hermes-agent/tools/computer_use
Brooklyn Nicholson 807b696295 fix(computer-use): vision capture returns an image on cua-driver >=0.5.x
Vision mode called a `screenshot` MCP tool that cua-driver dropped in
0.5.x (full-window PNG capture was folded into `get_window_state`). The
driver replied "Unknown tool: screenshot", so `images` came back empty,
`png_b64` stayed None, and capture returned a 0x0 result with no image
on every call. `som`/`ax` were unaffected because they already use
`get_window_state`, which masked the regression.

Route vision by capability:
- driver advertises `screenshot` (older builds) -> use it (no AX walk)
- otherwise -> call `get_window_state` but discard the AX tree/elements,
  returning only the PNG so vision stays free of element noise
- capabilities not yet discovered -> try `screenshot`, fall back to
  `get_window_state` on an empty image, so the path self-heals

Add `_image_from_tool_result` to pull the PNG from either an MCP image
content-part or `structuredContent.screenshot_png_b64`, and use it on
the som path too so the image won't silently drop on driver builds that
deliver it via structuredContent instead of a content part.

Verified live (vision: 1568x954, 0 elements; som: image + 527 elements)
and with unit coverage of all four routing cases.
2026-06-22 17:41:42 -05:00
..
__init__.py feat(computer-use): cua-driver backend, universal any-model schema 2026-05-08 11:07:38 -07:00
backend.py feat(computer_use): cross-platform cua-driver (macOS/Windows/Linux) 2026-06-22 06:42:30 -07:00
cua_backend.py fix(computer-use): vision capture returns an image on cua-driver >=0.5.x 2026-06-22 17:41:42 -05:00
doctor.py feat(computer_use): disable cua-driver telemetry by default, add opt-in (#50842) 2026-06-22 09:57:16 -07:00
schema.py feat(computer_use): cross-platform cua-driver (macOS/Windows/Linux) 2026-06-22 06:42:30 -07:00
tool.py fix(computer_use): reconcile Linux gate with stale "gated off" comments 2026-06-22 06:42:30 -07:00
vision_routing.py fix(computer_use): honor custom vision routing 2026-06-07 02:09:20 -07:00