hermes-agent/tools/computer_use
Bartok9 4cc18877c6 fix(computer_use): preserve app context for capture_after; fix element label parsing (#24170 bugs 2 & 5)
Bug 2 (capture_after=True loses app context):
_maybe_follow_capture called backend.capture(mode='som') with no app=,
causing cua-driver to capture the frontmost window instead of the app
targeted by the preceding capture/focus_app. Fix: track _last_app on
CuaDriverBackend and thread it through the follow-up capture call so
the same app is re-captured regardless of which window has OS focus.

Bug 5 (element labels stripped in capture results):
_ELEMENT_LINE_RE matched the classic '  - [N] AXRole "label"' format
but not the '[N] AXRole (order) id=Label' format introduced in
cua-driver v0.1.6. All element labels were silently dropped as empty
strings, making element identification impossible.

Fix: extend regex to capture both group(3) (quoted label) and group(4)
(id= label), and update _parse_elements_from_tree to use group(4) as
fallback. Both old and new cua-driver output now produce populated
UIElement.label values.

focus_app() now also sets _last_app so that capture_after= on any
subsequent action re-targets the focused app.

5 new regression tests added.

Part of #24170 (bugs 1 and 3/4 addressed separately).
2026-05-21 14:19:09 -07:00
..
__init__.py feat(computer-use): cua-driver backend, universal any-model schema 2026-05-08 11:07:38 -07:00
backend.py feat(computer-use): cua-driver backend, universal any-model schema 2026-05-08 11:07:38 -07:00
cua_backend.py fix(computer_use): preserve app context for capture_after; fix element label parsing (#24170 bugs 2 & 5) 2026-05-21 14:19:09 -07:00
schema.py feat(computer-use): background focus-safe backend — set_value, structured windows, MIME detection 2026-05-08 11:07:38 -07:00
tool.py fix(computer_use): preserve app context for capture_after; fix element label parsing (#24170 bugs 2 & 5) 2026-05-21 14:19:09 -07:00