mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-05 07:41:39 +00:00
fix(computer_use): preserve app context for capture_after; fix element label parsing (#24170 bugs 2 & 5)
Bug 2 (capture_after=True loses app context): _maybe_follow_capture called backend.capture(mode='som') with no app=, causing cua-driver to capture the frontmost window instead of the app targeted by the preceding capture/focus_app. Fix: track _last_app on CuaDriverBackend and thread it through the follow-up capture call so the same app is re-captured regardless of which window has OS focus. Bug 5 (element labels stripped in capture results): _ELEMENT_LINE_RE matched the classic ' - [N] AXRole "label"' format but not the '[N] AXRole (order) id=Label' format introduced in cua-driver v0.1.6. All element labels were silently dropped as empty strings, making element identification impossible. Fix: extend regex to capture both group(3) (quoted label) and group(4) (id= label), and update _parse_elements_from_tree to use group(4) as fallback. Both old and new cua-driver output now produce populated UIElement.label values. focus_app() now also sets _last_app so that capture_after= on any subsequent action re-targets the focused app. 5 new regression tests added. Part of #24170 (bugs 1 and 3/4 addressed separately).
This commit is contained in:
parent
3fde8c153d
commit
4cc18877c6
3 changed files with 226 additions and 6 deletions
|
|
@ -463,7 +463,11 @@ def _maybe_follow_capture(
|
|||
if not do_capture:
|
||||
return _text_response(res)
|
||||
try:
|
||||
cap = backend.capture(mode="som")
|
||||
# Preserve the app context established by the preceding capture/focus_app so
|
||||
# that capture_after=True re-captures the same app rather than the frontmost
|
||||
# window (which may have changed if the action caused a focus shift).
|
||||
last_app = getattr(backend, "_last_app", None)
|
||||
cap = backend.capture(mode="som", app=last_app)
|
||||
except Exception as e:
|
||||
logger.warning("follow-up capture failed: %s", e)
|
||||
return _text_response(res)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue