hermes-agent/tools
ddupont e31f3b3c56 feat(computer-use): background focus-safe backend — set_value, structured windows, MIME detection
Extends the cua-driver computer-use backend to drive backgrounded macOS
windows without stealing keyboard or mouse focus from the foreground app.
All changes target the cua-driver MCP backend and the shared dispatcher.

## cua_backend.py

**Window-aware capture**: capture() now calls list_windows + get_window_state
instead of the removed capture tool. Prefers structuredContent.windows
(MCP 2024-11-05+ cua-driver) for zero-parse window enumeration; falls back
to regex-parsed text for older builds. Stores the selected (pid, window_id)
as sticky context so subsequent action calls do not need a redundant round-trip.

**Action routing**: click/scroll/type_text/key all carry the sticky pid
(and window_id for element-indexed clicks). type_text routes through
type_text_chars (individual key events) rather than AX attribute write --
WebKit AXTextFields reject attribute writes from backgrounded processes.

**Key parsing**: _parse_key_combo splits cmd+s-style strings into
(key, [modifiers]) and routes to hotkey (modifier present) or
press_key (bare key) -- cua-driver actual tool names.

**set_value method**: new set_value(value, element) calls the cua-driver
set_value MCP tool. For AXPopUpButton / HTML select in a backgrounded Safari,
AXPress opens the native macOS popup which closes immediately when the app is
non-frontmost; set_value AX-presses the matching child option directly
(no menu required, no focus steal).

**focus_app**: reimplemented as a pure window-selector (enumerates
list_windows, sets sticky pid/window_id) without ever raising the window
or stealing focus.

**list_apps**: fixed tool name from listApps to list_apps; handles plain-text
response via regex when structured data is absent.

**Structured-content extraction**: _extract_tool_result now surfaces
structuredContent from MCP results, enabling the list_windows window array
without text parsing.

**Helpers**: _parse_windows_from_text, _parse_elements_from_tree,
_split_tree_text, _parse_key_combo extracted as module-level functions.

## schema.py

Added set_value to the action enum with a description explaining when to
prefer it over click (select/popup elements, sliders, no focus steal).
Added value field for set_value payloads.

## tool.py

Routed set_value action through _dispatch to backend.set_value.
Added set_value to _DESTRUCTIVE_ACTIONS (approval-gated).
Fixed MIME-type detection in _capture_response: cua-driver may return
JPEG; detect from base64 magic bytes (/9j/ -> image/jpeg, else image/png)
rather than hardcoding image/png.

## agent/display.py + run_agent.py

Guard _detect_tool_failure and result-preview logic against non-string
function_result values: multimodal tool results (dicts with _multimodal=True)
are not string-sliceable; treat them as successes and fall back to str()
for length/preview.
2026-05-08 11:07:38 -07:00
..
browser_providers feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
computer_use feat(computer-use): background focus-safe backend — set_value, structured windows, MIME detection 2026-05-08 11:07:38 -07:00
environments fix(windows): terminal drain and cwd path conversion for native Windows 2026-05-07 06:11:00 -07:00
neutts_samples refactor(tts): replace NeuTTS optional skill with built-in provider + setup flow 2026-03-17 02:33:12 -07:00
web_providers docs(web): fix SearXNG env configuration 2026-05-07 17:54:47 -07:00
__init__.py Merge branch 'main' into rewbs/tool-use-charge-to-subscription 2026-03-31 08:48:54 +09:00
ansi_strip.py fix: strip ANSI at the source — clean terminal output before it reaches the model 2026-03-23 07:43:12 -07:00
approval.py fix(approval): cron jobs must not be treated as gateway context 2026-05-08 07:30:14 -07:00
binary_extensions.py fix(tools): address PR review — remove _extract_raw_output, BudgetConfig everywhere, read_file hardening 2026-04-08 02:24:32 -07:00
browser_camofox.py refactor(config): add cfg_get() helper; migrate 20 nested-get call sites (#17304) 2026-04-28 23:17:39 -07:00
browser_camofox_state.py feat(browser): add persistent Camofox sessions and VNC URL discovery (salvage #4400) (#4419) 2026-04-01 04:18:50 -07:00
browser_cdp_tool.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
browser_dialog_tool.py feat(browser): CDP supervisor — dialog detection + response + cross-origin iframe eval (#14540) 2026-04-23 22:23:37 -07:00
browser_supervisor.py fix(browser_supervisor): verify thread and loop health before returning cached supervisor 2026-04-30 20:33:33 -07:00
browser_tool.py fix(browser): enforce cloud-metadata SSRF floor in hybrid routing (#16234) (#21228) 2026-05-07 05:38:05 -07:00
budget_config.py fix: preserve existing thresholds, remove pre-read byte guard 2026-04-08 02:24:32 -07:00
checkpoint_manager.py feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709) 2026-05-06 05:44:35 -07:00
clarify_tool.py refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites 2026-04-07 13:36:38 -07:00
code_execution_tool.py fix(tools): serialize concurrent hermes_tools RPC calls from execute_code 2026-04-30 03:31:16 -07:00
computer_use_tool.py feat(computer-use): cua-driver backend, universal any-model schema 2026-05-08 11:07:38 -07:00
credential_files.py fix(gateway): translate inbound document host paths to container paths for Docker backend 2026-05-07 05:02:26 -07:00
cronjob_tools.py feat(cron): routing intent — deliver=all fans out to every connected channel (#21495) 2026-05-08 04:17:21 -07:00
debug_helpers.py refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821) 2026-04-07 10:25:31 -07:00
delegate_tool.py fix(delegate): expand composite toolsets before intersection in delegate_task 2026-05-07 06:41:42 -07:00
discord_tool.py feat: add Discord message deletion action 2026-05-07 05:11:09 -07:00
env_passthrough.py refactor(config): add cfg_get() helper; migrate 20 nested-get call sites (#17304) 2026-04-28 23:17:39 -07:00
feishu_doc_tool.py fix(feishu-comment): use get_hermes_home(); drop dead asyncio wrapper; AUTHOR_MAP 2026-04-17 19:04:11 -07:00
feishu_drive_tool.py fix(feishu-comment): use get_hermes_home(); drop dead asyncio wrapper; AUTHOR_MAP 2026-04-17 19:04:11 -07:00
file_operations.py docs: align terminal-backend count and naming across docs and code 2026-05-05 13:44:09 -07:00
file_state.py feat(delegate): cross-agent file state coordination for concurrent subagents (#13718) 2026-04-21 16:41:26 -07:00
file_tools.py feat(file_tools): post-write delta lint on write_file + patch, add JSON/YAML/TOML/Python in-process linters (#20191) 2026-05-05 04:54:17 -07:00
fuzzy_match.py fix(patch): gate 'did you mean?' to no-match + extend to v4a/skill_manage 2026-04-21 02:03:46 -07:00
homeassistant_tool.py fix: clean up description escaping, add string-data tests 2026-04-13 04:45:07 -07:00
image_generation_tool.py feat(image-gen): honor image_gen.model from config.yaml in plugin dispatch 2026-05-07 06:24:24 -07:00
interrupt.py fix(interrupt): propagate to concurrent-tool workers + opt-in debug trace (#11907) 2026-04-17 20:39:25 -07:00
kanban_tools.py fix(kanban): heartbeat tool extends claim TTL, not just last_heartbeat_at 2026-05-07 05:05:20 -07:00
managed_tool_gateway.py fix(tools): add debug logging for token refresh and tighten domain check 2026-04-02 12:40:03 +11:00
mcp_oauth.py fix(mcp-oauth): persist OAuth server metadata across process restarts (#21226) 2026-05-07 05:35:33 -07:00
mcp_oauth_manager.py fix(mcp-oauth): persist OAuth server metadata across process restarts (#21226) 2026-05-07 05:35:33 -07:00
mcp_tool.py fix(mcp): gate utility stubs on server-advertised capabilities (#21347) 2026-05-07 07:39:50 -07:00
memory_tool.py fix(memory): remove dead allOf schema block at the source 2026-05-07 07:03:21 -07:00
microsoft_graph_auth.py feat(msgraph): add auth and client foundation 2026-05-08 09:27:26 -07:00
microsoft_graph_client.py fix(msgraph): stream download_to_file body instead of buffering 2026-05-08 09:27:26 -07:00
mixture_of_agents_tool.py Fix (mixture_of_agents): replace deprecated Gemini model and forward max_tokens to OpenRouter (#6621) 2026-04-23 15:14:11 -07:00
neutts_synth.py fix(tts): document NeuTTS provider and align install guidance (#1903) 2026-03-18 02:55:30 -07:00
openrouter_client.py refactor: route ad-hoc LLM consumers through centralized provider router 2026-03-11 20:02:36 -07:00
osv_check.py feat: OSV malware check for MCP extension packages (#5305) 2026-04-05 12:46:07 -07:00
patch_parser.py fix(patch): gate 'did you mean?' to no-match + extend to v4a/skill_manage 2026-04-21 02:03:46 -07:00
path_security.py refactor: extract shared helpers to deduplicate repeated code patterns (#7917) 2026-04-11 13:59:52 -07:00
process_registry.py fix(terminal): guard background process spawn against deleted cwd (#19933) 2026-05-04 15:35:34 -07:00
registry.py perf(tools): memoize get_tool_definitions + TTL-cache check_fn results (#17098) 2026-04-28 18:20:17 -07:00
rl_training_tool.py refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821) 2026-04-07 10:25:31 -07:00
schema_sanitizer.py fix: strip Codex-hostile top-level schema combinators 2026-05-07 07:03:21 -07:00
send_message_tool.py feat(gateway): support [[as_document]] directive for skill media routing 2026-05-07 05:20:10 -07:00
session_search_tool.py fix(session-search): report source from resolved parent, not FTS5 child session (#15909) 2026-05-04 05:07:40 -07:00
skill_manager_tool.py fix: exclude hidden and archive dirs from _find_skill rglob 2026-05-07 05:15:28 -07:00
skill_provenance.py fix(curator): only mark agent-created for background-review sediment (#19621) 2026-05-04 02:42:16 -07:00
skill_usage.py fix(skills): lock usage telemetry updates 2026-05-07 06:13:37 -07:00
skills_guard.py feat(skills-guard): gate agent-created scanner on config.skills.guard_agent_created (default off) 2026-04-23 06:20:47 -07:00
skills_hub.py skills-hub: hash binary skill bundle files correctly 2026-05-04 01:28:12 -07:00
skills_sync.py refactor: consolidate symlink-safe atomic replace into shared helper 2026-04-28 04:58:22 -07:00
skills_tool.py fix(skills): support category-qualified local skill names 2026-05-05 10:15:31 -07:00
slash_confirm.py feat(gateway,cli): confirm /reload-mcp to warn about prompt cache invalidation 2026-04-29 21:56:47 -07:00
terminal_tool.py fix(terminal): skip sudo prompt when local NOPASSWD sudo works 2026-04-30 20:38:09 -07:00
tirith_security.py fix: guard against None tirith path in security scanner 2026-04-23 03:08:53 -07:00
todo_tool.py fix(tools): enforce ID uniqueness in TODO store during replace operations 2026-04-11 16:22:50 -07:00
tool_backend_helpers.py fix(cli): coerce use_gateway config flags in tool routing 2026-04-26 19:02:55 -07:00
tool_output_limits.py feat(skills): add design-md skill for Google's DESIGN.md spec (#14876) 2026-04-23 21:51:19 -07:00
tool_result_storage.py fix(tools): neutralize shell injection in _write_to_sandbox via path quoting (#7940) 2026-04-11 14:26:11 -07:00
transcription_tools.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
tts_tool.py fix(tts): update MiniMax API endpoint to v1/text_to_speech 2026-05-04 12:36:09 -07:00
url_safety.py fix(browser): enforce cloud-metadata SSRF floor in hybrid routing (#16234) (#21228) 2026-05-07 05:38:05 -07:00
vision_tools.py fix(vision): guard user_prompt type in video_analyze_tool before debug_call_data construction 2026-05-03 15:28:04 -07:00
voice_mode.py fix: point optional-dep install hints at the venv's python (#11938) 2026-04-17 21:16:33 -07:00
web_tools.py feat(web): add Brave Search (free tier) and DDGS search providers 2026-05-07 09:59:17 -07:00
website_policy.py refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821) 2026-04-07 10:25:31 -07:00
xai_http.py feat(xai): upgrade to Responses API, add TTS provider 2026-04-16 02:24:08 -07:00
yuanbao_tools.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00