hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-13 14:02:16 +00:00

History

xxxigm 531efe7208 fix(computer_use): add helper to decide capture vision routing Add tools/computer_use/vision_routing.py with should_route_capture_to_aux_vision(provider, model, cfg) — a small policy helper that decides whether a captured screenshot should be returned as a multimodal envelope (main model has native vision) or pre-analysed through the auxiliary.vision pipeline so the main model only sees text. The decision mirrors agent.image_routing.decide_image_input_mode for user-attached images, so the capture path and the user-turn path agree on what counts as an explicit aux vision override: * provider/model/base_url under auxiliary.vision => explicit override => route through aux vision * provider+model accepts multimodal tool results AND main model reports supports_vision=True => keep multimodal envelope * everything else (no tool-result image support, non-vision model, metadata lookup failure) => fail closed and route through aux No call sites are changed in this commit; the helper is added in isolation so the routing decision can be unit-tested before it is plumbed into _capture_response().		2026-05-21 17:38:19 -07:00
..
__init__.py	feat(computer-use): cua-driver backend, universal any-model schema	2026-05-08 11:07:38 -07:00
backend.py	feat(computer-use): cua-driver backend, universal any-model schema	2026-05-08 11:07:38 -07:00
cua_backend.py	fix(computer-use): surface app=… filter no-match instead of silently using frontmost (#24170 bug 1)	2026-05-21 17:15:35 -07:00
schema.py	feat(computer-use): background focus-safe backend — set_value, structured windows, MIME detection	2026-05-08 11:07:38 -07:00
tool.py	fix(computer_use): preserve app context for capture_after; fix element label parsing (#24170 bugs 2 & 5)	2026-05-21 14:19:09 -07:00
vision_routing.py	fix(computer_use): add helper to decide capture vision routing	2026-05-21 17:38:19 -07:00