hermes-agent

mirrors/hermes-agent

Fork 0

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-30 06:41:51 +00:00

Commit graph

Author	SHA1	Message	Date
xxxigm	531efe7208	fix(computer_use): add helper to decide capture vision routing Add tools/computer_use/vision_routing.py with should_route_capture_to_aux_vision(provider, model, cfg) — a small policy helper that decides whether a captured screenshot should be returned as a multimodal envelope (main model has native vision) or pre-analysed through the auxiliary.vision pipeline so the main model only sees text. The decision mirrors agent.image_routing.decide_image_input_mode for user-attached images, so the capture path and the user-turn path agree on what counts as an explicit aux vision override: * provider/model/base_url under auxiliary.vision => explicit override => route through aux vision * provider+model accepts multimodal tool results AND main model reports supports_vision=True => keep multimodal envelope * everything else (no tool-result image support, non-vision model, metadata lookup failure) => fail closed and route through aux No call sites are changed in this commit; the helper is added in isolation so the routing decision can be unit-tested before it is plumbed into _capture_response().	2026-05-21 17:38:19 -07:00

Author

SHA1

Message

Date

xxxigm

531efe7208

fix(computer_use): add helper to decide capture vision routing

Add tools/computer_use/vision_routing.py with
should_route_capture_to_aux_vision(provider, model, cfg) — a small
policy helper that decides whether a captured screenshot should be
returned as a multimodal envelope (main model has native vision) or
pre-analysed through the auxiliary.vision pipeline so the main model
only sees text.

The decision mirrors agent.image_routing.decide_image_input_mode for
user-attached images, so the capture path and the user-turn path agree
on what counts as an explicit aux vision override:
  * provider/model/base_url under auxiliary.vision => explicit override
    => route through aux vision
  * provider+model accepts multimodal tool results AND main model
    reports supports_vision=True => keep multimodal envelope
  * everything else (no tool-result image support, non-vision model,
    metadata lookup failure) => fail closed and route through aux

No call sites are changed in this commit; the helper is added in
isolation so the routing decision can be unit-tested before it is
plumbed into _capture_response().

2026-05-21 17:38:19 -07:00

1 commit