mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-30 06:41:51 +00:00
Add tools/computer_use/vision_routing.py with
should_route_capture_to_aux_vision(provider, model, cfg) — a small
policy helper that decides whether a captured screenshot should be
returned as a multimodal envelope (main model has native vision) or
pre-analysed through the auxiliary.vision pipeline so the main model
only sees text.
The decision mirrors agent.image_routing.decide_image_input_mode for
user-attached images, so the capture path and the user-turn path agree
on what counts as an explicit aux vision override:
* provider/model/base_url under auxiliary.vision => explicit override
=> route through aux vision
* provider+model accepts multimodal tool results AND main model
reports supports_vision=True => keep multimodal envelope
* everything else (no tool-result image support, non-vision model,
metadata lookup failure) => fail closed and route through aux
No call sites are changed in this commit; the helper is added in
isolation so the routing decision can be unit-tested before it is
plumbed into _capture_response().
|
||
|---|---|---|
| .. | ||
| __init__.py | ||
| backend.py | ||
| cua_backend.py | ||
| schema.py | ||
| tool.py | ||
| vision_routing.py | ||