Commit graph

1 commit

Author SHA1 Message Date
xxxigm
531efe7208 fix(computer_use): add helper to decide capture vision routing
Add tools/computer_use/vision_routing.py with
should_route_capture_to_aux_vision(provider, model, cfg) — a small
policy helper that decides whether a captured screenshot should be
returned as a multimodal envelope (main model has native vision) or
pre-analysed through the auxiliary.vision pipeline so the main model
only sees text.

The decision mirrors agent.image_routing.decide_image_input_mode for
user-attached images, so the capture path and the user-turn path agree
on what counts as an explicit aux vision override:
  * provider/model/base_url under auxiliary.vision => explicit override
    => route through aux vision
  * provider+model accepts multimodal tool results AND main model
    reports supports_vision=True => keep multimodal envelope
  * everything else (no tool-result image support, non-vision model,
    metadata lookup failure) => fail closed and route through aux

No call sites are changed in this commit; the helper is added in
isolation so the routing decision can be unit-tested before it is
plumbed into _capture_response().
2026-05-21 17:38:19 -07:00