mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-30 06:41:51 +00:00
fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452)
* fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main for vision Fixes #31179. Three coupled fixes so a configured aux vision backend actually serves vision tasks instead of silently routing images to the user's main provider: 1. agent/auxiliary_client.py: `auxiliary.<task>.provider: openai` resolves to `custom` + `https://api.openai.com/v1`. "openai" was not in PROVIDER_REGISTRY (we have `openai-codex` for OAuth and `custom` for manual base_url), so the obvious config name silently failed to build a client. User-supplied base_url is still preserved; only the provider name normalises to `custom` so resolution doesn't hit the PROVIDER_REGISTRY-only path. 2. agent/auxiliary_client.py: the vision auto-detect chain now skips the user's main provider when models.dev reports `supports_vision=False`. Without this guard, a misconfigured aux provider would fall back to `auto`, which happily returned the main-provider client. The caller would then send image content to e.g. api.deepseek.com with model `gpt-4o-mini` and get a cryptic `unknown variant 'image_url', expected 'text'` from the provider's parser. 3. tools/vision_tools.py + tools/browser_tool.py: `check_vision_requirements` now mirrors the runtime fallback chain (explicit provider, then auto), so `vision_analyze` shows up whenever vision is actually serviceable. `browser_vision` gets a new `check_browser_vision_requirements` check_fn that AND-gates browser + vision availability, so it doesn't get advertised to the model when the call would fail at runtime. Reproduction (config from the bug report): model.provider: deepseek model.default: deepseek-v4-pro auxiliary.vision.provider: openai auxiliary.vision.model: gpt-4o-mini Before: resolve_vision_provider_client() returns None for the explicit provider, fallback auto returns the deepseek client with model='gpt-4o-mini', image hits api.deepseek.com → 'unknown variant image_url'. vision_analyze hidden from tool list; browser_vision exposed but fails at call time. After: resolves to custom + api.openai.com/v1 with model gpt-4o-mini. vision_analyze and browser_vision both gate correctly on capability. Tests: tests/agent/test_vision_routing_31179.py covers all three fixes (12 cases including the user's exact scenario, base_url preservation, text-only-main skip, capability-unknown permissive fallback, and tool gating parity). Existing 382 tests across auxiliary/vision/image_routing suites still pass. * test(vision): use exact hostname check to silence CodeQL substring-sanitization alert * fix(auxiliary): drop model name from vision-skip debug log to silence CodeQL The new `logger.debug(...)` added in the previous commit interpolated both `main_provider` and `vision_model` (a public model slug \u2014 not sensitive). CodeQL's `py/clear-text-logging-sensitive-data` heuristic re-flagged it twice because the rule mis-detects multi-value interpolations near tainted-via-config provider strings. Drop the model from the log args (provider alone is enough to diagnose the skip; the same sibling branch a few lines up already logs provider only). Behavior unchanged; CodeQL false positive cleared.
This commit is contained in:
parent
d9ec90585c
commit
3d66787a04
4 changed files with 417 additions and 3 deletions
|
|
@ -3652,6 +3652,24 @@ def check_browser_requirements() -> bool:
|
|||
return True
|
||||
|
||||
|
||||
def check_browser_vision_requirements() -> bool:
|
||||
"""Whether ``browser_vision`` should be advertised to the model.
|
||||
|
||||
Requires BOTH a working browser (``check_browser_requirements``) AND a
|
||||
resolvable vision backend. Without the vision check, the tool stays in
|
||||
the model's tool list even when no vision provider is configured, then
|
||||
fails at call time with a cryptic provider-side error like
|
||||
``unknown variant `image_url`, expected `text``` (issue #31179).
|
||||
"""
|
||||
if not check_browser_requirements():
|
||||
return False
|
||||
try:
|
||||
from tools.vision_tools import check_vision_requirements
|
||||
except ImportError:
|
||||
return False
|
||||
return check_vision_requirements()
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# Module Test
|
||||
# ============================================================================
|
||||
|
|
@ -3786,7 +3804,7 @@ registry.register(
|
|||
toolset="browser",
|
||||
schema=_BROWSER_SCHEMA_MAP["browser_vision"],
|
||||
handler=lambda args, **kw: browser_vision(question=args.get("question", ""), annotate=args.get("annotate", False), task_id=kw.get("task_id")),
|
||||
check_fn=check_browser_requirements,
|
||||
check_fn=check_browser_vision_requirements,
|
||||
emoji="👁️",
|
||||
)
|
||||
registry.register(
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue