hermes-agent/tests
Teknium 3800972dd0
feat(vision): vision_analyze returns pixels to vision-capable models, not aux text (#22955)
When the active main model has native vision and the provider supports
multimodal tool results (Anthropic, OpenAI Chat, Codex Responses, Gemini
3, OpenRouter, Nous), vision_analyze loads the image bytes and returns
them to the model as a multimodal tool-result envelope. The model then
sees the pixels directly on its next turn instead of receiving a lossy
text description from an auxiliary LLM.

Falls back to the legacy aux-LLM text path for non-vision models and
unverified providers.

Mirrors the architecture used in OpenCode, Claude Code, Codex CLI, and
Cline. All four converge on the same pattern: tool results carry image
content blocks for vision-capable provider/model combinations.

Changes
- tools/vision_tools.py: _vision_analyze_native fast path + provider
  capability table (_supports_media_in_tool_results). Schema description
  updated to reflect new behaviour.
- agent/codex_responses_adapter.py: function_call_output.output now
  accepts the array form for multimodal tool results (was string-only).
  Preflight validates input_text/input_image parts.
- agent/auxiliary_client.py: _RUNTIME_MAIN_PROVIDER/_MODEL globals so
  tools see the live CLI/gateway override, not the stale config.yaml
  default. set_runtime_main()/clear_runtime_main() helpers.
- run_agent.py: AIAgent.run_conversation calls set_runtime_main at turn
  start so vision_analyze's fast-path check sees the actual runtime.
- tests/conftest.py: clear runtime-main override between tests.

Tests
- tests/tools/test_vision_native_fast_path.py: provider capability
  table, envelope shape, fast-path gating (vision-capable model uses
  fast path; non-vision model falls through to aux).
- tests/run_agent/test_codex_multimodal_tool_result.py: list tool
  content becomes function_call_output.output array; preflight
  preserves arrays and drops unknown part types.

Live verified
- Opus 4.6 + Sonnet 4.6 on OpenRouter: model calls vision_analyze on a
  typed filepath, gets pixels back, reads exact text from images that
  no aux description could capture (font color irony, multi-line
  fruit-count list, etc.).

PR replaces the closed prior efforts (#16506 shipped the inbound user-
attached path; this PR closes the gap for tool-discovered images).
2026-05-09 21:06:19 -07:00
..
acp fix(acp): preserve assistant reasoning metadata in session persistence 2026-05-05 10:18:28 -07:00
acp_adapter fix: make session search initialize session db 2026-05-09 14:36:58 -07:00
agent feat(curator): show rename map in user-visible summary (#22910) 2026-05-09 18:43:40 -07:00
cli fix(cli): preserve config comments on setting writes 2026-05-09 17:55:12 -07:00
cron fix(cron): avoid github skill false positives in scanner 2026-05-09 11:11:45 -07:00
e2e fix(gateway): move quick-command dispatch before built-in handlers 2026-05-04 01:39:23 -07:00
environments/benchmarks
fakes
gateway fix(gateway): degrade gracefully when all platform adapters are missing 2026-05-09 17:53:46 -07:00
hermes_cli chore(test): comment of test case rewrite to english 2026-05-09 19:31:41 -07:00
hermes_state
honcho_plugin
integration
openviking_plugin fix(openviking): pre-check fs/stat to route file URIs before hitting directory-only endpoints 2026-04-30 02:35:29 -07:00
plugins fix(kanban): request default board explicitly (#21819) 2026-05-09 19:31:32 -07:00
providers feat(openrouter): wire Pareto Code router with min_coding_score knob (#22838) 2026-05-09 14:47:00 -07:00
run_agent feat(vision): vision_analyze returns pixels to vision-capable models, not aux text (#22955) 2026-05-09 21:06:19 -07:00
skills fix(deps): declare youtube-transcript-api in pyproject.toml [youtube] extra 2026-05-09 13:36:01 -07:00
stress fix(kanban): gate claim + unblock on parent completion 2026-05-09 11:07:37 -07:00
tools feat(vision): vision_analyze returns pixels to vision-capable models, not aux text (#22955) 2026-05-09 21:06:19 -07:00
tui_gateway fix(tui): close slash parity gaps with CLI (#20339) 2026-05-05 15:42:39 -05:00
website docs(skills): explain restoring bundled skills 2026-05-05 13:46:20 -07:00
__init__.py
conftest.py feat(vision): vision_analyze returns pixels to vision-capable models, not aux text (#22955) 2026-05-09 21:06:19 -07:00
run_interrupt_test.py
test_account_usage.py
test_atomic_replace_symlinks.py
test_base_url_hostname.py
test_batch_runner_checkpoint.py
test_cli_file_drop.py
test_cli_manual_compress.py
test_cli_skin_integration.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_ctx_halving_fix.py
test_empty_model_fallback.py
test_evidence_store.py
test_get_tool_definitions_cache_isolation.py fix(tools): isolate get_tool_definitions quiet_mode cache + dedup LCM injection (#17335) 2026-04-30 04:32:06 -07:00
test_hermes_bootstrap.py fix(entry-points): guard hermes_bootstrap import so partial updates don't brick hermes (#22091) 2026-05-08 14:43:13 -07:00
test_hermes_constants.py test(hermes_constants): cover parse_reasoning_effort() 2026-05-07 09:59:07 -07:00
test_hermes_home_profile_warning.py fix(constants): warn once when get_hermes_home() falls back under an active profile (#18746) 2026-05-02 01:49:55 -07:00
test_hermes_logging.py
test_hermes_state.py fix(session): route OR-combined short CJK tokens to LIKE fallback (#20494) 2026-05-09 17:53:02 -07:00
test_hermes_state_wal_fallback.py fix(sqlite): fall back to journal_mode=DELETE on NFS/SMB/FUSE (#22043) 2026-05-09 02:09:35 -07:00
test_honcho_client_config.py
test_install_sh_pythonpath_sanitization.py fix: harden install.sh against inherited Python env leakage 2026-05-06 04:02:02 -07:00
test_install_sh_setup_wizard_tty_probe.py
test_install_sh_termux_network_prereqs.py fix: strengthen termux install network prerequisites 2026-05-07 13:04:08 -07:00
test_ipv4_preference.py
test_lazy_session_regressions.py fix: resolve lazy session creation regressions (#18370 fallout) (#20363) 2026-05-06 01:11:49 +05:30
test_lint_config.py lint: enable PLW1514 as a blocking ruff rule 2026-05-08 14:27:40 -07:00
test_mcp_serve.py fix(mcp): unwrap platforms key in channels_list 2026-05-07 13:41:16 -07:00
test_mini_swe_runner.py
test_minimax_model_validation.py
test_minimax_oauth.py test(cli): cover minimax-oauth resolution, refresh, menu wiring 2026-04-29 09:53:42 -07:00
test_minisweagent_path.py
test_model_picker_scroll.py
test_model_tools.py fix(plugins): stop firing pre_tool_call hook twice per tool execution (#17611) 2026-04-29 12:43:39 -07:00
test_model_tools_async_bridge.py fix(model_tools): cancel coroutine on timeout so worker thread exits + log full traceback 2026-04-29 05:00:40 -07:00
test_ollama_num_ctx.py
test_packaging_metadata.py
test_plugin_skills.py fix(skills): support category-qualified local skill names 2026-05-05 10:15:31 -07:00
test_process_loop_event_loop_warning.py fix(cli): replace get_event_loop() with get_running_loop() to silence RuntimeWarning in process_loop thread (#19285) 2026-05-07 06:35:54 -07:00
test_project_metadata.py
test_retry_utils.py
test_sql_injection.py
test_subprocess_home_isolation.py
test_termux_all_extra_compat.py fix: add termux-all install profile and safe fallbacks 2026-05-07 13:04:08 -07:00
test_timezone.py
test_toolset_distributions.py
test_toolsets.py fix: merge plugin tools into builtin toolsets 2026-05-05 10:14:17 -07:00
test_trajectory_compressor.py
test_trajectory_compressor_async.py
test_transform_llm_output_hook.py test+docs: cover transform_llm_output hook + release author map 2026-05-07 05:46:05 -07:00
test_transform_tool_result_hook.py
test_tui_gateway_server.py Merge pull request #20942 from NousResearch/austin/fix/personality 2026-05-07 18:54:29 -04:00
test_utils_truthy_values.py
test_yuanbao_integration.py
test_yuanbao_markdown.py
test_yuanbao_pipeline.py
test_yuanbao_proto.py