hermes-agent/tests
Erosika edc23e888d fix(gateway): scrub memory-context leaks from vision auto-analysis output
fixes #5719

The auxiliary vision LLM called by gateway._enrich_message_with_vision
can echo its injected Honcho system prompt back into the image
description.  That description gets embedded verbatim into the enriched
user message, so recalled memory (personal facts, dialectic output)
surfaces into a user-visible bubble.

Strips both forms of leak before embedding:
  - <memory-context>...</memory-context> fenced blocks (sanitize_context)
  - trailing '## Honcho Context' sections (header + everything after)

Plus regression tests:
  - tests/agent/test_streaming_context_scrubber.py — 13 tests on the
    stateful scrubber (whole block, split tags, false-positive partial
    tags, unterminated span, reset, case-insensitivity)
  - tests/run_agent/test_run_agent_codex_responses.py — 2 new tests on
    _fire_stream_delta covering the realistic 7-chunk leak scenario and
    the cross-turn scrubber reset
  - tests/gateway/test_vision_memory_leak.py — 4 tests covering the
    vision auto-analysis boundary (clean pass-through, '## Honcho Context'
    header, fenced block, both patterns together)
2026-04-24 18:48:10 -04:00
..
acp fix(acp): include MCP toolsets in ACP sessions 2026-04-24 03:04:42 -07:00
agent fix(gateway): scrub memory-context leaks from vision auto-analysis output 2026-04-24 18:48:10 -04:00
cli feat: add slash command for busy input mode 2026-04-24 15:15:26 -07:00
cron feat(cron): per-job workdir for project-aware cron runs (#15110) 2026-04-24 05:07:01 -07:00
e2e refactor(commands): drop /provider, /plan handler, and clean up slash registry (#15047) 2026-04-24 03:10:52 -07:00
environments/benchmarks
fakes
gateway fix(gateway): scrub memory-context leaks from vision auto-analysis output 2026-04-24 18:48:10 -04:00
hermes_cli fix: mobile chat in new layout 2026-04-24 12:07:46 -04:00
hermes_state fix(resume): redirect --resume to the descendant that actually holds the messages 2026-04-24 03:04:42 -07:00
honcho_plugin fix(plugins/memory/honcho): default Honcho SDK HTTP timeout to 30s 2026-04-24 18:48:10 -04:00
integration fix(discord): strip RTP padding before DAVE/Opus decode (#11267) 2026-04-16 16:50:15 -07:00
plugins feat(hindsight): optional bank_id_template for per-agent / per-user banks 2026-04-24 03:38:17 -07:00
run_agent fix(gateway): scrub memory-context leaks from vision auto-analysis output 2026-04-24 18:48:10 -04:00
skills fix(google-workspace): normalize authorized user token writes 2026-04-16 04:22:16 -07:00
tools fix(env): safely quote ~/ subpaths in wrapped cd commands 2026-04-24 15:25:12 -07:00
tui_gateway fix(tui): keep default personality neutral 2026-04-24 16:19:23 -05:00
__init__.py
conftest.py test(conftest): reset module-level state + unset platform allowlists (#13400) 2026-04-21 01:33:10 -07:00
run_interrupt_test.py
test_account_usage.py feat(account-usage): add per-provider account limits module 2026-04-21 01:56:35 -07:00
test_base_url_hostname.py security(runtime_provider): close OLLAMA_API_KEY substring-leak sweep miss (#13522) 2026-04-21 06:06:16 -07:00
test_batch_runner_checkpoint.py test: regression coverage for checkpoint dedup and inf/nan coercion 2026-04-24 14:32:21 -07:00
test_cli_file_drop.py fix(tui): improve macOS paste and shortcut parity 2026-04-21 08:00:00 -07:00
test_cli_skin_integration.py fix: align status bar skin tests with upstream main 2026-04-22 13:20:02 -07:00
test_ctx_halving_fix.py fix(tests): fix 78 CI test failures and remove dead test (#9036) 2026-04-13 10:50:24 -07:00
test_empty_model_fallback.py fix: fall back to provider's default model when model config is empty (#8303) 2026-04-12 03:53:30 -07:00
test_evidence_store.py
test_hermes_constants.py fix(gateway): harden Docker/container gateway pathway 2026-04-12 16:36:11 -07:00
test_hermes_logging.py fix(tests): fix 78 CI test failures and remove dead test (#9036) 2026-04-13 10:50:24 -07:00
test_hermes_state.py fix: harden memory-context leak boundaries 2026-04-24 18:48:10 -04:00
test_honcho_client_config.py
test_ipv4_preference.py feat: add network.force_ipv4 config to fix IPv6 timeout issues (#8196) 2026-04-11 23:12:11 -07:00
test_mcp_serve.py
test_mini_swe_runner.py fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157) 2026-04-20 12:23:05 -07:00
test_minimax_model_validation.py fix(models): validate MiniMax models against static catalog (#12611, #12460, #12399, #12547) 2026-04-19 22:44:47 -07:00
test_minisweagent_path.py
test_model_picker_scroll.py
test_model_tools.py test: regression coverage for checkpoint dedup and inf/nan coercion 2026-04-24 14:32:21 -07:00
test_model_tools_async_bridge.py fix(core): ensure non-blocking executor shutdown on async timeout 2026-04-22 14:42:32 -07:00
test_ollama_num_ctx.py
test_packaging_metadata.py
test_plugin_skills.py fix(tests): attach caplog to specific logger in 3 order-dependent tests (#11453) 2026-04-17 00:20:40 -07:00
test_project_metadata.py build(deps): add qrcode to dingtalk + feishu extras (parity with messaging) (#11627) 2026-04-17 13:31:53 -07:00
test_retry_utils.py
test_sql_injection.py
test_subprocess_home_isolation.py fix: per-profile subprocess HOME isolation (#4426) (#7357) 2026-04-10 13:37:45 -07:00
test_timezone.py test: speed up slow tests (backoff + subprocess + IMDS network) (#11797) 2026-04-17 14:21:22 -07:00
test_toolset_distributions.py
test_toolsets.py fix(ci): unblock test suite + cut ~2s of dead Z.AI probes from every AIAgent 2026-04-19 19:18:19 -07:00
test_trajectory_compressor.py fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157) 2026-04-20 12:23:05 -07:00
test_trajectory_compressor_async.py fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157) 2026-04-20 12:23:05 -07:00
test_transform_tool_result_hook.py test: stop testing mutable data — convert change-detectors to invariants (#13363) 2026-04-20 23:20:33 -07:00
test_tui_gateway_server.py feat(tui): per-section visibility for the details accordion 2026-04-24 02:34:32 -05:00
test_utils_truthy_values.py