hermes-agent/tests/run_agent
Erosika 3b2edb347d fix(gateway): scrub memory-context leaks from vision auto-analysis output
fixes #5719

The auxiliary vision LLM called by gateway._enrich_message_with_vision
can echo its injected Honcho system prompt back into the image
description.  That description gets embedded verbatim into the enriched
user message, so recalled memory (personal facts, dialectic output)
surfaces into a user-visible bubble.

Strips both forms of leak before embedding:
  - <memory-context>...</memory-context> fenced blocks (sanitize_context)
  - trailing '## Honcho Context' sections (header + everything after)

Plus regression tests:
  - tests/agent/test_streaming_context_scrubber.py — 13 tests on the
    stateful scrubber (whole block, split tags, false-positive partial
    tags, unterminated span, reset, case-insensitivity)
  - tests/run_agent/test_run_agent_codex_responses.py — 2 new tests on
    _fire_stream_delta covering the realistic 7-chunk leak scenario and
    the cross-turn scrubber reset
  - tests/gateway/test_vision_memory_leak.py — 4 tests covering the
    vision auto-analysis boundary (clean pass-through, '## Honcho Context'
    header, fenced block, both patterns together)
2026-04-27 12:37:33 -07:00
..
__init__.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
conftest.py test: speed up slow tests (backoff + subprocess + IMDS network) (#11797) 2026-04-17 14:21:22 -07:00
test_413_compression.py test: speed up slow tests (backoff + subprocess + IMDS network) (#11797) 2026-04-17 14:21:22 -07:00
test_860_dedup.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_1630_context_overflow_loop.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_agent_guardrails.py fix(delegate): make max_concurrent_children configurable + error on excess 2026-04-10 13:38:14 -07:00
test_agent_loop.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_agent_loop_tool_calling.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_agent_loop_vllm.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_anthropic_error_handling.py feat(providers): extend request_timeout_seconds to all client paths 2026-04-19 11:23:00 -07:00
test_anthropic_prompt_cache_policy.py fix(cache): enable prompt caching for Qwen on OpenCode/OpenCode-Go/Alibaba (#13528) 2026-04-21 06:40:58 -07:00
test_anthropic_third_party_oauth_guard.py fix(anthropic): complete third-party Anthropic-compatible provider support (#12846) 2026-04-19 22:43:09 -07:00
test_anthropic_truncation_continuation.py refactor: remove _nr_to_assistant_message shim + fix flush_memories guard 2026-04-23 02:30:05 -07:00
test_api_max_retries_config.py feat(agent): make API retry count configurable via agent.api_max_retries (#14730) 2026-04-23 13:59:32 -07:00
test_async_httpx_del_neuter.py fix(copilot): send vision header for Copilot vision requests 2026-04-27 08:35:50 -07:00
test_background_review.py fix(approval): close remaining prompt_toolkit deadlock vectors (#15216) 2026-04-27 06:42:32 -07:00
test_background_review_summary.py fix(agent): exclude prior-history tool messages from background review summary 2026-04-24 03:10:19 -07:00
test_background_review_toolset_restriction.py fix(agent): restrict background review agent to memory and skills toolsets 2026-04-27 06:41:23 -07:00
test_compress_focus_plugin_fallback.py refactor(memory): remove flush_memories entirely (#15696) 2026-04-25 08:21:14 -07:00
test_compression_boundary.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_compression_boundary_hook.py fix: signal compression boundary to context engine 2026-04-26 19:07:18 -07:00
test_compression_feasibility.py refactor(memory): remove flush_memories entirely (#15696) 2026-04-25 08:21:14 -07:00
test_compression_persistence.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_compression_trigger_excludes_reasoning.py fix(compression): exclude completion tokens from compression trigger (#12026) 2026-04-20 05:12:10 -07:00
test_compressor_fallback_update.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_concurrent_interrupt.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
test_context_token_tracking.py feat(providers): extend request_timeout_seconds to all client paths 2026-04-19 11:23:00 -07:00
test_copilot_native_vision_headers.py fix(copilot): mark native image requests as vision 2026-04-27 08:35:50 -07:00
test_create_openai_client_kwargs_isolation.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_create_openai_client_proxy_env.py test(proxy): regression tests for NO_PROXY bypass on keepalive client 2026-04-24 03:04:42 -07:00
test_create_openai_client_reuse.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_deepseek_reasoning_content_echo.py fix(agent): block cross-provider reasoning leak to DeepSeek/Kimi (#15748) (#16500) 2026-04-27 04:06:23 -07:00
test_dict_tool_call_args.py fix(tests): fix 78 CI test failures and remove dead test (#9036) 2026-04-13 10:50:24 -07:00
test_exit_cleanup_interrupt.py test: speed up slow tests (backoff + subprocess + IMDS network) (#11797) 2026-04-17 14:21:22 -07:00
test_fallback_model.py test: speed up slow tests (backoff + subprocess + IMDS network) (#11797) 2026-04-17 14:21:22 -07:00
test_image_shrink_recovery.py feat(image-input): native multimodal routing based on model vision capability (#16506) 2026-04-27 06:27:59 -07:00
test_interactive_interrupt.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_interrupt_propagation.py test: stop testing mutable data — convert change-detectors to invariants (#13363) 2026-04-20 23:20:33 -07:00
test_invalid_context_length_warning.py fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation 2026-04-15 22:05:21 -07:00
test_jsondecodeerror_retryable.py fix(agent): retry on json.JSONDecodeError instead of treating it as a local validation error (#15107) 2026-04-24 05:02:58 -07:00
test_long_context_tier_429.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_memory_provider_init.py fix(memory): keep Honcho provider opt-in 2026-04-18 22:50:55 -07:00
test_memory_sync_interrupted.py fix(memory): skip external-provider sync on interrupted turns (#15218) 2026-04-24 15:30:18 -07:00
test_openai_client_lifecycle.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_percentage_clamp.py fix: update 6 test files broken by dead code removal 2026-04-10 03:44:43 -07:00
test_plugin_context_engine_init.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_primary_runtime_restore.py fix(agent): only set rate-limit cooldown when leaving primary; add tests 2026-04-24 05:35:43 -07:00
test_provider_attribution_headers.py fix(providers): send user agent to routermint endpoints 2026-04-24 03:02:16 -07:00
test_provider_fallback.py fix(agent): fall back on rate limit when pool has no rotation room 2026-04-24 05:20:05 -07:00
test_provider_parity.py fix(agent): preserve Codex message items for replay 2026-04-25 18:22:06 -07:00
test_real_interrupt_subagent.py fix(tests): fix 78 CI test failures and remove dead test (#9036) 2026-04-13 10:50:24 -07:00
test_redirect_stdout_issue.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_repair_tool_call_arguments.py fix(run_agent): handle unescaped control chars in tool_call arguments (#15356) 2026-04-24 15:06:41 -07:00
test_repair_tool_call_name.py fix(agent): repair CamelCase + _tool suffix tool-call emissions (#15124) 2026-04-24 05:32:08 -07:00
test_review_prompt_class_first.py feat(review): class-first skill review prompt (#16026) 2026-04-26 05:17:10 -07:00
test_run_agent.py fix: harden memory-context leak boundaries 2026-04-27 12:37:33 -07:00
test_run_agent_codex_responses.py fix(gateway): scrub memory-context leaks from vision auto-analysis output 2026-04-27 12:37:33 -07:00
test_run_agent_multimodal_prologue.py refactor: unify transport dispatch + collapse normalize shims 2026-04-22 18:34:25 -07:00
test_sequential_chats_live.py test: regression guards for the keepalive/transport bug class (#10933) (#11266) 2026-04-16 16:36:33 -07:00
test_session_meta_filtering.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_session_reset_fix.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_steer.py refactor(steer): simplify injection marker to 'User guidance:' prefix (#13340) 2026-04-20 22:18:49 -07:00
test_stream_interrupt_retry.py fix: /stop now immediately aborts streaming retry loop 2026-04-25 09:51:39 -07:00
test_streaming.py fix(streaming): silent retry when stream dies mid tool-call (#14151) 2026-04-22 13:47:33 -07:00
test_streaming_tool_call_repair.py fix: repair malformed tool call args in streaming assembly before flagging as truncated 2026-04-24 15:03:07 -07:00
test_strict_api_validation.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_strip_reasoning_tags_cli.py fix(display): strip standalone tool-call XML tags from visible text 2026-04-22 18:12:42 -07:00
test_switch_model_context.py fix: pass config_context_length to switch_model context compressor 2026-04-10 05:52:45 -07:00
test_switch_model_fallback_prune.py fix(agent): default missing fallback chain on switch 2026-04-24 05:35:43 -07:00
test_token_persistence_non_cli.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_tool_arg_coercion.py test,chore: cover stringified array/object coercion + AUTHOR_MAP entry 2026-04-23 16:38:38 -07:00
test_tool_call_args_sanitizer.py fix(run_agent): repair corrupted tool_call arguments before sending to provider 2026-04-24 14:55:47 -07:00
test_unicode_ascii_codec.py fix: always retry on ASCII codec UnicodeEncodeError — don't gate on per-component sanitization 2026-04-15 15:03:28 -07:00
test_vision_aware_preprocessing.py feat(image-input): native multimodal routing based on model vision capability (#16506) 2026-04-27 06:27:59 -07:00