Commit graph

2 commits

Author SHA1 Message Date
Erosika
e553f6f3e4 fix(memory): narrow scrub surface to known wrapper boundaries
Reviewer pushback on the original boundary-hardening commits — three
overreach points pulled plugin-specific policy into shared core paths:

1. gateway/run.py hardcoded a '## Honcho Context' literal split for
   vision-LLM output.  Plugin-format heading in framework code; could
   truncate legitimate output naturally containing that header.
   Drop the literal split; keep generic sanitize_context (the wrapper
   strip is plugin-agnostic).  Plugin-specific cleanup belongs at the
   provider boundary, not the shared gateway path.

2. run_agent.run_conversation scrubbed user_message and
   persist_user_message before the conversation loop.  User text is
   sacred — if a user types a literal <memory-context> tag we must
   not silently delete it.  The producer (build_memory_context_block)
   is the only legitimate emitter; user input should never need the
   reverse op.

3. _build_assistant_message scrubbed model output before persistence.
   Same hazard: would silently mutate legitimate documentation/code
   the model emits containing the literal markers.  The streaming
   scrubber catches real leaks delta-by-delta before content is
   concatenated; persist-time scrub was redundant belt-and-suspenders.

4. _fire_stream_delta stripped leading newlines from every delta unless
   a paragraph break flag was set.  Mid-stream '\n' is legitimate
   markdown — lists, code fences, paragraph breaks — and chunk
   boundaries are arbitrary.  Narrow lstrip to the very first delta
   of the stream only (so stale provider preamble still gets cleaned
   on turn start, but mid-stream formatting survives).

Plus: build_memory_context_block now logs a warning when its defensive
sanitize_context strips something — surfaces buggy providers returning
pre-wrapped text instead of silently double-fencing.

Net architectural change: scrub surface collapses from 8 sites to 3
(StreamingContextScrubber on output deltas, plugin→backend send,
build_memory_context_block input-validation).  Plugin-specific strings
stay out of shared runtime paths.  User input and persisted assistant
output are no longer mutated.

Tests: rescoped TestMemoryContextSanitization (helper-correctness only,
no source-inspection of removed call sites), updated vision tests to
drop '## Honcho Context' literal-split assertions, updated
_build_assistant_message persistence test to assert preservation.
Added: cross-turn scrubber reset, build_memory_context_block warn-on-
violation, mid-stream newline preservation (plain + code fence).
2026-04-27 12:37:33 -07:00
Erosika
3b2edb347d fix(gateway): scrub memory-context leaks from vision auto-analysis output
fixes #5719

The auxiliary vision LLM called by gateway._enrich_message_with_vision
can echo its injected Honcho system prompt back into the image
description.  That description gets embedded verbatim into the enriched
user message, so recalled memory (personal facts, dialectic output)
surfaces into a user-visible bubble.

Strips both forms of leak before embedding:
  - <memory-context>...</memory-context> fenced blocks (sanitize_context)
  - trailing '## Honcho Context' sections (header + everything after)

Plus regression tests:
  - tests/agent/test_streaming_context_scrubber.py — 13 tests on the
    stateful scrubber (whole block, split tags, false-positive partial
    tags, unterminated span, reset, case-insensitivity)
  - tests/run_agent/test_run_agent_codex_responses.py — 2 new tests on
    _fire_stream_delta covering the realistic 7-chunk leak scenario and
    the cross-turn scrubber reset
  - tests/gateway/test_vision_memory_leak.py — 4 tests covering the
    vision auto-analysis boundary (clean pass-through, '## Honcho Context'
    header, fenced block, both patterns together)
2026-04-27 12:37:33 -07:00