feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation (#10619)

Salvaged from PR #9884 by erosika. Cherry-picked plugin changes onto
current main with minimal core modifications.

Plugin changes (plugins/memory/honcho/):
- New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context)
- Two-layer context injection: base context (summary + representation + card)
  on contextCadence, dialectic supplement on dialecticCadence
- Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal
- Cold/warm prompt selection based on session state
- dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls
- Session summary injection for conversational continuity
- Bidirectional peer targeting on all 5 tools
- Correctness fixes: peer param fallback, None guard on set_peer_card,
  schema validation, signal_sufficient anchored regex, mid->medium level fix

Core changes (~20 lines across 3 files):
- agent/memory_manager.py: Enhanced sanitize_context() to strip full
  <memory-context> blocks and system notes (prevents leak from saveMessages)
- run_agent.py: gateway_session_key param for stable per-chat Honcho sessions,
  on_turn_start() call before prefetch_all() for cadence tracking,
  sanitize_context() on user messages to strip leaked memory blocks
- gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions),
  gateway_session_key threading to main agent

Tests: 509 passed (3 skipped — honcho SDK not installed locally)
Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md

Co-authored-by: erosika <erosika@users.noreply.github.com>
This commit is contained in:
Teknium 2026-04-15 19:12:19 -07:00 committed by GitHub
parent 00ff9a26cd
commit cc6e8941db
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
17 changed files with 2632 additions and 396 deletions

View file

@ -3998,3 +3998,63 @@ class TestDeadRetryCode:
f"Expected 2 occurrences of 'if retry_count >= max_retries:' "
f"but found {occurrences}"
)
class TestMemoryContextSanitization:
"""run_conversation() must strip leaked <memory-context> blocks from user input."""
def test_memory_context_stripped_from_user_message(self):
"""Verify that <memory-context> blocks are removed before the message
enters the conversation loop prevents stale Honcho injection from
leaking into user text."""
import inspect
src = inspect.getsource(AIAgent.run_conversation)
# The sanitize_context call must appear in run_conversation's preamble
assert "sanitize_context(user_message)" in src
assert "sanitize_context(persist_user_message)" in src
def test_sanitize_context_strips_full_block(self):
"""End-to-end: a user message with an embedded memory-context block
is cleaned to just the actual user text."""
from agent.memory_manager import sanitize_context
user_text = "how is the honcho working"
injected = (
user_text + "\n\n"
"<memory-context>\n"
"[System note: The following is recalled memory context, "
"NOT new user input. Treat as informational background data.]\n\n"
"## User Representation\n"
"[2026-01-13 02:13:00] stale observation about AstroMap\n"
"</memory-context>"
)
result = sanitize_context(injected)
assert "memory-context" not in result.lower()
assert "stale observation" not in result
assert "how is the honcho working" in result
class TestMemoryProviderTurnStart:
"""run_conversation() must call memory_manager.on_turn_start() before prefetch_all().
Without this call, providers like Honcho never update _turn_count, so cadence
checks (contextCadence, dialecticCadence) are always satisfied every turn
fires both context refresh and dialectic, ignoring the configured cadence.
"""
def test_on_turn_start_called_before_prefetch(self):
"""Source-level check: on_turn_start appears before prefetch_all in run_conversation."""
import inspect
src = inspect.getsource(AIAgent.run_conversation)
# Find the actual method calls, not comments
idx_turn_start = src.index(".on_turn_start(")
idx_prefetch = src.index(".prefetch_all(")
assert idx_turn_start < idx_prefetch, (
"on_turn_start() must be called before prefetch_all() in run_conversation "
"so that memory providers have the correct turn count for cadence checks"
)
def test_on_turn_start_uses_user_turn_count(self):
"""Source-level check: on_turn_start receives self._user_turn_count."""
import inspect
src = inspect.getsource(AIAgent.run_conversation)
assert "on_turn_start(self._user_turn_count" in src