fix(gateway): stop per-turn agent-cache eviction from model + message_id signature churn

Two independent bugs evicted the cached gateway AIAgent on every turn,
preventing the prompt cache from ever warming:

1. Model normalization mismatch: the post-run fallback-eviction check
   compared _agent.model (stripped in AIAgent.__init__) against the raw
   _resolve_gateway_model() config string. For vendor-prefixed config on
   native providers (e.g. 'deepseek/deepseek-v4-pro' vs 'deepseek-v4-pro')
   this was always unequal, so the agent was evicted after every
   successful run. Normalize _cfg_model the same way (skip aggregators).

2. Discord triggering message_id leaked into the cached system prompt via
   build_session_context_prompt()'s Discord IDs block. message_id changes
   every turn, so the agent-cache signature (computed from the ephemeral
   prompt) changed every Discord turn -> rebuild every message. The id is
   now injected per-turn into the user message (where per-turn content
   belongs and does not touch the cache signature); the cached IDs block
   carries a static pointer to it, preserving reply/react/pin via the
   discord tools.

Adapted from #28846. Bug #1 fix is the contributor's; bug #2 reworked to
be non-destructive (keeps the triggering-id capability instead of deleting
it). Redundant auto-reset eviction (already on main via #9893/#48031) and
the wrong-premise reset_context_note plumbing from the original PR were
dropped.

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
This commit is contained in:
fayenix 2026-06-30 03:41:23 -07:00 committed by Teknium
parent e7ca53e6b8
commit d6c53dcdcb
4 changed files with 91 additions and 1 deletions

View file

@ -47,6 +47,7 @@ ACP_REGISTRY_MANIFEST = REPO_ROOT / "acp_registry" / "agent.json"
AUTHOR_MAP = {
"193368749+jimmyjohansson84@users.noreply.github.com": "jimmyjohansson84", # PR #27123 salvage (Kanban unknown-skill warn-instead-of-crash; #27136)
"gxalong@gmail.com": "Jeffgithub0029", # PR #28558 salvage (chunk Telegram text *after* MarkdownV2/HTML formatting so escaping inflation can't push a send over the 4096 UTF-16 limit; #28557)
"273238055+fayenix@users.noreply.github.com": "fayenix", # PR #28846 salvage (normalize _cfg_model in gateway fallback-eviction so vendor-prefixed config matches stripped agent.model on native providers)
"phanvanhoa@gmail.com": "theAgenticBuilder", # PR #14180 salvage (route delegate_task progress lines through _safe_print so ACP stdio JSON-RPC frames stay clean)
"huangxudong663@gmail.com": "huangxudong663-sys", # PR #15157 salvage (isinstance(dict) guard on tool-call model_extra; NVIDIA NIM non-dict crash)
"39369769+jasonQin6@users.noreply.github.com": "jasonQin6", # PR #15093 salvage (session staleness guard on stream consumer run() loop; #11016 follow-up)