fix(gateway): always inject reply-to pointer, not just when quoted text is absent (#13676)

The [Replying to: "..."] prefix is disambiguation, not deduplication. When a user explicitly replies to a prior message, the agent needs a pointer to which specific message they're referencing — even when the quoted text already exists somewhere in history. History can contain the same or similar text multiple times; without an explicit pointer the agent has to guess (or answer for both subjects), and the reply signal is silently dropped. Example: in a conversation comparing Japan and Italy, replying to the "Japan is great for culture..." message and asking "What's the best time to go?" — previously the found_in_history check suppressed the prefix because the quoted text was already in history, leaving the agent to guess which destination the user meant. Now the pointer is always present. Drops the found_in_history guard added in #1594. Token overhead is minimal (snippet capped at 500 chars on the new user turn; cached prefix unaffected). Behavior becomes deterministic: reply sent ⇒ pointer present. Thanks to smartyi for flagging this.
2026-04-25 00:51:20 +00:00 · 2026-04-21 13:33:02 -07:00 · 2026-04-21 13:33:02 -07:00 · e889332c99
commit e889332c99
parent 7ff7155cbd
2 changed files with 166 additions and 7 deletions
--- a/gateway/run.py
+++ b/gateway/run.py
@ -3887,14 +3887,14 @@ class GatewayRunner:
                message_text = f"{context_note}\n\n{message_text}"

        if getattr(event, "reply_to_text", None) and event.reply_to_message_id:
+            # Always inject the reply-to pointer — even when the quoted text
+            # already appears in history. The prefix isn't deduplication, it's
+            # disambiguation: it tells the agent *which* prior message the user
+            # is referencing. History can contain the same or similar text
+            # multiple times, and without an explicit pointer the agent has to
+            # guess (or answer for both subjects). Token overhead is minimal.
            reply_snippet = event.reply_to_text[:500]
-            found_in_history = any(
-                reply_snippet[:200] in (msg.get("content") or "")
-                for msg in history
-                if msg.get("role") in ("assistant", "user", "tool")
-            )
-            if not found_in_history:
-                message_text = f'[Replying to: "{reply_snippet}"]\n\n{message_text}'
+            message_text = f'[Replying to: "{reply_snippet}"]\n\n{message_text}'

        if "@" in message_text:
            try: