fix(xai): restore encrypted reasoning replay across turns

xAI partner integration requires Hermes to thread `encrypted_content` reasoning items back to the Responses API on every turn so Grok can maintain cross-turn reasoning coherence. PR #26644 (May 15) gated this off for `is_xai_responses` on the theory that the OAuth/SuperGrok surface rejected replayed encrypted blobs and produced the multi-turn "Expected to have received \`response.created\` before \`error\`" failure. That diagnosis was wrong — the prelude-SSE fallback added in the same PR is what actually fixed that failure mode. Suppressing the replay was an unnecessary side-effect that broke the whole point of xAI's partnership integration. Changes: - agent/codex_responses_adapter.py — drop the `is_xai_responses` gate in `_chat_messages_to_responses_input`. Keep the kwarg in the signature for transport compatibility; update the docstring to document the May 2026 reversal. - agent/transports/codex.py — restore `kwargs["include"] = ["reasoning.encrypted_content"]` on the xAI Responses path so xAI echoes encrypted reasoning back to us. - tests/run_agent/test_codex_xai_oauth_recovery.py — flip the three xAI assertions (now: xAI MUST receive replayed reasoning AND we MUST include encrypted_content in the request). - tests/agent/transports/test_codex_transport.py — flip the `include` assertions on `test_xai_reasoning_effort_passed` and `test_xai_grok_4_omits_reasoning_effort`; update the allowlist block comment. The prelude-SSE fallback and the entitlement-403 surfacing fixes from #26644 are untouched — they were independent fixes that happened to ride along with the reasoning-replay gate. Validation: - Targeted: tests/run_agent/test_codex_xai_oauth_recovery.py + tests/agent/transports/test_codex_transport.py → 65/65 pass - Broader: tests/agent/transports/ + tests/run_agent/ → 1674 passed, 3 skipped, 0 failures - E2E (real imports, isolated HERMES_HOME, ResponsesApiTransport build_kwargs): turn-1 request carries `include: ["reasoning.encrypted_content"]`; turn-2 input replays the encrypted_content blob from turn-1's `codex_reasoning_items`; native Codex unchanged.
2026-05-29 06:31:32 +00:00 · 2026-05-20 22:14:18 -07:00 · 2026-05-20 22:14:18 -07:00 · b4afc6546e
commit b4afc6546e
parent 127b56a61a
4 changed files with 67 additions and 57 deletions
--- a/agent/codex_responses_adapter.py
+++ b/agent/codex_responses_adapter.py
@ -251,13 +251,16 @@ def _chat_messages_to_responses_input(
 ) -> List[Dict[str, Any]]:
    """Convert internal chat-style messages to Responses input items.

-    ``is_xai_responses=True`` strips ``encrypted_content`` from replayed
-    reasoning items.  xAI's OAuth/SuperGrok ``/v1/responses`` surface
-    rejects encrypted reasoning blobs minted by prior turns: the request
-    streams an ``error`` SSE frame before ``response.created`` and the
-    OpenAI SDK collapses it into a generic stream-ordering error.  Native
-    Codex (chatgpt.com backend-api) DOES accept replayed encrypted_content
-    — keep the default off.
+    ``is_xai_responses`` is kept for transport signature compatibility but
+    no longer suppresses encrypted reasoning replay.  Earlier (PR #26644,
+    May 2026) we believed xAI's OAuth/SuperGrok ``/v1/responses`` surface
+    rejected replayed ``encrypted_content`` reasoning items minted by
+    prior turns, and we stripped them.  That decision was wrong — xAI
+    explicitly relies on Hermes threading encrypted reasoning back across
+    turns for cross-turn coherence (the whole point of their partnership
+    integration).  We now replay encrypted reasoning on every Responses
+    transport (xAI, native Codex, custom relays) and let xAI tell us
+    explicitly if a specific surface ever rejects a payload.
    """
    items: List[Dict[str, Any]] = []
    seen_item_ids: set = set()
@ -284,17 +287,12 @@ def _chat_messages_to_responses_input(
            if role == "assistant":
                # Replay encrypted reasoning items from previous turns
                # so the API can maintain coherent reasoning chains.
-                #
-                # xAI OAuth (SuperGrok/Premium) rejects replayed
-                # ``encrypted_content`` reasoning items minted by prior
-                # turns — see _chat_messages_to_responses_input docstring.
-                # When ``is_xai_responses`` is set we drop the replay
-                # entirely; Grok still reasons on each turn server-side,
-                # we just don't try to thread the prior turn's encrypted
-                # blob back in.
+                # This applies to every Responses transport including
+                # xAI — see _chat_messages_to_responses_input docstring
+                # for the May 2026 reversal of the earlier xAI gate.
                codex_reasoning = msg.get("codex_reasoning_items")
                has_codex_reasoning = False
-                if isinstance(codex_reasoning, list) and not is_xai_responses:
+                if isinstance(codex_reasoning, list):
                    for ri in codex_reasoning:
                        if isinstance(ri, dict) and ri.get("encrypted_content"):
                            item_id = ri.get("id")