fix(openrouter): add x-grok-conv-id header for Grok models to improve prompt cache hit rates (carve-out of #22708)

Pass session_id through to provider profile build_api_kwargs_extras so the OpenRouter profile can attach an xAI cache-affinity header (x-grok-conv-id: <session-id>) for x-ai/grok-* models. xAI prompt cache requires server affinity via this header — without it the cache is poisoned and Grok prompt-cache hit rates drop dramatically on multi-turn sessions. Carve-out of #22708 by Ninso112. The original PR bundled a /diff slash command, a zsh completion fix (already on main via #22802), and holographic memory null-guards. This salvage keeps just the Grok header work — small, targeted, and well-tested. Other contributors and changes preserved for separate review. Closes #22705.
2026-05-18 04:41:56 +00:00 · 2026-05-09 13:23:39 -07:00 · 2026-05-09 13:23:39 -07:00 · 883e11f0a0
commit 883e11f0a0
parent 5e2eba87e6
3 changed files with 61 additions and 2 deletions
--- a/agent/transports/chat_completions.py
+++ b/agent/transports/chat_completions.py
@ -448,6 +448,7 @@ class ChatCompletionsTransport(ProviderTransport):
                qwen_session_metadata=params.get("qwen_session_metadata"),
                model=model,
                ollama_num_ctx=params.get("ollama_num_ctx"),
+                session_id=params.get("session_id"),
            )
        )
        api_kwargs.update(top_level_from_profile)