fix(openrouter): add x-grok-conv-id header for Grok models to improve prompt cache hit rates (carve-out of #22708)

Pass session_id through to provider profile build_api_kwargs_extras so
the OpenRouter profile can attach an xAI cache-affinity header
(x-grok-conv-id: <session-id>) for x-ai/grok-* models. xAI prompt
cache requires server affinity via this header — without it the cache
is poisoned and Grok prompt-cache hit rates drop dramatically on
multi-turn sessions.

Carve-out of #22708 by Ninso112. The original PR bundled a /diff
slash command, a zsh completion fix (already on main via #22802),
and holographic memory null-guards. This salvage keeps just the
Grok header work — small, targeted, and well-tested. Other
contributors and changes preserved for separate review.

Closes #22705.
This commit is contained in:
Ninso112 2026-05-09 13:23:39 -07:00 committed by Teknium
parent 5e2eba87e6
commit 883e11f0a0
3 changed files with 61 additions and 2 deletions

View file

@ -53,16 +53,28 @@ class OpenRouterProfile(ProviderProfile):
*,
reasoning_config: dict | None = None,
supports_reasoning: bool = False,
model: str | None = None,
session_id: str | None = None,
**context: Any,
) -> tuple[dict[str, Any], dict[str, Any]]:
"""OpenRouter passes the full reasoning_config dict as extra_body.reasoning."""
"""OpenRouter passes the full reasoning_config dict as extra_body.reasoning.
For xAI Grok models routed through OpenRouter, attach the
``x-grok-conv-id`` header so that xAI's prompt cache stays pinned to
the same backend server across turns.
"""
extra_body: dict[str, Any] = {}
if supports_reasoning:
if reasoning_config is not None:
extra_body["reasoning"] = dict(reasoning_config)
else:
extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
return extra_body, {}
extra_headers: dict[str, Any] = {}
if session_id and model and model.startswith(("x-ai/grok-", "xai/grok-")):
extra_headers["x-grok-conv-id"] = session_id
return extra_body, {"extra_headers": extra_headers} if extra_headers else {}
openrouter = OpenRouterProfile(