feat(xai-oauth): add xAI Grok OAuth (SuperGrok Subscription) provider

Adds a new authentication provider that lets SuperGrok subscribers sign
in to Hermes with their xAI account via the standard OAuth 2.0 PKCE
loopback flow, instead of pasting a raw API key from console.x.ai.

Highlights
----------
* OAuth 2.0 PKCE loopback login against accounts.x.ai with discovery,
  state/nonce, and a strict CORS-origin allowlist on the callback.
* Authorize URL carries `plan=generic` (required for non-allowlisted
  loopback clients) and `referrer=hermes-agent` for best-effort
  attribution in xAI's OAuth server logs.
* Token storage in `auth.json` with file-locked atomic writes; JWT
  `exp`-based expiry detection with skew; refresh-token rotation
  synced both ways between the singleton store and the credential
  pool so multi-process / multi-profile setups don't tear each other's
  refresh tokens.
* Reactive 401 retry: on a 401 from the xAI Responses API, the agent
  refreshes the token, swaps it back into `self.api_key`, and retries
  the call once. Guarded against silent account swaps when the active
  key was sourced from a different (manual) pool entry.
* Auxiliary tasks (curator, vision, embeddings, etc.) route through a
  dedicated xAI Responses-mode auxiliary client instead of falling back
  to OpenRouter billing.
* Direct HTTP tools (`tools/xai_http.py`, transcription, TTS, image-gen
  plugin) resolve credentials through a unified runtime → singleton →
  env-var fallback chain so xai-oauth users get them for free.
* `hermes auth add xai-oauth` and `hermes auth remove xai-oauth N` are
  wired through the standard auth-commands surface; remove cleans up
  the singleton loopback_pkce entry so it doesn't silently reinstate.
* `hermes model` provider picker shows
  "xAI Grok OAuth (SuperGrok Subscription)" and the model-flow falls
  back to pool credentials when the singleton is missing.

Hardening
---------
* Discovery and refresh responses validate the returned
  `token_endpoint` host against the same `*.x.ai` allowlist as the
  authorization endpoint, blocking MITM persistence of a hostile
  endpoint.
* Discovery / refresh / token-exchange `response.json()` calls are
  wrapped to raise typed `AuthError` on malformed bodies (captive
  portals, proxy error pages) instead of leaking JSONDecodeError
  tracebacks.
* `prompt_cache_key` is routed through `extra_body` on the codex
  transport (sending it as a top-level kwarg trips xAI's SDK with a
  TypeError).
* Credential-pool sync-back preserves `active_provider` so refreshing
  an OAuth entry doesn't silently flip the active provider out from
  under the running agent.

Testing
-------
* New `tests/hermes_cli/test_auth_xai_oauth_provider.py` (~63 tests)
  covers JWT expiry, OAuth URL params (plan + referrer), CORS origins,
  redirect URI validation, singleton↔pool sync, concurrency races,
  refresh error paths, runtime resolution, and malformed-JSON guards.
* Extended `test_credential_pool.py`, `test_codex_transport.py`, and
  `test_run_agent_codex_responses.py` cover the pool sync-back,
  `extra_body` routing, and 401 reactive refresh paths.
* 165 tests passing on this branch via `scripts/run_tests.sh`.
This commit is contained in:
Jaaneek 2026-05-15 16:10:38 +01:00 committed by Teknium
parent 9fb40e6a3d
commit b62c997973
27 changed files with 3843 additions and 131 deletions

View file

@ -1254,6 +1254,30 @@ def _resolve_nous_runtime_api(*, force_refresh: bool = False) -> Optional[tuple[
return api_key, base_url
def _resolve_xai_oauth_for_aux() -> Optional[Tuple[str, str]]:
"""Resolve a fresh xAI OAuth (api_key, base_url) for auxiliary clients.
Routes through ``hermes_cli.auth``'s runtime resolver so the auto-refresh
path is shared with the main agent, instead of relying on whatever raw
tokens happen to be sitting in auth.json or the credential pool. Returns
``None`` if the user is not authenticated with xAI Grok OAuth (so
``_resolve_auto`` Step 1 falls through to the next provider in the chain).
"""
try:
from hermes_cli.auth import resolve_xai_oauth_runtime_credentials
creds = resolve_xai_oauth_runtime_credentials()
except Exception as exc:
logger.debug("Auxiliary xAI OAuth runtime credential resolution failed: %s", exc)
return None
api_key = str(creds.get("api_key") or "").strip()
base_url = str(creds.get("base_url") or "").strip().rstrip("/")
if not api_key or not base_url:
return None
return api_key, base_url
def _read_codex_access_token() -> Optional[str]:
"""Read a valid, non-expired Codex OAuth access token from Hermes auth store.
@ -1744,6 +1768,32 @@ def _try_custom_endpoint() -> Tuple[Optional[Any], Optional[str]]:
return _fallback_client, model
def _build_xai_oauth_aux_client(model: str) -> Tuple[Optional[Any], Optional[str]]:
"""Build a CodexAuxiliaryClient for an xAI Grok OAuth-authenticated session.
xAI's ``/v1/responses`` endpoint speaks the OpenAI Responses API, so we
wrap a plain ``OpenAI`` client in ``CodexAuxiliaryClient`` to translate
``chat.completions.create()`` calls into ``responses.stream()`` requests.
The caller must pass an explicit model pinning a default for Grok
would silently rot when xAI's allowlist drifts. Returns ``(None, None)``
when the user has not authenticated with xAI Grok OAuth.
"""
if not model:
logger.warning(
"Auxiliary client: xai-oauth requested without a model; "
"pass model explicitly (auxiliary.<task>.model in config.yaml)."
)
return None, None
resolved = _resolve_xai_oauth_for_aux()
if resolved is None:
return None, None
api_key, base_url = resolved
logger.debug("Auxiliary client: xAI OAuth (%s via Responses API)", model)
real_client = OpenAI(api_key=api_key, base_url=base_url)
return CodexAuxiliaryClient(real_client, model), model
def _build_codex_client(model: str) -> Tuple[Optional[Any], Optional[str]]:
"""Build a CodexAuxiliaryClient for an explicitly-requested model.
@ -2851,6 +2901,26 @@ def resolve_provider_client(
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
else (client, final_model))
# ── xAI Grok OAuth (loopback PKCE → Responses API) ───────────────
# Without this branch, an xai-oauth main provider falls through to the
# generic ``oauth_external`` arm below and returns ``(None, None)``,
# silently re-routing every auxiliary task (compression, web extract,
# session search, curator, etc.) to whatever Step-2 fallback the user
# has configured. Users on xAI Grok OAuth would then see surprise
# OpenRouter / Nous bills for side tasks they thought were running on
# their xAI subscription.
if provider == "xai-oauth":
client, default = _build_xai_oauth_aux_client(model)
if client is None:
logger.warning(
"resolve_provider_client: xai-oauth requested but no xAI "
"OAuth token found (run: hermes model -> xAI Grok OAuth — SuperGrok Subscription)"
)
return None, None
final_model = _normalize_resolved_model(model or default, provider)
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
else (client, final_model))
# ── Custom endpoint (OPENAI_BASE_URL + OPENAI_API_KEY) ───────────
if provider == "custom":
if explicit_base_url:
@ -3201,6 +3271,8 @@ def resolve_provider_client(
return resolve_provider_client("nous", model, async_mode)
if provider == "openai-codex":
return resolve_provider_client("openai-codex", model, async_mode)
if provider == "xai-oauth":
return resolve_provider_client("xai-oauth", model, async_mode)
# Other OAuth providers not directly supported
logger.warning("resolve_provider_client: OAuth provider %s not "
"directly supported, try 'auto'", provider)

View file

@ -726,7 +726,7 @@ def _preflight_codex_api_kwargs(
"model", "instructions", "input", "tools", "store",
"reasoning", "include", "max_output_tokens", "temperature",
"tool_choice", "parallel_tool_calls", "prompt_cache_key", "service_tier",
"extra_headers",
"extra_headers", "extra_body",
}
normalized: Dict[str, Any] = {
"model": model,
@ -776,6 +776,19 @@ def _preflight_codex_api_kwargs(
if normalized_headers:
normalized["extra_headers"] = normalized_headers
extra_body = api_kwargs.get("extra_body")
if extra_body is not None:
if not isinstance(extra_body, dict):
raise ValueError("Codex Responses request 'extra_body' must be an object.")
# Pass extra_body through verbatim — used by xAI Responses to
# carry `prompt_cache_key` as a body-level field (the documented
# cache-routing surface on /v1/responses). The openai SDK
# serializes extra_body into the JSON body without per-field
# type checks, so it survives Responses.stream() kwarg-signature
# changes that would otherwise raise TypeError before the wire.
if extra_body:
normalized["extra_body"] = dict(extra_body)
if allow_stream:
stream = api_kwargs.get("stream")
if stream is not None and stream is not True:

View file

@ -29,6 +29,7 @@ from hermes_cli.auth import (
_resolve_zai_base_url,
_save_auth_store,
_save_provider_state,
_store_provider_state,
read_credential_pool,
write_credential_pool,
)
@ -539,6 +540,64 @@ class CredentialPool:
logger.debug("Failed to sync Codex entry from auth.json: %s", exc)
return entry
def _sync_xai_oauth_entry_from_auth_store(self, entry: PooledCredential) -> PooledCredential:
"""Sync an xAI OAuth pool entry from auth.json if tokens differ.
xAI OAuth refresh tokens are single-use. When another Hermes process
(or another profile sharing the same auth.json) refreshes the token,
it writes the new pair to ``providers["xai-oauth"]["tokens"]`` under
``_auth_store_lock``. Without this resync, our in-memory pool entry
keeps the consumed refresh_token and the next ``_refresh_entry`` call
would replay it and get a ``refresh_token_reused``-style 4xx.
Only applies to entries seeded from the singleton (``loopback_pkce``);
manually added entries (``manual:xai_pkce``) are independent
credentials with their own refresh-token lifecycle.
"""
if self.provider != "xai-oauth" or entry.source != "loopback_pkce":
return entry
try:
with _auth_store_lock():
auth_store = _load_auth_store()
state = _load_provider_state(auth_store, "xai-oauth")
if not isinstance(state, dict):
return entry
tokens = state.get("tokens")
if not isinstance(tokens, dict):
return entry
store_access = tokens.get("access_token", "")
store_refresh = tokens.get("refresh_token", "")
entry_access = entry.access_token or ""
entry_refresh = entry.refresh_token or ""
if store_access and (
store_access != entry_access
or (store_refresh and store_refresh != entry_refresh)
):
logger.debug(
"Pool entry %s: syncing xAI OAuth tokens from auth.json "
"(refreshed by another process)",
entry.id,
)
field_updates: Dict[str, Any] = {
"access_token": store_access,
"refresh_token": store_refresh or entry.refresh_token,
"last_status": None,
"last_status_at": None,
"last_error_code": None,
"last_error_reason": None,
"last_error_message": None,
"last_error_reset_at": None,
}
if state.get("last_refresh"):
field_updates["last_refresh"] = state["last_refresh"]
updated = replace(entry, **field_updates)
self._replace_entry(entry, updated)
self._persist()
return updated
except Exception as exc:
logger.debug("Failed to sync xAI OAuth entry from auth.json: %s", exc)
return entry
def _sync_nous_entry_from_auth_store(self, entry: PooledCredential) -> PooledCredential:
"""Sync a Nous pool entry from auth.json if tokens differ.
@ -604,9 +663,22 @@ class CredentialPool:
re-seeding a consumed single-use refresh token.
Applies to any OAuth provider whose singleton lives in auth.json
(currently Nous and OpenAI Codex).
(currently Nous, OpenAI Codex, and xAI Grok OAuth).
``set_active=False`` on every write: a pool sync-back is a
token-rotation side effect, not the user choosing a provider.
Using ``_save_provider_state`` (which sets ``active_provider``)
here would mean every Nous/Codex/xAI refresh in a multi-provider
setup silently flips the ``active_provider`` flag the next
``hermes`` invocation that defaults to the active provider
(e.g. setup wizard, ``hermes auth status``) would land on
whatever provider happened to refresh last, not whatever the
user actually chose.
"""
if entry.source != "device_code":
# Only sync entries that were seeded *from* a singleton. Manually
# added pool entries (source="manual:*") are independent credentials
# and must not write back to the singleton.
if entry.source not in {"device_code", "loopback_pkce"}:
return
try:
with _auth_store_lock():
@ -632,7 +704,7 @@ class CredentialPool:
state[extra_key] = val
if entry.inference_base_url:
state["inference_base_url"] = entry.inference_base_url
_save_provider_state(auth_store, "nous", state)
_store_provider_state(auth_store, "nous", state, set_active=False)
elif self.provider == "openai-codex":
state = _load_provider_state(auth_store, "openai-codex")
@ -646,7 +718,21 @@ class CredentialPool:
tokens["refresh_token"] = entry.refresh_token
if entry.last_refresh:
state["last_refresh"] = entry.last_refresh
_save_provider_state(auth_store, "openai-codex", state)
_store_provider_state(auth_store, "openai-codex", state, set_active=False)
elif self.provider == "xai-oauth":
state = _load_provider_state(auth_store, "xai-oauth")
if not isinstance(state, dict):
return
tokens = state.get("tokens")
if not isinstance(tokens, dict):
return
tokens["access_token"] = entry.access_token
if entry.refresh_token:
tokens["refresh_token"] = entry.refresh_token
if entry.last_refresh:
state["last_refresh"] = entry.last_refresh
_store_provider_state(auth_store, "xai-oauth", state, set_active=False)
else:
return
@ -699,6 +785,25 @@ class CredentialPool:
refresh_token=refreshed["refresh_token"],
last_refresh=refreshed.get("last_refresh"),
)
elif self.provider == "xai-oauth":
# Adopt fresher tokens from auth.json before spending the
# refresh_token — single-use tokens consumed by another
# process (or another profile sharing the singleton) would
# otherwise trigger ``refresh_token_reused`` on the next
# POST. Only meaningful for singleton-seeded entries.
synced = self._sync_xai_oauth_entry_from_auth_store(entry)
if synced is not entry:
entry = synced
refreshed = auth_mod.refresh_xai_oauth_pure(
entry.access_token,
entry.refresh_token,
)
updated = replace(
entry,
access_token=refreshed["access_token"],
refresh_token=refreshed["refresh_token"],
last_refresh=refreshed.get("last_refresh"),
)
elif self.provider == "nous":
synced = self._sync_nous_entry_from_auth_store(entry)
if synced is not entry:
@ -777,6 +882,30 @@ class CredentialPool:
# Credentials file had a valid (non-expired) token — use it directly
logger.debug("Credentials file has valid token, using without refresh")
return synced
# For xai-oauth: same race as nous — another process may have
# consumed the refresh token between our proactive sync and the
# HTTP call. Re-check auth.json and adopt the fresh tokens if
# they have rotated since. Only meaningful for singleton-seeded
# (loopback_pkce) entries; manual entries don't share state with
# the singleton.
if self.provider == "xai-oauth":
synced = self._sync_xai_oauth_entry_from_auth_store(entry)
if synced.refresh_token != entry.refresh_token:
logger.debug(
"xAI OAuth refresh failed but auth.json has newer tokens — adopting"
)
updated = replace(
synced,
last_status=STATUS_OK,
last_status_at=None,
last_error_code=None,
last_error_reason=None,
last_error_message=None,
last_error_reset_at=None,
)
self._replace_entry(synced, updated)
self._persist()
return updated
# For nous: another process may have consumed the refresh token
# between our proactive sync and the HTTP call. Re-sync from
# auth.json and adopt the fresh tokens if available.
@ -829,6 +958,11 @@ class CredentialPool:
entry.access_token,
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
)
if self.provider == "xai-oauth":
return auth_mod._xai_access_token_is_expiring(
entry.access_token,
auth_mod.XAI_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
)
if self.provider == "nous":
# Nous refresh/mint can require network access and should happen when
# runtime credentials are actually resolved, not merely when the pool
@ -883,6 +1017,17 @@ class CredentialPool:
if synced is not entry:
entry = synced
cleared_any = True
# For xai-oauth singleton-seeded entries, identical pattern:
# an entry frozen as exhausted may simply be holding stale
# tokens that another process (or a fresh `hermes model` ->
# xAI Grok OAuth login) has since rotated in auth.json.
if (self.provider == "xai-oauth"
and entry.source == "loopback_pkce"
and entry.last_status == STATUS_EXHAUSTED):
synced = self._sync_xai_oauth_entry_from_auth_store(entry)
if synced is not entry:
entry = synced
cleared_any = True
if entry.last_status == STATUS_EXHAUSTED:
exhausted_until = _exhausted_until(entry)
if exhausted_until is not None and now < exhausted_until:
@ -1394,6 +1539,37 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
},
)
elif provider == "xai-oauth":
# When the user logs in via ``hermes model`` -> xAI Grok OAuth,
# tokens are written to the auth.json singleton
# (``providers["xai-oauth"]``). Surface them in the pool too so
# ``hermes auth list`` reflects the logged-in state and so the pool
# is the single source of truth for refresh during runtime resolution.
if _is_suppressed(provider, "loopback_pkce"):
return changed, active_sources
state = _load_provider_state(auth_store, "xai-oauth")
tokens = state.get("tokens") if isinstance(state, dict) else None
if isinstance(tokens, dict) and tokens.get("access_token"):
active_sources.add("loopback_pkce")
from hermes_cli.auth import DEFAULT_XAI_OAUTH_BASE_URL
base_url = DEFAULT_XAI_OAUTH_BASE_URL
changed |= _upsert_entry(
entries,
provider,
"loopback_pkce",
{
"source": "loopback_pkce",
"auth_type": AUTH_TYPE_OAUTH,
"access_token": tokens.get("access_token", ""),
"refresh_token": tokens.get("refresh_token"),
"base_url": base_url,
"last_refresh": state.get("last_refresh"),
"label": label_from_token(tokens.get("access_token", ""), "loopback_pkce"),
},
)
return changed, active_sources

View file

@ -265,6 +265,31 @@ def _remove_minimax_oauth(provider: str, removed) -> RemovalResult:
return result
def _remove_xai_oauth_loopback_pkce(provider: str, removed) -> RemovalResult:
"""xAI OAuth tokens live in auth.json providers.xai-oauth — clear them.
Without this step, ``hermes auth remove xai-oauth <N>`` silently undoes
itself: the central dispatcher only removes the in-memory pool entry,
leaves ``providers.xai-oauth`` in auth.json intact, and on the next
``load_pool("xai-oauth")`` call ``_seed_from_singletons`` re-seeds the
entry from the still-present singleton credentials reappear with no
user feedback. Clearing the singleton in step with the suppression set
by the central dispatcher makes the removal stick.
Belt-and-braces against the manual entry path: ``hermes auth add
xai-oauth`` produces a ``manual:xai_pkce`` entry whose removal step
falls through to "unregistered → nothing to clean up" (correct
manual entries are pool-only).
"""
result = RemovalResult()
if _clear_auth_store_provider(provider):
result.cleaned.append(f"Cleared {provider} OAuth tokens from auth store")
result.hints.append(
"Run `hermes model` → xAI Grok OAuth (SuperGrok Subscription) to re-authenticate if needed."
)
return result
def _remove_codex_device_code(provider: str, removed) -> RemovalResult:
"""Codex tokens live in TWO places: our auth store AND ~/.codex/auth.json.
@ -397,6 +422,11 @@ def _register_all_sources() -> None:
remove_fn=_remove_codex_device_code,
description="auth.json providers.openai-codex + ~/.codex/auth.json",
))
register(RemovalStep(
provider="xai-oauth", source_id="loopback_pkce",
remove_fn=_remove_xai_oauth_loopback_pkce,
description="auth.json providers.xai-oauth",
))
register(RemovalStep(
provider="qwen-oauth", source_id="qwen-cli",
remove_fn=_remove_qwen_cli,

View file

@ -89,18 +89,25 @@ class ResponsesApiTransport(ProviderTransport):
_effort_clamp = {"minimal": "low"}
reasoning_effort = _effort_clamp.get(reasoning_effort, reasoning_effort)
response_tools = _responses_tools(tools)
kwargs = {
"model": model,
"instructions": instructions,
"input": _chat_messages_to_responses_input(payload_messages),
"tools": _responses_tools(tools),
"tool_choice": "auto",
"parallel_tool_calls": True,
"tools": response_tools,
"store": False,
}
if response_tools:
kwargs["tool_choice"] = "auto"
kwargs["parallel_tool_calls"] = True
session_id = params.get("session_id")
if not is_github_responses and session_id:
# xAI's Responses API uses `prompt_cache_key` (body-level) as the
# cache-routing key, not a top-level kwarg — the body-field
# injection below survives openai SDK builds whose
# Responses.stream() signature drops the kwarg. Everything else
# that ISN'T github/xAI keeps using the typed kwarg.
if not is_github_responses and not is_xai_responses and session_id:
kwargs["prompt_cache_key"] = session_id
if reasoning_enabled and is_xai_responses:
@ -165,6 +172,22 @@ class ResponsesApiTransport(ProviderTransport):
merged_extra_headers["x-grok-conv-id"] = session_id
kwargs["extra_headers"] = merged_extra_headers
# xAI Responses cache-routing field. Lives in the request body
# (per https://docs.x.ai/.../prompt-caching/maximizing-cache-hits),
# so we ship it via extra_body — the openai SDK serializes
# extra_body fields into the JSON body without per-field type
# validation, sidestepping the TypeError that fires on
# Responses.stream() builds whose `prompt_cache_key` kwarg has
# been dropped. Setdefault preserves a caller-supplied value
# (e.g. request_overrides.extra_body.prompt_cache_key) over
# the auto-derived session_id.
existing_extra_body = kwargs.get("extra_body")
merged_extra_body: Dict[str, Any] = {}
if isinstance(existing_extra_body, dict):
merged_extra_body.update(existing_extra_body)
merged_extra_body.setdefault("prompt_cache_key", session_id)
kwargs["extra_body"] = merged_extra_body
return kwargs
def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:

View file

@ -72,6 +72,7 @@ DEFAULT_AGENT_KEY_MIN_TTL_SECONDS = 30 * 60 # 30 minutes
ACCESS_TOKEN_REFRESH_SKEW_SECONDS = 120 # refresh 2 min before expiry
DEVICE_AUTH_POLL_INTERVAL_CAP_SECONDS = 1 # poll at most every 1s
DEFAULT_CODEX_BASE_URL = "https://chatgpt.com/backend-api/codex"
DEFAULT_XAI_OAUTH_BASE_URL = "https://api.x.ai/v1"
MINIMAX_OAUTH_CLIENT_ID = "78257093-7e40-4613-99e0-527b14b39113"
MINIMAX_OAUTH_SCOPE = "group_id profile model.completion"
MINIMAX_OAUTH_GRANT_TYPE = "urn:ietf:params:oauth:grant-type:user_code"
@ -89,6 +90,14 @@ STEPFUN_STEP_PLAN_CN_BASE_URL = "https://api.stepfun.com/step_plan/v1"
CODEX_OAUTH_CLIENT_ID = "app_EMoamEEZ73f0CkXaXp7hrann"
CODEX_OAUTH_TOKEN_URL = "https://auth.openai.com/oauth/token"
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS = 120
XAI_OAUTH_ISSUER = "https://auth.x.ai"
XAI_OAUTH_DISCOVERY_URL = f"{XAI_OAUTH_ISSUER}/.well-known/openid-configuration"
XAI_OAUTH_CLIENT_ID = "b1a00492-073a-47ea-816f-4c329264a828"
XAI_OAUTH_SCOPE = "openid profile email offline_access grok-cli:access api:access"
XAI_OAUTH_REDIRECT_HOST = "127.0.0.1"
XAI_OAUTH_REDIRECT_PORT = 56121
XAI_OAUTH_REDIRECT_PATH = "/callback"
XAI_ACCESS_TOKEN_REFRESH_SKEW_SECONDS = 120
QWEN_OAUTH_CLIENT_ID = "f0304373b74a44d2b584a3fb70ca9e56"
QWEN_OAUTH_TOKEN_URL = "https://chat.qwen.ai/api/v1/oauth2/token"
QWEN_ACCESS_TOKEN_REFRESH_SKEW_SECONDS = 120
@ -162,6 +171,12 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
auth_type="oauth_external",
inference_base_url=DEFAULT_CODEX_BASE_URL,
),
"xai-oauth": ProviderConfig(
id="xai-oauth",
name="xAI Grok OAuth (SuperGrok Subscription)",
auth_type="oauth_external",
inference_base_url=DEFAULT_XAI_OAUTH_BASE_URL,
),
"qwen-oauth": ProviderConfig(
id="qwen-oauth",
name="Qwen OAuth",
@ -1364,6 +1379,8 @@ def resolve_provider(
"glm": "zai", "z-ai": "zai", "z.ai": "zai", "zhipu": "zai",
"google": "gemini", "google-gemini": "gemini", "google-ai-studio": "gemini",
"x-ai": "xai", "x.ai": "xai", "grok": "xai",
"xai-oauth": "xai-oauth", "x-ai-oauth": "xai-oauth",
"grok-oauth": "xai-oauth", "xai-grok-oauth": "xai-oauth",
"kimi": "kimi-coding", "kimi-for-coding": "kimi-coding", "moonshot": "kimi-coding",
"kimi-cn": "kimi-coding-cn", "moonshot-cn": "kimi-coding-cn",
"step": "stepfun", "stepfun-coding-plan": "stepfun",
@ -1907,6 +1924,16 @@ def _spotify_code_challenge(code_verifier: str) -> str:
return base64.urlsafe_b64encode(digest).decode("ascii").rstrip("=")
def _oauth_pkce_code_verifier(length: int = 64) -> str:
raw = base64.urlsafe_b64encode(os.urandom(length)).decode("ascii")
return raw.rstrip("=")[:128]
def _oauth_pkce_code_challenge(code_verifier: str) -> str:
digest = hashlib.sha256(code_verifier.encode("utf-8")).digest()
return base64.urlsafe_b64encode(digest).decode("ascii").rstrip("=")
def _spotify_build_authorize_url(
*,
client_id: str,
@ -2029,6 +2056,158 @@ def _spotify_wait_for_callback(
)
def _xai_validate_loopback_redirect_uri(redirect_uri: str) -> tuple[str, int, str]:
parsed = urlparse(redirect_uri)
if parsed.scheme != "http":
raise AuthError(
"xAI OAuth redirect_uri must use http://127.0.0.1.",
provider="xai-oauth",
code="xai_redirect_invalid",
)
host = parsed.hostname or ""
if host != XAI_OAUTH_REDIRECT_HOST:
raise AuthError(
"xAI OAuth redirect_uri must point to 127.0.0.1.",
provider="xai-oauth",
code="xai_redirect_invalid",
)
if not parsed.port:
raise AuthError(
"xAI OAuth redirect_uri must include an explicit localhost port.",
provider="xai-oauth",
code="xai_redirect_invalid",
)
return host, parsed.port, parsed.path or "/"
def _xai_callback_cors_origin(origin: Optional[str]) -> str:
allowed = {
"https://accounts.x.ai",
"https://auth.x.ai",
"https://accounts.mouseion.dev",
"http://localhost:20000",
"http://127.0.0.1:20000",
}
return origin if origin in allowed else ""
def _make_xai_callback_handler(expected_path: str) -> tuple[type[BaseHTTPRequestHandler], dict[str, Any]]:
result: dict[str, Any] = {
"code": None,
"state": None,
"error": None,
"error_description": None,
}
class _XAICallbackHandler(BaseHTTPRequestHandler):
def _maybe_write_cors_headers(self) -> None:
origin = self.headers.get("Origin")
allow_origin = _xai_callback_cors_origin(origin)
if allow_origin:
self.send_header("Access-Control-Allow-Origin", allow_origin)
self.send_header("Access-Control-Allow-Methods", "GET, OPTIONS")
self.send_header("Access-Control-Allow-Headers", "Content-Type")
self.send_header("Access-Control-Allow-Private-Network", "true")
self.send_header("Vary", "Origin")
def do_OPTIONS(self) -> None: # noqa: N802
self.send_response(204)
self._maybe_write_cors_headers()
self.end_headers()
def do_GET(self) -> None: # noqa: N802
parsed = urlparse(self.path)
if parsed.path != expected_path:
self.send_response(404)
self.end_headers()
self.wfile.write(b"Not found.")
return
params = parse_qs(parsed.query)
result["code"] = params.get("code", [None])[0]
result["state"] = params.get("state", [None])[0]
result["error"] = params.get("error", [None])[0]
result["error_description"] = params.get("error_description", [None])[0]
self.send_response(200)
self._maybe_write_cors_headers()
self.send_header("Content-Type", "text/html; charset=utf-8")
self.end_headers()
if result["error"]:
body = "<html><body><h1>xAI authorization failed.</h1>You can close this tab.</body></html>"
else:
body = "<html><body><h1>xAI authorization received.</h1>You can close this tab.</body></html>"
self.wfile.write(body.encode("utf-8"))
def log_message(self, format: str, *args: Any) -> None: # noqa: A003
return
return _XAICallbackHandler, result
def _xai_start_callback_server(
preferred_port: int = XAI_OAUTH_REDIRECT_PORT,
) -> tuple[HTTPServer, threading.Thread, dict[str, Any], str]:
host = XAI_OAUTH_REDIRECT_HOST
expected_path = XAI_OAUTH_REDIRECT_PATH
handler_cls, result = _make_xai_callback_handler(expected_path)
class _ReuseHTTPServer(HTTPServer):
allow_reuse_address = True
ports_to_try = [preferred_port]
if preferred_port != 0:
ports_to_try.append(0)
server = None
last_error: Optional[OSError] = None
for port in ports_to_try:
try:
server = _ReuseHTTPServer((host, port), handler_cls)
break
except OSError as exc:
last_error = exc
if server is None:
raise AuthError(
f"Could not bind xAI callback server on {host}:{preferred_port}: {last_error}",
provider="xai-oauth",
code="xai_callback_bind_failed",
) from last_error
actual_port = int(server.server_address[1])
redirect_uri = f"http://{host}:{actual_port}{expected_path}"
thread = threading.Thread(
target=server.serve_forever,
kwargs={"poll_interval": 0.1},
daemon=True,
)
thread.start()
return server, thread, result, redirect_uri
def _xai_wait_for_callback(
server: HTTPServer,
thread: threading.Thread,
result: dict[str, Any],
*,
timeout_seconds: float = 180.0,
) -> dict[str, Any]:
deadline = time.monotonic() + max(5.0, timeout_seconds)
try:
while time.monotonic() < deadline:
if result["code"] or result["error"]:
return result
time.sleep(0.1)
finally:
server.shutdown()
server.server_close()
thread.join(timeout=1.0)
raise AuthError(
"xAI authorization timed out waiting for the local callback.",
provider="xai-oauth",
code="xai_callback_timeout",
)
def _spotify_token_payload_to_state(
token_payload: Dict[str, Any],
*,
@ -2680,6 +2859,348 @@ def resolve_codex_runtime_credentials(
}
# =============================================================================
# xAI Grok OAuth — tokens stored in ~/.hermes/auth.json
# =============================================================================
def _read_xai_oauth_tokens(*, _lock: bool = True) -> Dict[str, Any]:
if _lock:
with _auth_store_lock():
auth_store = _load_auth_store()
else:
auth_store = _load_auth_store()
state = _load_provider_state(auth_store, "xai-oauth")
if not state:
raise AuthError(
"No xAI OAuth credentials stored. Select xAI Grok OAuth (SuperGrok Subscription) in `hermes model`.",
provider="xai-oauth",
code="xai_auth_missing",
relogin_required=True,
)
tokens = state.get("tokens")
if not isinstance(tokens, dict):
raise AuthError(
"xAI OAuth state is missing tokens. Re-authenticate with `hermes model`.",
provider="xai-oauth",
code="xai_auth_invalid_shape",
relogin_required=True,
)
access_token = str(tokens.get("access_token", "") or "").strip()
refresh_token = str(tokens.get("refresh_token", "") or "").strip()
if not access_token:
raise AuthError(
"xAI OAuth state is missing access_token. Re-authenticate with `hermes model`.",
provider="xai-oauth",
code="xai_auth_missing_access_token",
relogin_required=True,
)
if not refresh_token:
raise AuthError(
"xAI OAuth state is missing refresh_token. Re-authenticate with `hermes model`.",
provider="xai-oauth",
code="xai_auth_missing_refresh_token",
relogin_required=True,
)
return {
"tokens": tokens,
"last_refresh": state.get("last_refresh"),
"discovery": state.get("discovery") or {},
"redirect_uri": state.get("redirect_uri"),
}
def _save_xai_oauth_tokens(
tokens: Dict[str, Any],
*,
discovery: Optional[Dict[str, Any]] = None,
redirect_uri: str = "",
last_refresh: Optional[str] = None,
) -> None:
if last_refresh is None:
last_refresh = datetime.now(timezone.utc).isoformat().replace("+00:00", "Z")
with _auth_store_lock():
auth_store = _load_auth_store()
state = _load_provider_state(auth_store, "xai-oauth") or {}
state["tokens"] = tokens
state["last_refresh"] = last_refresh
state["auth_mode"] = "oauth_pkce"
if discovery:
state["discovery"] = discovery
if redirect_uri:
state["redirect_uri"] = redirect_uri
_save_provider_state(auth_store, "xai-oauth", state)
_save_auth_store(auth_store)
def _xai_access_token_is_expiring(access_token: str, skew_seconds: int = 0) -> bool:
if not isinstance(access_token, str) or "." not in access_token:
return False
try:
parts = access_token.split(".")
if len(parts) < 2:
return False
payload_b64 = parts[1]
payload_b64 += "=" * (-len(payload_b64) % 4)
payload = json.loads(base64.urlsafe_b64decode(payload_b64.encode("ascii")).decode("utf-8"))
exp = payload.get("exp")
if not isinstance(exp, (int, float)):
return False
return float(exp) <= (time.time() + max(0, int(skew_seconds)))
except Exception:
return False
def _xai_validate_oauth_endpoint(url: str, *, field: str) -> str:
"""Refuse any OIDC discovery endpoint that isn't HTTPS on the xAI origin.
The OIDC discovery response is a long-lived, low-frequency request whose
output is cached in ``~/.hermes/auth.json``. A single MITM during initial
login could substitute a malicious ``token_endpoint``; that URL would
then receive the refresh_token on every subsequent refresh a permanent
credential leak from a one-time MITM. Validating scheme + host pins the
cached endpoint to the xAI auth origin (or a future ``*.x.ai`` subdomain
if xAI migrates) so the cache poisoning loses its persistence guarantee.
RFC 8414 §2 requires the issuer to be ``https://`` and SHOULD-keeps the
token_endpoint on the same origin; we enforce both. ``x.ai`` is the
bare apex, so we accept either exact host match or any ``.x.ai`` suffix.
"""
parsed = urlparse(url)
if parsed.scheme != "https":
raise AuthError(
f"xAI OIDC discovery returned a non-HTTPS {field}: {url!r}.",
provider="xai-oauth",
code="xai_discovery_invalid",
)
host = (parsed.hostname or "").lower()
if not host:
raise AuthError(
f"xAI OIDC discovery {field} is missing a hostname: {url!r}.",
provider="xai-oauth",
code="xai_discovery_invalid",
)
if host != "x.ai" and not host.endswith(".x.ai"):
raise AuthError(
f"xAI OIDC discovery {field} host {host!r} is not on the xAI origin "
f"(expected x.ai or a *.x.ai subdomain). Refusing to use a cached "
f"endpoint that may have been substituted by a MITM during initial "
f"discovery; re-authenticate with `hermes model` to re-fetch.",
provider="xai-oauth",
code="xai_discovery_invalid",
)
return url
def _xai_oauth_discovery(timeout_seconds: float = 15.0) -> Dict[str, str]:
try:
response = httpx.get(
XAI_OAUTH_DISCOVERY_URL,
headers={"Accept": "application/json"},
timeout=timeout_seconds,
)
except Exception as exc:
raise AuthError(
f"xAI OIDC discovery failed: {exc}",
provider="xai-oauth",
code="xai_discovery_failed",
) from exc
if response.status_code != 200:
raise AuthError(
f"xAI OIDC discovery returned status {response.status_code}.",
provider="xai-oauth",
code="xai_discovery_failed",
)
try:
payload = response.json()
except Exception as exc:
raise AuthError(
f"xAI OIDC discovery returned invalid JSON: {exc}",
provider="xai-oauth",
code="xai_discovery_invalid_json",
) from exc
if not isinstance(payload, dict):
raise AuthError(
"xAI OIDC discovery response was not a JSON object.",
provider="xai-oauth",
code="xai_discovery_incomplete",
)
authorization_endpoint = str(payload.get("authorization_endpoint", "") or "").strip()
token_endpoint = str(payload.get("token_endpoint", "") or "").strip()
if not authorization_endpoint or not token_endpoint:
raise AuthError(
"xAI OIDC discovery response was missing required endpoints.",
provider="xai-oauth",
code="xai_discovery_incomplete",
)
_xai_validate_oauth_endpoint(authorization_endpoint, field="authorization_endpoint")
_xai_validate_oauth_endpoint(token_endpoint, field="token_endpoint")
return {
"authorization_endpoint": authorization_endpoint,
"token_endpoint": token_endpoint,
}
def refresh_xai_oauth_pure(
access_token: str,
refresh_token: str,
*,
token_endpoint: str = "",
timeout_seconds: float = 20.0,
) -> Dict[str, Any]:
del access_token
if not isinstance(refresh_token, str) or not refresh_token.strip():
raise AuthError(
"xAI OAuth is missing refresh_token. Re-authenticate with `hermes model`.",
provider="xai-oauth",
code="xai_auth_missing_refresh_token",
relogin_required=True,
)
endpoint = token_endpoint.strip() or _xai_oauth_discovery(timeout_seconds)["token_endpoint"]
# Re-validate cached endpoints on the refresh hot path: an auth.json
# written by an older Hermes (or hand-edited) may carry a non-xAI
# token_endpoint that would receive every future refresh_token in
# plaintext if we trusted it blindly. Cheap suffix check; fast-fail
# with a clear error so the user can re-run `hermes model` to refetch.
_xai_validate_oauth_endpoint(endpoint, field="token_endpoint")
timeout = httpx.Timeout(max(5.0, float(timeout_seconds)))
with httpx.Client(timeout=timeout, headers={"Accept": "application/json"}) as client:
response = client.post(
endpoint,
headers={"Content-Type": "application/x-www-form-urlencoded"},
data={
"grant_type": "refresh_token",
"client_id": XAI_OAUTH_CLIENT_ID,
"refresh_token": refresh_token,
},
)
if response.status_code != 200:
detail = response.text.strip()
raise AuthError(
"xAI token refresh failed."
+ (f" Response: {detail}" if detail else ""),
provider="xai-oauth",
code="xai_refresh_failed",
relogin_required=(response.status_code in {400, 401, 403}),
)
try:
payload = response.json()
except Exception as exc:
raise AuthError(
f"xAI token refresh returned invalid JSON: {exc}",
provider="xai-oauth",
code="xai_refresh_invalid_json",
) from exc
if not isinstance(payload, dict):
raise AuthError(
"xAI token refresh response was not a JSON object.",
provider="xai-oauth",
code="xai_refresh_invalid_response",
relogin_required=True,
)
refreshed_access = str(payload.get("access_token", "") or "").strip()
if not refreshed_access:
raise AuthError(
"xAI token refresh response was missing access_token.",
provider="xai-oauth",
code="xai_refresh_missing_access_token",
relogin_required=True,
)
updated = {
"access_token": refreshed_access,
"refresh_token": str(payload.get("refresh_token") or refresh_token).strip(),
"id_token": str(payload.get("id_token") or "").strip(),
"expires_in": payload.get("expires_in"),
"token_type": str(payload.get("token_type") or "Bearer").strip() or "Bearer",
"last_refresh": datetime.now(timezone.utc).isoformat().replace("+00:00", "Z"),
}
return updated
def _refresh_xai_oauth_tokens(
tokens: Dict[str, Any],
*,
token_endpoint: str,
redirect_uri: str = "",
timeout_seconds: float,
) -> Dict[str, Any]:
refreshed = refresh_xai_oauth_pure(
str(tokens.get("access_token", "") or ""),
str(tokens.get("refresh_token", "") or ""),
token_endpoint=token_endpoint,
timeout_seconds=timeout_seconds,
)
updated_tokens = dict(tokens)
updated_tokens["access_token"] = refreshed["access_token"]
updated_tokens["refresh_token"] = refreshed["refresh_token"]
if refreshed.get("id_token"):
updated_tokens["id_token"] = refreshed["id_token"]
if refreshed.get("expires_in") is not None:
updated_tokens["expires_in"] = refreshed["expires_in"]
if refreshed.get("token_type"):
updated_tokens["token_type"] = refreshed["token_type"]
_save_xai_oauth_tokens(
updated_tokens,
discovery={"token_endpoint": token_endpoint},
redirect_uri=redirect_uri,
last_refresh=refreshed["last_refresh"],
)
return updated_tokens
def resolve_xai_oauth_runtime_credentials(
*,
force_refresh: bool = False,
refresh_if_expiring: bool = True,
refresh_skew_seconds: int = XAI_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
) -> Dict[str, Any]:
data = _read_xai_oauth_tokens()
tokens = dict(data["tokens"])
access_token = str(tokens.get("access_token", "") or "").strip()
refresh_timeout_seconds = float(os.getenv("HERMES_XAI_REFRESH_TIMEOUT_SECONDS", "20"))
discovery = dict(data.get("discovery") or {})
token_endpoint = str(discovery.get("token_endpoint", "") or "").strip()
redirect_uri = str(data.get("redirect_uri", "") or "").strip()
should_refresh = bool(force_refresh)
if (not should_refresh) and refresh_if_expiring:
should_refresh = _xai_access_token_is_expiring(access_token, refresh_skew_seconds)
if should_refresh:
with _auth_store_lock(timeout_seconds=max(float(AUTH_LOCK_TIMEOUT_SECONDS), refresh_timeout_seconds + 5.0)):
data = _read_xai_oauth_tokens(_lock=False)
tokens = dict(data["tokens"])
access_token = str(tokens.get("access_token", "") or "").strip()
discovery = dict(data.get("discovery") or {})
token_endpoint = str(discovery.get("token_endpoint", "") or "").strip()
redirect_uri = str(data.get("redirect_uri", "") or "").strip()
should_refresh = bool(force_refresh)
if (not should_refresh) and refresh_if_expiring:
should_refresh = _xai_access_token_is_expiring(access_token, refresh_skew_seconds)
if should_refresh:
if not token_endpoint:
token_endpoint = _xai_oauth_discovery(refresh_timeout_seconds)["token_endpoint"]
tokens = _refresh_xai_oauth_tokens(
tokens,
token_endpoint=token_endpoint,
redirect_uri=redirect_uri,
timeout_seconds=refresh_timeout_seconds,
)
access_token = str(tokens.get("access_token", "") or "").strip()
base_url = (
os.getenv("HERMES_XAI_BASE_URL", "").strip().rstrip("/")
or os.getenv("XAI_BASE_URL", "").strip().rstrip("/")
or DEFAULT_XAI_OAUTH_BASE_URL
)
return {
"provider": "xai-oauth",
"base_url": base_url,
"api_key": access_token,
"source": "hermes-auth-store",
"last_refresh": data.get("last_refresh"),
"auth_mode": "oauth_pkce",
}
# =============================================================================
# TLS verification helper
# =============================================================================
@ -4030,6 +4551,48 @@ def get_codex_auth_status() -> Dict[str, Any]:
}
def get_xai_oauth_auth_status() -> Dict[str, Any]:
try:
from agent.credential_pool import load_pool
pool = load_pool("xai-oauth")
if pool and pool.has_credentials():
entry = pool.select()
if entry is not None:
api_key = (
getattr(entry, "runtime_api_key", None)
or getattr(entry, "access_token", "")
)
if api_key and not _xai_access_token_is_expiring(api_key, 0):
return {
"logged_in": True,
"auth_store": str(_auth_file_path()),
"last_refresh": getattr(entry, "last_refresh", None),
"auth_mode": "oauth_pkce",
"source": f"pool:{getattr(entry, 'label', 'unknown')}",
"api_key": api_key,
}
except Exception:
pass
try:
creds = resolve_xai_oauth_runtime_credentials()
return {
"logged_in": True,
"auth_store": str(_auth_file_path()),
"last_refresh": creds.get("last_refresh"),
"auth_mode": creds.get("auth_mode"),
"source": creds.get("source"),
"api_key": creds.get("api_key"),
}
except AuthError as exc:
return {
"logged_in": False,
"auth_store": str(_auth_file_path()),
"error": str(exc),
}
def get_api_key_provider_status(provider_id: str) -> Dict[str, Any]:
"""Status snapshot for API-key providers (z.ai, Kimi, MiniMax)."""
pconfig = PROVIDER_REGISTRY.get(provider_id)
@ -4100,6 +4663,8 @@ def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
return get_nous_auth_status()
if target == "openai-codex":
return get_codex_auth_status()
if target == "xai-oauth":
return get_xai_oauth_auth_status()
if target == "qwen-oauth":
return get_qwen_auth_status()
if target == "google-gemini-cli":
@ -4320,7 +4885,7 @@ def _logout_default_provider_from_config() -> Optional[str]:
"No provider is currently logged in" and never reset model.provider.
"""
provider = _get_config_provider()
if provider in {"nous", "openai-codex"}:
if provider in {"nous", "openai-codex", "xai-oauth"}:
return provider
return None
@ -4619,6 +5184,245 @@ def _login_openai_codex(
print(f" Config updated: {config_path} (model.provider=openai-codex)")
def _login_xai_oauth(
args,
pconfig: ProviderConfig,
*,
force_new_login: bool = False,
) -> None:
del pconfig
if not force_new_login:
try:
existing = resolve_xai_oauth_runtime_credentials()
api_key = existing.get("api_key", "")
if isinstance(api_key, str) and api_key and not _xai_access_token_is_expiring(api_key, 60):
print("Existing xAI OAuth credentials found in Hermes auth store.")
try:
reuse = input("Use existing credentials? [Y/n]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
reuse = "y"
if reuse in ("", "y", "yes"):
config_path = _update_config_for_provider(
"xai-oauth",
existing.get("base_url", DEFAULT_XAI_OAUTH_BASE_URL),
)
print()
print("Login successful!")
print(f" Config updated: {config_path} (model.provider=xai-oauth)")
return
except AuthError:
pass
print()
print("Signing in to xAI Grok OAuth (SuperGrok Subscription)...")
print("(Hermes creates its own local OAuth session)")
print()
timeout_seconds = float(getattr(args, "timeout", None) or 20.0)
open_browser = not getattr(args, "no_browser", False)
if _is_remote_session():
open_browser = False
creds = _xai_oauth_loopback_login(timeout_seconds=timeout_seconds, open_browser=open_browser)
_save_xai_oauth_tokens(
creds["tokens"],
discovery=creds.get("discovery"),
redirect_uri=creds.get("redirect_uri", ""),
last_refresh=creds.get("last_refresh"),
)
config_path = _update_config_for_provider("xai-oauth", creds.get("base_url", DEFAULT_XAI_OAUTH_BASE_URL))
print()
print("Login successful!")
from hermes_constants import display_hermes_home as _dhh
print(f" Auth state: {_dhh()}/auth.json")
print(f" Config updated: {config_path} (model.provider=xai-oauth)")
def _xai_oauth_build_authorize_url(
*,
authorization_endpoint: str,
redirect_uri: str,
code_challenge: str,
state: str,
nonce: str,
) -> str:
# `plan=generic` opts the consent screen into xAI's generic OAuth plan
# tier instead of falling back to the per-account default. Without it,
# accounts.x.ai rejects loopback OAuth from non-allowlisted clients.
# `referrer=hermes-agent` lets xAI attribute Hermes-originated logins
# in their OAuth server logs (we still impersonate the upstream Grok-CLI
# client_id; this is best-effort attribution until xAI mints us our own).
authorize_params = {
"response_type": "code",
"client_id": XAI_OAUTH_CLIENT_ID,
"redirect_uri": redirect_uri,
"scope": XAI_OAUTH_SCOPE,
"code_challenge": code_challenge,
"code_challenge_method": "S256",
"state": state,
"nonce": nonce,
"plan": "generic",
"referrer": "hermes-agent",
}
return f"{authorization_endpoint}?{urlencode(authorize_params)}"
def _xai_oauth_loopback_login(
*,
timeout_seconds: float = 20.0,
open_browser: bool = True,
) -> Dict[str, Any]:
discovery = _xai_oauth_discovery(timeout_seconds)
authorization_endpoint = discovery["authorization_endpoint"]
token_endpoint = discovery["token_endpoint"]
server, thread, callback_result, redirect_uri = _xai_start_callback_server()
try:
_xai_validate_loopback_redirect_uri(redirect_uri)
code_verifier = _oauth_pkce_code_verifier()
code_challenge = _oauth_pkce_code_challenge(code_verifier)
state = uuid.uuid4().hex
nonce = uuid.uuid4().hex
authorize_url = _xai_oauth_build_authorize_url(
authorization_endpoint=authorization_endpoint,
redirect_uri=redirect_uri,
code_challenge=code_challenge,
state=state,
nonce=nonce,
)
print("Open this URL to authorize Hermes with xAI:")
print(authorize_url)
print()
print(f"Waiting for callback on {redirect_uri}")
if open_browser and not _is_remote_session():
try:
opened = webbrowser.open(authorize_url)
except Exception:
opened = False
if opened:
print("Browser opened for xAI authorization.")
else:
print("Could not open the browser automatically; use the URL above.")
callback = _xai_wait_for_callback(
server,
thread,
callback_result,
timeout_seconds=max(30.0, timeout_seconds * 9),
)
except Exception:
try:
server.shutdown()
server.server_close()
except Exception:
pass
try:
thread.join(timeout=1.0)
except Exception:
pass
raise
if callback.get("error"):
detail = callback.get("error_description") or callback["error"]
raise AuthError(
f"xAI authorization failed: {detail}",
provider="xai-oauth",
code="xai_authorization_failed",
)
if callback.get("state") != state:
raise AuthError(
"xAI authorization failed: state mismatch.",
provider="xai-oauth",
code="xai_state_mismatch",
)
code = str(callback.get("code") or "").strip()
if not code:
raise AuthError(
"xAI authorization failed: missing authorization code.",
provider="xai-oauth",
code="xai_code_missing",
)
try:
response = httpx.post(
token_endpoint,
headers={"Content-Type": "application/x-www-form-urlencoded", "Accept": "application/json"},
data={
"grant_type": "authorization_code",
"code": code,
"redirect_uri": redirect_uri,
"client_id": XAI_OAUTH_CLIENT_ID,
"code_verifier": code_verifier,
},
timeout=max(20.0, timeout_seconds),
)
except Exception as exc:
raise AuthError(
f"xAI token exchange failed: {exc}",
provider="xai-oauth",
code="xai_token_exchange_failed",
) from exc
if response.status_code != 200:
detail = response.text.strip()
raise AuthError(
"xAI token exchange failed."
+ (f" Response: {detail}" if detail else ""),
provider="xai-oauth",
code="xai_token_exchange_failed",
)
try:
payload = response.json()
except Exception as exc:
raise AuthError(
f"xAI token exchange returned invalid JSON: {exc}",
provider="xai-oauth",
code="xai_token_exchange_invalid",
) from exc
if not isinstance(payload, dict):
raise AuthError(
"xAI token exchange response was not a JSON object.",
provider="xai-oauth",
code="xai_token_exchange_invalid",
)
access_token = str(payload.get("access_token", "") or "").strip()
refresh_token = str(payload.get("refresh_token", "") or "").strip()
if not access_token:
raise AuthError(
"xAI token exchange did not return an access_token.",
provider="xai-oauth",
code="xai_token_exchange_invalid",
)
if not refresh_token:
raise AuthError(
"xAI token exchange did not return a refresh_token.",
provider="xai-oauth",
code="xai_token_exchange_invalid",
)
base_url = (
os.getenv("HERMES_XAI_BASE_URL", "").strip().rstrip("/")
or os.getenv("XAI_BASE_URL", "").strip().rstrip("/")
or DEFAULT_XAI_OAUTH_BASE_URL
)
return {
"tokens": {
"access_token": access_token,
"refresh_token": refresh_token,
"id_token": str(payload.get("id_token", "") or "").strip(),
"expires_in": payload.get("expires_in"),
"token_type": str(payload.get("token_type") or "Bearer").strip() or "Bearer",
},
"discovery": discovery,
"redirect_uri": redirect_uri,
"base_url": base_url,
"last_refresh": datetime.now(timezone.utc).isoformat().replace("+00:00", "Z"),
"source": "oauth-loopback",
}
def _codex_device_code_login() -> Dict[str, Any]:
"""Run the OpenAI device code login flow and return credentials dict."""
import time as _time

View file

@ -33,7 +33,7 @@ from hermes_constants import OPENROUTER_BASE_URL
# Providers that support OAuth login in addition to API keys.
_OAUTH_CAPABLE_PROVIDERS = {"anthropic", "nous", "openai-codex", "qwen-oauth", "google-gemini-cli", "minimax-oauth"}
_OAUTH_CAPABLE_PROVIDERS = {"anthropic", "nous", "openai-codex", "xai-oauth", "qwen-oauth", "google-gemini-cli", "minimax-oauth"}
def _get_custom_provider_names() -> list:
@ -77,6 +77,8 @@ def _normalize_provider(provider: str) -> str:
normalized = (provider or "").strip().lower()
if normalized in {"or", "open-router"}:
return "openrouter"
if normalized in {"grok-oauth", "xai-oauth", "x-ai-oauth", "xai-grok-oauth"}:
return "xai-oauth"
# Check if it matches a custom provider name
custom_key = _resolve_custom_provider_input(normalized)
if custom_key:
@ -170,7 +172,7 @@ def auth_add_command(args) -> None:
if provider.startswith(CUSTOM_POOL_PREFIX):
requested_type = AUTH_TYPE_API_KEY
else:
requested_type = AUTH_TYPE_OAUTH if provider in {"anthropic", "nous", "openai-codex", "qwen-oauth", "google-gemini-cli", "minimax-oauth"} else AUTH_TYPE_API_KEY
requested_type = AUTH_TYPE_OAUTH if provider in _OAUTH_CAPABLE_PROVIDERS else AUTH_TYPE_API_KEY
pool = load_pool(provider)
@ -333,6 +335,31 @@ def auth_add_command(args) -> None:
print(f'Added {provider} OAuth credential #{len(pool.entries())}: "{entry.label}"')
return
if provider == "xai-oauth":
creds = auth_mod._xai_oauth_loopback_login(
timeout_seconds=getattr(args, "timeout", None) or 20.0,
open_browser=not getattr(args, "no_browser", False),
)
label = (getattr(args, "label", None) or "").strip() or label_from_token(
creds["tokens"]["access_token"],
_oauth_default_label(provider, len(pool.entries()) + 1),
)
entry = PooledCredential(
provider=provider,
id=uuid.uuid4().hex[:6],
label=label,
auth_type=AUTH_TYPE_OAUTH,
priority=0,
source=f"{SOURCE_MANUAL}:xai_pkce",
access_token=creds["tokens"]["access_token"],
refresh_token=creds["tokens"].get("refresh_token"),
base_url=creds.get("base_url"),
last_refresh=creds.get("last_refresh"),
)
pool.add_entry(entry)
print(f'Added {provider} OAuth credential #{len(pool.entries())}: "{entry.label}"')
return
if provider == "google-gemini-cli":
from agent.google_oauth import run_gemini_oauth_login_pure

View file

@ -1932,6 +1932,8 @@ def select_provider_and_model(args=None):
_model_flow_nous(config, current_model, args=args)
elif selected_provider == "openai-codex":
_model_flow_openai_codex(config, current_model)
elif selected_provider == "xai-oauth":
_model_flow_xai_oauth(config, current_model)
elif selected_provider == "qwen-oauth":
_model_flow_qwen_oauth(config, current_model)
elif selected_provider == "minimax-oauth":
@ -2813,6 +2815,87 @@ def _model_flow_openai_codex(config, current_model=""):
print("No change.")
def _model_flow_xai_oauth(_config, current_model=""):
"""xAI Grok OAuth (SuperGrok Subscription) provider: ensure logged in, then pick model."""
from hermes_cli.auth import (
get_xai_oauth_auth_status,
_prompt_model_selection,
_save_model_choice,
_update_config_for_provider,
resolve_xai_oauth_runtime_credentials,
_login_xai_oauth,
DEFAULT_XAI_OAUTH_BASE_URL,
PROVIDER_REGISTRY,
)
from hermes_cli.models import _PROVIDER_MODELS
status = get_xai_oauth_auth_status()
if status.get("logged_in"):
print(" xAI Grok OAuth (SuperGrok Subscription) credentials: ✓")
print()
print(" 1. Use existing credentials")
print(" 2. Reauthenticate (new OAuth login)")
print(" 3. Cancel")
print()
try:
choice = input(" Choice [1/2/3]: ").strip()
except (KeyboardInterrupt, EOFError):
choice = "1"
if choice == "2":
print("Starting a fresh xAI OAuth login...")
print()
try:
mock_args = argparse.Namespace()
_login_xai_oauth(
mock_args,
PROVIDER_REGISTRY["xai-oauth"],
force_new_login=True,
)
except SystemExit:
print("Login cancelled or failed.")
return
except Exception as exc:
print(f"Login failed: {exc}")
return
elif choice == "3":
return
else:
print("Not logged into xAI Grok OAuth (SuperGrok Subscription). Starting login...")
print()
try:
mock_args = argparse.Namespace()
_login_xai_oauth(mock_args, PROVIDER_REGISTRY["xai-oauth"])
except SystemExit:
print("Login cancelled or failed.")
return
except Exception as exc:
print(f"Login failed: {exc}")
return
# Resolve a usable base URL. ``resolve_xai_oauth_runtime_credentials``
# only reads from the auth.json singleton — but credentials may legitimately
# live only in the pool (e.g. after ``hermes auth add xai-oauth``). Fall
# back to the default base URL in that case so the model picker still
# completes successfully instead of bailing out with
# ``Could not resolve xAI OAuth credentials``.
base_url = DEFAULT_XAI_OAUTH_BASE_URL
try:
creds = resolve_xai_oauth_runtime_credentials()
base_url = (creds.get("base_url") or "").strip().rstrip("/") or base_url
except Exception:
pass
models = list(_PROVIDER_MODELS.get("xai-oauth") or _PROVIDER_MODELS.get("xai") or [])
selected = _prompt_model_selection(models, current_model=current_model or (models[0] if models else "grok-code-fast-1"))
if selected:
_save_model_choice(selected)
_update_config_for_provider("xai-oauth", base_url)
print(f"Default model set to: {selected} (via xAI Grok OAuth — SuperGrok Subscription)")
else:
print("No change.")
_DEFAULT_QWEN_PORTAL_MODELS = [
"qwen3-coder-plus",
"qwen3-coder",
@ -9400,7 +9483,7 @@ def _build_provider_choices() -> list[str]:
except Exception:
# Fallback: static list guarantees the CLI always works
return [
"auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot",
"auto", "openrouter", "nous", "openai-codex", "xai-oauth", "copilot-acp", "copilot",
"anthropic", "gemini", "google-gemini-cli", "xai", "bedrock", "azure-foundry",
"ollama-cloud", "huggingface", "zai", "kimi-coding", "kimi-coding-cn",
"stepfun", "minimax", "minimax-cn", "kilocode", "novita", "xiaomi", "arcee",
@ -9931,7 +10014,7 @@ def main():
)
login_parser.add_argument(
"--provider",
choices=["nous", "openai-codex"],
choices=["nous", "openai-codex", "xai-oauth"],
default=None,
help="Provider to authenticate with (default: nous)",
)
@ -9977,7 +10060,7 @@ def main():
)
logout_parser.add_argument(
"--provider",
choices=["nous", "openai-codex", "spotify"],
choices=["nous", "openai-codex", "xai-oauth", "spotify"],
default=None,
help="Provider to log out from (default: active provider)",
)

View file

@ -116,13 +116,23 @@ def _codex_curated_models() -> list[str]:
# (grok-4, grok-4-0709, grok-4-fast{,-reasoning,-non-reasoning},
# grok-4-1-fast{,-reasoning,-non-reasoning}, grok-code-fast-1 → grok-4.3).
_XAI_STATIC_FALLBACK: list[str] = [
"grok-4.3",
"grok-4.20-0309-reasoning",
"grok-4.20-0309-non-reasoning",
"grok-4.20-multi-agent-0309",
"grok-4.3",
]
_XAI_TOP_MODEL = "grok-4.3"
def _xai_promote_top(ids: list[str]) -> list[str]:
"""Pin the headline xAI model to the top of the curated list."""
if _XAI_TOP_MODEL in ids:
return [_XAI_TOP_MODEL] + [m for m in ids if m != _XAI_TOP_MODEL]
return ids
def _xai_curated_models() -> list[str]:
"""Derive the xAI-direct curated list from models.dev disk cache.
@ -142,7 +152,7 @@ def _xai_curated_models() -> list[str]:
if isinstance(models, dict) and models:
ids = [mid for mid in models.keys() if isinstance(mid, str)]
if ids:
return sorted(ids)
return _xai_promote_top(sorted(ids))
except Exception:
# Any failure (missing file, malformed JSON, import error)
# falls through to the static list.
@ -190,6 +200,7 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"gpt-4o-mini",
],
"openai-codex": _codex_curated_models(),
"xai-oauth": _xai_curated_models(),
"copilot-acp": [
"copilot-acp",
],
@ -918,6 +929,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models — API key or Claude Code)"),
ProviderEntry("openai-codex", "OpenAI Codex", "OpenAI Codex"),
ProviderEntry("alibaba", "Qwen Cloud", "Qwen Cloud / DashScope Coding (Qwen + multi-provider)"),
ProviderEntry("xai-oauth", "xAI Grok OAuth (SuperGrok Subscription)", "xAI Grok OAuth (SuperGrok Subscription)"),
ProviderEntry("xiaomi", "Xiaomi MiMo", "Xiaomi MiMo (MiMo-V2.5 and V2 models — pro, omni, flash)"),
ProviderEntry("tencent-tokenhub", "Tencent TokenHub", "Tencent TokenHub (Hy3 Preview — direct API via tokenhub.tencentmaas.com)"),
ProviderEntry("nvidia", "NVIDIA NIM", "NVIDIA NIM (Nemotron models — build.nvidia.com or local NIM)"),
@ -1036,6 +1048,10 @@ _PROVIDER_ALIASES = {
"amazon-bedrock": "bedrock",
"amazon": "bedrock",
"grok": "xai",
"grok-oauth": "xai-oauth",
"xai-oauth": "xai-oauth",
"x-ai-oauth": "xai-oauth",
"xai-grok-oauth": "xai-oauth",
"x-ai": "xai",
"x.ai": "xai",
"nim": "nvidia",
@ -2166,6 +2182,8 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
except Exception:
access_token = None
return get_codex_model_ids(access_token=access_token)
if normalized == "xai-oauth":
return list(_PROVIDER_MODELS.get("xai-oauth", _PROVIDER_MODELS.get("xai", [])))
if normalized in {"copilot", "copilot-acp"}:
try:
live = _fetch_github_models(_resolve_copilot_catalog_api_key())
@ -3444,14 +3462,14 @@ def validate_requested_model(
"message": message,
}
# OpenAI Codex has its own catalog path; /v1/models probing is not the right validation path.
if normalized == "openai-codex":
# Providers with non-standard catalog validation — /v1/models probing is not the right path.
if normalized in {"openai-codex", "xai-oauth"}:
try:
codex_models = provider_model_ids("openai-codex")
catalog_models = provider_model_ids(normalized)
except Exception:
codex_models = []
if codex_models:
if requested_for_lookup in set(codex_models):
catalog_models = []
if catalog_models:
if requested_for_lookup in set(catalog_models):
return {
"accepted": True,
"persist": True,
@ -3459,7 +3477,7 @@ def validate_requested_model(
"message": None,
}
# Auto-correct if the top match is very similar (e.g. typo)
auto = get_close_matches(requested_for_lookup, codex_models, n=1, cutoff=0.9)
auto = get_close_matches(requested_for_lookup, catalog_models, n=1, cutoff=0.9)
if auto:
return {
"accepted": True,
@ -3468,17 +3486,18 @@ def validate_requested_model(
"corrected_model": auto[0],
"message": f"Auto-corrected `{requested}` → `{auto[0]}`",
}
suggestions = get_close_matches(requested_for_lookup, codex_models, n=3, cutoff=0.5)
suggestions = get_close_matches(requested_for_lookup, catalog_models, n=3, cutoff=0.5)
suggestion_text = ""
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
provider_label = "OpenAI Codex" if normalized == "openai-codex" else "xAI Grok OAuth (SuperGrok Subscription)"
return {
"accepted": True,
"persist": True,
"recognized": False,
"message": (
f"Note: `{requested}` was not found in the OpenAI Codex model listing. "
"It may still work if your ChatGPT/Codex account has access to a newer or hidden model ID."
f"Note: `{requested}` was not found in the {provider_label} model listing. "
"It may still work if your account has access to a newer or hidden model ID."
f"{suggestion_text}"
),
}

View file

@ -60,6 +60,12 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
auth_type="oauth_external",
base_url_override="https://chatgpt.com/backend-api/codex",
),
"xai-oauth": HermesOverlay(
transport="codex_responses",
auth_type="oauth_external",
base_url_override="https://api.x.ai/v1",
base_url_env_var="XAI_BASE_URL",
),
"qwen-oauth": HermesOverlay(
transport="openai_chat",
auth_type="oauth_external",
@ -244,6 +250,10 @@ ALIASES: Dict[str, str] = {
"x-ai": "xai",
"x.ai": "xai",
"grok": "xai",
"grok-oauth": "xai-oauth",
"xai-oauth": "xai-oauth",
"x-ai-oauth": "xai-oauth",
"xai-grok-oauth": "xai-oauth",
# nvidia
"nim": "nvidia",

View file

@ -15,12 +15,14 @@ from hermes_cli.auth import (
AuthError,
DEFAULT_CODEX_BASE_URL,
DEFAULT_QWEN_BASE_URL,
DEFAULT_XAI_OAUTH_BASE_URL,
PROVIDER_REGISTRY,
_agent_key_is_usable,
format_auth_error,
resolve_provider,
resolve_nous_runtime_credentials,
resolve_codex_runtime_credentials,
resolve_xai_oauth_runtime_credentials,
resolve_qwen_runtime_credentials,
resolve_gemini_oauth_runtime_credentials,
resolve_api_key_provider_credentials,
@ -238,6 +240,9 @@ def _resolve_runtime_from_pool_entry(
if provider == "openai-codex":
api_mode = "codex_responses"
base_url = base_url or DEFAULT_CODEX_BASE_URL
elif provider == "xai-oauth":
api_mode = "codex_responses"
base_url = base_url or DEFAULT_XAI_OAUTH_BASE_URL
elif provider == "qwen-oauth":
api_mode = "chat_completions"
base_url = base_url or DEFAULT_QWEN_BASE_URL
@ -1132,6 +1137,24 @@ def resolve_runtime_provider(
logger.info("Auto-detected Codex provider but credentials failed; "
"falling through to next provider.")
if provider == "xai-oauth":
try:
creds = resolve_xai_oauth_runtime_credentials()
return {
"provider": "xai-oauth",
"api_mode": "codex_responses",
"base_url": (creds.get("base_url") or "").rstrip("/") or DEFAULT_XAI_OAUTH_BASE_URL,
"api_key": creds.get("api_key", ""),
"source": creds.get("source", "hermes-auth-store"),
"last_refresh": creds.get("last_refresh"),
"requested_provider": requested_provider,
}
except AuthError:
if requested_provider != "auto":
raise
logger.info("Auto-detected xAI OAuth provider but credentials failed; "
"falling through to next provider.")
if provider == "qwen-oauth":
try:
creds = resolve_qwen_runtime_credentials()

View file

@ -1091,6 +1091,58 @@ def _install_kittentts_deps() -> bool:
return False
def _xai_oauth_logged_in_for_setup() -> bool:
"""True iff xAI Grok OAuth credentials are already stored locally.
Lets TTS / STT setup skip the API-key prompt for users who logged in
through ``hermes model`` -> xAI Grok OAuth (SuperGrok Subscription).
"""
try:
from hermes_cli.auth import get_xai_oauth_auth_status
return bool(get_xai_oauth_auth_status().get("logged_in"))
except Exception:
return False
def _run_xai_oauth_login_from_setup() -> bool:
"""Run the xAI Grok OAuth loopback login from inside the setup wizard.
Returns True on success, False on any failure (the caller falls back
to whatever the user picked next, e.g. Edge TTS).
"""
try:
from hermes_cli.auth import (
DEFAULT_XAI_OAUTH_BASE_URL,
_is_remote_session,
_save_xai_oauth_tokens,
_update_config_for_provider,
_xai_oauth_loopback_login,
)
except Exception as exc:
print_warning(f"xAI Grok OAuth helpers unavailable: {exc}")
return False
open_browser = not _is_remote_session()
print()
print_info("Signing in to xAI Grok OAuth (SuperGrok Subscription)...")
try:
creds = _xai_oauth_loopback_login(open_browser=open_browser)
_save_xai_oauth_tokens(
creds["tokens"],
discovery=creds.get("discovery"),
redirect_uri=creds.get("redirect_uri", ""),
last_refresh=creds.get("last_refresh"),
)
_update_config_for_provider(
"xai-oauth", creds.get("base_url", DEFAULT_XAI_OAUTH_BASE_URL)
)
return True
except Exception as exc:
print_warning(f"xAI Grok OAuth login failed: {exc}")
return False
def _setup_tts_provider(config: dict):
"""Interactive TTS provider selection with install flow for NeuTTS."""
tts_config = config.get("tts", {})
@ -1125,7 +1177,7 @@ def _setup_tts_provider(config: dict):
"Edge TTS (free, cloud-based, no setup needed)",
"ElevenLabs (premium quality, needs API key)",
"OpenAI TTS (good quality, needs API key)",
"xAI TTS (Grok voices, needs API key)",
"xAI TTS (Grok voices — OAuth login or API key)",
"MiniMax TTS (high quality with voice cloning, needs API key)",
"Mistral Voxtral TTS (multilingual, native Opus, needs API key)",
"Google Gemini TTS (30 prebuilt voices, prompt-controllable, needs API key)",
@ -1199,21 +1251,59 @@ def _setup_tts_provider(config: dict):
selected = "edge"
elif selected == "xai":
existing = get_env_value("XAI_API_KEY")
if not existing:
# Resolution order: existing OAuth tokens (free for SuperGrok subscribers
# via the Hermes auth store) > existing XAI_API_KEY > prompt the user.
# When neither is configured, offer both options instead of forcing the
# API-key path — xAI TTS works fine with OAuth bearer tokens too.
oauth_logged_in = _xai_oauth_logged_in_for_setup()
existing_api_key = get_env_value("XAI_API_KEY")
if oauth_logged_in:
print_success(
"xAI TTS will use your xAI Grok OAuth (SuperGrok Subscription) "
"credentials"
)
elif existing_api_key:
print_success("xAI TTS will use your existing XAI_API_KEY")
else:
print()
api_key = prompt("xAI API key for TTS", password=True)
if api_key:
save_env_value("XAI_API_KEY", api_key)
print_success("xAI TTS API key saved")
choice_idx = prompt_choice(
"How do you want xAI TTS to authenticate?",
choices=[
"Sign in with xAI Grok OAuth (SuperGrok Subscription) — browser login",
"Paste an xAI API key (console.x.ai)",
"Skip → fallback to Edge TTS",
],
default=0,
)
if choice_idx == 0:
if _run_xai_oauth_login_from_setup():
print_success(
"Logged in — xAI TTS will use these OAuth credentials"
)
else:
print_warning(
"xAI Grok OAuth login did not complete. "
"Falling back to Edge TTS."
)
selected = "edge"
elif choice_idx == 1:
api_key = prompt("xAI API key for TTS", password=True)
if api_key:
save_env_value("XAI_API_KEY", api_key)
print_success("xAI TTS API key saved")
else:
from hermes_constants import display_hermes_home as _dhh
print_warning(
"No xAI API key provided for TTS. Configure XAI_API_KEY "
f"via hermes setup model or {_dhh()}/.env to use xAI TTS. "
"Falling back to Edge TTS."
)
selected = "edge"
else:
from hermes_constants import display_hermes_home as _dhh
print_warning(
"No xAI API key provided for TTS. Configure XAI_API_KEY via "
f"hermes setup model or {_dhh()}/.env to use xAI TTS. "
"Falling back to Edge TTS."
)
print_warning("xAI TTS skipped. Falling back to Edge TTS.")
selected = "edge"
if selected == "xai":
print()
voice_id = prompt("xAI voice_id (Enter for 'eve', or paste a custom voice ID)")

View file

@ -194,11 +194,10 @@ TOOL_CATEGORIES = {
},
{
"name": "xAI TTS",
"tag": "Grok voices - requires xAI API key",
"env_vars": [
{"key": "XAI_API_KEY", "prompt": "xAI API key", "url": "https://console.x.ai/"},
],
"tag": "Grok voices — uses xAI Grok OAuth or XAI_API_KEY",
"env_vars": [],
"tts_provider": "xai",
"post_setup": "xai_grok",
},
{
"name": "ElevenLabs",
@ -925,6 +924,73 @@ def _run_post_setup(post_setup_key: str):
_print_info(" Restart Hermes for tracing to take effect.")
_print_info(" Verify: hermes plugins list")
elif post_setup_key == "xai_grok":
# Shared credential bootstrap for any picker entry that talks to xAI
# (TTS, Video Gen, future Image Gen, etc.). Accepts either a
# SuperGrok-tier OAuth bearer token (preferred — billed against the
# user's existing subscription) or a raw XAI_API_KEY from
# console.x.ai. The picker entries declare empty env_vars so we
# drive the full auth UX here.
try:
from hermes_cli.auth import get_xai_oauth_auth_status
oauth_logged_in = bool(get_xai_oauth_auth_status().get("logged_in"))
except Exception:
oauth_logged_in = False
existing_api_key = get_env_value("XAI_API_KEY")
if oauth_logged_in:
_print_success(
" xAI will use your xAI Grok OAuth (SuperGrok Subscription) credentials"
)
return
if existing_api_key:
_print_success(" xAI will use your existing XAI_API_KEY")
return
_print_info(" xAI needs credentials. Choose one:")
try:
from hermes_cli.setup import (
_run_xai_oauth_login_from_setup,
prompt_choice,
prompt as _setup_prompt,
)
from hermes_cli.config import save_env_value
except Exception as exc:
_print_warning(f" Could not load setup helpers: {exc}")
_print_info(" Run later: hermes auth add xai-oauth (or set XAI_API_KEY)")
return
idx = prompt_choice(
" How do you want xAI to authenticate?",
choices=[
"Sign in with xAI Grok OAuth (SuperGrok Subscription) — browser login",
"Paste an xAI API key (console.x.ai)",
"Skip — configure later via `hermes auth add xai-oauth`",
],
default=0,
)
if idx == 0:
if _run_xai_oauth_login_from_setup():
_print_success(
" Logged in — xAI will use these OAuth credentials"
)
else:
_print_warning(
" xAI Grok OAuth login did not complete. "
"Run later: hermes auth add xai-oauth"
)
elif idx == 1:
api_key = _setup_prompt(" xAI API key", password=True)
if api_key:
save_env_value("XAI_API_KEY", api_key)
_print_success(" XAI_API_KEY saved")
else:
_print_warning(
" No API key provided. Run later: hermes auth add xai-oauth"
)
else:
_print_info(" xAI will remain inactive until credentials are configured.")
# ─── Platform / Toolset Helpers ───────────────────────────────────────────────

View file

@ -31,7 +31,7 @@ from agent.image_gen_provider import (
save_b64_image,
success_response,
)
from tools.xai_http import hermes_xai_user_agent
from tools.xai_http import hermes_xai_user_agent, resolve_xai_http_credentials
logger = logging.getLogger(__name__)
@ -39,14 +39,17 @@ logger = logging.getLogger(__name__)
# Model catalog
# ---------------------------------------------------------------------------
API_MODEL = "grok-imagine-image"
_MODELS: Dict[str, Dict[str, Any]] = {
"grok-imagine-image": {
"display": "Grok Imagine Image",
"speed": "~5-10s",
"strengths": "Fast, high-quality",
},
"grok-imagine-image-quality": {
"display": "Grok Imagine Image (Quality)",
"speed": "~10-20s",
"strengths": "Higher fidelity / detail; slower than the standard model.",
},
}
DEFAULT_MODEL = "grok-imagine-image"
@ -127,7 +130,8 @@ class XAIImageGenProvider(ImageGenProvider):
return "xAI (Grok)"
def is_available(self) -> bool:
return bool(os.getenv("XAI_API_KEY"))
creds = resolve_xai_http_credentials()
return bool(creds.get("api_key"))
def list_models(self) -> List[Dict[str, Any]]:
return [
@ -141,17 +145,16 @@ class XAIImageGenProvider(ImageGenProvider):
]
def get_setup_schema(self) -> Dict[str, Any]:
# Auth resolution is delegated to the shared ``xai_grok`` post_setup
# hook (``hermes_cli/tools_config.py``); identical to the TTS / video
# gen entries so users see the same OAuth-or-API-key choice for every
# xAI service.
return {
"name": "xAI (Grok)",
"name": "xAI Grok Imagine (image)",
"badge": "paid",
"tag": "Native xAI image generation via grok-imagine-image",
"env_vars": [
{
"key": "XAI_API_KEY",
"prompt": "xAI API key",
"url": "https://console.x.ai/",
},
],
"tag": "grok-imagine-image — text-to-image; uses xAI Grok OAuth or XAI_API_KEY",
"env_vars": [],
"post_setup": "xai_grok",
}
def generate(
@ -161,12 +164,14 @@ class XAIImageGenProvider(ImageGenProvider):
**kwargs: Any,
) -> Dict[str, Any]:
"""Generate an image using xAI's grok-imagine-image."""
api_key = os.getenv("XAI_API_KEY", "").strip()
creds = resolve_xai_http_credentials()
api_key = str(creds.get("api_key") or "").strip()
provider_name = str(creds.get("provider") or "xai").strip() or "xai"
if not api_key:
return error_response(
error="XAI_API_KEY not set. Get one at https://console.x.ai/",
error="No xAI credentials found. Configure xAI OAuth in `hermes model` or set XAI_API_KEY.",
error_type="missing_api_key",
provider="xai",
provider=provider_name,
aspect_ratio=aspect_ratio,
)
@ -177,7 +182,7 @@ class XAIImageGenProvider(ImageGenProvider):
xai_res = resolution if resolution in _XAI_RESOLUTIONS else DEFAULT_RESOLUTION
payload: Dict[str, Any] = {
"model": API_MODEL,
"model": model_id,
"prompt": prompt,
"aspect_ratio": xai_ar,
"resolution": xai_res,
@ -189,7 +194,7 @@ class XAIImageGenProvider(ImageGenProvider):
"User-Agent": hermes_xai_user_agent(),
}
base_url = (os.getenv("XAI_BASE_URL") or "https://api.x.ai/v1").strip().rstrip("/")
base_url = str(creds.get("base_url") or "https://api.x.ai/v1").strip().rstrip("/")
try:
response = requests.post(
@ -210,7 +215,7 @@ class XAIImageGenProvider(ImageGenProvider):
return error_response(
error=f"xAI image generation failed ({status}): {err_msg}",
error_type="api_error",
provider="xai",
provider=provider_name,
model=model_id,
prompt=prompt,
aspect_ratio=aspect,
@ -219,7 +224,7 @@ class XAIImageGenProvider(ImageGenProvider):
return error_response(
error="xAI image generation timed out (120s)",
error_type="timeout",
provider="xai",
provider=provider_name,
model=model_id,
prompt=prompt,
aspect_ratio=aspect,
@ -228,7 +233,7 @@ class XAIImageGenProvider(ImageGenProvider):
return error_response(
error=f"xAI connection error: {exc}",
error_type="connection_error",
provider="xai",
provider=provider_name,
model=model_id,
prompt=prompt,
aspect_ratio=aspect,
@ -240,7 +245,7 @@ class XAIImageGenProvider(ImageGenProvider):
return error_response(
error=f"xAI returned invalid JSON: {exc}",
error_type="invalid_response",
provider="xai",
provider=provider_name,
model=model_id,
prompt=prompt,
aspect_ratio=aspect,
@ -252,7 +257,7 @@ class XAIImageGenProvider(ImageGenProvider):
return error_response(
error="xAI returned no image data",
error_type="empty_response",
provider="xai",
provider=provider_name,
model=model_id,
prompt=prompt,
aspect_ratio=aspect,

View file

@ -10,8 +10,12 @@ Originally salvaged from PR #10600 by @Jaaneek; reshaped into the
:class:`VideoGenProvider` plugin interface and trimmed to the
generate-only surface.
Authentication via ``XAI_API_KEY``. Output is an HTTPS URL from xAI's
CDN; the gateway downloads and delivers it.
Authentication: xAI Grok OAuth tokens (preferred billed against the
user's SuperGrok subscription) or ``XAI_API_KEY``. Both routes are
resolved through ``tools.xai_http.resolve_xai_http_credentials`` so a
single login covers chat + TTS + image gen + video gen + transcription.
Output is an HTTPS URL from xAI's CDN; the gateway downloads and
delivers it.
"""
from __future__ import annotations
@ -20,7 +24,7 @@ import asyncio
import logging
import os
import uuid
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List, Optional, Tuple
import httpx
@ -66,24 +70,44 @@ _MODELS: Dict[str, Dict[str, Any]] = {
# ---------------------------------------------------------------------------
def _xai_base_url() -> str:
return (os.getenv("XAI_BASE_URL") or DEFAULT_XAI_BASE_URL).strip().rstrip("/")
def _resolve_xai_credentials() -> Tuple[str, str]:
"""Return ``(api_key, base_url)`` from the shared xAI credential resolver.
Order: runtime provider (xai-oauth pool entry) singleton ``auth.json``
OAuth tokens ``XAI_API_KEY`` env var. ``api_key`` is empty when no
credential source is available; callers must check before using it.
"""
try:
from tools.xai_http import resolve_xai_http_credentials
creds = resolve_xai_http_credentials() or {}
except Exception as exc:
logger.debug("xAI credential resolver failed: %s", exc)
creds = {}
api_key = str(creds.get("api_key") or os.getenv("XAI_API_KEY", "")).strip()
base_url = str(
creds.get("base_url")
or os.getenv("XAI_BASE_URL")
or DEFAULT_XAI_BASE_URL
).strip().rstrip("/")
return api_key, base_url
def _xai_headers() -> Dict[str, str]:
api_key = os.getenv("XAI_API_KEY", "").strip()
if not api_key:
raise ValueError("XAI_API_KEY not set. Get one at https://console.x.ai/")
def _xai_user_agent() -> str:
try:
from tools.xai_http import hermes_xai_user_agent
ua = hermes_xai_user_agent()
return hermes_xai_user_agent()
except Exception:
ua = "hermes-agent/video_gen"
return "hermes-agent/video_gen"
def _xai_headers(api_key: str) -> Dict[str, str]:
return {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
"User-Agent": ua,
"User-Agent": _xai_user_agent(),
}
@ -110,12 +134,15 @@ def _clamp_duration(duration: Optional[int], has_reference_images: bool) -> int:
async def _submit(
client: httpx.AsyncClient,
payload: Dict[str, Any],
*,
api_key: str,
base_url: str,
) -> str:
"""POST to /videos/generations — xAI's only public endpoint for our
text-to-video and image-to-video surface."""
response = await client.post(
f"{_xai_base_url()}/videos/generations",
headers={**_xai_headers(), "x-idempotency-key": str(uuid.uuid4())},
f"{base_url}/videos/generations",
headers={**_xai_headers(api_key), "x-idempotency-key": str(uuid.uuid4())},
json=payload,
timeout=60,
)
@ -131,6 +158,8 @@ async def _poll(
client: httpx.AsyncClient,
request_id: str,
*,
api_key: str,
base_url: str,
timeout_seconds: int,
poll_interval: int,
) -> Dict[str, Any]:
@ -138,8 +167,8 @@ async def _poll(
last_status = "queued"
while elapsed < timeout_seconds:
response = await client.get(
f"{_xai_base_url()}/videos/{request_id}",
headers=_xai_headers(),
f"{base_url}/videos/{request_id}",
headers=_xai_headers(api_key),
timeout=30,
)
response.raise_for_status()
@ -174,7 +203,8 @@ class XAIVideoGenProvider(VideoGenProvider):
return "xAI"
def is_available(self) -> bool:
return bool(os.environ.get("XAI_API_KEY", "").strip())
api_key, _ = _resolve_xai_credentials()
return bool(api_key)
def list_models(self) -> List[Dict[str, Any]]:
return [{"id": mid, **meta} for mid, meta in _MODELS.items()]
@ -183,17 +213,18 @@ class XAIVideoGenProvider(VideoGenProvider):
return DEFAULT_MODEL
def get_setup_schema(self) -> Dict[str, Any]:
# Auth resolution lives entirely in the shared ``xai_grok`` post_setup
# hook (``hermes_cli/tools_config.py``) so the picker doesn't blindly
# prompt for an API key when the user is already signed in via xAI
# Grok OAuth (SuperGrok Subscription) — TTS / image gen / video gen
# all share the same credential resolver. The hook offers an
# OAuth-vs-API-key choice when neither is configured.
return {
"name": "xAI",
"name": "xAI Grok Imagine",
"badge": "paid",
"tag": "grok-imagine-video — text-to-video & image-to-video with reference images",
"env_vars": [
{
"key": "XAI_API_KEY",
"prompt": "xAI API key",
"url": "https://console.x.ai/",
},
],
"tag": "grok-imagine-video — text-to-video & image-to-video; uses xAI Grok OAuth or XAI_API_KEY",
"env_vars": [],
"post_setup": "xai_grok",
}
def capabilities(self) -> Dict[str, Any]:
@ -259,9 +290,14 @@ class XAIVideoGenProvider(VideoGenProvider):
aspect_ratio: str,
resolution: str,
) -> Dict[str, Any]:
if not os.environ.get("XAI_API_KEY", "").strip():
api_key, base_url = _resolve_xai_credentials()
if not api_key:
return error_response(
error="XAI_API_KEY not set. Get one at https://console.x.ai/",
error=(
"No xAI credentials found. Sign in via `hermes auth add xai-oauth` "
"(SuperGrok subscription) or set XAI_API_KEY from "
"https://console.x.ai/."
),
error_type="auth_required",
provider="xai", prompt=prompt,
)
@ -317,7 +353,9 @@ class XAIVideoGenProvider(VideoGenProvider):
async with httpx.AsyncClient() as client:
try:
request_id = await _submit(client, payload)
request_id = await _submit(
client, payload, api_key=api_key, base_url=base_url
)
except httpx.HTTPStatusError as exc:
detail = ""
try:
@ -334,6 +372,7 @@ class XAIVideoGenProvider(VideoGenProvider):
poll_result = await _poll(
client, request_id,
api_key=api_key, base_url=base_url,
timeout_seconds=DEFAULT_TIMEOUT_SECONDS,
poll_interval=DEFAULT_POLL_INTERVAL_SECONDS,
)

View file

@ -1275,7 +1275,7 @@ class AIAgent:
self.api_mode = api_mode
elif self.provider == "openai-codex":
self.api_mode = "codex_responses"
elif self.provider == "xai":
elif self.provider in {"xai", "xai-oauth"}:
self.api_mode = "codex_responses"
elif (provider_name is None) and (
self._base_url_hostname == "chatgpt.com"
@ -7139,15 +7139,60 @@ class AIAgent:
raise RuntimeError("Responses create(stream=True) fallback did not emit a terminal response.")
def _try_refresh_codex_client_credentials(self, *, force: bool = True) -> bool:
if self.api_mode != "codex_responses" or self.provider != "openai-codex":
if self.api_mode != "codex_responses" or self.provider not in {"openai-codex", "xai-oauth"}:
return False
# Guard against silent account swap.
#
# When an agent is using a non-singleton credential — e.g. a manual
# pool entry (``hermes auth add xai-oauth``) whose tokens belong to
# a different account than the loopback_pkce singleton, or an agent
# constructed with an explicit ``api_key=`` arg — force-refreshing
# the singleton here and adopting its tokens silently re-routes the
# rest of the conversation onto the singleton's account. The
# credential pool's reactive recovery (``_recover_with_credential_pool``)
# is the right channel for that case; this path is the
# singleton-only fallback used when the pool can't recover, and
# MUST only fire when the agent really is on singleton tokens.
try:
if self.provider == "openai-codex":
from hermes_cli.auth import resolve_codex_runtime_credentials
singleton_now = resolve_codex_runtime_credentials(
refresh_if_expiring=False,
)
else:
from hermes_cli.auth import resolve_xai_oauth_runtime_credentials
singleton_now = resolve_xai_oauth_runtime_credentials(
refresh_if_expiring=False,
)
except Exception as exc:
logger.debug("%s singleton read failed: %s", self.provider, exc)
return False
singleton_key = str(singleton_now.get("api_key") or "").strip()
active_key = str(self.api_key or "").strip()
if singleton_key and active_key and singleton_key != active_key:
logger.debug(
"%s singleton tokens differ from the active api_key; "
"skipping singleton force-refresh to avoid silent account swap. "
"Reactive credential rotation should go through the pool.",
self.provider,
)
return False
try:
from hermes_cli.auth import resolve_codex_runtime_credentials
if self.provider == "openai-codex":
from hermes_cli.auth import resolve_codex_runtime_credentials
creds = resolve_codex_runtime_credentials(force_refresh=force)
creds = resolve_codex_runtime_credentials(force_refresh=force)
else:
from hermes_cli.auth import resolve_xai_oauth_runtime_credentials
creds = resolve_xai_oauth_runtime_credentials(force_refresh=force)
except Exception as exc:
logger.debug("Codex credential refresh failed: %s", exc)
logger.debug("%s credential refresh failed: %s", self.provider, exc)
return False
api_key = creds.get("api_key")
@ -7162,7 +7207,7 @@ class AIAgent:
self._client_kwargs["api_key"] = self.api_key
self._client_kwargs["base_url"] = self.base_url
if not self._replace_primary_openai_client(reason="codex_credential_refresh"):
if not self._replace_primary_openai_client(reason=f"{self.provider}_credential_refresh"):
return False
return True
@ -9631,7 +9676,7 @@ class AIAgent:
and "/backend-api/codex" in self._base_url_lower
)
)
is_xai_responses = self.provider == "xai" or self._base_url_hostname == "api.x.ai"
is_xai_responses = self.provider in {"xai", "xai-oauth"} or self._base_url_hostname == "api.x.ai"
_msgs_for_codex = self._prepare_messages_for_non_vision_model(api_messages)
return _ct.build_kwargs(
model=self.model,
@ -13700,13 +13745,14 @@ class AIAgent:
if (
self.api_mode == "codex_responses"
and self.provider == "openai-codex"
and self.provider in {"openai-codex", "xai-oauth"}
and status_code == 401
and not codex_auth_retry_attempted
):
codex_auth_retry_attempted = True
if self._try_refresh_codex_client_credentials(force=True):
self._vprint(f"{self.log_prefix}🔐 Codex auth refreshed after 401. Retrying request...")
_label = "xAI OAuth" if self.provider == "xai-oauth" else "Codex"
self._vprint(f"{self.log_prefix}🔐 {_label} auth refreshed after 401. Retrying request...")
continue
if (
self.api_mode == "chat_completions"
@ -14346,11 +14392,15 @@ class AIAgent:
self._vprint(f"{self.log_prefix} 🌐 Endpoint: {_base}", force=True)
# Actionable guidance for common auth errors
if classified.is_auth or classified.reason == FailoverReason.billing:
if _provider == "openai-codex" and status_code == 401:
self._vprint(f"{self.log_prefix} 💡 Codex OAuth token was rejected (HTTP 401). Your token may have been", force=True)
self._vprint(f"{self.log_prefix} refreshed by another client (Codex CLI, VS Code). To fix:", force=True)
self._vprint(f"{self.log_prefix} 1. Run `codex` in your terminal to generate fresh tokens.", force=True)
self._vprint(f"{self.log_prefix} 2. Then run `hermes auth` to re-authenticate.", force=True)
if _provider in {"openai-codex", "xai-oauth"} and status_code == 401:
if _provider == "openai-codex":
self._vprint(f"{self.log_prefix} 💡 Codex OAuth token was rejected (HTTP 401). Your token may have been", force=True)
self._vprint(f"{self.log_prefix} refreshed by another client (Codex CLI, VS Code). To fix:", force=True)
self._vprint(f"{self.log_prefix} 1. Run `codex` in your terminal to generate fresh tokens.", force=True)
self._vprint(f"{self.log_prefix} 2. Then run `hermes auth` to re-authenticate.", force=True)
else:
self._vprint(f"{self.log_prefix} 💡 xAI OAuth token was rejected (HTTP 401). To fix:", force=True)
self._vprint(f"{self.log_prefix} re-authenticate with xAI Grok OAuth (SuperGrok Subscription) from `hermes model`.", force=True)
else:
self._vprint(f"{self.log_prefix} 💡 Your API key was rejected by the provider. Check:", force=True)
self._vprint(f"{self.log_prefix} • Is the key valid? Run: hermes setup", force=True)

View file

@ -100,6 +100,49 @@ class TestCodexBuildKwargs:
)
assert "prompt_cache_key" not in kw
def test_xai_responses_sends_cache_key_via_extra_body(self, transport):
"""xAI's Responses API documents ``prompt_cache_key`` as the
body-level cache-routing key (the ``x-grok-conv-id`` header is
Chat-Completions-only). Passing it via ``extra_body`` is robust
against openai SDK builds whose ``Responses.stream()`` kwarg
signature ever drops the field the body field still serializes
and reaches xAI either way. The ``x-grok-conv-id`` header is kept
as a belt-and-braces fallback so cache routing survives even
when the body field would be stripped by an intermediate proxy.
Ref: https://docs.x.ai/developers/advanced-api-usage/prompt-caching/maximizing-cache-hits
"""
messages = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(
model="grok-4.3", messages=messages, tools=[],
session_id="conv-xai-1",
is_xai_responses=True,
)
# Top-level prompt_cache_key must NOT be set for xAI — the SDK
# signature drop is what motivated the extra_body indirection in
# the first place. The cache-routing field must travel in the
# body via extra_body.
assert "prompt_cache_key" not in kw
assert kw.get("extra_body", {}).get("prompt_cache_key") == "conv-xai-1"
# Header kept as belt-and-braces.
assert kw.get("extra_headers", {}).get("x-grok-conv-id") == "conv-xai-1"
def test_xai_responses_extra_body_preserves_caller_fields(self, transport):
"""When the caller already supplies ``extra_body`` (e.g. via
request_overrides), the xAI cache-key injection must merge into
the existing dict instead of overwriting it. Caller-supplied
``prompt_cache_key`` wins (setdefault semantics) so user overrides
aren't silently clobbered by the transport."""
messages = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(
model="grok-4.3", messages=messages, tools=[],
session_id="conv-xai-1",
is_xai_responses=True,
request_overrides={"extra_body": {"prompt_cache_key": "caller-override", "other_field": 42}},
)
eb = kw.get("extra_body", {})
assert eb.get("prompt_cache_key") == "caller-override"
assert eb.get("other_field") == 42
def test_max_tokens(self, transport):
messages = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(

File diff suppressed because it is too large Load diff

View file

@ -72,10 +72,13 @@ class TestXAIImageGenProvider:
provider = XAIImageGenProvider()
schema = provider.get_setup_schema()
assert schema["name"] == "xAI (Grok)"
assert schema["name"] == "xAI Grok Imagine (image)"
assert schema["badge"] == "paid"
assert len(schema["env_vars"]) == 1
assert schema["env_vars"][0]["key"] == "XAI_API_KEY"
# Auth resolution is delegated to the shared "xai_grok" post_setup
# hook so the picker doesn't blindly prompt for XAI_API_KEY when the
# user is already signed in via xAI Grok OAuth.
assert schema["env_vars"] == []
assert schema["post_setup"] == "xai_grok"
# ---------------------------------------------------------------------------

View file

@ -54,6 +54,50 @@ def test_xai_generate_requires_xai_key(monkeypatch):
assert result["error_type"] == "auth_required"
def test_xai_available_with_oauth_only(monkeypatch):
"""The plugin must honour xAI Grok OAuth credentials, not just
XAI_API_KEY. Otherwise the agent's tool-availability check filters
``video_generate`` out of the toolbelt and the agent silently falls
back to whatever skill advertises video generation (e.g. comfyui).
"""
import plugins.video_gen.xai as xai_plugin
monkeypatch.delenv("XAI_API_KEY", raising=False)
monkeypatch.setattr(
"tools.xai_http.resolve_xai_http_credentials",
lambda: {
"provider": "xai-oauth",
"api_key": "oauth-bearer-token",
"base_url": "https://api.x.ai/v1",
},
)
assert xai_plugin.XAIVideoGenProvider().is_available() is True
def test_xai_resolved_credentials_threaded_through_request(monkeypatch):
"""OAuth-resolved creds must reach the HTTP layer — bug class where
``is_available()`` says yes but the request still hits with no key.
"""
import plugins.video_gen.xai as xai_plugin
monkeypatch.delenv("XAI_API_KEY", raising=False)
monkeypatch.setattr(
"tools.xai_http.resolve_xai_http_credentials",
lambda: {
"provider": "xai-oauth",
"api_key": "oauth-bearer-token",
"base_url": "https://api.x.ai/v1",
},
)
api_key, base_url = xai_plugin._resolve_xai_credentials()
assert api_key == "oauth-bearer-token"
assert base_url == "https://api.x.ai/v1"
headers = xai_plugin._xai_headers(api_key)
assert headers["Authorization"] == "Bearer oauth-bearer-token"
def test_xai_no_operation_kwarg():
"""The ABC's generate() signature no longer accepts 'operation'.
Passing it through **kwargs should be ignored (forward-compat)."""

View file

@ -578,6 +578,197 @@ def test_run_conversation_codex_refreshes_after_401_and_retries(monkeypatch):
assert result["final_response"] == "Recovered after refresh"
def _build_xai_oauth_agent(monkeypatch):
_patch_agent_bootstrap(monkeypatch)
agent = run_agent.AIAgent(
model="grok-code-fast-1",
provider="xai-oauth",
api_mode="codex_responses",
base_url="https://api.x.ai/v1",
api_key="xai-oauth-token",
quiet_mode=True,
max_iterations=4,
skip_context_files=True,
skip_memory=True,
)
agent._cleanup_task_resources = lambda task_id: None
agent._persist_session = lambda messages, history=None: None
agent._save_trajectory = lambda messages, user_message, completed: None
agent._save_session_log = lambda messages: None
return agent
def test_build_api_kwargs_xai_oauth_sends_cache_key_via_extra_body(monkeypatch):
"""xai-oauth + codex_responses must route prompt caching via the
``prompt_cache_key`` body field on /v1/responses (xAI's documented
Responses-API cache key see docs.x.ai prompt-caching/maximizing-
cache-hits).
We pass it through ``extra_body`` rather than as a top-level kwarg so
the body field is serialized into JSON regardless of whether the
installed openai SDK build still accepts ``prompt_cache_key`` on
``Responses.stream()``. Older or trimmed SDK builds drop it from the
signature and would otherwise raise ``TypeError`` before the request
reaches api.x.ai. The ``x-grok-conv-id`` header is retained as a
belt-and-braces fallback for clients/proxies that route on headers."""
agent = _build_xai_oauth_agent(monkeypatch)
kwargs = agent._build_api_kwargs(
[
{"role": "system", "content": "You are Hermes."},
{"role": "user", "content": "Ping"},
]
)
assert kwargs.get("model") == "grok-code-fast-1"
# Top-level kwarg must NOT be set — that's the openai SDK
# incompatibility this whole indirection exists to dodge.
assert "prompt_cache_key" not in kwargs
extra_body = kwargs.get("extra_body") or {}
assert extra_body.get("prompt_cache_key"), (
"xAI prompt-cache routing must travel via extra_body.prompt_cache_key "
"for /v1/responses — body field is the documented surface."
)
headers = kwargs.get("extra_headers") or {}
assert "x-grok-conv-id" in headers, (
"x-grok-conv-id header kept as belt-and-braces fallback for clients "
"that route on headers."
)
def test_run_conversation_xai_oauth_refreshes_after_401_and_retries(monkeypatch):
"""xai-oauth speaks the Responses API just like codex. When the access
token is rejected mid-call (401), the same proactive refresh-and-retry
handler that fires for openai-codex must also fire for xai-oauth the
bug it caught: the gating condition checked only ``provider == "openai-codex"``,
so xai-oauth 401s leaked straight to non-retryable abort path with no
chance to swap in a freshly refreshed access token."""
agent = _build_xai_oauth_agent(monkeypatch)
calls = {"api": 0, "refresh": 0}
class _UnauthorizedError(RuntimeError):
def __init__(self):
super().__init__("Error code: 401 - unauthorized")
self.status_code = 401
def _fake_api_call(api_kwargs):
calls["api"] += 1
if calls["api"] == 1:
raise _UnauthorizedError()
return _codex_message_response("Recovered after xAI refresh")
def _fake_refresh(*, force=True):
calls["refresh"] += 1
assert force is True
return True
monkeypatch.setattr(agent, "_interruptible_api_call", _fake_api_call)
monkeypatch.setattr(agent, "_try_refresh_codex_client_credentials", _fake_refresh)
result = agent.run_conversation("Say OK")
assert calls["api"] == 2
assert calls["refresh"] == 1
assert result["completed"] is True
assert result["final_response"] == "Recovered after xAI refresh"
def test_try_refresh_codex_client_credentials_handles_xai_oauth(monkeypatch):
"""``_try_refresh_codex_client_credentials`` must rebuild the OpenAI
client with freshly resolved xAI OAuth credentials when the active
provider is xai-oauth. The function name is shared between codex and
xai-oauth (both speak codex_responses) covering both cases prevents
silent regressions where the function gets gated to a single provider."""
agent = _build_xai_oauth_agent(monkeypatch)
closed = {"value": False}
rebuilt = {"kwargs": None}
class _ExistingClient:
def close(self):
closed["value"] = True
class _RebuiltClient:
pass
def _fake_openai(**kwargs):
rebuilt["kwargs"] = kwargs
return _RebuiltClient()
def _fake_resolve(force_refresh=False, refresh_if_expiring=True, **_):
# The pre-refresh guard reads the singleton with refresh_if_expiring=False
# to verify that the agent's active key still matches; the actual
# refresh later passes force_refresh=True. Both calls must succeed.
return {
"api_key": "fresh-xai-token" if force_refresh else agent.api_key,
"base_url": "https://api.x.ai/v1",
}
monkeypatch.setattr(
"hermes_cli.auth.resolve_xai_oauth_runtime_credentials",
_fake_resolve,
)
monkeypatch.setattr(run_agent, "OpenAI", _fake_openai)
agent.client = _ExistingClient()
ok = agent._try_refresh_codex_client_credentials(force=True)
assert ok is True
assert closed["value"] is True
assert rebuilt["kwargs"]["api_key"] == "fresh-xai-token"
assert rebuilt["kwargs"]["base_url"] == "https://api.x.ai/v1"
assert isinstance(agent.client, _RebuiltClient)
assert agent.api_key == "fresh-xai-token"
def test_try_refresh_codex_client_credentials_skips_xai_oauth_when_singleton_differs(monkeypatch):
"""An xai-oauth agent constructed with a non-singleton credential
(e.g. a manual pool entry whose tokens belong to a different account
than the loopback_pkce singleton, or an explicit ``api_key=`` arg)
MUST NOT silently adopt the singleton's tokens on a 401 reactive
refresh. Otherwise a 401 mid-conversation would re-route the rest
of the conversation onto a different account, with no user feedback.
The credential pool's reactive recovery is the right channel for
pool-managed credentials; this fallback path is for the singleton-
only case and must short-circuit when the active key differs."""
agent = _build_xai_oauth_agent(monkeypatch)
# Agent is using "xai-oauth-token" (per the builder); singleton holds
# a *different* account's token. No force_refresh should fire.
refresh_calls = {"count": 0}
def _fake_resolve(force_refresh=False, refresh_if_expiring=True, **_):
if force_refresh:
refresh_calls["count"] += 1
return {
"api_key": "singleton-account-token",
"base_url": "https://api.x.ai/v1",
}
# The pre-refresh guard read — return the singleton's view of the
# singleton's token, which is NOT what the agent is currently using.
return {
"api_key": "singleton-account-token",
"base_url": "https://api.x.ai/v1",
}
monkeypatch.setattr(
"hermes_cli.auth.resolve_xai_oauth_runtime_credentials",
_fake_resolve,
)
pre_refresh_key = agent.api_key
ok = agent._try_refresh_codex_client_credentials(force=True)
assert ok is False, (
"must not refresh when the active credential isn't the singleton; "
"otherwise the conversation silently swaps accounts mid-flight."
)
assert refresh_calls["count"] == 0, (
"force_refresh must not run — that would mutate the singleton's "
"tokens on disk and consume its single-use refresh_token for an "
"agent that wasn't even using the singleton."
)
assert agent.api_key == pre_refresh_key
def test_run_conversation_copilot_refreshes_after_401_and_retries(monkeypatch):
agent = _build_copilot_agent(monkeypatch)
calls = {"api": 0, "refresh": 0}
@ -624,12 +815,18 @@ def test_try_refresh_codex_client_credentials_rebuilds_client(monkeypatch):
rebuilt["kwargs"] = kwargs
return _RebuiltClient()
def _fake_resolve(force_refresh=False, refresh_if_expiring=True, **_):
# Pre-refresh guard reads the singleton (refresh_if_expiring=False).
# It must report the agent's current api_key so the equality check
# passes; only then does the actual force_refresh run.
return {
"api_key": "new-codex-token" if force_refresh else agent.api_key,
"base_url": "https://chatgpt.com/backend-api/codex",
}
monkeypatch.setattr(
"hermes_cli.auth.resolve_codex_runtime_credentials",
lambda force_refresh=True: {
"api_key": "new-codex-token",
"base_url": "https://chatgpt.com/backend-api/codex",
},
_fake_resolve,
)
monkeypatch.setattr(run_agent, "OpenAI", _fake_openai)

View file

@ -266,10 +266,12 @@ def _get_provider(stt_config: dict) -> str:
return "none"
if provider == "xai":
if get_env_value("XAI_API_KEY"):
from tools.xai_http import resolve_xai_http_credentials
if resolve_xai_http_credentials().get("api_key"):
return "xai"
logger.warning(
"STT provider 'xai' configured but XAI_API_KEY not set"
"STT provider 'xai' configured but no xAI credentials are available"
)
return "none"
@ -289,9 +291,14 @@ def _get_provider(stt_config: dict) -> str:
if _HAS_OPENAI and _has_openai_audio_backend():
logger.info("No local STT available, using OpenAI Whisper API")
return "openai"
if get_env_value("XAI_API_KEY"):
logger.info("No local STT available, using xAI Grok STT API")
return "xai"
try:
from tools.xai_http import resolve_xai_http_credentials
if resolve_xai_http_credentials().get("api_key"):
logger.info("No local STT available, using xAI Grok STT API")
return "xai"
except Exception:
pass
return "none"
# ---------------------------------------------------------------------------
@ -704,14 +711,22 @@ def _transcribe_xai(file_path: str, model_name: str) -> Dict[str, Any]:
Supports Inverse Text Normalization, diarization, and word-level timestamps.
Requires ``XAI_API_KEY`` environment variable.
"""
api_key = get_env_value("XAI_API_KEY")
from tools.xai_http import resolve_xai_http_credentials
creds = resolve_xai_http_credentials()
api_key = str(creds.get("api_key") or "").strip()
if not api_key:
return {"success": False, "transcript": "", "error": "XAI_API_KEY not set"}
return {
"success": False,
"transcript": "",
"error": "No xAI credentials found. Configure xAI OAuth in `hermes model` or set XAI_API_KEY",
}
stt_config = _load_stt_config()
xai_config = stt_config.get("xai", {})
base_url = str(
xai_config.get("base_url")
or creds.get("base_url")
or get_env_value("XAI_STT_BASE_URL")
or XAI_STT_BASE_URL
).strip().rstrip("/")
@ -872,7 +887,7 @@ def transcribe_audio(file_path: str, model: Optional[str] = None) -> Dict[str, A
"No STT provider available. Install faster-whisper for free local "
f"transcription, configure {LOCAL_STT_COMMAND_ENV} or install a local whisper CLI, "
"set GROQ_API_KEY for free Groq Whisper, set MISTRAL_API_KEY for Mistral "
"Voxtral Transcribe, set XAI_API_KEY for xAI Grok STT, or set VOICE_TOOLS_OPENAI_KEY "
"Voxtral Transcribe, configure xAI OAuth or set XAI_API_KEY for xAI Grok STT, or set VOICE_TOOLS_OPENAI_KEY "
"or OPENAI_API_KEY for the OpenAI Whisper API."
),
}

View file

@ -9,7 +9,7 @@ Built-in TTS providers:
- MiniMax TTS: High-quality with voice cloning, needs MINIMAX_API_KEY
- Mistral (Voxtral TTS): Multilingual, native Opus, needs MISTRAL_API_KEY
- Google Gemini TTS: Controllable, 30 prebuilt voices, needs GEMINI_API_KEY
- xAI TTS: Grok voices, needs XAI_API_KEY
- xAI TTS: Grok voices, uses xAI Grok OAuth credentials or XAI_API_KEY
- NeuTTS (local, free, no API key): On-device TTS via neutts
- KittenTTS (local, free, no API key): On-device 25MB model
- Piper (local, free, no API key): OHF-Voice/piper1-gpl neural VITS, 44 languages
@ -902,9 +902,12 @@ def _generate_xai_tts(text: str, output_path: str, tts_config: Dict[str, Any]) -
"""
import requests
api_key = (get_env_value("XAI_API_KEY") or "").strip()
from tools.xai_http import resolve_xai_http_credentials
creds = resolve_xai_http_credentials()
api_key = str(creds.get("api_key") or "").strip()
if not api_key:
raise ValueError("XAI_API_KEY not set. Get one at https://console.x.ai/")
raise ValueError("No xAI credentials found. Configure xAI OAuth in `hermes model` or set XAI_API_KEY.")
xai_config = tts_config.get("xai", {})
voice_id = str(xai_config.get("voice_id", DEFAULT_XAI_VOICE_ID)).strip() or DEFAULT_XAI_VOICE_ID
@ -913,6 +916,7 @@ def _generate_xai_tts(text: str, output_path: str, tts_config: Dict[str, Any]) -
bit_rate = int(xai_config.get("bit_rate", DEFAULT_XAI_BIT_RATE))
base_url = str(
xai_config.get("base_url")
or creds.get("base_url")
or get_env_value("XAI_BASE_URL")
or DEFAULT_XAI_BASE_URL
).strip().rstrip("/")
@ -1917,8 +1921,13 @@ def check_tts_requirements() -> bool:
pass
if get_env_value("MINIMAX_API_KEY"):
return True
if get_env_value("XAI_API_KEY"):
return True
try:
from tools.xai_http import resolve_xai_http_credentials
if resolve_xai_http_credentials().get("api_key"):
return True
except Exception:
pass
if get_env_value("GEMINI_API_KEY") or get_env_value("GOOGLE_API_KEY"):
return True
try:

View file

@ -2,6 +2,9 @@
from __future__ import annotations
import os
from typing import Dict
def hermes_xai_user_agent() -> str:
"""Return a stable Hermes-specific User-Agent for xAI HTTP calls."""
@ -10,3 +13,49 @@ def hermes_xai_user_agent() -> str:
except Exception:
__version__ = "unknown"
return f"Hermes-Agent/{__version__}"
def resolve_xai_http_credentials() -> Dict[str, str]:
"""Resolve bearer credentials for direct xAI HTTP endpoints.
Prefers Hermes-managed xAI OAuth credentials when available, then falls back
to ``XAI_API_KEY`` from the environment. This keeps direct xAI endpoints
(images, TTS, STT, etc.) aligned with the main runtime auth model.
"""
try:
from hermes_cli.runtime_provider import resolve_runtime_provider
runtime = resolve_runtime_provider(requested="xai-oauth")
access_token = str(runtime.get("api_key") or "").strip()
base_url = str(runtime.get("base_url") or "").strip().rstrip("/")
if access_token:
return {
"provider": "xai-oauth",
"api_key": access_token,
"base_url": base_url or "https://api.x.ai/v1",
}
except Exception:
pass
try:
from hermes_cli.auth import resolve_xai_oauth_runtime_credentials
creds = resolve_xai_oauth_runtime_credentials()
access_token = str(creds.get("api_key") or "").strip()
base_url = str(creds.get("base_url") or "").strip().rstrip("/")
if access_token:
return {
"provider": "xai-oauth",
"api_key": access_token,
"base_url": base_url or "https://api.x.ai/v1",
}
except Exception:
pass
api_key = os.getenv("XAI_API_KEY", "").strip()
base_url = (os.getenv("XAI_BASE_URL") or "https://api.x.ai/v1").strip().rstrip("/")
return {
"provider": "xai",
"api_key": api_key,
"base_url": base_url,
}

View file

@ -0,0 +1,214 @@
---
sidebar_position: 16
title: "xAI Grok OAuth (SuperGrok Subscription)"
description: "Sign in with your SuperGrok subscription to use Grok models in Hermes Agent — no API key required"
---
# xAI Grok OAuth (SuperGrok Subscription)
Hermes Agent supports xAI Grok through a browser-based OAuth login flow against [accounts.x.ai](https://accounts.x.ai), using your existing **SuperGrok subscription**. No `XAI_API_KEY` is required — log in once and Hermes automatically refreshes your session in the background.
The transport reuses the `codex_responses` adapter (xAI exposes a Responses-style endpoint), so reasoning, tool-calling, streaming, and prompt caching work without any adapter changes.
The same OAuth bearer token is also reused by every direct-to-xAI surface in Hermes — TTS, image generation, video generation, and transcription — so a single login covers all four.
## Overview
| Item | Value |
|------|-------|
| Provider ID | `xai-oauth` |
| Display name | xAI Grok OAuth (SuperGrok Subscription) |
| Auth type | Browser OAuth 2.0 PKCE (loopback callback) |
| Transport | xAI Responses API (`codex_responses`) |
| Default model | `grok-4.3` |
| Endpoint | `https://api.x.ai/v1` |
| Auth server | `https://accounts.x.ai` |
| Requires env var | No (`XAI_API_KEY` is **not** used for this provider) |
| Subscription | [SuperGrok](https://x.ai/grok) (any active tier) |
## Prerequisites
- Python 3.9+
- Hermes Agent installed
- An active SuperGrok subscription on your xAI account
- A browser available on the local machine (or use `--no-browser` for remote sessions)
## Quick Start
```bash
# Launch the provider and model picker
hermes model
# → Select "xAI Grok OAuth (SuperGrok Subscription)" from the provider list
# → Hermes opens your browser to accounts.x.ai
# → Approve access in the browser
# → Pick a model (grok-4.3 is at the top)
# → Start chatting
hermes
```
After the first login, credentials are stored under `~/.hermes/auth.json` and refreshed automatically before they expire.
## Logging In Manually
You can trigger a login without going through the model picker:
```bash
hermes auth add xai-oauth
```
### Remote / headless sessions
On servers, containers, or SSH sessions where no browser is available, Hermes detects the remote environment and prints the authorization URL instead of opening a browser. Open the URL on any device with a browser, complete the consent flow, and Hermes finishes the loopback exchange when the redirect comes back.
If you need to force this behaviour explicitly:
```bash
hermes auth add xai-oauth --no-browser
```
## How the Login Works
1. Hermes opens your browser to `accounts.x.ai`.
2. You sign in (or confirm your existing session) and approve access.
3. xAI redirects back to Hermes and the tokens are saved to `~/.hermes/auth.json`.
4. From then on, Hermes refreshes the access token in the background — you stay signed in until you `hermes auth remove xai-oauth` or revoke access from your xAI account settings.
## Checking Login Status
```bash
hermes doctor
```
The `◆ Auth Providers` section will show the current state of every provider, including `xai-oauth`.
## Switching Models
```bash
hermes model
# → Select "xAI Grok OAuth (SuperGrok Subscription)"
# → Pick from the model list (grok-4.3 is pinned to the top)
```
Or set the model directly:
```bash
hermes config set model.default grok-4.3
hermes config set model.provider xai-oauth
```
## Configuration Reference
After login, `~/.hermes/config.yaml` will contain:
```yaml
model:
default: grok-4.3
provider: xai-oauth
base_url: https://api.x.ai/v1
```
### Provider aliases
All of the following resolve to `xai-oauth`:
```bash
hermes --provider xai-oauth # canonical
hermes --provider grok-oauth # alias
hermes --provider x-ai-oauth # alias
hermes --provider xai-grok-oauth # alias
```
## Direct-to-xAI Tools (TTS / Image / Video / Transcription)
Once you're logged in via OAuth, every direct-to-xAI tool reuses the same bearer token automatically — there is **no separate setup** unless you'd rather use an API key.
To pick a backend for each tool:
```bash
hermes tools
# → Text-to-Speech → "xAI TTS"
# → Image Generation → "xAI Grok Imagine (image)"
# → Video Generation → "xAI Grok Imagine"
```
If OAuth tokens are already stored, the picker confirms it and skips the credential prompt. If neither OAuth nor `XAI_API_KEY` is set, the picker offers a 3-choice menu: OAuth login, paste API key, or skip.
:::note Video generation is off by default
The `video_gen` toolset is disabled by default. Enable it in `hermes tools``🎬 Video Generation` (press space) before the agent can call `video_generate`. Otherwise the agent may fall back to the bundled ComfyUI skill, which is also tagged for video generation.
:::
### Models
| Tool | Model | Notes |
|------|-------|-------|
| Chat | `grok-4.3` | Default; auto-selected when you log in via OAuth |
| Chat | `grok-4.20-0309-reasoning` | Reasoning variant |
| Chat | `grok-4.20-0309-non-reasoning` | Non-reasoning variant |
| Chat | `grok-4.20-multi-agent-0309` | Multi-agent variant |
| Image | `grok-imagine-image` | Default; ~510 s |
| Image | `grok-imagine-image-quality` | Higher fidelity; ~1020 s |
| Video | `grok-imagine-video` | Text-to-video and image-to-video; up to 7 reference images |
| TTS | (default voice) | xAI `/v1/tts` endpoint |
The chat catalog is derived live from the on-disk `models.dev` cache; new xAI releases appear automatically once that cache refreshes. `grok-4.3` is always pinned to the top of the list.
## Environment Variables
| Variable | Effect |
|----------|--------|
| `XAI_BASE_URL` | Override the default `https://api.x.ai/v1` endpoint (rarely needed). |
| `HERMES_INFERENCE_PROVIDER` | Force the active provider at runtime, e.g. `HERMES_INFERENCE_PROVIDER=xai-oauth hermes`. |
## Troubleshooting
### Token expired — not re-logging in automatically
Hermes refreshes the token before each session and again reactively on a 401. If refresh fails with `invalid_grant` (the refresh token was revoked, or the account was rotated), Hermes surfaces a typed re-auth message instead of crashing.
**Fix:** run `hermes auth add xai-oauth` again to start a fresh login.
### Authorization timed out
The loopback listener has a finite expiry window (default 180 s). If you don't approve the login in time, Hermes raises a timeout error.
**Fix:** re-run `hermes auth add xai-oauth` (or `hermes model`). The flow starts fresh.
### State mismatch (possible CSRF)
Hermes detected that the `state` value returned by the authorization server doesn't match what it sent.
**Fix:** re-run the login. If it persists, check for a proxy or redirect that is modifying the OAuth response.
### Logging in from a remote server
On SSH or container sessions Hermes prints the authorization URL instead of opening a browser. Open the URL on any device with a browser and complete the consent there — the loopback callback comes back to your remote host.
You can also force this behaviour:
```bash
hermes auth add xai-oauth --no-browser
```
### "No xAI credentials found" error at runtime
The auth store has no `xai-oauth` entry and no `XAI_API_KEY` is set. You haven't logged in yet, or the credential file was deleted.
**Fix:** run `hermes model` and pick the xAI Grok OAuth provider, or run `hermes auth add xai-oauth`.
## Logging Out
To remove stored xAI Grok OAuth credentials:
```bash
hermes auth remove xai-oauth
```
This clears both the singleton `loopback_pkce` entry in `auth.json` and any matching credential-pool rows.
## See Also
- [AI Providers reference](../integrations/providers.md)
- [Environment Variables](../reference/environment-variables.md)
- [Configuration](../user-guide/configuration.md)
- [Voice & TTS](../user-guide/features/tts.md)

View file

@ -331,6 +331,8 @@ When using the Z.AI / GLM provider, Hermes automatically probes multiple endpoin
xAI is wired through the Responses API (`codex_responses` transport) for automatic reasoning support on Grok 4 models — no `reasoning_effort` parameter needed, the server reasons by default. Set `XAI_API_KEY` in `~/.hermes/.env` and pick xAI in `hermes model`, or drop `grok` as a shortcut into `/model grok-4-1-fast-reasoning`.
SuperGrok subscribers can sign in with browser OAuth instead of using an API key — pick **xAI Grok OAuth (SuperGrok Subscription)** in `hermes model`, or run `hermes auth add xai-oauth`. The same OAuth bearer token is automatically reused by direct-to-xAI tools (TTS, image gen, video gen, transcription). See the [xAI Grok OAuth guide](../guides/xai-grok-oauth.md) for the full flow.
When using xAI as a provider (any base URL containing `x.ai`), Hermes automatically enables prompt caching by sending the `x-grok-conv-id` header with every API request. This routes requests to the same server within a conversation session, allowing xAI's infrastructure to reuse cached system prompts and conversation history.
No configuration is needed — caching activates automatically when an xAI endpoint is detected and a session ID is available. This reduces latency and cost for multi-turn conversations.
@ -1444,7 +1446,7 @@ fallback_model:
When activated, the fallback swaps the model and provider mid-session without losing your conversation. The chain is tried entry-by-entry; activation is one-shot per session.
Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `deepseek`, `nvidia`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `azure-foundry`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `stepfun`, `lmstudio`, `alibaba`, `alibaba-coding-plan`, `tencent-tokenhub`, `custom`.
Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `deepseek`, `nvidia`, `xai`, `xai-oauth`, `ollama-cloud`, `bedrock`, `ai-gateway`, `azure-foundry`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `stepfun`, `lmstudio`, `alibaba`, `alibaba-coding-plan`, `tencent-tokenhub`, `custom`.
:::tip
Fallback is configured exclusively through `config.yaml` — or interactively via `hermes fallback`. For full details on when it triggers, how the chain advances, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers).

View file

@ -191,6 +191,7 @@ const sidebars: SidebarsConfig = {
'guides/migrate-from-openclaw',
'guides/aws-bedrock',
'guides/azure-foundry',
'guides/xai-grok-oauth',
'guides/microsoft-graph-app-registration',
'guides/operate-teams-meeting-pipeline',
],