mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-25 05:52:34 +00:00
feat(xai-oauth): add xAI Grok OAuth (SuperGrok Subscription) provider
Adds a new authentication provider that lets SuperGrok subscribers sign in to Hermes with their xAI account via the standard OAuth 2.0 PKCE loopback flow, instead of pasting a raw API key from console.x.ai. Highlights ---------- * OAuth 2.0 PKCE loopback login against accounts.x.ai with discovery, state/nonce, and a strict CORS-origin allowlist on the callback. * Authorize URL carries `plan=generic` (required for non-allowlisted loopback clients) and `referrer=hermes-agent` for best-effort attribution in xAI's OAuth server logs. * Token storage in `auth.json` with file-locked atomic writes; JWT `exp`-based expiry detection with skew; refresh-token rotation synced both ways between the singleton store and the credential pool so multi-process / multi-profile setups don't tear each other's refresh tokens. * Reactive 401 retry: on a 401 from the xAI Responses API, the agent refreshes the token, swaps it back into `self.api_key`, and retries the call once. Guarded against silent account swaps when the active key was sourced from a different (manual) pool entry. * Auxiliary tasks (curator, vision, embeddings, etc.) route through a dedicated xAI Responses-mode auxiliary client instead of falling back to OpenRouter billing. * Direct HTTP tools (`tools/xai_http.py`, transcription, TTS, image-gen plugin) resolve credentials through a unified runtime → singleton → env-var fallback chain so xai-oauth users get them for free. * `hermes auth add xai-oauth` and `hermes auth remove xai-oauth N` are wired through the standard auth-commands surface; remove cleans up the singleton loopback_pkce entry so it doesn't silently reinstate. * `hermes model` provider picker shows "xAI Grok OAuth (SuperGrok Subscription)" and the model-flow falls back to pool credentials when the singleton is missing. Hardening --------- * Discovery and refresh responses validate the returned `token_endpoint` host against the same `*.x.ai` allowlist as the authorization endpoint, blocking MITM persistence of a hostile endpoint. * Discovery / refresh / token-exchange `response.json()` calls are wrapped to raise typed `AuthError` on malformed bodies (captive portals, proxy error pages) instead of leaking JSONDecodeError tracebacks. * `prompt_cache_key` is routed through `extra_body` on the codex transport (sending it as a top-level kwarg trips xAI's SDK with a TypeError). * Credential-pool sync-back preserves `active_provider` so refreshing an OAuth entry doesn't silently flip the active provider out from under the running agent. Testing ------- * New `tests/hermes_cli/test_auth_xai_oauth_provider.py` (~63 tests) covers JWT expiry, OAuth URL params (plan + referrer), CORS origins, redirect URI validation, singleton↔pool sync, concurrency races, refresh error paths, runtime resolution, and malformed-JSON guards. * Extended `test_credential_pool.py`, `test_codex_transport.py`, and `test_run_agent_codex_responses.py` cover the pool sync-back, `extra_body` routing, and 401 reactive refresh paths. * 165 tests passing on this branch via `scripts/run_tests.sh`.
This commit is contained in:
parent
9fb40e6a3d
commit
b62c997973
27 changed files with 3843 additions and 131 deletions
|
|
@ -89,18 +89,25 @@ class ResponsesApiTransport(ProviderTransport):
|
|||
_effort_clamp = {"minimal": "low"}
|
||||
reasoning_effort = _effort_clamp.get(reasoning_effort, reasoning_effort)
|
||||
|
||||
response_tools = _responses_tools(tools)
|
||||
kwargs = {
|
||||
"model": model,
|
||||
"instructions": instructions,
|
||||
"input": _chat_messages_to_responses_input(payload_messages),
|
||||
"tools": _responses_tools(tools),
|
||||
"tool_choice": "auto",
|
||||
"parallel_tool_calls": True,
|
||||
"tools": response_tools,
|
||||
"store": False,
|
||||
}
|
||||
if response_tools:
|
||||
kwargs["tool_choice"] = "auto"
|
||||
kwargs["parallel_tool_calls"] = True
|
||||
|
||||
session_id = params.get("session_id")
|
||||
if not is_github_responses and session_id:
|
||||
# xAI's Responses API uses `prompt_cache_key` (body-level) as the
|
||||
# cache-routing key, not a top-level kwarg — the body-field
|
||||
# injection below survives openai SDK builds whose
|
||||
# Responses.stream() signature drops the kwarg. Everything else
|
||||
# that ISN'T github/xAI keeps using the typed kwarg.
|
||||
if not is_github_responses and not is_xai_responses and session_id:
|
||||
kwargs["prompt_cache_key"] = session_id
|
||||
|
||||
if reasoning_enabled and is_xai_responses:
|
||||
|
|
@ -165,6 +172,22 @@ class ResponsesApiTransport(ProviderTransport):
|
|||
merged_extra_headers["x-grok-conv-id"] = session_id
|
||||
kwargs["extra_headers"] = merged_extra_headers
|
||||
|
||||
# xAI Responses cache-routing field. Lives in the request body
|
||||
# (per https://docs.x.ai/.../prompt-caching/maximizing-cache-hits),
|
||||
# so we ship it via extra_body — the openai SDK serializes
|
||||
# extra_body fields into the JSON body without per-field type
|
||||
# validation, sidestepping the TypeError that fires on
|
||||
# Responses.stream() builds whose `prompt_cache_key` kwarg has
|
||||
# been dropped. Setdefault preserves a caller-supplied value
|
||||
# (e.g. request_overrides.extra_body.prompt_cache_key) over
|
||||
# the auto-derived session_id.
|
||||
existing_extra_body = kwargs.get("extra_body")
|
||||
merged_extra_body: Dict[str, Any] = {}
|
||||
if isinstance(existing_extra_body, dict):
|
||||
merged_extra_body.update(existing_extra_body)
|
||||
merged_extra_body.setdefault("prompt_cache_key", session_id)
|
||||
kwargs["extra_body"] = merged_extra_body
|
||||
|
||||
return kwargs
|
||||
|
||||
def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue