mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-29 06:31:32 +00:00
fix(xai-oauth): quarantine terminal refresh errors so dead tokens are not replayed across sessions
When refresh_xai_oauth_pure raises a terminal error (HTTP 400/401/403, i.e. revoked or reused refresh token), _refresh_entry's existing race- recovery path re-syncs from auth.json and returns if another process has already rotated the tokens. If auth.json still holds the same stale token pair, the function fell through to _mark_exhausted — leaving the dead credentials in auth.json. On the next Hermes startup _seed_from_singletons re-seeded the pool from those stale tokens, causing the same failure loop on every session. Fix: after the auth.json re-sync check in the xAI-oauth error handler, detect terminal errors with the new _is_terminal_xai_oauth_refresh_error helper and apply a quarantine: - Clear access_token and refresh_token from providers["xai-oauth"]["tokens"] in auth.json so they are not re-seeded. - Write a last_auth_error entry for hermes doctor / auth status diagnostics. - Remove all loopback_pkce entries from the in-memory pool so the current session stops retrying with the dead credentials. Mirrors the identical quarantine already in place for Nous OAuth (c90556262). Closes the parity gap introduced whenc90556262added Nous-only terminal error handling without a corresponding xAI-oauth path.
This commit is contained in:
parent
226680500d
commit
5e40f83cb7
3 changed files with 200 additions and 1 deletions
|
|
@ -4044,6 +4044,23 @@ def _is_terminal_nous_refresh_error(exc: Exception) -> bool:
|
|||
)
|
||||
|
||||
|
||||
def _is_terminal_xai_oauth_refresh_error(exc: Exception) -> bool:
|
||||
"""True when retrying the same xAI OAuth refresh token cannot succeed.
|
||||
|
||||
``xai_refresh_failed`` covers HTTP 400/401/403 from the token endpoint
|
||||
(invalid_grant, token revoked, refresh_token_reused).
|
||||
``xai_auth_missing_refresh_token`` means the pool entry has no refresh
|
||||
token at all — retrying will never work.
|
||||
Both carry ``relogin_required=True``; transient failures (429, 5xx) do not.
|
||||
"""
|
||||
return (
|
||||
isinstance(exc, AuthError)
|
||||
and exc.provider == "xai-oauth"
|
||||
and exc.code in {"xai_refresh_failed", "xai_auth_missing_refresh_token"}
|
||||
and bool(exc.relogin_required)
|
||||
)
|
||||
|
||||
|
||||
def _quarantine_nous_oauth_state(
|
||||
state: Dict[str, Any],
|
||||
error: AuthError,
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue