Commit graph

1 commit

Author SHA1 Message Date
Teknium
c4a7992317
fix(mcp-oauth): persist OAuth server metadata across process restarts (#21226)
The MCP SDK discovers OAuth server metadata (token_endpoint, etc.) on
demand and keeps it in memory only. Without disk persistence, a restart
with valid cached refresh tokens forces the SDK to fall back to the
guessed '{server_url}/token' path — which returns 404 on most real
providers (Notion, Atlassian, GitHub remote MCP, etc.) and triggers a
full browser re-authorization even though the refresh token is fine.

Add a .meta.json file next to the existing tokens/client_info files:

  HERMES_HOME/mcp-tokens/<server>.json        -- tokens (existing)
  HERMES_HOME/mcp-tokens/<server>.client.json -- client info (existing)
  HERMES_HOME/mcp-tokens/<server>.meta.json   -- oauth metadata (new)

Changes:
- HermesTokenStorage.save_oauth_metadata / load_oauth_metadata / _meta_path
  — disk layer for the discovered OAuthMetadata.
- HermesTokenStorage.remove() now also clears .meta.json so
  'hermes mcp remove <name>' and the manager's remove() path clean up fully.
- HermesMCPOAuthProvider._initialize cold-restores from disk before the
  existing pre-flight discovery runs. If disk has metadata we skip the
  discovery HTTP round-trips entirely.
- HermesMCPOAuthProvider._prefetch_oauth_metadata now persists ASM as
  soon as it's discovered, so even the first pre-flight run seeds disk.
- HermesMCPOAuthProvider._persist_oauth_metadata_if_changed() is called
  at the end of async_auth_flow so metadata discovered via the SDK's
  lazy 401-branch (not pre-flight) is also saved for next time.

Tests cover the storage roundtrip (save/load/missing/corrupt/remove) and
the manager provider path (cold-load restore, skip-when-in-memory,
persist-on-discover, noop-when-unchanged, end-to-end async_auth_flow).

Co-authored-by: nocturnum91 <50326054+nocturnum91@users.noreply.github.com>
2026-05-07 05:35:33 -07:00