feat(copilot): add 401 auth recovery with automatic token refresh and client rebuild

When using GitHub Copilot as provider, HTTP 401 errors could cause Hermes to silently fall back to the next model in the chain instead of recovering. This adds a one-shot retry mechanism that: 1. Re-resolves the Copilot token via the standard priority chain (COPILOT_GITHUB_TOKEN -> GH_TOKEN -> GITHUB_TOKEN -> gh auth token) 2. Rebuilds the OpenAI client with fresh credentials and Copilot headers 3. Retries the failed request before falling back The fix handles the common case where the gho_* OAuth token remains valid but the httpx client state becomes stale (e.g. after startup race conditions or long-lived sessions). Key design decisions: - Always rebuild client even if token string unchanged (recovers stale state) - Uses _apply_client_headers_for_base_url() for canonical header management - One-shot flag guard prevents infinite 401 loops (matches existing pattern used by Codex/Nous/Anthropic providers) - No token exchange via /copilot_internal/v2/token (returns 404 for some account types; direct gho_* auth works reliably) Tests: 3 new test cases covering end-to-end 401->refresh->retry, client rebuild verification, and same-token rebuild scenarios. Docs: Updated providers.md with Copilot auth behavior section.
2026-04-25 00:51:20 +00:00 · 2026-04-15 10:28:17 +02:00 · 2026-04-15 10:28:17 +02:00 · 2cab8129d1
commit 2cab8129d1
parent 7d2f93a97f
3 changed files with 143 additions and 0 deletions
--- a/website/docs/integrations/providers.md
+++ b/website/docs/integrations/providers.md
@ -216,6 +216,18 @@ The Copilot API does **not** support classic Personal Access Tokens (`ghp_*`). S
 If your `gh auth token` returns a `ghp_*` token, use `hermes model` to authenticate via OAuth instead.
 :::

+:::info Copilot auth behavior in Hermes
+Hermes sends a supported GitHub token (`gho_*`, `github_pat_*`, or `ghu_*`) directly to `api.githubcopilot.com` and includes Copilot-specific headers (`Editor-Version`, `Copilot-Integration-Id`, `Openai-Intent`, `x-initiator`).
+
+On HTTP 401, Hermes now performs a one-shot credential recovery before fallback:
+
+1. Re-resolve token via the normal priority chain (`COPILOT_GITHUB_TOKEN` → `GH_TOKEN` → `GITHUB_TOKEN` → `gh auth token`)
+2. Rebuild the shared OpenAI client with refreshed headers
+3. Retry the request once
+
+Some older community proxies use `api.github.com/copilot_internal/v2/token` exchange flows. That endpoint can be unavailable for some account types (returns 404). Hermes therefore keeps direct-token auth as the primary path and relies on runtime credential refresh + retry for robustness.
+:::
+
 **API routing**: GPT-5+ models (except `gpt-5-mini`) automatically use the Responses API. All other models (GPT-4o, Claude, Gemini, etc.) use Chat Completions. Models are auto-detected from the live Copilot catalog.

 **`copilot-acp` — Copilot ACP agent backend**. Spawns the local Copilot CLI as a subprocess: