fix: preserve credential_pool through smart routing and defer eager fallback on 429 (#4361)

Three bugs prevented credential pool rotation from working when multiple
Codex OAuth tokens were configured:

1. credential_pool was dropped during smart model turn routing.
   resolve_turn_route() constructed runtime dicts without it, so the
   AIAgent was created without pool access. Fixed in smart_model_routing.py
   (no-route and fallback paths), cli.py, and gateway/run.py.

2. Eager fallback fired before pool rotation on 429. The rate-limit
   handler at line ~7180 switched to a fallback provider immediately,
   before _recover_with_credential_pool got a chance to rotate to the
   next credential. Now deferred when the pool still has credentials.

3. (Non-issue) Retry budget was reported as too small, but successful
   pool rotations already skip retry_count increment — no change needed.

Reported by community member Schinsly who identified all three root
causes and verified the fix locally with multiple Codex accounts.
This commit is contained in:
Teknium 2026-04-01 01:02:34 -07:00 committed by GitHub
parent ef2ae3e48f
commit a7f7e87070
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 369 additions and 4 deletions

1
cli.py
View file

@ -2024,6 +2024,7 @@ class HermesCLI:
"api_mode": self.api_mode,
"command": self.acp_command,
"args": list(self.acp_args or []),
"credential_pool": getattr(self, "_credential_pool", None),
},
)