fix(xai-proxy): handle 429 rate-limit responses in proxy retry path

get_retry_credential only triggered on 401; a 429 Too Many Requests from
xAI was silently streamed back with no key rotation or back-off signal.

- server.py: widen retry gate from == 401 to in {401, 429}
- xai.py: on 429, skip token refresh and call mark_exhausted_and_rotate
  to stamp the 1-hour cooldown on the rate-limited key and return the
  next available credential. Returns None if pool is exhausted.
This commit is contained in:
sprmn24 2026-05-20 00:07:15 +03:00 committed by Teknium
parent aa3466063b
commit 4ed482549f
2 changed files with 14 additions and 5 deletions

View file

@ -206,7 +206,7 @@ def create_app(adapter: UpstreamAdapter) -> "web.Application":
return session_or_response
session = session_or_response
if upstream_resp.status == 401:
if upstream_resp.status in {401, 429}:
try:
retry_cred = adapter.get_retry_credential(
failed_credential=cred,