fix(agent): re-enter retry loop on genuine Nous 429 so fallback guard runs

The genuine-rate-limit branch set retry_count = max_retries before
continue, intending the top-of-loop Nous guard to handle fallback or
bail cleanly. But the loop condition is retry_count < max_retries, so
the guard never ran: no fallback activation, no clean rate-limit
message — just the generic retry-exhaustion error.

Set retry_count = max(0, max_retries - 1) so the loop body runs exactly
once more and the guard sees the breaker state recorded moments earlier.

Extracted from the #44061 bugfix rollup by @AIalliAI.
This commit is contained in:
Aðalsteinn Helgason 2026-06-12 12:02:46 -07:00 committed by Teknium
parent dc467488a7
commit 2714fc8396

View file

@ -2631,10 +2631,13 @@ def run_conversation(
except Exception:
pass
if _genuine_nous_rate_limit:
# Skip straight to max_retries -- the
# top-of-loop guard will handle fallback or
# bail cleanly.
retry_count = max_retries
# Re-enter the loop exactly once so the
# top-of-loop Nous guard handles fallback or
# bails cleanly. (Setting retry_count to
# max_retries would make the while condition
# false immediately and the guard would never
# run -- no fallback, generic exhaustion error.)
retry_count = max(0, max_retries - 1)
continue
# Upstream capacity 429: fall through to normal
# retry logic. A different model (or the same