fix(agent): retry on json.JSONDecodeError instead of treating it as a local validation error (#15107)

json.JSONDecodeError inherits from ValueError. The agent loop's
non-retryable classifier at run_agent.py ~L10782 treated any
ValueError/TypeError as a local programming bug and short-circuited
retry. Without a carve-out, a transient JSONDecodeError from a
provider that returned a malformed response body, a truncated stream,
or a router-layer corruption would fail the turn immediately.

Add JSONDecodeError to the existing UnicodeEncodeError exclusion
tuple so the classified-retry logic (which already handles 429/529/
context-overflow/etc.) gets to run on bad-JSON errors.

Tests (tests/run_agent/test_jsondecodeerror_retryable.py):
  - JSONDecodeError: NOT local validation
  - UnicodeEncodeError: NOT local validation (existing carve-out)
  - bare ValueError: IS local validation (programming bug)
  - bare TypeError: IS local validation (programming bug)
  - source-level assertion that run_agent.py still carries the carve-out
    (guards against accidental revert)

Closes #14782
This commit is contained in:
Teknium 2026-04-24 05:02:58 -07:00 committed by GitHub
parent 1eb29e6452
commit c2b3db48f5
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 97 additions and 1 deletions

View file

@ -10797,9 +10797,18 @@ class AIAgent:
# already accounts for 413, 429, 529 (transient), context
# overflow, and generic-400 heuristics. Local validation
# errors (ValueError, TypeError) are programming bugs.
# Exclude UnicodeEncodeError — it's a ValueError subclass
# but is handled separately by the surrogate sanitization
# path above. Exclude json.JSONDecodeError — also a
# ValueError subclass, but it indicates a transient
# provider/network failure (malformed response body,
# truncated stream, routing layer corruption), not a
# local programming bug, and should be retried (#14782).
is_local_validation_error = (
isinstance(api_error, (ValueError, TypeError))
and not isinstance(api_error, UnicodeEncodeError)
and not isinstance(
api_error, (UnicodeEncodeError, json.JSONDecodeError)
)
)
is_client_error = (
is_local_validation_error