fix(anthropic): reactive recovery for OAuth 1M-context beta rejection (#17752)

Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable
subscriptions retain full context. When Anthropic rejects a request with
400 'long context beta is not yet available for this subscription',
disable the beta for the rest of the session, rebuild the client, and
retry once.

Addresses #17680 (thanks @JayGwod for the clean reproduction) without
forcing every OAuth user off the 1M context window.

Changes:
- agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden;
  pattern matches 400 + 'long context beta' + 'not yet available'. Narrow
  enough that the existing 429 tier-gate pattern keeps its own reason.
- agent/anthropic_adapter.py: _common_betas_for_base_url,
  build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta
  kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged.
- agent/transports/anthropic.py: build_kwargs forwards the flag.
- run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard,
  recovery branch next to the image-shrink path. _rebuild_anthropic_client
  honors the flag. The main build_kwargs call site threads it through for
  fast-mode extra_headers.
- hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models
  probes get the same reactive retry — previously they'd falsely report
  the Anthropic API as unreachable for affected subscriptions.

Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit
tests cover the classifier pattern (including the collision guard against
the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default
keeps 1M, flag strips only 1M while preserving every other beta).
This commit is contained in:
Teknium 2026-04-29 21:56:54 -07:00 committed by GitHub
parent 4d363499db
commit 828d3a320b
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
8 changed files with 264 additions and 23 deletions

View file

@ -57,7 +57,9 @@ class TestFailoverReason:
"context_overflow", "payload_too_large", "image_too_large",
"model_not_found", "format_error",
"provider_policy_blocked",
"thinking_signature", "long_context_tier", "unknown",
"thinking_signature", "long_context_tier",
"oauth_long_context_beta_forbidden",
"unknown",
}
actual = {r.value for r in FailoverReason}
assert expected == actual
@ -458,6 +460,40 @@ class TestClassifyApiError:
result = classify_api_error(e, provider="anthropic")
assert result.reason == FailoverReason.rate_limit
# ── Provider-specific: Anthropic OAuth 1M-context beta forbidden ──
def test_anthropic_oauth_1m_beta_forbidden(self):
"""400 + 'long context beta is not yet available for this subscription'
oauth_long_context_beta_forbidden (retryable, no compression)."""
e = MockAPIError(
"The long context beta is not yet available for this subscription.",
status_code=400,
)
result = classify_api_error(e, provider="anthropic", model="claude-sonnet-4.6")
assert result.reason == FailoverReason.oauth_long_context_beta_forbidden
assert result.retryable is True
assert result.should_compress is False
def test_anthropic_oauth_1m_beta_forbidden_does_not_collide_with_tier_gate(self):
"""The 429 'extra usage' + 'long context' tier gate keeps its own
classification even though its message mentions 'long context'."""
e = MockAPIError(
"Extra usage is required for long context requests over 200k tokens",
status_code=429,
)
result = classify_api_error(e, provider="anthropic", model="claude-sonnet-4.6")
assert result.reason == FailoverReason.long_context_tier
def test_400_without_beta_phrase_is_not_1m_beta_forbidden(self):
"""A generic 400 that happens to mention 'long context' but not the
exact beta-availability phrase should not be misclassified."""
e = MockAPIError(
"long context window exceeded",
status_code=400,
)
result = classify_api_error(e, provider="anthropic")
assert result.reason != FailoverReason.oauth_long_context_beta_forbidden
# ── Transport errors ──
def test_read_timeout(self):