Commit graph

3 commits

Author SHA1 Message Date
Siddharth Balyan
f3006ebef9
refactor(tests): re-architect tests + fix CI failures (#5946)
* refactor: re-architect tests to mirror the codebase

* Update tests.yml

* fix: add missing tool_error imports after registry refactor

* fix(tests): replace patch.dict with monkeypatch to prevent env var leaks under xdist

patch.dict(os.environ) can leak TERMINAL_ENV across xdist workers,
causing test_code_execution tests to hit the Modal remote path.

* fix(tests): fix update_check and telegram xdist failures

- test_update_check: replace patch("hermes_cli.banner.os.getenv") with
  monkeypatch.setenv("HERMES_HOME") — banner.py no longer imports os
  directly, it uses get_hermes_home() from hermes_constants.

- test_telegram_conflict/approval_buttons: provide real exception classes
  for telegram.error mock (NetworkError, TimedOut, BadRequest) so the
  except clause in connect() doesn't fail with "catching classes that do
  not inherit from BaseException" when xdist pollutes sys.modules.

* fix(tests): accept unavailable_models kwarg in _prompt_model_selection mock
2026-04-07 17:19:07 -07:00
Teknium
8fd9fafc84
fix: handle Anthropic Sonnet long-context tier 429 by reducing to 200k (#4747)
Anthropic returns HTTP 429 'Extra usage is required for long context
requests' when a Claude Max subscription doesn't include the 1M context
tier. This is NOT a transient rate limit — retrying won't help.

Only applies to Sonnet models (Opus 1M is general access). Detects
this specific error before the generic rate-limit handler and:
1. Reduces context_length from 1M to 200k (the standard tier)
2. Triggers context compression to fit
3. Retries with the reduced context

The reduction is session-scoped (not persisted) so it auto-recovers
if the user later enables extra usage on their subscription.

Fixes: Sonnet 4.6 instant rate limits on Claude Max without extra usage
2026-04-03 02:05:02 -07:00
Teknium
470c3ea51a
fix: handle Anthropic long-context tier 429 by reducing to 200k
Anthropic returns HTTP 429 'Extra usage is required for long context
requests' when a Claude Max subscription doesn't include the 1M context
tier. This is NOT a transient rate limit — retrying won't help.

Detect this specific error before the generic rate-limit handler and:
1. Reduce context_length from 1M to 200k (the standard tier)
2. Trigger context compression to fit
3. Retry with the reduced context

The reduction is session-scoped (not persisted) so it auto-recovers
if the user later enables extra usage on their subscription.

Fixes: Sonnet 4.6 instant rate limits on Claude Max without extra usage
2026-04-03 01:56:43 -07:00