hermes-agent/agent
Teknium 9d9b424390
fix: Nous Portal rate limit guard — prevent retry amplification (#10568)
When Nous returns a 429, the retry amplification chain burns up to 9
API requests per conversation turn (3 SDK retries × 3 Hermes retries),
each counting against RPH and deepening the rate limit. With multiple
concurrent sessions (cron + gateway + auxiliary), this creates a spiral
where retries keep the limit tapped indefinitely.

New module: agent/nous_rate_guard.py
- Shared file-based rate limit state (~/.hermes/rate_limits/nous.json)
- Parses reset time from x-ratelimit-reset-requests-1h, x-ratelimit-
  reset-requests, retry-after headers, or error context
- Falls back to 5-minute default cooldown if no header data
- Atomic writes (tempfile + rename) for cross-process safety
- Auto-cleanup of expired state files

run_agent.py changes:
- Top-of-retry-loop guard: when another session already recorded Nous
  as rate-limited, skip the API call entirely. Try fallback provider
  first, then return a clear message with the reset time.
- On 429 from Nous: record rate limit state and skip further retries
  (sets retry_count = max_retries to trigger fallback path)
- On success from Nous: clear the rate limit state so other sessions
  know they can resume

auxiliary_client.py changes:
- _try_nous() checks rate guard before attempting Nous in the auxiliary
  fallback chain. When rate-limited, returns (None, None) so the chain
  skips to the next provider instead of piling more requests onto Nous.

This eliminates three sources of amplification:
1. Hermes-level retries (saves 6 of 9 calls per turn)
2. Cross-session retries (cron + gateway all skip Nous)
3. Auxiliary fallback to Nous (compression/session_search skip too)

Includes 24 tests covering the rate guard module, header parsing,
state lifecycle, and auxiliary client integration.
2026-04-15 16:31:48 -07:00
..
__init__.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
anthropic_adapter.py feat: native AWS Bedrock provider via Converse API 2026-04-15 16:17:17 -07:00
auxiliary_client.py fix: Nous Portal rate limit guard — prevent retry amplification (#10568) 2026-04-15 16:31:48 -07:00
bedrock_adapter.py feat: native AWS Bedrock provider via Converse API 2026-04-15 16:17:17 -07:00
context_compressor.py fix: stale agent timeout, uv venv detection, empty response after tools, compression model fallback (#9051, #8620, #9400) (#10093) 2026-04-14 22:38:17 -07:00
context_engine.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
context_references.py fix(agent): preserve quoted @file references with spaces 2026-04-10 13:05:01 -07:00
copilot_acp_client.py fix: bridge tool-calls in copilot-acp adapter 2026-04-06 01:47:57 -07:00
credential_pool.py fix(copilot): preserve base URL and gpt-5-mini routing 2026-04-15 15:04:14 -07:00
display.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
error_classifier.py feat: native AWS Bedrock provider via Converse API 2026-04-15 16:17:17 -07:00
insights.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
manual_compression_feedback.py fix(gateway): make manual compression feedback truthful 2026-04-10 21:16:53 -07:00
memory_manager.py refactor(memory): drop on_session_reset — commit-only is enough 2026-04-15 11:28:45 -07:00
memory_provider.py refactor(memory): drop on_session_reset — commit-only is enough 2026-04-15 11:28:45 -07:00
model_metadata.py feat: native AWS Bedrock provider via Converse API 2026-04-15 16:17:17 -07:00
models_dev.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
nous_rate_guard.py fix: Nous Portal rate limit guard — prevent retry amplification (#10568) 2026-04-15 16:31:48 -07:00
prompt_builder.py feat(gateway): unify QQBot branding, add PLATFORM_HINTS, fix streaming, restore missing setup functions 2026-04-14 00:11:49 -07:00
prompt_caching.py fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter 2026-03-21 16:54:43 -07:00
rate_limit_tracker.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
redact.py fix(security): add JWT token and Discord mention redaction (#10547) 2026-04-15 16:08:52 -07:00
retry_utils.py feat(agent): add jittered retry backoff 2026-04-08 00:41:36 -07:00
skill_commands.py fix: replace hardcoded ~/.hermes with display_hermes_home() in agent-facing text (#10285) 2026-04-15 04:57:55 -07:00
skill_utils.py feat(plugins): namespaced skill registration for plugin skill bundles 2026-04-14 10:42:58 -07:00
smart_model_routing.py fix: UTF-8 config encoding, pairing hint, credential_pool key, header normalization (#7174) 2026-04-10 05:33:48 -07:00
subdirectory_hints.py fix(agent): catch PermissionError in subdirectory hint discovery 2026-04-09 03:10:30 -07:00
title_generator.py fix: title_generator no longer logs as 'compression' task 2026-04-12 04:17:18 -07:00
trajectory.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
usage_pricing.py feat: native AWS Bedrock provider via Converse API 2026-04-15 16:17:17 -07:00