hermes-agent/agent
Teknium fae920642a
fix(agent): throttle cross-turn fallback-switch replay storm (#24996) (#53909)
When every provider in the fallback chain fails non-retryably back-to-back
(e.g. HTTP 400/402/429 across distinct providers), the within-turn walk is
already bounded — _fallback_index advances monotonically and the loop aborts
when the chain exhausts. The damaging mode is cross-turn: restore_primary_
runtime resets _fallback_index=0 every turn, so a client that re-submits
immediately replays the entire chain, re-marshaling the full (potentially
80k-token) context once per provider every turn with no throttle on the
non-rate-limit path. On constrained hosts this exhausts memory/swap.

Rate-limit/billing failures already arm a 60s cooldown via _rate_limited_until;
the gap was the non-rate-limit case. Now, when the chain exhausts on a non-
rate-limit failure with a non-empty chain, arm a short (5s) cooldown on the
same _rate_limited_until gate (max(), never shrinking an existing window).
The next turn's restore stays gated and does NOT reset the index, so the
chain isn't replayed until the cooldown clears. No new state, no thread sleep,
no false-trip on legitimately long chains (those walk normally within a turn).

Tests: tests/run_agent/test_24996_fallback_exhaustion_cooldown.py
2026-06-27 19:15:40 -07:00
..
lsp fix: prevent TUI gateway stdin EOF crash across all TUI-context subprocess calls 2026-06-08 22:46:57 -07:00
pet feat(pets): quality-first OpenRouter model chain + stronger atlas gates + global pet-gen notifications 2026-06-24 23:11:21 -05:00
secret_sources fix: prevent TUI gateway stdin EOF crash across all TUI-context subprocess calls 2026-06-08 22:46:57 -07:00
transports fix(cache): content-address prompt_cache_key so recurring cron jobs reuse the warm prefix (#52295) 2026-06-24 21:46:30 -07:00
__init__.py fix(agent): preload jiter native parser 2026-05-28 00:20:11 -07:00
account_usage.py feat(billing): /credits command — balance + portal top-up handoff (#44776) 2026-06-12 08:51:10 +00:00
agent_init.py fix(moa): keep virtual provider on MoA client 2026-06-27 14:20:51 -07:00
agent_runtime_helpers.py fix(moa): keep virtual provider on MoA client 2026-06-27 14:20:51 -07:00
anthropic_adapter.py fix(anthropic): adopt Claude Code's already-refreshed token before racing refresh 2026-06-27 19:14:43 -07:00
async_utils.py fix(async): close unscheduled coroutines in all threadsafe bridges (#26584) 2026-05-15 14:00:01 -07:00
auxiliary_client.py fix(auxiliary): fall back to OPENROUTER_API_KEY when credential pool exhausted 2026-06-27 19:09:27 -07:00
azure_identity_adapter.py feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
background_review.py feat(background-review): aux-model selector for the self-improvement review (#49252) 2026-06-22 14:54:53 -07:00
bedrock_adapter.py fix(bedrock): check boto3 version >= 1.34.59 before using converse_stream 2026-06-15 05:25:17 -07:00
billing_view.py feat(billing): /billing terminal billing — interactive TUI + CLI client (#45449) 2026-06-19 01:53:32 +05:30
browser_provider.py fix(browser): self-review pass — dead-import, log levels, future-proofing 2026-05-17 04:04:15 -07:00
browser_registry.py style: restore PEP8 blank-line separation after dead-code removal 2026-05-29 04:22:27 -07:00
chat_completion_helpers.py fix(agent): throttle cross-turn fallback-switch replay storm (#24996) (#53909) 2026-06-27 19:15:40 -07:00
codex_responses_adapter.py fix(xai): OAuth Responses native web_search, incomplete guard, grok-composer context 2026-06-17 17:33:32 -07:00
codex_runtime.py fix(codex): seed app-server sessions with configured cwd 2026-06-21 16:39:02 -07:00
coding_context.py revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853) 2026-06-27 15:59:00 -07:00
context_compressor.py fix(compression): abort (preserve context) on transient network summary failure (#29559, #25585) 2026-06-24 18:31:51 +05:30
context_engine.py fix(compression): avoid repeat preflight compaction from rough estimates 2026-05-29 19:05:03 -07:00
context_references.py revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853) 2026-06-27 15:59:00 -07:00
conversation_compression.py fix(agent): rebaseline in-place compression flushes 2026-06-27 03:04:26 -07:00
conversation_loop.py fix(agent): activate fallback on persistent transport failures (#22277) 2026-06-27 19:12:21 -07:00
copilot_acp_client.py fix: make profile subprocess HOME policy explicit 2026-06-14 03:20:21 -07:00
credential_persistence.py fix: avoid persisting borrowed credential secrets (#31416) 2026-05-25 00:32:08 -07:00
credential_pool.py fix(auth): preserve concurrently-added credentials on pool rewrite 2026-06-27 19:01:37 -07:00
credential_sources.py docs(auth): replace stale 'hermes login' references with 'hermes auth add' 2026-05-26 15:41:11 -07:00
credits_tracker.py feat(billing): /credits command — balance + portal top-up handoff (#44776) 2026-06-12 08:51:10 +00:00
curator.py fix(curator): make external-skill write guard actually fire during curation 2026-06-25 22:03:02 -07:00
curator_backup.py fix(curator): stop the rollback safety snapshot from pruning its target 2026-06-17 05:40:05 -07:00
display.py fix(browser): force secret-pattern redaction on browser_type display 2026-06-25 22:02:22 -07:00
error_classifier.py fix(agent): classify 429 'overloaded' bodies as overloaded, not rate_limit 2026-06-27 04:16:54 -07:00
errors.py fix(agent,gateway,doctor): add SSL CA cert bundle fail-fast guard 2026-06-13 21:14:32 -07:00
file_safety.py fix: use os.pathsep, add tests, update tips for multi-root support 2026-06-27 04:01:12 +05:30
gemini_native_adapter.py fix(gemini): strip native self prefixes before generateContent (#36141) 2026-06-13 13:47:08 -07:00
gemini_schema.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
i18n.py fix(packaging): ship locales/ i18n catalogs in wheel, sdist, and Nix (#38383) 2026-06-03 12:00:27 -07:00
image_gen_provider.py feat(image-gen): add image-to-image / editing to image_generate (#48705) 2026-06-18 22:13:07 -07:00
image_gen_registry.py fix(plugins): filter resolution by is_available() in web + image_gen registries 2026-05-13 22:31:28 -07:00
image_routing.py fix(vision): honor custom_providers per-model supports_vision (#41036) 2026-06-07 21:50:57 -07:00
insights.py refactor(insights): drop dead pricing/duration wrappers, call usage_pricing directly (#40618) 2026-06-07 18:33:20 -07:00
iteration_budget.py refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget 2026-05-16 17:59:32 -07:00
jiter_preload.py fix(agent): preload jiter native parser 2026-05-28 00:20:11 -07:00
learn_prompt.py fix(learn): name distilled skills as author Hermes, not the host OS user (#52388) 2026-06-25 12:48:08 -07:00
lmstudio_reasoning.py feat(agent): add lmstudio integration 2026-04-28 12:27:36 -07:00
manual_compression_feedback.py fix(compression): include system prompt + tool schemas in token estimates (#18265) 2026-04-30 23:03:54 -07:00
markdown_tables.py fix(cli): vertical fallback for markdown tables wider than terminal (#23948) 2026-05-11 16:49:13 -07:00
memory_manager.py fix(agent): validate context/memory tool schemas before wrapping 2026-06-25 02:17:29 +05:30
memory_provider.py fix(backup): capture memory-provider state stored outside HERMES_HOME (#50325) 2026-06-21 12:03:46 -07:00
message_content.py fix(openviking): preserve structured sync attribution 2026-06-19 15:23:41 +08:00
message_sanitization.py fix(agent): close tool-call sequence on all interrupt aborts, not just finalize_turn 2026-06-25 12:24:34 -05:00
moa_loop.py fix(moa): preserve Codex slot routing 2026-06-27 14:20:51 -07:00
model_metadata.py fix(moa): resolve context window from the aggregator, not the 256K default (#53780) 2026-06-27 12:08:09 -07:00
models_dev.py remove Vercel AI Gateway and Vercel Sandbox (#33067) 2026-05-27 00:43:32 -07:00
moonshot_schema.py fix(moonshot): handle union type arrays in tool schemas 2026-06-13 05:51:41 -07:00
nous_rate_guard.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
onboarding.py feat(onboarding): opt-in structured profile-build path on first contact (#41114) 2026-06-07 08:36:48 -07:00
oneshot.py feat(agent): one-shot LLM helper + llm.oneshot gateway RPC (#51261) 2026-06-23 08:01:50 +00:00
plugin_llm.py feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194) 2026-05-10 07:09:28 -07:00
portal_tags.py feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779) 2026-05-12 20:49:20 -07:00
process_bootstrap.py refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget 2026-05-16 17:59:32 -07:00
prompt_builder.py fix(gateway): WhatsApp/Signal hints affirm markdown instead of forbidding it (#53564) 2026-06-27 03:46:41 -07:00
prompt_caching.py fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778) 2026-05-12 20:46:04 -07:00
rate_limit_tracker.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
reasoning_timeouts.py fix(agent): detect thinking-timeout for reasoning models and surface actionable guidance instead of misleading file-write advice 2026-06-25 19:00:48 -07:00
redact.py fix(redact): mask passwords in lowercase/dotted config keys (#53590) 2026-06-27 04:43:28 -07:00
retry_utils.py fix: handle named custom providers and Z.AI overload retries 2026-06-25 00:17:17 -07:00
runtime_cwd.py fix(desktop): stabilize project folder sessions (#37586) 2026-06-02 20:23:09 +00:00
secret_scope.py feat(gateway): multiplex phase 2 — fail-closed profile credential isolation (Workstream A) 2026-06-19 07:34:15 -07:00
shell_hooks.py docs: document per-event extra keys in shell-hook wire protocol 2026-06-20 23:23:47 -07:00
skill_bundles.py feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373) 2026-05-18 21:38:05 -07:00
skill_commands.py fix(memory): strip skill scaffolding for all providers, not just openviking 2026-06-16 10:37:37 -07:00
skill_preprocessing.py fix: prevent TUI gateway stdin EOF crash across all TUI-context subprocess calls 2026-06-08 22:46:57 -07:00
skill_utils.py fix(curator): protect external skills from background curation 2026-06-25 22:03:02 -07:00
ssl_guard.py fix(ssl): align guard docs and escape hatch 2026-06-13 21:14:32 -07:00
stream_diag.py feat(agent): buffer retry/fallback status, surface only on terminal failure (#33816) 2026-05-28 04:53:27 -07:00
subdirectory_hints.py fix(subdirectory_hints): prevent loading AGENTS.md outside workspace 2026-05-25 23:17:33 -07:00
system_prompt.py feat(computer_use): cross-platform cua-driver (macOS/Windows/Linux) 2026-06-22 06:42:30 -07:00
think_scrubber.py fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924) (#20184) 2026-05-05 04:33:38 -07:00
thinking_timeout_guidance.py fix(agent): detect thinking-timeout for reasoning models and surface actionable guidance instead of misleading file-write advice 2026-06-25 19:00:48 -07:00
title_generator.py feat(titles): support language-aware title generation (#45296) 2026-06-19 17:15:52 -07:00
tool_dispatch_helpers.py feat(agent): require verification before finishing edits 2026-06-24 23:02:48 -05:00
tool_executor.py fix: redact browser typed text surfaces 2026-06-25 22:02:22 -07:00
tool_guardrails.py fix: add recovery hints to loop guard warnings 2026-05-19 00:12:12 -07:00
tool_result_classification.py fix: classify landed file mutations with diagnostics 2026-05-13 06:46:23 -07:00
trajectory.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
transcription_provider.py feat(stt): add register_transcription_provider() plugin hook 2026-05-25 01:41:19 -07:00
transcription_registry.py feat(stt): add register_transcription_provider() plugin hook 2026-05-25 01:41:19 -07:00
tts_provider.py feat(tts): add register_tts_provider() plugin hook (closes #30398) 2026-05-24 18:04:54 -07:00
tts_registry.py feat(tts): add register_tts_provider() plugin hook (closes #30398) 2026-05-24 18:04:54 -07:00
turn_context.py fix(agent): rebaseline in-place compression flushes 2026-06-27 03:04:26 -07:00
turn_finalizer.py fix(agent): close tool-call sequence on all interrupt aborts, not just finalize_turn 2026-06-25 12:24:34 -05:00
turn_retry_state.py fix(agent): fail over to fallback provider on persistent auth failure (401/403) 2026-06-20 11:38:01 -07:00
usage_pricing.py fix(bedrock): price Claude prompt-cache tokens in /usage (#50307) 2026-06-21 11:48:43 -07:00
verification_evidence.py feat(agent): recognize focused ad-hoc verification scripts 2026-06-24 23:03:45 -05:00
verification_stop.py feat(verify-on-stop): default OFF, one-time migration, skip doc-only edits (#53552) 2026-06-27 03:23:22 -07:00
video_gen_provider.py feat(video_gen): unified video_generate tool with pluggable provider backends (#25126) 2026-05-13 16:39:41 -07:00
video_gen_registry.py feat(video_gen): unified video_generate tool with pluggable provider backends (#25126) 2026-05-13 16:39:41 -07:00
web_search_provider.py chore(web): remove web_crawl tool + provider crawl plumbing (#33824) 2026-05-28 04:52:42 -07:00
web_search_registry.py chore(web): remove web_crawl tool + provider crawl plumbing (#33824) 2026-05-28 04:52:42 -07:00