hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-09 08:21:50 +00:00

History

Teknium c6fd2619f7 fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833 ) Google-side 429 Code Assist errors now flow through Hermes' normal rate-limit path (status_code on the exception, Retry-After preserved via error.response) instead of being opaque RuntimeErrors. User sees a one-line capacity message instead of a 500-char JSON dump. Changes - CodeAssistError grows status_code / response / retry_after / details attrs. _extract_status_code in error_classifier picks up status_code and classifies 429 as FailoverReason.rate_limit, so fallback_providers triggers the same way it does for SDK errors. run_agent.py line ~10428 already walks error.response.headers for Retry-After — preserving the response means that path just works. - _gemini_http_error parses the Google error envelope (error.status + error.details[].reason from google.rpc.ErrorInfo, retryDelay from google.rpc.RetryInfo). MODEL_CAPACITY_EXHAUSTED / RESOURCE_EXHAUSTED / 404 model-not-found each produce a human-readable message; unknown shapes fall back to the previous raw-body format. - Drop gemma-4-26b-it from hermes_cli/models.py, hermes_cli/setup.py, and agent/model_metadata.py — Google returned 404 for it today in local repro. Kept gemma-4-31b-it (capacity-constrained but not retired). Validation \| \| Before \| After \| \|---------------------------\|--------------------------------\|-------------------------------------------\| \| Error message \| 'Code Assist returned HTTP 429: {500 chars JSON}' \| 'Gemini capacity exhausted for gemini-2.5-pro (Google-side throttle...)' \| \| status_code on error \| None (opaque RuntimeError) \| 429 \| \| Classifier reason \| unknown (string-match fallback) \| FailoverReason.rate_limit \| \| Retry-After honored \| ignored \| extracted from RetryInfo or header \| \| gemma-4-26b-it picker \| advertised (404s on Google) \| removed \| Unit + E2E tests cover non-streaming 429, streaming 429, 404 model-not-found, Retry-After header fallback, malformed body, and classifier integration. Targeted suites: tests/agent/test_gemini_cloudcode.py (81 tests), full tests/hermes_cli (2203 tests) green. Co-authored-by: teknium1 <teknium@nousresearch.com>		2026-04-17 15:34:12 -07:00
..
__init__.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
anthropic_adapter.py	fix(agent): downgrade xhigh→max on Anthropic pre-4.7 adaptive models	2026-04-16 12:00:56 -07:00
auxiliary_client.py	fix(agent): complete Claude Opus 4.7 API migration	2026-04-16 10:48:20 -07:00
bedrock_adapter.py	feat: native AWS Bedrock provider via Converse API	2026-04-15 16:17:17 -07:00
context_compressor.py	fix(context_compressor): always keep last user message in tail to prevent active-task loss	2026-04-16 07:45:31 -07:00
context_engine.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
context_references.py	fix(agent): preserve quoted @file references with spaces	2026-04-10 13:05:01 -07:00
copilot_acp_client.py	fix: handle httpx.Timeout object in CopilotACPClient (#11058 )	2026-04-16 12:05:11 -07:00
credential_pool.py	fix(auth): codex auth remove no longer silently undone by auto-import (#11485 )	2026-04-17 04:10:17 -07:00
display.py	fix: remove context pressure warnings entirely (#11039 )	2026-04-16 06:44:23 -07:00
error_classifier.py	feat: native AWS Bedrock provider via Converse API	2026-04-15 16:17:17 -07:00
gemini_cloudcode_adapter.py	fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833 )	2026-04-17 15:34:12 -07:00
google_code_assist.py	fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833 )	2026-04-17 15:34:12 -07:00
google_oauth.py	feat(gemini): add Google Gemini CLI OAuth provider via Cloud Code Assist (free + paid tiers) (#11270 )	2026-04-16 16:49:00 -07:00
insights.py	fix(insights): hide cache read/write and cost metrics from display (#11477 )	2026-04-17 01:02:06 -07:00
manual_compression_feedback.py	fix(gateway): make manual compression feedback truthful	2026-04-10 21:16:53 -07:00
memory_manager.py	feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation (#10619 )	2026-04-15 19:12:19 -07:00
memory_provider.py	refactor(memory): drop on_session_reset — commit-only is enough	2026-04-15 11:28:45 -07:00
model_metadata.py	fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833 )	2026-04-17 15:34:12 -07:00
models_dev.py	feat: add Ollama Cloud as built-in provider	2026-04-16 02:22:09 -07:00
nous_rate_guard.py	fix: Nous Portal rate limit guard — prevent retry amplification (#10568 )	2026-04-15 16:31:48 -07:00
prompt_builder.py	fix(prompt): list all supported Telegram markdown formatting	2026-04-15 17:54:13 -07:00
prompt_caching.py	fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter	2026-03-21 16:54:43 -07:00
rate_limit_tracker.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
redact.py	fix(security): add JWT token and Discord mention redaction (#10547 )	2026-04-15 16:08:52 -07:00
retry_utils.py	feat(agent): add jittered retry backoff	2026-04-08 00:41:36 -07:00
skill_commands.py	fix: use absolute skill_dir for external skills (#10313 ) (#10587 )	2026-04-15 17:22:55 -07:00
skill_utils.py	feat(plugins): namespaced skill registration for plugin skill bundles	2026-04-14 10:42:58 -07:00
smart_model_routing.py	fix: UTF-8 config encoding, pairing hint, credential_pool key, header normalization (#7174 )	2026-04-10 05:33:48 -07:00
subdirectory_hints.py	fix(agent): catch PermissionError in subdirectory hint discovery	2026-04-09 03:10:30 -07:00
title_generator.py	fix: title_generator no longer logs as 'compression' task	2026-04-12 04:17:18 -07:00
trajectory.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
usage_pricing.py	feat: native AWS Bedrock provider via Converse API	2026-04-15 16:17:17 -07:00