mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
Google-side 429 Code Assist errors now flow through Hermes' normal rate-limit
path (status_code on the exception, Retry-After preserved via error.response)
instead of being opaque RuntimeErrors. User sees a one-line capacity message
instead of a 500-char JSON dump.
Changes
- CodeAssistError grows status_code / response / retry_after / details attrs.
_extract_status_code in error_classifier picks up status_code and classifies
429 as FailoverReason.rate_limit, so fallback_providers triggers the same
way it does for SDK errors. run_agent.py line ~10428 already walks
error.response.headers for Retry-After — preserving the response means that
path just works.
- _gemini_http_error parses the Google error envelope (error.status +
error.details[].reason from google.rpc.ErrorInfo, retryDelay from
google.rpc.RetryInfo). MODEL_CAPACITY_EXHAUSTED / RESOURCE_EXHAUSTED / 404
model-not-found each produce a human-readable message; unknown shapes fall
back to the previous raw-body format.
- Drop gemma-4-26b-it from hermes_cli/models.py, hermes_cli/setup.py, and
agent/model_metadata.py — Google returned 404 for it today in local repro.
Kept gemma-4-31b-it (capacity-constrained but not retired).
Validation
| | Before | After |
|---------------------------|--------------------------------|-------------------------------------------|
| Error message | 'Code Assist returned HTTP 429: {500 chars JSON}' | 'Gemini capacity exhausted for gemini-2.5-pro (Google-side throttle...)' |
| status_code on error | None (opaque RuntimeError) | 429 |
| Classifier reason | unknown (string-match fallback) | FailoverReason.rate_limit |
| Retry-After honored | ignored | extracted from RetryInfo or header |
| gemma-4-26b-it picker | advertised (404s on Google) | removed |
Unit + E2E tests cover non-streaming 429, streaming 429, 404 model-not-found,
Retry-After header fallback, malformed body, and classifier integration.
Targeted suites: tests/agent/test_gemini_cloudcode.py (81 tests), full
tests/hermes_cli (2203 tests) green.
Co-authored-by: teknium1 <teknium@nousresearch.com>
|
||
|---|---|---|
| .. | ||
| __init__.py | ||
| auth.py | ||
| auth_commands.py | ||
| backup.py | ||
| banner.py | ||
| callbacks.py | ||
| claw.py | ||
| cli_output.py | ||
| clipboard.py | ||
| codex_models.py | ||
| colors.py | ||
| commands.py | ||
| completion.py | ||
| config.py | ||
| copilot_auth.py | ||
| cron.py | ||
| curses_ui.py | ||
| debug.py | ||
| default_soul.py | ||
| dingtalk_auth.py | ||
| doctor.py | ||
| dump.py | ||
| env_loader.py | ||
| gateway.py | ||
| logs.py | ||
| main.py | ||
| mcp_config.py | ||
| memory_setup.py | ||
| model_normalize.py | ||
| model_switch.py | ||
| models.py | ||
| nous_subscription.py | ||
| pairing.py | ||
| platforms.py | ||
| plugins.py | ||
| plugins_cmd.py | ||
| profiles.py | ||
| providers.py | ||
| runtime_provider.py | ||
| setup.py | ||
| skills_config.py | ||
| skills_hub.py | ||
| skin_engine.py | ||
| status.py | ||
| tips.py | ||
| tools_config.py | ||
| uninstall.py | ||
| web_server.py | ||
| webhook.py | ||