hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-18 09:51:59 +00:00

History

Andre Kurait a8e89cbbf6 fix(bedrock): resolve context length via static table before custom-endpoint probe ## Problem `get_model_context_length()` in `agent/model_metadata.py` had a resolution order bug that caused every Bedrock model to fall back to the 128K default context length instead of reaching the static Bedrock table (200K for Claude, etc.). The root cause: `bedrock-runtime.<region>.amazonaws.com` is not listed in `_URL_TO_PROVIDER`, so `_is_known_provider_base_url()` returned False. The resolution order then ran the custom-endpoint probe (step 2) before the Bedrock branch (step 4b), which: 1. Treated Bedrock as a custom endpoint (via `_is_custom_endpoint`). 2. Called `fetch_endpoint_model_metadata()` → `GET /models` on the bedrock-runtime URL (Bedrock doesn't serve this shape). 3. Fell through to `return DEFAULT_FALLBACK_CONTEXT` (128K) at the "probe-down" branch — never reaching the Bedrock static table. Result: users on Bedrock saw 128K context for Claude models that actually support 200K on Bedrock, causing premature auto-compression. ## Fix Promote the Bedrock branch from step 4b to step 1b, so it runs before the custom-endpoint probe at step 2. The static table in `bedrock_adapter.py::get_bedrock_context_length()` is the authoritative source for Bedrock (the ListFoundationModels API doesn't expose context window sizes), so there's no reason to probe `/models` first. The original step 4b is replaced with a one-line breadcrumb comment pointing to the new location, to make the resolution-order docstring accurate. ## Changes - `agent/model_metadata.py` - Add step 1b: Bedrock static-table branch (unchanged predicate, moved). - Remove dead step 4b block, replace with breadcrumb comment. - Update resolution-order docstring to include step 1b. - `tests/agent/test_model_metadata.py` - New `TestBedrockContextResolution` class (3 tests): - `test_bedrock_provider_returns_static_table_before_probe`: confirms `provider="bedrock"` hits the static table and does NOT call `fetch_endpoint_model_metadata` (regression guard). - `test_bedrock_url_without_provider_hint`: confirms the `bedrock-runtime.*.amazonaws.com` host match works without an explicit `provider=` hint. - `test_non_bedrock_url_still_probes`: confirms the probe still fires for genuinely-custom endpoints (no over-reach). ## Testing pytest tests/agent/test_model_metadata.py -q # 83 passed in 1.95s (3 new + 80 existing) ## Risk Very low. - Predicate is identical to the original step 4b — no behaviour change for non-Bedrock paths. - Original step 4b was dead code for the user-facing case (always hit the 128K fallback first), so removing it cannot regress behaviour. - Bedrock path now short-circuits before any network I/O — faster too. - `ImportError` fall-through preserved so users without `boto3` installed are unaffected. ## Related - This is a prerequisite for accurate context-window accounting on Bedrock — the fix for #14710 (stale-connection client eviction) depends on correct context sizing to know when to compress. Signed-off-by: Andre Kurait <andrekurait@gmail.com>		2026-04-23 20:33:09 +00:00
..
transports	fix: add extra_content property to ToolCall for Gemini thought_signature (#14488 )	2026-04-23 23:45:07 +05:30
__init__.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
account_usage.py	feat(account-usage): add per-provider account limits module	2026-04-21 01:56:35 -07:00
anthropic_adapter.py	refactor: remove _nr_to_assistant_message shim + fix flush_memories guard	2026-04-23 02:30:05 -07:00
auxiliary_client.py	feat: add Xiaomi MiMo v2.5-pro and v2.5 model support (#14635 )	2026-04-23 10:06:25 -07:00
bedrock_adapter.py	feat: native AWS Bedrock provider via Converse API	2026-04-15 16:17:17 -07:00
codex_responses_adapter.py	refactor: extract codex_responses logic into dedicated adapter	2026-04-20 11:53:17 -07:00
context_compressor.py	fix: pass correct arguments in summary model fallback retry	2026-04-22 17:57:13 -07:00
context_engine.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
context_references.py	fix(agent): fall back when rg is blocked for @folder references	2026-04-20 01:56:41 -07:00
copilot_acp_client.py	fix(security): apply file safety to copilot acp fs	2026-04-21 01:31:58 -07:00
credential_pool.py	fix(auth): unify credential source removal — every source sticks (#13427 )	2026-04-21 01:52:49 -07:00
credential_sources.py	fix(auth): unify credential source removal — every source sticks (#13427 )	2026-04-21 01:52:49 -07:00
display.py	fix(display): render <missing old_text> in memory previews instead of empty quotes (#12852 )	2026-04-19 22:45:47 -07:00
error_classifier.py	fix(error_classifier): retry mid-stream SSL/TLS alert errors as transport	2026-04-22 17:44:50 -07:00
file_safety.py	fix(security): apply file safety to copilot acp fs	2026-04-21 01:31:58 -07:00
gemini_cloudcode_adapter.py	refactor: remove redundant local imports already available at module level	2026-04-21 00:50:58 -07:00
gemini_native_adapter.py	refactor: remove redundant local imports already available at module level	2026-04-21 00:50:58 -07:00
gemini_schema.py	fix(gemini): sanitize tool schemas for Google providers	2026-04-20 00:26:18 -07:00
google_code_assist.py	fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833 )	2026-04-17 15:34:12 -07:00
google_oauth.py	feat(gemini): add Google Gemini CLI OAuth provider via Cloud Code Assist (free + paid tiers) (#11270 )	2026-04-16 16:49:00 -07:00
image_gen_provider.py	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )	2026-04-21 21:30:10 -07:00
image_gen_registry.py	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )	2026-04-21 21:30:10 -07:00
insights.py	Merge branch 'main' into feat/dashboard-skill-analytics	2026-04-20 05:25:49 -07:00
manual_compression_feedback.py	fix(gateway): make manual compression feedback truthful	2026-04-10 21:16:53 -07:00
memory_manager.py	feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation (#10619 )	2026-04-15 19:12:19 -07:00
memory_provider.py	refactor(memory): drop on_session_reset — commit-only is enough	2026-04-15 11:28:45 -07:00
model_metadata.py	fix(bedrock): resolve context length via static table before custom-endpoint probe	2026-04-23 20:33:09 +00:00
models_dev.py	fix: normalize provider in list_provider_models to support aliases	2026-04-23 01:59:20 -07:00
nous_rate_guard.py	fix: Nous Portal rate limit guard — prevent retry amplification (#10568 )	2026-04-15 16:31:48 -07:00
prompt_builder.py	feat(agent): add PLATFORM_HINTS for matrix, mattermost, and feishu (#14428 )	2026-04-23 12:50:22 +05:30
prompt_caching.py	fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter	2026-03-21 16:54:43 -07:00
rate_limit_tracker.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
redact.py	feat: replace kimi-k2.5 with kimi-k2.6 on OpenRouter and Nous Portal (#13148 )	2026-04-20 11:49:54 -07:00
retry_utils.py	feat(agent): add jittered retry backoff	2026-04-08 00:41:36 -07:00
shell_hooks.py	feat: shell hooks — wire shell scripts as Hermes hook callbacks	2026-04-20 20:53:51 -07:00
skill_commands.py	feat(skills+terminal): make bundled skill scripts runnable out of the box (#13384 )	2026-04-21 00:39:19 -07:00
skill_utils.py	fix(skills): follow symlinks in iter_skill_index_files	2026-04-22 17:43:30 -07:00
subdirectory_hints.py	fix(agent): catch PermissionError in subdirectory hint discovery	2026-04-09 03:10:30 -07:00
title_generator.py	fix: increase max_tokens for GLM 5.1 reasoning headroom	2026-04-22 18:44:07 -07:00
trajectory.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
usage_pricing.py	fix(usage): read top-level Anthropic cache fields from OAI-compatible proxies	2026-04-22 17:40:49 -07:00