mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
fix(bedrock): resolve context length via static table before custom-endpoint probe
## Problem
`get_model_context_length()` in `agent/model_metadata.py` had a resolution
order bug that caused every Bedrock model to fall back to the 128K default
context length instead of reaching the static Bedrock table (200K for
Claude, etc.).
The root cause: `bedrock-runtime.<region>.amazonaws.com` is not listed in
`_URL_TO_PROVIDER`, so `_is_known_provider_base_url()` returned False.
The resolution order then ran the custom-endpoint probe (step 2) *before*
the Bedrock branch (step 4b), which:
1. Treated Bedrock as a custom endpoint (via `_is_custom_endpoint`).
2. Called `fetch_endpoint_model_metadata()` → `GET /models` on the
bedrock-runtime URL (Bedrock doesn't serve this shape).
3. Fell through to `return DEFAULT_FALLBACK_CONTEXT` (128K) at the
"probe-down" branch — never reaching the Bedrock static table.
Result: users on Bedrock saw 128K context for Claude models that
actually support 200K on Bedrock, causing premature auto-compression.
## Fix
Promote the Bedrock branch from step 4b to step 1b, so it runs *before*
the custom-endpoint probe at step 2. The static table in
`bedrock_adapter.py::get_bedrock_context_length()` is the authoritative
source for Bedrock (the ListFoundationModels API doesn't expose context
window sizes), so there's no reason to probe `/models` first.
The original step 4b is replaced with a one-line breadcrumb comment
pointing to the new location, to make the resolution-order docstring
accurate.
## Changes
- `agent/model_metadata.py`
- Add step 1b: Bedrock static-table branch (unchanged predicate, moved).
- Remove dead step 4b block, replace with breadcrumb comment.
- Update resolution-order docstring to include step 1b.
- `tests/agent/test_model_metadata.py`
- New `TestBedrockContextResolution` class (3 tests):
- `test_bedrock_provider_returns_static_table_before_probe`:
confirms `provider="bedrock"` hits the static table and does NOT
call `fetch_endpoint_model_metadata` (regression guard).
- `test_bedrock_url_without_provider_hint`: confirms the
`bedrock-runtime.*.amazonaws.com` host match works without an
explicit `provider=` hint.
- `test_non_bedrock_url_still_probes`: confirms the probe still
fires for genuinely-custom endpoints (no over-reach).
## Testing
pytest tests/agent/test_model_metadata.py -q
# 83 passed in 1.95s (3 new + 80 existing)
## Risk
Very low.
- Predicate is identical to the original step 4b — no behaviour change
for non-Bedrock paths.
- Original step 4b was dead code for the user-facing case (always hit
the 128K fallback first), so removing it cannot regress behaviour.
- Bedrock path now short-circuits before any network I/O — faster too.
- `ImportError` fall-through preserved so users without `boto3`
installed are unaffected.
## Related
- This is a prerequisite for accurate context-window accounting on
Bedrock — the fix for #14710 (stale-connection client eviction)
depends on correct context sizing to know when to compress.
Signed-off-by: Andre Kurait <andrekurait@gmail.com>
This commit is contained in:
parent
f2fba4f9a1
commit
b290297d66
2 changed files with 73 additions and 13 deletions
|
|
@ -588,6 +588,57 @@ class TestGetModelContextLength:
|
|||
assert result == 200000
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Bedrock context resolution — must run BEFORE custom-endpoint probe
|
||||
# =========================================================================
|
||||
|
||||
class TestBedrockContextResolution:
|
||||
"""Regression tests for Bedrock context-length resolution order.
|
||||
|
||||
Bug: because ``bedrock-runtime.<region>.amazonaws.com`` is not listed in
|
||||
``_URL_TO_PROVIDER``, ``_is_known_provider_base_url`` returned False and
|
||||
the custom-endpoint probe at step 2 ran first — fetching ``/models`` from
|
||||
Bedrock (which it doesn't serve), returning the 128K default-fallback
|
||||
before execution ever reached the Bedrock branch.
|
||||
|
||||
Fix: promote the Bedrock branch ahead of the custom-endpoint probe.
|
||||
"""
|
||||
|
||||
@patch("agent.model_metadata.fetch_endpoint_model_metadata")
|
||||
def test_bedrock_provider_returns_static_table_before_probe(self, mock_fetch):
|
||||
"""provider='bedrock' resolves via static table, bypasses /models probe."""
|
||||
ctx = get_model_context_length(
|
||||
"anthropic.claude-opus-4-v1:0",
|
||||
provider="bedrock",
|
||||
base_url="https://bedrock-runtime.us-east-1.amazonaws.com",
|
||||
)
|
||||
# Must return the static Bedrock table value (200K for Claude),
|
||||
# NOT DEFAULT_FALLBACK_CONTEXT (128K).
|
||||
assert ctx == 200000
|
||||
mock_fetch.assert_not_called()
|
||||
|
||||
@patch("agent.model_metadata.fetch_endpoint_model_metadata")
|
||||
def test_bedrock_url_without_provider_hint(self, mock_fetch):
|
||||
"""bedrock-runtime host infers Bedrock even when provider is omitted."""
|
||||
ctx = get_model_context_length(
|
||||
"anthropic.claude-sonnet-4-v1:0",
|
||||
base_url="https://bedrock-runtime.us-west-2.amazonaws.com",
|
||||
)
|
||||
assert ctx == 200000
|
||||
mock_fetch.assert_not_called()
|
||||
|
||||
@patch("agent.model_metadata.fetch_endpoint_model_metadata")
|
||||
def test_non_bedrock_url_still_probes(self, mock_fetch):
|
||||
"""Non-Bedrock hosts still reach the custom-endpoint probe."""
|
||||
mock_fetch.return_value = {"some-model": {"context_length": 50000}}
|
||||
ctx = get_model_context_length(
|
||||
"some-model",
|
||||
base_url="https://api.example.com/v1",
|
||||
)
|
||||
assert ctx == 50000
|
||||
assert mock_fetch.called
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# _strip_provider_prefix — Ollama model:tag vs provider:model
|
||||
# =========================================================================
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue