fix: update claude 4.6 context length from 200K to 1M (#2155)

* fix: preserve Ollama model:tag colons in context length detection

The colon-split logic in get_model_context_length() and
_query_local_context_length() assumed any colon meant provider:model
format (e.g. "local:my-model"). But Ollama uses model:tag format
(e.g. "qwen3.5:27b"), so the split turned "qwen3.5:27b" into just
"27b" — which matches nothing, causing a fallback to the 2M token
probe tier.

Now only recognised provider prefixes (local, openrouter, anthropic,
etc.) are stripped. Ollama model:tag names pass through intact.

* fix: update claude-opus-4-6 and claude-sonnet-4-6 context length from 200K to 1M

Both models support 1,000,000 token context windows. The hardcoded defaults
were set before Anthropic expanded the context for the 4.6 generation.
Verified via models.dev and OpenRouter API data.

---------

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Co-authored-by: Test <test@test.com>
This commit is contained in:
Teknium 2026-03-20 04:38:59 -07:00 committed by GitHub
parent b19f5133c3
commit 3ec6c71e43
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 11 additions and 6 deletions

View file

@ -69,15 +69,15 @@ CONTEXT_PROBE_TIERS = [
DEFAULT_CONTEXT_LENGTHS = {
"anthropic/claude-opus-4": 200000,
"anthropic/claude-opus-4.5": 200000,
"anthropic/claude-opus-4.6": 200000,
"anthropic/claude-opus-4.6": 1000000,
"anthropic/claude-sonnet-4": 200000,
"anthropic/claude-sonnet-4-20250514": 200000,
"anthropic/claude-sonnet-4.5": 200000,
"anthropic/claude-sonnet-4.6": 200000,
"anthropic/claude-sonnet-4.6": 1000000,
"anthropic/claude-haiku-4.5": 200000,
# Bare Anthropic model IDs (for native API provider)
"claude-opus-4-6": 200000,
"claude-sonnet-4-6": 200000,
"claude-opus-4-6": 1000000,
"claude-sonnet-4-6": 1000000,
"claude-opus-4-5-20251101": 200000,
"claude-sonnet-4-5-20250929": 200000,
"claude-opus-4-1-20250805": 200000,

View file

@ -106,9 +106,14 @@ class TestEstimateMessagesTokensRough:
# =========================================================================
class TestDefaultContextLengths:
def test_claude_models_200k(self):
def test_claude_models_context_lengths(self):
for key, value in DEFAULT_CONTEXT_LENGTHS.items():
if "claude" in key:
if "claude" not in key:
continue
# Claude 4.6 models have 1M context
if "4.6" in key or "4-6" in key:
assert value == 1000000, f"{key} should be 1000000"
else:
assert value == 200000, f"{key} should be 200000"
def test_gpt4_models_128k_or_1m(self):