feat: add claude-opus-4.8 and claude-opus-4.8-fast (#34003)

Anthropic released Claude Opus 4.8 on 2026-05-27, available on
OpenRouter, Anthropic, Amazon Bedrock, and Claude Platform on AWS:
  - https://openrouter.ai/anthropic/claude-opus-4.8
  - https://openrouter.ai/anthropic/claude-opus-4.8-fast

The fast-mode variant is a separate model ID (anthropic/claude-opus-4.8-fast)
priced at 2x of the base model — a notable improvement over the 6x premium
on older Opus generations (4.6/4.7). It is NOT a `speed: "fast"` request
parameter like Opus 4.6; Anthropic's native fast-mode beta still only
covers Opus 4.6.

Changes:

  hermes_cli/models.py
    - Add anthropic/claude-opus-4.8 + anthropic/claude-opus-4.8-fast to
      the OpenRouter fallback snapshot and the Nous Portal curated list
      (live catalogs surface them automatically when reachable; the
      fallback list matters when the manifest fetch fails).
    - Add claude-opus-4-8 to the Anthropic-native picker list.

  agent/model_metadata.py
    - Register claude-opus-4-8 / claude-opus-4.8 in DEFAULT_CONTEXT_LENGTHS
      with 1M tokens (matches 4.6/4.7).

  agent/anthropic_adapter.py
    - Extend _XHIGH_EFFORT_SUBSTRINGS, _ADAPTIVE_THINKING_SUBSTRINGS, and
      _NO_SAMPLING_PARAMS_SUBSTRINGS with "4-8"/"4.8". 4.8 inherits the
      Opus 4.7 API contract: adaptive thinking only, xhigh effort level
      supported, sampling parameters (temperature/top_p/top_k) return 400.
    - Add claude-opus-4-8 to _ANTHROPIC_OUTPUT_LIMITS (128k max output,
      same as 4.7). Matches by substring so claude-opus-4-8-fast and
      date-stamped variants resolve correctly.

  agent/usage_pricing.py
    - Add anthropic/claude-opus-4-8: $5/$25 per MTok input/output, $0.50
      cache read, $6.25 cache write (same as 4.6/4.7).
    - Add anthropic/claude-opus-4-8-fast: $10/$50 per MTok (2x), $1.00
      cache read, $12.50 cache write. Per OpenRouter, the 2x premium is
      the only differentiator from regular Opus 4.8.
    - OpenRouter routes still pull pricing from the live /models API, so
      no static OpenRouter entry is needed.

  tests/agent/test_model_metadata.py
    - Extend the Claude 4.6+ context-length tag list with 4.8/4-8.

  website/static/api/model-catalog.json
    - Regenerated via `python scripts/build_model_catalog.py` to pick up
      the new entries in the OpenRouter and Nous Portal fallback lists.

E2E verification (isolated sys.path import against the worktree):
  - _supports_adaptive_thinking, _supports_xhigh_effort, _forbids_sampling_params
    all return True for claude-opus-4.8 and claude-opus-4.8-fast.
  - _supports_fast_mode (the `speed: "fast"` request-parameter gate) stays
    False for 4.8 — fast mode is a separate model ID on OpenRouter, not a
    parameter Anthropic accepts on the base model.
  - DEFAULT_CONTEXT_LENGTHS resolves 1M for both notations.
  - resolve_billing_route + _lookup_official_docs_pricing resolve the
    correct $5/$25 (regular) and $10/$50 (fast) pricing for both
    dot-notation and dash-notation inputs.
  - 4.7 and 4.6 regression: behavior unchanged.

Unit tests: 305 passed across tests/agent/test_usage_pricing.py,
test_model_metadata.py, tests/hermes_cli/test_model_catalog.py,
test_models.py, test_model_validation.py, test_models_dev_preferred_merge.py.
This commit is contained in:
kshitij 2026-05-28 10:31:59 -07:00 committed by GitHub
parent e8b9369a9d
commit 1a74795735
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
7 changed files with 65 additions and 7 deletions

View file

@ -1188,16 +1188,27 @@ class TestBuildAnthropicKwargs:
# params through its signature, we exercise the strip behavior by
# calling the internal predicate directly.
from agent.anthropic_adapter import _forbids_sampling_params
assert _forbids_sampling_params("claude-opus-4-8") is True
assert _forbids_sampling_params("claude-opus-4-8-fast") is True
assert _forbids_sampling_params("claude-opus-4-7") is True
assert _forbids_sampling_params("claude-opus-4-6") is False
assert _forbids_sampling_params("claude-sonnet-4-5") is False
def test_supports_fast_mode_predicate(self):
"""Fast mode is Opus 4.6 only — Opus 4.7 and others must be excluded."""
"""Fast mode is Opus 4.6 only — Opus 4.7 and others must be excluded.
For Opus 4.8 the fast variant is a separate model ID
(anthropic/claude-opus-4.8-fast) routed through the normal model
field, NOT via the ``speed: "fast"`` request parameter. So
``_supports_fast_mode`` (which gates the parameter) must stay
False for both opus-4-8 and opus-4-8-fast.
"""
from agent.anthropic_adapter import _supports_fast_mode
assert _supports_fast_mode("claude-opus-4-6") is True
assert _supports_fast_mode("anthropic/claude-opus-4-6") is True
assert _supports_fast_mode("claude-opus-4-7") is False
assert _supports_fast_mode("claude-opus-4-8") is False
assert _supports_fast_mode("claude-opus-4-8-fast") is False
assert _supports_fast_mode("claude-sonnet-4-6") is False
assert _supports_fast_mode("claude-haiku-4-5") is False
assert _supports_fast_mode("") is False

View file

@ -131,10 +131,10 @@ class TestDefaultContextLengths:
for key, value in DEFAULT_CONTEXT_LENGTHS.items():
if "claude" not in key:
continue
# Claude 4.6+ models (4.6 and 4.7) have 1M context at standard
# Claude 4.6+ models (4.6, 4.7, 4.8) have 1M context at standard
# API pricing (no long-context premium). Older Claude 4.x and
# 3.x models cap at 200k.
if any(tag in key for tag in ("4.6", "4-6", "4.7", "4-7")):
if any(tag in key for tag in ("4.6", "4-6", "4.7", "4-7", "4.8", "4-8")):
assert value == 1000000, f"{key} should be 1000000"
else:
assert value == 200000, f"{key} should be 200000"