hermes-agent/tests/agent
trevthefoolish 0517ac3e93 fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide

Fixes NousResearch/hermes-agent#11137

Breaking-change coverage:

1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
   _supports_adaptive_thinking() (extends previous 4.6-only gate).

2. Sampling parameter stripping — 4.7 returns 400 for any non-default
   temperature / top_p / top_k. build_anthropic_kwargs drops them as a
   safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
   and AnthropicCompletionsAdapter.create() both early-exit before
   setting temperature for 4.7+ models. This keeps flush_memories and
   structured-JSON aux paths that hardcode temperature from 400ing
   when the aux model is flipped to 4.7.

3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
   which silently hides reasoning text from Hermes's CLI activity feed
   during long tool runs. Restoring "summarized" preserves 4.6 UX.

4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
   silently over-efforted every coding/agentic request). max is now a
   distinct ceiling per Anthropic's 5-level effort model.

5. New stop_reason values — refusal and model_context_window_exceeded
   were silently collapsed to "stop" (end_turn) by the adapter's
   stop_reason_map. Now mapped to "content_filter" and "length"
   respectively, matching upstream finish-reason handling already in
   bedrock_adapter.

6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
   list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
   catalog (recommended), claude-opus-4-7 added to model_metadata
   DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).

7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
   that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
   prefill (400).

8. Tests — 4 new tests in test_anthropic_adapter covering display
   default, xhigh preservation, max on 4.7, refusal / context-overflow
   stop_reason mapping, plus the sampling-param predicate. test_model_metadata
   accepts 4.7 at 1M context.

Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 10:48:20 -07:00
..
__init__.py test: add unit tests for 8 modules (batch 2) 2026-02-26 13:54:20 +03:00
test_anthropic_adapter.py fix(agent): complete Claude Opus 4.7 API migration 2026-04-16 10:48:20 -07:00
test_auxiliary_client.py fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation 2026-04-15 22:05:21 -07:00
test_auxiliary_config_bridge.py fix: remove legacy compression.summary_* config and env var fallbacks (#8992) 2026-04-13 04:59:26 -07:00
test_auxiliary_named_custom_providers.py fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation 2026-04-15 22:05:21 -07:00
test_bedrock_adapter.py feat: native AWS Bedrock provider via Converse API 2026-04-15 16:17:17 -07:00
test_bedrock_integration.py feat: native AWS Bedrock provider via Converse API 2026-04-15 16:17:17 -07:00
test_compress_focus.py fix: resolve CI test failures — add missing functions, fix stale tests (#9483) 2026-04-14 01:43:45 -07:00
test_context_compressor.py fix(agent): route compression aux through live session runtime 2026-04-12 01:34:52 -07:00
test_context_engine.py feat: wire context engine plugin slot into agent and plugin system 2026-04-10 19:15:50 -07:00
test_context_references.py fix(agent): preserve quoted @file references with spaces 2026-04-10 13:05:01 -07:00
test_credential_pool.py fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation 2026-04-15 22:05:21 -07:00
test_credential_pool_routing.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_crossloop_client_cache.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_display.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_display_emoji.py feat(tools): centralize tool emoji metadata in registry + skin integration 2026-03-15 20:21:21 -07:00
test_error_classifier.py fix: add vLLM/local server error patterns + MCP initial connection retry (#9281) 2026-04-13 18:46:14 -07:00
test_external_skills.py feat(skills): support external skill directories via config (#3678) 2026-03-29 00:33:30 -07:00
test_insights.py fix: remove 115 verified dead code symbols across 46 production files 2026-04-10 03:44:43 -07:00
test_local_stream_timeout.py fix: is_local_endpoint misses Docker/Podman DNS names (#7950) 2026-04-11 14:46:18 -07:00
test_memory_provider.py feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation (#10619) 2026-04-15 19:12:19 -07:00
test_memory_user_id.py fix: resolve CI test failures — add missing functions, fix stale tests (#9483) 2026-04-14 01:43:45 -07:00
test_minimax_auxiliary_url.py fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983) 2026-04-07 22:23:28 -07:00
test_minimax_provider.py fix: preserve dots in model names for OpenCode Zen and ZAI providers (#8794) 2026-04-12 21:22:59 -07:00
test_model_metadata.py fix(agent): complete Claude Opus 4.7 API migration 2026-04-16 10:48:20 -07:00
test_model_metadata_local_ctx.py fix(agent): prefer Ollama Modelfile num_ctx over GGUF training max 2026-04-13 04:24:07 -07:00
test_models_dev.py fix: three provider-related bugs (#8161, #8181, #8147) (#8243) 2026-04-12 01:44:18 -07:00
test_nous_rate_guard.py fix: Nous Portal rate limit guard — prevent retry amplification (#10568) 2026-04-15 16:31:48 -07:00
test_prompt_builder.py feat: add WSL environment hint to system prompt (#8285) 2026-04-12 02:26:28 -07:00
test_prompt_caching.py fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter 2026-03-21 16:54:43 -07:00
test_proxy_and_url_validation.py fix(runtime): surface malformed proxy env and base URL before client init 2026-04-15 16:10:53 -07:00
test_rate_limit_tracker.py feat: capture provider rate limit headers and show in /usage (#6541) 2026-04-09 03:43:14 -07:00
test_redact.py fix(security): add JWT token and Discord mention redaction (#10547) 2026-04-15 16:08:52 -07:00
test_skill_commands.py fix: sanitize Telegram command names to strip invalid characters 2026-04-06 11:27:28 -07:00
test_smart_model_routing.py fix: hermes update causes dual gateways on macOS (launchd) (#1567) 2026-03-16 12:36:29 -07:00
test_subagent_progress.py feat(api): structured run events via /v1/runs SSE endpoint 2026-04-05 12:05:13 -07:00
test_subdirectory_hints.py fix(agent): catch PermissionError in subdirectory hint discovery 2026-04-09 03:10:30 -07:00
test_title_generator.py feat: auto-generate session titles after first exchange 2026-03-17 04:14:40 -07:00
test_usage_pricing.py feat: use endpoint metadata for custom model context and pricing (#1906) 2026-03-18 03:04:07 -07:00
test_vision_resolved_args.py fix: pass resolved args to resolve_vision_provider_client() 2026-04-16 07:45:13 -07:00