hermes-agent/tests/agent
Teknium 1368caf66f
fix(anthropic): smart thinking block signature management (#6112)
Anthropic signs thinking blocks against the full turn content. Any
upstream mutation (context compression, session truncation, orphan
stripping, message merging) invalidates the signature, causing HTTP 400
'Invalid signature in thinking block' — especially in long-lived
gateway sessions.

Strategy (following clawdbot/OpenClaw pattern):

1. Strip thinking/redacted_thinking from all assistant messages EXCEPT
   the last one — preserves reasoning continuity on the current
   tool-use chain while avoiding stale signature errors on older turns.

2. Downgrade unsigned thinking blocks to plain text — Anthropic can't
   validate them, but the reasoning content is preserved.

3. Strip cache_control from thinking/redacted_thinking blocks to
   prevent cache markers from interfering with signature validation.

4. Drop thinking blocks from the second message when merging
   consecutive assistant messages (role alternation enforcement).

5. Error recovery: on HTTP 400 mentioning 'signature' and 'thinking',
   strip all reasoning_details from the conversation and retry once.
   This is the safety net for edge cases the proactive stripping
   misses.

Addresses the issue reported in PR #6086 by @mingginwan while
preserving reasoning continuity (their PR stripped ALL thinking
blocks unconditionally).

Files changed:
- agent/anthropic_adapter.py: thinking block management in
  convert_messages_to_anthropic (strip old turns, downgrade unsigned,
  strip cache_control, merge-time strip)
- run_agent.py: one-shot signature error recovery in retry loop
- tests/test_anthropic_adapter.py: 10 new tests covering all cases
2026-04-08 03:38:08 -07:00
..
__init__.py test: add unit tests for 8 modules (batch 2) 2026-02-26 13:54:20 +03:00
test_anthropic_adapter.py fix(anthropic): smart thinking block signature management (#6112) 2026-04-08 03:38:08 -07:00
test_auxiliary_client.py fix(vision): simplify vision auto-detection to openrouter → nous → active provider 2026-04-08 01:21:54 -07:00
test_auxiliary_config_bridge.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_auxiliary_named_custom_providers.py fix(auxiliary): resolve named custom providers and 'main' alias in auxiliary routing (#5978) 2026-04-07 17:59:47 -07:00
test_context_compressor.py Fix compaction summary retries for temperature-restricted models 2026-04-06 16:49:57 -07:00
test_context_references.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_credential_pool.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_credential_pool_routing.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_crossloop_client_cache.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_display.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_display_emoji.py feat(tools): centralize tool emoji metadata in registry + skin integration 2026-03-15 20:21:21 -07:00
test_external_skills.py feat(skills): support external skill directories via config (#3678) 2026-03-29 00:33:30 -07:00
test_insights.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_memory_plugin_e2e.py feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623) 2026-04-02 15:33:51 -07:00
test_memory_provider.py fix: mem0 API v2 compat, prefetch context fencing, secret redaction (#5423) 2026-04-05 22:43:33 -07:00
test_memory_user_id.py fix: thread gateway user_id to memory plugins for per-user scoping (#5895) 2026-04-07 11:14:12 -07:00
test_minimax_auxiliary_url.py fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983) 2026-04-07 22:23:28 -07:00
test_minimax_provider.py fix(minimax): correct context lengths, model catalog, thinking guard, aux model, and config base_url 2026-04-08 02:20:46 -07:00
test_model_metadata.py feat: overhaul context length detection with models.dev and provider-aware resolution (#2158) 2026-03-20 06:04:33 -07:00
test_model_metadata_local_ctx.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_models_dev.py feat: overhaul context length detection with models.dev and provider-aware resolution (#2158) 2026-03-20 06:04:33 -07:00
test_prompt_builder.py feat: switch managed browser provider from Browserbase to Browser Use (#5750) 2026-04-07 08:40:22 -04:00
test_prompt_caching.py fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter 2026-03-21 16:54:43 -07:00
test_redact.py test(redact): add regression tests for lowercase variable redaction (#4367) (#5185) 2026-04-05 00:10:16 -07:00
test_skill_commands.py fix: sanitize Telegram command names to strip invalid characters 2026-04-06 11:27:28 -07:00
test_smart_model_routing.py fix: hermes update causes dual gateways on macOS (launchd) (#1567) 2026-03-16 12:36:29 -07:00
test_subagent_progress.py feat(api): structured run events via /v1/runs SSE endpoint 2026-04-05 12:05:13 -07:00
test_subdirectory_hints.py feat: progressive subdirectory hint discovery (#5291) 2026-04-05 12:33:47 -07:00
test_title_generator.py feat: auto-generate session titles after first exchange 2026-03-17 04:14:40 -07:00
test_usage_pricing.py feat: use endpoint metadata for custom model context and pricing (#1906) 2026-03-18 03:04:07 -07:00