Commit graph

3 commits

Author SHA1 Message Date
teknium1
efcbbde48c refactor: keep anthropic_content_blocks in-memory only (no state.db column)
Drop the hermes_state.py column + persistence plumbing from the salvaged
interleaved-thinking fix. The ordered-block channel covers the failure
window in-memory (turn replayed within the live conversation loop). A
session reloaded from disk after a crash falls back to reconstruction;
if that replay 400s, the thinking-signature recovery (#43667) strips
reasoning_details and retries — one degraded call in a rare resume path
instead of a schema column. Replaces the DB-roundtrip test with a
fallback-shape test.
2026-06-10 20:45:16 -07:00
RaumfahrerSpiffy
7a1eed8268 fix(anthropic): redact replayed tool inputs and broaden thinking-replay 400 recovery
Two additive hardening changes on the interleaved-thinking replay path
introduced by this PR's anthropic_content_blocks channel. Both are scoped
to that channel's blast radius; neither changes correct behavior.

1. Replay-time tool-input re-sourcing (credential safety).
   The ordered-block channel captures each tool_use `input` from the RAW
   API response in normalize_response, which is NOT credential-redacted.
   The parallel tool_calls[].function.arguments IS redacted at storage
   time (build_assistant_message, #19798). The verbatim-replay fast path
   in _convert_assistant_message replayed the raw block input, so a secret
   a model inlined into a tool call (e.g. an Authorization header value
   passed inside a terminal command) would ride back onto the wire even
   though it is redacted everywhere else in history. Re-source tool_use
   input from the redacted tool_calls map by
   sanitized id; interleave order (the reason this channel exists) is
   unaffected. Adapted from #36071, which re-sources tool inputs the same
   way on its replay path.

2. Broaden the thinking-replay 400 classifier (defense-in-depth).
   error_classifier only matched "signature" + "thinking", so the
   frozen-block variant — "thinking ... blocks in the latest assistant
   message cannot be modified. These blocks must remain as they were in
   the original response." — carried no "signature" token and fell through
   to a non-retryable abort. The anthropic_content_blocks channel prevents
   the reorder that triggers this 400 at the source, but if any future
   mutator reintroduces it, the turn now self-heals via the existing
   strip-reasoning-and-retry recovery instead of crash-looping. A negative
   case ensures an unrelated "cannot be modified" 400 (no "thinking") is
   not swept in. Mirrors the classifier broadening in #36087 and #36071.

Tests
- tests/agent/test_anthropic_thinking_block_order.py: a replay test
  asserting an inlined secret is redacted on the wire while interleave
  order is preserved.
- tests/agent/test_error_classifier.py: three cases — frozen-block 400
  native and via OpenRouter route to thinking_signature/retryable; an
  unrelated "cannot be modified" 400 does not.
Both grafts verified RED (tests fail with the change reverted) then GREEN.
Full adapter, transport, classifier and output-field-leak suites pass.

Co-authored-by: AlexanderBFoley <92330381+AlexanderBFoley@users.noreply.github.com>
2026-06-10 20:45:16 -07:00
RaumfahrerSpiffy
aaccaada28 fix(anthropic): preserve interleaved thinking/tool_use block order on replay
Interleaved-thinking turns (adaptive thinking, Claude 4.6+/Opus 4.8) emit
content blocks like:

    thinking_1(signed) tool_use_1 thinking_2(signed) tool_use_2

Anthropic signs each thinking block against the turn content preceding it
at its position. normalize_response split the turn into two parallel lists
(reasoning_details + tool_calls), discarding cross-type order, and
_convert_assistant_message rebuilt it as [all thinking][text][all tool_use].
That moved thinking_2 ahead of tool_use_1, invalidating its signature, so
Anthropic rejected the latest assistant message with HTTP 400:

    messages.N.content.M: `thinking` or `redacted_thinking` blocks in the
    latest assistant message cannot be modified.

Observed repeatedly in agent.conversation_loop against api.anthropic.com /
claude-opus-4-8, recurring across sessions on multi-thinking-block turns.

Fix: carry a verbatim, order-preserving copy of the turn's content blocks
(anthropic_content_blocks) end-to-end - capture in normalize_response,
persist/restore through state.db, and replay unchanged for the latest
assistant message. Gated to turns that actually interleave signed thinking
with tool_use, so normal turns are unaffected.

Adds 3 regression tests including a SQLite round-trip covering the
crash-recovery reload path.
2026-06-10 20:45:16 -07:00