Drop the hermes_state.py column + persistence plumbing from the salvaged
interleaved-thinking fix. The ordered-block channel covers the failure
window in-memory (turn replayed within the live conversation loop). A
session reloaded from disk after a crash falls back to reconstruction;
if that replay 400s, the thinking-signature recovery (#43667) strips
reasoning_details and retries — one degraded call in a rare resume path
instead of a schema column. Replaces the DB-roundtrip test with a
fallback-shape test.
Two additive hardening changes on the interleaved-thinking replay path
introduced by this PR's anthropic_content_blocks channel. Both are scoped
to that channel's blast radius; neither changes correct behavior.
1. Replay-time tool-input re-sourcing (credential safety).
The ordered-block channel captures each tool_use `input` from the RAW
API response in normalize_response, which is NOT credential-redacted.
The parallel tool_calls[].function.arguments IS redacted at storage
time (build_assistant_message, #19798). The verbatim-replay fast path
in _convert_assistant_message replayed the raw block input, so a secret
a model inlined into a tool call (e.g. an Authorization header value
passed inside a terminal command) would ride back onto the wire even
though it is redacted everywhere else in history. Re-source tool_use
input from the redacted tool_calls map by
sanitized id; interleave order (the reason this channel exists) is
unaffected. Adapted from #36071, which re-sources tool inputs the same
way on its replay path.
2. Broaden the thinking-replay 400 classifier (defense-in-depth).
error_classifier only matched "signature" + "thinking", so the
frozen-block variant — "thinking ... blocks in the latest assistant
message cannot be modified. These blocks must remain as they were in
the original response." — carried no "signature" token and fell through
to a non-retryable abort. The anthropic_content_blocks channel prevents
the reorder that triggers this 400 at the source, but if any future
mutator reintroduces it, the turn now self-heals via the existing
strip-reasoning-and-retry recovery instead of crash-looping. A negative
case ensures an unrelated "cannot be modified" 400 (no "thinking") is
not swept in. Mirrors the classifier broadening in #36087 and #36071.
Tests
- tests/agent/test_anthropic_thinking_block_order.py: a replay test
asserting an inlined secret is redacted on the wire while interleave
order is preserved.
- tests/agent/test_error_classifier.py: three cases — frozen-block 400
native and via OpenRouter route to thinking_signature/retryable; an
unrelated "cannot be modified" 400 does not.
Both grafts verified RED (tests fail with the change reverted) then GREEN.
Full adapter, transport, classifier and output-field-leak suites pass.
Co-authored-by: AlexanderBFoley <92330381+AlexanderBFoley@users.noreply.github.com>
Interleaved-thinking turns (adaptive thinking, Claude 4.6+/Opus 4.8) emit
content blocks like:
thinking_1(signed) tool_use_1 thinking_2(signed) tool_use_2
Anthropic signs each thinking block against the turn content preceding it
at its position. normalize_response split the turn into two parallel lists
(reasoning_details + tool_calls), discarding cross-type order, and
_convert_assistant_message rebuilt it as [all thinking][text][all tool_use].
That moved thinking_2 ahead of tool_use_1, invalidating its signature, so
Anthropic rejected the latest assistant message with HTTP 400:
messages.N.content.M: `thinking` or `redacted_thinking` blocks in the
latest assistant message cannot be modified.
Observed repeatedly in agent.conversation_loop against api.anthropic.com /
claude-opus-4-8, recurring across sessions on multi-thinking-block turns.
Fix: carry a verbatim, order-preserving copy of the turn's content blocks
(anthropic_content_blocks) end-to-end - capture in normalize_response,
persist/restore through state.db, and replay unchanged for the latest
assistant message. Gated to turns that actually interleave signed thinking
with tool_use, so normal turns are unaffected.
Adds 3 regression tests including a SQLite round-trip covering the
crash-recovery reload path.