hermes-agent

mirrors/hermes-agent

Fork 0

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-01 07:01:41 +00:00

Commit graph

Author	SHA1	Message	Date
daimon-nous[bot]	ac5359a3f3	fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998 ) (#32012 ) * fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998) When a stream stalls mid-tool-call (e.g. a large write_file), the partial-stream-stub recovery used finish_reason='stop' which caused the conversation loop to treat the turn as complete, returning only the warning text. When users said 'continue', the model retried the same large tool call, hit the same stale timeout, and looped indefinitely. Changes: - chat_completion_helpers.py: change _stub_finish_reason from 'stop' to 'length' for mid-tool-call partials. The stub still has tool_calls=None so no tool auto-executes — the model gets a fresh API call through the existing length-continuation machinery (bounded to 3 retries). Also attach _dropped_tool_names to the stub for downstream use. - conversation_loop.py: add a third continuation prompt branch for partial-stream-stubs with dropped tool calls. Instead of the generic 'continue where you left off' (which would retry the same large call), tell the model to break the output into smaller tool calls (~8K tokens each) to avoid stream timeouts. - test_partial_stream_finish_reason.py: update existing test from finish_reason='stop' to 'length', add _dropped_tool_names assertion, add new test_dropped_tool_call_uses_chunking_prompt for the 3-way prompt branching. Safety: tool_calls=None is preserved on the stub, so the conversation loop enters the text-continuation branch (line 1513), NOT the tool-call execution branch (line 3246). No tool auto-executes. The model simply gets another API call with targeted guidance. * refactor: extract constants and continuation prompt helper - Move magic strings to hermes_constants.py (PARTIAL_STREAM_STUB_ID, FINISH_REASON_LENGTH) - Extract _get_continuation_prompt() in conversation_loop.py — DRYs the 3-way prompt branching and lets tests import the real function - Trim verbose inline comments in chat_completion_helpers.py - Tests import constants + helper instead of duplicating logic --------- Co-authored-by: alt-glitch <balyan.sid@gmail.com>	2026-05-25 17:43:10 +05:30
xxxigm	6cafcf9c77	test(streaming): pin partial-stream-stub finish_reason + continuation contract Three test classes lock in the #30963 fix: 1. TestPartialStreamStubFinishReason — drives _interruptible_streaming_api_call through the two recovery branches and asserts: - text-only partial → finish_reason="length" (the new behaviour), - mid-tool-call partial → finish_reason="stop" (unchanged on purpose). 2. TestLengthContinuationPromptBranching — pure-Python check on the branch that picks the continuation prompt by response.id. Locks the network error wording for partial-stream-stub vs. the output-length wording for everything else. 3. TestConversationLoopPartialStreamContinuation — feeds a stub + continuation pair into run_conversation, verifies the loop makes a second API call (instead of exiting with text_response(stop)), confirms the network-error continuation prompt actually reaches the model on call #2, and that final_response stitches both halves. Refs: NousResearch/hermes-agent#30963	2026-05-24 04:35:15 -07:00

Author

SHA1

Message

Date

daimon-nous[bot]

ac5359a3f3

fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998 ) (#32012 )

* fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998)

When a stream stalls mid-tool-call (e.g. a large write_file), the
partial-stream-stub recovery used finish_reason='stop' which caused the
conversation loop to treat the turn as complete, returning only the
warning text. When users said 'continue', the model retried the same
large tool call, hit the same stale timeout, and looped indefinitely.

Changes:
- chat_completion_helpers.py: change _stub_finish_reason from 'stop' to
'length' for mid-tool-call partials. The stub still has tool_calls=None
so no tool auto-executes — the model gets a fresh API call through the
existing length-continuation machinery (bounded to 3 retries).
Also attach _dropped_tool_names to the stub for downstream use.
- conversation_loop.py: add a third continuation prompt branch for
partial-stream-stubs with dropped tool calls. Instead of the generic
'continue where you left off' (which would retry the same large call),
tell the model to break the output into smaller tool calls (~8K
tokens each) to avoid stream timeouts.
- test_partial_stream_finish_reason.py: update existing test from
finish_reason='stop' to 'length', add _dropped_tool_names assertion,
add new test_dropped_tool_call_uses_chunking_prompt for the 3-way
prompt branching.

Safety: tool_calls=None is preserved on the stub, so the conversation
loop enters the text-continuation branch (line 1513), NOT the tool-call
execution branch (line 3246). No tool auto-executes. The model simply
gets another API call with targeted guidance.

* refactor: extract constants and continuation prompt helper

- Move magic strings to hermes_constants.py (PARTIAL_STREAM_STUB_ID,
FINISH_REASON_LENGTH)
- Extract _get_continuation_prompt() in conversation_loop.py — DRYs the
3-way prompt branching and lets tests import the real function
- Trim verbose inline comments in chat_completion_helpers.py
- Tests import constants + helper instead of duplicating logic

---------

Co-authored-by: alt-glitch <balyan.sid@gmail.com>

2026-05-25 17:43:10 +05:30

xxxigm

6cafcf9c77

test(streaming): pin partial-stream-stub finish_reason + continuation contract

Three test classes lock in the #30963 fix:

1. TestPartialStreamStubFinishReason — drives _interruptible_streaming_api_call
   through the two recovery branches and asserts:
     - text-only partial → finish_reason="length" (the new behaviour),
     - mid-tool-call partial → finish_reason="stop" (unchanged on purpose).

2. TestLengthContinuationPromptBranching — pure-Python check on the branch
   that picks the continuation prompt by response.id. Locks the network
   error wording for partial-stream-stub vs. the output-length wording
   for everything else.

3. TestConversationLoopPartialStreamContinuation — feeds a stub +
   continuation pair into run_conversation, verifies the loop makes a
   second API call (instead of exiting with text_response(stop)),
   confirms the network-error continuation prompt actually reaches the
   model on call #2, and that final_response stitches both halves.

Refs: NousResearch/hermes-agent#30963

2026-05-24 04:35:15 -07:00

2 commits