mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-28 01:21:43 +00:00
When streaming died after text was already delivered to the user but before a tool-call's arguments finished streaming, the partial-stream stub at the end of _interruptible_streaming_api_call silently set `tool_calls=None` on the returned message and kept `finish_reason=stop`. The agent treated the turn as complete, the session exited cleanly with code 0, and the attempted action was lost with zero user-facing signal. Live-observed Apr 2026 with MiniMax M2.7 on a ~6-minute audit task: agent streamed 'Let me write the audit:', started emitting a write_file tool call, MiniMax stalled for 240s mid-arguments, the stale-stream detector killed the connection, the stub fired, session ended, no file written, no error shown. Fix: the streaming accumulator now records each tool-call's name into `result['partial_tool_names']` as soon as the name is known. When the stub builder fires after a partial delivery and finds any recorded tool names, it appends a human-visible warning to the stub's content — and also fires it as a live stream delta so the user sees it immediately, not only in the persisted transcript. The next turn's model also sees the warning in conversation history and can retry on its own. Text-only partial streams keep the original bare-recovery behaviour (no warning). Validation: | Scenario | Before | After | |---------------------------------------------|---------------------------|---------------------------------------------| | Stream dies mid tool-call, text already sent | Silent exit, no indication | User sees ⚠ warning naming the dropped tool | | Text-only partial stream | Bare recovered text | Unchanged | | tests/run_agent/test_streaming.py | 24 passed | 26 passed (2 new) | |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| conftest.py | ||
| test_413_compression.py | ||
| test_860_dedup.py | ||
| test_1630_context_overflow_loop.py | ||
| test_agent_guardrails.py | ||
| test_agent_loop.py | ||
| test_agent_loop_tool_calling.py | ||
| test_agent_loop_vllm.py | ||
| test_anthropic_error_handling.py | ||
| test_async_httpx_del_neuter.py | ||
| test_compression_boundary.py | ||
| test_compression_feasibility.py | ||
| test_compression_persistence.py | ||
| test_compressor_fallback_update.py | ||
| test_concurrent_interrupt.py | ||
| test_context_token_tracking.py | ||
| test_create_openai_client_kwargs_isolation.py | ||
| test_create_openai_client_reuse.py | ||
| test_dict_tool_call_args.py | ||
| test_exit_cleanup_interrupt.py | ||
| test_fallback_model.py | ||
| test_flush_memories_codex.py | ||
| test_interactive_interrupt.py | ||
| test_interrupt_propagation.py | ||
| test_invalid_context_length_warning.py | ||
| test_long_context_tier_429.py | ||
| test_openai_client_lifecycle.py | ||
| test_percentage_clamp.py | ||
| test_plugin_context_engine_init.py | ||
| test_primary_runtime_restore.py | ||
| test_provider_fallback.py | ||
| test_provider_parity.py | ||
| test_real_interrupt_subagent.py | ||
| test_redirect_stdout_issue.py | ||
| test_run_agent.py | ||
| test_run_agent_codex_responses.py | ||
| test_sequential_chats_live.py | ||
| test_session_meta_filtering.py | ||
| test_session_reset_fix.py | ||
| test_streaming.py | ||
| test_strict_api_validation.py | ||
| test_switch_model_context.py | ||
| test_token_persistence_non_cli.py | ||
| test_tool_arg_coercion.py | ||
| test_unicode_ascii_codec.py | ||