hermes-agent/agent
daimon-nous[bot] ac5359a3f3
fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998) (#32012)
* fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998)

When a stream stalls mid-tool-call (e.g. a large write_file), the
partial-stream-stub recovery used finish_reason='stop' which caused the
conversation loop to treat the turn as complete, returning only the
warning text. When users said 'continue', the model retried the same
large tool call, hit the same stale timeout, and looped indefinitely.

Changes:
- chat_completion_helpers.py: change _stub_finish_reason from 'stop' to
  'length' for mid-tool-call partials. The stub still has tool_calls=None
  so no tool auto-executes — the model gets a fresh API call through the
  existing length-continuation machinery (bounded to 3 retries).
  Also attach _dropped_tool_names to the stub for downstream use.
- conversation_loop.py: add a third continuation prompt branch for
  partial-stream-stubs with dropped tool calls. Instead of the generic
  'continue where you left off' (which would retry the same large call),
  tell the model to break the output into smaller tool calls (~8K
  tokens each) to avoid stream timeouts.
- test_partial_stream_finish_reason.py: update existing test from
  finish_reason='stop' to 'length', add _dropped_tool_names assertion,
  add new test_dropped_tool_call_uses_chunking_prompt for the 3-way
  prompt branching.

Safety: tool_calls=None is preserved on the stub, so the conversation
loop enters the text-continuation branch (line 1513), NOT the tool-call
execution branch (line 3246). No tool auto-executes. The model simply
gets another API call with targeted guidance.

* refactor: extract constants and continuation prompt helper

- Move magic strings to hermes_constants.py (PARTIAL_STREAM_STUB_ID,
  FINISH_REASON_LENGTH)
- Extract _get_continuation_prompt() in conversation_loop.py — DRYs the
  3-way prompt branching and lets tests import the real function
- Trim verbose inline comments in chat_completion_helpers.py
- Tests import constants + helper instead of duplicating logic

---------

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-05-25 17:43:10 +05:30
..
lsp chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
secret_sources perf(cli): cut hermes startup 63% — flip head-to-head vs codex (#31968) 2026-05-25 03:06:39 -07:00
transports fix(codex): size and propagate timeouts for Responses-API requests; lower stale defaults 2026-05-25 01:47:55 -07:00
__init__.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
account_usage.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
agent_init.py fix(cli): synchronize HERMES_SESSION_ID across environment and contextvar during session switches 2026-05-23 17:46:55 -07:00
agent_runtime_helpers.py fix(compressor): propagate api_mode and fix root logger calls 2026-05-23 17:38:19 -07:00
anthropic_adapter.py fix(security): close TOCTOU window when saving Claude Code OAuth credentials (#21152) 2026-05-24 17:45:12 -07:00
async_utils.py fix(async): close unscheduled coroutines in all threadsafe bridges (#26584) 2026-05-15 14:00:01 -07:00
auxiliary_client.py fix(model): include Premium+ in xAI OAuth label 2026-05-24 18:12:16 -07:00
azure_identity_adapter.py feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
background_review.py fix(background-review): allow pinned skills to be improved 2026-05-23 22:57:42 -07:00
bedrock_adapter.py chore(deps): lazy-install boto3/botocore for bedrock adapter 2026-05-17 02:31:18 -07:00
browser_provider.py fix(browser): self-review pass — dead-import, log levels, future-proofing 2026-05-17 04:04:15 -07:00
browser_registry.py fix(browser): self-review pass — dead-import, log levels, future-proofing 2026-05-17 04:04:15 -07:00
chat_completion_helpers.py fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998) (#32012) 2026-05-25 17:43:10 +05:30
codex_responses_adapter.py fix(codex): size and propagate timeouts for Responses-API requests; lower stale defaults 2026-05-25 01:47:55 -07:00
codex_runtime.py fix(xai): surface provider 'error' SSE frame in Codex fallback stream (#27184) 2026-05-16 23:41:09 -07:00
context_compressor.py fix(compressor): ABC compliance — total_tokens, api_mode, logger consistency 2026-05-23 17:38:19 -07:00
context_engine.py fix(compressor): ABC compliance — total_tokens, api_mode, logger consistency 2026-05-23 17:38:19 -07:00
context_references.py fix(agent): fall back when rg is blocked for @folder references 2026-04-20 01:56:41 -07:00
conversation_compression.py fix(cli): synchronize HERMES_SESSION_ID across environment and contextvar during session switches 2026-05-23 17:46:55 -07:00
conversation_loop.py fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998) (#32012) 2026-05-25 17:43:10 +05:30
copilot_acp_client.py fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes 2026-05-19 00:12:41 -07:00
credential_persistence.py fix: avoid persisting borrowed credential secrets (#31416) 2026-05-25 00:32:08 -07:00
credential_pool.py fix: avoid persisting borrowed credential secrets (#31416) 2026-05-25 00:32:08 -07:00
credential_sources.py fix(model): include Premium+ in xAI OAuth label 2026-05-24 18:12:16 -07:00
curator.py feat(curator): hint at hermes curator pin in the rename block (#23212) 2026-05-10 06:44:53 -07:00
curator_backup.py fix(skills): prune dependency/venv dirs from all skill scanners (#30042) 2026-05-21 14:18:02 -07:00
display.py feat(cli): show todo progress as done/total fraction 2026-05-23 21:03:51 -07:00
error_classifier.py fix(error-classifier): treat 5xx request-validation errors as non-retryable 2026-05-24 15:15:37 -07:00
file_safety.py fix(security): block read_file on project-local .env files 2026-05-25 03:40:47 -07:00
gemini_cloudcode_adapter.py fix(agent/gemini-cloudcode): seed delta defaults for reasoning-only stream chunks 2026-05-14 08:03:56 -07:00
gemini_native_adapter.py fix(auxiliary): evict async wrappers on poisoned client (follow-up to #23482) 2026-05-11 11:13:20 -07:00
gemini_schema.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
google_code_assist.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
google_oauth.py fix(security): guard os.chmod(parent) against / and top-level dirs 2026-05-20 22:56:55 -07:00
i18n.py feat(i18n): localize all gateway commands + web dashboard, add 8 new locales (16 total) (#22914) 2026-05-10 07:14:14 -07:00
image_gen_provider.py fix(image_gen): cache xAI ephemeral URL responses to disk (#26942) (#31759) 2026-05-24 18:10:47 -07:00
image_gen_registry.py fix(plugins): filter resolution by is_available() in web + image_gen registries 2026-05-13 22:31:28 -07:00
image_routing.py fix(agent): consult supports_vision override in auto-mode routing 2026-05-20 23:27:10 -07:00
insights.py Merge branch 'main' into feat/dashboard-skill-analytics 2026-04-20 05:25:49 -07:00
iteration_budget.py refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget 2026-05-16 17:59:32 -07:00
lmstudio_reasoning.py feat(agent): add lmstudio integration 2026-04-28 12:27:36 -07:00
manual_compression_feedback.py fix(compression): include system prompt + tool schemas in token estimates (#18265) 2026-04-30 23:03:54 -07:00
markdown_tables.py fix(cli): vertical fallback for markdown tables wider than terminal (#23948) 2026-05-11 16:49:13 -07:00
memory_manager.py 🐛 fix(memory): require newline after context tag 2026-05-18 10:53:08 -07:00
memory_provider.py docs(agent): remove stale BuiltinMemoryProvider references from memory module docstrings 2026-05-05 13:33:49 -07:00
message_sanitization.py refactor(run_agent): extract message sanitization to agent/message_sanitization.py 2026-05-16 17:41:09 -07:00
model_metadata.py fix(compressor): propagate api_mode and fix root logger calls 2026-05-23 17:38:19 -07:00
models_dev.py fix(xai): resolve Grok Build context for OAuth 2026-05-22 13:05:36 -07:00
moonshot_schema.py fix(moonshot): strip $ref siblings and collapse tuple items in tool schemas (#27104) 2026-05-16 13:02:19 -07:00
nous_rate_guard.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
onboarding.py docs(onboarding): lead OpenClaw residue banner with migrate, warn that cleanup breaks OpenClaw (#17507) 2026-04-29 08:08:36 -07:00
plugin_llm.py feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194) 2026-05-10 07:09:28 -07:00
portal_tags.py feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779) 2026-05-12 20:49:20 -07:00
process_bootstrap.py refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget 2026-05-16 17:59:32 -07:00
prompt_builder.py refactor(ntfy): convert built-in adapter to platform plugin 2026-05-23 16:13:01 -07:00
prompt_caching.py fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778) 2026-05-12 20:46:04 -07:00
rate_limit_tracker.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
redact.py fix(debug): redact BlueBubbles webhook secrets 2026-05-24 15:43:48 -07:00
retry_utils.py feat(agent): add jittered retry backoff 2026-04-08 00:41:36 -07:00
shell_hooks.py fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes 2026-05-19 00:12:41 -07:00
skill_bundles.py feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373) 2026-05-18 21:38:05 -07:00
skill_commands.py fix(skills): load symlinked skill slash commands 2026-05-18 00:34:29 -07:00
skill_preprocessing.py fix: treat inline-shell timeout guard as timeout 2026-05-18 19:36:04 -07:00
skill_utils.py fix(skills): load Linux-tagged skills on Termux (android sys.platform) 2026-05-21 19:08:38 -07:00
stream_diag.py refactor(run_agent): extract stream diagnostics to agent/stream_diag.py 2026-05-16 18:28:17 -07:00
subdirectory_hints.py fix(agent): catch PermissionError in subdirectory hint discovery 2026-04-09 03:10:30 -07:00
system_prompt.py fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290) 2026-05-24 00:38:17 -07:00
think_scrubber.py fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924) (#20184) 2026-05-05 04:33:38 -07:00
title_generator.py fix: improve telegram topic mode setup 2026-05-04 12:07:17 -07:00
tool_dispatch_helpers.py fix(agent): set tool_name on tool-result messages at construction time 2026-05-19 20:49:11 +01:00
tool_executor.py fix(cli): surface tool failures with specific error messages 2026-05-23 21:03:51 -07:00
tool_guardrails.py fix: add recovery hints to loop guard warnings 2026-05-19 00:12:12 -07:00
tool_result_classification.py fix: classify landed file mutations with diagnostics 2026-05-13 06:46:23 -07:00
trajectory.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
transcription_provider.py feat(stt): add register_transcription_provider() plugin hook 2026-05-25 01:41:19 -07:00
transcription_registry.py feat(stt): add register_transcription_provider() plugin hook 2026-05-25 01:41:19 -07:00
tts_provider.py feat(tts): add register_tts_provider() plugin hook (closes #30398) 2026-05-24 18:04:54 -07:00
tts_registry.py feat(tts): add register_tts_provider() plugin hook (closes #30398) 2026-05-24 18:04:54 -07:00
usage_pricing.py fix(pricing): add deepseek-v4-pro to official docs pricing table 2026-05-12 16:32:57 -07:00
video_gen_provider.py feat(video_gen): unified video_generate tool with pluggable provider backends (#25126) 2026-05-13 16:39:41 -07:00
video_gen_registry.py feat(video_gen): unified video_generate tool with pluggable provider backends (#25126) 2026-05-13 16:39:41 -07:00
web_search_provider.py fix(web): align _LEGACY_PREFERENCE with legacy 7-provider order + doc cleanup 2026-05-13 22:31:28 -07:00
web_search_registry.py fix(web): align _LEGACY_PREFERENCE with legacy 7-provider order + doc cleanup 2026-05-13 22:31:28 -07:00