hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

History

daimon-nous[bot] ac5359a3f3 fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998 ) (#32012 ) * fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998) When a stream stalls mid-tool-call (e.g. a large write_file), the partial-stream-stub recovery used finish_reason='stop' which caused the conversation loop to treat the turn as complete, returning only the warning text. When users said 'continue', the model retried the same large tool call, hit the same stale timeout, and looped indefinitely. Changes: - chat_completion_helpers.py: change _stub_finish_reason from 'stop' to 'length' for mid-tool-call partials. The stub still has tool_calls=None so no tool auto-executes — the model gets a fresh API call through the existing length-continuation machinery (bounded to 3 retries). Also attach _dropped_tool_names to the stub for downstream use. - conversation_loop.py: add a third continuation prompt branch for partial-stream-stubs with dropped tool calls. Instead of the generic 'continue where you left off' (which would retry the same large call), tell the model to break the output into smaller tool calls (~8K tokens each) to avoid stream timeouts. - test_partial_stream_finish_reason.py: update existing test from finish_reason='stop' to 'length', add _dropped_tool_names assertion, add new test_dropped_tool_call_uses_chunking_prompt for the 3-way prompt branching. Safety: tool_calls=None is preserved on the stub, so the conversation loop enters the text-continuation branch (line 1513), NOT the tool-call execution branch (line 3246). No tool auto-executes. The model simply gets another API call with targeted guidance. * refactor: extract constants and continuation prompt helper - Move magic strings to hermes_constants.py (PARTIAL_STREAM_STUB_ID, FINISH_REASON_LENGTH) - Extract _get_continuation_prompt() in conversation_loop.py — DRYs the 3-way prompt branching and lets tests import the real function - Trim verbose inline comments in chat_completion_helpers.py - Tests import constants + helper instead of duplicating logic --------- Co-authored-by: alt-glitch <balyan.sid@gmail.com>		2026-05-25 17:43:10 +05:30
..
__init__.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
conftest.py	ci(tests): add pytest-timeout 60s hard cap to break suite-teardown deadlock (#28861 )	2026-05-19 17:27:24 -07:00
test_413_compression.py	test+polish(compression): pin anti-thrash gate and gateway session_id persistence	2026-05-25 01:44:46 -07:00
test_860_dedup.py	refactor(gateway): stop writing JSONL in append_to_transcript / rewrite_transcript	2026-05-20 13:00:57 -07:00
test_1630_context_overflow_loop.py	fix(tests): make AIAgent constructor calls self-contained (#11755 )	2026-04-17 12:32:03 -07:00
test_31273_402_not_retried.py	fix(agent): abort on HTTP 402 after pool rotation and fallback fail (#31443 )	2026-05-24 15:14:13 -07:00
test_agent_guardrails.py	fix(agent): include name field on every role:tool message for Gemini compatibility (#16478 )	2026-05-04 05:06:33 -07:00
test_anthropic_prompt_cache_policy.py	fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778 )	2026-05-12 20:46:04 -07:00
test_anthropic_third_party_oauth_guard.py	fix(anthropic): complete third-party Anthropic-compatible provider support (#12846 )	2026-04-19 22:43:09 -07:00
test_anthropic_truncation_continuation.py	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )	2026-05-17 02:29:41 -07:00
test_api_max_retries_config.py	feat(agent): make API retry count configurable via agent.api_max_retries (#14730 )	2026-04-23 13:59:32 -07:00
test_async_httpx_del_neuter.py	fix(dashboard): UI polish — modals, layout, consistency, test fixes	2026-05-12 13:59:22 -04:00
test_background_review.py	fix(run_agent): isolate background review fork from external memory plugins (#27190 )	2026-05-16 20:33:38 -07:00
test_background_review_cache_parity.py	chore: trim verbose comments/docstrings, add AUTHOR_MAP entry	2026-05-21 12:49:21 +05:30
test_background_review_summary.py	fix(agent): exclude prior-history tool messages from background review summary	2026-04-24 03:10:19 -07:00
test_background_review_toolset_restriction.py	chore: trim verbose comments/docstrings, add AUTHOR_MAP entry	2026-05-21 12:49:21 +05:30
test_callable_api_key.py	feat(azure-foundry): add Microsoft Entra ID auth	2026-05-18 10:14:38 -07:00
test_codex_app_server_integration.py	fix(codex-runtime): retire wedged sessions + post-tool watchdog + OAuth refresh classify (#25769 )	2026-05-14 07:55:09 -07:00
test_codex_multimodal_tool_result.py	feat(vision): vision_analyze returns pixels to vision-capable models, not aux text (#22955 )	2026-05-09 21:06:19 -07:00
test_codex_silent_hang_hint.py	fix(codex): surface actionable hint when stale-call detector fires on known silent-reject pattern	2026-05-25 04:49:22 -07:00
test_codex_xai_oauth_recovery.py	test(xai-oauth): regression coverage for the bad-credentials disambiguator (#29344 )	2026-05-23 02:48:13 -07:00
test_commit_memory_session_context_engine.py	fix(agent): notify context engine on commit_memory_session (#22764 )	2026-05-09 12:28:42 -07:00
test_compress_focus_plugin_fallback.py	refactor(memory): remove flush_memories entirely (#15696 )	2026-04-25 08:21:14 -07:00
test_compression_boundary.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_compression_boundary_hook.py	fix(tests): catch up six stale tests after compression/aux/kanban changes (#28465 )	2026-05-18 21:43:59 -07:00
test_compression_feasibility.py	perf(compression): defer feasibility check to first compression attempt (#28957 )	2026-05-19 17:27:17 -07:00
test_compression_persistence.py	fix(tests): make AIAgent constructor calls self-contained (#11755 )	2026-04-17 12:32:03 -07:00
test_compression_trigger_excludes_reasoning.py	fix(compression): exclude completion tokens from compression trigger (#12026 )	2026-04-20 05:12:10 -07:00
test_compressor_fallback_update.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_concurrent_interrupt.py	test: remove 50 stale/broken tests to unblock CI (#22098 )	2026-05-08 14:55:40 -07:00
test_context_token_tracking.py	refactor(session-log): delete _save_session_log and all callers	2026-05-20 11:44:10 -07:00
test_copilot_native_vision_headers.py	fix(copilot): mark native image requests as vision	2026-04-27 08:35:50 -07:00
test_create_openai_client_kwargs_isolation.py	fix(tests): make AIAgent constructor calls self-contained (#11755 )	2026-04-17 12:32:03 -07:00
test_create_openai_client_proxy_env.py	test(proxy): regression tests for NO_PROXY bypass on keepalive client	2026-04-24 03:04:42 -07:00
test_create_openai_client_reuse.py	fix(force_close_tcp_sockets): shutdown only, do not release FD (#29507 )	2026-05-23 02:31:10 -07:00
test_deepseek_reasoning_content_echo.py	fix(deepseek): use non-empty reasoning_content placeholder for V4 Pro thinking mode	2026-04-30 23:04:23 -07:00
test_deepseek_v4_thinking_live.py	fix(deepseek): preserve v4 reasoning_content on replay	2026-04-30 11:18:39 -07:00
test_dict_tool_call_args.py	fix(tests): fix 78 CI test failures and remove dead test (#9036 )	2026-04-13 10:50:24 -07:00
test_empty_response_recovery_persistence.py	refactor(session-log): delete _save_session_log and all callers	2026-05-20 11:44:10 -07:00
test_exit_cleanup_interrupt.py	test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )	2026-04-17 14:21:22 -07:00
test_file_mutation_verifier.py	fix: classify landed file mutations with diagnostics	2026-05-13 06:46:23 -07:00
test_image_rejection_fallback.py	fix(agent): catch ChatGPT-account Codex data-URL rejection so images are stripped instead of cascading to compression (#23602 )	2026-05-11 07:37:22 -07:00
test_image_shrink_recovery.py	feat(image-input): native multimodal routing based on model vision capability (#16506 )	2026-04-27 06:27:59 -07:00
test_init_fallback_on_exhausted_pool.py	fix(agent): try fallback providers at init when primary credential pool is exhausted (#17929 )	2026-05-02 02:09:46 -07:00
test_interactive_interrupt.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_interrupt_propagation.py	test: stop testing mutable data — convert change-detectors to invariants (#13363 )	2026-04-20 23:20:33 -07:00
test_invalid_context_length_warning.py	fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation	2026-04-15 22:05:21 -07:00
test_iteration_budget_race.py	fix(run_agent): acquire lock in IterationBudget.used property	2026-05-04 12:37:28 -07:00
test_jsondecodeerror_retryable.py	refactor(run_agent): review fixes — keyword-forward __init__, drop dead code, tighten guards	2026-05-16 22:55:49 -07:00
test_last_reasoning_per_turn.py	test: pin per-turn reasoning extraction semantics	2026-05-05 05:00:05 -07:00
test_long_context_tier_429.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_materialize_data_url_cleanup.py	fix(misc): three small defensive fixes from PR #1974	2026-05-10 22:28:01 -07:00
test_memory_nudge_counter_hydration.py	refactor(run_agent): review fixes — keyword-forward __init__, drop dead code, tighten guards	2026-05-16 22:55:49 -07:00
test_memory_provider_init.py	fix(memory): keep Honcho provider opt-in	2026-04-18 22:50:55 -07:00
test_memory_sync_interrupted.py	feat(memory): notify providers on mid-process session_id rotation (#17409 )	2026-04-29 04:57:22 -07:00
test_message_sequence_repair.py	fix(run_agent): break permanent empty-response loop from orphan tool-tail (#21385 )	2026-05-07 08:35:10 -07:00
test_multimodal_tool_content_recovery.py	fix(agent): recover from providers rejecting list-type tool content (#27344 ) (#30259 )	2026-05-21 23:40:16 -07:00
test_openai_client_lifecycle.py	fix(codex): size and propagate timeouts for Responses-API requests; lower stale defaults	2026-05-25 01:47:55 -07:00
test_partial_stream_finish_reason.py	fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998 ) (#32012 )	2026-05-25 17:43:10 +05:30
test_percentage_clamp.py	fix: update 6 test files broken by dead code removal	2026-04-10 03:44:43 -07:00
test_plugin_context_engine_init.py	fix(compressor): ABC compliance — total_tokens, api_mode, logger consistency	2026-05-23 17:38:19 -07:00
test_primary_runtime_restore.py	fix(agent): reset _fallback_index at turn start even when no fallback activated	2026-05-16 17:12:48 -07:00
test_provider_attribution_headers.py	feat(nvidia): add NIM billing origin header	2026-05-15 14:06:51 -07:00
test_provider_fallback.py	fix(fallback): skip chain entries matching current provider/model/base_url (#22780 )	2026-05-09 12:48:19 -07:00
test_provider_parity.py	fix(tests): stabilize xai env and provider parity	2026-05-17 11:55:25 -07:00
test_real_interrupt_subagent.py	fix(tests): fix 78 CI test failures and remove dead test (#9036 )	2026-04-13 10:50:24 -07:00
test_redirect_stdout_issue.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_repair_tool_call_arguments.py	fix(run_agent): handle unescaped control chars in tool_call arguments (#15356 )	2026-04-24 15:06:41 -07:00
test_repair_tool_call_name.py	fix(agent): repair CamelCase + _tool suffix tool-call emissions (#15124 )	2026-04-24 05:32:08 -07:00
test_review_prompt_class_first.py	fix(review): tell background reviewer not to capture transient env failures as skills (#23004 )	2026-05-09 22:51:25 -07:00
test_run_agent.py	fix(security): redact credentials before persistence in session capture	2026-05-24 17:58:25 -07:00
test_run_agent_codex_responses.py	fix(codex): size and propagate timeouts for Responses-API requests; lower stale defaults	2026-05-25 01:47:55 -07:00
test_run_agent_multimodal_prologue.py	refactor: unify transport dispatch + collapse normalize shims	2026-04-22 18:34:25 -07:00
test_sequential_chats_live.py	test: regression guards for the keepalive/transport bug class (#10933 ) (#11266 )	2026-04-16 16:36:33 -07:00
test_session_id_env.py	feat: expose HERMES_SESSION_ID to agent tools via ContextVar + env (#23847 )	2026-05-12 00:16:45 +05:30
test_session_meta_filtering.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_session_reset_fix.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_steer.py	refactor(steer): simplify injection marker to 'User guidance:' prefix (#13340 )	2026-04-20 22:18:49 -07:00
test_stream_drop_logging.py	feat(stream-retry): add upstream + timing diagnostics to drop log (#23005 )	2026-05-09 22:49:35 -07:00
test_stream_interrupt_retry.py	fix: /stop now immediately aborts streaming retry loop	2026-04-25 09:51:39 -07:00
test_streaming.py	fix(xai): surface provider 'error' SSE frame in Codex fallback stream (#27184 )	2026-05-16 17:09:41 -07:00
test_streaming_tool_call_repair.py	chore: remove Atropos RL environments and tinker-atropos integration (#26106 )	2026-05-15 10:36:38 +05:30
test_strict_api_validation.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_strip_reasoning_tags_cli.py	fix(display): strip standalone tool-call XML tags from visible text	2026-04-22 18:12:42 -07:00
test_switch_model_context.py	test(ci): stabilize shared optional dependency baselines	2026-05-13 17:32:22 -07:00
test_switch_model_fallback_prune.py	fix(agent): default missing fallback chain on switch	2026-04-24 05:35:43 -07:00
test_thinking_only_sanitizer.py	fix(agent): drop thinking-only assistant turns before provider call (#16959 )	2026-04-28 03:50:51 -07:00
test_tls_fd_recycle_corruption.py	test(tls-fd-recycle): pin shutdown-only + thread-aware close contract (#29507 )	2026-05-23 02:31:10 -07:00
test_token_persistence_non_cli.py	fix: make session search initialize session db	2026-05-09 14:36:58 -07:00
test_tool_arg_coercion.py	fix(tools): wrap bare scalars in single-element list for array-typed args	2026-05-04 05:00:37 -07:00
test_tool_call_args_sanitizer.py	ci(tests): add pytest-timeout 60s hard cap to break suite-teardown deadlock (#28861 )	2026-05-19 17:27:24 -07:00
test_tool_call_guardrail_runtime.py	test(guardrail): assert halt message reaches stream_delta_callback	2026-05-24 07:38:24 -07:00
test_tool_executor_contextvar_propagation.py	refactor(run_agent): extract tool execution to agent/tool_executor.py	2026-05-16 18:24:05 -07:00
test_tool_name_db_persistence.py	fix(agent): set tool_name on tool-result messages at construction time	2026-05-19 20:49:11 +01:00
test_unicode_ascii_codec.py	fix: always retry on ASCII codec UnicodeEncodeError — don't gate on per-component sanitization	2026-04-15 15:03:28 -07:00
test_vision_aware_preprocessing.py	fix(agent): resolve supports_vision override for named custom providers	2026-05-20 23:27:10 -07:00