hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-03 12:23:08 +00:00

History

Teknium 126cbffb8a feat(stream-retry): add upstream + timing diagnostics to drop log (#23005 ) The previous PR (#22993) gave us a structured WARNING per stream drop but the only diagnostic was 'error_type=APIError error=Network connection lost.' — same nothing the user started with. To actually diagnose why subagents drop streams disproportionately we need to know WHERE the drop happened. Adds three breadcrumbs to the agent.log WARNING: 1. Inner exception chain. openai SDK wraps httpx errors as APIConnectionError / APIError so the catch site only sees the wrapper. _flatten_exception_chain walks __cause__/__context__ up to 4 levels deep and renders 'Outer(msg) <- Inner(msg)' so we can tell ConnectError vs RemoteProtocolError vs ReadError vs ProxyError without enabling verbose mode. 2. Upstream HTTP headers. Snapshots cf-ray, x-openrouter-provider, x-openrouter-model, x-openrouter-id, x-request-id, server, via, etc. from stream.response immediately after open (so they survive even when the stream dies before the first chunk). These answer 'is one CF edge / one downstream provider responsible, or random?' 3. Per-attempt counters. bytes streamed, chunk count, elapsed time on the dying attempt, and time-to-first-byte. Distinguishes 'couldn't connect at all' (0s, 0 bytes) from 'died after 30s mid-stream' (very different root causes — first is auth/routing, second is upstream idle-kill or proxy timeout). Plumbing: - _stream_diag_init / _stream_diag_capture_response live on AIAgent and produce a per-attempt dict held on request_client_holder['diag'] for closure access from the retry block. - _call_chat_completions and _call_anthropic both initialize the diag and increment counters per chunk/event (best-effort, never raises in the streaming hot path). - _log_stream_retry / _emit_stream_drop accept an optional diag and render the new fields. Final-exhaustion log goes through the same helper so it gets the same diagnostic dump. - UI status line gains a brief 'after Xs' suffix when timing is available — distinguishes 'connect failed' from 'died mid-stream' at a glance without grepping logs. Sample WARNING after this change: Stream drop mid tool-call on attempt 2/3 — retrying. subagent_id=sa-2-cafef00d depth=1 provider=openrouter base_url=https://openrouter.ai/api/v1 error_type=APIError error=Connection error. chain=APIError(Connection error.) <- RemoteProtocolError(peer closed connection without sending complete message body) http_status=200 bytes=12400 chunks=47 elapsed=12.00s ttfb=0.83s upstream=[cf-ray=8f1a2b3c4d5e6f7g-LAX x-openrouter-provider=Anthropic x-openrouter-id=gen-abc123 server=cloudflare] Tests: 10 covering diag init, header capture (whitelist enforced for PII), exception-chain walking + depth cap, log content with full diag, log content without diag (placeholders), UI elapsed-suffix on/off.		2026-05-09 22:49:35 -07:00
..
acp	fix(acp): preserve assistant reasoning metadata in session persistence	2026-05-05 10:18:28 -07:00
acp_adapter	fix: make session search initialize session db	2026-05-09 14:36:58 -07:00
agent	feat(curator): show rename map in user-visible summary (#22910 )	2026-05-09 18:43:40 -07:00
cli	fix(cli): preserve config comments on setting writes	2026-05-09 17:55:12 -07:00
cron	fix(cron): avoid github skill false positives in scanner	2026-05-09 11:11:45 -07:00
e2e	fix(gateway): move quick-command dispatch before built-in handlers	2026-05-04 01:39:23 -07:00
environments/benchmarks
fakes
gateway	fix(gateway): degrade gracefully when all platform adapters are missing	2026-05-09 17:53:46 -07:00
hermes_cli	fix(kanban): /kanban slash command emits argparse garbage instead of help	2026-05-09 22:49:29 -07:00
hermes_state	fix(resume): redirect --resume to the descendant that actually holds the messages	2026-04-24 03:04:42 -07:00
honcho_plugin	feat(honcho): explain why when honcho_profile returns an empty card	2026-04-27 12:37:33 -07:00
integration	fix(discord): strip RTP padding before DAVE/Opus decode (#11267 )	2026-04-16 16:50:15 -07:00
openviking_plugin	fix(openviking): pre-check fs/stat to route file URIs before hitting directory-only endpoints	2026-04-30 02:35:29 -07:00
plugins	fix(kanban): request default board explicitly (#21819 )	2026-05-09 19:31:32 -07:00
providers	feat(openrouter): wire Pareto Code router with min_coding_score knob (#22838 )	2026-05-09 14:47:00 -07:00
run_agent	feat(stream-retry): add upstream + timing diagnostics to drop log (#23005 )	2026-05-09 22:49:35 -07:00
skills	fix(deps): declare youtube-transcript-api in pyproject.toml [youtube] extra	2026-05-09 13:36:01 -07:00
stress	fix(kanban): gate claim + unblock on parent completion	2026-05-09 11:07:37 -07:00
tools	feat(vision): vision_analyze returns pixels to vision-capable models, not aux text (#22955 )	2026-05-09 21:06:19 -07:00
tui_gateway	fix(tui): close slash parity gaps with CLI (#20339 )	2026-05-05 15:42:39 -05:00
website	docs(skills): explain restoring bundled skills	2026-05-05 13:46:20 -07:00
__init__.py
conftest.py	feat(vision): vision_analyze returns pixels to vision-capable models, not aux text (#22955 )	2026-05-09 21:06:19 -07:00
run_interrupt_test.py
test_account_usage.py
test_atomic_replace_symlinks.py	refactor: consolidate symlink-safe atomic replace into shared helper	2026-04-28 04:58:22 -07:00
test_base_url_hostname.py
test_batch_runner_checkpoint.py
test_cli_file_drop.py
test_cli_manual_compress.py	test(cli): regression test for manual /compress system_message	2026-04-28 05:21:49 -07:00
test_cli_skin_integration.py	fix(ci): stabilize main test suite regressions (#17660 )	2026-04-29 23:18:55 -07:00
test_ctx_halving_fix.py
test_empty_model_fallback.py
test_evidence_store.py	feat: add OSS Security Forensics skill (Skills Hub) (#1482 )	2026-03-15 21:59:53 -07:00
test_get_tool_definitions_cache_isolation.py	fix(tools): isolate get_tool_definitions quiet_mode cache + dedup LCM injection (#17335 )	2026-04-30 04:32:06 -07:00
test_hermes_bootstrap.py	fix(entry-points): guard hermes_bootstrap import so partial updates don't brick hermes (#22091 )	2026-05-08 14:43:13 -07:00
test_hermes_constants.py	test(hermes_constants): cover parse_reasoning_effort()	2026-05-07 09:59:07 -07:00
test_hermes_home_profile_warning.py	fix(constants): warn once when get_hermes_home() falls back under an active profile (#18746 )	2026-05-02 01:49:55 -07:00
test_hermes_logging.py
test_hermes_state.py	fix(session): route OR-combined short CJK tokens to LIKE fallback (#20494 )	2026-05-09 17:53:02 -07:00
test_hermes_state_wal_fallback.py	fix(sqlite): fall back to journal_mode=DELETE on NFS/SMB/FUSE (#22043 )	2026-05-09 02:09:35 -07:00
test_honcho_client_config.py
test_install_sh_pythonpath_sanitization.py	fix: harden install.sh against inherited Python env leakage	2026-05-06 04:02:02 -07:00
test_install_sh_setup_wizard_tty_probe.py	fix(install): widen /dev/tty open-probe to sibling gates (#16746 )	2026-04-28 06:45:55 -07:00
test_install_sh_termux_network_prereqs.py	fix: strengthen termux install network prerequisites	2026-05-07 13:04:08 -07:00
test_ipv4_preference.py
test_lazy_session_regressions.py	fix: resolve lazy session creation regressions (#18370 fallout) (#20363 )	2026-05-06 01:11:49 +05:30
test_lint_config.py	lint: enable PLW1514 as a blocking ruff rule	2026-05-08 14:27:40 -07:00
test_mcp_serve.py	fix(mcp): unwrap platforms key in channels_list	2026-05-07 13:41:16 -07:00
test_mini_swe_runner.py	fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157 )	2026-04-20 12:23:05 -07:00
test_minimax_model_validation.py
test_minimax_oauth.py	test(cli): cover minimax-oauth resolution, refresh, menu wiring	2026-04-29 09:53:42 -07:00
test_minisweagent_path.py
test_model_picker_scroll.py
test_model_tools.py	fix(plugins): stop firing pre_tool_call hook twice per tool execution (#17611 )	2026-04-29 12:43:39 -07:00
test_model_tools_async_bridge.py	fix(model_tools): cancel coroutine on timeout so worker thread exits + log full traceback	2026-04-29 05:00:40 -07:00
test_ollama_num_ctx.py	fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983 )	2026-04-07 22:23:28 -07:00
test_packaging_metadata.py
test_plugin_skills.py	fix(skills): support category-qualified local skill names	2026-05-05 10:15:31 -07:00
test_process_loop_event_loop_warning.py	fix(cli): replace get_event_loop() with get_running_loop() to silence RuntimeWarning in process_loop thread (#19285 )	2026-05-07 06:35:54 -07:00
test_project_metadata.py
test_retry_utils.py
test_sql_injection.py
test_subprocess_home_isolation.py	fix: per-profile subprocess HOME isolation (#4426 ) (#7357 )	2026-04-10 13:37:45 -07:00
test_termux_all_extra_compat.py	fix: add termux-all install profile and safe fallbacks	2026-05-07 13:04:08 -07:00
test_timezone.py
test_toolset_distributions.py
test_toolsets.py	fix: merge plugin tools into builtin toolsets	2026-05-05 10:14:17 -07:00
test_trajectory_compressor.py
test_trajectory_compressor_async.py
test_transform_llm_output_hook.py	test+docs: cover transform_llm_output hook + release author map	2026-05-07 05:46:05 -07:00
test_transform_tool_result_hook.py
test_tui_gateway_server.py	Merge pull request #20942 from NousResearch/austin/fix/personality	2026-05-07 18:54:29 -04:00
test_utils_truthy_values.py
test_yuanbao_integration.py	yuanbao platform (#16298 )	2026-04-26 18:50:49 -07:00
test_yuanbao_markdown.py	yuanbao platform (#16298 )	2026-04-26 18:50:49 -07:00
test_yuanbao_pipeline.py	yuanbao platform (#16298 )	2026-04-26 18:50:49 -07:00
test_yuanbao_proto.py	yuanbao platform (#16298 )	2026-04-26 18:50:49 -07:00