mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
The previous PR (#22993) gave us a structured WARNING per stream drop but the only diagnostic was 'error_type=APIError error=Network connection lost.' — same nothing the user started with. To actually diagnose why subagents drop streams disproportionately we need to know WHERE the drop happened. Adds three breadcrumbs to the agent.log WARNING: 1. Inner exception chain. openai SDK wraps httpx errors as APIConnectionError / APIError so the catch site only sees the wrapper. _flatten_exception_chain walks __cause__/__context__ up to 4 levels deep and renders 'Outer(msg) <- Inner(msg)' so we can tell ConnectError vs RemoteProtocolError vs ReadError vs ProxyError without enabling verbose mode. 2. Upstream HTTP headers. Snapshots cf-ray, x-openrouter-provider, x-openrouter-model, x-openrouter-id, x-request-id, server, via, etc. from stream.response immediately after open (so they survive even when the stream dies before the first chunk). These answer 'is one CF edge / one downstream provider responsible, or random?' 3. Per-attempt counters. bytes streamed, chunk count, elapsed time on the dying attempt, and time-to-first-byte. Distinguishes 'couldn't connect at all' (0s, 0 bytes) from 'died after 30s mid-stream' (very different root causes — first is auth/routing, second is upstream idle-kill or proxy timeout). Plumbing: - _stream_diag_init / _stream_diag_capture_response live on AIAgent and produce a per-attempt dict held on request_client_holder['diag'] for closure access from the retry block. - _call_chat_completions and _call_anthropic both initialize the diag and increment counters per chunk/event (best-effort, never raises in the streaming hot path). - _log_stream_retry / _emit_stream_drop accept an optional diag and render the new fields. Final-exhaustion log goes through the same helper so it gets the same diagnostic dump. - UI status line gains a brief 'after Xs' suffix when timing is available — distinguishes 'connect failed' from 'died mid-stream' at a glance without grepping logs. Sample WARNING after this change: Stream drop mid tool-call on attempt 2/3 — retrying. subagent_id=sa-2-cafef00d depth=1 provider=openrouter base_url=https://openrouter.ai/api/v1 error_type=APIError error=Connection error. chain=APIError(Connection error.) <- RemoteProtocolError(peer closed connection without sending complete message body) http_status=200 bytes=12400 chunks=47 elapsed=12.00s ttfb=0.83s upstream=[cf-ray=8f1a2b3c4d5e6f7g-LAX x-openrouter-provider=Anthropic x-openrouter-id=gen-abc123 server=cloudflare] Tests: 10 covering diag init, header capture (whitelist enforced for PII), exception-chain walking + depth cap, log content with full diag, log content without diag (placeholders), UI elapsed-suffix on/off. |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| conftest.py | ||
| test_413_compression.py | ||
| test_860_dedup.py | ||
| test_1630_context_overflow_loop.py | ||
| test_agent_guardrails.py | ||
| test_agent_loop.py | ||
| test_agent_loop_tool_calling.py | ||
| test_agent_loop_vllm.py | ||
| test_anthropic_error_handling.py | ||
| test_anthropic_prompt_cache_policy.py | ||
| test_anthropic_third_party_oauth_guard.py | ||
| test_anthropic_truncation_continuation.py | ||
| test_api_max_retries_config.py | ||
| test_async_httpx_del_neuter.py | ||
| test_background_review.py | ||
| test_background_review_summary.py | ||
| test_background_review_toolset_restriction.py | ||
| test_codex_multimodal_tool_result.py | ||
| test_commit_memory_session_context_engine.py | ||
| test_compress_focus_plugin_fallback.py | ||
| test_compression_boundary.py | ||
| test_compression_boundary_hook.py | ||
| test_compression_feasibility.py | ||
| test_compression_persistence.py | ||
| test_compression_trigger_excludes_reasoning.py | ||
| test_compressor_fallback_update.py | ||
| test_concurrent_interrupt.py | ||
| test_context_token_tracking.py | ||
| test_copilot_native_vision_headers.py | ||
| test_create_openai_client_kwargs_isolation.py | ||
| test_create_openai_client_proxy_env.py | ||
| test_create_openai_client_reuse.py | ||
| test_deepseek_reasoning_content_echo.py | ||
| test_deepseek_v4_thinking_live.py | ||
| test_dict_tool_call_args.py | ||
| test_empty_response_recovery_persistence.py | ||
| test_exit_cleanup_interrupt.py | ||
| test_fallback_model.py | ||
| test_image_rejection_fallback.py | ||
| test_image_shrink_recovery.py | ||
| test_init_fallback_on_exhausted_pool.py | ||
| test_interactive_interrupt.py | ||
| test_interrupt_propagation.py | ||
| test_invalid_context_length_warning.py | ||
| test_iteration_budget_race.py | ||
| test_jsondecodeerror_retryable.py | ||
| test_last_reasoning_per_turn.py | ||
| test_long_context_tier_429.py | ||
| test_memory_nudge_counter_hydration.py | ||
| test_memory_provider_init.py | ||
| test_memory_sync_interrupted.py | ||
| test_message_sequence_repair.py | ||
| test_openai_client_lifecycle.py | ||
| test_percentage_clamp.py | ||
| test_plugin_context_engine_init.py | ||
| test_primary_runtime_restore.py | ||
| test_provider_attribution_headers.py | ||
| test_provider_fallback.py | ||
| test_provider_parity.py | ||
| test_real_interrupt_subagent.py | ||
| test_redirect_stdout_issue.py | ||
| test_repair_tool_call_arguments.py | ||
| test_repair_tool_call_name.py | ||
| test_review_prompt_class_first.py | ||
| test_run_agent.py | ||
| test_run_agent_codex_responses.py | ||
| test_run_agent_multimodal_prologue.py | ||
| test_sequential_chats_live.py | ||
| test_session_meta_filtering.py | ||
| test_session_reset_fix.py | ||
| test_steer.py | ||
| test_stream_drop_logging.py | ||
| test_stream_interrupt_retry.py | ||
| test_streaming.py | ||
| test_streaming_tool_call_repair.py | ||
| test_strict_api_validation.py | ||
| test_strip_reasoning_tags_cli.py | ||
| test_switch_model_context.py | ||
| test_switch_model_fallback_prune.py | ||
| test_thinking_only_sanitizer.py | ||
| test_token_persistence_non_cli.py | ||
| test_tool_arg_coercion.py | ||
| test_tool_call_args_sanitizer.py | ||
| test_tool_call_guardrail_runtime.py | ||
| test_tool_executor_contextvar_propagation.py | ||
| test_unicode_ascii_codec.py | ||
| test_vision_aware_preprocessing.py | ||