mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-29 06:31:32 +00:00
* refactor(codex): drop SDK responses.stream() helper; consume events directly The OpenAI Python SDK's high-level `client.responses.stream(...)` helper does post-hoc typed reconstruction from the terminal `response.completed.response.output` field. The chatgpt.com Codex backend has been observed (today, gpt-5.5) to ship `response.output = null` on terminal frames, which crashes the SDK with `TypeError: 'NoneType' object is not iterable` mid-iteration. Carlton's #32963 patched the symptom by wrapping the helper in try/except and recovering from the same per-event accumulator the SDK was supposed to populate. This PR removes the helper from the call path entirely: we now use `client.responses.create(stream=True)` (raw AsyncIterable of SSE events) and assemble the final response object ourselves from `response.output_item.done` events as they arrive. The terminal event's `output` field is never read for content. Same strategy OpenClaw uses for the same backend. This makes Hermes structurally immune to the bug class, not patched. The next time OpenAI ships a shape change to chatgpt.com's terminal frame, our consumer keeps working because it doesn't read that frame for content — only for usage/status/id. Changes - `agent/codex_runtime.py`: new `_consume_codex_event_stream()` shared consumer; `run_codex_stream()` uses `responses.create(stream=True)`; `run_codex_create_stream_fallback()` collapses into a thin alias since the primary path now does what the fallback used to do. - `agent/auxiliary_client.py`: `_CodexCompletionsAdapter` uses the same consumer; old null-output recovery helpers deleted as unreferenced. - Tests migrated: fixtures that mocked `responses.stream` now mock `responses.create` returning a raw iterable. New regression test asserts the auxiliary path returns streamed items even when the terminal event's `output` is literally `null`. Validation - Live: tested against fresh OAuth on `chatgpt.com/backend-api/codex` with `gpt-5.5` — response built correctly with `response.output=null` on the terminal frame, all events consumed, usage/reasoning tokens propagated. - `tests/run_agent/test_run_agent_codex_responses.py` + `tests/agent/test_auxiliary_client.py`: 242 passed. * test+fix(codex): migrate streaming tests, raise on truncated streams CI surfaced 10 test failures across tests/run_agent/test_streaming.py and tests/run_agent/test_codex_xai_oauth_recovery.py — both files had their own `responses.stream(...)` mocks I missed in the first sweep. agent/codex_runtime.py: _consume_codex_event_stream() now raises "Codex Responses stream did not emit a terminal response" when the stream ends without any terminal frame AND no usable content. This preserves the signal callers used to get from the SDK's high-level helper, which they distinguished from "completed with empty body" in error handling. Tests migrated: - test_streaming.py: text-delta callback, activity-touch, and remote-protocol-error tests all switch from mocking responses.stream to responses.create returning an iterable of events. - test_codex_xai_oauth_recovery.py: prelude-error tests are recast as wire-error-event tests (the new path raises _StreamErrorEvent directly when the wire emits type=error, which is strictly better than the old two-phase "SDK RuntimeError → retry → fallback"). The retry-on-transport-error test moves from responses.stream side-effect to responses.create side-effect. Verified live against chatgpt.com Codex with gpt-5.5 — AIAgent.chat() through the full codex_responses path returns correctly, 319/319 targeted tests passing. |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| conftest.py | ||
| test_413_compression.py | ||
| test_860_dedup.py | ||
| test_1630_context_overflow_loop.py | ||
| test_31273_402_not_retried.py | ||
| test_agent_guardrails.py | ||
| test_anthropic_prompt_cache_policy.py | ||
| test_anthropic_third_party_oauth_guard.py | ||
| test_anthropic_truncation_continuation.py | ||
| test_api_max_retries_config.py | ||
| test_async_httpx_del_neuter.py | ||
| test_background_review.py | ||
| test_background_review_cache_parity.py | ||
| test_background_review_summary.py | ||
| test_background_review_toolset_restriction.py | ||
| test_callable_api_key.py | ||
| test_codex_app_server_integration.py | ||
| test_codex_multimodal_tool_result.py | ||
| test_codex_silent_hang_hint.py | ||
| test_codex_xai_oauth_recovery.py | ||
| test_commit_memory_session_context_engine.py | ||
| test_compress_focus_plugin_fallback.py | ||
| test_compression_boundary.py | ||
| test_compression_boundary_hook.py | ||
| test_compression_feasibility.py | ||
| test_compression_persistence.py | ||
| test_compression_trigger_excludes_reasoning.py | ||
| test_compressor_fallback_update.py | ||
| test_concurrent_interrupt.py | ||
| test_context_token_tracking.py | ||
| test_copilot_native_vision_headers.py | ||
| test_create_openai_client_kwargs_isolation.py | ||
| test_create_openai_client_proxy_env.py | ||
| test_create_openai_client_reuse.py | ||
| test_credential_pool_interrupt.py | ||
| test_deepseek_reasoning_content_echo.py | ||
| test_deepseek_v4_thinking_live.py | ||
| test_dict_tool_call_args.py | ||
| test_empty_response_recovery_persistence.py | ||
| test_exit_cleanup_interrupt.py | ||
| test_file_mutation_verifier.py | ||
| test_image_rejection_fallback.py | ||
| test_image_shrink_recovery.py | ||
| test_init_fallback_on_exhausted_pool.py | ||
| test_interactive_interrupt.py | ||
| test_interrupt_propagation.py | ||
| test_invalid_context_length_warning.py | ||
| test_iteration_budget_race.py | ||
| test_jsondecodeerror_retryable.py | ||
| test_last_reasoning_per_turn.py | ||
| test_long_context_tier_429.py | ||
| test_materialize_data_url_cleanup.py | ||
| test_memory_nudge_counter_hydration.py | ||
| test_memory_provider_init.py | ||
| test_memory_sync_interrupted.py | ||
| test_message_sequence_repair.py | ||
| test_multimodal_tool_content_recovery.py | ||
| test_openai_client_lifecycle.py | ||
| test_partial_stream_finish_reason.py | ||
| test_percentage_clamp.py | ||
| test_plugin_context_engine_init.py | ||
| test_primary_runtime_restore.py | ||
| test_provider_attribution_headers.py | ||
| test_provider_fallback.py | ||
| test_provider_parity.py | ||
| test_real_interrupt_subagent.py | ||
| test_redirect_stdout_issue.py | ||
| test_repair_tool_call_arguments.py | ||
| test_repair_tool_call_name.py | ||
| test_review_prompt_class_first.py | ||
| test_run_agent.py | ||
| test_run_agent_codex_responses.py | ||
| test_run_agent_multimodal_prologue.py | ||
| test_sequential_chats_live.py | ||
| test_session_id_env.py | ||
| test_session_meta_filtering.py | ||
| test_session_reset_fix.py | ||
| test_steer.py | ||
| test_stream_drop_logging.py | ||
| test_stream_interrupt_retry.py | ||
| test_streaming.py | ||
| test_streaming_tool_call_repair.py | ||
| test_strict_api_validation.py | ||
| test_strip_reasoning_tags_cli.py | ||
| test_switch_model_context.py | ||
| test_switch_model_fallback_prune.py | ||
| test_thinking_only_sanitizer.py | ||
| test_tls_fd_recycle_corruption.py | ||
| test_token_persistence_non_cli.py | ||
| test_tool_arg_coercion.py | ||
| test_tool_call_args_sanitizer.py | ||
| test_tool_call_guardrail_runtime.py | ||
| test_tool_executor_contextvar_propagation.py | ||
| test_tool_name_db_persistence.py | ||
| test_unicode_ascii_codec.py | ||
| test_vision_aware_preprocessing.py | ||