mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-29 06:31:32 +00:00
* refactor(codex): drop SDK responses.stream() helper; consume events directly The OpenAI Python SDK's high-level `client.responses.stream(...)` helper does post-hoc typed reconstruction from the terminal `response.completed.response.output` field. The chatgpt.com Codex backend has been observed (today, gpt-5.5) to ship `response.output = null` on terminal frames, which crashes the SDK with `TypeError: 'NoneType' object is not iterable` mid-iteration. Carlton's #32963 patched the symptom by wrapping the helper in try/except and recovering from the same per-event accumulator the SDK was supposed to populate. This PR removes the helper from the call path entirely: we now use `client.responses.create(stream=True)` (raw AsyncIterable of SSE events) and assemble the final response object ourselves from `response.output_item.done` events as they arrive. The terminal event's `output` field is never read for content. Same strategy OpenClaw uses for the same backend. This makes Hermes structurally immune to the bug class, not patched. The next time OpenAI ships a shape change to chatgpt.com's terminal frame, our consumer keeps working because it doesn't read that frame for content — only for usage/status/id. Changes - `agent/codex_runtime.py`: new `_consume_codex_event_stream()` shared consumer; `run_codex_stream()` uses `responses.create(stream=True)`; `run_codex_create_stream_fallback()` collapses into a thin alias since the primary path now does what the fallback used to do. - `agent/auxiliary_client.py`: `_CodexCompletionsAdapter` uses the same consumer; old null-output recovery helpers deleted as unreferenced. - Tests migrated: fixtures that mocked `responses.stream` now mock `responses.create` returning a raw iterable. New regression test asserts the auxiliary path returns streamed items even when the terminal event's `output` is literally `null`. Validation - Live: tested against fresh OAuth on `chatgpt.com/backend-api/codex` with `gpt-5.5` — response built correctly with `response.output=null` on the terminal frame, all events consumed, usage/reasoning tokens propagated. - `tests/run_agent/test_run_agent_codex_responses.py` + `tests/agent/test_auxiliary_client.py`: 242 passed. * test+fix(codex): migrate streaming tests, raise on truncated streams CI surfaced 10 test failures across tests/run_agent/test_streaming.py and tests/run_agent/test_codex_xai_oauth_recovery.py — both files had their own `responses.stream(...)` mocks I missed in the first sweep. agent/codex_runtime.py: _consume_codex_event_stream() now raises "Codex Responses stream did not emit a terminal response" when the stream ends without any terminal frame AND no usable content. This preserves the signal callers used to get from the SDK's high-level helper, which they distinguished from "completed with empty body" in error handling. Tests migrated: - test_streaming.py: text-delta callback, activity-touch, and remote-protocol-error tests all switch from mocking responses.stream to responses.create returning an iterable of events. - test_codex_xai_oauth_recovery.py: prelude-error tests are recast as wire-error-event tests (the new path raises _StreamErrorEvent directly when the wire emits type=error, which is strictly better than the old two-phase "SDK RuntimeError → retry → fallback"). The retry-on-transport-error test moves from responses.stream side-effect to responses.create side-effect. Verified live against chatgpt.com Codex with gpt-5.5 — AIAgent.chat() through the full codex_responses path returns correctly, 319/319 targeted tests passing. |
||
|---|---|---|
| .. | ||
| lsp | ||
| secret_sources | ||
| transports | ||
| __init__.py | ||
| account_usage.py | ||
| agent_init.py | ||
| agent_runtime_helpers.py | ||
| anthropic_adapter.py | ||
| async_utils.py | ||
| auxiliary_client.py | ||
| azure_identity_adapter.py | ||
| background_review.py | ||
| bedrock_adapter.py | ||
| browser_provider.py | ||
| browser_registry.py | ||
| chat_completion_helpers.py | ||
| codex_responses_adapter.py | ||
| codex_runtime.py | ||
| context_compressor.py | ||
| context_engine.py | ||
| context_references.py | ||
| conversation_compression.py | ||
| conversation_loop.py | ||
| copilot_acp_client.py | ||
| credential_persistence.py | ||
| credential_pool.py | ||
| credential_sources.py | ||
| curator.py | ||
| curator_backup.py | ||
| display.py | ||
| error_classifier.py | ||
| file_safety.py | ||
| gemini_cloudcode_adapter.py | ||
| gemini_native_adapter.py | ||
| gemini_schema.py | ||
| google_code_assist.py | ||
| google_oauth.py | ||
| i18n.py | ||
| image_gen_provider.py | ||
| image_gen_registry.py | ||
| image_routing.py | ||
| insights.py | ||
| iteration_budget.py | ||
| lmstudio_reasoning.py | ||
| manual_compression_feedback.py | ||
| markdown_tables.py | ||
| memory_manager.py | ||
| memory_provider.py | ||
| message_sanitization.py | ||
| model_metadata.py | ||
| models_dev.py | ||
| moonshot_schema.py | ||
| nous_rate_guard.py | ||
| onboarding.py | ||
| plugin_llm.py | ||
| portal_tags.py | ||
| process_bootstrap.py | ||
| prompt_builder.py | ||
| prompt_caching.py | ||
| rate_limit_tracker.py | ||
| redact.py | ||
| retry_utils.py | ||
| shell_hooks.py | ||
| skill_bundles.py | ||
| skill_commands.py | ||
| skill_preprocessing.py | ||
| skill_utils.py | ||
| stream_diag.py | ||
| subdirectory_hints.py | ||
| system_prompt.py | ||
| think_scrubber.py | ||
| title_generator.py | ||
| tool_dispatch_helpers.py | ||
| tool_executor.py | ||
| tool_guardrails.py | ||
| tool_result_classification.py | ||
| trajectory.py | ||
| transcription_provider.py | ||
| transcription_registry.py | ||
| tts_provider.py | ||
| tts_registry.py | ||
| usage_pricing.py | ||
| video_gen_provider.py | ||
| video_gen_registry.py | ||
| web_search_provider.py | ||
| web_search_registry.py | ||