perf(cli): defer openai._base_client import via sys.meta_path finder (#28864)

`cli.py` was eager-importing `openai._base_client` at module-load time purely to monkeypatch `AsyncHttpxClientWrapper.__del__` (defense against "Press ENTER to continue..." errors when AsyncOpenAI clients are GC'd against dead event loops). That import cost ~166ms / ~30MB on every cold CLI start because openai's type tree (responses/*, graders/*) is huge. Replace with a `sys.meta_path` finder that intercepts the first import of `openai._base_client` from anywhere in the codebase, lets the normal load run, then applies the `__del__ = lambda self: None` patch before control returns to the caller. Same correctness guarantee (patch applies before any AsyncOpenAI instance can be constructed), zero cost until the SDK is actually needed. Hot path: every hermes chat / gateway boot / cron tick / subagent spawn. A/B benchmark, 10 runs each, fresh subprocess: BEFORE AFTER delta import cli wall 0.86s 0.62s -28% (median) import cli wall 0.85s 0.59s -31% (min) import cli RSS 91.2MB 74.0MB -19% (median) The `neuter_async_httpx_del` function in agent/auxiliary_client.py is unchanged; its tests still pass and any future callers can still invoke it directly. Verified: - import cli no longer pulls openai into sys.modules - first 'from openai._base_client import AsyncHttpxClientWrapper' triggers the patch; __del__.__name__ == '<lambda>' - tests/run_agent/test_async_httpx_del_neuter.py: 9/9 pass - tests/agent/test_auxiliary_client.py: 159/159 pass - tests/cli/: 715/715 pass
2026-07-14 14:12:44 +00:00 · 2026-05-19 14:24:53 -07:00 · 2026-05-19 14:24:53 -07:00 · 784febe1cf
commit 784febe1cf
parent 6a159be7ca
1 changed files with 51 additions and 2 deletions
--- a/cli.py
+++ b/cli.py
@ -655,9 +655,58 @@ except Exception:
 # which, during CLI idle time, finds prompt_toolkit's event loop and tries to
 # close TCP transports bound to dead worker loops — producing
 # "Event loop is closed" / "Press ENTER to continue..." errors.
+#
+# We install a sys.meta_path finder that defers the actual import + patch
+# until ``openai._base_client`` is first loaded by the rest of the codebase.
+# Eagerly importing it here (the old approach) cost ~166ms / ~30MB on every
+# cold CLI start because openai's type tree (responses/*, graders/*) is huge.
+# The finder approach pays nothing until the SDK is genuinely needed and
+# still guarantees the patch is applied before any AsyncOpenAI instance can
+# be constructed (the import-then-instantiate ordering is enforced by
+# Python's import system).
 try:
-    from agent.auxiliary_client import neuter_async_httpx_del
-    neuter_async_httpx_del()
+    import sys as _httpx_neuter_sys
+    import importlib.util as _httpx_neuter_imp_util
+
+    class _AsyncHttpxDelNeuter:
+        """Defer ``AsyncHttpxClientWrapper.__del__`` neutering until import.
+
+        Saves ~166ms on cold CLI start where openai is never used (e.g.
+        ``hermes --help`` paths inside the chat command flow).  See
+        ``agent.auxiliary_client.neuter_async_httpx_del`` for full rationale
+        on why ``__del__`` must be a no-op.
+        """
+
+        _armed = True
+
+        def find_spec(self, fullname, path=None, target=None):
+            if not self._armed or fullname != "openai._base_client":
+                return None
+            # Disarm before delegating so the recursive find_spec call
+            # below doesn't loop through us.
+            self._armed = False
+            try:
+                _httpx_neuter_sys.meta_path.remove(self)
+            except ValueError:
+                pass
+            spec = _httpx_neuter_imp_util.find_spec(fullname)
+            if spec is None or spec.loader is None:
+                return None
+            _orig_exec = spec.loader.exec_module
+
+            def _patched_exec(module):
+                _orig_exec(module)
+                try:
+                    cls = getattr(module, "AsyncHttpxClientWrapper", None)
+                    if cls is not None:
+                        cls.__del__ = lambda self: None  # type: ignore[assignment]
+                except Exception:
+                    pass
+
+            spec.loader.exec_module = _patched_exec  # type: ignore[method-assign]
+            return spec
+
+    _httpx_neuter_sys.meta_path.insert(0, _AsyncHttpxDelNeuter())
 except Exception:
    pass