perf(cli): defer openai._base_client import via sys.meta_path finder (#28864)

`cli.py` was eager-importing `openai._base_client` at module-load time
purely to monkeypatch `AsyncHttpxClientWrapper.__del__` (defense against
"Press ENTER to continue..." errors when AsyncOpenAI clients are GC'd
against dead event loops). That import cost ~166ms / ~30MB on every
cold CLI start because openai's type tree (responses/*, graders/*) is huge.

Replace with a `sys.meta_path` finder that intercepts the first import
of `openai._base_client` from anywhere in the codebase, lets the normal
load run, then applies the `__del__ = lambda self: None` patch before
control returns to the caller. Same correctness guarantee (patch
applies before any AsyncOpenAI instance can be constructed), zero cost
until the SDK is actually needed.

Hot path: every hermes chat / gateway boot / cron tick / subagent spawn.

A/B benchmark, 10 runs each, fresh subprocess:
                     BEFORE  AFTER   delta
  import cli wall    0.86s   0.62s   -28% (median)
  import cli wall    0.85s   0.59s   -31% (min)
  import cli RSS     91.2MB  74.0MB  -19% (median)

The `neuter_async_httpx_del` function in agent/auxiliary_client.py is
unchanged; its tests still pass and any future callers can still invoke
it directly.

Verified:
- import cli no longer pulls openai into sys.modules
- first 'from openai._base_client import AsyncHttpxClientWrapper'
  triggers the patch; __del__.__name__ == '<lambda>'
- tests/run_agent/test_async_httpx_del_neuter.py: 9/9 pass
- tests/agent/test_auxiliary_client.py: 159/159 pass
- tests/cli/: 715/715 pass
This commit is contained in:
Teknium 2026-05-19 14:24:53 -07:00 committed by GitHub
parent 6a159be7ca
commit 784febe1cf
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

53
cli.py
View file

@ -655,9 +655,58 @@ except Exception:
# which, during CLI idle time, finds prompt_toolkit's event loop and tries to
# close TCP transports bound to dead worker loops — producing
# "Event loop is closed" / "Press ENTER to continue..." errors.
#
# We install a sys.meta_path finder that defers the actual import + patch
# until ``openai._base_client`` is first loaded by the rest of the codebase.
# Eagerly importing it here (the old approach) cost ~166ms / ~30MB on every
# cold CLI start because openai's type tree (responses/*, graders/*) is huge.
# The finder approach pays nothing until the SDK is genuinely needed and
# still guarantees the patch is applied before any AsyncOpenAI instance can
# be constructed (the import-then-instantiate ordering is enforced by
# Python's import system).
try:
from agent.auxiliary_client import neuter_async_httpx_del
neuter_async_httpx_del()
import sys as _httpx_neuter_sys
import importlib.util as _httpx_neuter_imp_util
class _AsyncHttpxDelNeuter:
"""Defer ``AsyncHttpxClientWrapper.__del__`` neutering until import.
Saves ~166ms on cold CLI start where openai is never used (e.g.
``hermes --help`` paths inside the chat command flow). See
``agent.auxiliary_client.neuter_async_httpx_del`` for full rationale
on why ``__del__`` must be a no-op.
"""
_armed = True
def find_spec(self, fullname, path=None, target=None):
if not self._armed or fullname != "openai._base_client":
return None
# Disarm before delegating so the recursive find_spec call
# below doesn't loop through us.
self._armed = False
try:
_httpx_neuter_sys.meta_path.remove(self)
except ValueError:
pass
spec = _httpx_neuter_imp_util.find_spec(fullname)
if spec is None or spec.loader is None:
return None
_orig_exec = spec.loader.exec_module
def _patched_exec(module):
_orig_exec(module)
try:
cls = getattr(module, "AsyncHttpxClientWrapper", None)
if cls is not None:
cls.__del__ = lambda self: None # type: ignore[assignment]
except Exception:
pass
spec.loader.exec_module = _patched_exec # type: ignore[method-assign]
return spec
_httpx_neuter_sys.meta_path.insert(0, _AsyncHttpxDelNeuter())
except Exception:
pass