mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-13 09:01:54 +00:00
perf(startup): lazy-import OpenAI, Anthropic, Firecrawl, account_usage (#17046)
* perf(startup): lazy-import OpenAI, Anthropic, Firecrawl, account_usage
Four heavy SDK/module imports are now deferred off the hot startup path.
Net savings on cold module imports:
cli 1200 → 958 ms (-242)
run_agent 1220 → 901 ms (-319)
tools.web_tools 711 → 423 ms (-288)
agent.anthropic_adapter 230 → 15 ms (-215)
agent.auxiliary_client 253 → 68 ms (-185)
Four independent changes in one PR since they all use the same pattern
and share the same risk profile (heavy SDK import → lazy proxy or
function-local import):
1. tools/web_tools.py:
'from firecrawl import Firecrawl' moved into _get_firecrawl_client(),
which is only called when backend='firecrawl'. Users on Exa/Tavily/
Parallel pay zero firecrawl cost.
2. cli.py + gateway/run.py:
'from agent.account_usage import ...' moved into the /limits handlers.
account_usage transitively pulls the OpenAI SDK chain; only needed
when the user runs /limits.
3. agent/anthropic_adapter.py:
'try: import anthropic as _anthropic_sdk' replaced with a cached
'_get_anthropic_sdk()' accessor. The three usage sites
(build_anthropic_client, build_anthropic_bedrock_client,
read_claude_code_credentials_from_keychain) now resolve via the
accessor. All pre-existing test patches of
'agent.anthropic_adapter._anthropic_sdk' keep working because the
accessor respects any value already in module globals.
4. agent/auxiliary_client.py AND run_agent.py:
'from openai import OpenAI' replaced with an '_OpenAIProxy()' module-
level object that looks like the OpenAI class but imports the SDK on
first call/isinstance check. This preserves:
- 15+ in-module OpenAI(...) construction sites in auxiliary_client
and the single site in run_agent's _create_openai_client (Python's
function-scope name lookup finds the proxy, forwards the call);
- 'patch("agent.auxiliary_client.OpenAI", ...)' and
'patch("run_agent.OpenAI", ...)' test patterns used by 28+ test
files (patch replaces the module attribute as usual).
Tried two alternatives first:
- 'from openai._client import OpenAI' — doesn't skip openai/__init__.py
(the audit's hypothesis here was wrong).
- Module-level __getattr__ — works for external access but Python
function-scope name resolution skips __getattr__, so in-module
OpenAI(...) calls NameError.
Note: 'openai' still loads on 'import cli' because
cli.py -> neuter_async_httpx_del() -> openai._base_client, and
run_agent.py -> code_execution_tool.py (module-level
build_execute_code_schema) -> _load_config() -> 'from cli import
CLI_CONFIG'. Deferring those is a separate, larger change — out of scope
for this PR. The savings above all come from avoiding the openai/*,
anthropic/*, and firecrawl/* top-level type-tree imports on paths that
don't need them.
Verified:
- 302/302 tests in tests/agent/{test_anthropic_adapter,
test_bedrock_1m_context, test_minimax_provider, test_anthropic_keychain}
pass. Two pre-existing failures on main unchanged.
- 106/106 tests/agent/test_auxiliary_client.py pass (1 pre-existing fail).
- 97/97 tests/run_agent/test_create_openai_client_kwargs_isolation.py,
test_plugin_context_engine_init.py, test_invalid_context_length_warning.py,
test_api_max_retries_config.py,
tests/hermes_cli/test_gemini_provider.py, test_ollama_cloud_provider.py
pass (1 pre-existing fail).
- Live hermes chat smoke: 2 turns + /model switch + tool calls, zero
errors in the 57-line agent.log window.
- Module-level import of run_agent + auxiliary_client + anthropic_adapter
no longer pulls 'anthropic' or 'firecrawl' at all.
* fix(gateway): restore top-level account_usage import for test-patch surface
CI caught two failures in tests/gateway/test_usage_command.py that I
missed locally:
AttributeError: 'module' object at gateway.run has no attribute 'fetch_account_usage'
The test uses monkeypatch.setattr('gateway.run.fetch_account_usage', ...)
to inject a fake account-fetch call. Moving the import inside the
handler deleted that module-level attribute, breaking the patch surface.
Restoring the top-level import in gateway/run.py gives up the ~230 ms
gateway-boot savings from that one lazy, but:
1. the gateway is a long-running daemon — boot cost is paid once per
install, not per turn;
2. the other four lazy-imports (firecrawl, openai, anthropic, cli's
account_usage) remain in place and still account for the bulk of
the savings reported in the PR body;
3. preserving the patch surface keeps the established
'gateway.run.fetch_account_usage' monkeypatch pattern working
without touching tests.
Verified: tests/gateway/test_usage_command.py — 8 passed, 0 failed.
Full targeted sweep (2336 tests across agent/gateway/hermes_cli/run_agent):
2332 passed, 4 failed — all 4 pre-existing on main.
---------
Co-authored-by: teknium1 <teknium@users.noreply.github.com>
This commit is contained in:
parent
df51ad7973
commit
b5128a751b
6 changed files with 125 additions and 9 deletions
|
|
@ -22,10 +22,25 @@ from hermes_constants import get_hermes_home
|
|||
from typing import Any, Dict, List, Optional, Tuple
|
||||
from utils import normalize_proxy_env_vars
|
||||
|
||||
try:
|
||||
import anthropic as _anthropic_sdk
|
||||
except ImportError:
|
||||
_anthropic_sdk = None # type: ignore[assignment]
|
||||
# NOTE: `import anthropic` is deliberately NOT at module top — the SDK pulls
|
||||
# ~220 ms of imports (anthropic.types, anthropic.lib.tools._beta_runner, etc.)
|
||||
# and the 3 usage sites (build_anthropic_client, build_anthropic_bedrock_client,
|
||||
# read_claude_code_credentials_from_keychain) are all on cold user-triggered
|
||||
# paths. Access via the `_get_anthropic_sdk()` accessor below, which caches
|
||||
# the module after the first call and returns None on ImportError.
|
||||
_anthropic_sdk: Any = ... # sentinel — None means "tried and missing"
|
||||
|
||||
|
||||
def _get_anthropic_sdk():
|
||||
"""Return the ``anthropic`` SDK module, importing lazily. None if not installed."""
|
||||
global _anthropic_sdk
|
||||
if _anthropic_sdk is ...:
|
||||
try:
|
||||
import anthropic as _sdk
|
||||
_anthropic_sdk = _sdk
|
||||
except ImportError:
|
||||
_anthropic_sdk = None
|
||||
return _anthropic_sdk
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
|
@ -395,6 +410,7 @@ def build_anthropic_client(api_key: str, base_url: str = None, timeout: float =
|
|||
|
||||
Returns an anthropic.Anthropic instance.
|
||||
"""
|
||||
_anthropic_sdk = _get_anthropic_sdk()
|
||||
if _anthropic_sdk is None:
|
||||
raise ImportError(
|
||||
"The 'anthropic' package is required for the Anthropic provider. "
|
||||
|
|
@ -492,6 +508,7 @@ def build_anthropic_bedrock_client(region: str):
|
|||
|
||||
Auth uses the boto3 default credential chain (IAM roles, SSO, env vars).
|
||||
"""
|
||||
_anthropic_sdk = _get_anthropic_sdk()
|
||||
if _anthropic_sdk is None:
|
||||
raise ImportError(
|
||||
"The 'anthropic' package is required for the Bedrock provider. "
|
||||
|
|
|
|||
|
|
@ -41,10 +41,57 @@ import threading
|
|||
import time
|
||||
from pathlib import Path # noqa: F401 — used by test mocks
|
||||
from types import SimpleNamespace
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
from typing import Any, Dict, List, Optional, Tuple, TYPE_CHECKING
|
||||
from urllib.parse import urlparse, parse_qs, urlunparse
|
||||
|
||||
from openai import OpenAI
|
||||
# NOTE: `from openai import OpenAI` is deliberately NOT at module top — the
|
||||
# openai SDK pulls a large type tree (~240 ms cold, including responses/*,
|
||||
# graders/*). We expose `OpenAI` here as a thin proxy that imports the SDK on
|
||||
# first call and forwards, so:
|
||||
# (a) the 15+ in-module `OpenAI(...)` construction sites work unchanged
|
||||
# (Python's function-scope name lookup resolves `OpenAI` to the proxy
|
||||
# object bound in module globals here, without triggering any import);
|
||||
# (b) external code can still do `auxiliary_client.OpenAI` or
|
||||
# `patch("agent.auxiliary_client.OpenAI", ...)` — tests see the proxy,
|
||||
# and patch replaces the module attribute as usual;
|
||||
# (c) `OpenAI` as a type annotation resolves at runtime to the proxy class
|
||||
# (which is harmless — annotations aren't type-checked at runtime).
|
||||
# See tests/agent/test_auxiliary_client.py for patch patterns this supports.
|
||||
if TYPE_CHECKING:
|
||||
from openai import OpenAI # noqa: F401 — type hints only
|
||||
|
||||
_OPENAI_CLS_CACHE: Optional[type] = None
|
||||
|
||||
|
||||
def _load_openai_cls() -> type:
|
||||
"""Import and cache ``openai.OpenAI``."""
|
||||
global _OPENAI_CLS_CACHE
|
||||
if _OPENAI_CLS_CACHE is None:
|
||||
from openai import OpenAI as _cls
|
||||
_OPENAI_CLS_CACHE = _cls
|
||||
return _OPENAI_CLS_CACHE
|
||||
|
||||
|
||||
class _OpenAIProxy:
|
||||
"""Module-level proxy that looks like the ``openai.OpenAI`` class.
|
||||
|
||||
Forwards ``OpenAI(...)`` calls and ``isinstance(x, OpenAI)`` checks to the
|
||||
real SDK class, importing the SDK lazily on first use.
|
||||
"""
|
||||
|
||||
__slots__ = ()
|
||||
|
||||
def __call__(self, *args, **kwargs):
|
||||
return _load_openai_cls()(*args, **kwargs)
|
||||
|
||||
def __instancecheck__(self, obj):
|
||||
return isinstance(obj, _load_openai_cls())
|
||||
|
||||
def __repr__(self):
|
||||
return "<lazy openai.OpenAI proxy>"
|
||||
|
||||
|
||||
OpenAI = _OpenAIProxy() # module-level name, resolves lazily on call/isinstance
|
||||
|
||||
from agent.credential_pool import load_pool
|
||||
from hermes_cli.config import get_hermes_home
|
||||
|
|
|
|||
6
cli.py
6
cli.py
|
|
@ -69,7 +69,9 @@ from agent.usage_pricing import (
|
|||
format_duration_compact,
|
||||
format_token_count_compact,
|
||||
)
|
||||
from agent.account_usage import fetch_account_usage, render_account_usage_lines
|
||||
# NOTE: `from agent.account_usage import ...` is deliberately NOT at module
|
||||
# top — it transitively pulls the OpenAI SDK chain (~230 ms cold) and is only
|
||||
# needed when the user runs `/limits`. Lazy-imported inside the handler below.
|
||||
from hermes_cli.banner import _format_context_length, format_banner_version_label
|
||||
|
||||
_COMMAND_SPINNER_FRAMES = ("⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏")
|
||||
|
|
@ -7285,6 +7287,8 @@ class HermesCLI:
|
|||
provider = getattr(agent, "provider", None) or getattr(self, "provider", None)
|
||||
base_url = getattr(agent, "base_url", None) or getattr(self, "base_url", None)
|
||||
api_key = getattr(agent, "api_key", None) or getattr(self, "api_key", None)
|
||||
# Lazy import — pulls the OpenAI SDK chain, only needed here.
|
||||
from agent.account_usage import fetch_account_usage, render_account_usage_lines
|
||||
account_snapshot = None
|
||||
if provider:
|
||||
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as _pool:
|
||||
|
|
|
|||
|
|
@ -31,6 +31,12 @@ from pathlib import Path
|
|||
from datetime import datetime
|
||||
from typing import Dict, Optional, Any, List
|
||||
|
||||
# account_usage imports the OpenAI SDK chain (~230 ms). Only needed by
|
||||
# /usage; we still import it at module top in the gateway because test
|
||||
# patches (tests/gateway/test_usage_command.py) target
|
||||
# `gateway.run.fetch_account_usage` as a module-level attribute. The
|
||||
# gateway is a long-running daemon, so its boot cost matters less than
|
||||
# preserving the established test-patch surface.
|
||||
from agent.account_usage import fetch_account_usage, render_account_usage_lines
|
||||
|
||||
# --- Agent cache tuning ---------------------------------------------------
|
||||
|
|
|
|||
39
run_agent.py
39
run_agent.py
|
|
@ -41,13 +41,48 @@ import urllib.request
|
|||
import uuid
|
||||
from typing import List, Dict, Any, Optional
|
||||
from urllib.parse import urlparse, parse_qs, urlunparse
|
||||
from openai import OpenAI
|
||||
# NOTE: `from openai import OpenAI` is deliberately NOT at module top — the
|
||||
# SDK pulls ~240 ms of imports. We expose `OpenAI` as a thin proxy object
|
||||
# that imports the SDK on first call/isinstance check. This preserves:
|
||||
# (a) the single in-module `OpenAI(**client_kwargs)` call site at
|
||||
# _create_openai_client, and
|
||||
# (b) `patch("run_agent.OpenAI", ...)` test patterns used by ~28 test files.
|
||||
import fire
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
from hermes_constants import get_hermes_home
|
||||
|
||||
|
||||
_OPENAI_CLS_CACHE: Optional[type] = None
|
||||
|
||||
|
||||
def _load_openai_cls() -> type:
|
||||
"""Import and cache ``openai.OpenAI``."""
|
||||
global _OPENAI_CLS_CACHE
|
||||
if _OPENAI_CLS_CACHE is None:
|
||||
from openai import OpenAI as _cls
|
||||
_OPENAI_CLS_CACHE = _cls
|
||||
return _OPENAI_CLS_CACHE
|
||||
|
||||
|
||||
class _OpenAIProxy:
|
||||
"""Module-level proxy that looks like ``openai.OpenAI`` but imports lazily."""
|
||||
|
||||
__slots__ = ()
|
||||
|
||||
def __call__(self, *args, **kwargs):
|
||||
return _load_openai_cls()(*args, **kwargs)
|
||||
|
||||
def __instancecheck__(self, obj):
|
||||
return isinstance(obj, _load_openai_cls())
|
||||
|
||||
def __repr__(self):
|
||||
return "<lazy openai.OpenAI proxy>"
|
||||
|
||||
|
||||
OpenAI = _OpenAIProxy()
|
||||
|
||||
# Load .env from ~/.hermes/.env first, then project root as dev fallback.
|
||||
# User-managed env files should override stale shell exports on restart.
|
||||
from hermes_cli.env_loader import load_hermes_dotenv
|
||||
|
|
@ -5243,6 +5278,8 @@ class AIAgent:
|
|||
keepalive_http = self._build_keepalive_http_client(client_kwargs.get("base_url", ""))
|
||||
if keepalive_http is not None:
|
||||
client_kwargs["http_client"] = keepalive_http
|
||||
# Uses the module-level `OpenAI` name, resolved lazily on first
|
||||
# access via __getattr__ below. Tests patch via `run_agent.OpenAI`.
|
||||
client = OpenAI(**client_kwargs)
|
||||
logger.info(
|
||||
"OpenAI client created (%s, shared=%s) %s",
|
||||
|
|
|
|||
|
|
@ -47,7 +47,10 @@ import re
|
|||
import asyncio
|
||||
from typing import List, Dict, Any, Optional
|
||||
import httpx
|
||||
from firecrawl import Firecrawl
|
||||
# NOTE: `from firecrawl import Firecrawl` is deliberately NOT at module top —
|
||||
# the SDK pulls ~200 ms of imports (httpcore, firecrawl.v1/v2 type trees) and
|
||||
# we only need it when the backend is actually "firecrawl". See
|
||||
# _get_firecrawl_client() below for the lazy import.
|
||||
from agent.auxiliary_client import (
|
||||
async_call_llm,
|
||||
extract_content_or_reasoning,
|
||||
|
|
@ -236,6 +239,8 @@ def _get_firecrawl_client():
|
|||
if _firecrawl_client is not None and _firecrawl_client_config == client_config:
|
||||
return _firecrawl_client
|
||||
|
||||
# Lazy import — ~200 ms of SDK init, only paid when firecrawl is actually used.
|
||||
from firecrawl import Firecrawl # noqa: E402
|
||||
_firecrawl_client = Firecrawl(**kwargs)
|
||||
_firecrawl_client_config = client_config
|
||||
return _firecrawl_client
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue