mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
fix(web): align _LEGACY_PREFERENCE with legacy 7-provider order + doc cleanup
Self-review of the plugin migration surfaced one warning and a handful of
doc/dead-code cleanups. None affect production behaviour through the main
dispatcher (which always calls `tools.web_tools._get_backend()` first and
preserves the full 7-provider walk), but direct callers of
`agent.web_search_registry.get_active_*_provider()` previously diverged
from the legacy order and could return `None` for users with credentials
but no explicit `web.backend` config key.
Changes
-------
1. `_LEGACY_PREFERENCE` was shipped as a 4-tuple
`("brave-free", "firecrawl", "searxng", "ddgs")` while the PR
description and the legacy `_get_backend()` candidate order both
call for the 7-tuple
`(firecrawl, parallel, tavily, exa, searxng, brave-free, ddgs)`.
Replaced with the 7-tuple. Verified empirically: with TAVILY+EXA keys
and no config, `get_active_search_provider()` now returns tavily
(was None); with EXA+PARALLEL it returns parallel (was None); with
BRAVE+FIRECRAWL it returns firecrawl (was brave-free).
2. `agent/web_search_registry.py` — module docstring, `_resolve` step-3
docstring, and inline comment all listed the old 4-tuple and claimed
"brave-free first because it was the shipped default". The legacy
default is `"firecrawl"`. Rewritten to match the new ordering and
reference `tools.web_tools._get_backend()` as the source of truth.
3. `agent/web_search_registry.py` — `get_active_crawl_provider`
docstring said "only Tavily implements it among built-in providers".
Firecrawl also advertises `supports_crawl=True` after the previous
commit. Updated to "Tavily and Firecrawl".
4. `plugins/web/tavily/provider.py` — module docstring said "Tavily is
the only built-in backend that natively crawls". Updated.
5. `agent/web_search_provider.py` — ABC docstring mentioned only
`search` / `extract` capabilities. Added `crawl` for accuracy.
6. `plugins/web/{firecrawl,parallel,exa}/provider.py` — dead plugin-level
cache globals (`_firecrawl_client`, `_parallel_client`,
`_async_parallel_client`, `_exa_client`) were declared but never read
(all reads/writes go through `_wt.*` per the `extracting-inline-
helpers-to-plugins` recipe). Removed the dead declarations; the
reset-for-tests helpers in firecrawl + parallel now clear the
canonical `_wt._<name>` slots, matching the pattern exa already used.
Tests
-----
218/218 web-targeted tests still pass (no test changes needed). 4910/4910
in `tests/tools/` still green.
This commit is contained in:
parent
21e3a863bb
commit
657e6d87cc
6 changed files with 82 additions and 48 deletions
|
|
@ -32,9 +32,10 @@ from agent.web_search_provider import WebSearchProvider
|
|||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Module-level cache for the Exa client so we don't reconstruct it per
|
||||
# call. Matches the legacy `_exa_client` pattern in tools/web_tools.py.
|
||||
_exa_client: Any = None
|
||||
# Module-level note: the canonical ``_exa_client`` cache slot lives on
|
||||
# :mod:`tools.web_tools` so tests that do ``tools.web_tools._exa_client =
|
||||
# None`` between cases see fresh state. The plugin reads/writes through
|
||||
# that public module (see :func:`_get_exa_client`).
|
||||
|
||||
|
||||
def _get_exa_client() -> Any:
|
||||
|
|
|
|||
|
|
@ -112,9 +112,11 @@ Firecrawl = _FirecrawlProxy()
|
|||
# ---------------------------------------------------------------------------
|
||||
# Client construction (direct vs managed-gateway)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_firecrawl_client: Any = None
|
||||
_firecrawl_client_config: Any = None
|
||||
#
|
||||
# The canonical cache slots live on :mod:`tools.web_tools` so tests that do
|
||||
# ``tools.web_tools._firecrawl_client = None`` between cases see fresh
|
||||
# state. The plugin reads/writes through that public module — see
|
||||
# :func:`_get_firecrawl_client` below.
|
||||
|
||||
|
||||
def _get_direct_firecrawl_config() -> Optional[tuple]:
|
||||
|
|
@ -257,10 +259,15 @@ def _get_firecrawl_client() -> Any:
|
|||
|
||||
|
||||
def _reset_client_for_tests() -> None:
|
||||
"""Drop the cached Firecrawl client so tests can re-instantiate cleanly."""
|
||||
global _firecrawl_client, _firecrawl_client_config
|
||||
_firecrawl_client = None
|
||||
_firecrawl_client_config = None
|
||||
"""Drop the cached Firecrawl client so tests can re-instantiate cleanly.
|
||||
|
||||
Clears the canonical slots on :mod:`tools.web_tools` (where
|
||||
:func:`_get_firecrawl_client` reads/writes them).
|
||||
"""
|
||||
import tools.web_tools as _wt
|
||||
|
||||
_wt._firecrawl_client = None
|
||||
_wt._firecrawl_client_config = None
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
|
|
|||
|
|
@ -36,13 +36,11 @@ from agent.web_search_provider import WebSearchProvider
|
|||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Module-level client caches mirroring the legacy `tools.web_tools._parallel_client`
|
||||
# / `_async_parallel_client` pattern. For tests, the canonical cache lives on
|
||||
# tools.web_tools so existing setup_method() handlers that reset
|
||||
# ``tools.web_tools._parallel_client = None`` keep working — we read/write
|
||||
# the cache via that module rather than these module-level globals.
|
||||
_parallel_client: Any = None
|
||||
_async_parallel_client: Any = None
|
||||
# Module-level note: the canonical cache slots ``_parallel_client`` and
|
||||
# ``_async_parallel_client`` live on :mod:`tools.web_tools` so tests that do
|
||||
# ``tools.web_tools._parallel_client = None`` between cases see fresh state.
|
||||
# The plugin reads/writes through that public module (see
|
||||
# :func:`_get_sync_client` / :func:`_get_async_client`).
|
||||
|
||||
|
||||
def _ensure_parallel_sdk_installed() -> None:
|
||||
|
|
@ -117,10 +115,15 @@ def _get_async_client() -> Any:
|
|||
|
||||
|
||||
def _reset_clients_for_tests() -> None:
|
||||
"""Drop both cached clients so tests can re-instantiate cleanly."""
|
||||
global _parallel_client, _async_parallel_client
|
||||
_parallel_client = None
|
||||
_async_parallel_client = None
|
||||
"""Drop both cached clients so tests can re-instantiate cleanly.
|
||||
|
||||
Clears the canonical slots on :mod:`tools.web_tools` (where
|
||||
:func:`_get_sync_client` / :func:`_get_async_client` read/write them).
|
||||
"""
|
||||
import tools.web_tools as _wt
|
||||
|
||||
_wt._parallel_client = None
|
||||
_wt._async_parallel_client = None
|
||||
|
||||
|
||||
# Backward-compatible aliases for the names that lived in tools.web_tools
|
||||
|
|
|
|||
|
|
@ -5,8 +5,8 @@ capabilities advertised:
|
|||
|
||||
- ``supports_search()`` -> True (Tavily ``/search``)
|
||||
- ``supports_extract()`` -> True (Tavily ``/extract``)
|
||||
- ``supports_crawl()`` -> True (Tavily ``/crawl``) — Tavily is the only
|
||||
built-in backend that natively crawls
|
||||
- ``supports_crawl()`` -> True (Tavily ``/crawl``) — sync HTTP crawl;
|
||||
Firecrawl also advertises ``supports_crawl=True`` (async)
|
||||
|
||||
All three are sync — the underlying call is ``httpx.post(...)``. The
|
||||
dispatcher in :func:`tools.web_tools.web_crawl_tool` (which is itself
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue