mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
refactor(web): dispatch all three tools through web_search_registry
Cuts over web_search_tool, web_extract_tool, and web_crawl_tool in
tools/web_tools.py to dispatch through agent.web_search_registry
instead of the legacy hardcoded if-elif backend chains.
Per-tool changes:
web_search_tool (sync)
Replace 5 backend branches (parallel, exa, registry-3-providers,
tavily, firecrawl-fallthrough) with a single registry path:
1. _get_search_backend() resolves the configured name
2. _wsp_get_provider(name) for explicit-config-wins semantics
3. get_active_search_provider() fallback for typo / unknown name
4. provider.search(query, limit) — sync for all 7 providers
web_extract_tool (async)
Replace 4 backend branches (parallel-async, exa-sync, tavily-sync,
search-only-error, firecrawl-perurl-loop) with:
1. Same provider resolution as search.
2. When configured backend IS registered but doesn't support
extract (search-only providers like brave-free), surface a
typed "search-only" error matching the legacy text — tests
assert that wording.
3. inspect.iscoroutinefunction(provider.extract) detects sync vs
async: parallel + firecrawl are async; exa + tavily are sync.
Sync extracts run in asyncio.to_thread() so we don't block.
web_crawl_tool (async)
Replace tavily-specific branch + search-only-error block with:
1. _wsp_get_provider(backend) — explicit config first
2. Search-only typed error when the configured name doesn't
support crawl (matches legacy phrasing)
3. get_active_crawl_provider() fallback otherwise
4. provider.crawl(url, **kwargs) — async-or-sync dispatch as above
5. Response post-processing (LLM summarization, trimming) stays
unchanged — it's not provider-specific.
When no plugin advertises supports_crawl, falls through to the
existing Firecrawl-via-web-summarize path below (unchanged).
Test updates (2 tests in tests/tools/test_web_tools_config.py):
- test_web_search_clamps_limit_before_backend_call:
patch("tools.web_tools._parallel_search") -> patch the registry
provider returned by agent.web_search_registry.get_provider
- test_search_error_response_does_not_expose_diagnostics:
patch("tools.web_tools._get_firecrawl_client") -> same pattern
Tests unchanged (still pass):
- All TestXBackendWiring classes (test _get_backend / _is_backend_available
config-resolution, independent of dispatch)
- All TestXSearchOnlyErrors classes (test the search-only error path
via web_extract_tool / web_crawl_tool — error text preserved)
- 141 passing web tests total, 0 regressions.
Dead-code cleanup deferred to a follow-up commit so this diff stays
focused on the cutover. After this commit:
- tools.web_tools._exa_search / _exa_extract / _parallel_search /
_parallel_extract / _tavily_request / _normalize_tavily_* /
_get_firecrawl_client / _extract_web_search_results /
_extract_scrape_payload / _to_plain_object / _normalize_result_list
are no longer called by the dispatchers, but still exist.
- The config-resolution layer (_get_backend, _is_backend_available,
_is_tool_gateway_ready, _has_direct_firecrawl_config) IS still in
use and must stay.
- The Firecrawl proxy and check_firecrawl_api_key are still imported
by integration tests and patched by unit tests — must stay (or be
re-exported from the plugin).
This commit is contained in:
parent
143184e943
commit
b05253ceed
2 changed files with 175 additions and 238 deletions
|
|
@ -485,15 +485,28 @@ class TestWebSearchSchema:
|
|||
def test_web_search_clamps_limit_before_backend_call(self):
|
||||
import tools.web_tools
|
||||
|
||||
with patch("tools.web_tools._get_backend", return_value="parallel"), \
|
||||
patch("tools.web_tools._parallel_search", return_value={"success": True, "data": {"web": []}}) as mock_search, \
|
||||
# After the web-provider plugin migration, _parallel_search lives in
|
||||
# plugins.web.parallel.provider.ParallelWebSearchProvider.search; the
|
||||
# tool dispatcher resolves a provider from the registry and calls
|
||||
# provider.search(query, limit). Mock the provider lookup so we can
|
||||
# assert the limit is clamped before reaching the backend.
|
||||
fake_search = MagicMock(return_value={"success": True, "data": {"web": []}})
|
||||
fake_provider = MagicMock(
|
||||
name="ParallelWebSearchProvider",
|
||||
supports_search=MagicMock(return_value=True),
|
||||
)
|
||||
fake_provider.search = fake_search
|
||||
fake_provider.name = "parallel"
|
||||
|
||||
with patch("tools.web_tools._get_search_backend", return_value="parallel"), \
|
||||
patch("agent.web_search_registry.get_provider", return_value=fake_provider), \
|
||||
patch("tools.interrupt.is_interrupted", return_value=False), \
|
||||
patch.object(tools.web_tools._debug, "log_call"), \
|
||||
patch.object(tools.web_tools._debug, "save"):
|
||||
result = json.loads(tools.web_tools.web_search_tool("docs", limit=500))
|
||||
|
||||
assert result == {"success": True, "data": {"web": []}}
|
||||
mock_search.assert_called_once_with("docs", 100)
|
||||
fake_search.assert_called_once_with("docs", 100)
|
||||
|
||||
|
||||
class TestWebSearchErrorHandling:
|
||||
|
|
@ -502,11 +515,19 @@ class TestWebSearchErrorHandling:
|
|||
def test_search_error_response_does_not_expose_diagnostics(self):
|
||||
import tools.web_tools
|
||||
|
||||
firecrawl_client = MagicMock()
|
||||
firecrawl_client.search.side_effect = RuntimeError("boom")
|
||||
# After the web-provider plugin migration, the firecrawl client lives
|
||||
# at plugins.web.firecrawl.provider._get_firecrawl_client. We mock the
|
||||
# registry's get_provider to return a fake provider whose .search()
|
||||
# raises so we can verify error sanitization.
|
||||
fake_provider = MagicMock(
|
||||
name="FirecrawlWebSearchProvider",
|
||||
supports_search=MagicMock(return_value=True),
|
||||
)
|
||||
fake_provider.search.side_effect = RuntimeError("boom")
|
||||
fake_provider.name = "firecrawl"
|
||||
|
||||
with patch("tools.web_tools._get_backend", return_value="firecrawl"), \
|
||||
patch("tools.web_tools._get_firecrawl_client", return_value=firecrawl_client), \
|
||||
with patch("tools.web_tools._get_search_backend", return_value="firecrawl"), \
|
||||
patch("agent.web_search_registry.get_provider", return_value=fake_provider), \
|
||||
patch("tools.interrupt.is_interrupted", return_value=False), \
|
||||
patch.object(tools.web_tools._debug, "log_call") as mock_log_call, \
|
||||
patch.object(tools.web_tools._debug, "save"):
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue