fix(web): preserve top-level error envelope on unconfigured systems

Surfaced by local E2E behavior-parity testing of PR vs origin/main: the
plugin-migrated dispatchers were quietly changing the error envelope
shape returned to function-calling models on unconfigured systems.

Two findings, both from per-result error wrapping bleeding into the
pre-flight configuration error path:

1. **search**: ``firecrawl.search()`` caught the
   ``ValueError("Web tools are not configured...")`` from
   ``_get_firecrawl_client()`` and returned it as
   ``{"success": False, "error": ...}``, losing the legacy
   ``{"error": "Error searching web: ..."}`` envelope that
   ``tool_error()`` emits on main. Models that special-case the
   ``error`` key still detect the failure, but the prefix is part of
   the legacy contract some users rely on.

2. **crawl**: ``firecrawl.crawl()`` caught the same pre-flight
   ``ValueError`` and wrapped it as a per-page error inside
   ``results[0]``. Main short-circuits on ``check_firecrawl_api_key()``
   BEFORE dispatching, so its unconfigured response is
   ``{"success": False, "error": "web_crawl requires Firecrawl..."}``
   at the top level. The PR's per-page burying hid the failure inside
   ``results[]`` where models that check ``result.get("error")`` would
   miss it.

Fix:
- ``plugins/web/firecrawl/provider.py``: pull
  ``_get_firecrawl_client()`` outside the broad ``try`` in
  ``search()``. Pre-flight ``ValueError`` / ``ImportError`` propagate
  to the dispatcher's top-level exception handler. In-flight SDK
  errors still get wrapped as ``{"success": False, ...}``.
- ``tools/web_tools.py``: mirror main's upstream availability gate in
  ``web_crawl_tool``. When the resolved crawl provider is
  ``is_available()==False``, short-circuit BEFORE dispatching with the
  same top-level error shape main emits.
- ``tests/tools/test_web_providers.py``: 2 regression tests
  (``TestUnconfiguredErrorEnvelopeParity``) lock in the behavior so
  future plugin work can't undo this.

Verified via local subprocess-based parity test (14/14 scenarios match
origin/main shape exactly) and full 210/210 web test suite green.
This commit is contained in:
kshitijk4poor 2026-05-14 02:06:45 +05:30 committed by Teknium
parent 657e6d87cc
commit 4ca5e72444
3 changed files with 105 additions and 11 deletions

View file

@ -1192,6 +1192,25 @@ async def web_crawl_tool(
if crawl_provider is None:
crawl_provider = get_active_crawl_provider()
# Mirror main's upstream availability gate: when the resolved
# provider is configured-but-unavailable (e.g. firecrawl without
# FIRECRAWL_API_KEY), short-circuit BEFORE we dispatch so the
# error envelope matches the legacy top-level shape
# ``{"success": False, "error": "..."}`` rather than burying the
# configuration message inside a per-page ``results[]`` entry.
if crawl_provider is not None and not crawl_provider.is_available():
return json.dumps(
{
"success": False,
"error": (
"web_crawl requires Firecrawl. Set FIRECRAWL_API_KEY, "
f"FIRECRAWL_API_URL{_firecrawl_backend_help_suffix()}, "
"or use web_search + web_extract instead."
),
},
ensure_ascii=False,
)
if crawl_provider is not None:
# Ensure URL has protocol
if not url.startswith(('http://', 'https://')):