hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-02 12:13:05 +00:00

Author	SHA1	Message	Date
Teknium	ee8cbfdc03	feat(web_extract): truncate-and-store instead of LLM summarization (#54843 ) * feat(web_extract): truncate-and-store instead of LLM summarization web_extract no longer runs an auxiliary LLM over scraped pages. The extract backends (Firecrawl/Tavily/Exa/Parallel) already return clean, boilerplate- stripped markdown, so we return it directly: pages within a char budget (default 15000, web.extract_char_limit) come back whole; larger pages get a head+tail window plus an explicit footer giving the stored full-text path and the read_file call to page through the omitted middle. The full clean text is written to cache/web (mounted read-only into remote backends like the other cache dirs), so nothing is lost. Inline base64 images are converted to [IMAGE: alt] placeholders (token bombs dropped) while real http(s) image URLs are preserved as links so the agent can still web_extract/vision_analyze them. Removes process_content_with_llm + the chunked summarizer + check_auxiliary_model + _resolve_web_extract_auxiliary. context_references._default_url_fetcher is updated to the truncate path and its stale data.documents shape read is fixed to results (it was silently returning empty). Live before/after eval (firecrawl, 4 URLs): 11.7x faster overall (176.6s -> 15.1s); 10-60x on large pages. Quality identical; findability 4/4 (answer recoverable from stored full text on every truncated page). web_search is unchanged. No own scraper added; no changes to web_search. * fix(web_extract): add char_limit to execute_code web_extract stub The new web_extract char_limit param must appear in the code_execution_tool _TOOL_STUBS signature (and doc line) or test_stubs_cover_all_schema_params fails — the stub schema must cover every real schema param.	2026-06-29 10:00:49 -07:00
Teknium	f3fe99863d	revert(web): remove keyless Parallel search fallback (#46350 ) Remove the free Parallel Search MCP path and restore the keyed Parallel backend behavior from before it was introduced. Also drops the keyless fallback registration/display labeling tests and returns the Parallel SDK pin to the prior version.	2026-06-14 16:47:57 -07:00
Matt Harris	e0e2571711	feat(web): Parallel-backed web search & extract — free Search MCP when keyless, v1 REST when keyed Make Parallel the web search/extract backend with a zero-setup free tier: - Keyless (no PARALLEL_API_KEY): web_search/web_extract work out of the box via Parallel's free hosted Search MCP (search.parallel.ai/mcp), and parallel becomes the default backend when no other web credentials are configured (ahead of ddgs, which is search-only). A small hand-rolled Streamable-HTTP JSON-RPC client speaks the MCP's web_search/web_fetch tools; the existing web_search/web_extract tools are the only tools registered. - Keyed (PARALLEL_API_KEY set): uses the Parallel v1 REST endpoints (client.search / client.extract with advanced_settings.full_content) — no beta. Bumps parallel-web 0.4.2 -> 0.6.0. - Attribution: on the free path only, results carry provider/attribution and the CLI tool line reads "Parallel search" / "Parallel fetch"; the paid path is unbranded. - Selection/registration: web tools register unconditionally (free MCP backstop) while check_web_api_key remains a real usability probe; explicit per-capability backends are honored (so misconfig surfaces) rather than masked by the fallback. Tested: live web_search/web_extract against search.parallel.ai in keyless and keyed modes; unit suites for the MCP client, backend selection, and display labeling; full agent run shows the "Parallel search" label on the free path.	2026-06-10 19:54:38 -07:00
briandevans	6e179c44b1	fix(web): ensure plugin discovery before web_*_tool registry lookups Web search/extract dispatch read agent.web_search_registry before plugin discovery had run, so in any process that hadn't imported model_tools.py (subprocess agent runs, delegate children, standalone scripts) the registry was empty: get_provider('firecrawl') returned None and the dispatcher emitted the misleading 'No web extract provider configured' error even with web.extract_backend set and FIRECRAWL_API_KEY exported. Adds an idempotent _ensure_web_plugins_loaded() helper (mirrors tools.browser_tool._ensure_browser_plugins_loaded) and calls it at the top of both the web_search_tool and web_extract_tool dispatch sites before the registry lookup. Fixes #27580. Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>	2026-05-29 04:00:00 -07:00
kshitijk4poor	66827f8947	chore: prune unused imports and duplicate import redefinitions Remove unused imports (F401) and duplicate/shadowed import redefinitions (F811) across the codebase using ruff's safe autofixes. No behavioral changes -- imports only. - ~1400 safe autofixes applied across 644 files (net -1072 lines) - __init__.py re-exports preserved (excluded from F401 removal so public re-export surfaces stay intact) - Re-exports that are imported or monkeypatched by tests but look unused in their defining module are kept with explicit # noqa: F401 (gateway/run.py load_dotenv; run_agent re-exports from agent.message_sanitization, agent.context_compressor, agent.retry_utils, agent.prompt_builder, agent.process_bootstrap, agent.codex_responses_adapter) - Unsafe F841 (unused-variable) fixes deliberately skipped -- those can change behavior when the RHS has side effects - ruff lints remain disabled in pyproject.toml (only PLW1514 is selected); this is a one-time cleanup, not a config change Verification: - python -m compileall: clean - pytest --collect-only: all 27161 tests collect (zero import errors) - core entry points import clean (run_agent, model_tools, cli, toolsets, hermes_state, batch_runner, gateway) - static scan: every name any test imports directly from an edited module still resolves	2026-05-28 22:26:25 -07:00
Teknium	5e1f793430	chore(web): remove web_crawl tool + provider crawl plumbing (#33824 ) The web_crawl_tool() function was an orphan — no model schema registered it, no skill or CLI command called it, and the agent had no way to invoke it. PR #32608 proposed wiring it up as a model-callable tool; we've decided not to expose crawl as a separate capability since web_search + web_extract cover the use cases we want models to have. Removed: - tools/web_tools.py: web_crawl_tool() (~230 LOC) - plugins/web/firecrawl/provider.py: supports_crawl() + crawl() - plugins/web/tavily/provider.py: supports_crawl() + crawl() - plugins/web/xai/provider.py: supports_crawl() override - agent/web_search_provider.py: supports_crawl() + crawl() ABC methods - agent/web_search_registry.py: get_active_crawl_provider() + the 'crawl' branch in _resolve() - agent/display.py: web_crawl tool-progress rendering - hermes_cli/config.py: 'web_crawl' from TAVILY_API_KEY.tools - tools/website_policy.py: stale comment reference - Tests: removed TestWebCrawlTavily class, the two website-policy web_crawl tests, the searxng/ddgs/brave-free crawl-error tests, the integration test_web_crawl method, and the test_unconfigured_crawl_emits_top_level_error test. Trimmed the capability-flag parametrize list and the WebSearchProvider ABC conformance tests. - Docs: trimmed the Crawl column from capability tables in both EN and zh-Hans, updated the developer-guide ABC table. Net: 25 files, +115/-1067. Closes #33762 (the schema-text bug only existed if #32608 landed). Supersedes #32608.	2026-05-28 04:52:42 -07:00
ethernet	48be2e0e4d	test: use subprocesses for each test file (#29016 ) * ci(tests): install ripgrep from prebuilt tarball instead of apt apt-get update + install of ripgrep takes ~4 min on the GHA Ubuntu runners (the apt-get update against archive.ubuntu.com is the slow part; ripgrep itself is small). Switching to the upstream musl binary tarball cuts the step to a few seconds. - Pinned to ripgrep 15.1.0 with sha256 verification (same hash as published in the releases sha256 sidecar file). - Drops the `rg` binary into /usr/local/bin so it is on PATH for every subsequent step without GITHUB_PATH manipulation. - Applied to both the test and e2e jobs in tests.yml. * fix(cli): compile syntax check to tempdir, not source __pycache__ `_validate_critical_files_syntax` runs `py_compile.compile()` on each critical bootstrap file after a successful `git pull`. The default `py_compile` writes the resulting `.pyc` next to the source under `__pycache__/`, which causes two real problems: 1. Parallel test workers walking the same source tree (e.g. running the suite under per-file process isolation) can race against each other on the `__pycache__` write — manifests as flaky 'directory not empty' errors during teardown. 2. In production, the post-pull syntax check leaves a `.pyc` behind that the next interpreter run might pick up — fine when the interpreter version matches, sketchy if it doesn't. Fix: write the compiled output to a `tempfile.TemporaryDirectory()` that's discarded on function exit. We only care about the compile-or-not signal, not the artifact. * test(runner): per-file process isolation, drop manual state reset + xdist Replace fragile manual _reset_module_state test fixtures with robust per-file subprocess isolation. Each test file runs in a fresh `python -m pytest <file>` subprocess via ThreadPoolExecutor. No xdist, no custom pytest plugin, no shared worker state. Key changes: * scripts/run_tests_parallel.py — new runner: discovers test files, runs N in parallel via ThreadPoolExecutor, captures stdout per file, treats exit code 5 (no tests collected) as pass, kills all children on exit. Change from cpu_count to cpu_count2. The runner is I/O-bound (waiting on subprocess.communicate() from pytest children) The parent process does almost no CPU work, so 2x oversubscription keeps more pipes full. When a file fails, immediately show the last 30 lines of pytest output (stack traces + FAILED summary) plus a ready-to-copy repro command: python -m pytest tests/agent/test_auxiliary_client.py scripts/run_tests.sh — delegates to run_tests_parallel.py * .github/workflows/tests.yml — test step: python scripts/run_tests_parallel.py * pyproject.toml — drop pytest-xdist, pytest-split; simplify addopts * tests/conftest.py — remove ~200 lines of manual state-reset fixtures * AGENTS.md — update Testing section for per-file design * test(runner): speed gateway test antipattern scan up * fix(test): web search provider plugin test missing xai * fix(tests): make 14 test files pass under per-file subprocess isolation Tests that relied on cross-file state pollution from xdist workers fail when run in isolation (per-file subprocess model). Root causes and fixes: Tool registry not populated: - test_video_generation_tool_surface_matrix: add discover_builtin_tools() - test_web_providers_brave_free/ddgs/searxng/general: autouse fixtures registering all 8 bundled web providers, reset after each test - test_website_policy: same provider registration pattern - test_web_tools_tavily: same pattern across 3 dispatch test classes - Also add is_safe_url/check_website_access mocks where SSRF check blocks example.com (DNS resolution fails in isolated envs) Stale check_fn cache: - test_kanban_tools: invalidate_check_fn_cache() + _clear_tool_defs_cache() in both kanban guidance tests (prior test cached False for kanban_show) - test_discord_tool: cache invalidation in setup/teardown - test_homeassistant_tool: invalidate_check_fn_cache() before registry queries Module-level state pollution: - test_auxiliary_client: autouse fixture clearing _aux_unhealthy_until cache - test_skill_commands: set_session_vars() instead of patch.dict(os.environ) (ContextVar takes precedence over os.environ) - test_dm_topics: overwrite sys.modules + separate telegram.constants mock + force-reimport of gateway.platforms.telegram - test_terminal_tool_requirements: removed duplicate class declaration, autouse _clear_caches fixture * change(tests): run_tests.sh explicitly includes env vars instead of manually dropping some vars, now we just only include some * fix(tests): 5 more isolation/NixOS fixes - test_approval_plugin_hooks: isolate HERMES_HOME so real user's command_allowlist doesn't short-circuit the approval path - test_google_chat: skipif when Platform.GOOGLE_CHAT not in enum (feature not merged on this branch) - test_write_deny: test systemd prefix against tmp_path instead of /etc/systemd which resolves to /nix/store on NixOS - test_pty_bridge: use shutil.which('cat') instead of /bin/cat (doesn't exist on NixOS) - profiles.py: rmtree onexc handler chmod's parent dirs too, fixing profile deletion when copytree preserved read-only modes from nix store * fix(tests): clear unhealthy cache in autouse fixture for auxiliary_client * fix(tests): skip send_message when telegram not installed; handle missing worker_id in browser_supervisor * fix: py3.11 rmtree onexc compat + belt-and-suspenders unhealthy cache clear for expired codex test * fix: address PR #29016 review feedback - Remove tracked .pytest-cache/ artifact and add to .gitignore - Fix stale 'xdist worker' comment in conftest.py - Deduplicate web provider registration into tests/tools/conftest.py shared helper (register_all_web_providers), replacing 8 copy-pasted blocks across 6 test files - Update PR description: remove stale recovered-test-files claim, fix worker count to match code (cpu_count2) fix: eliminate race in stale-cache achievements test The background scan thread could complete and overwrite _SNAPSHOT_CACHE before evaluate_all() returned the stale data — only 10 fake sessions made the scan finish instantly. Added scan_delay param to _FakeSessionDB and set it to 2s in the stale-cache test so the background thread can't win the race.	2026-05-21 16:40:04 +05:30
kshitijk4poor	4ca5e72444	fix(web): preserve top-level error envelope on unconfigured systems Surfaced by local E2E behavior-parity testing of PR vs origin/main: the plugin-migrated dispatchers were quietly changing the error envelope shape returned to function-calling models on unconfigured systems. Two findings, both from per-result error wrapping bleeding into the pre-flight configuration error path: 1. search: ``firecrawl.search()`` caught the ``ValueError("Web tools are not configured...")`` from ``_get_firecrawl_client()`` and returned it as ``{"success": False, "error": ...}``, losing the legacy ``{"error": "Error searching web: ..."}`` envelope that ``tool_error()`` emits on main. Models that special-case the ``error`` key still detect the failure, but the prefix is part of the legacy contract some users rely on. 2. crawl: ``firecrawl.crawl()`` caught the same pre-flight ``ValueError`` and wrapped it as a per-page error inside ``results[0]``. Main short-circuits on ``check_firecrawl_api_key()`` BEFORE dispatching, so its unconfigured response is ``{"success": False, "error": "web_crawl requires Firecrawl..."}`` at the top level. The PR's per-page burying hid the failure inside ``results[]`` where models that check ``result.get("error")`` would miss it. Fix: - ``plugins/web/firecrawl/provider.py``: pull ``_get_firecrawl_client()`` outside the broad ``try`` in ``search()``. Pre-flight ``ValueError`` / ``ImportError`` propagate to the dispatcher's top-level exception handler. In-flight SDK errors still get wrapped as ``{"success": False, ...}``. - ``tools/web_tools.py``: mirror main's upstream availability gate in ``web_crawl_tool``. When the resolved crawl provider is ``is_available()==False``, short-circuit BEFORE dispatching with the same top-level error shape main emits. - ``tests/tools/test_web_providers.py``: 2 regression tests (``TestUnconfiguredErrorEnvelopeParity``) lock in the behavior so future plugin work can't undo this. Verified via local subprocess-based parity test (14/14 scenarios match origin/main shape exactly) and full 210/210 web test suite green.	2026-05-13 22:31:28 -07:00
kshitijk4poor	39b4ebfcea	refactor(web): delete legacy tools/web_providers/ directory + migrate ABC tests Removes the legacy in-tree provider scaffolding that PR #25182 fully replaced with the plugin architecture: tools/web_providers/__init__.py (6 lines) tools/web_providers/base.py (89 lines — old ABCs) tools/web_providers/ARCHITECTURE.md (73 lines — old design doc) These were the staging-ground ABCs and provider modules that the plugin migration absorbed. All seven web providers now implement the single :class:`agent.web_search_provider.WebSearchProvider` ABC and live under ``plugins/web/<vendor>/``. Nothing else in the tree imports ``tools.web_providers`` — verified via grep before deletion. Test migration (tests/tools/test_web_providers.py) -------------------------------------------------- Rewrote ``TestWebProviderABCs`` to test the new unified ABC at :mod:`agent.web_search_provider`: - test_cannot_instantiate_abc_directly — abstract ``name`` + ``is_available`` - test_concrete_search_only_provider_works — exercise default ``supports_extract=False`` / ``supports_crawl=False`` flags - test_concrete_multi_capability_provider_works — exercise all three capabilities, async extract supported (declared sync here for simplicity; real plugins like parallel + firecrawl use async) - test_search_only_provider_skips_extract_and_crawl — verify ``supports_*()`` flags default to False so search-only providers don't have to implement extract() or crawl() The 9 other tests in the file (per-capability backend selection, DEFAULT_CONFIG merge, dispatcher routing) test public helpers in ``tools.web_tools`` that still exist and pass unchanged. agent/web_search_provider.py docstring updated to reflect that the legacy ABCs no longer exist; the response-shape contract is preserved bit-for-bit so external consumers see no behavioral change. Net diff -------- - tools/web_providers/ removed (-168 lines) - tests/tools/test_web_providers.py rewritten ABC section (+78/-30 net, same coverage, new API) - agent/web_search_provider.py docstring (-3/+5 lines) Verified -------- - 173/173 targeted web tests pass - 12/12 ABC contract tests pass with the new interface - No remaining grep hits for ``tools.web_providers`` outside of intentional historical references in plugin docstrings.	2026-05-13 22:31:28 -07:00
kshitij	cd2cbc73b7	refactor(web): per-capability backend selection for search/extract split Introduce the foundation for independently selecting web search and extract backends — enabling future combinations like SearXNG for search + Firecrawl for extract. Architecture: - tools/web_providers/base.py: WebSearchProvider and WebExtractProvider ABCs with normalized result contracts (mirrors CloudBrowserProvider) - tools/web_tools.py: _get_search_backend() and _get_extract_backend() read per-capability config keys, fall through to shared web.backend - hermes_cli/config.py: web.search_backend and web.extract_backend in DEFAULT_CONFIG (empty = inherit from web.backend) Behavioral change: - web_search_tool() now dispatches via _get_search_backend() - web_extract_tool() now dispatches via _get_extract_backend() - When per-capability keys are empty (default), behavior is identical to before — _get_search_backend() falls through to _get_backend() This is purely structural — no new backends are added. SearXNG and other search-only/extract-only providers can now be added as simple drop-in modules in follow-up PRs. 12 new tests, 49 existing tests pass with zero regressions. Ref: #19198	2026-05-06 09:16:25 -07:00

10 commits