mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-19 04:52:06 +00:00

docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784 )

* docs: deep audit — fix stale config keys, missing commands, and registry drift

Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level
user-guide, user-guide/features) against the live registries:

  hermes_cli/commands.py    COMMAND_REGISTRY (slash commands)
  hermes_cli/auth.py        PROVIDER_REGISTRY (providers)
  hermes_cli/config.py      DEFAULT_CONFIG (config keys)
  toolsets.py               TOOLSETS (toolsets)
  tools/registry.py         get_all_tool_names() (tools)
  python -m hermes_cli.main <subcmd> --help (CLI args)

reference/
- cli-commands.md: drop duplicate hermes fallback row + duplicate section,
  add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand
  lists to match --help output (status/logout/spotify, login, archive/prune/
  list-archived).
- slash-commands.md: add missing /sessions and /reload-skills entries +
  correct the cross-platform Notes line.
- tools-reference.md: drop bogus '68 tools' headline, drop fictional
  'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated),
  add missing 'kanban' and 'video' toolset sections, fix MCP example to use
  the real mcp_<server>_<tool> prefix.
- toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser'
  row, add missing 'kanban' and 'video' toolset rows, drop the stale
  '38 tools' count for hermes-cli.
- profile-commands.md: add missing install/update/info subcommands, document
  fish completion.
- environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the
  one with the correct gmi-serving.com default).
- faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just
  via OpenRouter), refresh the OpenAI model list.

getting-started/
- installation.md: PortableGit (not MinGit) is what the Windows installer
  fetches; document the 32-bit MinGit fallback.
- installation.md / termux.md: installer prefers .[termux-all] then falls
  back to .[termux].
- nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid
  'nix flake update --flake' invocation.
- updating.md: 'hermes backup restore --state pre-update' doesn't exist —
  point at the snapshot/quick-snapshot flow; correct config key
  'updates.pre_update_backup' (was 'update.backup').

user-guide/
- configuration.md: api_max_retries default 3 (not 2); display.runtime_footer
  is the real key (not display.runtime_metadata_footer); checkpoints defaults
  enabled=false / max_snapshots=20 (not true / 50).
- configuring-models.md: 'hermes model list' / 'hermes model set ...' don't
  exist — hermes model is interactive only.
- tui.md: busy_indicator -> tui_status_indicator with values
  kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none).
- security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env,
  not config.yaml.
- windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the
  OpenAI-compatible API server runs inside hermes gateway.

user-guide/features/
- computer-use.md: approvals.mode (not security.approval_level); fix broken
  ./browser-use.md link to ./browser.md.
- fallback-providers.md: top-level fallback_providers (not
  model.fallback_providers); the picker is subcommand-based, not modal.
- api-server.md: API_SERVER_* are env vars — write to per-profile .env,
  not 'hermes config set' which targets YAML.
- web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl
  modes are exposed through web_extract.
- kanban.md: failure_limit default is 2, not '~5'.
- plugins.md: drop hard-coded '33 providers' count.
- honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document
  that 'hermes honcho' subcommand is gated on memory.provider=honcho;
  reconcile subcommand list with actual --help output.
- memory-providers.md: legacy 'hermes honcho setup' redirect documented.

Verified via 'npm run build' — site builds cleanly; broken-link count went
from 149 to 146 (no regressions, fixed a few in passing).

* docs: round 2 audit fixes + regenerate skill catalogs

Follow-up to the previous commit on this branch:

Round 2 manual fixes:
- quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY;
  voice-mode and ACP install commands rewritten — bare 'pip install ...'
  doesn't work for curl-installed setups (no pip on PATH, not in repo
  dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e
  ".[voice]"'. ACP already ships in [all] so the curl install includes it.
- cli.md / configuration.md: 'auxiliary.compression.model' shown as
  'google/gemini-3-flash-preview' (the doc's own claimed default);
  actual default is empty (= use main model). Reworded as 'leave empty
  (default) or pin a cheap model'.
- built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row
  that was missing from the table.

Regenerated skill catalogs:
- ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill
  pages and both reference catalogs (skills-catalog.md,
  optional-skills-catalog.md). This adds the entries that were genuinely
  missing — productivity/teams-meeting-pipeline (bundled),
  optional/finance/* (entire category — 7 skills:
  3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model,
  merger-model, pptx-author), creative/hyperframes,
  creative/kanban-video-orchestrator, devops/watchers,
  productivity/shop-app, research/searxng-search,
  apple/macos-computer-use — and rewrites every other per-skill page from
  the current SKILL.md. Most diffs are tiny (one line of refreshed
  metadata).

Validation:
- 'npm run build' succeeded.
- Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation
  shells that lag every newly-added skill page (pre-existing pattern).
  No regressions on any en/ page.

2026-05-09 13:19:51 -07:00

9.5 KiB

Raw Blame History

title	description	sidebar_label	sidebar_position
Web Search & Extract	Search the web, extract page content, and crawl websites with multiple backend providers — including free self-hosted SearXNG.	Web Search	6

Web Search & Extract

Hermes Agent includes two model-callable web tools backed by multiple providers:

web_search — search the web and return ranked results
web_extract — fetch and extract readable content from one or more URLs (with built-in deep-crawl support when the backend provides it)

Both are configured through a single backend selection. Providers are chosen via hermes tools or set directly in config.yaml. Recursive crawling capabilities (Firecrawl/Tavily) are exposed through web_extract rather than as a separate web_crawl tool.

Backends

Provider	Env Var	Search	Extract	Crawl	Free tier
Firecrawl (default)	`FIRECRAWL_API_KEY`	✔	✔	✔	500 credits/mo
SearXNG	`SEARXNG_URL`	✔	—	—	✔ Free (self-hosted)
Tavily	`TAVILY_API_KEY`	✔	✔	✔	1 000 searches/mo
Exa	`EXA_API_KEY`	✔	✔	—	1 000 searches/mo
Parallel	`PARALLEL_API_KEY`	✔	✔	—	Paid

Per-capability split: you can use different providers for search and extract independently — for example SearXNG (free) for search and Firecrawl for extract. See Per-capability configuration below.

:::tip Nous Subscribers If you have a paid Nous Portal subscription, web search and extract are available through the Tool Gateway via managed Firecrawl — no API key needed. Run hermes tools to enable it. :::

Setup

Quick setup via `hermes tools`

Run hermes tools, navigate to Web Search & Extract, and pick a provider. The wizard prompts for the required URL or API key and writes it to your config.

hermes tools

Firecrawl (default)

Full-featured search, extract, and crawl. Recommended for most users.

# ~/.hermes/.env
FIRECRAWL_API_KEY=fc-your-key-here

Get a key at firecrawl.dev. The free tier includes 500 credits/month.

Self-hosted Firecrawl: Point at your own instance instead of the cloud API:

# ~/.hermes/.env
FIRECRAWL_API_URL=http://localhost:3002

When FIRECRAWL_API_URL is set, the API key is optional (disable server auth with USE_DB_AUTHENTICATION=false).

SearXNG (free, self-hosted)

SearXNG is a privacy-respecting, open-source metasearch engine that aggregates results from 70+ search engines. No API key required — just point Hermes at a running SearXNG instance.

SearXNG is search-only — web_extract (including its crawl modes) requires a separate extract provider.

Option A — Self-host with Docker (recommended)

This gives you a private instance with no rate limits.

1. Create a working directory:

mkdir -p ~/searxng/searxng
cd ~/searxng

2. Write a docker-compose.yml:

# ~/searxng/docker-compose.yml
services:
  searxng:
    image: searxng/searxng:latest
    container_name: searxng
    ports:
      - "8888:8080"
    volumes:
      - ./searxng:/etc/searxng:rw
    environment:
      - SEARXNG_BASE_URL=http://localhost:8888/
    restart: unless-stopped

3. Start the container:

docker compose up -d

4. Enable the JSON API format:

SearXNG ships with JSON output disabled by default. Copy the generated config and enable it:

# Copy the auto-generated config out of the container
docker cp searxng:/etc/searxng/settings.yml ~/searxng/searxng/settings.yml

Open ~/searxng/searxng/settings.yml and find the formats block (around line 84):

# Before (default — JSON disabled):
formats:
  - html

# After (enable JSON for Hermes):
formats:
  - html
  - json

5. Restart to apply:

docker cp ~/searxng/searxng/settings.yml searxng:/etc/searxng/settings.yml
docker restart searxng

6. Verify it works:

curl -s "http://localhost:8888/search?q=test&format=json" | python3 -c \
  "import sys,json; d=json.load(sys.stdin); print(f'{len(d[\"results\"])} results')"

You should see something like 10 results. If you get a 403 Forbidden, JSON format is still disabled — recheck step 4.

7. Configure Hermes:

# ~/.hermes/.env
SEARXNG_URL=http://localhost:8888

Then select SearXNG as the search backend in ~/.hermes/config.yaml:

web:
  search_backend: "searxng"

Or set via hermes tools → Web Search & Extract → SearXNG.

Option B — Use a public instance

Public SearXNG instances are listed at searx.space. Filter by instances that have JSON format enabled (shown in the table).

# ~/.hermes/.env
SEARXNG_URL=https://searx.example.com

:::caution Public instances Public instances have rate limits, variable uptime, and may disable JSON format at any time. For production use, self-hosting is strongly recommended. :::

Pair SearXNG with an extract provider

SearXNG handles search; you need a separate provider for web_extract (including any deep-crawl modes). Use the per-capability keys:

# ~/.hermes/config.yaml
web:
  search_backend: "searxng"
  extract_backend: "firecrawl"   # or tavily, exa, parallel

With this config, Hermes uses SearXNG for all search queries and Firecrawl for URL extraction — combining free search with high-quality extraction.

Tavily

AI-optimised search, extract, and crawl with a generous free tier.

# ~/.hermes/.env
TAVILY_API_KEY=tvly-your-key-here

Get a key at app.tavily.com. The free tier includes 1 000 searches/month.

Exa

Neural search with semantic understanding. Good for research and finding conceptually related content.

# ~/.hermes/.env
EXA_API_KEY=your-exa-key-here

Get a key at exa.ai. The free tier includes 1 000 searches/month.

Parallel

AI-native search and extraction with deep research capabilities.

# ~/.hermes/.env
PARALLEL_API_KEY=your-parallel-key-here

Get access at parallel.ai.

Configuration

Single backend

Set one provider for all web capabilities:

# ~/.hermes/config.yaml
web:
  backend: "searxng"   # firecrawl | searxng | tavily | exa | parallel

Per-capability configuration

Use different providers for search vs extract. This lets you combine free search (SearXNG) with a paid extract provider, or vice versa:

# ~/.hermes/config.yaml
web:
  search_backend: "searxng"     # used by web_search
  extract_backend: "firecrawl"  # used by web_extract (and its deep-crawl modes)

When per-capability keys are empty, both fall through to web.backend. When web.backend is also empty, the backend is auto-detected from whichever API key/URL is present.

Priority order (per capability):

web.search_backend / web.extract_backend (explicit per-capability)
web.backend (shared fallback)
Auto-detect from environment variables

Auto-detection

If no backend is explicitly configured, Hermes picks the first available one based on which credentials are set:

Credential present	Auto-selected backend
`FIRECRAWL_API_KEY` or `FIRECRAWL_API_URL`	firecrawl
`PARALLEL_API_KEY`	parallel
`TAVILY_API_KEY`	tavily
`EXA_API_KEY`	exa
`SEARXNG_URL`	searxng

Verify your setup

Run hermes setup to see which web backend is detected:

✅ Web Search & Extract (searxng)

Or check via the CLI:

# Activate the venv and run the web tools module directly
source ~/.hermes/hermes-agent/.venv/bin/activate
python -m tools.web_tools

This prints the active backend and its status:

✅ Web backend: searxng
   Using SearXNG (search only): http://localhost:8888

Troubleshooting

`web_search` returns `{"success": false}`

Check SEARXNG_URL is reachable: curl -s "http://localhost:8888/search?q=test&format=json"
If you get HTTP 403, JSON format is disabled — add json to the formats list in settings.yml and restart
If you get a connection error, the container may not be running: docker ps | grep searxng

`web_extract` says "search-only backend"

SearXNG cannot extract URL content. Set web.extract_backend to a provider that supports extraction:

web:
  search_backend: "searxng"
  extract_backend: "firecrawl"  # or tavily / exa / parallel

SearXNG returns 0 results

Some public instances disable certain search engines or categories. Try:

A different query
A different public instance from searx.space
Self-hosting your own instance for reliable results

Rate limited on a public instance

Switch to a self-hosted instance (see Option A above). With Docker, your own instance has no rate limits.

Optional skill: `searxng-search`

For agents that need to use SearXNG via curl directly (e.g. as a fallback when the web toolset isn't available), install the searxng-search optional skill:

hermes skills install official/research/searxng-search

This adds a skill that teaches the agent how to:

Call the SearXNG JSON API via curl or Python
Filter by category (general, news, science, etc.)
Handle pagination and error cases
Fall back gracefully when SearXNG is unreachable

9.5 KiB Raw Blame History