Closes the remaining gaps from PR #11562 that weren't covered by the core SearXNG integration landed in #20823. - optional-skills/research/searxng-search/ — installable skill with SKILL.md (curl-based usage, category support, Python example) and searxng.sh helper script for health checks and instance queries - website/docs/user-guide/configuration.md — SearXNG added to the Web Search Backends section (5 backends, backend table, per-capability split config example, correct search-only note) - website/docs/reference/environment-variables.md — SEARXNG_URL row - website/docs/reference/optional-skills-catalog.md — searxng-search entry The core SearXNG code, OPTIONAL_ENV_VARS, hermes tools picker, and tests were already on main via #20823. This commit is purely additive docs + the optional skill scaffold. Credits from #11562 salvage: @w4rum — original _searxng_search structure @nathansdev — tools_config.py integration @moyomartin — category support and result formatting @0xMihai — config/env var approach @nicobailon — skill and documentation structure @searxng-fan — error handling patterns @local-first — self-hosted-first philosophy and docs
7 KiB
| name | description | version | author | license | metadata | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| searxng-search | Free meta-search via SearXNG — aggregates results from 70+ search engines. Self-hosted or use a public instance. No API key needed. Falls back automatically when the web search toolset is unavailable. | 1.0.0 | hermes-agent | MIT |
|
SearXNG Search
Free meta-search using SearXNG — a privacy-respecting, self-hosted search aggregator that queries 70+ search engines simultaneously.
No API key required when using a public instance. Can also be self-hosted for full control. Automatically appears as a fallback when the main web search toolset (FIRECRAWL_API_KEY) is not configured.
Configuration
SearXNG requires a SEARXNG_URL environment variable pointing to your SearXNG instance:
# Public instances (no setup required)
SEARXNG_URL=https://searxng.example.com
# Self-hosted SearXNG
SEARXNG_URL=http://localhost:8888
If no instance is configured, this skill is unavailable and the agent falls back to other search options.
Detection Flow
Check what is actually available before choosing an approach:
# Check if SEARXNG_URL is set and the instance is reachable
curl -s --max-time 5 "${SEARXNG_URL}/search?q=test&format=json" | head -c 200
Decision tree:
- If
SEARXNG_URLis set and the instance responds, use SearXNG - If
SEARXNG_URLis unset or unreachable, fall back to other available search tools - If the user wants SearXNG specifically, help them set up an instance or find a public one
Method 1: CLI via curl (Preferred)
Use curl via terminal to call the SearXNG JSON API. This avoids assuming any particular Python package is installed.
# Text search (JSON output)
curl -s --max-time 10 \
"${SEARXNG_URL}/search?q=python+async+programming&format=json&engines=google,bing&limit=10"
# With Safesearch off
curl -s --max-time 10 \
"${SEARXNG_URL}/search?q=example&format=json&safesearch=0"
# Specific categories (general, news, science, etc.)
curl -s --max-time 10 \
"${SEARXNG_URL}/search?q=AI+news&format=json&categories=news"
Common CLI Flags
| Flag | Description | Example |
|---|---|---|
q |
Query string (URL-encoded) | q=python+async |
format |
Output format: json, csv, rss |
format=json |
engines |
Comma-separated engine names | engines=google,bing,ddg |
limit |
Max results per engine (default 10) | limit=5 |
categories |
Filter by category | categories=news,science |
safesearch |
0=none, 1=moderate, 2=strict | safesearch=0 |
time_range |
Filter: day, week, month, year |
time_range=week |
Parsing JSON Results
# Extract titles and URLs from JSON
curl -s --max-time 10 "${SEARXNG_URL}/search?q=fastapi&format=json&limit=5" \
| python3 -c "
import json, sys
data = json.load(sys.stdin)
for r in data.get('results', []):
print(r.get('title',''))
print(r.get('url',''))
print(r.get('content','')[:200])
print()
"
Returns per result: title, url, content (snippet), engine, parsed_url, img_src, thumbnail, author, published_date
Method 2: Python API via requests
Use the SearXNG REST API directly from Python with the requests library:
import os, requests, urllib.parse
base_url = os.environ.get("SEARXNG_URL", "")
if not base_url:
raise RuntimeError("SEARXNG_URL is not set")
query = "fastapi deployment guide"
params = {
"q": query,
"format": "json",
"limit": 5,
"engines": "google,bing",
}
resp = requests.get(f"{base_url}/search", params=params, timeout=10)
resp.raise_for_status()
data = resp.json()
for r in data.get("results", []):
print(r["title"])
print(r["url"])
print(r.get("content", "")[:200])
print()
Method 3: searxng-data Python Package
For more structured access, install the searxng-data package:
pip install searxng-data
from searxng_data import engines
# List available engines
print(engines.list_engines())
Note: This package only provides engine metadata, not the search API itself.
Self-Hosting SearXNG
To run your own SearXNG instance:
# Using Docker
docker run -d -p 8888:8080 \
-v $(pwd)/searxng:/etc/searxng \
searxng/searxng:latest
# Then set
SEARXNG_URL=http://localhost:8888
Or install via pip:
pip install searxng
# Edit /etc/searxng/settings.yml
searxng-run
Public SearXNG instances are available at:
https://searxng.example.com(replace with any public instance)
Workflow: Search then Extract
SearXNG returns titles, URLs, and snippets — not full page content. To get full page content, search first and then extract the most relevant URL with web_extract, browser tools, or curl.
# Search for relevant pages
curl -s "${SEARXNG_URL}/search?q=fastapi+deployment&format=json&limit=3"
# Output: list of results with titles and URLs
# Then extract the best URL with web_extract
Limitations
- Instance availability: If the SearXNG instance is down or unreachable, search fails. Always check
SEARXNG_URLis set and the instance is reachable. - No content extraction: SearXNG returns snippets, not full page content. Use
web_extract, browser tools, orcurlfor full articles. - Rate limiting: Some public instances limit requests. Self-hosting avoids this.
- Engine coverage: Available engines depend on the SearXNG instance configuration. Some engines may be disabled.
- Results freshness: Meta-search aggregates external engines — result freshness depends on those engines.
Troubleshooting
| Problem | Likely Cause | What To Do |
|---|---|---|
SEARXNG_URL not set |
No instance configured | Use a public SearXNG instance or set up your own |
| Connection refused | Instance not running or wrong URL | Check the URL is correct and the instance is running |
| Empty results | Instance blocks the query | Try a different instance or self-host |
| Slow responses | Public instance under load | Self-host or use a less-loaded public instance |
json format not supported |
Old SearXNG version | Try format=rss or upgrade SearXNG |
Pitfalls
- Always set
SEARXNG_URL: Without it, the skill cannot function. - URL-encode queries: Spaces and special characters must be URL-encoded in curl, or use
urllib.parse.quote()in Python. - Use
format=json: The default format may not be machine-readable. Always request JSON explicitly. - Set a timeout: Always use
--max-timeortimeout=to avoid hanging on unreachable instances. - Self-hosting is best: Public instances may go down, rate-limit, or block. A self-hosted instance is reliable.
Instance Discovery
If SEARXNG_URL is not set and the user asks about SearXNG, help them either:
- Find a public SearXNG instance (search for "public searxng instance")
- Set up their own with Docker or pip
Public instances are listed at: https://searxng.org/