mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap (#3934)
* feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap Map active skills to Telegram's slash command menu so users can discover and invoke skills directly. Three changes: 1. Telegram menu now includes active skill commands alongside built-in commands, capped at 100 entries (Telegram Bot API limit). Overflow commands remain callable but hidden from the picker. Logged at startup when cap is hit. 2. New /commands [page] gateway command for paginated browsing of all commands + skills. /help now shows first 10 skill commands and points to /commands for the full list. 3. When a user types a slash command that matches a disabled or uninstalled skill, they get actionable guidance: - Disabled: 'Enable it with: hermes skills config' - Optional (not installed): 'Install with: hermes skills install official/<path>' Built on ideas from PR #3921 by @kshitijk4poor. * chore: move 21 niche skills to optional-skills Move specialized/niche skills from built-in (skills/) to optional (optional-skills/) to reduce the default skill count. Users can install them with: hermes skills install official/<category>/<name> Moved skills (21): - mlops: accelerate, chroma, faiss, flash-attention, hermes-atropos-environments, huggingface-tokenizers, instructor, lambda-labs, llava, nemo-curator, pinecone, pytorch-lightning, qdrant, saelens, simpo, slime, tensorrt-llm, torchtitan - research: domain-intel, duckduckgo-search - devops: inference-sh cli Built-in skills: 96 → 75 Optional skills: 22 → 43 * fix: only include repo built-in skills in Telegram menu, not user-installed User-installed skills (from hub or manually added) stay accessible via /skills and by typing the command directly, but don't get registered in the Telegram slash command picker. Only skills whose SKILL.md is under the repo's skills/ directory are included in the menu. This keeps the Telegram menu focused on the curated built-in set while user-installed skills remain discoverable through /skills and /commands.
This commit is contained in:
parent
97d6813f51
commit
5ceed021dc
73 changed files with 163 additions and 4 deletions
|
|
@ -1,96 +0,0 @@
|
|||
---
|
||||
name: domain-intel
|
||||
description: Passive domain reconnaissance using Python stdlib. Subdomain discovery, SSL certificate inspection, WHOIS lookups, DNS records, domain availability checks, and bulk multi-domain analysis. No API keys required.
|
||||
---
|
||||
|
||||
# Domain Intelligence — Passive OSINT
|
||||
|
||||
Passive domain reconnaissance using only Python stdlib.
|
||||
**Zero dependencies. Zero API keys. Works on Linux, macOS, and Windows.**
|
||||
|
||||
## Helper script
|
||||
|
||||
This skill includes `scripts/domain_intel.py` — a complete CLI tool for all domain intelligence operations.
|
||||
|
||||
```bash
|
||||
# Subdomain discovery via Certificate Transparency logs
|
||||
python3 SKILL_DIR/scripts/domain_intel.py subdomains example.com
|
||||
|
||||
# SSL certificate inspection (expiry, cipher, SANs, issuer)
|
||||
python3 SKILL_DIR/scripts/domain_intel.py ssl example.com
|
||||
|
||||
# WHOIS lookup (registrar, dates, name servers — 100+ TLDs)
|
||||
python3 SKILL_DIR/scripts/domain_intel.py whois example.com
|
||||
|
||||
# DNS records (A, AAAA, MX, NS, TXT, CNAME)
|
||||
python3 SKILL_DIR/scripts/domain_intel.py dns example.com
|
||||
|
||||
# Domain availability check (passive: DNS + WHOIS + SSL signals)
|
||||
python3 SKILL_DIR/scripts/domain_intel.py available coolstartup.io
|
||||
|
||||
# Bulk analysis — multiple domains, multiple checks in parallel
|
||||
python3 SKILL_DIR/scripts/domain_intel.py bulk example.com github.com google.com
|
||||
python3 SKILL_DIR/scripts/domain_intel.py bulk example.com github.com --checks ssl,dns
|
||||
```
|
||||
|
||||
`SKILL_DIR` is the directory containing this SKILL.md file. All output is structured JSON.
|
||||
|
||||
## Available commands
|
||||
|
||||
| Command | What it does | Data source |
|
||||
|---------|-------------|-------------|
|
||||
| `subdomains` | Find subdomains from certificate logs | crt.sh (HTTPS) |
|
||||
| `ssl` | Inspect TLS certificate details | Direct TCP:443 to target |
|
||||
| `whois` | Registration info, registrar, dates | WHOIS servers (TCP:43) |
|
||||
| `dns` | A, AAAA, MX, NS, TXT, CNAME records | System DNS + Google DoH |
|
||||
| `available` | Check if domain is registered | DNS + WHOIS + SSL signals |
|
||||
| `bulk` | Run multiple checks on multiple domains | All of the above |
|
||||
|
||||
## When to use this vs built-in tools
|
||||
|
||||
- **Use this skill** for infrastructure questions: subdomains, SSL certs, WHOIS, DNS records, availability
|
||||
- **Use `web_search`** for general research about what a domain/company does
|
||||
- **Use `web_extract`** to get the actual content of a webpage
|
||||
- **Use `terminal` with `curl -I`** for a simple "is this URL reachable" check
|
||||
|
||||
| Task | Better tool | Why |
|
||||
|------|-------------|-----|
|
||||
| "What does example.com do?" | `web_extract` | Gets page content, not DNS/WHOIS data |
|
||||
| "Find info about a company" | `web_search` | General research, not domain-specific |
|
||||
| "Is this website safe?" | `web_search` | Reputation checks need web context |
|
||||
| "Check if a URL is reachable" | `terminal` with `curl -I` | Simple HTTP check |
|
||||
| "Find subdomains of X" | **This skill** | Only passive source for this |
|
||||
| "When does the SSL cert expire?" | **This skill** | Built-in tools can't inspect TLS |
|
||||
| "Who registered this domain?" | **This skill** | WHOIS data not in web search |
|
||||
| "Is coolstartup.io available?" | **This skill** | Passive availability via DNS+WHOIS+SSL |
|
||||
|
||||
## Platform compatibility
|
||||
|
||||
Pure Python stdlib (`socket`, `ssl`, `urllib`, `json`, `concurrent.futures`).
|
||||
Works identically on Linux, macOS, and Windows with no dependencies.
|
||||
|
||||
- **crt.sh queries** use HTTPS (port 443) — works behind most firewalls
|
||||
- **WHOIS queries** use TCP port 43 — may be blocked on restrictive networks
|
||||
- **DNS queries** use Google DoH (HTTPS) for MX/NS/TXT — firewall-friendly
|
||||
- **SSL checks** connect to the target on port 443 — the only "active" operation
|
||||
|
||||
## Data sources
|
||||
|
||||
All queries are **passive** — no port scanning, no vulnerability testing:
|
||||
|
||||
- **crt.sh** — Certificate Transparency logs (subdomain discovery, HTTPS only)
|
||||
- **WHOIS servers** — Direct TCP to 100+ authoritative TLD registrars
|
||||
- **Google DNS-over-HTTPS** — MX, NS, TXT, CNAME resolution (firewall-friendly)
|
||||
- **System DNS** — A/AAAA record resolution
|
||||
- **SSL check** is the only "active" operation (TCP connection to target:443)
|
||||
|
||||
## Notes
|
||||
|
||||
- WHOIS queries use TCP port 43 — may be blocked on restrictive networks
|
||||
- Some WHOIS servers redact registrant info (GDPR) — mention this to the user
|
||||
- crt.sh can be slow for very popular domains (thousands of certs) — set reasonable expectations
|
||||
- The availability check is heuristic-based (3 passive signals) — not authoritative like a registrar API
|
||||
|
||||
---
|
||||
|
||||
*Contributed by [@FurkanL0](https://github.com/FurkanL0)*
|
||||
|
|
@ -1,397 +0,0 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
Domain Intelligence — Passive OSINT via Python stdlib.
|
||||
|
||||
Usage:
|
||||
python domain_intel.py subdomains example.com
|
||||
python domain_intel.py ssl example.com
|
||||
python domain_intel.py whois example.com
|
||||
python domain_intel.py dns example.com
|
||||
python domain_intel.py available example.com
|
||||
python domain_intel.py bulk example.com github.com google.com --checks ssl,dns
|
||||
|
||||
All output is structured JSON. No dependencies beyond Python stdlib.
|
||||
Works on Linux, macOS, and Windows.
|
||||
"""
|
||||
|
||||
import json
|
||||
import re
|
||||
import socket
|
||||
import ssl
|
||||
import sys
|
||||
import urllib.request
|
||||
import urllib.parse
|
||||
from concurrent.futures import ThreadPoolExecutor, as_completed
|
||||
from datetime import datetime, timezone
|
||||
|
||||
|
||||
# ─── Subdomain Discovery (crt.sh) ──────────────────────────────────────────
|
||||
|
||||
def subdomains(domain, include_expired=False, limit=200):
|
||||
"""Find subdomains via Certificate Transparency logs."""
|
||||
url = f"https://crt.sh/?q=%25.{urllib.parse.quote(domain)}&output=json"
|
||||
req = urllib.request.Request(url, headers={
|
||||
"User-Agent": "domain-intel-skill/1.0", "Accept": "application/json",
|
||||
})
|
||||
with urllib.request.urlopen(req, timeout=15) as r:
|
||||
entries = json.loads(r.read().decode())
|
||||
|
||||
seen, results = set(), []
|
||||
now = datetime.now(timezone.utc)
|
||||
for e in entries:
|
||||
not_after = e.get("not_after", "")
|
||||
if not include_expired and not_after:
|
||||
try:
|
||||
dt = datetime.strptime(not_after[:19], "%Y-%m-%dT%H:%M:%S").replace(tzinfo=timezone.utc)
|
||||
if dt <= now:
|
||||
continue
|
||||
except ValueError:
|
||||
pass
|
||||
for name in e.get("name_value", "").splitlines():
|
||||
name = name.strip().lower()
|
||||
if name and name not in seen:
|
||||
seen.add(name)
|
||||
results.append({
|
||||
"subdomain": name,
|
||||
"issuer": e.get("issuer_name", ""),
|
||||
"not_after": not_after,
|
||||
})
|
||||
|
||||
results.sort(key=lambda r: (r["subdomain"].startswith("*"), r["subdomain"]))
|
||||
return {"domain": domain, "count": min(len(results), limit), "subdomains": results[:limit]}
|
||||
|
||||
|
||||
# ─── SSL Certificate Inspection ────────────────────────────────────────────
|
||||
|
||||
def check_ssl(host, port=443, timeout=10):
|
||||
"""Inspect the TLS certificate of a host."""
|
||||
def flat(rdns):
|
||||
r = {}
|
||||
for rdn in rdns:
|
||||
for item in rdn:
|
||||
if isinstance(item, (list, tuple)) and len(item) == 2:
|
||||
r[item[0]] = item[1]
|
||||
return r
|
||||
|
||||
def parse_date(s):
|
||||
for fmt in ("%b %d %H:%M:%S %Y %Z", "%b %d %H:%M:%S %Y %Z"):
|
||||
try:
|
||||
return datetime.strptime(s, fmt).replace(tzinfo=timezone.utc)
|
||||
except ValueError:
|
||||
pass
|
||||
return None
|
||||
|
||||
warning = None
|
||||
try:
|
||||
ctx = ssl.create_default_context()
|
||||
with socket.create_connection((host, port), timeout=timeout) as sock:
|
||||
with ctx.wrap_socket(sock, server_hostname=host) as s:
|
||||
cert, cipher, proto = s.getpeercert(), s.cipher(), s.version()
|
||||
except ssl.SSLCertVerificationError as e:
|
||||
warning = str(e)
|
||||
ctx = ssl.create_default_context()
|
||||
ctx.check_hostname = False
|
||||
ctx.verify_mode = ssl.CERT_NONE
|
||||
with socket.create_connection((host, port), timeout=timeout) as sock:
|
||||
with ctx.wrap_socket(sock, server_hostname=host) as s:
|
||||
cert, cipher, proto = s.getpeercert(), s.cipher(), s.version()
|
||||
|
||||
not_after = parse_date(cert.get("notAfter", ""))
|
||||
now = datetime.now(timezone.utc)
|
||||
days = (not_after - now).days if not_after else None
|
||||
is_expired = days is not None and days < 0
|
||||
|
||||
if is_expired:
|
||||
status = f"EXPIRED ({abs(days)} days ago)"
|
||||
elif days is not None and days <= 14:
|
||||
status = f"CRITICAL — {days} day(s) left"
|
||||
elif days is not None and days <= 30:
|
||||
status = f"WARNING — {days} day(s) left"
|
||||
else:
|
||||
status = f"OK — {days} day(s) remaining" if days is not None else "unknown"
|
||||
|
||||
return {
|
||||
"host": host, "port": port,
|
||||
"subject": flat(cert.get("subject", [])),
|
||||
"issuer": flat(cert.get("issuer", [])),
|
||||
"subject_alt_names": [f"{t}:{v}" for t, v in cert.get("subjectAltName", [])],
|
||||
"not_before": parse_date(cert.get("notBefore", "")).isoformat() if parse_date(cert.get("notBefore", "")) else "",
|
||||
"not_after": not_after.isoformat() if not_after else "",
|
||||
"days_remaining": days, "is_expired": is_expired, "expiry_status": status,
|
||||
"tls_version": proto,
|
||||
"cipher_suite": cipher[0] if cipher else None,
|
||||
"serial_number": cert.get("serialNumber", ""),
|
||||
"verification_warning": warning,
|
||||
}
|
||||
|
||||
|
||||
# ─── WHOIS Lookup ──────────────────────────────────────────────────────────
|
||||
|
||||
WHOIS_SERVERS = {
|
||||
"com": "whois.verisign-grs.com", "net": "whois.verisign-grs.com",
|
||||
"org": "whois.pir.org", "io": "whois.nic.io", "co": "whois.nic.co",
|
||||
"ai": "whois.nic.ai", "dev": "whois.nic.google", "app": "whois.nic.google",
|
||||
"tech": "whois.nic.tech", "shop": "whois.nic.shop", "store": "whois.nic.store",
|
||||
"online": "whois.nic.online", "site": "whois.nic.site", "cloud": "whois.nic.cloud",
|
||||
"digital": "whois.nic.digital", "media": "whois.nic.media", "blog": "whois.nic.blog",
|
||||
"info": "whois.afilias.net", "biz": "whois.biz", "me": "whois.nic.me",
|
||||
"tv": "whois.nic.tv", "cc": "whois.nic.cc", "ws": "whois.website.ws",
|
||||
"uk": "whois.nic.uk", "co.uk": "whois.nic.uk", "de": "whois.denic.de",
|
||||
"nl": "whois.domain-registry.nl", "fr": "whois.nic.fr", "it": "whois.nic.it",
|
||||
"es": "whois.nic.es", "pl": "whois.dns.pl", "ru": "whois.tcinet.ru",
|
||||
"se": "whois.iis.se", "no": "whois.norid.no", "fi": "whois.fi",
|
||||
"ch": "whois.nic.ch", "at": "whois.nic.at", "be": "whois.dns.be",
|
||||
"cz": "whois.nic.cz", "br": "whois.registro.br", "ca": "whois.cira.ca",
|
||||
"mx": "whois.mx", "au": "whois.auda.org.au", "jp": "whois.jprs.jp",
|
||||
"cn": "whois.cnnic.cn", "in": "whois.inregistry.net", "kr": "whois.kr",
|
||||
"sg": "whois.sgnic.sg", "hk": "whois.hkirc.hk", "tr": "whois.nic.tr",
|
||||
"ae": "whois.aeda.net.ae", "za": "whois.registry.net.za",
|
||||
"space": "whois.nic.space", "zone": "whois.nic.zone", "ninja": "whois.nic.ninja",
|
||||
"guru": "whois.nic.guru", "rocks": "whois.nic.rocks", "live": "whois.nic.live",
|
||||
"game": "whois.nic.game", "games": "whois.nic.games",
|
||||
}
|
||||
|
||||
|
||||
def whois_lookup(domain):
|
||||
"""Query WHOIS servers for domain registration info."""
|
||||
parts = domain.split(".")
|
||||
server = WHOIS_SERVERS.get(".".join(parts[-2:])) or WHOIS_SERVERS.get(parts[-1])
|
||||
if not server:
|
||||
return {"error": f"No WHOIS server for .{parts[-1]}"}
|
||||
|
||||
try:
|
||||
with socket.create_connection((server, 43), timeout=10) as s:
|
||||
s.sendall((domain + "\r\n").encode())
|
||||
chunks = []
|
||||
while True:
|
||||
c = s.recv(4096)
|
||||
if not c:
|
||||
break
|
||||
chunks.append(c)
|
||||
raw = b"".join(chunks).decode("utf-8", errors="replace")
|
||||
except Exception as e:
|
||||
return {"error": str(e)}
|
||||
|
||||
patterns = {
|
||||
"registrar": r"(?:Registrar|registrar):\s*(.+)",
|
||||
"creation_date": r"(?:Creation Date|Created|created):\s*(.+)",
|
||||
"expiration_date": r"(?:Registry Expiry Date|Expiration Date|Expiry Date):\s*(.+)",
|
||||
"updated_date": r"(?:Updated Date|Last Modified):\s*(.+)",
|
||||
"name_servers": r"(?:Name Server|nserver):\s*(.+)",
|
||||
"status": r"(?:Domain Status|status):\s*(.+)",
|
||||
"dnssec": r"DNSSEC:\s*(.+)",
|
||||
}
|
||||
result = {"domain": domain, "whois_server": server}
|
||||
for key, pat in patterns.items():
|
||||
matches = re.findall(pat, raw, re.IGNORECASE)
|
||||
if matches:
|
||||
if key in ("name_servers", "status"):
|
||||
result[key] = list(dict.fromkeys(m.strip().lower() for m in matches))
|
||||
else:
|
||||
result[key] = matches[0].strip()
|
||||
|
||||
for field in ("creation_date", "expiration_date", "updated_date"):
|
||||
if field in result:
|
||||
for fmt in ("%Y-%m-%dT%H:%M:%S", "%Y-%m-%dT%H:%M:%SZ", "%Y-%m-%d %H:%M:%S", "%Y-%m-%d"):
|
||||
try:
|
||||
dt = datetime.strptime(result[field][:19], fmt).replace(tzinfo=timezone.utc)
|
||||
result[field] = dt.isoformat()
|
||||
if field == "expiration_date":
|
||||
days = (dt - datetime.now(timezone.utc)).days
|
||||
result["expiration_days_remaining"] = days
|
||||
result["is_expired"] = days < 0
|
||||
break
|
||||
except ValueError:
|
||||
pass
|
||||
return result
|
||||
|
||||
|
||||
# ─── DNS Records ───────────────────────────────────────────────────────────
|
||||
|
||||
def dns_records(domain, types=None):
|
||||
"""Resolve DNS records using system DNS + Google DoH."""
|
||||
if not types:
|
||||
types = ["A", "AAAA", "MX", "NS", "TXT", "CNAME"]
|
||||
records = {}
|
||||
|
||||
for qtype in types:
|
||||
if qtype == "A":
|
||||
try:
|
||||
records["A"] = list(dict.fromkeys(
|
||||
i[4][0] for i in socket.getaddrinfo(domain, None, socket.AF_INET)
|
||||
))
|
||||
except Exception:
|
||||
records["A"] = []
|
||||
elif qtype == "AAAA":
|
||||
try:
|
||||
records["AAAA"] = list(dict.fromkeys(
|
||||
i[4][0] for i in socket.getaddrinfo(domain, None, socket.AF_INET6)
|
||||
))
|
||||
except Exception:
|
||||
records["AAAA"] = []
|
||||
else:
|
||||
url = f"https://dns.google/resolve?name={urllib.parse.quote(domain)}&type={qtype}"
|
||||
try:
|
||||
req = urllib.request.Request(url, headers={"User-Agent": "domain-intel-skill/1.0"})
|
||||
with urllib.request.urlopen(req, timeout=10) as r:
|
||||
data = json.loads(r.read())
|
||||
records[qtype] = [
|
||||
a.get("data", "").strip().rstrip(".")
|
||||
for a in data.get("Answer", []) if a.get("data")
|
||||
]
|
||||
except Exception:
|
||||
records[qtype] = []
|
||||
|
||||
return {"domain": domain, "records": records}
|
||||
|
||||
|
||||
# ─── Domain Availability Check ─────────────────────────────────────────────
|
||||
|
||||
def check_available(domain):
|
||||
"""Check domain availability using passive signals (DNS + WHOIS + SSL)."""
|
||||
signals = {}
|
||||
|
||||
# DNS
|
||||
try:
|
||||
a = [i[4][0] for i in socket.getaddrinfo(domain, None, socket.AF_INET)]
|
||||
except Exception:
|
||||
a = []
|
||||
|
||||
try:
|
||||
ns_url = f"https://dns.google/resolve?name={urllib.parse.quote(domain)}&type=NS"
|
||||
req = urllib.request.Request(ns_url, headers={"User-Agent": "domain-intel-skill/1.0"})
|
||||
with urllib.request.urlopen(req, timeout=10) as r:
|
||||
ns = [x.get("data", "") for x in json.loads(r.read()).get("Answer", [])]
|
||||
except Exception:
|
||||
ns = []
|
||||
|
||||
signals["dns_a"] = a
|
||||
signals["dns_ns"] = ns
|
||||
dns_exists = bool(a or ns)
|
||||
|
||||
# SSL
|
||||
ssl_up = False
|
||||
try:
|
||||
ctx = ssl.create_default_context()
|
||||
ctx.check_hostname = False
|
||||
ctx.verify_mode = ssl.CERT_NONE
|
||||
with socket.create_connection((domain, 443), timeout=3) as s:
|
||||
with ctx.wrap_socket(s, server_hostname=domain):
|
||||
ssl_up = True
|
||||
except Exception:
|
||||
pass
|
||||
signals["ssl_reachable"] = ssl_up
|
||||
|
||||
# WHOIS (quick check)
|
||||
tld = domain.rsplit(".", 1)[-1]
|
||||
server = WHOIS_SERVERS.get(tld)
|
||||
whois_avail = None
|
||||
whois_note = ""
|
||||
if server:
|
||||
try:
|
||||
with socket.create_connection((server, 43), timeout=10) as s:
|
||||
s.sendall((domain + "\r\n").encode())
|
||||
raw = b""
|
||||
while True:
|
||||
c = s.recv(4096)
|
||||
if not c:
|
||||
break
|
||||
raw += c
|
||||
raw = raw.decode("utf-8", errors="replace").lower()
|
||||
if any(p in raw for p in ["no match", "not found", "no data found", "status: free"]):
|
||||
whois_avail = True
|
||||
whois_note = "WHOIS: not found"
|
||||
elif "registrar:" in raw or "creation date:" in raw:
|
||||
whois_avail = False
|
||||
whois_note = "WHOIS: registered"
|
||||
else:
|
||||
whois_note = "WHOIS: inconclusive"
|
||||
except Exception as e:
|
||||
whois_note = f"WHOIS error: {e}"
|
||||
|
||||
signals["whois_available"] = whois_avail
|
||||
signals["whois_note"] = whois_note
|
||||
|
||||
if not dns_exists and whois_avail is True:
|
||||
verdict, conf = "LIKELY AVAILABLE", "high"
|
||||
elif dns_exists or whois_avail is False or ssl_up:
|
||||
verdict, conf = "REGISTERED / IN USE", "high"
|
||||
elif not dns_exists and whois_avail is None:
|
||||
verdict, conf = "POSSIBLY AVAILABLE", "medium"
|
||||
else:
|
||||
verdict, conf = "UNCERTAIN", "low"
|
||||
|
||||
return {"domain": domain, "verdict": verdict, "confidence": conf, "signals": signals}
|
||||
|
||||
|
||||
# ─── Bulk Analysis ─────────────────────────────────────────────────────────
|
||||
|
||||
COMMAND_MAP = {
|
||||
"subdomains": subdomains,
|
||||
"ssl": check_ssl,
|
||||
"whois": whois_lookup,
|
||||
"dns": dns_records,
|
||||
"available": check_available,
|
||||
}
|
||||
|
||||
|
||||
def bulk_check(domains, checks=None, max_workers=5):
|
||||
"""Run multiple checks across multiple domains in parallel."""
|
||||
if not checks:
|
||||
checks = ["ssl", "whois", "dns"]
|
||||
|
||||
def run_one(d):
|
||||
entry = {"domain": d}
|
||||
for check in checks:
|
||||
fn = COMMAND_MAP.get(check)
|
||||
if fn:
|
||||
try:
|
||||
entry[check] = fn(d)
|
||||
except Exception as e:
|
||||
entry[check] = {"error": str(e)}
|
||||
return entry
|
||||
|
||||
results = []
|
||||
with ThreadPoolExecutor(max_workers=min(max_workers, 10)) as ex:
|
||||
futures = {ex.submit(run_one, d): d for d in domains[:20]}
|
||||
for f in as_completed(futures):
|
||||
results.append(f.result())
|
||||
|
||||
return {"total": len(results), "checks": checks, "results": results}
|
||||
|
||||
|
||||
# ─── CLI Entry Point ───────────────────────────────────────────────────────
|
||||
|
||||
def main():
|
||||
if len(sys.argv) < 3:
|
||||
print(__doc__)
|
||||
sys.exit(1)
|
||||
|
||||
command = sys.argv[1].lower()
|
||||
args = sys.argv[2:]
|
||||
|
||||
if command == "bulk":
|
||||
# Parse --checks flag
|
||||
checks = None
|
||||
domains = []
|
||||
i = 0
|
||||
while i < len(args):
|
||||
if args[i] == "--checks" and i + 1 < len(args):
|
||||
checks = [c.strip() for c in args[i + 1].split(",")]
|
||||
i += 2
|
||||
else:
|
||||
domains.append(args[i])
|
||||
i += 1
|
||||
result = bulk_check(domains, checks)
|
||||
elif command in COMMAND_MAP:
|
||||
result = COMMAND_MAP[command](args[0])
|
||||
else:
|
||||
print(f"Unknown command: {command}")
|
||||
print(f"Available: {', '.join(COMMAND_MAP.keys())}, bulk")
|
||||
sys.exit(1)
|
||||
|
||||
print(json.dumps(result, indent=2))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
|
@ -1,237 +0,0 @@
|
|||
---
|
||||
name: duckduckgo-search
|
||||
description: Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Prefer the `ddgs` CLI when installed; use the Python DDGS library only after verifying that `ddgs` is available in the current runtime.
|
||||
version: 1.3.0
|
||||
author: gamedevCloudy
|
||||
license: MIT
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [search, duckduckgo, web-search, free, fallback]
|
||||
related_skills: [arxiv]
|
||||
fallback_for_toolsets: [web]
|
||||
---
|
||||
|
||||
# DuckDuckGo Search
|
||||
|
||||
Free web search using DuckDuckGo. **No API key required.**
|
||||
|
||||
Preferred when `web_search` is unavailable or unsuitable (for example when `FIRECRAWL_API_KEY` is not set). Can also be used as a standalone search path when DuckDuckGo results are specifically desired.
|
||||
|
||||
## Detection Flow
|
||||
|
||||
Check what is actually available before choosing an approach:
|
||||
|
||||
```bash
|
||||
# Check CLI availability
|
||||
command -v ddgs >/dev/null && echo "DDGS_CLI=installed" || echo "DDGS_CLI=missing"
|
||||
```
|
||||
|
||||
Decision tree:
|
||||
1. If `ddgs` CLI is installed, prefer `terminal` + `ddgs`
|
||||
2. If `ddgs` CLI is missing, do not assume `execute_code` can import `ddgs`
|
||||
3. If the user wants DuckDuckGo specifically, install `ddgs` first in the relevant environment
|
||||
4. Otherwise fall back to built-in web/browser tools
|
||||
|
||||
Important runtime note:
|
||||
- Terminal and `execute_code` are separate runtimes
|
||||
- A successful shell install does not guarantee `execute_code` can import `ddgs`
|
||||
- Never assume third-party Python packages are preinstalled inside `execute_code`
|
||||
|
||||
## Installation
|
||||
|
||||
Install `ddgs` only when DuckDuckGo search is specifically needed and the runtime does not already provide it.
|
||||
|
||||
```bash
|
||||
# Python package + CLI entrypoint
|
||||
pip install ddgs
|
||||
|
||||
# Verify CLI
|
||||
ddgs --help
|
||||
```
|
||||
|
||||
If a workflow depends on Python imports, verify that same runtime can import `ddgs` before using `from ddgs import DDGS`.
|
||||
|
||||
## Method 1: CLI Search (Preferred)
|
||||
|
||||
Use the `ddgs` command via `terminal` when it exists. This is the preferred path because it avoids assuming the `execute_code` sandbox has the `ddgs` Python package installed.
|
||||
|
||||
```bash
|
||||
# Text search
|
||||
ddgs text -k "python async programming" -m 5
|
||||
|
||||
# News search
|
||||
ddgs news -k "artificial intelligence" -m 5
|
||||
|
||||
# Image search
|
||||
ddgs images -k "landscape photography" -m 10
|
||||
|
||||
# Video search
|
||||
ddgs videos -k "python tutorial" -m 5
|
||||
|
||||
# With region filter
|
||||
ddgs text -k "best restaurants" -m 5 -r us-en
|
||||
|
||||
# Recent results only (d=day, w=week, m=month, y=year)
|
||||
ddgs text -k "latest AI news" -m 5 -t w
|
||||
|
||||
# JSON output for parsing
|
||||
ddgs text -k "fastapi tutorial" -m 5 -o json
|
||||
```
|
||||
|
||||
### CLI Flags
|
||||
|
||||
| Flag | Description | Example |
|
||||
|------|-------------|---------|
|
||||
| `-k` | Keywords (query) — **required** | `-k "search terms"` |
|
||||
| `-m` | Max results | `-m 5` |
|
||||
| `-r` | Region | `-r us-en` |
|
||||
| `-t` | Time limit | `-t w` (week) |
|
||||
| `-s` | Safe search | `-s off` |
|
||||
| `-o` | Output format | `-o json` |
|
||||
|
||||
## Method 2: Python API (Only After Verification)
|
||||
|
||||
Use the `DDGS` class in `execute_code` or another Python runtime only after verifying that `ddgs` is installed there. Do not assume `execute_code` includes third-party packages by default.
|
||||
|
||||
Safe wording:
|
||||
- "Use `execute_code` with `ddgs` after installing or verifying the package if needed"
|
||||
|
||||
Avoid saying:
|
||||
- "`execute_code` includes `ddgs`"
|
||||
- "DuckDuckGo search works by default in `execute_code`"
|
||||
|
||||
**Important:** `max_results` must always be passed as a **keyword argument** — positional usage raises an error on all methods.
|
||||
|
||||
### Text Search
|
||||
|
||||
Best for: general research, companies, documentation.
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
for r in ddgs.text("python async programming", max_results=5):
|
||||
print(r["title"])
|
||||
print(r["href"])
|
||||
print(r.get("body", "")[:200])
|
||||
print()
|
||||
```
|
||||
|
||||
Returns: `title`, `href`, `body`
|
||||
|
||||
### News Search
|
||||
|
||||
Best for: current events, breaking news, latest updates.
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
for r in ddgs.news("AI regulation 2026", max_results=5):
|
||||
print(r["date"], "-", r["title"])
|
||||
print(r.get("source", ""), "|", r["url"])
|
||||
print(r.get("body", "")[:200])
|
||||
print()
|
||||
```
|
||||
|
||||
Returns: `date`, `title`, `body`, `url`, `image`, `source`
|
||||
|
||||
### Image Search
|
||||
|
||||
Best for: visual references, product images, diagrams.
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
for r in ddgs.images("semiconductor chip", max_results=5):
|
||||
print(r["title"])
|
||||
print(r["image"])
|
||||
print(r.get("thumbnail", ""))
|
||||
print(r.get("source", ""))
|
||||
print()
|
||||
```
|
||||
|
||||
Returns: `title`, `image`, `thumbnail`, `url`, `height`, `width`, `source`
|
||||
|
||||
### Video Search
|
||||
|
||||
Best for: tutorials, demos, explainers.
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
for r in ddgs.videos("FastAPI tutorial", max_results=5):
|
||||
print(r["title"])
|
||||
print(r.get("content", ""))
|
||||
print(r.get("duration", ""))
|
||||
print(r.get("provider", ""))
|
||||
print(r.get("published", ""))
|
||||
print()
|
||||
```
|
||||
|
||||
Returns: `title`, `content`, `description`, `duration`, `provider`, `published`, `statistics`, `uploader`
|
||||
|
||||
### Quick Reference
|
||||
|
||||
| Method | Use When | Key Fields |
|
||||
|--------|----------|------------|
|
||||
| `text()` | General research, companies | title, href, body |
|
||||
| `news()` | Current events, updates | date, title, source, body, url |
|
||||
| `images()` | Visuals, diagrams | title, image, thumbnail, url |
|
||||
| `videos()` | Tutorials, demos | title, content, duration, provider |
|
||||
|
||||
## Workflow: Search then Extract
|
||||
|
||||
DuckDuckGo returns titles, URLs, and snippets — not full page content. To get full page content, search first and then extract the most relevant URL with `web_extract`, browser tools, or curl.
|
||||
|
||||
CLI example:
|
||||
|
||||
```bash
|
||||
ddgs text -k "fastapi deployment guide" -m 3 -o json
|
||||
```
|
||||
|
||||
Python example, only after verifying `ddgs` is installed in that runtime:
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
results = list(ddgs.text("fastapi deployment guide", max_results=3))
|
||||
for r in results:
|
||||
print(r["title"], "->", r["href"])
|
||||
```
|
||||
|
||||
Then extract the best URL with `web_extract` or another content-retrieval tool.
|
||||
|
||||
## Limitations
|
||||
|
||||
- **Rate limiting**: DuckDuckGo may throttle after many rapid requests. Add a short delay between searches if needed.
|
||||
- **No content extraction**: `ddgs` returns snippets, not full page content. Use `web_extract`, browser tools, or curl for the full article/page.
|
||||
- **Results quality**: Generally good but less configurable than Firecrawl's search.
|
||||
- **Availability**: DuckDuckGo may block requests from some cloud IPs. If searches return empty, try different keywords or wait a few seconds.
|
||||
- **Field variability**: Return fields may vary between results or `ddgs` versions. Use `.get()` for optional fields to avoid `KeyError`.
|
||||
- **Separate runtimes**: A successful `ddgs` install in terminal does not automatically mean `execute_code` can import it.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Problem | Likely Cause | What To Do |
|
||||
|---------|--------------|------------|
|
||||
| `ddgs: command not found` | CLI not installed in the shell environment | Install `ddgs`, or use built-in web/browser tools instead |
|
||||
| `ModuleNotFoundError: No module named 'ddgs'` | Python runtime does not have the package installed | Do not use Python DDGS there until that runtime is prepared |
|
||||
| Search returns nothing | Temporary rate limiting or poor query | Wait a few seconds, retry, or adjust the query |
|
||||
| CLI works but `execute_code` import fails | Terminal and `execute_code` are different runtimes | Keep using CLI, or separately prepare the Python runtime |
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- **`max_results` is keyword-only**: `ddgs.text("query", 5)` raises an error. Use `ddgs.text("query", max_results=5)`.
|
||||
- **Do not assume the CLI exists**: Check `command -v ddgs` before using it.
|
||||
- **Do not assume `execute_code` can import `ddgs`**: `from ddgs import DDGS` may fail with `ModuleNotFoundError` unless that runtime was prepared separately.
|
||||
- **Package name**: The package is `ddgs` (previously `duckduckgo-search`). Install with `pip install ddgs`.
|
||||
- **Don't confuse `-k` and `-m`** (CLI): `-k` is for keywords, `-m` is for max results count.
|
||||
- **Empty results**: If `ddgs` returns nothing, it may be rate-limited. Wait a few seconds and retry.
|
||||
|
||||
## Validated With
|
||||
|
||||
Validated examples against `ddgs==9.11.2` semantics. Skill guidance now treats CLI availability and Python import availability as separate concerns so the documented workflow matches actual runtime behavior.
|
||||
|
|
@ -1,28 +0,0 @@
|
|||
#!/bin/bash
|
||||
# DuckDuckGo Search Helper Script
|
||||
# Wrapper around ddgs CLI with sensible defaults
|
||||
# Usage: ./duckduckgo.sh <query> [max_results]
|
||||
|
||||
set -e
|
||||
|
||||
QUERY="$1"
|
||||
MAX_RESULTS="${2:-5}"
|
||||
|
||||
if [ -z "$QUERY" ]; then
|
||||
echo "Usage: $0 <query> [max_results]"
|
||||
echo ""
|
||||
echo "Examples:"
|
||||
echo " $0 'python async programming' 5"
|
||||
echo " $0 'latest AI news' 10"
|
||||
echo ""
|
||||
echo "Requires: pip install ddgs"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Check if ddgs is available
|
||||
if ! command -v ddgs &> /dev/null; then
|
||||
echo "Error: ddgs not found. Install with: pip install ddgs"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
ddgs text -k "$QUERY" -m "$MAX_RESULTS"
|
||||
Loading…
Add table
Add a link
Reference in a new issue