Adds optional-skills/security/web-pentest/ — an authorized web app penetration testing skill adapted from Shannon's methodology (concepts only; AGPL-clean fresh implementation). Phased: recon (read-only) → vuln analysis (delegate_task per OWASP class) → proof-based exploitation → report. Guardrails baked in: - Authorization gate before first active scan (templates/authorization.md) - Scope allowlist (scope.txt) consulted by recon-scan.sh and documented as the rule for every active request - Aux-client leakage warning (compression + title gen replay history; payloads/creds must not enter chat verbatim) - Bypass-exhaustion discipline before false-positive classification - L3/L4 (proof-required) for reportable findings; L1/L2 listed as candidates only Closes #400. Supersedes #21845 (plugin-shaped proposal; skill-shaped is cheaper and matches the existing optional-skills/security/ pattern).
4 KiB
Scope Enforcement
The pentest skill is dangerous because Hermes can drive network tools unattended. The single most important rule: every active request must target a host the operator authorized. This file is the procedure.
The Three Authorities
engagement/authorization.md— what the operator wrote down.engagement/scope.txt— the machine-readable allowlist.- The current shell prompt — implicit: "I'm running as Hermes inside the operator's box."
If any of those three disagree, you STOP and ask. Don't try to reconcile.
scope.txt format
One target per line. Comments with #.
# Hostnames — resolved at use time
localhost
127.0.0.1
::1
staging.example.com
api-staging.example.com
# CIDR — internal labs only, requires operator OK in writing
192.168.50.0/24
10.0.5.0/24
Wildcards are NOT supported. If you need *.staging.example.com, list
each host explicitly. This is on purpose: subdomain wildcards in
authorization scope are how unauthorized testing happens.
Host Extraction Rules
Before any active request, extract the target host from the command or URL and confirm it's in scope.
| Surface | Where the host lives | Example |
|---|---|---|
curl URL |
The URL | curl https://staging.example.com/login |
curl --resolve HOST:PORT:ADDR |
HOST | reject — resolve overrides scope |
nmap TARGET |
Each TARGET arg | nmap 10.0.5.5 staging.example.com |
whatweb URL |
The URL | whatweb https://staging.example.com |
browser_navigate(url) |
The URL | python-side: extract host from url |
| Tool-driven HTTP (sqlmap, wfuzz, gobuster) | -u, -h, target arg |
depends on tool |
For URLs: urllib.parse.urlparse(url).hostname.lower().
For raw IPs: keep as IP, check against CIDR entries with
ipaddress.ip_address(host) in ipaddress.ip_network(cidr).
Pre-Send Checklist
For every active request, before you press enter:
- Did you extract the host correctly? (URL host, not Host header, not
--resolvealiasing.) - Is the host in scope.txt (exact hostname match) OR is its resolved IP in a scope.txt CIDR?
- If it's a redirect target you're following, did you re-check scope on the redirect URL?
- If it's the second hop of an SSRF probe, is the inner URL in scope? (Usually NOT — that's the whole point. Don't auto-fire.)
- Did the operator approve this class of payload? (Read-only recon is auto-OK; destructive payloads need explicit OK.)
If any answer is "no" or "not sure," STOP and ask the operator.
Things That Look In-Scope But Aren't
- Redirects to a parent or sister host.
staging.example.com→auth.example.comis a different host. Stop, re-confirm. - CNAMEs.
app.staging.example.commay CNAME toprod-cluster.aws.example.com. Resolve and check IP, not just name. - Cloud metadata IPs.
169.254.169.254is not in any sane scope.txt. If your SSRF candidate resolves there, you're probably testing against a real cloud host and need explicit approval before the probe. - 127.0.0.1 / localhost on a shared box. If you're in a container
or shared dev box,
localhostmay be someone else's service. Confirm with the operator that 127.0.0.1 means what they think. - External services the target depends on. Stripe API, OAuth providers, S3 buckets — even if your tests would touch them, they are NOT in scope by default.
When Scope Fails Open
If you can't decide whether a host is in scope:
DEFAULT: out of scope.
Stop the agent. Ask the operator. Resume only after written confirmation. There is no penalty for asking; there is significant penalty for testing the wrong host.
Logging
Every active request should append to engagement/request-log.jsonl:
{"ts": "2026-05-25T03:14:15Z", "method": "GET", "url": "https://staging.example.com/api/users", "host": "staging.example.com", "in_scope": true, "phase": "recon", "result_status": 200, "evidence_ref": "evidence/recon.md#endpoints"}
This is your audit trail. If anyone ever asks "why did the pentest agent hit X?" you can answer from this log.