mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-07 08:02:23 +00:00

feat(skills): add web-pentest optional skill (#32265 )

Adds optional-skills/security/web-pentest/ — an authorized web app
penetration testing skill adapted from Shannon's methodology (concepts
only; AGPL-clean fresh implementation).

Phased: recon (read-only) → vuln analysis (delegate_task per OWASP
class) → proof-based exploitation → report.

Guardrails baked in:
- Authorization gate before first active scan (templates/authorization.md)
- Scope allowlist (scope.txt) consulted by recon-scan.sh and
  documented as the rule for every active request
- Aux-client leakage warning (compression + title gen replay history;
  payloads/creds must not enter chat verbatim)
- Bypass-exhaustion discipline before false-positive classification
- L3/L4 (proof-required) for reportable findings; L1/L2 listed as
  candidates only

Closes #400. Supersedes #21845 (plugin-shaped proposal; skill-shaped is
cheaper and matches the existing optional-skills/security/ pattern).

2026-05-25 14:51:41 -07:00

13 KiB

Raw Blame History

name

description

platforms

Web Application Penetration Testing

A phased pentesting workflow for running web applications. Adapted from Shannon's pipeline (Keygraph, AGPL — concepts only, no code borrowed). Built around three rules:

No exploit, no report — every finding requires reproducible evidence.
Bounded scope — every active request goes against a target the operator pre-declared. Off-scope hosts are refused.
Bypass exhaustion before false-positive dismissal — a "blocked" payload is not a clean bill of health until you've tried the bypass set.

⚠️ Hard Guardrails — Read Before Every Engagement

Violating any of these invalidates the engagement and may be illegal.

Authorization gate. Before the first active scan in a session, you MUST confirm with the user, in writing, that they own or have written authorization to test the target. Record the acknowledgement in engagement/authorization.md (see template). No acknowledgement → no active scanning. Reading public pages with curl is fine; sending payloads is not.
Scope allowlist. Maintain engagement/scope.txt — one hostname or CIDR per line. Every nmap, curl, whatweb, browser navigation, or payload-bearing request MUST be against an entry in scope. If a target redirects you off-scope (3xx to a different host, a link in HTML), STOP and confirm with the user before following.
No production systems without paper. If the user hasn't told you "yes, prod is in scope and I have written sign-off," assume not. Default targets are staging, local docker, dedicated test instances.
Cloud metadata is off by default. Do not probe 169.254.169.254, metadata.google.internal, 100.100.100.200, [fd00:ec2::254], or equivalent unless the engagement explicitly includes SSRF-to-metadata as a goal AND the target is one you control. The agent's browser tool can reach these from inside your own infrastructure — don't.
Destructive payloads need approval. SQLi payloads that DROP/DELETE, filesystem-write SSTI, command injection with rm/shutdown/mkfs, anything that mutates beyond a single test row → ASK FIRST. The approval.py system catches some; don't rely on it alone.
Aux-client leakage risk (Hermes-specific). This skill produces sessions full of SQLi/XSS/RCE payloads, captured credentials, JWT tokens. Hermes' compression and title-generation paths replay history through the auxiliary client (often the main model). Anything sensitive you write to the conversation can leave the box on the next compress. Mitigation:
- Redact captured tokens/credentials to the LAST 6 CHARS before logging them in any message. Full values go to engagement/evidence/ files, never into chat history.
- If the engagement is sensitive, set auxiliary.title_generation.enabled: false in ~/.hermes/config.yaml for the session.
Rate limit yourself. Default 200ms between active requests against any single host. The recon-scan.sh script enforces this. Don't bypass it without operator approval.
Authority of the report. This skill produces a security assessment, not a "PASS." Even a clean run is "no exploitable issues FOUND in scope X within time T using methods Y" — not "the application is secure." Mirror that language in the report.

Phase 0: Engagement Setup

Before any scanning happens, create the engagement directory and authorization acknowledgement.

ENGAGEMENT=engagement-$(date +%Y%m%d-%H%M%S)
mkdir -p "$ENGAGEMENT"/{evidence,findings,reports}
cd "$ENGAGEMENT"

Ask the user (verbatim):

"Confirm: (a) the target URL is [X], (b) you own this application or have written authorization to test it, and (c) the engagement may run for up to [N] hours starting now. Reply 'authorized' to proceed."
Wait for explicit authorized response. Any other answer means STOP.
Record authorization to engagement/authorization.md using the template in templates/authorization.md. Include:
- Target URL(s) and IP(s)
- Authorization basis (ownership / written authz from $name)
- Engagement window
- Out-of-scope items (production, third-party services, etc.)
- Operator name (the user driving this session)

Build scope.txt:

localhost
127.0.0.1
staging.example.com
192.168.1.0/24    # internal lab only, with operator OK

Read references/scope-enforcement.md before issuing the first active request — that doc has the host-extraction rules you apply to every command/URL before it goes out.

Phase 1: Pre-Recon (Code Analysis, optional)

Skip if no source access (black-box engagement).

If you have read access to the application source:

Map the architecture — framework, routing, middleware stack
Inventory sinks — every execute(, os.system(, eval(, template render, file read/write, redirect target
Map auth — session cookie vs JWT, OAuth flows, password reset, privileged endpoints
Identify trust boundaries — what's authenticated, what's not, what comes from request.*
Backward taint from each sink to a request source. Early-terminate when proper sanitization is found (parameterized queries, allowlists, shlex.quote, well-known escapers).

Output: evidence/pre-recon.md — architecture map, sink inventory, suspected vulnerable code paths.

This is OFFLINE work. No traffic to the target.

Phase 2: Recon (Live, Read-Only)

Maps the attack surface. All requests are GETs of public pages, no payloads yet. Still scope-bounded.

Verify scope. Resolve every target hostname → IP. Confirm IPs are in scope (avoids the "DNS points somewhere unexpected" trap).
Network surface (only if scope permits port scanning):
```
nmap -sT -T3 --top-ports 100 -oN evidence/nmap.txt $TARGET
```
Use -T3 (default), not -T4/-T5. Stealthier and avoids tripping IDS/IPS in shared environments.

Tech fingerprint:

whatweb -v $TARGET_URL > evidence/whatweb.txt
curl -sIk $TARGET_URL > evidence/headers.txt

Endpoint discovery:
- Crawl the app with the browser tool (browser_navigate, browser_get_images, follow links).
- Inspect robots.txt, sitemap.xml, .well-known/*.
- Use the developer tools network panel via browser tool to capture XHR/fetch calls.
Auth surface: Identify login, registration, password reset, session cookie names, token formats. Do NOT send credentials yet — just observe.
Correlate with pre-recon (if you have source). For each evidence/pre-recon.md finding, mark whether the live surface confirms it's reachable.

Output: evidence/recon.md — endpoints, technologies, auth model, input vectors.

Phase 3: Vulnerability Analysis

One delegate_task per vulnerability class. Each agent reads evidence/recon.md (+ evidence/pre-recon.md if present), produces findings/<class>-queue.json using templates/exploitation-queue.json.

Use delegate_task with these focused subagents (parallel where possible):

Class	Goal	Reference
`injection`	SQLi, command, path traversal, SSTI, LFI/RFI, deserialization	`references/vuln-taxonomy.md` (slot types)
`xss`	Reflected, stored, DOM-based	`references/vuln-taxonomy.md` (render contexts)
`auth`	Login bypass, JWT confusion, session fixation, OAuth flaws	`references/exploitation-techniques.md`
`authz`	IDOR, vertical/horizontal escalation, business logic	`references/exploitation-techniques.md`
`ssrf`	Internal reachability, metadata, protocol smuggling	Skip metadata unless explicitly authorized
`infra`	Misconfig, info disclosure, default creds, exposed admin	`references/exploitation-techniques.md`

Each queue entry has: id, vuln class, source (file:line if known), endpoint, parameter, slot type, suspected defense, verdict (identified / partial / confirmed / critical), witness payload, confidence (0-1), notes.

The analysis phase doesn't send malicious payloads yet — it stages them. The exploitation phase actually fires them.

Phase 4: Exploitation (Proof-Based, Conditional)

Only run a sub-agent per class where the analysis queue has actionable entries (identified or partial).

For each candidate:

Pre-send check — host in scope? auth gate satisfied? payload approved if destructive?
Send the witness payload — minimal proof. SQLi: ' AND 1=1-- then ' AND 1=2--. XSS: a benign marker like <svg/onload=console.log("HERMES-PENTEST-XSS")>. Never alert(1) in stored XSS — it'll fire for other users in shared environments.
Verify the witness fires — for blind injection, use a sleep probe (SLEEP(5)) and time the response. For SSRF, use a tester-controlled callback host you own (NOT a public service like webhook.site for sensitive engagements — exfil paths).
Promote level:
- L1 Identified — pattern matched, no behavior change
- L2 Partial — sink reached, but defense in place
- L3 Confirmed — payload changed app behavior in observable way
- L4 Critical — data extracted, code executed, access escalated
Bypass exhaustion before classifying as FP. For each candidate that blocks: try at least the bypass set in references/bypass-techniques.md for that class. Only after the set is exhausted may you write verdict: false_positive.
Record evidence for every L3/L4:
- Full request (method, URL, headers, body)
- Response (status, headers, relevant body excerpt)
- Reproducer command (curl one-liner)
- Impact statement

Output: findings/exploitation-evidence.md

Redact in evidence files:

Any captured credentials/tokens → last 6 chars only in chat; full value to findings/secrets-vault.md (gitignored).
Other users' PII → redact.
Your test credentials → fine to keep.

Phase 5: Reporting

Generate the final report using templates/pentest-report.md. Sections:

Executive summary
Engagement scope (from engagement/scope.txt)
Authorization (from engagement/authorization.md)
Findings (L3/L4 only — proof-required). Per finding:
- Title, severity (CVSS 3.1), CWE
- Affected endpoint(s)
- Proof (request + response excerpt)
- Reproduction steps
- Impact
- Remediation
Not-exploited candidates (L1/L2 with notes on what blocked them)
Out-of-scope observations
Methodology / tools used
Limitations and what was NOT tested

Severity policy: CVSS only for L3/L4. L1/L2 are "candidates pending verification" — don't assign CVSS to unverified findings.

When to Stop

The user revokes authorization.
A candidate finding clearly impacts production data and you don't have approval for destructive testing — STOP and ask.
The target starts returning 503/429 storms — back off, reconvene with the operator.
You discover something outside the contracted scope (e.g. an exposed customer database while testing an unrelated endpoint). STOP, document, report to the operator. Do not pivot without explicit approval — that pivot is what makes pentesting illegal.

What This Skill Does NOT Cover

Network-layer pentesting beyond port scanning (no Metasploit, Cobalt Strike, AD attacks, network protocol fuzzing).
Reverse engineering / binary analysis (see issue #383).
Source-only static analysis (see issue #382).
Active social engineering / phishing.
Anything against systems the operator hasn't pre-authorized.

If the engagement needs any of these, escalate to a professional pentester. This skill complements professional pentesting; it does not replace it.

13 KiB Raw Blame History