fix(security): harden heredoc approval, NFKC homograph fold, env-var filter

Three independent security-scanner hardenings, re-homed onto the current shared threat-pattern architecture (tools/threat_patterns.py): - approval.py: add bash/sh/zsh/ksh heredoc to DANGEROUS_PATTERNS. The existing heredoc pattern only covered python/perl/ruby/node, so `bash <<'EOF' ... EOF` ran arbitrary shell — including exfil pipelines whose inner commands don't individually match a pattern — with no prompt. - threat_patterns.py: apply unicodedata.normalize("NFKC", ...) before pattern matching so full-width / compatibility homographs (e.g. `ｃａｔ ~/.hermes/.env`) are folded to ASCII and no longer bypass the keyword scanners. Invisible-char detection still runs on the raw content first (NFKC can strip those codepoints). - code_execution_tool.py: add CREDS/BEARER/APIKEY to _SECRET_SUBSTRINGS so vars like HERMES_LLM_CREDS, API_BEARER, MY_APIKEY are scrubbed from the sandbox env. PASS was intentionally dropped from the original proposal — it false-positives on BYPASS_CACHE / COMPASS_DIR / PASSENGER_HOST while PASSWORD/PASSWD already cover the credential cases. The original PR also proposed a 'synonym' injection pattern block (overlook/forget/set aside/bypass/discard + developer-mode); dropped here because it false-positives on ordinary AGENTS.md/SOUL.md prose ("don't forget to follow the rules", "run in developer mode"), exactly the bossy-English class threat_patterns.py is documented to avoid. Salvaged from #9028. Co-authored-by: Hermes Agent <agent@nousresearch.com>
2026-07-01 12:02:05 +00:00 · 2026-06-30 01:21:28 -07:00 · 2026-06-30 01:21:28 -07:00 · 3b2bb30c5d
commit 3b2bb30c5d
parent c8376e0dc6
6 changed files with 75 additions and 3 deletions
--- a/tools/code_execution_tool.py
+++ b/tools/code_execution_tool.py
@ -88,7 +88,16 @@ _SAFE_ENV_PREFIXES = ("PATH", "HOME", "USER", "LANG", "LC_", "TERM",
                      "TMPDIR", "TMP", "TEMP", "SHELL", "LOGNAME",
                      "XDG_", "PYTHONPATH", "VIRTUAL_ENV", "CONDA")
 _SECRET_SUBSTRINGS = ("KEY", "TOKEN", "SECRET", "PASSWORD", "CREDENTIAL",
-                      "PASSWD", "AUTH", "DSN", "WEBHOOK")
+                      "PASSWD", "AUTH", "DSN", "WEBHOOK",
+                      # Abbreviations that appear in real-world credential
+                      # variable names but were previously undetected:
+                      # CREDS (CREDENTIALS abbreviated), BEARER
+                      # (Authorization: Bearer tokens), APIKEY (written
+                      # without an underscore). "PASS" is intentionally NOT
+                      # added — it false-positives on legitimate non-secret
+                      # vars (BYPASS_CACHE, COMPASS_DIR, PASSENGER_HOST) while
+                      # PASSWORD/PASSWD already cover the credential cases.
+                      "CREDS", "BEARER", "APIKEY")

 # Operational HERMES_* vars the child legitimately needs by exact name — these
 # are non-secret runtime-location flags (the same set hermes_cli treats as the