fix(security): align cron invisible-unicode set with install-time scanner

The cron runtime tripwire (_scan_cron_prompt) used a 10-char invisible-unicode
set while the install-time scanner (threat_patterns.INVISIBLE_CHARS) flags 17.
The cron-local set was missing U+2062-U+2064 (invisible math operators) and
U+2066-U+2069 (directional isolates), so a directive obfuscated with one of
those codepoints (e.g. "ig<U+2063>nore all previous instructions") slipped past
the runtime cron gate while being caught at install time.

Import the canonical set so the cron tripwire and install scanner can't drift
apart again. Emoji-ZWJ protection (_zwj_has_emoji_neighbour) is unchanged.

Fixes #35075

Co-authored-by: rlaope <piyrw9754@gmail.com>
This commit is contained in:
teknium1 2026-06-26 00:52:33 -07:00 committed by Teknium
parent a0dc92450b
commit fbfccbb3ee
3 changed files with 34 additions and 4 deletions

View file

@ -115,10 +115,14 @@ _CRON_EXFIL_COMMAND_PATTERNS = [
(rf'curl\s+[^\n]*(?:-H|--header)\s+["\']Authorization:\s*(?:Bearer|token)\s+{_CRON_SECRET_VAR_RE}["\']', "exfil_curl_auth_header"),
]
_CRON_INVISIBLE_CHARS = {
'\u200b', '\u200c', '\u200d', '\u2060', '\ufeff',
'\u202a', '\u202b', '\u202c', '\u202d', '\u202e',
}
# Single source of truth, shared with the install-time scanner
# (threat_patterns.INVISIBLE_CHARS / skills_guard). Keeping a separate, narrower
# copy here let an obfuscated injection directive slip past this runtime cron
# tripwire while being caught at install time (or vice versa): U+2062-U+2064
# (invisible math operators) and U+2066-U+2069 (directional isolates) are real
# attack tools and were missing from the cron-local set. Importing the canonical
# set keeps the cron tripwire and the install scanner from drifting apart.
from tools.threat_patterns import INVISIBLE_CHARS as _CRON_INVISIBLE_CHARS
# U+200D Zero-Width Joiner is also a legitimate, required part of many
# Unicode emoji sequences (for example 👨‍👩‍👧, 🏳️‍🌈, ❤️‍🩹, 🧑‍💻).