feat(approval): hardline blocklist for unrecoverable commands (#15878)

Adds a floor below --yolo: a tiny set of commands so catastrophic they
should never run via the agent, regardless of --yolo, gateway /yolo,
approvals.mode=off, or cron approve mode.  Opting into yolo is trusting
the agent with your files and services — not trusting it to wipe the
disk or power the box off.

The list is deliberately small (12 patterns), covering only
unrecoverable ops:
- rm -rf targeting /, /home, /etc, /usr, /var, /boot, /bin, /sbin,
  /lib, ~, $HOME
- mkfs (any variant)
- dd + redirection to raw block devices (/dev/sd*, /dev/nvme*, etc.)
- fork bomb
- kill -1 / kill -9 -1
- shutdown, reboot, halt, poweroff, init 0/6, telinit 0/6,
  systemctl poweroff/reboot/halt/kexec

Recoverable-but-costly commands (git reset --hard, rm -rf /tmp/x,
chmod -R 777, curl | sh) stay in DANGEROUS_PATTERNS where yolo can
still pass them through — that's what yolo is for.

Container backends (docker/singularity/modal/daytona) continue to
bypass both hardline and dangerous checks, since nothing they do can
touch the host.

Inspired by Mercury Agent's permission-hardened blocklist.
This commit is contained in:
Teknium 2026-04-25 22:07:12 -07:00 committed by GitHub
parent a55de5bcd0
commit eb28145f36
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 427 additions and 13 deletions

View file

@ -234,7 +234,7 @@ class TestCronModeInteractions:
assert result["approved"]
def test_yolo_overrides_cron_deny(self, monkeypatch):
"""--yolo still works even if cron_mode=deny."""
"""--yolo still bypasses cron_mode=deny for dangerous (non-hardline) commands."""
monkeypatch.setenv("HERMES_CRON_SESSION", "1")
monkeypatch.setenv("HERMES_YOLO_MODE", "1")
monkeypatch.delenv("HERMES_INTERACTIVE", raising=False)
@ -242,7 +242,9 @@ class TestCronModeInteractions:
from unittest.mock import patch as mock_patch
with mock_patch("tools.approval._get_cron_approval_mode", return_value="deny"):
result = check_dangerous_command("rm -rf /", "local")
# Use a dangerous-but-not-hardline command — `rm -rf /` is now
# hardline-blocked regardless of yolo (see test_hardline_blocklist.py).
result = check_dangerous_command("rm -rf /tmp/stuff", "local")
assert result["approved"]
def test_non_cron_non_interactive_still_auto_approves(self, monkeypatch):