mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-29 06:31:32 +00:00
fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290)
* fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint
Adds a soft guard so an agent running under one Hermes profile cannot
silently edit a different profile's skills/plugins/cron/memories.
Three layers:
A. agent/file_safety.classify_cross_profile_target
Classifies a write target against the active HERMES_HOME. Returns
a {active_profile, target_profile, area, target_path} dict when the
path lands in another profile's scoped area. PROFILE_SCOPED_AREAS =
(skills, plugins, cron, memories). get_cross_profile_warning()
wraps it into a model-facing error string that names both profiles,
names the area, and points at the cross_profile=True bypass.
Defense-in-depth, NOT a security boundary — the terminal tool runs
as the same OS user and can write any of these paths directly. The
guard exists to prevent confused-agent corruption, not to stop a
determined attacker. SECURITY.md §3.2 (terminal-bypass posture)
still applies.
Wired into tools/file_tools.write_file_tool and patch_tool with a
cross_profile=False kwarg. WRITE_FILE_SCHEMA and PATCH_SCHEMA both
advertise cross_profile so the model can pass it after explicit
user direction. patch_tool extracts target paths from V4A patch
bodies before checking (same shape as the existing sensitive-path
check).
skill_manage is already scoped to the active profile's SKILLS_DIR
by construction, so no extra guard wiring is needed there. The
D-side error message (below) still names other profiles when the
skill exists elsewhere.
B. agent/system_prompt
One deterministic line near the environment-hints block names the
active profile and tells the model not to modify another profile's
skills/plugins/cron/memories without explicit direction. Profile
name is stable for the lifetime of the AIAgent, so the line is
prompt-cache-safe.
D. tools/skill_manager_tool._skill_not_found_error
Replaces the bare "Skill 'X' not found." with a message that:
- names the active profile,
- searches OTHER profiles' skills dirs for the same name,
- names the profile(s) where the skill exists and the path,
- suggests `hermes -p <name>` to switch profiles, or
cross_profile=True for an explicit edit.
All 5 "not found" sites in skill_manager_tool (edit, patch, delete,
write_file, remove_file) now go through the helper.
Reference incident (May 2026): a hermes-security profile session
edited skills under both ~/.hermes/profiles/hermes-security/skills/
AND ~/.hermes/skills/ (the default profile's skills) without
realizing the second path belonged to a different profile. Three of
the four skill files needed manual restoration afterward.
What this PR does NOT do:
* No hard block. The terminal tool can still touch any of these
paths with no guard — same posture as the dangerous-command
approval flow. SECURITY.md §3.2 applies.
* No regex sweep on terminal commands for cross-profile paths.
That direction is a Skills-Guard-style arms race (cd + relative
paths, base64, etc.) and would false-positive on legitimate
cross-profile reads. Filed as a follow-up.
* No on-disk path migration. ~/.hermes/skills/ remains the
default profile's skills dir; this PR is about telling the
agent about that boundary, not changing the layout.
Tests:
tests/agent/test_file_safety_cross_profile.py (16 tests)
- _resolve_active_profile_name covers default/named/failure paths
- classify_cross_profile_target covers all four scoped areas,
both directions (default → named, named → default, named → named),
non-Hermes paths, and root-level config files
- get_cross_profile_warning covers in-profile no-op, cross-profile
message shape, and the defense-in-depth self-documentation
tests/tools/test_cross_profile_guard.py (12 tests)
- write_file: in-profile allow, cross-profile block, cross_profile=True
bypass, non-Hermes pass-through
- patch: replace-mode block, cross_profile=True bypass, V4A patch
path extraction
- skill_manage: error names the other profile (single + multiple),
missing-everywhere falls back to skills_list hint
- system prompt: contract-level checks (both branches present,
cross_profile=True mentioned, ~/.hermes/profiles/ referenced)
All 207 existing tests in file_safety/file_operations/skill_manager
still pass. 10 system-prompt tests still pass.
E2E verified: the exact incident scenario (security profile editing
default's hermes-agent-dev skill) is now blocked with the warning
message; cross_profile=True unblocks.
* fix(code_execution): add cross_profile to write_file/patch stubs
The cross_profile kwarg added to write_file_tool/patch_tool needs to
flow through the execute_code sandbox stubs in _TOOL_STUBS so the
test_stubs_cover_all_schema_params drift test passes. Without this,
scripts running inside execute_code couldn't pass cross_profile=True
through hermes_tools.write_file().
Caught by CI on PR #31290.
This commit is contained in:
parent
b207dc28b3
commit
d3c167b644
7 changed files with 846 additions and 19 deletions
219
tests/agent/test_file_safety_cross_profile.py
Normal file
219
tests/agent/test_file_safety_cross_profile.py
Normal file
|
|
@ -0,0 +1,219 @@
|
|||
"""Tests for the cross-Hermes-profile write guard in agent/file_safety.
|
||||
|
||||
The guard fires when a tool tries to write into another Hermes profile's
|
||||
skills/plugins/cron/memories directory. It's a soft guard — defense in
|
||||
depth, NOT a security boundary — but it prevents the agent from silently
|
||||
corrupting a profile that belongs to a different session.
|
||||
|
||||
Reference: May 2026 incident — a hermes-security profile session
|
||||
accidentally edited skills under both ~/.hermes/profiles/hermes-security/skills/
|
||||
AND ~/.hermes/skills/ (the default profile's skills), realizing only
|
||||
afterwards that the second path belonged to a different profile.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers — set up a fake Hermes root with two profiles, monkeypatch the
|
||||
# resolver helpers so the classifier sees the test layout.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def fake_hermes(tmp_path, monkeypatch):
|
||||
"""Build a fake Hermes layout:
|
||||
|
||||
<tmp>/
|
||||
skills/foo/SKILL.md # default profile
|
||||
plugins/foo/__init__.py
|
||||
cron/<state>
|
||||
memories/MEMORY.md
|
||||
profiles/
|
||||
hermes-security/
|
||||
skills/foo/SKILL.md # named profile
|
||||
plugins/...
|
||||
coder/
|
||||
skills/foo/SKILL.md # another named profile
|
||||
"""
|
||||
root = tmp_path / "fake-hermes"
|
||||
(root / "skills" / "foo").mkdir(parents=True)
|
||||
(root / "skills" / "foo" / "SKILL.md").write_text("# default skill\n")
|
||||
(root / "plugins" / "foo").mkdir(parents=True)
|
||||
(root / "memories").mkdir(parents=True)
|
||||
(root / "cron").mkdir(parents=True)
|
||||
|
||||
sec_home = root / "profiles" / "hermes-security"
|
||||
(sec_home / "skills" / "foo").mkdir(parents=True)
|
||||
(sec_home / "skills" / "foo" / "SKILL.md").write_text("# sec skill\n")
|
||||
(sec_home / "plugins").mkdir(parents=True)
|
||||
|
||||
coder_home = root / "profiles" / "coder"
|
||||
(coder_home / "skills" / "foo").mkdir(parents=True)
|
||||
(coder_home / "skills" / "foo" / "SKILL.md").write_text("# coder skill\n")
|
||||
|
||||
# Monkeypatch the resolver functions used by file_safety so each test
|
||||
# can choose which profile is "active".
|
||||
import hermes_constants
|
||||
monkeypatch.setattr(hermes_constants, "get_default_hermes_root", lambda: root)
|
||||
|
||||
# The reloads below ensure get_cross_profile_warning/classify see the patched root.
|
||||
import agent.file_safety as fs
|
||||
monkeypatch.setattr(fs, "_hermes_root_path", lambda: root)
|
||||
|
||||
return {
|
||||
"root": root,
|
||||
"default_home": root,
|
||||
"security_home": sec_home,
|
||||
"coder_home": coder_home,
|
||||
}
|
||||
|
||||
|
||||
def _set_active_home(monkeypatch, hermes_home: Path):
|
||||
"""Point file_safety._hermes_home_path at a specific profile dir."""
|
||||
import agent.file_safety as fs
|
||||
monkeypatch.setattr(fs, "_hermes_home_path", lambda: hermes_home)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _resolve_active_profile_name
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestResolveActiveProfileName:
|
||||
def test_default_when_home_is_root(self, fake_hermes, monkeypatch):
|
||||
_set_active_home(monkeypatch, fake_hermes["default_home"])
|
||||
from agent.file_safety import _resolve_active_profile_name
|
||||
assert _resolve_active_profile_name() == "default"
|
||||
|
||||
def test_named_profile(self, fake_hermes, monkeypatch):
|
||||
_set_active_home(monkeypatch, fake_hermes["security_home"])
|
||||
from agent.file_safety import _resolve_active_profile_name
|
||||
assert _resolve_active_profile_name() == "hermes-security"
|
||||
|
||||
def test_falls_back_to_default_on_resolution_failure(self, fake_hermes, monkeypatch):
|
||||
"""If HERMES_HOME resolution raises, return 'default' rather than crashing the tool."""
|
||||
import agent.file_safety as fs
|
||||
|
||||
def _boom():
|
||||
raise RuntimeError("simulated")
|
||||
|
||||
monkeypatch.setattr(fs, "_hermes_home_path", _boom)
|
||||
# Should not raise — falls back to "default"
|
||||
assert fs._resolve_active_profile_name() == "default"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# classify_cross_profile_target
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestClassifyCrossProfileTarget:
|
||||
def test_same_profile_write_returns_none(self, fake_hermes, monkeypatch):
|
||||
_set_active_home(monkeypatch, fake_hermes["security_home"])
|
||||
from agent.file_safety import classify_cross_profile_target
|
||||
result = classify_cross_profile_target(
|
||||
str(fake_hermes["security_home"] / "skills" / "foo" / "SKILL.md")
|
||||
)
|
||||
assert result is None
|
||||
|
||||
def test_security_writing_default_skill(self, fake_hermes, monkeypatch):
|
||||
"""The exact incident from May 2026."""
|
||||
_set_active_home(monkeypatch, fake_hermes["security_home"])
|
||||
from agent.file_safety import classify_cross_profile_target
|
||||
result = classify_cross_profile_target(
|
||||
str(fake_hermes["default_home"] / "skills" / "foo" / "SKILL.md")
|
||||
)
|
||||
assert result is not None
|
||||
assert result["active_profile"] == "hermes-security"
|
||||
assert result["target_profile"] == "default"
|
||||
assert result["area"] == "skills"
|
||||
|
||||
def test_default_writing_security_skill(self, fake_hermes, monkeypatch):
|
||||
"""Inverse direction — default-profile session reaching into a named profile."""
|
||||
_set_active_home(monkeypatch, fake_hermes["default_home"])
|
||||
from agent.file_safety import classify_cross_profile_target
|
||||
result = classify_cross_profile_target(
|
||||
str(fake_hermes["security_home"] / "skills" / "foo" / "SKILL.md")
|
||||
)
|
||||
assert result is not None
|
||||
assert result["active_profile"] == "default"
|
||||
assert result["target_profile"] == "hermes-security"
|
||||
|
||||
def test_named_to_named_cross_profile(self, fake_hermes, monkeypatch):
|
||||
_set_active_home(monkeypatch, fake_hermes["security_home"])
|
||||
from agent.file_safety import classify_cross_profile_target
|
||||
result = classify_cross_profile_target(
|
||||
str(fake_hermes["coder_home"] / "skills" / "foo" / "SKILL.md")
|
||||
)
|
||||
assert result is not None
|
||||
assert result["target_profile"] == "coder"
|
||||
|
||||
@pytest.mark.parametrize("area", ["skills", "plugins", "cron", "memories"])
|
||||
def test_all_profile_scoped_areas_classified(self, fake_hermes, monkeypatch, area):
|
||||
_set_active_home(monkeypatch, fake_hermes["security_home"])
|
||||
from agent.file_safety import classify_cross_profile_target
|
||||
target = fake_hermes["default_home"] / area / "foo.txt"
|
||||
result = classify_cross_profile_target(str(target))
|
||||
assert result is not None
|
||||
assert result["area"] == area
|
||||
|
||||
def test_non_hermes_path_returns_none(self, fake_hermes, monkeypatch, tmp_path):
|
||||
_set_active_home(monkeypatch, fake_hermes["security_home"])
|
||||
from agent.file_safety import classify_cross_profile_target
|
||||
# Path outside any Hermes root
|
||||
assert classify_cross_profile_target(str(tmp_path / "random.txt")) is None
|
||||
|
||||
def test_hermes_config_not_classified_as_cross_profile(self, fake_hermes, monkeypatch):
|
||||
"""Files under <root>/config.yaml or <root>/.env are NOT profile-scoped
|
||||
(already covered by build_write_denied_paths). Don't double-warn."""
|
||||
_set_active_home(monkeypatch, fake_hermes["security_home"])
|
||||
from agent.file_safety import classify_cross_profile_target
|
||||
# config.yaml at root level is not in PROFILE_SCOPED_AREAS
|
||||
result = classify_cross_profile_target(
|
||||
str(fake_hermes["default_home"] / "config.yaml")
|
||||
)
|
||||
assert result is None
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# get_cross_profile_warning
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestGetCrossProfileWarning:
|
||||
def test_in_profile_returns_none(self, fake_hermes, monkeypatch):
|
||||
_set_active_home(monkeypatch, fake_hermes["security_home"])
|
||||
from agent.file_safety import get_cross_profile_warning
|
||||
assert get_cross_profile_warning(
|
||||
str(fake_hermes["security_home"] / "skills" / "foo" / "SKILL.md")
|
||||
) is None
|
||||
|
||||
def test_cross_profile_warning_names_both_profiles(self, fake_hermes, monkeypatch):
|
||||
_set_active_home(monkeypatch, fake_hermes["security_home"])
|
||||
from agent.file_safety import get_cross_profile_warning
|
||||
warn = get_cross_profile_warning(
|
||||
str(fake_hermes["default_home"] / "skills" / "foo" / "SKILL.md")
|
||||
)
|
||||
assert warn is not None
|
||||
# Must name BOTH profiles so the model knows which is which.
|
||||
assert "default" in warn
|
||||
assert "hermes-security" in warn
|
||||
# Must name the bypass kwarg.
|
||||
assert "cross_profile=True" in warn
|
||||
# Must reference the area.
|
||||
assert "skills" in warn
|
||||
|
||||
def test_warning_is_defense_in_depth_not_boundary(self, fake_hermes, monkeypatch):
|
||||
_set_active_home(monkeypatch, fake_hermes["security_home"])
|
||||
from agent.file_safety import get_cross_profile_warning
|
||||
warn = get_cross_profile_warning(
|
||||
str(fake_hermes["default_home"] / "skills" / "foo" / "SKILL.md")
|
||||
)
|
||||
# Must self-document as defense-in-depth so future reviewers
|
||||
# don't promote it to a hard block.
|
||||
assert "not a security boundary" in warn.lower()
|
||||
Loading…
Add table
Add a link
Reference in a new issue