fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290)

* fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint Adds a soft guard so an agent running under one Hermes profile cannot silently edit a different profile's skills/plugins/cron/memories. Three layers: A. agent/file_safety.classify_cross_profile_target Classifies a write target against the active HERMES_HOME. Returns a {active_profile, target_profile, area, target_path} dict when the path lands in another profile's scoped area. PROFILE_SCOPED_AREAS = (skills, plugins, cron, memories). get_cross_profile_warning() wraps it into a model-facing error string that names both profiles, names the area, and points at the cross_profile=True bypass. Defense-in-depth, NOT a security boundary — the terminal tool runs as the same OS user and can write any of these paths directly. The guard exists to prevent confused-agent corruption, not to stop a determined attacker. SECURITY.md §3.2 (terminal-bypass posture) still applies. Wired into tools/file_tools.write_file_tool and patch_tool with a cross_profile=False kwarg. WRITE_FILE_SCHEMA and PATCH_SCHEMA both advertise cross_profile so the model can pass it after explicit user direction. patch_tool extracts target paths from V4A patch bodies before checking (same shape as the existing sensitive-path check). skill_manage is already scoped to the active profile's SKILLS_DIR by construction, so no extra guard wiring is needed there. The D-side error message (below) still names other profiles when the skill exists elsewhere. B. agent/system_prompt One deterministic line near the environment-hints block names the active profile and tells the model not to modify another profile's skills/plugins/cron/memories without explicit direction. Profile name is stable for the lifetime of the AIAgent, so the line is prompt-cache-safe. D. tools/skill_manager_tool._skill_not_found_error Replaces the bare "Skill 'X' not found." with a message that: - names the active profile, - searches OTHER profiles' skills dirs for the same name, - names the profile(s) where the skill exists and the path, - suggests `hermes -p <name>` to switch profiles, or cross_profile=True for an explicit edit. All 5 "not found" sites in skill_manager_tool (edit, patch, delete, write_file, remove_file) now go through the helper. Reference incident (May 2026): a hermes-security profile session edited skills under both ~/.hermes/profiles/hermes-security/skills/ AND ~/.hermes/skills/ (the default profile's skills) without realizing the second path belonged to a different profile. Three of the four skill files needed manual restoration afterward. What this PR does NOT do: * No hard block. The terminal tool can still touch any of these paths with no guard — same posture as the dangerous-command approval flow. SECURITY.md §3.2 applies. * No regex sweep on terminal commands for cross-profile paths. That direction is a Skills-Guard-style arms race (cd + relative paths, base64, etc.) and would false-positive on legitimate cross-profile reads. Filed as a follow-up. * No on-disk path migration. ~/.hermes/skills/ remains the default profile's skills dir; this PR is about telling the agent about that boundary, not changing the layout. Tests: tests/agent/test_file_safety_cross_profile.py (16 tests) - _resolve_active_profile_name covers default/named/failure paths - classify_cross_profile_target covers all four scoped areas, both directions (default → named, named → default, named → named), non-Hermes paths, and root-level config files - get_cross_profile_warning covers in-profile no-op, cross-profile message shape, and the defense-in-depth self-documentation tests/tools/test_cross_profile_guard.py (12 tests) - write_file: in-profile allow, cross-profile block, cross_profile=True bypass, non-Hermes pass-through - patch: replace-mode block, cross_profile=True bypass, V4A patch path extraction - skill_manage: error names the other profile (single + multiple), missing-everywhere falls back to skills_list hint - system prompt: contract-level checks (both branches present, cross_profile=True mentioned, ~/.hermes/profiles/ referenced) All 207 existing tests in file_safety/file_operations/skill_manager still pass. 10 system-prompt tests still pass. E2E verified: the exact incident scenario (security profile editing default's hermes-agent-dev skill) is now blocked with the warning message; cross_profile=True unblocks. * fix(code_execution): add cross_profile to write_file/patch stubs The cross_profile kwarg added to write_file_tool/patch_tool needs to flow through the execute_code sandbox stubs in _TOOL_STUBS so the test_stubs_cover_all_schema_params drift test passes. Without this, scripts running inside execute_code couldn't pass cross_profile=True through hermes_tools.write_file(). Caught by CI on PR #31290.
2026-07-17 14:42:06 +00:00 · 2026-05-24 00:38:17 -07:00 · 2026-05-24 00:38:17 -07:00 · d3c167b644
commit d3c167b644
parent b207dc28b3
7 changed files with 846 additions and 19 deletions
--- a/tests/agent/test_file_safety_cross_profile.py
+++ b/tests/agent/test_file_safety_cross_profile.py
@ -0,0 +1,219 @@
+"""Tests for the cross-Hermes-profile write guard in agent/file_safety.
+
+The guard fires when a tool tries to write into another Hermes profile's
+skills/plugins/cron/memories directory. It's a soft guard — defense in
+depth, NOT a security boundary — but it prevents the agent from silently
+corrupting a profile that belongs to a different session.
+
+Reference: May 2026 incident — a hermes-security profile session
+accidentally edited skills under both ~/.hermes/profiles/hermes-security/skills/
+AND ~/.hermes/skills/ (the default profile's skills), realizing only
+afterwards that the second path belonged to a different profile.
+"""
+from __future__ import annotations
+
+import os
+from pathlib import Path
+
+import pytest
+
+
+# ---------------------------------------------------------------------------
+# Helpers — set up a fake Hermes root with two profiles, monkeypatch the
+# resolver helpers so the classifier sees the test layout.
+# ---------------------------------------------------------------------------
+
+
+@pytest.fixture
+def fake_hermes(tmp_path, monkeypatch):
+    """Build a fake Hermes layout:
+
+        <tmp>/
+          skills/foo/SKILL.md           # default profile
+          plugins/foo/__init__.py
+          cron/<state>
+          memories/MEMORY.md
+          profiles/
+            hermes-security/
+              skills/foo/SKILL.md       # named profile
+              plugins/...
+            coder/
+              skills/foo/SKILL.md       # another named profile
+    """
+    root = tmp_path / "fake-hermes"
+    (root / "skills" / "foo").mkdir(parents=True)
+    (root / "skills" / "foo" / "SKILL.md").write_text("# default skill\n")
+    (root / "plugins" / "foo").mkdir(parents=True)
+    (root / "memories").mkdir(parents=True)
+    (root / "cron").mkdir(parents=True)
+
+    sec_home = root / "profiles" / "hermes-security"
+    (sec_home / "skills" / "foo").mkdir(parents=True)
+    (sec_home / "skills" / "foo" / "SKILL.md").write_text("# sec skill\n")
+    (sec_home / "plugins").mkdir(parents=True)
+
+    coder_home = root / "profiles" / "coder"
+    (coder_home / "skills" / "foo").mkdir(parents=True)
+    (coder_home / "skills" / "foo" / "SKILL.md").write_text("# coder skill\n")
+
+    # Monkeypatch the resolver functions used by file_safety so each test
+    # can choose which profile is "active".
+    import hermes_constants
+    monkeypatch.setattr(hermes_constants, "get_default_hermes_root", lambda: root)
+
+    # The reloads below ensure get_cross_profile_warning/classify see the patched root.
+    import agent.file_safety as fs
+    monkeypatch.setattr(fs, "_hermes_root_path", lambda: root)
+
+    return {
+        "root": root,
+        "default_home": root,
+        "security_home": sec_home,
+        "coder_home": coder_home,
+    }
+
+
+def _set_active_home(monkeypatch, hermes_home: Path):
+    """Point file_safety._hermes_home_path at a specific profile dir."""
+    import agent.file_safety as fs
+    monkeypatch.setattr(fs, "_hermes_home_path", lambda: hermes_home)
+
+
+# ---------------------------------------------------------------------------
+# _resolve_active_profile_name
+# ---------------------------------------------------------------------------
+
+
+class TestResolveActiveProfileName:
+    def test_default_when_home_is_root(self, fake_hermes, monkeypatch):
+        _set_active_home(monkeypatch, fake_hermes["default_home"])
+        from agent.file_safety import _resolve_active_profile_name
+        assert _resolve_active_profile_name() == "default"
+
+    def test_named_profile(self, fake_hermes, monkeypatch):
+        _set_active_home(monkeypatch, fake_hermes["security_home"])
+        from agent.file_safety import _resolve_active_profile_name
+        assert _resolve_active_profile_name() == "hermes-security"
+
+    def test_falls_back_to_default_on_resolution_failure(self, fake_hermes, monkeypatch):
+        """If HERMES_HOME resolution raises, return 'default' rather than crashing the tool."""
+        import agent.file_safety as fs
+
+        def _boom():
+            raise RuntimeError("simulated")
+
+        monkeypatch.setattr(fs, "_hermes_home_path", _boom)
+        # Should not raise — falls back to "default"
+        assert fs._resolve_active_profile_name() == "default"
+
+
+# ---------------------------------------------------------------------------
+# classify_cross_profile_target
+# ---------------------------------------------------------------------------
+
+
+class TestClassifyCrossProfileTarget:
+    def test_same_profile_write_returns_none(self, fake_hermes, monkeypatch):
+        _set_active_home(monkeypatch, fake_hermes["security_home"])
+        from agent.file_safety import classify_cross_profile_target
+        result = classify_cross_profile_target(
+            str(fake_hermes["security_home"] / "skills" / "foo" / "SKILL.md")
+        )
+        assert result is None
+
+    def test_security_writing_default_skill(self, fake_hermes, monkeypatch):
+        """The exact incident from May 2026."""
+        _set_active_home(monkeypatch, fake_hermes["security_home"])
+        from agent.file_safety import classify_cross_profile_target
+        result = classify_cross_profile_target(
+            str(fake_hermes["default_home"] / "skills" / "foo" / "SKILL.md")
+        )
+        assert result is not None
+        assert result["active_profile"] == "hermes-security"
+        assert result["target_profile"] == "default"
+        assert result["area"] == "skills"
+
+    def test_default_writing_security_skill(self, fake_hermes, monkeypatch):
+        """Inverse direction — default-profile session reaching into a named profile."""
+        _set_active_home(monkeypatch, fake_hermes["default_home"])
+        from agent.file_safety import classify_cross_profile_target
+        result = classify_cross_profile_target(
+            str(fake_hermes["security_home"] / "skills" / "foo" / "SKILL.md")
+        )
+        assert result is not None
+        assert result["active_profile"] == "default"
+        assert result["target_profile"] == "hermes-security"
+
+    def test_named_to_named_cross_profile(self, fake_hermes, monkeypatch):
+        _set_active_home(monkeypatch, fake_hermes["security_home"])
+        from agent.file_safety import classify_cross_profile_target
+        result = classify_cross_profile_target(
+            str(fake_hermes["coder_home"] / "skills" / "foo" / "SKILL.md")
+        )
+        assert result is not None
+        assert result["target_profile"] == "coder"
+
+    @pytest.mark.parametrize("area", ["skills", "plugins", "cron", "memories"])
+    def test_all_profile_scoped_areas_classified(self, fake_hermes, monkeypatch, area):
+        _set_active_home(monkeypatch, fake_hermes["security_home"])
+        from agent.file_safety import classify_cross_profile_target
+        target = fake_hermes["default_home"] / area / "foo.txt"
+        result = classify_cross_profile_target(str(target))
+        assert result is not None
+        assert result["area"] == area
+
+    def test_non_hermes_path_returns_none(self, fake_hermes, monkeypatch, tmp_path):
+        _set_active_home(monkeypatch, fake_hermes["security_home"])
+        from agent.file_safety import classify_cross_profile_target
+        # Path outside any Hermes root
+        assert classify_cross_profile_target(str(tmp_path / "random.txt")) is None
+
+    def test_hermes_config_not_classified_as_cross_profile(self, fake_hermes, monkeypatch):
+        """Files under <root>/config.yaml or <root>/.env are NOT profile-scoped
+        (already covered by build_write_denied_paths). Don't double-warn."""
+        _set_active_home(monkeypatch, fake_hermes["security_home"])
+        from agent.file_safety import classify_cross_profile_target
+        # config.yaml at root level is not in PROFILE_SCOPED_AREAS
+        result = classify_cross_profile_target(
+            str(fake_hermes["default_home"] / "config.yaml")
+        )
+        assert result is None
+
+
+# ---------------------------------------------------------------------------
+# get_cross_profile_warning
+# ---------------------------------------------------------------------------
+
+
+class TestGetCrossProfileWarning:
+    def test_in_profile_returns_none(self, fake_hermes, monkeypatch):
+        _set_active_home(monkeypatch, fake_hermes["security_home"])
+        from agent.file_safety import get_cross_profile_warning
+        assert get_cross_profile_warning(
+            str(fake_hermes["security_home"] / "skills" / "foo" / "SKILL.md")
+        ) is None
+
+    def test_cross_profile_warning_names_both_profiles(self, fake_hermes, monkeypatch):
+        _set_active_home(monkeypatch, fake_hermes["security_home"])
+        from agent.file_safety import get_cross_profile_warning
+        warn = get_cross_profile_warning(
+            str(fake_hermes["default_home"] / "skills" / "foo" / "SKILL.md")
+        )
+        assert warn is not None
+        # Must name BOTH profiles so the model knows which is which.
+        assert "default" in warn
+        assert "hermes-security" in warn
+        # Must name the bypass kwarg.
+        assert "cross_profile=True" in warn
+        # Must reference the area.
+        assert "skills" in warn
+
+    def test_warning_is_defense_in_depth_not_boundary(self, fake_hermes, monkeypatch):
+        _set_active_home(monkeypatch, fake_hermes["security_home"])
+        from agent.file_safety import get_cross_profile_warning
+        warn = get_cross_profile_warning(
+            str(fake_hermes["default_home"] / "skills" / "foo" / "SKILL.md")
+        )
+        # Must self-document as defense-in-depth so future reviewers
+        # don't promote it to a hard block.
+        assert "not a security boundary" in warn.lower()
--- a/tests/tools/test_cross_profile_guard.py
+++ b/tests/tools/test_cross_profile_guard.py
@ -0,0 +1,259 @@
+"""Tests for the cross-profile soft guard wired into write_file / patch /
+skill_manage.
+
+The classifier is tested in tests/agent/test_file_safety_cross_profile.py.
+This file tests that the tool surfaces:
+
+  1. Refuse cross-profile writes by default and return the warning.
+  2. Accept cross-profile writes when cross_profile=True is passed.
+  3. Continue to accept in-profile writes normally.
+  4. skill_manage's "not found" error names other profiles where the
+     skill exists.
+"""
+from __future__ import annotations
+
+import json
+import os
+from pathlib import Path
+
+import pytest
+
+
+@pytest.fixture
+def fake_hermes(tmp_path, monkeypatch):
+    """Build a two-profile Hermes layout and point HERMES_HOME at
+    the hermes-security profile (matching the original-incident shape).
+    """
+    root = tmp_path / "fake-hermes"
+    (root / "skills" / "shared-skill").mkdir(parents=True)
+    (root / "skills" / "shared-skill" / "SKILL.md").write_text(
+        "---\nname: shared-skill\ndescription: default copy.\n---\n"
+    )
+
+    sec_home = root / "profiles" / "hermes-security"
+    (sec_home / "skills").mkdir(parents=True)
+
+    coder_home = root / "profiles" / "coder"
+    (coder_home / "skills").mkdir(parents=True)
+
+    monkeypatch.setenv("HERMES_HOME", str(sec_home))
+
+    import hermes_constants
+    monkeypatch.setattr(hermes_constants, "get_default_hermes_root", lambda: root)
+
+    import agent.file_safety as fs
+    monkeypatch.setattr(fs, "_hermes_home_path", lambda: sec_home)
+    monkeypatch.setattr(fs, "_hermes_root_path", lambda: root)
+
+    return {
+        "root": root,
+        "sec_home": sec_home,
+        "coder_home": coder_home,
+    }
+
+
+# ---------------------------------------------------------------------------
+# write_file
+# ---------------------------------------------------------------------------
+
+
+class TestWriteFileCrossProfileGuard:
+    def test_in_profile_write_allowed(self, fake_hermes):
+        from tools.file_tools import write_file_tool
+        target = fake_hermes["sec_home"] / "skills" / "new-skill" / "SKILL.md"
+        target.parent.mkdir(parents=True)
+        result_json = write_file_tool(str(target), "in-profile content")
+        result = json.loads(result_json)
+        assert not result.get("error"), f"In-profile write should succeed: {result}"
+        assert target.exists()
+        assert target.read_text() == "in-profile content"
+
+    def test_cross_profile_write_blocked_by_default(self, fake_hermes):
+        """The May 2026 incident — security-profile session edits default
+        profile's skill. Must be blocked."""
+        from tools.file_tools import write_file_tool
+        target = fake_hermes["root"] / "skills" / "shared-skill" / "SKILL.md"
+        original = target.read_text()
+        result_json = write_file_tool(str(target), "OVERWRITTEN")
+        result = json.loads(result_json)
+        assert result.get("error"), "Cross-profile write should be refused"
+        assert "cross-profile" in result["error"].lower()
+        assert "default" in result["error"]
+        assert "hermes-security" in result["error"]
+        # File untouched.
+        assert target.read_text() == original
+
+    def test_cross_profile_True_bypass(self, fake_hermes):
+        """Explicit override after user direction must succeed."""
+        from tools.file_tools import write_file_tool
+        target = fake_hermes["root"] / "skills" / "shared-skill" / "SKILL.md"
+        result_json = write_file_tool(
+            str(target), "user-directed override", cross_profile=True
+        )
+        result = json.loads(result_json)
+        assert not result.get("error"), f"cross_profile=True must succeed: {result}"
+        assert target.read_text() == "user-directed override"
+
+    def test_non_hermes_path_unaffected(self, fake_hermes, tmp_path):
+        from tools.file_tools import write_file_tool
+        target = tmp_path / "outside" / "main.py"
+        target.parent.mkdir()
+        result_json = write_file_tool(str(target), "print('hello')")
+        result = json.loads(result_json)
+        assert not result.get("error")
+        assert target.exists()
+
+
+# ---------------------------------------------------------------------------
+# patch
+# ---------------------------------------------------------------------------
+
+
+class TestPatchCrossProfileGuard:
+    def test_cross_profile_patch_blocked(self, fake_hermes):
+        from tools.file_tools import patch_tool
+        target = fake_hermes["root"] / "skills" / "shared-skill" / "SKILL.md"
+        original = target.read_text()
+        result_json = patch_tool(
+            mode="replace",
+            path=str(target),
+            old_string="default copy.",
+            new_string="HIJACKED.",
+        )
+        result = json.loads(result_json)
+        assert result.get("error")
+        assert "cross-profile" in result["error"].lower()
+        assert target.read_text() == original
+
+    def test_cross_profile_patch_bypass(self, fake_hermes):
+        from tools.file_tools import patch_tool
+        target = fake_hermes["root"] / "skills" / "shared-skill" / "SKILL.md"
+        result_json = patch_tool(
+            mode="replace",
+            path=str(target),
+            old_string="default copy.",
+            new_string="user-directed update.",
+            cross_profile=True,
+        )
+        result = json.loads(result_json)
+        assert not result.get("error"), f"cross_profile=True bypass: {result}"
+        assert "user-directed update." in target.read_text()
+
+    def test_v4a_patch_extracts_path_for_guard(self, fake_hermes):
+        """V4A patches embed the target paths in the patch body, not in
+        a ``path`` kwarg. The guard must still apply."""
+        from tools.file_tools import patch_tool
+        target = fake_hermes["root"] / "skills" / "shared-skill" / "SKILL.md"
+        original = target.read_text()
+        v4a = (
+            "*** Begin Patch\n"
+            f"*** Update File: {target}\n"
+            "@@\n"
+            "-default copy.\n"
+            "+HIJACKED.\n"
+            "*** End Patch"
+        )
+        result_json = patch_tool(mode="patch", patch=v4a)
+        result = json.loads(result_json)
+        assert result.get("error"), f"V4A cross-profile must block: {result}"
+        assert "cross-profile" in result["error"].lower()
+        assert target.read_text() == original
+
+
+# ---------------------------------------------------------------------------
+# skill_manage — error message naming other profile (item D)
+# ---------------------------------------------------------------------------
+
+
+class TestSkillManageCrossProfileErrorUX:
+    def _make_skill_in_profile(self, profile_dir: Path, name: str):
+        d = profile_dir / "skills" / name
+        d.mkdir(parents=True, exist_ok=True)
+        (d / "SKILL.md").write_text(
+            f"---\nname: {name}\ndescription: a skill.\n---\n"
+        )
+
+    def test_error_names_other_profile_when_skill_lives_there(
+        self, fake_hermes, monkeypatch
+    ):
+        """The original incident shape — model expects 'foo' in active
+        profile, but 'foo' lives in default. Error must point at default."""
+        self._make_skill_in_profile(fake_hermes["root"], "default-only-skill")
+
+        # Re-import the module so SKILLS_DIR picks up HERMES_HOME (set in
+        # the fixture). Skill_manager_tool computes SKILLS_DIR at import.
+        import importlib
+        import tools.skill_manager_tool
+        importlib.reload(tools.skill_manager_tool)
+        from tools.skill_manager_tool import _skill_not_found_error
+
+        err = _skill_not_found_error("default-only-skill")
+        assert "not found in active profile 'hermes-security'" in err
+        assert "default" in err
+        assert "cross_profile=True" in err
+
+    def test_error_names_multiple_profiles(self, fake_hermes, monkeypatch):
+        """When the skill exists in TWO other profiles, both should be named."""
+        self._make_skill_in_profile(fake_hermes["root"], "everywhere-skill")
+        self._make_skill_in_profile(fake_hermes["coder_home"], "everywhere-skill")
+
+        import importlib
+        import tools.skill_manager_tool
+        importlib.reload(tools.skill_manager_tool)
+        from tools.skill_manager_tool import _skill_not_found_error
+
+        err = _skill_not_found_error("everywhere-skill")
+        assert "default" in err
+        assert "coder" in err
+        # Switch-profiles hint
+        assert "hermes -p" in err
+
+    def test_genuinely_missing_skill_keeps_helpful_hint(
+        self, fake_hermes, monkeypatch
+    ):
+        """When no profile has the skill, error falls back to skills_list hint."""
+        import importlib
+        import tools.skill_manager_tool
+        importlib.reload(tools.skill_manager_tool)
+        from tools.skill_manager_tool import _skill_not_found_error
+
+        err = _skill_not_found_error("totally-imaginary-skill")
+        assert "not found in active profile 'hermes-security'" in err
+        assert "skills_list" in err
+
+
+# ---------------------------------------------------------------------------
+# System prompt active-profile line (item B)
+# ---------------------------------------------------------------------------
+
+
+class TestSystemPromptActiveProfile:
+    def test_default_profile_line_in_prompt(self, tmp_path, monkeypatch):
+        """When active profile is 'default', the prompt names it and warns
+        about ~/.hermes/profiles/<name>/."""
+        # Don't set HERMES_HOME — falls back to default.
+        import agent.file_safety as fs
+        monkeypatch.setattr(fs, "_hermes_home_path", lambda: tmp_path / "fake")
+        monkeypatch.setattr(fs, "_hermes_root_path", lambda: tmp_path / "fake")
+
+        from agent.file_safety import _resolve_active_profile_name
+        assert _resolve_active_profile_name() == "default"
+        # Build the line manually to pin the contract — the prompt builder
+        # is too heavy to instantiate end-to-end in a unit test.
+        # See agent/system_prompt.py for the exact wording.
+
+    def test_named_profile_line_in_prompt_text(self, fake_hermes):
+        """When active profile is 'hermes-security', the prompt warns
+        explicitly about NOT modifying default's skills/plugins/cron/memories."""
+        # Spot-check by reading the source — the contract is:
+        # (1) names the active profile, (2) names the default-profile
+        # paths, (3) says "do not modify another profile's" without
+        # explicit user direction.
+        from pathlib import Path
+        src = Path("agent/system_prompt.py").read_text()
+        assert "Active Hermes profile" in src
+        assert "cross_profile=True" in src
+        assert "~/.hermes/profiles/" in src
+        # Both branches present (default and named profile).
+        assert "Active Hermes profile: default" in src
+        assert "Active Hermes profile: {active_profile}" in src