fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290)

* fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint Adds a soft guard so an agent running under one Hermes profile cannot silently edit a different profile's skills/plugins/cron/memories. Three layers: A. agent/file_safety.classify_cross_profile_target Classifies a write target against the active HERMES_HOME. Returns a {active_profile, target_profile, area, target_path} dict when the path lands in another profile's scoped area. PROFILE_SCOPED_AREAS = (skills, plugins, cron, memories). get_cross_profile_warning() wraps it into a model-facing error string that names both profiles, names the area, and points at the cross_profile=True bypass. Defense-in-depth, NOT a security boundary — the terminal tool runs as the same OS user and can write any of these paths directly. The guard exists to prevent confused-agent corruption, not to stop a determined attacker. SECURITY.md §3.2 (terminal-bypass posture) still applies. Wired into tools/file_tools.write_file_tool and patch_tool with a cross_profile=False kwarg. WRITE_FILE_SCHEMA and PATCH_SCHEMA both advertise cross_profile so the model can pass it after explicit user direction. patch_tool extracts target paths from V4A patch bodies before checking (same shape as the existing sensitive-path check). skill_manage is already scoped to the active profile's SKILLS_DIR by construction, so no extra guard wiring is needed there. The D-side error message (below) still names other profiles when the skill exists elsewhere. B. agent/system_prompt One deterministic line near the environment-hints block names the active profile and tells the model not to modify another profile's skills/plugins/cron/memories without explicit direction. Profile name is stable for the lifetime of the AIAgent, so the line is prompt-cache-safe. D. tools/skill_manager_tool._skill_not_found_error Replaces the bare "Skill 'X' not found." with a message that: - names the active profile, - searches OTHER profiles' skills dirs for the same name, - names the profile(s) where the skill exists and the path, - suggests `hermes -p <name>` to switch profiles, or cross_profile=True for an explicit edit. All 5 "not found" sites in skill_manager_tool (edit, patch, delete, write_file, remove_file) now go through the helper. Reference incident (May 2026): a hermes-security profile session edited skills under both ~/.hermes/profiles/hermes-security/skills/ AND ~/.hermes/skills/ (the default profile's skills) without realizing the second path belonged to a different profile. Three of the four skill files needed manual restoration afterward. What this PR does NOT do: * No hard block. The terminal tool can still touch any of these paths with no guard — same posture as the dangerous-command approval flow. SECURITY.md §3.2 applies. * No regex sweep on terminal commands for cross-profile paths. That direction is a Skills-Guard-style arms race (cd + relative paths, base64, etc.) and would false-positive on legitimate cross-profile reads. Filed as a follow-up. * No on-disk path migration. ~/.hermes/skills/ remains the default profile's skills dir; this PR is about telling the agent about that boundary, not changing the layout. Tests: tests/agent/test_file_safety_cross_profile.py (16 tests) - _resolve_active_profile_name covers default/named/failure paths - classify_cross_profile_target covers all four scoped areas, both directions (default → named, named → default, named → named), non-Hermes paths, and root-level config files - get_cross_profile_warning covers in-profile no-op, cross-profile message shape, and the defense-in-depth self-documentation tests/tools/test_cross_profile_guard.py (12 tests) - write_file: in-profile allow, cross-profile block, cross_profile=True bypass, non-Hermes pass-through - patch: replace-mode block, cross_profile=True bypass, V4A patch path extraction - skill_manage: error names the other profile (single + multiple), missing-everywhere falls back to skills_list hint - system prompt: contract-level checks (both branches present, cross_profile=True mentioned, ~/.hermes/profiles/ referenced) All 207 existing tests in file_safety/file_operations/skill_manager still pass. 10 system-prompt tests still pass. E2E verified: the exact incident scenario (security profile editing default's hermes-agent-dev skill) is now blocked with the warning message; cross_profile=True unblocks. * fix(code_execution): add cross_profile to write_file/patch stubs The cross_profile kwarg added to write_file_tool/patch_tool needs to flow through the execute_code sandbox stubs in _TOOL_STUBS so the test_stubs_cover_all_schema_params drift test passes. Without this, scripts running inside execute_code couldn't pass cross_profile=True through hermes_tools.write_file(). Caught by CI on PR #31290.
2026-07-17 14:42:06 +00:00 · 2026-05-24 00:38:17 -07:00 · 2026-05-24 00:38:17 -07:00 · d3c167b644
commit d3c167b644
parent b207dc28b3
7 changed files with 846 additions and 19 deletions
--- a/tools/file_tools.py
+++ b/tools/file_tools.py
@ -174,6 +174,37 @@ def _check_sensitive_path(filepath: str, task_id: str = "default") -> str | None
    return None


+def _check_cross_profile_path(filepath: str, task_id: str = "default") -> str | None:
+    """Return a cross-profile warning string when ``filepath`` lands in
+    another Hermes profile's skills/plugins/cron/memories directory.
+
+    Returns ``None`` when the write is in-scope (same profile) or outside
+    Hermes scope entirely. Soft guard — the agent can override by passing
+    ``cross_profile=True`` to its write tool after explicit user direction.
+
+    Defense-in-depth, NOT a security boundary — the terminal tool runs
+    as the same OS user and can write any of these paths directly.
+    See ``agent/file_safety.classify_cross_profile_target`` for the
+    detection rules.
+    """
+    try:
+        from agent.file_safety import get_cross_profile_warning
+    except Exception:
+        # Fail open on import error — the existing sensitive-path guard
+        # plus the write_denied list still apply.
+        return None
+
+    # Resolve via the task's cwd so a relative ``skills/foo/SKILL.md``
+    # in a session that cd'd into ``~/.hermes/profiles/other/`` is
+    # classified against the right base.
+    try:
+        resolved = str(_resolve_path_for_task(filepath, task_id))
+    except (OSError, ValueError):
+        resolved = filepath
+
+    return get_cross_profile_warning(resolved)
+
+
 def _is_expected_write_exception(exc: Exception) -> bool:
    """Return True for expected write denials that should not hit error logs."""
    if isinstance(exc, PermissionError):
@ -795,11 +826,23 @@ def _check_file_staleness(filepath: str, task_id: str) -> str | None:
    return None


-def write_file_tool(path: str, content: str, task_id: str = "default") -> str:
-    """Write content to a file."""
+def write_file_tool(path: str, content: str, task_id: str = "default",
+                    cross_profile: bool = False) -> str:
+    """Write content to a file.
+
+    ``cross_profile`` opts out of the soft cross-Hermes-profile guard. The
+    guard fires only on writes that land in another profile's
+    skills/plugins/cron/memories directory; everything else is unaffected.
+    Pass ``True`` after explicit user direction — same shape as ``force``
+    on the terminal tool.
+    """
    sensitive_err = _check_sensitive_path(path, task_id)
    if sensitive_err:
        return tool_error(sensitive_err)
+    if not cross_profile:
+        cross_warning = _check_cross_profile_path(path, task_id)
+        if cross_warning:
+            return tool_error(cross_warning)
    if _is_internal_file_status_text(content):
        return tool_error(
            "Refusing to write internal read_file status text as file content. "
@ -854,8 +897,13 @@ def write_file_tool(path: str, content: str, task_id: str = "default") -> str:

 def patch_tool(mode: str = "replace", path: str = None, old_string: str = None,
               new_string: str = None, replace_all: bool = False, patch: str = None,
-               task_id: str = "default") -> str:
-    """Patch a file using replace mode or V4A patch format."""
+               task_id: str = "default", cross_profile: bool = False) -> str:
+    """Patch a file using replace mode or V4A patch format.
+
+    ``cross_profile`` opts out of the soft cross-Hermes-profile guard for
+    targets under another profile's skills/plugins/cron/memories
+    directory. Same shape as ``write_file``'s flag.
+    """
    # Check sensitive paths for both replace (explicit path) and V4A patch (extract paths)
    _paths_to_check = []
    if path:
@ -868,6 +916,10 @@ def patch_tool(mode: str = "replace", path: str = None, old_string: str = None,
        sensitive_err = _check_sensitive_path(_p, task_id)
        if sensitive_err:
            return tool_error(sensitive_err)
+        if not cross_profile:
+            cross_warning = _check_cross_profile_path(_p, task_id)
+            if cross_warning:
+                return tool_error(cross_warning)
    try:
        # Resolve paths for locking.  Ordered + deduplicated so concurrent
        # callers lock in the same order — prevents deadlock on overlapping
@ -1052,7 +1104,12 @@ WRITE_FILE_SCHEMA = {
        "type": "object",
        "properties": {
            "path": {"type": "string", "description": "Path to the file to write (will be created if it doesn't exist, overwritten if it does)"},
-            "content": {"type": "string", "description": "Complete content to write to the file"}
+            "content": {"type": "string", "description": "Complete content to write to the file"},
+            "cross_profile": {
+                "type": "boolean",
+                "description": "Opt out of the cross-profile soft guard. Defaults to false. Set true ONLY after explicit user direction to edit another Hermes profile's skills/plugins/cron/memories — by default these writes are blocked with a warning because they affect a different profile than the one this session is running under.",
+                "default": False,
+            },
        },
        "required": ["path", "content"]
    }
@ -1099,6 +1156,11 @@ PATCH_SCHEMA = {
                "type": "string",
                "description": "REQUIRED when mode='patch'. V4A format patch content. Format:\n*** Begin Patch\n*** Update File: path/to/file\n@@ context hint @@\n context line\n-removed line\n+added line\n*** End Patch",
            },
+            "cross_profile": {
+                "type": "boolean",
+                "description": "Opt out of the cross-profile soft guard. Defaults to false. Set true ONLY after explicit user direction to edit another Hermes profile's skills/plugins/cron/memories.",
+                "default": False,
+            },
        },
        "required": ["mode"],
    },
@ -1149,7 +1211,10 @@ def _handle_write_file(args, **kw):
            f"write_file: 'content' must be a string, got "
            f"{type(args['content']).__name__}."
        )
-    return write_file_tool(path=args["path"], content=args["content"], task_id=tid)
+    return write_file_tool(
+        path=args["path"], content=args["content"], task_id=tid,
+        cross_profile=bool(args.get("cross_profile", False)),
+    )


 def _handle_patch(args, **kw):
@ -1157,7 +1222,9 @@ def _handle_patch(args, **kw):
    return patch_tool(
        mode=args.get("mode", "replace"), path=args.get("path"),
        old_string=args.get("old_string"), new_string=args.get("new_string"),
-        replace_all=args.get("replace_all", False), patch=args.get("patch"), task_id=tid)
+        replace_all=args.get("replace_all", False), patch=args.get("patch"), task_id=tid,
+        cross_profile=bool(args.get("cross_profile", False)),
+    )


 def _handle_search_files(args, **kw):