fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290)

* fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint

Adds a soft guard so an agent running under one Hermes profile cannot
silently edit a different profile's skills/plugins/cron/memories.
Three layers:

A. agent/file_safety.classify_cross_profile_target
   Classifies a write target against the active HERMES_HOME. Returns
   a {active_profile, target_profile, area, target_path} dict when the
   path lands in another profile's scoped area. PROFILE_SCOPED_AREAS =
   (skills, plugins, cron, memories). get_cross_profile_warning()
   wraps it into a model-facing error string that names both profiles,
   names the area, and points at the cross_profile=True bypass.

   Defense-in-depth, NOT a security boundary — the terminal tool runs
   as the same OS user and can write any of these paths directly. The
   guard exists to prevent confused-agent corruption, not to stop a
   determined attacker. SECURITY.md §3.2 (terminal-bypass posture)
   still applies.

   Wired into tools/file_tools.write_file_tool and patch_tool with a
   cross_profile=False kwarg. WRITE_FILE_SCHEMA and PATCH_SCHEMA both
   advertise cross_profile so the model can pass it after explicit
   user direction. patch_tool extracts target paths from V4A patch
   bodies before checking (same shape as the existing sensitive-path
   check).

   skill_manage is already scoped to the active profile's SKILLS_DIR
   by construction, so no extra guard wiring is needed there. The
   D-side error message (below) still names other profiles when the
   skill exists elsewhere.

B. agent/system_prompt
   One deterministic line near the environment-hints block names the
   active profile and tells the model not to modify another profile's
   skills/plugins/cron/memories without explicit direction. Profile
   name is stable for the lifetime of the AIAgent, so the line is
   prompt-cache-safe.

D. tools/skill_manager_tool._skill_not_found_error
   Replaces the bare "Skill 'X' not found." with a message that:
     - names the active profile,
     - searches OTHER profiles' skills dirs for the same name,
     - names the profile(s) where the skill exists and the path,
     - suggests `hermes -p <name>` to switch profiles, or
       cross_profile=True for an explicit edit.

   All 5 "not found" sites in skill_manager_tool (edit, patch, delete,
   write_file, remove_file) now go through the helper.

Reference incident (May 2026): a hermes-security profile session
edited skills under both ~/.hermes/profiles/hermes-security/skills/
AND ~/.hermes/skills/ (the default profile's skills) without
realizing the second path belonged to a different profile. Three of
the four skill files needed manual restoration afterward.

What this PR does NOT do:

  * No hard block. The terminal tool can still touch any of these
    paths with no guard — same posture as the dangerous-command
    approval flow. SECURITY.md §3.2 applies.
  * No regex sweep on terminal commands for cross-profile paths.
    That direction is a Skills-Guard-style arms race (cd + relative
    paths, base64, etc.) and would false-positive on legitimate
    cross-profile reads. Filed as a follow-up.
  * No on-disk path migration. ~/.hermes/skills/ remains the
    default profile's skills dir; this PR is about telling the
    agent about that boundary, not changing the layout.

Tests:
  tests/agent/test_file_safety_cross_profile.py (16 tests)
    - _resolve_active_profile_name covers default/named/failure paths
    - classify_cross_profile_target covers all four scoped areas,
      both directions (default → named, named → default, named → named),
      non-Hermes paths, and root-level config files
    - get_cross_profile_warning covers in-profile no-op, cross-profile
      message shape, and the defense-in-depth self-documentation

  tests/tools/test_cross_profile_guard.py (12 tests)
    - write_file: in-profile allow, cross-profile block, cross_profile=True
      bypass, non-Hermes pass-through
    - patch: replace-mode block, cross_profile=True bypass, V4A patch
      path extraction
    - skill_manage: error names the other profile (single + multiple),
      missing-everywhere falls back to skills_list hint
    - system prompt: contract-level checks (both branches present,
      cross_profile=True mentioned, ~/.hermes/profiles/ referenced)

All 207 existing tests in file_safety/file_operations/skill_manager
still pass. 10 system-prompt tests still pass.

E2E verified: the exact incident scenario (security profile editing
default's hermes-agent-dev skill) is now blocked with the warning
message; cross_profile=True unblocks.

* fix(code_execution): add cross_profile to write_file/patch stubs

The cross_profile kwarg added to write_file_tool/patch_tool needs to
flow through the execute_code sandbox stubs in _TOOL_STUBS so the
test_stubs_cover_all_schema_params drift test passes. Without this,
scripts running inside execute_code couldn't pass cross_profile=True
through hermes_tools.write_file().

Caught by CI on PR #31290.
This commit is contained in:
Teknium 2026-05-24 00:38:17 -07:00 committed by GitHub
parent b207dc28b3
commit d3c167b644
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
7 changed files with 846 additions and 19 deletions

View file

@ -40,7 +40,7 @@ import shutil
import tempfile
from pathlib import Path
from hermes_constants import get_hermes_home, display_hermes_home
from typing import Dict, Any, Optional, Tuple
from typing import Dict, Any, List, Optional, Tuple
from utils import atomic_replace, is_truthy_value
from hermes_cli.config import cfg_get
@ -295,6 +295,109 @@ def _find_skill(name: str) -> Optional[Dict[str, Any]]:
return None
def _find_skill_in_other_profiles(name: str) -> List[Tuple[str, Path]]:
"""Look for ``name`` under SKILL.md across OTHER Hermes profiles.
Returns a list of ``(profile_name, skill_dir)`` pairs. Used to make
the "Skill X not found" error explain when the user is editing the
wrong profile. Empty list when no other profile has the skill (or
when profile discovery fails fail-quiet, the caller falls back to
the plain "not found" error).
"""
matches: List[Tuple[str, Path]] = []
try:
from hermes_constants import get_default_hermes_root
from agent.skill_utils import is_excluded_skill_path
except Exception:
return matches
try:
root = get_default_hermes_root()
except Exception:
return matches
# Collect (profile_name, skills_dir) for every profile EXCEPT the
# one whose SKILLS_DIR we already searched in _find_skill().
active_dir = SKILLS_DIR.resolve() if SKILLS_DIR.exists() else SKILLS_DIR
candidates: List[Tuple[str, Path]] = []
# Default profile (~/.hermes/skills) — only consider when active is non-default.
default_skills = root / "skills"
try:
if default_skills.resolve() != active_dir:
candidates.append(("default", default_skills))
except (OSError, RuntimeError):
pass
# All named profiles (~/.hermes/profiles/*/skills)
profiles_root = root / "profiles"
if profiles_root.is_dir():
try:
for entry in profiles_root.iterdir():
if not entry.is_dir():
continue
pskills = entry / "skills"
try:
if pskills.resolve() == active_dir:
continue
except (OSError, RuntimeError):
continue
candidates.append((entry.name, pskills))
except OSError:
pass
for profile_name, skills_dir in candidates:
if not skills_dir.is_dir():
continue
try:
for skill_md in skills_dir.rglob("SKILL.md"):
if is_excluded_skill_path(skill_md):
continue
if skill_md.parent.name == name:
matches.append((profile_name, skill_md.parent))
break # one match per profile is enough
except OSError:
continue
return matches
def _skill_not_found_error(name: str, suffix: str = "") -> str:
"""Build a "skill not found" error that names other profiles holding
the same skill, so the agent can recognize a profile-scoping mistake.
``suffix`` is appended after the cross-profile hint if present
(e.g. ``" Create it first with action='create'."``).
"""
from agent.file_safety import _resolve_active_profile_name
active = _resolve_active_profile_name()
base = f"Skill '{name}' not found in active profile '{active}'."
others = _find_skill_in_other_profiles(name)
if others:
if len(others) == 1:
other_profile, other_path = others[0]
base += (
f" A skill by that name exists in profile "
f"'{other_profile}' ({other_path}). To edit a skill in "
f"another profile, switch profiles (`hermes -p "
f"{other_profile}`) or operate via explicit file tools "
f"with ``cross_profile=True``."
)
else:
names = ", ".join(f"'{p}'" for p, _ in others)
base += (
f" Skills by that name exist in other profiles: {names}. "
f"Switch profiles (`hermes -p <name>`) to edit there, or "
f"operate via explicit file tools with ``cross_profile=True``."
)
else:
base += " Use skills_list() to see available skills."
if suffix:
base += suffix
return base
def _validate_file_path(file_path: str) -> Optional[str]:
"""
Validate a file path for write_file/remove_file.
@ -439,7 +542,7 @@ def _edit_skill(name: str, content: str) -> Dict[str, Any]:
existing = _find_skill(name)
if not existing:
return {"success": False, "error": f"Skill '{name}' not found. Use skills_list() to see available skills."}
return {"success": False, "error": _skill_not_found_error(name)}
skill_md = existing["path"] / "SKILL.md"
# Back up original content for rollback
@ -479,7 +582,7 @@ def _patch_skill(
existing = _find_skill(name)
if not existing:
return {"success": False, "error": f"Skill '{name}' not found."}
return {"success": False, "error": _skill_not_found_error(name)}
skill_dir = existing["path"]
@ -568,7 +671,7 @@ def _delete_skill(name: str, absorbed_into: Optional[str] = None) -> Dict[str, A
"""
existing = _find_skill(name)
if not existing:
return {"success": False, "error": f"Skill '{name}' not found."}
return {"success": False, "error": _skill_not_found_error(name)}
pinned_err = _pinned_guard(name)
if pinned_err:
@ -637,7 +740,7 @@ def _write_file(name: str, file_path: str, file_content: str) -> Dict[str, Any]:
existing = _find_skill(name)
if not existing:
return {"success": False, "error": f"Skill '{name}' not found. Create it first with action='create'."}
return {"success": False, "error": _skill_not_found_error(name, " Create it first with action='create'.")}
target, err = _resolve_skill_target(existing["path"], file_path)
if err:
@ -671,7 +774,7 @@ def _remove_file(name: str, file_path: str) -> Dict[str, Any]:
existing = _find_skill(name)
if not existing:
return {"success": False, "error": f"Skill '{name}' not found."}
return {"success": False, "error": _skill_not_found_error(name)}
skill_dir = existing["path"]