diff --git a/SECURITY.md b/SECURITY.md index ad9dc4149b3..c58e348b579 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -31,11 +31,31 @@ through the private security channel. ## 2. Trust Model -Hermes is a single-tenant personal agent. Its posture is layered, and -the layers are not equally load-bearing. Reporters and operators -should reason about them in the same terms. +Hermes Agent is a single-tenant personal agent. Its posture is +layered, and the layers are not equally load-bearing. Reporters and +operators should reason about them in the same terms. -### 2.1 The Boundary: OS-Level Isolation +### 2.1 Definitions + +- **Agent process.** The Python interpreter running Hermes Agent, + including any Python modules it has loaded (skills, plugins, + hook handlers). +- **Terminal backend.** A pluggable execution target for the + `terminal()` tool. The default runs commands directly on the host. + Other backends run commands inside a container, cloud sandbox, or + remote host. +- **Input surface.** Any channel through which content enters the + agent's context: operator input, web fetches, email, gateway + messages, file reads, MCP server responses, tool results. +- **Trust envelope.** The set of resources an operator has implicitly + granted Hermes Agent access to by running it — typically, whatever + the operator's own user account can reach on the host. +- **Stance.** An explicit statement in Hermes Agent's documentation + or code about how a consuming layer (adapter, UI, file writer, + shell) should treat agent output — e.g. "the dashboard renders + agent output as inert HTML." + +### 2.2 The Boundary: OS-Level Isolation **The only security boundary against an adversarial LLM is the operating system.** Nothing inside the agent process constitutes @@ -44,51 +64,76 @@ pattern scanner, not any tool allowlist. Any in-process component that screens LLM output is a heuristic operating on an attacker-influenced string, and this policy treats it as such. -Hermes supports two OS-level isolation postures. They address +Hermes Agent supports two OS-level isolation postures. They address different threats and an operator should choose deliberately. -**Terminal-backend isolation** sandboxes the shell tool. A -non-default terminal backend runs LLM-emitted shell commands inside -a container, remote host, or cloud sandbox. This confines the blast -radius of destructive shell — but only of shell. The Python process -running the agent itself stays on the host, along with every code -path that doesn't go through the shell tool: the code-execution -tool, MCP subprocesses, file tools, plugin loading, hook dispatch, -skill loading. This is the right posture when the concern is -LLM-emitted destructive shell and the operator is otherwise -trusted. +#### Terminal-backend isolation -**Whole-process wrapping** sandboxes the agent itself. The agent -runs inside an external runtime that enforces filesystem, network, -process, and inference policies across the entire agent process -tree. [NVIDIA OpenShell](https://github.com/NVIDIA/OpenShell) is -the reference deployment. Under this posture, every code path in -the agent is subject to the same policy, and the in-process -heuristics in §2.3 become accident-prevention layered on top of a -real boundary. This is the supported posture when the agent -ingests content from surfaces the operator does not control — the -open web, inbound email, multi-user channels, untrusted MCP -servers — and for production or shared deployments. +A non-default terminal backend runs LLM-emitted shell commands +inside a container, remote host, or cloud sandbox. The file tools +(`read_file`, `write_file`, `patch`) also run through this backend, +since they are implemented on top of the shell contract — they +cannot reach paths the backend doesn't expose. + +What this confines: anything the agent does by issuing shell or +file operations. What this does **not** confine: everything the +agent does in its own Python process. That includes the +code-execution tool (spawned as a host subprocess), MCP subprocesses +(spawned from the agent's environment), plugin loading, hook +dispatch, and skill loading (all imported into the agent +interpreter). + +Terminal-backend isolation is the right posture when the concern is +LLM-emitted destructive shell or unwanted file-tool writes, and the +operator is otherwise trusted. + +#### Whole-process wrapping + +Whole-process wrapping runs the entire agent process tree inside a +sandbox. Every code path — shell, code-execution, MCP, file tools, +plugins, hooks, skill loading — is subject to the same filesystem, +network, process, and (where applicable) inference policy. + +Hermes Agent supports this in two ways: + +- **Hermes Agent's own Docker image and Compose setup.** Lighter- + weight; the agent runs in a standard container with operator- + configured mounts and network policy. +- **[NVIDIA OpenShell](https://github.com/NVIDIA/OpenShell)**. + OpenShell provides per-session sandboxes with declarative policy + across filesystem, network (L7 egress), process/syscall, and + inference-routing layers. Network and inference policies are + hot-reloadable. Credentials are injected from a Provider store + and never touch the sandbox filesystem. + +Under a whole-process wrapper, Hermes Agent's in-process heuristics +(§2.4) function as accident-prevention layered on top of a real +boundary. This is the supported posture when the agent ingests +content from surfaces the operator does not control — the open web, +inbound email, multi-user channels, untrusted MCP servers — and for +production or shared deployments. Operators running the default local backend with untrusted input surfaces, or running a terminal-backend sandbox and expecting it to contain code paths that don't go through the shell, are operating outside the supported security posture. -### 2.2 Credential Scoping +### 2.3 Credential Scoping -Hermes filters the environment it passes to its lower-trust +Hermes Agent filters the environment it passes to its lower-trust in-process components: shell subprocesses, MCP subprocesses, and the code-execution child. Credentials like provider API keys and gateway tokens are stripped by default; variables explicitly declared by the operator or by a loaded skill are passed through. -This reduces casual exfiltration. It is not containment. A -component with code-execution primitives can always reach -filesystem-resident credentials that the agent process itself can -read. +This reduces casual exfiltration. It is not containment. Any +component running inside the agent process (skills, plugins, hook +handlers) can read whatever the agent itself can read, including +in-memory credentials. The mitigation against a compromised +in-process component is operator review before install (§2.4, +§2.5), not environment scrubbing. -### 2.3 In-Process Heuristics +### 2.4 In-Process Heuristics The following components screen or warn about LLM behavior. They are useful. They are not boundaries. @@ -102,35 +147,75 @@ are useful. They are not boundaries. A motivated output producer will defeat it. - **Skills Guard** scans installable skill content for injection patterns. It is a review aid; the boundary for third-party skills - is operator review before install. + is operator review before install. Reviewing a skill means + reading its Python code and scripts, not just its SKILL.md + description — skills execute arbitrary Python at import time. -### 2.4 Gateway Authorization +### 2.5 Plugin Trust Model -When the gateway integrates with a messaging platform, each platform -adapter authenticates callers against an operator-configured -allowlist. **An allowlist is required for every enabled adapter.** -Adapters should refuse to dispatch agent work, resolve approvals, or -relay output until an allowlist is set; code paths that fail open -when no allowlist is configured are code bugs in scope under §3.1. -Within the allowlist, all authorized callers are equally trusted. -Session identifiers are routing handles, not authorization -boundaries. +Plugins load into the agent process and run with full agent +privileges: they can read the same credentials, call the same +tools, register the same hooks, and import the same modules as +anything shipped in-tree. The boundary for third-party plugins is +operator review before install — the same rule as skills (§2.4), +called out separately because plugins are architecturally heavier +and often ship their own background services, network listeners, +and dependencies. -### 2.5 Agent-Loaded Content +A malicious or buggy plugin is not a vulnerability in Hermes Agent +itself. Bugs in Hermes Agent's plugin-install or plugin-discovery +path that prevent the operator from seeing what they're installing +are in scope under §3.1. -Hermes chooses, by design, to load and execute content from specific -on-disk locations at its own initiative — skills, hooks, plugins, -operator-configured shortcuts. Content placed in these locations -becomes code the agent runs on its next session, hook dispatch, or -command invocation. +### 2.6 External Surfaces -Hermes does not claim these locations are protected files. -Filesystem-level protection is whatever the OS provides under the -operator's chosen isolation posture (§2.1). What Hermes commits to -is narrower and different: **attacker-influenced input must not be -chainable into a write that Hermes would later load and execute on -its own initiative**. The concern is not what the filesystem -allows; it is what Hermes loads. +An **external surface** is any channel outside the local agent +process through which a caller can dispatch agent work, resolve +approvals, or receive agent output. Each surface has its own +authorization model, but the rules below apply uniformly. + +**Surfaces in Hermes Agent:** + +- **Gateway platform adapters.** Messaging integrations in + `gateway/platforms/` (Telegram, Discord, Slack, email, SMS, etc.) + and analogous adapters shipped as plugins. +- **Network-exposed HTTP surfaces.** The API server adapter, the + dashboard plugin, the kanban plugin's HTTP endpoints, and any + other plugin that binds a listening socket. +- **Editor / IDE adapters.** The ACP adapter (`acp_adapter/`) and + equivalent integrations that accept requests from a local client + process. +- **The TUI gateway (`tui_gateway/`).** JSON-RPC backend for the + Ink terminal UI, reached over local IPC. + +**Uniform rules:** + +1. **Authorization is required at every surface that crosses a + trust boundary.** For messaging and network HTTP surfaces, the + boundary is the network: authorization means an operator- + configured caller allowlist. For editor and local-IPC surfaces + (ACP, TUI gateway), the boundary is the host's user account: + authorization means relying on OS-level access control (file + permissions, loopback-only binds) and not exposing the surface + beyond the local user without an explicit network auth layer. +2. **An allowlist is required for every enabled network-exposed + adapter.** Adapters must refuse to dispatch agent work, resolve + approvals, or relay output until an allowlist is set. Code paths + that fail open when no allowlist is configured are code bugs in + scope under §3.1. +3. **Session identifiers are routing handles, not authorization + boundaries.** Knowing another caller's session ID does not grant + access to their approvals or output; authorization is always + re-checked against the allowlist (or OS-level equivalent). +4. **Within the authorized set, all callers are equally trusted.** + Hermes Agent does not model per-caller capabilities inside a + single adapter. Operators who need capability separation should + run separate agent instances with separate allowlists. +5. **Binding a local-only surface to a non-loopback interface is a + break-glass operator decision (§3.2).** The dashboard and other + plugin HTTP servers default to loopback; exposing them via + `--host 0.0.0.0` or equivalent makes public-exposure hardening + (§4) the operator's responsibility. --- @@ -138,60 +223,71 @@ allows; it is what Hermes loads. ### 3.1 In Scope -- Escape from a declared OS-level isolation posture (§2.1): an +- Escape from a declared OS-level isolation posture (§2.2): an attacker-controlled code path reaching state that the posture claimed to confine. -- Unauthorized gateway access: a caller outside the configured - allowlist dispatching work, receiving output, or resolving - approvals (§2.4). +- Unauthorized external-surface access: a caller outside the + configured authorization set (allowlist, or OS-level equivalent + for local-IPC surfaces) dispatching work, receiving output, or + resolving approvals (§2.6). - Credential exfiltration: leakage of operator credentials or session authorization material to a destination outside the - operator's trust envelope. -- Untrusted input chaining into agent-loaded content: an untrusted - input surface chains into a write whose target is a location - Hermes loads and executes on its own initiative (§2.5). -- Output integrity failures into external platforms: agent output - rendered on a receiving platform with unintended authority — - broadcast-mention passthrough, content that fetches attacker - resources for every recipient, markup injection into hosted UIs. + trust envelope, via a mechanism that should have prevented it + (environment scrubbing bug, adapter logging, transport error + that flushes credentials to an upstream, etc.). - Trust-model documentation violations: code behaving contrary to - what this policy states, where an operator relying on the policy - would reasonably expect otherwise. + what this policy, Hermes Agent's own documentation, or reasonable + operator expectations would predict — including cases where + Hermes Agent has documented a stance about how its output should + be rendered by a consuming layer (dashboard, gateway adapter, + file writer, shell) and a code path breaks that stance. ### 3.2 Out of Scope "Out of scope" here means "not a security vulnerability under this policy." It does not mean "not worth reporting." Improvements to the in-process heuristics, hardening ideas, and UX fixes are welcome as -regular issues or pull requests — we can always make the approval -gate catch more patterns, make redaction smarter, or tighten adapter -behavior. These items just don't go through the private-disclosure -channel and don't receive advisories. +regular issues or pull requests — the approval gate can always catch +more patterns, redaction can always get smarter, adapter behavior +can always be tightened. These items just don't go through the +private-disclosure channel and don't receive advisories. -- **Bypasses of in-process heuristics (§2.3)** — approval-gate regex +- **Bypasses of in-process heuristics (§2.4)** — approval-gate regex bypasses, redaction bypasses, Skills Guard pattern bypasses, and analogous reports against future heuristics. These components are not boundaries; defeating them is not a vulnerability under this policy. -- **Prompt injection that does not chain to a §3.1 outcome.** Getting - the LLM to emit unusual text or "ignore previous instructions" is - not itself a vulnerability; it becomes one only when it results in - something §3.1 describes. +- **Prompt injection per se.** Getting the LLM to emit unusual + output — via injected content, hallucination, training artifacts, + or any other cause — is not itself a vulnerability. "I achieved + prompt injection" without a chained §3.1 outcome is not an + actionable report under this policy. - **Consequences of a chosen isolation posture.** Reports that a code path operating within its posture's scope can do what that - posture permits are not vulnerabilities. Examples: shell tools - reaching host state under the local backend; code-execution or - file tools reaching host state under terminal-backend isolation - that only sandboxes shell; reports whose preconditions require - pre-existing write access to operator-owned configuration or - credential files (those are already inside the operator's trust - envelope). + posture permits are not vulnerabilities. Examples: shell or file + tools reaching host state under the local backend; code-execution + or MCP subprocesses reaching host state under terminal-backend + isolation that only sandboxes shell; reports whose preconditions + require pre-existing write access to operator-owned configuration + or credential files (those are already inside the trust envelope). +- **Documented break-glass settings.** Operator-selected trade-offs + that explicitly disable protections: `--insecure` and equivalent + flags on the dashboard or other components, disabled approvals, + local backend in production, development profiles that bypass + hermes-home security, and similar. Reports against those + configurations are not vulnerabilities — that's the flag's job. +- **Community-contributed skills and plugins.** Third-party skills + (including the community skills repository) and third-party + plugins are in the operator's review surface, not Hermes Agent's + trust surface (§2.4, §2.5). A skill or plugin doing something + malicious is the expected failure mode of one that wasn't + reviewed, not a vulnerability in Hermes Agent. Bugs in Hermes + Agent's skill-install or plugin-install path that prevent the + operator from seeing what they're installing are in scope under + §3.1. - **Public exposure without external controls.** Exposing the gateway or API to the public internet without authentication, VPN, or firewall. -- **Documented break-glass settings.** Disabled approvals, local - backend in production, development profiles that bypass - hermes-home security, and similar operator-selected trade-offs. - **Tool-level read/write restrictions on a posture where shell is permitted.** If a path is reachable via the terminal tool, reports that other file tools can reach it add nothing. @@ -201,25 +297,26 @@ channel and don't receive advisories. ## 4. Deployment Hardening The single most important hardening decision is matching isolation -(§2.1) to the trust of the content the agent will ingest. Beyond +(§2.2) to the trust of the content the agent will ingest. Beyond that: - Run the agent as a non-root user. The supplied container image does this by default. - Keep credentials in the operator credential file with tight permissions, never in the main config, never in version control. - Under OpenShell, use its Provider store rather than an on-disk + Under OpenShell, use the Provider store rather than an on-disk credential file. - Do not expose the gateway or API to the public internet without VPN, Tailscale, or firewall protection. Under OpenShell, use the network policy layer to restrict egress. -- Configure a caller allowlist for every gateway adapter you enable - (§2.4). -- Review third-party skills before install. Skills Guard reports and - the install audit log are the review surface. -- The OSV malware database is consulted before launching - ecosystem-resolved MCP servers. Additional supply-chain guards - on dependency and bundled-package changes run in CI; see +- Configure a caller allowlist for every network-exposed adapter + you enable (§2.6). +- Review third-party skills and plugins before install (§2.4, + §2.5). For skills, this means reading the Python and scripts, + not just SKILL.md. Skills Guard reports and the install audit + log are the review surface. +- Hermes Agent includes supply-chain guards for MCP server + launches and for dependency / bundled-package changes in CI; see `CONTRIBUTING.md` for specifics. ---