mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
changes from feedback
This commit is contained in:
parent
401aadb5b8
commit
0d1cbc2dda
1 changed files with 196 additions and 99 deletions
295
SECURITY.md
295
SECURITY.md
|
|
@ -31,11 +31,31 @@ through the private security channel.
|
|||
|
||||
## 2. Trust Model
|
||||
|
||||
Hermes is a single-tenant personal agent. Its posture is layered, and
|
||||
the layers are not equally load-bearing. Reporters and operators
|
||||
should reason about them in the same terms.
|
||||
Hermes Agent is a single-tenant personal agent. Its posture is
|
||||
layered, and the layers are not equally load-bearing. Reporters and
|
||||
operators should reason about them in the same terms.
|
||||
|
||||
### 2.1 The Boundary: OS-Level Isolation
|
||||
### 2.1 Definitions
|
||||
|
||||
- **Agent process.** The Python interpreter running Hermes Agent,
|
||||
including any Python modules it has loaded (skills, plugins,
|
||||
hook handlers).
|
||||
- **Terminal backend.** A pluggable execution target for the
|
||||
`terminal()` tool. The default runs commands directly on the host.
|
||||
Other backends run commands inside a container, cloud sandbox, or
|
||||
remote host.
|
||||
- **Input surface.** Any channel through which content enters the
|
||||
agent's context: operator input, web fetches, email, gateway
|
||||
messages, file reads, MCP server responses, tool results.
|
||||
- **Trust envelope.** The set of resources an operator has implicitly
|
||||
granted Hermes Agent access to by running it — typically, whatever
|
||||
the operator's own user account can reach on the host.
|
||||
- **Stance.** An explicit statement in Hermes Agent's documentation
|
||||
or code about how a consuming layer (adapter, UI, file writer,
|
||||
shell) should treat agent output — e.g. "the dashboard renders
|
||||
agent output as inert HTML."
|
||||
|
||||
### 2.2 The Boundary: OS-Level Isolation
|
||||
|
||||
**The only security boundary against an adversarial LLM is the
|
||||
operating system.** Nothing inside the agent process constitutes
|
||||
|
|
@ -44,51 +64,76 @@ pattern scanner, not any tool allowlist. Any in-process component
|
|||
that screens LLM output is a heuristic operating on an
|
||||
attacker-influenced string, and this policy treats it as such.
|
||||
|
||||
Hermes supports two OS-level isolation postures. They address
|
||||
Hermes Agent supports two OS-level isolation postures. They address
|
||||
different threats and an operator should choose deliberately.
|
||||
|
||||
**Terminal-backend isolation** sandboxes the shell tool. A
|
||||
non-default terminal backend runs LLM-emitted shell commands inside
|
||||
a container, remote host, or cloud sandbox. This confines the blast
|
||||
radius of destructive shell — but only of shell. The Python process
|
||||
running the agent itself stays on the host, along with every code
|
||||
path that doesn't go through the shell tool: the code-execution
|
||||
tool, MCP subprocesses, file tools, plugin loading, hook dispatch,
|
||||
skill loading. This is the right posture when the concern is
|
||||
LLM-emitted destructive shell and the operator is otherwise
|
||||
trusted.
|
||||
#### Terminal-backend isolation
|
||||
|
||||
**Whole-process wrapping** sandboxes the agent itself. The agent
|
||||
runs inside an external runtime that enforces filesystem, network,
|
||||
process, and inference policies across the entire agent process
|
||||
tree. [NVIDIA OpenShell](https://github.com/NVIDIA/OpenShell) is
|
||||
the reference deployment. Under this posture, every code path in
|
||||
the agent is subject to the same policy, and the in-process
|
||||
heuristics in §2.3 become accident-prevention layered on top of a
|
||||
real boundary. This is the supported posture when the agent
|
||||
ingests content from surfaces the operator does not control — the
|
||||
open web, inbound email, multi-user channels, untrusted MCP
|
||||
servers — and for production or shared deployments.
|
||||
A non-default terminal backend runs LLM-emitted shell commands
|
||||
inside a container, remote host, or cloud sandbox. The file tools
|
||||
(`read_file`, `write_file`, `patch`) also run through this backend,
|
||||
since they are implemented on top of the shell contract — they
|
||||
cannot reach paths the backend doesn't expose.
|
||||
|
||||
What this confines: anything the agent does by issuing shell or
|
||||
file operations. What this does **not** confine: everything the
|
||||
agent does in its own Python process. That includes the
|
||||
code-execution tool (spawned as a host subprocess), MCP subprocesses
|
||||
(spawned from the agent's environment), plugin loading, hook
|
||||
dispatch, and skill loading (all imported into the agent
|
||||
interpreter).
|
||||
|
||||
Terminal-backend isolation is the right posture when the concern is
|
||||
LLM-emitted destructive shell or unwanted file-tool writes, and the
|
||||
operator is otherwise trusted.
|
||||
|
||||
#### Whole-process wrapping
|
||||
|
||||
Whole-process wrapping runs the entire agent process tree inside a
|
||||
sandbox. Every code path — shell, code-execution, MCP, file tools,
|
||||
plugins, hooks, skill loading — is subject to the same filesystem,
|
||||
network, process, and (where applicable) inference policy.
|
||||
|
||||
Hermes Agent supports this in two ways:
|
||||
|
||||
- **Hermes Agent's own Docker image and Compose setup.** Lighter-
|
||||
weight; the agent runs in a standard container with operator-
|
||||
configured mounts and network policy.
|
||||
- **[NVIDIA OpenShell](https://github.com/NVIDIA/OpenShell)**.
|
||||
OpenShell provides per-session sandboxes with declarative policy
|
||||
across filesystem, network (L7 egress), process/syscall, and
|
||||
inference-routing layers. Network and inference policies are
|
||||
hot-reloadable. Credentials are injected from a Provider store
|
||||
and never touch the sandbox filesystem.
|
||||
|
||||
Under a whole-process wrapper, Hermes Agent's in-process heuristics
|
||||
(§2.4) function as accident-prevention layered on top of a real
|
||||
boundary. This is the supported posture when the agent ingests
|
||||
content from surfaces the operator does not control — the open web,
|
||||
inbound email, multi-user channels, untrusted MCP servers — and for
|
||||
production or shared deployments.
|
||||
|
||||
Operators running the default local backend with untrusted input
|
||||
surfaces, or running a terminal-backend sandbox and expecting it to
|
||||
contain code paths that don't go through the shell, are operating
|
||||
outside the supported security posture.
|
||||
|
||||
### 2.2 Credential Scoping
|
||||
### 2.3 Credential Scoping
|
||||
|
||||
Hermes filters the environment it passes to its lower-trust
|
||||
Hermes Agent filters the environment it passes to its lower-trust
|
||||
in-process components: shell subprocesses, MCP subprocesses, and
|
||||
the code-execution child. Credentials like provider API keys and
|
||||
gateway tokens are stripped by default; variables explicitly
|
||||
declared by the operator or by a loaded skill are passed through.
|
||||
|
||||
This reduces casual exfiltration. It is not containment. A
|
||||
component with code-execution primitives can always reach
|
||||
filesystem-resident credentials that the agent process itself can
|
||||
read.
|
||||
This reduces casual exfiltration. It is not containment. Any
|
||||
component running inside the agent process (skills, plugins, hook
|
||||
handlers) can read whatever the agent itself can read, including
|
||||
in-memory credentials. The mitigation against a compromised
|
||||
in-process component is operator review before install (§2.4,
|
||||
§2.5), not environment scrubbing.
|
||||
|
||||
### 2.3 In-Process Heuristics
|
||||
### 2.4 In-Process Heuristics
|
||||
|
||||
The following components screen or warn about LLM behavior. They
|
||||
are useful. They are not boundaries.
|
||||
|
|
@ -102,35 +147,75 @@ are useful. They are not boundaries.
|
|||
A motivated output producer will defeat it.
|
||||
- **Skills Guard** scans installable skill content for injection
|
||||
patterns. It is a review aid; the boundary for third-party skills
|
||||
is operator review before install.
|
||||
is operator review before install. Reviewing a skill means
|
||||
reading its Python code and scripts, not just its SKILL.md
|
||||
description — skills execute arbitrary Python at import time.
|
||||
|
||||
### 2.4 Gateway Authorization
|
||||
### 2.5 Plugin Trust Model
|
||||
|
||||
When the gateway integrates with a messaging platform, each platform
|
||||
adapter authenticates callers against an operator-configured
|
||||
allowlist. **An allowlist is required for every enabled adapter.**
|
||||
Adapters should refuse to dispatch agent work, resolve approvals, or
|
||||
relay output until an allowlist is set; code paths that fail open
|
||||
when no allowlist is configured are code bugs in scope under §3.1.
|
||||
Within the allowlist, all authorized callers are equally trusted.
|
||||
Session identifiers are routing handles, not authorization
|
||||
boundaries.
|
||||
Plugins load into the agent process and run with full agent
|
||||
privileges: they can read the same credentials, call the same
|
||||
tools, register the same hooks, and import the same modules as
|
||||
anything shipped in-tree. The boundary for third-party plugins is
|
||||
operator review before install — the same rule as skills (§2.4),
|
||||
called out separately because plugins are architecturally heavier
|
||||
and often ship their own background services, network listeners,
|
||||
and dependencies.
|
||||
|
||||
### 2.5 Agent-Loaded Content
|
||||
A malicious or buggy plugin is not a vulnerability in Hermes Agent
|
||||
itself. Bugs in Hermes Agent's plugin-install or plugin-discovery
|
||||
path that prevent the operator from seeing what they're installing
|
||||
are in scope under §3.1.
|
||||
|
||||
Hermes chooses, by design, to load and execute content from specific
|
||||
on-disk locations at its own initiative — skills, hooks, plugins,
|
||||
operator-configured shortcuts. Content placed in these locations
|
||||
becomes code the agent runs on its next session, hook dispatch, or
|
||||
command invocation.
|
||||
### 2.6 External Surfaces
|
||||
|
||||
Hermes does not claim these locations are protected files.
|
||||
Filesystem-level protection is whatever the OS provides under the
|
||||
operator's chosen isolation posture (§2.1). What Hermes commits to
|
||||
is narrower and different: **attacker-influenced input must not be
|
||||
chainable into a write that Hermes would later load and execute on
|
||||
its own initiative**. The concern is not what the filesystem
|
||||
allows; it is what Hermes loads.
|
||||
An **external surface** is any channel outside the local agent
|
||||
process through which a caller can dispatch agent work, resolve
|
||||
approvals, or receive agent output. Each surface has its own
|
||||
authorization model, but the rules below apply uniformly.
|
||||
|
||||
**Surfaces in Hermes Agent:**
|
||||
|
||||
- **Gateway platform adapters.** Messaging integrations in
|
||||
`gateway/platforms/` (Telegram, Discord, Slack, email, SMS, etc.)
|
||||
and analogous adapters shipped as plugins.
|
||||
- **Network-exposed HTTP surfaces.** The API server adapter, the
|
||||
dashboard plugin, the kanban plugin's HTTP endpoints, and any
|
||||
other plugin that binds a listening socket.
|
||||
- **Editor / IDE adapters.** The ACP adapter (`acp_adapter/`) and
|
||||
equivalent integrations that accept requests from a local client
|
||||
process.
|
||||
- **The TUI gateway (`tui_gateway/`).** JSON-RPC backend for the
|
||||
Ink terminal UI, reached over local IPC.
|
||||
|
||||
**Uniform rules:**
|
||||
|
||||
1. **Authorization is required at every surface that crosses a
|
||||
trust boundary.** For messaging and network HTTP surfaces, the
|
||||
boundary is the network: authorization means an operator-
|
||||
configured caller allowlist. For editor and local-IPC surfaces
|
||||
(ACP, TUI gateway), the boundary is the host's user account:
|
||||
authorization means relying on OS-level access control (file
|
||||
permissions, loopback-only binds) and not exposing the surface
|
||||
beyond the local user without an explicit network auth layer.
|
||||
2. **An allowlist is required for every enabled network-exposed
|
||||
adapter.** Adapters must refuse to dispatch agent work, resolve
|
||||
approvals, or relay output until an allowlist is set. Code paths
|
||||
that fail open when no allowlist is configured are code bugs in
|
||||
scope under §3.1.
|
||||
3. **Session identifiers are routing handles, not authorization
|
||||
boundaries.** Knowing another caller's session ID does not grant
|
||||
access to their approvals or output; authorization is always
|
||||
re-checked against the allowlist (or OS-level equivalent).
|
||||
4. **Within the authorized set, all callers are equally trusted.**
|
||||
Hermes Agent does not model per-caller capabilities inside a
|
||||
single adapter. Operators who need capability separation should
|
||||
run separate agent instances with separate allowlists.
|
||||
5. **Binding a local-only surface to a non-loopback interface is a
|
||||
break-glass operator decision (§3.2).** The dashboard and other
|
||||
plugin HTTP servers default to loopback; exposing them via
|
||||
`--host 0.0.0.0` or equivalent makes public-exposure hardening
|
||||
(§4) the operator's responsibility.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -138,60 +223,71 @@ allows; it is what Hermes loads.
|
|||
|
||||
### 3.1 In Scope
|
||||
|
||||
- Escape from a declared OS-level isolation posture (§2.1): an
|
||||
- Escape from a declared OS-level isolation posture (§2.2): an
|
||||
attacker-controlled code path reaching state that the posture
|
||||
claimed to confine.
|
||||
- Unauthorized gateway access: a caller outside the configured
|
||||
allowlist dispatching work, receiving output, or resolving
|
||||
approvals (§2.4).
|
||||
- Unauthorized external-surface access: a caller outside the
|
||||
configured authorization set (allowlist, or OS-level equivalent
|
||||
for local-IPC surfaces) dispatching work, receiving output, or
|
||||
resolving approvals (§2.6).
|
||||
- Credential exfiltration: leakage of operator credentials or
|
||||
session authorization material to a destination outside the
|
||||
operator's trust envelope.
|
||||
- Untrusted input chaining into agent-loaded content: an untrusted
|
||||
input surface chains into a write whose target is a location
|
||||
Hermes loads and executes on its own initiative (§2.5).
|
||||
- Output integrity failures into external platforms: agent output
|
||||
rendered on a receiving platform with unintended authority —
|
||||
broadcast-mention passthrough, content that fetches attacker
|
||||
resources for every recipient, markup injection into hosted UIs.
|
||||
trust envelope, via a mechanism that should have prevented it
|
||||
(environment scrubbing bug, adapter logging, transport error
|
||||
that flushes credentials to an upstream, etc.).
|
||||
- Trust-model documentation violations: code behaving contrary to
|
||||
what this policy states, where an operator relying on the policy
|
||||
would reasonably expect otherwise.
|
||||
what this policy, Hermes Agent's own documentation, or reasonable
|
||||
operator expectations would predict — including cases where
|
||||
Hermes Agent has documented a stance about how its output should
|
||||
be rendered by a consuming layer (dashboard, gateway adapter,
|
||||
file writer, shell) and a code path breaks that stance.
|
||||
|
||||
### 3.2 Out of Scope
|
||||
|
||||
"Out of scope" here means "not a security vulnerability under this
|
||||
policy." It does not mean "not worth reporting." Improvements to the
|
||||
in-process heuristics, hardening ideas, and UX fixes are welcome as
|
||||
regular issues or pull requests — we can always make the approval
|
||||
gate catch more patterns, make redaction smarter, or tighten adapter
|
||||
behavior. These items just don't go through the private-disclosure
|
||||
channel and don't receive advisories.
|
||||
regular issues or pull requests — the approval gate can always catch
|
||||
more patterns, redaction can always get smarter, adapter behavior
|
||||
can always be tightened. These items just don't go through the
|
||||
private-disclosure channel and don't receive advisories.
|
||||
|
||||
- **Bypasses of in-process heuristics (§2.3)** — approval-gate regex
|
||||
- **Bypasses of in-process heuristics (§2.4)** — approval-gate regex
|
||||
bypasses, redaction bypasses, Skills Guard pattern bypasses, and
|
||||
analogous reports against future heuristics. These components are
|
||||
not boundaries; defeating them is not a vulnerability under this
|
||||
policy.
|
||||
- **Prompt injection that does not chain to a §3.1 outcome.** Getting
|
||||
the LLM to emit unusual text or "ignore previous instructions" is
|
||||
not itself a vulnerability; it becomes one only when it results in
|
||||
something §3.1 describes.
|
||||
- **Prompt injection per se.** Getting the LLM to emit unusual
|
||||
output — via injected content, hallucination, training artifacts,
|
||||
or any other cause — is not itself a vulnerability. "I achieved
|
||||
prompt injection" without a chained §3.1 outcome is not an
|
||||
actionable report under this policy.
|
||||
- **Consequences of a chosen isolation posture.** Reports that a
|
||||
code path operating within its posture's scope can do what that
|
||||
posture permits are not vulnerabilities. Examples: shell tools
|
||||
reaching host state under the local backend; code-execution or
|
||||
file tools reaching host state under terminal-backend isolation
|
||||
that only sandboxes shell; reports whose preconditions require
|
||||
pre-existing write access to operator-owned configuration or
|
||||
credential files (those are already inside the operator's trust
|
||||
envelope).
|
||||
posture permits are not vulnerabilities. Examples: shell or file
|
||||
tools reaching host state under the local backend; code-execution
|
||||
or MCP subprocesses reaching host state under terminal-backend
|
||||
isolation that only sandboxes shell; reports whose preconditions
|
||||
require pre-existing write access to operator-owned configuration
|
||||
or credential files (those are already inside the trust envelope).
|
||||
- **Documented break-glass settings.** Operator-selected trade-offs
|
||||
that explicitly disable protections: `--insecure` and equivalent
|
||||
flags on the dashboard or other components, disabled approvals,
|
||||
local backend in production, development profiles that bypass
|
||||
hermes-home security, and similar. Reports against those
|
||||
configurations are not vulnerabilities — that's the flag's job.
|
||||
- **Community-contributed skills and plugins.** Third-party skills
|
||||
(including the community skills repository) and third-party
|
||||
plugins are in the operator's review surface, not Hermes Agent's
|
||||
trust surface (§2.4, §2.5). A skill or plugin doing something
|
||||
malicious is the expected failure mode of one that wasn't
|
||||
reviewed, not a vulnerability in Hermes Agent. Bugs in Hermes
|
||||
Agent's skill-install or plugin-install path that prevent the
|
||||
operator from seeing what they're installing are in scope under
|
||||
§3.1.
|
||||
- **Public exposure without external controls.** Exposing the
|
||||
gateway or API to the public internet without authentication,
|
||||
VPN, or firewall.
|
||||
- **Documented break-glass settings.** Disabled approvals, local
|
||||
backend in production, development profiles that bypass
|
||||
hermes-home security, and similar operator-selected trade-offs.
|
||||
- **Tool-level read/write restrictions on a posture where shell is
|
||||
permitted.** If a path is reachable via the terminal tool, reports
|
||||
that other file tools can reach it add nothing.
|
||||
|
|
@ -201,25 +297,26 @@ channel and don't receive advisories.
|
|||
## 4. Deployment Hardening
|
||||
|
||||
The single most important hardening decision is matching isolation
|
||||
(§2.1) to the trust of the content the agent will ingest. Beyond
|
||||
(§2.2) to the trust of the content the agent will ingest. Beyond
|
||||
that:
|
||||
|
||||
- Run the agent as a non-root user. The supplied container image
|
||||
does this by default.
|
||||
- Keep credentials in the operator credential file with tight
|
||||
permissions, never in the main config, never in version control.
|
||||
Under OpenShell, use its Provider store rather than an on-disk
|
||||
Under OpenShell, use the Provider store rather than an on-disk
|
||||
credential file.
|
||||
- Do not expose the gateway or API to the public internet without
|
||||
VPN, Tailscale, or firewall protection. Under OpenShell, use the
|
||||
network policy layer to restrict egress.
|
||||
- Configure a caller allowlist for every gateway adapter you enable
|
||||
(§2.4).
|
||||
- Review third-party skills before install. Skills Guard reports and
|
||||
the install audit log are the review surface.
|
||||
- The OSV malware database is consulted before launching
|
||||
ecosystem-resolved MCP servers. Additional supply-chain guards
|
||||
on dependency and bundled-package changes run in CI; see
|
||||
- Configure a caller allowlist for every network-exposed adapter
|
||||
you enable (§2.6).
|
||||
- Review third-party skills and plugins before install (§2.4,
|
||||
§2.5). For skills, this means reading the Python and scripts,
|
||||
not just SKILL.md. Skills Guard reports and the install audit
|
||||
log are the review surface.
|
||||
- Hermes Agent includes supply-chain guards for MCP server
|
||||
launches and for dependency / bundled-package changes in CI; see
|
||||
`CONTRIBUTING.md` for specifics.
|
||||
|
||||
---
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue