mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
331 lines
15 KiB
Markdown
331 lines
15 KiB
Markdown
# Hermes Agent Security Policy
|
|
|
|
This document describes Hermes Agent's trust model, names the one
|
|
security boundary the project treats as load-bearing, and defines the
|
|
scope for vulnerability reports.
|
|
|
|
## 1. Reporting a Vulnerability
|
|
|
|
Report privately via [GitHub Security Advisories](https://github.com/NousResearch/hermes-agent/security/advisories/new)
|
|
or **security@nousresearch.com**. Do not open public issues for
|
|
security vulnerabilities. **Hermes Agent does not operate a bug
|
|
bounty program.**
|
|
|
|
A useful report includes:
|
|
|
|
- A concise description and severity assessment.
|
|
- The affected component, identified by file path and line range
|
|
(e.g. `path/to/file.py:120-145`).
|
|
- Environment details (`hermes version`, commit SHA, OS, Python
|
|
version).
|
|
- A reproduction against `main` or the latest release.
|
|
- A statement of which trust boundary in §2 is crossed.
|
|
|
|
Please read §2 and §3 before submitting. Reports that demonstrate
|
|
limits of an in-process heuristic this policy does not treat as a
|
|
boundary will be closed as out-of-scope under §3 — but see §3.2:
|
|
they are still welcome as regular issues or pull requests, just not
|
|
through the private security channel.
|
|
|
|
---
|
|
|
|
## 2. Trust Model
|
|
|
|
Hermes Agent is a single-tenant personal agent. Its posture is
|
|
layered, and the layers are not equally load-bearing. Reporters and
|
|
operators should reason about them in the same terms.
|
|
|
|
### 2.1 Definitions
|
|
|
|
- **Agent process.** The Python interpreter running Hermes Agent,
|
|
including any Python modules it has loaded (skills, plugins,
|
|
hook handlers).
|
|
- **Terminal backend.** A pluggable execution target for the
|
|
`terminal()` tool. The default runs commands directly on the host.
|
|
Other backends run commands inside a container, cloud sandbox, or
|
|
remote host.
|
|
- **Input surface.** Any channel through which content enters the
|
|
agent's context: operator input, web fetches, email, gateway
|
|
messages, file reads, MCP server responses, tool results.
|
|
- **Trust envelope.** The set of resources an operator has implicitly
|
|
granted Hermes Agent access to by running it — typically, whatever
|
|
the operator's own user account can reach on the host.
|
|
- **Stance.** An explicit statement in Hermes Agent's documentation
|
|
or code about how a consuming layer (adapter, UI, file writer,
|
|
shell) should treat agent output — e.g. "the dashboard renders
|
|
agent output as inert HTML."
|
|
|
|
### 2.2 The Boundary: OS-Level Isolation
|
|
|
|
**The only security boundary against an adversarial LLM is the
|
|
operating system.** Nothing inside the agent process constitutes
|
|
containment — not the approval gate, not output redaction, not any
|
|
pattern scanner, not any tool allowlist. Any in-process component
|
|
that screens LLM output is a heuristic operating on an
|
|
attacker-influenced string, and this policy treats it as such.
|
|
|
|
Hermes Agent supports two OS-level isolation postures. They address
|
|
different threats and an operator should choose deliberately.
|
|
|
|
#### Terminal-backend isolation
|
|
|
|
A non-default terminal backend runs LLM-emitted shell commands
|
|
inside a container, remote host, or cloud sandbox. The file tools
|
|
(`read_file`, `write_file`, `patch`) also run through this backend,
|
|
since they are implemented on top of the shell contract — they
|
|
cannot reach paths the backend doesn't expose.
|
|
|
|
What this confines: anything the agent does by issuing shell or
|
|
file operations. What this does **not** confine: everything the
|
|
agent does in its own Python process. That includes the
|
|
code-execution tool (spawned as a host subprocess), MCP subprocesses
|
|
(spawned from the agent's environment), plugin loading, hook
|
|
dispatch, and skill loading (all imported into the agent
|
|
interpreter).
|
|
|
|
Terminal-backend isolation is the right posture when the concern is
|
|
LLM-emitted destructive shell or unwanted file-tool writes, and the
|
|
operator is otherwise trusted.
|
|
|
|
#### Whole-process wrapping
|
|
|
|
Whole-process wrapping runs the entire agent process tree inside a
|
|
sandbox. Every code path — shell, code-execution, MCP, file tools,
|
|
plugins, hooks, skill loading — is subject to the same filesystem,
|
|
network, process, and (where applicable) inference policy.
|
|
|
|
Hermes Agent supports this in two ways:
|
|
|
|
- **Hermes Agent's own Docker image and Compose setup.** Lighter-
|
|
weight; the agent runs in a standard container with operator-
|
|
configured mounts and network policy.
|
|
- **[NVIDIA OpenShell](https://github.com/NVIDIA/OpenShell)**.
|
|
OpenShell provides per-session sandboxes with declarative policy
|
|
across filesystem, network (L7 egress), process/syscall, and
|
|
inference-routing layers. Network and inference policies are
|
|
hot-reloadable. Credentials are injected from a Provider store
|
|
and never touch the sandbox filesystem.
|
|
|
|
Under a whole-process wrapper, Hermes Agent's in-process heuristics
|
|
(§2.4) function as accident-prevention layered on top of a real
|
|
boundary. This is the supported posture when the agent ingests
|
|
content from surfaces the operator does not control — the open web,
|
|
inbound email, multi-user channels, untrusted MCP servers — and for
|
|
production or shared deployments.
|
|
|
|
Operators running the default local backend with untrusted input
|
|
surfaces, or running a terminal-backend sandbox and expecting it to
|
|
contain code paths that don't go through the shell, are operating
|
|
outside the supported security posture.
|
|
|
|
### 2.3 Credential Scoping
|
|
|
|
Hermes Agent filters the environment it passes to its lower-trust
|
|
in-process components: shell subprocesses, MCP subprocesses, and
|
|
the code-execution child. Credentials like provider API keys and
|
|
gateway tokens are stripped by default; variables explicitly
|
|
declared by the operator or by a loaded skill are passed through.
|
|
|
|
This reduces casual exfiltration. It is not containment. Any
|
|
component running inside the agent process (skills, plugins, hook
|
|
handlers) can read whatever the agent itself can read, including
|
|
in-memory credentials. The mitigation against a compromised
|
|
in-process component is operator review before install (§2.4,
|
|
§2.5), not environment scrubbing.
|
|
|
|
### 2.4 In-Process Heuristics
|
|
|
|
The following components screen or warn about LLM behavior. They
|
|
are useful. They are not boundaries.
|
|
|
|
- The **approval gate** detects common destructive shell patterns
|
|
and prompts the operator before execution. Shell is Turing-
|
|
complete; a denylist over shell strings is structurally
|
|
incomplete. The gate catches cooperative-mode mistakes, not
|
|
adversarial output.
|
|
- **Output redaction** strips secret-like patterns from display.
|
|
A motivated output producer will defeat it.
|
|
- **Skills Guard** scans installable skill content for injection
|
|
patterns. It is a review aid; the boundary for third-party skills
|
|
is operator review before install. Reviewing a skill means
|
|
reading its Python code and scripts, not just its SKILL.md
|
|
description — skills execute arbitrary Python at import time.
|
|
|
|
### 2.5 Plugin Trust Model
|
|
|
|
Plugins load into the agent process and run with full agent
|
|
privileges: they can read the same credentials, call the same
|
|
tools, register the same hooks, and import the same modules as
|
|
anything shipped in-tree. The boundary for third-party plugins is
|
|
operator review before install — the same rule as skills (§2.4),
|
|
called out separately because plugins are architecturally heavier
|
|
and often ship their own background services, network listeners,
|
|
and dependencies.
|
|
|
|
A malicious or buggy plugin is not a vulnerability in Hermes Agent
|
|
itself. Bugs in Hermes Agent's plugin-install or plugin-discovery
|
|
path that prevent the operator from seeing what they're installing
|
|
are in scope under §3.1.
|
|
|
|
### 2.6 External Surfaces
|
|
|
|
An **external surface** is any channel outside the local agent
|
|
process through which a caller can dispatch agent work, resolve
|
|
approvals, or receive agent output. Each surface has its own
|
|
authorization model, but the rules below apply uniformly.
|
|
|
|
**Surfaces in Hermes Agent:**
|
|
|
|
- **Gateway platform adapters.** Messaging integrations in
|
|
`gateway/platforms/` (Telegram, Discord, Slack, email, SMS, etc.)
|
|
and analogous adapters shipped as plugins.
|
|
- **Network-exposed HTTP surfaces.** The API server adapter, the
|
|
dashboard plugin, the kanban plugin's HTTP endpoints, and any
|
|
other plugin that binds a listening socket.
|
|
- **Editor / IDE adapters.** The ACP adapter (`acp_adapter/`) and
|
|
equivalent integrations that accept requests from a local client
|
|
process.
|
|
- **The TUI gateway (`tui_gateway/`).** JSON-RPC backend for the
|
|
Ink terminal UI, reached over local IPC.
|
|
|
|
**Uniform rules:**
|
|
|
|
1. **Authorization is required at every surface that crosses a
|
|
trust boundary.** For messaging and network HTTP surfaces, the
|
|
boundary is the network: authorization means an operator-
|
|
configured caller allowlist. For editor and local-IPC surfaces
|
|
(ACP, TUI gateway), the boundary is the host's user account:
|
|
authorization means relying on OS-level access control (file
|
|
permissions, loopback-only binds) and not exposing the surface
|
|
beyond the local user without an explicit network auth layer.
|
|
2. **An allowlist is required for every enabled network-exposed
|
|
adapter.** Adapters must refuse to dispatch agent work, resolve
|
|
approvals, or relay output until an allowlist is set. Code paths
|
|
that fail open when no allowlist is configured are code bugs in
|
|
scope under §3.1.
|
|
3. **Session identifiers are routing handles, not authorization
|
|
boundaries.** Knowing another caller's session ID does not grant
|
|
access to their approvals or output; authorization is always
|
|
re-checked against the allowlist (or OS-level equivalent).
|
|
4. **Within the authorized set, all callers are equally trusted.**
|
|
Hermes Agent does not model per-caller capabilities inside a
|
|
single adapter. Operators who need capability separation should
|
|
run separate agent instances with separate allowlists.
|
|
5. **Binding a local-only surface to a non-loopback interface is a
|
|
break-glass operator decision (§3.2).** The dashboard and other
|
|
plugin HTTP servers default to loopback; exposing them via
|
|
`--host 0.0.0.0` or equivalent makes public-exposure hardening
|
|
(§4) the operator's responsibility.
|
|
|
|
---
|
|
|
|
## 3. Scope
|
|
|
|
### 3.1 In Scope
|
|
|
|
- Escape from a declared OS-level isolation posture (§2.2): an
|
|
attacker-controlled code path reaching state that the posture
|
|
claimed to confine.
|
|
- Unauthorized external-surface access: a caller outside the
|
|
configured authorization set (allowlist, or OS-level equivalent
|
|
for local-IPC surfaces) dispatching work, receiving output, or
|
|
resolving approvals (§2.6).
|
|
- Credential exfiltration: leakage of operator credentials or
|
|
session authorization material to a destination outside the
|
|
trust envelope, via a mechanism that should have prevented it
|
|
(environment scrubbing bug, adapter logging, transport error
|
|
that flushes credentials to an upstream, etc.).
|
|
- Trust-model documentation violations: code behaving contrary to
|
|
what this policy, Hermes Agent's own documentation, or reasonable
|
|
operator expectations would predict — including cases where
|
|
Hermes Agent has documented a stance about how its output should
|
|
be rendered by a consuming layer (dashboard, gateway adapter,
|
|
file writer, shell) and a code path breaks that stance.
|
|
|
|
### 3.2 Out of Scope
|
|
|
|
"Out of scope" here means "not a security vulnerability under this
|
|
policy." It does not mean "not worth reporting." Improvements to the
|
|
in-process heuristics, hardening ideas, and UX fixes are welcome as
|
|
regular issues or pull requests — the approval gate can always catch
|
|
more patterns, redaction can always get smarter, adapter behavior
|
|
can always be tightened. These items just don't go through the
|
|
private-disclosure channel and don't receive advisories.
|
|
|
|
- **Bypasses of in-process heuristics (§2.4)** — approval-gate regex
|
|
bypasses, redaction bypasses, Skills Guard pattern bypasses, and
|
|
analogous reports against future heuristics. These components are
|
|
not boundaries; defeating them is not a vulnerability under this
|
|
policy.
|
|
- **Prompt injection per se.** Getting the LLM to emit unusual
|
|
output — via injected content, hallucination, training artifacts,
|
|
or any other cause — is not itself a vulnerability. "I achieved
|
|
prompt injection" without a chained §3.1 outcome is not an
|
|
actionable report under this policy.
|
|
- **Consequences of a chosen isolation posture.** Reports that a
|
|
code path operating within its posture's scope can do what that
|
|
posture permits are not vulnerabilities. Examples: shell or file
|
|
tools reaching host state under the local backend; code-execution
|
|
or MCP subprocesses reaching host state under terminal-backend
|
|
isolation that only sandboxes shell; reports whose preconditions
|
|
require pre-existing write access to operator-owned configuration
|
|
or credential files (those are already inside the trust envelope).
|
|
- **Documented break-glass settings.** Operator-selected trade-offs
|
|
that explicitly disable protections: `--insecure` and equivalent
|
|
flags on the dashboard or other components, disabled approvals,
|
|
local backend in production, development profiles that bypass
|
|
hermes-home security, and similar. Reports against those
|
|
configurations are not vulnerabilities — that's the flag's job.
|
|
- **Community-contributed skills and plugins.** Third-party skills
|
|
(including the community skills repository) and third-party
|
|
plugins are in the operator's review surface, not Hermes Agent's
|
|
trust surface (§2.4, §2.5). A skill or plugin doing something
|
|
malicious is the expected failure mode of one that wasn't
|
|
reviewed, not a vulnerability in Hermes Agent. Bugs in Hermes
|
|
Agent's skill-install or plugin-install path that prevent the
|
|
operator from seeing what they're installing are in scope under
|
|
§3.1.
|
|
- **Public exposure without external controls.** Exposing the
|
|
gateway or API to the public internet without authentication,
|
|
VPN, or firewall.
|
|
- **Tool-level read/write restrictions on a posture where shell is
|
|
permitted.** If a path is reachable via the terminal tool, reports
|
|
that other file tools can reach it add nothing.
|
|
|
|
---
|
|
|
|
## 4. Deployment Hardening
|
|
|
|
The single most important hardening decision is matching isolation
|
|
(§2.2) to the trust of the content the agent will ingest. Beyond
|
|
that:
|
|
|
|
- Run the agent as a non-root user. The supplied container image
|
|
does this by default.
|
|
- Keep credentials in the operator credential file with tight
|
|
permissions, never in the main config, never in version control.
|
|
Under OpenShell, use the Provider store rather than an on-disk
|
|
credential file.
|
|
- Do not expose the gateway or API to the public internet without
|
|
VPN, Tailscale, or firewall protection. Under OpenShell, use the
|
|
network policy layer to restrict egress.
|
|
- Configure a caller allowlist for every network-exposed adapter
|
|
you enable (§2.6).
|
|
- Review third-party skills and plugins before install (§2.4,
|
|
§2.5). For skills, this means reading the Python and scripts,
|
|
not just SKILL.md. Skills Guard reports and the install audit
|
|
log are the review surface.
|
|
- Hermes Agent includes supply-chain guards for MCP server
|
|
launches and for dependency / bundled-package changes in CI; see
|
|
`CONTRIBUTING.md` for specifics.
|
|
|
|
---
|
|
|
|
## 5. Disclosure
|
|
|
|
- **Coordinated disclosure window:** 90 days from report, or until a
|
|
fix is released, whichever comes first.
|
|
- **Channel:** the GHSA thread or email correspondence with
|
|
security@nousresearch.com.
|
|
- **Credit:** reporters are credited in release notes unless
|
|
anonymity is requested.
|