plugins: add security-guidance — pattern-matched warnings on dangerous code writes (#33131)

New opt-in plugin that scans the content passed to write_file / patch /
skill_manage for 25 known-dangerous code patterns — pickle.load,
yaml.load, eval(, os.system, subprocess(shell=True), child_process.exec,
dangerouslySetInnerHTML, innerHTML/outerHTML/document.write/
insertAdjacentHTML, crypto.createCipher (no IV), AES ECB,
TLS verification disabled, XXE-prone xml.etree/minidom parsers,
<script src=//...> without SRI, torch.load without weights_only=True,
GitHub Actions ${{ github.event.* }} injection — and appends a
"Security guidance" warning block to the tool result via the
transform_tool_result hook.

Default behaviour is non-blocking: the file is written and the warning
rides back to the model in the next turn so it can self-correct or
document why the construct is safe. SECURITY_GUIDANCE_BLOCK=1 upgrades
to refusing the write entirely; SECURITY_GUIDANCE_DISABLE=1 is the
kill switch.

Pattern data (patterns.py) is a verbatim Apache-2.0 fork of
Anthropic's claude-plugins-official/plugins/security-guidance/hooks/
patterns.py at commit 0bde168 (2026-05-26). LICENSE and NOTICE
preserve attribution. The Hermes-side plugin glue (__init__.py,
plugin.yaml, README.md, tests) is original work.

Plugin is opt-in like all bundled plugins:
  hermes plugins enable security-guidance

Inspired by https://x.com/ClaudeDevs/status/1927108527247... — Anthropic
shipped this as their security-guidance plugin for Claude Code on
2026-05-26 with a measured 30-40% reduction in security-related PR
comments on internal rollout.

What's NOT ported (deferred):
  * Layer 2 (LLM diff review on turn end) — would route through main
    model by default on Hermes, real money on reasoning models. A
    follow-up can wire it to a cheap aux model with explicit opt-in.
  * Layer 3 (agentic commit-time review) — agent can run this on
    demand via delegate_task today.
  * .hermes/security-guidance.md project-rules file — only used by
    layers 2/3 upstream.
This commit is contained in:
Teknium 2026-05-27 02:07:21 -07:00 committed by GitHub
parent c752205635
commit 249534e472
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
8 changed files with 1311 additions and 0 deletions

View file

@ -0,0 +1,202 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View file

@ -0,0 +1,30 @@
Hermes Agent security-guidance plugin
=====================================
This plugin (plugins/security-guidance/) includes work originally
published in the claude-plugins-official repository by Anthropic, PBC.,
licensed under the Apache License, Version 2.0.
Source: https://github.com/anthropics/claude-plugins-official
Subpath: plugins/security-guidance/hooks/patterns.py
Commit: 0bde168 (2026-05-26)
License: Apache License 2.0 (see LICENSE in this directory)
Forked content
--------------
The file patterns.py in this directory is a verbatim copy of the upstream
patterns.py at the commit above, with a modified module docstring noting
this attribution. The pattern data — 25 regex/substring rules covering
unsafe deserialization, command injection, XSS sinks, crypto footguns,
XXE, GitHub Actions injection, and TLS-verification disablement — is
unmodified.
Original work
-------------
The Hermes-side plugin glue code (__init__.py, plugin.yaml, README.md,
tests) is original work by NousResearch and is licensed under the MIT
License that applies to the rest of the hermes-agent project, except
where it imports from patterns.py — that import does not change the
license of either file.

View file

@ -0,0 +1,88 @@
# security-guidance
Pattern-matched security warnings for code the agent writes. When the agent
calls `write_file`, `patch`, or `skill_manage` with content that matches a
known-dangerous code pattern (eval, pickle.load, yaml.load, os.system,
subprocess with `shell=True`, `dangerouslySetInnerHTML`, `verify=False`, ECB
mode, GitHub Actions `${{ github.event.* }}` injection, `torch.load` without
`weights_only=True`, ...), the plugin appends a warning to the tool's result.
The file is still written; the model sees the warning in the next turn and
can fix the code or briefly document why the construct is safe.
This is layer 1 of Anthropic's `security-guidance` plugin design — a fast
first-pass that runs locally with zero LLM tokens spent. Layers 2 and 3 (LLM
diff review on turn end, agentic commit review) are not ported; the agent
can already run those kinds of reviews on demand via `delegate_task`.
## Coverage (25 rules)
The pattern set is forked verbatim from Anthropic's `claude-plugins-official`
under Apache-2.0. Categories:
| Category | Rules |
|---|---|
| Unsafe deserialization | `pickle.load`, `cPickle/cloudpickle/dill.load`, `marshal.loads`, `shelve.open`, `yaml.load`, `yaml.unsafe_load`, `torch.load` (without `weights_only=True`), `joblib.load`, `pandas.read_pickle`, `numpy.load(allow_pickle=True)` |
| Command injection | `os.system`, `subprocess(..., shell=True)`, JS `child_process.exec`, Go `exec.Command("sh"...)` |
| Code injection | `eval(`, JS `new Function(...)` |
| XSS sinks | `.innerHTML =`, `.outerHTML =`, `.insertAdjacentHTML(`, `document.write`, React `dangerouslySetInnerHTML` |
| Crypto footguns | AES ECB mode, Node `crypto.createCipher` (no IV), TLS verification disabled (`verify=False`, `rejectUnauthorized: false`, `InsecureSkipVerify: true`, ...) |
| XXE | `xml.etree`, `minidom`, `xml.sax` without `defusedxml` |
| Supply chain | `<script src="https://..."` without `integrity=` SRI hash |
| CI/CD injection | GitHub Actions workflow files using `${{ github.event.* }}` in `run:` |
The pattern data uses Python regex + literal-substring matching. Each rule
carries a per-extension `path_filter` lambda — Python-only rules skip `.js`,
JS rules skip `.py`, all rules skip `.md/.txt/.rst/.json/.yaml`. Lookbehind
assertions exclude method calls (so `model.eval()` and `redis.eval()` don't
trip the `eval(` rule). False-positive rate is mediocre but tolerable; the
plugin is warn-by-default precisely because of that.
## Enabling
Plugins are opt-in. Add it to your allow-list:
```bash
hermes plugins enable security-guidance
# or edit ~/.hermes/config.yaml manually:
plugins:
enabled:
- security-guidance
```
## Modes
| Env var | Default | Effect |
|---|---|---|
| (none) | warn | Appends a `⚠️ Security guidance` block to the tool result. The file is written. |
| `SECURITY_GUIDANCE_BLOCK=1` | unset | Refuses the write entirely with the warning as the block reason. Use for stricter environments. |
| `SECURITY_GUIDANCE_DISABLE=1` | unset | Kill switch — plugin loads but does nothing. |
## What it does **not** do (yet)
* **No LLM diff review.** Anthropic's layer 2 spawns an auxiliary LLM call
on every agent turn that touched files. On hermes that would route
through the main model by default (`auxiliary_client._resolve_auto()` is
main-model-first), which is real money on reasoning models. A separate
PR can wire layer 2 to a cheap auxiliary model with explicit opt-in.
* **No agentic commit review.** Anthropic's layer 3 spawns an SDK subagent
with `Read`/`Grep`/`Glob` to trace data flow on `git commit`. That's a
follow-up that would build on `delegate_task`.
* **No project-local rules file.** Anthropic's `.claude/claude-security-guidance.md`
is read by their layer 2/3 LLM prompts, not the pattern scanner. We can
add an analogous `.hermes/security-guidance.md` once layer 2 lands.
## Limitations
This is a best-effort assistive tool. Pattern matching can miss
vulnerabilities and produce false positives. Treat warnings as suggestions,
not a substitute for code review, SAST, dependency scanning, or pen testing.
## Attribution and licensing
* `patterns.py` is a verbatim fork from
[`anthropics/claude-plugins-official`](https://github.com/anthropics/claude-plugins-official/tree/main/plugins/security-guidance/hooks)
(commit `0bde168`, 2026-05-26), licensed under the
[Apache License 2.0](./LICENSE). See [NOTICE](./NOTICE) for the full
attribution.
* `__init__.py`, `plugin.yaml`, `README.md`, and tests are original work by
NousResearch, MIT-licensed alongside the rest of hermes-agent.

View file

@ -0,0 +1,259 @@
"""security-guidance plugin — fast pattern-matched security warnings on file writes.
Wires one behaviour:
* ``transform_tool_result`` hook scans the *content being written* by
``write_file`` / ``patch`` / ``skill_manage`` (write/patch modes) for known
dangerous code patterns (eval(, pickle.load, yaml.load, os.system,
subprocess(shell=True), dangerouslySetInnerHTML, verify=False, ECB,
XXE-prone XML parsers, GitHub Actions ``${{ github.event.* }}`` injection,
torch.load without ``weights_only=True``, ...). When any pattern matches,
the plugin appends a `` Security warning`` block to the JSON tool-result
string. The file is still written; the model sees the warning in the next
turn's tool message and can self-correct.
Why not block? Patterns have a non-trivial false-positive rate (``eval(`` in
a tokenizer, ``yaml.load`` already wrapped in ``yaml.SafeLoader``, ECB inside
a test fixture). Blocking would force every false positive into an approval
prompt or an interrupted workflow. Warning is the right severity for layer
1 the agent reads the warning and either fixes the code or briefly
documents why the construct is safe.
For block-mode (refuse the write entirely), set
``SECURITY_GUIDANCE_BLOCK=1``. This trades convenience for strictness and
is intended for shared dev environments where unsafe-by-default patterns
are policy violations.
Pattern data lives in ``patterns.py``, forked verbatim from Anthropic's
``claude-plugins-official`` under Apache-2.0. See ``LICENSE`` and ``NOTICE``
in this directory.
"""
from __future__ import annotations
import json
import logging
import os
import re
from typing import Any, Dict, List, Optional, Tuple
from . import patterns as _patterns
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Configuration
# ---------------------------------------------------------------------------
# Tool names whose args carry "code being written to disk" we want to scan.
# Maps tool name -> (path_arg_name, content_arg_names). For tools with multiple
# possible content fields (patch's old/new_string vs raw patch text), we scan
# every populated string field.
_TARGET_TOOLS: Dict[str, Tuple[str, Tuple[str, ...]]] = {
"write_file": ("path", ("content",)),
"patch": ("path", ("new_string", "patch")),
# skill_manage write_file / patch sub-actions land here. file_path holds
# the relative path inside the skill dir; we scan it the same way.
"skill_manage": ("file_path", ("file_content", "new_string")),
}
# Cap on how much content we scan. Above this we skip — pattern matching a
# 10 MB blob has poor signal-to-noise and would slow down the agent loop.
_MAX_SCAN_BYTES = 256 * 1024
def _block_mode_enabled() -> bool:
return os.environ.get("SECURITY_GUIDANCE_BLOCK", "").lower() in {"1", "true", "yes", "on"}
def _plugin_disabled() -> bool:
return os.environ.get("SECURITY_GUIDANCE_DISABLE", "").lower() in {"1", "true", "yes", "on"}
# ---------------------------------------------------------------------------
# Scanning
# ---------------------------------------------------------------------------
# Pre-compile the regex patterns once. Substring patterns stay as plain
# strings — ``str.__contains__`` is faster than a regex of literal chars.
_COMPILED: List[Dict[str, Any]] = []
for _rule in _patterns.SECURITY_PATTERNS:
_entry: Dict[str, Any] = {
"ruleName": _rule["ruleName"],
"reminder": _rule["reminder"],
"path_filter": _rule.get("path_filter"),
"path_check": _rule.get("path_check"),
"substrings": tuple(_rule.get("substrings", ())),
"regex": None,
}
_re_src = _rule.get("regex")
if _re_src:
try:
_entry["regex"] = re.compile(_re_src)
except re.error as _err:
logger.warning(
"security-guidance: skipping rule %s — invalid regex %r: %s",
_rule["ruleName"], _re_src, _err,
)
continue
_COMPILED.append(_entry)
def _scan_content(path: str, content: str) -> List[Tuple[str, str]]:
"""Return [(ruleName, reminder), ...] for every pattern that matches.
``path`` is used by per-rule path filters (path_filter / path_check).
Each rule fires at most once per call multiple matches of the same
rule collapse into a single warning entry.
"""
if not content or len(content.encode("utf-8", errors="ignore")) > _MAX_SCAN_BYTES:
return []
hits: List[Tuple[str, str]] = []
for entry in _COMPILED:
# path_check: rule fires PURELY on path match (no content regex). Used
# for blanket "you're editing a sensitive file, here are reminders"
# warnings — github_actions_workflow is the canonical example.
path_check = entry.get("path_check")
if path_check is not None:
try:
if path_check(path or ""):
hits.append((entry["ruleName"], entry["reminder"]))
except Exception:
pass
# Path-check rules don't also pattern-match content; move on.
continue
# path_filter: rule is skipped when the path filter returns False
# (e.g. Python-only rules skip .js files; eval_injection skips .md)
path_filter = entry.get("path_filter")
if path_filter is not None:
try:
if not path_filter(path or ""):
continue
except Exception:
continue
matched = False
for sub in entry["substrings"]:
if sub in content:
matched = True
break
if not matched and entry["regex"] is not None:
if entry["regex"].search(content):
matched = True
if matched:
hits.append((entry["ruleName"], entry["reminder"]))
return hits
def _extract_path_and_content(tool_name: str, args: Any) -> List[Tuple[str, str]]:
"""Return [(path, content), ...] for a tool call. Empty if nothing to scan."""
spec = _TARGET_TOOLS.get(tool_name)
if spec is None or not isinstance(args, dict):
return []
path_key, content_keys = spec
path = args.get(path_key) or ""
if not isinstance(path, str):
path = ""
out: List[Tuple[str, str]] = []
for ck in content_keys:
val = args.get(ck)
if isinstance(val, str) and val:
out.append((path, val))
return out
def _format_warning_block(findings: List[Tuple[str, str]]) -> str:
"""Render findings into a Markdown block appended to the tool result."""
names = ", ".join(name for name, _ in findings)
lines = [
"",
"---",
f"⚠️ Security guidance — {len(findings)} pattern{'s' if len(findings) != 1 else ''} matched ({names})",
"",
]
for _, reminder in findings:
lines.append(reminder)
lines.append("")
lines.append(
"Pattern matches can be false positives. If the construct is safe in this "
"context, briefly document why in a code comment and continue. Otherwise, "
"fix the code before moving on."
)
return "\n".join(lines)
# ---------------------------------------------------------------------------
# Hooks
# ---------------------------------------------------------------------------
def _scan_args(tool_name: str, args: Any) -> List[Tuple[str, str]]:
"""Common scan path used by both pre_tool_call (block mode) and
transform_tool_result (warn mode)."""
if _plugin_disabled():
return []
findings: List[Tuple[str, str]] = []
for path, content in _extract_path_and_content(tool_name, args):
findings.extend(_scan_content(path, content))
return findings
def _on_pre_tool_call(
tool_name: str = "",
args: Any = None,
**_: Any,
) -> Optional[Dict[str, str]]:
"""In block mode, refuse the write if any pattern matches.
Default mode is non-blocking we return None here and let
``transform_tool_result`` append a warning to the result instead.
"""
if not _block_mode_enabled():
return None
findings = _scan_args(tool_name, args)
if not findings:
return None
return {
"action": "block",
"message": (
"security-guidance refused this write: "
+ _format_warning_block(findings)
+ "\n\nTo override, unset SECURITY_GUIDANCE_BLOCK and retry."
),
}
def _on_transform_tool_result(
tool_name: str = "",
args: Any = None,
result: Any = None,
**_: Any,
) -> Optional[str]:
"""Warn-mode hook: append a security-warning block to the tool result.
Returning a string replaces the result that the model sees in the next
turn. Returning None leaves the result unchanged.
"""
# Block mode handles findings via pre_tool_call; nothing for this hook
# to do in that case (the tool didn't run, so there's no result to wrap).
if _block_mode_enabled():
return None
findings = _scan_args(tool_name, args)
if not findings:
return None
if not isinstance(result, str):
return None
# Don't decorate error results — the model already has bigger problems.
try:
parsed = json.loads(result)
if isinstance(parsed, dict) and "error" in parsed and len(parsed) <= 2:
return None
except (ValueError, TypeError):
pass
return result + "\n\n" + _format_warning_block(findings)
def register(ctx) -> None:
ctx.register_hook("pre_tool_call", _on_pre_tool_call)
ctx.register_hook("transform_tool_result", _on_transform_tool_result)

View file

@ -0,0 +1,368 @@
"""
Regex-based security pattern definitions for the security-guidance plugin.
Pure data + one pure helper. No env-var reads, no I/O kept side-effect-free
so it can be imported in isolation.
Forked verbatim from Anthropic's claude-plugins-official repository
(plugins/security-guidance/hooks/patterns.py) under the Apache License 2.0:
https://github.com/anthropics/claude-plugins-official
Copyright (c) Anthropic, PBC. and the security-guidance contributors
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Modifications by NousResearch for the Hermes Agent plugin port:
- none to the pattern data itself; this file is byte-for-byte the upstream
patterns.py at commit 0bde168 (2026-05-26). Hermes-side wiring lives in
__init__.py.
"""
from enum import IntEnum
_JS_EXTS = (".js", ".jsx", ".ts", ".tsx", ".mjs", ".cjs", ".mts", ".cts", ".vue", ".svelte")
_PY_EXTS = (".py", ".pyi", ".ipynb")
_DOC_EXTS = (".md", ".mdx", ".txt", ".rst", ".json", ".yaml", ".yml")
_UNSAFE_DESERIALIZATION_REMINDER = """⚠️ Security Warning: Loading pickle data (or equivalents: cPickle, cloudpickle, dill, marshal, shelve, joblib, pandas.read_pickle, numpy with allow_pickle=True) from untrusted sources allows arbitrary code execution.
For simple data, prefer JSON or msgspec. For typed objects, prefer a schema-validated deserializer (msgspec.Struct, pydantic, marshmallow) that constructs only declared types.
If this is safe or is explicitly needed, briefly document that in a comment before continuing."""
_UNSAFE_YAML_LOAD_REMINDER = """⚠️ Security Warning: yaml.load() / yaml.unsafe_load() execute arbitrary Python via !!python/object tags.
Use yaml.safe_load() if the file only contains simple data structures (dicts, lists, strings, numbers). If you need typed objects, parse with safe_load and validate the result against a schema (pydantic, msgspec, marshmallow) never use a custom Loader that constructs arbitrary types."""
_UNSAFE_TORCH_LOAD_REMINDER = """⚠️ Security Warning: torch.load() defaults to weights_only=False, which unpickles arbitrary Python objects and allows arbitrary code execution.
If the file only contains tensors and simple data structures, pass weights_only=True (or set TORCH_FORCE_WEIGHTS_ONLY_LOAD=1)."""
# Security patterns configuration
SECURITY_PATTERNS = [
{
"ruleName": "github_actions_workflow",
"path_check": lambda path: ".github/workflows/" in path
and (path.endswith(".yml") or path.endswith(".yaml")),
"reminder": """⚠️ Security Warning: You are editing a GitHub Actions workflow file. Be aware of these security risks:
1. **Command Injection**: Never use untrusted input (like issue titles, PR descriptions, commit messages) directly in run: commands without proper escaping
2. **Use environment variables**: Instead of ${{ github.event.issue.title }}, use env: with proper quoting
3. **Review the guide**: https://github.blog/security/vulnerability-research/how-to-catch-github-actions-workflow-injections-before-attackers-do/
Example of UNSAFE pattern to avoid:
run: echo "${{ github.event.issue.title }}"
Example of SAFE pattern:
env:
TITLE: ${{ github.event.issue.title }}
run: echo "$TITLE"
Other risky inputs to be careful with:
- github.event.issue.body
- github.event.pull_request.title
- github.event.pull_request.body
- github.event.comment.body
- github.event.review.body
- github.event.review_comment.body
- github.event.pages.*.page_name
- github.event.commits.*.message
- github.event.head_commit.message
- github.event.head_commit.author.email
- github.event.head_commit.author.name
- github.event.commits.*.author.email
- github.event.commits.*.author.name
- github.event.pull_request.head.ref
- github.event.pull_request.head.label
- github.event.pull_request.head.repo.default_branch
- github.event.client_payload.* (repository_dispatch events attacker can set any field)
4. **Ref injection**: Never use untrusted input in `ref:` parameters of `actions/checkout`. For `client_payload.pr_number`, validate it matches `^[0-9]+$` before using in `ref: refs/pull/${{ ... }}/head`
- github.head_ref""",
},
{
"ruleName": "child_process_exec",
# Gate to JS/TS files — bare `exec(` otherwise fires on Python's
# exec() and on prose/docstrings mentioning exec.
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": ["child_process.exec", "execSync("],
"regex": r"(?<![a-zA-Z0-9_\.])exec\(",
"reminder": """⚠️ Security Warning: Using child_process.exec() can lead to command injection vulnerabilities.
exec() runs the command string through a shell, so any user input interpolated into it can inject arbitrary commands. Prefer child_process.execFile() (or spawn()) with an argument array instead of building a shell string.
Instead of:
exec(`command ${userInput}`)
Use:
import { execFile } from 'node:child_process'
execFile('command', [userInput], callback)
Why execFile/spawn with an argument array is safer:
- No shell is involved, so shell metacharacters in arguments are not interpreted
- Arguments are passed directly to the program rather than interpolated into a command string
Only use exec() if you absolutely need shell features and the input is guaranteed to be safe.""",
},
{
"ruleName": "new_function_injection",
"substrings": ["new Function"],
"reminder": "\u26a0\ufe0f Security Warning: Using new Function() with string interpolation is a CODE INJECTION vulnerability. If any variable is concatenated or interpolated into the function body string, an attacker controlling that variable can execute arbitrary code. Use safe alternatives: for property access use obj[key] or array.reduce((o, k) => o[k], root); for computation use a safe expression parser. NEVER interpolate untrusted strings into new Function() bodies.",
},
{
"ruleName": "eval_injection",
# Lookbehind excludes `.` so method calls like PyTorch model.eval(),
# redis.eval(), spec.eval() don't match. Skip doc/prose files.
"path_filter": lambda p: not p.endswith(_DOC_EXTS),
"regex": r"(?<![a-zA-Z0-9_\.])eval\(",
"reminder": "⚠️ Security Warning: eval() executes arbitrary code and is a major security risk. Use JSON.parse() for data, ast.literal_eval() for Python literals, or a safe expression parser. If this is safe or is explicitly needed, briefly document that in a comment before continuing.",
},
{
"ruleName": "react_dangerously_set_html",
"substrings": ["dangerouslySetInnerHTML"],
"reminder": "⚠️ Security Warning: dangerouslySetInnerHTML can lead to XSS vulnerabilities if used with untrusted content. Ensure all content is properly sanitized using an HTML sanitizer library like DOMPurify, or use safe alternatives.",
},
{
"ruleName": "document_write_xss",
"substrings": ["document.write"],
"reminder": "⚠️ Security Warning: document.write() can be exploited for XSS attacks and has performance issues. Use DOM manipulation methods like createElement() and appendChild() instead.",
},
{
"ruleName": "innerHTML_xss",
"substrings": [".innerHTML =", ".innerHTML="],
"reminder": "⚠️ Security Warning: Setting innerHTML with untrusted content can lead to XSS vulnerabilities. Use textContent for plain text or safe DOM methods for HTML content. If you need HTML support, consider using an HTML sanitizer library such as DOMPurify.",
},
{
"ruleName": "pickle_deserialization",
# Match deserialization only (load/loads/Unpickler). pickle.dump is
# not the RCE surface. `pkl_load` needs a word boundary so similarly
# named safe loaders don't match.
"path_filter": lambda p: p.endswith(_PY_EXTS),
"regex": r"(?<![a-zA-Z0-9_])pickle\.(loads?|Unpickler)\b|(?<![a-zA-Z0-9_])pkl_load\(",
"reminder": _UNSAFE_DESERIALIZATION_REMINDER,
},
{
"ruleName": "os_system_injection",
"path_filter": lambda p: p.endswith(_PY_EXTS),
"regex": r"\bos\.system\s*\(",
"substrings": ["from os import system"],
"reminder": "⚠️ Security Warning: os.system() runs a shell and is a command-injection sink. Use subprocess.run([...]) with a list of arguments instead. If this is safe or is explicitly needed, briefly document that in a comment before continuing.",
},
{
"ruleName": "python_subprocess_shell",
"regex": r"subprocess\.(?:run|call|Popen|check_output|check_call)\(.*shell\s*=\s*True",
"reminder": """⚠️ Security Warning: Using subprocess with shell=True enables command injection.
UNSAFE:
subprocess.run(f"ls {user_input}", shell=True)
subprocess.call("grep " + pattern, shell=True)
SAFE - pass arguments as a list without shell:
subprocess.run(["ls", user_input])
subprocess.call(["grep", pattern])
When arguments are passed as a list without shell=True, special characters cannot be interpreted as shell metacharacters.""",
},
# =====================================================================
# Go-specific security patterns
# =====================================================================
{
"ruleName": "go_exec_shell_injection",
# Detect exec.Command with shell invocation (sh, bash, /bin/sh, /bin/bash)
"regex": r'exec\.Command\(\s*"(?:sh|bash|/bin/sh|/bin/bash)"',
"reminder": """⚠️ Security Warning: Using exec.Command with a shell interpreter (sh/bash) enables command injection.
UNSAFE:
exec.Command("sh", "-c", "ping -c 1 " + host)
exec.Command("bash", "-c", fmt.Sprintf("df -h %s", path))
SAFE - pass arguments directly without a shell:
exec.Command("ping", "-c", "1", host)
exec.Command("df", "-h", path)
When arguments are passed directly (not through a shell), special characters in user input cannot be interpreted as shell metacharacters. This prevents command injection entirely.
Additionally, validate user inputs:
- For hostnames/IPs: use net.ParseIP() or a hostname regex
- For file paths: use filepath.Clean() and verify the result is within an allowed directory
- For numeric values: parse to int/float first""",
},
{
"ruleName": "unsafe_yaml_load",
"regex": r"\byaml\.load\s*\((?![^)\n]{0,80}\bSafe)",
"reminder": _UNSAFE_YAML_LOAD_REMINDER,
},
{
"ruleName": "node_createcipher_no_iv",
"regex": r"\bcrypto\.(createCipher|createDecipher)\b",
"reminder": "⚠️ Security Warning: Use crypto.createCipheriv() / createDecipheriv(). createCipher was removed in Node 22 and derives the key insecurely (no IV, MD5-based KDF).",
},
{
"ruleName": "aes_ecb_mode",
"regex": r"\bAES\.MODE_ECB\b|\bmodes\.ECB\s*\(|[\x22\x27]aes-\d+-ecb[\x22\x27]",
"reminder": "⚠️ Security Warning: Use AES-GCM or AES-CBC with HMAC. ECB mode leaks plaintext structure (identical blocks encrypt to identical ciphertext).",
},
{
"ruleName": "tls_verification_disabled",
"regex": r"\bverify\s*=\s*False\b|rejectUnauthorized\s*:\s*false|InsecureSkipVerify\s*:\s*true|NODE_TLS_REJECT_UNAUTHORIZED\s*=\s*[\x22\x27]?0|ssl\._create_unverified_context|check_hostname\s*=\s*False",
"reminder": "⚠️ Security Warning: Don't disable TLS verification. This allows MITM attacks. For self-signed dev certs, add the CA to your trust store or use a properly-issued cert.",
},
{
"ruleName": "marshal_loads",
"regex": r"\bmarshal\.loads?\s*\(",
"reminder": _UNSAFE_DESERIALIZATION_REMINDER,
},
{
"ruleName": "shelve_open",
"regex": r"\bshelve\.open\s*\(",
"reminder": _UNSAFE_DESERIALIZATION_REMINDER,
},
{
"ruleName": "xml_unsafe_parse",
"regex": r"\b(xml\.etree\.ElementTree|ElementTree|ET)\.(parse|fromstring|XML)\s*\(|\bminidom\.(parse|parseString)\s*\(|\bxml\.sax\.(parse|make_parser)\b",
"reminder": "⚠️ Security Warning: Use defusedxml.ElementTree. Python's stdlib XML parsers are vulnerable to XXE (external entity) and billion-laughs attacks by default.",
},
{
"ruleName": "pickle_variants_load",
"regex": r"\b(cPickle|cloudpickle|dill)\.(load|loads)\s*\(",
"reminder": _UNSAFE_DESERIALIZATION_REMINDER,
},
{
"ruleName": "outerHTML_xss",
"substrings": [".outerHTML =", ".outerHTML="],
"reminder": "⚠️ Security Warning: Use textContent or sanitize with DOMPurify. outerHTML assignment is an XSS sink equivalent to innerHTML.",
},
{
"ruleName": "insertAdjacentHTML_xss",
"substrings": [".insertAdjacentHTML("],
"reminder": "⚠️ Security Warning: Use insertAdjacentText() or sanitize with DOMPurify. insertAdjacentHTML is an XSS sink.",
},
{
"ruleName": "script_src_without_sri",
# Detect remote code execution via dynamic import/eval of fetched content.
# Negative lookahead after src checks for integrity= anywhere in the remaining tag.
"regex": (
r"<script\s+(?![^>]{0,400}integrity\s*=)"
r"[^>]{0,200}src\s*=\s*[\x22\x27](?:https?:)?//"
r"[^\x22\x27]{1,300}[\x22\x27]"
r"[^>]{0,100}>"
),
"reminder": '⚠️ Security Warning: Add integrity="sha384-..." crossorigin="anonymous" to external script tags. Loading scripts without Subresource Integrity exposes you to CDN compromise.',
},
{
"ruleName": "torch_unsafe_load",
# Suppressed by weights_only=True on the same line (within 200 chars). weights_only=False
# still triggers. Multi-line calls false-positive — same known limitation as unsafe_yaml_load.
"regex": r"(?:\btorch\.load|\.torch_load)\s*\((?![^)\n]{0,200}weights_only\s*=\s*True)",
"reminder": _UNSAFE_TORCH_LOAD_REMINDER,
},
{
"ruleName": "yaml_unsafe_load_variants",
# yaml.unsafe_load (stdlib alias) plus unsafe wrapper method names seen in the wild.
# Bare yaml.load() is unsafe_yaml_load's job (RuleId 12).
"regex": r"(?:\byaml\.unsafe_load|\.yaml_unsafe_load)\s*\(",
"reminder": _UNSAFE_YAML_LOAD_REMINDER,
},
{
"ruleName": "pickle_wrapper_load",
# Library APIs that unpickle without saying "pickle". numpy.load only triggers
# when allow_pickle=True is explicit (defaults to False since numpy 1.16.3).
"regex": r"\bjoblib\.load\s*\(|\b(?:pd|pandas)\.read_pickle\s*\(|\.cloudpickle_load\s*\(|\b(?:np|numpy)\.load\s*\([^)\n]{0,200}allow_pickle\s*=\s*True",
"reminder": _UNSAFE_DESERIALIZATION_REMINDER,
},
]
class RuleId(IntEnum):
"""
Stable numeric IDs for SECURITY_PATTERNS rules, emitted via the PostToolUse
metrics field so telemetry can attribute pattern-warning events to
specific checks. The metrics schema only allows bool|number values (no
strings), so rule names can't be sent directly.
Values are frozen: do not renumber existing entries. Append new ones.
"""
GITHUB_ACTIONS_WORKFLOW = 1
CHILD_PROCESS_EXEC = 2
NEW_FUNCTION_INJECTION = 3
EVAL_INJECTION = 4
REACT_DANGEROUSLY_SET_HTML = 5
DOCUMENT_WRITE_XSS = 6
INNERHTML_XSS = 7
PICKLE_DESERIALIZATION = 8
OS_SYSTEM_INJECTION = 9
PYTHON_SUBPROCESS_SHELL = 10
GO_EXEC_SHELL_INJECTION = 11
UNSAFE_YAML_LOAD = 12
NODE_CREATECIPHER_NO_IV = 13
AES_ECB_MODE = 14
TLS_VERIFICATION_DISABLED = 15
MARSHAL_LOADS = 16
SHELVE_OPEN = 17
XML_UNSAFE_PARSE = 18
PICKLE_VARIANTS_LOAD = 19
OUTERHTML_XSS = 20
INSERTADJACENTHTML_XSS = 21
SCRIPT_SRC_WITHOUT_SRI = 22
TORCH_UNSAFE_LOAD = 23
YAML_UNSAFE_LOAD_VARIANTS = 24
PICKLE_WRAPPER_LOAD = 25
_RULE_NAME_TO_ID = {
"github_actions_workflow": RuleId.GITHUB_ACTIONS_WORKFLOW,
"child_process_exec": RuleId.CHILD_PROCESS_EXEC,
"new_function_injection": RuleId.NEW_FUNCTION_INJECTION,
"eval_injection": RuleId.EVAL_INJECTION,
"react_dangerously_set_html": RuleId.REACT_DANGEROUSLY_SET_HTML,
"document_write_xss": RuleId.DOCUMENT_WRITE_XSS,
"innerHTML_xss": RuleId.INNERHTML_XSS,
"pickle_deserialization": RuleId.PICKLE_DESERIALIZATION,
"os_system_injection": RuleId.OS_SYSTEM_INJECTION,
"python_subprocess_shell": RuleId.PYTHON_SUBPROCESS_SHELL,
"go_exec_shell_injection": RuleId.GO_EXEC_SHELL_INJECTION,
"unsafe_yaml_load": RuleId.UNSAFE_YAML_LOAD,
"node_createcipher_no_iv": RuleId.NODE_CREATECIPHER_NO_IV,
"aes_ecb_mode": RuleId.AES_ECB_MODE,
"tls_verification_disabled": RuleId.TLS_VERIFICATION_DISABLED,
"marshal_loads": RuleId.MARSHAL_LOADS,
"shelve_open": RuleId.SHELVE_OPEN,
"xml_unsafe_parse": RuleId.XML_UNSAFE_PARSE,
"pickle_variants_load": RuleId.PICKLE_VARIANTS_LOAD,
"outerHTML_xss": RuleId.OUTERHTML_XSS,
"insertAdjacentHTML_xss": RuleId.INSERTADJACENTHTML_XSS,
"script_src_without_sri": RuleId.SCRIPT_SRC_WITHOUT_SRI,
"torch_unsafe_load": RuleId.TORCH_UNSAFE_LOAD,
"yaml_unsafe_load_variants": RuleId.YAML_UNSAFE_LOAD_VARIANTS,
"pickle_wrapper_load": RuleId.PICKLE_WRAPPER_LOAD,
}
# Fail loudly at import time if a pattern is added without a RuleId.
# This fires in pytest on every PR, so desync is caught before merge.
assert set(_RULE_NAME_TO_ID) == {p["ruleName"] for p in SECURITY_PATTERNS}, (
f"RuleId enum out of sync with SECURITY_PATTERNS: "
f"missing={set(p['ruleName'] for p in SECURITY_PATTERNS) - set(_RULE_NAME_TO_ID)}, "
f"extra={set(_RULE_NAME_TO_ID) - set(p['ruleName'] for p in SECURITY_PATTERNS)}"
)
def rule_names_to_mask(rule_names):
"""Pack a set of rule names into a bitmask. Bit N set means RuleId(N) matched.
User-defined patterns (rule_name starting with "user:") have no static
RuleId and are excluded from the mask."""
mask = 0
for name in rule_names:
if name in _RULE_NAME_TO_ID:
mask |= 1 << _RULE_NAME_TO_ID[name]
return mask

View file

@ -0,0 +1,7 @@
name: security-guidance
version: "0.1.0"
description: "Append security warnings to file-write tool results when the new content contains known-dangerous patterns (pickle.load, yaml.load, eval(, os.system, dangerouslySetInnerHTML, verify=False, ECB, XXE, GitHub Actions injection, ...). 25 regex/substring rules forked from Anthropic's claude-plugins-official under Apache-2.0. Non-blocking — the file is written and the warning rides back to the model in the next turn so it can self-correct."
author: "Anthropic (patterns, Apache-2.0) / NousResearch (Hermes plugin port)"
hooks:
- transform_tool_result
- pre_tool_call

View file

@ -0,0 +1,334 @@
"""Tests for the security-guidance plugin.
Covers ``plugins/security-guidance/``:
* ``patterns.py`` data integrity every rule has a ``RuleId``, the
fail-loud import assertion is wired.
* ``_scan_content`` true positives (pickle.load, yaml.load, eval,
dangerouslySetInnerHTML, GitHub Actions workflow), true negatives
(.md skips Python rules, ``model.eval()`` doesn't trip eval),
path-only rules (``path_check``), content-only rules
(``path_filter``).
* Hooks ``transform_tool_result`` appends a warning block in warn
mode and stays out of error results; ``pre_tool_call`` blocks
writes when ``SECURITY_GUIDANCE_BLOCK=1`` and stays silent
otherwise.
* Bundled-plugin discovery via ``PluginManager.discover_and_load``.
"""
import importlib
import importlib.util
import json
import sys
import types
from pathlib import Path
import pytest
@pytest.fixture(autouse=True)
def _isolate_env(tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.delenv("SECURITY_GUIDANCE_BLOCK", raising=False)
monkeypatch.delenv("SECURITY_GUIDANCE_DISABLE", raising=False)
yield hermes_home
# ---------------------------------------------------------------------------
# Module loading
# ---------------------------------------------------------------------------
def _repo_root() -> Path:
return Path(__file__).resolve().parents[2]
def _load_patterns():
"""Import patterns.py in isolation (no plugin glue)."""
pat_path = _repo_root() / "plugins" / "security-guidance" / "patterns.py"
spec = importlib.util.spec_from_file_location(
"security_guidance_patterns_under_test", pat_path
)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
return mod
def _load_plugin_init():
"""Import the plugin __init__.py with patterns.py as a sibling."""
plugin_dir = _repo_root() / "plugins" / "security-guidance"
if "hermes_plugins" not in sys.modules:
ns = types.ModuleType("hermes_plugins")
ns.__path__ = []
sys.modules["hermes_plugins"] = ns
spec = importlib.util.spec_from_file_location(
"hermes_plugins.security_guidance",
plugin_dir / "__init__.py",
submodule_search_locations=[str(plugin_dir)],
)
mod = importlib.util.module_from_spec(spec)
mod.__package__ = "hermes_plugins.security_guidance"
mod.__path__ = [str(plugin_dir)]
sys.modules["hermes_plugins.security_guidance"] = mod
spec.loader.exec_module(mod)
return mod
# ---------------------------------------------------------------------------
# patterns.py data integrity
# ---------------------------------------------------------------------------
class TestPatternsData:
def test_has_at_least_one_rule(self):
p = _load_patterns()
assert len(p.SECURITY_PATTERNS) >= 1
def test_every_rule_has_required_fields(self):
p = _load_patterns()
for rule in p.SECURITY_PATTERNS:
assert "ruleName" in rule
assert "reminder" in rule and rule["reminder"]
# At least one of substrings/regex/path_check must be present —
# otherwise the rule could never fire.
assert any(k in rule for k in ("substrings", "regex", "path_check")), rule
def test_rule_names_are_unique(self):
p = _load_patterns()
names = [r["ruleName"] for r in p.SECURITY_PATTERNS]
assert len(names) == len(set(names))
def test_rule_id_enum_in_sync(self):
# The upstream patterns.py asserts this at import time. If the
# set diverges, the import itself raises and this test fails.
p = _load_patterns()
rule_names = {r["ruleName"] for r in p.SECURITY_PATTERNS}
enum_names = set(p._RULE_NAME_TO_ID)
assert rule_names == enum_names
def test_rule_names_to_mask_packs_bits(self):
p = _load_patterns()
# PICKLE_DESERIALIZATION = 8, EVAL_INJECTION = 4 → bits 8 and 4 set.
mask = p.rule_names_to_mask({"pickle_deserialization", "eval_injection"})
assert mask & (1 << p.RuleId.PICKLE_DESERIALIZATION)
assert mask & (1 << p.RuleId.EVAL_INJECTION)
# ---------------------------------------------------------------------------
# _scan_content
# ---------------------------------------------------------------------------
class TestScanContent:
def test_pickle_load_in_py_warns(self):
mod = _load_plugin_init()
findings = mod._scan_content(
"/tmp/foo.py", "import pickle\nx = pickle.load(open('p.pkl', 'rb'))\n"
)
names = [n for n, _ in findings]
assert "pickle_deserialization" in names
def test_pickle_load_in_md_skipped_by_path_filter(self):
mod = _load_plugin_init()
findings = mod._scan_content(
"/tmp/foo.md", "import pickle\nx = pickle.load(open('p.pkl', 'rb'))\n"
)
assert findings == []
def test_method_call_eval_does_not_trip(self):
"""model.eval() / redis.eval() / spec.eval() must not match eval_injection."""
mod = _load_plugin_init()
findings = mod._scan_content("/tmp/foo.py", "model.eval()\nout = model(x)\n")
assert "eval_injection" not in [n for n, _ in findings]
def test_bare_eval_in_py_warns(self):
mod = _load_plugin_init()
findings = mod._scan_content("/tmp/foo.py", "result = eval(user_input)\n")
assert "eval_injection" in [n for n, _ in findings]
def test_subprocess_shell_true_warns(self):
mod = _load_plugin_init()
findings = mod._scan_content(
"/tmp/foo.py", "subprocess.run('ls ' + path, shell=True)\n"
)
assert "python_subprocess_shell" in [n for n, _ in findings]
def test_dangerously_set_inner_html_warns(self):
mod = _load_plugin_init()
findings = mod._scan_content(
"/tmp/foo.tsx", "<div dangerouslySetInnerHTML={{__html: x}} />"
)
assert "react_dangerously_set_html" in [n for n, _ in findings]
def test_github_workflow_path_check_fires_on_path_alone(self):
"""github_actions_workflow has no regex/substring — fires on path."""
mod = _load_plugin_init()
findings = mod._scan_content(
".github/workflows/test.yml", "name: CI\non: pull_request"
)
assert "github_actions_workflow" in [n for n, _ in findings]
def test_non_workflow_path_doesnt_trip_workflow_rule(self):
mod = _load_plugin_init()
findings = mod._scan_content("/tmp/foo.py", "name: CI")
assert "github_actions_workflow" not in [n for n, _ in findings]
def test_empty_content_returns_no_findings(self):
mod = _load_plugin_init()
assert mod._scan_content("/tmp/foo.py", "") == []
def test_huge_content_skipped(self):
mod = _load_plugin_init()
# 1 MB of content with a dangerous pattern at the end — scanner caps
# out at _MAX_SCAN_BYTES (256 KB), so this should return [].
big = "x" * (1024 * 1024) + "\npickle.load(open('p.pkl', 'rb'))\n"
assert mod._scan_content("/tmp/foo.py", big) == []
# ---------------------------------------------------------------------------
# Hooks
# ---------------------------------------------------------------------------
class TestTransformToolResultHook:
def test_warns_on_write_file_with_dangerous_content(self):
mod = _load_plugin_init()
args = {
"path": "/tmp/foo.py",
"content": "import pickle\nx = pickle.loads(b)\n",
}
result = mod._on_transform_tool_result(
tool_name="write_file",
args=args,
result='{"success": true, "bytes_written": 30}',
)
assert isinstance(result, str)
assert "Security guidance" in result
assert "pickle_deserialization" in result
# The original JSON should still be there at the start of the string.
assert result.startswith('{"success": true')
def test_no_warn_on_clean_content(self):
mod = _load_plugin_init()
args = {"path": "/tmp/foo.py", "content": "import json\nx = json.loads(b)\n"}
assert (
mod._on_transform_tool_result(
tool_name="write_file", args=args, result='{"success": true}'
)
is None
)
def test_no_warn_when_result_is_error(self):
mod = _load_plugin_init()
args = {"path": "/tmp/foo.py", "content": "pickle.load(f)\n"}
# When the tool itself errored, we don't pile a security warning on
# top — the model has bigger problems to solve.
assert (
mod._on_transform_tool_result(
tool_name="write_file", args=args, result='{"error": "boom"}'
)
is None
)
def test_patch_tool_new_string_scanned(self):
mod = _load_plugin_init()
args = {
"path": "/tmp/foo.py",
"old_string": "x = 1",
"new_string": "x = eval(user_input)",
}
result = mod._on_transform_tool_result(
tool_name="patch", args=args, result='{"success": true}'
)
assert isinstance(result, str)
assert "eval_injection" in result
def test_untargeted_tool_skipped(self):
mod = _load_plugin_init()
# The plugin only scans write_file/patch/skill_manage. terminal output
# should pass through untouched.
args = {"command": "echo pickle.load"}
assert (
mod._on_transform_tool_result(
tool_name="terminal", args=args, result='{"output": "pickle.load"}'
)
is None
)
def test_disable_kill_switch(self, monkeypatch):
mod = _load_plugin_init()
monkeypatch.setenv("SECURITY_GUIDANCE_DISABLE", "1")
args = {"path": "/tmp/foo.py", "content": "pickle.load(f)\n"}
assert (
mod._on_transform_tool_result(
tool_name="write_file", args=args, result='{"ok": true}'
)
is None
)
def test_block_mode_makes_transform_hook_quiet(self, monkeypatch):
"""In block mode, pre_tool_call handles the warning; the transform
hook stays silent so we don't double-emit."""
mod = _load_plugin_init()
monkeypatch.setenv("SECURITY_GUIDANCE_BLOCK", "1")
args = {"path": "/tmp/foo.py", "content": "pickle.load(f)\n"}
assert (
mod._on_transform_tool_result(
tool_name="write_file", args=args, result='{"ok": true}'
)
is None
)
class TestPreToolCallHook:
def test_no_block_in_warn_mode(self):
mod = _load_plugin_init()
args = {"path": "/tmp/foo.py", "content": "pickle.load(f)\n"}
assert mod._on_pre_tool_call(tool_name="write_file", args=args) is None
def test_blocks_in_block_mode_on_dangerous_pattern(self, monkeypatch):
mod = _load_plugin_init()
monkeypatch.setenv("SECURITY_GUIDANCE_BLOCK", "1")
args = {"path": "/tmp/foo.py", "content": "pickle.load(f)\n"}
out = mod._on_pre_tool_call(tool_name="write_file", args=args)
assert isinstance(out, dict)
assert out["action"] == "block"
assert "pickle_deserialization" in out["message"]
assert "SECURITY_GUIDANCE_BLOCK" in out["message"] # tells user how to disable
def test_no_block_in_block_mode_on_clean_content(self, monkeypatch):
mod = _load_plugin_init()
monkeypatch.setenv("SECURITY_GUIDANCE_BLOCK", "1")
args = {"path": "/tmp/foo.py", "content": "import json\n"}
assert mod._on_pre_tool_call(tool_name="write_file", args=args) is None
def test_untargeted_tool_skipped(self, monkeypatch):
mod = _load_plugin_init()
monkeypatch.setenv("SECURITY_GUIDANCE_BLOCK", "1")
args = {"command": "echo pickle.load(f)"}
assert mod._on_pre_tool_call(tool_name="terminal", args=args) is None
# ---------------------------------------------------------------------------
# Bundled-plugin discovery
# ---------------------------------------------------------------------------
class TestPluginDiscovery:
def test_loads_via_plugin_manager(self, _isolate_env, monkeypatch):
"""End-to-end: enable in config.yaml and verify the PluginManager
picks it up via the standard discovery path."""
import yaml
config = {"plugins": {"enabled": ["security-guidance"]}}
(_isolate_env / "config.yaml").write_text(yaml.safe_dump(config))
# Wipe any cached plugin state from earlier tests in this worker.
for k in list(sys.modules):
if k.startswith(("hermes_plugins", "hermes_cli.plugins")):
del sys.modules[k]
from hermes_cli.plugins import _ensure_plugins_discovered
mgr = _ensure_plugins_discovered(force=True)
loaded = set()
if hasattr(mgr, "_plugins"):
loaded = set(mgr._plugins.keys())
assert "security-guidance" in loaded

View file

@ -56,6 +56,7 @@ The repo ships these bundled plugins under `plugins/`. All are opt-in — enable
| Plugin | Kind | Purpose |
|---|---|---|
| `disk-cleanup` | hooks + slash command | Auto-track ephemeral files and clean them on session end |
| `security-guidance` | hooks | Pattern-match dangerous code on `write_file`/`patch` and append a security warning (or block) — 25 rules (Apache-2.0 fork of Anthropic's `claude-plugins-official` patterns) |
| `observability/langfuse` | hooks | Trace turns / LLM calls / tools to [Langfuse](https://langfuse.com) |
| `spotify` | backend (7 tools) | Native Spotify playback, queue, search, playlists, albums, library |
| `google_meet` | standalone | Join Meet calls, live-caption transcription, optional realtime duplex audio |
@ -115,6 +116,28 @@ Auto-tracks and removes ephemeral files created during sessions — test scripts
**Disabling again:** `hermes plugins disable disk-cleanup`.
### security-guidance
Fast pattern-matched security warnings on file writes. When the agent's `write_file` / `patch` / `skill_manage` calls carry content matching a known-dangerous code pattern — `pickle.load`, `yaml.load` without `SafeLoader`, `eval(`, `os.system`, `subprocess(..., shell=True)`, JS `child_process.exec`, React `dangerouslySetInnerHTML`, raw `.innerHTML =` / `.outerHTML =` / `document.write`, Node `crypto.createCipher`, AES ECB mode, TLS verification disabled, XXE-prone `xml.etree` / `minidom` parsers, `<script src="//..." >` without SRI, `torch.load` without `weights_only=True`, GitHub Actions `${{ github.event.* }}` injection — the plugin appends a `⚠️ Security guidance` block to the tool's result.
The file is still written. The model reads the warning in the next turn's tool message and can either fix the code or document why the construct is safe in this context. Pattern matching has a non-trivial false-positive rate, which is why warn (not block) is the default.
**Coverage:** 25 rules total, covering unsafe deserialization, command injection, XSS sinks, crypto footguns, XXE, supply-chain (SRI), and CI/CD workflow injection. The pattern data is a verbatim Apache-2.0 fork of [Anthropic's `claude-plugins-official`](https://github.com/anthropics/claude-plugins-official/tree/main/plugins/security-guidance/hooks) — see the plugin's `LICENSE` and `NOTICE` files for attribution.
**Modes:**
| Env var | Effect |
|---|---|
| (unset) | **warn mode** (default) — file is written, warning appended to result |
| `SECURITY_GUIDANCE_BLOCK=1` | **block mode** — write refused, warning returned as the block reason |
| `SECURITY_GUIDANCE_DISABLE=1` | kill switch — plugin loads but does nothing |
**Enabling:** `hermes plugins enable security-guidance` (or check the box in `hermes plugins`).
**Disabling again:** `hermes plugins disable security-guidance`.
**What it does not do (yet):** the upstream Anthropic plugin has two more layers — an LLM diff review on each agent turn that touched files, and an agentic commit-time review that traces data flow across files. Neither is ported. The agent can already run those reviews on demand via `delegate_task`.
### observability/langfuse
Traces Hermes turns, LLM calls, and tool invocations to [Langfuse](https://langfuse.com) — an open-source LLM observability platform. One span per turn, one generation per API call, one tool observation per tool call. Usage totals, per-type token counts, and cost estimates come out of Hermes' canonical `agent.usage_pricing` numbers, so the Langfuse dashboard sees the same breakdown (input / output / `cache_read_input_tokens` / `cache_creation_input_tokens` / `reasoning_tokens`) that appears in `hermes logs`.