Salvages #25872 by @konsisumer against current main.
NAS users (UGOS, Synology, unRAID) expect the LinuxServer.io
PUID/PGID convention and bind-mount /opt/data from a host directory
owned by their own UID. Without this alias those vars are silently
ignored and the s6-setuidgid drop to UID 10000 leaves the runtime
unable to read the volume. HERMES_UID/HERMES_GID still take
precedence when both are set.
The original PR targeted docker/entrypoint.sh, which is now a 27-line
deprecation shim under s6-overlay (the May 2026 rework moved all
bootstrap logic to docker/stage2-hook.sh, installed as
/etc/cont-init.d/01-hermes-setup). Re-applied the same 2-line
alias resolution at the equivalent spot in stage2-hook.sh just
before the existing UID/GID remap block. Test was retargeted at
docker/stage2-hook.sh; docs hunk adapted to current main's wording
("stage2 hook" + s6-setuidgid, not the obsolete "entrypoint drops
via gosu") with the NAS bind-mount example preserved verbatim.
Test-first regression verification: reverted just docker/stage2-hook.sh
to origin/main and re-ran the new tests. Result:
FAILED test_stage2_hook_resolves_puid_pgid_aliases
FAILED test_puid_pgid_populate_hermes_uid_gid
AssertionError: assert ':' == '1000:10'
That's the exact bug shape — PUID=1000 PGID=10 silently ignored,
HERMES_UID/HERMES_GID stay empty. With the salvage applied, all 4
tests pass.
Closes#25872
Co-authored-by: konsisumer <11262660+konsisumer@users.noreply.github.com>
Remove unused imports (F401) and duplicate/shadowed import
redefinitions (F811) across the codebase using ruff's safe
autofixes. No behavioral changes -- imports only.
- ~1400 safe autofixes applied across 644 files (net -1072 lines)
- __init__.py re-exports preserved (excluded from F401 removal so
public re-export surfaces stay intact)
- Re-exports that are imported or monkeypatched by tests but look
unused in their defining module are kept with explicit # noqa:
F401 (gateway/run.py load_dotenv; run_agent re-exports from
agent.message_sanitization, agent.context_compressor,
agent.retry_utils, agent.prompt_builder, agent.process_bootstrap,
agent.codex_responses_adapter)
- Unsafe F841 (unused-variable) fixes deliberately skipped -- those
can change behavior when the RHS has side effects
- ruff lints remain disabled in pyproject.toml (only PLW1514 is
selected); this is a one-time cleanup, not a config change
Verification:
- python -m compileall: clean
- pytest --collect-only: all 27161 tests collect (zero import errors)
- core entry points import clean (run_agent, model_tools, cli,
toolsets, hermes_state, batch_runner, gateway)
- static scan: every name any test imports directly from an edited
module still resolves
NVIDIA's verified skills catalog (https://github.com/NVIDIA/skills) ships
NVIDIA-signed skills for CUDA-X, AIQ, cuOpt, cuPyNumeric, DeepStream, NeMo,
NemoClaw and the Skill Card Generator — each bundle carrying a detached
`skill.oms.sig` signature, a governance `skill-card.md`, and `evals/`. The
sync pipeline drops any skill missing those artifacts before publishing.
Changes:
- tools/skills_hub.py: add NVIDIA/skills to GitHubSource.DEFAULT_TAPS so
it lights up in `hermes skills browse`, `hermes skills search <q>`, the
twice-daily skills-index build, and the docs-site Skills Hub page
(https://hermes-agent.nousresearch.com/docs/skills) automatically.
- tools/skills_guard.py: add NVIDIA/skills to TRUSTED_REPOS so installs
resolve to trust_level="trusted" (looser install policy than community).
- website/scripts/extract-skills.py: map the `github` source id to a
friendly "NVIDIA" pill label for the docs hub page.
- website/src/pages/skills/index.tsx: register the NVIDIA pill (green
#76b900) and slot it into SOURCE_ORDER after HuggingFace.
- website/docs/user-guide/features/skills.md (+ zh-Hans i18n): document
the new default tap and the expanded trusted-repos list.
- tests/tools/test_skills_guard.py: assert NVIDIA/skills resolves to
"trusted" (including the skills-sh-wrapped form).
- tests/tools/test_skills_hub.py: invariant — every TRUSTED_REPOS entry
must be reachable via GitHubSource.DEFAULT_TAPS (prevents future
trusted repos from being declared but never browseable).
Validation:
- Live GitHub fetch: `src.fetch('NVIDIA/skills/skills/aiq-deploy')` pulled
17 files including SKILL.md (13 KB), skill-card.md, skill.oms.sig, and
the full references/ + evals/ tree. trust_level="trusted".
- Live inspect resolved name, description, and trust correctly.
- All 193 existing skills_guard + skills_hub tests still pass.
The page was last meaningfully rewritten in the pre-s6 (tini) era and had
drifted on five points that no longer matched the image:
1. "Running the dashboard" claimed the entrypoint backgrounds
`hermes dashboard` and prefixes its output with `[dashboard]`. That
was the pre-s6 entrypoint.sh path; under s6 the dashboard is a
supervised s6-rc service (`docker/s6-rc.d/dashboard/run`) with no
sed-prefix pipeline. Rewrote the section accordingly.
2. The default for `HERMES_DASHBOARD_HOST` was documented as
`127.0.0.1`. The s6 run script defaults it to `0.0.0.0`
(`dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"`). Fixed the table
and the surrounding prose.
3. Multi-profile was documented as "not recommended in Docker — run
one container per profile." That advice was load-bearing when
there was no in-container supervisor, but the s6 architecture
explicitly adds per-profile gateway supervision: each profile
created via `hermes profile create <name>` gets a slot under
`/run/service/gateway-<name>/`, the `02-reconcile-profiles`
cont-init script restores them across `docker restart` from
`gateway_state.json`, and `hermes gateway start/stop/restart` is
intercepted by `_dispatch_via_service_manager_if_s6` to route
through `s6-svc`. Pivoted the section to "one container, many
supervised profile gateways" as the default, with a comparison
table and a "When you DO want a separate container" escape
hatch for the genuine resource-isolation / network-segmentation
cases.
4. The Compose example trailer also claimed `[dashboard]` log
prefixing. Replaced with the actual log routing.
5. Added a new "Where the logs go" section covering all four log
surfaces: per-profile gateways (tee'd to `docker logs` AND
`${HERMES_HOME}/logs/gateways/<profile>/current` since PR
b34532319), dashboard (`docker logs`, no prefix), boot reconciler
(`container-boot.log`), and `hermes logs`. The gateway-mode and
Compose sections cross-reference this rather than each carrying
their own routing prose.
Added a new "docker exec automatically drops to the hermes user"
subsection under "What the Dockerfile does", next to the existing
Privilege model warning. Documents the `/opt/hermes/bin/hermes` shim
(landed via the docker-exec privilege-drop work) — operators don't
need to remember `--user hermes` for `docker exec hermes login`,
`docker exec hermes profile create …`, etc. The historical footgun
(`auth.json` written as `root:root`, supervised gateway then can't
read its own auth file) is mentioned only as context for what the
fail-loud `exit 126` is protecting against, not as a problem the
reader needs to solve. The `HERMES_DOCKER_EXEC_AS_ROOT=1` opt-out is
documented for diagnostic sessions.
The "Permission denied" troubleshooting subsection now carries a
single-line pointer to the new section instead of duplicating it.
The `--insecure` framing reflects PR #fb5125362 (opt-in via
`HERMES_DASHBOARD_INSECURE`, not derived from bind host): the OAuth
gate is the authority, the bind host alone never implies
`--insecure`, and opting out is an explicit security trade-off.
Anchors verified resolve. i18n zh-Hans mirror left for the
translation flow to catch up.
Updates the Docker Backend section of the user-guide configuration page
to match the actual behavior shipped in PR #33645. Pre-PR the docs
claimed "container is stopped and removed on shutdown," which was
never quite true for the documented happy path and is now actively
wrong: in default mode the container survives across Hermes processes
so background processes (npm watchers, dev servers, long-running
pytest) carry over the way the "ONE long-lived container shared
across sessions" promise requires.
Changes to `website/docs/user-guide/configuration.md`:
* Reworked the intro paragraph at the top of the Docker Backend
section to describe the actual cross-process reuse contract.
* Expanded the YAML example with the new keys
`docker_persist_across_processes` and `docker_orphan_reaper`, plus
the pre-existing-but-undocumented `docker_env`, `timeout`, and
`lifetime_seconds`. Clarified the `container_persistent` comment
to disambiguate from `docker_persist_across_processes`.
* Added a `docker_env` vs `docker_forward_env` explainer (one
injects literal KEY=value, the other forwards values from the
host/.env — easy to confuse).
* Replaced the one-line "Container lifecycle" paragraph with a full
subsection covering:
- the three labels Hermes tags every container with
(hermes-agent, hermes-task-id, hermes-profile)
- the label-probe reuse mechanism on startup
- a teardown-trigger table with four rows for every situation
that destroys the container in default mode
- edge cases (OOM kill, profile switching)
* Added an "Environment variable overrides" table covering all
TERMINAL_* env vars relevant to the Docker backend, including the
previously-undocumented `TERMINAL_DOCKER_ENV` and
`HERMES_DOCKER_BINARY`.
Changes to `website/docs/user-guide/docker.md`:
* Extended the cross-link admonition (around l.227) so the
Hermes-in-Docker page points at the new terminal-backend keys
(`docker_env`, `docker_persist_across_processes`,
`docker_orphan_reaper`) alongside the ones already mentioned.
No code changes. Behavior already covered by tests added in earlier
commits on this branch (#33645 commits 1-5).
Refs #20561
Regression from PR #33809 (lazy-fetch refactor). The `sources` and
`categoryEntries` useMemo blocks were derived from `allSkillsLocal`
but had empty/incomplete deps arrays — so they computed once at mount
when the catalog was still `[]`, then never recomputed when the fetch
resolved.
Symptom: live site shows only the "All 87,639" source button and
"All Skills 87,639" category — no per-source pills (ClawHub, skills.sh,
LobeHub, etc.) and no category breakdown. Filtering by source/category
is unusable.
Fix: add `allSkillsLocal` to both deps arrays so they recompute when
data arrives. Local build green on en + zh-Hans.
The s6 dashboard run script flipped `--insecure` on whenever
`HERMES_DASHBOARD_HOST` was anything other than 127.0.0.1 / localhost.
That comment ("the dashboard refuses otherwise") predates the OAuth
auth gate: back when it was written, `start_server` would SystemExit
on any non-loopback bind, so the run script's `--insecure` was the
only way to make in-container deployments work at all.
The gate has since been replaced by `should_require_auth(host,
allow_public)`, which engages the OAuth flow when a
`DashboardAuthProvider` is registered (the bundled `dashboard_auth/nous`
provider auto-registers on `HERMES_DASHBOARD_OAUTH_CLIENT_ID`) and
fails closed with a specific operator-facing error when none is. The
host-derived `--insecure` ran upstream of all that and silently
disabled the gate on every container-deployed dashboard.
Most visible under the portal's wildcard-subdomain rollout: every Fly
machine binds 0.0.0.0 so the edge can reach Flycast, every machine
boots with the correct `HERMES_DASHBOARD_OAUTH_CLIENT_ID`, the nous
provider registers — and `/api/status` still returns
`{"auth_required": false, "auth_providers": ["nous"]}` because the
run script disabled the gate before `start_server` ever saw the
request. The dashboard SPA was served to anyone, no `/login` redirect,
no OAuth challenge.
Fix: derive `--insecure` from an explicit opt-in env var,
`HERMES_DASHBOARD_INSECURE` (truthy values matching the rest of the
s6 boolean envs: 1, true, TRUE, True, yes, YES, Yes). Operators on
trusted LANs behind a reverse proxy without the OAuth contract
(the existing `docker-compose.windows.yml` use case) opt in
explicitly; portal-managed agent deployments leave it unset and let
the gate engage.
`docker-compose.windows.yml` already passes `--insecure` on the
`command:` array directly (line 38), so it doesn't depend on the s6
auto-injection. No compose-file change required.
Tests:
* `tests/test_docker_home_override_scripts.py` — extends the existing
static-text guard with a regression assertion that the legacy
host-derived case-statement is gone and the new env-var opt-in is
present (locks against accidental revert).
* `tests/docker/test_dashboard.py` — adds two Docker-in-Docker tests
exercising the actual `/api/status` round-trip:
- 0.0.0.0 bind + `HERMES_DASHBOARD_OAUTH_CLIENT_ID` → gate engaged
- 0.0.0.0 bind + `HERMES_DASHBOARD_INSECURE=1` → gate disabled
Docs:
* `website/docs/user-guide/docker.md` + zh-Hans i18n — adds the new
env var to the table, replaces the stale prose ("the entrypoint
no longer auto-enables insecure mode" — which until this PR was
flat-out wrong) with an accurate description of the gate's
trigger conditions and the explicit opt-out.
shellcheck clean. Python static-text test passes locally. Behavioural
test will run against any future image build (CI's Docker harness).
Adds an optional `messages` keyword to the `MemoryProvider.sync_turn`
contract so external/community memory plugins can receive the OpenAI-style
conversation message list for the completed turn — including assistant tool
calls and tool result content — not just the final assistant text.
Dispatch uses signature inspection (`_provider_sync_accepts_messages`): only
providers that declare a `messages` parameter (or `**kwargs`) receive it; all
existing in-tree providers keep their legacy text-only signature and are
called unchanged. No structured-trace envelope is added to core — providers
reconstruct whatever they need from the standard message list.
Also documents Memori as a standalone community memory provider.
Salvaged from #28065 — rebased onto current main.
Co-authored-by: Dave Heritage <david@memorilabs.ai>
Anthropic released Claude Opus 4.8 on 2026-05-27, available on
OpenRouter, Anthropic, Amazon Bedrock, and Claude Platform on AWS:
- https://openrouter.ai/anthropic/claude-opus-4.8
- https://openrouter.ai/anthropic/claude-opus-4.8-fast
The fast-mode variant is a separate model ID (anthropic/claude-opus-4.8-fast)
priced at 2x of the base model — a notable improvement over the 6x premium
on older Opus generations (4.6/4.7). It is NOT a `speed: "fast"` request
parameter like Opus 4.6; Anthropic's native fast-mode beta still only
covers Opus 4.6.
Changes:
hermes_cli/models.py
- Add anthropic/claude-opus-4.8 + anthropic/claude-opus-4.8-fast to
the OpenRouter fallback snapshot and the Nous Portal curated list
(live catalogs surface them automatically when reachable; the
fallback list matters when the manifest fetch fails).
- Add claude-opus-4-8 to the Anthropic-native picker list.
agent/model_metadata.py
- Register claude-opus-4-8 / claude-opus-4.8 in DEFAULT_CONTEXT_LENGTHS
with 1M tokens (matches 4.6/4.7).
agent/anthropic_adapter.py
- Extend _XHIGH_EFFORT_SUBSTRINGS, _ADAPTIVE_THINKING_SUBSTRINGS, and
_NO_SAMPLING_PARAMS_SUBSTRINGS with "4-8"/"4.8". 4.8 inherits the
Opus 4.7 API contract: adaptive thinking only, xhigh effort level
supported, sampling parameters (temperature/top_p/top_k) return 400.
- Add claude-opus-4-8 to _ANTHROPIC_OUTPUT_LIMITS (128k max output,
same as 4.7). Matches by substring so claude-opus-4-8-fast and
date-stamped variants resolve correctly.
agent/usage_pricing.py
- Add anthropic/claude-opus-4-8: $5/$25 per MTok input/output, $0.50
cache read, $6.25 cache write (same as 4.6/4.7).
- Add anthropic/claude-opus-4-8-fast: $10/$50 per MTok (2x), $1.00
cache read, $12.50 cache write. Per OpenRouter, the 2x premium is
the only differentiator from regular Opus 4.8.
- OpenRouter routes still pull pricing from the live /models API, so
no static OpenRouter entry is needed.
tests/agent/test_model_metadata.py
- Extend the Claude 4.6+ context-length tag list with 4.8/4-8.
website/static/api/model-catalog.json
- Regenerated via `python scripts/build_model_catalog.py` to pick up
the new entries in the OpenRouter and Nous Portal fallback lists.
E2E verification (isolated sys.path import against the worktree):
- _supports_adaptive_thinking, _supports_xhigh_effort, _forbids_sampling_params
all return True for claude-opus-4.8 and claude-opus-4.8-fast.
- _supports_fast_mode (the `speed: "fast"` request-parameter gate) stays
False for 4.8 — fast mode is a separate model ID on OpenRouter, not a
parameter Anthropic accepts on the base model.
- DEFAULT_CONTEXT_LENGTHS resolves 1M for both notations.
- resolve_billing_route + _lookup_official_docs_pricing resolve the
correct $5/$25 (regular) and $10/$50 (fast) pricing for both
dot-notation and dash-notation inputs.
- 4.7 and 4.6 regression: behavior unchanged.
Unit tests: 305 passed across tests/agent/test_usage_pricing.py,
test_model_metadata.py, tests/hermes_cli/test_model_catalog.py,
test_models.py, test_model_validation.py, test_models_dev_preferred_merge.py.
The web_crawl_tool() function was an orphan — no model schema registered
it, no skill or CLI command called it, and the agent had no way to invoke
it. PR #32608 proposed wiring it up as a model-callable tool; we've
decided not to expose crawl as a separate capability since web_search +
web_extract cover the use cases we want models to have.
Removed:
- tools/web_tools.py: web_crawl_tool() (~230 LOC)
- plugins/web/firecrawl/provider.py: supports_crawl() + crawl()
- plugins/web/tavily/provider.py: supports_crawl() + crawl()
- plugins/web/xai/provider.py: supports_crawl() override
- agent/web_search_provider.py: supports_crawl() + crawl() ABC methods
- agent/web_search_registry.py: get_active_crawl_provider() +
the 'crawl' branch in _resolve()
- agent/display.py: web_crawl tool-progress rendering
- hermes_cli/config.py: 'web_crawl' from TAVILY_API_KEY.tools
- tools/website_policy.py: stale comment reference
- Tests: removed TestWebCrawlTavily class, the two website-policy
web_crawl tests, the searxng/ddgs/brave-free crawl-error tests,
the integration test_web_crawl method, and the
test_unconfigured_crawl_emits_top_level_error test. Trimmed the
capability-flag parametrize list and the WebSearchProvider ABC
conformance tests.
- Docs: trimmed the Crawl column from capability tables in both EN
and zh-Hans, updated the developer-guide ABC table.
Net: 25 files, +115/-1067.
Closes#33762 (the schema-text bug only existed if #32608 landed).
Supersedes #32608.
PR #33748 grew the live skills index from ~2k skills to ~69k, which made
the previous build-time bundling strategy untenable: the skills page's
JS chunk was about to balloon from ~1MB to ~35MB. Initial page load
on mobile became unusable, search lagged on every keystroke against the
68k-item array, and JSON.parse blocked the main thread at startup.
Three changes:
1. extract-skills.py writes skills.json + skills-meta.json into
website/static/api/ instead of website/src/data/. Static-served by
Vercel as /docs/api/skills.json (gzipped on the wire), same CDN that
already serves skills-index.json.
2. skills/index.tsx drops the static import and fetches both files in
parallel on mount. Loading state shows '…' for the count; failures
surface a small error pill instead of blanking the page.
3. Search is debounced 150ms and runs against a precomputed lowercase
haystack stamped onto each row at load time. Before: array-join +
toLowerCase per row per keystroke on a 68k array. After: single
.includes() per row, deferred until typing settles.
Validation:
| | before | after |
|---|---|---|
| skills.json location | src/data/ (bundled) | static/api/ (CDN) |
| Largest JS chunk | would be ~35MB at 68k skills | 659 KB |
| Initial page render | wait for full parse | immediate, fetch async |
| Per-keystroke filter | join+lowercase x 68k rows | single includes x 68k rows |
| Debounce | none | 150ms |
Built locally for both en and zh-Hans locales; the 34MB skills.json now
lives in build/api/ and is served separately rather than inlined into
the page's bundle.
skills.json and skills-meta.json added to .gitignore — they were already
build artifacts, but the gitignore only listed skills-index.json before.
Kimi K2.6 is natively multimodal — flagged by Shengyuan from the Kimi
growth team. Replace the named-vendor example with a model-agnostic
phrasing so the row doesn't go stale as more vendors ship vision.
Adds first-class `client_cert` / `client_key` config keys so MCP servers
behind mTLS work without an external TLS-terminating proxy. Resolves
inbound community question (Jeremy W.).
Schema (per `mcp_servers.<name>`, HTTP/SSE only):
- `client_cert: "/path/to/combined.pem"` — single PEM with cert + key
- `client_cert: "/path/to/cert"` + `client_key: "/path/to/key"` — separate
- `client_cert: [cert, key]` or `[cert, key, password]` — list form,
with optional passphrase for encrypted keys
Paths support `~` expansion. Missing files raise a server-scoped
`FileNotFoundError` at connect time rather than failing later with an
opaque TLS handshake error.
Wiring:
- New SDK HTTP path (mcp >= 1.24): `cert=` on the user-owned
`httpx.AsyncClient` alongside the existing `verify=` handling.
- SSE path: routed through an `httpx_client_factory` that wraps the
SDK's defaults (follow_redirects=True) and layers `verify` + `cert`
on top. The factory is only injected when needed, so the SDK's
built-in `create_mcp_http_client` keeps being used in the default
case.
- Deprecated mcp<1.24 path left untouched — that SDK's
`streamablehttp_client` signature doesn't expose `cert`, and adding
it would be dead code.
Also documents the previously-undocumented `ssl_verify` key (bool or
CA bundle path) in the MCP config reference.
Tests:
- `tests/tools/test_mcp_client_cert.py` (new, 19 tests):
- `_resolve_client_cert` helper: all three input forms, `~` expansion,
missing-file and validation errors.
- HTTP transport: `cert=` forwarded into `httpx.AsyncClient` for
string and tuple forms; absent when unset; missing-file error
propagates.
- SSE transport: factory only injected when cert or non-default
verify is set; factory applies cert, custom CA bundle, and
preserves `follow_redirects=True` + forwarded headers/auth.
- Existing tests: 200/200 in `test_mcp_tool.py` + `test_mcp_sse_transport.py`
still pass.
Resolves the two Dependabot alerts currently open against the website
lockfile:
- serialize-javascript: pin to ^7.0.5 (was 6.0.2 — high-severity RCE
via RegExp.flags + Date.prototype.to*, plus medium-severity DoS)
- uuid: pin to ^14.0.0 (was 8.3.2 — medium buffer bounds check miss
in v3/v5/v6 when buf is provided)
Lockfile regenerated against current main (not the stale lockfile
from the original PR — several Dependabot bumps for mermaid,
webpack-dev-server, @babel/plugin-transform-modules-systemjs,
fast-uri, lodash-es+langium, lodash, follow-redirects, and dompurify
have landed since #30036 was opened, so the website portion was
re-applied surgically on top of those).
Salvaged the website half of PR #30036. The TUI test half landed
on main separately, so this PR is web-only.
Follow-up to #33583 (the gateway-run-supervised redirect).
Before this fix, the supervised gateway's stdout (most visibly the
"Hermes Gateway Starting…" rich-console banner) was swallowed by
`s6-log` into the rotated file at
`${HERMES_HOME}/logs/gateways/<profile>/current` and never reached
`docker logs`. Operational signal lived in two places:
* **docker logs** — saw stderr (Python `logging` defaults to
stderr), so warnings/errors were visible.
* **the rotated file** — saw stdout (rich banners, `print()`
output, third-party libs that wrote to fd 1).
This was surprising for users coming from the pre-s6 image, where
`docker run … gateway run` produced a single unified stream in
`docker logs`. They'd see partial output, conclude something was
broken, and dig around for the missing pieces.
Fix: add the `1` s6-log action directive before the file destination
so each line is forwarded to s6-log's stdout — which propagates up
the s6-supervise pipeline to /init's stdout = container stdout =
`docker logs`. The file destination is preserved as a second
destination, so the rotated log (with ISO 8601 timestamps) still
exists for `hermes logs` and for survival across container restarts.
Trade-off considered: timestamps. Putting `T` between `1` and the
file destination (not before `1`) means:
* docker logs sees raw lines — Python's logging formatter has its
own timestamps, and `docker logs --timestamps` adds another
layer when desired. No double-stamping in the common reading
path.
* The persisted file gets s6-log's ISO 8601 timestamp so even
output that lacked a Python-logger timestamp (rich banners,
third-party raw prints) is correlatable in `current`.
Verification:
* New unit-test assertion in `test_service_manager.py` locks the
`s6-log 1` directive into the rendered run-script. Mutation-
tested by reverting to the pre-fix script (no `1`); the assert
catches it cleanly.
* New docker-harness test `test_supervised_gateway_stdout_reaches_docker_logs`
builds the image, runs `docker run … gateway run`, and asserts
the unique `⚕` banner glyph reaches `docker logs`. Also verifies
the rotated file still contains the banner (no regression on
the existing file destination). Mutation-tested end-to-end: built
a deliberately-broken image without the `1` directive and the
test failed exactly as designed, citing the banner present in
`current` but absent from `docker logs`.
* `website/docs/user-guide/docker.md` gains a new `:::note Where
gateway logs go` admonition documenting both destinations and
the audit-log file at `${HERMES_HOME}/logs/container-boot.log`.
Existing functionality preserved: every other docker-harness test
still passes against the new image. Unit-test sweep across
`tests/hermes_cli/` (5561 tests) is green.
Pre-s6, `docker run nousresearch/hermes-agent gateway run` was the
standard invocation: gateway ran as the container's main process,
tini reaped zombies, container exit code matched gateway exit code,
no supervision. With s6-overlay as PID 1, the same invocation now
auto-upgrades to supervised semantics — auto-restart on crash,
dashboard supervised alongside (when HERMES_DASHBOARD=1 is set),
multiple profile gateways under the same /init.
Users get the new behavior with zero changes to their docker run
command. A loud one-line breadcrumb on stderr explains the upgrade
and points at the opt-out for users who genuinely want pre-s6
foreground semantics.
How it works:
1. `_gateway_command_inner` (the `gateway run` handler) checks if
we're inside a container with s6 as PID 1.
2. If yes, dispatches `start` to the s6 service manager (registers
and starts gateway-default), then `exec sleep infinity` to keep
the CMD process alive without binding container lifetime to
gateway PID lifetime. The supervised gateway can flap freely;
`docker stop` still tears everything down via /init stage 3.
3. If no, falls through to the existing foreground code path
unchanged. Host runs of `hermes gateway run` are unaffected.
Three gates make the redirect inert outside the intended scope:
* `detect_service_manager() != "s6"` — host/non-s6-container runs.
* `HERMES_S6_SUPERVISED_CHILD=1` env var (recursion guard) —
exported by `S6ServiceManager._render_run_script` for the
s6-supervised invocation itself. Without this guard, the
supervised `gateway run --replace` would re-enter the redirect
and recurse (run → start → run → start → ...) infinitely.
* `--no-supervise` CLI flag OR `HERMES_GATEWAY_NO_SUPERVISE=1` env
var — explicit user opt-out for CI smoke tests, debugging the
foreground startup path, or any case wanting "CMD exit =
container exit" semantics. Strict truthiness (1/true/yes,
case-insensitive); typos like `=0` do NOT silently opt out.
Tests:
* Unit tests in tests/hermes_cli/test_gateway_s6_dispatch.py
cover all five paths (host no-op, supervised fire, sentinel
recursion guard, CLI flag, env var truthy + falsy). The two
load-bearing gates (sentinel + opt-out) were mutation-tested
by removing each gate in isolation and confirming the dedicated
test fails with the expected error.
* Docker harness tests in tests/docker/test_gateway_run_supervised.py
cover the round trips end-to-end against a built image: redirect
fires (sleep-infinity heartbeat + supervised gateway-default
slot + breadcrumb), --no-supervise opt-out (foreground gateway,
no want-up on the slot), HERMES_GATEWAY_NO_SUPERVISE env var
works identically, recursion is impossible (≤1 supervised
python gateway-run + exactly 1 sleep-infinity parented to the
CMD wrapper), and HERMES_DASHBOARD=1 produces both supervised
gateway and supervised dashboard.
Docs:
* Added a `:::tip Gateway runs supervised` admonition near the
main docker.md example explaining the upgrade and pointing at
the opt-out. Pre-s6 (tini-based) images still run gateway run
as the foreground main process, so the note is scoped to the
s6 image only.
Trade-off documented in the helper docstring: container exit code
under the redirect is sleep's exit code (always 0 on SIGTERM), not
the gateway's. That was an explicit design call — the supervised
gateway is allowed to flap without taking the container with it,
which is what "supervision" means. CI users who want exit-code
forwarding can pass --no-supervise.
#33151 flipped THREE Telegram display defaults to false:
- tool_progress: new -> off (kept: per-tool stream is too chatty)
- interim_assistant_messages: T -> F (REVERTED here)
- long_running_notifications: T -> F (REVERTED here)
- busy_ack_detail: T -> F (kept: verbose iteration counter)
The two reverts were wrong. interim_assistant_messages = the model's REAL
words mid-turn ("I'll inspect the repo first.", "Let me check both files
in parallel"). That is signal, not noise. Suppressing it left Telegram
users staring at "typing..." for the entire turn duration with no
feedback. long_running_notifications = the periodic heartbeat. Silent
agent for 30 minutes is worse than one bubble updating every 3 minutes.
Changes:
- gateway/display_config.py: Telegram tier-1 inbox keeps both defaults
on (only tool_progress and busy_ack_detail stay off).
- gateway/run.py _notify_long_running(): edit a single heartbeat
message in place (where the adapter supports it) instead of posting
a new "Still working..." bubble each interval. Telegram, Discord,
Slack, Matrix all qualify. Falls back to send-new when edit fails.
- gateway/run.py: tighten heartbeat text. "⏳ Still working... (12 min
elapsed — iteration 21/60, running: terminal)" -> "⏳ Working — 12
min, terminal". Verbose iteration detail moves behind busy_ack_detail
(one knob now controls both busy acks AND heartbeat verbosity).
- tests/, cli-config.yaml.example, website/docs/user-guide/messaging:
updated to reflect the corrected story.
Operators behind reverse proxies that don't reliably forward
X-Forwarded-Host / X-Forwarded-Proto / X-Forwarded-Prefix (manual
nginx setups, on-prem ingresses, custom-domain Fly deploys with
incomplete proxy chains) had no way to force the absolute base URL
the OAuth callback redirects from. The dashboard would reconstruct
the redirect_uri from request headers, the IDP would echo it back,
and the user would land on the wrong host or wrong path — 404.
Add `dashboard.public_url` to config.yaml with env override
HERMES_DASHBOARD_PUBLIC_URL. When set, it is the complete authority —
scheme + host + optional path prefix (e.g. https://example.com/hermes) —
and becomes the base for the OAuth `redirect_uri`. X-Forwarded-Prefix
is IGNORED on this code path because the operator has explicitly
declared the public URL; we no longer need to guess from proxy
headers, and stacking the prefix on top would double-prefix the
common case where the prefix is already baked into public_url.
When unset, the existing proxy_headers + X-Forwarded-Prefix
reconstruction runs untouched. Existing Fly.io deploys continue to
work without configuration — this is purely additive.
Precedence mirrors dashboard.oauth.client_id:
env (non-empty) > config.yaml > reconstructed from request
Implementation:
- hermes_cli/config.py: add dashboard.public_url to DEFAULT_CONFIG
with a multi-paragraph doc comment explaining the use case,
the X-Forwarded-Prefix interaction, and the validation rules.
- hermes_cli/dashboard_auth/prefix.py: factored out the existing
_REJECT_CHARS frozenset, added _normalise_public_url() validator
(requires http/https scheme + non-empty host + no header-injection
chars), _load_dashboard_section() loader (robust to load_config
raising, non-dict shapes), and resolve_public_url() entry point
with the env-overrides-config precedence. A malformed value
silently falls through to ""; the caller treats "" as "reconstruct
from request" so a typo never breaks the login flow.
- hermes_cli/dashboard_auth/routes.py: rewrite _redirect_uri()
docstring to spell out the three resolution tiers; add the
public_url short-circuit before the existing X-Forwarded-Prefix
splicing. Source-level comment notes that X-Forwarded-Prefix is
intentionally ignored when public_url is set so a future reader
doesn't try to "fix" the missing prefix layering.
- cli-config.yaml.example: extend the existing dashboard section
with a public_url block.
- website/docs/user-guide/features/web-dashboard.md: new "Public
URL override" section between the provider configuration and
the OAuth flow walkthrough. Documents the env-vs-config table,
the validation rules, and the `http://` `public_url` ↔ Secure
cookie footgun.
Test coverage — new TestPublicUrlOverride class (8 tests):
- env var overrides request reconstruction (the primary motivating
case)
- config.yaml used when env unset
- env wins over config (precedence pin)
- public_url with a path prefix already baked in (the Q1-a case the
user explicitly chose)
- public_url suppresses X-Forwarded-Prefix layering (defends
against the double-prefix bug)
- trailing slash stripped from public_url (no //auth/callback)
- malformed public_url falls through to reconstruction (six
hostile inputs: javascript:, ftp:, missing scheme, missing host,
quote chars, CRLF injection)
- empty env string doesn't shadow config.yaml entry (CI / Fly
provisioned-but-empty secret case)
Mutation-tested: flipping the precedence in resolve_public_url() trips
exactly test_env_overrides_config_public_url; weakening the validator
(accept any scheme) trips exactly test_malformed_public_url_falls_through_to_reconstruction.
Both other tests in each pair stay green, confirming the suite
discriminates the specific regression each test pins.
Per AGENTS.md, ~/.hermes/.env is reserved for API keys / secrets and
config.yaml is the surface for non-secret configuration. The Nous
Portal plugin previously read HERMES_DASHBOARD_OAUTH_CLIENT_ID and
HERMES_DASHBOARD_PORTAL_URL from the environment only, which forced
local-dev / on-prem operators to put non-secret per-instance
configuration in .env — violating the convention.
Add dashboard.oauth.{client_id,portal_url} to DEFAULT_CONFIG and have
the plugin resolve each setting with env-overrides-config precedence:
1. Env var when set to a non-empty value (Fly.io platform-secret
injection — what pushes per-deploy client_ids without baking
them into the image).
2. config.yaml entry (canonical surface for local dev / on-prem).
3. Plugin default (no provider registered when client_id is empty;
portal_url defaults to https://portal.nousresearch.com).
Empty env values are explicitly treated as unset so a provisioned-but-
not-populated Fly secret can't accidentally shadow a valid config.yaml
entry with an empty string — operators would otherwise lose the gate.
Implementation:
- hermes_cli/config.py: add dashboard.oauth.{client_id,portal_url}
block to DEFAULT_CONFIG with full doc comment explaining the
override precedence and Fly.io rationale.
- plugins/dashboard_auth/nous/__init__.py: add _load_config_oauth_section,
_resolve_client_id, _resolve_portal_url helpers; replace the two
direct os.environ.get() calls in register() with the resolvers.
Update the skip-reason string to mention BOTH surfaces so an
operator looking at the fail-closed bind error knows config.yaml
is a valid alternative to the env var.
- plugins/dashboard_auth/nous/plugin.yaml: update description to
name both surfaces. requires_env stays pointing at the env var
name — it's metadata-only (not used by the plugin loader for
gating) so this is documentation/UX, not enforcement.
- cli-config.yaml.example: append commented dashboard.oauth block
with the same override rationale operators see in code.
- website/docs/user-guide/features/web-dashboard.md: rewrite the
'Default provider: Nous Research' section to lead with config.yaml,
present env vars as operator overrides (Fly.io's primary path).
Updated the example fail-closed bind error to match the new
skip-reason text.
Test coverage — new TestConfigYamlSource class (8 tests) pinning
every tier of the precedence chain:
- config-yaml-only path registers correctly
- both config-yaml fields (client_id + portal_url) honoured
- env var overrides config for client_id (Fly.io critical path)
- env var overrides config for portal_url
- empty env string does NOT shadow config (CI/Fly edge case)
- neither source set → skip with reason mentioning BOTH surfaces
- load_config() raising falls through to env-only path (resilience)
- non-dict oauth section falls through cleanly (typo resilience)
Mutation-tested: flipping the precedence to config-wins-over-env trips
exactly test_env_overrides_config_client_id while the other 7 stay
green, confirming the suite discriminates the order, not just the
sources.
This closes the last item in Teknium's PR review (PR #30156).
The Nous OAuth provider plugin (plugins/dashboard_auth/nous) is bundled
and auto-loaded — same as before — but previously refused to register
unless BOTH HERMES_DASHBOARD_OAUTH_CLIENT_ID and HERMES_DASHBOARD_PORTAL_URL
were set, then the gate's fail-closed branch told the operator 'install
the default Nous provider'. That message is misleading: the provider IS
installed; it's just unconfigured. And the contract only really needs
the per-instance client_id — the portal URL is the same for everyone
in production.
Three changes:
1. plugins/dashboard_auth/nous/__init__.py:
- HERMES_DASHBOARD_PORTAL_URL is now optional and defaults to
'https://portal.nousresearch.com'. Override only for staging
(portal.rewbs.uk) or a custom deployment. Empty string also
falls back to the default so an empty Fly secret can't point
the dashboard at nowhere.
- Plugin exposes a module-level LAST_SKIP_REASON: str that the gate
reads when no providers register. Cleared on each register() call.
Skip reasons are human-readable and actionable
('HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. The Nous Portal
provisions this env var…').
2. plugins/dashboard_auth/nous/plugin.yaml:
- requires_env drops HERMES_DASHBOARD_PORTAL_URL; only the client_id
is mandatory. Description updated to reflect this.
3. hermes_cli/web_server.py:
- When the gate fail-closes for 'no providers', it now reads each
bundled plugin's LAST_SKIP_REASON and embeds them in the SystemExit
message. Operator sees the specific config fix needed:
Bundled providers reported these issues:
• nous: HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. …
instead of the prior generic 'Install the default Nous provider'.
Tests:
- TestPluginRegister rewritten to assert the new defaults +
LAST_SKIP_REASON contents (6 tests, +1 new for empty-string env).
- New gate test test_start_server_surfaces_nous_skip_reason_when_unconfigured.
- test_get_method_is_not_allowed widened to handle the SPA-shell 200
path explicitly — assertion now verifies no JSON ticket leaks
rather than asserting a specific status code (covers all four of
401/404/405/200).
Docs updated: web-dashboard.md's 'Default provider' section now shows
the env-var table with required/optional columns and embeds the
fail-closed error message verbatim so operators can match what they
see at the prompt.
Adds an 'OAuth Authentication (gated mode)' section to the existing web
dashboard docs, slotted just before the CORS section so readers
encounter it after the REST API reference. Covers:
- When the gate engages (decision table for --host / --insecure
combinations).
- Fail-closed semantics if no provider is registered.
- Bundled Nous provider, env-var contract, Portal provisioning.
- Full OAuth dance (link to nous-account-service contract doc) — auth
code + PKCE S256, JWKS verification, 15-min token TTL, no refresh
token in V1.
- Cookies set (hermes_session_at + hermes_session_pkce; mentions the
deprecated hermes_session_rt slot).
- Logout flow, audit log path, redacted fields.
- Custom provider plugin recipe with the DashboardAuthProvider ABC.
- Verification recipe: env vars + /api/status curl.
The docs follow the existing web-dashboard.md style (option tables,
ASCII flow diagrams, curl examples). No frontmatter/sidebar position
changes — the section is appended in place.
New opt-in plugin that scans the content passed to write_file / patch /
skill_manage for 25 known-dangerous code patterns — pickle.load,
yaml.load, eval(, os.system, subprocess(shell=True), child_process.exec,
dangerouslySetInnerHTML, innerHTML/outerHTML/document.write/
insertAdjacentHTML, crypto.createCipher (no IV), AES ECB,
TLS verification disabled, XXE-prone xml.etree/minidom parsers,
<script src=//...> without SRI, torch.load without weights_only=True,
GitHub Actions ${{ github.event.* }} injection — and appends a
"Security guidance" warning block to the tool result via the
transform_tool_result hook.
Default behaviour is non-blocking: the file is written and the warning
rides back to the model in the next turn so it can self-correct or
document why the construct is safe. SECURITY_GUIDANCE_BLOCK=1 upgrades
to refusing the write entirely; SECURITY_GUIDANCE_DISABLE=1 is the
kill switch.
Pattern data (patterns.py) is a verbatim Apache-2.0 fork of
Anthropic's claude-plugins-official/plugins/security-guidance/hooks/
patterns.py at commit 0bde168 (2026-05-26). LICENSE and NOTICE
preserve attribution. The Hermes-side plugin glue (__init__.py,
plugin.yaml, README.md, tests) is original work.
Plugin is opt-in like all bundled plugins:
hermes plugins enable security-guidance
Inspired by https://x.com/ClaudeDevs/status/1927108527247... — Anthropic
shipped this as their security-guidance plugin for Claude Code on
2026-05-26 with a measured 30-40% reduction in security-related PR
comments on internal rollout.
What's NOT ported (deferred):
* Layer 2 (LLM diff review on turn end) — would route through main
model by default on Hermes, real money on reasoning models. A
follow-up can wire it to a cheap aux model with explicit opt-in.
* Layer 3 (agentic commit-time review) — agent can run this on
demand via delegate_task today.
* .hermes/security-guidance.md project-rules file — only used by
layers 2/3 upstream.
* fix(plugins/discord): correct install_hint extra to [messaging]
The Discord platform registered install_hint pointing at
'hermes-agent[discord]', but pyproject.toml has no [discord] extra —
the deps live in [messaging] alongside Telegram and Slack. Users hitting
"Platform 'Discord' requirements not met" were directed at a pip command
that installs nothing.
* feat(nix): add #messaging and #full package variants
Make Discord/Telegram/Slack work out of the box for `nix profile install`
users. Messaging deps were dropped from [all] on 2026-05-12 in favor of
lazy-install, but lazy-install can't write to the read-only /nix/store —
users hit "No adapter available for discord" with no actionable guidance.
- #messaging: pre-built with discord.py/telegram/slack (+33 MB venv)
- #full: all 18 platform-portable extras + matrix on Linux only
(python-olm lacks Darwin PyPI wheels) (+738 MB venv)
Also adds a `messaging-variant` flake check that verifies `import discord`
succeeds in the sealed venv — regression guard for the lazy-install
migration.
Docs updated: Quick Start callout, extraDependencyGroups rewrite with
messaging as primary example + full extras table, troubleshooting row,
cheatsheet row.
Closure size deltas (measured x86_64-linux):
default 1792 MB pkg / 512 MB venv
messaging 1826 MB pkg / 546 MB venv (+33 MB)
full 2530 MB pkg / 1250 MB venv (+738 MB)
* chore(nix): trim variant comments + alphabetize full extras
Drop the date-stamped changelog from messaging-variant's comment and the
"+33 MB / +704 MB" numbers from the variant defs — those drift and belong
in the PR description, not source. Alphabetize the 18-extra list in #full
so future additions produce clean one-line diffs.
No semantic change. messaging-variant check still passes.
* remove Vercel AI Gateway provider and Vercel Sandbox terminal backend
Both Vercel-hosted integrations are removed end-to-end. Users on the AI
Gateway should switch to OpenRouter or one of the other aggregators
(Nous Portal, Kilo Code). Users on the Vercel Sandbox backend should
switch to Docker, Modal, Daytona, or SSH.
What's removed:
- `plugins/model-providers/ai-gateway/` provider plugin
- `hermes_cli/vercel_auth.py` Vercel-Sandbox auth helper
- `tools/environments/vercel_sandbox.py` terminal backend
- `ai-gateway` provider wiring across auth, doctor, setup, models,
config, status, providers, main, web_server, model_normalize, dump
- `vercel_sandbox` backend wiring across terminal_tool, file_tools,
code_execution_tool, file_operations, approval, skills_tool,
environments/local, credential_files, lazy_deps, prompt_builder,
cli, gateway/run
- `AI_GATEWAY_BASE_URL` constant, `_AI_GATEWAY_HEADERS` auxiliary-client
header set, run_agent base-URL header/reasoning special-cases
- `[vercel]` pyproject extra and `vercel`/`vercel-workers` from uv.lock
- env vars: `AI_GATEWAY_API_KEY`, `AI_GATEWAY_BASE_URL`, `VERCEL_TOKEN`,
`VERCEL_PROJECT_ID`, `VERCEL_TEAM_ID`, `VERCEL_OIDC_TOKEN`,
`TERMINAL_VERCEL_RUNTIME`
- Tests: deletes test_ai_gateway_models.py and
test_vercel_sandbox_environment.py; scrubs references across 23
surviving test files (no entire tests deleted unless they were
dedicated to AI Gateway / Sandbox)
- Docs: provider tables, env-var reference, setup guides, security
notes, tool config, terminal-backend tables — English plus zh-Hans
i18n parity
- `hermes-agent` skill: provider table entry and remote-backend list
What stays (intentional):
- `popular-web-designs/templates/vercel.md` — CSS design reference,
unrelated to Vercel-the-AI-product
- `x-vercel-id` in `stream_diag.py` headers — generic Vercel CDN
response header, useful diag signal on any Vercel-hosted endpoint
- `vercel-labs/agent-browser` URL in browser config — lightpanda
browser project, different OSS effort
- `userStories.json` historical contributor entry mentioning Vercel
Sandbox — archive, not active docs
Validation:
- 1153 tests in the 22 targeted files pass (`scripts/run_tests.sh`)
- Full repo `py_compile` clean
- Live import of every touched module + invariant check (no
`ai-gateway` in `PROVIDER_REGISTRY`, no `_AI_GATEWAY_HEADERS`, no
`vercel_sandbox` in `_REMOTE_TERMINAL_BACKENDS`)
* test: convert profile-count check from change-detector to invariant
The hardcoded "== 34" assertion broke when ai-gateway was removed.
Per AGENTS.md change-detector-test guidance, assert the relationship
(registry count >= number of plugin dirs) instead of a literal count.
Counts shift when providers are added/removed; that's expected.
Add a first-class active-session orchestrator for the Ink TUI:
- list, activate, close, and launch live process-local TUI sessions
- hydrate committed and in-flight output when switching sessions
- dispatch a new prompt session from the +new row with session-scoped model picks
- expose a clickable live-session count in the status chrome
- preserve stable row order while initially focusing the current session
- support mouse hit-testing for floating orchestrator overlays
- add backend and frontend regression coverage for the lifecycle and UI helpers
'hermes login' was removed (the command now just prints a deprecation
message and exits). The bundled hermes-agent SKILL.md, in-code error
messages, the tip rotation, the proxy adapters, and the docs site
still pointed agents and users at the dead command — so models loading
the skill kept running 'hermes login --provider openai-codex' and
getting a dead-end print.
Replacements use the canonical 'hermes auth add <provider>' surface
(or bare 'hermes auth' for the interactive manager).
Files:
- skills/autonomous-ai-agents/hermes-agent/SKILL.md (+ regenerated docs page)
- hermes_cli/tips.py (tip rotation)
- agent/google_oauth.py (gemini-cli error message)
- agent/conversation_loop.py (nous re-auth troubleshooting line)
- agent/credential_sources.py (docstring)
- hermes_cli/proxy/cli.py + hermes_cli/proxy/adapters/nous_portal.py (proxy auth hints)
- tests/hermes_cli/test_proxy.py (updated assertions)
- website/docs/reference/faq.md, website/docs/user-guide/features/subscription-proxy.md
- zh-Hans i18n mirrors for the above
'hermes logout' is still a live command and is left untouched.
The 'hermes login' stub in hermes_cli/auth.py:login_command() and
the cli-commands.md 'Deprecated' rows are intentionally kept as
the discoverable deprecation surface.
Updates curated picker lists for both the OpenRouter fallback snapshot
(`OPENROUTER_MODELS`) and the Nous Portal list (`_PROVIDER_MODELS['nous']`).
Regenerates website/static/api/model-catalog.json via
`scripts/build_model_catalog.py` to keep the docs-hosted manifest in
sync (drift guard in `test_in_repo_lists_match_manifest`).
tests/hermes_cli/test_models.py fixtures updated — they pinned the
old model id as their live-fetch sample.
* feat(mcp): Nous-approved MCP catalog with interactive picker
Adds an optional-mcps/ directory mirroring optional-skills/: curated,
Nous-approved MCP servers shipped with the repo but disabled by default.
Presence in optional-mcps/ = approval. No community tier, no trust signals.
Entries are added by merging a PR.
New surface:
hermes mcp Interactive catalog picker (default)
hermes mcp catalog Plain-text list, scriptable
hermes mcp install <name> Install a catalog entry
Picker behavior:
not installed -> install (clone/bootstrap if needed, prompt for creds)
installed/off -> enable
installed/on -> menu (disable / uninstall / reinstall)
Manifest schema (manifest_version: 1) supports:
- transport: stdio (command/args, ${INSTALL_DIR} substitution) or http (url)
- install: optional git clone + bootstrap commands (for repos that need
local venv setup, like the n8n bridge); omit for npx/uvx servers
- auth: api_key (prompts -> ~/.hermes/.env), oauth (provider-mediated
or native MCP), or none
Catalog entries are never auto-updated. Users re-run `hermes mcp install`
to refresh. Credentials always go to ~/.hermes/.env (the .env-is-for-secrets
rule), never to per-server env blocks.
Ships n8n as the reference manifest (https://github.com/CyberSamuraiX/hermes-n8n-mcp).
Tests: 19 catalog tests + E2E install/uninstall round-trip via the shipped
manifest.
* feat(mcp): tool-selection checklist + Linear catalog entry
Adds install-time tool selection so users only enable the MCP tools they
actually want, and ships Linear as a second reference catalog entry to
demonstrate the http+oauth path alongside n8n's stdio+api_key+git-bootstrap.
Tool selection flow:
install (clone/auth/credentials) ->
probe server for available tools ->
curses checklist with pre-checked rows ->
write mcp_servers.<name>.tools.include
Pre-check priority:
1. user's prior tools.include (reinstall preserves selection)
2. manifest's tools.default_enabled (curated subset)
3. all probed tools (default)
Probe-failure fallback (server unreachable, OAuth not yet complete,
backing service offline):
- manifest declared default_enabled -> applied directly
- no default declared -> no filter written (all-on when reachable)
- both cases point user at hermes mcp configure <name>
Manifest schema additions:
tools:
default_enabled: [list, of, tool, names] # optional
Updates:
- optional-mcps/linear/manifest.yaml -- new reference entry (http+oauth)
- optional-mcps/n8n/manifest.yaml -- tools.default_enabled set to the
8 read-mostly tools; mutating tools (activate/deactivate, container_logs)
pruned by default
- docs: new 'Tool selection at install time' section in features/mcp.md
Tests: 7 new tests in TestToolSelection covering probe-success / probe-fail
matrix, manifest-default filtering, reinstall-preserves-selection, and
invalid-default-enabled rejection. 26 catalog tests + 32 existing
mcp_config tests passing.
* feat(mcp): polish — picker unification, include-mode convergence, hardening
Addresses review findings on PR #30870. Lands all improvements that
belong in this PR before merge; defers separate cleanup (consolidating
two probe implementations, change-detector tests) to follow-ups.
Picker UX (mcp_picker.py)
- Unifies catalog + custom (user-added) MCPs in one view with distinct
status badges (available / enabled / installed (disabled) /
custom — enabled / custom — disabled)
- Adds 'Configure tools (probe server + re-pick)' action to both the
catalog-installed and custom-row submenus — the existing
hermes mcp configure flow was previously unreachable from the picker
- Loops until ESC/q so the user can manage several entries in one
session instead of having to re-launch
- Uninstall message now mentions .env credentials are preserved with a
pointer to clean them up manually if no longer needed
- Surfaces a 'requires a newer Hermes' warning per future-manifest
entry instead of silently hiding it
Catalog (mcp_catalog.py)
- catalog_diagnostics() exposes which manifests were skipped and why
(future_manifest vs invalid) so UIs can give actionable feedback
- _do_git_install detects SHA-shaped refs (regex /[0-9a-f]{7,40}/)
and skips the doomed 'git clone --branch <sha>' attempt — clone --branch
only accepts branches/tags, so SHAs always failed noisily before
falling back to the full-clone path
- Probe-success all-tools-enabled message now mentions that new tools
the server adds later will be auto-enabled (no-filter mode)
Convergence (tools_config.py)
- _configure_mcp_tools_interactive now writes tools.include (whitelist)
instead of tools.exclude (blacklist), matching the catalog flow and
hermes mcp configure. The on-disk config shape no longer depends on
which UI the user touched last
- Two existing tests updated to assert the new include-mode contract
Discoverability
- Setup wizard final step now prints 'Browse curated MCPs: hermes mcp'
- Three tip-corpus entries pointing at the new catalog
- Docs updated with: trust model (manifests run code locally, gated by
PR review, but read before installing), runtime ${ENV_VAR} substitution
semantics, and the manifest_version forward-compat behavior
Tests
- 7 new tests covering future-manifest diagnostics, custom MCP picker
rows, SHA-ref git-install path, branch-ref git-install path, and the
tools_config include-mode write contract
- 80 MCP-related tests passing across test_mcp_catalog.py,
test_mcp_config.py, test_mcp_tools_config.py
* fix(mcp): drop setup-wizard catalog hint to satisfy supply-chain scanner
The wizard line 'Browse curated MCPs: hermes mcp' triggered the
CI supply-chain scanner because it pattern-matches on edits to any
file named hermes_cli/setup.py — that filename matches the Python
'install-hook file' heuristic even though this setup.py is the
user-facing 'hermes setup' wizard, not a packaging install hook.
The catalog is already surfaced via three tip-corpus entries in
hermes_cli/tips.py (which the scanner doesn't flag), so dropping the
wizard mention loses no discoverability. Worth revisiting after a
scanner allowlist for this specific file lands.
Layered safety so the Skills Hub at /docs/skills stays in sync without
silent rot. Three pieces:
1. build_skills_index.py — refuses to ship a degenerate index.
EXPECTED_FLOORS per source (skills.sh ≥100, lobehub ≥100, clawhub ≥50,
official ≥50, github ≥30, browse-sh ≥50) and MIN_TOTAL=1500. Any source
collapsing to zero (the silent OpenAI breakage that hid for weeks) now
fails the workflow loud — broken index never reaches the live site.
2. extract-skills.py + the React page — visible freshness signal.
Sidecar website/src/data/skills-meta.json carries the index's
generated_at timestamp, plus per-source counts. Skills Hub renders a
'Catalog refreshed N hours ago · auto-rebuilt twice daily' line under
the hero copy. If the cron stalls, users see the staleness immediately.
3. .github/workflows/skills-index-freshness.yml — watchdog cron.
Every 4 hours, fetches the live /docs/api/skills-index.json, validates
shape, checks age (>26h is stale), checks the same per-source floors,
and opens (or appends to) a GitHub issue when anything is off. The
issue is title-prefixed [skills-index-watchdog] so subsequent failures
append a comment instead of spamming new issues.
Net effect:
- A silent regression like 'OpenAI tap moved its skills' now fails the
build instead of shipping a quietly broken catalog.
- A stuck cron (like the landingpage breakage that ran red for weeks) now
files an issue within 4 hours.
- Users see how fresh the catalog is on the page itself.
Test plan:
- Local: built skills-meta.json from the live index → 'Catalog refreshed
N minutes ago' rendered correctly in the static HTML.
- Probe logic dry-run against the live index: total=2456, all 6 sources
above floor, age 0.1h — issues=NONE.
- Triggered skills-index.yml manually; both jobs green, deploy-site.yml
dispatch fired.
The Skills Hub page was stuck on a stale Feb 25 snapshot, showing only Built-in
+ Optional + Anthropic + LobeHub. The unified index already has 2078 skills
from skills.sh / ClawHub / LobeHub / GitHub taps / Claude Marketplace, and
BrowseShSource adds another ~330 — none of it was reaching the page.
Changes:
- website/scripts/extract-skills.py: read website/static/api/skills-index.json
(the unified multi-source catalog, rebuilt twice daily) as the canonical
external source. Keep the legacy skills/index-cache/ fallback for offline
builds. Add friendly per-source labels (skills.sh, ClawHub, browse.sh,
OpenAI, HuggingFace, Anthropic, LobeHub, etc.) and per-entry installCmd.
- website/src/pages/skills/index.tsx: add source pills + ordering for the 11
new sources; render installCmd from the index entry.
- website/scripts/prebuild.mjs: when no local skills-index.json exists, fetch
the live one from hermes-agent.nousresearch.com so local 'npm run build'
matches production without burning GitHub API quota.
- scripts/build_skills_index.py: crawl BrowseShSource so browse.sh entries
land in the unified index. Adjust source_order.
- tools/skills_hub.py: GitHubSource.DEFAULT_TAPS — openai/skills moved its
skills into skills/.curated/ and skills/.system/, so add both as explicit
taps (the listing code skips dotted dirs by design). Drop
VoltAgent/awesome-agent-skills (README-only, no SKILL.md files) and
MiniMax-AI/cli (singular skill, not a tap directory). Net effect: github
source jumps from 83 → 143 skills, with OpenAI properly included.
- .github/workflows/deploy-site.yml: build the unified index BEFORE running
extract-skills.py — previous order meant extract-skills always fell back
to the legacy cache. Drop the 'skip if file exists' guard; the file is
gitignored and must be rebuilt every deploy.
- .github/workflows/skills-index.yml: drop the broken 'deploy-with-index'
job (it cp'd 'landingpage/\*' which no longer exists, failing every cron
run since the landingpage move). Replace it with a workflow_dispatch
trigger of deploy-site.yml so the index refresh still reaches production
on schedule.
- website/docs/user-guide/features/skills.md: drop VoltAgent from the
default-taps doc list to match the code.
Before: 695 skills (Built-in 90, Optional 84, Anthropic 16, LobeHub 505).
After: 2168 skills across 9 source pills, including the 1212 skills.sh
entries the user expected to see.
xAI retired grok-4-1-fast. hermes_cli/models.py already removed it from
the static fallback in an earlier commit, but the context-length
metadata, the tests pinning those values, and the provider doc still
referenced the retired ID. Clean those up so retired model names stop
appearing in user-facing output.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an optional autonomous-ai-agents skill that delegates coding tasks
to the OpenHands CLI (https://github.com/All-Hands-AI/OpenHands). Sits
alongside claude-code / codex / opencode and is the model-agnostic
option in that family — any LiteLLM-supported provider works.
This is a ground-truth rewrite of #19325 by @xzessmedia (Tim Koepsel).
The original PR's SKILL.md was drafted by the OpenHands agent itself and
hallucinated several flags that don't exist in the real CLI (\`--model\`,
\`--max-iterations\`, \`--workspace\`, \`--sandbox docker\`), pointed at
the wrong PyPI package (\`openhands-ai\`, which is the legacy V0 SDK),
and claimed native Windows support that the upstream docs explicitly
disclaim. Rather than cherry-pick and rewrite half the lines under
contributor authorship, the SKILL.md was rebuilt against a verified
install (\`uv tool install openhands --python 3.12\`) and a real
end-to-end \`--headless --json\` run against openrouter/openai/gpt-4o-mini.
Authorship credited via the \`author:\` frontmatter field and an
AUTHOR_MAP entry in scripts/release.py.
Changes:
- optional-skills/autonomous-ai-agents/openhands/SKILL.md (new)
- website/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-openhands.md (auto-gen)
- website/docs/reference/optional-skills-catalog.md (one new row)
- website/sidebars.ts (one new entry under Optional → Autonomous AI Agents)
- scripts/release.py (AUTHOR_MAP entry for xzessmedia)
Pitfalls documented in the SKILL came from running the tool, not from
the upstream README: LiteLLM bedrock/sagemaker stderr noise on every
invocation, banner spam (\`OPENHANDS_SUPPRESS_BANNER=1\` required),
\`--override-with-envs\` mandatory or the CLI ignores LLM_* env vars
entirely, the dashed-vs-undashed Conversation ID footgun for \`--resume\`,
LiteLLM model-slug double-prefix when going through OpenRouter.
* feat(skills): add code-wiki skill — closes#486
Bundled skill at skills/software-development/code-wiki/ that generates
comprehensive documentation for any codebase: project overview, architecture
walkthrough with Mermaid flowchart, per-module deep-dives, class diagram,
sequence diagrams, getting-started guide, and (when applicable) API reference.
Output defaults to ~/.hermes/wikis/<repo-name>/ (external to repo, like
Google CodeWiki); in-repo output supported when user explicitly requests it.
Uses only existing Hermes tools (terminal, read_file, search_files,
write_file) — no Docker, no external services, no extra dependencies. Works
on local repos and GitHub URLs (shallow-clones to a temp dir). Bounded scope
defaults (depth 3, cap 10 modules) keep token cost reasonable on large repos.
* refactor(skills): move code-wiki to optional-skills
Per the 'when in doubt, optional' rule — wiki generation is a 'I want this
big thing right now' capability, not daily-driver behavior. Lines up with
finance/research/blockchain skills as install-on-demand rather than always
loaded.
Install via: hermes skills install official/software-development/code-wiki
Follow-up to #32053. The OAuth-over-SSH guide and the MCP feature page
previously only covered xAI and Spotify. Now that MCP servers can complete
OAuth via stdin paste-back on remote/headless hosts, document it.
oauth-over-ssh.md:
- Add MCP servers to the 'Which Providers Need This' table.
- New 'MCP Servers' section covering: paste-back (no setup, works
anywhere), SSH port forward (same pattern as xAI/Spotify), and the 30s
config-auto-reload race pitfall (use 'hermes mcp login <server>' from a
fresh terminal instead of editing config from inside a running session).
mcp.md:
- New 'OAuth-authenticated HTTP servers' section under HTTP servers,
covering auth: oauth config, token cache path, paste-back vs SSH
tunnel for headless hosts, and the same reload-race pitfall.
- Cross-links to the OAuth-over-SSH guide anchor.
Translates the full English docs corpus (335 files) into Simplified
Chinese under website/i18n/zh-Hans/. Combined with PR #31895 (cross-
locale link fix), the 简体中文 locale toggle now serves a complete
Chinese site with working cross-page navigation.
Pipeline:
- Claude Sonnet 4.6 via OpenRouter, 8-way concurrent
- Preserves frontmatter keys, code blocks, MDX/JSX, link URLs, brand
names, and technical jargon (prompt/token/hook/MCP/ACP/etc.)
- Translates only frontmatter title/description and prose
- Two largest files (configuration.md 93KB, research-paper-writing.md
107KB) retried with 64K max_tokens after initial fence-drift
- 3 manual post-fixes for MDX edge cases the model didn't escape:
< in optional-skills-catalog table, double-quotes in an alt= tag,
and a bare URL adjacent to a full-width period
Cost: ~$30 total (Sonnet 4.6 input $3/M + output $15/M).
Verified `npm run build` succeeds for both en and zh-Hans locales,
no double-prefixed /docs/zh-Hans/docs/ URLs in rendered output,
all in-page navigation resolves correctly.
Translations are machine-generated and may need human review on
specific pages — but they're an enormous improvement over the
previous state (3 zh-Hans pages out of 335).
Mirror of the TTS command-provider registry (PR #17843) for STT. Lets any
shell-driven ASR engine — Doubao ASR, NVIDIA Parakeet, whisper.cpp builds,
SenseVoice, curl pipelines — become an STT backend with zero Python.
Complements the legacy HERMES_LOCAL_STT_COMMAND escape hatch (preserved
untouched via the built-in local_command path) and the
register_transcription_provider() Python plugin hook also shipped in this
PR.
Resolution order (mirrors TTS exactly):
1. Built-in (local, local_command, groq, openai, mistral, xai)
→ native handler. Always wins.
2. stt.providers.<name>: type: command → command-provider runner.
3. Plugin-registered TranscriptionProvider → plugin dispatch.
4. No match → 'No STT provider available'.
Files
-----
- tools/transcription_tools.py: BUILTIN_STT_PROVIDERS frozenset retained;
added _resolve_command_stt_provider_config, _transcribe_command_stt,
and local helpers for template rendering, shell-quote context, and
process-tree termination. Helpers are documented as mirrors of their
tts_tool.py counterparts (kept local to avoid cross-tool private
import). Wire-in is one insertion point in transcribe_audio() after
the xai elif and before the plugin dispatcher. Plugin dispatcher
additionally defensively short-circuits when a same-name command
config exists (command-wins-over-plugin invariant).
- tests/tools/test_transcription_command_providers.py: 50 new tests
covering resolution (builtin precedence, type/command gating,
case-insensitive lookup, legacy stt.<name> back-compat), helpers
(timeout fallback, format validation, iter, has-any), template
rendering (shell-quote contexts, doubled-brace preservation),
end-to-end via _transcribe_command_stt (output_path read, stdout
fallback, timeout, nonzero exit envelope, model override,
language precedence), and dispatcher integration via the real
transcribe_audio() including command-wins-over-plugin and
builtin-shadow-rejection.
- tests/plugins/transcription/check_parity_vs_main.py: extended from
10 to 13 scenarios. New cases: command-provider-installed,
command-vs-plugin-same-name (verifies command wins precedence),
explicit-openai-with-command-shadow (verifies built-in wins).
Adds command_provider dispatch_kind detection via transcript prefix
(CMD: vs PLUGIN:) so command-provider scenarios can be distinguished
from plugin scenarios even when sharing a provider name.
- website/docs/user-guide/features/tts.md: new 'STT custom command
providers' section symmetric to the TTS section — example config,
placeholder grammar table (input_path / output_path / output_dir /
format / language / model), transcript-read-back semantics (file
first, then stdout fallback), optional keys table, behavior notes,
security note. Updated 'Python plugin providers (STT)' to include
the new 'When to pick which (STT)' decision table and updated
resolution-order section (now 4 layers instead of 3).
Verification
------------
189/189 STT targeted tests + 50/50 new command-provider tests pass.
Combined sweep: tests/tools/ 5576/5576, tests/agent/ + tests/hermes_cli/
8623/8623 — zero regressions across 14,199 tests.
Parity harness: 13 scenarios, 9 OK + 4 expected diffs
(no_provider_error → plugin, plugin_unavailable, command_provider × 2).
E2E live-verified in an isolated HERMES_HOME with a real .wav file:
command: → dispatched to stt.providers.my-fake-cli
plugin: → dispatched to registered TranscriptionProvider
command-wins-over-plugin: → command provider beats same-name plugin
builtin-wins-over-command: → built-in OpenAI handler fires;
stt.providers.openai: type: command
does NOT hijack it.
Add an opt-in Python plugin surface for speech-to-text backends,
mirroring the TTS hook pattern. New backends (OpenRouter, SenseAudio,
Gemini-STT, custom proprietary engines) can be implemented as plugins
without modifying tools/transcription_tools.py.
Built-ins always win
--------------------
The 6 built-in STT providers (local/faster-whisper, local_command,
groq, openai, mistral, xai) keep their native handlers. Plugins
attempting to register under a built-in name are rejected at
registration time with a warning and re-checked defensively at
dispatch.
Resolution order
----------------
1. stt.provider matches a built-in → built-in dispatch (unchanged)
2. stt.provider matches a registered plugin →
a. if plugin.is_available() returns False → unavailability envelope
identifying the plugin (not the generic "No STT provider"
message — the user explicitly opted into this plugin)
b. otherwise plugin.transcribe() with model + language forwarded
from stt.<provider>.{model,language} config
3. No match → legacy "No STT provider available" error (unchanged)
Per-provider config namespace
-----------------------------
Plugins read their config from stt.<provider> in config.yaml, mirroring
how built-ins read stt.openai.model / stt.mistral.model. The dispatcher
forwards `model` and `language` from this section. Caller's explicit
`model=` argument overrides the config-set model.
Files
-----
- agent/transcription_provider.py: TranscriptionProvider ABC
- agent/transcription_registry.py: register/get/list providers,
built-in shadow guard, _reset_for_tests
- hermes_cli/plugins.py: register_transcription_provider() on
PluginContext
- tools/transcription_tools.py: BUILTIN_STT_PROVIDERS frozenset,
_dispatch_to_plugin_provider() with availability gate, wire-in
after xai branch and before "No STT provider" error
- tests/agent/test_transcription_registry.py: 27 tests
- tests/hermes_cli/test_plugins_transcription_registration.py: 3 tests
- tests/tools/test_transcription_plugin_dispatch.py: 28 tests
(covering built-in short-circuit, plugin dispatch, exception
envelope, non-dict guard, availability gate, language forwarding)
- tests/plugins/transcription/check_parity_vs_main.py: 10-scenario
subprocess-pinned parity harness vs origin/main
- website/docs/user-guide/features/{tts,plugins}.md: docs
Behavior parity
---------------
10 scenarios, 8 OK + 2 expected DIFFs:
no_provider_error → plugin (plugin-installed scenario)
no_provider_error → plugin_unavailable (plugin-installed-unavailable
scenario; PR returns cleaner envelope)
Zero behavior change for users not opting into a plugin.
Issue follow-up to #30398.