When HERMES_HOME points at a custom path whose parent directories
only root can create (e.g. HERMES_HOME=/home/hermes/.hermes in a
Compose file, or any path under a fresh / not pre-populated by the
image), stage2-hook.sh fails on first boot:
[stage2] Warning: chown failed (rootless container?) - continuing
mkdir: cannot create directory '/custom': Permission denied
mkdir: cannot create directory '/custom': Permission denied
... (one per s6-setuidgid hermes mkdir invocation)
cont-init: info: /etc/cont-init.d/01-hermes-setup exited 1
The mkdirs fail because s6-setuidgid drops to hermes (UID 10000)
before invoking mkdir -p, and the runtime user has no permission to
create root-owned ancestor directories. 02-reconcile-profiles then
crashes with FileNotFoundError, .install_method never lands, and
the container limps on in a half-initialized state.
Bootstrap HERMES_HOME with mkdir -p while still root, before the
ownership normalization. Idempotent on the default /opt/data path
(directory already exists from the Dockerfile RUN mkdir -p) and on
any subsequent restart. (#18482)
Retargeted from the original PR's docker/entrypoint.sh (now a
deprecated shim) to docker/stage2-hook.sh where the related chown
logic moved during the s6-overlay rework.
Co-authored-by: wpengpeng168 <133926080+wpengpeng168@users.noreply.github.com>
shellcheck doesn't recognize the s6-overlay `#!/command/with-contenv sh`
shebang and aborts with SC1008 ("This shebang was unrecognized. ShellCheck
only supports sh/bash/dash/ksh/'busybox sh'. Add a 'shell' directive to
specify."). The error fires at --severity=error too, so it fails the
"Docker / shell lint" CI job on every PR that touches docker/.
Add the canonical `# shellcheck shell=sh` directive — same fix already
applied to the sibling cont-init.d scripts (`02-reconcile-profiles` and
`015-supervise-perms`) when they adopted the with-contenv shebang.
The shebang was changed from `#!/bin/sh` → `#!/command/with-contenv sh`
in PR #32412 (commit 29c71e9) to fix env-propagation through s6's PID 1.
The shellcheck-directive line was missed in that PR; this patches it.
Reproduces locally:
docker run --rm -v "$PWD:/mnt" -w /mnt koalaman/shellcheck:stable \
--severity=error --format=gcc docker/main-wrapper.sh
Before: docker/main-wrapper.sh:1:1: error: [SC1008] (rc=1)
After: (no output) (rc=0)
Script behavior is unchanged — the directive is a comment, and `sh -n`
/ `bash -n` parse the file cleanly either way.
Debian trixie's bundled `nodejs` package is pinned to 20.19.2, which
reached LTS EOL in April 2026. Trixie won't upgrade in place; Debian 14
(forky) — where the apt nodejs is 24.x — isn't released until ~mid-2027.
To stay on a supported LTS without waiting for Debian 14, copy node + npm
+ corepack from the upstream `node:22-bookworm-slim` image as a
multi-stage source, matching the existing `uv_source` and `gosu_source`
patterns in the Dockerfile. Bookworm-based slim image is used so the
produced binary links against glibc 2.36, which runs cleanly on Debian 13
(trixie, glibc 2.41).
Changes:
- Add `FROM node:22-bookworm-slim@sha256:... AS node_source` stage
- Remove `nodejs npm` from `apt-get install` (now sourced from node_source)
- Add `ca-certificates` explicitly to apt install (was a transitive of
the apt nodejs package; removing nodejs broke the chain and curl
inside the build failed with "error setting certificate file")
- COPY node binary + npm + corepack from node_source; recreate the
symlinks at /usr/local/bin/{npm,npx,corepack}
- Update the npm_config_install_links=false comment block — npm 10's
default is already `install-links=false`, but we keep the env as
defense-in-depth against future Node-source-version regressions
Future bumps to Node 24/26 are a one-line ARG change.
Validation:
- Built --no-cache against current origin/main; build succeeds in 1m42s
- Image size: 3.27 GB (pre-salvage-1 baseline) → 3.14 GB (this PR);
net 130 MiB savings (60 MiB from this change alone vs current main —
removing apt nodejs+transitive deps that duplicated what node bundles)
- Node 22.22.3 / npm 10.9.8 / esbuild 0.27.7 all run cleanly under
trixie's glibc 2.41
- Standard image smoke (6/6), Node-version E2E (8/8), chown E2E from
#19788 (6/6), TUI UID-remap E2E from #28851 (4/4) — 24 checks total
Co-authored-by: Prithvi Monangi <8312237+Prithvi1994@users.noreply.github.com>
When HERMES_UID remaps the hermes user from 10000 to another UID
(e.g. matching the host user's UID for bind-mount ergonomics), the TUI
launcher's esbuild step fails:
✘ [ERROR] Failed to write to output file:
open /opt/hermes/ui-tui/dist/entry.js: permission denied
TUI build failed.
This is because the Dockerfile's build-time `chown -R hermes:hermes` on
`/opt/hermes/{.venv,ui-tui,node_modules}` (line 154) wrote UID 10000,
and stage2-hook.sh only re-chowned `.venv` on UID remap — leaving the
TUI build trees still owned by the old UID.
Extend the stage2 re-chown to include the same set as the build-time
chown: `.venv`, `ui-tui`, `node_modules`. These are the runtime-writable
trees under $INSTALL_DIR; everything else under /opt/hermes is read-only
at runtime so keeping it root-owned is fine.
Original fix targeted docker/entrypoint.sh which is now a deprecated shim;
retargeted to docker/stage2-hook.sh where the .venv chown moved during
the s6-overlay rework.
Co-authored-by: Andreas Steffan <623481+deas@users.noreply.github.com>
Replaces the recursive chown of $HERMES_HOME in stage2-hook.sh with a
targeted approach: chown the top-level dir (so hermes can create new subdirs)
plus the specific hermes-owned subdirectories (cron/, sessions/, logs/,
hooks/, memories/, skills/, skins/, plans/, workspace/, home/, profiles/) —
the same canonical list seeded by the s6-setuidgid mkdir -p block below.
Avoids clobbering host-side file ownership when $HERMES_HOME is a bind
mount that contains user-owned files not managed by hermes (issue #19788).
Original fix targeted docker/entrypoint.sh which is now a deprecated shim;
retargeted to docker/stage2-hook.sh where the recursive chown moved during
the s6-overlay rework.
Co-authored-by: Ptichalouf <1809721+ptichalouf@users.noreply.github.com>
* fix(codex-responses): gracefully recover from invalid_encrypted_content (salvage #10144)
When an OpenAI-compatible Responses API surface accepts an initial
request but later rejects the replayed `codex_reasoning_items`
encrypted blob with HTTP 400 `invalid_encrypted_content`, the
session previously got stuck retrying the same poisoned payload.
Recovery: classify the error as a dedicated FailoverReason, and on the
first hit disable encrypted reasoning replay for the rest of the
session, strip cached items from message history, and retry once.
Changes:
* error_classifier: add FailoverReason.invalid_encrypted_content
branch in _classify_400 (before context_overflow so the messages
that mention 'encrypted content … could not be verified' don't trip
context heuristics), in _classify_by_error_code, and extend
_extract_error_code to peek inside wrapped JSON in error.message and
ignore the bare '400' as a code.
* agent_init: initialize `_codex_reasoning_replay_enabled = True` on
every agent.
* run_agent: add AIAgent._disable_codex_reasoning_replay() helper
that flips the flag and pops cached items.
* codex_responses_adapter: thread a `replay_encrypted_reasoning`
kwarg through _chat_messages_to_responses_input so that when the
flag is False we don't replay codex_reasoning_items.
* transports/codex.py: read `replay_encrypted_reasoning` from params,
thread it into the adapter, and gate the
`include=['reasoning.encrypted_content']` request hint on it.
* chat_completion_helpers: pass the agent's replay flag through to
the transport.
* conversation_loop: in the retry loop, add an
invalid_encrypted_content recovery branch that fires once per
session, only when api_mode == codex_responses, only when replay is
still enabled, and only when at least one assistant message in
history actually carries cached reasoning items (otherwise the 400
has nothing to do with our cache and the normal retry path handles
it).
Tests:
* test_error_classifier: new wrapped-JSON _extract_error_code case;
new TestClassifyApiError cases proving the 400 is retryable with
no fallback, that the broad message match doesn't catch a generic
'parsed' message, and that the error code match is
case-insensitive.
* test_run_agent_codex_responses: end-to-end test of the recovery
branch firing once and disabling replay, plus a sibling test that
proves the branch does *not* fire (and the flag stays True) when
history has no cached reasoning items.
Salvages PR #10144 onto the post-refactor module layout
(error_classifier / codex_responses_adapter / transports/codex /
conversation_loop / agent_init) since the original diff was written
against the pre-refactor monolithic run_agent.py.
* chore(release): map victorGPT in AUTHOR_MAP for #10144 salvage
---------
Co-authored-by: victorGPT <wuxuebin1993@gmail.com>
build-essential is a Debian metapackage (libc6-dev + gcc + g++ + make + dpkg-dev).
The Dockerfile already installs gcc + python3-dev + libffi-dev explicitly,
which covers the C-ext compile cases lazy_deps may hit at first boot.
g++/make/dpkg-dev aren't reached by the resolved [all]+[messaging] tree
on current main — verified via uv sync --dry-run on cp313-linux.
Co-authored-by: Monty Taylor <mordred@inaugust.com>
Add a first-class active-session orchestrator for the Ink TUI:
- list, activate, close, and launch live process-local TUI sessions
- hydrate committed and in-flight output when switching sessions
- dispatch a new prompt session from the +new row with session-scoped model picks
- expose a clickable live-session count in the status chrome
- preserve stable row order while initially focusing the current session
- support mouse hit-testing for floating orchestrator overlays
- add backend and frontend regression coverage for the lifecycle and UI helpers
qwen3.7-max on OpenCode Go rejects the OpenAI-compatible (oa-compat)
format with HTTP 401 but works correctly via the Anthropic Messages
endpoint (/v1/messages with x-api-key auth). Route it the same way
MiniMax models are routed: anthropic_messages api_mode.
Changes:
- hermes_cli/models.py: add qwen3.7-max routing + curated list
- hermes_cli/setup.py: add to setup wizard model list
- hermes_cli/auth.py: update provider comment
- tests: add assertions for qwen3.7-max api_mode routing
'hermes login' was removed (the command now just prints a deprecation
message and exits). The bundled hermes-agent SKILL.md, in-code error
messages, the tip rotation, the proxy adapters, and the docs site
still pointed agents and users at the dead command — so models loading
the skill kept running 'hermes login --provider openai-codex' and
getting a dead-end print.
Replacements use the canonical 'hermes auth add <provider>' surface
(or bare 'hermes auth' for the interactive manager).
Files:
- skills/autonomous-ai-agents/hermes-agent/SKILL.md (+ regenerated docs page)
- hermes_cli/tips.py (tip rotation)
- agent/google_oauth.py (gemini-cli error message)
- agent/conversation_loop.py (nous re-auth troubleshooting line)
- agent/credential_sources.py (docstring)
- hermes_cli/proxy/cli.py + hermes_cli/proxy/adapters/nous_portal.py (proxy auth hints)
- tests/hermes_cli/test_proxy.py (updated assertions)
- website/docs/reference/faq.md, website/docs/user-guide/features/subscription-proxy.md
- zh-Hans i18n mirrors for the above
'hermes logout' is still a live command and is left untouched.
The 'hermes login' stub in hermes_cli/auth.py:login_command() and
the cli-commands.md 'Deprecated' rows are intentionally kept as
the discoverable deprecation surface.
When the gateway processes /reload-mcp, it reconnects MCP servers and
updates the global _servers registry, but cached AIAgent instances in
_agent_cache keep the tools list they were built with. The user had to
also run /new (discarding conversation history) before the agent could
see the new tools — even though /reload-mcp had succeeded.
This patch refreshes each cached agent's .tools and .valid_tool_names
in _execute_mcp_reload after discovery returns, so existing sessions
pick up new MCP tools on their next turn. The slash-confirm gate in
_handle_reload_mcp_command already obtains user consent for the
implied prompt-cache invalidation before this code runs.
Mirrors the equivalent behaviour the CLI already does in cli.py
_reload_mcp. Per-agent enabled_toolsets and disabled_toolsets are
preserved so an agent that was scoped to a subset of toolsets does
not silently gain disabled tools after the reload.
Original diagnosis + initial implementation in #23812 from @fujinice.
The auto-reload watcher half of that PR is intentionally dropped —
users want /reload-mcp to remain explicit.
Co-authored-by: fujinice <45688690+fujinice@users.noreply.github.com>
Grok models (and other LLMs) sometimes omit the schedule parameter
when calling the cronjob tool with action=create because the schema
only listed 'action' in required[] and the schedule description did
not explicitly state it was mandatory (issue #32427).
Fix: update schema descriptions to clearly state schedule is REQUIRED
for action=create, making this explicit for models that rely on
description text for parameter compliance.
Fixes#32427
Updates curated picker lists for both the OpenRouter fallback snapshot
(`OPENROUTER_MODELS`) and the Nous Portal list (`_PROVIDER_MODELS['nous']`).
Regenerates website/static/api/model-catalog.json via
`scripts/build_model_catalog.py` to keep the docs-hosted manifest in
sync (drift guard in `test_in_repo_lists_match_manifest`).
tests/hermes_cli/test_models.py fixtures updated — they pinned the
old model id as their live-fetch sample.
* feat(mcp): Nous-approved MCP catalog with interactive picker
Adds an optional-mcps/ directory mirroring optional-skills/: curated,
Nous-approved MCP servers shipped with the repo but disabled by default.
Presence in optional-mcps/ = approval. No community tier, no trust signals.
Entries are added by merging a PR.
New surface:
hermes mcp Interactive catalog picker (default)
hermes mcp catalog Plain-text list, scriptable
hermes mcp install <name> Install a catalog entry
Picker behavior:
not installed -> install (clone/bootstrap if needed, prompt for creds)
installed/off -> enable
installed/on -> menu (disable / uninstall / reinstall)
Manifest schema (manifest_version: 1) supports:
- transport: stdio (command/args, ${INSTALL_DIR} substitution) or http (url)
- install: optional git clone + bootstrap commands (for repos that need
local venv setup, like the n8n bridge); omit for npx/uvx servers
- auth: api_key (prompts -> ~/.hermes/.env), oauth (provider-mediated
or native MCP), or none
Catalog entries are never auto-updated. Users re-run `hermes mcp install`
to refresh. Credentials always go to ~/.hermes/.env (the .env-is-for-secrets
rule), never to per-server env blocks.
Ships n8n as the reference manifest (https://github.com/CyberSamuraiX/hermes-n8n-mcp).
Tests: 19 catalog tests + E2E install/uninstall round-trip via the shipped
manifest.
* feat(mcp): tool-selection checklist + Linear catalog entry
Adds install-time tool selection so users only enable the MCP tools they
actually want, and ships Linear as a second reference catalog entry to
demonstrate the http+oauth path alongside n8n's stdio+api_key+git-bootstrap.
Tool selection flow:
install (clone/auth/credentials) ->
probe server for available tools ->
curses checklist with pre-checked rows ->
write mcp_servers.<name>.tools.include
Pre-check priority:
1. user's prior tools.include (reinstall preserves selection)
2. manifest's tools.default_enabled (curated subset)
3. all probed tools (default)
Probe-failure fallback (server unreachable, OAuth not yet complete,
backing service offline):
- manifest declared default_enabled -> applied directly
- no default declared -> no filter written (all-on when reachable)
- both cases point user at hermes mcp configure <name>
Manifest schema additions:
tools:
default_enabled: [list, of, tool, names] # optional
Updates:
- optional-mcps/linear/manifest.yaml -- new reference entry (http+oauth)
- optional-mcps/n8n/manifest.yaml -- tools.default_enabled set to the
8 read-mostly tools; mutating tools (activate/deactivate, container_logs)
pruned by default
- docs: new 'Tool selection at install time' section in features/mcp.md
Tests: 7 new tests in TestToolSelection covering probe-success / probe-fail
matrix, manifest-default filtering, reinstall-preserves-selection, and
invalid-default-enabled rejection. 26 catalog tests + 32 existing
mcp_config tests passing.
* feat(mcp): polish — picker unification, include-mode convergence, hardening
Addresses review findings on PR #30870. Lands all improvements that
belong in this PR before merge; defers separate cleanup (consolidating
two probe implementations, change-detector tests) to follow-ups.
Picker UX (mcp_picker.py)
- Unifies catalog + custom (user-added) MCPs in one view with distinct
status badges (available / enabled / installed (disabled) /
custom — enabled / custom — disabled)
- Adds 'Configure tools (probe server + re-pick)' action to both the
catalog-installed and custom-row submenus — the existing
hermes mcp configure flow was previously unreachable from the picker
- Loops until ESC/q so the user can manage several entries in one
session instead of having to re-launch
- Uninstall message now mentions .env credentials are preserved with a
pointer to clean them up manually if no longer needed
- Surfaces a 'requires a newer Hermes' warning per future-manifest
entry instead of silently hiding it
Catalog (mcp_catalog.py)
- catalog_diagnostics() exposes which manifests were skipped and why
(future_manifest vs invalid) so UIs can give actionable feedback
- _do_git_install detects SHA-shaped refs (regex /[0-9a-f]{7,40}/)
and skips the doomed 'git clone --branch <sha>' attempt — clone --branch
only accepts branches/tags, so SHAs always failed noisily before
falling back to the full-clone path
- Probe-success all-tools-enabled message now mentions that new tools
the server adds later will be auto-enabled (no-filter mode)
Convergence (tools_config.py)
- _configure_mcp_tools_interactive now writes tools.include (whitelist)
instead of tools.exclude (blacklist), matching the catalog flow and
hermes mcp configure. The on-disk config shape no longer depends on
which UI the user touched last
- Two existing tests updated to assert the new include-mode contract
Discoverability
- Setup wizard final step now prints 'Browse curated MCPs: hermes mcp'
- Three tip-corpus entries pointing at the new catalog
- Docs updated with: trust model (manifests run code locally, gated by
PR review, but read before installing), runtime ${ENV_VAR} substitution
semantics, and the manifest_version forward-compat behavior
Tests
- 7 new tests covering future-manifest diagnostics, custom MCP picker
rows, SHA-ref git-install path, branch-ref git-install path, and the
tools_config include-mode write contract
- 80 MCP-related tests passing across test_mcp_catalog.py,
test_mcp_config.py, test_mcp_tools_config.py
* fix(mcp): drop setup-wizard catalog hint to satisfy supply-chain scanner
The wizard line 'Browse curated MCPs: hermes mcp' triggered the
CI supply-chain scanner because it pattern-matches on edits to any
file named hermes_cli/setup.py — that filename matches the Python
'install-hook file' heuristic even though this setup.py is the
user-facing 'hermes setup' wizard, not a packaging install hook.
The catalog is already surfaced via three tip-corpus entries in
hermes_cli/tips.py (which the scanner doesn't flag), so dropping the
wizard mention loses no discoverability. Worth revisiting after a
scanner allowlist for this specific file lands.
Follow-up to #32087 after community report from @ethernet that 8000-char
single-line pastes get dumped raw into the input box.
A) Fallback regression revert
paste_collapse_threshold_fallback default: 0 -> 5
#32087 disabled the fallback handler by default. The fallback path
has been always-on with line_count >= 5 since #3065 (March 2026);
the previous shape was the salvaged contributor's design and didn't
match pre-existing behavior for terminals without bracketed paste
support (Windows terminals, some SSH setups). Restoring the original
on-by-default.
B) Long single-line paste guard
New config key: paste_collapse_char_threshold (default 2000)
Bracketed-paste handler and fallback handler now BOTH collapse when
line count >= line threshold OR total char length >= char threshold.
Catches the case ethernet hit: ~8000 chars of minified JSON / log
output on a single line dumped raw into the buffer.
TUI mirrors the same config via uiStore.pasteCollapseChars.
Set 0 to disable.
Defaults verified:
paste_collapse_threshold: 5
paste_collapse_threshold_fallback: 5
paste_collapse_char_threshold: 2000
Tests:
tests/hermes_cli/test_config.py: 87/87 pass
ui-tui useConfigSync.test.ts: 34/34 pass
ui-tui useComposerState.test.ts: 9/9 pass
tsc: 0 new errors in touched files
Follow-up on top of @TheOnlyMika's #32155 cherry-pick. The defusedxml
hardening import was unconditional, which would break the gateway for
anyone running a WeComCallback adapter without the (transitive-only)
defusedxml present.
- Wrap the import in the same try/except pattern as aiohttp/httpx in
the same file. Sets DEFUSEDXML_AVAILABLE flag.
- Extend check_wecom_callback_requirements() to gate on the flag, so
the gateway logs the actual missing dep and skips the adapter
instead of crashing.
- Add [wecom] extra to pyproject.toml with defusedxml==0.7.1.
- Register platform.wecom_callback in tools/lazy_deps.py so users get
prompted to install it on first WeComCallback configuration, same
pattern as discord/slack/matrix.
defusedxml is still the right call for pre-auth XML parsing — this
commit just makes the dep declarative and recoverable instead of a
hard import-time crash.
Two small defensive-hardening changes:
- web/src/components/Markdown.tsx: render links only for http(s)/mailto
schemes; other schemes (javascript:, data:, vbscript:) are dropped to
plain text so a crafted link in rendered content can't execute on click.
- gateway/platforms/wecom_callback.py: parse the untrusted, pre-auth WeCom
callback request body with defusedxml instead of xml.etree, blocking
entity-expansion / billion-laughs (and XXE) on the parse path. defusedxml
is already a dependency (uv.lock); response-building XML in
wecom_crypto.py is unchanged (it is not parsed from untrusted input).
Verified: dashboard typechecks and builds; defusedxml blocks an
entity-expansion payload while valid WeCom envelopes still parse.
SubdirectoryHintTracker was scanning directories outside the active
working directory, allowing files like ~/.codex/AGENTS.md or
~/.claude/CLAUDE.md to be loaded and injected into the agent context.
This causes cross-agent context contamination and instruction mixup.
Add _is_ancestor_or_same() helper and a path boundary check in
_is_valid_subdir(): only directories within the working directory tree
(i.e. path.is_relative_to(working_dir)) are allowed.
Also add exist_ok=True to mkdir() calls in new tests to prevent
pytest-xdist race conditions when workers share the same tmp_path parent.
Tests added:
- test_outside_working_dir_rejected: verifies sibling dirs are blocked
- test_outside_working_dir_absolute_path_rejected: verifies ~/.codex paths blocked
- test_inside_workspace_subdir_allowed: verifies normal subdir access unaffected
- test_sibling_repo_not_loaded_via_ancestor_walk: ancestor walk stays within workspace
The GFM → Telegram-row-group rewriter previously joined every line in
every row with a blank line ("\n\n".join(rendered_rows)), which made
multi-column tables explode into one-bullet-per-paragraph walls on
mobile. It also emitted the row heading twice when the table had no
row-label column: once as the standalone bold heading and once again
as the first labeled bullet (heading == headers[0] == data_cells[0]).
This commit:
* Uses single newlines between the heading and its bullets within a
row-group, and a blank line only BETWEEN row-groups.
* Skips any bullet whose value duplicates the heading text when the
table has no row-label column (the heading already carries that
information). Tables WITH a row-label column are unaffected since
the heading comes from the label cell and never duplicates a header.
Updated existing test assertions accordingly and added two regression
tests: one that reproduces the screenshot bug (wide five-column "Plays"
comparison table) and one that pins the row-label-column behavior so
the dedup logic doesn't accidentally swallow real data.
tests/gateway/test_telegram_format.py: 101 passed
Layered safety so the Skills Hub at /docs/skills stays in sync without
silent rot. Three pieces:
1. build_skills_index.py — refuses to ship a degenerate index.
EXPECTED_FLOORS per source (skills.sh ≥100, lobehub ≥100, clawhub ≥50,
official ≥50, github ≥30, browse-sh ≥50) and MIN_TOTAL=1500. Any source
collapsing to zero (the silent OpenAI breakage that hid for weeks) now
fails the workflow loud — broken index never reaches the live site.
2. extract-skills.py + the React page — visible freshness signal.
Sidecar website/src/data/skills-meta.json carries the index's
generated_at timestamp, plus per-source counts. Skills Hub renders a
'Catalog refreshed N hours ago · auto-rebuilt twice daily' line under
the hero copy. If the cron stalls, users see the staleness immediately.
3. .github/workflows/skills-index-freshness.yml — watchdog cron.
Every 4 hours, fetches the live /docs/api/skills-index.json, validates
shape, checks age (>26h is stale), checks the same per-source floors,
and opens (or appends to) a GitHub issue when anything is off. The
issue is title-prefixed [skills-index-watchdog] so subsequent failures
append a comment instead of spamming new issues.
Net effect:
- A silent regression like 'OpenAI tap moved its skills' now fails the
build instead of shipping a quietly broken catalog.
- A stuck cron (like the landingpage breakage that ran red for weeks) now
files an issue within 4 hours.
- Users see how fresh the catalog is on the page itself.
Test plan:
- Local: built skills-meta.json from the live index → 'Catalog refreshed
N minutes ago' rendered correctly in the static HTML.
- Probe logic dry-run against the live index: total=2456, all 6 sources
above floor, age 0.1h — issues=NONE.
- Triggered skills-index.yml manually; both jobs green, deploy-site.yml
dispatch fired.
s6-overlay's /init scrubs the environment before invoking both
/etc/cont-init.d/* scripts and the container's CMD wrapper. As a
result, ENV directives from the Dockerfile (HERMES_HOME=/opt/data,
HERMES_WEB_DIST, …) and compose-time `environment:` entries
(HERMES_UID, HERMES_GID) never reached the scripts that actually
use them. Three concrete failures observed on macOS Docker Desktop
with `~/.hermes:/opt/data`:
* stage2-hook.sh ran with HERMES_UID unset → no UID remap, hermes
user stayed at UID 10000 instead of the host user's UID.
* skills_sync.py (invoked from stage2-hook) ran with HERMES_HOME
unset → get_hermes_home() fell back to Path.home()/.hermes,
populating a shadow $HERMES_HOME/.hermes/skills tree on the
mounted volume (visible on the host as ~/.hermes/.hermes/skills).
* The main `hermes gateway run` process inherited HOME=/root from
the /init context (s6-setuidgid doesn't update HOME), so
libraries resolving XDG_STATE_HOME via $HOME tried to write to
/root/.local/state/hermes/gateway-locks/ and failed with EACCES,
preventing the Discord adapter from acquiring its bot-token lock.
Three surgical changes restore correct env flow:
1. The auto-generated /etc/cont-init.d/01-hermes-setup wrapper now
uses `#!/command/with-contenv sh`, matching the pattern already
used by docker/cont-init.d/02-reconcile-profiles. The container
env (Dockerfile ENV + compose `environment:`) now reaches
stage2-hook.sh and the skills_sync.py subprocess it spawns.
2. docker/main-wrapper.sh also switches to `#!/command/with-contenv
sh`. The container CMD (`gateway run`, `chat`, `setup`, …) now
sees HERMES_HOME and the other container-level env vars.
3. docker/main-wrapper.sh exports HOME=/opt/data before
`s6-setuidgid hermes`. with-contenv populates HOME from the
/init context (/root); s6-setuidgid drops privileges but does
not update HOME. The hermes user's home per /etc/passwd is
/opt/data, so the explicit override matches passwd.
No behavior change for the non-buggy paths: the s6-supervised
services already used with-contenv, and HOME=/opt/data only affects
processes that resolved $HOME-based paths to /root (silently
broken).
The Skills Hub page was stuck on a stale Feb 25 snapshot, showing only Built-in
+ Optional + Anthropic + LobeHub. The unified index already has 2078 skills
from skills.sh / ClawHub / LobeHub / GitHub taps / Claude Marketplace, and
BrowseShSource adds another ~330 — none of it was reaching the page.
Changes:
- website/scripts/extract-skills.py: read website/static/api/skills-index.json
(the unified multi-source catalog, rebuilt twice daily) as the canonical
external source. Keep the legacy skills/index-cache/ fallback for offline
builds. Add friendly per-source labels (skills.sh, ClawHub, browse.sh,
OpenAI, HuggingFace, Anthropic, LobeHub, etc.) and per-entry installCmd.
- website/src/pages/skills/index.tsx: add source pills + ordering for the 11
new sources; render installCmd from the index entry.
- website/scripts/prebuild.mjs: when no local skills-index.json exists, fetch
the live one from hermes-agent.nousresearch.com so local 'npm run build'
matches production without burning GitHub API quota.
- scripts/build_skills_index.py: crawl BrowseShSource so browse.sh entries
land in the unified index. Adjust source_order.
- tools/skills_hub.py: GitHubSource.DEFAULT_TAPS — openai/skills moved its
skills into skills/.curated/ and skills/.system/, so add both as explicit
taps (the listing code skips dotted dirs by design). Drop
VoltAgent/awesome-agent-skills (README-only, no SKILL.md files) and
MiniMax-AI/cli (singular skill, not a tap directory). Net effect: github
source jumps from 83 → 143 skills, with OpenAI properly included.
- .github/workflows/deploy-site.yml: build the unified index BEFORE running
extract-skills.py — previous order meant extract-skills always fell back
to the legacy cache. Drop the 'skip if file exists' guard; the file is
gitignored and must be rebuilt every deploy.
- .github/workflows/skills-index.yml: drop the broken 'deploy-with-index'
job (it cp'd 'landingpage/\*' which no longer exists, failing every cron
run since the landingpage move). Replace it with a workflow_dispatch
trigger of deploy-site.yml so the index refresh still reaches production
on schedule.
- website/docs/user-guide/features/skills.md: drop VoltAgent from the
default-taps doc list to match the code.
Before: 695 skills (Built-in 90, Optional 84, Anthropic 16, LobeHub 505).
After: 2168 skills across 9 source pills, including the 1212 skills.sh
entries the user expected to see.
Pre-salvage prep for the must-have security cluster (#32103, #32155).
#32103 author commit uses dearmayo@localhost; PR opener is ffr31mr —
same pattern as the existing holynn-q localhost mapping.
The runtime cron prompt scanner (added in #3968 to plug the
"malicious skill carrying an injection payload" gap) reuses the same
critical-severity patterns as the create-time user-prompt scan against
the *assembled* prompt — which includes loaded skill markdown.
That works fine for narrow patterns like "ignore previous instructions"
which never legitimately appear in prose. It catastrophically false-
positives on command-shape patterns like `cat ~/.hermes/.env`,
`authorized_keys`, `/etc/sudoers`, and `rm -rf /`, which routinely
appear in security postmortems and runbooks as **descriptive prose**
about attacks, not as actual commands.
Concrete failure: the bundled `hermes-agent-dev` skill contains a
security postmortem section saying "the attacker could just
`cat ~/.hermes/.env`". Every PR-scout cron job that loaded this skill
was silently blocked with `Blocked: prompt matches threat pattern
'read_secrets'`. All 11 scout jobs failed for weeks.
Fix: split the scanner into two tiers and route by context:
- `_scan_cron_prompt` (strict, unchanged behavior) runs against
the small user-authored cron prompt at create/update and as a
runtime defense-in-depth when no skills are attached. A legit
user prompt has no business saying `cat .env`, so the strict
patterns still apply there.
- `_scan_cron_skill_assembled` (new, looser) runs against the
assembled prompt when skills are attached. It only catches
unambiguous prompt-injection directives ("ignore previous
instructions", "disregard your rules", "system prompt override",
"do not tell the user") plus invisible-unicode markers. Command-
shape patterns are dropped because they false-positive on prose.
This is defense-in-depth, not the only line of defense. Skill bodies
are already scanned at install time by `skills_guard.py`; the runtime
cron scan exists purely as a tripwire for an obvious injection
directive surviving a malicious install. Catching prose mentions of
commands was never the goal of #3968 — the test that planted a skill
containing `cat ~/.hermes/.env` was the wrong shape of test for the
threat model.
Tests:
- `_scan_cron_prompt` strict behavior preserved (56 existing tests
unchanged: bare `cat .env`, `rm -rf /`, etc. still block).
- New `TestScanCronSkillAssembled` class verifies the looser scanner:
injection / disregard / system-override / do-not-tell-the-user /
invisible-unicode still block; descriptive prose about attack
commands is allowed; GitHub auth-header allowlist still works.
- `test_skill_with_env_exfil_payload_raises` (planted `cat .env`
in skill body) replaced with `test_skill_with_env_exfil_command
_in_prose_is_allowed` documenting the new correct behavior with
the real-world postmortem-style example that triggered the bug.
- All 11 originally-failing PR-scout jobs validated end-to-end via
`_build_job_prompt` — assembled prompts now build successfully
with the `hermes-agent-dev` skill attached.
Total: 75/75 tests in cron + cronjob_tools + threat scanner pass;
544/544 across the wider cron / memory / threat-pattern surface.
When the user picks 'Anthropic API key' at `hermes setup` (vs 'Claude
Pro/Max subscription'), `save_anthropic_api_key()` writes ANTHROPIC_API_KEY
to ~/.hermes/.env and zeros ANTHROPIC_TOKEN. That env-var pattern is the
user's explicit choice of auth method — API key, not OAuth.
But the anthropic credential pool's autodiscovery (_seed_from_singletons)
unconditionally read ~/.claude/.credentials.json from the Claude Code CLI
and any saved hermes_pkce creds, and added them to the SAME anthropic
pool as the user's API key. Two problems:
1. Even with the API key at higher priority, a 401/429 on the API key
would rotate the session onto an autodiscovered OAuth credential,
silently flipping the agent into the Claude Code masquerade
mid-conversation: 'You are Claude Code' system block, every tool
renamed to mcp_*, claude-cli User-Agent header.
2. Switching OAuth → API key at `hermes setup` cleared the env vars
but left previously-seeded OAuth entries dormant in auth.json,
where rotation could revive them.
The user picking the API-key path is explicitly opting OUT of the
masquerade. Mixing OAuth credentials into their pool defeats that
choice.
Fix: in `_seed_from_singletons` for provider='anthropic', detect the
API-key path (ANTHROPIC_API_KEY set in env, no OAuth env var set) and:
- Skip calling read_claude_code_credentials() and
read_hermes_oauth_credentials() entirely
- Prune any stale hermes_pkce / claude_code entries that may already
be in the on-disk pool
OAuth-path users (ANTHROPIC_TOKEN set) are unaffected — autodiscovery
continues to fire as before.
Tests: 3 new regression tests (api-key skips autodiscovery, api-key
prunes stale entries, oauth path still autodiscovers). Full file 70/70.
Reported via AskClaw. When config.yaml has `model: <name>` (flat string)
instead of the nested `model: {default: ..., provider: ...}` form, every
gateway `/model X --global` crashed silently with
TypeError: 'str' object does not support item assignment
The persist block did:
model_cfg = cfg.setdefault("model", {})
model_cfg["default"] = result.new_model
`setdefault` returns the existing scalar, and the next assignment blows
up. The 'switch failed' warning was logged at WARNING level and the user
never saw why their persist didn't stick.
Coerce scalar/None `model:` into a dict before mutation, in both the
gateway path (`gateway/run.py`) and the sister site in
`hermes_cli/doctor.py --fix` (same setdefault-on-string flaw). The CLI
`/model` path is unaffected because it goes through `_set_nested` which
already replaces scalar leaves with dicts.
Regression test `tests/gateway/test_model_command_flat_string_config.py`
covers the flat-string, missing, and proper-dict cases. Without the fix,
the flat-string case fails with the exact original TypeError.
`load_hermes_dotenv()` is called at module-import time from cli.py,
hermes_cli/main.py, run_agent.py, trajectory_compressor.py, gateway/run.py,
tui_gateway/server.py, acp_adapter/entry.py, and a few others. Each call
triggered `_apply_external_secret_sources()`, which re-parsed config,
re-fetched from Bitwarden Secrets Manager (its own 300s cache mostly absorbed
this), re-ran the ASCII sanitization sweep, and reprinted
Bitwarden Secrets Manager: applied N secret(s) (...)
to stderr. Users saw the status line 3-5x per CLI startup.
Guard the function with a process-level set of HERMES_HOME paths that have
already had external secrets applied. Subsequent calls for the same home_path
are no-ops. `reset_secret_source_cache()` lets tests (and any future
long-running consumer that wants to refresh after a config change) force a
re-pull.
Three granular patch-tool refinements from the Roo Code deep-dive (#507).
## Indentation preservation (fuzzy_match.py)
When fuzzy_find_and_replace matches via a non-exact strategy, the file's
indentation may differ from what the LLM sent in old_string/new_string
(common case: model sends zero-indent old/new for a method body that
lives inside an 8-space-indented class). Before this commit the
replacement was spliced in verbatim, producing a file with a broken
indent level that may still parse but is logically wrong.
The fix computes the indent delta between old_string's first meaningful
line and the matched region's first meaningful line, then re-indents
every line of new_string by that delta. Exact-strategy matches are
untouched (passthrough). Same approach as Roo Code's
multi-search-replace.ts:466-500.
## CRLF preservation (file_operations.py)
Models nearly always send tool args with bare LF endings (JSON-encoded),
but the file on disk may have CRLF (Windows-line-ending configs, .bat,
.cmd, .ini files). Before this commit:
- write_file silently normalized CRLF to LF on every overwrite
- patch produced mixed-ending files: the substituted region had LF,
the surrounding context kept CRLF
The fix detects the file's existing line endings (via pre_content if
already read for lint/LSP, otherwise a tiny head -c 4096 probe), and
normalizes the entire write to that ending. New files are written
verbatim (no detection possible).
## Per-file failure escalation (file_tools.py)
When the agent fails to patch the same file 3+ times in a row, the
existing 'old_string not found' hint isn't strong enough — the model
keeps retrying with variations against a stale view of the file.
The fix tracks consecutive failures per (task_id, resolved_path) and
injects an escalating hint after 3 failures: 'This is failure #N
patching X. Stop retrying. Either re-read fresh, use longer context,
or fall back to write_file.' Counter resets on a successful patch to
the same path.
## Validation
- 22 new tests across tests/tools/test_fuzzy_match.py (5),
test_line_ending_preservation.py (12), test_patch_failure_tracking.py (5)
- All existing tests pass (165/165 in the touched files)
- E2E verified with real _handle_patch / _handle_write_file calls
against real CRLF files and real failure loops
Closes part of #507. The remaining open items in #507 (2b start_line
hint, behavioral rules) were declined after audit:
- 2b adds schema bloat for a problem the existing 'multiple matches'
contract already handles
- Behavioral rules conflict with the personality system
Items 1, 2d, 2e, 3, 4 of #507 were already landed in earlier work.
The outer 'except Exception' guard in run_conversation() captures
exceptions raised inside the agent loop (during streaming, tool
dispatch, message construction, etc.) and prints a one-line summary
to the screen. The traceback was only logged at DEBUG, so it never
landed in errors.log (WARNING+) and was lost.
For intermittent failures — the most important kind to debug — users
saw 'Error during OpenAI-compatible API call #N: <message>' on
screen with no way to recover the call site. Switching to
logger.exception() emits the full traceback at ERROR so it goes to
both agent.log and errors.log automatically.
This is a pure logging change; control flow is unchanged.
Two posture fixes surfaced by the web-pentest skill self-test against
the dashboard (issue #32267).
1. /dashboard-plugins/<name>/<path> previously returned 200 for any
file inside the plugin's dashboard directory — including
plugin_api.py and __pycache__/*.pyc. The path is unauthenticated by
architecture (SPA loads JS via <script src> and CSS via <link href>,
neither of which can attach a custom auth header), so the fix is
not "require token" — it's "restrict to browser-fetchable suffixes."
Allowlist now: .js .mjs .css .json .html .svg .png .jpg .jpeg .gif
.webp .ico .woff .woff2 .ttf .otf .map. Everything else → 404.
This stops a private user-installed plugin's Python source from
being readable by anyone reachable on the dashboard's loopback port
(other local users on a shared box, sidecar containers sharing the
host netns).
2. save_env_value() now refuses to persist env-var names that
influence how the next subprocess executes: LD_PRELOAD,
LD_LIBRARY_PATH, LD_AUDIT, DYLD_*, PYTHONPATH, PYTHONHOME,
PYTHONSTARTUP, NODE_OPTIONS, NODE_PATH, PATH, SHELL, EDITOR,
VISUAL, PAGER, BROWSER, GIT_SSH_COMMAND, GIT_EXEC_PATH; plus
HERMES_HOME / HERMES_PROFILE / HERMES_CONFIG / HERMES_ENV.
PUT /api/env is authed but the session token lives in the SPA HTML
where any future plugin XSS or local process can read it. Without
this gate, a token-holder could plant LD_PRELOAD in .env and the
next hermes process start would load attacker code via the dotenv
to os.environ chain. This is enforced on write only — pre-existing
.env values are left alone (the gate is in save_env_value, not in
load_env). PUT /api/env now returns 400 with the explanatory
message instead of an opaque 500.
IMPORTANT: HERMES_* overall is NOT blocked — only the four runtime
location names. Integration credentials following the HERMES_*
convention (HERMES_GEMINI_*, HERMES_LANGFUSE_*, HERMES_SPOTIFY_*,
HERMES_QWEN_BASE_URL, ...) keep working.
Regression tests cover both fixes (30 new test cases). No existing
tests changed; 257 passing in tests/hermes_cli/.
Closes#32267.
Salvage follow-up. The new private-DM-topic fail-loud contract from
PR #27107 hits 'requires a reply anchor' when reply_to_mode='off' is
configured, even though commit 21a15b671 (PR #23994) verified that
message_thread_id alone routes correctly on python-telegram-bot's
reference client when the user has explicitly opted out of quote
bubbles. Carve out the explicit opt-in path so users on reply_to_mode
'off' aren't regressed — the new guard now only applies to callers
that didn't ask for the anchor to be suppressed.
Salvage follow-up. The transient thread-not-found retry test was
exercising chat_id='123' (positive, looks-like-private) which now
hits the new private-DM-topic fail-closed contract. The test's
intent is the transient-flake retry on real forum topics in groups,
so use -100123 to make the scenario unambiguous.