mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-03 07:21:54 +00:00
fix(docker): drop docker exec to hermes uid before invoking the CLI
When operators ran `docker exec <c> hermes login` (or anything else that wrote under $HERMES_HOME) they defaulted to root, leaving /opt/data/auth.json root:root mode 0600. The supervised gateway (UID 10000) then couldn't read its own credentials and returned "Provider authentication failed: Hermes is not logged into Nous Portal" on every Telegram/Discord/etc. message — even though `docker exec <c> hermes chat -q ping` (also root) succeeded because root could read its own root-owned file. _load_auth_store swallowed PermissionError as a parse failure and copied the file aside as auth.json.corrupt, making the diagnostic more misleading. Fix: install a privilege-drop shim at /opt/hermes/bin/hermes, prepended ahead of the venv on PATH. When invoked as root the shim exec's the real venv binary via `s6-setuidgid hermes` — so any file the docker-exec session writes is uid-aligned with the supervised processes. Non-root callers (the supervised processes themselves, `docker exec --user hermes`, kanban subagents, anything inside the container that's not coming through docker-exec) hit a single exec to the absolute venv path with no privilege change. Recursion is impossible: the shim exec's the venv binary by absolute path (/opt/hermes/.venv/bin/hermes), so the second hop cannot re-enter the shim regardless of PATH state. No sentinel env var needed (unlike #33583's gateway-run redirect which DOES need HERMES_S6_SUPERVISED_CHILD because there's no absolute-path equivalent for the s6 dispatch). Opt-out: `docker exec -e HERMES_DOCKER_EXEC_AS_ROOT=1 …` for diagnostic sessions where the operator deliberately wants root. Strict truthiness (1/true/yes case-insensitive); typos like `=0` do not silently opt out, mirroring HERMES_GATEWAY_NO_SUPERVISE in #33583. If `s6-setuidgid` is missing (someone stripped s6-overlay in a downstream fork), the shim exits 126 with a remediation message pointing at `--user hermes` and the opt-out — never silently runs as root. Test plan: - tests/docker/test_docker_exec_privilege_drop.py — 11 tests - shim drops root to hermes uid (file ownership check) - shim short-circuits for non-root docker exec - HERMES_DOCKER_EXEC_AS_ROOT=1 keeps root - strict-truthiness parametrization (5 falsy values reject) - main CMD path unaffected (recursion guard) - E2E: every file written by docker-exec is readable by uid 10000 - Full tests/docker/ harness: 32/32 pass against fresh image build - shellcheck --severity=error: clean - hadolint: clean - Manual: reproduced the original symptom (root-owned auth.json) by bypassing the shim; confirmed default docker-exec produces hermes-owned files; confirmed opt-out env keeps root semantics. Known follow-up: this prevents NEW instances of the bug. Volumes that already have root:root /opt/data/auth.json from a pre-shim image need a one-time `chown hermes:hermes` before rebooting onto the new image. A stage2-hook chown sweep can self-heal that, but is deferred per scope decision.
This commit is contained in:
parent
b345323195
commit
aeb992d343
3 changed files with 397 additions and 1 deletions
21
Dockerfile
21
Dockerfile
|
|
@ -213,13 +213,32 @@ COPY --chmod=0755 docker/cont-init.d/02-reconcile-profiles /etc/cont-init.d/02-r
|
|||
# ---------- Runtime ----------
|
||||
ENV HERMES_WEB_DIST=/opt/hermes/hermes_cli/web_dist
|
||||
ENV HERMES_HOME=/opt/data
|
||||
|
||||
# `docker exec` privilege-drop shim. When operators run
|
||||
# `docker exec <c> hermes ...` they default to root, and any file the
|
||||
# command writes under $HERMES_HOME (auth.json, .env, config.yaml) ends
|
||||
# up root-owned and unreadable to the supervised gateway (UID 10000).
|
||||
# The shim lives at /opt/hermes/bin/hermes, sits earliest on PATH, and
|
||||
# transparently re-exec's the real venv binary via `s6-setuidgid hermes`
|
||||
# when invoked as root. Non-root callers (supervised processes,
|
||||
# `--user hermes`, etc.) hit the short-circuit path with no overhead.
|
||||
# Recursion is impossible because the shim exec's the venv binary by
|
||||
# absolute path (/opt/hermes/.venv/bin/hermes). See the shim source for
|
||||
# the opt-out env var (HERMES_DOCKER_EXEC_AS_ROOT=1).
|
||||
COPY --chmod=0755 docker/hermes-exec-shim.sh /opt/hermes/bin/hermes
|
||||
|
||||
# Pre-s6 entrypoint.sh did `source .venv/bin/activate` which exported
|
||||
# the venv bin onto PATH; Architecture B's main-wrapper.sh does the
|
||||
# same for the container's main process, but `docker exec` and our
|
||||
# cont-init.d scripts don't pass through the wrapper. Expose the venv
|
||||
# bin globally so `docker exec <container> hermes ...` and any
|
||||
# subprocess that doesn't activate the venv first still find hermes.
|
||||
ENV PATH="/opt/hermes/.venv/bin:/opt/data/.local/bin:${PATH}"
|
||||
#
|
||||
# /opt/hermes/bin is prepended ahead of the venv so the privilege-drop
|
||||
# shim wins PATH resolution. The shim's last act is to exec the venv
|
||||
# binary by absolute path, so this PATH ordering is transparent to
|
||||
# every other consumer.
|
||||
ENV PATH="/opt/hermes/bin:/opt/hermes/.venv/bin:/opt/data/.local/bin:${PATH}"
|
||||
RUN mkdir -p /opt/data
|
||||
VOLUME [ "/opt/data" ]
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue