hermes-agent/website/docs/user-guide
Ben Barclay 0927fb5584 feat(docker): auto-redirect gateway run to supervised mode inside s6 image
Pre-s6, `docker run nousresearch/hermes-agent gateway run` was the
standard invocation: gateway ran as the container's main process,
tini reaped zombies, container exit code matched gateway exit code,
no supervision. With s6-overlay as PID 1, the same invocation now
auto-upgrades to supervised semantics — auto-restart on crash,
dashboard supervised alongside (when HERMES_DASHBOARD=1 is set),
multiple profile gateways under the same /init.

Users get the new behavior with zero changes to their docker run
command. A loud one-line breadcrumb on stderr explains the upgrade
and points at the opt-out for users who genuinely want pre-s6
foreground semantics.

How it works:

  1. `_gateway_command_inner` (the `gateway run` handler) checks if
     we're inside a container with s6 as PID 1.
  2. If yes, dispatches `start` to the s6 service manager (registers
     and starts gateway-default), then `exec sleep infinity` to keep
     the CMD process alive without binding container lifetime to
     gateway PID lifetime. The supervised gateway can flap freely;
     `docker stop` still tears everything down via /init stage 3.
  3. If no, falls through to the existing foreground code path
     unchanged. Host runs of `hermes gateway run` are unaffected.

Three gates make the redirect inert outside the intended scope:

  * `detect_service_manager() != "s6"` — host/non-s6-container runs.
  * `HERMES_S6_SUPERVISED_CHILD=1` env var (recursion guard) —
    exported by `S6ServiceManager._render_run_script` for the
    s6-supervised invocation itself. Without this guard, the
    supervised `gateway run --replace` would re-enter the redirect
    and recurse (run → start → run → start → ...) infinitely.
  * `--no-supervise` CLI flag OR `HERMES_GATEWAY_NO_SUPERVISE=1` env
    var — explicit user opt-out for CI smoke tests, debugging the
    foreground startup path, or any case wanting "CMD exit =
    container exit" semantics. Strict truthiness (1/true/yes,
    case-insensitive); typos like `=0` do NOT silently opt out.

Tests:

  * Unit tests in tests/hermes_cli/test_gateway_s6_dispatch.py
    cover all five paths (host no-op, supervised fire, sentinel
    recursion guard, CLI flag, env var truthy + falsy). The two
    load-bearing gates (sentinel + opt-out) were mutation-tested
    by removing each gate in isolation and confirming the dedicated
    test fails with the expected error.
  * Docker harness tests in tests/docker/test_gateway_run_supervised.py
    cover the round trips end-to-end against a built image: redirect
    fires (sleep-infinity heartbeat + supervised gateway-default
    slot + breadcrumb), --no-supervise opt-out (foreground gateway,
    no want-up on the slot), HERMES_GATEWAY_NO_SUPERVISE env var
    works identically, recursion is impossible (≤1 supervised
    python gateway-run + exactly 1 sleep-infinity parented to the
    CMD wrapper), and HERMES_DASHBOARD=1 produces both supervised
    gateway and supervised dashboard.

Docs:

  * Added a `:::tip Gateway runs supervised` admonition near the
    main docker.md example explaining the upgrade and pointing at
    the opt-out. Pre-s6 (tini-based) images still run gateway run
    as the foreground main process, so the note is scoped to the
    s6 image only.

Trade-off documented in the helper docstring: container exit code
under the redirect is sleep's exit code (always 0 on SIGTERM), not
the gateway's. That was an explicit design call — the supervised
gateway is allowed to flap without taking the container with it,
which is what "supervision" means. CI users who want exit-code
forwarding can pass --no-supervise.
2026-05-28 12:42:13 +10:00
..
features fix: preserve skill packages during curator consolidation 2026-05-27 13:39:58 -07:00
messaging fix(gateway): keep Telegram heartbeat + interim commentary on; edit heartbeat in place (#33187) 2026-05-27 05:21:53 -07:00
secrets feat(secrets/bitwarden): EU Cloud + self-hosted server URL support (#31378) 2026-05-24 02:19:57 -07:00
skills remove Vercel AI Gateway and Vercel Sandbox (#33067) 2026-05-27 00:43:32 -07:00
_category_.json feat: add documentation website (Docusaurus) 2026-03-05 05:24:55 -08:00
checkpoints-and-rollback.md feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709) 2026-05-06 05:44:35 -07:00
cli.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
configuration.md remove Vercel AI Gateway and Vercel Sandbox (#33067) 2026-05-27 00:43:32 -07:00
configuring-models.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
docker.md feat(docker): auto-redirect gateway run to supervised mode inside s6 image 2026-05-28 12:42:13 +10:00
git-worktrees.md docs: restructure site navigation — promote features and platforms to top-level (#4116) 2026-03-30 18:39:51 -07:00
profile-distributions.md docs(profiles): full user guide for profile distributions (#22017) 2026-05-08 11:13:45 -07:00
profiles.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
security.md remove Vercel AI Gateway and Vercel Sandbox (#33067) 2026-05-27 00:43:32 -07:00
sessions.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
tui.md feat: add TUI session orchestrator 2026-05-26 20:51:59 -07:00
windows-native.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
windows-wsl-quickstart.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00