fix(docker): tee supervised gateway stdout to docker logs

Follow-up to #33583 (the gateway-run-supervised redirect).

Before this fix, the supervised gateway's stdout (most visibly the
"Hermes Gateway Starting…" rich-console banner) was swallowed by
`s6-log` into the rotated file at
`${HERMES_HOME}/logs/gateways/<profile>/current` and never reached
`docker logs`. Operational signal lived in two places:

  * **docker logs** — saw stderr (Python `logging` defaults to
    stderr), so warnings/errors were visible.
  * **the rotated file** — saw stdout (rich banners, `print()`
    output, third-party libs that wrote to fd 1).

This was surprising for users coming from the pre-s6 image, where
`docker run … gateway run` produced a single unified stream in
`docker logs`. They'd see partial output, conclude something was
broken, and dig around for the missing pieces.

Fix: add the `1` s6-log action directive before the file destination
so each line is forwarded to s6-log's stdout — which propagates up
the s6-supervise pipeline to /init's stdout = container stdout =
`docker logs`. The file destination is preserved as a second
destination, so the rotated log (with ISO 8601 timestamps) still
exists for `hermes logs` and for survival across container restarts.

Trade-off considered: timestamps. Putting `T` between `1` and the
file destination (not before `1`) means:

  * docker logs sees raw lines — Python's logging formatter has its
    own timestamps, and `docker logs --timestamps` adds another
    layer when desired. No double-stamping in the common reading
    path.
  * The persisted file gets s6-log's ISO 8601 timestamp so even
    output that lacked a Python-logger timestamp (rich banners,
    third-party raw prints) is correlatable in `current`.

Verification:

  * New unit-test assertion in `test_service_manager.py` locks the
    `s6-log 1` directive into the rendered run-script. Mutation-
    tested by reverting to the pre-fix script (no `1`); the assert
    catches it cleanly.
  * New docker-harness test `test_supervised_gateway_stdout_reaches_docker_logs`
    builds the image, runs `docker run … gateway run`, and asserts
    the unique `⚕` banner glyph reaches `docker logs`. Also verifies
    the rotated file still contains the banner (no regression on
    the existing file destination). Mutation-tested end-to-end: built
    a deliberately-broken image without the `1` directive and the
    test failed exactly as designed, citing the banner present in
    `current` but absent from `docker logs`.
  * `website/docs/user-guide/docker.md` gains a new `:::note Where
    gateway logs go` admonition documenting both destinations and
    the audit-log file at `${HERMES_HOME}/logs/container-boot.log`.

Existing functionality preserved: every other docker-harness test
still passes against the new image. Unit-test sweep across
`tests/hermes_cli/` (5561 tests) is green.
This commit is contained in:
Ben Barclay 2026-05-28 13:06:11 +10:00 committed by Ben Barclay
parent 912e6e2274
commit b345323195
4 changed files with 118 additions and 1 deletions

View file

@ -628,6 +628,38 @@ class S6ServiceManager:
so a container started with ``-e HERMES_HOME=/data/hermes``
gets its logs under /data/hermes/logs/..., not the build-time
default.
Output routing the script is two action directives, applied
per line, in order:
1. ``1`` (forward to stdout) propagates the line up the
s6-supervise pipeline to /init's stdout, which is the
container's stdout, which is ``docker logs``. Without
this, supervised stdout would be terminated inside
s6-log and never reach the container's log stream;
users would have to ``docker exec`` and ``tail`` the
file just to see startup banners. (Python's ``logging``
module defaults to stderr, which s6-supervise leaves
unfiltered so warnings/errors already reach docker
logs. This change is specifically about the rich-console
banner output and other plain stdout writes.)
2. ``T <log_dir>`` also write a timestamped copy to the
rotated log directory (``current`` + archived ``@*.s``
files). This is what ``hermes logs`` reads and what
persists across container restarts via the volume mount.
``T`` is non-sticky: it only prefixes lines for the next
action directive. We deliberately put ``T`` between ``1``
and the log dir (not before ``1``) so:
* ``docker logs`` shows raw lines Python's logging
formatter has its own timestamps, and ``docker logs
--timestamps`` adds a third layer when desired. No
double-stamping in the most common reading path.
* The persisted file gets s6-log's own ISO 8601 timestamp
so even output that lacked a Python-logger timestamp
(rich banners, third-party libs' raw prints) is
correlatable in ``current``.
"""
import shlex
prof = shlex.quote(profile)
@ -638,7 +670,7 @@ class S6ServiceManager:
f'log_dir="$HERMES_HOME/logs/gateways/{prof}"\n'
f'mkdir -p "$log_dir"\n'
f'chown -R hermes:hermes "$log_dir" 2>/dev/null || true\n'
f'exec s6-setuidgid hermes s6-log n10 s1000000 T "$log_dir"\n'
f'exec s6-setuidgid hermes s6-log 1 n10 s1000000 T "$log_dir"\n'
)
# -- lifecycle ---------------------------------------------------------