feat(docker): remove gosu from bundled image; s6-setuidgid handles privilege drop

The s6-overlay migration replaced every runtime use of gosu with
s6-setuidgid (in stage2-hook.sh, main-wrapper.sh, per-service run
scripts, and cont-init.d hooks), but the gosu binary itself was still
being copied into the image from tianon/gosu, and several comments
across the repo still pointed to it.

Image changes:
- Drop the FROM tianon/gosu:1.19-trixie AS gosu_source stage
- Drop the COPY --from=gosu_source /gosu /usr/local/bin/ layer
- Net: one fewer base-image pull, ~12-15 MB layer eliminated

Documentation/comment refresh (no behavior change):
- Dockerfile: update root-user rationale comment + cont-init.d comment
- docker/main-wrapper.sh: drop "pre-s6 contract (gosu drop)" reference
- docker-compose.yml: update UID/GID remap comment
- .hadolint.yaml: update DL3002 ignore rationale
- website/docs/user-guide/docker.md: privilege-drop helper is s6-setuidgid now
- hermes_cli/config.py: docker_run_as_host_user docstring

tools/environments/docker.py runs *arbitrary user images* via the
terminal backend, not the bundled Hermes image. It still needs SETUID/
SETGID caps so user images that use gosu/su/s6-setuidgid all work.
Renamed the cap-list constant _GOSU_CAP_ARGS → _PRIVDROP_CAP_ARGS and
updated comments to list s6-setuidgid alongside the others as examples.
The matching test (test_security_args_include_setuid_setgid_for_gosu_drop
→ test_security_args_include_setuid_setgid_for_privdrop) was renamed
and its docstring updated; behavior is unchanged.

Verification:
- hadolint clean against .hadolint.yaml
- shellcheck clean against all docker/ shell scripts
- Image rebuilt successfully (sha 1a090924ccea)
- Docker harness: 19 passed in 41.87s (every Phase 0 test + Phase 4
  per-profile-gateway lifecycle + container-restart reconciliation)
- tests/tools/test_docker_environment.py: 23 passed (rename did not
  break test discovery; pre-existing unrelated mock warning)

The plan document (docs/plans/2026-05-07-s6-overlay-dynamic-subagent-gateways.md)
intentionally retains its historical references to gosu — it describes
the pre-s6 entrypoint as background for understanding the migration.
This commit is contained in:
Ben 2026-05-22 10:43:57 +10:00 committed by teknium1
parent a36221ed91
commit 4b4c36cb61
No known key found for this signature in database
8 changed files with 50 additions and 44 deletions

View file

@ -385,18 +385,19 @@ def test_normalize_env_dict_rejects_complex_values():
assert result == {"GOOD": "string"}
def test_security_args_include_setuid_setgid_for_gosu_drop(monkeypatch):
def test_security_args_include_setuid_setgid_for_privdrop(monkeypatch):
"""The default (run_as_host_user=False) invocation must include SETUID and
SETGID caps so the image entrypoint can drop from root to the non-root
`hermes` user via gosu.
SETGID caps so the image's init can drop from root to a non-root user
(e.g. via ``s6-setuidgid`` in the bundled Hermes image, or ``gosu``/``su``
in user-provided images).
Without these caps gosu exits with
``error: failed switching to 'hermes': operation not permitted``
and the container exits immediately (exit 1) before running any work.
Without these caps the privilege-drop helper fails with
``operation not permitted`` and the container exits immediately (exit 1)
before running any work.
`no-new-privileges` is kept, so gosu still cannot escalate back to root
after the drop the drop is a one-way transition performed before the
`no_new_privs` bit is enforced on the exec boundary.
``no-new-privileges`` is kept, so the dropped process still cannot
escalate back to root after the drop the drop is a one-way transition
performed before the ``no_new_privs`` bit is enforced on the exec boundary.
"""
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
calls = _mock_subprocess_run(monkeypatch)
@ -412,8 +413,8 @@ def test_security_args_include_setuid_setgid_for_gosu_drop(monkeypatch):
for i, flag in enumerate(run_args[:-1])
if flag == "--cap-add"
}
assert "SETUID" in added, "SETUID cap missing — gosu drop in entrypoint will fail"
assert "SETGID" in added, "SETGID cap missing — gosu drop in entrypoint will fail"
assert "SETUID" in added, "SETUID cap missing — image privilege-drop will fail"
assert "SETGID" in added, "SETGID cap missing — image privilege-drop will fail"
# ── run_as_host_user tests ────────────────────────────────────────
@ -441,8 +442,9 @@ def test_run_as_host_user_passes_uid_gid(monkeypatch):
def test_run_as_host_user_drops_setuid_setgid_caps(monkeypatch):
"""When --user is passed, the container never needs gosu, so SETUID/SETGID
caps are omitted for a tighter security posture."""
"""When --user is passed, the container already starts unprivileged and
never needs a privilege drop, so SETUID/SETGID caps are omitted for a
tighter security posture."""
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
monkeypatch.setattr(docker_env.os, "getuid", lambda: 1000, raising=False)
monkeypatch.setattr(docker_env.os, "getgid", lambda: 1000, raising=False)
@ -459,10 +461,10 @@ def test_run_as_host_user_drops_setuid_setgid_caps(monkeypatch):
if flag == "--cap-add"
}
assert "SETUID" not in added, (
"SETUID cap should be dropped when running as host user — no gosu drop is needed"
"SETUID cap should be dropped when running as host user — no privilege drop is needed"
)
assert "SETGID" not in added, (
"SETGID cap should be dropped when running as host user — no gosu drop is needed"
"SETGID cap should be dropped when running as host user — no privilege drop is needed"
)
# Core non-privilege-drop caps must still be there (pip/npm/apt need them).
assert "DAC_OVERRIDE" in added