hermes-agent/tests/hermes_cli/test_service_manager.py
Ben 4f416fc40c fix(docker): make s6 lifecycle work for the unprivileged hermes user
Resolves the explicit "Known follow-up" left by commit 2f8ceeab9 and
the resulting CI failures in tests/docker/test_dashboard.py and
tests/docker/test_s6_profile_gateway_integration.py.

The product gap
---------------
Every hermes runtime operation inside the container runs as the
hermes user (UID 10000) via s6-setuidgid. But s6-supervise — spawned
by s6-svscan running as PID 1 — creates each service's supervise/
and top-level event/ directories with mode 0700 owned by its
effective UID (root). That left every s6-svc / s6-svstat / s6-svwait
call from hermes hitting EACCES on the supervise/control FIFO and
supervise/status — i.e. the entire S6ServiceManager lifecycle
(register, start, stop, unregister) was inert in production.

The 2f8ceeab9 commit message called this out and deferred the fix.
The audit changes that landed alongside it (defaulting docker_exec
to -u hermes) made the integration tests reproduce the bug
deterministically; the fix below resolves it.

The fix: pre-create the supervise/ skeleton hermes-owned
----------------------------------------------------------
Reading s6's source (src/supervision/s6-supervise.c::trymkdir +
control_init), the mkdir and mkfifo calls that build the supervise
tree are EEXIST-safe: if the directory or FIFO is already present,
s6-supervise reuses it and skips the chown/chmod fix-up that would
normally make event/ 03730 root:root. So if we lay the skeleton
down with hermes ownership before triggering s6-svscanctl -a,
s6-supervise inherits our layout and never touches it. The
death_tally / lock / status regular files written later by
s6-supervise (still as root) land mode 0644 — world-readable —
which is all s6-svstat needs.

New module-level helper _seed_supervise_skeleton(svc_dir) in
hermes_cli/service_manager.py lays down:
  svc_dir/event/                       hermes:hermes 03730
  svc_dir/supervise/                   hermes:hermes 0755
  svc_dir/supervise/event/             hermes:hermes 03730
  svc_dir/supervise/control            hermes:hermes 0660 (FIFO)
  svc_dir/log/event/                   hermes:hermes 03730  (if log/ present)
  svc_dir/log/supervise/               hermes:hermes 0755
  svc_dir/log/supervise/event/         hermes:hermes 03730
  svc_dir/log/supervise/control        hermes:hermes 0660 (FIFO)

The log/ branch matters because the logger is a second
s6-supervise instance — without it, unregister rmtree races on
the logger's root-owned supervise dir even after the parent
slot's supervise/ is hermes-owned. The helper is idempotent and
swallows PermissionError on chown so it works equally well when
called from root (cont-init.d) or hermes (runtime register).

Wiring
------
1. S6ServiceManager.register_profile_gateway calls
   _seed_supervise_skeleton(tmp_dir) just before publishing the
   slot via Path.replace. Runtime-registered profile gateways are
   set up by hermes.

2. container_boot._register_service does the same in the cont-init.d
   reconciliation path so boot-time-restored profile slots inherit
   the same layout.

3. New cont-init.d/015-supervise-perms script chowns the supervise/
   and event/ trees for STATIC s6-rc services (dashboard,
   main-hermes). These are spawned by s6-rc before cont-init.d
   gets to run, so the EEXIST-trick doesn't apply; we chown the
   already-existing tree instead. s6-supervise keeps using the
   same files; it never re-asserts ownership on a running service.
   The script skips s6-overlay internal services (s6rc-*,
   s6-linux-*) so the supervision tree itself stays root-only.
   015- slot is intentional: lex-sorts between 01-hermes-setup
   and 02-reconcile-profiles in the container's C-locale, so
   the chown finishes before the reconciler walks the scandir.

Unregister teardown reordering
------------------------------
S6ServiceManager.unregister_profile_gateway now fires
s6-svscanctl -an BEFORE rmtree (with a 200ms grace), so
s6-svscan reaps the supervise child and releases its file
handles on supervise/lock + supervise/status before we try to
remove the directory. Previously rmtree raced s6-supervise on a
set of files inside the supervise dir, and even with the parent
supervise/ now hermes-owned, the contained files (death_tally,
lock, status, written by root) could still be in use.

Dashboard down-state redesign
-----------------------------
The original PR #30136 review fix wrote a 'down' marker file
into /run/service/dashboard/ via cont-init.d/03-dashboard-toggle.
That approach was broken in two ways:

  (a) /run/service/dashboard is a symlink to a TRANSIENT
      /run/s6-rc:s6-rc-init:<tmpdir>/ directory while s6-rc is
      mid-transaction; the touch landed in a soon-to-be-discarded
      tmp.

  (b) Even when written to the final /run/s6-rc/servicedirs/
      location, the 'down' file is only consulted by s6-supervise
      at slot startup. s6-rc's user-bundle explicitly transitions
      'dashboard' to 'up' on every boot, overriding any down
      marker.

The right fix is the canonical s6 pattern: when HERMES_DASHBOARD
is unset, the dashboard run script exits 0 and a companion
finish script exits 125. Per s6-supervise(8), exit code 125 from
the finish script is the 'permanent failure, do not restart'
marker — equivalent to s6-svc -O. The slot reports as 'down' to
s6-svstat, matching the reality that no dashboard process is
running. When HERMES_DASHBOARD IS truthy, finish exits 0 and
restart-on-crash semantics apply.

03-dashboard-toggle is removed (its function is now subsumed by
the run/finish pair).

Tests
-----
Adds four unit tests for _seed_supervise_skeleton covering the
produced layout, the log/ subservice case, the skip-when-no-log
case, and idempotency. The live-container verification continues
to live in tests/docker/test_s6_profile_gateway_integration.py and
tests/docker/test_dashboard.py — both now pass against the
rebuilt image.

References
----------
* Skarnet skaware mailing list 2020-02-02 (Laurent Bercot
  + Guillermo Diaz Hartusch) on unprivileged s6 tool semantics:
  http://skarnet.org/lists/skaware/1424.html
* just-containers/s6-overlay#130 — same EEXIST-preseed pattern,
  community-validated 2016 onward
* https://skarnet.org/software/s6/servicedir.html — exit-code 125
  semantics in finish scripts

(cherry picked from commit c41f908ad4)
2026-05-25 12:23:23 +10:00

793 lines
28 KiB
Python

"""Tests for hermes_cli.service_manager — the abstract ServiceManager
protocol, the detect_service_manager() entry point, and the host-side
adapter wrappers (Systemd / Launchd / Windows).
The s6 backend is added in Phase 3; its tests live alongside the
implementation in this same file once that phase ships.
"""
from __future__ import annotations
import pytest
from hermes_cli.service_manager import (
LaunchdServiceManager,
S6ServiceManager,
ServiceManager,
ServiceManagerKind,
SystemdServiceManager,
WindowsServiceManager,
detect_service_manager,
get_service_manager,
validate_profile_name,
)
# ---------------------------------------------------------------------------
# validate_profile_name
# ---------------------------------------------------------------------------
def test_validate_profile_name_accepts_valid_names() -> None:
# Smoke: known-good names should not raise.
validate_profile_name("coder")
validate_profile_name("my-profile")
validate_profile_name("assistant_v2")
validate_profile_name("a")
validate_profile_name("0")
validate_profile_name("0abc")
@pytest.mark.parametrize(
"bad",
[
"", # empty
"Coder", # uppercase
"foo/bar", # path traversal
"../escape", # path traversal
"-leading-dash", # leading dash (s6 reads as a flag)
"_leading_underscore", # leading underscore
"name with spaces", # whitespace
"name.with.dots", # punctuation
"a" * 252, # too long
],
)
def test_validate_profile_name_rejects_invalid(bad: str) -> None:
with pytest.raises(ValueError):
validate_profile_name(bad)
# ---------------------------------------------------------------------------
# detect_service_manager
# ---------------------------------------------------------------------------
def test_detect_service_manager_returns_known_value() -> None:
"""Without mocking, the function must still return one of the
advertised literals — anything else means a new platform branch
was added without updating ServiceManagerKind."""
result = detect_service_manager()
assert result in ("systemd", "launchd", "windows", "s6", "none")
# ---------------------------------------------------------------------------
# _s6_running — must work for unprivileged users, not just root
# ---------------------------------------------------------------------------
def _patch_s6_paths(
monkeypatch: pytest.MonkeyPatch,
*,
comm: str | OSError | None,
basedir_is_dir: bool,
) -> None:
"""Stub /proc/1/comm and /run/s6/basedir for _s6_running tests."""
from pathlib import Path as _Path
real_read_text = _Path.read_text
real_is_dir = _Path.is_dir
def fake_read_text(self, *args, **kwargs): # type: ignore[override]
if str(self) == "/proc/1/comm":
if isinstance(comm, OSError):
raise comm
if comm is None:
raise FileNotFoundError(2, "No such file or directory")
return comm + "\n"
return real_read_text(self, *args, **kwargs)
def fake_is_dir(self): # type: ignore[override]
if str(self) == "/run/s6/basedir":
return basedir_is_dir
return real_is_dir(self)
monkeypatch.setattr(_Path, "read_text", fake_read_text)
monkeypatch.setattr(_Path, "is_dir", fake_is_dir)
def test_s6_running_true_when_comm_and_basedir_match(
monkeypatch: pytest.MonkeyPatch,
) -> None:
from hermes_cli.service_manager import _s6_running
_patch_s6_paths(monkeypatch, comm="s6-svscan", basedir_is_dir=True)
assert _s6_running() is True
def test_s6_running_false_when_comm_is_wrong(
monkeypatch: pytest.MonkeyPatch,
) -> None:
from hermes_cli.service_manager import _s6_running
# systemd as PID 1, basedir present from some stray s6 install
_patch_s6_paths(monkeypatch, comm="systemd", basedir_is_dir=True)
assert _s6_running() is False
def test_s6_running_false_when_basedir_missing(
monkeypatch: pytest.MonkeyPatch,
) -> None:
from hermes_cli.service_manager import _s6_running
# The comm matches but the basedir is missing — e.g. an unrelated
# process happens to be named "s6-svscan"
_patch_s6_paths(monkeypatch, comm="s6-svscan", basedir_is_dir=False)
assert _s6_running() is False
def test_s6_running_false_when_comm_unreadable(
monkeypatch: pytest.MonkeyPatch,
) -> None:
"""Regression: /proc/1/exe was unreadable to UID 10000 and
resolve() silently returned the unresolved path, making detection
always-False inside the container under the hermes user. The new
probe must FAIL CLOSED — not raise — when /proc/1/comm can't be
read.
"""
from hermes_cli.service_manager import _s6_running
_patch_s6_paths(
monkeypatch,
comm=PermissionError(13, "Permission denied"),
basedir_is_dir=True,
)
assert _s6_running() is False
def test_s6_running_handles_missing_proc(
monkeypatch: pytest.MonkeyPatch,
) -> None:
"""On macOS / Windows / WSL-without-procfs, /proc/1/comm doesn't
exist. Must return False, not raise."""
from hermes_cli.service_manager import _s6_running
_patch_s6_paths(monkeypatch, comm=None, basedir_is_dir=False)
assert _s6_running() is False
# ---------------------------------------------------------------------------
# Backend wrappers — kind + registration unsupported on hosts
# ---------------------------------------------------------------------------
def test_systemd_manager_kind_and_registration_unsupported() -> None:
mgr = SystemdServiceManager()
assert mgr.kind == "systemd"
assert mgr.supports_runtime_registration() is False
with pytest.raises(NotImplementedError):
mgr.register_profile_gateway("foo")
with pytest.raises(NotImplementedError):
mgr.unregister_profile_gateway("foo")
assert mgr.list_profile_gateways() == []
# Protocol conformance — runtime_checkable lets us assert this.
assert isinstance(mgr, ServiceManager)
def test_launchd_manager_kind_and_registration_unsupported() -> None:
mgr = LaunchdServiceManager()
assert mgr.kind == "launchd"
assert mgr.supports_runtime_registration() is False
with pytest.raises(NotImplementedError):
mgr.register_profile_gateway("foo")
assert mgr.list_profile_gateways() == []
assert isinstance(mgr, ServiceManager)
def test_windows_manager_kind_and_registration_unsupported() -> None:
mgr = WindowsServiceManager()
assert mgr.kind == "windows"
assert mgr.supports_runtime_registration() is False
with pytest.raises(NotImplementedError):
mgr.register_profile_gateway("foo")
assert isinstance(mgr, ServiceManager)
# ---------------------------------------------------------------------------
# Lifecycle delegation — wrappers must call through to module-level fns
# ---------------------------------------------------------------------------
def test_systemd_manager_lifecycle_delegates(monkeypatch: pytest.MonkeyPatch) -> None:
called: list[str] = []
monkeypatch.setattr(
"hermes_cli.gateway.systemd_start", lambda: called.append("start"),
)
monkeypatch.setattr(
"hermes_cli.gateway.systemd_stop", lambda: called.append("stop"),
)
monkeypatch.setattr(
"hermes_cli.gateway.systemd_restart", lambda: called.append("restart"),
)
monkeypatch.setattr(
"hermes_cli.gateway._probe_systemd_service_running",
lambda *a, **kw: (False, True),
)
mgr = SystemdServiceManager()
mgr.start("ignored")
mgr.stop("ignored")
mgr.restart("ignored")
assert called == ["start", "stop", "restart"]
assert mgr.is_running("ignored") is True
def test_launchd_manager_lifecycle_delegates(monkeypatch: pytest.MonkeyPatch) -> None:
called: list[str] = []
monkeypatch.setattr(
"hermes_cli.gateway.launchd_start", lambda: called.append("start"),
)
monkeypatch.setattr(
"hermes_cli.gateway.launchd_stop", lambda: called.append("stop"),
)
monkeypatch.setattr(
"hermes_cli.gateway.launchd_restart", lambda: called.append("restart"),
)
monkeypatch.setattr(
"hermes_cli.gateway._probe_launchd_service_running", lambda: False,
)
mgr = LaunchdServiceManager()
mgr.start("ignored")
mgr.stop("ignored")
mgr.restart("ignored")
assert called == ["start", "stop", "restart"]
assert mgr.is_running("ignored") is False
def test_windows_manager_lifecycle_delegates(monkeypatch: pytest.MonkeyPatch) -> None:
called: list[str] = []
# Force-import the submodule so monkeypatch's attribute lookup
# against the `hermes_cli` package succeeds — gateway_windows is
# imported lazily inside the wrapper and may not yet be loaded.
import hermes_cli.gateway_windows # noqa: F401
class _FakeWindowsModule:
@staticmethod
def start() -> None: called.append("start")
@staticmethod
def stop() -> None: called.append("stop")
@staticmethod
def restart() -> None: called.append("restart")
@staticmethod
def is_installed() -> bool: return True
monkeypatch.setattr("hermes_cli.gateway_windows", _FakeWindowsModule)
monkeypatch.setattr(
"hermes_cli.gateway.find_gateway_pids",
lambda **kw: [12345],
)
mgr = WindowsServiceManager()
mgr.start("ignored")
mgr.stop("ignored")
mgr.restart("ignored")
assert called == ["start", "stop", "restart"]
assert mgr.is_running("ignored") is True
def test_windows_manager_is_running_false_when_not_installed(
monkeypatch: pytest.MonkeyPatch,
) -> None:
import hermes_cli.gateway_windows # noqa: F401
class _FakeWindowsModule:
@staticmethod
def is_installed() -> bool: return False
monkeypatch.setattr("hermes_cli.gateway_windows", _FakeWindowsModule)
monkeypatch.setattr(
"hermes_cli.gateway.find_gateway_pids",
lambda **kw: [12345], # PIDs would otherwise vote "running"
)
assert WindowsServiceManager().is_running("ignored") is False
def test_windows_manager_install_forwards_kwargs(monkeypatch: pytest.MonkeyPatch) -> None:
captured: dict[str, object] = {}
import hermes_cli.gateway_windows # noqa: F401
class _FakeWindowsModule:
@staticmethod
def install(*, force, start_now, start_on_login, elevated_handoff) -> None:
captured["force"] = force
captured["start_now"] = start_now
captured["start_on_login"] = start_on_login
captured["elevated_handoff"] = elevated_handoff
monkeypatch.setattr("hermes_cli.gateway_windows", _FakeWindowsModule)
WindowsServiceManager().install(
force=True, start_now=True, start_on_login=False, elevated_handoff=True,
)
assert captured == {
"force": True,
"start_now": True,
"start_on_login": False,
"elevated_handoff": True,
}
# ---------------------------------------------------------------------------
# get_service_manager factory
# ---------------------------------------------------------------------------
@pytest.mark.parametrize(
"kind,cls",
[
("systemd", SystemdServiceManager),
("launchd", LaunchdServiceManager),
("windows", WindowsServiceManager),
],
)
def test_get_service_manager_returns_correct_backend(
monkeypatch: pytest.MonkeyPatch,
kind: ServiceManagerKind,
cls: type,
) -> None:
monkeypatch.setattr(
"hermes_cli.service_manager.detect_service_manager", lambda: kind,
)
assert isinstance(get_service_manager(), cls)
def test_get_service_manager_raises_when_unsupported(
monkeypatch: pytest.MonkeyPatch,
) -> None:
monkeypatch.setattr(
"hermes_cli.service_manager.detect_service_manager", lambda: "none",
)
with pytest.raises(RuntimeError, match="no supported service manager"):
get_service_manager()
def test_get_service_manager_returns_s6_instance(
monkeypatch: pytest.MonkeyPatch,
) -> None:
"""The s6 backend ships in Phase 3 — the factory must return an
S6ServiceManager when running inside a container."""
from hermes_cli.service_manager import S6ServiceManager
monkeypatch.setattr(
"hermes_cli.service_manager.detect_service_manager", lambda: "s6",
)
assert isinstance(get_service_manager(), S6ServiceManager)
# ---------------------------------------------------------------------------
# S6ServiceManager — unit tests against a tmp-path scandir (no real s6)
# ---------------------------------------------------------------------------
@pytest.fixture
def s6_scandir(tmp_path):
"""Empty scandir for the S6ServiceManager tests."""
d = tmp_path / "service"
d.mkdir()
return d
@pytest.fixture
def fake_subprocess_run(monkeypatch: pytest.MonkeyPatch):
"""Capture subprocess.run calls + always return success. Lets the
S6ServiceManager tests run on hosts that don't have s6-svc /
s6-svscanctl installed.
Records are normalized: leading ``/command/`` is stripped from
cmd[0] so assertions can match on the bare s6-svc / s6-svstat /
s6-svscanctl name regardless of whether the manager calls them
via absolute path or bare name."""
calls: list[list[str]] = []
def _fake(cmd, **kw):
import subprocess as _sp
seq = list(cmd) if isinstance(cmd, (list, tuple)) else [str(cmd)]
if seq and seq[0].startswith("/command/"):
seq[0] = seq[0][len("/command/"):]
calls.append(seq)
return _sp.CompletedProcess(cmd, 0, "", "")
monkeypatch.setattr("subprocess.run", _fake)
return calls
def test_s6_manager_kind_and_supports_registration() -> None:
from hermes_cli.service_manager import S6ServiceManager
mgr = S6ServiceManager()
assert mgr.kind == "s6"
assert mgr.supports_runtime_registration() is True
# ---------------------------------------------------------------------------
# _seed_supervise_skeleton — unit tests
# ---------------------------------------------------------------------------
#
# The skeleton helper pre-creates the dirs and FIFOs that s6-supervise
# would otherwise create as root mode 0700, locking out the
# unprivileged hermes user from every lifecycle op. These tests run
# against tmp_path and assert the produced layout — the live-container
# verification (against real s6-svc / s6-svstat) lives in
# tests/docker/test_s6_profile_gateway_integration.py.
def test_seed_supervise_skeleton_creates_expected_layout(tmp_path) -> None:
"""Verifies the dirs + FIFO + modes the helper lays down."""
import stat
from hermes_cli.service_manager import _seed_supervise_skeleton
svc_dir = tmp_path / "gateway-foo"
svc_dir.mkdir()
_seed_supervise_skeleton(svc_dir)
# Top-level event/ — s6-svlisten1 event subscription dir.
event = svc_dir / "event"
assert event.is_dir(), "missing top-level event/"
assert stat.S_IMODE(event.stat().st_mode) == 0o3730, (
f"event/ mode = {oct(event.stat().st_mode)}, want 03730"
)
# supervise/ dir.
supervise = svc_dir / "supervise"
assert supervise.is_dir(), "missing supervise/"
assert stat.S_IMODE(supervise.stat().st_mode) == 0o755
# supervise/event/.
supervise_event = supervise / "event"
assert supervise_event.is_dir(), "missing supervise/event/"
assert stat.S_IMODE(supervise_event.stat().st_mode) == 0o3730
# supervise/control FIFO.
control = supervise / "control"
assert control.exists(), "missing supervise/control FIFO"
assert stat.S_ISFIFO(control.stat().st_mode), (
"supervise/control must be a FIFO"
)
assert stat.S_IMODE(control.stat().st_mode) == 0o660
def test_seed_supervise_skeleton_handles_log_subservice(tmp_path) -> None:
"""When a log/ subdir exists, its supervise tree also gets seeded.
Without this, ``unregister_profile_gateway``'s rmtree would EACCES
on the logger's root-owned supervise dir even after the parent
slot's supervise/ was hermes-owned.
"""
import stat
from hermes_cli.service_manager import _seed_supervise_skeleton
svc_dir = tmp_path / "gateway-foo"
svc_dir.mkdir()
(svc_dir / "log").mkdir() # logger subdir present
_seed_supervise_skeleton(svc_dir)
# Logger's own supervise tree is seeded the same way.
log_event = svc_dir / "log" / "event"
log_supervise = svc_dir / "log" / "supervise"
log_supervise_event = log_supervise / "event"
log_control = log_supervise / "control"
assert log_event.is_dir()
assert stat.S_IMODE(log_event.stat().st_mode) == 0o3730
assert log_supervise.is_dir()
assert log_supervise_event.is_dir()
assert log_control.exists() and stat.S_ISFIFO(log_control.stat().st_mode)
def test_seed_supervise_skeleton_skips_when_no_log_subservice(tmp_path) -> None:
"""If log/ isn't present, no logger skeleton is created."""
from hermes_cli.service_manager import _seed_supervise_skeleton
svc_dir = tmp_path / "gateway-foo"
svc_dir.mkdir()
_seed_supervise_skeleton(svc_dir)
assert not (svc_dir / "log").exists(), (
"helper must not synthesize a log/ subdir on its own"
)
def test_seed_supervise_skeleton_is_idempotent(tmp_path) -> None:
"""Calling the helper twice on the same dir is a no-op the second time.
Important because s6-supervise may have already opened the FIFO
when a re-register / reconcile happens; double-creation would
error out. The helper short-circuits on existence.
"""
from hermes_cli.service_manager import _seed_supervise_skeleton
svc_dir = tmp_path / "gateway-foo"
svc_dir.mkdir()
_seed_supervise_skeleton(svc_dir)
_seed_supervise_skeleton(svc_dir) # must not raise
def test_s6_register_creates_service_dir_and_triggers_scan(
s6_scandir, fake_subprocess_run,
) -> None:
from hermes_cli.service_manager import S6ServiceManager
mgr = S6ServiceManager(scandir=s6_scandir)
mgr.register_profile_gateway("coder")
svc_dir = s6_scandir / "gateway-coder"
assert svc_dir.is_dir()
assert (svc_dir / "type").read_text().strip() == "longrun"
run_path = svc_dir / "run"
assert run_path.is_file()
assert run_path.stat().st_mode & 0o111 # executable
run_text = run_path.read_text()
assert "hermes -p coder gateway run" in run_text
assert "s6-setuidgid hermes" in run_text
log_run = svc_dir / "log" / "run"
assert log_run.is_file()
log_text = log_run.read_text()
# CRITICAL: HERMES_HOME must be a runtime env-var expansion, NOT
# a Python-substituted absolute path. Negative-assert the wrong
# form so future regressions are caught.
assert "$HERMES_HOME" in log_text
assert "logs/gateways/coder" in log_text
assert "/opt/data/logs/gateways/coder" not in log_text, (
"log_dir was hard-coded; must use ${HERMES_HOME} at run time"
)
# s6-svscanctl -a was invoked against the scandir
assert any(
cmd[0] == "s6-svscanctl" and "-a" in cmd
and str(s6_scandir) in cmd
for cmd in fake_subprocess_run
), f"s6-svscanctl -a not invoked; saw: {fake_subprocess_run}"
def test_s6_register_extra_env_is_quoted(s6_scandir, fake_subprocess_run) -> None:
from hermes_cli.service_manager import S6ServiceManager
mgr = S6ServiceManager(scandir=s6_scandir)
mgr.register_profile_gateway(
"x", extra_env={"FOO": "bar baz", "QUOTED": "a'b"},
)
run_text = (s6_scandir / "gateway-x" / "run").read_text()
# shlex.quote should have wrapped both values
assert "export FOO='bar baz'" in run_text
assert "export QUOTED='a'\"'\"'b'" in run_text
def test_s6_register_rejects_invalid_profile_name(s6_scandir) -> None:
from hermes_cli.service_manager import S6ServiceManager
mgr = S6ServiceManager(scandir=s6_scandir)
with pytest.raises(ValueError):
mgr.register_profile_gateway("Bad/Name")
def test_s6_register_rejects_duplicate(s6_scandir, fake_subprocess_run) -> None:
from hermes_cli.service_manager import S6ServiceManager
mgr = S6ServiceManager(scandir=s6_scandir)
(s6_scandir / "gateway-coder").mkdir(parents=True)
with pytest.raises(ValueError, match="already registered"):
mgr.register_profile_gateway("coder")
def test_s6_register_rolls_back_on_svscanctl_failure(
s6_scandir, monkeypatch: pytest.MonkeyPatch,
) -> None:
"""If s6-svscanctl fails the service dir must be cleaned up so the
next register call doesn't see a stale duplicate."""
import subprocess as _sp
from hermes_cli.service_manager import S6ServiceManager
def _fail_scanctl(cmd, **kw):
# Manager calls s6-svscanctl by absolute path; match on basename.
if cmd[0].endswith("/s6-svscanctl"):
return _sp.CompletedProcess(cmd, 1, "", "rescan failed")
return _sp.CompletedProcess(cmd, 0, "", "")
monkeypatch.setattr("subprocess.run", _fail_scanctl)
mgr = S6ServiceManager(scandir=s6_scandir)
with pytest.raises(RuntimeError, match="s6-svscanctl failed"):
mgr.register_profile_gateway("coder")
assert not (s6_scandir / "gateway-coder").exists()
def test_s6_unregister_removes_service_dir(
s6_scandir, fake_subprocess_run,
) -> None:
from hermes_cli.service_manager import S6ServiceManager
svc_dir = s6_scandir / "gateway-coder"
svc_dir.mkdir(parents=True)
(svc_dir / "type").write_text("longrun\n")
mgr = S6ServiceManager(scandir=s6_scandir)
mgr.unregister_profile_gateway("coder")
# s6-svc -d was issued
assert any(
cmd[0] == "s6-svc" and "-d" in cmd
for cmd in fake_subprocess_run
)
# Service dir was removed
assert not svc_dir.exists()
# Rescan was triggered
assert any(cmd[0] == "s6-svscanctl" for cmd in fake_subprocess_run)
def test_s6_unregister_absent_profile_is_noop(s6_scandir) -> None:
from hermes_cli.service_manager import S6ServiceManager
# Should NOT raise even though "ghost" doesn't exist
S6ServiceManager(scandir=s6_scandir).unregister_profile_gateway("ghost")
def test_s6_list_profile_gateways(s6_scandir) -> None:
from hermes_cli.service_manager import S6ServiceManager
# Three gateway profiles + one unrelated service + one hidden dir
(s6_scandir / "gateway-coder").mkdir()
(s6_scandir / "gateway-assistant").mkdir()
(s6_scandir / "gateway-writer").mkdir()
(s6_scandir / "s6-linux-init-shutdownd").mkdir() # filtered out
(s6_scandir / ".lock").mkdir() # filtered out (hidden)
profiles = sorted(S6ServiceManager(scandir=s6_scandir).list_profile_gateways())
assert profiles == ["assistant", "coder", "writer"]
def test_s6_list_profile_gateways_empty_when_scandir_missing(tmp_path) -> None:
from hermes_cli.service_manager import S6ServiceManager
missing = tmp_path / "does-not-exist"
assert S6ServiceManager(scandir=missing).list_profile_gateways() == []
def test_s6_lifecycle_dispatches_to_s6_svc(
s6_scandir, fake_subprocess_run,
) -> None:
from hermes_cli.service_manager import S6ServiceManager
mgr = S6ServiceManager(scandir=s6_scandir)
# _run_svc now verifies the slot exists before invoking s6-svc, so
# we have to pre-seed the dir. In real use the slot is created by
# register_profile_gateway or the cont-init.d reconciler.
(s6_scandir / "gateway-coder").mkdir()
mgr.start("gateway-coder")
mgr.stop("gateway-coder")
mgr.restart("gateway-coder")
flags = [c[1] for c in fake_subprocess_run if c[0] == "s6-svc"]
assert flags == ["-u", "-d", "-t"]
# ---------------------------------------------------------------------------
# Lifecycle errors — friendly messages, not raw CalledProcessError
# ---------------------------------------------------------------------------
def test_lifecycle_raises_gateway_not_registered_for_missing_slot(
s6_scandir, fake_subprocess_run,
) -> None:
"""When the service slot doesn't exist, the lifecycle methods
must raise GatewayNotRegisteredError BEFORE invoking s6-svc, so
the user sees a clear 'no such gateway' message instead of an
opaque CalledProcessError stacktrace."""
from hermes_cli.service_manager import (
GatewayNotRegisteredError,
S6ServiceManager,
)
mgr = S6ServiceManager(scandir=s6_scandir)
# No gateway-typo/ directory exists — slot is missing.
with pytest.raises(GatewayNotRegisteredError) as excinfo:
mgr.start("gateway-typo")
assert excinfo.value.profile == "typo"
assert excinfo.value.service == "gateway-typo"
msg = str(excinfo.value)
assert "'typo'" in msg
assert "hermes profile create typo" in msg
# And critically: s6-svc was NOT invoked.
assert not any(c[0] == "s6-svc" for c in fake_subprocess_run)
@pytest.mark.parametrize("action,method_name", [
("start", "start"),
("stop", "stop"),
("restart", "restart"),
])
def test_all_lifecycle_methods_check_for_missing_slot(
s6_scandir,
fake_subprocess_run,
action: str,
method_name: str,
) -> None:
"""start/stop/restart all check for missing slots the same way."""
from hermes_cli.service_manager import (
GatewayNotRegisteredError,
S6ServiceManager,
)
mgr = S6ServiceManager(scandir=s6_scandir)
with pytest.raises(GatewayNotRegisteredError):
getattr(mgr, method_name)("gateway-absent")
def test_gateway_not_registered_unprefixed_service_name(s6_scandir) -> None:
"""If the caller passes a name without the 'gateway-' prefix (the
Protocol allows arbitrary service names), the error still carries
that name verbatim as the 'profile' so error messages don't
accidentally strip user-provided text."""
from hermes_cli.service_manager import (
GatewayNotRegisteredError,
S6ServiceManager,
)
mgr = S6ServiceManager(scandir=s6_scandir)
with pytest.raises(GatewayNotRegisteredError) as excinfo:
mgr.start("not-prefixed")
assert excinfo.value.profile == "not-prefixed"
def test_lifecycle_raises_s6_command_error_on_subprocess_failure(
s6_scandir, monkeypatch: pytest.MonkeyPatch,
) -> None:
"""When s6-svc itself fails (non-zero exit) — e.g. EACCES on the
supervise control FIFO — the lifecycle methods translate the
CalledProcessError into a named S6CommandError carrying the
return code and stderr."""
import subprocess as _sp
from hermes_cli.service_manager import S6CommandError, S6ServiceManager
# Pre-create the slot so we reach the s6-svc call.
(s6_scandir / "gateway-coder").mkdir()
def _fail(cmd, **kw):
raise _sp.CalledProcessError(
returncode=111,
cmd=cmd,
stderr="s6-svc: fatal: unable to control supervise/control: "
"Permission denied\n",
)
monkeypatch.setattr("subprocess.run", _fail)
mgr = S6ServiceManager(scandir=s6_scandir)
with pytest.raises(S6CommandError) as excinfo:
mgr.start("gateway-coder")
assert excinfo.value.service == "gateway-coder"
assert excinfo.value.action == "start"
assert excinfo.value.returncode == 111
assert "Permission denied" in excinfo.value.stderr
assert "Permission denied" in str(excinfo.value)
assert "rc=111" in str(excinfo.value)
def test_s6_is_running_parses_svstat(
s6_scandir, monkeypatch: pytest.MonkeyPatch,
) -> None:
import subprocess as _sp
from hermes_cli.service_manager import S6ServiceManager
def _svstat(cmd, **kw):
if cmd[0].endswith("/s6-svstat"):
return _sp.CompletedProcess(cmd, 0, "up (pid 42) 17 seconds\n", "")
return _sp.CompletedProcess(cmd, 0, "", "")
monkeypatch.setattr("subprocess.run", _svstat)
assert S6ServiceManager(scandir=s6_scandir).is_running("gateway-coder") is True
def _svstat_down(cmd, **kw):
if cmd[0].endswith("/s6-svstat"):
return _sp.CompletedProcess(cmd, 0, "down 5 seconds\n", "")
return _sp.CompletedProcess(cmd, 0, "", "")
monkeypatch.setattr("subprocess.run", _svstat_down)
assert S6ServiceManager(scandir=s6_scandir).is_running("gateway-coder") is False