fix(cli): add missing subprocess.run() timeouts in gateway CLI (#5424)

All 35 subprocess.run() calls in hermes_cli/gateway.py lacked timeout
parameters. If systemctl, launchctl, loginctl, wmic, or ps blocks,
hermes gateway start/stop/restart/status/install/uninstall hangs
indefinitely with no feedback.

Timeouts tiered by operation type:
- 10s: instant queries (is-active, status, list, ps, tail, journalctl)
- 30s: fast lifecycle (daemon-reload, enable, start, bootstrap, kickstart)
- 90s: graceful shutdown (stop, restart, bootout, kickstart -k) — exceeds
  our TimeoutStopSec=60 to avoid premature timeout during shutdown

Special handling: _is_service_running() and launchd_status() catch
TimeoutExpired and treat it as not-running/not-loaded, consistent with
how non-zero return codes are already handled.

Inspired by PR #3732 (dlkakbs) and issue #4057 (SHL0MS).
Reimplemented on current main which has significantly changed launchctl
handling (bootout/bootstrap/kickstart vs legacy load/unload/start/stop).
This commit is contained in:
Teknium 2026-04-05 22:41:42 -07:00 committed by GitHub
parent ab086a320b
commit 786970925e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 75 additions and 54 deletions

View file

@ -40,7 +40,7 @@ def test_systemd_status_warns_when_linger_disabled(monkeypatch, tmp_path, capsys
monkeypatch.setattr(gateway, "get_systemd_unit_path", lambda system=False: unit_path)
monkeypatch.setattr(gateway, "get_systemd_linger_status", lambda: (False, ""))
def fake_run(cmd, capture_output=False, text=False, check=False):
def fake_run(cmd, capture_output=False, text=False, check=False, **kwargs):
if cmd[:4] == ["systemctl", "--user", "status", gateway.get_service_name()]:
return SimpleNamespace(returncode=0, stdout="", stderr="")
if cmd[:3] == ["systemctl", "--user", "is-active"]: