mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-26 11:12:03 +00:00
fix(macos): clearly distinguish launchd supervision from detached fallback in gateway status
Some checks are pending
CI / detect (push) Waiting to run
CI / tests (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / typecheck (push) Blocked by required conditions
CI / docs-site (push) Blocked by required conditions
CI / history-check (push) Blocked by required conditions
CI / contributor-check (push) Blocked by required conditions
CI / uv-lockfile (push) Blocked by required conditions
CI / docker-lint (push) Blocked by required conditions
CI / supply-chain (push) Blocked by required conditions
CI / osv-scanner (push) Blocked by required conditions
CI / All required checks pass (push) Blocked by required conditions
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Some checks are pending
CI / detect (push) Waiting to run
CI / tests (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / typecheck (push) Blocked by required conditions
CI / docs-site (push) Blocked by required conditions
CI / history-check (push) Blocked by required conditions
CI / contributor-check (push) Blocked by required conditions
CI / uv-lockfile (push) Blocked by required conditions
CI / docker-lint (push) Blocked by required conditions
CI / supply-chain (push) Blocked by required conditions
CI / osv-scanner (push) Blocked by required conditions
CI / All required checks pass (push) Blocked by required conditions
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
## Description
On macOS 26.x, `launchctl bootstrap` and `launchctl kickstart` return exit code 5 ("Input/output error"), which Hermes already anticipates and handles by spawning a detached fallback process. However, the gateway status reporting is ambiguous:
- `gateway status` says "Gateway service is loaded" (because `launchctl list` returns exit 0)
- But `launchctl print` shows `state = not running` — launchd isn't actually supervising anything
- The detached fallback PID running is invisible to the status command
- Users can't tell whether auto-start at login and auto-restart on crash are available
### Root Cause
Two problems in `hermes_cli/gateway.py`:
1. **`_probe_launchd_service_running()`** (line 1067): Determined launchd service liveness solely by `launchctl list <label>` exit code. On macOS 26, this returns 0 even when the service is only *registered* but not running (output lacks a `"PID"` field). This caused `GatewayRuntimeSnapshot.service_running = True` incorrectly, which suppressed the process/service mismatch warning.
2. **`launchd_status()`** (line 3569): Used the same binary "loaded/not loaded" check without inspecting whether launchd actually has a PID, whether a detached fallback is running, or whether auto-start/restart are available.
### Changes
**`hermes_cli/gateway.py`:**
1. **New `_parse_launchd_pid_from_list_output()` helper** — Extracts the PID from `launchctl list` output. When launchd is actively supervising, the output includes `"PID" = <number>;`. When only registered but not running, no PID field is present.
2. **Fixed `_probe_launchd_service_running()`** — Now requires a PID in the `launchctl list` output to confirm launchd is actually supervising. This correctly sets `service_running = False` when launchd has the service registered but `state = not running`, which triggers the existing process/service mismatch detection.
3. **Reworked `launchd_status()`** — Reports clearly separated information:
- LaunchAgent plist currentness (stale or current)
- Whether launchd is actively supervising (with PID)
- Whether a detached fallback PID is running
- Whether auto-start at login and auto-restart on crash are available
- When launchd supervision is known to be unavailable, explains why
4. **Persistent unsupported marker** (`~/.hermes/.gateway-launchd-unsupported`) — Written when `_launchd_fallback_to_detached()` is called (launchd exit 5/125). Allows `launchd_status()` to explain *why* launchd can't supervise even when no fallback process is currently running. Cleared automatically when a future bootstrap/kickstart succeeds (e.g., after an OS update fixes the issue).
5. **Updated `_print_gateway_process_mismatch()`** — Distinguishes the managed detached fallback from a genuinely manual `nohup hermes gateway run`, providing accurate guidance for each case.
### Status Output Examples
**Before** (macOS 26, fallback active):
```
Launchd plist: ~/Library/LaunchAgents/ai.hermes.gateway.plist
✓ Service definition matches the current Hermes install
✓ Gateway service is loaded
{
"Label" = "ai.hermes.gateway";
"OnDemand" = true;
...
};
```
**After** (macOS 26, fallback active):
```
Launchd plist: ~/Library/LaunchAgents/ai.hermes.gateway.plist
✓ Service definition matches the current Hermes install
⚠ Gateway service is registered but launchd is not supervising it
launchd cannot manage the gateway on this macOS version.
✓ Detached fallback process is running (PID 12345)
Cron jobs will fire. Stop with: hermes gateway stop
⚠ Auto-start at login and auto-restart on crash are NOT available.
```
**After** (normal launchd supervision):
```
Launchd plist: ~/Library/LaunchAgents/ai.hermes.gateway.plist
✓ Service definition matches the current Hermes install
✓ Gateway is supervised by launchd (PID 12345)
Auto-start at login and auto-restart on crash are available.
```
### Tests
Updated 5 existing tests and added 11 new tests in `tests/hermes_cli/test_gateway_service.py`:
- PID parsing from `launchctl list` output (with PID, without PID, empty, unquoted PID)
- `_probe_launchd_service_running()` requires PID presence
- Unsupport marker lifecycle (write, clear, persist across fallback)
- Marker cleared on successful bootstrap
- `launchd_status()` reporting: supervised, fallback-running, fallback-unavailable
- Existing fallback tests now verify marker creation
### Related Issues
- Issue #23387 (original macOS 26 launchd workaround)
- Issue #42524 (this issue)
This commit is contained in:
parent
1c832762a8
commit
e3db1ef92d
2 changed files with 339 additions and 24 deletions
|
|
@ -138,16 +138,22 @@ def _get_service_pids() -> set:
|
|||
timeout=5,
|
||||
)
|
||||
if result.returncode == 0:
|
||||
# Output: "PID\tStatus\tLabel" header, then one data line
|
||||
for line in result.stdout.strip().splitlines():
|
||||
parts = line.split()
|
||||
if len(parts) >= 3 and parts[2] == label:
|
||||
try:
|
||||
pid = int(parts[0])
|
||||
if pid > 0:
|
||||
pids.add(pid)
|
||||
except ValueError:
|
||||
pass
|
||||
# Try plist format first (macOS 26+): "PID" = <N>;
|
||||
pid = _parse_launchd_pid_from_list_output(result.stdout)
|
||||
if pid is not None and pid > 0:
|
||||
pids.add(pid)
|
||||
else:
|
||||
# Fall back to legacy tab-separated format:
|
||||
# "PID\tStatus\tLabel"
|
||||
for line in result.stdout.strip().splitlines():
|
||||
parts = line.split()
|
||||
if len(parts) >= 3 and parts[2] == label:
|
||||
try:
|
||||
pid = int(parts[0])
|
||||
if pid > 0:
|
||||
pids.add(pid)
|
||||
except ValueError:
|
||||
pass
|
||||
except (FileNotFoundError, subprocess.TimeoutExpired):
|
||||
pass
|
||||
|
||||
|
|
@ -1129,7 +1135,37 @@ def _recover_pending_systemd_restart(
|
|||
return False
|
||||
|
||||
|
||||
def _parse_launchd_pid_from_list_output(output: str) -> int | None:
|
||||
"""Extract the PID from ``launchctl list <label>`` output.
|
||||
|
||||
When launchd is actively supervising a process, the output includes a
|
||||
``"PID" = <number>;`` line. When the service definition is only *registered*
|
||||
but not running (macOS 26+ with an unmanageable domain, fallback active),
|
||||
the output lacks a PID field entirely. Returns ``None`` when no PID is
|
||||
found or the PID is non-positive (e.g. ``-1`` for a recently-crashed service).
|
||||
"""
|
||||
for line in output.splitlines():
|
||||
stripped = line.strip()
|
||||
if stripped.startswith('"PID"') or stripped.startswith("PID"):
|
||||
parts = stripped.split("=", 1)
|
||||
if len(parts) == 2:
|
||||
val = parts[1].strip().rstrip(";").strip('"')
|
||||
try:
|
||||
pid = int(val)
|
||||
return pid if pid > 0 else None
|
||||
except ValueError:
|
||||
return None
|
||||
return None
|
||||
|
||||
|
||||
def _probe_launchd_service_running() -> bool:
|
||||
"""Return True when launchd is actively supervising the gateway process.
|
||||
|
||||
``launchctl list <label>`` returns exit 0 whenever the service definition is
|
||||
registered with launchd — even when ``state = not running`` (macOS 26+).
|
||||
We additionally require a PID in the output to confirm launchd is actually
|
||||
managing a live process, not just holding a static definition.
|
||||
"""
|
||||
if not get_launchd_plist_path().exists():
|
||||
return False
|
||||
try:
|
||||
|
|
@ -1141,7 +1177,9 @@ def _probe_launchd_service_running() -> bool:
|
|||
)
|
||||
except subprocess.TimeoutExpired:
|
||||
return False
|
||||
return result.returncode == 0
|
||||
if result.returncode != 0:
|
||||
return False
|
||||
return _parse_launchd_pid_from_list_output(result.stdout) is not None
|
||||
|
||||
|
||||
def get_gateway_runtime_snapshot(system: bool = False) -> GatewayRuntimeSnapshot:
|
||||
|
|
@ -1235,12 +1273,23 @@ def _print_gateway_process_mismatch(snapshot: GatewayRuntimeSnapshot) -> None:
|
|||
if not snapshot.has_process_service_mismatch:
|
||||
return
|
||||
print()
|
||||
print(
|
||||
"⚠ Gateway process is running for this profile, but the service is not active"
|
||||
)
|
||||
print(f" PID(s): {_format_gateway_pids(snapshot.gateway_pids, limit=None)}")
|
||||
print(" This is usually a manual foreground/tmux/nohup run, so `hermes gateway`")
|
||||
print(" can refuse to start another copy until this process stops.")
|
||||
# Distinguish the managed detached fallback (macOS launchd exit-5 path)
|
||||
# from a genuinely manual foreground/tmux/nohup run.
|
||||
if _launchd_unsupported_marker_exists():
|
||||
print(
|
||||
"⚠ Gateway is running as a detached fallback process — "
|
||||
"launchd cannot supervise it"
|
||||
)
|
||||
print(f" PID(s): {_format_gateway_pids(snapshot.gateway_pids, limit=None)}")
|
||||
print(" Auto-start at login and auto-restart on crash are NOT available.")
|
||||
print(" Stop it with: hermes gateway stop")
|
||||
else:
|
||||
print(
|
||||
"⚠ Gateway process is running for this profile, but the service is not active"
|
||||
)
|
||||
print(f" PID(s): {_format_gateway_pids(snapshot.gateway_pids, limit=None)}")
|
||||
print(" This is usually a manual foreground/tmux/nohup run, so `hermes gateway`")
|
||||
print(" can refuse to start another copy until this process stops.")
|
||||
|
||||
|
||||
def _print_other_profiles_gateway_status() -> None:
|
||||
|
|
@ -3404,6 +3453,47 @@ def _launchctl_domain_unsupported(returncode: int) -> bool:
|
|||
return returncode in _LAUNCHCTL_DOMAIN_UNSUPPORTED_CODES
|
||||
|
||||
|
||||
# ── launchd unsupported marker ─────────────────────────────────────────────
|
||||
# When launchd can't manage the domain on this host (error 5/125, macOS 26+),
|
||||
# we write a persistent marker so `launchd_status()` can explain that launchd
|
||||
# supervision is unavailable regardless of whether a fallback process is
|
||||
# currently running. The marker is cleared when bootstrap/kickstart succeeds,
|
||||
# so an OS update that fixes the underlying issue allows automatic recovery.
|
||||
|
||||
|
||||
def _launchd_unsupported_marker_path() -> Path:
|
||||
return get_hermes_home() / ".gateway-launchd-unsupported"
|
||||
|
||||
|
||||
def _write_launchd_unsupported_marker() -> None:
|
||||
"""Persist that launchd cannot supervise the gateway on this host."""
|
||||
import json
|
||||
from datetime import datetime, timezone
|
||||
|
||||
try:
|
||||
_launchd_unsupported_marker_path().write_text(
|
||||
json.dumps({
|
||||
"written_at": datetime.now(timezone.utc).isoformat(),
|
||||
"reason": "launchd domain unsupported (exit 5/125)",
|
||||
}),
|
||||
encoding="utf-8",
|
||||
)
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
|
||||
def _clear_launchd_unsupported_marker() -> None:
|
||||
"""Clear the unsupported marker when launchd bootstrap succeeds."""
|
||||
try:
|
||||
_launchd_unsupported_marker_path().unlink(missing_ok=True)
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
|
||||
def _launchd_unsupported_marker_exists() -> bool:
|
||||
return _launchd_unsupported_marker_path().exists()
|
||||
|
||||
|
||||
def _gateway_run_command() -> list[str]:
|
||||
"""Build the `python -m hermes_cli.main [--profile X] gateway run --replace` argv.
|
||||
|
||||
|
|
@ -3461,6 +3551,7 @@ def _launchd_fallback_to_detached(reason: str, *, exit_on_failure: bool = True)
|
|||
"""
|
||||
from hermes_constants import display_hermes_home as _dhh
|
||||
|
||||
_write_launchd_unsupported_marker()
|
||||
print(f"⚠ launchd cannot manage the gateway on this macOS version ({reason}).")
|
||||
if _spawn_detached_gateway():
|
||||
print("✓ Started gateway as a background process instead")
|
||||
|
|
@ -3702,6 +3793,7 @@ def launchd_install(force: bool = False):
|
|||
|
||||
print()
|
||||
print("✓ Service installed and loaded!")
|
||||
_clear_launchd_unsupported_marker()
|
||||
print()
|
||||
print("Next steps:")
|
||||
print(" hermes gateway status # Check status")
|
||||
|
|
@ -3755,6 +3847,7 @@ def launchd_start():
|
|||
_launchd_fallback_to_detached(f"launchctl exit {e.returncode}")
|
||||
return
|
||||
print("✓ Service started")
|
||||
_clear_launchd_unsupported_marker()
|
||||
return
|
||||
|
||||
refresh_launchd_plist_if_needed()
|
||||
|
|
@ -3788,6 +3881,7 @@ def launchd_start():
|
|||
_launchd_fallback_to_detached(f"launchctl exit {e2.returncode}")
|
||||
return
|
||||
print("✓ Service started")
|
||||
_clear_launchd_unsupported_marker()
|
||||
|
||||
|
||||
def launchd_stop():
|
||||
|
|
@ -3883,6 +3977,7 @@ def launchd_restart():
|
|||
pid = get_running_pid()
|
||||
if pid is not None and _request_gateway_self_restart(pid):
|
||||
print("✓ Service restart requested")
|
||||
_clear_launchd_unsupported_marker()
|
||||
return
|
||||
if pid is not None:
|
||||
# Announce the drain BEFORE waiting on it. This wait can run for
|
||||
|
|
@ -3907,6 +4002,7 @@ def launchd_restart():
|
|||
)
|
||||
subprocess.run(["launchctl", "kickstart", "-k", target], check=True, timeout=90)
|
||||
print("✓ Service restarted")
|
||||
_clear_launchd_unsupported_marker()
|
||||
except subprocess.CalledProcessError as e:
|
||||
if not _launchd_error_indicates_unloaded(e):
|
||||
# Not a "job unloaded" code. If the domain is fundamentally
|
||||
|
|
@ -3932,6 +4028,7 @@ def launchd_restart():
|
|||
_launchd_fallback_to_detached(f"launchctl exit {e2.returncode}")
|
||||
return
|
||||
print("✓ Service restarted")
|
||||
_clear_launchd_unsupported_marker()
|
||||
|
||||
|
||||
def launchd_status(deep: bool = False):
|
||||
|
|
@ -3944,12 +4041,35 @@ def launchd_status(deep: bool = False):
|
|||
text=True,
|
||||
timeout=10,
|
||||
)
|
||||
loaded = result.returncode == 0
|
||||
loaded_output = result.stdout
|
||||
service_listed = result.returncode == 0
|
||||
list_output = result.stdout
|
||||
except subprocess.TimeoutExpired:
|
||||
loaded = False
|
||||
loaded_output = ""
|
||||
service_listed = False
|
||||
list_output = ""
|
||||
|
||||
# Determine whether launchd is actively supervising a process.
|
||||
# ``launchctl list`` returns exit 0 whenever the service definition is
|
||||
# registered — even when ``state = not running`` (macOS 26+ with an
|
||||
# unmanageable domain). A PID in the output confirms a live process.
|
||||
launchd_pid = _parse_launchd_pid_from_list_output(list_output) if service_listed else None
|
||||
|
||||
# Hermes PID tracking — may be a detached fallback process spawned when
|
||||
# launchd cannot manage the domain on this host.
|
||||
from gateway.status import get_running_pid
|
||||
fallback_pid = get_running_pid(cleanup_stale=False)
|
||||
|
||||
# Avoid double-counting: when launchd IS supervising, fallback_pid and
|
||||
# launchd_pid point at the same process (the gateway writes both the
|
||||
# launchd PID and the Hermes PID file).
|
||||
if launchd_pid is not None and fallback_pid == launchd_pid:
|
||||
fallback_pid = None
|
||||
|
||||
# Persistent marker written when launchd bootstrap/kickstart fails with
|
||||
# exit 5/125 on this host. Lets us explain *why* launchd can't supervise
|
||||
# even when no fallback process is currently running.
|
||||
launchd_unsupported = _launchd_unsupported_marker_exists()
|
||||
|
||||
# ── Report ──
|
||||
print(f"Launchd plist: {plist_path}")
|
||||
if launchd_plist_is_current():
|
||||
print("✓ Service definition matches the current Hermes install")
|
||||
|
|
@ -3957,13 +4077,33 @@ def launchd_status(deep: bool = False):
|
|||
print("⚠ Service definition is stale relative to the current Hermes install")
|
||||
print(" Run: hermes gateway start")
|
||||
|
||||
if loaded:
|
||||
print("✓ Gateway service is loaded")
|
||||
print(loaded_output)
|
||||
if service_listed:
|
||||
if launchd_pid is not None:
|
||||
print(f"✓ Gateway is supervised by launchd (PID {launchd_pid})")
|
||||
print(" Auto-start at login and auto-restart on crash are available.")
|
||||
if launchd_unsupported:
|
||||
print(" (launchd domain was previously unavailable but is now working)")
|
||||
elif launchd_unsupported:
|
||||
print("⚠ Gateway service is registered but launchd is not supervising it")
|
||||
print(" launchd cannot manage the gateway on this macOS version.")
|
||||
if fallback_pid:
|
||||
print(f"✓ Detached fallback process is running (PID {fallback_pid})")
|
||||
print(" Cron jobs will fire. Stop with: hermes gateway stop")
|
||||
else:
|
||||
print("✗ No fallback process is running")
|
||||
print(" Run: hermes gateway start")
|
||||
print(" ⚠ Auto-start at login and auto-restart on crash are NOT available.")
|
||||
else:
|
||||
print("✓ Gateway service is registered with launchd")
|
||||
print(list_output)
|
||||
if fallback_pid:
|
||||
print(f" Detached gateway process is running (PID {fallback_pid})")
|
||||
else:
|
||||
print("✗ Gateway service is not loaded")
|
||||
print(" Service definition exists locally but launchd has not loaded it.")
|
||||
print(" Run: hermes gateway start")
|
||||
if fallback_pid:
|
||||
print(f" Note: a detached gateway process is running (PID {fallback_pid})")
|
||||
|
||||
if deep:
|
||||
log_file = get_hermes_home() / "logs" / "gateway.log"
|
||||
|
|
|
|||
|
|
@ -993,6 +993,8 @@ class TestLaunchdServiceRecovery:
|
|||
|
||||
assert spawned == [True]
|
||||
assert "background process" in capsys.readouterr().out.lower()
|
||||
# Verify the unsupported marker was written so status can explain why
|
||||
assert gateway_cli._launchd_unsupported_marker_exists()
|
||||
|
||||
def test_launchd_install_falls_back_to_detached_on_bootstrap_5(self, tmp_path, monkeypatch, capsys):
|
||||
"""macOS bootstrap error 5 should spawn a detached gateway, not crash."""
|
||||
|
|
@ -1028,6 +1030,7 @@ class TestLaunchdServiceRecovery:
|
|||
|
||||
assert spawned == [True]
|
||||
assert "Service installed and loaded" not in capsys.readouterr().out
|
||||
assert gateway_cli._launchd_unsupported_marker_exists()
|
||||
|
||||
def test_launchd_restart_falls_back_to_detached_on_error_5(self, monkeypatch, capsys):
|
||||
"""kickstart -k error 5 (domain unmanageable) should relaunch detached."""
|
||||
|
|
@ -1056,6 +1059,7 @@ class TestLaunchdServiceRecovery:
|
|||
gateway_cli.launchd_restart()
|
||||
|
||||
assert spawned == [True]
|
||||
assert gateway_cli._launchd_unsupported_marker_exists()
|
||||
|
||||
def test_launchd_stop_tolerates_domain_unsupported_bootout(self, monkeypatch, capsys):
|
||||
"""bootout exit 125 (macOS 26) must fall through to PID-based kill, not raise."""
|
||||
|
|
@ -1082,6 +1086,177 @@ class TestLaunchdServiceRecovery:
|
|||
assert exc.value.code == 1
|
||||
out = capsys.readouterr().out
|
||||
assert "nohup hermes gateway run" in out
|
||||
# Marker is still written so status knows launchd is unavailable
|
||||
assert gateway_cli._launchd_unsupported_marker_exists()
|
||||
|
||||
# ── PID parsing ──────────────────────────────────────────────────────
|
||||
|
||||
def test_parse_launchd_pid_from_list_output_with_pid(self):
|
||||
output = '{\n "PID" = 12345;\n "Label" = "ai.hermes.gateway";\n}'
|
||||
assert gateway_cli._parse_launchd_pid_from_list_output(output) == 12345
|
||||
|
||||
def test_parse_launchd_pid_from_list_output_without_pid(self):
|
||||
output = '{\n "Label" = "ai.hermes.gateway";\n "OnDemand" = true;\n}'
|
||||
assert gateway_cli._parse_launchd_pid_from_list_output(output) is None
|
||||
|
||||
def test_parse_launchd_pid_from_list_output_empty(self):
|
||||
assert gateway_cli._parse_launchd_pid_from_list_output("") is None
|
||||
|
||||
def test_parse_launchd_pid_from_list_output_unquoted_pid(self):
|
||||
"""Older macOS versions may output PID without quotes."""
|
||||
output = "{\n PID = 99999;\n}"
|
||||
assert gateway_cli._parse_launchd_pid_from_list_output(output) == 99999
|
||||
|
||||
def test_parse_launchd_pid_from_list_output_negative_pid_returns_none(self):
|
||||
"""PID = -1 (recently-crashed service sentinel) must return None."""
|
||||
output = '{\n "PID" = -1;\n "Label" = "ai.hermes.gateway";\n}'
|
||||
assert gateway_cli._parse_launchd_pid_from_list_output(output) is None
|
||||
|
||||
# ── Probe requires PID ───────────────────────────────────────────────
|
||||
|
||||
def test_probe_launchd_service_running_false_without_pid_in_output(self, tmp_path, monkeypatch):
|
||||
"""launchctl list returns 0 but no PID → not actually running."""
|
||||
plist_path = tmp_path / "ai.hermes.gateway.plist"
|
||||
plist_path.write_text(gateway_cli.generate_launchd_plist(), encoding="utf-8")
|
||||
monkeypatch.setattr(gateway_cli, "get_launchd_plist_path", lambda: plist_path)
|
||||
monkeypatch.setattr(
|
||||
gateway_cli.subprocess,
|
||||
"run",
|
||||
lambda *args, **kwargs: SimpleNamespace(
|
||||
returncode=0,
|
||||
stdout='{\n "Label" = "ai.hermes.gateway";\n}',
|
||||
stderr="",
|
||||
),
|
||||
)
|
||||
assert gateway_cli._probe_launchd_service_running() is False
|
||||
|
||||
def test_probe_launchd_service_running_true_with_pid_in_output(self, tmp_path, monkeypatch):
|
||||
"""launchctl list returns 0 with PID → actually running."""
|
||||
plist_path = tmp_path / "ai.hermes.gateway.plist"
|
||||
plist_path.write_text(gateway_cli.generate_launchd_plist(), encoding="utf-8")
|
||||
monkeypatch.setattr(gateway_cli, "get_launchd_plist_path", lambda: plist_path)
|
||||
monkeypatch.setattr(
|
||||
gateway_cli.subprocess,
|
||||
"run",
|
||||
lambda *args, **kwargs: SimpleNamespace(
|
||||
returncode=0,
|
||||
stdout='{\n "PID" = 55555;\n "Label" = "ai.hermes.gateway";\n}',
|
||||
stderr="",
|
||||
),
|
||||
)
|
||||
assert gateway_cli._probe_launchd_service_running() is True
|
||||
|
||||
# ── Unsupport marker lifecycle ───────────────────────────────────────
|
||||
|
||||
def test_launchd_unsupported_marker_write_and_clear(self, tmp_path, monkeypatch):
|
||||
monkeypatch.setattr(gateway_cli, "get_hermes_home", lambda: tmp_path)
|
||||
assert not gateway_cli._launchd_unsupported_marker_exists()
|
||||
gateway_cli._write_launchd_unsupported_marker()
|
||||
assert gateway_cli._launchd_unsupported_marker_exists()
|
||||
gateway_cli._clear_launchd_unsupported_marker()
|
||||
assert not gateway_cli._launchd_unsupported_marker_exists()
|
||||
|
||||
def test_launchd_start_clears_unsupported_marker_on_bootstrap_success(self, tmp_path, monkeypatch, capsys):
|
||||
"""When bootstrap succeeds (OS update fixes the issue), clear the marker."""
|
||||
plist_path = tmp_path / "ai.hermes.gateway.plist"
|
||||
monkeypatch.setattr(gateway_cli, "get_launchd_plist_path", lambda: plist_path)
|
||||
# Pre-seed the marker as if a previous fallback wrote it
|
||||
monkeypatch.setattr(gateway_cli, "get_hermes_home", lambda: tmp_path)
|
||||
# Bypass the temp-home service write guard (added on main after PR #42567)
|
||||
monkeypatch.setattr(gateway_cli, "_refuse_temp_home_service_write", lambda d, k: False)
|
||||
gateway_cli._write_launchd_unsupported_marker()
|
||||
assert gateway_cli._launchd_unsupported_marker_exists()
|
||||
|
||||
# Simulate a bootstrap that succeeds
|
||||
def fake_run(cmd, check=False, **kwargs):
|
||||
return SimpleNamespace(returncode=0, stdout="", stderr="")
|
||||
monkeypatch.setattr(gateway_cli.subprocess, "run", fake_run)
|
||||
|
||||
gateway_cli.launchd_install(force=True)
|
||||
|
||||
assert "Service installed and loaded" in capsys.readouterr().out
|
||||
assert not gateway_cli._launchd_unsupported_marker_exists()
|
||||
|
||||
# ── launchd_status with active supervision ───────────────────────────
|
||||
|
||||
def test_launchd_status_reports_supervised_when_pid_present(self, tmp_path, monkeypatch, capsys):
|
||||
"""When launchd is actively supervising, report it clearly."""
|
||||
plist_path = tmp_path / "ai.hermes.gateway.plist"
|
||||
plist_path.write_text(gateway_cli.generate_launchd_plist(), encoding="utf-8")
|
||||
monkeypatch.setattr(gateway_cli, "get_launchd_plist_path", lambda: plist_path)
|
||||
|
||||
def fake_run(cmd, capture_output=False, text=False, timeout=None, check=False, **kwargs):
|
||||
if isinstance(cmd, list) and cmd[:2] == ["launchctl", "list"]:
|
||||
return SimpleNamespace(
|
||||
returncode=0,
|
||||
stdout='{\n "PID" = 77777;\n "Label" = "ai.hermes.gateway";\n}',
|
||||
stderr="",
|
||||
)
|
||||
return SimpleNamespace(returncode=0, stdout="", stderr="")
|
||||
monkeypatch.setattr(gateway_cli.subprocess, "run", fake_run)
|
||||
# No fallback PID — when launchd supervises, get_running_pid returns
|
||||
# the same PID; launchd_status deduplicates it.
|
||||
monkeypatch.setattr("gateway.status.get_running_pid", lambda cleanup_stale=False: 77777)
|
||||
|
||||
gateway_cli.launchd_status()
|
||||
|
||||
out = capsys.readouterr().out
|
||||
assert "supervised by launchd" in out
|
||||
assert "Auto-start at login" in out
|
||||
|
||||
def test_launchd_status_reports_fallback_when_unsupported_and_pid_running(self, tmp_path, monkeypatch, capsys):
|
||||
"""When the unsupported marker exists and a fallback PID is running."""
|
||||
plist_path = tmp_path / "ai.hermes.gateway.plist"
|
||||
plist_path.write_text(gateway_cli.generate_launchd_plist(), encoding="utf-8")
|
||||
monkeypatch.setattr(gateway_cli, "get_launchd_plist_path", lambda: plist_path)
|
||||
|
||||
def fake_run(cmd, capture_output=False, text=False, timeout=None, check=False, **kwargs):
|
||||
if isinstance(cmd, list) and cmd[:2] == ["launchctl", "list"]:
|
||||
return SimpleNamespace(
|
||||
returncode=0,
|
||||
stdout='{\n "Label" = "ai.hermes.gateway";\n "OnDemand" = true;\n}',
|
||||
stderr="",
|
||||
)
|
||||
return SimpleNamespace(returncode=0, stdout="", stderr="")
|
||||
monkeypatch.setattr(gateway_cli.subprocess, "run", fake_run)
|
||||
monkeypatch.setattr("gateway.status.get_running_pid", lambda cleanup_stale=False: 88888)
|
||||
# Pre-seed the unsupported marker
|
||||
monkeypatch.setattr(gateway_cli, "get_hermes_home", lambda: tmp_path)
|
||||
gateway_cli._write_launchd_unsupported_marker()
|
||||
|
||||
gateway_cli.launchd_status()
|
||||
|
||||
out = capsys.readouterr().out
|
||||
assert "cannot manage the gateway on this macos version" in out.lower()
|
||||
assert "Detached fallback process is running" in out
|
||||
assert "PID 88888" in out
|
||||
assert "NOT available" in out
|
||||
|
||||
def test_launchd_status_reports_fallback_unavailable_when_unsupported_no_pid(self, tmp_path, monkeypatch, capsys):
|
||||
"""Unsupported marker exists but no fallback process is running."""
|
||||
plist_path = tmp_path / "ai.hermes.gateway.plist"
|
||||
plist_path.write_text(gateway_cli.generate_launchd_plist(), encoding="utf-8")
|
||||
monkeypatch.setattr(gateway_cli, "get_launchd_plist_path", lambda: plist_path)
|
||||
|
||||
def fake_run(cmd, capture_output=False, text=False, timeout=None, check=False, **kwargs):
|
||||
if isinstance(cmd, list) and cmd[:2] == ["launchctl", "list"]:
|
||||
return SimpleNamespace(
|
||||
returncode=0,
|
||||
stdout='{\n "Label" = "ai.hermes.gateway";\n "OnDemand" = true;\n}',
|
||||
stderr="",
|
||||
)
|
||||
return SimpleNamespace(returncode=0, stdout="", stderr="")
|
||||
monkeypatch.setattr(gateway_cli.subprocess, "run", fake_run)
|
||||
monkeypatch.setattr("gateway.status.get_running_pid", lambda cleanup_stale=False: None)
|
||||
monkeypatch.setattr(gateway_cli, "get_hermes_home", lambda: tmp_path)
|
||||
gateway_cli._write_launchd_unsupported_marker()
|
||||
|
||||
gateway_cli.launchd_status()
|
||||
|
||||
out = capsys.readouterr().out
|
||||
assert "cannot manage the gateway on this macos version" in out.lower()
|
||||
assert "No fallback process is running" in out
|
||||
assert "NOT available" in out
|
||||
|
||||
|
||||
class TestLaunchdDomainDetection:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue