mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-23 10:42:00 +00:00
`hermes gateway restart` on Windows could take the gateway offline with no
replacement. restart() was stop() -> sleep(1.0) -> start(), but the graceful
drain can run up to ~180s while the detached pythonw process stays alive. The
1s sleep let start() run against the still-draining old process; its
"already running" guard then no-opped, and when the old process finally exited
nothing relaunched it.
Two root causes, both fixed:
1. Loose PID detection. `_scan_gateway_pids` and the gateway.status helpers
used substring matches ("... gateway" in cmdline) for lifecycle decisions,
so they false-matched `gateway status`/`dashboard` siblings and unrelated
processes like `python -m tui_gateway`, plus stale gateway.pid records.
Add a shared strict matcher `looks_like_gateway_command_line()` in
gateway/status.py that requires the real `gateway run` subcommand (or the
dedicated entrypoints), and route `_looks_like_gateway_process`,
`_record_looks_like_gateway`, and `_scan_gateway_pids` through it.
2. restart() race. Wait until the gateway is authoritatively gone
(`get_running_pid()` + strict `_gateway_pids()`) before relaunch; force-kill
once if it lingers and raise rather than start a duplicate; verify the
relaunch produced a running gateway and raise loudly if not (no more
exit-0 silent outage).
Scoped to Windows; systemd/launchd restart paths are already drain-aware.
Adds tests/gateway/test_gateway_command_line_matcher.py.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
48 lines
1.6 KiB
Python
48 lines
1.6 KiB
Python
"""Tests for the strict gateway command-line matcher.
|
|
|
|
Regression guard for the Windows ``hermes gateway restart`` silent-outage bug:
|
|
the previous loose substring match (``"... gateway" in cmdline``) false-matched
|
|
``gateway status``/``dashboard`` siblings and unrelated processes such as
|
|
``python -m tui_gateway``, which let ``restart()`` race a still-draining old
|
|
process and ``status``/``start`` report false positives.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import pytest
|
|
|
|
from gateway.status import looks_like_gateway_command_line as matches
|
|
|
|
|
|
ACCEPT = [
|
|
"pythonw.exe -m hermes_cli.main gateway run",
|
|
r"C:\Users\me\hermes\venv\Scripts\pythonw.exe -m hermes_cli.main gateway run",
|
|
"python -m hermes_cli.main --profile work gateway run",
|
|
"python -m hermes_cli.main gateway run --replace",
|
|
"python -m hermes_cli/main.py gateway run",
|
|
"python gateway/run.py",
|
|
"hermes-gateway.exe",
|
|
"hermes gateway", # bare `hermes gateway` defaults to run
|
|
"hermes gateway run",
|
|
]
|
|
|
|
REJECT = [
|
|
"python -m tui_gateway", # unrelated module
|
|
"python -m hermes_cli.main gateway status", # other subcommand
|
|
"python -m hermes_cli.main gateway restart",
|
|
"python -m hermes_cli.main gateway stop",
|
|
"python -m hermes_cli.main --profile x dashboard", # non-gateway subcommand
|
|
"some random python -m mygateway thing",
|
|
"",
|
|
None,
|
|
]
|
|
|
|
|
|
@pytest.mark.parametrize("cmd", ACCEPT)
|
|
def test_accepts_real_gateway_run(cmd):
|
|
assert matches(cmd) is True
|
|
|
|
|
|
@pytest.mark.parametrize("cmd", REJECT)
|
|
def test_rejects_non_gateway_run(cmd):
|
|
assert matches(cmd) is False
|