Commit graph

1 commit

Author SHA1 Message Date
Teknium
9c263fbf8a feat(windows): gateway as a Scheduled Task + Startup-folder fallback
Hermes gateway now installs as a real Windows service via
`hermes gateway install`, auto-starts on user logon, and stays running
across reboots. Mirrors the launchd (macOS) / systemd (Linux) contract
so the rest of the CLI dispatcher just plugs into the same `install /
uninstall / start / stop / restart / status` entrypoints.

Primary implementation is the new `hermes_cli/gateway_windows.py`:

- `schtasks /Create /SC ONLOGON /RL LIMITED /RU <user> /NP /IT` creates
  a per-user Scheduled Task running as the current user at next logon,
  with no UAC prompt and no stored password. Same pattern OpenClaw uses.
- When `schtasks /Create` returns "Access is denied" or times out
  (locked-down corporate boxes, 15s/30s hard + no-output cutoffs),
  fall back to writing a `.cmd` file into
  `%APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup\`, which
  Windows Explorer fires at every logon. Either path produces the same
  end-user experience.
- `_spawn_detached()` launches `pythonw.exe -m hermes_cli.main gateway
  run --replace` directly with `DETACHED_PROCESS |
  CREATE_NEW_PROCESS_GROUP | CREATE_NO_WINDOW |
  CREATE_BREAKAWAY_FROM_JOB` + DEVNULL stdio + sidecar
  `logs/gateway-stdio.log`. Going through pythonw.exe (no console)
  instead of a cmd.exe shim is what lets the gateway survive the
  spawning shell's exit on Windows — documented in
  `references/windows-subprocess-sigint-storm.md`.
- Two separate quoting helpers for cmd.exe vs schtasks (`/TR` argument)
  — they're different parsers and mixing breaks both. Same split
  OpenClaw documents in src/daemon/schtasks.ts.
- `_wait_for_gateway_ready()` + `_report_gateway_start()` poll for a
  live gateway process after spawn and report the PID, so install
  doesn't lie about success.

Dispatcher wiring in `hermes_cli/gateway.py`:

- `_gateway_command_inner()` gets Windows branches for install /
  uninstall / start / stop / restart / status + `_is_service_installed`
  + `_is_service_running`. `gateway status` output + suggested
  commands now mention `hermes gateway install` instead of
  `sudo hermes gateway install --system` on Windows.

Two separable Windows fixes that only matter for a working
detached gateway, bundled here because shipping them independently
leaves install broken:

(1) Spurious CTRL_C_EVENT on detached pythonw runs. When the gateway
is launched detached on Windows, something on the boot path (HTTPX /
python-telegram-bot / asyncio ProactorEventLoop subprocess plumbing)
synthesizes a Ctrl+C within ~60-90 seconds. Python 3.11 translates it
into KeyboardInterrupt inside `asyncio.run(start_gateway(...))`, the
outer `except KeyboardInterrupt: return` exits cleanly, and the
process dies with no shutdown log — "bot started typing, then
stopped" is the fingerprint because the interrupt fires mid-send.
Fix in `run_gateway()`: when `is_windows()` and stdin is not a TTY,
install `signal.signal(SIGINT, SIG_IGN)` + same for SIGBREAK. Real
console runs have a TTY and skip the absorber, so user Ctrl+C still
works interactively. Same family as commit 449ad952b's browser-tool
SIGINT absorber; cross-referenced in the ref doc.

(2) `wmic process get` is the process-list path used by
`_scan_gateway_pids()` / `find_gateway_pids()`, which power status,
stop, and restart on Windows. `C:\Windows\System32\wbem\WMIC.exe` has
been deprecated since Windows 10 21H1 and is not installed on modern
Win 10/11 boxes, so `find_gateway_pids()` silently returns [] — status
sees no gateway even when one is running. Fix: `shutil.which("wmic")`
first, fall back to PowerShell's `Get-CimInstance Win32_Process`
emitting the same LIST-style `CommandLine=...` / `ProcessId=...` pairs
the downstream parser already handles. Zero behavior change on boxes
where wmic still works.

Verified end-to-end on Windows 10 (Delta-1):
- `hermes gateway install` → falls back to Startup folder (access
  denied on schtasks for this user) + detached pythonw spawn, PID
  reported correctly.
- Gateway connects to Telegram, answers messages, stays alive past
  2min (previously died at ~85s with no shutdown log).
- `hermes gateway stop` + `uninstall` both clean up both tracks.

Refs: openclaw/openclaw src/daemon/schtasks.ts for the ONLOGON +
startup-folder-fallback pattern. skill hermes-agent
references/windows-subprocess-sigint-storm.md for the deeper
CTRL_C_EVENT / ProactorEventLoop background.
2026-05-08 14:27:40 -07:00