feat(windows): close native-Windows install gaps — crash-free startup, UTF-8 stdio, tzdata dep, docs

Native Windows (with Git for Windows installed) can now run the Hermes CLI
and gateway end-to-end without crashing.  install.ps1 already existed and
the Git Bash terminal backend was already wired up — this PR fills the
remaining gaps discovered by auditing every Windows-unsafe primitive
(`signal.SIGKILL`, `os.kill(pid, 0)` probes, bare `fcntl`/`termios`
imports) and by comparing hermes against how Claude Code, OpenCode, Codex,
and Cline handle native Windows.

## What changed

### UTF-8 stdio (new module)
- `hermes_cli/stdio.py` — single `configure_windows_stdio()` entry point.
  Flips the console code page to CP_UTF8 (65001), reconfigures
  `sys.stdout`/`stderr`/`stdin` to UTF-8, sets `PYTHONIOENCODING` + `PYTHONUTF8`
  for subprocesses.  No-op on non-Windows.  Opt out via `HERMES_DISABLE_WINDOWS_UTF8=1`.
- Called early in `cli.py::main`, `hermes_cli/main.py::main`, and
  `gateway/run.py::main` so Unicode banners (box-drawing, geometric
  symbols, non-Latin chat text) don't `UnicodeEncodeError` on cp1252
  consoles.

### Crash sites fixed
- `hermes_cli/main.py:7970` (hermes update → stuck gateway sweep): raw
  `os.kill(pid, _signal.SIGKILL)` → `gateway.status.terminate_pid(pid, force=True)`
  which routes through `taskkill /T /F` on Windows.
- `hermes_cli/profiles.py::_stop_gateway_process`: same fix — also
  converted SIGTERM path to `terminate_pid()` and widened OSError catch
  on the intermediate `os.kill(pid, 0)` probe.
- `hermes_cli/kanban_db.py:2914, 3041`: raw `signal.SIGKILL` →
  `getattr(signal, "SIGKILL", signal.SIGTERM)` fallback (matches the
  pattern already used in `gateway/status.py`).

### OSError widening on `os.kill(pid, 0)` probes
Windows raises `OSError` (WinError 87) for a gone PID instead of
`ProcessLookupError`.  Widened the catch at:
- `gateway/run.py:15101` (`--replace` wait-for-exit loop — without this,
  the loop busy-spins the full 10s every Windows gateway start)
- `hermes_cli/gateway.py:228, 460, 940`
- `hermes_cli/profiles.py:777`
- `tools/process_registry.py::_is_host_pid_alive`
- `tools/browser_tool.py:1170, 1206`

### Dashboard PTY graceful degradation
`hermes_cli/pty_bridge.py` depends on `fcntl`/`termios`/`ptyprocess`,
none of which exist on native Windows.  Previously a Windows dashboard
would crash on `import hermes_cli.web_server` because of a top-level
import.  Now:
- `hermes_cli/web_server.py` wraps the pty_bridge import in
  `try/except ImportError` and sets `_PTY_BRIDGE_AVAILABLE=False`.
- The `/api/pty` WebSocket handler returns a friendly "use WSL2 for
  this tab" message instead of exploding.
- Every other dashboard feature (sessions, jobs, metrics, config
  editor) runs natively on Windows.

### Dependency
- `pyproject.toml`: add `tzdata>=2023.3; sys_platform == 'win32'` so
  Python's `zoneinfo` works on Windows (which has no IANA tzdata
  shipped with the OS).  Credits @sprmn24 (PR #13182).

### Docs
- README.md: removed "Native Windows is not supported"; added
  PowerShell one-liner and Git-for-Windows prerequisite note.
- `website/docs/getting-started/installation.md`: new Windows section
  with capability matrix (everything native except the dashboard
  `/chat` PTY tab, which is WSL2-only).
- `website/docs/user-guide/windows-wsl-quickstart.md`: reframed as
  "WSL2 as an alternative to native" rather than "the only way".
- `website/docs/developer-guide/contributing.md`: updated
  cross-platform guidance with the `signal.SIGKILL` / `OSError`
  rules we enforce now.
- `website/docs/user-guide/features/web-dashboard.md`: acknowledged
  native Windows works for everything except the embedded PTY pane.

## Why this shape

Pulled from a survey of how other agent codebases handle native
Windows (Claude Code, OpenCode, Codex, Cline):

- All four treat Git Bash as the canonical shell on Windows, same as
  hermes already does in `tools/environments/local.py::_find_bash()`.
- None of them force `SetConsoleOutputCP` — but they don't have to,
  Node/Rust write UTF-16 to the Win32 console API.  Python does not get
  that for free, so we flip CP_UTF8 via ctypes.
- None of them ship PowerShell-as-primary-shell (Claude Code exposes
  PS as a secondary tool; scope creep for this PR).
- All of them use `taskkill /T /F` for force-kill on Windows, which
  is exactly what `gateway.status.terminate_pid(force=True)` does.

## Non-goals (deliberate scope limits)

- No PowerShell-as-a-second-shell tool — worth designing separately.
- No terminal routing rewrite (#12317, #15461, #19800 cluster) — that's
  the hardest design call and needs a separate doc.
- No wholesale `open()` → `open(..., encoding="utf-8")` sweep (Tianworld
  cluster) — will do as follow-up if users hit actual breakage; most
  modern code already specifies it.

## Validation

- 28 new tests in `tests/tools/test_windows_native_support.py` — all
  platform-mocked, pass on Linux CI.  Cover:
  - `configure_windows_stdio` idempotency, opt-out, env-preservation
  - `terminate_pid` taskkill routing, failure → OSError, FileNotFoundError fallback
  - `getattr(signal, "SIGKILL", …)` fallback shape
  - `_is_host_pid_alive` OSError widening (Windows-gone-PID behavior)
  - Source-level checks that all entry points call `configure_windows_stdio`
  - pty_bridge import-guard present in `web_server.py`
  - README no longer says "not supported"
- 12 pre-existing tests in `tests/tools/test_windows_compat.py` still pass.
- `tests/hermes_cli/` ran fully (3909 passed, 9 failures — all confirmed
  pre-existing on main by stash-test).
- `tests/gateway/` ran fully (5021 passed, 1 pre-existing failure).
- `tests/tools/test_process_registry.py` + `test_browser_*` pass.
- Manual smoke: `import hermes_cli.stdio; import gateway.run;
  import hermes_cli.web_server` — all clean, `_PTY_BRIDGE_AVAILABLE=True`
  on Linux (as expected).

## Files

- New: `hermes_cli/stdio.py`, `tests/tools/test_windows_native_support.py`
- Modified: `cli.py`, `gateway/run.py`, `hermes_cli/main.py`,
  `hermes_cli/profiles.py`, `hermes_cli/gateway.py`,
  `hermes_cli/kanban_db.py`, `hermes_cli/pty_bridge.py`,
  `hermes_cli/web_server.py`, `tools/browser_tool.py`,
  `tools/process_registry.py`, `pyproject.toml`, `README.md`, and 4
  docs pages.

Credits to everyone whose prior PR work informed these fixes — see
the co-author trailers.  All of the PRs listed in
`~/.hermes/plans/windows-support-prs.md` fixing `os.kill` / `signal.SIGKILL`
/ UTF-8 stdio / tzdata / README patterns found the same issues; this PR
consolidates them.

Co-authored-by: Philip D'Souza <9472774+PhilipAD@users.noreply.github.com>
Co-authored-by: Arecanon <42595053+ArecaNon@users.noreply.github.com>
Co-authored-by: XiaoXiao0221 <263113677+XiaoXiao0221@users.noreply.github.com>
Co-authored-by: Lars Hagen <1360677+lars-hagen@users.noreply.github.com>
Co-authored-by: Luan Dias <65574834+luandiasrj@users.noreply.github.com>
Co-authored-by: Ruzzgar <ruzzgarcn@gmail.com>
Co-authored-by: sprmn24 <oncuevtv@gmail.com>
Co-authored-by: adybag14-cyber <252811164+adybag14-cyber@users.noreply.github.com>
Co-authored-by: Prasanna28Devadiga <54196612+Prasanna28Devadiga@users.noreply.github.com>
This commit is contained in:
Teknium 2026-05-07 16:31:40 -07:00
parent 7f369bfe55
commit fda234a210
No known key found for this signature in database
18 changed files with 728 additions and 29 deletions

View file

@ -80,7 +80,7 @@ The **Chat** tab embeds the full Hermes TUI (the same interface you get from `he
- Node.js (same requirement as `hermes --tui`; the TUI bundle is built on first launch)
- `ptyprocess` — installed by the `pty` extra (`pip install 'hermes-agent[web,pty]'`, or `[all]` covers both)
- POSIX kernel (Linux, macOS, or WSL). Native Windows Python is not supported — use WSL.
- POSIX kernel (Linux, macOS, or WSL2). The `/chat` terminal pane specifically needs a POSIX PTY — native Windows Python has no equivalent, so on a native Windows install the rest of the dashboard (sessions, jobs, metrics, config editor) works but the `/chat` tab will show a banner telling you to use WSL2 for that feature.
Close the browser tab and the PTY is reaped cleanly on the server. Re-opening spawns a fresh session.

View file

@ -7,7 +7,18 @@ sidebar_position: 2
# Windows (WSL2) Guide
Hermes Agent is developed and tested on **Linux** and **macOS**. Native Windows is not supported — on Windows you run Hermes inside **WSL2** (Windows Subsystem for Linux, version 2). That means there are effectively two computers in play: your Windows host, and a Linux VM managed by WSL. Most confusion comes from not being sure which one you're on at any moment.
Hermes Agent now supports **both** native Windows and WSL2. This page covers the WSL2 path; for the native PowerShell install see [Installation](../getting-started/installation.md#windows-native-powershell).
**When to pick WSL2 over native:**
- You want to use the dashboard's embedded terminal (`/chat` tab) — that pane requires a POSIX PTY and is WSL2-only.
- You're doing POSIX-heavy development work and want your Hermes sessions to share the same filesystem / paths as your dev tools.
- You already have a WSL2 environment and don't want to maintain a second install.
**When native is fine (or better):**
- Interactive chat, gateway (Telegram/Discord/etc.), cron scheduler, browser tool, MCP servers, and most Hermes features all run natively on Windows.
- You don't want to think about crossing the WSL↔Windows boundary every time you reference a file or open a URL.
In WSL2 there are effectively two computers in play: your Windows host, and a Linux VM managed by WSL. Most confusion comes from not being sure which one you're on at any moment.
This guide covers the parts of that split that specifically affect Hermes: installing WSL2, getting files back and forth between Windows and Linux, networking in both directions, and the pitfalls people actually hit.
@ -15,11 +26,13 @@ This guide covers the parts of that split that specifically affect Hermes: insta
A Chinese-language walkthrough of the minimum install path is maintained on this same page — switch via the **language** menu (top right) and select **简体中文**.
:::
## Why WSL2 (and not "just Windows")
## Why WSL2 (vs. native Windows)
Hermes assumes a POSIX environment: `fork`, `/tmp`, UNIX sockets, signal semantics, PTY-backed terminals, shells like `bash`/`zsh`, and tools like `rg`, `git`, `ffmpeg` that behave the way they do on Linux. Rewriting that for native Windows would be a full port — WSL2 gives you a real Linux kernel in a lightweight VM instead, and Hermes inside it is essentially identical to running on Ubuntu.
The native Windows install runs in Windows directly: your Windows terminal (PowerShell, Windows Terminal, etc.), Windows filesystem paths (`C:\Users\…`), and Windows processes. Hermes uses Git Bash to run shell commands, which is how Claude Code and other agents handle Windows today — it sidesteps the POSIX-vs-Windows gap without a full rewrite.
Practical consequences of this choice:
WSL2 runs a real Linux kernel in a lightweight VM, so Hermes inside it is essentially identical to running on Ubuntu. That's valuable when you want a real POSIX environment: `fork`, `/tmp`, UNIX sockets, signal semantics, PTY-backed terminals, shells like `bash`/`zsh`, and tools like `rg`, `git`, `ffmpeg` that behave the way they do on Linux.
Practical consequences of WSL2:
- The Hermes CLI, gateway, sessions, memory, skills, and tool runtimes all live inside the Linux VM.
- Windows programs (browsers, native apps, Chrome with your logged-in profile) live outside it.