mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-09 03:11:58 +00:00
Second pass on native Windows support, driven by a systematic audit across
five areas: POSIX-only primitives (signal.SIGKILL/SIGHUP/SIGPIPE, os.WNOHANG,
os.setsid), path translation bugs (/c/Users → C:\Users), subprocess patterns
(npm.cmd batch shims, start_new_session no-op on Windows), subsystem health
(cron, gateway daemon, update flow), and module-level import guards.
Every change is platform-gated — POSIX (Linux/macOS) behaviour is preserved
bit-identical. Explicit "do no harm" test: test_posix_path_preserved_on_linux,
test_posix_noop, test_windows_detach_popen_kwargs_is_posix_equivalent_on_posix.
## New module
- hermes_cli/_subprocess_compat.py — shared helpers (resolve_node_command,
windows_detach_flags, windows_hide_flags, windows_detach_popen_kwargs).
All no-ops on non-Windows.
## CRITICAL fixes (would crash or silently break on Windows)
- tui_gateway/entry.py: SIGPIPE/SIGHUP referenced at module top level would
AttributeError on import on Windows, breaking `hermes --tui` entirely (it
spawns this module as a subprocess). Guard each signal.signal() call with
hasattr() and add SIGBREAK as Windows' SIGHUP equivalent.
- hermes_cli/kanban_db.py: os.waitpid(-1, os.WNOHANG) in dispatcher tick was
unguarded. os.WNOHANG doesn't exist on Windows. Gate the whole reap loop
behind `os.name != "nt"` — Windows has no zombies anyway.
- tools/code_execution_tool.py: AF_UNIX socket for execute_code RPC fails on
most Windows builds. Fall back to loopback TCP (AF_INET on 127.0.0.1:0
ephemeral port) when _IS_WINDOWS. HERMES_RPC_SOCKET env var now accepts
either a filesystem path (POSIX) or `tcp://127.0.0.1:<port>` (Windows).
Generated sandbox client parses both.
- cron/scheduler.py: `argv = ["/bin/bash", str(path)]` hardcoded. Use
shutil.which("bash") so Windows (Git Bash via MinGit) works, with a
readable error when bash is genuinely absent.
- 6 bare npm/npx spawn sites: tools_config.py x2, doctor.py, whatsapp.py
(npm install + node version probe), browser_tool.py x2. On Windows npm
is npm.cmd / npx is npx.cmd (batch shims); subprocess.Popen(["npm", ...])
fails with WinError 193. shutil.which(...) returns the absolute .cmd
path which CreateProcessW accepts because the extension routes through
cmd.exe /c. POSIX behaviour unchanged (shutil.which still returns the
same path subprocess would resolve itself).
## HIGH fixes (silent misbehaviour on Windows)
- tools/environments/local.py get_temp_dir: hardcoded /tmp returned on
Windows meant `_cwd_file = "/tmp/hermes-cwd-*.txt"`, which bash wrote
via MSYS2's virtual /tmp but native Python couldn't open. Result: cwd
tracking silently broken — `cd` in terminal tool did nothing. Windows
branch now returns `%HERMES_HOME%/cache/terminal` with forward slashes
(works in both bash and Python, guaranteed no spaces).
- tools/environments/local.py _make_run_env PATH injection: `/usr/bin not
in split(":")` heuristic mangles Windows PATH (";" separator). Gate
the injection behind `not _IS_WINDOWS`.
- hermes_cli/gateway.py launch_detached_profile_gateway_restart: outer
Popen + watcher-script Popen both used start_new_session=True, which
Windows silently ignores. Watcher stayed attached to CLI's console,
died when user closed terminal after `hermes update`, left gateway
stale. Now branches through windows_detach_popen_kwargs() helper
(CREATE_NEW_PROCESS_GROUP | DETACHED_PROCESS | CREATE_NO_WINDOW on
Windows, start_new_session=True on POSIX — identical to main).
## MEDIUM fixes
- gateway/run.py /restart and /update handlers: hardcoded bash/setsid
chain crashes on Windows when user triggers /update in-gateway. Now
has sys.platform=="win32" branch using sys.executable + a tiny
Python watcher with proper detach flags. POSIX path is unchanged.
- cli.py _git_repo_root: Git on Windows sometimes returns /c/Users/...
style paths that break subprocess.Popen(cwd=...) and Path().resolve().
Added _normalize_git_bash_path() helper that translates /c/Users,
/cygdrive/c, /mnt/c variants to native C:\Users form. POSIX no-op.
_git_repo_root() now routes every result through it.
- cli.py worktree .worktreeinclude: os.symlink on directories failed
hard on Windows (requires admin or Developer Mode). Falls back to
shutil.copytree with a warning log.
## Tests
- 29 new tests in tests/tools/test_windows_native_support.py covering:
subprocess_compat helpers, TUI entry signal guards, kanban waitpid
guard, code_execution TCP fallback source-level invariants, cron bash
resolution, npm/npx bare-spawn lint per-file, local env Windows temp
dir, PATH injection gating, git bash path normalization, symlink
fallback, gateway detached watcher flags.
- One existing test assertion adjusted in test_browser_homebrew_paths:
it compared captured Popen argv to the BARE `"npx"` literal; after the
shutil.which() change argv[0] is the absolute path. New assertion
checks the shape (two items, second is `agent-browser`) rather than
the exact first-item string. Behaviour unchanged; test was too strict.
All 56 tests pass on Linux (30 from previous commits + 26 new).
267 tests from the affected files/dirs (browser, code_exec, local_env,
process_registry, kanban_db, windows_compat) all pass — zero regressions.
tests/hermes_cli/ (3909 pass) and tests/gateway/ (5021 pass) unchanged;
all pre-existing test failures confirmed unrelated via `git stash` re-run.
## What's still deferred (LOW priority)
- Visible cmd-window flashes on short-lived console apps (~14 sites) —
cosmetic, needs a follow-up pass once we have user reports.
- agent/file_safety.py POSIX-only security deny patterns — separate
hardening task.
- tools/process_registry.py returning "/tmp" as fallback — theoretical;
reachable only when all env-var candidates fail.
175 lines
6.5 KiB
Python
175 lines
6.5 KiB
Python
"""Windows subprocess compatibility helpers.
|
|
|
|
Hermes is developed on Linux / macOS and tested natively on Windows too.
|
|
Several common subprocess patterns break silently-or-loudly on Windows:
|
|
|
|
* ``["npm", "install", ...]`` — on Windows ``npm`` is ``npm.cmd``, a batch
|
|
shim. ``subprocess.Popen(["npm", ...])`` fails with WinError 193
|
|
("not a valid Win32 application") because CreateProcessW can't run a
|
|
``.cmd`` file without ``shell=True`` or PATHEXT resolution.
|
|
|
|
* ``start_new_session=True`` — on POSIX, this maps to ``os.setsid()`` and
|
|
actually detaches the child. On Windows it's silently ignored; the
|
|
Windows equivalent is ``CREATE_NEW_PROCESS_GROUP | DETACHED_PROCESS``
|
|
creationflags, which Python only applies when you pass them explicitly.
|
|
|
|
* Console-window flashes — every ``subprocess.Popen`` of a ``.exe`` on
|
|
Windows spawns a cmd window briefly unless ``CREATE_NO_WINDOW`` is
|
|
passed. Cosmetic but jarring for background daemons.
|
|
|
|
This module centralizes the platform-branching logic so the rest of the
|
|
codebase doesn't sprinkle ``if sys.platform == "win32":`` everywhere.
|
|
|
|
**All helpers are no-ops on non-Windows** — calling them in Linux/macOS
|
|
code paths is safe by design. That's the "do no damage on POSIX"
|
|
guarantee.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import os
|
|
import shutil
|
|
import subprocess
|
|
import sys
|
|
from typing import Optional, Sequence
|
|
|
|
__all__ = [
|
|
"IS_WINDOWS",
|
|
"resolve_node_command",
|
|
"windows_detach_flags",
|
|
"windows_hide_flags",
|
|
"windows_detach_popen_kwargs",
|
|
]
|
|
|
|
|
|
IS_WINDOWS = sys.platform == "win32"
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
# Node ecosystem launcher resolution
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
|
def resolve_node_command(name: str, argv: Sequence[str]) -> list[str]:
|
|
"""Resolve a Node-ecosystem command name to an absolute-path argv.
|
|
|
|
On Windows, commands like ``npm``, ``npx``, ``yarn``, ``pnpm``,
|
|
``playwright``, ``prettier`` ship as ``.cmd`` files (batch shims).
|
|
``subprocess.Popen(["npm", "install"])`` fails with WinError 193
|
|
because CreateProcessW doesn't execute batch files directly.
|
|
|
|
``shutil.which(name)`` *does* resolve ``.cmd`` via PATHEXT and returns
|
|
the fully-qualified path — which CreateProcessW accepts because the
|
|
extension tells Windows to route through ``cmd.exe /c``.
|
|
|
|
On POSIX ``shutil.which`` also returns a fully-qualified path when
|
|
found. That's a small change from bare-name resolution (the OS does
|
|
its own PATH search) but functionally identical and has the side
|
|
benefit of making the argv reproducible in logs.
|
|
|
|
Behavior when the command is not on PATH:
|
|
- On Windows: return the bare name — caller can still try with
|
|
``shell=True`` as a last resort, OR the subsequent Popen will
|
|
raise FileNotFoundError with a readable error we want to surface.
|
|
- On POSIX: same. Bare ``npm`` on a Linux box without npm installed
|
|
fails the same way it did before this function existed.
|
|
|
|
Args:
|
|
name: The command name to resolve (``npm``, ``npx``, ``node`` …).
|
|
argv: The remaining arguments. Must NOT include ``name`` itself —
|
|
this function builds the full argv list.
|
|
|
|
Returns:
|
|
A list suitable for passing to subprocess.Popen/run/call.
|
|
"""
|
|
resolved = shutil.which(name)
|
|
if resolved:
|
|
return [resolved, *argv]
|
|
return [name, *argv]
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
# Detached / hidden process creation
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
|
# Win32 CreationFlags — defined here rather than imported from subprocess
|
|
# because CREATE_NO_WINDOW and DETACHED_PROCESS aren't guaranteed to be
|
|
# present on stdlib subprocess on older Pythons or non-Windows builds.
|
|
_CREATE_NEW_PROCESS_GROUP = 0x00000200
|
|
_DETACHED_PROCESS = 0x00000008
|
|
_CREATE_NO_WINDOW = 0x08000000
|
|
|
|
|
|
def windows_detach_flags() -> int:
|
|
"""Return Win32 creationflags that detach a child from the parent
|
|
console and process group. 0 on non-Windows.
|
|
|
|
Pair with ``start_new_session=False`` (default) when calling
|
|
subprocess.Popen — on POSIX use ``start_new_session=True`` instead,
|
|
which maps to ``os.setsid()`` in the child.
|
|
|
|
Rationale:
|
|
- ``CREATE_NEW_PROCESS_GROUP`` — child has its own process group so
|
|
Ctrl+C in the parent console doesn't propagate.
|
|
- ``DETACHED_PROCESS`` — child has no console at all. Necessary for
|
|
background daemons (gateway watchers, update respawners) because
|
|
without it, closing the console kills the child.
|
|
- ``CREATE_NO_WINDOW`` — suppress the brief cmd flash that would
|
|
otherwise appear when launching a console app. Redundant with
|
|
DETACHED_PROCESS but explicit for clarity.
|
|
"""
|
|
if not IS_WINDOWS:
|
|
return 0
|
|
return _CREATE_NEW_PROCESS_GROUP | _DETACHED_PROCESS | _CREATE_NO_WINDOW
|
|
|
|
|
|
def windows_hide_flags() -> int:
|
|
"""Return Win32 creationflags that merely hide the child's console
|
|
window without detaching the child. 0 on non-Windows.
|
|
|
|
Use for short-lived console apps spawned as part of a larger
|
|
operation (``taskkill``, ``where``, version probes) where we want no
|
|
flash but also want to collect stdout/exit code synchronously.
|
|
|
|
The key difference from :func:`windows_detach_flags`: NO
|
|
``DETACHED_PROCESS`` — the child still inherits stdio handles so
|
|
``capture_output=True`` works. ``DETACHED_PROCESS`` would sever
|
|
stdio and break stdout capture.
|
|
"""
|
|
if not IS_WINDOWS:
|
|
return 0
|
|
return _CREATE_NO_WINDOW
|
|
|
|
|
|
def windows_detach_popen_kwargs() -> dict:
|
|
"""Return a dict of Popen kwargs that detach a child on Windows and
|
|
fall back to the POSIX equivalent (``start_new_session=True``) on
|
|
Linux/macOS.
|
|
|
|
Usage pattern:
|
|
|
|
.. code-block:: python
|
|
|
|
subprocess.Popen(
|
|
argv,
|
|
stdout=subprocess.DEVNULL,
|
|
stderr=subprocess.DEVNULL,
|
|
stdin=subprocess.DEVNULL,
|
|
close_fds=True,
|
|
**windows_detach_popen_kwargs(),
|
|
)
|
|
|
|
This replaces the unsafe-on-Windows pattern:
|
|
|
|
.. code-block:: python
|
|
|
|
subprocess.Popen(..., start_new_session=True)
|
|
|
|
which silently fails to detach on Windows (the flag is accepted but
|
|
has no effect — the child stays attached to the parent's console
|
|
and dies when the console closes).
|
|
"""
|
|
if IS_WINDOWS:
|
|
return {"creationflags": windows_detach_flags()}
|
|
return {"start_new_session": True}
|