hermes-agent/hermes_cli/relaunch.py
Teknium cf9b2df57a
fix(windows): use PortableGit (not MinGit), fix relaunch os.execvp crash, surface npm errors
Three real bugs from teknium1's first Windows install run:

1. **MinGit has no bash.exe.**  MinGit is the minimal-automation Git for Windows
   distribution — it ships git.exe but deliberately strips bash and the POSIX
   coreutils.  Installer logged "Could not locate bash.exe" and Hermes would
   fail to run any shell command.  Switched to PortableGit — the full Git for
   Windows minus the installer UI.  PortableGit ships bash.exe at
   <root>\bin\bash.exe plus sh, awk, sed, grep, curl, ssh in usr\bin\.  ARM64
   variant is detected separately (PortableGit-*-arm64.7z.exe).  32-bit falls
   back to MinGit-32-bit with a warning (PortableGit is 64-bit only).

   PortableGit ships as a 7z self-extractor (56MB vs MinGit's 38MB).  We
   invoke it with `-o<target> -y` to extract silently — no 7z install needed,
   it's self-contained.

   Updated tools/environments/local.py::_find_bash candidate order to prefer
   the PortableGit layout (<root>\bin\bash.exe) with the MinGit layout
   (<root>\usr\bin\bash.exe) as a fallback so existing installs keep working.

2. **os.execvp "Exec format error" on Windows.**  Setup wizard's "Launch
   hermes chat now? Y" called `os.execvp(["hermes", "chat"])` which on
   Windows can only swap to real Win32 .exe files — chokes with OSError(8)
   on .cmd batch shims and Python console-script wrappers.  Added a
   win32 branch in hermes_cli/relaunch.py::relaunch() that uses
   subprocess.run + sys.exit — functionally identical (user sees "hermes
   exited, then new hermes started") with one extra PID in play.  POSIX
   path is UNCHANGED — still uses os.execvp for in-place replacement.
   Catches OSError in the Windows branch and surfaces a "open a new
   terminal so PATH picks up, then re-run hermes" hint instead of a
   cryptic traceback.

3. **npm install failures silent on Windows.**  The install.ps1 was invoking
   `npm install --silent 2>&1 | Out-Null` inside a try/catch.  PowerShell's
   try/catch does NOT trigger on non-zero process exit codes — only on
   unhandled .NET exceptions — so npm failing printed a generic "npm
   install failed" with zero information about WHY.  The silent pipe ate
   the stderr.

   Rewrote Install-NodeDeps to:
   - Resolve npm.cmd via Get-Command (respects PATHEXT) instead of
     relying on bare `npm` name resolution.
   - Use Start-Process with -PassThru to capture the actual exit code.
   - Redirect stderr to a temp log and surface the first ~800 chars of
     the real npm error when install fails, plus the log path for the
     full text.
   - Fail loudly with the right exit code instead of a misleading success.
   - Bail cleanly with a helpful message when npm isn't on PATH at all.

4. **"True" printing to console after Node check.**  `Test-Node` returns $true;
   installer called it as a bare statement (no assignment, no cast).  PowerShell
   prints bare return values.  Wrapped the call in `[void](Test-Node)`.

## Tests

- Added 3 new tests in tests/hermes_cli/test_relaunch.py covering the
  Windows branch: subprocess is called (not execvp), child exit code
  propagates, OSError surfaces a helpful message.  All 23 tests pass
  (20 existing + 3 new).
- 77 Windows-compat tests still pass, POSIX behaviour unchanged.
2026-05-07 17:42:47 -07:00

189 lines
No EOL
6.4 KiB
Python

"""
Unified self-relaunch for Hermes CLI.
Preserves critical flags (--tui, --dev, --profile, --model, etc.) across
process replacement so that ``hermes sessions browse`` or post-setup relaunch
doesn't silently drop the user's UI mode or other preferences.
Also works when ``hermes`` is not on PATH (e.g. ``nix run`` or ``python -m``).
"""
import os
import shutil
import sys
from typing import Optional, Sequence
from hermes_cli._parser import (
PRE_ARGPARSE_INHERITED_FLAGS,
build_top_level_parser,
)
def _build_inherited_flag_table() -> list[tuple[str, bool]]:
"""Build the ``(option_string, takes_value)`` table of flags that must
survive a self-relaunch, by introspecting the real parser used by
``hermes`` itself.
A flag participates if its argparse Action carries
``inherit_on_relaunch = True`` — set by ``_parser._inherited_flag``.
"""
parser, _subparsers, chat_parser = build_top_level_parser()
table: list[tuple[str, bool]] = []
seen: set[tuple[str, bool]] = set()
for p in (parser, chat_parser):
for action in p._actions:
if not action.option_strings:
continue # positional / no flag form
if not getattr(action, "inherit_on_relaunch", False):
continue
takes_value = action.nargs != 0 # store_true/false set nargs=0
for opt in action.option_strings:
key = (opt, takes_value)
if key not in seen:
seen.add(key)
table.append(key)
table.extend(PRE_ARGPARSE_INHERITED_FLAGS)
return table
_INHERITED_FLAGS_TABLE = _build_inherited_flag_table()
def _extract_inherited_flags(argv: Sequence[str]) -> list[str]:
"""Pull out flags that should carry over into a self-relaunched hermes."""
flags: list[str] = []
i = 0
while i < len(argv):
arg = argv[i]
if "=" in arg:
key = arg.split("=", 1)[0]
for flag, _ in _INHERITED_FLAGS_TABLE:
if key == flag:
flags.append(arg)
break
i += 1
continue
for flag, takes_value in _INHERITED_FLAGS_TABLE:
if arg == flag:
flags.append(arg)
if takes_value and i + 1 < len(argv) and not argv[i + 1].startswith("-"):
flags.append(argv[i + 1])
i += 1
break
i += 1
return flags
def resolve_hermes_bin() -> Optional[str]:
"""Find the hermes entry point.
Priority:
1. ``sys.argv[0]`` if it resolves to a real executable.
2. ``shutil.which("hermes")`` on PATH.
3. ``None`` → caller should fall back to ``python -m hermes_cli.main``.
"""
argv0 = sys.argv[0]
# Absolute path to an executable (covers nix store, venv wrappers, etc.)
if os.path.isabs(argv0) and os.path.isfile(argv0) and os.access(argv0, os.X_OK):
return argv0
# Relative path — resolve against CWD
if not argv0.startswith("-") and os.path.isfile(argv0):
abs_path = os.path.abspath(argv0)
if os.access(abs_path, os.X_OK):
return abs_path
# PATH lookup
path_bin = shutil.which("hermes")
if path_bin:
return path_bin
return None
def build_relaunch_argv(
extra_args: Sequence[str],
*,
preserve_inherited: bool = True,
original_argv: Optional[Sequence[str]] = None,
) -> list[str]:
"""Construct an argv list for replacing the current process with hermes.
Args:
extra_args: Arguments to append (e.g. ``["--resume", id]``).
preserve_inherited: Whether to carry over UI / behaviour flags
tagged with ``inherit_on_relaunch`` in the parser.
original_argv: The original argv to scan for flags (defaults to
``sys.argv[1:]``).
"""
bin_path = resolve_hermes_bin()
if bin_path:
argv = [bin_path]
else:
argv = [sys.executable, "-m", "hermes_cli.main"]
src = list(original_argv) if original_argv is not None else list(sys.argv[1:])
if preserve_inherited:
argv.extend(_extract_inherited_flags(src))
argv.extend(extra_args)
return argv
def relaunch(
extra_args: Sequence[str],
*,
preserve_inherited: bool = True,
original_argv: Optional[Sequence[str]] = None,
) -> None:
"""Replace the current process with a fresh hermes invocation.
On POSIX we use ``os.execvp`` which replaces the running process with
the new one in place — same PID, no double-fork. That's what the
relaunch contract wants: "run hermes again as if the user had typed
the new argv".
Windows has no native exec semantics — ``os.execvp`` on Windows
*emulates* exec by spawning the child and exiting the parent, but
only works when the target is a real Win32 executable. Our target
is usually ``hermes.exe`` (a Python console-script shim that wraps
``python -m hermes_cli.main``) or a ``.cmd`` batch file, and both
raise ``OSError(8, "Exec format error")`` on Windows' execvp.
The Windows-correct pattern is: spawn the child with ``subprocess.run``
(which routes through ``cmd.exe`` via ``shell=False`` + PATHEXT resolution),
wait for it to exit, then propagate its exit code via ``sys.exit``.
That's functionally equivalent — the user sees "hermes exited, then
new hermes started" — just with two PIDs in play instead of one.
"""
new_argv = build_relaunch_argv(
extra_args, preserve_inherited=preserve_inherited, original_argv=original_argv
)
if sys.platform == "win32":
# Windows: subprocess + exit, because execvp can't swap to .cmd/.exe shims.
import subprocess
try:
result = subprocess.run(new_argv)
sys.exit(result.returncode)
except KeyboardInterrupt:
sys.exit(130)
except OSError as exc:
# Surface a helpful error rather than the raw OSError — the
# caller used to see ``[Errno 8] Exec format error`` which is
# cryptic. Common causes: ``hermes`` not on PATH yet (install
# hasn't propagated User PATH into this shell) or a stale shim.
print(
f"\nHermes relaunch failed: {exc}\n"
f"Command: {' '.join(new_argv)}\n"
f"Fix: open a new terminal so PATH picks up, then re-run hermes.",
file=sys.stderr,
)
sys.exit(1)
else:
os.execvp(new_argv[0], new_argv)