hermes-agent/hermes_cli
Teknium 0548facc50 fix(windows): gateway status dedup + install.ps1 platform-SDK bootstrap
## Two residual Windows fixes that were hanging from earlier commits.

### 1. `hermes gateway status` reported 2 PIDs per gateway — TWO bugs compounded

Diagnosed with psutil parent/child walk against live gateway PIDs:

**Bug A (the real one): `_get_parent_pid` silently failed on Windows.**
The helper shelled out to `ps -o ppid= -p <pid>`, which doesn't exist
on Windows — `FileNotFoundError` → returns `None` → the ancestor walk
terminated at `os.getpid()` alone.  Consequence: the PID table scan in
`_scan_gateway_pids` couldn't filter out `hermes gateway status`'s own
launcher stub (a venv `pythonw.exe`/`python.exe` that matches the same
`-m hermes_cli.main gateway` pattern as the gateway).  Every status
call saw "itself" as a second gateway.

Fix: `_get_parent_pid` now calls `psutil.Process(pid).ppid()` first
(psutil is a core dependency since 3dfb35700) and falls back to `ps`
only when `shutil.which("ps")` succeeds — matching the Windows-footgun
checker's "always guard `ps` / `wmic` / etc. with `shutil.which`" rule.

Before: `Gateway process running (PID: 21952, 46880)` — 46880 changing
on every call (the status invocation's own launcher, which died by the
time the next status call looked).

After (5 consecutive calls):
```
✓ Gateway process running (PID: 21952)
✓ Gateway process running (PID: 21952)
✓ Gateway process running (PID: 21952)
✓ Gateway process running (PID: 21952)
✓ Gateway process running (PID: 21952)
```

Ancestor walk on the fix: 14 PIDs (full chain through bash/explorer)
instead of the broken 1-PID set.

**Bug B (the cosmetic one): venv-launcher dedup.** Standard Windows
CPython venv behaviour is that `<venv>/Scripts/pythonw.exe` is a ~5 MB
launcher stub that spawns the base Python (`C:\\Program Files\\Python311
\\pythonw.exe`) with the same command line and waits.  Our process
scanner sees two PIDs for every gateway: launcher + interpreter, same
cmdline.  Bug A masked this by accidentally counting the status call
AS one of them; with Bug A fixed, we see both the real launcher and
real interpreter for the gateway process itself.

Fix: `_filter_venv_launcher_stubs` at the tail of `_scan_gateway_pids`
walks each matched PID's ppid via psutil.  Any PID that's the PARENT
of another matched PID is a launcher stub — drop it, keep the child.
Scoped to Windows (`is_windows() and len(pids) > 1`) and no-ops when
psutil isn't importable.

Net effect: `gateway status` now reports one PID per gateway — the
interpreter — matching POSIX behaviour and user expectations.

### 2. `install.ps1`: bootstrap pip + auto-install platform SDKs

New `Install-PlatformSdks` function wired between `Invoke-SetupWizard`
and `Start-GatewayIfConfigured`.  Fixes two related issues on fresh
Windows installs:

1. The tiered `uv pip install` cascade (introduced in 87fca8342)
   correctly falls through when tier 1 `.[all]` fails on the RL git
   deps, but the fallback tiers can silently skip SDKs from `[messaging]`
   when there's a partial-resolve.  Result: user sets `DISCORD_BOT_TOKEN`
   in `.env`, fires up gateway, hits "discord module not installed".

2. `uv` creates venvs WITHOUT pip by default, so the user's escape
   hatch (`pip install discord.py` in the venv) doesn't exist either.

The new function:
- Skips if `-NoVenv` (nothing to bootstrap into).
- Scans `~/.hermes/.env` for messaging tokens (TELEGRAM_BOT_TOKEN,
  DISCORD_BOT_TOKEN, SLACK_BOT_TOKEN, SLACK_APP_TOKEN, WHATSAPP_ENABLED),
  filtering placeholder values.
- For each token that's set, runs `python -c "import <sdk>"` to verify.
- If any import fails: runs `python -m ensurepip --upgrade` to bootstrap
  pip into the venv (idempotent — no-ops if pip is already present),
  then `pip install <spec>` for each missing SDK with specs mirroring
  pyproject.toml's `[messaging]` extra to avoid version drift.

The `$ErrorActionPreference = "SilentlyContinue"` spans are not
cosmetic — PowerShell wraps native-stderr from a non-zero-exit
subprocess as a `NativeCommandError` that prints even through
`*> $null` / `2>$null`.  Save + restore EAP over the import-probe
and pip-install blocks keeps the output clean.

Verified on this Windows 10 box:
- Initial state: telegram+fastapi+psutil present, discord+slack_sdk
  missing (tier 1 `.[all]` had failed — `.tirith-install-failed`
  marker in `%LOCALAPPDATA%\\hermes`).
- First run with discord+slack tokens in .env: detects both missing,
  ensurepip (skipped — pip was already bootstrapped earlier this
  session for telegram), installs `discord.py[voice]==2.7.1` +
  `PyNaCl` + `davey`, installs `slack-sdk==3.41.0`. All imports
  succeed on verify.
- Second run: all three SDKs report OK, function no-ops.

Pip spec strings mirror pyproject.toml's `[messaging]` extra verbatim
so a bump to the extra picks up here automatically — no drift.

### Files

- `hermes_cli/gateway.py`: `_get_parent_pid` rewritten (psutil-first);
  `_filter_venv_launcher_stubs` added; `_scan_gateway_pids` dedups
  launchers on Windows when it finds >1 match.
- `scripts/install.ps1`: new `Install-PlatformSdks` function (~85
  lines); wired into the main flow at line 1438.

### Verification

- `venv/Scripts/python.exe scripts/check-windows-footguns.py --all`
  → `✓ No Windows footguns found (380 file(s) scanned).`
- `ast.parse` passes on gateway.py.
- `[System.Management.Automation.Language.Parser]::ParseFile` passes
  on install.ps1.
- Live gateway (PID 21952, running since 12:33 today) survived 5x
  stress loop of `hermes gateway status` without dying.
2026-05-08 14:27:40 -07:00
..
__init__.py chore: release v0.13.0 (2026.5.7) (#21406) 2026-05-07 09:22:48 -07:00
_parser.py fix: add dashboard to CLI help epilogue and Docker CI smoke test 2026-05-07 06:16:23 -07:00
_subprocess_compat.py feat(windows): close remaining POSIX-only landmines — TUI crash, kanban waitpid, AF_UNIX sandbox, /bin/bash, npm .cmd shims, cwd tracking, detach flags 2026-05-08 14:27:40 -07:00
auth.py feat(cross-platform): psutil for PID/process management + Windows footgun checker 2026-05-08 14:27:40 -07:00
auth_commands.py auth: use get_default_hermes_root() for shared nous_auth.json path 2026-05-08 14:27:40 -07:00
azure_detect.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
backup.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
banner.py fix(banner): show correct update status on nix-built hermes (#17550) 2026-04-30 07:03:00 +05:30
browser_connect.py fix(browser): address Copilot review on /browser connect 2026-04-28 22:11:10 -07:00
callbacks.py fix: ESC cancels secret/sudo prompts, clearer skip messaging (#9902) 2026-04-14 16:11:37 -07:00
checkpoints.py feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709) 2026-05-06 05:44:35 -07:00
claw.py Merge origin/main and resolve conflict in nix/tui.nix 2026-05-07 22:56:19 +00:00
cli_output.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
clipboard.py feat: fix img pasting in new ink plus newline after tools 2026-04-11 13:14:32 -05:00
codex_models.py feat(codex): add gpt-5.5 and wire live model discovery into picker (#14720) 2026-04-23 13:32:43 -07:00
colors.py feat: respect NO_COLOR env var and TERM=dumb (#4079) 2026-03-30 17:07:21 -07:00
commands.py Merge pull request #20805 from NousResearch/austin-feat-sessions-skills-menu 2026-05-07 18:54:16 -04:00
completion.py fix: preserve profile name completion in dynamic shell completion 2026-04-14 10:45:42 -07:00
config.py feat(cross-platform): psutil for PID/process management + Windows footgun checker 2026-05-08 14:27:40 -07:00
copilot_auth.py fix(oauth,gateway): monotonic deadlines for polling/timeout loops 2026-05-07 05:09:39 -07:00
cron.py feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern) (#19709) 2026-05-04 12:31:01 -07:00
curator.py feat(curator): add hermes curator list-archived command (#21236) 2026-05-07 05:46:51 -07:00
curses_ui.py fix: treat ctrl-c as curses cancel 2026-05-04 01:36:44 -07:00
debug.py fix(debug): redact log content at upload time in hermes debug share 2026-05-03 11:42:20 -07:00
default_soul.py fix: reset default SOUL.md to baseline identity text (#3159) 2026-03-26 01:34:27 -07:00
dingtalk_auth.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
doctor.py fix(windows): auto-install Playwright Chromium + surface it in doctor 2026-05-08 14:27:40 -07:00
dump.py refactor(env): use shared Hermes dotenv loader 2026-05-05 10:13:13 -07:00
env_loader.py feat(cross-platform): psutil for PID/process management + Windows footgun checker 2026-05-08 14:27:40 -07:00
fallback_cmd.py feat(cli): add 'hermes fallback' command to manage fallback providers (#16052) 2026-04-26 06:19:04 -07:00
gateway.py fix(windows): gateway status dedup + install.ps1 platform-SDK bootstrap 2026-05-08 14:27:40 -07:00
gateway_windows.py feat(windows): gateway as a Scheduled Task + Startup-folder fallback 2026-05-08 14:27:40 -07:00
goals.py fix(goals): auto-pause when judge model returns unparseable output 2026-05-07 17:33:09 -07:00
hooks.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
kanban.py feat(kanban): add specify — auxiliary LLM fleshes out triage tasks (#21435) 2026-05-07 13:04:41 -07:00
kanban_db.py fix(windows): os.kill(pid, 0) is NOT a no-op on Windows — route through new _pid_exists helper 2026-05-08 14:27:40 -07:00
kanban_diagnostics.py fix(kanban): unify failure counter across spawn/timeout/crash outcomes (#20410) 2026-05-05 13:55:37 -07:00
kanban_specify.py feat(kanban): add specify — auxiliary LLM fleshes out triage tasks (#21435) 2026-05-07 13:04:41 -07:00
logs.py feat: component-separated logging with session context and filtering (#7991) 2026-04-11 17:23:36 -07:00
main.py feat(cross-platform): psutil for PID/process management + Windows footgun checker 2026-05-08 14:27:40 -07:00
mcp_config.py fix(mcp): give 'mcp add --command' a distinct argparse dest 2026-05-07 05:17:03 -07:00
memory_setup.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
model_catalog.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
model_normalize.py fix(opencode-go): keep users on opencode-go instead of hijacking to native providers (#20802) 2026-05-06 09:08:33 -07:00
model_switch.py fix(model_switch): live model discovery for custom_providers in /model picker 2026-05-07 05:21:26 -07:00
models.py feat(models): add paid tencent/hy3-preview route on OpenRouter (#21077) 2026-05-07 06:34:48 -07:00
nous_subscription.py feat(web): add SearXNG as a native search-only backend 2026-05-06 10:05:29 -07:00
oneshot.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
pairing.py fix(pairing): enforce lockout on approve_code, not just generate_code (#10195) (#21325) 2026-05-07 07:18:21 -07:00
platforms.py feat: complete plugin platform parity — all 12 integration points 2026-04-29 21:56:51 -07:00
plugins.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
plugins_cmd.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
profile_distribution.py feat(profile): shareable profile distributions via git (#20831) 2026-05-08 10:04:32 -07:00
profiles.py fix(windows): os.kill(pid, 0) is NOT a no-op on Windows — route through new _pid_exists helper 2026-05-08 14:27:40 -07:00
providers.py fix: prevent bare 'custom' slug in model.provider (#17478) 2026-04-30 04:32:11 -07:00
pty_bridge.py feat(cross-platform): psutil for PID/process management + Windows footgun checker 2026-05-08 14:27:40 -07:00
relaunch.py fix(windows): prefer npm.cmd over npm.ps1, skip .py argv0 in relaunch 2026-05-08 14:27:40 -07:00
runtime_provider.py fix(credential_pool): resolve key mix-up when custom providers share base_url 2026-05-07 05:27:41 -07:00
setup.py fix: include terminal backend in quick setup wizard (#21842) 2026-05-08 17:36:38 +05:30
skills_config.py refactor(config): migrate remaining 33 cfg_get call sites (#17311) 2026-04-29 04:03:03 -07:00
skills_hub.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
skin_engine.py fix(tui): honor skin highlight colors (#20895) 2026-05-06 14:01:56 -07:00
slack_cli.py fix(paths): route achievements plugin + profile-tui through HERMES_HOME 2026-04-30 23:21:54 -07:00
status.py fix(status): add missing popular provider API keys to hermes status display 2026-05-04 05:14:13 -07:00
stdio.py fix(windows): quote cache paths in bash + augment PATH so rg/bash resolve on first launch 2026-05-08 14:27:40 -07:00
timeouts.py refactor(timeouts): drop redundant ImportError in except clause 2026-04-26 20:48:20 -07:00
tips.py feat: Ctrl+Enter inserts newline on Windows Terminal 2026-05-08 14:27:40 -07:00
tools_config.py feat(windows): close remaining POSIX-only landmines — TUI crash, kanban waitpid, AF_UNIX sandbox, /bin/bash, npm .cmd shims, cwd tracking, detach flags 2026-05-08 14:27:40 -07:00
uninstall.py feat(cross-platform): psutil for PID/process management + Windows footgun checker 2026-05-08 14:27:40 -07:00
vercel_auth.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
voice.py fix(tui): restore voice push-to-talk parity (#20897) 2026-05-06 15:49:59 -07:00
web_server.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
webhook.py refactor(config): migrate remaining 33 cfg_get call sites (#17311) 2026-04-29 04:03:03 -07:00