fix(update): quarantine hermes.exe vs concurrent Windows instance (#26670) (#26677)

* fix(update): detect concurrent hermes.exe on Windows; retry + restart-defer quarantine

Closes #26670.

When 'hermes update' runs on Windows with another hermes.exe alive (most
commonly the Hermes Desktop Electron app's spawned backend) _quarantine_running_hermes_exe()
fails to rename the venv shim with [WinError 32]. uv pip install -e .
then exits 2, the git-pull fast path is silently abandoned, and the ZIP
fallback runs (and fails the same way) before eventually succeeding.

This change implements three of the five proposed fixes from the issue:

1. Concurrent-instance detection (preferred fix). _detect_concurrent_hermes_instances()
   uses psutil to enumerate processes whose .exe is one of our venv shims
   (hermes.exe / hermes-gateway.exe), excluding the caller's PID. When any
   match exists, cmd_update prints an actionable message naming the
   blocking PIDs and exits 2 BEFORE any destructive work. New --force flag
   bypasses the gate.

2. Retry + restart-deferred fallback. _quarantine_running_hermes_exe()
   now retries the rename up to 4 times with 100/250/500/1000 ms backoff
   (covers the transient AV-scanner-handle case). If all retries fail,
   it schedules the replacement via MoveFileExW with the OS deferred-rename
   flag so the new shim can land at the original path and the update
   completes; the old image is fully unloaded after the user's next
   system restart.

3. Actionable warning text. The old 'Could not quarantine: [WinError 32]'
   warning is replaced with one that names the likely culprits (Hermes
   Desktop, REPLs, gateway, AV) and points to the new --force flag.

Tests:
- 13 new tests in tests/hermes_cli/test_update_concurrent_quarantine.py
  covering: psutil-based enumeration, self-pid exclusion, case-insensitive
  matching of .EXE, no-psutil graceful degradation, off-Windows no-op,
  helpful warning formatting, retry-then-succeed, restart-deferred fallback,
  cmd_update abort + exit code 2, and --force bypass.
- New autouse fixture in tests/hermes_cli/conftest.py defaults
  _detect_concurrent_hermes_instances to [] so the rest of the suite
  isn't tripped by the developer's own running hermes.exe. Opt-out marker
  'real_concurrent_gate' registered in pyproject.toml.
- Updating docs page (website/docs/getting-started/updating.md) gains a
  short section explaining the new Windows error and remediation.

* chore: refresh uv.lock to match pyproject.toml exact pins

aiohttp 3.13.4 -> 3.13.3 (matches pyproject pin: aiohttp==3.13.3)
anthropic 0.87.0 -> 0.86.0 (matches pyproject pin: anthropic==0.86.0)
hermes-agent 0.13.0 -> 0.14.0 (matches pyproject version)

CI's uv lock --check was failing on the merged state because main
drifted: pyproject.toml uses exact == pins for those two deps and the
hermes-agent version was bumped to 0.14.0 but the lockfile still had
0.13.0.
This commit is contained in:
Teknium 2026-05-19 11:10:51 -07:00 committed by GitHub
parent 57af46fae2
commit 2a7308b7c4
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 591 additions and 9 deletions

View file

@ -69,6 +69,26 @@ updates:
`--backup` was the always-on behavior in earlier builds, but it was adding minutes to every update on large homes, so it's now opt-in. The lightweight pairing-data snapshot above still runs unconditionally.
### Windows: another `hermes.exe` is running
On Windows, `hermes update` will refuse to run if it detects another `hermes.exe` process holding the venv's entry-point executable open — most commonly the Hermes Desktop app's spawned backend, an open `hermes` REPL in another terminal, or a running gateway:
```
$ hermes update
✗ Another hermes.exe is running:
PID 12345 hermes.exe
Updating now would fail to overwrite ...\venv\Scripts\hermes.exe because
Windows blocks REPLACE on a running executable.
Close Hermes Desktop, exit any open `hermes` REPLs, and
stop the gateway (`hermes gateway stop`) before retrying.
Override with `hermes update --force` if you've already
confirmed those processes will not write to the venv.
```
Close the listed processes and re-run. If you're sure the concurrent process won't interfere (rare — usually only useful when an antivirus shim is mis-attributed), pass `--force` to skip the check. In that case the updater will still retry the `.exe` rename with exponential backoff and, on stubborn locks, schedule the replacement for next reboot via `MoveFileEx(MOVEFILE_DELAY_UNTIL_REBOOT)` so the update can complete.
Expected output looks like:
```