mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-23 10:42:00 +00:00
When a Windows user relaunches Hermes while an in-app update is still running (the desktop vanished with no progress and looks crashed), the fresh instance spawns its own dashboard backend. That backend re-locks the venv shim, the updater's straggler cleanup (force_kill_other_hermes -> taskkill /F /T /IM hermes.exe) kills it, the launch dies with the 45s "backend didn't come up" timeout, and the user relaunches into the same trap -- an infinite respawn/kill loop (#50238). Root cause: no mutual exclusion between an applying update and a fresh desktop spawning its own local backend. Fix: the updater publishes a HERMES_HOME/.hermes-update-in-progress marker (pid + start time) for the whole run via an RAII drop-guard that removes it on every exit path (success, early return, panic). A freshly-launched desktop checks the marker before spawning its local backend and PARKS until the update finishes -- then brings the backend up itself (it is the surviving instance; the updater's own relaunch hits the single-instance lock and quits). A stale marker (dead pid or past a 20-minute ceiling) is pruned so a crashed updater can never strand future launches. No rogue backend spawns mid-update, so force_kill_other_hermes has nothing legitimate to kill. Marker parse/staleness logic is extracted to update-marker.cjs and unit-tested; the Rust guard has unit tests; the Rust-write <-> JS-read contract is E2E-verified. |
||
|---|---|---|
| .. | ||
| bootstrap-installer | ||
| desktop | ||
| shared | ||