mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-24 10:52:21 +00:00
When the Node spectrum-ts sidecar process exited mid-session (crash,
OOM, upstream overflow escalation), _supervise_sidecar returned
silently — readline hit EOF, the log-pump loop broke, and nothing
notified the gateway. _inbound_loop entered an infinite retry loop
against a dead port, _running stayed True, and the adapter remained
in self.adapters with no path to self-recovery short of a manual
gateway restart.
Add a death-detection tail to _supervise_sidecar: after the log-pump
exits (EOF or exception), guard on _inbound_running to distinguish
unexpected death from a deliberate disconnect(). On unexpected exit,
call _set_fatal_error("SIDECAR_CRASHED", retryable=True) followed by
_notify_fatal_error() so the reconnect watcher picks up the platform
within 30 s and retries with exponential backoff (30 s → 300 s cap)
until the sidecar comes back up. All other platforms remain unaffected.
The _inbound_running guard is safe against races: disconnect() sets
_inbound_running = False before _stop_sidecar() cancels the supervisor
task. CancelledError is BaseException, not Exception, so it bypasses
the except clause and propagates normally — the detection block never
runs during a clean shutdown.
|
||
|---|---|---|
| .. | ||
| browser | ||
| context_engine | ||
| cron | ||
| dashboard_auth | ||
| disk-cleanup | ||
| google_meet | ||
| hermes-achievements | ||
| image_gen | ||
| kanban | ||
| memory | ||
| model-providers | ||
| observability | ||
| platforms | ||
| security-guidance | ||
| spotify | ||
| teams_pipeline | ||
| video_gen | ||
| web | ||
| __init__.py | ||
| plugin_utils.py | ||