hermes-agent/plugins
xxxigm 6f1a176b33 fix(gateway/discord): REST liveness probe to detect zombie clients (#26656)
The Discord adapter could enter a silent zombie state after a network
outage / proxy stall: the process is alive, _client looks open, but the
underlying socket is dead. discord.py's WebSocket reconnect never sees a
RST through a wedged proxy/NAT, so client.start() spins forever without
exiting — which means the bot-task done callback (which only fires on
task completion) never trips either. The bot stays "offline" in Discord
until a manual `hermes gateway restart`. Reported offline for 13-17h.

Adds an out-of-band REST liveness probe in DiscordAdapter. Every
`discord.liveness_interval_seconds` (default 60s) the adapter issues a
cheap fetch_user(bot_id) — the same REST path as message delivery, so it
fails when the proxy/NAT is wedged. After
`discord.liveness_failure_threshold` consecutive failures (default 3) the
probe closes the wedged client and surfaces a retryable fatal error,
which trips the gateway's existing _platform_reconnect_watcher and
rebuilds the adapter. Operators disable it by setting either knob to 0.

Config lives in config.yaml (discord.liveness_*) per the .env-is-secrets
policy; _apply_yaml_config bridges it to internal env vars the adapter
reads, matching the existing HERMES_DISCORD_TEXT_BATCH_* pattern.

Co-authored-by: Hermes Agent <agent@nousresearch.com>
2026-06-27 19:30:32 -07:00
..
browser fix: guard int(os.getenv()) casts against malformed env vars (#40598) 2026-06-07 06:14:24 -07:00
context_engine feat(context-engine): host contract for external context engines 2026-05-28 01:45:30 -07:00
cron_providers fix(cron): avoid provider package shadowing core cron 2026-06-23 23:39:22 -07:00
dashboard_auth fix(dashboard-auth): follow redirects on self-hosted OIDC discovery (#53399) 2026-06-27 14:14:51 +10:00
disk-cleanup 🐛 fix(disk-cleanup): avoid brittle sweep review issues 2026-06-15 05:25:27 -07:00
google_meet fix: prevent TUI gateway stdin EOF crash across all TUI-context subprocess calls 2026-06-08 22:46:57 -07:00
hermes-achievements revert(plugins): restore user dashboard plugin backend API auto-import (#43719) (#51950) 2026-06-24 07:46:54 -07:00
image_gen fix shape 2026-06-25 12:38:33 -07:00
kanban fix(security): sanitize kanban markdown html 2026-06-21 13:10:17 -07:00
memory revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853) 2026-06-27 15:59:00 -07:00
model-providers feat: add reasoning_effort support to ollama-cloud provider 2026-06-23 11:51:43 -07:00
observability fix(langfuse): bound _TRACE_STATE growth from non-finalizing turns 2026-06-18 12:59:41 +05:30
platforms fix(gateway/discord): REST liveness probe to detect zombie clients (#26656) 2026-06-27 19:30:32 -07:00
security-guidance plugins: add security-guidance — pattern-matched warnings on dangerous code writes (#33131) 2026-05-27 02:07:21 -07:00
spotify chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
teams_pipeline chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
video_gen fix(plugins): thread-safe lazy-singleton helpers; fix honcho TOCTOU (#24759) (#42150) 2026-06-08 09:35:22 -07:00
web fix(ddgs): bound DuckDuckGo search with a wall-clock timeout (#36776) 2026-06-25 01:45:06 +05:30
__init__.py feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623) 2026-04-02 15:33:51 -07:00
plugin_utils.py fix(plugins): thread-safe lazy-singleton helpers; fix honcho TOCTOU (#24759) (#42150) 2026-06-08 09:35:22 -07:00