hermes-agent/plugins/platforms
xxxigm 6f1a176b33 fix(gateway/discord): REST liveness probe to detect zombie clients (#26656)
The Discord adapter could enter a silent zombie state after a network
outage / proxy stall: the process is alive, _client looks open, but the
underlying socket is dead. discord.py's WebSocket reconnect never sees a
RST through a wedged proxy/NAT, so client.start() spins forever without
exiting — which means the bot-task done callback (which only fires on
task completion) never trips either. The bot stays "offline" in Discord
until a manual `hermes gateway restart`. Reported offline for 13-17h.

Adds an out-of-band REST liveness probe in DiscordAdapter. Every
`discord.liveness_interval_seconds` (default 60s) the adapter issues a
cheap fetch_user(bot_id) — the same REST path as message delivery, so it
fails when the proxy/NAT is wedged. After
`discord.liveness_failure_threshold` consecutive failures (default 3) the
probe closes the wedged client and surfaces a retryable fatal error,
which trips the gateway's existing _platform_reconnect_watcher and
rebuilds the adapter. Operators disable it by setting either knob to 0.

Config lives in config.yaml (discord.liveness_*) per the .env-is-secrets
policy; _apply_yaml_config bridges it to internal env vars the adapter
reads, matching the existing HERMES_DISCORD_TEXT_BATCH_* pattern.

Co-authored-by: Hermes Agent <agent@nousresearch.com>
2026-06-27 19:30:32 -07:00
..
dingtalk fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
discord fix(gateway/discord): REST liveness probe to detect zombie clients (#26656) 2026-06-27 19:30:32 -07:00
email fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
feishu fix(gateway,feishu): refuse executor resurrection during real shutdown 2026-06-27 04:13:09 -07:00
google_chat fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
homeassistant fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
irc fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
line fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
matrix fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
mattermost fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
ntfy fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
photon revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853) 2026-06-27 15:59:00 -07:00
raft revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853) 2026-06-27 15:59:00 -07:00
simplex fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
slack fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
sms fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
teams fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
telegram fix(telegram): surface failed media downloads to user and agent, not a silent empty turn (#53912) 2026-06-27 19:12:57 -07:00
wecom fix(telegram): preserve Bot API update queue on watcher reconnect 2026-06-25 21:29:57 -07:00
whatsapp revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853) 2026-06-27 15:59:00 -07:00