hermes-agent/website/docs/reference
xxxigm 6f1a176b33 fix(gateway/discord): REST liveness probe to detect zombie clients (#26656)
The Discord adapter could enter a silent zombie state after a network
outage / proxy stall: the process is alive, _client looks open, but the
underlying socket is dead. discord.py's WebSocket reconnect never sees a
RST through a wedged proxy/NAT, so client.start() spins forever without
exiting — which means the bot-task done callback (which only fires on
task completion) never trips either. The bot stays "offline" in Discord
until a manual `hermes gateway restart`. Reported offline for 13-17h.

Adds an out-of-band REST liveness probe in DiscordAdapter. Every
`discord.liveness_interval_seconds` (default 60s) the adapter issues a
cheap fetch_user(bot_id) — the same REST path as message delivery, so it
fails when the proxy/NAT is wedged. After
`discord.liveness_failure_threshold` consecutive failures (default 3) the
probe closes the wedged client and surfaces a retryable fatal error,
which trips the gateway's existing _platform_reconnect_watcher and
rebuilds the adapter. Operators disable it by setting either knob to 0.

Config lives in config.yaml (discord.liveness_*) per the .env-is-secrets
policy; _apply_yaml_config bridges it to internal env vars the adapter
reads, matching the existing HERMES_DISCORD_TEXT_BATCH_* pattern.

Co-authored-by: Hermes Agent <agent@nousresearch.com>
2026-06-27 19:30:32 -07:00
..
_category_.json
automation-blueprints-catalog.mdx docs: finish Automation Blueprints terminology rebrand (#44470) 2026-06-11 17:22:22 -04:00
cli-commands.md feat(moa): make /moa one-shot only; route preset switching through the model picker 2026-06-27 03:09:09 -07:00
environment-variables.md fix(gateway/discord): REST liveness probe to detect zombie clients (#26656) 2026-06-27 19:30:32 -07:00
faq.md feat(docs): clarify platform support 2026-06-26 11:37:56 -07:00
mcp-config-reference.md refactor: remove agent-callable send_message tool (#47856) 2026-06-17 07:11:23 -07:00
model-catalog.md docs: deep audit — registry drift, stale claims, 2-week PR coverage, dashboard screenshot (#40952) 2026-06-07 01:39:06 -07:00
optional-skills-catalog.md fix(docs): regenerate skill docs to fix stale cross-links, add tool-search to sidebar 2026-06-20 20:42:49 -07:00
profile-commands.md fix(profile): make clone-from a full source selector 2026-06-13 07:33:58 -07:00
skills-catalog.md Merge remote-tracking branch 'origin/main' into bb/pets 2026-06-22 05:25:49 -05:00
slash-commands.md feat(skills): /learn — distill a reusable skill from anything you describe (#51506) 2026-06-23 13:51:28 -07:00
tools-reference.md feat(docs): clarify platform support 2026-06-26 11:37:56 -07:00
toolsets-reference.md feat(docs): clarify platform support 2026-06-26 11:37:56 -07:00