hermes-agent/plugins/platforms
Dineth Hettiarachchi 020ef76cf1 fix(discord): cancel _bot_task on connect() timeout to prevent zombie client
When connect() times out waiting for the Discord ready event, the background
asyncio.Task running client.start() was not cancelled. discord.py's internal
reconnect loop can ignore client.close() while a WebSocket handshake is in
flight, so the orphaned task eventually completes and fires on_ready.

A later successful reconnect then leaves two live Discord clients in the same
process — each with its own on_message handler and MessageDeduplicator instance
— so every @mention creates two threads because the per-adapter dedup caches
cannot catch cross-client duplicates.

Fix: explicitly cancel and await _bot_task in two places:
1. The asyncio.TimeoutError handler inside connect() — catches the case where
   the adapter's own inner wait_for fires before the gateway's outer timeout.
2. The start of disconnect() — the load-bearing path, always reached via
   _dispose_unused_adapter regardless of which timeout fired first.

Root cause confirmed from production logs: a Jun 8 network outage caused three
consecutive connect() timeouts. The first attempt's bot_task completed its
handshake 4 minutes later ("Connected as") with no preceding watcher line,
then the watcher's real reconnect also connected 90 seconds after that. The two
clients ran continuously for 41+ hours, confirmed by the same user message
appearing as two separate inbound events in two different thread IDs 357ms apart.

Regression tests added to tests/gateway/test_discord_connect.py:
- test_connect_timeout_cancels_bot_task: simulates a connect() timeout with a
  NeverReadyBot and asserts _bot_task is None afterward
- test_disconnect_cancels_running_bot_task: injects a live zombie task, calls
  disconnect(), and asserts the task is cancelled and the attribute cleared
2026-06-11 12:09:18 -07:00
..
discord fix(discord): cancel _bot_task on connect() timeout to prevent zombie client 2026-06-11 12:09:18 -07:00
google_chat fix: guard int(os.getenv()) casts against malformed env vars (#40598) 2026-06-07 06:14:24 -07:00
homeassistant refactor(gateway): migrate Home Assistant adapter to bundled plugin 2026-06-06 11:46:24 -07:00
irc fix: guard int(os.getenv()) casts against malformed env vars (#40598) 2026-06-07 06:14:24 -07:00
line fix(line): map inbound message types to the correct MessageType 2026-06-04 21:55:20 -07:00
mattermost refactor(gateway): migrate Mattermost adapter to bundled plugin 2026-05-24 18:05:33 -07:00
ntfy test(ntfy): cover echo-tag filter; tag standalone send path 2026-05-29 13:17:46 -07:00
photon fix(photon): production hardening for the gRPC-native iMessage channel (#42732) 2026-06-09 11:12:58 -04:00
simplex feat(simplex): groups, native attachments, text batching, auto-accept 2026-06-08 21:03:45 -07:00
teams chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00