mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-27 11:22:03 +00:00
When a Telegram long-poll TCP socket enters CLOSE-WAIT (remote sent FIN but httpx hasn't noticed), epoll still reports it readable so no exception is raised. PTB's error_callback never fires, the reconnect ladder never engages, and the gateway silently stops receiving messages while the process stays alive — until a manual systemctl restart. The existing recovery only covers two cases: error_callback-driven reconnects (which require an exception PTB never gets) and a one-shot _verify_polling_after_reconnect probe (which runs only right after an explicit reconnect). A socket that wedges during steady-state operation is never detected. Add _polling_heartbeat_loop: a background asyncio.Task started in connect() (polling mode only) that probes get_me() every 90s on the general request pool (not the getUpdates pool, so healthy long-polls are never interrupted). On asyncio.TimeoutError/OSError it hands off to the existing _handle_polling_network_error ladder; other errors are swallowed. disconnect() cancels and awaits the task. Worst-case detection window ~105s. Complementary to #51541 (general-pool keepalive limits / fd leak) — that recycles idle pooled connections; this detects a wedged active read. Fixes #48495 Co-authored-by: agt-user <267614622+agt-user@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| dingtalk | ||
| discord | ||
| feishu | ||
| google_chat | ||
| homeassistant | ||
| irc | ||
| line | ||
| matrix | ||
| mattermost | ||
| ntfy | ||
| photon | ||
| raft | ||
| simplex | ||
| slack | ||
| sms | ||
| teams | ||
| telegram | ||
| wecom | ||