Network errors through proxies (e.g. sing-box) can leave httpx
connections in a half-closed state occupying pool slots. After enough
reconnect cycles the 256-connection default fills up entirely, causing
Pool timeout: All connections in the connection pool are occupied.
Fix: cycle only the getUpdates request object (_request[0]) via
shut-down + re-initialize before restarting polling. This drains stale
connections without touching the general request (_request[1]) that
concurrent send_message / edit_message calls rely on.
The drain is applied to both _handle_polling_network_error and
_handle_polling_conflict reconnect paths via a shared
_drain_polling_connections() helper. Failures in the drain are
swallowed so reconnect always proceeds.
Based on #16466 by @Mirac1eSky.
* feat(telegram): auto-discover fallback IPs via DoH when api.telegram.org is unreachable
On some networks (university, corporate), api.telegram.org resolves to a
valid Telegram IP that is unreachable due to routing/firewall rules. A
different IP in the same Telegram-owned 149.154.160.0/20 block works fine.
This adds automatic fallback IP discovery at connect time:
1. Query Google and Cloudflare DNS-over-HTTPS for api.telegram.org A records
2. Exclude the system-DNS IP (the unreachable one), use the rest as fallbacks
3. If DoH is also blocked, fall back to a seed list (149.154.167.220)
4. TelegramFallbackTransport tries primary first, sticks to whichever works
No configuration needed — works automatically. TELEGRAM_FALLBACK_IPS env var
still available as manual override. Zero impact on healthy networks (primary
path succeeds on first attempt, fallback never exercised).
No new dependencies (uses httpx already in deps + stdlib socket).
* fix: share transport instance and downgrade seed fallback log to info
- Use single TelegramFallbackTransport shared between request and
get_updates_request so sticky IP is shared across polling and API calls
- Keep separate HTTPXRequest instances (different timeout settings)
- Downgrade "using seed fallback IPs" from warning to info to avoid
noisy logs on healthy networks
* fix: add telegram.request mock and discovery fixture to remaining test files
The original PR missed test_dm_topics.py and
test_telegram_network_reconnect.py — both need the telegram.request
mock module. The reconnect test also needs _no_auto_discovery since
_handle_polling_network_error calls connect() which now invokes
discover_fallback_ips().
---------
Co-authored-by: Mohan Qiao <Gavin-Qiao@users.noreply.github.com>
After a Telegram 502, _handle_polling_network_error calls updater.stop()
then start_polling(). If start_polling() also raises, the old code logged
a warning and returned — but the comment 'The next network error will
trigger another attempt' was wrong. The updater loop is dead after stop(),
so no further error callbacks ever fire. The gateway stays alive but
permanently deaf to messages.
Fix: when start_polling() fails in the except branch, schedule a new
_handle_polling_network_error task to continue the exponential backoff
retry chain. The task is tracked in _background_tasks (preventing GC).
Guarded by has_fatal_error to avoid spurious retries during shutdown.
Closes#3173.
Salvaged from PR #3177 by Mibayy.