Two mitigations for the CLOSE_WAIT accumulation reported against QQ Bot
+ Feishu on macOS behind Cloudflare Warp.
1. Shared httpx.Limits helper (gateway/platforms/_http_client_limits.py).
Every long-lived platform adapter now constructs httpx.AsyncClient
with max_keepalive_connections=10 and keepalive_expiry=2.0, vs httpx's
default of unbounded keepalive pool and 5.0s expiry. On macOS/Warp the
default 5s window let idle keepalive sockets sit in CLOSE_WAIT long
enough for seven persistent adapters (QQ Bot, WeCom, DingTalk, Signal,
BlueBubbles, WeCom-callback, plus the transient Feishu helper) to
compound to the 256-fd ulimit. Tunable via
HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY and
HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE env vars.
2. whatsapp.send_typing aiohttp leak. The call was
'await self._http_session.post(...)' with no 'async with' and no
variable capture — the ClientResponse went out of scope unclosed,
holding its TCP socket in CLOSE_WAIT until GC. Fixed by wrapping in
'async with'. This was the only bare-await aiohttp leak in the
gateway/tools/plugins tree per audit; all other aiohttp sites use
the context-manager pattern correctly.
The underlying reporter also saw Feishu SDK (lark-oapi) connections in
CLOSE_WAIT — those are inside the SDK and out of our direct control, but
tightening httpx keepalive across adapters reduces the aggregate pool
pressure regardless of which individual adapter leaks.