mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-30 11:52:04 +00:00
fix(dashboard): reap PTY bridge on child EOF, not only in writer finally (#54190)
The /api/pty handler only closed the PtyBridge in the writer loop's finally. On child EOF the reader task closes the WebSocket, but if the handler task is cancelled the instant the socket closes, the writer's finally can be skipped and the PTY fds leak (#54028) — the FD-leak the regression test guards. Under dashboard auto-reconnect this stacks orphaned PTYs until fds are exhausted. Reap the bridge in the reader's EOF finally too (close() is idempotent), so the PTY is reaped independently of the writer-loop cancellation race. Harden the regression test to poll for teardown instead of asserting on the same tick. Was flaky on main (2/20); now 25/25.
This commit is contained in:
parent
7968c90318
commit
a06d0198cd
2 changed files with 18 additions and 0 deletions
|
|
@ -12210,6 +12210,16 @@ async def pty_ws(ws: WebSocket) -> None:
|
|||
# handler never reaches its ``finally`` and the PTY's fds leak.
|
||||
# With dashboard auto-reconnect (#52962) every dropped socket then
|
||||
# stacks a fresh PTY on top of the orphaned one, exhausting fds.
|
||||
#
|
||||
# Reap the bridge here too (close() is idempotent): on child EOF the
|
||||
# writer loop's ``finally`` is the usual closer, but if the handler
|
||||
# task is cancelled the instant we close the WS, that ``finally``
|
||||
# can be skipped, leaking the PTY. Closing from the EOF path makes
|
||||
# the reap independent of that cancellation race (#54028).
|
||||
try:
|
||||
await asyncio.to_thread(bridge.close)
|
||||
except Exception:
|
||||
pass
|
||||
try:
|
||||
await ws.close()
|
||||
except Exception:
|
||||
|
|
|
|||
|
|
@ -165,4 +165,12 @@ def test_child_eof_closes_socket_and_bridge(pty_client, monkeypatch):
|
|||
conn.receive_bytes()
|
||||
|
||||
assert len(bridges) == 1
|
||||
# bridge.close() runs in the handler's `finally` via asyncio.to_thread,
|
||||
# which can lag the client-side context exit by a tick or two. Poll briefly
|
||||
# instead of asserting immediately so the teardown isn't a race.
|
||||
import time
|
||||
|
||||
deadline = time.monotonic() + 5.0
|
||||
while not bridges[0].closed and time.monotonic() < deadline:
|
||||
time.sleep(0.01)
|
||||
assert bridges[0].closed is True
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue