mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-05 07:41:39 +00:00
fix(force_close_tcp_sockets): shutdown only, do not release FD (#29507)
The helper used to call ``socket.shutdown(SHUT_RDWR)`` followed by ``socket.close()`` to drop CLOSE-WAIT entries immediately. On its own ``shutdown()`` is safe from any thread — it only sends FIN and breaks pending ``recv``/``send`` — but ``close()`` releases the FD integer to the kernel. When the helper runs on a stranger thread (the interrupt loop, the stale-call detector) the FD release races the owning httpx worker thread that still has the same integer cached inside the SSL BIO. The kernel then recycles that integer to the next ``open()`` call — in production, kanban dispatcher's ``kanban.db`` — and the worker's delayed TLS flush writes a 24-byte TLS application-data record on top of the SQLite header. Restrict the helper to ``shutdown(SHUT_RDWR)`` only. The owning httpx worker's own unwind will close the underlying socket via the same Python ``socket.socket`` object, which atomically swaps ``_fd`` to -1 before issuing ``close(2)`` — no FD-aliasing window. The log field ``tcp_force_closed=N`` is kept (now counts shutdowns) so existing dashboards / log parsers keep working.
This commit is contained in:
parent
53cb6d32be
commit
e2a7d73a66
2 changed files with 47 additions and 16 deletions
|
|
@ -190,7 +190,13 @@ def test_replace_primary_openai_client_survives_repeated_rebuilds():
|
|||
|
||||
|
||||
def test_force_close_tcp_sockets_descends_httpcore_1_connection_wrapper():
|
||||
"""httpcore 1.x stores the real stream below conn._connection."""
|
||||
"""httpcore 1.x stores the real stream below conn._connection.
|
||||
|
||||
Post-#29507: the helper must shut sockets down but must NOT release the
|
||||
FD via ``sock.close()`` — that race recycled FDs into unrelated file
|
||||
descriptors (kanban.db) and let TLS bytes overwrite SQLite headers. The
|
||||
owning httpx thread is responsible for closing FDs on its own unwind.
|
||||
"""
|
||||
from agent.agent_runtime_helpers import force_close_tcp_sockets
|
||||
|
||||
class FakeSocket:
|
||||
|
|
@ -215,4 +221,6 @@ def test_force_close_tcp_sockets_descends_httpcore_1_connection_wrapper():
|
|||
|
||||
assert force_close_tcp_sockets(openai_client) == 1
|
||||
assert sock.shutdown_calls == 1
|
||||
assert sock.close_calls == 1
|
||||
# #29507: close() must NOT be called from this helper — the owning
|
||||
# httpx worker thread releases the FD, not us.
|
||||
assert sock.close_calls == 0
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue