mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-08 03:01:47 +00:00
CPython's logging module is not reentrant-safe. `Logger.isEnabledFor`
caches level results in `Logger._cache`; under shutdown races the cache
can be cleared (`Logger._clear_cache`, triggered by logging config changes
from another thread) or mid-mutation when a signal fires, raising
`KeyError: <level_int>` (e.g. `KeyError: 10` for DEBUG) inside the signal
handler.
When that happens, the KeyError escapes before the `raise KeyboardInterrupt()`
on the next line can fire, which bypasses prompt_toolkit's normal interrupt
unwind and surfaces as the EIO cascade originally reported in #13710.
Issue #13710 shipped two defenses (asyncio exception handler + outer
`except (KeyError, OSError)` with EIO suppression) that cover the EIO
unwind path. This patch closes the remaining escape hatch: the
`logger.debug` call at the top of `_signal_handler` itself. Wrap it in a
bare `try/except Exception: pass` so logging can never raise through a
signal handler.
Observed in the wild: debug report on 0.12.0 (commit 8163d371) shows the
exact stack — KeyError: 10 at logging/__init__.py:1742 inside the
signal handler's `logger.debug`, followed by the EIO cascade from
prompt_toolkit's emergency flush.
Tests: adds `TestSignalHandlerLoggingRace` to
`tests/hermes_cli/test_suppress_eio_on_interrupt.py` with 6 new cases:
- normal path still raises KeyboardInterrupt
- KeyError(10) from logger.debug does not escape
- any Exception from logger.debug is swallowed
- agent.interrupt still fires when logger.debug raises
- agent.interrupt raising also does not escape
- BaseException (SystemExit) is NOT swallowed — guard uses `except Exception`
deliberately so real shutdown signals still propagate
Closes #13710 regression.
This commit is contained in:
parent
a6f5f9c484
commit
e70e49016f
2 changed files with 135 additions and 1 deletions
16
cli.py
16
cli.py
|
|
@ -11876,8 +11876,22 @@ class HermesCLI:
|
|||
call _kill_process (SIGTERM + 1 s wait + SIGKILL if needed) →
|
||||
return from _wait_for_process. ``time.sleep`` releases the
|
||||
GIL so the daemon actually runs during the window.
|
||||
|
||||
Guarded ``logger.debug``: CPython's ``logging`` module is not
|
||||
reentrant-safe. ``Logger.isEnabledFor`` caches level results
|
||||
in ``Logger._cache``; under shutdown races the cache can be
|
||||
cleared (``_clear_cache``) or mid-mutation when the signal
|
||||
fires, raising ``KeyError: <level_int>`` (e.g. ``KeyError: 10``
|
||||
for DEBUG) inside the handler. That KeyError then escapes
|
||||
before ``raise KeyboardInterrupt()`` can fire, which bypasses
|
||||
prompt_toolkit's normal interrupt unwind and surfaces as the
|
||||
EIO cascade from issue #13710. Wrap the log in a bare
|
||||
``try/except`` so the handler can never raise through it.
|
||||
"""
|
||||
logger.debug("Received signal %s, triggering graceful shutdown", signum)
|
||||
try:
|
||||
logger.debug("Received signal %s, triggering graceful shutdown", signum)
|
||||
except Exception:
|
||||
pass # never let logging raise from a signal handler (#13710 regression)
|
||||
try:
|
||||
if getattr(self, "agent", None) and getattr(self, "_agent_running", False):
|
||||
self.agent.interrupt(f"received signal {signum}")
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue