mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-19 10:02:16 +00:00
fix(mcp): stability fix pack — reload timeout, shutdown cleanup, event loop handler, OAuth non-blocking (#4757)
Four fixes for MCP server stability issues reported by community member (terminal lockup, zombie processes, escape sequence pollution, startup hang): 1. MCP reload timeout guard (cli.py): _check_config_mcp_changes now runs _reload_mcp in a separate daemon thread with a 30s hard timeout. Previously, a hung MCP server could block the process_loop thread indefinitely, freezing the entire TUI (user can type but nothing happens, only Ctrl+D/Ctrl+\ work). 2. MCP stdio subprocess PID tracking (mcp_tool.py): Tracks child PIDs spawned by stdio_client via before/after snapshots of /proc children. On shutdown, _stop_mcp_loop force-kills any tracked PIDs that survived the SDK's graceful SIGTERM→SIGKILL cleanup. Prevents zombie MCP server processes from accumulating across sessions. 3. MCP event loop exception handler (mcp_tool.py): Installs _mcp_loop_exception_handler on the MCP background event loop — same pattern as the existing _suppress_closed_loop_errors on prompt_toolkit's loop. Suppresses benign 'Event loop is closed' RuntimeError from httpx transport __del__ during MCP shutdown. Salvaged from PR #2538 (acsezen). 4. MCP OAuth non-blocking (mcp_oauth.py): Replaces blocking input() call in _wait_for_callback with OAuthNonInteractiveError raise. Adds _is_interactive() TTY detection. In non-interactive environments, build_oauth_auth() still returns a provider (cached tokens + refresh work), but the callback handler raises immediately instead of blocking the MCP event loop for 120s. Re-raises OAuth setup failures in _run_http so failed servers are reported cleanly without blocking others. Salvaged from PRs #4521 (voidborne-d) and #4465 (heathley). Closes #2537, closes #4462 Related: #4128, #3436
This commit is contained in:
parent
f374ae4c61
commit
cc54818d26
5 changed files with 431 additions and 12 deletions
13
cli.py
13
cli.py
|
|
@ -5016,11 +5016,18 @@ class HermesCLI:
|
|||
return # mcp_servers unchanged (some other section was edited)
|
||||
|
||||
self._config_mcp_servers = new_mcp
|
||||
# Notify user and reload
|
||||
# Notify user and reload. Run in a separate thread with a hard
|
||||
# timeout so a hung MCP server cannot block the process_loop
|
||||
# indefinitely (which would freeze the entire TUI).
|
||||
print()
|
||||
print("🔄 MCP server config changed — reloading connections...")
|
||||
with self._busy_command(self._slow_command_status("/reload-mcp")):
|
||||
self._reload_mcp()
|
||||
_reload_thread = threading.Thread(
|
||||
target=self._reload_mcp, daemon=True
|
||||
)
|
||||
_reload_thread.start()
|
||||
_reload_thread.join(timeout=30)
|
||||
if _reload_thread.is_alive():
|
||||
print(" ⚠️ MCP reload timed out (30s). Some servers may not have reconnected.")
|
||||
|
||||
def _reload_mcp(self):
|
||||
"""Reload MCP servers: disconnect all, re-read config.yaml, reconnect.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue