fix(mcp): address adversarial review round 1 (cache parity, gates, races)

Consolidated findings from three independent reviewers (Codex, Claude Code, a
Hermes subagent w/ the hermes-agent-dev skill):

- BLOCKING: refresh_agent_mcp_tools rebuilt only the registry subset, silently
  dropping post-build-injected memory-provider (mem0/honcho/…) and context-
  engine (lcm_*) tools on every refresh. Now additive-preserving: re-applies
  the same injectors agent_init uses, staged on locals and published atomically.
- Re-injection now honors the #5544 enabled_toolsets gate for context-engine
  tools, so a restricted-toolset platform can't get lcm_* leaked back in.
- Atomic read-diff-publish under one lock: the returned `added` set and the
  (tools, valid_tool_names) pair are consistent even under concurrent callers
  (no half-swap, no TOCTOU).
- background_review fork opts out (_skip_mcp_refresh) so its byte-identical
  tools[] cache parity with the parent is preserved.
- CLI /reload-mcp routed through the shared helper (was a 4th divergent copy
  with the same clobber bug + missing disabled_toolsets).
- Explicit reloads (TUI RPC + CLI) pass enabled_override so a server the user
  just enabled in config this session is picked up; automatic paths reuse the
  agent's build-time selection.
- mcp_discovery_timeout default 5.0 -> 1.5s: correctness now comes from the
  between-turns refresh, so the startup wait is only a small turn-1 UX bump
  rather than a heavy dead-server latency penalty.
- has_registered_mcp_tools checks registered TOOLS (not connected servers) so a
  zero-tool/prompt-only server doesn't make the per-turn hook fire forever.
- Tests: rewrote the thread-safety test to actually exercise the write path
  (alternating tool sets), added the #5544-gate regression, the memory/context
  preservation regression, and a "callable next turn via valid_tool_names"
  contract; removed a dead monkeypatch line.
This commit is contained in:
alt-glitch 2026-06-19 22:41:15 +05:30 committed by Teknium
parent 3713483874
commit b6e2a54a94
8 changed files with 278 additions and 51 deletions

18
cli.py
View file

@ -9661,16 +9661,20 @@ class HermesCLI(CLIAgentSetupMixin, CLICommandsMixin):
else:
print(f" 🔧 {len(new_tools)} tool(s) available from {len(connected_servers)} server(s)")
# Refresh the agent's tool list so the model can call new tools
# Refresh the agent's tool list so the model can call new tools.
# Route through the shared helper so this CLI /reload-mcp path stays
# in lockstep with the TUI RPC / gateway reload / late-binding paths
# (name-diff, thread-safe, and — critically — additive-preserving so
# memory-provider and context-engine tools survive the rebuild).
if self.agent is not None:
self.agent.tools = get_tool_definitions(
enabled_toolsets=self.agent.enabled_toolsets
if hasattr(self.agent, "enabled_toolsets") else None,
from tools.mcp_tool import refresh_agent_mcp_tools
# Explicit reload: re-resolve enabled toolsets so a server the
# user just enabled in config this session is picked up.
refresh_agent_mcp_tools(
self.agent,
enabled_override=self.enabled_toolsets,
quiet_mode=True,
)
self.agent.valid_tool_names = {
tool["function"]["name"] for tool in self.agent.tools
} if self.agent.tools else set()
# Inject a message at the END of conversation history so the
# model knows tools changed. Appended after all existing