fix(mcp): address adversarial review round 1 (cache parity, gates, races)

Consolidated findings from three independent reviewers (Codex, Claude Code, a Hermes subagent w/ the hermes-agent-dev skill): - BLOCKING: refresh_agent_mcp_tools rebuilt only the registry subset, silently dropping post-build-injected memory-provider (mem0/honcho/…) and context- engine (lcm_*) tools on every refresh. Now additive-preserving: re-applies the same injectors agent_init uses, staged on locals and published atomically. - Re-injection now honors the #5544 enabled_toolsets gate for context-engine tools, so a restricted-toolset platform can't get lcm_* leaked back in. - Atomic read-diff-publish under one lock: the returned `added` set and the (tools, valid_tool_names) pair are consistent even under concurrent callers (no half-swap, no TOCTOU). - background_review fork opts out (_skip_mcp_refresh) so its byte-identical tools[] cache parity with the parent is preserved. - CLI /reload-mcp routed through the shared helper (was a 4th divergent copy with the same clobber bug + missing disabled_toolsets). - Explicit reloads (TUI RPC + CLI) pass enabled_override so a server the user just enabled in config this session is picked up; automatic paths reuse the agent's build-time selection. - mcp_discovery_timeout default 5.0 -> 1.5s: correctness now comes from the between-turns refresh, so the startup wait is only a small turn-1 UX bump rather than a heavy dead-server latency penalty. - has_registered_mcp_tools checks registered TOOLS (not connected servers) so a zero-tool/prompt-only server doesn't make the per-turn hook fire forever. - Tests: rewrote the thread-safety test to actually exercise the write path (alternating tool sets), added the #5544-gate regression, the memory/context preservation regression, and a "callable next turn via valid_tool_names" contract; removed a dead monkeypatch line.
2026-06-22 10:32:00 +00:00 · 2026-06-19 22:41:15 +05:30 · 2026-06-19 22:41:15 +05:30 · b6e2a54a94
commit b6e2a54a94
parent 3713483874
8 changed files with 278 additions and 51 deletions
--- a/cli.py
+++ b/cli.py
@ -9661,16 +9661,20 @@ class HermesCLI(CLIAgentSetupMixin, CLICommandsMixin):
            else:
                print(f"  🔧 {len(new_tools)} tool(s) available from {len(connected_servers)} server(s)")

-            # Refresh the agent's tool list so the model can call new tools
+            # Refresh the agent's tool list so the model can call new tools.
+            # Route through the shared helper so this CLI /reload-mcp path stays
+            # in lockstep with the TUI RPC / gateway reload / late-binding paths
+            # (name-diff, thread-safe, and — critically — additive-preserving so
+            # memory-provider and context-engine tools survive the rebuild).
            if self.agent is not None:
-                self.agent.tools = get_tool_definitions(
-                    enabled_toolsets=self.agent.enabled_toolsets
-                    if hasattr(self.agent, "enabled_toolsets") else None,
+                from tools.mcp_tool import refresh_agent_mcp_tools
+                # Explicit reload: re-resolve enabled toolsets so a server the
+                # user just enabled in config this session is picked up.
+                refresh_agent_mcp_tools(
+                    self.agent,
+                    enabled_override=self.enabled_toolsets,
                    quiet_mode=True,
                )
-                self.agent.valid_tool_names = {
-                    tool["function"]["name"] for tool in self.agent.tools
-                } if self.agent.tools else set()

            # Inject a message at the END of conversation history so the
            # model knows tools changed.  Appended after all existing