feat(tools): progressive tool disclosure for MCP and plugin tools

Adds Tool Search, a structured-tools progressive-disclosure layer that
replaces MCP and non-core plugin tools in the model-visible tools array
with three bridge tools (tool_search / tool_describe / tool_call) when
the deferrable surface would consume more than a configurable percentage
of the active model's context window. Core Hermes tools are never deferred.

Default mode is 'auto' with a 10% context threshold, so small toolsets
pay no overhead. Set tools.tool_search.enabled to 'on' to force or 'off'
to disable.

Design carefully reflects the OpenClaw production failure modes
documented in the openclaw-tool-search-report:

  - Core tools never defer (toolsets._HERMES_CORE_TOOLS). Addresses the
    'tools silently missing from isolated cron turns' regression class
    (openclaw#84141) by construction: there is no code path that can
    drop a core tool.
  - Catalog is stateless across turns — rebuilt from the live tool-defs
    list on every assembly. No session-keyed Map that can drift out of
    sync with the registry.
  - tool_call unwraps the bridge call before any hook fires, so plugin
    pre/post hooks, guardrails, approval flows, and the activity feed
    all see the underlying tool name, not the bridge (addresses
    openclaw#85588 and the verbose-mode complaint on openclaw#79823).
  - The unwrap happens in both the parallel and sequential paths of
    agent/tool_executor.py and also in handle_function_call, so direct
    callers (sandboxed code, eval harnesses) are covered too.
  - Bridge tools cannot invoke each other (recursion guard) and cannot
    invoke core tools (those must be called directly).
  - Tools mode only — no JS-sandbox code-mode. Keeps the surface small.
  - Token estimation via cheap char/4 heuristic; precision isn't needed
    for the threshold decision.

Files:
  - tools/tool_search.py — new module (BM25 retrieval, classification,
    threshold gate, bridge dispatch, unwrap helper).
  - tests/tools/test_tool_search.py — 35 tests including the OpenClaw
    #84141 regression guard.
  - model_tools.py — wires assembly into _compute_tool_definitions as the
    final step, adds skip_tool_search_assembly kwarg so the bridge can
    see the real catalog, dispatches the three bridge tools.
  - agent/tool_executor.py — unwraps tool_call in both parallel and
    sequential parsing loops so checkpointing, guardrails, plugin hooks,
    and tool-progress callbacks all observe the underlying tool name.
  - hermes_cli/config.py — DEFAULT_CONFIG['tools']['tool_search'] block.
  - website/docs/user-guide/features/tool-search.md — user docs.

Validation:
  - 35/35 new tests pass.
  - Existing tool/registry/model_tools/config/coercion/executor tests
    (82 + 74 + small adjacents) green.
  - Live E2E: 20 fake MCP tools registered, get_tool_definitions returns
    3 bridges, tool_search returns top 3 hits, tool_describe returns
    full schema, tool_call dispatches to the real underlying handler
    and the underlying result is what the model sees.
  - Reserved-name recursion guard verified live.
  - Core-tool refusal via tool_call verified live.
This commit is contained in:
teknium1 2026-05-23 15:22:01 -07:00 committed by Teknium
parent 73d73f1f0d
commit 369075dc95
6 changed files with 1453 additions and 1 deletions

View file

@ -100,6 +100,26 @@ def execute_tool_calls_concurrent(agent, assistant_message, messages: list, effe
if not isinstance(function_args, dict):
function_args = {}
# ── Tool Search unwrap ────────────────────────────────────────
# When the model invokes the tool_call bridge, peel it open so
# every downstream check (checkpointing, guardrails, plugin
# pre-tool-call hooks, the display/activity feed, the post-call
# callback) sees the underlying tool — not the bridge. This is
# the OpenClaw lesson: hooks must observe the real tool name.
#
# The original tool_call entry on ``tool_call.function`` is left
# untouched so the conversation transcript and the matching
# tool_call_id are preserved exactly as the model emitted them.
try:
from tools import tool_search as _ts
if function_name == _ts.TOOL_CALL_NAME:
_underlying, _underlying_args, _err = _ts.resolve_underlying_call(function_args)
if not _err and _underlying:
function_name = _underlying
function_args = _underlying_args
except Exception:
pass
# Checkpoint for file-mutating tools
if function_name in {"write_file", "patch"} and agent._checkpoint_mgr.enabled:
try:
@ -497,6 +517,17 @@ def execute_tool_calls_sequential(agent, assistant_message, messages: list, effe
if not isinstance(function_args, dict):
function_args = {}
# Tool Search unwrap — see _execute_tool_calls for full rationale.
try:
from tools import tool_search as _ts
if function_name == _ts.TOOL_CALL_NAME:
_underlying, _underlying_args, _err = _ts.resolve_underlying_call(function_args)
if not _err and _underlying:
function_name = _underlying
function_args = _underlying_args
except Exception:
pass
# Check plugin hooks for a block directive before executing.
_block_msg: Optional[str] = None
try: