mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-06 07:51:53 +00:00
fix(background_review): propagate parent toolset config to keep tools[] cache-stable
## Summary The background skill/memory-review fork constructed a child `AIAgent` without propagating `enabled_toolsets` / `disabled_toolsets` from the parent. When the parent narrowed its toolset (via `hermes tools disable` or `config.yaml`), the fork's default `enabled_toolsets=None` expanded to "all registered tools" — and the fork's outbound request body sent a wider `tools[]` array than the parent's main-turn request. Anthropic's prompt-cache key includes the `tools[]` array byte-for-byte, so this divergence forked the cache lineage on every nudge and forced a full prefix rewrite. On a captured ~4 hour Claude-via-Hermes session this cost roughly 4.3 M cache-write tokens — about half of those attributable to the per-nudge alternation between the main turn's narrowed `tools[]` and the review fork's wider `tools[]`. ## Goal Extend the byte-stability invariant established by PR #17276 (which fixed `system`) to the `tools[]` slot of the request body, so the review fork's outbound request hits the parent's warmed Anthropic prefix cache regardless of how the parent's toolset is configured. ## Implementation Two-line change in `agent/background_review.py`: pass `enabled_toolsets=getattr(agent, "enabled_toolsets", None)` and the matching `disabled_toolsets` kwarg into the `AIAgent(...)` call inside `_spawn_background_review`. Adds an explanatory block comment that calls out the cache-key dependency and the relationship to PR #17276. The post-construction runtime whitelist (`set_thread_tool_whitelist({memory, skills})`) is untouched — it still gates which tools the model is allowed to *dispatch*. This change aligns only what the request body *transmits*, not what the review is allowed to do, so the safety contract from issue #15204 remains intact. ## Testing - `tests/run_agent/test_background_review_cache_parity.py`: new `test_review_fork_inherits_parent_toolset_config` asserts the parent's `enabled_toolsets` and `disabled_toolsets` reach the review-fork constructor as kwargs. - `tests/run_agent/test_background_review_toolset_restriction.py`: the existing `test_background_review_does_not_narrow_toolset_schema` was inverted (its old "must NOT pass enabled_toolsets" rule was built on the assumption that the parent always ran with the registry default — wrong in practice when the parent is narrowed). Renamed to `test_background_review_matches_parent_toolset_config` and updated to assert the parent's value propagates verbatim. - Verified the new positive test fails without the fix and passes with it. - Full suite for `test_background_review*`: ``` $ python -m pytest tests/run_agent/test_background_review.py \ tests/run_agent/test_background_review_summary.py \ tests/run_agent/test_background_review_toolset_restriction.py \ tests/run_agent/test_background_review_cache_parity.py -q 18 passed in 1.85s ``` ## Scope - `agent/background_review.py`: 2 added kwargs + explanatory comment. - Two test files: one new positive test, one inverted existing test. - No production code paths outside the review fork; no schema changes; no public-API changes. Refs: ziliangpeng/hermes-agent#1 (root-cause analysis with wire-level cache-write measurements). Extends PR #17276's `system`-bytes invariant to the `tools[]` slot.
This commit is contained in:
parent
6c26727bb3
commit
c3a09f7835
3 changed files with 135 additions and 10 deletions
|
|
@ -390,6 +390,24 @@ def _run_review_in_thread(
|
|||
# parent below so memory(action="add") writes from
|
||||
# the review still land on disk; the review just
|
||||
# has zero side effects on external providers.
|
||||
# Inherit the parent's toolset configuration so the review
|
||||
# fork's outbound request body has byte-identical ``tools[]``
|
||||
# with the parent's last main-turn request. Without this,
|
||||
# ``enabled_toolsets=None`` defaults to "all registered tools"
|
||||
# and the fork transmits every tool descriptor (including any
|
||||
# the user has disabled via ``hermes tools disable``), while
|
||||
# the parent transmits only its narrower configured set —
|
||||
# making the two requests diverge in ``tools[]`` even though
|
||||
# they share ``messages[0..N]`` and ``system`` byte-for-byte.
|
||||
# Anthropic's prompt-cache key includes ``tools[]``, so any
|
||||
# divergence forks the cache lineage and forces a full
|
||||
# prefix rewrite (~100-200K tokens per turn for long convs).
|
||||
# The post-construction whitelist (``set_thread_tool_whitelist``
|
||||
# below) still restricts which tools the model is allowed
|
||||
# to dispatch — this change only aligns what the request
|
||||
# body transmits, not what the review is allowed to do.
|
||||
# This extends the byte-stability invariant established by
|
||||
# PR #17276 (which fixed ``system``) to ``tools[]``.
|
||||
review_agent = AIAgent(
|
||||
model=agent.model,
|
||||
max_iterations=16,
|
||||
|
|
@ -401,6 +419,8 @@ def _run_review_in_thread(
|
|||
api_key=_parent_runtime.get("api_key") or None,
|
||||
credential_pool=getattr(agent, "_credential_pool", None),
|
||||
parent_session_id=agent.session_id,
|
||||
enabled_toolsets=getattr(agent, "enabled_toolsets", None),
|
||||
disabled_toolsets=getattr(agent, "disabled_toolsets", None),
|
||||
skip_memory=True,
|
||||
)
|
||||
review_agent._memory_write_origin = "background_review"
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue