test(memory): cover cache-parity + runtime whitelist on background review fork

- test_background_review_does_not_narrow_toolset_schema: review fork must
  NOT pass enabled_toolsets to AIAgent (full parent schema = matching
  Anthropic cache key on the 'tools' field).
- test_background_review_installs_thread_local_whitelist: the runtime
  whitelist that replaces schema-level narrowing must contain memory +
  skills tools and exclude terminal / send_message / delegate_task /
  web_search / execute_code.
- test_review_fork_inherits_parent_cached_system_prompt: new test for
  PR #17276's first root cause — the fork's _cached_system_prompt must
  equal the parent's byte-for-byte.
- test_review_fork_pins_session_start_and_session_id: defensive belt-and-
  suspenders for the cached-prompt inheritance.

Inverted the original test_background_review_agent_uses_restricted_toolsets
(which asserted the schema-level narrowing) — that narrowing was the
direct cause of #25322's cache miss, and the runtime whitelist replaces
its safety claim without breaking cache parity.

Refs #25322, #15204, PR #17276.
This commit is contained in:
teknium1 2026-05-13 22:06:31 -07:00 committed by Teknium
parent 07349ce4df
commit 8c6b0c9ecd
3 changed files with 273 additions and 9 deletions

View file

@ -20,6 +20,9 @@ def _bare_agent() -> AIAgent:
agent._memory_store = object()
agent._memory_enabled = True
agent._user_profile_enabled = False
agent._cached_system_prompt = "test-cached-system-prompt"
import datetime as _dt
agent.session_start = _dt.datetime(2026, 1, 1, 12, 0, 0)
agent._MEMORY_REVIEW_PROMPT = "review memory"
agent._SKILL_REVIEW_PROMPT = "review skills"
agent._COMBINED_REVIEW_PROMPT = "review both"