fix(delegate): add explicit do-not-use guidance to acp_command/acp_args schema (carve-out of #22680)

acp_command / acp_args descriptions previously primed the model to
populate them — "Per-task ACP command override (e.g. 'copilot')" —
even when no ACP CLI was installed. Models with weaker schema-following
discipline would set them and the spawn would fail.

Add explicit "Do NOT set unless the user has explicitly told you"
guidance at both the top-level acp_command and the per-task override.
Strengthen acp_args to mention it's empty unless acp_command is set.
Adds 2 tests pinning the descriptions.

Note: this is a cosmetic prompt-engineering fix — the params remain
exposed in the schema. The fully-correct fix is to gate them behind
a config flag or runtime ACP-CLI detection so the schema only emits
them when an ACP harness is available. Tracked as a follow-up; this
PR ships the low-cost stopgap.

Salvage of #22680 (delegate schema only). The original PR also
bundled unrelated fixes for #22548, #21944, #22150 — those
need separate PRs since #22548 and #21944 are already addressed
on main (#22780 + #22798 in flight) and #22150 deserves its own
review.

Closes #22013.
This commit is contained in:
Wesley Simplicio 2026-05-09 13:22:25 -07:00 committed by Teknium
parent 1c9ffb177c
commit ca13993217
2 changed files with 38 additions and 4 deletions

View file

@ -2032,6 +2032,32 @@ class TestOrchestratorRoleSchema(unittest.TestCase):
self.assertIn("role", task_props)
self.assertEqual(task_props["role"]["enum"], ["leaf", "orchestrator"])
def test_acp_command_description_has_do_not_set_guidance(self):
# acp_command/acp_args descriptions must NOT bias the model toward
# assuming an ACP CLI (Claude, Copilot, etc.) is installed. They must
# carry explicit "do not set unless told" guidance so the model doesn't
# hallucinate ACP availability (#22013).
from tools.delegate_tool import DELEGATE_TASK_SCHEMA
props = DELEGATE_TASK_SCHEMA["parameters"]["properties"]
top_acp_desc = props["acp_command"]["description"]
self.assertIn("Do NOT set", top_acp_desc)
self.assertIn("explicitly told you", top_acp_desc)
task_props = props["tasks"]["items"]["properties"]
per_task_acp_desc = task_props["acp_command"]["description"]
self.assertIn("Do NOT set", per_task_acp_desc)
def test_acp_command_description_has_no_claude_as_example(self):
# Descriptions must not list 'claude' as a canonical example value —
# that directly primes the model to attempt Claude ACP even when it is
# not installed (#22013).
from tools.delegate_tool import DELEGATE_TASK_SCHEMA
props = DELEGATE_TASK_SCHEMA["parameters"]["properties"]
top_acp_desc = props["acp_command"]["description"].lower()
self.assertNotIn("e.g. 'claude'", top_acp_desc)
self.assertNotIn("e.g. \"claude\"", top_acp_desc)
# Sentinel used to distinguish "role kwarg omitted" from "role=None".
_SENTINEL = object()