feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)

Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.

Closes #20017.

Recovery UX (kernel + CLI + dashboard)
--------------------------------------

A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:

* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
  releases an active worker claim immediately (unlike
  ``release_stale_claims`` which only acts after claim_expires has
  passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
  switch a task to a different profile, optionally reclaiming a stuck
  running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
  ``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
  CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
  ``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
  dashboard plugin.

Dashboard surfacing
-------------------

* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
  tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
  with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
  Reassign (with profile picker + reclaim-first checkbox), and a
  copy-to-clipboard hint for ``hermes -p <profile> model`` since
  profile config lives on disk and can't be edited from the browser.
  Auto-opens when the task has warnings, collapsed otherwise.
  Keyed by task id so state doesn't leak between drawers.

Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.

Skill updates
-------------

* ``skills/devops/kanban-worker/SKILL.md`` documents the
  ``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
  stuck workers" section with the three actions and when to use each.

Tests
-----

* Kernel gate: verified-cards manifest, phantom rejection + audit
  event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
  returns False, reassign refuses running without reclaim_first,
  reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
  warnings cleared after clean completion, reclaim 200 + 409 paths,
  reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.

Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.

359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
This commit is contained in:
Teknium 2026-05-05 08:06:55 -07:00 committed by GitHub
parent 7de3c86c5a
commit de9238d37e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
11 changed files with 1791 additions and 17 deletions

View file

@ -210,6 +210,20 @@ def _handle_complete(args: dict, **kw) -> str:
summary = args.get("summary")
metadata = args.get("metadata")
result = args.get("result")
created_cards = args.get("created_cards")
if created_cards is not None:
if isinstance(created_cards, str):
# Accept a single id as a string for convenience.
created_cards = [created_cards]
if not isinstance(created_cards, (list, tuple)):
return tool_error(
f"created_cards must be a list of task ids, got "
f"{type(created_cards).__name__}"
)
# Normalise: strings only, stripped, non-empty.
created_cards = [
str(c).strip() for c in created_cards if str(c).strip()
]
if not (summary or result):
return tool_error(
"provide at least one of: summary (preferred), result"
@ -221,10 +235,23 @@ def _handle_complete(args: dict, **kw) -> str:
try:
kb, conn = _connect()
try:
ok = kb.complete_task(
conn, tid,
result=result, summary=summary, metadata=metadata,
)
try:
ok = kb.complete_task(
conn, tid,
result=result, summary=summary, metadata=metadata,
created_cards=created_cards,
)
except kb.HallucinatedCardsError as hall_err:
# Structured rejection — surface the phantom ids so the
# worker can retry with a corrected list or drop the
# field. Audit event already landed in the DB.
return tool_error(
f"kanban_complete blocked: the following created_cards "
f"do not exist or were not created by this worker: "
f"{', '.join(hall_err.phantom)}. "
f"Either omit them, use only ids returned from successful "
f"kanban_create calls, or remove the created_cards field."
)
if not ok:
return tool_error(
f"could not complete {tid} (unknown id or already terminal)"
@ -452,7 +479,11 @@ KANBAN_COMPLETE_SCHEMA = {
"human-readable 1-3 sentence description of what you did; put "
"machine-readable facts in ``metadata`` (changed_files, "
"tests_run, decisions, findings, etc). At least one of "
"``summary`` or ``result`` is required."
"``summary`` or ``result`` is required. If you created new "
"tasks via ``kanban_create`` during this run, list their ids "
"in ``created_cards`` — the kernel verifies them so phantom "
"references are caught before they leak into downstream "
"automation."
),
"parameters": {
"type": "object",
@ -487,6 +518,22 @@ KANBAN_COMPLETE_SCHEMA = {
"callers that still set --result on the CLI."
),
},
"created_cards": {
"type": "array",
"items": {"type": "string"},
"description": (
"Optional structured manifest of task ids you "
"created via ``kanban_create`` during this run. "
"The kernel verifies each id exists and was "
"created by this worker's profile; any phantom "
"id blocks the completion with an error listing "
"what went wrong (auditable in the task's events). "
"Only list ids you got back from a successful "
"``kanban_create`` call — do not invent or "
"remember ids from prose. Omit the field if you "
"did not create any cards."
),
},
},
"required": [],
},