mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-11 03:31:55 +00:00
feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they created via an optional ``created_cards`` field on ``kanban_complete``. The kernel verifies each id exists and was created by the completing worker's profile; any phantom id blocks the completion with a ``HallucinatedCardsError`` and records a ``completion_blocked_hallucination`` event on the task so the rejected attempt is auditable. Successful completions also get a non-blocking prose-scan pass over their ``summary`` + ``result`` that emits a ``suspected_hallucinated_references`` event for any ``t_<hex>`` reference that doesn't resolve. Closes #20017. Recovery UX (kernel + CLI + dashboard) -------------------------------------- A structural gate alone isn't enough — operators also need to see and act on stuck workers, especially when a profile's model is the root cause. This PR ships the full loop: * ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that releases an active worker claim immediately (unlike ``release_stale_claims`` which only acts after claim_expires has passed). Emits a ``reclaimed`` event with ``manual: True`` payload. * ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` — switch a task to a different profile, optionally reclaiming a stuck running worker in the same call. * ``hermes kanban reclaim <id> [--reason ...]`` and ``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]`` CLI subcommands wired through to the same helpers. * ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and ``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the dashboard plugin. Dashboard surfacing ------------------- * ⚠ **warning badge** on cards with active hallucination events. * **attention strip** at the top of the board listing all flagged tasks; dismissible per session. * **events callout** in the task drawer — hallucination events render with a red left border, amber icon, and phantom ids as styled chips. * **recovery section** in the task drawer with three actions: Reclaim, Reassign (with profile picker + reclaim-first checkbox), and a copy-to-clipboard hint for ``hermes -p <profile> model`` since profile config lives on disk and can't be edited from the browser. Auto-opens when the task has warnings, collapsed otherwise. Keyed by task id so state doesn't leak between drawers. Active-vs-stale rule: warnings clear when a clean ``completed`` or ``edited`` event supersedes the hallucination, so recovery is never permanently stigmatising — the audit events persist for debugging but the badge goes away once the worker succeeds. Skill updates ------------- * ``skills/devops/kanban-worker/SKILL.md`` documents the ``created_cards`` contract with good/bad examples. * ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering stuck workers" section with the three actions and when to use each. Tests ----- * Kernel gate: verified-cards manifest, phantom rejection + audit event, cross-worker rejection, prose scan positive + negative. * Recovery helpers: reclaim on running task, reclaim on non-running returns False, reassign refuses running without reclaim_first, reassign with reclaim_first succeeds on running. * API endpoints: warnings field present on /board and /tasks/:id, warnings cleared after clean completion, reclaim 200 + 409 paths, reassign 200 + 409 + reclaim_first paths. * CLI smoke: reclaim + reassign subcommands. Live-verified end-to-end on a dashboard with seeded scenarios: attention strip renders, badges land on the right cards, drawer callout shows phantom chips, Reclaim on a running task flips status to ready + emits manual reclaimed event + refreshes the drawer, Reassign swaps the assignee and triggers board refresh. 359/359 kanban-suite tests pass (test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
This commit is contained in:
parent
7de3c86c5a
commit
de9238d37e
11 changed files with 1791 additions and 17 deletions
|
|
@ -208,3 +208,81 @@ def test_kanban_not_gateway_only():
|
|||
cmd = next(c for c in COMMAND_REGISTRY if c.name == "kanban")
|
||||
assert not cmd.cli_only
|
||||
assert not cmd.gateway_only
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# reclaim + reassign CLI smoke tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_run_slash_reclaim_running_task(kanban_home):
|
||||
import re
|
||||
import time
|
||||
import secrets
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
out1 = kc.run_slash("create 'stuck worker task' --assignee broken-model")
|
||||
m = re.search(r"(t_[a-f0-9]+)", out1)
|
||||
assert m
|
||||
tid = m.group(1)
|
||||
|
||||
# Simulate a running claim outside TTL.
|
||||
conn = kb.connect()
|
||||
try:
|
||||
lock = secrets.token_hex(4)
|
||||
conn.execute(
|
||||
"UPDATE tasks SET status='running', claim_lock=?, claim_expires=?, "
|
||||
"worker_pid=? WHERE id=?",
|
||||
(lock, int(time.time()) + 3600, 4242, tid),
|
||||
)
|
||||
conn.execute(
|
||||
"INSERT INTO task_runs (task_id, status, claim_lock, claim_expires, "
|
||||
"worker_pid, started_at) VALUES (?, 'running', ?, ?, ?, ?)",
|
||||
(tid, lock, int(time.time()) + 3600, 4242, int(time.time())),
|
||||
)
|
||||
rid = conn.execute("SELECT last_insert_rowid()").fetchone()[0]
|
||||
conn.execute("UPDATE tasks SET current_run_id=? WHERE id=?", (rid, tid))
|
||||
conn.commit()
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
out = kc.run_slash(f"reclaim {tid} --reason 'test'")
|
||||
assert "Reclaimed" in out, out
|
||||
# Status back to ready.
|
||||
out2 = kc.run_slash(f"show {tid}")
|
||||
assert "ready" in out2.lower()
|
||||
|
||||
|
||||
def test_run_slash_reassign_with_reclaim_flag(kanban_home):
|
||||
import re
|
||||
import time
|
||||
import secrets
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
out1 = kc.run_slash("create 'switch model' --assignee orig")
|
||||
m = re.search(r"(t_[a-f0-9]+)", out1)
|
||||
tid = m.group(1)
|
||||
|
||||
# Simulate a running claim.
|
||||
conn = kb.connect()
|
||||
try:
|
||||
lock = secrets.token_hex(4)
|
||||
conn.execute(
|
||||
"UPDATE tasks SET status='running', claim_lock=?, claim_expires=?, "
|
||||
"worker_pid=? WHERE id=?",
|
||||
(lock, int(time.time()) + 3600, 4242, tid),
|
||||
)
|
||||
conn.execute(
|
||||
"INSERT INTO task_runs (task_id, status, claim_lock, claim_expires, "
|
||||
"worker_pid, started_at) VALUES (?, 'running', ?, ?, ?, ?)",
|
||||
(tid, lock, int(time.time()) + 3600, 4242, int(time.time())),
|
||||
)
|
||||
rid = conn.execute("SELECT last_insert_rowid()").fetchone()[0]
|
||||
conn.execute("UPDATE tasks SET current_run_id=? WHERE id=?", (rid, tid))
|
||||
conn.commit()
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
out = kc.run_slash(f"reassign {tid} newbie --reclaim --reason 'switch'")
|
||||
assert "Reassigned" in out, out
|
||||
out2 = kc.run_slash(f"show {tid}")
|
||||
assert "newbie" in out2
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue