mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-12 03:42:08 +00:00
Workers completing a kanban task can now claim the ids of cards they created via an optional ``created_cards`` field on ``kanban_complete``. The kernel verifies each id exists and was created by the completing worker's profile; any phantom id blocks the completion with a ``HallucinatedCardsError`` and records a ``completion_blocked_hallucination`` event on the task so the rejected attempt is auditable. Successful completions also get a non-blocking prose-scan pass over their ``summary`` + ``result`` that emits a ``suspected_hallucinated_references`` event for any ``t_<hex>`` reference that doesn't resolve. Closes #20017. Recovery UX (kernel + CLI + dashboard) -------------------------------------- A structural gate alone isn't enough — operators also need to see and act on stuck workers, especially when a profile's model is the root cause. This PR ships the full loop: * ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that releases an active worker claim immediately (unlike ``release_stale_claims`` which only acts after claim_expires has passed). Emits a ``reclaimed`` event with ``manual: True`` payload. * ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` — switch a task to a different profile, optionally reclaiming a stuck running worker in the same call. * ``hermes kanban reclaim <id> [--reason ...]`` and ``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]`` CLI subcommands wired through to the same helpers. * ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and ``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the dashboard plugin. Dashboard surfacing ------------------- * ⚠ **warning badge** on cards with active hallucination events. * **attention strip** at the top of the board listing all flagged tasks; dismissible per session. * **events callout** in the task drawer — hallucination events render with a red left border, amber icon, and phantom ids as styled chips. * **recovery section** in the task drawer with three actions: Reclaim, Reassign (with profile picker + reclaim-first checkbox), and a copy-to-clipboard hint for ``hermes -p <profile> model`` since profile config lives on disk and can't be edited from the browser. Auto-opens when the task has warnings, collapsed otherwise. Keyed by task id so state doesn't leak between drawers. Active-vs-stale rule: warnings clear when a clean ``completed`` or ``edited`` event supersedes the hallucination, so recovery is never permanently stigmatising — the audit events persist for debugging but the badge goes away once the worker succeeds. Skill updates ------------- * ``skills/devops/kanban-worker/SKILL.md`` documents the ``created_cards`` contract with good/bad examples. * ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering stuck workers" section with the three actions and when to use each. Tests ----- * Kernel gate: verified-cards manifest, phantom rejection + audit event, cross-worker rejection, prose scan positive + negative. * Recovery helpers: reclaim on running task, reclaim on non-running returns False, reassign refuses running without reclaim_first, reassign with reclaim_first succeeds on running. * API endpoints: warnings field present on /board and /tasks/:id, warnings cleared after clean completion, reclaim 200 + 409 paths, reassign 200 + 409 + reclaim_first paths. * CLI smoke: reclaim + reassign subcommands. Live-verified end-to-end on a dashboard with seeded scenarios: attention strip renders, badges land on the right cards, drawer callout shows phantom chips, Reclaim on a running task flips status to ready + emits manual reclaimed event + refreshes the drawer, Reassign swaps the assignee and triggers board refresh. 359/359 kanban-suite tests pass (test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
162 lines
9.1 KiB
Markdown
162 lines
9.1 KiB
Markdown
---
|
|
name: kanban-orchestrator
|
|
description: Decomposition playbook + specialist-roster conventions + anti-temptation rules for an orchestrator profile routing work through Kanban. The "don't do the work yourself" rule and the basic lifecycle are auto-injected into every kanban worker's system prompt; this skill is the deeper playbook when you're specifically playing the orchestrator role.
|
|
version: 2.0.0
|
|
metadata:
|
|
hermes:
|
|
tags: [kanban, multi-agent, orchestration, routing]
|
|
related_skills: [kanban-worker]
|
|
---
|
|
|
|
# Kanban Orchestrator — Decomposition Playbook
|
|
|
|
> The **core worker lifecycle** (including the `kanban_create` fan-out pattern and the "decompose, don't execute" rule) is auto-injected into every kanban process via the `KANBAN_GUIDANCE` system-prompt block. This skill is the deeper playbook when you're an orchestrator profile whose whole job is routing.
|
|
|
|
## When to use the board (vs. just doing the work)
|
|
|
|
Create Kanban tasks when any of these are true:
|
|
|
|
1. **Multiple specialists are needed.** Research + analysis + writing is three profiles.
|
|
2. **The work should survive a crash or restart.** Long-running, recurring, or important.
|
|
3. **The user might want to interject.** Human-in-the-loop at any step.
|
|
4. **Multiple subtasks can run in parallel.** Fan-out for speed.
|
|
5. **Review / iteration is expected.** A reviewer profile loops on drafter output.
|
|
6. **The audit trail matters.** Board rows persist in SQLite forever.
|
|
|
|
If *none* of those apply — it's a small one-shot reasoning task — use `delegate_task` instead or answer the user directly.
|
|
|
|
## The anti-temptation rules
|
|
|
|
Your job description says "route, don't execute." The rules that enforce that:
|
|
|
|
- **Do not execute the work yourself.** Your restricted toolset usually doesn't even include terminal/file/code/web for implementation. If you find yourself "just fixing this quickly" — stop and create a task for the right specialist.
|
|
- **For any concrete task, create a Kanban task and assign it.** Every single time.
|
|
- **If no specialist fits, ask the user which profile to create.** Do not default to doing it yourself under "close enough."
|
|
- **Decompose, route, and summarize — that's the whole job.**
|
|
|
|
## The standard specialist roster (convention)
|
|
|
|
Unless the user's setup has customized profiles, assume these exist. Adjust to whatever the user actually has — ask if you're unsure.
|
|
|
|
| Profile | Does | Typical workspace |
|
|
|---|---|---|
|
|
| `researcher` | Reads sources, gathers facts, writes findings | `scratch` |
|
|
| `analyst` | Synthesizes, ranks, de-dupes. Consumes multiple `researcher` outputs | `scratch` |
|
|
| `writer` | Drafts prose in the user's voice | `scratch` or `dir:` into their Obsidian vault |
|
|
| `reviewer` | Reads output, leaves findings, gates approval | `scratch` |
|
|
| `backend-eng` | Writes server-side code | `worktree` |
|
|
| `frontend-eng` | Writes client-side code | `worktree` |
|
|
| `ops` | Runs scripts, manages services, handles deployments | `dir:` into ops scripts repo |
|
|
| `pm` | Writes specs, acceptance criteria | `scratch` |
|
|
|
|
## Decomposition playbook
|
|
|
|
### Step 1 — Understand the goal
|
|
|
|
Ask clarifying questions if the goal is ambiguous. Cheap to ask; expensive to spawn the wrong fleet.
|
|
|
|
### Step 2 — Sketch the task graph
|
|
|
|
Before creating anything, draft the graph out loud (in your response to the user). Example for "Analyze whether we should migrate to Postgres":
|
|
|
|
```
|
|
T1 researcher research: Postgres cost vs current
|
|
T2 researcher research: Postgres performance vs current
|
|
T3 analyst synthesize migration recommendation parents: T1, T2
|
|
T4 writer draft decision memo parents: T3
|
|
```
|
|
|
|
Show this to the user. Let them correct it before you create anything.
|
|
|
|
### Step 3 — Create tasks and link
|
|
|
|
```python
|
|
t1 = kanban_create(
|
|
title="research: Postgres cost vs current",
|
|
assignee="researcher",
|
|
body="Compare estimated infrastructure costs, migration costs, and ongoing ops costs over a 3-year window. Sources: AWS/GCP pricing, team time estimates, current Postgres bills from peers.",
|
|
tenant=os.environ.get("HERMES_TENANT"),
|
|
)["task_id"]
|
|
|
|
t2 = kanban_create(
|
|
title="research: Postgres performance vs current",
|
|
assignee="researcher",
|
|
body="Compare query latency, throughput, and scaling characteristics at our expected data volume (~500GB, 10k QPS peak). Sources: benchmark papers, public case studies, pgbench results if easy.",
|
|
)["task_id"]
|
|
|
|
t3 = kanban_create(
|
|
title="synthesize migration recommendation",
|
|
assignee="analyst",
|
|
body="Read the findings from T1 (cost) and T2 (performance). Produce a 1-page recommendation with explicit trade-offs and a go/no-go call.",
|
|
parents=[t1, t2],
|
|
)["task_id"]
|
|
|
|
t4 = kanban_create(
|
|
title="draft decision memo",
|
|
assignee="writer",
|
|
body="Turn the analyst's recommendation into a 2-page memo for the CTO. Match the tone of previous decision memos in the team's knowledge base.",
|
|
parents=[t3],
|
|
)["task_id"]
|
|
```
|
|
|
|
`parents=[...]` gates promotion — children stay in `todo` until every parent reaches `done`, then auto-promote to `ready`. No manual coordination needed; the dispatcher and dependency engine handle it.
|
|
|
|
### Step 4 — Complete your own task
|
|
|
|
If you were spawned as a task yourself (e.g. `planner` profile was assigned `T0: "investigate Postgres migration"`), mark it done with a summary of what you created:
|
|
|
|
```python
|
|
kanban_complete(
|
|
summary="decomposed into T1-T4: 2 researchers parallel, 1 analyst on their outputs, 1 writer on the recommendation",
|
|
metadata={
|
|
"task_graph": {
|
|
"T1": {"assignee": "researcher", "parents": []},
|
|
"T2": {"assignee": "researcher", "parents": []},
|
|
"T3": {"assignee": "analyst", "parents": ["T1", "T2"]},
|
|
"T4": {"assignee": "writer", "parents": ["T3"]},
|
|
},
|
|
},
|
|
)
|
|
```
|
|
|
|
### Step 5 — Report back to the user
|
|
|
|
Tell them what you created in plain prose:
|
|
|
|
> I've queued 4 tasks:
|
|
> - **T1** (researcher): cost comparison
|
|
> - **T2** (researcher): performance comparison, in parallel with T1
|
|
> - **T3** (analyst): synthesizes T1 + T2 into a recommendation
|
|
> - **T4** (writer): turns T3 into a CTO memo
|
|
>
|
|
> The dispatcher will pick up T1 and T2 now. T3 starts when both finish. You'll get a gateway ping when T4 completes. Use the dashboard or `hermes kanban tail <id>` to follow along.
|
|
|
|
## Common patterns
|
|
|
|
**Fan-out + fan-in (research → synthesize):** N `researcher` tasks with no parents, one `analyst` task with all of them as parents.
|
|
|
|
**Pipeline with gates:** `pm → backend-eng → reviewer`. Each stage's `parents=[previous_task]`. Reviewer blocks or completes; if reviewer blocks, the operator unblocks with feedback and respawns.
|
|
|
|
**Same-profile queue:** 50 tasks, all assigned to `translator`, no dependencies between them. Dispatcher serializes — translator processes them in priority order, accumulating experience in their own memory.
|
|
|
|
**Human-in-the-loop:** Any task can `kanban_block()` to wait for input. Dispatcher respawns after `/unblock`. The comment thread carries the full context.
|
|
|
|
## Pitfalls
|
|
|
|
**Reassignment vs. new task.** If a reviewer blocks with "needs changes," create a NEW task linked from the reviewer's task — don't re-run the same task with a stern look. The new task is assigned to the original implementer profile.
|
|
|
|
**Argument order for links.** `kanban_link(parent_id=..., child_id=...)` — parent first. Mixing them up demotes the wrong task to `todo`.
|
|
|
|
**Don't pre-create the whole graph if the shape depends on intermediate findings.** If T3's structure depends on what T1 and T2 find, let T3 exist as a "synthesize findings" task whose own first step is to read parent handoffs and plan the rest. Orchestrators can spawn orchestrators.
|
|
|
|
**Tenant inheritance.** If `HERMES_TENANT` is set in your env, pass `tenant=os.environ.get("HERMES_TENANT")` on every `kanban_create` call so child tasks stay in the same namespace.
|
|
|
|
## Recovering stuck workers
|
|
|
|
When a worker profile keeps crashing, hallucinating, or getting blocked by its own mistakes (usually: wrong model, missing skill, broken credential), the kanban dashboard flags the task with a ⚠ badge and opens a **Recovery** section in the drawer. Three primary actions:
|
|
|
|
1. **Reclaim** (or `hermes kanban reclaim <task_id>`) — abort the running worker immediately and reset the task to `ready`. The existing claim TTL is ~15 min; this is the fast path out.
|
|
2. **Reassign** (or `hermes kanban reassign <task_id> <new-profile> --reclaim`) — switch the task to a different profile and let the dispatcher pick it up with a fresh worker.
|
|
3. **Change profile model** — the dashboard prints a copy-paste hint for `hermes -p <profile> model` since profile config lives on disk; edit it in a terminal, then Reclaim to retry with the new model.
|
|
|
|
Hallucination warnings appear on tasks where a worker's `kanban_complete(created_cards=[...])` claim included card ids that don't exist or weren't created by the worker's profile (the gate blocks the completion), or where the free-form summary references `t_<hex>` ids that don't resolve (advisory prose scan, non-blocking). Both produce audit events that persist even after recovery actions — the trail stays for debugging.
|