mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-11 03:31:55 +00:00
Workers completing a kanban task can now claim the ids of cards they created via an optional ``created_cards`` field on ``kanban_complete``. The kernel verifies each id exists and was created by the completing worker's profile; any phantom id blocks the completion with a ``HallucinatedCardsError`` and records a ``completion_blocked_hallucination`` event on the task so the rejected attempt is auditable. Successful completions also get a non-blocking prose-scan pass over their ``summary`` + ``result`` that emits a ``suspected_hallucinated_references`` event for any ``t_<hex>`` reference that doesn't resolve. Closes #20017. Recovery UX (kernel + CLI + dashboard) -------------------------------------- A structural gate alone isn't enough — operators also need to see and act on stuck workers, especially when a profile's model is the root cause. This PR ships the full loop: * ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that releases an active worker claim immediately (unlike ``release_stale_claims`` which only acts after claim_expires has passed). Emits a ``reclaimed`` event with ``manual: True`` payload. * ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` — switch a task to a different profile, optionally reclaiming a stuck running worker in the same call. * ``hermes kanban reclaim <id> [--reason ...]`` and ``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]`` CLI subcommands wired through to the same helpers. * ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and ``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the dashboard plugin. Dashboard surfacing ------------------- * ⚠ **warning badge** on cards with active hallucination events. * **attention strip** at the top of the board listing all flagged tasks; dismissible per session. * **events callout** in the task drawer — hallucination events render with a red left border, amber icon, and phantom ids as styled chips. * **recovery section** in the task drawer with three actions: Reclaim, Reassign (with profile picker + reclaim-first checkbox), and a copy-to-clipboard hint for ``hermes -p <profile> model`` since profile config lives on disk and can't be edited from the browser. Auto-opens when the task has warnings, collapsed otherwise. Keyed by task id so state doesn't leak between drawers. Active-vs-stale rule: warnings clear when a clean ``completed`` or ``edited`` event supersedes the hallucination, so recovery is never permanently stigmatising — the audit events persist for debugging but the badge goes away once the worker succeeds. Skill updates ------------- * ``skills/devops/kanban-worker/SKILL.md`` documents the ``created_cards`` contract with good/bad examples. * ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering stuck workers" section with the three actions and when to use each. Tests ----- * Kernel gate: verified-cards manifest, phantom rejection + audit event, cross-worker rejection, prose scan positive + negative. * Recovery helpers: reclaim on running task, reclaim on non-running returns False, reassign refuses running without reclaim_first, reassign with reclaim_first succeeds on running. * API endpoints: warnings field present on /board and /tasks/:id, warnings cleared after clean completion, reclaim 200 + 409 paths, reassign 200 + 409 + reclaim_first paths. * CLI smoke: reclaim + reassign subcommands. Live-verified end-to-end on a dashboard with seeded scenarios: attention strip renders, badges land on the right cards, drawer callout shows phantom chips, Reclaim on a running task flips status to ready + emits manual reclaimed event + refreshes the drawer, Reassign swaps the assignee and triggers board refresh. 359/359 kanban-suite tests pass (test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
160 lines
8.4 KiB
Markdown
160 lines
8.4 KiB
Markdown
---
|
|
name: kanban-worker
|
|
description: Pitfalls, examples, and edge cases for Hermes Kanban workers. The lifecycle itself is auto-injected into every worker's system prompt as KANBAN_GUIDANCE (from agent/prompt_builder.py); this skill is what you load when you want deeper detail on specific scenarios.
|
|
version: 2.0.0
|
|
metadata:
|
|
hermes:
|
|
tags: [kanban, multi-agent, collaboration, workflow, pitfalls]
|
|
related_skills: [kanban-orchestrator]
|
|
---
|
|
|
|
# Kanban Worker — Pitfalls and Examples
|
|
|
|
> You're seeing this skill because the Hermes Kanban dispatcher spawned you as a worker with `--skills kanban-worker` — it's loaded automatically for every dispatched worker. The **lifecycle** (6 steps: orient → work → heartbeat → block/complete) also lives in the `KANBAN_GUIDANCE` block that's auto-injected into your system prompt. This skill is the deeper detail: good handoff shapes, retry diagnostics, edge cases.
|
|
|
|
## Workspace handling
|
|
|
|
Your workspace kind determines how you should behave inside `$HERMES_KANBAN_WORKSPACE`:
|
|
|
|
| Kind | What it is | How to work |
|
|
|---|---|---|
|
|
| `scratch` | Fresh tmp dir, yours alone | Read/write freely; it gets GC'd when the task is archived. |
|
|
| `dir:<path>` | Shared persistent directory | Other runs will read what you write. Treat it like long-lived state. Path is guaranteed absolute (the kernel rejects relative paths). |
|
|
| `worktree` | Git worktree at the resolved path | If `.git` doesn't exist, run `git worktree add <path> <branch>` from the main repo first, then cd and work normally. Commit work here. |
|
|
|
|
## Tenant isolation
|
|
|
|
If `$HERMES_TENANT` is set, the task belongs to a tenant namespace. When reading or writing persistent memory, prefix memory entries with the tenant so context doesn't leak across tenants:
|
|
|
|
- Good: `business-a: Acme is our biggest customer`
|
|
- Bad (leaks): `Acme is our biggest customer`
|
|
|
|
## Good summary + metadata shapes
|
|
|
|
The `kanban_complete(summary=..., metadata=...)` handoff is how downstream workers read what you did. Patterns that work:
|
|
|
|
**Coding task:**
|
|
```python
|
|
kanban_complete(
|
|
summary="shipped rate limiter — token bucket, keys on user_id with IP fallback, 14 tests pass",
|
|
metadata={
|
|
"changed_files": ["rate_limiter.py", "tests/test_rate_limiter.py"],
|
|
"tests_run": 14,
|
|
"tests_passed": 14,
|
|
"decisions": ["user_id primary, IP fallback for unauthenticated requests"],
|
|
},
|
|
)
|
|
```
|
|
|
|
**Research task:**
|
|
```python
|
|
kanban_complete(
|
|
summary="3 competing libraries reviewed; vLLM wins on throughput, SGLang on latency, Tensorrt-LLM on memory efficiency",
|
|
metadata={
|
|
"sources_read": 12,
|
|
"recommendation": "vLLM",
|
|
"benchmarks": {"vllm": 1.0, "sglang": 0.87, "trtllm": 0.72},
|
|
},
|
|
)
|
|
```
|
|
|
|
**Review task:**
|
|
```python
|
|
kanban_complete(
|
|
summary="reviewed PR #123; 2 blocking issues found (SQL injection in /search, missing CSRF on /settings)",
|
|
metadata={
|
|
"pr_number": 123,
|
|
"findings": [
|
|
{"severity": "critical", "file": "api/search.py", "line": 42, "issue": "raw SQL concat"},
|
|
{"severity": "high", "file": "api/settings.py", "issue": "missing CSRF middleware"},
|
|
],
|
|
"approved": False,
|
|
},
|
|
)
|
|
```
|
|
|
|
Shape `metadata` so downstream parsers (reviewers, aggregators, schedulers) can use it without re-reading your prose.
|
|
|
|
## Claiming cards you actually created
|
|
|
|
If your run produced new kanban tasks (via `kanban_create`), pass the ids in `created_cards` on `kanban_complete`. The kernel verifies each id exists and was created by your profile; any phantom id blocks the completion with an error listing what went wrong, and the rejected attempt is permanently recorded on the task's event log. **Only list ids you captured from a successful `kanban_create` return value — never invent ids from prose, never paste ids from earlier runs, never claim cards another worker created.**
|
|
|
|
```python
|
|
# GOOD — capture return values, then claim them.
|
|
c1 = kanban_create(title="remediate SQL injection", assignee="security-worker")
|
|
c2 = kanban_create(title="fix CSRF middleware", assignee="web-worker")
|
|
|
|
kanban_complete(
|
|
summary="Review done; spawned remediations for both findings.",
|
|
metadata={"pr_number": 123, "approved": False},
|
|
created_cards=[c1["task_id"], c2["task_id"]],
|
|
)
|
|
```
|
|
|
|
```python
|
|
# BAD — claiming ids you don't have captured return values for.
|
|
kanban_complete(
|
|
summary="Created remediation cards t_a1b2c3d4, t_deadbeef", # hallucinated
|
|
created_cards=["t_a1b2c3d4", "t_deadbeef"], # → gate rejects
|
|
)
|
|
```
|
|
|
|
If a `kanban_create` call fails (exception, tool_error), the card was NOT created — do not include a phantom id for it. Retry the create, or omit the id and mention the failure in your summary. The prose-scan pass also catches `t_<hex>` references in your free-form summary that don't resolve; these don't block the completion but show up as advisory warnings on the task in the dashboard.
|
|
|
|
## Block reasons that get answered fast
|
|
|
|
Bad: `"stuck"` — the human has no context.
|
|
|
|
Good: one sentence naming the specific decision you need. Leave longer context as a comment instead.
|
|
|
|
```python
|
|
kanban_comment(
|
|
task_id=os.environ["HERMES_KANBAN_TASK"],
|
|
body="Full context: I have user IPs from Cloudflare headers but some users are behind NATs with thousands of peers. Keying on IP alone causes false positives.",
|
|
)
|
|
kanban_block(reason="Rate limit key choice: IP (simple, NAT-unsafe) or user_id (requires auth, skips anonymous endpoints)?")
|
|
```
|
|
|
|
The block message is what appears in the dashboard / gateway notifier. The comment is the deeper context a human reads when they open the task.
|
|
|
|
## Heartbeats worth sending
|
|
|
|
Good heartbeats name progress: `"epoch 12/50, loss 0.31"`, `"scanned 1.2M/2.4M rows"`, `"uploaded 47/120 videos"`.
|
|
|
|
Bad heartbeats: `"still working"`, empty notes, sub-second intervals. Every few minutes max; skip entirely for tasks under ~2 minutes.
|
|
|
|
## Retry scenarios
|
|
|
|
If you open the task and `kanban_show` returns `runs: [...]` with one or more closed runs, you're a retry. The prior runs' `outcome` / `summary` / `error` tell you what didn't work. Don't repeat that path. Typical retry diagnostics:
|
|
|
|
- `outcome: "timed_out"` — the previous attempt hit `max_runtime_seconds`. You may need to chunk the work or shorten it.
|
|
- `outcome: "crashed"` — OOM or segfault. Reduce memory footprint.
|
|
- `outcome: "spawn_failed"` + `error: "..."` — usually a profile config issue (missing credential, bad PATH). Ask the human via `kanban_block` instead of retrying blindly.
|
|
- `outcome: "reclaimed"` + `summary: "task archived..."` — operator archived the task out from under the previous run; you probably shouldn't be running at all, check status carefully.
|
|
- `outcome: "blocked"` — a previous attempt blocked; the unblock comment should be in the thread by now.
|
|
|
|
## Do NOT
|
|
|
|
- Call `delegate_task` as a substitute for `kanban_create`. `delegate_task` is for short reasoning subtasks inside YOUR run; `kanban_create` is for cross-agent handoffs that outlive one API loop.
|
|
- Modify files outside `$HERMES_KANBAN_WORKSPACE` unless the task body says to.
|
|
- Create follow-up tasks assigned to yourself — assign to the right specialist.
|
|
- Complete a task you didn't actually finish. Block it instead.
|
|
|
|
## Pitfalls
|
|
|
|
**Task state can change between dispatch and your startup.** Between when the dispatcher claimed and when your process actually booted, the task may have been blocked, reassigned, or archived. Always `kanban_show` first. If it reports `blocked` or `archived`, stop — you shouldn't be running.
|
|
|
|
**Workspace may have stale artifacts.** Especially `dir:` and `worktree` workspaces can have files from previous runs. Read the comment thread — it usually explains why you're running again and what state the workspace is in.
|
|
|
|
**Don't rely on the CLI when the guidance is available.** The `kanban_*` tools work across all terminal backends (Docker, Modal, SSH). `hermes kanban <verb>` from your terminal tool will fail in containerized backends because the CLI isn't installed there. When in doubt, use the tool.
|
|
|
|
## CLI fallback (for scripting)
|
|
|
|
Every tool has a CLI equivalent for human operators and scripts:
|
|
- `kanban_show` ↔ `hermes kanban show <id> --json`
|
|
- `kanban_complete` ↔ `hermes kanban complete <id> --summary "..." --metadata '{...}'`
|
|
- `kanban_block` ↔ `hermes kanban block <id> "reason"`
|
|
- `kanban_create` ↔ `hermes kanban create "title" --assignee <profile> [--parent <id>]`
|
|
- etc.
|
|
|
|
Use the tools from inside an agent; the CLI exists for the human at the terminal.
|