hermes-agent/tests/hermes_cli/test_kanban_specify_db.py
Teknium 24d48ffb82
feat(kanban): add specify — auxiliary LLM fleshes out triage tasks (#21435)
* feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks

The Triage column shipped with a placeholder 'a specifier will flesh
out the spec', but the specifier itself was never built. This wires
it up as a dedicated CLI verb.

`hermes kanban specify <id>` calls the auxiliary LLM (configured under
`auxiliary.triage_specifier`) to expand a rough one-liner into a
concrete spec — tightened title plus a body with Goal / Approach /
Acceptance criteria / Out-of-scope sections — then atomically flips
`status: triage -> todo` and recomputes ready so parent-free tasks
go straight to the dispatcher on the same tick.

Surface:

  hermes kanban specify <task_id>               # single task
  hermes kanban specify --all [--tenant T]      # sweep triage column
  hermes kanban specify ... --author NAME       # audit-comment author
  hermes kanban specify ... --json              # one JSON line per task

Design choices:

  - Parent gating is preserved. specify_triage_task flips to 'todo',
    then recompute_ready promotes to 'ready' only when parents are
    done — same rule as a normal parent-gated todo.
  - No daemon, no background watcher. Every invocation is explicit —
    keeps cost predictable and doesn't fight the dispatcher loop.
  - Response parse is lenient: strict JSON preferred, markdown-fence
    tolerated, raw-body fallback on malformed JSON so the LLM can't
    strand a task in triage.
  - All failure modes (no aux client, API error, task moved out of
    triage mid-call) return SpecifyOutcome(ok=False, reason=...) so
    --all continues past individual failures.

Changes:

  hermes_cli/kanban_db.py    + specify_triage_task()
  hermes_cli/kanban_specify.py  NEW (~220 LOC — prompt, parse, call)
  hermes_cli/kanban.py       + specify subcommand + _cmd_specify
  hermes_cli/config.py       + auxiliary.triage_specifier task slot
  website/docs/user-guide/features/kanban.md  specify + config notes
  website/docs/reference/cli-commands.md      CLI reference entry
  tests/hermes_cli/test_kanban_specify_db.py    NEW (10 tests)
  tests/hermes_cli/test_kanban_specify.py       NEW (20 tests)

Validation: 30/30 targeted tests pass. E2E: triage task -> specify ->
ends in 'ready' with events [created, specified, promoted] and the
audit comment recorded under the configured author.

* feat(kanban): wire specifier into dashboard and gateway slash

Follow-ups to the initial PR #21435 — closes the two gaps I'd left as
post-merge: dashboard button and first-class gateway surface.

Dashboard (plugins/kanban/dashboard/)
  - POST /tasks/:id/specify  NEW endpoint. Thin wrapper around
    kanban_specify.specify_task(). Returns the CLI outcome shape
    ({ok, task_id, reason, new_title}); ok=false with a human reason
    is a 200, not a 4xx, so the UI can render it inline without
    treating 'no aux client configured' as a crash.
  - Runs sync in FastAPI's threadpool because the LLM call can take
    tens of seconds on reasoning models.
  - Pins HERMES_KANBAN_BOARD around the specify call so the module's
    argless kb.connect() lands on the right board.
  - dist/index.js: doSpecify callback threaded through the drawer →
    TaskDetail → StatusActions prop chain.  Specify button appears
    ONLY when task.status === 'triage' (elsewhere the backend would
    reject anyway — hide the button to keep the action row clean).
    Busy state (Specifying…) + inline success/error banner under the
    button using the response.reason text.
  - dist/style.css: tiny hermes-kanban-msg-ok / -err classes using
    existing --color vars so themes reskin cleanly.

Gateway slash (/kanban specify)
  - Already works via the existing run_slash → build_parser →
    kanban_command pipeline. No code change needed — slash commands
    inherit the argparse tree automatically. Added coverage:
    test_run_slash_specify_end_to_end (create --triage, specify, verify
    promotion + retitle) and test_run_slash_specify_help_is_reachable.

Tests
  - tests/plugins/test_kanban_dashboard_plugin.py: 3 new tests for the
    REST endpoint — happy path, non-triage rejection as ok=false 200,
    missing aux client as ok=false 200.
  - tests/hermes_cli/test_kanban_cli.py: 2 new slash-surface tests.

Docs
  - website/docs/user-guide/features/kanban.md: dashboard action row
    description mentions  Specify + all three surfaces. REST table
    gains /tasks/:id/specify. Slash examples include /kanban specify.

Validation: 340/340 targeted tests pass. E2E via TestClient: create a
triage task over REST → POST /specify with mocked aux client → task
moves to 'ready' column on /board with new title and body applied.
2026-05-07 13:04:41 -07:00

184 lines
6.3 KiB
Python

"""Tests for kb.specify_triage_task — the DB-layer atomic promotion
from the triage column to todo. LLM-free by design."""
from __future__ import annotations
from pathlib import Path
import pytest
from hermes_cli import kanban_db as kb
@pytest.fixture
def kanban_home(tmp_path, monkeypatch):
"""Isolated HERMES_HOME with an empty kanban DB."""
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setenv("HERMES_HOME", str(home))
monkeypatch.setattr(Path, "home", lambda: tmp_path)
kb.init_db()
return home
def _create_triage(conn, title="rough idea", body=None, assignee=None):
return kb.create_task(
conn,
title=title,
body=body,
assignee=assignee,
triage=True,
)
def test_specify_promotes_triage_to_todo(kanban_home):
with kb.connect() as conn:
tid = _create_triage(conn, title="rough idea")
assert kb.get_task(conn, tid).status == "triage"
with kb.connect() as conn:
ok = kb.specify_triage_task(
conn,
tid,
title="Refined: rough idea",
body="**Goal**\nDo the thing.",
author="specifier-bot",
)
assert ok is True
with kb.connect() as conn:
task = kb.get_task(conn, tid)
# No parents → recompute_ready should have flipped it past todo to ready.
assert task.status == "ready"
assert task.title == "Refined: rough idea"
assert "**Goal**" in (task.body or "")
def test_specify_with_open_parent_lands_in_todo_not_ready(kanban_home):
# Parent-gated specified tasks must not jump the dispatcher — they go
# to todo and wait for parent completion like any other gated task.
with kb.connect() as conn:
parent = kb.create_task(conn, title="parent work")
child = _create_triage(conn, title="child idea")
kb.link_tasks(conn, parent, child)
# After linking with an open parent, triage status should still be
# 'triage' (linking doesn't touch triage tasks).
assert kb.get_task(conn, child).status == "triage"
with kb.connect() as conn:
ok = kb.specify_triage_task(
conn,
child,
body="full spec",
author="specifier",
)
assert ok is True
with kb.connect() as conn:
t = kb.get_task(conn, child)
# Parent still open → specified child sits in 'todo', not 'ready'.
assert t.status == "todo"
def test_specify_refuses_non_triage_task(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="normal task")
assert kb.get_task(conn, tid).status == "ready"
with kb.connect() as conn:
ok = kb.specify_triage_task(conn, tid, body="won't apply")
assert ok is False
with kb.connect() as conn:
# Status unchanged.
assert kb.get_task(conn, tid).status == "ready"
def test_specify_returns_false_for_unknown_id(kanban_home):
with kb.connect() as conn:
ok = kb.specify_triage_task(conn, "t_does_not_exist", body="x")
assert ok is False
def test_specify_rejects_blank_title(kanban_home):
with kb.connect() as conn:
tid = _create_triage(conn, title="rough")
with kb.connect() as conn, pytest.raises(ValueError):
kb.specify_triage_task(conn, tid, title=" ", body="ok")
def test_specify_emits_event(kanban_home):
with kb.connect() as conn:
tid = _create_triage(conn, title="rough")
with kb.connect() as conn:
kb.specify_triage_task(
conn, tid, title="new", body="b", author="ace"
)
with kb.connect() as conn:
events = kb.list_events(conn, tid)
kinds = [e.kind for e in events]
assert "specified" in kinds
# The specified event records which fields actually changed as a
# JSON payload under task_events.payload.
spec_ev = next(e for e in events if e.kind == "specified")
assert spec_ev.payload is not None
fields = spec_ev.payload.get("changed_fields") or []
assert "title" in fields
assert "body" in fields
def test_specify_records_audit_comment_only_when_author_given(kanban_home):
# With author → comment added.
with kb.connect() as conn:
tid1 = _create_triage(conn, title="a")
kb.specify_triage_task(
conn, tid1, title="A-spec", body="b", author="ace"
)
comments1 = kb.list_comments(conn, tid1)
assert len(comments1) == 1
assert "Specified" in comments1[0].body
assert comments1[0].author == "ace"
# Without author → no comment (silent).
with kb.connect() as conn:
tid2 = _create_triage(conn, title="b")
kb.specify_triage_task(conn, tid2, title="B-spec", body="b")
comments2 = kb.list_comments(conn, tid2)
assert comments2 == []
def test_specify_skips_comment_when_nothing_changed(kanban_home):
# Create triage task with title and body already set; pass identical
# values to specify. Should promote to todo but skip audit comment.
with kb.connect() as conn:
tid = _create_triage(conn, title="same", body="same body")
with kb.connect() as conn:
ok = kb.specify_triage_task(
conn,
tid,
title="same",
body="same body",
author="ace",
)
assert ok is True
with kb.connect() as conn:
# Promoted.
assert kb.get_task(conn, tid).status in {"todo", "ready"}
# No audit comment because neither field changed.
assert kb.list_comments(conn, tid) == []
def test_specify_with_only_body_preserves_title(kanban_home):
with kb.connect() as conn:
tid = _create_triage(conn, title="keep this title")
with kb.connect() as conn:
kb.specify_triage_task(conn, tid, body="new body only")
with kb.connect() as conn:
t = kb.get_task(conn, tid)
assert t.title == "keep this title"
assert t.body == "new body only"
def test_specify_second_call_noop_false(kanban_home):
# Promoting twice must not crash and the second call returns False
# because the task is no longer in triage.
with kb.connect() as conn:
tid = _create_triage(conn, title="once")
with kb.connect() as conn:
assert kb.specify_triage_task(conn, tid, body="spec") is True
with kb.connect() as conn:
assert kb.specify_triage_task(conn, tid, body="spec again") is False