From 24d48ffb8294d6f13f0a6660dfff376d886d0466 Mon Sep 17 00:00:00 2001 From: Teknium <127238744+teknium1@users.noreply.github.com> Date: Thu, 7 May 2026 13:04:41 -0700 Subject: [PATCH] =?UTF-8?q?feat(kanban):=20add=20`specify`=20=E2=80=94=20a?= =?UTF-8?q?uxiliary=20LLM=20fleshes=20out=20triage=20tasks=20(#21435)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks The Triage column shipped with a placeholder 'a specifier will flesh out the spec', but the specifier itself was never built. This wires it up as a dedicated CLI verb. `hermes kanban specify ` calls the auxiliary LLM (configured under `auxiliary.triage_specifier`) to expand a rough one-liner into a concrete spec — tightened title plus a body with Goal / Approach / Acceptance criteria / Out-of-scope sections — then atomically flips `status: triage -> todo` and recomputes ready so parent-free tasks go straight to the dispatcher on the same tick. Surface: hermes kanban specify # single task hermes kanban specify --all [--tenant T] # sweep triage column hermes kanban specify ... --author NAME # audit-comment author hermes kanban specify ... --json # one JSON line per task Design choices: - Parent gating is preserved. specify_triage_task flips to 'todo', then recompute_ready promotes to 'ready' only when parents are done — same rule as a normal parent-gated todo. - No daemon, no background watcher. Every invocation is explicit — keeps cost predictable and doesn't fight the dispatcher loop. - Response parse is lenient: strict JSON preferred, markdown-fence tolerated, raw-body fallback on malformed JSON so the LLM can't strand a task in triage. - All failure modes (no aux client, API error, task moved out of triage mid-call) return SpecifyOutcome(ok=False, reason=...) so --all continues past individual failures. Changes: hermes_cli/kanban_db.py + specify_triage_task() hermes_cli/kanban_specify.py NEW (~220 LOC — prompt, parse, call) hermes_cli/kanban.py + specify subcommand + _cmd_specify hermes_cli/config.py + auxiliary.triage_specifier task slot website/docs/user-guide/features/kanban.md specify + config notes website/docs/reference/cli-commands.md CLI reference entry tests/hermes_cli/test_kanban_specify_db.py NEW (10 tests) tests/hermes_cli/test_kanban_specify.py NEW (20 tests) Validation: 30/30 targeted tests pass. E2E: triage task -> specify -> ends in 'ready' with events [created, specified, promoted] and the audit comment recorded under the configured author. * feat(kanban): wire specifier into dashboard and gateway slash Follow-ups to the initial PR #21435 — closes the two gaps I'd left as post-merge: dashboard button and first-class gateway surface. Dashboard (plugins/kanban/dashboard/) - POST /tasks/:id/specify NEW endpoint. Thin wrapper around kanban_specify.specify_task(). Returns the CLI outcome shape ({ok, task_id, reason, new_title}); ok=false with a human reason is a 200, not a 4xx, so the UI can render it inline without treating 'no aux client configured' as a crash. - Runs sync in FastAPI's threadpool because the LLM call can take tens of seconds on reasoning models. - Pins HERMES_KANBAN_BOARD around the specify call so the module's argless kb.connect() lands on the right board. - dist/index.js: doSpecify callback threaded through the drawer → TaskDetail → StatusActions prop chain. ✨ Specify button appears ONLY when task.status === 'triage' (elsewhere the backend would reject anyway — hide the button to keep the action row clean). Busy state (Specifying…) + inline success/error banner under the button using the response.reason text. - dist/style.css: tiny hermes-kanban-msg-ok / -err classes using existing --color vars so themes reskin cleanly. Gateway slash (/kanban specify) - Already works via the existing run_slash → build_parser → kanban_command pipeline. No code change needed — slash commands inherit the argparse tree automatically. Added coverage: test_run_slash_specify_end_to_end (create --triage, specify, verify promotion + retitle) and test_run_slash_specify_help_is_reachable. Tests - tests/plugins/test_kanban_dashboard_plugin.py: 3 new tests for the REST endpoint — happy path, non-triage rejection as ok=false 200, missing aux client as ok=false 200. - tests/hermes_cli/test_kanban_cli.py: 2 new slash-surface tests. Docs - website/docs/user-guide/features/kanban.md: dashboard action row description mentions ✨ Specify + all three surfaces. REST table gains /tasks/:id/specify. Slash examples include /kanban specify. Validation: 340/340 targeted tests pass. E2E via TestClient: create a triage task over REST → POST /specify with mocked aux client → task moves to 'ready' column on /board with new title and body applied. --- hermes_cli/config.py | 13 + hermes_cli/kanban.py | 111 ++++++ hermes_cli/kanban_db.py | 85 +++++ hermes_cli/kanban_specify.py | 265 ++++++++++++++ plugins/kanban/dashboard/dist/index.js | 109 +++++- plugins/kanban/dashboard/dist/style.css | 20 ++ plugins/kanban/dashboard/plugin_api.py | 56 +++ tests/hermes_cli/test_kanban_cli.py | 55 +++ tests/hermes_cli/test_kanban_specify.py | 337 ++++++++++++++++++ tests/hermes_cli/test_kanban_specify_db.py | 184 ++++++++++ tests/plugins/test_kanban_dashboard_plugin.py | 101 ++++++ website/docs/reference/cli-commands.md | 1 + website/docs/user-guide/features/kanban.md | 11 +- 13 files changed, 1328 insertions(+), 20 deletions(-) create mode 100644 hermes_cli/kanban_specify.py create mode 100644 tests/hermes_cli/test_kanban_specify.py create mode 100644 tests/hermes_cli/test_kanban_specify_db.py diff --git a/hermes_cli/config.py b/hermes_cli/config.py index 65d85cd58b..1e040c3685 100644 --- a/hermes_cli/config.py +++ b/hermes_cli/config.py @@ -780,6 +780,19 @@ DEFAULT_CONFIG = { "timeout": 30, "extra_body": {}, }, + # Triage specifier — flesh out a rough one-liner in the Kanban + # Triage column into a concrete spec, then promote it to ``todo``. + # Invoked by ``hermes kanban specify`` (single id or --all). Set a + # cheap, capable model here (gemini-flash works well); the main + # model is overkill for short spec expansion. + "triage_specifier": { + "provider": "auto", + "model": "", + "base_url": "", + "api_key": "", + "timeout": 120, + "extra_body": {}, + }, # Curator — skill-usage review fork. Timeout is generous because the # review pass can take several minutes on reasoning models (umbrella # building over hundreds of candidate skills). "auto" = use main chat diff --git a/hermes_cli/kanban.py b/hermes_cli/kanban.py index 59e44795f3..7c63d973c2 100644 --- a/hermes_cli/kanban.py +++ b/hermes_cli/kanban.py @@ -570,6 +570,42 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu ) p_ctx.add_argument("task_id") + # --- specify --- (triage → todo via auxiliary LLM) + p_specify = sub.add_parser( + "specify", + help="Flesh out a triage-column task into a concrete spec " + "(title + body) and promote it to todo. Uses the auxiliary " + "LLM configured under auxiliary.triage_specifier.", + ) + p_specify.add_argument( + "task_id", + nargs="?", + default=None, + help="Task id to specify (required unless --all is given)", + ) + p_specify.add_argument( + "--all", + dest="all_triage", + action="store_true", + help="Specify every task currently in the triage column", + ) + p_specify.add_argument( + "--tenant", + default=None, + help="When used with --all, restrict the sweep to this tenant", + ) + p_specify.add_argument( + "--author", + default=None, + help="Author name recorded on the audit comment " + "(default: $HERMES_PROFILE or 'specifier')", + ) + p_specify.add_argument( + "--json", + action="store_true", + help="Emit one JSON object per task on stdout", + ) + # --- gc --- p_gc = sub.add_parser( "gc", help="Garbage-collect archived-task workspaces, old events, and old logs", @@ -684,6 +720,7 @@ def kanban_command(args: argparse.Namespace) -> int: "notify-list": _cmd_notify_list, "notify-unsubscribe": _cmd_notify_unsubscribe, "context": _cmd_context, + "specify": _cmd_specify, "gc": _cmd_gc, } handler = handlers.get(action) @@ -1980,6 +2017,80 @@ def _cmd_context(args: argparse.Namespace) -> int: return 0 +def _cmd_specify(args: argparse.Namespace) -> int: + """Flesh out a triage task (or all of them) via auxiliary LLM, + then promote to todo. Thin wrapper over ``kanban_specify``.""" + from hermes_cli import kanban_specify as spec + + all_flag = bool(getattr(args, "all_triage", False)) + tenant = getattr(args, "tenant", None) + author = getattr(args, "author", None) or _profile_author() + want_json = bool(getattr(args, "json", False)) + + if args.task_id and all_flag: + print( + "kanban: pass either a task id OR --all, not both", + file=sys.stderr, + ) + return 2 + + if all_flag: + ids = spec.list_triage_ids(tenant=tenant) + if not ids: + msg = ( + "No triage tasks" + + (f" for tenant {tenant!r}" if tenant else "") + + "." + ) + if want_json: + print(json.dumps({"specified": 0, "total": 0})) + else: + print(msg) + return 0 + elif args.task_id: + ids = [args.task_id] + else: + print( + "kanban: specify requires a task id or --all", + file=sys.stderr, + ) + return 2 + + ok_count = 0 + fail_count = 0 + for tid in ids: + outcome = spec.specify_task(tid, author=author) + if outcome.ok: + ok_count += 1 + else: + fail_count += 1 + if want_json: + print(json.dumps({ + "task_id": outcome.task_id, + "ok": outcome.ok, + "reason": outcome.reason, + "new_title": outcome.new_title, + })) + else: + if outcome.ok: + title_suffix = ( + f" — retitled: {outcome.new_title!r}" + if outcome.new_title + else "" + ) + print(f"Specified {outcome.task_id} → todo{title_suffix}") + else: + print( + f"kanban: specify {outcome.task_id}: {outcome.reason}", + file=sys.stderr, + ) + if not all_flag: + return 0 if ok_count == 1 else 1 + # --all: succeed if at least one promotion landed; exit 1 only when + # every candidate failed (honest signal for scripts). + return 0 if (ok_count > 0 or not ids) else 1 + + def _cmd_gc(args: argparse.Namespace) -> int: """Remove scratch workspaces of archived tasks, prune old events, and delete old worker logs.""" diff --git a/hermes_cli/kanban_db.py b/hermes_cli/kanban_db.py index 920e23e403..f905dd89af 100644 --- a/hermes_cli/kanban_db.py +++ b/hermes_cli/kanban_db.py @@ -2503,6 +2503,91 @@ def unblock_task(conn: sqlite3.Connection, task_id: str) -> bool: return True +def specify_triage_task( + conn: sqlite3.Connection, + task_id: str, + *, + title: Optional[str] = None, + body: Optional[str] = None, + author: Optional[str] = None, +) -> bool: + """Flesh out a triage task and promote it to ``todo``. + + Atomically updates ``title`` / ``body`` (when provided) and transitions + ``status: triage -> todo`` in a single write txn. Returns False when + the task is missing or not in the ``triage`` column — callers should + surface that as "nothing to specify" rather than an error. + + ``todo`` (not ``ready``) is the correct landing column: ``recompute_ready`` + promotes parent-free / parent-done todos to ``ready`` on the next + dispatcher tick, which keeps the normal parent-gating behaviour intact + for specified tasks that happen to have open parents. + + ``author`` is recorded on an audit comment only when at least one of + ``title`` / ``body`` actually changed — avoids noisy comment spam for + status-only promotions. + """ + if title is not None and not title.strip(): + raise ValueError("title cannot be blank") + with write_txn(conn): + existing = conn.execute( + "SELECT title, body FROM tasks WHERE id = ? AND status = 'triage'", + (task_id,), + ).fetchone() + if existing is None: + return False + sets: list[str] = ["status = 'todo'"] + params: list[Any] = [] + changed_fields: list[str] = [] + if title is not None and title.strip() != (existing["title"] or ""): + sets.append("title = ?") + params.append(title.strip()) + changed_fields.append("title") + if body is not None and (body or "") != (existing["body"] or ""): + sets.append("body = ?") + params.append(body) + changed_fields.append("body") + params.append(task_id) + cur = conn.execute( + f"UPDATE tasks SET {', '.join(sets)} " + f"WHERE id = ? AND status = 'triage'", + tuple(params), + ) + if cur.rowcount != 1: + return False + if changed_fields and author and author.strip(): + # Inline INSERT (rather than ``add_comment``) because we're + # already inside this function's write_txn — nested BEGIN + # IMMEDIATE would raise OperationalError. We also skip the + # 'commented' event that ``add_comment`` emits, since the + # 'specified' event below already records the change. + conn.execute( + "INSERT INTO task_comments (task_id, author, body, created_at) " + "VALUES (?, ?, ?, ?)", + ( + task_id, + author.strip(), + "Specified — updated " + + ", ".join(changed_fields) + + " and promoted to todo.", + int(time.time()), + ), + ) + _append_event( + conn, + task_id, + "specified", + {"changed_fields": changed_fields} if changed_fields else None, + ) + # Outside the write_txn above, so we don't nest BEGIN IMMEDIATE — the + # ready-promotion pass opens its own IMMEDIATE txn. This runs the same + # logic the dispatcher would on its next tick, so a specified task + # with no open parents flips straight to 'ready' here instead of + # idling in 'todo' until the next sweep. + recompute_ready(conn) + return True + + def archive_task(conn: sqlite3.Connection, task_id: str) -> bool: with write_txn(conn): cur = conn.execute( diff --git a/hermes_cli/kanban_specify.py b/hermes_cli/kanban_specify.py new file mode 100644 index 0000000000..d069e5ee1a --- /dev/null +++ b/hermes_cli/kanban_specify.py @@ -0,0 +1,265 @@ +"""Kanban triage specifier — flesh out a one-liner into a real spec. + +Used by ``hermes kanban specify [task_id | --all]``. Takes a task that +lives in the Triage column (a rough idea, typically only a title), calls +the auxiliary LLM to produce: + + * A tightened title (optional — only replaces if the model proposes a + materially different one) + * A concrete body: goal, proposed approach, acceptance criteria + +and then flips the task ``triage -> todo`` via +``kanban_db.specify_triage_task``. The dispatcher promotes it to +``ready`` on its next tick (or immediately if there are no open parents). + +Design notes +------------ + +* This module intentionally mirrors ``hermes_cli/goals.py`` — same aux + client pattern, same "empty config => skip, don't crash" tolerance. + Keeps the surface area tiny and the failure modes predictable. + +* The prompt is a short system + user pair. We ask for JSON with + ``{title, body}``; if parsing fails, we fall back to treating the + whole response as the body and leave the title untouched. No + retry loop — one shot, keep cost bounded. + +* Structured output / JSON mode is not requested explicitly so the + specifier works on providers that don't implement it. The parse + is lenient (tolerates markdown code fences around the JSON). +""" + +from __future__ import annotations + +import json +import logging +import os +import re +from dataclasses import dataclass +from typing import Optional + +from hermes_cli import kanban_db as kb + +logger = logging.getLogger(__name__) + + +_SYSTEM_PROMPT = """You are the Kanban triage specifier for the Hermes Agent board. +A user dropped a rough idea into the Triage column. Your job is to turn it +into a concrete, actionable task spec that an autonomous worker can pick up +and execute without further clarification. + +Output a single JSON object with exactly two keys: + + { + "title": "", + "body": "" + } + +The body MUST include these sections, each prefixed with a bold markdown +heading, in this order: + + **Goal** — one sentence, user-facing outcome. + **Approach** — 2-5 bullets on how a worker should tackle it. + **Acceptance criteria** — checklist of concrete, verifiable conditions. + **Out of scope** — short list of things NOT to touch (omit if nothing + obvious; never invent scope creep). + +Rules: + - Keep the tightened title close in meaning to the original idea — do + NOT invent a different project. + - If the original idea is already detailed, preserve its substance and + just reformat into the sections above. + - Never add invented requirements the user didn't hint at. + - No preamble, no closing remarks, no code fences around the JSON. + - Output only the JSON object and nothing else. +""" + + +_USER_TEMPLATE = """Task id: {task_id} +Current title: {title} +Current body: +{body} +""" + + +@dataclass +class SpecifyOutcome: + """Result of specifying a single triage task.""" + + task_id: str + ok: bool + reason: str = "" + new_title: Optional[str] = None + + +def _truncate(text: str, limit: int) -> str: + if len(text) <= limit: + return text + return text[: limit - 1] + "…" + + +_FENCE_RE = re.compile(r"^\s*```(?:json)?\s*|\s*```\s*$", re.IGNORECASE) + + +def _extract_json_blob(raw: str) -> Optional[dict]: + """Lenient JSON extraction — tolerates fenced code blocks and + leading/trailing whitespace. Returns None if nothing parses.""" + if not raw: + return None + stripped = _FENCE_RE.sub("", raw.strip()) + # Greedy: find the first `{` and last `}` and try that slice. + first = stripped.find("{") + last = stripped.rfind("}") + if first == -1 or last == -1 or last <= first: + return None + candidate = stripped[first : last + 1] + try: + val = json.loads(candidate) + except (ValueError, json.JSONDecodeError): + return None + if not isinstance(val, dict): + return None + return val + + +def _profile_author() -> str: + """Mirror of ``hermes_cli.kanban._profile_author``. Kept local to + avoid a circular import when kanban.py imports this module.""" + return ( + os.environ.get("HERMES_PROFILE") + or os.environ.get("USER") + or "specifier" + ) + + +def specify_task( + task_id: str, + *, + author: Optional[str] = None, + timeout: Optional[int] = None, +) -> SpecifyOutcome: + """Specify a single triage task and promote it to ``todo``. + + Returns an outcome describing what happened. Never raises for expected + failure modes (task not in triage, no aux client configured, API + error, malformed response) — those surface via ``ok=False`` so the + ``--all`` sweep can continue past individual failures. + """ + with kb.connect() as conn: + task = kb.get_task(conn, task_id) + if task is None: + return SpecifyOutcome(task_id, False, "unknown task id") + if task.status != "triage": + return SpecifyOutcome( + task_id, False, f"task is not in triage (status={task.status!r})" + ) + + try: + from agent.auxiliary_client import get_text_auxiliary_client + except Exception as exc: # pragma: no cover — import smoke test + logger.debug("specify: auxiliary client import failed: %s", exc) + return SpecifyOutcome(task_id, False, "auxiliary client unavailable") + + try: + client, model = get_text_auxiliary_client("triage_specifier") + except Exception as exc: + logger.debug("specify: get_text_auxiliary_client failed: %s", exc) + return SpecifyOutcome(task_id, False, "auxiliary client unavailable") + + if client is None or not model: + return SpecifyOutcome( + task_id, False, "no auxiliary client configured" + ) + + user_msg = _USER_TEMPLATE.format( + task_id=task.id, + title=_truncate(task.title or "", 400), + body=_truncate(task.body or "(no body)", 4000), + ) + + try: + resp = client.chat.completions.create( + model=model, + messages=[ + {"role": "system", "content": _SYSTEM_PROMPT}, + {"role": "user", "content": user_msg}, + ], + temperature=0.3, + max_tokens=1500, + timeout=timeout or 120, + ) + except Exception as exc: + logger.info( + "specify: API call failed for %s (%s) — skipping", + task_id, exc, + ) + return SpecifyOutcome( + task_id, False, f"LLM error: {type(exc).__name__}" + ) + + try: + raw = resp.choices[0].message.content or "" + except Exception: + raw = "" + + parsed = _extract_json_blob(raw) + + new_title: Optional[str] + new_body: Optional[str] + if parsed is None: + # Fall back: treat the whole reply as the body, leave title as-is. + # Worst case the user edits afterward — still better than stranding + # the task in triage on a malformed LLM reply. + stripped_raw = raw.strip() + if not stripped_raw: + return SpecifyOutcome( + task_id, False, "LLM returned an empty response" + ) + new_title = None + new_body = stripped_raw + else: + title_val = parsed.get("title") + body_val = parsed.get("body") + new_title = ( + title_val.strip() + if isinstance(title_val, str) and title_val.strip() + else None + ) + new_body = ( + body_val if isinstance(body_val, str) and body_val.strip() else None + ) + if new_body is None and new_title is None: + return SpecifyOutcome( + task_id, False, "LLM response missing title and body" + ) + + with kb.connect() as conn: + ok = kb.specify_triage_task( + conn, + task_id, + title=new_title, + body=new_body, + author=author or _profile_author(), + ) + if not ok: + # Race: someone else promoted / archived the task between our + # read above and the write. Report, don't crash. + return SpecifyOutcome( + task_id, False, "task moved out of triage before promotion" + ) + return SpecifyOutcome(task_id, True, "specified", new_title=new_title) + + +def list_triage_ids(*, tenant: Optional[str] = None) -> list[str]: + """Return task ids currently in the triage column. + + ``tenant`` narrows the sweep; ``None`` returns every triage task. + """ + with kb.connect() as conn: + tasks = kb.list_tasks( + conn, + status="triage", + tenant=tenant, + include_archived=False, + ) + return [t.id for t in tasks] diff --git a/plugins/kanban/dashboard/dist/index.js b/plugins/kanban/dashboard/dist/index.js index 8bd2c8f40b..9947e26be9 100644 --- a/plugins/kanban/dashboard/dist/index.js +++ b/plugins/kanban/dashboard/dist/index.js @@ -1905,6 +1905,29 @@ }).then(function () { load(); props.onRefresh(); }); }; + // Triage specifier — calls the auxiliary LLM to flesh out a rough + // idea in the Triage column into a concrete spec (title + body with + // goal, approach, acceptance criteria) and promotes it to todo. + // Not a PATCH: runs through a dedicated POST endpoint because the + // LLM call can take tens of seconds, and its outcome is richer than + // a status flip (may update title AND body AND emit an audit + // comment — or fail with a human-readable reason that the UI + // surfaces inline without treating it as an HTTP error). + const doSpecify = function () { + return SDK.fetchJSON( + withBoard(`${API}/tasks/${encodeURIComponent(props.taskId)}/specify`, boardSlug), + { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify({}), + } + ).then(function (res) { + load(); + props.onRefresh(); + return res; + }); + }; + const addLink = function (parentId) { return SDK.fetchJSON(withBoard(`${API}/links`, boardSlug), { method: "POST", @@ -1994,6 +2017,7 @@ assignees: props.assignees || [], boardSlug: boardSlug, onPatch: doPatch, + onSpecify: doSpecify, onAddParent: addLink, onRemoveParent: removeLink, onAddChild: addChild, @@ -2062,7 +2086,11 @@ }) : null, t.created_by ? h(MetaRow, { label: "Created by", value: t.created_by }) : null, ), - h(StatusActions, { task: t, onPatch: props.onPatch }), + h(StatusActions, { + task: t, + onPatch: props.onPatch, + onSpecify: props.onSpecify, + }), h(DiagnosticsSection, { task: t, boardSlug: props.boardSlug, @@ -2495,6 +2523,8 @@ function StatusActions(props) { const t = props.task; + const [specifyBusy, setSpecifyBusy] = useState(false); + const [specifyMsg, setSpecifyMsg] = useState(null); const b = function (label, patch, enabled, confirmMsg) { return h(Button, { onClick: function () { if (enabled !== false) props.onPatch(patch, { confirm: confirmMsg }); }, @@ -2502,22 +2532,67 @@ size: "sm", }, label); }; - return h("div", { className: "hermes-kanban-actions" }, - b("→ triage", { status: "triage" }, t.status !== "triage"), - b("→ ready", { status: "ready" }, t.status !== "ready"), - // No direct → running button: /tasks/:id PATCH rejects status=running - // with 400 (issue #19535). Tasks enter running only through the - // dispatcher's claim_task path, which atomically creates the run row, - // claim lock, and worker process metadata. - b("Block", { status: "blocked" }, - t.status === "running" || t.status === "ready", - DESTRUCTIVE_TRANSITIONS.blocked), - b("Unblock", { status: "ready" }, t.status === "blocked"), - b("Complete", { status: "done" }, - t.status === "running" || t.status === "ready" || t.status === "blocked", - DESTRUCTIVE_TRANSITIONS.done), - b("Archive", { status: "archived" }, t.status !== "archived", - DESTRUCTIVE_TRANSITIONS.archived), + + // "Specify" appears only when the task is in the Triage column — the + // one column where an auxiliary LLM pass is meaningful. Elsewhere + // the backend would return ok:false with "not in triage" anyway, + // so hiding the button keeps the action row uncluttered. + const specifyButton = (t.status === "triage" && props.onSpecify) + ? h(Button, { + onClick: function () { + if (specifyBusy) return; + setSpecifyBusy(true); + setSpecifyMsg(null); + props.onSpecify().then(function (res) { + if (res && res.ok) { + const suffix = res.new_title + ? ` — retitled: ${res.new_title}` + : ""; + setSpecifyMsg({ ok: true, text: `Specified${suffix}` }); + } else { + setSpecifyMsg({ + ok: false, + text: "Specify failed: " + ((res && res.reason) || "unknown error"), + }); + } + }).catch(function (err) { + setSpecifyMsg({ + ok: false, + text: "Specify failed: " + (err.message || String(err)), + }); + }).then(function () { + setSpecifyBusy(false); + }); + }, + disabled: specifyBusy, + size: "sm", + }, specifyBusy ? "Specifying…" : "✨ Specify") + : null; + + return h("div", null, + h("div", { className: "hermes-kanban-actions" }, + specifyButton, + b("→ triage", { status: "triage" }, t.status !== "triage"), + b("→ ready", { status: "ready" }, t.status !== "ready"), + // No direct → running button: /tasks/:id PATCH rejects status=running + // with 400 (issue #19535). Tasks enter running only through the + // dispatcher's claim_task path, which atomically creates the run row, + // claim lock, and worker process metadata. + b("Block", { status: "blocked" }, + t.status === "running" || t.status === "ready", + DESTRUCTIVE_TRANSITIONS.blocked), + b("Unblock", { status: "ready" }, t.status === "blocked"), + b("Complete", { status: "done" }, + t.status === "running" || t.status === "ready" || t.status === "blocked", + DESTRUCTIVE_TRANSITIONS.done), + b("Archive", { status: "archived" }, t.status !== "archived", + DESTRUCTIVE_TRANSITIONS.archived), + ), + specifyMsg ? h("div", { + className: specifyMsg.ok + ? "hermes-kanban-msg-ok" + : "hermes-kanban-msg-err", + }, specifyMsg.text) : null, ); } diff --git a/plugins/kanban/dashboard/dist/style.css b/plugins/kanban/dashboard/dist/style.css index ec8934d314..7ecf2fd61f 100644 --- a/plugins/kanban/dashboard/dist/style.css +++ b/plugins/kanban/dashboard/dist/style.css @@ -402,6 +402,26 @@ gap: 0.3rem; } +/* Specifier result banner — sits directly under the status action row. */ +.hermes-kanban-msg-ok, +.hermes-kanban-msg-err { + margin-top: 0.4rem; + padding: 0.35rem 0.55rem; + border-radius: 0.375rem; + font-size: 0.85rem; + line-height: 1.3; +} +.hermes-kanban-msg-ok { + background: rgba(46, 160, 67, 0.12); + color: #2ea043; + border: 1px solid rgba(46, 160, 67, 0.35); +} +.hermes-kanban-msg-err { + background: rgba(248, 81, 73, 0.12); + color: #f85149; + border: 1px solid rgba(248, 81, 73, 0.35); +} + /* ---- Home channel subscription toggles (per-platform, per-task) ----- */ .hermes-kanban-home-subs { diff --git a/plugins/kanban/dashboard/plugin_api.py b/plugins/kanban/dashboard/plugin_api.py index f7dfd91a7d..4cc2ccb3c3 100644 --- a/plugins/kanban/dashboard/plugin_api.py +++ b/plugins/kanban/dashboard/plugin_api.py @@ -30,6 +30,7 @@ import asyncio import hmac import json import logging +import os import sqlite3 import time from dataclasses import asdict @@ -1011,6 +1012,61 @@ def reclaim_task_endpoint( conn.close() +class SpecifyBody(BaseModel): + """Optional author override. Nothing else is configurable from the + dashboard — model + prompt come from ``auxiliary.triage_specifier`` + in config.yaml, same as the CLI.""" + + author: Optional[str] = None + + +@router.post("/tasks/{task_id}/specify") +def specify_task_endpoint( + task_id: str, + payload: SpecifyBody, + board: Optional[str] = Query(None), +): + """Flesh out a triage-column task via the auxiliary LLM and promote + it to ``todo``. Maps 1:1 to ``hermes kanban specify ``. + + Returns the outcome shape used by the CLI: ``{ok, task_id, reason, + new_title}``. A non-OK outcome is NOT an HTTP error — the UI renders + the reason inline (e.g. "no auxiliary client configured") so the + operator knows what to fix, and retries without a page reload. + + This endpoint runs in FastAPI's threadpool (sync ``def``) because + the underlying LLM call can take tens of seconds to minutes on + reasoning models, which would block the event loop if we used + ``async def`` without an explicit ``run_in_executor``. + """ + board = _resolve_board(board) + # Pin the board for the duration of this call so the specifier module + # (which calls ``kb.connect()`` with no args) hits the right DB. + prev_env = os.environ.get("HERMES_KANBAN_BOARD") + try: + os.environ["HERMES_KANBAN_BOARD"] = board or kanban_db.DEFAULT_BOARD + # Import lazily so a missing auxiliary client at import time + # doesn't break plugin load. + from hermes_cli import kanban_specify # noqa: WPS433 (intentional) + + outcome = kanban_specify.specify_task( + task_id, + author=(payload.author or None), + ) + finally: + if prev_env is None: + os.environ.pop("HERMES_KANBAN_BOARD", None) + else: + os.environ["HERMES_KANBAN_BOARD"] = prev_env + + return { + "ok": bool(outcome.ok), + "task_id": outcome.task_id, + "reason": outcome.reason, + "new_title": outcome.new_title, + } + + class ReassignBody(BaseModel): profile: Optional[str] = None # "" or None = unassign reclaim_first: bool = False diff --git a/tests/hermes_cli/test_kanban_cli.py b/tests/hermes_cli/test_kanban_cli.py index 2c657124c1..7eed9e0be2 100644 --- a/tests/hermes_cli/test_kanban_cli.py +++ b/tests/hermes_cli/test_kanban_cli.py @@ -286,3 +286,58 @@ def test_run_slash_reassign_with_reclaim_flag(kanban_home): assert "Reassigned" in out, out out2 = kc.run_slash(f"show {tid}") assert "newbie" in out2 + + +# --------------------------------------------------------------------------- +# /kanban specify — slash surface (same entry point CLI + gateway use) +# --------------------------------------------------------------------------- + +def test_run_slash_specify_end_to_end(kanban_home, monkeypatch): + """The /kanban specify slash command routes through run_slash, which + both the interactive CLI and every gateway platform use. This test + covers both surfaces.""" + from unittest.mock import MagicMock + + # Create a triage task via the same slash surface. + create_out = kc.run_slash("create 'rough idea' --triage") + import re + m = re.search(r"(t_[a-f0-9]+)", create_out) + assert m, f"no task id in: {create_out!r}" + tid = m.group(1) + + # Mock the auxiliary client so we don't hit a real provider. + resp = MagicMock() + resp.choices = [MagicMock()] + resp.choices[0].message.content = ( + '{"title": "Spec: rough idea", "body": "**Goal**\\nShip it."}' + ) + fake_client = MagicMock() + fake_client.chat.completions.create = MagicMock(return_value=resp) + monkeypatch.setattr( + "agent.auxiliary_client.get_text_auxiliary_client", + lambda *a, **kw: (fake_client, "test-model"), + ) + + # Specify via slash. + out = kc.run_slash(f"specify {tid}") + assert "Specified" in out + assert tid in out + + # Task is promoted and retitled. + with kb.connect() as conn: + task = kb.get_task(conn, tid) + assert task.status in {"todo", "ready"} + assert task.title == "Spec: rough idea" + + +def test_run_slash_specify_help_is_reachable(kanban_home): + """`--help` on a subcommand is handled by argparse itself — it prints + to the process stdout and raises SystemExit before run_slash's output + redirection is installed, so the returned string is the usage-error + sentinel. All we're asserting here is that the subcommand is + registered (no "unknown action" error) — the shape of the help text + is covered by the direct argparse tests in test_kanban_specify.py.""" + out = kc.run_slash("specify --help") + # Either the usage-error sentinel (stdout swallowed by argparse) or + # a real help rendering — both mean the subcommand exists. + assert "usage error" in out.lower() or "specify" in out.lower() diff --git a/tests/hermes_cli/test_kanban_specify.py b/tests/hermes_cli/test_kanban_specify.py new file mode 100644 index 0000000000..dd37700159 --- /dev/null +++ b/tests/hermes_cli/test_kanban_specify.py @@ -0,0 +1,337 @@ +"""Tests for the specifier module + `hermes kanban specify` CLI surface. + +The auxiliary LLM client is mocked — these tests don't hit any network or +real provider. They exercise the prompt plumbing, response parsing, DB +writes, and CLI flag surface. +""" + +from __future__ import annotations + +import argparse +import json as jsonlib +from pathlib import Path +from unittest.mock import MagicMock, patch + +import pytest + +from hermes_cli import kanban as kanban_cli +from hermes_cli import kanban_db as kb +from hermes_cli import kanban_specify as spec + + +@pytest.fixture +def kanban_home(tmp_path, monkeypatch): + home = tmp_path / ".hermes" + home.mkdir() + monkeypatch.setenv("HERMES_HOME", str(home)) + monkeypatch.setattr(Path, "home", lambda: tmp_path) + kb.init_db() + return home + + +def _fake_aux_response(content: str): + """Build a minimal object shaped like an OpenAI chat.completions result. + + The specifier only reads ``resp.choices[0].message.content``, so we + avoid importing the openai SDK and build the tree with MagicMock. + """ + resp = MagicMock() + resp.choices = [MagicMock()] + resp.choices[0].message.content = content + return resp + + +def _mock_client_returning(content: str): + client = MagicMock() + client.chat.completions.create = MagicMock(return_value=_fake_aux_response(content)) + return client + + +def _patch_aux_client(content: str, *, model: str = "test-model"): + """Patch get_text_auxiliary_client at its source + at the module that + imported it lazily inside specify_task. Both patches are needed + because kanban_specify imports the function inside the function body. + """ + client = _mock_client_returning(content) + return patch( + "agent.auxiliary_client.get_text_auxiliary_client", + return_value=(client, model), + ), client + + +# --------------------------------------------------------------------------- +# JSON extraction helpers +# --------------------------------------------------------------------------- + +def test_extract_json_blob_handles_plain_json(): + raw = '{"title": "T", "body": "B"}' + assert spec._extract_json_blob(raw) == {"title": "T", "body": "B"} + + +def test_extract_json_blob_handles_fenced_json(): + raw = '```json\n{"title": "T", "body": "B"}\n```' + assert spec._extract_json_blob(raw) == {"title": "T", "body": "B"} + + +def test_extract_json_blob_handles_prose_preamble(): + raw = 'Sure! Here you go:\n{"title": "T", "body": "B"}\nThanks.' + assert spec._extract_json_blob(raw) == {"title": "T", "body": "B"} + + +def test_extract_json_blob_returns_none_for_unparseable(): + assert spec._extract_json_blob("no json here") is None + assert spec._extract_json_blob("") is None + assert spec._extract_json_blob("{not: valid}") is None + + +# --------------------------------------------------------------------------- +# specify_task (module-level entry point) +# --------------------------------------------------------------------------- + +def test_specify_task_happy_path(kanban_home): + with kb.connect() as conn: + tid = kb.create_task(conn, title="rough", triage=True) + + content = jsonlib.dumps({ + "title": "Refined rough", + "body": "**Goal**\nA concrete goal.", + }) + p, _ = _patch_aux_client(content) + with p: + outcome = spec.specify_task(tid, author="ace") + + assert outcome.ok is True + assert outcome.task_id == tid + assert outcome.new_title == "Refined rough" + + with kb.connect() as conn: + task = kb.get_task(conn, tid) + # Parent-free → recompute_ready promotes to ready. + assert task.status == "ready" + assert task.title == "Refined rough" + assert "**Goal**" in (task.body or "") + + +def test_specify_task_falls_back_to_body_only_on_bad_json(kanban_home): + with kb.connect() as conn: + tid = kb.create_task(conn, title="keep title", triage=True) + + # Model returned plain markdown, no JSON object. + content = "Goal: Do a thing.\nApproach: Steps here." + p, _ = _patch_aux_client(content) + with p: + outcome = spec.specify_task(tid) + + assert outcome.ok is True + with kb.connect() as conn: + t = kb.get_task(conn, tid) + # Title preserved (no JSON with a title key). + assert t.title == "keep title" + # Body replaced with the raw response. + assert "Goal:" in (t.body or "") + + +def test_specify_task_rejects_non_triage_task(kanban_home): + with kb.connect() as conn: + tid = kb.create_task(conn, title="ready task") + + p, client = _patch_aux_client("unused") + with p: + outcome = spec.specify_task(tid) + + assert outcome.ok is False + assert "not in triage" in outcome.reason + # LLM must not be invoked for a non-triage task — fail cheap. + assert client.chat.completions.create.call_count == 0 + + +def test_specify_task_unknown_id(kanban_home): + p, client = _patch_aux_client("unused") + with p: + outcome = spec.specify_task("t_nope") + assert outcome.ok is False + assert "unknown task" in outcome.reason + assert client.chat.completions.create.call_count == 0 + + +def test_specify_task_no_aux_client_configured(kanban_home): + with kb.connect() as conn: + tid = kb.create_task(conn, title="rough", triage=True) + + with patch( + "agent.auxiliary_client.get_text_auxiliary_client", + return_value=(None, ""), + ): + outcome = spec.specify_task(tid) + + assert outcome.ok is False + assert "auxiliary client" in outcome.reason + # Task must stay in triage — we never touched it. + with kb.connect() as conn: + assert kb.get_task(conn, tid).status == "triage" + + +def test_specify_task_llm_api_error_keeps_task_in_triage(kanban_home): + with kb.connect() as conn: + tid = kb.create_task(conn, title="rough", triage=True) + + client = MagicMock() + client.chat.completions.create = MagicMock(side_effect=RuntimeError("429 rate limited")) + with patch( + "agent.auxiliary_client.get_text_auxiliary_client", + return_value=(client, "test-model"), + ): + outcome = spec.specify_task(tid) + + assert outcome.ok is False + assert "LLM error" in outcome.reason + with kb.connect() as conn: + assert kb.get_task(conn, tid).status == "triage" + + +def test_specify_task_empty_llm_response(kanban_home): + with kb.connect() as conn: + tid = kb.create_task(conn, title="rough", triage=True) + + p, _ = _patch_aux_client("") + with p: + outcome = spec.specify_task(tid) + + assert outcome.ok is False + with kb.connect() as conn: + assert kb.get_task(conn, tid).status == "triage" + + +def test_list_triage_ids(kanban_home): + with kb.connect() as conn: + a = kb.create_task(conn, title="a", triage=True) + b = kb.create_task(conn, title="b", triage=True, tenant="proj-1") + kb.create_task(conn, title="c") # not triage — excluded + + ids_all = spec.list_triage_ids() + assert set(ids_all) == {a, b} + ids_tenant = spec.list_triage_ids(tenant="proj-1") + assert ids_tenant == [b] + + +# --------------------------------------------------------------------------- +# CLI wiring — argparse + _cmd_specify +# --------------------------------------------------------------------------- + +def _run_cli(*argv: str) -> int: + """Invoke the `hermes kanban …` argparse surface directly.""" + root = argparse.ArgumentParser() + subp = root.add_subparsers(dest="cmd") + kanban_cli.build_parser(subp) + ns = root.parse_args(["kanban", *argv]) + return kanban_cli.kanban_command(ns) + + +def test_cli_specify_requires_id_or_all(kanban_home, capsys): + rc = _run_cli("specify") + assert rc == 2 + err = capsys.readouterr().err + assert "requires a task id or --all" in err + + +def test_cli_specify_rejects_both_id_and_all(kanban_home, capsys): + with kb.connect() as conn: + tid = kb.create_task(conn, title="rough", triage=True) + rc = _run_cli("specify", tid, "--all") + assert rc == 2 + err = capsys.readouterr().err + assert "either a task id OR --all" in err + + +def test_cli_specify_single_id_success(kanban_home, capsys): + with kb.connect() as conn: + tid = kb.create_task(conn, title="rough", triage=True) + + content = jsonlib.dumps({"title": "clean", "body": "body"}) + p, _ = _patch_aux_client(content) + with p: + rc = _run_cli("specify", tid) + assert rc == 0 + out = capsys.readouterr().out + assert tid in out + assert "→ todo" in out or "-> todo" in out or "→" in out + + +def test_cli_specify_all_success_and_json(kanban_home, capsys): + with kb.connect() as conn: + a = kb.create_task(conn, title="a", triage=True) + b = kb.create_task(conn, title="b", triage=True) + + content = jsonlib.dumps({"title": "spec", "body": "body"}) + p, _ = _patch_aux_client(content) + with p: + rc = _run_cli("specify", "--all", "--json") + assert rc == 0 + lines = [l for l in capsys.readouterr().out.strip().splitlines() if l] + # One JSON object per task + nothing else. + assert len(lines) == 2 + parsed = [jsonlib.loads(l) for l in lines] + ids = {row["task_id"] for row in parsed} + assert ids == {a, b} + assert all(row["ok"] for row in parsed) + + +def test_cli_specify_all_empty_triage_column(kanban_home, capsys): + rc = _run_cli("specify", "--all") + assert rc == 0 + assert "No triage tasks" in capsys.readouterr().out + + +def test_cli_specify_all_returns_1_when_every_task_fails(kanban_home, capsys): + with kb.connect() as conn: + kb.create_task(conn, title="a", triage=True) + kb.create_task(conn, title="b", triage=True) + + with patch( + "agent.auxiliary_client.get_text_auxiliary_client", + return_value=(None, ""), # no aux client → every task fails + ): + rc = _run_cli("specify", "--all") + + assert rc == 1 + + +def test_cli_specify_tenant_filter(kanban_home, capsys): + with kb.connect() as conn: + outside = kb.create_task(conn, title="outside", triage=True) + inside = kb.create_task( + conn, title="inside", triage=True, tenant="proj-a", + ) + + content = jsonlib.dumps({"title": "spec", "body": "body"}) + p, _ = _patch_aux_client(content) + with p: + rc = _run_cli("specify", "--all", "--tenant", "proj-a", "--json") + assert rc == 0 + lines = [ + jsonlib.loads(l) + for l in capsys.readouterr().out.strip().splitlines() + if l + ] + ids = {row["task_id"] for row in lines} + assert ids == {inside} + + # The outside task stays in triage. + with kb.connect() as conn: + assert kb.get_task(conn, outside).status == "triage" + # The inside task was promoted. + assert kb.get_task(conn, inside).status in {"todo", "ready"} + + +def test_cli_specify_author_passed_through(kanban_home, capsys): + with kb.connect() as conn: + tid = kb.create_task(conn, title="rough", triage=True) + + content = jsonlib.dumps({"title": "fresh title", "body": "fresh body"}) + p, _ = _patch_aux_client(content) + with p: + rc = _run_cli("specify", tid, "--author", "custom-agent") + assert rc == 0 + with kb.connect() as conn: + comments = kb.list_comments(conn, tid) + assert comments and comments[0].author == "custom-agent" diff --git a/tests/hermes_cli/test_kanban_specify_db.py b/tests/hermes_cli/test_kanban_specify_db.py new file mode 100644 index 0000000000..4128c8c522 --- /dev/null +++ b/tests/hermes_cli/test_kanban_specify_db.py @@ -0,0 +1,184 @@ +"""Tests for kb.specify_triage_task — the DB-layer atomic promotion +from the triage column to todo. LLM-free by design.""" + +from __future__ import annotations + +from pathlib import Path + +import pytest + +from hermes_cli import kanban_db as kb + + +@pytest.fixture +def kanban_home(tmp_path, monkeypatch): + """Isolated HERMES_HOME with an empty kanban DB.""" + home = tmp_path / ".hermes" + home.mkdir() + monkeypatch.setenv("HERMES_HOME", str(home)) + monkeypatch.setattr(Path, "home", lambda: tmp_path) + kb.init_db() + return home + + +def _create_triage(conn, title="rough idea", body=None, assignee=None): + return kb.create_task( + conn, + title=title, + body=body, + assignee=assignee, + triage=True, + ) + + +def test_specify_promotes_triage_to_todo(kanban_home): + with kb.connect() as conn: + tid = _create_triage(conn, title="rough idea") + assert kb.get_task(conn, tid).status == "triage" + with kb.connect() as conn: + ok = kb.specify_triage_task( + conn, + tid, + title="Refined: rough idea", + body="**Goal**\nDo the thing.", + author="specifier-bot", + ) + assert ok is True + with kb.connect() as conn: + task = kb.get_task(conn, tid) + # No parents → recompute_ready should have flipped it past todo to ready. + assert task.status == "ready" + assert task.title == "Refined: rough idea" + assert "**Goal**" in (task.body or "") + + +def test_specify_with_open_parent_lands_in_todo_not_ready(kanban_home): + # Parent-gated specified tasks must not jump the dispatcher — they go + # to todo and wait for parent completion like any other gated task. + with kb.connect() as conn: + parent = kb.create_task(conn, title="parent work") + child = _create_triage(conn, title="child idea") + kb.link_tasks(conn, parent, child) + # After linking with an open parent, triage status should still be + # 'triage' (linking doesn't touch triage tasks). + assert kb.get_task(conn, child).status == "triage" + with kb.connect() as conn: + ok = kb.specify_triage_task( + conn, + child, + body="full spec", + author="specifier", + ) + assert ok is True + with kb.connect() as conn: + t = kb.get_task(conn, child) + # Parent still open → specified child sits in 'todo', not 'ready'. + assert t.status == "todo" + + +def test_specify_refuses_non_triage_task(kanban_home): + with kb.connect() as conn: + tid = kb.create_task(conn, title="normal task") + assert kb.get_task(conn, tid).status == "ready" + with kb.connect() as conn: + ok = kb.specify_triage_task(conn, tid, body="won't apply") + assert ok is False + with kb.connect() as conn: + # Status unchanged. + assert kb.get_task(conn, tid).status == "ready" + + +def test_specify_returns_false_for_unknown_id(kanban_home): + with kb.connect() as conn: + ok = kb.specify_triage_task(conn, "t_does_not_exist", body="x") + assert ok is False + + +def test_specify_rejects_blank_title(kanban_home): + with kb.connect() as conn: + tid = _create_triage(conn, title="rough") + with kb.connect() as conn, pytest.raises(ValueError): + kb.specify_triage_task(conn, tid, title=" ", body="ok") + + +def test_specify_emits_event(kanban_home): + with kb.connect() as conn: + tid = _create_triage(conn, title="rough") + with kb.connect() as conn: + kb.specify_triage_task( + conn, tid, title="new", body="b", author="ace" + ) + with kb.connect() as conn: + events = kb.list_events(conn, tid) + kinds = [e.kind for e in events] + assert "specified" in kinds + # The specified event records which fields actually changed as a + # JSON payload under task_events.payload. + spec_ev = next(e for e in events if e.kind == "specified") + assert spec_ev.payload is not None + fields = spec_ev.payload.get("changed_fields") or [] + assert "title" in fields + assert "body" in fields + + +def test_specify_records_audit_comment_only_when_author_given(kanban_home): + # With author → comment added. + with kb.connect() as conn: + tid1 = _create_triage(conn, title="a") + kb.specify_triage_task( + conn, tid1, title="A-spec", body="b", author="ace" + ) + comments1 = kb.list_comments(conn, tid1) + assert len(comments1) == 1 + assert "Specified" in comments1[0].body + assert comments1[0].author == "ace" + + # Without author → no comment (silent). + with kb.connect() as conn: + tid2 = _create_triage(conn, title="b") + kb.specify_triage_task(conn, tid2, title="B-spec", body="b") + comments2 = kb.list_comments(conn, tid2) + assert comments2 == [] + + +def test_specify_skips_comment_when_nothing_changed(kanban_home): + # Create triage task with title and body already set; pass identical + # values to specify. Should promote to todo but skip audit comment. + with kb.connect() as conn: + tid = _create_triage(conn, title="same", body="same body") + with kb.connect() as conn: + ok = kb.specify_triage_task( + conn, + tid, + title="same", + body="same body", + author="ace", + ) + assert ok is True + with kb.connect() as conn: + # Promoted. + assert kb.get_task(conn, tid).status in {"todo", "ready"} + # No audit comment because neither field changed. + assert kb.list_comments(conn, tid) == [] + + +def test_specify_with_only_body_preserves_title(kanban_home): + with kb.connect() as conn: + tid = _create_triage(conn, title="keep this title") + with kb.connect() as conn: + kb.specify_triage_task(conn, tid, body="new body only") + with kb.connect() as conn: + t = kb.get_task(conn, tid) + assert t.title == "keep this title" + assert t.body == "new body only" + + +def test_specify_second_call_noop_false(kanban_home): + # Promoting twice must not crash and the second call returns False + # because the task is no longer in triage. + with kb.connect() as conn: + tid = _create_triage(conn, title="once") + with kb.connect() as conn: + assert kb.specify_triage_task(conn, tid, body="spec") is True + with kb.connect() as conn: + assert kb.specify_triage_task(conn, tid, body="spec again") is False diff --git a/tests/plugins/test_kanban_dashboard_plugin.py b/tests/plugins/test_kanban_dashboard_plugin.py index f1e562425d..9163025174 100644 --- a/tests/plugins/test_kanban_dashboard_plugin.py +++ b/tests/plugins/test_kanban_dashboard_plugin.py @@ -1582,3 +1582,104 @@ def test_board_exposes_diagnostics_list_and_summary(client): assert task_dict["warnings"] is not None assert task_dict["warnings"]["highest_severity"] == "error" assert task_dict["diagnostics"][0]["kind"] == "repeated_crashes" + + +# --------------------------------------------------------------------------- +# POST /tasks/:id/specify — triage specifier endpoint +# --------------------------------------------------------------------------- + + +def _patch_specifier_response(monkeypatch, *, content, model="test-model"): + """Helper: install a fake auxiliary client so the specifier endpoint + can run without hitting any real provider.""" + from unittest.mock import MagicMock + + resp = MagicMock() + resp.choices = [MagicMock()] + resp.choices[0].message.content = content + fake_client = MagicMock() + fake_client.chat.completions.create = MagicMock(return_value=resp) + monkeypatch.setattr( + "agent.auxiliary_client.get_text_auxiliary_client", + lambda *a, **kw: (fake_client, model), + ) + return fake_client + + +def test_specify_happy_path(client, monkeypatch): + import json as jsonlib + + # Create a triage task. + t = client.post( + "/api/plugins/kanban/tasks", + json={"title": "one-liner", "triage": True}, + ).json()["task"] + assert t["status"] == "triage" + + _patch_specifier_response( + monkeypatch, + content=jsonlib.dumps( + {"title": "Polished", "body": "**Goal**\nDo the thing."} + ), + ) + + r = client.post( + f"/api/plugins/kanban/tasks/{t['id']}/specify", + json={"author": "ui-tester"}, + ) + assert r.status_code == 200 + body = r.json() + assert body["ok"] is True + assert body["task_id"] == t["id"] + assert body["new_title"] == "Polished" + + # Task should have moved off the triage column. + detail = client.get(f"/api/plugins/kanban/tasks/{t['id']}").json()["task"] + assert detail["status"] in {"todo", "ready"} + assert detail["title"] == "Polished" + assert "**Goal**" in (detail["body"] or "") + + +def test_specify_non_triage_returns_ok_false_not_http_error(client, monkeypatch): + """The endpoint intentionally returns ``{ok: false, reason: ...}`` for + "task not in triage" rather than a 4xx — the dashboard renders the + reason inline so the user can fix it without a page reload.""" + # Create a normal (ready) task — not in triage. + t = client.post("/api/plugins/kanban/tasks", json={"title": "x"}).json()["task"] + + _patch_specifier_response(monkeypatch, content="unused") + + r = client.post( + f"/api/plugins/kanban/tasks/{t['id']}/specify", + json={}, + ) + assert r.status_code == 200 + body = r.json() + assert body["ok"] is False + assert "not in triage" in body["reason"] + + +def test_specify_no_aux_client_surfaces_reason(client, monkeypatch): + t = client.post( + "/api/plugins/kanban/tasks", + json={"title": "rough", "triage": True}, + ).json()["task"] + + # Simulate "no auxiliary client configured". + monkeypatch.setattr( + "agent.auxiliary_client.get_text_auxiliary_client", + lambda *a, **kw: (None, ""), + ) + + r = client.post( + f"/api/plugins/kanban/tasks/{t['id']}/specify", + json={}, + ) + assert r.status_code == 200 + body = r.json() + assert body["ok"] is False + assert "auxiliary client" in body["reason"] + + # Task must stay in triage — nothing was touched. + detail = client.get(f"/api/plugins/kanban/tasks/{t['id']}").json()["task"] + assert detail["status"] == "triage" diff --git a/website/docs/reference/cli-commands.md b/website/docs/reference/cli-commands.md index 68e911984e..390204e533 100644 --- a/website/docs/reference/cli-commands.md +++ b/website/docs/reference/cli-commands.md @@ -378,6 +378,7 @@ Multi-profile, multi-project collaboration board. Each install can host many boa | `tail ` | Follow a task's event stream. | | `dispatch` | One dispatcher pass on the active board. Flags: `--dry-run`, `--max N`, `--json`. | | `context ` | Print the full context a worker would see (title + body + parent results + comments). | +| `specify ` / `specify --all` | Flesh out a triage-column task into a concrete spec (title + body with goal, approach, acceptance criteria) via the auxiliary LLM, then promote it to `todo`. Flags: `--tenant` (scope `--all` to one tenant), `--author`, `--json`. Configure the model under `auxiliary.triage_specifier` in `config.yaml`. | | `gc` | Remove scratch workspaces for archived tasks. | Examples: diff --git a/website/docs/user-guide/features/kanban.md b/website/docs/user-guide/features/kanban.md index acaa07c201..1f343a29f0 100644 --- a/website/docs/user-guide/features/kanban.md +++ b/website/docs/user-guide/features/kanban.md @@ -442,7 +442,7 @@ hermes dashboard # "Kanban" tab appears in the nav, after "Skills" ### What the plugin gives you - A **Kanban** tab showing one column per status: `triage`, `todo`, `ready`, `running`, `blocked`, `done` (plus `archived` when the toggle is on). - - `triage` is the parking column for rough ideas a specifier is expected to flesh out. Tasks created with `hermes kanban create --triage` (or via the Triage column's inline create) land here and the dispatcher leaves them alone until a human or specifier promotes them to `todo` / `ready`. + - `triage` is the parking column for rough ideas a specifier is expected to flesh out. Tasks created with `hermes kanban create --triage` (or via the Triage column's inline create) land here and the dispatcher leaves them alone until a human or specifier promotes them to `todo` / `ready`. Run `hermes kanban specify ` to have the auxiliary LLM expand a triage task into a concrete spec (title + body with goal, approach, acceptance criteria) and promote it to `todo` in one shot; `--all` sweeps every triage task at once. Configure which model runs the specifier under `auxiliary.triage_specifier` in `config.yaml`. - Cards show the task id, title, priority badge, tenant tag, assigned profile, comment/link counts, a **progress pill** (`N/M` children done when the task has dependents), and "created N ago". A per-card checkbox enables multi-select. - **Per-profile lanes inside Running** — toolbar checkbox toggles sub-grouping of the Running column by assignee. - **Live updates via WebSocket** — the plugin tails the append-only `task_events` table on a short poll interval; the board reflects changes the instant any profile (CLI, gateway, or another dashboard tab) acts. Reloads are debounced so a burst of events triggers a single refetch. @@ -454,7 +454,7 @@ hermes dashboard # "Kanban" tab appears in the nav, after "Skills" - **Editable assignee / priority** — click the meta row to rewrite. - **Editable description** — markdown-rendered by default (headings, bold, italic, inline code, fenced code, `http(s)` / `mailto:` links, bullet lists), with an "edit" button that swaps in a textarea. Markdown rendering is a tiny, XSS-safe renderer — every substitution runs on HTML-escaped input, only `http(s)` / `mailto:` links pass through, and `target="_blank"` + `rel="noopener noreferrer"` are always set. - **Dependency editor** — chip list of parents and children, each with an `×` to unlink, plus dropdowns over every other task to add a new parent or child. Cycle attempts are rejected server-side with a clear message. - - **Status action row** (→ triage / → ready / → running / block / unblock / complete / archive) with confirm prompts for destructive transitions. + - **Status action row** (→ triage / → ready / → running / block / unblock / complete / archive) with confirm prompts for destructive transitions. For cards in the **Triage** column the row also exposes a **✨ Specify** button that calls the auxiliary LLM (`auxiliary.triage_specifier` in `config.yaml`) to expand the one-liner into a concrete spec (title + body with goal, approach, acceptance criteria) and promote the task to `todo`. The same behaviour is reachable from the CLI (`hermes kanban specify ` / `--all`), from any gateway platform (`/kanban specify `), and programmatically via `POST /api/plugins/kanban/tasks/:id/specify`. - Result section (also markdown-rendered), comment thread with Enter-to-submit, the last 20 events. - **Toolbar filters** — free-text search, tenant dropdown (defaults to `dashboard.kanban.default_tenant` from `config.yaml`), assignee dropdown, "show archived" toggle, "lanes by profile" toggle, and a **Nudge dispatcher** button so you don't have to wait for the next 60 s tick. @@ -496,6 +496,7 @@ All routes are mounted under `/api/plugins/kanban/` and protected by the dashboa | `PATCH` | `/tasks/:id` | Status / assignee / priority / title / body / result | | `POST` | `/tasks/bulk` | Apply the same patch (status / archive / assignee / priority) to every id in `ids`. Per-id failures reported without aborting siblings | | `POST` | `/tasks/:id/comments` | Append a comment | +| `POST` | `/tasks/:id/specify` | Run the triage specifier — auxiliary LLM fleshes out the task body and promotes it from `triage` to `todo`. Returns `{ok, task_id, reason, new_title}`; `ok=false` with a human-readable reason on "not in triage" / no aux client / LLM error is a 200, not a 4xx | | `POST` | `/links` | Add a dependency (`parent_id` → `child_id`) | | `DELETE` | `/links?parent_id=…&child_id=…` | Remove a dependency | | `POST` | `/dispatch?max=…&dry_run=…` | Nudge the dispatcher — skip the 60 s wait | @@ -588,6 +589,8 @@ hermes kanban notify-list [] [--json] hermes kanban notify-unsubscribe --platform --chat-id [--thread-id ] hermes kanban context # what a worker sees +hermes kanban specify [ | --all] [--tenant T] # flesh out a triage-column idea + [--author NAME] [--json] # into a full spec and promote to todo hermes kanban gc [--event-retention-days N] # workspaces + old events + old logs [--log-retention-days N] ``` @@ -605,6 +608,8 @@ Every `hermes kanban ` verb is also reachable as `/kanban ` — /kanban comment t_abcd "looks good, ship it" /kanban unblock t_abcd /kanban dispatch --max 3 +/kanban specify t_abcd # flesh out a triage one-liner into a real spec +/kanban specify --all --tenant engineering # sweep every triage task in one tenant ``` Quote multi-word arguments the same way you would on a shell — `run_slash` parses the rest of the line with `shlex.split`, so `"..."` and `'...'` both work. @@ -658,7 +663,7 @@ The board supports these eight patterns without any new primitives: | **P6 `@mention`** | inline routing from prose | `@reviewer look at this` | | **P7 Thread-scoped workspace** | `/kanban here` in a thread | per-project gateway threads | | **P8 Fleet farming** | one profile, N subjects | 50 social accounts | -| **P9 Triage specifier** | rough idea → `triage` → specifier expands body → `todo` | "turn this one-liner into a spec' task" | +| **P9 Triage specifier** | rough idea → `triage` → `hermes kanban specify` expands body → `todo` | "turn this one-liner into a spec'd task" | For worked examples of each, see `docs/hermes-kanban-v1-spec.pdf`.