feat(kanban): add specify — auxiliary LLM fleshes out triage tasks (#21435)

* feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks The Triage column shipped with a placeholder 'a specifier will flesh out the spec', but the specifier itself was never built. This wires it up as a dedicated CLI verb. `hermes kanban specify <id>` calls the auxiliary LLM (configured under `auxiliary.triage_specifier`) to expand a rough one-liner into a concrete spec — tightened title plus a body with Goal / Approach / Acceptance criteria / Out-of-scope sections — then atomically flips `status: triage -> todo` and recomputes ready so parent-free tasks go straight to the dispatcher on the same tick. Surface: hermes kanban specify <task_id> # single task hermes kanban specify --all [--tenant T] # sweep triage column hermes kanban specify ... --author NAME # audit-comment author hermes kanban specify ... --json # one JSON line per task Design choices: - Parent gating is preserved. specify_triage_task flips to 'todo', then recompute_ready promotes to 'ready' only when parents are done — same rule as a normal parent-gated todo. - No daemon, no background watcher. Every invocation is explicit — keeps cost predictable and doesn't fight the dispatcher loop. - Response parse is lenient: strict JSON preferred, markdown-fence tolerated, raw-body fallback on malformed JSON so the LLM can't strand a task in triage. - All failure modes (no aux client, API error, task moved out of triage mid-call) return SpecifyOutcome(ok=False, reason=...) so --all continues past individual failures. Changes: hermes_cli/kanban_db.py + specify_triage_task() hermes_cli/kanban_specify.py NEW (~220 LOC — prompt, parse, call) hermes_cli/kanban.py + specify subcommand + _cmd_specify hermes_cli/config.py + auxiliary.triage_specifier task slot website/docs/user-guide/features/kanban.md specify + config notes website/docs/reference/cli-commands.md CLI reference entry tests/hermes_cli/test_kanban_specify_db.py NEW (10 tests) tests/hermes_cli/test_kanban_specify.py NEW (20 tests) Validation: 30/30 targeted tests pass. E2E: triage task -> specify -> ends in 'ready' with events [created, specified, promoted] and the audit comment recorded under the configured author. * feat(kanban): wire specifier into dashboard and gateway slash Follow-ups to the initial PR #21435 — closes the two gaps I'd left as post-merge: dashboard button and first-class gateway surface. Dashboard (plugins/kanban/dashboard/) - POST /tasks/:id/specify NEW endpoint. Thin wrapper around kanban_specify.specify_task(). Returns the CLI outcome shape ({ok, task_id, reason, new_title}); ok=false with a human reason is a 200, not a 4xx, so the UI can render it inline without treating 'no aux client configured' as a crash. - Runs sync in FastAPI's threadpool because the LLM call can take tens of seconds on reasoning models. - Pins HERMES_KANBAN_BOARD around the specify call so the module's argless kb.connect() lands on the right board. - dist/index.js: doSpecify callback threaded through the drawer → TaskDetail → StatusActions prop chain. ✨ Specify button appears ONLY when task.status === 'triage' (elsewhere the backend would reject anyway — hide the button to keep the action row clean). Busy state (Specifying…) + inline success/error banner under the button using the response.reason text. - dist/style.css: tiny hermes-kanban-msg-ok / -err classes using existing --color vars so themes reskin cleanly. Gateway slash (/kanban specify) - Already works via the existing run_slash → build_parser → kanban_command pipeline. No code change needed — slash commands inherit the argparse tree automatically. Added coverage: test_run_slash_specify_end_to_end (create --triage, specify, verify promotion + retitle) and test_run_slash_specify_help_is_reachable. Tests - tests/plugins/test_kanban_dashboard_plugin.py: 3 new tests for the REST endpoint — happy path, non-triage rejection as ok=false 200, missing aux client as ok=false 200. - tests/hermes_cli/test_kanban_cli.py: 2 new slash-surface tests. Docs - website/docs/user-guide/features/kanban.md: dashboard action row description mentions ✨ Specify + all three surfaces. REST table gains /tasks/:id/specify. Slash examples include /kanban specify. Validation: 340/340 targeted tests pass. E2E via TestClient: create a triage task over REST → POST /specify with mocked aux client → task moves to 'ready' column on /board with new title and body applied.
2026-05-12 03:42:08 +00:00 · 2026-05-07 13:04:41 -07:00 · 2026-05-07 13:04:41 -07:00 · 24d48ffb82
commit 24d48ffb82
parent 732a6c45fa
13 changed files with 1328 additions and 20 deletions
--- a/hermes_cli/config.py
+++ b/hermes_cli/config.py
@ -780,6 +780,19 @@ DEFAULT_CONFIG = {
            "timeout": 30,
            "extra_body": {},
        },
+        # Triage specifier — flesh out a rough one-liner in the Kanban
+        # Triage column into a concrete spec, then promote it to ``todo``.
+        # Invoked by ``hermes kanban specify`` (single id or --all). Set a
+        # cheap, capable model here (gemini-flash works well); the main
+        # model is overkill for short spec expansion.
+        "triage_specifier": {
+            "provider": "auto",
+            "model": "",
+            "base_url": "",
+            "api_key": "",
+            "timeout": 120,
+            "extra_body": {},
+        },
        # Curator — skill-usage review fork. Timeout is generous because the
        # review pass can take several minutes on reasoning models (umbrella
        # building over hundreds of candidate skills). "auto" = use main chat
--- a/hermes_cli/kanban.py
+++ b/hermes_cli/kanban.py
@ -570,6 +570,42 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
    )
    p_ctx.add_argument("task_id")

+    # --- specify --- (triage → todo via auxiliary LLM)
+    p_specify = sub.add_parser(
+        "specify",
+        help="Flesh out a triage-column task into a concrete spec "
+             "(title + body) and promote it to todo. Uses the auxiliary "
+             "LLM configured under auxiliary.triage_specifier.",
+    )
+    p_specify.add_argument(
+        "task_id",
+        nargs="?",
+        default=None,
+        help="Task id to specify (required unless --all is given)",
+    )
+    p_specify.add_argument(
+        "--all",
+        dest="all_triage",
+        action="store_true",
+        help="Specify every task currently in the triage column",
+    )
+    p_specify.add_argument(
+        "--tenant",
+        default=None,
+        help="When used with --all, restrict the sweep to this tenant",
+    )
+    p_specify.add_argument(
+        "--author",
+        default=None,
+        help="Author name recorded on the audit comment "
+             "(default: $HERMES_PROFILE or 'specifier')",
+    )
+    p_specify.add_argument(
+        "--json",
+        action="store_true",
+        help="Emit one JSON object per task on stdout",
+    )
+
    # --- gc ---
    p_gc = sub.add_parser(
        "gc", help="Garbage-collect archived-task workspaces, old events, and old logs",
@ -684,6 +720,7 @@ def kanban_command(args: argparse.Namespace) -> int:
        "notify-list":        _cmd_notify_list,
        "notify-unsubscribe": _cmd_notify_unsubscribe,
        "context":  _cmd_context,
+        "specify":  _cmd_specify,
        "gc":       _cmd_gc,
    }
    handler = handlers.get(action)
@ -1980,6 +2017,80 @@ def _cmd_context(args: argparse.Namespace) -> int:
    return 0


+def _cmd_specify(args: argparse.Namespace) -> int:
+    """Flesh out a triage task (or all of them) via auxiliary LLM,
+    then promote to todo. Thin wrapper over ``kanban_specify``."""
+    from hermes_cli import kanban_specify as spec
+
+    all_flag = bool(getattr(args, "all_triage", False))
+    tenant = getattr(args, "tenant", None)
+    author = getattr(args, "author", None) or _profile_author()
+    want_json = bool(getattr(args, "json", False))
+
+    if args.task_id and all_flag:
+        print(
+            "kanban: pass either a task id OR --all, not both",
+            file=sys.stderr,
+        )
+        return 2
+
+    if all_flag:
+        ids = spec.list_triage_ids(tenant=tenant)
+        if not ids:
+            msg = (
+                "No triage tasks"
+                + (f" for tenant {tenant!r}" if tenant else "")
+                + "."
+            )
+            if want_json:
+                print(json.dumps({"specified": 0, "total": 0}))
+            else:
+                print(msg)
+            return 0
+    elif args.task_id:
+        ids = [args.task_id]
+    else:
+        print(
+            "kanban: specify requires a task id or --all",
+            file=sys.stderr,
+        )
+        return 2
+
+    ok_count = 0
+    fail_count = 0
+    for tid in ids:
+        outcome = spec.specify_task(tid, author=author)
+        if outcome.ok:
+            ok_count += 1
+        else:
+            fail_count += 1
+        if want_json:
+            print(json.dumps({
+                "task_id": outcome.task_id,
+                "ok": outcome.ok,
+                "reason": outcome.reason,
+                "new_title": outcome.new_title,
+            }))
+        else:
+            if outcome.ok:
+                title_suffix = (
+                    f" — retitled: {outcome.new_title!r}"
+                    if outcome.new_title
+                    else ""
+                )
+                print(f"Specified {outcome.task_id} → todo{title_suffix}")
+            else:
+                print(
+                    f"kanban: specify {outcome.task_id}: {outcome.reason}",
+                    file=sys.stderr,
+                )
+    if not all_flag:
+        return 0 if ok_count == 1 else 1
+    # --all: succeed if at least one promotion landed; exit 1 only when
+    # every candidate failed (honest signal for scripts).
+    return 0 if (ok_count > 0 or not ids) else 1
+
+
 def _cmd_gc(args: argparse.Namespace) -> int:
    """Remove scratch workspaces of archived tasks, prune old events, and
    delete old worker logs."""
--- a/hermes_cli/kanban_db.py
+++ b/hermes_cli/kanban_db.py
@ -2503,6 +2503,91 @@ def unblock_task(conn: sqlite3.Connection, task_id: str) -> bool:
        return True


+def specify_triage_task(
+    conn: sqlite3.Connection,
+    task_id: str,
+    *,
+    title: Optional[str] = None,
+    body: Optional[str] = None,
+    author: Optional[str] = None,
+) -> bool:
+    """Flesh out a triage task and promote it to ``todo``.
+
+    Atomically updates ``title`` / ``body`` (when provided) and transitions
+    ``status: triage -> todo`` in a single write txn. Returns False when
+    the task is missing or not in the ``triage`` column — callers should
+    surface that as "nothing to specify" rather than an error.
+
+    ``todo`` (not ``ready``) is the correct landing column: ``recompute_ready``
+    promotes parent-free / parent-done todos to ``ready`` on the next
+    dispatcher tick, which keeps the normal parent-gating behaviour intact
+    for specified tasks that happen to have open parents.
+
+    ``author`` is recorded on an audit comment only when at least one of
+    ``title`` / ``body`` actually changed — avoids noisy comment spam for
+    status-only promotions.
+    """
+    if title is not None and not title.strip():
+        raise ValueError("title cannot be blank")
+    with write_txn(conn):
+        existing = conn.execute(
+            "SELECT title, body FROM tasks WHERE id = ? AND status = 'triage'",
+            (task_id,),
+        ).fetchone()
+        if existing is None:
+            return False
+        sets: list[str] = ["status = 'todo'"]
+        params: list[Any] = []
+        changed_fields: list[str] = []
+        if title is not None and title.strip() != (existing["title"] or ""):
+            sets.append("title = ?")
+            params.append(title.strip())
+            changed_fields.append("title")
+        if body is not None and (body or "") != (existing["body"] or ""):
+            sets.append("body = ?")
+            params.append(body)
+            changed_fields.append("body")
+        params.append(task_id)
+        cur = conn.execute(
+            f"UPDATE tasks SET {', '.join(sets)} "
+            f"WHERE id = ? AND status = 'triage'",
+            tuple(params),
+        )
+        if cur.rowcount != 1:
+            return False
+        if changed_fields and author and author.strip():
+            # Inline INSERT (rather than ``add_comment``) because we're
+            # already inside this function's write_txn — nested BEGIN
+            # IMMEDIATE would raise OperationalError. We also skip the
+            # 'commented' event that ``add_comment`` emits, since the
+            # 'specified' event below already records the change.
+            conn.execute(
+                "INSERT INTO task_comments (task_id, author, body, created_at) "
+                "VALUES (?, ?, ?, ?)",
+                (
+                    task_id,
+                    author.strip(),
+                    "Specified — updated "
+                    + ", ".join(changed_fields)
+                    + " and promoted to todo.",
+                    int(time.time()),
+                ),
+            )
+        _append_event(
+            conn,
+            task_id,
+            "specified",
+            {"changed_fields": changed_fields} if changed_fields else None,
+        )
+    # Outside the write_txn above, so we don't nest BEGIN IMMEDIATE — the
+    # ready-promotion pass opens its own IMMEDIATE txn. This runs the same
+    # logic the dispatcher would on its next tick, so a specified task
+    # with no open parents flips straight to 'ready' here instead of
+    # idling in 'todo' until the next sweep.
+    recompute_ready(conn)
+    return True
+
+
 def archive_task(conn: sqlite3.Connection, task_id: str) -> bool:
    with write_txn(conn):
        cur = conn.execute(
--- a/hermes_cli/kanban_specify.py
+++ b/hermes_cli/kanban_specify.py
@ -0,0 +1,265 @@
+"""Kanban triage specifier — flesh out a one-liner into a real spec.
+
+Used by ``hermes kanban specify [task_id | --all]``. Takes a task that
+lives in the Triage column (a rough idea, typically only a title), calls
+the auxiliary LLM to produce:
+
+  * A tightened title (optional — only replaces if the model proposes a
+    materially different one)
+  * A concrete body: goal, proposed approach, acceptance criteria
+
+and then flips the task ``triage -> todo`` via
+``kanban_db.specify_triage_task``. The dispatcher promotes it to
+``ready`` on its next tick (or immediately if there are no open parents).
+
+Design notes
+------------
+
+* This module intentionally mirrors ``hermes_cli/goals.py`` — same aux
+  client pattern, same "empty config => skip, don't crash" tolerance.
+  Keeps the surface area tiny and the failure modes predictable.
+
+* The prompt is a short system + user pair. We ask for JSON with
+  ``{title, body}``; if parsing fails, we fall back to treating the
+  whole response as the body and leave the title untouched. No
+  retry loop — one shot, keep cost bounded.
+
+* Structured output / JSON mode is not requested explicitly so the
+  specifier works on providers that don't implement it. The parse
+  is lenient (tolerates markdown code fences around the JSON).
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+import os
+import re
+from dataclasses import dataclass
+from typing import Optional
+
+from hermes_cli import kanban_db as kb
+
+logger = logging.getLogger(__name__)
+
+
+_SYSTEM_PROMPT = """You are the Kanban triage specifier for the Hermes Agent board.
+A user dropped a rough idea into the Triage column. Your job is to turn it
+into a concrete, actionable task spec that an autonomous worker can pick up
+and execute without further clarification.
+
+Output a single JSON object with exactly two keys:
+
+  {
+    "title": "<tightened task title, <= 80 chars, imperative voice>",
+    "body":  "<multi-line spec, see structure below>"
+  }
+
+The body MUST include these sections, each prefixed with a bold markdown
+heading, in this order:
+
+  **Goal** — one sentence, user-facing outcome.
+  **Approach** — 2-5 bullets on how a worker should tackle it.
+  **Acceptance criteria** — checklist of concrete, verifiable conditions.
+  **Out of scope** — short list of things NOT to touch (omit if nothing
+      obvious; never invent scope creep).
+
+Rules:
+  - Keep the tightened title close in meaning to the original idea — do
+    NOT invent a different project.
+  - If the original idea is already detailed, preserve its substance and
+    just reformat into the sections above.
+  - Never add invented requirements the user didn't hint at.
+  - No preamble, no closing remarks, no code fences around the JSON.
+  - Output only the JSON object and nothing else.
+"""
+
+
+_USER_TEMPLATE = """Task id: {task_id}
+Current title: {title}
+Current body:
+{body}
+"""
+
+
+@dataclass
+class SpecifyOutcome:
+    """Result of specifying a single triage task."""
+
+    task_id: str
+    ok: bool
+    reason: str = ""
+    new_title: Optional[str] = None
+
+
+def _truncate(text: str, limit: int) -> str:
+    if len(text) <= limit:
+        return text
+    return text[: limit - 1] + "…"
+
+
+_FENCE_RE = re.compile(r"^\s*```(?:json)?\s*|\s*```\s*$", re.IGNORECASE)
+
+
+def _extract_json_blob(raw: str) -> Optional[dict]:
+    """Lenient JSON extraction — tolerates fenced code blocks and
+    leading/trailing whitespace. Returns None if nothing parses."""
+    if not raw:
+        return None
+    stripped = _FENCE_RE.sub("", raw.strip())
+    # Greedy: find the first `{` and last `}` and try that slice.
+    first = stripped.find("{")
+    last = stripped.rfind("}")
+    if first == -1 or last == -1 or last <= first:
+        return None
+    candidate = stripped[first : last + 1]
+    try:
+        val = json.loads(candidate)
+    except (ValueError, json.JSONDecodeError):
+        return None
+    if not isinstance(val, dict):
+        return None
+    return val
+
+
+def _profile_author() -> str:
+    """Mirror of ``hermes_cli.kanban._profile_author``. Kept local to
+    avoid a circular import when kanban.py imports this module."""
+    return (
+        os.environ.get("HERMES_PROFILE")
+        or os.environ.get("USER")
+        or "specifier"
+    )
+
+
+def specify_task(
+    task_id: str,
+    *,
+    author: Optional[str] = None,
+    timeout: Optional[int] = None,
+) -> SpecifyOutcome:
+    """Specify a single triage task and promote it to ``todo``.
+
+    Returns an outcome describing what happened. Never raises for expected
+    failure modes (task not in triage, no aux client configured, API
+    error, malformed response) — those surface via ``ok=False`` so the
+    ``--all`` sweep can continue past individual failures.
+    """
+    with kb.connect() as conn:
+        task = kb.get_task(conn, task_id)
+    if task is None:
+        return SpecifyOutcome(task_id, False, "unknown task id")
+    if task.status != "triage":
+        return SpecifyOutcome(
+            task_id, False, f"task is not in triage (status={task.status!r})"
+        )
+
+    try:
+        from agent.auxiliary_client import get_text_auxiliary_client
+    except Exception as exc:  # pragma: no cover — import smoke test
+        logger.debug("specify: auxiliary client import failed: %s", exc)
+        return SpecifyOutcome(task_id, False, "auxiliary client unavailable")
+
+    try:
+        client, model = get_text_auxiliary_client("triage_specifier")
+    except Exception as exc:
+        logger.debug("specify: get_text_auxiliary_client failed: %s", exc)
+        return SpecifyOutcome(task_id, False, "auxiliary client unavailable")
+
+    if client is None or not model:
+        return SpecifyOutcome(
+            task_id, False, "no auxiliary client configured"
+        )
+
+    user_msg = _USER_TEMPLATE.format(
+        task_id=task.id,
+        title=_truncate(task.title or "", 400),
+        body=_truncate(task.body or "(no body)", 4000),
+    )
+
+    try:
+        resp = client.chat.completions.create(
+            model=model,
+            messages=[
+                {"role": "system", "content": _SYSTEM_PROMPT},
+                {"role": "user", "content": user_msg},
+            ],
+            temperature=0.3,
+            max_tokens=1500,
+            timeout=timeout or 120,
+        )
+    except Exception as exc:
+        logger.info(
+            "specify: API call failed for %s (%s) — skipping",
+            task_id, exc,
+        )
+        return SpecifyOutcome(
+            task_id, False, f"LLM error: {type(exc).__name__}"
+        )
+
+    try:
+        raw = resp.choices[0].message.content or ""
+    except Exception:
+        raw = ""
+
+    parsed = _extract_json_blob(raw)
+
+    new_title: Optional[str]
+    new_body: Optional[str]
+    if parsed is None:
+        # Fall back: treat the whole reply as the body, leave title as-is.
+        # Worst case the user edits afterward — still better than stranding
+        # the task in triage on a malformed LLM reply.
+        stripped_raw = raw.strip()
+        if not stripped_raw:
+            return SpecifyOutcome(
+                task_id, False, "LLM returned an empty response"
+            )
+        new_title = None
+        new_body = stripped_raw
+    else:
+        title_val = parsed.get("title")
+        body_val = parsed.get("body")
+        new_title = (
+            title_val.strip()
+            if isinstance(title_val, str) and title_val.strip()
+            else None
+        )
+        new_body = (
+            body_val if isinstance(body_val, str) and body_val.strip() else None
+        )
+        if new_body is None and new_title is None:
+            return SpecifyOutcome(
+                task_id, False, "LLM response missing title and body"
+            )
+
+    with kb.connect() as conn:
+        ok = kb.specify_triage_task(
+            conn,
+            task_id,
+            title=new_title,
+            body=new_body,
+            author=author or _profile_author(),
+        )
+    if not ok:
+        # Race: someone else promoted / archived the task between our
+        # read above and the write. Report, don't crash.
+        return SpecifyOutcome(
+            task_id, False, "task moved out of triage before promotion"
+        )
+    return SpecifyOutcome(task_id, True, "specified", new_title=new_title)
+
+
+def list_triage_ids(*, tenant: Optional[str] = None) -> list[str]:
+    """Return task ids currently in the triage column.
+
+    ``tenant`` narrows the sweep; ``None`` returns every triage task.
+    """
+    with kb.connect() as conn:
+        tasks = kb.list_tasks(
+            conn,
+            status="triage",
+            tenant=tenant,
+            include_archived=False,
+        )
+    return [t.id for t in tasks]