mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-08 03:01:47 +00:00
* feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks
The Triage column shipped with a placeholder 'a specifier will flesh
out the spec', but the specifier itself was never built. This wires
it up as a dedicated CLI verb.
`hermes kanban specify <id>` calls the auxiliary LLM (configured under
`auxiliary.triage_specifier`) to expand a rough one-liner into a
concrete spec — tightened title plus a body with Goal / Approach /
Acceptance criteria / Out-of-scope sections — then atomically flips
`status: triage -> todo` and recomputes ready so parent-free tasks
go straight to the dispatcher on the same tick.
Surface:
hermes kanban specify <task_id> # single task
hermes kanban specify --all [--tenant T] # sweep triage column
hermes kanban specify ... --author NAME # audit-comment author
hermes kanban specify ... --json # one JSON line per task
Design choices:
- Parent gating is preserved. specify_triage_task flips to 'todo',
then recompute_ready promotes to 'ready' only when parents are
done — same rule as a normal parent-gated todo.
- No daemon, no background watcher. Every invocation is explicit —
keeps cost predictable and doesn't fight the dispatcher loop.
- Response parse is lenient: strict JSON preferred, markdown-fence
tolerated, raw-body fallback on malformed JSON so the LLM can't
strand a task in triage.
- All failure modes (no aux client, API error, task moved out of
triage mid-call) return SpecifyOutcome(ok=False, reason=...) so
--all continues past individual failures.
Changes:
hermes_cli/kanban_db.py + specify_triage_task()
hermes_cli/kanban_specify.py NEW (~220 LOC — prompt, parse, call)
hermes_cli/kanban.py + specify subcommand + _cmd_specify
hermes_cli/config.py + auxiliary.triage_specifier task slot
website/docs/user-guide/features/kanban.md specify + config notes
website/docs/reference/cli-commands.md CLI reference entry
tests/hermes_cli/test_kanban_specify_db.py NEW (10 tests)
tests/hermes_cli/test_kanban_specify.py NEW (20 tests)
Validation: 30/30 targeted tests pass. E2E: triage task -> specify ->
ends in 'ready' with events [created, specified, promoted] and the
audit comment recorded under the configured author.
* feat(kanban): wire specifier into dashboard and gateway slash
Follow-ups to the initial PR #21435 — closes the two gaps I'd left as
post-merge: dashboard button and first-class gateway surface.
Dashboard (plugins/kanban/dashboard/)
- POST /tasks/:id/specify NEW endpoint. Thin wrapper around
kanban_specify.specify_task(). Returns the CLI outcome shape
({ok, task_id, reason, new_title}); ok=false with a human reason
is a 200, not a 4xx, so the UI can render it inline without
treating 'no aux client configured' as a crash.
- Runs sync in FastAPI's threadpool because the LLM call can take
tens of seconds on reasoning models.
- Pins HERMES_KANBAN_BOARD around the specify call so the module's
argless kb.connect() lands on the right board.
- dist/index.js: doSpecify callback threaded through the drawer →
TaskDetail → StatusActions prop chain. ✨ Specify button appears
ONLY when task.status === 'triage' (elsewhere the backend would
reject anyway — hide the button to keep the action row clean).
Busy state (Specifying…) + inline success/error banner under the
button using the response.reason text.
- dist/style.css: tiny hermes-kanban-msg-ok / -err classes using
existing --color vars so themes reskin cleanly.
Gateway slash (/kanban specify)
- Already works via the existing run_slash → build_parser →
kanban_command pipeline. No code change needed — slash commands
inherit the argparse tree automatically. Added coverage:
test_run_slash_specify_end_to_end (create --triage, specify, verify
promotion + retitle) and test_run_slash_specify_help_is_reachable.
Tests
- tests/plugins/test_kanban_dashboard_plugin.py: 3 new tests for the
REST endpoint — happy path, non-triage rejection as ok=false 200,
missing aux client as ok=false 200.
- tests/hermes_cli/test_kanban_cli.py: 2 new slash-surface tests.
Docs
- website/docs/user-guide/features/kanban.md: dashboard action row
description mentions ✨ Specify + all three surfaces. REST table
gains /tasks/:id/specify. Slash examples include /kanban specify.
Validation: 340/340 targeted tests pass. E2E via TestClient: create a
triage task over REST → POST /specify with mocked aux client → task
moves to 'ready' column on /board with new title and body applied.
265 lines
8.5 KiB
Python
265 lines
8.5 KiB
Python
"""Kanban triage specifier — flesh out a one-liner into a real spec.
|
|
|
|
Used by ``hermes kanban specify [task_id | --all]``. Takes a task that
|
|
lives in the Triage column (a rough idea, typically only a title), calls
|
|
the auxiliary LLM to produce:
|
|
|
|
* A tightened title (optional — only replaces if the model proposes a
|
|
materially different one)
|
|
* A concrete body: goal, proposed approach, acceptance criteria
|
|
|
|
and then flips the task ``triage -> todo`` via
|
|
``kanban_db.specify_triage_task``. The dispatcher promotes it to
|
|
``ready`` on its next tick (or immediately if there are no open parents).
|
|
|
|
Design notes
|
|
------------
|
|
|
|
* This module intentionally mirrors ``hermes_cli/goals.py`` — same aux
|
|
client pattern, same "empty config => skip, don't crash" tolerance.
|
|
Keeps the surface area tiny and the failure modes predictable.
|
|
|
|
* The prompt is a short system + user pair. We ask for JSON with
|
|
``{title, body}``; if parsing fails, we fall back to treating the
|
|
whole response as the body and leave the title untouched. No
|
|
retry loop — one shot, keep cost bounded.
|
|
|
|
* Structured output / JSON mode is not requested explicitly so the
|
|
specifier works on providers that don't implement it. The parse
|
|
is lenient (tolerates markdown code fences around the JSON).
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import json
|
|
import logging
|
|
import os
|
|
import re
|
|
from dataclasses import dataclass
|
|
from typing import Optional
|
|
|
|
from hermes_cli import kanban_db as kb
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
|
|
_SYSTEM_PROMPT = """You are the Kanban triage specifier for the Hermes Agent board.
|
|
A user dropped a rough idea into the Triage column. Your job is to turn it
|
|
into a concrete, actionable task spec that an autonomous worker can pick up
|
|
and execute without further clarification.
|
|
|
|
Output a single JSON object with exactly two keys:
|
|
|
|
{
|
|
"title": "<tightened task title, <= 80 chars, imperative voice>",
|
|
"body": "<multi-line spec, see structure below>"
|
|
}
|
|
|
|
The body MUST include these sections, each prefixed with a bold markdown
|
|
heading, in this order:
|
|
|
|
**Goal** — one sentence, user-facing outcome.
|
|
**Approach** — 2-5 bullets on how a worker should tackle it.
|
|
**Acceptance criteria** — checklist of concrete, verifiable conditions.
|
|
**Out of scope** — short list of things NOT to touch (omit if nothing
|
|
obvious; never invent scope creep).
|
|
|
|
Rules:
|
|
- Keep the tightened title close in meaning to the original idea — do
|
|
NOT invent a different project.
|
|
- If the original idea is already detailed, preserve its substance and
|
|
just reformat into the sections above.
|
|
- Never add invented requirements the user didn't hint at.
|
|
- No preamble, no closing remarks, no code fences around the JSON.
|
|
- Output only the JSON object and nothing else.
|
|
"""
|
|
|
|
|
|
_USER_TEMPLATE = """Task id: {task_id}
|
|
Current title: {title}
|
|
Current body:
|
|
{body}
|
|
"""
|
|
|
|
|
|
@dataclass
|
|
class SpecifyOutcome:
|
|
"""Result of specifying a single triage task."""
|
|
|
|
task_id: str
|
|
ok: bool
|
|
reason: str = ""
|
|
new_title: Optional[str] = None
|
|
|
|
|
|
def _truncate(text: str, limit: int) -> str:
|
|
if len(text) <= limit:
|
|
return text
|
|
return text[: limit - 1] + "…"
|
|
|
|
|
|
_FENCE_RE = re.compile(r"^\s*```(?:json)?\s*|\s*```\s*$", re.IGNORECASE)
|
|
|
|
|
|
def _extract_json_blob(raw: str) -> Optional[dict]:
|
|
"""Lenient JSON extraction — tolerates fenced code blocks and
|
|
leading/trailing whitespace. Returns None if nothing parses."""
|
|
if not raw:
|
|
return None
|
|
stripped = _FENCE_RE.sub("", raw.strip())
|
|
# Greedy: find the first `{` and last `}` and try that slice.
|
|
first = stripped.find("{")
|
|
last = stripped.rfind("}")
|
|
if first == -1 or last == -1 or last <= first:
|
|
return None
|
|
candidate = stripped[first : last + 1]
|
|
try:
|
|
val = json.loads(candidate)
|
|
except (ValueError, json.JSONDecodeError):
|
|
return None
|
|
if not isinstance(val, dict):
|
|
return None
|
|
return val
|
|
|
|
|
|
def _profile_author() -> str:
|
|
"""Mirror of ``hermes_cli.kanban._profile_author``. Kept local to
|
|
avoid a circular import when kanban.py imports this module."""
|
|
return (
|
|
os.environ.get("HERMES_PROFILE")
|
|
or os.environ.get("USER")
|
|
or "specifier"
|
|
)
|
|
|
|
|
|
def specify_task(
|
|
task_id: str,
|
|
*,
|
|
author: Optional[str] = None,
|
|
timeout: Optional[int] = None,
|
|
) -> SpecifyOutcome:
|
|
"""Specify a single triage task and promote it to ``todo``.
|
|
|
|
Returns an outcome describing what happened. Never raises for expected
|
|
failure modes (task not in triage, no aux client configured, API
|
|
error, malformed response) — those surface via ``ok=False`` so the
|
|
``--all`` sweep can continue past individual failures.
|
|
"""
|
|
with kb.connect() as conn:
|
|
task = kb.get_task(conn, task_id)
|
|
if task is None:
|
|
return SpecifyOutcome(task_id, False, "unknown task id")
|
|
if task.status != "triage":
|
|
return SpecifyOutcome(
|
|
task_id, False, f"task is not in triage (status={task.status!r})"
|
|
)
|
|
|
|
try:
|
|
from agent.auxiliary_client import get_text_auxiliary_client
|
|
except Exception as exc: # pragma: no cover — import smoke test
|
|
logger.debug("specify: auxiliary client import failed: %s", exc)
|
|
return SpecifyOutcome(task_id, False, "auxiliary client unavailable")
|
|
|
|
try:
|
|
client, model = get_text_auxiliary_client("triage_specifier")
|
|
except Exception as exc:
|
|
logger.debug("specify: get_text_auxiliary_client failed: %s", exc)
|
|
return SpecifyOutcome(task_id, False, "auxiliary client unavailable")
|
|
|
|
if client is None or not model:
|
|
return SpecifyOutcome(
|
|
task_id, False, "no auxiliary client configured"
|
|
)
|
|
|
|
user_msg = _USER_TEMPLATE.format(
|
|
task_id=task.id,
|
|
title=_truncate(task.title or "", 400),
|
|
body=_truncate(task.body or "(no body)", 4000),
|
|
)
|
|
|
|
try:
|
|
resp = client.chat.completions.create(
|
|
model=model,
|
|
messages=[
|
|
{"role": "system", "content": _SYSTEM_PROMPT},
|
|
{"role": "user", "content": user_msg},
|
|
],
|
|
temperature=0.3,
|
|
max_tokens=1500,
|
|
timeout=timeout or 120,
|
|
)
|
|
except Exception as exc:
|
|
logger.info(
|
|
"specify: API call failed for %s (%s) — skipping",
|
|
task_id, exc,
|
|
)
|
|
return SpecifyOutcome(
|
|
task_id, False, f"LLM error: {type(exc).__name__}"
|
|
)
|
|
|
|
try:
|
|
raw = resp.choices[0].message.content or ""
|
|
except Exception:
|
|
raw = ""
|
|
|
|
parsed = _extract_json_blob(raw)
|
|
|
|
new_title: Optional[str]
|
|
new_body: Optional[str]
|
|
if parsed is None:
|
|
# Fall back: treat the whole reply as the body, leave title as-is.
|
|
# Worst case the user edits afterward — still better than stranding
|
|
# the task in triage on a malformed LLM reply.
|
|
stripped_raw = raw.strip()
|
|
if not stripped_raw:
|
|
return SpecifyOutcome(
|
|
task_id, False, "LLM returned an empty response"
|
|
)
|
|
new_title = None
|
|
new_body = stripped_raw
|
|
else:
|
|
title_val = parsed.get("title")
|
|
body_val = parsed.get("body")
|
|
new_title = (
|
|
title_val.strip()
|
|
if isinstance(title_val, str) and title_val.strip()
|
|
else None
|
|
)
|
|
new_body = (
|
|
body_val if isinstance(body_val, str) and body_val.strip() else None
|
|
)
|
|
if new_body is None and new_title is None:
|
|
return SpecifyOutcome(
|
|
task_id, False, "LLM response missing title and body"
|
|
)
|
|
|
|
with kb.connect() as conn:
|
|
ok = kb.specify_triage_task(
|
|
conn,
|
|
task_id,
|
|
title=new_title,
|
|
body=new_body,
|
|
author=author or _profile_author(),
|
|
)
|
|
if not ok:
|
|
# Race: someone else promoted / archived the task between our
|
|
# read above and the write. Report, don't crash.
|
|
return SpecifyOutcome(
|
|
task_id, False, "task moved out of triage before promotion"
|
|
)
|
|
return SpecifyOutcome(task_id, True, "specified", new_title=new_title)
|
|
|
|
|
|
def list_triage_ids(*, tenant: Optional[str] = None) -> list[str]:
|
|
"""Return task ids currently in the triage column.
|
|
|
|
``tenant`` narrows the sweep; ``None`` returns every triage task.
|
|
"""
|
|
with kb.connect() as conn:
|
|
tasks = kb.list_tasks(
|
|
conn,
|
|
status="triage",
|
|
tenant=tenant,
|
|
include_archived=False,
|
|
)
|
|
return [t.id for t in tasks]
|