hermes-agent/tests/hermes_cli/test_kanban_decompose.py
Teknium 1345dda0cf
feat(kanban): orchestrator-driven auto-decomposition on triage (#27572)
* feat(kanban): orchestrator-driven auto-decomposition on triage

Closes the core gap in the kanban system: dropping a one-liner into Triage
now decomposes it into a graph of child tasks routed to specialist
profiles by description, matching teknium's original vision ("main
orchestrator splits/creates actual tasks, doles them out to each agent").

The build
---------
- hermes_cli/profiles.py: new `description` + `description_auto` fields
  on ProfileInfo, persisted in <profile_dir>/profile.yaml. Helpers
  read_profile_meta / write_profile_meta. `create_profile` accepts
  optional description.
- hermes_cli/profile_describer.py: new module — auto-generate a 1-2
  sentence description from a profile's skills + model + name via the
  auxiliary LLM (`auxiliary.profile_describer`).
- hermes_cli/main.py: new `hermes profile create --description ...`
  flag; new `hermes profile describe [name] [--text ... | --auto |
  --all --auto]` subcommand.
- hermes_cli/kanban_db.py: new `decompose_triage_task` atomic helper —
  creates N child tasks, links the root as a child of every leaf
  (root waits for the whole graph), flips root `triage -> todo` with
  orchestrator assignee, records an audit comment + `decomposed` event
  in a single write_txn.
- hermes_cli/kanban_decompose.py: new module — calls the auxiliary LLM
  (`auxiliary.kanban_decomposer`) with the profile roster + descriptions
  to produce a JSON task graph, then invokes the DB helper. Rewrites
  unknown assignees to the configured `kanban.default_assignee` (or
  the active default profile) so a task NEVER lands with assignee=None.
  Falls back to specify-style single-task promotion when the LLM
  returns `fanout: false`.
- hermes_cli/kanban.py: new `hermes kanban decompose [task_id | --all]`
  CLI verb.
- hermes_cli/config.py: new DEFAULT_CONFIG keys —
  kanban.orchestrator_profile, kanban.default_assignee,
  kanban.auto_decompose (default True), kanban.auto_decompose_per_tick
  (default 3), auxiliary.kanban_decomposer, auxiliary.profile_describer.
- gateway/run.py: kanban dispatcher watcher now runs auto-decompose
  before each `_tick_once`, capped by `auto_decompose_per_tick` so a
  bulk-load of triage tasks doesn't burst-spend the aux LLM.
- plugins/kanban/dashboard/plugin_api.py: new endpoints —
  GET /profiles (list roster + descriptions),
  PATCH /profiles/<name> (set description, user-authored),
  POST /profiles/<name>/describe-auto (LLM-generate),
  POST /tasks/<id>/decompose (run decomposer),
  GET/PUT /orchestration (orchestrator/default-assignee/auto-decompose
  pickers, with resolved fallbacks echoed back).
- plugins/kanban/dashboard/dist/index.js: new OrchestrationPanel
  collapsible — dropdowns for orchestrator profile and default
  assignee, auto-decompose toggle, per-profile description editor with
  Save and Auto-generate buttons. New ⚗ Decompose button next to
   Specify on triage-column task drawers.

Behavior
--------
- A task in Triage gets fanned out into a small DAG of child tasks.
  Children with no internal parents flip to `ready` immediately
  (parallel dispatch). Children with sibling parents wait. The root
  stays alive as a parent of every child — when the whole graph
  finishes, it promotes to `ready` and the orchestrator profile wakes
  back up to judge completion (the "adds more tasks until done" part
  of the original vision).
- `kanban.orchestrator_profile` unset -> falls back to the default
  profile (whichever `hermes` launches with no -p flag).
- `kanban.default_assignee` unset -> same fallback. Tasks NEVER end
  up unassigned.
- `kanban.auto_decompose=true` (default) runs the decomposer
  automatically on dispatcher ticks; manual `hermes kanban decompose`
  is always available.

Tests
-----
- tests/hermes_cli/test_kanban_decompose_db.py — 7 tests for the
  atomic DB helper (status transitions, dep graph, audit trail,
  validation errors).
- tests/hermes_cli/test_kanban_decompose.py — 6 tests for the
  decomposer module (fanout, no-fanout fallback, unknown-assignee
  rewrite, malformed-JSON resilience, no-aux-client path).
- tests/hermes_cli/test_profile_describer.py — 10 tests for
  profile.yaml r/w + the LLM auto-describer (yaml corrupt tolerance,
  user-vs-auto description protection, --overwrite, fallback parsing).

E2E
---
- CLI end-to-end: created profiles with descriptions, dropped a triage
  task, mocked the aux LLM with a 3-task graph -> verified all three
  children were created with the right assignees, the dependency
  edges matched the LLM's graph, root flipped to todo gated by every
  child, audit comment + `decomposed` event recorded.
- Dashboard end-to-end: started the dashboard against an isolated
  HERMES_HOME, verified all four new endpoints via curl (profile
  listing, PATCH for description, PUT for orchestration settings,
  POST for decompose). Opened the UI in the browser, confirmed the
  OrchestrationPanel renders with all three pickers + the per-profile
  description editor, typed a description, clicked Save, verified
  ~/.hermes/profile.yaml was written. Clicked Decompose on the triage
  card and confirmed the inline error message surfaced as designed
  ("no auxiliary client configured").

* feat(kanban): surface decompose mode (Auto/Manual) as a one-click pill

The auto/manual toggle already existed as kanban.auto_decompose (default
true), but it was buried inside the collapsed Orchestration settings
panel — users couldn't tell at a glance which mode they were in. This
hoists it to a pill at the top of the kanban page so the state is always
visible and one click flips it.

UX
- New "⚗ Decompose: AUTO|MANUAL" pill in the kanban header. Emerald
  styling when Auto is on (the default), muted/gray when Manual.
- Pill is visible both in the collapsed AND expanded Orchestration
  settings views so context is preserved when the user opens the panel.
- Tooltip explains both states + what clicking does.
- Renamed the in-panel "Auto-decompose on triage / Enabled" checkbox
  to "Decompose mode / Auto (default) | Manual" for language parity
  with the pill.

Behavior preserved
- Default remains Auto (kanban.auto_decompose=true).
- Manual mode restores pre-PR behavior: triage tasks stay in triage
  until the user clicks ⚗ Decompose on each card (or runs
  `hermes kanban decompose <id>`).

Implementation
- plugins/kanban/dashboard/dist/index.js: load /orchestration on mount
  (not just on expand) so the collapsed pill reflects real state.
  Render mode pill in both collapsed and expanded headers. Reuses the
  existing PUT /api/plugins/kanban/orchestration endpoint — no new
  backend, no new tests required.

E2E verified
- Pill renders as "⚗ Decompose: AUTO" on page load (default).
- One click flips to "⚗ Decompose: MANUAL" with muted styling.
- config.yaml on disk shows auto_decompose: false after the flip.
- Second click round-trips back to Auto; config.yaml flips to true.

* feat(kanban): rename mode pill to "Orchestration: Auto/Manual"

Per Teknium feedback — "Decompose" was too implementation-specific.
"Orchestration" is the user-facing concept (the whole pitch is the
orchestrator profile routing work), and the pill is the front door to it.

- Pill text: "Orchestration: Auto" / "Orchestration: Manual" (title case,
  no ⚗ prefix, no SHOUTY-CAPS for the mode value)
- In-panel checkbox label: "Orchestration mode" (was "Decompose mode")
- Tooltips updated to match
- No behavior change

* docs(kanban): document decompose, profile descriptions, orchestration mode

Brings the docs site up to parity with the PR. English build verified
locally (npx docusaurus build --locale en) — clean, no new broken links
or anchors. Pre-existing broken-link warnings (rl-training, llms.txt,
step-by-step-checklist, fallback-model) untouched.

- website/docs/reference/cli-commands.md
    + `hermes kanban decompose` action row in the action table, with
      pointer to the Auto vs Manual orchestration section.

- website/docs/reference/profile-commands.md
    + `--description "<text>"` flag on `hermes profile create`.
    + Full `hermes profile describe` section: read, --text, --auto,
      --overwrite, --all flags with examples.

- website/docs/user-guide/features/kanban.md (the big one)
    + Triage column intro rewritten around the Auto-decompose default
      behavior, with pointer to the new Auto vs Manual section.
    + Status action row updated to mention both ⚗ Decompose and
       Specify on triage cards.
    + New "Auto vs Manual orchestration" section explaining the two
      modes, how to flip them (pill, config), how routing-by-description
      works, the no-None-assignee guarantee, plus a config knob table
      (auto_decompose, auto_decompose_per_tick, orchestrator_profile,
      default_assignee) and the two new auxiliary slots
      (kanban_decomposer, profile_describer).
    + REST surface table gains 6 new endpoint rows: /tasks/:id/decompose,
      /profiles (GET), /profiles/:name (PATCH), /profiles/:name/describe-auto,
      /orchestration (GET + PUT).

- website/docs/user-guide/features/kanban-tutorial.md
    + Triage column blurb updated for Auto by default + Manual via the
      pill, with cross-link to the Auto vs Manual orchestration section.

- website/docs/user-guide/profiles.md
    + Blank-profile flow now mentions --description and points to the
      kanban routing model for context.

- website/docs/user-guide/configuration.md
    + `kanban_decomposer` and `profile_describer` added to the
      `hermes model -> Configure auxiliary models` menu listing.
2026-05-17 13:54:12 -07:00

242 lines
7.4 KiB
Python

"""Tests for the decomposer module + `hermes kanban decompose` CLI surface.
The auxiliary LLM client is mocked — no network calls. Tests exercise the
prompt plumbing, response parsing, DB writes (via the real DB helper),
and the assignee-fallback logic.
"""
from __future__ import annotations
import argparse
import json as jsonlib
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
from hermes_cli import kanban as kanban_cli
from hermes_cli import kanban_db as kb
from hermes_cli import kanban_decompose as decomp
@pytest.fixture
def kanban_home(tmp_path, monkeypatch):
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setenv("HERMES_HOME", str(home))
monkeypatch.setattr(Path, "home", lambda: tmp_path)
kb.init_db()
return home
def _fake_aux_response(content: str):
resp = MagicMock()
resp.choices = [MagicMock()]
resp.choices[0].message.content = content
return resp
def _mock_client_returning(content: str):
client = MagicMock()
client.chat.completions.create = MagicMock(return_value=_fake_aux_response(content))
return client
def _patch_aux_client(content: str, *, model: str = "test-model"):
client = _mock_client_returning(content)
return patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(client, model),
)
def _patch_extra_body():
return patch(
"agent.auxiliary_client.get_auxiliary_extra_body",
return_value={},
)
def _patch_list_profiles(names: list[str]):
"""Pretend the named profiles exist. The decomposer uses
profiles_mod.list_profiles() to build the roster + valid-set, and
profiles_mod.profile_exists() to resolve orchestrator/default."""
from types import SimpleNamespace
fake_profiles = [
SimpleNamespace(
name=n, is_default=(i == 0), description=f"desc for {n}",
description_auto=False, model="m", provider="p", skill_count=1,
)
for i, n in enumerate(names)
]
return [
patch("hermes_cli.profiles.list_profiles", return_value=fake_profiles),
patch("hermes_cli.profiles.profile_exists", side_effect=lambda x: x in names),
patch("hermes_cli.profiles.get_active_profile_name", return_value=names[0] if names else "default"),
]
def test_decompose_with_fanout_creates_children(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="ship a feature", triage=True)
llm_payload = jsonlib.dumps({
"fanout": True,
"rationale": "test split",
"tasks": [
{"title": "research", "body": "look it up", "assignee": "researcher", "parents": []},
{"title": "build", "body": "code it", "assignee": "engineer", "parents": [0]},
],
})
patches = _patch_list_profiles(["orchestrator", "researcher", "engineer"])
for p in patches:
p.start()
try:
with _patch_aux_client(llm_payload), _patch_extra_body():
outcome = decomp.decompose_task(tid, author="me")
finally:
for p in patches:
p.stop()
assert outcome.ok, outcome.reason
assert outcome.fanout is True
assert outcome.child_ids and len(outcome.child_ids) == 2
with kb.connect() as conn:
root = kb.get_task(conn, tid)
c0 = kb.get_task(conn, outcome.child_ids[0])
c1 = kb.get_task(conn, outcome.child_ids[1])
assert root.status == "todo"
assert c0.status == "ready"
assert c1.status == "todo"
assert c0.assignee == "researcher"
assert c1.assignee == "engineer"
def test_decompose_fanout_false_falls_back_to_specify(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="just one thing", triage=True)
llm_payload = jsonlib.dumps({
"fanout": False,
"rationale": "single unit",
"title": "Tightened title",
"body": "**Goal**\nDo the thing.",
})
patches = _patch_list_profiles(["orchestrator"])
for p in patches:
p.start()
try:
with _patch_aux_client(llm_payload), _patch_extra_body():
outcome = decomp.decompose_task(tid, author="me")
finally:
for p in patches:
p.stop()
assert outcome.ok, outcome.reason
assert outcome.fanout is False
assert outcome.new_title == "Tightened title"
with kb.connect() as conn:
task = kb.get_task(conn, tid)
# specify path with no parents -> recompute_ready flips to 'ready'
assert task.status == "ready"
assert task.title == "Tightened title"
def test_decompose_unknown_assignee_falls_back_to_default(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="x", triage=True)
# Roster only has 'orchestrator' and 'fallback'; LLM picks 'made_up'.
llm_payload = jsonlib.dumps({
"fanout": True,
"rationale": "test",
"tasks": [
{"title": "do X", "body": "", "assignee": "made_up", "parents": []},
],
})
patches = _patch_list_profiles(["orchestrator", "fallback"])
for p in patches:
p.start()
try:
with patch.dict(
"os.environ", {}, clear=False,
), _patch_aux_client(llm_payload), _patch_extra_body(), \
patch(
"hermes_cli.kanban_decompose._load_config",
return_value={
"kanban": {
"orchestrator_profile": "orchestrator",
"default_assignee": "fallback",
}
},
):
outcome = decomp.decompose_task(tid, author="me")
finally:
for p in patches:
p.stop()
assert outcome.ok, outcome.reason
assert outcome.child_ids and len(outcome.child_ids) == 1
with kb.connect() as conn:
child = kb.get_task(conn, outcome.child_ids[0])
# 'made_up' wasn't in roster, so assignee rewritten to 'fallback'
assert child.assignee == "fallback"
def test_decompose_handles_malformed_llm_json(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="x", triage=True)
patches = _patch_list_profiles(["orchestrator"])
for p in patches:
p.start()
try:
with _patch_aux_client("not json at all, sorry"), _patch_extra_body():
outcome = decomp.decompose_task(tid, author="me")
finally:
for p in patches:
p.stop()
assert outcome.ok is False
assert "malformed JSON" in outcome.reason
def test_decompose_returns_false_when_task_not_triage(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="x") # ready, not triage
patches = _patch_list_profiles(["orchestrator"])
for p in patches:
p.start()
try:
outcome = decomp.decompose_task(tid, author="me")
finally:
for p in patches:
p.stop()
assert outcome.ok is False
assert "not in triage" in outcome.reason
def test_decompose_no_aux_client_configured(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="x", triage=True)
patches = _patch_list_profiles(["orchestrator"])
for p in patches:
p.start()
try:
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(None, ""),
):
outcome = decomp.decompose_task(tid, author="me")
finally:
for p in patches:
p.stop()
assert outcome.ok is False
assert "no auxiliary client" in outcome.reason