hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-29 06:31:32 +00:00

Author	SHA1	Message	Date
Teknium	53cf82a1ea	fix(kanban): remove orphan conflict markers from kanban.py (#28459 ) PR #28454 (salvage of #26745, workflow filter) merged with leftover git conflict markers in hermes_cli/kanban.py at three sites: - _task_to_dict() (session_id alongside workflow_template_id/current_step_key) - p_list parser (--sort alongside --workflow-template-id/--step-key) - _cmd_list (order_by alongside the new filter kwargs) Cleans up the markers and keeps both halves at each site. Resolves a self-introduced regression.	2026-05-18 21:29:31 -07:00
nehaaprasaad	341912c224	feat(kanban): filter tasks by workflow fields and runs by status/outcome Salvages #26745 by @nehaaprasaad. Exposes filtering for the existing workflow_template_id and current_step_key columns: - list_tasks() accepts workflow_template_id and current_step_key kwargs - 'hermes kanban list' adds matching CLI flags - dashboard plugin_api also exposes the filters Resolved a small conflict in list_tasks signature alongside main's session_id and order_by additions; combined all three into the single filter list.	2026-05-18 21:22:32 -07:00
thewillhuang	e286e68756	feat(kanban): stale detection for running tasks in dispatcher Salvages #23790 by @thewillhuang. Adds detect_stale_running() to the dispatcher cycle. Running tasks that have been started for longer than dispatch_stale_timeout_seconds (default 14400 = 4h) without a heartbeat in the last hour are auto-reclaimed to ready. - New config kanban.dispatch_stale_timeout_seconds (default 14400, 0 disables) - New 'stale' field on DispatchResult - detect_stale_running() in kanban_db.py with heartbeat freshness check - Records outcome='stale' on run close + 'stale' event; ticks failure counter - Wires config through gateway embedded dispatcher - Updates _cmd_dispatch verbose/JSON output and daemon logging Resolved test-file end-of-file conflict by appending both halves.	2026-05-18 21:20:56 -07:00
awizemann	31fe229039	feat(kanban): stamp originating ACP session_id on tasks Salvages #23208 by @awizemann. Tracks which chat session created a kanban task so clients can render a per-session board without falling back to tenant + time-window heuristics. - Schema: tasks gains nullable session_id TEXT column with index (additive migration in _migrate_add_optional_columns). - ACP: server.py exposes the originating session id via HERMES_SESSION_ID with save/restore around the agent loop. - Tool: kanban_create reads HERMES_SESSION_ID (with explicit override). - CLI: 'hermes kanban list --session <id>' filter; JSON output exposes session_id.	2026-05-18 21:15:21 -07:00
Niraven	3ee7a5546d	feat(cli): add kanban swarm topology helper Salvages #26791 by @Niraven. Adds 'hermes kanban swarm' to create a durable Kanban Swarm v1 graph: a completed root/blackboard card, parallel worker cards, a verifier gated on all workers, and a synthesizer gated on the verifier. Stores shared swarm blackboard updates as structured JSON comments on the root card. Self-contained: new hermes_cli/kanban_swarm.py module + CLI wiring + unit tests.	2026-05-18 21:10:12 -07:00
loicnico96	79f6654d16	feat(kanban): surface per-task model_override in show + tool output Salvages #26897 by @loicnico96. The per-task model_override DB column already exists on main, but it wasn't exposed in user-facing surfaces. This adds: - 'kanban show' prints 'model: <name>' when model_override is set - kanban_show / kanban_list tool responses include the model_override field Original branch was stale (PR was authored against an older field name 'model'); applied the substantive surface exposure manually using the current 'model_override' field name.	2026-05-18 21:09:02 -07:00
LizerAIDev	a846e500b0	feat(kanban): add --sort option to 'hermes kanban list' Salvages #25745 by @LizerAIDev. Adds --sort {created,created-desc, priority,priority-desc,status,assignee,title,updated} to 'hermes kanban list'. Validated against VALID_SORT_ORDERS map; invalid values raise ValueError. Default behaviour (priority DESC, created ASC) is unchanged when --sort is omitted.	2026-05-18 20:58:43 -07:00
Bartok9	365da2d2df	fix: 4 small surgical bugs Salvages #23302 by @Bartok9. Four independent one-area fixes: 1. kanban boards delete alias now hard-deletes (not archives) — the alias didn't carry --delete, so getattr(args, 'delete', False) returned False. Detect boards_action=='delete' explicitly. 2. Gateway auto-title failures no longer leak as user-visible warnings — debug-log only since they're not actionable. 3. Background process completion notification snaps truncation to the next newline boundary, prepends a marker when content is dropped. 4. _cprint() schedules the run_in_terminal coroutine via asyncio.ensure_future so output isn't silently dropped from background threads (fixes #23185 Bug A). Skips the double-print fallback that would fire for mock paths.	2026-05-18 20:54:52 -07:00
shunsuke-hikiyama	fb96208892	feat(kanban): add initial-status for human-ops cards Salvages #27526 by @shunsuke-hikiyama. Adds an --initial-status flag (running\|blocked, default running) to 'kanban create', threaded through kanban_db.create_task() and the kanban_create tool schema. 'blocked' parks the task directly in the blocked column for R3 human-ops review, skipping the brief running-to-blocked transition. Dropped the unrelated 'add' alias, WIFEXITED Windows compat, and slash-handler error formatting changes that were bundled in the original PR — those should ship as their own focused changes if still wanted.	2026-05-18 20:44:02 -07:00
ACR27	a5c2836b07	feat(kanban): allow trimmed task comments SS-1647 live SHIP validation: real code + tests for kanban comment --max-len.	2026-05-18 20:25:29 -07:00
zccyman	fe5e0bf5a3	feat(kanban): add board-level default workdir (#25430 )	2026-05-18 20:24:04 -07:00
QuenVix	afae2dd9ec	fix(kanban): keep board-management commands independent from board override	2026-05-18 20:14:27 -07:00
ms-alan	97ac94fe56	fix(kanban): seed bundled skills (e.g. kanban-worker) on kanban init Closes #23725	2026-05-18 20:13:52 -07:00
Beandon13	bde6313e34	feat(kanban): archive --rm to hard-delete archived tasks Salvages #19964 by @Beandon13. Adds `hermes kanban archive --rm` to permanently remove already-archived tasks with cascading cleanup of links, comments, events, runs, and notify-subs. Safety guard: only archived tasks can be deleted; active/blocked/done must be archived first. Cherry-picked from #19964 onto current main (severe stale base, applied manually to preserve substance only).	2026-05-18 20:09:26 -07:00
Teknium	dadc8aa255	fix(kanban): surface unusable triage auxiliary model (auto-decompose aware) (#27871 ) Adds a 'triage_aux_unavailable' diagnostic for tasks stuck in triage when neither the active aux helper slot nor the main-model auto fallback is usable. Auto-decompose aware: - kanban.auto_decompose=True (default): primary is auxiliary.kanban_decomposer, triage_specifier is the fanout=false fallback. - kanban.auto_decompose=False: primary is auxiliary.triage_specifier (manual 'hermes kanban specify' path). Default aux slots use 'provider: auto' which falls back to the main model, so this rule only fires when both the explicit slot config AND the main-model auto fallback are absent. Quiet by default; informative when there is a real config gap. Also adds kd.config_from_runtime_config() that carries kanban + auxiliary + model keys through to diagnostics, and updates CLI/dashboard call sites to use it. config_from_kanban_config() is preserved for back-compat. Reworks the original PR #25640 idea (@qWaitCrypto) to align with the new auto-decompose dispatcher path landed in #27572. The original PR pointed only at auxiliary.triage_specifier, which is now the fallback rather than the primary helper. Co-authored-by: qWaitCrypto <axmaiqiu@gmail.com>	2026-05-18 01:27:06 -07:00
qWaitCrypto	d9fef0c8ab	fix(kanban): align failure diagnostics with retry limit	2026-05-18 01:22:16 -07:00
Teknium	1345dda0cf	feat(kanban): orchestrator-driven auto-decomposition on triage (#27572 ) * feat(kanban): orchestrator-driven auto-decomposition on triage Closes the core gap in the kanban system: dropping a one-liner into Triage now decomposes it into a graph of child tasks routed to specialist profiles by description, matching teknium's original vision ("main orchestrator splits/creates actual tasks, doles them out to each agent"). The build --------- - hermes_cli/profiles.py: new `description` + `description_auto` fields on ProfileInfo, persisted in <profile_dir>/profile.yaml. Helpers read_profile_meta / write_profile_meta. `create_profile` accepts optional description. - hermes_cli/profile_describer.py: new module — auto-generate a 1-2 sentence description from a profile's skills + model + name via the auxiliary LLM (`auxiliary.profile_describer`). - hermes_cli/main.py: new `hermes profile create --description ...` flag; new `hermes profile describe [name] [--text ... \| --auto \| --all --auto]` subcommand. - hermes_cli/kanban_db.py: new `decompose_triage_task` atomic helper — creates N child tasks, links the root as a child of every leaf (root waits for the whole graph), flips root `triage -> todo` with orchestrator assignee, records an audit comment + `decomposed` event in a single write_txn. - hermes_cli/kanban_decompose.py: new module — calls the auxiliary LLM (`auxiliary.kanban_decomposer`) with the profile roster + descriptions to produce a JSON task graph, then invokes the DB helper. Rewrites unknown assignees to the configured `kanban.default_assignee` (or the active default profile) so a task NEVER lands with assignee=None. Falls back to specify-style single-task promotion when the LLM returns `fanout: false`. - hermes_cli/kanban.py: new `hermes kanban decompose [task_id \| --all]` CLI verb. - hermes_cli/config.py: new DEFAULT_CONFIG keys — kanban.orchestrator_profile, kanban.default_assignee, kanban.auto_decompose (default True), kanban.auto_decompose_per_tick (default 3), auxiliary.kanban_decomposer, auxiliary.profile_describer. - gateway/run.py: kanban dispatcher watcher now runs auto-decompose before each `_tick_once`, capped by `auto_decompose_per_tick` so a bulk-load of triage tasks doesn't burst-spend the aux LLM. - plugins/kanban/dashboard/plugin_api.py: new endpoints — GET /profiles (list roster + descriptions), PATCH /profiles/<name> (set description, user-authored), POST /profiles/<name>/describe-auto (LLM-generate), POST /tasks/<id>/decompose (run decomposer), GET/PUT /orchestration (orchestrator/default-assignee/auto-decompose pickers, with resolved fallbacks echoed back). - plugins/kanban/dashboard/dist/index.js: new OrchestrationPanel collapsible — dropdowns for orchestrator profile and default assignee, auto-decompose toggle, per-profile description editor with Save and Auto-generate buttons. New ⚗ Decompose button next to ✨ Specify on triage-column task drawers. Behavior -------- - A task in Triage gets fanned out into a small DAG of child tasks. Children with no internal parents flip to `ready` immediately (parallel dispatch). Children with sibling parents wait. The root stays alive as a parent of every child — when the whole graph finishes, it promotes to `ready` and the orchestrator profile wakes back up to judge completion (the "adds more tasks until done" part of the original vision). - `kanban.orchestrator_profile` unset -> falls back to the default profile (whichever `hermes` launches with no -p flag). - `kanban.default_assignee` unset -> same fallback. Tasks NEVER end up unassigned. - `kanban.auto_decompose=true` (default) runs the decomposer automatically on dispatcher ticks; manual `hermes kanban decompose` is always available. Tests ----- - tests/hermes_cli/test_kanban_decompose_db.py — 7 tests for the atomic DB helper (status transitions, dep graph, audit trail, validation errors). - tests/hermes_cli/test_kanban_decompose.py — 6 tests for the decomposer module (fanout, no-fanout fallback, unknown-assignee rewrite, malformed-JSON resilience, no-aux-client path). - tests/hermes_cli/test_profile_describer.py — 10 tests for profile.yaml r/w + the LLM auto-describer (yaml corrupt tolerance, user-vs-auto description protection, --overwrite, fallback parsing). E2E --- - CLI end-to-end: created profiles with descriptions, dropped a triage task, mocked the aux LLM with a 3-task graph -> verified all three children were created with the right assignees, the dependency edges matched the LLM's graph, root flipped to todo gated by every child, audit comment + `decomposed` event recorded. - Dashboard end-to-end: started the dashboard against an isolated HERMES_HOME, verified all four new endpoints via curl (profile listing, PATCH for description, PUT for orchestration settings, POST for decompose). Opened the UI in the browser, confirmed the OrchestrationPanel renders with all three pickers + the per-profile description editor, typed a description, clicked Save, verified ~/.hermes/profile.yaml was written. Clicked Decompose on the triage card and confirmed the inline error message surfaced as designed ("no auxiliary client configured"). * feat(kanban): surface decompose mode (Auto/Manual) as a one-click pill The auto/manual toggle already existed as kanban.auto_decompose (default true), but it was buried inside the collapsed Orchestration settings panel — users couldn't tell at a glance which mode they were in. This hoists it to a pill at the top of the kanban page so the state is always visible and one click flips it. UX - New "⚗ Decompose: AUTO\|MANUAL" pill in the kanban header. Emerald styling when Auto is on (the default), muted/gray when Manual. - Pill is visible both in the collapsed AND expanded Orchestration settings views so context is preserved when the user opens the panel. - Tooltip explains both states + what clicking does. - Renamed the in-panel "Auto-decompose on triage / Enabled" checkbox to "Decompose mode / Auto (default) \| Manual" for language parity with the pill. Behavior preserved - Default remains Auto (kanban.auto_decompose=true). - Manual mode restores pre-PR behavior: triage tasks stay in triage until the user clicks ⚗ Decompose on each card (or runs `hermes kanban decompose <id>`). Implementation - plugins/kanban/dashboard/dist/index.js: load /orchestration on mount (not just on expand) so the collapsed pill reflects real state. Render mode pill in both collapsed and expanded headers. Reuses the existing PUT /api/plugins/kanban/orchestration endpoint — no new backend, no new tests required. E2E verified - Pill renders as "⚗ Decompose: AUTO" on page load (default). - One click flips to "⚗ Decompose: MANUAL" with muted styling. - config.yaml on disk shows auto_decompose: false after the flip. - Second click round-trips back to Auto; config.yaml flips to true. * feat(kanban): rename mode pill to "Orchestration: Auto/Manual" Per Teknium feedback — "Decompose" was too implementation-specific. "Orchestration" is the user-facing concept (the whole pitch is the orchestrator profile routing work), and the pill is the front door to it. - Pill text: "Orchestration: Auto" / "Orchestration: Manual" (title case, no ⚗ prefix, no SHOUTY-CAPS for the mode value) - In-panel checkbox label: "Orchestration mode" (was "Decompose mode") - Tooltips updated to match - No behavior change * docs(kanban): document decompose, profile descriptions, orchestration mode Brings the docs site up to parity with the PR. English build verified locally (npx docusaurus build --locale en) — clean, no new broken links or anchors. Pre-existing broken-link warnings (rl-training, llms.txt, step-by-step-checklist, fallback-model) untouched. - website/docs/reference/cli-commands.md + `hermes kanban decompose` action row in the action table, with pointer to the Auto vs Manual orchestration section. - website/docs/reference/profile-commands.md + `--description "<text>"` flag on `hermes profile create`. + Full `hermes profile describe` section: read, --text, --auto, --overwrite, --all flags with examples. - website/docs/user-guide/features/kanban.md (the big one) + Triage column intro rewritten around the Auto-decompose default behavior, with pointer to the new Auto vs Manual section. + Status action row updated to mention both ⚗ Decompose and ✨ Specify on triage cards. + New "Auto vs Manual orchestration" section explaining the two modes, how to flip them (pill, config), how routing-by-description works, the no-None-assignee guarantee, plus a config knob table (auto_decompose, auto_decompose_per_tick, orchestrator_profile, default_assignee) and the two new auxiliary slots (kanban_decomposer, profile_describer). + REST surface table gains 6 new endpoint rows: /tasks/:id/decompose, /profiles (GET), /profiles/:name (PATCH), /profiles/:name/describe-auto, /orchestration (GET + PUT). - website/docs/user-guide/features/kanban-tutorial.md + Triage column blurb updated for Auto by default + Manual via the pill, with cross-link to the Auto vs Manual orchestration section. - website/docs/user-guide/profiles.md + Blank-profile flow now mentions --description and points to the kanban routing model for context. - website/docs/user-guide/configuration.md + `kanban_decomposer` and `profile_describer` added to the `hermes model -> Configure auxiliary models` menu listing.	2026-05-17 13:54:12 -07:00
shellybotmoyer	1eadb069c7	fix(kanban): --severity filter uses >= comparison per documented behavior (#26379 )	2026-05-16 22:54:22 -07:00
kshitij	2ec8d2b42f	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 ) Replace with for all literal-tuple membership tests. Set lookup is O(1) vs O(n) for tuple — consistent micro-optimization across the codebase. 608 instances fixed via `ruff --fix --unsafe-fixes`, 0 remaining. 133 files, +626/-626 (net zero).	2026-05-11 11:13:25 -07:00
kshitij	657874460f	chore: ruff auto-fixes — collapsible-else-if, if-stmt-min-max, dict.fromkeys (#23926 ) PLR5501 (collapsible-else-if): 28 instances — else: if: → elif: PLR1730 (if-stmt-min-max): 15 instances — if x<y: x=y → x=max(x,y) C420 (dict.fromkeys): 2 instances — dictcomp → dict.fromkeys PLR1704 (redefined-argument): 1 instance — reason → err_msg (shadow fix) C414 (unnecessary-list): 1 instance — sorted(list(x)) → sorted(x) 28 files, -44 net lines. All mechanical, zero logic changes. 17,211 tests pass, zero regressions.	2026-05-11 11:03:29 -07:00
Sylw3ster	641e40c4bd	fix(kanban): restore HERMES_KANBAN_BOARD after scoped slash override	2026-05-11 06:44:58 -07:00
teknium1	68d081f570	fix(kanban): keep '--created-by' default as 'user' Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Docker Build and Publish / move-latest (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (push) Waiting to run Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details Out-of-scope behavior change in #23521 — the kanban notifier-routing fix also flipped the 'kanban create --created-by' default from 'user' to the active profile name. Revert to keep PR scope focused on the notifier ownership fix; the profile-aware author default can be its own change.	2026-05-10 20:04:53 -07:00
Mike Nguyen	ba5640fa11	fix(gateway): route kanban notifications to creator profile	2026-05-10 20:04:53 -07:00
tymrtn	d1fc748def	fix(kanban): /kanban slash command emits argparse garbage instead of help Closes #21794. `/kanban`, `/kanban help`, `/kanban --help`, and `/kanban <sub> -h` all returned broken output to the gateway and interactive CLI. Three underlying bugs in `hermes_cli.kanban.run_slash`: 1. argparse writes help to stdout but `run_slash` only captured stderr at parse time, so `-h` text was silently swallowed and replaced with the `(usage error: 0)` sentinel. 2. The wrapping parser used `prog="/"` and routed via a synthetic "_top → kanban" subparser, producing `usage: / kanban …` (stray space) and `usage: /kanban kanban …` (doubled token) in error text. 3. Bare `/kanban` and `/kanban help` dumped argparse's full ~3KB usage tree, which reads as visual garbage in a chat bubble. Fix: drive the kanban_parser directly (no double-wrap), rewrite prog strings on every leaf subparser, capture stdout AND stderr around parse_args, distinguish SystemExit(0) (help — return captured stdout) from SystemExit(2) (error — return single-line ⚠-prefixed message), and add an explicit chat-friendly short-help block returned for bare invocation and the help aliases (`help`, `--help`, `-h`, `?`). Added 5 regression tests covering bare invocation, every help alias, subcommand help, unknown action, and missing required arg. Affects every chat platform via gateway/run.py::_handle_kanban_command and the interactive CLI via cli.py::_handle_kanban_command. Co-Authored-By: Nagatha (Claude Opus 4.7) <noreply@anthropic.com>	2026-05-09 22:49:29 -07:00
Teknium	24d48ffb82	feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks (#21435 ) * feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks The Triage column shipped with a placeholder 'a specifier will flesh out the spec', but the specifier itself was never built. This wires it up as a dedicated CLI verb. `hermes kanban specify <id>` calls the auxiliary LLM (configured under `auxiliary.triage_specifier`) to expand a rough one-liner into a concrete spec — tightened title plus a body with Goal / Approach / Acceptance criteria / Out-of-scope sections — then atomically flips `status: triage -> todo` and recomputes ready so parent-free tasks go straight to the dispatcher on the same tick. Surface: hermes kanban specify <task_id> # single task hermes kanban specify --all [--tenant T] # sweep triage column hermes kanban specify ... --author NAME # audit-comment author hermes kanban specify ... --json # one JSON line per task Design choices: - Parent gating is preserved. specify_triage_task flips to 'todo', then recompute_ready promotes to 'ready' only when parents are done — same rule as a normal parent-gated todo. - No daemon, no background watcher. Every invocation is explicit — keeps cost predictable and doesn't fight the dispatcher loop. - Response parse is lenient: strict JSON preferred, markdown-fence tolerated, raw-body fallback on malformed JSON so the LLM can't strand a task in triage. - All failure modes (no aux client, API error, task moved out of triage mid-call) return SpecifyOutcome(ok=False, reason=...) so --all continues past individual failures. Changes: hermes_cli/kanban_db.py + specify_triage_task() hermes_cli/kanban_specify.py NEW (~220 LOC — prompt, parse, call) hermes_cli/kanban.py + specify subcommand + _cmd_specify hermes_cli/config.py + auxiliary.triage_specifier task slot website/docs/user-guide/features/kanban.md specify + config notes website/docs/reference/cli-commands.md CLI reference entry tests/hermes_cli/test_kanban_specify_db.py NEW (10 tests) tests/hermes_cli/test_kanban_specify.py NEW (20 tests) Validation: 30/30 targeted tests pass. E2E: triage task -> specify -> ends in 'ready' with events [created, specified, promoted] and the audit comment recorded under the configured author. * feat(kanban): wire specifier into dashboard and gateway slash Follow-ups to the initial PR #21435 — closes the two gaps I'd left as post-merge: dashboard button and first-class gateway surface. Dashboard (plugins/kanban/dashboard/) - POST /tasks/:id/specify NEW endpoint. Thin wrapper around kanban_specify.specify_task(). Returns the CLI outcome shape ({ok, task_id, reason, new_title}); ok=false with a human reason is a 200, not a 4xx, so the UI can render it inline without treating 'no aux client configured' as a crash. - Runs sync in FastAPI's threadpool because the LLM call can take tens of seconds on reasoning models. - Pins HERMES_KANBAN_BOARD around the specify call so the module's argless kb.connect() lands on the right board. - dist/index.js: doSpecify callback threaded through the drawer → TaskDetail → StatusActions prop chain. ✨ Specify button appears ONLY when task.status === 'triage' (elsewhere the backend would reject anyway — hide the button to keep the action row clean). Busy state (Specifying…) + inline success/error banner under the button using the response.reason text. - dist/style.css: tiny hermes-kanban-msg-ok / -err classes using existing --color vars so themes reskin cleanly. Gateway slash (/kanban specify) - Already works via the existing run_slash → build_parser → kanban_command pipeline. No code change needed — slash commands inherit the argparse tree automatically. Added coverage: test_run_slash_specify_end_to_end (create --triage, specify, verify promotion + retitle) and test_run_slash_specify_help_is_reachable. Tests - tests/plugins/test_kanban_dashboard_plugin.py: 3 new tests for the REST endpoint — happy path, non-triage rejection as ok=false 200, missing aux client as ok=false 200. - tests/hermes_cli/test_kanban_cli.py: 2 new slash-surface tests. Docs - website/docs/user-guide/features/kanban.md: dashboard action row description mentions ✨ Specify + all three surfaces. REST table gains /tasks/:id/specify. Slash examples include /kanban specify. Validation: 340/340 targeted tests pass. E2E via TestClient: create a triage task over REST → POST /specify with mocked aux client → task moves to 'ready' column on /board with new title and body applied.	2026-05-07 13:04:41 -07:00
Teknium	ac51c4c1ad	feat(kanban): per-task max_retries override (#20263 follow-up, supersedes #20972 ) (#21330 ) Adds a per-task override for the consecutive-failure circuit breaker, so individual tasks can opt out of the global ``kanban.failure_limit`` without dragging everyone else with them. Resolution order (now three tiers): 1. per-task ``max_retries`` (new, this commit) 2. caller-supplied ``failure_limit`` — the gateway threads ``kanban.failure_limit`` from config here 3. ``DEFAULT_FAILURE_LIMIT`` (2) Changes: - ``tasks.max_retries INTEGER`` column + migration for existing DBs (NULL = no override, matches pre-column behavior). - ``Task.max_retries`` field + ``from_row`` plumbing. - ``create_task(..., max_retries=N)`` kwarg. - ``_record_task_failure`` reads the per-task value first and records ``limit_source`` + ``effective_limit`` on the ``gave_up`` event so operators can see which tier won. - CLI: ``hermes kanban create --max-retries N`` (rejects ``< 1``). - CLI: ``hermes kanban show`` surfaces the effective threshold + source (``(task)``, ``(config kanban.failure_limit)``, ``(default)``). - CLI: ``_task_to_dict`` includes ``max_retries`` in ``--json`` output. Key design choice vs. the earlier #20972 attempt: - No new config key. The existing ``kanban.failure_limit`` (landed in #21183) is the dispatcher-tier source — no silent break for users who already tuned it. - No ``!=`` sentinel for "is config set" (which would misfire when config equals the default). The tier-winner is determined purely by "is per-task override set" — the dispatcher always wins when per-task is NULL, regardless of whether the caller passed the default or a configured value. E2E verified across four scenarios: default-only (trips at 2), config-only (trips at caller's value), per-task-only beats default (trips at task value), per-task beats larger config (trips at task value). ``gave_up`` event metadata correctly records ``limit_source`` and ``effective_limit`` in all cases. Tests: - ``test_per_task_max_retries_overrides_dispatcher_limit`` — task=1 beats caller=10. - ``test_per_task_max_retries_allows_more_than_default`` — task=5 does not trip at caller=default of 2. - ``test_max_retries_none_falls_through_to_dispatcher_limit`` — None honors caller's config value (4), records ``limit_source=dispatcher``. Full kanban trio (db + core + cli + tools + dashboard-plugin): 342 passed, no regressions. Supersedes: #20972 (@jelrod27) — credit in PR close comment. Ref: #20263 (tangentially — the reporter asked about adapter API drift, not retry caps, but the CLI discussion there is what surfaced the original ask).	2026-05-07 07:29:02 -07:00
mwnickerson	411cfa26e3	fix: auto-block repeated kanban retries	2026-05-07 05:05:20 -07:00
Brecht-H	3f97297413	feat(kanban): surface task_runs.summary on dashboard cards + ``kanban show`` The kanban-worker skill (built into the gateway dispatcher's spawn prompt) instructs every worker to hand off via ``kanban_complete(summary=..., metadata=...)``. That writes the summary onto the closing ``task_runs`` row, NOT onto ``tasks.result`` — the latter is left NULL unless the caller passes ``result=`` explicitly. Result: a glance at the dashboard or ``hermes kanban show <id>`` shows a blank "Result:" section even when the worker did real work, which on 2026-05-05 caused a Mac false-alarm ("Hermes did nothing") on a task that had a 10-line completion summary on its run. This patch surfaces the latest non-null run summary as ``latest_summary`` so the worker's actual handoff lands in front of operators. * New helpers ``kanban_db.latest_summary(conn, task_id)`` and ``kanban_db.latest_summaries(conn, task_ids)``. The batch variant uses a single window-function SELECT so the dashboard board endpoint doesn't pay an N+1 cost on multi-hundred-task boards. * CLI ``hermes kanban show <id>`` prints a "Latest summary:" block when ``tasks.result`` is empty but a run has produced a summary (the existing "Result:" section still wins when populated, so the back-compat path for hand-edited results is untouched). JSON output gains a top-level ``latest_summary`` field. * Dashboard ``/board`` and ``/tasks/{id}`` now include a ``latest_summary`` field on every task. Cards on /board carry a 200-character preview (cheap to render, plenty for "what did this worker do?" at a glance); the drawer/detail endpoint returns the full summary. * Five new tests cover: empty-runs case, post-complete surface, newest-of-multiple selection, empty-string skip, batch with missing tasks + empty input. Smoke-tested locally against the live profile DB on the three acceptance-criterion targets (t_f08fef91 cron-hygiene-audit, t_007b7f1c EMA-analysis, t_05746fa4 self-assessment) — all three now return their populated summaries via both ``latest_summary`` and ``latest_summaries``. Test plan: 255/255 kanban tests pass + 91/91 dashboard plugin tests pass. No regression on tasks where ``tasks.result`` is explicitly populated (the existing "Result:" branch is preserved). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 17:26:15 -07:00
misery-hl	56b4795115	guard kanban worker lifecycle by run id	2026-05-05 15:09:28 -07:00
Teknium	f67063ba81	feat(kanban): generic diagnostics engine for task distress signals (#20332 ) * feat(kanban): generic diagnostics engine for task distress signals Replaces the hallucination-specific ``warnings`` / ``RecoverySection`` surface (shipped in PR #20232) with a reusable diagnostic-rule engine that covers five distress kinds in v1 and can be extended without touching UI code. The "something's wrong with this task" signal is no longer limited to phantom card ids. Closes the follow-up from #20232 discussion. New module ---------- ``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule engine. Each rule is a pure function of ``(task, events, runs, now, config) -> list[Diagnostic]``. Registry is a simple list; adding a new distress kind is one function + one import, no UI or API changes required. v1 rule set ----------- * ``hallucinated_cards`` (error) — folds the existing ``completion_blocked_hallucination`` event into the new surface. * ``prose_phantom_refs`` (warning) — folds ``suspected_hallucinated_references``. * ``repeated_spawn_failures`` (error → critical at 2x threshold) — fires when ``tasks.spawn_failures >= 3``; suggests ``hermes -p <profile> doctor`` / ``auth``. * ``repeated_crashes`` (error → critical) — fires after N consecutive ``crashed`` run outcomes with no successful completion between; suggests ``hermes kanban log <id>``. * ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked`` state with no comments / unblock attempts; suggests commenting. Every diagnostic carries structured ``actions`` (reclaim, reassign, unblock, cli_hint, comment, open_docs) that render consistently in both CLI and dashboard. Suggested actions are highlighted; generic recovery actions (reclaim / reassign) are available on every kind as fallbacks. Diagnostics auto-clear when the underlying failure resolves — a clean ``completed``/``edited`` event drops hallucination diagnostics, a successful run drops crash diagnostics, a comment drops stuck-blocked diagnostics. Audit events persist; the badge goes away. API --- ``plugin_api.py``: * ``/board`` now attaches ``diagnostics`` (full list) and ``warnings`` (compact summary with ``highest_severity``) per task. * ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics section auto-opens on flagged tasks. * NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by severity, sorted critical-first. CLI --- * NEW ``hermes kanban diagnostics [--severity X] [--task id] [--json]`` — fleet view or single-task view, matches dashboard rule output so CLI users see the same picture. * ``hermes kanban show <id>`` now renders a Diagnostics section near the top with severity markers + suggested actions. Dashboard --------- * Card badge is severity-coloured (⚠ amber warning, !! orange error, !!! red critical) using ``warnings.highest_severity``. * Attention strip above the toolbar counts EVERY task with active diagnostics (not just hallucinations), severity-coloured, lists affected tasks with Open buttons when expanded. * Drawer's old ``RecoverySection`` replaced with generic ``DiagnosticsSection`` rendering a card per active diagnostic: title + detail + structured data (task-id chips when payload keys look like id lists) + action buttons. Reassign profile picker is inline per-diagnostic. Clipboard fallback uses ``.catch()`` for environments where writeText rejects. * Three-rung severity palette; amber for warning, orange for error, red for critical. Uses CSS variables so theming is straightforward. Tests ----- * NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests covering each rule's positive/negative/threshold paths, severity sorting, broken-rule isolation, and sqlite3.Row integration. * Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty, populated, severity-filtered), ``/board`` exposes both diagnostic list and compact summary with ``highest_severity``. * Existing hallucination-specific test (``test_board_surfaces_ warnings_field_for_hallucinated_completions``) updated to reflect the new contract: warning summary keys by diagnostic kind (``hallucinated_cards``) not event kind. 379 kanban-suite tests pass (+16 net from this PR). Live verification ----------------- Seeded all 5 diagnostic kinds + one clean + one plain-running task (7 total) into an isolated HERMES_HOME, spun up the dashboard, and verified: * Attention strip: shows ``!! 5 tasks need attention`` in the error-severity orange; Show expands to a list of 5 rows ordered critical > error > warning. * Card badges: error tasks render ``!!`` orange, warning tasks render ``⚠`` amber, clean and plain-running tasks render no badge. * Each of the 5 rules opens a correctly-coloured, correctly-styled diagnostic card in the drawer with its specific suggested action. * Live reassign from a diagnostic card flipped ``broken-ml-worker → alice`` and the drawer refreshed with the new assignee + the same diagnostic still firing (correct: spawn_failures counter hasn't reset yet). * CLI ``hermes kanban diagnostics`` prints all 5 in severity order; ``--severity error`` narrows to 3; ``kanban show <id>`` includes the Diagnostics block at the top with suggested action hint. Migration note -------------- The old ``warnings`` shape (``{count, kinds, latest_at}``) is preserved on the API but ``kinds`` now keys by diagnostic kind (``hallucinated_cards``) instead of event kind (``completion_blocked_hallucination``). ``highest_severity`` is a new required field. The dashboard was the only consumer and has been updated in the same commit; external API consumers of the ``warnings`` field will need to update their kind-match logic. * feat(kanban/diagnostics): lead titles with the actual error text The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn N times' titles buried the actual cause in the data section. Operators had to open logs or expand the diagnostic to see WHY the worker is stuck — rate-limit vs insufficient quota vs bad auth vs context overflow vs network blip all looked identical at a glance. New titles: Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance Agent crashed 3x: provider auth error: 401 Unauthorized Agent spawn failed 4x: insufficient_quota: You exceeded your current Detail keeps the full error snippet (capped at 500 chars + ellipsis for tracebacks). Title takes the first line capped at 160 chars. Fallback title if no error recorded stays honest ('no error recorded'). Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total pass (+4). Live-verified on dashboard with 6 seeded scenarios (rate-limit, billing, auth, context, network, spawn-billing) — each card title leads with the actionable error text.	2026-05-05 13:32:42 -07:00
Teknium	de9238d37e	feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232 ) Workers completing a kanban task can now claim the ids of cards they created via an optional ``created_cards`` field on ``kanban_complete``. The kernel verifies each id exists and was created by the completing worker's profile; any phantom id blocks the completion with a ``HallucinatedCardsError`` and records a ``completion_blocked_hallucination`` event on the task so the rejected attempt is auditable. Successful completions also get a non-blocking prose-scan pass over their ``summary`` + ``result`` that emits a ``suspected_hallucinated_references`` event for any ``t_<hex>`` reference that doesn't resolve. Closes #20017. Recovery UX (kernel + CLI + dashboard) -------------------------------------- A structural gate alone isn't enough — operators also need to see and act on stuck workers, especially when a profile's model is the root cause. This PR ships the full loop: * ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that releases an active worker claim immediately (unlike ``release_stale_claims`` which only acts after claim_expires has passed). Emits a ``reclaimed`` event with ``manual: True`` payload. * ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` — switch a task to a different profile, optionally reclaiming a stuck running worker in the same call. * ``hermes kanban reclaim <id> [--reason ...]`` and ``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]`` CLI subcommands wired through to the same helpers. * ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and ``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the dashboard plugin. Dashboard surfacing ------------------- * ⚠ warning badge on cards with active hallucination events. * attention strip at the top of the board listing all flagged tasks; dismissible per session. * events callout in the task drawer — hallucination events render with a red left border, amber icon, and phantom ids as styled chips. * recovery section in the task drawer with three actions: Reclaim, Reassign (with profile picker + reclaim-first checkbox), and a copy-to-clipboard hint for ``hermes -p <profile> model`` since profile config lives on disk and can't be edited from the browser. Auto-opens when the task has warnings, collapsed otherwise. Keyed by task id so state doesn't leak between drawers. Active-vs-stale rule: warnings clear when a clean ``completed`` or ``edited`` event supersedes the hallucination, so recovery is never permanently stigmatising — the audit events persist for debugging but the badge goes away once the worker succeeds. Skill updates ------------- * ``skills/devops/kanban-worker/SKILL.md`` documents the ``created_cards`` contract with good/bad examples. * ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering stuck workers" section with the three actions and when to use each. Tests ----- * Kernel gate: verified-cards manifest, phantom rejection + audit event, cross-worker rejection, prose scan positive + negative. * Recovery helpers: reclaim on running task, reclaim on non-running returns False, reassign refuses running without reclaim_first, reassign with reclaim_first succeeds on running. * API endpoints: warnings field present on /board and /tasks/:id, warnings cleared after clean completion, reclaim 200 + 409 paths, reassign 200 + 409 + reclaim_first paths. * CLI smoke: reclaim + reassign subcommands. Live-verified end-to-end on a dashboard with seeded scenarios: attention strip renders, badges land on the right cards, drawer callout shows phantom chips, Reclaim on a running task flips status to ready + emits manual reclaimed event + refreshes the drawer, Reassign swaps the assignee and triggers board refresh. 359/359 kanban-suite tests pass (test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).	2026-05-05 08:06:55 -07:00
LeonSGP43	354502ee48	fix(kanban): preserve dashboard completion summaries	2026-05-05 04:57:38 -07:00
Brecht-H	f25d3ec917	fix(kanban): suppress dispatcher stuck-warn when ready queue holds only non-spawnable assignees After PR #20105 (dispatcher skips ready tasks whose assignee fails ``profile_exists()`` to prevent the orion-cc/orion-research crash loop), the gateway and CLI emit a spurious "kanban dispatcher stuck: ready queue non-empty for N consecutive ticks but 0 workers spawned" warning every 5 minutes on multi-lane setups where the queue is steadily full of human-pulled work assigned to terminal lanes. The warn is intended to catch real failure modes (broken PATH, missing venv, credential loss for a real Hermes profile). On a multi-lane host it fires forever even though everything is healthy: the dispatcher correctly chose not to spawn, and there is nothing for the operator to fix. Changes: * ``DispatchResult`` gains a ``skipped_nonspawnable`` field (separate from ``skipped_unassigned``) so callers can distinguish "task missing an owner — operator should route it" from "task owned by a control-plane lane — terminal will pull it". * ``dispatch_once`` routes the ``not profile_exists(assignee)`` skip into the new bucket (was lumped into ``skipped_unassigned``). * New helper ``has_spawnable_ready(conn)`` returns True iff at least one ready+assigned+unclaimed task in the DB has an assignee that maps to a real Hermes profile. Falls back to legacy "any ready+assigned" when ``profile_exists`` is unimportable so degraded installs still surface the original warn. * The gateway dispatcher (``gateway/run.py``) and the CLI standalone daemon (``hermes_cli/kanban.py``) both swap their cheap ``ready_nonempty`` probe to use ``has_spawnable_ready``. Stuck-warn now fires only when there is genuine spawnable work the dispatcher failed to start. * CLI dispatch output prints ``Skipped (non-spawnable assignee — terminal lane, OK)`` for visibility without alarm. Tests: * New ``has_spawnable_ready`` cases (empty queue, terminal-lane only, mixed real+terminal). * New ``test_dispatch_skips_nonspawnable_into_separate_bucket`` verifies the bucketing change. * Updated ``test_dispatch_skips_unassigned`` to assert no cross-leak. * Added ``all_assignees_spawnable`` fixture in ``tests/hermes_cli/conftest.py`` and threaded it through dispatcher tests that use synthetic assignees ("alice", "bob"). PR #20105 (the parent commit) silently broke 8 such tests by routing those assignees into ``skipped_nonspawnable`` instead of spawning; this PR repairs them as part of the same code area. Verified locally: 246/246 kanban-suite tests pass. Stacks on top of fix/kanban-dispatcher-skip-missing-profile-2026-05-05 (PR #20105). Reviewer: this PR is meant to merge AFTER #20105. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 04:13:12 -07:00
Teknium	5ec6baa400	feat(kanban): multi-project boards — one install, many kanbans (#19653 ) Adds first-class board support to kanban so users can separate unrelated streams of work (projects, repos, domains) into isolated queues. Single- project users stay on the 'default' board and see no UI change. Isolation model --------------- - Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh installs and pre-boards users get zero migration. - Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in their env alongside the existing `HERMES_KANBAN_DB` / `HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see other boards' tasks. - The gateway's single dispatcher loop now sweeps every board per tick; per-tick cost is a few extra filesystem stats. - CAS concurrency guarantees are preserved per-board (each board is its own SQLite DB, same WAL+IMMEDIATE machinery as before). CLI --- hermes kanban boards list\|create\|switch\|show\|rename\|rm hermes kanban --board <slug> <any-subcommand> Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env → `~/.hermes/kanban/current` file → `default`. Slug validation is strict: lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` / control chars are rejected so boards can't name their way out of the boards/ directory. Passive discoverability: when more than one board exists, `hermes kanban list` prints a one-line header ("Board: foo (2 other boards …)") so users who stumble across multi-project never have to hunt for the feature. Invisible for single-board installs. Dashboard --------- - New `BoardSwitcher` component at the top of the Kanban tab: dropdown with all boards + task counts, `+ New board` button, `Archive` button (non-default only). Hidden entirely when only `default` exists and is empty — single-project users never see it. - New `NewBoardDialog` modal: slug / display name / description / icon + "switch to this board after creating" checkbox. - Selected board persists to `localStorage` so browser users don't shift the CLI's active board out from under a terminal they left open. - New `?board=<slug>` query param on every existing endpoint plus a new `/boards` CRUD surface (`GET /boards`, `POST /boards`, `PATCH /boards/<slug>`, `DELETE /boards/<slug>`, `POST /boards/<slug>/switch`). - Events WebSocket is pinned to a board at connection time; switching opens a fresh WS against the new board. Also fixes a pre-existing bug in the plugin's tenant / assignee filters: the SDK's `Select` uses `onValueChange(value)`, not native `onChange(event)`, so those filters silently didn't work. New `selectChangeHandler` helper wires both signatures. Tests ----- 49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering: slug validation (valid / invalid / auto-downcase), path resolution (default = legacy path, named = `boards/<slug>/`, env var override), current-board resolution chain (env > file > default), board CRUD + archive / hard-delete, per-board connection isolation (tasks don't leak), worker spawn env injection (`HERMES_KANBAN_BOARD`, `HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the right board), and end-to-end CLI surface. Regression surface: all 264 pre-existing kanban tests continue to pass. Live-tested via the dashboard: created 3 boards (default, hermes-agent, atm10-server), created tasks on each via both CLI (`--board <slug> create`) and dashboard (inline create on the Ready column), confirmed zero cross-board leakage, confirmed `BoardSwitcher` + `NewBoardDialog` work end-to-end in the browser.	2026-05-04 04:42:38 -07:00
GodsBoy	f5bd77b3e1	fix(kanban): anchor board, workspaces, and worker logs at the shared Hermes root The Kanban board is documented as shared across all Hermes profiles, but `kanban_db_path()` and `workspaces_root()` resolved through `get_hermes_home()`, which returns the active profile's HERMES_HOME. When the dispatcher spawned a worker with `hermes -p <profile> --skills kanban-worker chat -q "work kanban task <id>"`, the worker rewrote HERMES_HOME to the profile subdirectory before kanban_db.py imported, opening a profile-local `kanban.db` that did not contain the dispatcher's task. `kanban_show` and `kanban_complete` failed; the dispatcher's row stayed `running` and was retried/crashed. The same defect applied to `_default_spawn`'s log directory and `worker_log_path`, so `hermes kanban tail` did not see the worker's output. Add `kanban_home()` in `hermes_cli/kanban_db.py` that resolves through `HERMES_KANBAN_HOME` (explicit override) then `get_default_hermes_root()`, which already understands the `<root>/profiles/<name>` and Docker / custom HERMES_HOME shapes. Reroute `kanban_db_path`, `workspaces_root`, the `_default_spawn` log directory, `gc_worker_logs`, and `worker_log_path` through it. Profile-specific config, `.env`, memory, and sessions stay isolated as before; only the kanban surface is shared. Add a `TestSharedBoardPaths` regression class to `tests/hermes_cli/test_kanban_db.py` covering: default install, profile-worker convergence, Docker custom HERMES_HOME, Docker profile layout, explicit `HERMES_KANBAN_HOME` override, and a real SQLite round-trip across dispatcher and worker HERMES_HOME perspectives. The dispatcher/worker convergence tests fail on origin/main and pass after the fix. Update the `kanban.md` user-guide page and the misleading docstrings in `kanban_db.py` to describe the shared-root behavior. Fixes #19348	2026-05-03 15:13:39 -07:00
Teknium	c868425467	feat(kanban): durable multi-profile collaboration board (#17805 ) Salvage of PR #16100 onto current main (after emozilla's #17514 fix that unblocks plugin Pydantic body validation). History preserved on the standing `feat/kanban-standing` branch; this squashes the 22 iterative commits into one clean landing. What this lands: - SQLite kernel (hermes_cli/kanban_db.py) — durable task board with tasks, task_links, task_runs, task_comments, task_events, kanban_notify_subs tables. WAL mode, atomic claim via CAS, tenant-namespaced, skills JSON array per task, max-runtime timeouts, worker heartbeats, idempotency keys, circuit breaker on repeated spawn failures, crash detection via /proc/<pid>/status, run history preserved across attempts. - Dispatcher — runs inside the gateway by default (`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims stale claims, promotes ready tasks, spawns `hermes -p <assignee> chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK + HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker` plus any per-task skills. Health telemetry warns on stuck ready queue. - Structured tool surface (tools/kanban_tools.py) — 7 tools (kanban_show, kanban_complete, kanban_block, kanban_heartbeat, kanban_comment, kanban_create, kanban_link). Gated on HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal sessions. - System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE) injected only when kanban tools are active. - Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board UI: triage/todo/ready/running/blocked/done columns, drag-drop, inline create, task drawer with markdown, comments, run history, dependency editor, bulk ops, lanes-by-profile grouping, WS-driven live refresh. Matches active dashboard theme via CSS variables. - CLI — `hermes kanban init\|create\|list\|show\|assign\|link\|unlink\| claim\|comment\|complete\|block\|unblock\|archive\|tail\|dispatch\|context\| init\|gc\|watch\|stats\|notify\|log\|heartbeat\|runs\|assignees` + `/kanban` slash in-session. - Worker + orchestrator skills (skills/devops/kanban-worker + kanban-orchestrator) — pattern library for good summary/metadata shapes, retry diagnostics, block-reason examples, fan-out patterns. - Per-task force-loaded skills — `--skill <name>` (repeatable), stored as JSON, threaded through to dispatcher argv as one `--skills X` pair per skill alongside the built-in kanban-worker. Dashboard + CLI + tool parity. - Deprecation of standalone `hermes kanban daemon` — stub exits 2 with migration guidance; `--force` escape hatch for headless hosts. - Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md) with 11 dashboard screenshots walking through four user stories (Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker). - Tests (251 passing): kernel schema + migration + CAS atomicity, dispatcher logic, circuit breaker, crash detection, max-runtime timeouts, claim lifecycle, tenant isolation, idempotency keys, per- task skills round-trip + validation + dispatcher argv, tool surface (7 tools × round-trip + error paths), dashboard REST (CRUD + bulk + links + warnings), gateway-embedded dispatcher (config gate, env override, graceful shutdown), CLI deprecation stub, migration from legacy schemas. Gateway integration: - GatewayRunner._kanban_dispatcher_watcher — new asyncio background task, symmetric with _kanban_notifier_watcher. Runs dispatch_once via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0 env override for debugging. - Config: new `kanban` section in DEFAULT_CONFIG with `dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`. Additive — no \_config_version bump needed. Forward-compat: - workflow_template_id / current_step_key columns on tasks (v1 writes NULL; v2 will use them for routing). - task_runs holds claim machinery (claim_lock, claim_expires, worker_pid, last_heartbeat_at) so multi-attempt history is first- class from day one. Closes #16102. Co-authored-by: emozilla <emozilla@nousresearch.com>	2026-04-30 13:36:47 -07:00
Teknium	06f81752ed	Revert "feat(kanban): durable multi-profile collaboration board (#16081 )" (#16098 ) This reverts commit `15937a6b46`.	2026-04-26 08:29:37 -07:00
Teknium	15937a6b46	feat(kanban): durable multi-profile collaboration board (#16081 ) New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for worker and orchestrator profiles. SQLite-backed task board (~/.hermes/kanban.db) shared across all profiles on the host. Zero changes to run_agent.py, no new core tools, no tool-schema bloat. Motivation: delegate_task is a function call — sync fork/join, anonymous subagent, no resumability, no human-in-the-loop. Kanban is the durable shape needed for research triage, scheduled ops, digital twins, engineering pipelines, and fleet work. They coexist (workers may call delegate_task internally). What this adds - hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution, dispatcher, workspace resolution, worker-context builder. - hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash() entry point used by both CLI and gateway. - skills/devops/kanban-worker — how a profile should work a claimed task. - skills/devops/kanban-orchestrator — "you are a dispatcher, not a worker" template with anti-temptation rules. - /kanban slash command wired into cli.py and gateway/run.py. Bypasses the running-agent guard (board writes don't touch agent state), so /kanban unblock can free a stuck worker mid-conversation. - Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns; 4 user stories; implementation plan; concurrency correctness. - Docs: website/docs/user-guide/features/kanban.md, CLI reference updated, sidebar entry added. Architecture highlights - Three planes: control (user + gateway), state (board + dispatcher), execution (pool of profile processes). - Every worker is a full OS process, spawned as `hermes -p <profile>`. No in-process subagent swarms — solves NanoClaw's SDK-lifecycle failure class. - Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale claims reclaimed 15 min after their TTL expires. - Tenant namespacing via one nullable column — one specialist fleet can serve many businesses with data isolation by workspace path. Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution, dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass hermetic via scripts/run_tests.sh.	2026-04-26 08:24:26 -07:00

38 commits