hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-03 02:11:48 +00:00

Author	SHA1	Message	Date
johnncenae	1ef9e88549	fix(gateway): write restart markers atomically and fix Windows lock collisions	2026-04-30 19:58:16 -07:00
teknium1	447a2bba3a	fix(plugins): bound async plugin command await with 30s timeout Follow-up to #17963. The threaded branch of resolve_plugin_command_result previously called Event.wait() with no timeout — a hung async plugin handler would wedge the terminal indefinitely. Cap the wait at 30s and raise TimeoutError instead. Added a regression test covering the hung handler path.	2026-04-30 19:56:18 -07:00
hharry11	ca9a61ae38	fix(plugins): await async handlers in CLI and TUI dispatch	2026-04-30 19:56:18 -07:00
johnncenae	79cffa9232	auth: coerce tls insecure flag safely instead of using Python truthiness	2026-04-30 19:55:48 -07:00
johnncenae	2bf73fbe2c	fix(cli): coerce tls insecure flag safely in auth state	2026-04-30 19:55:48 -07:00
Teknium	7cbe943d2d	feat(skills): add here.now as an optional skill Moves the here-now skill under optional-skills/productivity/here-now/ so it's discoverable via the Skills Hub but not installed by default, and tightens the SKILL.md description to a single line to match sibling optional-skill descriptions. Install with: hermes skills install official/productivity/here-now Closes #378	2026-04-30 19:48:15 -07:00
adamludwin	21cc9c8d32	Update here.now skill bundle Made-with: Cursor	2026-04-30 19:48:15 -07:00
adamludwin	f7dfd4ae36	feat(skills): add built-in here.now skill Add the here.now productivity skill with a bundled publish runtime so Hermes can publish files and folders to live URLs. Keep the skill thin and docs-first while fixing script path resolution and upload failure handling. Made-with: Cursor	2026-04-30 19:48:15 -07:00
Yukipukii1	2110a3a0c4	fix(tui): return JSON-RPC errors for invalid request shapes	2026-04-30 19:47:00 -07:00
Yukipukii1	5f3f456784	fix(approval): wake blocked gateway approvals on session cleanup	2026-04-30 19:46:27 -07:00
Feranmi10	f4ba97ad9a	fix(status): add NVIDIA_API_KEY to hermes status API keys display Closes #16082 The `hermes status` command listed provider API keys under the ◆ API Keys section but NVIDIA_API_KEY was absent. Users configured with NVIDIA NIM had no way to verify their key was set from status output. Add it alongside the other inference provider keys.	2026-04-30 19:46:06 -07:00
Yukipukii1	75483b6db1	fix(curator): preserve last_report_path in state	2026-04-30 19:45:59 -07:00
Mind-Dragon	aab5bcc6ac	test(model_switch): cover private user_providers override	2026-04-30 19:44:26 -07:00
Mind-Dragon	5ad8281885	fix(model_switch): correct user_providers override for private models The switch_model override logic incorrectly iterated over user_providers as if it were a list of dicts, but it's actually a dict mapping provider_slug -> config. This meant private models defined in a provider's `models:` section (e.g. nahcrof-dedicated with discover_models: false) were never accepted when the API /models list didn't include them. Fix: iterate over user_providers.items(), match by slug, and handle both dict and list forms of the models config.	2026-04-30 19:44:26 -07:00
Aamir Jawaid	1e5a23fa64	docs(teams): use teams app get --install-link for Step 6 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 19:43:32 -07:00
Aamir Jawaid	67f1198ba9	docs(teams): fix CLI install tag and Step 6 install flow - Keep @preview tag for teams CLI - Step 3: note client secret won't be shown again - Step 6: use the Install in Teams link from teams app create output Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 19:43:32 -07:00
Aamir Jawaid	d5e72ae17f	docs(teams): fix CLI install tag and Step 6 install flow - Keep @preview tag for teams CLI - Step 3: note client secret won't be shown again - Step 6: just open the Install in Teams link from teams app create output Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 19:43:32 -07:00
Aamir Jawaid	a5d60f42ee	docs(teams): fix CLI install tag and Step 6 install flow - Keep @preview tag for teams CLI - Step 3: note client secret won't be shown again - Step 6: use the install link printed by teams app create instead of a separate CLI command Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 19:43:32 -07:00
Aamir Jawaid	09aba91766	docs(teams): note that tunnel port 3978 is the default, not fixed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 19:43:32 -07:00
Aamir Jawaid	f59693c075	fix(teams): pipe TEAMS_PORT through docker-compose properly Was hardcoded to 3978; use ${TEAMS_PORT:-3978} so a custom port set in .env is actually passed into the container. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 19:43:32 -07:00
Aamir Jawaid	c997830e1e	docs(teams): fix port references and add TEAMS_ALLOW_ALL_USERS - Replace hardcoded 3978 with configurable TEAMS_PORT references - Fix incorrect docker-compose port mapping claim (uses network_mode: host) - Add missing TEAMS_ALLOW_ALL_USERS to config reference table Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 19:43:32 -07:00
Aamir Jawaid	4a6fac36d8	docs(teams): fix group chat behavior — @mention required Group chats require @mention just like channels, not respond-to-all. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 19:43:32 -07:00
Aamir Jawaid	624057fce6	feat(teams): set User-Agent to Hermes via 2.0.0 client option microsoft-teams-apps 2.0.0 added the `client` option to AppOptions, accepting a ClientOptions instance. Use it to set the User-Agent header to "Hermes" on all outgoing HTTP requests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 19:43:32 -07:00
briandevans	97d6f25008	test(toolsets): include kanban in expected post-#17805 toolset assertions The kanban PR (#17805, `c86842546`) added the `kanban` toolset and `tools/kanban_tools.py`, but didn't update three pre-existing test assertions that bake the full toolset/tool inventory: * `tests/tools/test_registry.py::test_matches_previous_manual_builtin_tool_set` hard-codes the manual list of builtin tool modules. `tools.kanban_tools` was missing. * `tests/test_tui_gateway_server.py::test_load_enabled_toolsets_rejects_disabled_mcp_env` and `test_load_enabled_toolsets_falls_back_when_tui_env_invalid` both expect `["memory"]` from `_load_enabled_toolsets()`. With kanban now auto-recovered by `_get_platform_tools` (its tools live in hermes-cli's universe but are not in CONFIGURABLE_TOOLSETS), the resolver returns `["kanban", "memory"]`. * `tests/hermes_cli/test_tools_config.py::test_get_platform_tools_preserves_explicit_empty_selection` asserts `set()` for an explicit empty list. The recovery loop now also surfaces `kanban`. Reframed to assert the contract the test name describes — no CONFIGURABLE toolset gets re-enabled when the user explicitly saved an empty list — which stays correct as more non-configurable platform toolsets are added. Verified the failures reproduce on clean origin/main (`180a7036b`) with `.[all,dev]`-equivalent extras (fastapi, starlette, httpx, pytest-asyncio) and that all four pass with this commit applied. CI on main itself is currently red on these tests; this restores green for everyone's PRs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 19:43:03 -07:00
Chris Danis	f61695ee73	fix(signal): skip contentless envelopes (profile key updates, empty messages) Signal-cli sends dataMessage wrappers for profile key updates and other metadata events that have no actual text content. These were reaching the gateway as msg='' and triggering full agent turns for nothing. Add early return in _handle_envelope() when both message field is empty/ missing/whitespace AND there are no attachments. Messages with media attachments but no text still flow through. - 12 lines added to gateway/platforms/signal.py - 5 new tests in TestSignalContentlessEnvelope class	2026-04-30 19:42:59 -07:00
Teknium	e2e6b6ff1a	chore(models): move Vercel AI Gateway to bottom of provider picker (#18112 ) It was sitting at position 4 of the `hermes model` list, ahead of Anthropic, OpenAI, Xiaomi, and other first-class API providers. Move it to the end of CANONICAL_PROVIDERS and drop the "(200+ models, $5 free credit, no markup)" parenthetical so the entry just reads "Vercel AI Gateway".	2026-04-30 19:34:19 -07:00
Teknium	e5dad4ac57	fix(agent): propagate ContextVars to concurrent tool worker threads (#18123 ) Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-and-push (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Waiting to run Details Nix Lockfile Fix / fix (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details Tests / test (push) Waiting to run Details Tests / e2e (push) Waiting to run Details Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics. Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd). Salvages #16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site. Closes #16660. Co-authored-by: firefly <promptsiren@gmail.com> Co-authored-by: banditburai <banditburai@users.noreply.github.com>	2026-04-30 16:26:26 -07:00
Teknium	180a7036bc	feat(skills): add Shopify optional skill (Admin + Storefront GraphQL) (#18116 ) Adds optional-skills/productivity/shopify — curl-based guide for the Shopify Admin GraphQL API (products, orders, customers, inventory, metafields, bulk operations, webhooks) and the Storefront GraphQL API. - API version 2026-01 (current stable) - Custom-app access tokens (shpat_...) with X-Shopify-Access-Token header - Notes the 2026-01-01 deprecation of admin-created custom apps, points users at Dev Dashboard for new setups after that date - Includes a reusable shop_gql() bash helper, cursor pagination, rate-limit cost inspection, GID conventions, userErrors check - Safety section warns on destructive mutations (delete/refund/cancel) Installs cleanly via: hermes skills install official/productivity/shopify	2026-04-30 15:58:44 -07:00
brooklyn!	8fed969618	Merge pull request #18113 from NousResearch/bb/tui-sgr-mouse-fragments fix(tui): recover fragmented SGR mouse reports	2026-04-30 15:56:59 -07:00
Brooklyn Nicholson	ded011c5a5	fix(tui): tighten SGR fragment matching	2026-04-30 17:50:49 -05:00
Brooklyn Nicholson	71b685aee0	fix(tui): recover fragmented SGR mouse reports	2026-04-30 17:43:21 -05:00
Teknium	bbbce92651	feat(tui): render self-improvement review summaries in the transcript The Ink TUI (\`hermes --tui\` + dashboard \`/chat\`) had no wiring for the background self-improvement review. When the review fired and patched a skill or saved a memory entry, the change landed but the user had no visual indication it happened — only the CLI had a print surface for the '💾 Self-improvement review: …' line. Changes: - tui_gateway/server.py: in _init_session, attach agent.background_review_callback to an _emit('review.summary', sid, {text}) closure. Wrapped in try/except so agents with locked attribute slots don't break session startup. - ui-tui/src/app/createGatewayEventHandler.ts: handle 'review.summary' by routing ev.payload.text through sys(…), matching the existing 'background.complete' pattern. Empty / whitespace payloads are ignored so the transcript never gets a blank system line. - ui-tui/src/gatewayTypes.ts: extend the GatewayEvent discriminated union with { type: 'review.summary', payload?: { text?: string } }. Gateway platforms (Telegram, Discord, Slack, …) already route the review summary via background_review_callback → post-delivery queue in gateway/run.py, so they pick up the new 'Self-improvement review:' prefix from the companion run_agent change with no platform edits. Tests: - tests/tui_gateway/test_review_summary_callback.py (Python, 2 tests): _init_session attaches a callback that emits the right event; the callback path survives agents that can't accept the attribute. - ui-tui/src/__tests__/createGatewayEventHandler.test.ts (vitest, 2 new cases): review.summary events feed sys(...) with the full text; empty / missing payloads are no-ops. - TypeScript type-check passes. - tui_gateway suite: 64/64 pass.	2026-04-30 14:07:22 -07:00
Teknium	80a676658c	fix(cli): surface self-improvement review summaries from bg thread When the self-improvement background review fires after a turn, it runs in a bg thread and emits a ' 💾 <summary>' line to announce what it saved to memory or skills. Two problems made this invisible to users even when the review successfully modified a skill: 1. The print went through `_cprint` (prompt_toolkit's print_formatted_text) on a bg thread while the CLI's PromptSession was live. Direct print_formatted_text races with the input-area redraw and the line can land behind/above the prompt, scrolled off without the user seeing it. 2. The message said only '💾 Skill created.' / '💾 Memory updated' with no indication that the self-improvement loop was the one doing this. Users who did catch the line couldn't tell the background review from some other agent action. Fixes: - `_cprint` now detects when it's called from a non-app thread with a running prompt_toolkit Application, and routes through `run_in_terminal` via `loop.call_soon_threadsafe`. That pauses the input, prints the line above the prompt, and redraws — the normal prompt_toolkit contract for bg-thread output. Direct-print fallback preserved for the no-app / same-thread / import-error paths. Affects every bg-thread emission, not just the review summary (curator summaries and auxiliary failure prints benefit too). - The summary now reads ' 💾 Self-improvement review: <summary>' in both the CLI and the gateway `background_review_callback` path, so the origin is unambiguous. Tests: - New `tests/cli/test_cprint_bg_thread.py` covers all five routing branches (no app, app-not-running, cross-thread schedule, same-thread direct, app-loop-attribute-error, import-error). - New case in `tests/run_agent/test_background_review.py` asserts the attributed prefix shows up in both `_safe_print` and `background_review_callback`. Live E2E: exercised _cprint from a bg thread inside a real Application event loop; confirmed get_app_or_none() sees the app, call_soon_threadsafe schedules run_in_terminal, and the inner _pt_print runs.	2026-04-30 14:07:22 -07:00
Teknium	c868425467	feat(kanban): durable multi-profile collaboration board (#17805 ) Salvage of PR #16100 onto current main (after emozilla's #17514 fix that unblocks plugin Pydantic body validation). History preserved on the standing `feat/kanban-standing` branch; this squashes the 22 iterative commits into one clean landing. What this lands: - SQLite kernel (hermes_cli/kanban_db.py) — durable task board with tasks, task_links, task_runs, task_comments, task_events, kanban_notify_subs tables. WAL mode, atomic claim via CAS, tenant-namespaced, skills JSON array per task, max-runtime timeouts, worker heartbeats, idempotency keys, circuit breaker on repeated spawn failures, crash detection via /proc/<pid>/status, run history preserved across attempts. - Dispatcher — runs inside the gateway by default (`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims stale claims, promotes ready tasks, spawns `hermes -p <assignee> chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK + HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker` plus any per-task skills. Health telemetry warns on stuck ready queue. - Structured tool surface (tools/kanban_tools.py) — 7 tools (kanban_show, kanban_complete, kanban_block, kanban_heartbeat, kanban_comment, kanban_create, kanban_link). Gated on HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal sessions. - System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE) injected only when kanban tools are active. - Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board UI: triage/todo/ready/running/blocked/done columns, drag-drop, inline create, task drawer with markdown, comments, run history, dependency editor, bulk ops, lanes-by-profile grouping, WS-driven live refresh. Matches active dashboard theme via CSS variables. - CLI — `hermes kanban init\|create\|list\|show\|assign\|link\|unlink\| claim\|comment\|complete\|block\|unblock\|archive\|tail\|dispatch\|context\| init\|gc\|watch\|stats\|notify\|log\|heartbeat\|runs\|assignees` + `/kanban` slash in-session. - Worker + orchestrator skills (skills/devops/kanban-worker + kanban-orchestrator) — pattern library for good summary/metadata shapes, retry diagnostics, block-reason examples, fan-out patterns. - Per-task force-loaded skills — `--skill <name>` (repeatable), stored as JSON, threaded through to dispatcher argv as one `--skills X` pair per skill alongside the built-in kanban-worker. Dashboard + CLI + tool parity. - Deprecation of standalone `hermes kanban daemon` — stub exits 2 with migration guidance; `--force` escape hatch for headless hosts. - Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md) with 11 dashboard screenshots walking through four user stories (Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker). - Tests (251 passing): kernel schema + migration + CAS atomicity, dispatcher logic, circuit breaker, crash detection, max-runtime timeouts, claim lifecycle, tenant isolation, idempotency keys, per- task skills round-trip + validation + dispatcher argv, tool surface (7 tools × round-trip + error paths), dashboard REST (CRUD + bulk + links + warnings), gateway-embedded dispatcher (config gate, env override, graceful shutdown), CLI deprecation stub, migration from legacy schemas. Gateway integration: - GatewayRunner._kanban_dispatcher_watcher — new asyncio background task, symmetric with _kanban_notifier_watcher. Runs dispatch_once via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0 env override for debugging. - Config: new `kanban` section in DEFAULT_CONFIG with `dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`. Additive — no \_config_version bump needed. Forward-compat: - workflow_template_id / current_step_key columns on tasks (v1 writes NULL; v2 will use them for routing). - task_runs holds claim machinery (claim_lock, claim_expires, worker_pid, last_heartbeat_at) so multi-attempt history is first- class from day one. Closes #16102. Co-authored-by: emozilla <emozilla@nousresearch.com>	2026-04-30 13:36:47 -07:00
ethernet	59c1a13f45	Merge pull request #15680 from NousResearch/fix/nix-package-lock fix: let fixing nix pkgs command work without an initial build	2026-04-30 16:21:51 -04:00
Teknium	1d8068d71d	feat(models): add openrouter/owl-alpha (free) to curated OpenRouter list (#18071 )	2026-04-30 12:57:02 -07:00
Ari Lotter	9ac4a2e53e	fix: let fixing nix pkgs command work without an initial build	2026-04-30 15:39:45 -04:00
Austin Pickett	6bc5d72271	Merge pull request #16419 from vincez-hms-coder/feat/dashboard-profiles-hms-coder feat(dashboard): add profiles management page	2026-04-30 12:09:23 -07:00
ethernet	b737af8226	Merge pull request #18047 from stephenschoettler/fix/acp-persist-user-message-test-mocks test(acp): accept prompt persistence kwargs in MCP E2E mocks	2026-04-30 14:43:26 -04:00
Teknium	73bf3ab1b2	chore: release v0.12.0 (2026.4.30) (#18057 ) The Curator release — Hermes Agent now maintains itself. Autonomous background Curator grades, prunes, and consolidates the skill library; self-improvement loop substantially upgraded; four new inference providers; Microsoft Teams (via pluggable platforms) + Yuanbao as 18th and 19th messaging platforms; Spotify + Google Meet native integrations; ComfyUI + TouchDesigner-MCP bundled by default; Humanizer skill ported; ~57% cut to visible TUI cold start. Stats since v0.11.0: 1,096 commits, 550 merged PRs, 1,270 files changed, 217,776 insertions, 213 community contributors.	2026-04-30 11:31:01 -07:00
Teknium	76edc40ab0	fix(agent): extend thinking-mode reasoning_content pad to Kimi/Moonshot Builds on #16855 (@lsdsjy) which fixed DeepSeek v4 reasoning_content replay via model_extra fallback + capturing tool_calls at method entry. Kimi / Moonshot thinking mode enforces the same echo-back contract and hits the same 400 when a tool-call turn is persisted without reasoning_content. - _build_assistant_message: pad branch now uses _needs_thinking_reasoning_pad() (DeepSeek OR Kimi) instead of _needs_deepseek_tool_reasoning() alone. - Extract _needs_thinking_reasoning_pad() and reuse it in _copy_reasoning_content_for_api so both sites share one predicate. - tests/run_agent/test_deepseek_reasoning_content_echo.py: add TestBuildAssistantMessagePadsStrictProviders parametrized over DeepSeek (attr=None, attr-absent), Kimi (attr=None), Moonshot (via base_url), and an OpenRouter negative control that must NOT pad. Proven to fail 2/5 cases on Kimi/Moonshot without this change. - scripts/release.py: add AUTHOR_MAP entries for lsdsjy and season179. Refs #17400. Co-authored-by: season179 <season.saw@gmail.com>	2026-04-30 11:18:39 -07:00
lsdsjy	b9b9ee3e6c	fix(deepseek): preserve v4 reasoning_content on replay	2026-04-30 11:18:39 -07:00
ethernet	8fbc9d7d78	Merge pull request #18043 from NousResearch/feat/help-ui feat(tui): add a mini help menu when u write ? in the input field	2026-04-30 14:02:28 -04:00
Stephen Schoettler	699a9c11a9	test(acp): accept prompt persistence kwargs in mocks	2026-04-30 10:47:23 -07:00
Teknium	d60a9917d3	feat(curator): show most-used and least-used skills in `hermes curator status` (#18033 ) Alongside the existing 'least recently used' section, surface two more rankings so users can see which of their agent-created skills actually get exercised: - 'most used (top 5)' — sorted by use_count descending. Hidden when every skill has use_count=0 (noise suppression on fresh installs). - 'least used (top 5)' — sorted by use_count ascending. Always shown when the catalog is non-empty. use_count started tracking real agent skill activation in PR #17932 (bump_use wired into skill_view tool + slash invocation + --skill preload), so these rankings are now meaningful. Tests: 3 new in tests/hermes_cli/test_curator_status.py — happy path with mixed use_counts, zero-use suppression of the most-used section, and the no-skills clean-empty case.	2026-04-30 10:37:33 -07:00
ethernet	7c07422202	feat(tui): add a mini help menu when u write ? in the input field it feels so nice :3 just a lil popup ! doesn't get in the way or take any focus or anything, and directs users to /help for more info :3	2026-04-30 13:37:12 -04:00
y0shualee	f4b76fa272	fix: use skill activity in curator status Treat skill views and edits as activity when curator reports and applies lifecycle transitions, so recently loaded or patched skills are not displayed or transitioned as never used.\n\nAdds regression tests for activity derivation, automatic transitions, and CLI status output.	2026-04-30 10:31:47 -07:00
0xDevNinja	564a649e6a	fix(curator): scan nested archive subdirs in restore_skill restore_skill() in tools/skill_usage.py used archive_root.iterdir(), which only walked the top level of .archive/. Skills archived under nested layouts (e.g. .archive/openclaw-imports/<skill>/ from older archive paths or external imports) were invisible to both the exact-match and prefix-match candidate scans, surfacing as a misleading "skill '<name>' not found in archive" error even though the directory existed on disk. Switch both candidate scans to archive_root.rglob('*') so the lookup descends into category subdirectories. Fixes #17942	2026-04-30 10:31:44 -07:00
Teknium	7913d6a90f	chore(author-map): add y0shua1ee and 0xDevNinja for curator PRs (#18031 )	2026-04-30 10:31:38 -07:00
Teknium	8b290a5908	feat(curator): split archived into consolidated vs pruned with model + heuristic classification (#17941 ) * fix(curator): split 'archived' into consolidated vs pruned in run reports Users who watched a curator run saw skills like 'anthropic-api' listed under 'Skills archived' and interpreted that as pruning — but the curator had actually absorbed those skills into a new umbrella (e.g. 'llm-providers') during the same run. The directory gets archived for safety (all removals are recoverable), but the content still lives under a different name. Users then 'restored' what they thought were deleted skills and ended up with confusingly duplicated skillsets (old-name + absorbed-inside-umbrella). Classify removed skills using this run's skill_manage tool calls: - consolidated: content absorbed into a surviving/newly-created skill (evidenced by a skill_manage write_file/patch/create/edit whose target is a different skill AND whose file_path/content references the removed skill's name) - pruned: archived without consolidation evidence (truly stale) REPORT.md now shows two distinct sections: - 'Consolidated into umbrella skills' — with `removed → merged into umbrella` - 'Pruned — archived for staleness' — pure staleness archives run.json schema additions (backward compatible): - counts.consolidated_this_run, counts.pruned_this_run - consolidated: [{name, into, evidence}, ...] - pruned: [names] - archived: retained as the union for backward compat Also: relabel the auto-transitions 'archived' counter to 'archived (no LLM, pure time-based staleness)' so it's clearly distinct from LLM-pass archives. Tests: 9 new tests in test_curator_classification.py covering consolidation evidence parsing (write_file/patch/create), hyphen/underscore name variants, self-reference rejection, destination-must-exist, mixed runs, and malformed-JSON fallback safety. Existing test_report_md_is_human_readable updated to cover the new section names. E2E: isolated HERMES_HOME, realistic 3-skill run, REPORT.md verified end-to-end. * feat(curator): hybrid model-declared + heuristic classification Extend the consolidated-vs-pruned split with LLM-authored intent: 1. Curator prompt now requires a structured YAML block at the end of the final response (consolidations / prunings with short rationale). 2. _parse_structured_summary() extracts it tolerantly — missing block, malformed YAML, partial lists all fall back to heuristic cleanly. 3. _reconcile_classification() merges model intent with the tool-call heuristic: - Model wins on rationale when its umbrella exists post-run - Model hallucination (umbrella doesn't exist) is downgraded to the heuristic's finding, or pruned if there's no evidence either - Heuristic catches model omission — consolidations the model enumerated tools for but forgot to list get surfaced with a '(detected via tool-call audit)' tag 4. REPORT.md now shows per-row rationale alongside 'removed → umbrella' and flags audit-only rows so the user knows why no reason is shown. Backward compat: run.json's 'archived' field (union) is preserved. 'pruned' is now a list of dicts with {name, source, reason}; 'pruned_names' is the flat-name list for legacy consumers. Tests: 15 new covering YAML parse edge cases (malformed, empty lists, bare-string entries, missing fields), reconciler rules (model wins, hallucination fallback, heuristic catches omission, prune with reason), and an end-to-end report-render test with all four paths exercised.	2026-04-30 10:31:23 -07:00

1 2 3 4 5 ...

6857 commits