mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-06 02:41:48 +00:00
2 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
97acd66b4c
|
fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671) (#18731)
* fix(curator): authoritative absorbed_into declarations on skill delete Closes #18671. The classification pipeline that feeds cron-ref rewriting used to infer consolidation vs pruning from two brittle signals: the curator model's post-hoc YAML summary block, and a substring heuristic scanning other tool calls for the removed skill's name. Both miss in real consolidations — the model forgets the YAML under reasoning pressure, and the heuristic misses when the umbrella's patch content describes the absorbed behavior abstractly instead of naming the old slug. When both miss, the skill falls through to 'no-evidence fallback' pruned, and #18253's cron rewriter drops the cron ref entirely instead of mapping it to the umbrella. Same observable symptom as pre-#18253: 'Skill(s) not found and skipped' at the next cron run. The fix makes the model declare intent at the moment of deletion. skill_manage(action='delete') now accepts absorbed_into: - absorbed_into='<umbrella>' -> consolidated, target must exist on disk - absorbed_into='' -> explicit prune, no forwarding target - missing -> legacy path, falls through to heuristic/YAML The curator reconciler reads these declarations off llm_meta.tool_calls BEFORE either the YAML block or the substring heuristic. Declaration wins. Fallback logic stays intact for backward compat with any caller (human or older curator conversation) that doesn't populate the arg. Changes - tools/skill_manager_tool.py: add absorbed_into param to skill_manage + _delete_skill. Validate target exists when non-empty. Reject absorbed_into=<self>. Wire through dispatcher + registry + schema. - agent/curator.py: new _extract_absorbed_into_declarations() walks tool calls for skill_manage(delete) with the arg. _reconcile_classification accepts absorbed_declarations= and treats them as authoritative. Curator prompt updated to require the arg on every delete. - Tests: 7 new skill_manager tests covering the tool contract (valid target, empty string, nonexistent target, self-reference, whitespace, backward compat, dispatcher plumbing). 11 new curator tests covering the extractor + authoritative reconciler path + mixed-legacy-and- declared runs. Validation - 307/307 targeted tests pass (curator + cron + skill_manager suites). - E2E #18671 repro: 3 narrow skills, 1 umbrella, cron job referencing all 3. Model emits NO YAML block. Heuristic misses (patch prose doesn't name old slugs). Delete calls carry absorbed_into. Result: both PR skills correctly classified 'consolidated' + cron rewritten ['pr-review-format', 'pr-review-checklist', 'stale-junk'] -> ['hermes-agent-dev']; stale-junk pruned via absorbed_into=''. - E2E backward-compat: delete without absorbed_into, model emits YAML -> routed via existing 'model' source, cron still rewritten correctly. * feat(curator): capture + restore cron skill links across snapshot/rollback Before this, rolling back a curator run restored the skills tree but cron jobs still pointed at the umbrella skills the curator had rewritten them to. The user would see their old narrow skills back on disk but their cron jobs still configured with the merged umbrella — not actually 'back to how it was'. Snapshot side: snapshot_skills() now captures ~/.hermes/cron/jobs.json alongside the skills tarball, as cron-jobs.json. The manifest gets a new 'cron_jobs' block with {backed_up, jobs_count} so rollback (and the CLI confirm dialog) can surface what's in the snapshot. If jobs.json is missing/unreadable/malformed, snapshot proceeds without cron data — the skills backup is the core guarantee; cron is additive. Rollback side: after the skills extract succeeds, the new _restore_cron_skill_links() reconciles the backed-up jobs into the live jobs.json SURGICALLY. Only 'skills' and 'skill' fields are restored, and only on jobs matched by id. Everything else about a cron job — schedule, last_run_at, next_run_at, enabled, prompt, workdir, hooks — is live state the user or scheduler has modified since the snapshot; overwriting it would regress unrelated activity. Reconciliation rules: - Job in backup AND live, skills differ → skills restored. - Job in backup AND live, skills match → no-op. - Job in backup, NOT in live → skipped (user deleted it after snapshot; their choice is later than the snapshot). - Job in live, NOT in backup → untouched (user created it after snapshot). - Snapshot missing cron-jobs.json at all → rollback still succeeds, reports 'not captured' (older pre-feature snapshots keep working). Writes go through cron.jobs.save_jobs under the same _jobs_file_lock the scheduler uses, so rollback doesn't race tick(). Also: - hermes_cli/curator.py: rollback confirm dialog now shows 'cron jobs: N (will be restored for skill-link fields only)' when the snapshot has cron data, or 'not in snapshot (<reason>)' otherwise. - rollback()'s message string includes a 'cron links: ...' clause summarizing the reconciliation outcome. Tests - 9 new cases: snapshot-with-cron, snapshot-without-cron, malformed-json captured-as-raw, full rollback-restores-skills-and-cron, rollback touches only skill fields, rollback skips user-deleted jobs, rollback leaves user-created jobs untouched, rollback still works with pre-feature snapshot that has no cron-jobs.json, standalone unit test on _restore_cron_skill_links exercising the full report shape. Validation - 484/484 targeted tests pass (curator + cron + skill_manager suites). - E2E: real snapshot_skills, real cron rewrite, real rollback. Before: ['pr-review-format', 'pr-review-checklist', 'pr-triage-salvage']. After curator: ['hermes-agent-dev']. After rollback: ['pr-review-format', 'pr-review-checklist', 'pr-triage-salvage']. Non-skill fields (id, name, prompt) preserved across the round trip. |
||
|
|
8b290a5908
|
feat(curator): split archived into consolidated vs pruned with model + heuristic classification (#17941)
* fix(curator): split 'archived' into consolidated vs pruned in run reports
Users who watched a curator run saw skills like 'anthropic-api' listed
under 'Skills archived' and interpreted that as pruning — but the curator
had actually absorbed those skills into a new umbrella (e.g. 'llm-providers')
during the same run. The directory gets archived for safety (all removals
are recoverable), but the content still lives under a different name.
Users then 'restored' what they thought were deleted skills and ended up
with confusingly duplicated skillsets (old-name + absorbed-inside-umbrella).
Classify removed skills using this run's skill_manage tool calls:
- consolidated: content absorbed into a surviving/newly-created skill
(evidenced by a skill_manage write_file/patch/create/edit whose target
is a different skill AND whose file_path/content references the
removed skill's name)
- pruned: archived without consolidation evidence (truly stale)
REPORT.md now shows two distinct sections:
- 'Consolidated into umbrella skills' — with `removed → merged into umbrella`
- 'Pruned — archived for staleness' — pure staleness archives
run.json schema additions (backward compatible):
- counts.consolidated_this_run, counts.pruned_this_run
- consolidated: [{name, into, evidence}, ...]
- pruned: [names]
- archived: retained as the union for backward compat
Also: relabel the auto-transitions 'archived' counter to 'archived (no
LLM, pure time-based staleness)' so it's clearly distinct from LLM-pass
archives.
Tests: 9 new tests in test_curator_classification.py covering consolidation
evidence parsing (write_file/patch/create), hyphen/underscore name variants,
self-reference rejection, destination-must-exist, mixed runs, and
malformed-JSON fallback safety. Existing test_report_md_is_human_readable
updated to cover the new section names.
E2E: isolated HERMES_HOME, realistic 3-skill run, REPORT.md verified
end-to-end.
* feat(curator): hybrid model-declared + heuristic classification
Extend the consolidated-vs-pruned split with LLM-authored intent:
1. Curator prompt now requires a structured YAML block at the end of the
final response (consolidations / prunings with short rationale).
2. _parse_structured_summary() extracts it tolerantly — missing block,
malformed YAML, partial lists all fall back to heuristic cleanly.
3. _reconcile_classification() merges model intent with the tool-call
heuristic:
- Model wins on rationale when its umbrella exists post-run
- Model hallucination (umbrella doesn't exist) is downgraded to the
heuristic's finding, or pruned if there's no evidence either
- Heuristic catches model omission — consolidations the model
enumerated tools for but forgot to list get surfaced with a
'(detected via tool-call audit)' tag
4. REPORT.md now shows per-row rationale alongside 'removed → umbrella'
and flags audit-only rows so the user knows why no reason is shown.
Backward compat: run.json's 'archived' field (union) is preserved.
'pruned' is now a list of dicts with {name, source, reason};
'pruned_names' is the flat-name list for legacy consumers.
Tests: 15 new covering YAML parse edge cases (malformed, empty lists,
bare-string entries, missing fields), reconciler rules (model wins,
hallucination fallback, heuristic catches omission, prune with reason),
and an end-to-end report-render test with all four paths exercised.
|