feat(curator): make skill consolidation opt-in (prune stays default-on) (#47840)

The curator now defaults to prune-only: the deterministic inactivity pass
(mark stale / archive long-unused skills) still runs whenever the curator is
enabled, but the opinionated LLM umbrella-building consolidation fork is OFF
by default.

- agent/curator.py: add DEFAULT_CONSOLIDATE=False + get_consolidate(); gate
  the forked aux-model review in run_curator_review behind it (new consolidate
  param, None=read config). When off, the LLM pass is skipped entirely (no
  aux-model cost); the run is still recorded and reported.
- config.py: add curator.consolidate (default false); v29->v30 migration seeds
  the key for existing installs without clobbering a user-set value.
- hermes_cli/curator.py: 'hermes curator run --consolidate' override; status
  shows consolidate state; prune-only notice on run.
- docs + tests.
This commit is contained in:
Teknium 2026-06-17 05:20:32 -07:00 committed by GitHub
parent e48803daec
commit 7bbffceb9c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 217 additions and 11 deletions

View file

@ -57,6 +57,11 @@ DEFAULT_INTERVAL_HOURS = 24 * 7 # 7 days
DEFAULT_MIN_IDLE_HOURS = 2
DEFAULT_STALE_AFTER_DAYS = 30
DEFAULT_ARCHIVE_AFTER_DAYS = 90
# Consolidation (the LLM umbrella-building fork) is OFF by default. The
# deterministic inactivity prune (apply_automatic_transitions) still runs
# whenever the curator is enabled; only the opinionated, aux-model-cost
# consolidation pass is opt-in.
DEFAULT_CONSOLIDATE = False
# ---------------------------------------------------------------------------
@ -182,6 +187,22 @@ def get_prune_builtins() -> bool:
return bool(cfg.get("prune_builtins", True))
def get_consolidate() -> bool:
"""Whether the curator runs its LLM consolidation (umbrella-building) pass.
OFF by default. When off, a curator run does ONLY the deterministic
inactivity prune (mark stale / archive long-unused skills) and skips the
forked aux-model review entirely no consolidation, no umbrella-building,
no aux-model cost. Set ``curator.consolidate: true`` to opt back into the
LLM pass that merges overlapping skills into class-level umbrellas.
The explicit ``hermes curator run --consolidate`` flag overrides this for
a single invocation regardless of the config value.
"""
cfg = _load_config()
return bool(cfg.get("consolidate", DEFAULT_CONSOLIDATE))
# ---------------------------------------------------------------------------
# Idle / interval check
# ---------------------------------------------------------------------------
@ -1408,25 +1429,38 @@ def run_curator_review(
on_summary: Optional[Callable[[str], None]] = None,
synchronous: bool = False,
dry_run: bool = False,
consolidate: Optional[bool] = None,
) -> Dict[str, Any]:
"""Execute a single curator review pass.
Steps:
1. Apply automatic state transitions (pure, no LLM).
2. If there are agent-created skills, spawn a forked AIAgent that runs
the LLM review prompt against the current candidate list.
2. If consolidation is enabled AND there are agent-created skills, spawn
a forked AIAgent that runs the LLM review prompt against the current
candidate list.
3. Update .curator_state with last_run_at and a one-line summary.
4. Invoke *on_summary* with a user-visible description.
If *synchronous* is True, the LLM review runs in the calling thread; the
default is to spawn a daemon thread so the caller returns immediately.
*consolidate* gates the LLM umbrella-building pass. ``None`` (the default)
reads ``curator.consolidate`` from config (OFF by default). Passing
``True``/``False`` overrides the config for this invocation used by the
``hermes curator run --consolidate`` flag. When consolidation is off, only
the deterministic inactivity prune runs and the forked aux-model review is
skipped entirely (no aux-model cost).
If *dry_run* is True, the automatic stale/archive transitions are SKIPPED
and the LLM review pass is instructed to produce a report only no
skill_manage mutations, no terminal archive moves. The REPORT.md still
gets written and ``state.last_report_path`` still records it so users
can read what the curator WOULD have done.
can read what the curator WOULD have done. A dry-run also honors
*consolidate*: when consolidation is off, the preview only reports the
deterministic prune candidates.
"""
if consolidate is None:
consolidate = get_consolidate()
start = datetime.now(timezone.utc)
if dry_run:
# Count candidates without mutating state.
@ -1489,6 +1523,53 @@ def run_curator_review(
before_report = []
before_names = {r.get("name") for r in before_report if isinstance(r, dict)}
# Consolidation gate. When off (the default), the curator does ONLY the
# deterministic inactivity prune above — no forked aux-model review, no
# umbrella-building, no aux-model cost. Record the run, write a report
# reflecting the prune-only outcome, and return without spawning a fork.
if not consolidate:
final_summary = (
f"{prefix}{auto_summary}; llm: skipped (consolidation off)"
)
llm_meta = {
"final": "",
"summary": "skipped (consolidation off)",
"model": "",
"provider": "",
"tool_calls": [],
"error": None,
}
elapsed = (datetime.now(timezone.utc) - start).total_seconds()
state2 = load_state()
state2["last_run_duration_seconds"] = elapsed
state2["last_run_summary"] = final_summary
try:
after_report = skill_usage.agent_created_report()
except Exception:
after_report = []
try:
report_path = _write_run_report(
started_at=start,
elapsed_seconds=elapsed,
auto_counts=counts,
auto_summary=auto_summary,
before_report=before_report,
before_names=before_names,
after_report=after_report,
llm_meta=llm_meta,
)
if report_path is not None:
state2["last_report_path"] = str(report_path)
except Exception as e:
logger.debug("Curator report write failed: %s", e, exc_info=True)
save_state(state2)
if on_summary:
try:
on_summary(f"curator: {final_summary}")
except Exception:
pass
return
llm_meta: Dict[str, Any] = {}
try:
candidate_list = _render_candidate_list()

View file

@ -1895,6 +1895,14 @@ DEFAULT_CONFIG = {
# Archive a skill (move to skills/.archive/) after this many days
# without use. Archived skills are recoverable — no auto-deletion.
"archive_after_days": 90,
# Run the LLM consolidation (umbrella-building) pass. OFF by default.
# When off, a curator run does ONLY the deterministic inactivity prune
# (mark stale / archive long-unused skills) and skips the forked
# aux-model review entirely — no umbrella-building, no aux-model cost.
# Set to true to opt back into merging overlapping skills into
# class-level umbrellas. `hermes curator run --consolidate` overrides
# this for a single invocation.
"consolidate": False,
# Also prune (archive) bundled built-in skills after the inactivity
# period, not just agent-created ones. ON by default. Built-ins are
# normally restored on every `hermes update`, so pruning them only
@ -2569,7 +2577,7 @@ DEFAULT_CONFIG = {
# Config schema version - bump this when adding new required fields
"_config_version": 29,
"_config_version": 30,
}
# =============================================================================
@ -4858,6 +4866,29 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
if not quiet:
print(" ✓ Renamed write_mode → write_approval (boolean gate)")
# ── Version 29 → 30: seed curator.consolidate (default false) ──
# Consolidation (the LLM umbrella-building fork) is now an opt-in toggle,
# OFF by default. The deterministic inactivity prune still runs whenever
# the curator is enabled; only the opinionated, aux-model-cost LLM pass is
# gated. The runtime deep-merge already supplies the default, but we seed
# the key so it's visible/editable in config.yaml. Existing installs that
# WANT the old always-consolidate behavior must set it to true explicitly.
# Only add the key when a curator section exists and lacks it — never
# clobber a value the user already set.
if current_ver < 30:
config = read_raw_config()
raw_curator = config.get("curator")
if isinstance(raw_curator, dict) and "consolidate" not in raw_curator:
raw_curator["consolidate"] = False
config["curator"] = raw_curator
save_config(config)
results["config_added"].append("curator.consolidate=false")
if not quiet:
print(
" ✓ Seeded curator.consolidate: false "
"(LLM consolidation is now opt-in; pruning stays on)"
)
# ── Post-migration: disable exfiltration-shaped MCP stdio entries ──
# Users can hand-edit mcp_servers, and older installs may already contain a
# malicious entry. Preserve the stanza for auditability but mark it

View file

@ -77,6 +77,10 @@ def _cmd_status(args) -> int:
print(f" interval: every {_interval_label}")
print(f" stale after: {curator.get_stale_after_days()}d unused")
print(f" archive after: {curator.get_archive_after_days()}d unused")
print(
f" consolidate: {'on' if curator.get_consolidate() else 'off'}"
f"{'' if curator.get_consolidate() else ' (prune-only; LLM merge pass opt-in)'}"
)
rows = skill_usage.agent_created_report()
if not rows:
@ -174,10 +178,20 @@ def _cmd_run(args) -> int:
dry = bool(getattr(args, "dry_run", False))
background = bool(getattr(args, "background", False))
synchronous = bool(getattr(args, "synchronous", False)) or not background
# --consolidate forces the LLM umbrella-building pass on for this run,
# overriding the config default (off). When the flag is absent, pass None
# so run_curator_review reads curator.consolidate from config.
consolidate = True if bool(getattr(args, "consolidate", False)) else None
if dry:
print("curator: running DRY-RUN (report only, no mutations)...")
else:
print("curator: running review pass...")
if consolidate is None and not curator.get_consolidate():
print(
"curator: consolidation is off — running prune-only "
"(deterministic stale/archive). Pass --consolidate or set "
"`curator.consolidate: true` to enable the LLM merge pass."
)
def _on_summary(msg: str) -> None:
print(msg)
@ -186,6 +200,7 @@ def _cmd_run(args) -> int:
on_summary=_on_summary,
synchronous=synchronous,
dry_run=dry,
consolidate=consolidate,
)
auto = result.get("auto_transitions", {})
if auto:
@ -503,6 +518,12 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
help="Report only — no state changes, no archives, no consolidation "
"(use this to preview what curator would do)",
)
p_run.add_argument(
"--consolidate", dest="consolidate", action="store_true",
help="Force the LLM umbrella-building consolidation pass on for this "
"run, overriding the config default (off). Without this flag the "
"run is prune-only unless `curator.consolidate: true` is set.",
)
p_run.set_defaults(func=_cmd_run)
p_pause = subs.add_parser("pause", help="Pause the curator until resumed")

View file

@ -520,7 +520,7 @@ def test_dry_run_injects_report_only_banner(curator_env, monkeypatch):
"tool_calls": [], "error": None}
monkeypatch.setattr(c, "_run_llm_review", _stub)
c.run_curator_review(synchronous=True, dry_run=True)
c.run_curator_review(synchronous=True, dry_run=True, consolidate=True)
assert "DRY-RUN" in captured["prompt"]
assert "DO NOT" in captured["prompt"]
@ -571,7 +571,11 @@ def test_run_review_synchronous_invokes_llm_stub(curator_env, monkeypatch):
monkeypatch.setattr(c, "_run_llm_review", _stub)
captured = []
c.run_curator_review(on_summary=lambda s: captured.append(s), synchronous=True)
c.run_curator_review(
on_summary=lambda s: captured.append(s),
synchronous=True,
consolidate=True,
)
assert len(calls) == 1
assert "skill CURATOR" in calls[0] or "CURATOR" in calls[0]
@ -595,6 +599,69 @@ def test_run_review_skips_llm_when_no_candidates(curator_env, monkeypatch):
assert any("skipped" in s for s in captured)
def test_consolidate_default_off(curator_env):
"""Consolidation (the LLM umbrella pass) is OFF by default — only the
deterministic inactivity prune runs unless the user opts in."""
c = curator_env["curator"]
assert c.get_consolidate() is False
def test_consolidate_enabled_via_config(curator_env, monkeypatch):
c = curator_env["curator"]
monkeypatch.setattr(c, "_load_config", lambda: {"consolidate": True})
assert c.get_consolidate() is True
def test_run_review_skips_llm_when_consolidate_off(curator_env, monkeypatch):
"""With consolidation off (the default), a run does the deterministic
prune but never spawns the LLM consolidation fork even with candidates
present. The run is still recorded and a 'consolidation off' summary is
surfaced."""
c = curator_env["curator"]
u = curator_env["usage"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
u.mark_agent_created("a")
calls = []
monkeypatch.setattr(
c, "_run_llm_review",
lambda prompt: (calls.append(prompt), "never-called")[1],
)
captured = []
c.run_curator_review(on_summary=lambda s: captured.append(s), synchronous=True)
assert calls == [] # LLM consolidation fork not invoked
assert any("consolidation off" in s for s in captured)
# The run is still recorded (deterministic prune happened).
state = c.load_state()
assert state["last_run_at"] is not None
assert state["run_count"] >= 1
def test_run_review_consolidate_override_runs_llm(curator_env, monkeypatch):
"""Passing consolidate=True overrides the config default (off) and drives
the LLM consolidation pass mirrors `hermes curator run --consolidate`."""
c = curator_env["curator"]
u = curator_env["usage"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
u.mark_agent_created("a")
calls = []
monkeypatch.setattr(
c, "_run_llm_review",
lambda prompt: (calls.append(prompt), {
"final": "", "summary": "s", "model": "", "provider": "",
"tool_calls": [], "error": None,
})[1],
)
c.run_curator_review(synchronous=True, consolidate=True)
assert len(calls) == 1
def test_maybe_run_curator_respects_disabled(curator_env, monkeypatch):
c = curator_env["curator"]
monkeypatch.setattr(c, "_load_config", lambda: {"enabled": False})

View file

@ -31,8 +31,12 @@ If you want to see what the curator *would* do before it runs for real, run `her
A run has two phases:
1. **Automatic transitions** (deterministic, no LLM). Skills unused for `stale_after_days` (30) become `stale`; skills unused for `archive_after_days` (90) are moved to `~/.hermes/skills/.archive/`.
2. **LLM review** (single aux-model pass, `max_iterations=8`). The forked agent surveys the agent-created skills, can read any of them with `skill_view`, and decides per-skill whether to keep, patch (via `skill_manage`), consolidate overlapping ones, or archive via the terminal tool. Consolidation treats a skill as a full package: if a skill has `references/`, `templates/`, `scripts/`, `assets/`, or relative links to those paths, the curator must either keep it standalone, re-home the needed support files and rewrite paths, or archive the entire package unchanged — not flatten only `SKILL.md` into another skill's `references/` file.
1. **Automatic transitions** (deterministic, no LLM). Skills unused for `stale_after_days` (30) become `stale`; skills unused for `archive_after_days` (90) are moved to `~/.hermes/skills/.archive/`. This is the always-on pruning behavior — it runs whenever the curator is enabled, with no aux-model cost.
2. **LLM consolidation** (single aux-model pass, `max_iterations=8`) — **OFF by default**. When `curator.consolidate: true`, the forked agent surveys the agent-created skills, can read any of them with `skill_view`, and decides per-skill whether to keep, patch (via `skill_manage`), consolidate overlapping ones into class-level umbrellas, or archive via the terminal tool. Consolidation treats a skill as a full package: if a skill has `references/`, `templates/`, `scripts/`, `assets/`, or relative links to those paths, the curator must either keep it standalone, re-home the needed support files and rewrite paths, or archive the entire package unchanged — not flatten only `SKILL.md` into another skill's `references/` file.
:::info Consolidation is opt-in
By default the curator only **prunes** — the deterministic inactivity pass marks skills stale and archives long-unused ones. The opinionated LLM **consolidation** pass (umbrella-building, merging overlapping skills) is off by default because it costs aux-model tokens on every run and makes broad structural changes to your library. Turn it on with `curator.consolidate: true`, or run it once on demand with `hermes curator run --consolidate`.
:::
Pinned skills are off-limits to both the curator's auto-transitions and the agent's own `skill_manage` tool. See [Pinning a skill](#pinning-a-skill) below.
@ -47,10 +51,11 @@ curator:
min_idle_hours: 2
stale_after_days: 30
archive_after_days: 90
consolidate: false # LLM umbrella-building pass — opt-in (prune-only by default)
prune_builtins: true # archive unused bundled built-in skills too (hub skills always exempt)
```
To disable entirely, set `curator.enabled: false`.
To disable entirely, set `curator.enabled: false`. To keep the always-on pruning but opt into LLM consolidation, set `curator.consolidate: true`.
### Running the review on a cheaper aux model
@ -85,8 +90,9 @@ Earlier releases used a one-off `curator.auxiliary.{provider,model}` block. That
```bash
hermes curator status # last run, counts, pinned list, LRU top 5
hermes curator run # trigger a review now (blocks until the LLM pass finishes)
hermes curator run --background # fire-and-forget: start the LLM pass in a background thread
hermes curator run # trigger a run now (blocks until done). Prune-only unless curator.consolidate: true
hermes curator run --consolidate # force the LLM consolidation pass on for this run, overriding the config default
hermes curator run --background # fire-and-forget: start the run in a background thread
hermes curator run --dry-run # preview only — report without any mutations
hermes curator backup # take a manual snapshot of ~/.hermes/skills/
hermes curator rollback # restore from the newest snapshot