mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-09 03:11:58 +00:00
feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709)
Replaces the per-directory shadow-repo design with a single shared shadow
git store at ~/.hermes/checkpoints/store/. Object DB is now deduplicated
across every working directory the agent has ever touched; a dozen
worktrees of the same project cost near-zero in additional disk.
Why
---
Pre-v2 design had three compounding problems that let ~/.hermes/checkpoints/
grow to multi-GB on active machines:
1. Each working directory got its own full shadow git repo — no object
dedup across projects or across worktrees of the same project.
2. _prune() was a documented no-op: max_snapshots only limited the
/rollback listing. Loose objects accumulated forever.
3. Defaults: enabled=True, auto_prune=False — users paid the disk cost
without ever asking for /rollback.
Field report on a single workstation: 847 MB across 47 shadow repos,
mostly redundant clones of the hermes-agent source tree.
Changes
-------
- tools/checkpoint_manager.py: full rewrite. Single bare store, per-project
refs (refs/hermes/<hash>), per-project indexes (store/indexes/<hash>),
per-project metadata (store/projects/<hash>.json with workdir +
created_at + last_touch). On first v2 init, any pre-v2 per-directory
shadow repos are auto-migrated into legacy-<timestamp>/ so the new
store starts clean. _prune() now actually rewrites the per-project ref
to the last max_snapshots commits and runs git gc --prune=now. New
_enforce_size_cap() drops oldest commits round-robin across projects
when the store exceeds max_total_size_mb. _drop_oversize_from_index()
filters any single file larger than max_file_size_mb out of the snapshot.
- hermes_cli/checkpoints.py: new 'hermes checkpoints' CLI
(status / list / prune / clear / clear-legacy) for managing the store
outside a session.
- hermes_cli/config.py: flipped defaults — enabled=False, max_snapshots=20,
auto_prune=True. Added max_total_size_mb=500, max_file_size_mb=10.
Tightened DEFAULT_EXCLUDES (added target/, *.so/*.dylib/*.dll,
*.mp4/*.mov, *.zip/*.tar.gz, .worktrees/, .mypy_cache/, etc.).
- run_agent.py / cli.py / gateway/run.py: thread the new kwargs through
AIAgent and the startup auto_prune hooks.
- Tests rewritten to match v2 storage while keeping backwards-compat
coverage for the pre-v2 prune path (per-directory shadow repos under
base/ are still swept correctly for anyone mid-migration).
- Docs updated: user-guide/checkpoints-and-rollback.md explains the
shared store, new defaults, migration, and the new CLI;
reference/cli-commands.md documents 'hermes checkpoints'.
E2E validated
-------------
- Legacy migration: pre-v2 shadow repos auto-archived into legacy-<ts>/.
- Object dedup: two projects with an identical shared.py blob resolve to
7 total objects in the store (v1 would have stored the blob twice).
- max_snapshots=3 actually enforced: after 6 commits, list shows 3.
- Orphan prune: deleting a project's workdir + 'hermes checkpoints prune
--retention-days 0' removes its ref, index, and metadata; GC reclaims
the objects.
- max_file_size_mb=1 excludes a 2 MB weights.bin while keeping the
tracked source code files.
- hermes checkpoints {status,prune,clear,clear-legacy} all work from the
CLI without an agent running.
Breaking / migration
--------------------
No in-place data migration — legacy per-directory shadow repos are moved
into legacy-<timestamp>/ on first run. Old /rollback history is still
accessible by inspecting the archive with git; run
'hermes checkpoints clear-legacy' to reclaim the space when ready. Users
relying on /rollback must now set checkpoints.enabled=true (or pass
--checkpoints) explicitly.
This commit is contained in:
parent
b045e7a2ba
commit
a0fedfbb1b
10 changed files with 1965 additions and 715 deletions
|
|
@ -54,6 +54,7 @@ hermes [global-options] <command> [subcommand/options]
|
|||
| `hermes dump` | Copy-pasteable setup summary for support/debugging. |
|
||||
| `hermes debug` | Debug tools — upload logs and system info for support. |
|
||||
| `hermes backup` | Back up Hermes home directory to a zip file. |
|
||||
| `hermes checkpoints` | Inspect / prune / clear `~/.hermes/checkpoints/` (the shadow store used by `/rollback`). Run with no args for a status overview. |
|
||||
| `hermes import` | Restore a Hermes backup from a zip file. |
|
||||
| `hermes logs` | View, tail, and filter agent/gateway/error log files. |
|
||||
| `hermes config` | Show, edit, migrate, and query configuration files. |
|
||||
|
|
@ -579,6 +580,44 @@ hermes backup --quick # Quick state-only snapshot
|
|||
hermes backup --quick --label "pre-upgrade" # Quick snapshot with label
|
||||
```
|
||||
|
||||
## `hermes checkpoints`
|
||||
|
||||
```bash
|
||||
hermes checkpoints [COMMAND]
|
||||
```
|
||||
|
||||
Inspect and manage the shadow git store at `~/.hermes/checkpoints/` — the storage layer behind the in-session `/rollback` command. Safe to run any time; does not require the agent to be running.
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `status` (default) | Show total size, project count, and per-project breakdown. Bare `hermes checkpoints` is equivalent. |
|
||||
| `list` | Alias for `status`. |
|
||||
| `prune` | Force a cleanup sweep — delete orphan and stale projects, GC the store, enforce the size cap. Ignores the 24h idempotency marker. |
|
||||
| `clear` | Delete the entire checkpoint base. Irreversible; asks for confirmation unless `-f`. |
|
||||
| `clear-legacy` | Delete only the `legacy-<timestamp>/` archives produced by the v1→v2 migration. |
|
||||
|
||||
### Options
|
||||
|
||||
| Option | Subcommand | Description |
|
||||
|--------|------------|-------------|
|
||||
| `--limit N` | `status`, `list` | Max projects to list (default 20). |
|
||||
| `--retention-days N` | `prune` | Drop projects whose `last_touch` is older than N days (default 7). |
|
||||
| `--max-size-mb N` | `prune` | After the orphan/stale pass, drop the oldest commit per project until total store size ≤ N MB (default 500). |
|
||||
| `--keep-orphans` | `prune` | Skip deleting projects whose working directory no longer exists. |
|
||||
| `-f`, `--force` | `clear`, `clear-legacy` | Skip the confirmation prompt. |
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
hermes checkpoints # status overview
|
||||
hermes checkpoints prune --retention-days 3 # aggressive cleanup
|
||||
hermes checkpoints prune --max-size-mb 200 # tighten size cap once
|
||||
hermes checkpoints clear-legacy -f # drop v1 archive dirs
|
||||
hermes checkpoints clear -f # wipe everything
|
||||
```
|
||||
|
||||
See [Checkpoints and `/rollback`](../user-guide/checkpoints-and-rollback.md) for the full architecture and the in-session commands.
|
||||
|
||||
## `hermes import`
|
||||
|
||||
```bash
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue