Replaces the per-directory shadow-repo design with a single shared shadow
git store at ~/.hermes/checkpoints/store/. Object DB is now deduplicated
across every working directory the agent has ever touched; a dozen
worktrees of the same project cost near-zero in additional disk.
Why
---
Pre-v2 design had three compounding problems that let ~/.hermes/checkpoints/
grow to multi-GB on active machines:
1. Each working directory got its own full shadow git repo — no object
dedup across projects or across worktrees of the same project.
2. _prune() was a documented no-op: max_snapshots only limited the
/rollback listing. Loose objects accumulated forever.
3. Defaults: enabled=True, auto_prune=False — users paid the disk cost
without ever asking for /rollback.
Field report on a single workstation: 847 MB across 47 shadow repos,
mostly redundant clones of the hermes-agent source tree.
Changes
-------
- tools/checkpoint_manager.py: full rewrite. Single bare store, per-project
refs (refs/hermes/<hash>), per-project indexes (store/indexes/<hash>),
per-project metadata (store/projects/<hash>.json with workdir +
created_at + last_touch). On first v2 init, any pre-v2 per-directory
shadow repos are auto-migrated into legacy-<timestamp>/ so the new
store starts clean. _prune() now actually rewrites the per-project ref
to the last max_snapshots commits and runs git gc --prune=now. New
_enforce_size_cap() drops oldest commits round-robin across projects
when the store exceeds max_total_size_mb. _drop_oversize_from_index()
filters any single file larger than max_file_size_mb out of the snapshot.
- hermes_cli/checkpoints.py: new 'hermes checkpoints' CLI
(status / list / prune / clear / clear-legacy) for managing the store
outside a session.
- hermes_cli/config.py: flipped defaults — enabled=False, max_snapshots=20,
auto_prune=True. Added max_total_size_mb=500, max_file_size_mb=10.
Tightened DEFAULT_EXCLUDES (added target/, *.so/*.dylib/*.dll,
*.mp4/*.mov, *.zip/*.tar.gz, .worktrees/, .mypy_cache/, etc.).
- run_agent.py / cli.py / gateway/run.py: thread the new kwargs through
AIAgent and the startup auto_prune hooks.
- Tests rewritten to match v2 storage while keeping backwards-compat
coverage for the pre-v2 prune path (per-directory shadow repos under
base/ are still swept correctly for anyone mid-migration).
- Docs updated: user-guide/checkpoints-and-rollback.md explains the
shared store, new defaults, migration, and the new CLI;
reference/cli-commands.md documents 'hermes checkpoints'.
E2E validated
-------------
- Legacy migration: pre-v2 shadow repos auto-archived into legacy-<ts>/.
- Object dedup: two projects with an identical shared.py blob resolve to
7 total objects in the store (v1 would have stored the blob twice).
- max_snapshots=3 actually enforced: after 6 commits, list shows 3.
- Orphan prune: deleting a project's workdir + 'hermes checkpoints prune
--retention-days 0' removes its ref, index, and metadata; GC reclaims
the objects.
- max_file_size_mb=1 excludes a 2 MB weights.bin while keeping the
tracked source code files.
- hermes checkpoints {status,prune,clear,clear-legacy} all work from the
CLI without an agent running.
Breaking / migration
--------------------
No in-place data migration — legacy per-directory shadow repos are moved
into legacy-<timestamp>/ on first run. Old /rollback history is still
accessible by inspecting the archive with git; run
'hermes checkpoints clear-legacy' to reclaim the space when ready. Users
relying on /rollback must now set checkpoints.enabled=true (or pass
--checkpoints) explicitly.
9.7 KiB
| sidebar_position | sidebar_label | title | description |
|---|---|---|---|
| 8 | Checkpoints & Rollback | Checkpoints and /rollback | Filesystem safety nets for destructive operations using shadow git repos and automatic snapshots |
Checkpoints and /rollback
Hermes Agent can automatically snapshot your project before destructive operations and restore it with a single command. Checkpoints are opt-in as of v2 — most users never use /rollback, and the shadow-store storage is non-trivial over time, so the default is off.
Enable checkpoints per-session with --checkpoints:
hermes chat --checkpoints
Or enable globally in ~/.hermes/config.yaml:
checkpoints:
enabled: true
This safety net is powered by an internal Checkpoint Manager that keeps a single shared shadow git repository under ~/.hermes/checkpoints/store/ — your real project .git is never touched. Every project the agent works in shares the same store, so git's content-addressable object DB deduplicates across projects and across turns.
What Triggers a Checkpoint
Checkpoints are taken automatically before:
- File tools —
write_fileandpatch - Destructive terminal commands —
rm,rmdir,cp,install,mv,sed -i,truncate,dd,shred, output redirects (>), andgit reset/clean/checkout
The agent creates at most one checkpoint per directory per turn, so long-running sessions don't spam snapshots.
Quick Reference
In-session slash commands:
| Command | Description |
|---|---|
/rollback |
List all checkpoints with change stats |
/rollback <N> |
Restore to checkpoint N (also undoes last chat turn) |
/rollback diff <N> |
Preview diff between checkpoint N and current state |
/rollback <N> <file> |
Restore a single file from checkpoint N |
CLI for inspecting and managing the store outside a session:
| Command | Description |
|---|---|
hermes checkpoints |
Show total size, project count, per-project breakdown |
hermes checkpoints status |
Same as bare checkpoints |
hermes checkpoints list |
Alias for status |
hermes checkpoints prune |
Force a sweep: delete orphans/stale, GC, enforce size cap |
hermes checkpoints clear |
Nuke the entire checkpoint base (asks first) |
hermes checkpoints clear-legacy |
Delete only the legacy-* archives from v1 migration |
How Checkpoints Work
At a high level:
- Hermes detects when tools are about to modify files in your working tree.
- Once per conversation turn (per directory), it:
- Resolves a reasonable project root for the file.
- Initialises or reuses the single shared shadow store at
~/.hermes/checkpoints/store/. - Stages into a per-project index, builds a tree, and commits to a per-project ref (
refs/hermes/<project-hash>).
- These per-project refs form a checkpoint history that you can inspect and restore via
/rollback.
flowchart LR
user["User command\n(hermes, gateway)"]
agent["AIAgent\n(run_agent.py)"]
tools["File & terminal tools"]
cpMgr["CheckpointManager"]
store["Shared shadow store\n~/.hermes/checkpoints/store/"]
user --> agent
agent -->|"tool call"| tools
tools -->|"before mutate\nensure_checkpoint()"| cpMgr
cpMgr -->|"git add/commit-tree/update-ref"| store
cpMgr -->|"OK / skipped"| tools
tools -->|"apply changes"| agent
Configuration
Configure in ~/.hermes/config.yaml:
checkpoints:
enabled: false # master switch (default: false — opt-in)
max_snapshots: 20 # max checkpoints per project (enforced via ref rewrite + gc)
max_total_size_mb: 500 # hard cap on total store size; oldest commits dropped
max_file_size_mb: 10 # skip any single file larger than this
# Auto-maintenance (on by default): sweep ~/.hermes/checkpoints/ at startup
# and delete project entries whose working directory no longer exists
# (orphans) or whose last_touch is older than retention_days. Runs at most
# once per min_interval_hours, tracked via a .last_prune marker.
auto_prune: true
retention_days: 7
delete_orphans: true
min_interval_hours: 24
To disable everything:
checkpoints:
enabled: false
auto_prune: false
When enabled: false, the Checkpoint Manager is a no-op and never attempts git operations. When auto_prune: false, the store grows until you run hermes checkpoints prune manually.
Listing Checkpoints
From a CLI session:
/rollback
Hermes responds with a formatted list showing change statistics:
📸 Checkpoints for /path/to/project:
1. 4270a8c 2026-03-16 04:36 before patch (1 file, +1/-0)
2. eaf4c1f 2026-03-16 04:35 before write_file
3. b3f9d2e 2026-03-16 04:34 before terminal: sed -i s/old/new/ config.py (1 file, +1/-1)
/rollback <N> restore to checkpoint N
/rollback diff <N> preview changes since checkpoint N
/rollback <N> <file> restore a single file from checkpoint N
Inspecting the Store from the Shell
hermes checkpoints
Sample output:
Checkpoint base: /home/you/.hermes/checkpoints
Total size: 142.3 MB
store/ 138.1 MB
legacy-* 4.2 MB
Projects: 12
WORKDIR COMMITS LAST TOUCH STATE
/home/you/code/hermes-agent 20 2h ago live
/home/you/code/experiments/rl-runner 8 1d ago live
/home/you/code/old-prototype 3 9d ago orphan
...
Legacy archives (1):
legacy-20260506-050616 4.2 MB
Clear with: hermes checkpoints clear-legacy
Force a full sweep (ignores the 24h idempotency marker):
hermes checkpoints prune --retention-days 3 --max-size-mb 200
Previewing Changes with /rollback diff
Before committing to a restore, preview what has changed since a checkpoint:
/rollback diff 1
This shows a git diff stat summary followed by the actual diff.
Restoring with /rollback
/rollback 1
Behind the scenes, Hermes:
- Verifies the target commit exists in the shadow store.
- Takes a pre-rollback snapshot of the current state so you can "undo the undo" later.
- Restores tracked files in your working directory.
- Undoes the last conversation turn so the agent's context matches the restored filesystem state.
Single-File Restore
Restore just one file from a checkpoint without affecting the rest of the directory:
/rollback 1 src/broken_file.py
Safety and Performance Guards
- Git availability — if
gitis not found onPATH, checkpoints are transparently disabled. - Directory scope — Hermes skips overly broad directories (root
/, home$HOME). - Repository size — directories with more than 50,000 files are skipped.
- Per-file size cap — files larger than
max_file_size_mb(default 10 MB) are excluded from the snapshot. Prevents accidentally swallowing datasets, model weights, or generated media. - Total store size cap — when the store exceeds
max_total_size_mb(default 500 MB), the oldest commit per project is dropped round-robin until under the cap. - Real pruning —
max_snapshotsis enforced by rewriting the per-project ref and runninggit gc --prune=nowafterwards, so loose objects don't accumulate. - No-change snapshots — if there are no changes since the last snapshot, the checkpoint is skipped.
- Non-fatal errors — all errors inside the Checkpoint Manager are logged at debug level; your tools continue to run.
Where Checkpoints Live
~/.hermes/checkpoints/
├── store/ # single shared bare git repo
│ ├── HEAD, objects/ # git internals (shared across projects)
│ ├── refs/hermes/<hash> # per-project branch tip
│ ├── indexes/<hash> # per-project git index
│ ├── projects/<hash>.json # workdir + created_at + last_touch
│ └── info/exclude
├── .last_prune # auto-prune idempotency marker
└── legacy-<ts>/ # archived pre-v2 per-project shadow repos
Each <hash> is derived from the absolute path of the working directory. You normally never need to touch these manually — use hermes checkpoints status / prune / clear instead.
Migration from v1
Before the v2 rewrite, each working directory got its own complete shadow git repo directly under ~/.hermes/checkpoints/<hash>/. That layout couldn't dedup objects across projects and had a documented no-op pruner — the store would grow without bound.
On first v2 run, any pre-v2 shadow repos are moved into ~/.hermes/checkpoints/legacy-<timestamp>/ so the new single-store layout starts clean. Old /rollback history is still reachable by manually inspecting the legacy archive with git; once you're confident you don't need it, run:
hermes checkpoints clear-legacy
to reclaim the space. Legacy archives are also swept by auto_prune after retention_days.
Best Practices
- Enable checkpoints only when you need them —
hermes chat --checkpointsor per-profileenabled: true. - Use
/rollback diffbefore restoring — preview what will change to pick the right checkpoint. - Use
/rollbackinstead ofgit resetwhen you want to undo agent-driven changes only. - Check
hermes checkpoints statusoccasionally if you use checkpoints regularly — shows which projects are active and what the store costs you. - Combine with Git worktrees for maximum safety — keep each Hermes session in its own worktree/branch, with checkpoints as an extra layer.
For running multiple agents in parallel on the same repo, see the guide on Git worktrees.