hermes-agent/website/docs/user-guide
Teknium a0fedfbb1b
feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709)
Replaces the per-directory shadow-repo design with a single shared shadow
git store at ~/.hermes/checkpoints/store/. Object DB is now deduplicated
across every working directory the agent has ever touched; a dozen
worktrees of the same project cost near-zero in additional disk.

Why
---
Pre-v2 design had three compounding problems that let ~/.hermes/checkpoints/
grow to multi-GB on active machines:

1. Each working directory got its own full shadow git repo — no object
   dedup across projects or across worktrees of the same project.
2. _prune() was a documented no-op: max_snapshots only limited the
   /rollback listing. Loose objects accumulated forever.
3. Defaults: enabled=True, auto_prune=False — users paid the disk cost
   without ever asking for /rollback.

Field report on a single workstation: 847 MB across 47 shadow repos,
mostly redundant clones of the hermes-agent source tree.

Changes
-------
- tools/checkpoint_manager.py: full rewrite. Single bare store, per-project
  refs (refs/hermes/<hash>), per-project indexes (store/indexes/<hash>),
  per-project metadata (store/projects/<hash>.json with workdir +
  created_at + last_touch). On first v2 init, any pre-v2 per-directory
  shadow repos are auto-migrated into legacy-<timestamp>/ so the new
  store starts clean. _prune() now actually rewrites the per-project ref
  to the last max_snapshots commits and runs git gc --prune=now. New
  _enforce_size_cap() drops oldest commits round-robin across projects
  when the store exceeds max_total_size_mb. _drop_oversize_from_index()
  filters any single file larger than max_file_size_mb out of the snapshot.
- hermes_cli/checkpoints.py: new 'hermes checkpoints' CLI
  (status / list / prune / clear / clear-legacy) for managing the store
  outside a session.
- hermes_cli/config.py: flipped defaults — enabled=False, max_snapshots=20,
  auto_prune=True. Added max_total_size_mb=500, max_file_size_mb=10.
  Tightened DEFAULT_EXCLUDES (added target/, *.so/*.dylib/*.dll,
  *.mp4/*.mov, *.zip/*.tar.gz, .worktrees/, .mypy_cache/, etc.).
- run_agent.py / cli.py / gateway/run.py: thread the new kwargs through
  AIAgent and the startup auto_prune hooks.
- Tests rewritten to match v2 storage while keeping backwards-compat
  coverage for the pre-v2 prune path (per-directory shadow repos under
  base/ are still swept correctly for anyone mid-migration).
- Docs updated: user-guide/checkpoints-and-rollback.md explains the
  shared store, new defaults, migration, and the new CLI;
  reference/cli-commands.md documents 'hermes checkpoints'.

E2E validated
-------------
- Legacy migration: pre-v2 shadow repos auto-archived into legacy-<ts>/.
- Object dedup: two projects with an identical shared.py blob resolve to
  7 total objects in the store (v1 would have stored the blob twice).
- max_snapshots=3 actually enforced: after 6 commits, list shows 3.
- Orphan prune: deleting a project's workdir + 'hermes checkpoints prune
  --retention-days 0' removes its ref, index, and metadata; GC reclaims
  the objects.
- max_file_size_mb=1 excludes a 2 MB weights.bin while keeping the
  tracked source code files.
- hermes checkpoints {status,prune,clear,clear-legacy} all work from the
  CLI without an agent running.

Breaking / migration
--------------------
No in-place data migration — legacy per-directory shadow repos are moved
into legacy-<timestamp>/ on first run. Old /rollback history is still
accessible by inspecting the archive with git; run
'hermes checkpoints clear-legacy' to reclaim the space when ready. Users
relying on /rollback must now set checkpoints.enabled=true (or pass
--checkpoints) explicitly.
2026-05-06 05:44:35 -07:00
..
features docs: update VS Code setup instructions for ACP Client integration 2026-05-05 14:16:30 -07:00
messaging docs: add Open WebUI bootstrap script 2026-05-05 14:12:09 -07:00
skills docs(codex): clarify OAuth auth prerequisite 2026-05-05 13:53:55 -07:00
_category_.json feat: add documentation website (Docusaurus) 2026-03-05 05:24:55 -08:00
checkpoints-and-rollback.md feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709) 2026-05-06 05:44:35 -07:00
cli.md docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738) 2026-04-29 20:55:59 -07:00
configuration.md fix: add Turkish locale references in config, tests, and docs 2026-05-05 17:29:12 -07:00
configuring-models.md docs: document custom model aliases for /model command (#20475) 2026-05-05 19:11:20 -07:00
docker.md docs(docker): document API_SERVER_* env vars for exposing the OpenAI-compatible endpoint 2026-05-05 13:48:37 -07:00
git-worktrees.md docs: restructure site navigation — promote features and platforms to top-level (#4116) 2026-03-30 18:39:51 -07:00
profiles.md Revert "fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)" (#19329) 2026-05-04 00:43:58 +05:30
security.md docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738) 2026-04-29 20:55:59 -07:00
sessions.md docs: add Microsoft Teams to platform lists across docs 2026-05-04 20:59:18 -07:00
tui.md docs: two-week gap sweep — platforms, CLI, config, TUI, hooks, providers (#17727) 2026-04-29 20:32:37 -07:00
windows-wsl-quickstart.md docs(i18n): add zh-Hans Tool Gateway, image gen, and Windows WSL guide 2026-05-05 14:14:03 -07:00