hermes-agent/plugins/hermes-achievements/docs/achievements-performance-spec.md
Teknium 62a5d7207d
feat(plugins): bundle hermes-achievements + scan full session history (#17754)
* feat(plugins): bundle hermes-achievements, scan full session history

Ships @PCinkusz's hermes-achievements dashboard plugin (https://github.com/PCinkusz/hermes-achievements) as a bundled plugin at plugins/hermes-achievements/ and fixes a bug in the scan path that made the plugin only see the first 200 sessions — making lifetime badges (50k tool calls, 75k errors, etc.) unreachable on long-running installs.

Changes:

- plugins/hermes-achievements/: vendor v0.3.1 verbatim (manifest, dist/, plugin_api.py, tests, docs, README).
- plugins/hermes-achievements/dashboard/plugin_api.py:
  * scan_sessions(): limit=None now scans ALL sessions via SQLite LIMIT -1. Previously capped at 200, so users with 8000+ sessions saw ~2% of their history.
  * evaluate_all(): first-ever scans run in a background thread so the dashboard request path never blocks. Stale snapshots serve immediately while a background refresh runs. force=True still blocks synchronously for manual /rescan.
  * _build_pending_snapshot(), _start_background_scan(), _run_scan_and_update_cache(): supporting plumbing + idempotent thread spawn.
- tests/plugins/test_achievements_plugin.py: new tests covering the 200-cap regression, the background-scan first-run flow, stale-serve-plus-background-refresh, forced sync rescan, and scan-thread idempotency.
- website/docs/user-guide/features/built-in-plugins.md: lists hermes-achievements in the bundled-plugins table and documents API endpoints, state files, and performance characteristics.

E2E validated against a real 8564-session ~6.4GB state.db:
  * Cold scan: 13m 19s (one-time, backgrounded — UI never blocks)
  * Warm rescan: 1.47s (8563/8564 sessions reused from checkpoint cache)
  * 57/60 achievements unlocked, 3 discovered — aggregates like total_tool_calls=259958, total_errors=164213, skill_events=368243 correctly surface lifetime badges that the 200-cap made unreachable.

Original credit: @PCinkusz (MIT-licensed). Upstream repo remains the staging ground for new badges; this bundle keeps the dashboard feature parity with Hermes core changes.

* feat(achievements): publish partial snapshots during cold scan

Previously a cold scan on a large session DB (13min on 8564 sessions)
showed zero badges for the entire duration, then every badge at once
when the scan completed. A dashboard refresh mid-scan was indistinguishable
from a fresh install with no history.

Now the scanner publishes a partial snapshot to _SNAPSHOT_CACHE every
250 sessions, so each refresh during a cold scan surfaces more badges
incrementally.

Mechanism:
- scan_sessions() takes an optional progress_callback fired every
  progress_every sessions with (sessions_so_far, scanned, total).
- _compute_from_scan() is extracted from compute_all() and gains an
  is_partial flag that skips writing to state.json — we don't want
  to record unlocked_at based on a half-complete aggregate that a
  later session might rebalance.
- _run_scan_and_update_cache() installs a publisher callback that
  builds a partial snapshot, marks it mode='in_progress', and writes
  it to the cache with age=0 so the UI keeps polling /scan-status
  and picks up the final snapshot when the scan completes.
- Manual /rescan (force=True) disables partial publishing — the
  caller is blocking on the final result anyway.

E2E against real 8564-session state.db (polled cache every 10s):
  t=10s: cache empty
  t=20s: 250/8564 scanned, 35 unlocked, 25 discovered
  t=40s: 500/8564 scanned, 42 unlocked, 18 discovered
  t=60s: 1000/8564 scanned, 49 unlocked, 11 discovered
  ...

Tests: 9/9 pass (2 new — partial snapshot publication + no-persist-on-partial).
Upstream unittest suite: 10/10 pass.

* feat(achievements): in-progress scan banner with live % progress

Previously the dashboard showed zero badges silently during long cold
scans (13min on 8564 sessions). The backend was publishing partial
snapshots every 250 sessions, but the bundled UI didn't surface any
indicator that a scan was running — it just rendered the main page
with whatever counts were currently published and no way for the user
to know more progress was coming.

UI changes (dist/index.js, dist/style.css):

- Added a scan-in-progress banner rendered between the hero and stats
  when scan_meta.mode is 'pending' or 'in_progress'. Shows:
    BUILDING ACHIEVEMENT PROFILE…
    Scanned 1,750 of 8,564 sessions · 20%. Badges unlock as more history streams in.
  with a pulsing teal indicator and a filling teal/cyan progress bar.
  Disappears the moment the backend flips to 'full' or 'incremental'.

- Added an auto-poller via useEffect — while scanInFlight is true the
  page re-fetches /achievements every 4s WITHOUT toggling the loading
  skeleton, so unlock counts tick up visibly without the user refreshing.
  The effect cleans itself up when the scan finishes.

- Added refresh() (re-fetch, no loading flip) alongside the existing
  load() (full reload, used by the Rescan button).

Attribution preserved:

- Added a header comment to index.js crediting @PCinkusz
  (https://github.com/PCinkusz/hermes-achievements, MIT) as the
  original author, noting the banner is a layered addition on top
  of the original dist bundle.
- Matching header comment in style.css, flagging the new
  .ha-scan-banner* rules as the local addition.

Live-verified end to end:

- Spun up `hermes dashboard --port 9229 --no-open` against a fresh
  HERMES_HOME symlinked to the real 8564-session state.db.
- Opened /achievements in a browser, confirmed the banner renders with
  live progress: 'Scanned 1,000 of 8,564 sessions · 11%' → updates to
  '1,250 ... · 14%' → '1,750 ... · 20%' without user interaction,
  matching the backend's partial publications.
- Stats row simultaneously climbed from 35 → 49 → 53 unlocked as
  more history streamed in.
- Vision analysis of the rendered page confirms the banner styling
  matches the rest of the dashboard (dark card bg, teal accent, same
  small-caps typography, pulsing indicator reusing ha-pulse keyframes).
2026-04-29 23:23:57 -07:00

5 KiB
Raw Blame History

Hermes Achievements Performance Spec (Post-Hackathon)

Status: Draft (no code changes yet) Owner: hermes-achievements plugin Scope: dashboard/plugin_api.py + dashboard/dist/index.js request behavior Decision: Drop /overview and top-banner slots; keep only Achievements tab data path.


1) Problem Statement

Current plugin endpoints /achievements and /overview both execute a full history recomputation (evaluate_all()), which performs a full SessionDB scan each request.

Observed on this machine/repo:

  • ~83 sessions
  • ~7,125 messages
  • ~3,623 tool calls
  • evaluate_all() ~1316s per call
  • /achievements ~1315s per call
  • /overview ~1215s per call
  • Overlap between endpoints increases perceived wait.

Given current product direction, /overview and cross-tab top-banner slots are not needed.


2) Goals

  • Keep achievement correctness unchanged.
  • Keep all Achievements-tab UX/data (unlocked/discovered/secrets/highest/latest/cards).
  • Remove unused summary path (/overview) and slot wiring.
  • Make Achievements tab faster by avoiding duplicate endpoint pathways.
  • Ensure at most one heavy scan can run at a time.

Non-goals (phase 1):

  • Rewriting achievement rules.
  • Changing badge semantics/states.

3) Endpoint Semantics (Target)

GET /api/plugins/hermes-achievements/achievements

Single source endpoint for Achievements UI. Returns full payload used by the tab:

  • achievements
  • unlocked_count
  • discovered_count
  • secret_count
  • total_count
  • error

POST /api/plugins/hermes-achievements/rescan (optional)

Manual refresh trigger. Prefer async trigger + immediate status response.

GET /api/plugins/hermes-achievements/scan-status (optional new)

Reports scan state for UX/ops.

Removed

  • GET /api/plugins/hermes-achievements/overview

4) UI Scope (Target)

Keep:

  • Achievements page/tab (/achievements in plugin tab manifest)
  • All existing Achievements tab stats/cards/filters

Remove:

  • Top-banner summary slot components using sessions:top and analytics:top
  • Any frontend call path to /overview

5) Runtime State Machine (for /achievements)

  • FRESH: cached snapshot age <= TTL
  • STALE: snapshot exists but expired
  • SCANNING: background recompute running
  • FAILED: last recompute failed, last good snapshot still served

Rules:

  1. FRESH -> serve immediately.
  2. STALE + not scanning -> serve stale snapshot immediately and launch background refresh.
  3. SCANNING -> do not start another scan; join single-flight in-flight job.
  4. No snapshot yet -> allow one blocking bootstrap scan.

6) Caching & Invalidation

Phase 1

  • In-memory cache + persisted snapshot file.
  • TTL: 60180 seconds (configurable).
  • Single-flight dedupe for scan requests.
  • Persist plugin data under:
    • ~/.hermes/plugins/hermes-achievements/scan_snapshot.json

Phase 2

  • Incremental scan checkpoints with per-session fingerprints.
  • Persist checkpoint data under:
    • ~/.hermes/plugins/hermes-achievements/scan_checkpoint.json
  • Checkpoint stores, per session:
    • session_id
    • fingerprint (updated_at, message_count, or hash)
    • cached per-session contribution used for aggregate recomposition
  • Scan policy:
    • First run: full scan and materialize snapshot + checkpoint.
    • Next runs: process only new/changed sessions, reuse unchanged contributions.
  • Full rebuild only on:
    • schema/version change
    • checkpoint corruption
    • explicit full rescan

7) Frontend Contract

  • Achievements tab requests /achievements once on mount.
  • No slot-based summary fetches.
  • If response says is_stale=true, UI may display “Updating in background”.
  • Avoid duplicate mount-triggered calls and cancel stale requests on navigation.

8) SLO Targets

  • /achievements p95 < 1s (cached)
  • Max concurrent heavy scans: 1
  • Background refresh should not block UI

9) Observability Requirements

Track:

  • scan count
  • scan duration avg/p95
  • dedupe hit count (joined in-flight scans)
  • stale-served count
  • failures + last error

Expose minimal diagnostics in /scan-status.


10) Backward Compatibility

  • Keep /achievements response shape backward-compatible.
  • Removing /overview is acceptable because slot UI is intentionally removed.
  • If temporary compatibility is needed, /overview can return static deprecation response for one release.

11) Risks

  • Stale data confusion -> mitigate with generated_at and explicit refresh status.
  • Cache invalidation bugs -> start with conservative TTL + manual rescan.
  • Concurrency bugs -> protect scan section with lock/single-flight guard.
  • Session mutation edge cases -> use per-session fingerprint invalidation (not global timestamp only).

12) Persistence Files (Explicit)

Plugin state directory:

  • ~/.hermes/plugins/hermes-achievements/

Files:

  • state.json (existing): unlock tracking
  • scan_snapshot.json (new): latest materialized achievements payload
  • scan_checkpoint.json (new): per-session fingerprints + contributions for incremental refresh