fix(dashboard): include cache tokens in totals, track real API call count

The analytics dashboard had three accuracy issues:

1. TOTAL TOKENS excluded cache_read and cache_write tokens — only counted
   the non-cached input portion. With 90%+ cache hit rates typical in
   Hermes, this dramatically undercounted actual token usage (e.g. showing
   9.1M when the real total was 169M+).

2. The 'API Calls' card displayed session count (COUNT(*) from sessions
   table), not actual LLM API requests. A single session makes 10-90 API
   calls through the tool loop, so this was ~30x lower than reality.

3. cache_write_tokens was stored in the DB but never exposed through the
   analytics API endpoint or frontend.

Changes:
- Add api_call_count column to sessions table (schema v7 migration)
- Persist api_call_count=1 per LLM API call in run_agent.py
- Analytics SQL queries now include cache_write_tokens and api_call_count
  in daily, by_model, and totals aggregations
- Frontend TOTAL TOKENS card now shows input + cache_read + cache_write +
  output (the full prompt total + output)
- API CALLS card now uses real api_call_count from DB
- New Cache Hit Rate card shows cache efficiency percentage
- Bar chart, tooltips, daily table, model table all use prompt totals
  (input + cache_read + cache_write) instead of just input
- Labels changed from 'Input' to 'Prompt' to reflect the full prompt total
- TypeScript interfaces and i18n strings updated (en + zh)
This commit is contained in:
kshitijk4poor 2026-04-15 12:16:58 +05:30
parent da8bab77fb
commit 42aeb4ecac
10 changed files with 121 additions and 27 deletions

View file

@ -694,6 +694,8 @@ class TestNewEndpoints:
assert "totals" in data
assert isinstance(data["daily"], list)
assert "total_sessions" in data["totals"]
assert "total_cache_write" in data["totals"]
assert "total_api_calls" in data["totals"]
def test_session_token_endpoint_removed(self):
"""GET /api/auth/session-token no longer exists."""