feat: add persistent memory system + SQLite session store

Two-part implementation:

Part A - Curated Bounded Memory:
- New memory tool (tools/memory_tool.py) with MEMORY.md + USER.md stores
- Character-limited (2200/1375 chars), § delimited entries
- Frozen snapshot injected into system prompt at session start
- Model manages pruning via replace/remove with substring matching
- Usage indicator shown in system prompt header

Part B - SQLite Session Store:
- New hermes_state.py with SessionDB class, FTS5 full-text search
- Gateway session.py rewritten to dual-write SQLite + legacy JSONL
- Compression-triggered session splitting with parent_session_id chains
- New session_search tool with Gemini Flash summarization of matched sessions
- CLI session lifecycle (create on launch, close on exit)

Also:
- System prompt now cached per session, only rebuilt on compression
  (fixes prefix cache invalidation from date/time changes every turn)
- Config version bumped to 3, hermes doctor checks for new artifacts
- Disabled in batch_runner and RL environments
This commit is contained in:
teknium1 2026-02-19 00:57:31 -08:00
parent 655303f2f1
commit 440c244cac
19 changed files with 2397 additions and 327 deletions

View file

@ -142,6 +142,26 @@ compression:
# This model compresses the middle turns into a concise summary
summary_model: "google/gemini-3-flash-preview"
# =============================================================================
# Persistent Memory
# =============================================================================
# Bounded curated memory injected into the system prompt every session.
# Two stores: MEMORY.md (agent's notes) and USER.md (user profile).
# Character limits keep the memory small and focused. The agent manages
# pruning -- when at the limit, it must consolidate or replace entries.
# Disabled by default in batch_runner and RL environments.
#
memory:
# Agent's personal notes: environment facts, conventions, things learned
memory_enabled: true
# User profile: preferences, communication style, expectations
user_profile_enabled: true
# Character limits (~2.75 chars per token, model-independent)
memory_char_limit: 2200 # ~800 tokens
user_char_limit: 1375 # ~500 tokens
# =============================================================================
# Agent Behavior
# =============================================================================
@ -274,6 +294,8 @@ platform_toolsets:
# skills - Load skill documents (skills_list, skill_view)
# moa - Mixture of Agents reasoning (mixture_of_agents)
# todo - Task planning and tracking for multi-step work
# memory - Persistent memory across sessions (personal notes + user profile)
# session_search - Search and recall past conversations (FTS5 + Gemini Flash summarization)
# tts - Text-to-speech (Edge TTS free, ElevenLabs, OpenAI)
# cronjob - Schedule and manage automated tasks (CLI-only)
# rl - RL training tools (Tinker-Atropos)