hermes-agent/website/docs/user-guide/features/honcho.md
Erosika 11b4c9ecf9 feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation
Context Injection Overhaul:
- Base layer: peer.context() (representation + card) cached with 5-minute TTL
- Dialectic supplement: cadence-gated, cached until next refresh
- Trivial prompt skip: short inputs/slash commands skip injection
- New peer guard: dialectic skipped at session start when peer has no context
- Targeted warm prompt for better dialectic quality

Tool Surface (5 bidirectional tools):
- honcho_profile: read or update peer card
- honcho_search: semantic search over context
- honcho_context: full session context (summary, representation, card, messages)
- honcho_reasoning: synthesized answer, reasoning_level param
- honcho_conclude: create or delete conclusions (PII removal)

Cost Safety:
- dialectic_cadence defaults to 3 (~66% fewer LLM calls)
- context_tokens defaults to uncapped (cap opt-in via config/wizard)
- on_turn_start hook wired up (fixes broken cadence/injection gating)

Correctness:
- Explicit target= on peer context/card fetches (fixes identity blur)
- honcho_search perspective fix under directional observation
- Timeout config plumbing
- peerName precedence over gateway user_id
- skip_memory on temp agents (orphan session prevention)
- gateway_session_key for stable per-chat session continuity
- initOnSessionStart for eager tools-mode init
- get_session_context fallback respects peer param
- mid -> medium in reasoning level validation

ABC changes (minimal, honcho-only):
- run_agent.py: gateway_session_key param + memory provider wiring (+5 lines)
- gateway/run.py: skip_memory on 2 temp agents, gateway_session_key on main agent (+3 lines)
- agent/memory_manager.py: sanitize regex for context tag variants (+9 lines)
2026-04-14 18:07:19 -04:00

6.7 KiB

sidebar_position title description
99 Honcho Memory AI-native persistent memory via Honcho — dialectic reasoning, multi-agent user modeling, and deep personalization

Honcho Memory

Honcho is an AI-native memory backend that adds dialectic reasoning and deep user modeling on top of Hermes's built-in memory system. Instead of simple key-value storage, Honcho maintains a running model of who the user is — their preferences, communication style, goals, and patterns — by reasoning about conversations after they happen.

:::info Honcho is a Memory Provider Plugin Honcho is integrated into the Memory Providers system. All features below are available through the unified memory provider interface. :::

What Honcho Adds

Capability Built-in Memory Honcho
Cross-session persistence ✔ File-based MEMORY.md/USER.md ✔ Server-side with API
User profile ✔ Manual agent curation ✔ Automatic dialectic reasoning
Multi-agent isolation ✔ Per-peer profile separation
Observation modes ✔ Unified or directional observation
Conclusions (derived insights) ✔ Server-side reasoning about patterns
Search across history ✔ FTS5 session search ✔ Semantic search over conclusions

Dialectic reasoning: After each conversation, Honcho analyzes the exchange and derives "conclusions" — insights about the user's preferences, habits, and goals. These conclusions accumulate over time, giving the agent a deepening understanding that goes beyond what the user explicitly stated.

Multi-agent profiles: When multiple Hermes instances talk to the same user (e.g., a coding assistant and a personal assistant), Honcho maintains separate "peer" profiles. Each peer sees only its own observations and conclusions, preventing cross-contamination of context.

Setup

hermes memory setup    # select "honcho" from the provider list

Or configure manually:

# ~/.hermes/config.yaml
memory:
  provider: honcho
echo "HONCHO_API_KEY=your-key" >> ~/.hermes/.env

Get an API key at honcho.dev.

Configuration Options

Honcho is configured in ~/.honcho/config.json (global) or $HERMES_HOME/honcho.json (profile-local). The setup wizard handles this for you.

Key settings:

Setting Default Description
sessionStrategy per-directory per-directory, per-repo, per-session, or global
recallMode hybrid hybrid (auto-inject + tools), context (inject only), tools (tools only)
contextTokens uncapped Token budget for auto-injected context per turn. Set to an integer (e.g. 1200) to cap
dialecticReasoningLevel low Base reasoning level: minimal, low, medium, high, max
dialecticDynamic true When true, model can override reasoning level per-call via tool param
dialecticCadence 3 Turns between Honcho LLM calls (higher = fewer calls)
writeFrequency async When to flush messages to Honcho: async (background thread), turn (sync each turn), session (flush on end), or integer N (every N turns)
observation all on Per-peer observeMe/observeOthers booleans

Session strategy controls how Honcho sessions map to your work:

  • per-session — each hermes run gets a fresh session. Clean starts, memory via tools. Recommended for new users.
  • per-directory — one Honcho session per working directory. Context accumulates across runs.
  • per-repo — one session per git repository.
  • global — single session across all directories.

Recall mode controls how memory flows into conversations:

  • hybrid — context auto-injected into system prompt AND tools available (model decides when to query).
  • context — auto-injection only, tools hidden.
  • tools — tools only, no auto-injection. Agent must explicitly call honcho_reasoning, honcho_search, etc.

Dialectic cadence controls cost. With default 3, Honcho rebuilds the user model every 3 turns instead of every turn — ~66% fewer LLM calls without losing model fidelity.

Settings per recall mode:

Setting hybrid context tools
writeFrequency flushes messages flushes messages flushes messages
dialecticCadence gates auto LLM calls gates auto LLM calls irrelevant — model calls explicitly
contextTokens caps injection caps injection irrelevant — no injection
dialecticDynamic gates model override N/A (no tools) gates model override

In tools mode, the model is fully in control — it calls honcho_reasoning when it wants, at whatever reasoning_level it picks. dialecticCadence and contextTokens only apply to modes with auto-injection (hybrid and context).

Tools

When Honcho is active as the memory provider, five tools become available:

Tool Purpose
honcho_profile Read or update peer card — pass card (list of facts) to update, omit to read
honcho_search Semantic search over context — raw excerpts, no LLM synthesis
honcho_context Full session context — summary, representation, card, recent messages
honcho_reasoning Synthesized answer from Honcho's LLM — pass reasoning_level (minimal/low/medium/high/max) to control depth
honcho_conclude Create or delete conclusions — pass conclusion to create, delete_id to remove (PII only)

CLI Commands

hermes honcho status          # Connection status, config, and key settings
hermes honcho setup           # Interactive setup wizard
hermes honcho strategy        # Show or set session strategy
hermes honcho peer            # Update peer names for multi-agent setups
hermes honcho mode            # Show or set recall mode
hermes honcho tokens          # Show or set context token budget
hermes honcho identity        # Show Honcho peer identity
hermes honcho sync            # Sync host blocks for all profiles
hermes honcho enable          # Enable Honcho
hermes honcho disable         # Disable Honcho

Migrating from hermes honcho

If you previously used the standalone hermes honcho setup:

  1. Your existing configuration (honcho.json or ~/.honcho/config.json) is preserved
  2. Your server-side data (memories, conclusions, user profiles) is intact
  3. Set memory.provider: honcho in config.yaml to reactivate

No re-login or re-setup needed. Run hermes memory setup and select "honcho" — the wizard detects your existing config.

Full Documentation

See Memory Providers — Honcho for the complete reference.