mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-17 09:41:58 +00:00

Erosika 11b4c9ecf9 feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation

Context Injection Overhaul:
- Base layer: peer.context() (representation + card) cached with 5-minute TTL
- Dialectic supplement: cadence-gated, cached until next refresh
- Trivial prompt skip: short inputs/slash commands skip injection
- New peer guard: dialectic skipped at session start when peer has no context
- Targeted warm prompt for better dialectic quality

Tool Surface (5 bidirectional tools):
- honcho_profile: read or update peer card
- honcho_search: semantic search over context
- honcho_context: full session context (summary, representation, card, messages)
- honcho_reasoning: synthesized answer, reasoning_level param
- honcho_conclude: create or delete conclusions (PII removal)

Cost Safety:
- dialectic_cadence defaults to 3 (~66% fewer LLM calls)
- context_tokens defaults to uncapped (cap opt-in via config/wizard)
- on_turn_start hook wired up (fixes broken cadence/injection gating)

Correctness:
- Explicit target= on peer context/card fetches (fixes identity blur)
- honcho_search perspective fix under directional observation
- Timeout config plumbing
- peerName precedence over gateway user_id
- skip_memory on temp agents (orphan session prevention)
- gateway_session_key for stable per-chat session continuity
- initOnSessionStart for eager tools-mode init
- get_session_context fallback respects peer param
- mid -> medium in reasoning level validation

ABC changes (minimal, honcho-only):
- run_agent.py: gateway_session_key param + memory provider wiring (+5 lines)
- gateway/run.py: skip_memory on 2 temp agents, gateway_session_key on main agent (+3 lines)
- agent/memory_manager.py: sanitize regex for context tag variants (+9 lines)

2026-04-14 18:07:19 -04:00

6.7 KiB

Raw Blame History

sidebar_position	title	description
99	Honcho Memory	AI-native persistent memory via Honcho — dialectic reasoning, multi-agent user modeling, and deep personalization

Honcho Memory

Honcho is an AI-native memory backend that adds dialectic reasoning and deep user modeling on top of Hermes's built-in memory system. Instead of simple key-value storage, Honcho maintains a running model of who the user is — their preferences, communication style, goals, and patterns — by reasoning about conversations after they happen.

:::info Honcho is a Memory Provider Plugin Honcho is integrated into the Memory Providers system. All features below are available through the unified memory provider interface. :::

What Honcho Adds

Capability	Built-in Memory	Honcho
Cross-session persistence	✔ File-based MEMORY.md/USER.md	✔ Server-side with API
User profile	✔ Manual agent curation	✔ Automatic dialectic reasoning
Multi-agent isolation	—	✔ Per-peer profile separation
Observation modes	—	✔ Unified or directional observation
Conclusions (derived insights)	—	✔ Server-side reasoning about patterns
Search across history	✔ FTS5 session search	✔ Semantic search over conclusions

Dialectic reasoning: After each conversation, Honcho analyzes the exchange and derives "conclusions" — insights about the user's preferences, habits, and goals. These conclusions accumulate over time, giving the agent a deepening understanding that goes beyond what the user explicitly stated.

Multi-agent profiles: When multiple Hermes instances talk to the same user (e.g., a coding assistant and a personal assistant), Honcho maintains separate "peer" profiles. Each peer sees only its own observations and conclusions, preventing cross-contamination of context.

Setup

hermes memory setup    # select "honcho" from the provider list

Or configure manually:

# ~/.hermes/config.yaml
memory:
  provider: honcho

echo "HONCHO_API_KEY=your-key" >> ~/.hermes/.env

Get an API key at honcho.dev.

Configuration Options

Honcho is configured in ~/.honcho/config.json (global) or $HERMES_HOME/honcho.json (profile-local). The setup wizard handles this for you.

Key settings:

Setting	Default	Description
`sessionStrategy`	`per-directory`	`per-directory`, `per-repo`, `per-session`, or `global`
`recallMode`	`hybrid`	`hybrid` (auto-inject + tools), `context` (inject only), `tools` (tools only)
`contextTokens`	uncapped	Token budget for auto-injected context per turn. Set to an integer (e.g. 1200) to cap
`dialecticReasoningLevel`	`low`	Base reasoning level: `minimal`, `low`, `medium`, `high`, `max`
`dialecticDynamic`	`true`	When `true`, model can override reasoning level per-call via tool param
`dialecticCadence`	`3`	Turns between Honcho LLM calls (higher = fewer calls)
`writeFrequency`	`async`	When to flush messages to Honcho: `async` (background thread), `turn` (sync each turn), `session` (flush on end), or integer N (every N turns)
`observation`	all on	Per-peer `observeMe`/`observeOthers` booleans

Session strategy controls how Honcho sessions map to your work:

per-session — each hermes run gets a fresh session. Clean starts, memory via tools. Recommended for new users.
per-directory — one Honcho session per working directory. Context accumulates across runs.
per-repo — one session per git repository.
global — single session across all directories.

Recall mode controls how memory flows into conversations:

hybrid — context auto-injected into system prompt AND tools available (model decides when to query).
context — auto-injection only, tools hidden.
tools — tools only, no auto-injection. Agent must explicitly call honcho_reasoning, honcho_search, etc.

Dialectic cadence controls cost. With default 3, Honcho rebuilds the user model every 3 turns instead of every turn — ~66% fewer LLM calls without losing model fidelity.

Settings per recall mode:

Setting	`hybrid`	`context`	`tools`
`writeFrequency`	flushes messages	flushes messages	flushes messages
`dialecticCadence`	gates auto LLM calls	gates auto LLM calls	irrelevant — model calls explicitly
`contextTokens`	caps injection	caps injection	irrelevant — no injection
`dialecticDynamic`	gates model override	N/A (no tools)	gates model override

In tools mode, the model is fully in control — it calls honcho_reasoning when it wants, at whatever reasoning_level it picks. dialecticCadence and contextTokens only apply to modes with auto-injection (hybrid and context).

Tools

When Honcho is active as the memory provider, five tools become available:

Tool	Purpose
`honcho_profile`	Read or update peer card — pass `card` (list of facts) to update, omit to read
`honcho_search`	Semantic search over context — raw excerpts, no LLM synthesis
`honcho_context`	Full session context — summary, representation, card, recent messages
`honcho_reasoning`	Synthesized answer from Honcho's LLM — pass `reasoning_level` (minimal/low/medium/high/max) to control depth
`honcho_conclude`	Create or delete conclusions — pass `conclusion` to create, `delete_id` to remove (PII only)

CLI Commands

hermes honcho status          # Connection status, config, and key settings
hermes honcho setup           # Interactive setup wizard
hermes honcho strategy        # Show or set session strategy
hermes honcho peer            # Update peer names for multi-agent setups
hermes honcho mode            # Show or set recall mode
hermes honcho tokens          # Show or set context token budget
hermes honcho identity        # Show Honcho peer identity
hermes honcho sync            # Sync host blocks for all profiles
hermes honcho enable          # Enable Honcho
hermes honcho disable         # Disable Honcho

Migrating from `hermes honcho`

If you previously used the standalone hermes honcho setup:

Your existing configuration (honcho.json or ~/.honcho/config.json) is preserved
Your server-side data (memories, conclusions, user profiles) is intact
Set memory.provider: honcho in config.yaml to reactivate

No re-login or re-setup needed. Run hermes memory setup and select "honcho" — the wizard detects your existing config.

Full Documentation

See Memory Providers — Honcho for the complete reference.

6.7 KiB Raw Blame History