feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation (#10619)

Salvaged from PR #9884 by erosika. Cherry-picked plugin changes onto
current main with minimal core modifications.

Plugin changes (plugins/memory/honcho/):
- New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context)
- Two-layer context injection: base context (summary + representation + card)
  on contextCadence, dialectic supplement on dialecticCadence
- Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal
- Cold/warm prompt selection based on session state
- dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls
- Session summary injection for conversational continuity
- Bidirectional peer targeting on all 5 tools
- Correctness fixes: peer param fallback, None guard on set_peer_card,
  schema validation, signal_sufficient anchored regex, mid->medium level fix

Core changes (~20 lines across 3 files):
- agent/memory_manager.py: Enhanced sanitize_context() to strip full
  <memory-context> blocks and system notes (prevents leak from saveMessages)
- run_agent.py: gateway_session_key param for stable per-chat Honcho sessions,
  on_turn_start() call before prefetch_all() for cadence tracking,
  sanitize_context() on user messages to strip leaked memory blocks
- gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions),
  gateway_session_key threading to main agent

Tests: 509 passed (3 skipped — honcho SDK not installed locally)
Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md

Co-authored-by: erosika <erosika@users.noreply.github.com>
This commit is contained in:
Teknium 2026-04-15 19:12:19 -07:00 committed by GitHub
parent 00ff9a26cd
commit cc6e8941db
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
17 changed files with 2632 additions and 396 deletions

View file

@ -1,6 +1,6 @@
# Honcho Memory Provider
AI-native cross-session user modeling with dialectic Q&A, semantic search, peer cards, and persistent conclusions.
AI-native cross-session user modeling with multi-pass dialectic reasoning, session summaries, bidirectional peer tools, and persistent conclusions.
> **Honcho docs:** <https://docs.honcho.dev/v3/guides/integrations/hermes>
@ -19,9 +19,86 @@ hermes memory setup # generic picker, also works
Or manually:
```bash
hermes config set memory.provider honcho
echo "HONCHO_API_KEY=your-key" >> ~/.hermes/.env
echo "HONCHO_API_KEY=***" >> ~/.hermes/.env
```
## Architecture Overview
### Two-Layer Context Injection
Context is injected into the **user message** at API-call time (not the system prompt) to preserve prompt caching. Only a static mode header goes in the system prompt. The injected block is wrapped in `<memory-context>` fences with a system note clarifying it's background data, not new user input.
Two independent layers, each on its own cadence:
**Layer 1 — Base context** (refreshed every `contextCadence` turns):
1. **SESSION SUMMARY** — from `session.context(summary=True)`, placed first
2. **User Representation** — Honcho's evolving model of the user
3. **User Peer Card** — key facts snapshot
4. **AI Self-Representation** — Honcho's model of the AI peer
5. **AI Identity Card** — AI peer facts
**Layer 2 — Dialectic supplement** (fired every `dialecticCadence` turns):
Multi-pass `.chat()` reasoning about the user, appended after base context.
Both layers are joined, then truncated to fit `contextTokens` budget via `_truncate_to_budget` (tokens × 4 chars, word-boundary safe).
### Cold Start vs Warm Session Prompts
Dialectic pass 0 automatically selects its prompt based on session state:
- **Cold** (no base context cached): "Who is this person? What are their preferences, goals, and working style? Focus on facts that would help an AI assistant be immediately useful."
- **Warm** (base context exists): "Given what's been discussed in this session so far, what context about this user is most relevant to the current conversation? Prioritize active context over biographical facts."
Not configurable — determined automatically.
### Dialectic Depth (Multi-Pass Reasoning)
`dialecticDepth` (13, clamped) controls how many `.chat()` calls fire per dialectic cycle:
| Depth | Passes | Behavior |
|-------|--------|----------|
| 1 | single `.chat()` | Base query only (cold or warm prompt) |
| 2 | audit + synthesis | Pass 0 result is self-audited; pass 1 does targeted synthesis. Conditional bail-out if pass 0 returns strong signal (>300 chars or structured with bullets/sections >100 chars) |
| 3 | audit + synthesis + reconciliation | Pass 2 reconciles contradictions across prior passes into a final synthesis |
### Proportional Reasoning Levels
When `dialecticDepthLevels` is not set, each pass uses a proportional level relative to `dialecticReasoningLevel` (the "base"):
| Depth | Pass levels |
|-------|-------------|
| 1 | [base] |
| 2 | [minimal, base] |
| 3 | [minimal, base, low] |
Override with `dialecticDepthLevels`: an explicit array of reasoning level strings per pass.
### Three Orthogonal Dialectic Knobs
| Knob | Controls | Type |
|------|----------|------|
| `dialecticCadence` | How often — minimum turns between dialectic firings | int |
| `dialecticDepth` | How many — passes per firing (13) | int |
| `dialecticReasoningLevel` | How hard — reasoning ceiling per `.chat()` call | string |
### Input Sanitization
`run_conversation` strips leaked `<memory-context>` blocks from user input before processing. When `saveMessages` persists a turn that included injected context, the block can reappear in subsequent turns via message history. The sanitizer removes `<memory-context>` blocks plus associated system notes.
## Tools
Five bidirectional tools. All accept an optional `peer` parameter (`"user"` or `"ai"`, default `"user"`).
| Tool | LLM call? | Description |
|------|-----------|-------------|
| `honcho_profile` | No | Peer card — key facts snapshot |
| `honcho_search` | No | Semantic search over stored context (800 tok default, 2000 max) |
| `honcho_context` | No | Full session context: summary, representation, card, messages |
| `honcho_reasoning` | Yes | LLM-synthesized answer via dialectic `.chat()` |
| `honcho_conclude` | No | Write a persistent fact/conclusion about the user |
Tool visibility depends on `recallMode`: hidden in `context` mode, always present in `tools` and `hybrid`.
## Config Resolution
Config is read from the first file that exists:
@ -34,42 +111,128 @@ Config is read from the first file that exists:
Host key is derived from the active Hermes profile: `hermes` (default) or `hermes.<profile>`.
## Tools
| Tool | LLM call? | Description |
|------|-----------|-------------|
| `honcho_profile` | No | User's peer card -- key facts snapshot |
| `honcho_search` | No | Semantic search over stored context (800 tok default, 2000 max) |
| `honcho_context` | Yes | LLM-synthesized answer via dialectic reasoning |
| `honcho_conclude` | No | Write a persistent fact about the user |
Tool availability depends on `recallMode`: hidden in `context` mode, always present in `tools` and `hybrid`.
For every key, resolution order is: **host block > root > env var > default**.
## Full Configuration Reference
### Identity & Connection
| Key | Type | Default | Scope | Description |
|-----|------|---------|-------|-------------|
| `apiKey` | string | -- | root / host | API key. Falls back to `HONCHO_API_KEY` env var |
| `baseUrl` | string | -- | root | Base URL for self-hosted Honcho. Local URLs (`localhost`, `127.0.0.1`, `::1`) auto-skip API key auth |
| `environment` | string | `"production"` | root / host | SDK environment mapping |
| `enabled` | bool | auto | root / host | Master toggle. Auto-enables when `apiKey` or `baseUrl` present |
| `workspace` | string | host key | root / host | Honcho workspace ID |
| `peerName` | string | -- | root / host | User peer identity |
| `aiPeer` | string | host key | root / host | AI peer identity |
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `apiKey` | string | — | API key. Falls back to `HONCHO_API_KEY` env var |
| `baseUrl` | string | — | Base URL for self-hosted Honcho. Local URLs auto-skip API key auth |
| `environment` | string | `"production"` | SDK environment mapping |
| `enabled` | bool | auto | Master toggle. Auto-enables when `apiKey` or `baseUrl` present |
| `workspace` | string | host key | Honcho workspace ID. Shared environment — all profiles in the same workspace can see the same user identity and related memories |
| `peerName` | string | | User peer identity |
| `aiPeer` | string | host key | AI peer identity |
### Memory & Recall
| Key | Type | Default | Scope | Description |
|-----|------|---------|-------|-------------|
| `recallMode` | string | `"hybrid"` | root / host | `"hybrid"` (auto-inject + tools), `"context"` (auto-inject only, tools hidden), `"tools"` (tools only, no injection). Legacy `"auto"` normalizes to `"hybrid"` |
| `observationMode` | string | `"directional"` | root / host | Shorthand preset: `"directional"` (all on) or `"unified"` (shared pool). Use `observation` object for granular control |
| `observation` | object | -- | root / host | Per-peer observation config (see below) |
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `recallMode` | string | `"hybrid"` | `"hybrid"` (auto-inject + tools), `"context"` (auto-inject only, tools hidden), `"tools"` (tools only, no injection). Legacy `"auto"` `"hybrid"` |
| `observationMode` | string | `"directional"` | Preset: `"directional"` (all on) or `"unified"` (shared pool). Use `observation` object for granular control |
| `observation` | object | — | Per-peer observation config (see Observation section) |
#### Observation (granular)
### Write Behavior
Maps 1:1 to Honcho's per-peer `SessionPeerConfig`. Set at root or per host block -- each profile can have different observation settings. When present, overrides `observationMode` preset.
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `writeFrequency` | string/int | `"async"` | `"async"` (background), `"turn"` (sync per turn), `"session"` (batch on end), or integer N (every N turns) |
| `saveMessages` | bool | `true` | Persist messages to Honcho API |
### Session Resolution
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `sessionStrategy` | string | `"per-directory"` | `"per-directory"`, `"per-session"`, `"per-repo"` (git root), `"global"` |
| `sessionPeerPrefix` | bool | `false` | Prepend peer name to session keys |
| `sessions` | object | `{}` | Manual directory-to-session-name mappings |
#### Session Name Resolution
The Honcho session name determines which conversation bucket memory lands in. Resolution follows a priority chain — first match wins:
| Priority | Source | Example session name |
|----------|--------|---------------------|
| 1 | Manual map (`sessions` config) | `"myproject-main"` |
| 2 | `/title` command (mid-session rename) | `"refactor-auth"` |
| 3 | Gateway session key (Telegram, Discord, etc.) | `"agent-main-telegram-dm-8439114563"` |
| 4 | `per-session` strategy | Hermes session ID (`20260415_a3f2b1`) |
| 5 | `per-repo` strategy | Git root directory name (`hermes-agent`) |
| 6 | `per-directory` strategy | Current directory basename (`src`) |
| 7 | `global` strategy | Workspace name (`hermes`) |
Gateway platforms always resolve via priority 3 (per-chat isolation) regardless of `sessionStrategy`. The strategy setting only affects CLI sessions.
If `sessionPeerPrefix` is `true`, the peer name is prepended: `eri-hermes-agent`.
#### What each strategy produces
- **`per-directory`** — basename of `$PWD`. Opening hermes in `~/code/myapp` and `~/code/other` gives two separate sessions. Same directory = same session across runs.
- **`per-repo`** — git root directory name. All subdirectories within a repo share one session. Falls back to `per-directory` if not inside a git repo.
- **`per-session`** — Hermes session ID (timestamp + hex). Every `hermes` invocation starts a fresh Honcho session. Falls back to `per-directory` if no session ID is available.
- **`global`** — workspace name. One session for everything. Memory accumulates across all directories and runs.
### Multi-Profile Pattern
Multiple Hermes profiles can share one workspace while maintaining separate AI identities. Config resolution is **host block > root > env var > default** — host blocks inherit from root, so shared settings only need to be declared once:
```json
{
"apiKey": "***",
"workspace": "hermes",
"peerName": "yourname",
"hosts": {
"hermes": {
"aiPeer": "hermes",
"recallMode": "hybrid",
"sessionStrategy": "per-directory"
},
"hermes.coder": {
"aiPeer": "coder",
"recallMode": "tools",
"sessionStrategy": "per-repo"
}
}
}
```
Both profiles see the same user (`yourname`) in the same shared environment (`hermes`), but each AI peer builds its own observations, conclusions, and behavior patterns. The coder's memory stays code-oriented; the main agent's stays broad.
Host key is derived from the active Hermes profile: `hermes` (default) or `hermes.<profile>` (e.g. `hermes -p coder` → host key `hermes.coder`).
### Dialectic & Reasoning
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `dialecticDepth` | int | `1` | Passes per dialectic cycle (13, clamped). 1=single query, 2=audit+synthesis, 3=audit+synthesis+reconciliation |
| `dialecticDepthLevels` | array | — | Optional array of reasoning level strings per pass. Overrides proportional defaults. Example: `["minimal", "low", "medium"]` |
| `dialecticReasoningLevel` | string | `"low"` | Base reasoning level for `.chat()`: `"minimal"`, `"low"`, `"medium"`, `"high"`, `"max"` |
| `dialecticDynamic` | bool | `true` | When `true`, model can override reasoning level per-call via `honcho_reasoning` tool. When `false`, always uses `dialecticReasoningLevel` |
| `dialecticMaxChars` | int | `600` | Max chars of dialectic result injected into system prompt |
| `dialecticMaxInputChars` | int | `10000` | Max chars for dialectic query input to `.chat()`. Honcho cloud limit: 10k |
### Token Budgets
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `contextTokens` | int | SDK default | Token budget for `context()` API calls. Also gates prefetch truncation (tokens × 4 chars) |
| `messageMaxChars` | int | `25000` | Max chars per message sent via `add_messages()`. Exceeding this triggers chunking with `[continued]` markers. Honcho cloud limit: 25k |
### Cadence (Cost Control)
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `contextCadence` | int | `1` | Minimum turns between base context refreshes (session summary + representation + card) |
| `dialecticCadence` | int | `1` | Minimum turns between dialectic `.chat()` firings |
| `injectionFrequency` | string | `"every-turn"` | `"every-turn"` or `"first-turn"` (inject context on the first user message only, skip from turn 2 onward) |
| `reasoningLevelCap` | string | — | Hard cap on reasoning level: `"minimal"`, `"low"`, `"medium"`, `"high"` |
### Observation (Granular)
Maps 1:1 to Honcho's per-peer `SessionPeerConfig`. When present, overrides `observationMode` preset.
```json
"observation": {
@ -85,74 +248,16 @@ Maps 1:1 to Honcho's per-peer `SessionPeerConfig`. Set at root or per host block
| `ai.observeMe` | `true` | AI peer self-observation (Honcho builds AI representation) |
| `ai.observeOthers` | `true` | AI peer observes user messages (enables cross-peer dialectic) |
Presets for `observationMode`:
- `"directional"` (default): all four booleans `true`
Presets:
- `"directional"` (default): all four `true`
- `"unified"`: user `observeMe=true`, AI `observeOthers=true`, rest `false`
Per-profile example -- coder profile observes the user but user doesn't observe coder:
### Hardcoded Limits
```json
"hosts": {
"hermes.coder": {
"observation": {
"user": { "observeMe": true, "observeOthers": false },
"ai": { "observeMe": true, "observeOthers": true }
}
}
}
```
Settings changed in the [Honcho dashboard](https://app.honcho.dev) are synced back on session init.
### Write Behavior
| Key | Type | Default | Scope | Description |
|-----|------|---------|-------|-------------|
| `writeFrequency` | string or int | `"async"` | root / host | `"async"` (background thread), `"turn"` (sync per turn), `"session"` (batch on end), or integer N (every N turns) |
| `saveMessages` | bool | `true` | root / host | Whether to persist messages to Honcho API |
### Session Resolution
| Key | Type | Default | Scope | Description |
|-----|------|---------|-------|-------------|
| `sessionStrategy` | string | `"per-directory"` | root / host | `"per-directory"`, `"per-session"` (new each run), `"per-repo"` (git root name), `"global"` (single session) |
| `sessionPeerPrefix` | bool | `false` | root / host | Prepend peer name to session keys |
| `sessions` | object | `{}` | root | Manual directory-to-session-name mappings: `{"/path/to/project": "my-session"}` |
### Token Budgets & Dialectic
| Key | Type | Default | Scope | Description |
|-----|------|---------|-------|-------------|
| `contextTokens` | int | SDK default | root / host | Token budget for `context()` API calls. Also gates prefetch truncation (tokens x 4 chars) |
| `dialecticReasoningLevel` | string | `"low"` | root / host | Base reasoning level for `peer.chat()`: `"minimal"`, `"low"`, `"medium"`, `"high"`, `"max"` |
| `dialecticDynamic` | bool | `true` | root / host | Auto-bump reasoning based on query length: `<120` chars = base level, `120-400` = +1, `>400` = +2 (capped at `"high"`). Set `false` to always use `dialecticReasoningLevel` as-is |
| `dialecticMaxChars` | int | `600` | root / host | Max chars of dialectic result injected into system prompt |
| `dialecticMaxInputChars` | int | `10000` | root / host | Max chars for dialectic query input to `peer.chat()`. Honcho cloud limit: 10k |
| `messageMaxChars` | int | `25000` | root / host | Max chars per message sent via `add_messages()`. Messages exceeding this are chunked with `[continued]` markers. Honcho cloud limit: 25k |
### Cost Awareness (Advanced)
These are read from the root config object, not the host block. Must be set manually in `honcho.json`.
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `injectionFrequency` | string | `"every-turn"` | `"every-turn"` or `"first-turn"` (inject context only on turn 0) |
| `contextCadence` | int | `1` | Minimum turns between `context()` API calls |
| `dialecticCadence` | int | `1` | Minimum turns between `peer.chat()` API calls |
| `reasoningLevelCap` | string | -- | Hard cap on auto-bumped reasoning: `"minimal"`, `"low"`, `"mid"`, `"high"` |
### Hardcoded Limits (Not Configurable)
| Limit | Value | Location |
|-------|-------|----------|
| Search tool max tokens | 2000 (hard cap), 800 (default) | `__init__.py` handle_tool_call |
| Peer card fetch tokens | 200 | `session.py` get_peer_card |
## Config Precedence
For every key, resolution order is: **host block > root > env var > default**.
Host key derivation: `HERMES_HONCHO_HOST` env > active profile (`hermes.<profile>`) > `"hermes"`.
| Limit | Value |
|-------|-------|
| Search tool max tokens | 2000 (hard cap), 800 (default) |
| Peer card fetch tokens | 200 |
## Environment Variables
@ -182,15 +287,16 @@ Host key derivation: `HERMES_HONCHO_HOST` env > active profile (`hermes.<profile
```json
{
"apiKey": "your-key",
"apiKey": "***",
"workspace": "hermes",
"peerName": "eri",
"peerName": "username",
"contextCadence": 2,
"dialecticCadence": 3,
"dialecticDepth": 2,
"hosts": {
"hermes": {
"enabled": true,
"aiPeer": "hermes",
"workspace": "hermes",
"peerName": "eri",
"recallMode": "hybrid",
"observation": {
"user": { "observeMe": true, "observeOthers": true },
@ -199,14 +305,16 @@ Host key derivation: `HERMES_HONCHO_HOST` env > active profile (`hermes.<profile
"writeFrequency": "async",
"sessionStrategy": "per-directory",
"dialecticReasoningLevel": "low",
"dialecticDepth": 2,
"dialecticMaxChars": 600,
"saveMessages": true
},
"hermes.coder": {
"enabled": true,
"aiPeer": "coder",
"workspace": "hermes",
"peerName": "eri",
"sessionStrategy": "per-repo",
"dialecticDepth": 1,
"dialecticDepthLevels": ["low"],
"observation": {
"user": { "observeMe": true, "observeOthers": false },
"ai": { "observeMe": true, "observeOthers": true }

View file

@ -17,6 +17,7 @@ from __future__ import annotations
import json
import logging
import re
import threading
from typing import Any, Dict, List, Optional
@ -33,20 +34,33 @@ logger = logging.getLogger(__name__)
PROFILE_SCHEMA = {
"name": "honcho_profile",
"description": (
"Retrieve the user's peer card from Honcho — a curated list of key facts "
"about them (name, role, preferences, communication style, patterns). "
"Fast, no LLM reasoning, minimal cost. "
"Use this at conversation start or when you need a quick factual snapshot."
"Retrieve or update a peer card from Honcho — a curated list of key facts "
"about that peer (name, role, preferences, communication style, patterns). "
"Pass `card` to update; omit `card` to read."
),
"parameters": {"type": "object", "properties": {}, "required": []},
"parameters": {
"type": "object",
"properties": {
"peer": {
"type": "string",
"description": "Peer to query. Built-in aliases: 'user' (default), 'ai'. Or pass any peer ID from this workspace.",
},
"card": {
"type": "array",
"items": {"type": "string"},
"description": "New peer card as a list of fact strings. Omit to read the current card.",
},
},
"required": [],
},
}
SEARCH_SCHEMA = {
"name": "honcho_search",
"description": (
"Semantic search over Honcho's stored context about the user. "
"Semantic search over Honcho's stored context about a peer. "
"Returns raw excerpts ranked by relevance — no LLM synthesis. "
"Cheaper and faster than honcho_context. "
"Cheaper and faster than honcho_reasoning. "
"Good when you want to find specific past facts and reason over them yourself."
),
"parameters": {
@ -60,17 +74,23 @@ SEARCH_SCHEMA = {
"type": "integer",
"description": "Token budget for returned context (default 800, max 2000).",
},
"peer": {
"type": "string",
"description": "Peer to query. Built-in aliases: 'user' (default), 'ai'. Or pass any peer ID from this workspace.",
},
},
"required": ["query"],
},
}
CONTEXT_SCHEMA = {
"name": "honcho_context",
REASONING_SCHEMA = {
"name": "honcho_reasoning",
"description": (
"Ask Honcho a natural language question and get a synthesized answer. "
"Uses Honcho's LLM (dialectic reasoning) — higher cost than honcho_profile or honcho_search. "
"Can query about any peer: the user (default) or the AI assistant."
"Can query about any peer via alias or explicit peer ID. "
"Pass reasoning_level to control depth: minimal (fast/cheap), low (default), "
"medium, high, max (deep/expensive). Omit for configured default."
),
"parameters": {
"type": "object",
@ -79,37 +99,87 @@ CONTEXT_SCHEMA = {
"type": "string",
"description": "A natural language question.",
},
"reasoning_level": {
"type": "string",
"description": (
"Override the default reasoning depth. "
"Omit to use the configured default (typically low). "
"Guide:\n"
"- minimal: quick factual lookups (name, role, simple preference)\n"
"- low: straightforward questions with clear answers\n"
"- medium: multi-aspect questions requiring synthesis across observations\n"
"- high: complex behavioral patterns, contradictions, deep analysis\n"
"- max: thorough audit-level analysis, leave no stone unturned"
),
"enum": ["minimal", "low", "medium", "high", "max"],
},
"peer": {
"type": "string",
"description": "Which peer to query about: 'user' (default) or 'ai'.",
"description": "Peer to query. Built-in aliases: 'user' (default), 'ai'. Or pass any peer ID from this workspace.",
},
},
"required": ["query"],
},
}
CONTEXT_SCHEMA = {
"name": "honcho_context",
"description": (
"Retrieve full session context from Honcho — summary, peer representation, "
"peer card, and recent messages. No LLM synthesis. "
"Cheaper than honcho_reasoning. Use this to see what Honcho knows about "
"the current conversation and the specified peer."
),
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Optional focus query to filter context. Omit for full session context snapshot.",
},
"peer": {
"type": "string",
"description": "Peer to query. Built-in aliases: 'user' (default), 'ai'. Or pass any peer ID from this workspace.",
},
},
"required": [],
},
}
CONCLUDE_SCHEMA = {
"name": "honcho_conclude",
"description": (
"Write a conclusion about the user back to Honcho's memory. "
"Conclusions are persistent facts that build the user's profile. "
"Use when the user states a preference, corrects you, or shares "
"something to remember across sessions."
"Write or delete a conclusion about a peer in Honcho's memory. "
"Conclusions are persistent facts that build a peer's profile. "
"You MUST pass exactly one of: `conclusion` (to create) or `delete_id` (to delete). "
"Passing neither is an error. "
"Deletion is only for PII removal — Honcho self-heals incorrect conclusions over time."
),
"parameters": {
"type": "object",
"properties": {
"conclusion": {
"type": "string",
"description": "A factual statement about the user to persist.",
}
"description": "A factual statement to persist. Required when not using delete_id.",
},
"delete_id": {
"type": "string",
"description": "Conclusion ID to delete (for PII removal). Required when not using conclusion.",
},
"peer": {
"type": "string",
"description": "Peer to query. Built-in aliases: 'user' (default), 'ai'. Or pass any peer ID from this workspace.",
},
},
"required": ["conclusion"],
"anyOf": [
{"required": ["conclusion"]},
{"required": ["delete_id"]},
],
},
}
ALL_TOOL_SCHEMAS = [PROFILE_SCHEMA, SEARCH_SCHEMA, CONTEXT_SCHEMA, CONCLUDE_SCHEMA]
ALL_TOOL_SCHEMAS = [PROFILE_SCHEMA, SEARCH_SCHEMA, REASONING_SCHEMA, CONTEXT_SCHEMA, CONCLUDE_SCHEMA]
# ---------------------------------------------------------------------------
@ -131,16 +201,18 @@ class HonchoMemoryProvider(MemoryProvider):
# B1: recall_mode — set during initialize from config
self._recall_mode = "hybrid" # "context", "tools", or "hybrid"
# B4: First-turn context baking
self._first_turn_context: Optional[str] = None
self._first_turn_lock = threading.Lock()
# Base context cache — refreshed on context_cadence, not frozen
self._base_context_cache: Optional[str] = None
self._base_context_lock = threading.Lock()
# B5: Cost-awareness turn counting and cadence
self._turn_count = 0
self._injection_frequency = "every-turn" # or "first-turn"
self._context_cadence = 1 # minimum turns between context API calls
self._dialectic_cadence = 1 # minimum turns between dialectic API calls
self._reasoning_level_cap: Optional[str] = None # "minimal", "low", "mid", "high"
self._dialectic_cadence = 3 # minimum turns between dialectic API calls
self._dialectic_depth = 1 # how many .chat() calls per dialectic cycle (1-3)
self._dialectic_depth_levels: list[str] | None = None # per-pass reasoning levels
self._reasoning_level_cap: Optional[str] = None # "minimal", "low", "medium", "high"
self._last_context_turn = -999
self._last_dialectic_turn = -999
@ -236,9 +308,11 @@ class HonchoMemoryProvider(MemoryProvider):
raw = cfg.raw or {}
self._injection_frequency = raw.get("injectionFrequency", "every-turn")
self._context_cadence = int(raw.get("contextCadence", 1))
self._dialectic_cadence = int(raw.get("dialecticCadence", 1))
self._dialectic_cadence = int(raw.get("dialecticCadence", 3))
self._dialectic_depth = max(1, min(cfg.dialectic_depth, 3))
self._dialectic_depth_levels = cfg.dialectic_depth_levels
cap = raw.get("reasoningLevelCap")
if cap and cap in ("minimal", "low", "mid", "high"):
if cap and cap in ("minimal", "low", "medium", "high"):
self._reasoning_level_cap = cap
except Exception as e:
logger.debug("Honcho cost-awareness config parse error: %s", e)
@ -251,9 +325,7 @@ class HonchoMemoryProvider(MemoryProvider):
# ----- Port #1957: lazy session init for tools-only mode -----
if self._recall_mode == "tools":
if cfg.init_on_session_start:
# Eager init: create session now so sync_turn() works from turn 1.
# Does NOT enable auto-injection — prefetch() still returns empty.
logger.debug("Honcho tools-only mode — eager session init (initOnSessionStart=true)")
# Eager init even in tools mode (opt-in)
self._do_session_init(cfg, session_id, **kwargs)
return
# Defer actual session creation until first tool call
@ -287,8 +359,13 @@ class HonchoMemoryProvider(MemoryProvider):
# ----- B3: resolve_session_name -----
session_title = kwargs.get("session_title")
gateway_session_key = kwargs.get("gateway_session_key")
self._session_key = (
cfg.resolve_session_name(session_title=session_title, session_id=session_id)
cfg.resolve_session_name(
session_title=session_title,
session_id=session_id,
gateway_session_key=gateway_session_key,
)
or session_id
or "hermes-default"
)
@ -299,12 +376,21 @@ class HonchoMemoryProvider(MemoryProvider):
self._session_initialized = True
# ----- B6: Memory file migration (one-time, for new sessions) -----
# Skip under per-session strategy: every Hermes run creates a fresh
# Honcho session by design, so uploading MEMORY.md/USER.md/SOUL.md to
# each one would flood the backend with short-lived duplicates instead
# of performing a one-time migration.
try:
if not session.messages:
if not session.messages and cfg.session_strategy != "per-session":
from hermes_constants import get_hermes_home
mem_dir = str(get_hermes_home() / "memories")
self._manager.migrate_memory_files(self._session_key, mem_dir)
logger.debug("Honcho memory file migration attempted for new session: %s", self._session_key)
elif cfg.session_strategy == "per-session":
logger.debug(
"Honcho memory file migration skipped: per-session strategy creates a fresh session per run (%s)",
self._session_key,
)
except Exception as e:
logger.debug("Honcho memory file migration skipped: %s", e)
@ -347,6 +433,11 @@ class HonchoMemoryProvider(MemoryProvider):
"""Format the prefetch context dict into a readable system prompt block."""
parts = []
# Session summary — session-scoped context, placed first for relevance
summary = ctx.get("summary", "")
if summary:
parts.append(f"## Session Summary\n{summary}")
rep = ctx.get("representation", "")
if rep:
parts.append(f"## User Representation\n{rep}")
@ -370,9 +461,9 @@ class HonchoMemoryProvider(MemoryProvider):
def system_prompt_block(self) -> str:
"""Return system prompt text, adapted by recall_mode.
B4: On the FIRST call, fetch and bake the full Honcho context
(user representation, peer card, AI representation, continuity synthesis).
Subsequent calls return the cached block for prompt caching stability.
Returns only the mode header and tool instructions static text
that doesn't change between turns (prompt-cache friendly).
Live context (representation, card) is injected via prefetch().
"""
if self._cron_skipped:
return ""
@ -382,24 +473,10 @@ class HonchoMemoryProvider(MemoryProvider):
return (
"# Honcho Memory\n"
"Active (tools-only mode). Use honcho_profile, honcho_search, "
"honcho_context, and honcho_conclude tools to access user memory."
"honcho_reasoning, honcho_context, and honcho_conclude tools to access user memory."
)
return ""
# ----- B4: First-turn context baking -----
first_turn_block = ""
if self._recall_mode in ("context", "hybrid"):
with self._first_turn_lock:
if self._first_turn_context is None:
# First call — fetch and cache
try:
ctx = self._manager.get_prefetch_context(self._session_key)
self._first_turn_context = self._format_first_turn_context(ctx) if ctx else ""
except Exception as e:
logger.debug("Honcho first-turn context fetch failed: %s", e)
self._first_turn_context = ""
first_turn_block = self._first_turn_context
# ----- B1: adapt text based on recall_mode -----
if self._recall_mode == "context":
header = (
@ -412,7 +489,8 @@ class HonchoMemoryProvider(MemoryProvider):
header = (
"# Honcho Memory\n"
"Active (tools-only mode). Use honcho_profile for a quick factual snapshot, "
"honcho_search for raw excerpts, honcho_context for synthesized answers, "
"honcho_search for raw excerpts, honcho_context for raw peer context, "
"honcho_reasoning for synthesized answers, "
"honcho_conclude to save facts about the user. "
"No automatic context injection — you must use tools to access memory."
)
@ -421,16 +499,19 @@ class HonchoMemoryProvider(MemoryProvider):
"# Honcho Memory\n"
"Active (hybrid mode). Relevant context is auto-injected AND memory tools are available. "
"Use honcho_profile for a quick factual snapshot, "
"honcho_search for raw excerpts, honcho_context for synthesized answers, "
"honcho_search for raw excerpts, honcho_context for raw peer context, "
"honcho_reasoning for synthesized answers, "
"honcho_conclude to save facts about the user."
)
if first_turn_block:
return f"{header}\n\n{first_turn_block}"
return header
def prefetch(self, query: str, *, session_id: str = "") -> str:
"""Return prefetched dialectic context from background thread.
"""Return base context (representation + card) plus dialectic supplement.
Assembles two layers:
1. Base context from peer.context() cached, refreshed on context_cadence
2. Dialectic supplement cached, refreshed on dialectic_cadence
B1: Returns empty when recall_mode is "tools" (no injection).
B5: Respects injection_frequency "first-turn" returns cached/empty after turn 0.
@ -443,22 +524,95 @@ class HonchoMemoryProvider(MemoryProvider):
if self._recall_mode == "tools":
return ""
# B5: injection_frequency — if "first-turn" and past first turn, return empty
if self._injection_frequency == "first-turn" and self._turn_count > 0:
# B5: injection_frequency — if "first-turn" and past first turn, return empty.
# _turn_count is 1-indexed (first user message = 1), so > 1 means "past first".
if self._injection_frequency == "first-turn" and self._turn_count > 1:
return ""
parts = []
# ----- Layer 1: Base context (representation + card) -----
# On first call, fetch synchronously so turn 1 isn't empty.
# After that, serve from cache and refresh in background on cadence.
with self._base_context_lock:
if self._base_context_cache is None:
# First call — synchronous fetch
try:
ctx = self._manager.get_prefetch_context(self._session_key)
self._base_context_cache = self._format_first_turn_context(ctx) if ctx else ""
self._last_context_turn = self._turn_count
except Exception as e:
logger.debug("Honcho base context fetch failed: %s", e)
self._base_context_cache = ""
base_context = self._base_context_cache
# Check if background context prefetch has a fresher result
if self._manager:
fresh_ctx = self._manager.pop_context_result(self._session_key)
if fresh_ctx:
formatted = self._format_first_turn_context(fresh_ctx)
if formatted:
with self._base_context_lock:
self._base_context_cache = formatted
base_context = formatted
if base_context:
parts.append(base_context)
# ----- Layer 2: Dialectic supplement -----
# On the very first turn, no queue_prefetch() has run yet so the
# dialectic result is empty. Run with a bounded timeout so a slow
# Honcho connection doesn't block the first response indefinitely.
# On timeout the result is skipped and queue_prefetch() will pick it
# up at the next cadence-allowed turn.
if self._last_dialectic_turn == -999 and query:
_first_turn_timeout = (
self._config.timeout if self._config and self._config.timeout else 8.0
)
_result_holder: list[str] = []
def _run_first_turn() -> None:
try:
_result_holder.append(self._run_dialectic_depth(query))
except Exception as exc:
logger.debug("Honcho first-turn dialectic failed: %s", exc)
_t = threading.Thread(target=_run_first_turn, daemon=True)
_t.start()
_t.join(timeout=_first_turn_timeout)
if not _t.is_alive():
first_turn_dialectic = _result_holder[0] if _result_holder else ""
if first_turn_dialectic and first_turn_dialectic.strip():
with self._prefetch_lock:
self._prefetch_result = first_turn_dialectic
self._last_dialectic_turn = self._turn_count
else:
logger.debug(
"Honcho first-turn dialectic timed out (%.1fs) — "
"will inject at next cadence-allowed turn",
_first_turn_timeout,
)
# Don't update _last_dialectic_turn: queue_prefetch() will
# retry at the next cadence-allowed turn via the async path.
if self._prefetch_thread and self._prefetch_thread.is_alive():
self._prefetch_thread.join(timeout=3.0)
with self._prefetch_lock:
result = self._prefetch_result
dialectic_result = self._prefetch_result
self._prefetch_result = ""
if not result:
if dialectic_result and dialectic_result.strip():
parts.append(dialectic_result)
if not parts:
return ""
result = "\n\n".join(parts)
# ----- Port #3265: token budget enforcement -----
result = self._truncate_to_budget(result)
return f"## Honcho Context\n{result}"
return result
def _truncate_to_budget(self, text: str) -> str:
"""Truncate text to fit within context_tokens budget if set."""
@ -475,9 +629,11 @@ class HonchoMemoryProvider(MemoryProvider):
return truncated + ""
def queue_prefetch(self, query: str, *, session_id: str = "") -> None:
"""Fire a background dialectic query for the upcoming turn.
"""Fire background prefetch threads for the upcoming turn.
B5: Checks cadence before firing background threads.
B5: Checks cadence independently for dialectic and context refresh.
Context refresh updates the base layer (representation + card).
Dialectic fires the LLM reasoning supplement.
"""
if self._cron_skipped:
return
@ -488,6 +644,15 @@ class HonchoMemoryProvider(MemoryProvider):
if self._recall_mode == "tools":
return
# ----- Context refresh (base layer) — independent cadence -----
if self._context_cadence <= 1 or (self._turn_count - self._last_context_turn) >= self._context_cadence:
self._last_context_turn = self._turn_count
try:
self._manager.prefetch_context(self._session_key, query)
except Exception as e:
logger.debug("Honcho context prefetch failed: %s", e)
# ----- Dialectic prefetch (supplement layer) -----
# B5: cadence check — skip if too soon since last dialectic call
if self._dialectic_cadence > 1:
if (self._turn_count - self._last_dialectic_turn) < self._dialectic_cadence:
@ -499,9 +664,7 @@ class HonchoMemoryProvider(MemoryProvider):
def _run():
try:
result = self._manager.dialectic_query(
self._session_key, query, peer="user"
)
result = self._run_dialectic_depth(query)
if result and result.strip():
with self._prefetch_lock:
self._prefetch_result = result
@ -513,13 +676,140 @@ class HonchoMemoryProvider(MemoryProvider):
)
self._prefetch_thread.start()
# Also fire context prefetch if cadence allows
if self._context_cadence <= 1 or (self._turn_count - self._last_context_turn) >= self._context_cadence:
self._last_context_turn = self._turn_count
try:
self._manager.prefetch_context(self._session_key, query)
except Exception as e:
logger.debug("Honcho context prefetch failed: %s", e)
# ----- Dialectic depth: multi-pass .chat() with cold/warm prompts -----
# Proportional reasoning levels per depth/pass when dialecticDepthLevels
# is not configured. The base level is dialecticReasoningLevel.
# Index: (depth, pass) → level relative to base.
_PROPORTIONAL_LEVELS: dict[tuple[int, int], str] = {
# depth 1: single pass at base level
(1, 0): "base",
# depth 2: pass 0 lighter, pass 1 at base
(2, 0): "minimal",
(2, 1): "base",
# depth 3: pass 0 lighter, pass 1 at base, pass 2 one above minimal
(3, 0): "minimal",
(3, 1): "base",
(3, 2): "low",
}
_LEVEL_ORDER = ("minimal", "low", "medium", "high", "max")
def _resolve_pass_level(self, pass_idx: int) -> str:
"""Resolve reasoning level for a given pass index.
Uses dialecticDepthLevels if configured, otherwise proportional
defaults relative to dialecticReasoningLevel.
"""
if self._dialectic_depth_levels and pass_idx < len(self._dialectic_depth_levels):
return self._dialectic_depth_levels[pass_idx]
base = (self._config.dialectic_reasoning_level if self._config else "low")
mapping = self._PROPORTIONAL_LEVELS.get((self._dialectic_depth, pass_idx))
if mapping is None or mapping == "base":
return base
return mapping
def _build_dialectic_prompt(self, pass_idx: int, prior_results: list[str], is_cold: bool) -> str:
"""Build the prompt for a given dialectic pass.
Pass 0: cold start (general user query) or warm (session-scoped).
Pass 1: self-audit / targeted synthesis against gaps from pass 0.
Pass 2: reconciliation / contradiction check across prior passes.
"""
if pass_idx == 0:
if is_cold:
return (
"Who is this person? What are their preferences, goals, "
"and working style? Focus on facts that would help an AI "
"assistant be immediately useful."
)
return (
"Given what's been discussed in this session so far, what "
"context about this user is most relevant to the current "
"conversation? Prioritize active context over biographical facts."
)
elif pass_idx == 1:
prior = prior_results[-1] if prior_results else ""
return (
f"Given this initial assessment:\n\n{prior}\n\n"
"What gaps remain in your understanding that would help "
"going forward? Synthesize what you actually know about "
"the user's current state and immediate needs, grounded "
"in evidence from recent sessions."
)
else:
# pass 2: reconciliation
return (
f"Prior passes produced:\n\n"
f"Pass 1:\n{prior_results[0] if len(prior_results) > 0 else '(empty)'}\n\n"
f"Pass 2:\n{prior_results[1] if len(prior_results) > 1 else '(empty)'}\n\n"
"Do these assessments cohere? Reconcile any contradictions "
"and produce a final, concise synthesis of what matters most "
"for the current conversation."
)
@staticmethod
def _signal_sufficient(result: str) -> bool:
"""Check if a dialectic pass returned enough signal to skip further passes.
Heuristic: a response longer than 100 chars with some structure
(section headers, bullets, or an ordered list) is considered sufficient.
"""
if not result or len(result.strip()) < 100:
return False
# Structured output with sections/bullets is strong signal
if "\n" in result and (
"##" in result
or "" in result
or re.search(r"^[*-] ", result, re.MULTILINE)
or re.search(r"^\s*\d+\. ", result, re.MULTILINE)
):
return True
# Long enough even without structure
return len(result.strip()) > 300
def _run_dialectic_depth(self, query: str) -> str:
"""Execute up to dialecticDepth .chat() calls with conditional bail-out.
Cold start (no base context): general user-oriented query.
Warm session (base context exists): session-scoped query.
Each pass is conditional bails early if prior pass returned strong signal.
Returns the best (usually last) result.
"""
if not self._manager or not self._session_key:
return ""
is_cold = not self._base_context_cache
results: list[str] = []
for i in range(self._dialectic_depth):
if i == 0:
prompt = self._build_dialectic_prompt(0, results, is_cold)
else:
# Skip further passes if prior pass delivered strong signal
if results and self._signal_sufficient(results[-1]):
logger.debug("Honcho dialectic depth %d: pass %d skipped, prior signal sufficient",
self._dialectic_depth, i)
break
prompt = self._build_dialectic_prompt(i, results, is_cold)
level = self._resolve_pass_level(i)
logger.debug("Honcho dialectic depth %d: pass %d, level=%s, cold=%s",
self._dialectic_depth, i, level, is_cold)
result = self._manager.dialectic_query(
self._session_key, prompt,
reasoning_level=level,
peer="user",
)
results.append(result or "")
# Return the last non-empty result (deepest pass that ran)
for r in reversed(results):
if r and r.strip():
return r
return ""
def on_turn_start(self, turn_number: int, message: str, **kwargs) -> None:
"""Track turn count for cadence and injection_frequency logic."""
@ -659,7 +949,14 @@ class HonchoMemoryProvider(MemoryProvider):
try:
if tool_name == "honcho_profile":
card = self._manager.get_peer_card(self._session_key)
peer = args.get("peer", "user")
card_update = args.get("card")
if card_update:
result = self._manager.set_peer_card(self._session_key, card_update, peer=peer)
if result is None:
return tool_error("Failed to update peer card.")
return json.dumps({"result": f"Peer card updated ({len(result)} facts).", "card": result})
card = self._manager.get_peer_card(self._session_key, peer=peer)
if not card:
return json.dumps({"result": "No profile facts available yet."})
return json.dumps({"result": card})
@ -669,30 +966,64 @@ class HonchoMemoryProvider(MemoryProvider):
if not query:
return tool_error("Missing required parameter: query")
max_tokens = min(int(args.get("max_tokens", 800)), 2000)
peer = args.get("peer", "user")
result = self._manager.search_context(
self._session_key, query, max_tokens=max_tokens
self._session_key, query, max_tokens=max_tokens, peer=peer
)
if not result:
return json.dumps({"result": "No relevant context found."})
return json.dumps({"result": result})
elif tool_name == "honcho_context":
elif tool_name == "honcho_reasoning":
query = args.get("query", "")
if not query:
return tool_error("Missing required parameter: query")
peer = args.get("peer", "user")
reasoning_level = args.get("reasoning_level")
result = self._manager.dialectic_query(
self._session_key, query, peer=peer
self._session_key, query,
reasoning_level=reasoning_level,
peer=peer,
)
# Update cadence tracker so auto-injection respects the gap after an explicit call
self._last_dialectic_turn = self._turn_count
return json.dumps({"result": result or "No result from Honcho."})
elif tool_name == "honcho_context":
peer = args.get("peer", "user")
ctx = self._manager.get_session_context(self._session_key, peer=peer)
if not ctx:
return json.dumps({"result": "No context available yet."})
parts = []
if ctx.get("summary"):
parts.append(f"## Summary\n{ctx['summary']}")
if ctx.get("representation"):
parts.append(f"## Representation\n{ctx['representation']}")
if ctx.get("card"):
parts.append(f"## Card\n{ctx['card']}")
if ctx.get("recent_messages"):
msgs = ctx["recent_messages"]
msg_str = "\n".join(
f" [{m['role']}] {m['content'][:200]}"
for m in msgs[-5:] # last 5 for brevity
)
parts.append(f"## Recent messages\n{msg_str}")
return json.dumps({"result": "\n\n".join(parts) or "No context available."})
elif tool_name == "honcho_conclude":
delete_id = args.get("delete_id")
peer = args.get("peer", "user")
if delete_id:
ok = self._manager.delete_conclusion(self._session_key, delete_id, peer=peer)
if ok:
return json.dumps({"result": f"Conclusion {delete_id} deleted."})
return tool_error(f"Failed to delete conclusion {delete_id}.")
conclusion = args.get("conclusion", "")
if not conclusion:
return tool_error("Missing required parameter: conclusion")
ok = self._manager.create_conclusion(self._session_key, conclusion)
return tool_error("Missing required parameter: conclusion or delete_id")
ok = self._manager.create_conclusion(self._session_key, conclusion, peer=peer)
if ok:
return json.dumps({"result": f"Conclusion saved: {conclusion}"})
return json.dumps({"result": f"Conclusion saved for {peer}: {conclusion}"})
return tool_error("Failed to save conclusion.")
return tool_error(f"Unknown tool: {tool_name}")

View file

@ -440,11 +440,43 @@ def cmd_setup(args) -> None:
if new_recall in ("hybrid", "context", "tools"):
hermes_host["recallMode"] = new_recall
# --- 7. Session strategy ---
current_strat = hermes_host.get("sessionStrategy") or cfg.get("sessionStrategy", "per-directory")
# --- 7. Context token budget ---
current_ctx_tokens = hermes_host.get("contextTokens") or cfg.get("contextTokens")
current_display = str(current_ctx_tokens) if current_ctx_tokens else "uncapped"
print("\n Context injection per turn (hybrid/context recall modes only):")
print(" uncapped -- no limit (default)")
print(" N -- token limit per turn (e.g. 1200)")
new_ctx_tokens = _prompt("Context tokens", default=current_display)
if new_ctx_tokens.strip().lower() in ("none", "uncapped", "no limit"):
hermes_host.pop("contextTokens", None)
elif new_ctx_tokens.strip() == "":
pass # keep current
else:
try:
val = int(new_ctx_tokens)
if val >= 0:
hermes_host["contextTokens"] = val
except (ValueError, TypeError):
pass # keep current
# --- 7b. Dialectic cadence ---
current_dialectic = str(hermes_host.get("dialecticCadence") or cfg.get("dialecticCadence") or "3")
print("\n Dialectic cadence:")
print(" How often Honcho rebuilds its user model (LLM call on Honcho backend).")
print(" 1 = every turn (aggressive), 3 = every 3 turns (recommended), 5+ = sparse.")
new_dialectic = _prompt("Dialectic cadence", default=current_dialectic)
try:
val = int(new_dialectic)
if val >= 1:
hermes_host["dialecticCadence"] = val
except (ValueError, TypeError):
hermes_host["dialecticCadence"] = 3
# --- 8. Session strategy ---
current_strat = hermes_host.get("sessionStrategy") or cfg.get("sessionStrategy", "per-session")
print("\n Session strategy:")
print(" per-directory -- one session per working directory (default)")
print(" per-session -- new Honcho session each run")
print(" per-session -- each run starts clean, Honcho injects context automatically")
print(" per-directory -- reuses session per dir, prior context auto-injected each run")
print(" per-repo -- one session per git repository")
print(" global -- single session across all directories")
new_strat = _prompt("Session strategy", default=current_strat)
@ -490,10 +522,11 @@ def cmd_setup(args) -> None:
print(f" Recall: {hcfg.recall_mode}")
print(f" Sessions: {hcfg.session_strategy}")
print("\n Honcho tools available in chat:")
print(" honcho_context -- ask Honcho about the user (LLM-synthesized)")
print(" honcho_search -- semantic search over history (no LLM)")
print(" honcho_profile -- peer card, key facts (no LLM)")
print(" honcho_conclude -- persist a user fact to memory (no LLM)")
print(" honcho_context -- session context: summary, representation, card, messages")
print(" honcho_search -- semantic search over history")
print(" honcho_profile -- peer card, key facts")
print(" honcho_reasoning -- ask Honcho a question, synthesized answer")
print(" honcho_conclude -- persist a user fact to memory")
print("\n Other commands:")
print(" hermes honcho status -- show full config")
print(" hermes honcho mode -- change recall/observation mode")
@ -585,13 +618,26 @@ def cmd_status(args) -> None:
print(f" Enabled: {hcfg.enabled}")
print(f" API key: {masked}")
print(f" Workspace: {hcfg.workspace_id}")
print(f" Config path: {active_path}")
# Config paths — show where config was read from and where writes go
global_path = Path.home() / ".honcho" / "config.json"
print(f" Config: {active_path}")
if write_path != active_path:
print(f" Write path: {write_path} (instance-local)")
print(f" Write to: {write_path} (profile-local)")
if active_path == global_path:
print(f" Fallback: (none — using global ~/.honcho/config.json)")
elif global_path.exists():
print(f" Fallback: {global_path} (exists, cross-app interop)")
print(f" AI peer: {hcfg.ai_peer}")
print(f" User peer: {hcfg.peer_name or 'not set'}")
print(f" Session key: {hcfg.resolve_session_name()}")
print(f" Session strat: {hcfg.session_strategy}")
print(f" Recall mode: {hcfg.recall_mode}")
print(f" Context budget: {hcfg.context_tokens or '(uncapped)'} tokens")
raw = getattr(hcfg, "raw", None) or {}
dialectic_cadence = raw.get("dialecticCadence") or 3
print(f" Dialectic cad: every {dialectic_cadence} turn{'s' if dialectic_cadence != 1 else ''}")
print(f" Observation: user(me={hcfg.user_observe_me},others={hcfg.user_observe_others}) ai(me={hcfg.ai_observe_me},others={hcfg.ai_observe_others})")
print(f" Write freq: {hcfg.write_frequency}")
@ -599,8 +645,8 @@ def cmd_status(args) -> None:
print("\n Connection... ", end="", flush=True)
try:
client = get_honcho_client(hcfg)
print("OK")
_show_peer_cards(hcfg, client)
print("OK")
except Exception as e:
print(f"FAILED ({e})\n")
else:
@ -824,6 +870,41 @@ def cmd_mode(args) -> None:
print(f" {label}Recall mode -> {mode_arg} ({MODES[mode_arg]})\n")
def cmd_strategy(args) -> None:
"""Show or set the session strategy."""
STRATEGIES = {
"per-session": "each run starts clean, Honcho injects context automatically",
"per-directory": "reuses session per dir, prior context auto-injected each run",
"per-repo": "one session per git repository",
"global": "single session across all directories",
}
cfg = _read_config()
strat_arg = getattr(args, "strategy", None)
if strat_arg is None:
current = (
(cfg.get("hosts") or {}).get(_host_key(), {}).get("sessionStrategy")
or cfg.get("sessionStrategy")
or "per-session"
)
print("\nHoncho session strategy\n" + "" * 40)
for s, desc in STRATEGIES.items():
marker = " <-" if s == current else ""
print(f" {s:<15} {desc}{marker}")
print(f"\n Set with: hermes honcho strategy [per-session|per-directory|per-repo|global]\n")
return
if strat_arg not in STRATEGIES:
print(f" Invalid strategy '{strat_arg}'. Options: {', '.join(STRATEGIES)}\n")
return
host = _host_key()
label = f"[{host}] " if host != "hermes" else ""
cfg.setdefault("hosts", {}).setdefault(host, {})["sessionStrategy"] = strat_arg
_write_config(cfg)
print(f" {label}Session strategy -> {strat_arg} ({STRATEGIES[strat_arg]})\n")
def cmd_tokens(args) -> None:
"""Show or set token budget settings."""
cfg = _read_config()
@ -1143,10 +1224,11 @@ def cmd_migrate(args) -> None:
print(" automatically. Files become the seed, not the live store.")
print()
print(" Honcho tools (available to the agent during conversation)")
print(" honcho_context — ask Honcho a question, get a synthesized answer (LLM)")
print(" honcho_search — semantic search over stored context (no LLM)")
print(" honcho_profile — fast peer card snapshot (no LLM)")
print(" honcho_conclude — write a conclusion/fact back to memory (no LLM)")
print(" honcho_context — session context: summary, representation, card, messages")
print(" honcho_search — semantic search over stored context")
print(" honcho_profile — fast peer card snapshot")
print(" honcho_reasoning — ask Honcho a question, synthesized answer")
print(" honcho_conclude — write a conclusion/fact back to memory")
print()
print(" Session naming")
print(" OpenClaw: no persistent session concept — files are global.")
@ -1197,6 +1279,8 @@ def honcho_command(args) -> None:
cmd_peer(args)
elif sub == "mode":
cmd_mode(args)
elif sub == "strategy":
cmd_strategy(args)
elif sub == "tokens":
cmd_tokens(args)
elif sub == "identity":
@ -1211,7 +1295,7 @@ def honcho_command(args) -> None:
cmd_sync(args)
else:
print(f" Unknown honcho command: {sub}")
print(" Available: status, sessions, map, peer, mode, tokens, identity, migrate, enable, disable, sync\n")
print(" Available: status, sessions, map, peer, mode, strategy, tokens, identity, migrate, enable, disable, sync\n")
def register_cli(subparser) -> None:
@ -1270,6 +1354,15 @@ def register_cli(subparser) -> None:
help="Recall mode to set (hybrid/context/tools). Omit to show current.",
)
strategy_parser = subs.add_parser(
"strategy", help="Show or set session strategy (per-session/per-directory/per-repo/global)",
)
strategy_parser.add_argument(
"strategy", nargs="?", metavar="STRATEGY",
choices=("per-session", "per-directory", "per-repo", "global"),
help="Session strategy to set. Omit to show current.",
)
tokens_parser = subs.add_parser(
"tokens", help="Show or set token budget for context and dialectic",
)

View file

@ -58,7 +58,8 @@ def resolve_config_path() -> Path:
Resolution order:
1. $HERMES_HOME/honcho.json (profile-local, if it exists)
2. ~/.honcho/config.json (global, cross-app interop)
2. ~/.hermes/honcho.json (default profile shared host blocks live here)
3. ~/.honcho/config.json (global, cross-app interop)
Returns the global path if none exist (for first-time setup writes).
"""
@ -66,6 +67,11 @@ def resolve_config_path() -> Path:
if local_path.exists():
return local_path
# Default profile's config — host blocks accumulate here via setup/clone
default_path = Path.home() / ".hermes" / "honcho.json"
if default_path != local_path and default_path.exists():
return default_path
return GLOBAL_CONFIG_PATH
@ -88,6 +94,68 @@ def _resolve_bool(host_val, root_val, *, default: bool) -> bool:
return default
def _parse_context_tokens(host_val, root_val) -> int | None:
"""Parse contextTokens: host wins, then root, then None (uncapped)."""
for val in (host_val, root_val):
if val is not None:
try:
return int(val)
except (ValueError, TypeError):
pass
return None
def _parse_dialectic_depth(host_val, root_val) -> int:
"""Parse dialecticDepth: host wins, then root, then 1. Clamped to 1-3."""
for val in (host_val, root_val):
if val is not None:
try:
return max(1, min(int(val), 3))
except (ValueError, TypeError):
pass
return 1
_VALID_REASONING_LEVELS = ("minimal", "low", "medium", "high", "max")
def _parse_dialectic_depth_levels(host_val, root_val, depth: int) -> list[str] | None:
"""Parse dialecticDepthLevels: optional array of reasoning levels per pass.
Returns None when not configured (use proportional defaults).
When configured, validates each level and truncates/pads to match depth.
"""
for val in (host_val, root_val):
if val is not None and isinstance(val, list):
levels = [
lvl if lvl in _VALID_REASONING_LEVELS else "low"
for lvl in val[:depth]
]
# Pad with "low" if array is shorter than depth
while len(levels) < depth:
levels.append("low")
return levels
return None
def _resolve_optional_float(*values: Any) -> float | None:
"""Return the first non-empty value coerced to a positive float."""
for value in values:
if value is None:
continue
if isinstance(value, str):
value = value.strip()
if not value:
continue
try:
parsed = float(value)
except (TypeError, ValueError):
continue
if parsed > 0:
return parsed
return None
_VALID_OBSERVATION_MODES = {"unified", "directional"}
_OBSERVATION_MODE_ALIASES = {"shared": "unified", "separate": "directional", "cross": "directional"}
@ -153,6 +221,8 @@ class HonchoClientConfig:
environment: str = "production"
# Optional base URL for self-hosted Honcho (overrides environment mapping)
base_url: str | None = None
# Optional request timeout in seconds for Honcho SDK HTTP calls
timeout: float | None = None
# Identity
peer_name: str | None = None
ai_peer: str = "hermes"
@ -162,17 +232,25 @@ class HonchoClientConfig:
# Write frequency: "async" (background thread), "turn" (sync per turn),
# "session" (flush on session end), or int (every N turns)
write_frequency: str | int = "async"
# Prefetch budget
# Prefetch budget (None = no cap; set to an integer to bound auto-injected context)
context_tokens: int | None = None
# Dialectic (peer.chat) settings
# reasoning_level: "minimal" | "low" | "medium" | "high" | "max"
dialectic_reasoning_level: str = "low"
# dynamic: auto-bump reasoning level based on query length
# true — low->medium (120+ chars), low->high (400+ chars), capped at "high"
# false — always use dialecticReasoningLevel as-is
# When true, the model can override reasoning_level per-call via the
# honcho_reasoning tool param (agentic). When false, always uses
# dialecticReasoningLevel and ignores model-provided overrides.
dialectic_dynamic: bool = True
# Max chars of dialectic result to inject into Hermes system prompt
dialectic_max_chars: int = 600
# Dialectic depth: how many .chat() calls per dialectic cycle (1-3).
# Depth 1: single call. Depth 2: self-audit + targeted synthesis.
# Depth 3: self-audit + synthesis + reconciliation.
dialectic_depth: int = 1
# Optional per-pass reasoning level override. Array of reasoning levels
# matching dialectic_depth length. When None, uses proportional defaults
# derived from dialectic_reasoning_level.
dialectic_depth_levels: list[str] | None = None
# Honcho API limits — configurable for self-hosted instances
# Max chars per message sent via add_messages() (Honcho cloud: 25000)
message_max_chars: int = 25000
@ -183,10 +261,8 @@ class HonchoClientConfig:
# "context" — auto-injected context only, Honcho tools removed
# "tools" — Honcho tools only, no auto-injected context
recall_mode: str = "hybrid"
# When True and recallMode is "tools", create the Honcho session eagerly
# during initialize() instead of deferring to the first tool call.
# This ensures sync_turn() can write from the very first turn.
# Does NOT enable automatic context injection — only changes init timing.
# Eager init in tools mode — when true, initializes session during
# initialize() instead of deferring to first tool call
init_on_session_start: bool = False
# Observation mode: legacy string shorthand ("directional" or "unified").
# Kept for backward compat; granular per-peer booleans below are preferred.
@ -218,12 +294,14 @@ class HonchoClientConfig:
resolved_host = host or resolve_active_host()
api_key = os.environ.get("HONCHO_API_KEY")
base_url = os.environ.get("HONCHO_BASE_URL", "").strip() or None
timeout = _resolve_optional_float(os.environ.get("HONCHO_TIMEOUT"))
return cls(
host=resolved_host,
workspace_id=workspace_id,
api_key=api_key,
environment=os.environ.get("HONCHO_ENVIRONMENT", "production"),
base_url=base_url,
timeout=timeout,
ai_peer=resolved_host,
enabled=bool(api_key or base_url),
)
@ -284,6 +362,11 @@ class HonchoClientConfig:
or os.environ.get("HONCHO_BASE_URL", "").strip()
or None
)
timeout = _resolve_optional_float(
raw.get("timeout"),
raw.get("requestTimeout"),
os.environ.get("HONCHO_TIMEOUT"),
)
# Auto-enable when API key or base_url is present (unless explicitly disabled)
# Host-level enabled wins, then root-level, then auto-enable if key/url exists.
@ -329,12 +412,16 @@ class HonchoClientConfig:
api_key=api_key,
environment=environment,
base_url=base_url,
timeout=timeout,
peer_name=host_block.get("peerName") or raw.get("peerName"),
ai_peer=ai_peer,
enabled=enabled,
save_messages=save_messages,
write_frequency=write_frequency,
context_tokens=host_block.get("contextTokens") or raw.get("contextTokens"),
context_tokens=_parse_context_tokens(
host_block.get("contextTokens"),
raw.get("contextTokens"),
),
dialectic_reasoning_level=(
host_block.get("dialecticReasoningLevel")
or raw.get("dialecticReasoningLevel")
@ -350,6 +437,15 @@ class HonchoClientConfig:
or raw.get("dialecticMaxChars")
or 600
),
dialectic_depth=_parse_dialectic_depth(
host_block.get("dialecticDepth"),
raw.get("dialecticDepth"),
),
dialectic_depth_levels=_parse_dialectic_depth_levels(
host_block.get("dialecticDepthLevels"),
raw.get("dialecticDepthLevels"),
depth=_parse_dialectic_depth(host_block.get("dialecticDepth"), raw.get("dialecticDepth")),
),
message_max_chars=int(
host_block.get("messageMaxChars")
or raw.get("messageMaxChars")
@ -416,16 +512,18 @@ class HonchoClientConfig:
cwd: str | None = None,
session_title: str | None = None,
session_id: str | None = None,
gateway_session_key: str | None = None,
) -> str | None:
"""Resolve Honcho session name.
Resolution order:
1. Manual directory override from sessions map
2. Hermes session title (from /title command)
3. per-session strategy Hermes session_id ({timestamp}_{hex})
4. per-repo strategy git repo root directory name
5. per-directory strategy directory basename
6. global strategy workspace name
3. Gateway session key (stable per-chat identifier from gateway platforms)
4. per-session strategy Hermes session_id ({timestamp}_{hex})
5. per-repo strategy git repo root directory name
6. per-directory strategy directory basename
7. global strategy workspace name
"""
import re
@ -439,12 +537,22 @@ class HonchoClientConfig:
# /title mid-session remap
if session_title:
sanitized = re.sub(r'[^a-zA-Z0-9_-]', '-', session_title).strip('-')
sanitized = re.sub(r'[^a-zA-Z0-9_-]+', '-', session_title).strip('-')
if sanitized:
if self.session_peer_prefix and self.peer_name:
return f"{self.peer_name}-{sanitized}"
return sanitized
# Gateway session key: stable per-chat identifier passed by the gateway
# (e.g. "agent:main:telegram:dm:8439114563"). Sanitize colons to hyphens
# for Honcho session ID compatibility. This takes priority over strategy-
# based resolution because gateway platforms need per-chat isolation that
# cwd-based strategies cannot provide.
if gateway_session_key:
sanitized = re.sub(r'[^a-zA-Z0-9_-]+', '-', gateway_session_key).strip('-')
if sanitized:
return sanitized
# per-session: inherit Hermes session_id (new Honcho session each run)
if self.session_strategy == "per-session" and session_id:
if self.session_peer_prefix and self.peer_name:
@ -506,13 +614,20 @@ def get_honcho_client(config: HonchoClientConfig | None = None) -> Honcho:
# mapping, enabling remote self-hosted Honcho deployments without
# requiring the server to live on localhost.
resolved_base_url = config.base_url
if not resolved_base_url:
resolved_timeout = config.timeout
if not resolved_base_url or resolved_timeout is None:
try:
from hermes_cli.config import load_config
hermes_cfg = load_config()
honcho_cfg = hermes_cfg.get("honcho", {})
if isinstance(honcho_cfg, dict):
resolved_base_url = honcho_cfg.get("base_url", "").strip() or None
if not resolved_base_url:
resolved_base_url = honcho_cfg.get("base_url", "").strip() or None
if resolved_timeout is None:
resolved_timeout = _resolve_optional_float(
honcho_cfg.get("timeout"),
honcho_cfg.get("request_timeout"),
)
except Exception:
pass
@ -547,6 +662,8 @@ def get_honcho_client(config: HonchoClientConfig | None = None) -> Honcho:
}
if resolved_base_url:
kwargs["base_url"] = resolved_base_url
if resolved_timeout is not None:
kwargs["timeout"] = resolved_timeout
_honcho_client = Honcho(**kwargs)

View file

@ -486,36 +486,9 @@ class HonchoSessionManager:
_REASONING_LEVELS = ("minimal", "low", "medium", "high", "max")
def _dynamic_reasoning_level(self, query: str) -> str:
"""
Pick a reasoning level for a dialectic query.
When dialecticDynamic is true (default), auto-bumps based on query
length so Honcho applies more inference where it matters:
< 120 chars -> configured default (typically "low")
120-400 chars -> +1 level above default (cap at "high")
> 400 chars -> +2 levels above default (cap at "high")
"max" is never selected automatically -- reserve it for explicit config.
When dialecticDynamic is false, always returns the configured level.
"""
if not self._dialectic_dynamic:
return self._dialectic_reasoning_level
levels = self._REASONING_LEVELS
default_idx = levels.index(self._dialectic_reasoning_level) if self._dialectic_reasoning_level in levels else 1
n = len(query)
if n < 120:
bump = 0
elif n < 400:
bump = 1
else:
bump = 2
# Cap at "high" (index 3) for auto-selection
idx = min(default_idx + bump, 3)
return levels[idx]
def _default_reasoning_level(self) -> str:
"""Return the configured default reasoning level."""
return self._dialectic_reasoning_level
def dialectic_query(
self, session_key: str, query: str,
@ -532,8 +505,9 @@ class HonchoSessionManager:
Args:
session_key: The session key to query against.
query: Natural language question.
reasoning_level: Override the config default. If None, uses
_dynamic_reasoning_level(query).
reasoning_level: Override the configured default (dialecticReasoningLevel).
Only honored when dialecticDynamic is true.
If None or dialecticDynamic is false, uses the configured default.
peer: Which peer to query "user" (default) or "ai".
Returns:
@ -543,29 +517,34 @@ class HonchoSessionManager:
if not session:
return ""
target_peer_id = self._resolve_peer_id(session, peer)
if target_peer_id is None:
return ""
# Guard: truncate query to Honcho's dialectic input limit
if len(query) > self._dialectic_max_input_chars:
query = query[:self._dialectic_max_input_chars].rsplit(" ", 1)[0]
level = reasoning_level or self._dynamic_reasoning_level(query)
if self._dialectic_dynamic and reasoning_level:
level = reasoning_level
else:
level = self._default_reasoning_level()
try:
if self._ai_observe_others:
# AI peer can observe user — use cross-observation routing
if peer == "ai":
ai_peer_obj = self._get_or_create_peer(session.assistant_peer_id)
# AI peer can observe other peers — use assistant as observer.
ai_peer_obj = self._get_or_create_peer(session.assistant_peer_id)
if target_peer_id == session.assistant_peer_id:
result = ai_peer_obj.chat(query, reasoning_level=level) or ""
else:
ai_peer_obj = self._get_or_create_peer(session.assistant_peer_id)
result = ai_peer_obj.chat(
query,
target=session.user_peer_id,
target=target_peer_id,
reasoning_level=level,
) or ""
else:
# AI can't observe others — each peer queries self
peer_id = session.assistant_peer_id if peer == "ai" else session.user_peer_id
target_peer = self._get_or_create_peer(peer_id)
# Without cross-observation, each peer queries its own context.
target_peer = self._get_or_create_peer(target_peer_id)
result = target_peer.chat(query, reasoning_level=level) or ""
# Apply Hermes-side char cap before caching
@ -647,10 +626,11 @@ class HonchoSessionManager:
"""
Pre-fetch user and AI peer context from Honcho.
Fetches peer_representation and peer_card for both peers. search_query
is intentionally omitted it would only affect additional excerpts
that this code does not consume, and passing the raw message exposes
conversation content in server access logs.
Fetches peer_representation and peer_card for both peers, plus the
session summary when available. search_query is intentionally omitted
it would only affect additional excerpts that this code does not
consume, and passing the raw message exposes conversation content in
server access logs.
Args:
session_key: The session key to get context for.
@ -658,15 +638,29 @@ class HonchoSessionManager:
Returns:
Dictionary with 'representation', 'card', 'ai_representation',
and 'ai_card' keys.
'ai_card', and optionally 'summary' keys.
"""
session = self._cache.get(session_key)
if not session:
return {}
result: dict[str, str] = {}
# Session summary — provides session-scoped context.
# Fresh sessions (per-session cold start, or first-ever per-directory)
# return null summary — the guard below handles that gracefully.
# Per-directory returning sessions get their accumulated summary.
try:
user_ctx = self._fetch_peer_context(session.user_peer_id)
honcho_session = self._sessions_cache.get(session.honcho_session_id)
if honcho_session:
ctx = honcho_session.context(summary=True)
if ctx.summary and getattr(ctx.summary, "content", None):
result["summary"] = ctx.summary.content
except Exception as e:
logger.debug("Failed to fetch session summary from Honcho: %s", e)
try:
user_ctx = self._fetch_peer_context(session.user_peer_id, target=session.user_peer_id)
result["representation"] = user_ctx["representation"]
result["card"] = "\n".join(user_ctx["card"])
except Exception as e:
@ -674,7 +668,7 @@ class HonchoSessionManager:
# Also fetch AI peer's own representation so Hermes knows itself.
try:
ai_ctx = self._fetch_peer_context(session.assistant_peer_id)
ai_ctx = self._fetch_peer_context(session.assistant_peer_id, target=session.assistant_peer_id)
result["ai_representation"] = ai_ctx["representation"]
result["ai_card"] = "\n".join(ai_ctx["card"])
except Exception as e:
@ -862,7 +856,7 @@ class HonchoSessionManager:
return [str(item) for item in card if item]
return [str(card)]
def _fetch_peer_card(self, peer_id: str) -> list[str]:
def _fetch_peer_card(self, peer_id: str, *, target: str | None = None) -> list[str]:
"""Fetch a peer card directly from the peer object.
This avoids relying on session.context(), which can return an empty
@ -872,22 +866,33 @@ class HonchoSessionManager:
peer = self._get_or_create_peer(peer_id)
getter = getattr(peer, "get_card", None)
if callable(getter):
return self._normalize_card(getter())
return self._normalize_card(getter(target=target) if target is not None else getter())
legacy_getter = getattr(peer, "card", None)
if callable(legacy_getter):
return self._normalize_card(legacy_getter())
return self._normalize_card(legacy_getter(target=target) if target is not None else legacy_getter())
return []
def _fetch_peer_context(self, peer_id: str, search_query: str | None = None) -> dict[str, Any]:
def _fetch_peer_context(
self,
peer_id: str,
search_query: str | None = None,
*,
target: str | None = None,
) -> dict[str, Any]:
"""Fetch representation + peer card directly from a peer object."""
peer = self._get_or_create_peer(peer_id)
representation = ""
card: list[str] = []
try:
ctx = peer.context(search_query=search_query) if search_query else peer.context()
context_kwargs: dict[str, Any] = {}
if target is not None:
context_kwargs["target"] = target
if search_query is not None:
context_kwargs["search_query"] = search_query
ctx = peer.context(**context_kwargs) if context_kwargs else peer.context()
representation = (
getattr(ctx, "representation", None)
or getattr(ctx, "peer_representation", None)
@ -899,24 +904,111 @@ class HonchoSessionManager:
if not representation:
try:
representation = peer.representation() or ""
representation = (
peer.representation(target=target) if target is not None else peer.representation()
) or ""
except Exception as e:
logger.debug("Direct peer.representation() failed for '%s': %s", peer_id, e)
if not card:
try:
card = self._fetch_peer_card(peer_id)
card = self._fetch_peer_card(peer_id, target=target)
except Exception as e:
logger.debug("Direct peer card fetch failed for '%s': %s", peer_id, e)
return {"representation": representation, "card": card}
def get_peer_card(self, session_key: str) -> list[str]:
def get_session_context(self, session_key: str, peer: str = "user") -> dict[str, Any]:
"""Fetch full session context from Honcho including summary.
Uses the session-level context() API which returns summary,
peer_representation, peer_card, and messages.
"""
Fetch the user peer's card — a curated list of key facts.
session = self._cache.get(session_key)
if not session:
return {}
honcho_session = self._sessions_cache.get(session.honcho_session_id)
if not honcho_session:
# Fall back to peer-level context, respecting the requested peer
peer_id = self._resolve_peer_id(session, peer)
if peer_id is None:
peer_id = session.user_peer_id
return self._fetch_peer_context(peer_id, target=peer_id)
try:
peer_id = self._resolve_peer_id(session, peer)
ctx = honcho_session.context(
summary=True,
peer_target=peer_id,
peer_perspective=session.user_peer_id if peer == "user" else session.assistant_peer_id,
)
result: dict[str, Any] = {}
# Summary
if ctx.summary:
result["summary"] = ctx.summary.content
# Peer representation and card
if ctx.peer_representation:
result["representation"] = ctx.peer_representation
if ctx.peer_card:
result["card"] = "\n".join(ctx.peer_card)
# Messages (last N for context)
if ctx.messages:
recent = ctx.messages[-10:] # last 10 messages
result["recent_messages"] = [
{"role": getattr(m, "peer_id", "unknown"), "content": (m.content or "")[:500]}
for m in recent
]
return result
except Exception as e:
logger.debug("Session context fetch failed: %s", e)
return {}
def _resolve_peer_id(self, session: HonchoSession, peer: str | None) -> str:
"""Resolve a peer alias or explicit peer ID to a concrete Honcho peer ID.
Always returns a non-empty string: either a known peer ID or a
sanitized version of the caller-supplied alias/ID.
"""
candidate = (peer or "user").strip()
if not candidate:
return session.user_peer_id
normalized = self._sanitize_id(candidate)
if normalized == self._sanitize_id("user"):
return session.user_peer_id
if normalized == self._sanitize_id("ai"):
return session.assistant_peer_id
return normalized
def _resolve_observer_target(
self,
session: HonchoSession,
peer: str | None,
) -> tuple[str, str | None]:
"""Resolve observer and target peer IDs for context/search/profile queries."""
target_peer_id = self._resolve_peer_id(session, peer)
if target_peer_id == session.assistant_peer_id:
return session.assistant_peer_id, session.assistant_peer_id
if self._ai_observe_others:
return session.assistant_peer_id, target_peer_id
return target_peer_id, None
def get_peer_card(self, session_key: str, peer: str = "user") -> list[str]:
"""
Fetch a peer card a curated list of key facts.
Fast, no LLM reasoning. Returns raw structured facts Honcho has
inferred about the user (name, role, preferences, patterns).
inferred about the target peer (name, role, preferences, patterns).
Empty list if unavailable.
"""
session = self._cache.get(session_key)
@ -924,12 +1016,19 @@ class HonchoSessionManager:
return []
try:
return self._fetch_peer_card(session.user_peer_id)
observer_peer_id, target_peer_id = self._resolve_observer_target(session, peer)
return self._fetch_peer_card(observer_peer_id, target=target_peer_id)
except Exception as e:
logger.debug("Failed to fetch peer card from Honcho: %s", e)
return []
def search_context(self, session_key: str, query: str, max_tokens: int = 800) -> str:
def search_context(
self,
session_key: str,
query: str,
max_tokens: int = 800,
peer: str = "user",
) -> str:
"""
Semantic search over Honcho session context.
@ -941,6 +1040,7 @@ class HonchoSessionManager:
session_key: Session to search against.
query: Search query for semantic matching.
max_tokens: Token budget for returned content.
peer: Peer alias or explicit peer ID to search about.
Returns:
Relevant context excerpts as a string, or empty string if none.
@ -950,7 +1050,13 @@ class HonchoSessionManager:
return ""
try:
ctx = self._fetch_peer_context(session.user_peer_id, search_query=query)
observer_peer_id, target = self._resolve_observer_target(session, peer)
ctx = self._fetch_peer_context(
observer_peer_id,
search_query=query,
target=target,
)
parts = []
if ctx["representation"]:
parts.append(ctx["representation"])
@ -962,16 +1068,17 @@ class HonchoSessionManager:
logger.debug("Honcho search_context failed: %s", e)
return ""
def create_conclusion(self, session_key: str, content: str) -> bool:
"""Write a conclusion about the user back to Honcho.
def create_conclusion(self, session_key: str, content: str, peer: str = "user") -> bool:
"""Write a conclusion about a target peer back to Honcho.
Conclusions are facts the AI peer observes about the user
preferences, corrections, clarifications, project context.
They feed into the user's peer card and representation.
Conclusions are facts a peer observes about another peer or itself
preferences, corrections, clarifications, and project context.
They feed into the target peer's card and representation.
Args:
session_key: Session to associate the conclusion with.
content: The conclusion text (e.g. "User prefers dark mode").
content: The conclusion text.
peer: Peer alias or explicit peer ID. "user" is the default alias.
Returns:
True on success, False on failure.
@ -985,25 +1092,90 @@ class HonchoSessionManager:
return False
try:
if self._ai_observe_others:
# AI peer creates conclusion about user (cross-observation)
target_peer_id = self._resolve_peer_id(session, peer)
if target_peer_id is None:
logger.warning("Could not resolve conclusion peer '%s' for session '%s'", peer, session_key)
return False
if target_peer_id == session.assistant_peer_id:
assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
conclusions_scope = assistant_peer.conclusions_of(session.user_peer_id)
conclusions_scope = assistant_peer.conclusions_of(session.assistant_peer_id)
elif self._ai_observe_others:
assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
conclusions_scope = assistant_peer.conclusions_of(target_peer_id)
else:
# AI can't observe others — user peer creates self-conclusion
user_peer = self._get_or_create_peer(session.user_peer_id)
conclusions_scope = user_peer.conclusions_of(session.user_peer_id)
target_peer = self._get_or_create_peer(target_peer_id)
conclusions_scope = target_peer.conclusions_of(target_peer_id)
conclusions_scope.create([{
"content": content.strip(),
"session_id": session.honcho_session_id,
}])
logger.info("Created conclusion for %s: %s", session_key, content[:80])
logger.info("Created conclusion about %s for %s: %s", target_peer_id, session_key, content[:80])
return True
except Exception as e:
logger.error("Failed to create conclusion: %s", e)
return False
def delete_conclusion(self, session_key: str, conclusion_id: str, peer: str = "user") -> bool:
"""Delete a conclusion by ID. Use only for PII removal.
Args:
session_key: Session key for peer resolution.
conclusion_id: The conclusion ID to delete.
peer: Peer alias or explicit peer ID.
Returns:
True on success, False on failure.
"""
session = self._cache.get(session_key)
if not session:
return False
try:
target_peer_id = self._resolve_peer_id(session, peer)
if target_peer_id == session.assistant_peer_id:
observer = self._get_or_create_peer(session.assistant_peer_id)
scope = observer.conclusions_of(session.assistant_peer_id)
elif self._ai_observe_others:
observer = self._get_or_create_peer(session.assistant_peer_id)
scope = observer.conclusions_of(target_peer_id)
else:
target_peer = self._get_or_create_peer(target_peer_id)
scope = target_peer.conclusions_of(target_peer_id)
scope.delete(conclusion_id)
logger.info("Deleted conclusion %s for %s", conclusion_id, session_key)
return True
except Exception as e:
logger.error("Failed to delete conclusion %s: %s", conclusion_id, e)
return False
def set_peer_card(self, session_key: str, card: list[str], peer: str = "user") -> list[str] | None:
"""Update a peer's card.
Args:
session_key: Session key for peer resolution.
card: New peer card as list of fact strings.
peer: Peer alias or explicit peer ID.
Returns:
Updated card on success, None on failure.
"""
session = self._cache.get(session_key)
if not session:
return None
try:
peer_id = self._resolve_peer_id(session, peer)
if peer_id is None:
logger.warning("Could not resolve peer '%s' for set_peer_card in session '%s'", peer, session_key)
return None
peer_obj = self._get_or_create_peer(peer_id)
result = peer_obj.set_card(card)
logger.info("Updated peer card for %s (%d facts)", peer_id, len(card))
return result
except Exception as e:
logger.error("Failed to set peer card: %s", e)
return None
def seed_ai_identity(self, session_key: str, content: str, source: str = "manual") -> bool:
"""
Seed the AI peer's Honcho representation from text content.
@ -1061,7 +1233,7 @@ class HonchoSessionManager:
return {"representation": "", "card": ""}
try:
ctx = self._fetch_peer_context(session.assistant_peer_id)
ctx = self._fetch_peer_context(session.assistant_peer_id, target=session.assistant_peer_id)
return {
"representation": ctx["representation"] or "",
"card": "\n".join(ctx["card"]),