feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623)

* feat(memory): add pluggable memory provider interface with profile isolation Introduces a pluggable MemoryProvider ABC so external memory backends can integrate with Hermes without modifying core files. Each backend becomes a plugin implementing a standard interface, orchestrated by MemoryManager. Key architecture: - agent/memory_provider.py — ABC with core + optional lifecycle hooks - agent/memory_manager.py — single integration point in the agent loop - agent/builtin_memory_provider.py — wraps existing MEMORY.md/USER.md Profile isolation fixes applied to all 6 shipped plugins: - Cognitive Memory: use get_hermes_home() instead of raw env var - Hindsight Memory: check $HERMES_HOME/hindsight/config.json first, fall back to legacy ~/.hindsight/ for backward compat - Hermes Memory Store: replace hardcoded ~/.hermes paths with get_hermes_home() for config loading and DB path defaults - Mem0 Memory: use get_hermes_home() instead of raw env var - RetainDB Memory: auto-derive profile-scoped project name from hermes_home path (hermes-<profile>), explicit env var overrides - OpenViking Memory: read-only, no local state, isolation via .env MemoryManager.initialize_all() now injects hermes_home into kwargs so every provider can resolve profile-scoped storage without importing get_hermes_home() themselves. Plugin system: adds register_memory_provider() to PluginContext and get_plugin_memory_providers() accessor. Based on PR #3825. 46 tests (37 unit + 5 E2E + 4 plugin registration). * refactor(memory): drop cognitive plugin, rewrite OpenViking as full provider Remove cognitive-memory plugin (#727) — core mechanics are broken: decay runs 24x too fast (hourly not daily), prefetch uses row ID as timestamp, search limited by importance not similarity. Rewrite openviking-memory plugin from a read-only search wrapper into a full bidirectional memory provider using the complete OpenViking session lifecycle API: - sync_turn: records user/assistant messages to OpenViking session (threaded, non-blocking) - on_session_end: commits session to trigger automatic memory extraction into 6 categories (profile, preferences, entities, events, cases, patterns) - prefetch: background semantic search via find() endpoint - on_memory_write: mirrors built-in memory writes to the session - is_available: checks env var only, no network calls (ABC compliance) Tools expanded from 3 to 5: - viking_search: semantic search with mode/scope/limit - viking_read: tiered content (abstract ~100tok / overview ~2k / full) - viking_browse: filesystem-style navigation (list/tree/stat) - viking_remember: explicit memory storage via session - viking_add_resource: ingest URLs/docs into knowledge base Uses direct HTTP via httpx (no openviking SDK dependency needed). Response truncation on viking_read to prevent context flooding. * fix(memory): harden Mem0 plugin — thread safety, non-blocking sync, circuit breaker - Remove redundant mem0_context tool (identical to mem0_search with rerank=true, top_k=5 — wastes a tool slot and confuses the model) - Thread sync_turn so it's non-blocking — Mem0's server-side LLM extraction can take 5-10s, was stalling the agent after every turn - Add threading.Lock around _get_client() for thread-safe lazy init (prefetch and sync threads could race on first client creation) - Add circuit breaker: after 5 consecutive API failures, pause calls for 120s instead of hammering a down server every turn. Auto-resets after cooldown. Logs a warning when tripped. - Track success/failure in prefetch, sync_turn, and all tool calls - Wait for previous sync to finish before starting a new one (prevents unbounded thread accumulation on rapid turns) - Clean up shutdown to join both prefetch and sync threads * fix(memory): enforce single external memory provider limit MemoryManager now rejects a second non-builtin provider with a warning. Built-in memory (MEMORY.md/USER.md) is always accepted. Only ONE external plugin provider is allowed at a time. This prevents tool schema bloat (some providers add 3-5 tools each) and conflicting memory backends. The warning message directs users to configure memory.provider in config.yaml to select which provider to activate. Updated all 47 tests to use builtin + one external pattern instead of multiple externals. Added test_second_external_rejected to verify the enforcement. * feat(memory): add ByteRover memory provider plugin Implements the ByteRover integration (from PR #3499 by hieuntg81) as a MemoryProvider plugin instead of direct run_agent.py modifications. ByteRover provides persistent memory via the brv CLI — a hierarchical knowledge tree with tiered retrieval (fuzzy text then LLM-driven search). Local-first with optional cloud sync. Plugin capabilities: - prefetch: background brv query for relevant context - sync_turn: curate conversation turns (threaded, non-blocking) - on_memory_write: mirror built-in memory writes to brv - on_pre_compress: extract insights before context compression Tools (3): - brv_query: search the knowledge tree - brv_curate: store facts/decisions/patterns - brv_status: check CLI version and context tree state Profile isolation: working directory at $HERMES_HOME/byterover/ (scoped per profile). Binary resolution cached with thread-safe double-checked locking. All write operations threaded to avoid blocking the agent (curate can take 120s with LLM processing). * fix(memory): thread remaining sync_turns, fix holographic, add config key Plugin fixes: - Hindsight: thread sync_turn (was blocking up to 30s via _run_in_thread) - RetainDB: thread sync_turn (was blocking on HTTP POST) - Both: shutdown now joins sync threads alongside prefetch threads Holographic retrieval fixes: - reason(): removed dead intersection_key computation (bundled but never used in scoring). Now reuses pre-computed entity_residuals directly, moved role_content encoding outside the inner loop. - contradict(): added _MAX_CONTRADICT_FACTS=500 scaling guard. Above 500 facts, only checks the most recently updated ones to avoid O(n^2) explosion (~125K comparisons at 500 is acceptable). Config: - Added memory.provider key to DEFAULT_CONFIG ("" = builtin only). No version bump needed (deep_merge handles new keys automatically). * feat(memory): extract Honcho as a MemoryProvider plugin Creates plugins/honcho-memory/ as a thin adapter over the existing honcho_integration/ package. All 4 Honcho tools (profile, search, context, conclude) move from the normal tool registry to the MemoryProvider interface. The plugin delegates all work to HonchoSessionManager — no Honcho logic is reimplemented. It uses the existing config chain: $HERMES_HOME/honcho.json -> ~/.honcho/config.json -> env vars. Lifecycle hooks: - initialize: creates HonchoSessionManager via existing client factory - prefetch: background dialectic query - sync_turn: records messages + flushes to API (threaded) - on_memory_write: mirrors user profile writes as conclusions - on_session_end: flushes all pending messages This is a prerequisite for the MemoryManager wiring in run_agent.py. Once wired, Honcho goes through the same provider interface as all other memory plugins, and the scattered Honcho code in run_agent.py can be consolidated into the single MemoryManager integration point. * feat(memory): wire MemoryManager into run_agent.py Adds 8 integration points for the external memory provider plugin, all purely additive (zero existing code modified): 1. Init (~L1130): Create MemoryManager, find matching plugin provider from memory.provider config, initialize with session context 2. Tool injection (~L1160): Append provider tool schemas to self.tools and self.valid_tool_names after memory_manager init 3. System prompt (~L2705): Add external provider's system_prompt_block alongside existing MEMORY.md/USER.md blocks 4. Tool routing (~L5362): Route provider tool calls through memory_manager.handle_tool_call() before the catchall handler 5. Memory write bridge (~L5353): Notify external provider via on_memory_write() when the built-in memory tool writes 6. Pre-compress (~L5233): Call on_pre_compress() before context compression discards messages 7. Prefetch (~L6421): Inject provider prefetch results into the current-turn user message (same pattern as Honcho turn context) 8. Turn sync + session end (~L8161, ~L8172): sync_all() after each completed turn, queue_prefetch_all() for next turn, on_session_end() + shutdown_all() at conversation end All hooks are wrapped in try/except — a failing provider never breaks the agent. The existing memory system, Honcho integration, and all other code paths are completely untouched. Full suite: 7222 passed, 4 pre-existing failures. * refactor(memory): remove legacy Honcho integration from core Extracts all Honcho-specific code from run_agent.py, model_tools.py, toolsets.py, and gateway/run.py. Honcho is now exclusively available as a memory provider plugin (plugins/honcho-memory/). Removed from run_agent.py (-457 lines): - Honcho init block (session manager creation, activation, config) - 8 Honcho methods: _honcho_should_activate, _strip_honcho_tools, _activate_honcho, _register_honcho_exit_hook, _queue_honcho_prefetch, _honcho_prefetch, _honcho_save_user_observation, _honcho_sync - _inject_honcho_turn_context module-level function - Honcho system prompt block (tool descriptions, CLI commands) - Honcho context injection in api_messages building - Honcho params from __init__ (honcho_session_key, honcho_manager, honcho_config) - HONCHO_TOOL_NAMES constant - All honcho-specific tool dispatch forwarding Removed from other files: - model_tools.py: honcho_tools import, honcho params from handle_function_call - toolsets.py: honcho toolset definition, honcho tools from core tools list - gateway/run.py: honcho params from AIAgent constructor calls Removed tests (-339 lines): - 9 Honcho-specific test methods from test_run_agent.py - TestHonchoAtexitFlush class from test_exit_cleanup_interrupt.py Restored two regex constants (_SURROGATE_RE, _BUDGET_WARNING_RE) that were accidentally removed during the honcho function extraction. The honcho_integration/ package is kept intact — the plugin delegates to it. tools/honcho_tools.py registry entries are now dead code (import commented out in model_tools.py) but the file is preserved for reference. Full suite: 7207 passed, 4 pre-existing failures. Zero regressions. * refactor(memory): restructure plugins, add CLI, clean gateway, migration notice Plugin restructure: - Move all memory plugins from plugins/<name>-memory/ to plugins/memory/<name>/ (byterover, hindsight, holographic, honcho, mem0, openviking, retaindb) - New plugins/memory/__init__.py discovery module that scans the directory directly, loading providers by name without the general plugin system - run_agent.py uses load_memory_provider() instead of get_plugin_memory_providers() CLI wiring: - hermes memory setup — interactive curses picker + config wizard - hermes memory status — show active provider, config, availability - hermes memory off — disable external provider (built-in only) - hermes honcho — now shows migration notice pointing to hermes memory setup Gateway cleanup: - Remove _get_or_create_gateway_honcho (already removed in prev commit) - Remove _shutdown_gateway_honcho and _shutdown_all_gateway_honcho methods - Remove all calls to shutdown methods (4 call sites) - Remove _honcho_managers/_honcho_configs dict references Dead code removal: - Delete tools/honcho_tools.py (279 lines, import was already commented out) - Delete tests/gateway/test_honcho_lifecycle.py (131 lines, tested removed methods) - Remove if False placeholder from run_agent.py Migration: - Honcho migration notice on startup: detects existing honcho.json or ~/.honcho/config.json, prints guidance to run hermes memory setup. Only fires when memory.provider is not set and not in quiet mode. Full suite: 7203 passed, 4 pre-existing failures. Zero regressions. * feat(memory): standardize plugin config + add per-plugin documentation Config architecture: - Add save_config(values, hermes_home) to MemoryProvider ABC - Honcho: writes to $HERMES_HOME/honcho.json (SDK native) - Mem0: writes to $HERMES_HOME/mem0.json - Hindsight: writes to $HERMES_HOME/hindsight/config.json - Holographic: writes to config.yaml under plugins.hermes-memory-store - OpenViking/RetainDB/ByteRover: env-var only (default no-op) Setup wizard (hermes memory setup): - Now calls provider.save_config() for non-secret config - Secrets still go to .env via env vars - Only memory.provider activation key goes to config.yaml Documentation: - README.md for each of the 7 providers in plugins/memory/<name>/ - Requirements, setup (wizard + manual), config reference, tools table - Consistent format across all providers The contract for new memory plugins: - get_config_schema() declares all fields (REQUIRED) - save_config() writes native config (REQUIRED if not env-var-only) - Secrets use env_var field in schema, written to .env by wizard - README.md in the plugin directory * docs: add memory providers user guide + developer guide New pages: - user-guide/features/memory-providers.md — comprehensive guide covering all 7 shipped providers (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover). Each with setup, config, tools, cost, and unique features. Includes comparison table and profile isolation notes. - developer-guide/memory-provider-plugin.md — how to build a new memory provider plugin. Covers ABC, required methods, config schema, save_config, threading contract, profile isolation, testing. Updated pages: - user-guide/features/memory.md — replaced Honcho section with link to new Memory Providers page - user-guide/features/honcho.md — replaced with migration redirect to the new Memory Providers page - sidebars.ts — added both new pages to navigation * fix(memory): auto-migrate Honcho users to memory provider plugin When honcho.json or ~/.honcho/config.json exists but memory.provider is not set, automatically set memory.provider: honcho in config.yaml and activate the plugin. The plugin reads the same config files, so all data and credentials are preserved. Zero user action needed. Persists the migration to config.yaml so it only fires once. Prints a one-line confirmation in non-quiet mode. * fix(memory): only auto-migrate Honcho when enabled + credentialed Check HonchoClientConfig.enabled AND (api_key OR base_url) before auto-migrating — not just file existence. Prevents false activation for users who disabled Honcho, stopped using it (config lingers), or have ~/.honcho/ from a different tool. * feat(memory): auto-install pip dependencies during hermes memory setup Reads pip_dependencies from plugin.yaml, checks which are missing, installs them via pip before config walkthrough. Also shows install guidance for external_dependencies (e.g. brv CLI for ByteRover). Updated all 7 plugin.yaml files with pip_dependencies: - honcho: honcho-ai - mem0: mem0ai - openviking: httpx - hindsight: hindsight-client - holographic: (none) - retaindb: requests - byterover: (external_dependencies for brv CLI) * fix: remove remaining Honcho crash risks from cli.py and gateway cli.py: removed Honcho session re-mapping block (would crash importing deleted tools/honcho_tools.py), Honcho flush on compress, Honcho session display on startup, Honcho shutdown on exit, honcho_session_key AIAgent param. gateway/run.py: removed honcho_session_key params from helper methods, sync_honcho param, _honcho.shutdown() block. tests: fixed test_cron_session_with_honcho_key_skipped (was passing removed honcho_key param to _flush_memories_for_session). * fix: include plugins/ in pyproject.toml package list Without this, plugins/memory/ wouldn't be included in non-editable installs. Hermes always runs from the repo checkout so this is belt- and-suspenders, but prevents breakage if the install method changes. * fix(memory): correct pip-to-import name mapping for dep checks The heuristic dep.replace('-', '_') fails for packages where the pip name differs from the import name: honcho-ai→honcho, mem0ai→mem0, hindsight-client→hindsight_client. Added explicit mapping table so hermes memory setup doesn't try to reinstall already-installed packages. * chore: remove dead code from old plugin memory registration path - hermes_cli/plugins.py: removed register_memory_provider(), _memory_providers list, get_plugin_memory_providers() — memory providers now use plugins/memory/ discovery, not the general plugin system - hermes_cli/main.py: stripped 74 lines of dead honcho argparse subparsers (setup, status, sessions, map, peer, mode, tokens, identity, migrate) — kept only the migration redirect - agent/memory_provider.py: updated docstring to reflect new registration path - tests: replaced TestPluginMemoryProviderRegistration with TestPluginMemoryDiscovery that tests the actual plugins/memory/ discovery system. Added 3 new tests (discover, load, nonexistent). * chore: delete dead honcho_integration/cli.py and its tests cli.py (794 lines) was the old 'hermes honcho' command handler — nobody calls it since cmd_honcho was replaced with a migration redirect. Deleted tests that imported from removed code: - tests/honcho_integration/test_cli.py (tested _resolve_api_key) - tests/honcho_integration/test_config_isolation.py (tested CLI config paths) - tests/tools/test_honcho_tools.py (tested the deleted tools/honcho_tools.py) Remaining honcho_integration/ files (actively used by the plugin): - client.py (445 lines) — config loading, SDK client creation - session.py (991 lines) — session management, queries, flush * refactor: move honcho_integration/ into the honcho plugin Moves client.py (445 lines) and session.py (991 lines) from the top-level honcho_integration/ package into plugins/memory/honcho/. No Honcho code remains in the main codebase. - plugins/memory/honcho/client.py — config loading, SDK client creation - plugins/memory/honcho/session.py — session management, queries, flush - Updated all imports: run_agent.py (auto-migration), hermes_cli/doctor.py, plugin __init__.py, session.py cross-import, all tests - Removed honcho_integration/ package and pyproject.toml entry - Renamed tests/honcho_integration/ → tests/honcho_plugin/ * docs: update architecture + gateway-internals for memory provider system - architecture.md: replaced honcho_integration/ with plugins/memory/ - gateway-internals.md: replaced Honcho-specific session routing and flush lifecycle docs with generic memory provider interface docs * fix: update stale mock path for resolve_active_host after honcho plugin migration * fix(memory): address review feedback — P0 lifecycle, ABC contract, honcho CLI restore Review feedback from Honcho devs (erosika): P0 — Provider lifecycle: - Remove on_session_end() + shutdown_all() from run_conversation() tail (was killing providers after every turn in multi-turn sessions) - Add shutdown_memory_provider() method on AIAgent for callers - Wire shutdown into CLI atexit, reset_conversation, gateway stop/expiry Bug fixes: - Remove sync_honcho=False kwarg from /btw callsites (TypeError crash) - Fix doctor.py references to dead 'hermes honcho setup' command - Cache prefetch_all() before tool loop (was re-calling every iteration) ABC contract hardening (all backwards-compatible): - Add session_id kwarg to prefetch/sync_turn/queue_prefetch - Make on_pre_compress() return str (provider insights in compression) - Add **kwargs to on_turn_start() for runtime context - Add on_delegation() hook for parent-side subagent observation - Document agent_context/agent_identity/agent_workspace kwargs on initialize() (prevents cron corruption, enables profile scoping) - Fix docstring: single external provider, not multiple Honcho CLI restoration: - Add plugins/memory/honcho/cli.py (from main's honcho_integration/cli.py with imports adapted to plugin path) - Restore full hermes honcho command with all subcommands (status, peer, mode, tokens, identity, enable/disable, sync, peers, --target-profile) - Restore auto-clone on profile creation + sync on hermes update - hermes honcho setup now redirects to hermes memory setup * fix(memory): wire on_delegation, skip_memory for cron/flush, fix ByteRover return type - Wire on_delegation() in delegate_tool.py — parent's memory provider is notified with task+result after each subagent completes - Add skip_memory=True to cron scheduler (prevents cron system prompts from corrupting user representations — closes #4052) - Add skip_memory=True to gateway flush agent (throwaway agent shouldn't activate memory provider) - Fix ByteRover on_pre_compress() return type: None -> str * fix(honcho): port profile isolation fixes from PR #4632 Ports 5 bug fixes found during profile testing (erosika's PR #4632): 1. 3-tier config resolution — resolve_config_path() now checks $HERMES_HOME/honcho.json → ~/.hermes/honcho.json → ~/.honcho/config.json (non-default profiles couldn't find shared host blocks) 2. Thread host=_host_key() through from_global_config() in cmd_setup, cmd_status, cmd_identity (--target-profile was being ignored) 3. Use bare profile name as aiPeer (not host key with dots) — Honcho's peer ID pattern is ^[a-zA-Z0-9_-]+$, dots are invalid 4. Wrap add_peers() in try/except — was fatal on new AI peers, killed all message uploads for the session 5. Gate Honcho clone behind --clone/--clone-all on profile create (bare create should be blank-slate) Also: sanitize assistant_peer_id via _sanitize_id() * fix(tests): add module cleanup fixture to test_cli_provider_resolution test_cli_provider_resolution._import_cli() wipes tools.*, cli, and run_agent from sys.modules to force fresh imports, but had no cleanup. This poisoned all subsequent tests on the same xdist worker — mocks targeting tools.file_tools, tools.send_message_tool, etc. patched the NEW module object while already-imported functions still referenced the OLD one. Caused ~25 cascade failures: send_message KeyError, process_registry FileNotFoundError, file_read_guards timeouts, read_loop_detection file-not-found, mcp_oauth None port, and provider_parity/codex_execution stale tool lists. Fix: autouse fixture saves all affected modules before each test and restores them after, matching the pattern in test_managed_browserbase_and_modal.py.
2026-05-03 02:11:48 +00:00 · 2026-04-02 15:33:51 -07:00 · 2026-04-02 15:33:51 -07:00 · 924bc67eee
commit 924bc67eee
parent e0b2bdb089
69 changed files with 7501 additions and 2317 deletions
--- a/plugins/memory/honcho/README.md
+++ b/plugins/memory/honcho/README.md
@ -0,0 +1,35 @@
+# Honcho Memory Provider
+
+AI-native cross-session user modeling with dialectic Q&A, semantic search, peer cards, and persistent conclusions.
+
+## Requirements
+
+- `pip install honcho-ai`
+- Honcho API key from [app.honcho.dev](https://app.honcho.dev)
+
+## Setup
+
+```bash
+hermes memory setup    # select "honcho"
+```
+
+Or manually:
+```bash
+hermes config set memory.provider honcho
+echo "HONCHO_API_KEY=your-key" >> ~/.hermes/.env
+```
+
+## Config
+
+Config file: `$HERMES_HOME/honcho.json` (or `~/.honcho/config.json` legacy)
+
+Existing Honcho users: your config and data are preserved. Just set `memory.provider: honcho`.
+
+## Tools
+
+| Tool | Description |
+|------|-------------|
+| `honcho_profile` | User's peer card — key facts, no LLM |
+| `honcho_search` | Semantic search over stored context |
+| `honcho_context` | LLM-synthesized answer from memory |
+| `honcho_conclude` | Write a fact about the user to memory |
--- a/plugins/memory/honcho/init.py
+++ b/plugins/memory/honcho/init.py
@ -0,0 +1,355 @@
+"""Honcho memory plugin — MemoryProvider for Honcho AI-native memory.
+
+Provides cross-session user modeling with dialectic Q&A, semantic search,
+peer cards, and persistent conclusions via the Honcho SDK. Honcho provides AI-native cross-session user
+modeling with dialectic Q&A, semantic search, peer cards, and conclusions.
+
+The 4 tools (profile, search, context, conclude) are exposed through
+the MemoryProvider interface.
+
+Config: Uses the existing Honcho config chain:
+  1. $HERMES_HOME/honcho.json (profile-scoped)
+  2. ~/.honcho/config.json (legacy global)
+  3. Environment variables
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+import threading
+from typing import Any, Dict, List, Optional
+
+from agent.memory_provider import MemoryProvider
+
+logger = logging.getLogger(__name__)
+
+
+# ---------------------------------------------------------------------------
+# Tool schemas (moved from tools/honcho_tools.py)
+# ---------------------------------------------------------------------------
+
+PROFILE_SCHEMA = {
+    "name": "honcho_profile",
+    "description": (
+        "Retrieve the user's peer card from Honcho — a curated list of key facts "
+        "about them (name, role, preferences, communication style, patterns). "
+        "Fast, no LLM reasoning, minimal cost. "
+        "Use this at conversation start or when you need a quick factual snapshot."
+    ),
+    "parameters": {"type": "object", "properties": {}, "required": []},
+}
+
+SEARCH_SCHEMA = {
+    "name": "honcho_search",
+    "description": (
+        "Semantic search over Honcho's stored context about the user. "
+        "Returns raw excerpts ranked by relevance — no LLM synthesis. "
+        "Cheaper and faster than honcho_context. "
+        "Good when you want to find specific past facts and reason over them yourself."
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "query": {
+                "type": "string",
+                "description": "What to search for in Honcho's memory.",
+            },
+            "max_tokens": {
+                "type": "integer",
+                "description": "Token budget for returned context (default 800, max 2000).",
+            },
+        },
+        "required": ["query"],
+    },
+}
+
+CONTEXT_SCHEMA = {
+    "name": "honcho_context",
+    "description": (
+        "Ask Honcho a natural language question and get a synthesized answer. "
+        "Uses Honcho's LLM (dialectic reasoning) — higher cost than honcho_profile or honcho_search. "
+        "Can query about any peer: the user (default) or the AI assistant."
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "query": {
+                "type": "string",
+                "description": "A natural language question.",
+            },
+            "peer": {
+                "type": "string",
+                "description": "Which peer to query about: 'user' (default) or 'ai'.",
+            },
+        },
+        "required": ["query"],
+    },
+}
+
+CONCLUDE_SCHEMA = {
+    "name": "honcho_conclude",
+    "description": (
+        "Write a conclusion about the user back to Honcho's memory. "
+        "Conclusions are persistent facts that build the user's profile. "
+        "Use when the user states a preference, corrects you, or shares "
+        "something to remember across sessions."
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "conclusion": {
+                "type": "string",
+                "description": "A factual statement about the user to persist.",
+            }
+        },
+        "required": ["conclusion"],
+    },
+}
+
+
+# ---------------------------------------------------------------------------
+# MemoryProvider implementation
+# ---------------------------------------------------------------------------
+
+class HonchoMemoryProvider(MemoryProvider):
+    """Honcho AI-native memory with dialectic Q&A and persistent user modeling."""
+
+    def __init__(self):
+        self._manager = None   # HonchoSessionManager
+        self._config = None    # HonchoClientConfig
+        self._session_key = ""
+        self._prefetch_result = ""
+        self._prefetch_lock = threading.Lock()
+        self._prefetch_thread: Optional[threading.Thread] = None
+        self._sync_thread: Optional[threading.Thread] = None
+
+    @property
+    def name(self) -> str:
+        return "honcho"
+
+    def is_available(self) -> bool:
+        """Check if Honcho is configured. No network calls."""
+        try:
+            from plugins.memory.honcho.client import HonchoClientConfig
+            cfg = HonchoClientConfig.from_global_config()
+            return cfg.enabled and bool(cfg.api_key or cfg.base_url)
+        except Exception:
+            return False
+
+    def save_config(self, values, hermes_home):
+        """Write config to $HERMES_HOME/honcho.json (Honcho SDK native format)."""
+        import json
+        from pathlib import Path
+        config_path = Path(hermes_home) / "honcho.json"
+        existing = {}
+        if config_path.exists():
+            try:
+                existing = json.loads(config_path.read_text())
+            except Exception:
+                pass
+        existing.update(values)
+        config_path.write_text(json.dumps(existing, indent=2))
+
+    def get_config_schema(self):
+        return [
+            {"key": "api_key", "description": "Honcho API key", "secret": True, "env_var": "HONCHO_API_KEY", "url": "https://app.honcho.dev"},
+            {"key": "base_url", "description": "Honcho base URL", "default": "https://api.honcho.dev"},
+        ]
+
+    def initialize(self, session_id: str, **kwargs) -> None:
+        """Initialize Honcho session manager."""
+        try:
+            from plugins.memory.honcho.client import HonchoClientConfig, get_honcho_client
+            from plugins.memory.honcho.session import HonchoSessionManager
+
+            cfg = HonchoClientConfig.from_global_config()
+            if not cfg.enabled or not (cfg.api_key or cfg.base_url):
+                logger.debug("Honcho not configured — plugin inactive")
+                return
+
+            self._config = cfg
+            client = get_honcho_client(cfg)
+            self._manager = HonchoSessionManager(
+                honcho=client,
+                config=cfg,
+                context_tokens=cfg.context_tokens,
+            )
+
+            # Build session key from kwargs or session_id
+            platform = kwargs.get("platform", "cli")
+            user_id = kwargs.get("user_id", "")
+            if user_id:
+                self._session_key = f"{platform}:{user_id}"
+            else:
+                self._session_key = session_id
+
+        except ImportError:
+            logger.debug("honcho-ai package not installed — plugin inactive")
+        except Exception as e:
+            logger.warning("Honcho init failed: %s", e)
+            self._manager = None
+
+    def system_prompt_block(self) -> str:
+        if not self._manager or not self._session_key:
+            return ""
+        return (
+            "# Honcho Memory\n"
+            "Active. AI-native cross-session user modeling.\n"
+            "Use honcho_profile for a quick factual snapshot, "
+            "honcho_search for raw excerpts, honcho_context for synthesized answers, "
+            "honcho_conclude to save facts about the user."
+        )
+
+    def prefetch(self, query: str, *, session_id: str = "") -> str:
+        """Return prefetched dialectic context from background thread."""
+        if self._prefetch_thread and self._prefetch_thread.is_alive():
+            self._prefetch_thread.join(timeout=3.0)
+        with self._prefetch_lock:
+            result = self._prefetch_result
+            self._prefetch_result = ""
+        if not result:
+            return ""
+        return f"## Honcho Context\n{result}"
+
+    def queue_prefetch(self, query: str, *, session_id: str = "") -> None:
+        """Fire a background dialectic query for the upcoming turn."""
+        if not self._manager or not self._session_key or not query:
+            return
+
+        def _run():
+            try:
+                result = self._manager.dialectic_query(
+                    self._session_key, query, peer="user"
+                )
+                if result and result.strip():
+                    with self._prefetch_lock:
+                        self._prefetch_result = result
+            except Exception as e:
+                logger.debug("Honcho prefetch failed: %s", e)
+
+        self._prefetch_thread = threading.Thread(
+            target=_run, daemon=True, name="honcho-prefetch"
+        )
+        self._prefetch_thread.start()
+
+    def sync_turn(self, user_content: str, assistant_content: str, *, session_id: str = "") -> None:
+        """Record the conversation turn in Honcho (non-blocking)."""
+        if not self._manager or not self._session_key:
+            return
+
+        def _sync():
+            try:
+                session = self._manager.get_or_create_session(self._session_key)
+                session.add_message("user", user_content[:4000])
+                session.add_message("assistant", assistant_content[:4000])
+                # Flush to Honcho API
+                self._manager._flush_session(session)
+            except Exception as e:
+                logger.debug("Honcho sync_turn failed: %s", e)
+
+        if self._sync_thread and self._sync_thread.is_alive():
+            self._sync_thread.join(timeout=5.0)
+        self._sync_thread = threading.Thread(
+            target=_sync, daemon=True, name="honcho-sync"
+        )
+        self._sync_thread.start()
+
+    def on_memory_write(self, action: str, target: str, content: str) -> None:
+        """Mirror built-in user profile writes as Honcho conclusions."""
+        if action != "add" or target != "user" or not content:
+            return
+        if not self._manager or not self._session_key:
+            return
+
+        def _write():
+            try:
+                self._manager.create_conclusion(self._session_key, content)
+            except Exception as e:
+                logger.debug("Honcho memory mirror failed: %s", e)
+
+        t = threading.Thread(target=_write, daemon=True, name="honcho-memwrite")
+        t.start()
+
+    def on_session_end(self, messages: List[Dict[str, Any]]) -> None:
+        """Flush all pending messages to Honcho on session end."""
+        if not self._manager:
+            return
+        # Wait for pending sync
+        if self._sync_thread and self._sync_thread.is_alive():
+            self._sync_thread.join(timeout=10.0)
+        try:
+            self._manager.flush_all()
+        except Exception as e:
+            logger.debug("Honcho session-end flush failed: %s", e)
+
+    def get_tool_schemas(self) -> List[Dict[str, Any]]:
+        return [PROFILE_SCHEMA, SEARCH_SCHEMA, CONTEXT_SCHEMA, CONCLUDE_SCHEMA]
+
+    def handle_tool_call(self, tool_name: str, args: dict, **kwargs) -> str:
+        if not self._manager or not self._session_key:
+            return json.dumps({"error": "Honcho is not active for this session."})
+
+        try:
+            if tool_name == "honcho_profile":
+                card = self._manager.get_peer_card(self._session_key)
+                if not card:
+                    return json.dumps({"result": "No profile facts available yet."})
+                return json.dumps({"result": card})
+
+            elif tool_name == "honcho_search":
+                query = args.get("query", "")
+                if not query:
+                    return json.dumps({"error": "Missing required parameter: query"})
+                max_tokens = min(int(args.get("max_tokens", 800)), 2000)
+                result = self._manager.search_context(
+                    self._session_key, query, max_tokens=max_tokens
+                )
+                if not result:
+                    return json.dumps({"result": "No relevant context found."})
+                return json.dumps({"result": result})
+
+            elif tool_name == "honcho_context":
+                query = args.get("query", "")
+                if not query:
+                    return json.dumps({"error": "Missing required parameter: query"})
+                peer = args.get("peer", "user")
+                result = self._manager.dialectic_query(
+                    self._session_key, query, peer=peer
+                )
+                return json.dumps({"result": result or "No result from Honcho."})
+
+            elif tool_name == "honcho_conclude":
+                conclusion = args.get("conclusion", "")
+                if not conclusion:
+                    return json.dumps({"error": "Missing required parameter: conclusion"})
+                ok = self._manager.create_conclusion(self._session_key, conclusion)
+                if ok:
+                    return json.dumps({"result": f"Conclusion saved: {conclusion}"})
+                return json.dumps({"error": "Failed to save conclusion."})
+
+            return json.dumps({"error": f"Unknown tool: {tool_name}"})
+
+        except Exception as e:
+            logger.error("Honcho tool %s failed: %s", tool_name, e)
+            return json.dumps({"error": f"Honcho {tool_name} failed: {e}"})
+
+    def shutdown(self) -> None:
+        for t in (self._prefetch_thread, self._sync_thread):
+            if t and t.is_alive():
+                t.join(timeout=5.0)
+        # Flush any remaining messages
+        if self._manager:
+            try:
+                self._manager.flush_all()
+            except Exception:
+                pass
+
+
+# ---------------------------------------------------------------------------
+# Plugin entry point
+# ---------------------------------------------------------------------------
+
+def register(ctx) -> None:
+    """Register Honcho as a memory provider plugin."""
+    ctx.register_memory_provider(HonchoMemoryProvider())
--- a/plugins/memory/honcho/cli.py
+++ b/plugins/memory/honcho/cli.py
--- a/plugins/memory/honcho/client.py
+++ b/plugins/memory/honcho/client.py
@ -0,0 +1,485 @@
+"""Honcho client initialization and configuration.
+
+Resolution order for config file:
+  1. $HERMES_HOME/honcho.json  (instance-local, enables isolated Hermes instances)
+  2. ~/.honcho/config.json     (global, shared across all Honcho-enabled apps)
+  3. Environment variables     (HONCHO_API_KEY, HONCHO_ENVIRONMENT)
+
+Resolution order for host-specific settings:
+  1. Explicit host block fields (always win)
+  2. Flat/global fields from config root
+  3. Defaults (host name as workspace/peer)
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import logging
+from dataclasses import dataclass, field
+from pathlib import Path
+
+from hermes_constants import get_hermes_home
+from typing import Any, TYPE_CHECKING
+
+if TYPE_CHECKING:
+    from honcho import Honcho
+
+logger = logging.getLogger(__name__)
+
+GLOBAL_CONFIG_PATH = Path.home() / ".honcho" / "config.json"
+HOST = "hermes"
+
+
+def resolve_active_host() -> str:
+    """Derive the Honcho host key from the active Hermes profile.
+
+    Resolution order:
+      1. HERMES_HONCHO_HOST env var (explicit override)
+      2. Active profile name via profiles system -> ``hermes.<profile>``
+      3. Fallback: ``"hermes"`` (default profile)
+    """
+    explicit = os.environ.get("HERMES_HONCHO_HOST", "").strip()
+    if explicit:
+        return explicit
+
+    try:
+        from hermes_cli.profiles import get_active_profile_name
+        profile = get_active_profile_name()
+        if profile and profile not in ("default", "custom"):
+            return f"{HOST}.{profile}"
+    except Exception:
+        pass
+    return HOST
+
+
+def resolve_config_path() -> Path:
+    """Return the active Honcho config path.
+
+    Resolution order:
+      1. $HERMES_HOME/honcho.json      (profile-local, if it exists)
+      2. ~/.hermes/honcho.json          (default profile — shared host blocks live here)
+      3. ~/.honcho/config.json          (global, cross-app interop)
+
+    Returns the global path if none exist (for first-time setup writes).
+    """
+    local_path = get_hermes_home() / "honcho.json"
+    if local_path.exists():
+        return local_path
+
+    # Default profile's config — host blocks accumulate here via setup/clone
+    default_path = Path.home() / ".hermes" / "honcho.json"
+    if default_path != local_path and default_path.exists():
+        return default_path
+
+    return GLOBAL_CONFIG_PATH
+
+
+_RECALL_MODE_ALIASES = {"auto": "hybrid"}
+_VALID_RECALL_MODES = {"hybrid", "context", "tools"}
+
+
+def _normalize_recall_mode(val: str) -> str:
+    """Normalize legacy recall mode values (e.g. 'auto' → 'hybrid')."""
+    val = _RECALL_MODE_ALIASES.get(val, val)
+    return val if val in _VALID_RECALL_MODES else "hybrid"
+
+
+def _resolve_memory_mode(
+    global_val: str | dict,
+    host_val: str | dict | None,
+) -> dict:
+    """Parse memoryMode (string or object) into memory_mode + peer_memory_modes.
+
+    Resolution order: host-level wins over global.
+    String form:  applies as the default for all peers.
+    Object form:  { "default": "hybrid", "hermes": "honcho", ... }
+                  "default" key sets the fallback; other keys are per-peer overrides.
+    """
+    # Pick the winning value (host beats global)
+    val = host_val if host_val is not None else global_val
+
+    if isinstance(val, dict):
+        default = val.get("default", "hybrid")
+        overrides = {k: v for k, v in val.items() if k != "default"}
+    else:
+        default = str(val) if val else "hybrid"
+        overrides = {}
+
+    return {"memory_mode": default, "peer_memory_modes": overrides}
+
+
+@dataclass
+class HonchoClientConfig:
+    """Configuration for Honcho client, resolved for a specific host."""
+
+    host: str = HOST
+    workspace_id: str = "hermes"
+    api_key: str | None = None
+    environment: str = "production"
+    # Optional base URL for self-hosted Honcho (overrides environment mapping)
+    base_url: str | None = None
+    # Identity
+    peer_name: str | None = None
+    ai_peer: str = "hermes"
+    linked_hosts: list[str] = field(default_factory=list)
+    # Toggles
+    enabled: bool = False
+    save_messages: bool = True
+    # memoryMode: default for all peers. "hybrid" / "honcho"
+    memory_mode: str = "hybrid"
+    # Per-peer overrides — any named Honcho peer. Override memory_mode when set.
+    # Config object form: "memoryMode": { "default": "hybrid", "hermes": "honcho" }
+    peer_memory_modes: dict[str, str] = field(default_factory=dict)
+
+    def peer_memory_mode(self, peer_name: str) -> str:
+        """Return the effective memory mode for a named peer.
+
+        Resolution: per-peer override → global memory_mode default.
+        """
+        return self.peer_memory_modes.get(peer_name, self.memory_mode)
+    # Write frequency: "async" (background thread), "turn" (sync per turn),
+    # "session" (flush on session end), or int (every N turns)
+    write_frequency: str | int = "async"
+    # Prefetch budget
+    context_tokens: int | None = None
+    # Dialectic (peer.chat) settings
+    # reasoning_level: "minimal" | "low" | "medium" | "high" | "max"
+    # Used as the default; prefetch_dialectic may bump it dynamically.
+    dialectic_reasoning_level: str = "low"
+    # Max chars of dialectic result to inject into Hermes system prompt
+    dialectic_max_chars: int = 600
+    # Recall mode: how memory retrieval works when Honcho is active.
+    # "hybrid"  — auto-injected context + Honcho tools available (model decides)
+    # "context" — auto-injected context only, Honcho tools removed
+    # "tools"   — Honcho tools only, no auto-injected context
+    recall_mode: str = "hybrid"
+    # Session resolution
+    session_strategy: str = "per-directory"
+    session_peer_prefix: bool = False
+    sessions: dict[str, str] = field(default_factory=dict)
+    # Raw global config for anything else consumers need
+    raw: dict[str, Any] = field(default_factory=dict)
+    # True when Honcho was explicitly configured for this host (hosts.hermes
+    # block exists or enabled was set explicitly), vs auto-enabled from a
+    # stray HONCHO_API_KEY env var.
+    explicitly_configured: bool = False
+
+    @classmethod
+    def from_env(
+        cls,
+        workspace_id: str = "hermes",
+        host: str | None = None,
+    ) -> HonchoClientConfig:
+        """Create config from environment variables (fallback)."""
+        resolved_host = host or resolve_active_host()
+        api_key = os.environ.get("HONCHO_API_KEY")
+        base_url = os.environ.get("HONCHO_BASE_URL", "").strip() or None
+        return cls(
+            host=resolved_host,
+            workspace_id=workspace_id,
+            api_key=api_key,
+            environment=os.environ.get("HONCHO_ENVIRONMENT", "production"),
+            base_url=base_url,
+            ai_peer=resolved_host,
+            enabled=bool(api_key or base_url),
+        )
+
+    @classmethod
+    def from_global_config(
+        cls,
+        host: str | None = None,
+        config_path: Path | None = None,
+    ) -> HonchoClientConfig:
+        """Create config from the resolved Honcho config path.
+
+        Resolution: $HERMES_HOME/honcho.json -> ~/.honcho/config.json -> env vars.
+        When host is None, derives it from the active Hermes profile.
+        """
+        resolved_host = host or resolve_active_host()
+        path = config_path or resolve_config_path()
+        if not path.exists():
+            logger.debug("No global Honcho config at %s, falling back to env", path)
+            return cls.from_env(host=resolved_host)
+
+        try:
+            raw = json.loads(path.read_text(encoding="utf-8"))
+        except (json.JSONDecodeError, OSError) as e:
+            logger.warning("Failed to read %s: %s, falling back to env", path, e)
+            return cls.from_env(host=resolved_host)
+
+        host_block = (raw.get("hosts") or {}).get(resolved_host, {})
+        # A hosts.hermes block or explicit enabled flag means the user
+        # intentionally configured Honcho for this host.
+        _explicitly_configured = bool(host_block) or raw.get("enabled") is True
+
+        # Explicit host block fields win, then flat/global, then defaults
+        workspace = (
+            host_block.get("workspace")
+            or raw.get("workspace")
+            or resolved_host
+        )
+        ai_peer = (
+            host_block.get("aiPeer")
+            or raw.get("aiPeer")
+            or resolved_host
+        )
+        linked_hosts = host_block.get("linkedHosts", [])
+
+        api_key = (
+            host_block.get("apiKey")
+            or raw.get("apiKey")
+            or os.environ.get("HONCHO_API_KEY")
+        )
+
+        environment = (
+            host_block.get("environment")
+            or raw.get("environment", "production")
+        )
+
+        base_url = (
+            raw.get("baseUrl")
+            or os.environ.get("HONCHO_BASE_URL", "").strip()
+            or None
+        )
+
+        # Auto-enable when API key or base_url is present (unless explicitly disabled)
+        # Host-level enabled wins, then root-level, then auto-enable if key/url exists.
+        host_enabled = host_block.get("enabled")
+        root_enabled = raw.get("enabled")
+        if host_enabled is not None:
+            enabled = host_enabled
+        elif root_enabled is not None:
+            enabled = root_enabled
+        else:
+            # Not explicitly set anywhere -> auto-enable if API key or base_url exists
+            enabled = bool(api_key or base_url)
+
+        # write_frequency: accept int or string
+        raw_wf = (
+            host_block.get("writeFrequency")
+            or raw.get("writeFrequency")
+            or "async"
+        )
+        try:
+            write_frequency: str | int = int(raw_wf)
+        except (TypeError, ValueError):
+            write_frequency = str(raw_wf)
+
+        # saveMessages: host wins (None-aware since False is valid)
+        host_save = host_block.get("saveMessages")
+        save_messages = host_save if host_save is not None else raw.get("saveMessages", True)
+
+        # sessionStrategy / sessionPeerPrefix: host first, root fallback
+        session_strategy = (
+            host_block.get("sessionStrategy")
+            or raw.get("sessionStrategy", "per-directory")
+        )
+        host_prefix = host_block.get("sessionPeerPrefix")
+        session_peer_prefix = (
+            host_prefix if host_prefix is not None
+            else raw.get("sessionPeerPrefix", False)
+        )
+
+        return cls(
+            host=resolved_host,
+            workspace_id=workspace,
+            api_key=api_key,
+            environment=environment,
+            base_url=base_url,
+            peer_name=host_block.get("peerName") or raw.get("peerName"),
+            ai_peer=ai_peer,
+            linked_hosts=linked_hosts,
+            enabled=enabled,
+            save_messages=save_messages,
+            **_resolve_memory_mode(
+                raw.get("memoryMode", "hybrid"),
+                host_block.get("memoryMode"),
+            ),
+            write_frequency=write_frequency,
+            context_tokens=host_block.get("contextTokens") or raw.get("contextTokens"),
+            dialectic_reasoning_level=(
+                host_block.get("dialecticReasoningLevel")
+                or raw.get("dialecticReasoningLevel")
+                or "low"
+            ),
+            dialectic_max_chars=int(
+                host_block.get("dialecticMaxChars")
+                or raw.get("dialecticMaxChars")
+                or 600
+            ),
+            recall_mode=_normalize_recall_mode(
+                host_block.get("recallMode")
+                or raw.get("recallMode")
+                or "hybrid"
+            ),
+            session_strategy=session_strategy,
+            session_peer_prefix=session_peer_prefix,
+            sessions=raw.get("sessions", {}),
+            raw=raw,
+            explicitly_configured=_explicitly_configured,
+        )
+
+    @staticmethod
+    def _git_repo_name(cwd: str) -> str | None:
+        """Return the git repo root directory name, or None if not in a repo."""
+        import subprocess
+
+        try:
+            root = subprocess.run(
+                ["git", "rev-parse", "--show-toplevel"],
+                capture_output=True, text=True, cwd=cwd, timeout=5,
+            )
+            if root.returncode == 0:
+                return Path(root.stdout.strip()).name
+        except (OSError, subprocess.TimeoutExpired):
+            pass
+        return None
+
+    def resolve_session_name(
+        self,
+        cwd: str | None = None,
+        session_title: str | None = None,
+        session_id: str | None = None,
+    ) -> str | None:
+        """Resolve Honcho session name.
+
+        Resolution order:
+          1. Manual directory override from sessions map
+          2. Hermes session title (from /title command)
+          3. per-session strategy — Hermes session_id ({timestamp}_{hex})
+          4. per-repo strategy — git repo root directory name
+          5. per-directory strategy — directory basename
+          6. global strategy — workspace name
+        """
+        import re
+
+        if not cwd:
+            cwd = os.getcwd()
+
+        # Manual override always wins
+        manual = self.sessions.get(cwd)
+        if manual:
+            return manual
+
+        # /title mid-session remap
+        if session_title:
+            sanitized = re.sub(r'[^a-zA-Z0-9_-]', '-', session_title).strip('-')
+            if sanitized:
+                if self.session_peer_prefix and self.peer_name:
+                    return f"{self.peer_name}-{sanitized}"
+                return sanitized
+
+        # per-session: inherit Hermes session_id (new Honcho session each run)
+        if self.session_strategy == "per-session" and session_id:
+            if self.session_peer_prefix and self.peer_name:
+                return f"{self.peer_name}-{session_id}"
+            return session_id
+
+        # per-repo: one Honcho session per git repository
+        if self.session_strategy == "per-repo":
+            base = self._git_repo_name(cwd) or Path(cwd).name
+            if self.session_peer_prefix and self.peer_name:
+                return f"{self.peer_name}-{base}"
+            return base
+
+        # per-directory: one Honcho session per working directory (default)
+        if self.session_strategy in ("per-directory", "per-session"):
+            base = Path(cwd).name
+            if self.session_peer_prefix and self.peer_name:
+                return f"{self.peer_name}-{base}"
+            return base
+
+        # global: single session across all directories
+        return self.workspace_id
+
+    def get_linked_workspaces(self) -> list[str]:
+        """Resolve linked host keys to workspace names."""
+        hosts = self.raw.get("hosts", {})
+        workspaces = []
+        for host_key in self.linked_hosts:
+            block = hosts.get(host_key, {})
+            ws = block.get("workspace") or host_key
+            if ws != self.workspace_id:
+                workspaces.append(ws)
+        return workspaces
+
+
+_honcho_client: Honcho | None = None
+
+
+def get_honcho_client(config: HonchoClientConfig | None = None) -> Honcho:
+    """Get or create the Honcho client singleton.
+
+    When no config is provided, attempts to load ~/.honcho/config.json
+    first, falling back to environment variables.
+    """
+    global _honcho_client
+
+    if _honcho_client is not None:
+        return _honcho_client
+
+    if config is None:
+        config = HonchoClientConfig.from_global_config()
+
+    if not config.api_key and not config.base_url:
+        raise ValueError(
+            "Honcho API key not found. "
+            "Get your API key at https://app.honcho.dev, "
+            "then run 'hermes honcho setup' or set HONCHO_API_KEY. "
+            "For local instances, set HONCHO_BASE_URL instead."
+        )
+
+    try:
+        from honcho import Honcho
+    except ImportError:
+        raise ImportError(
+            "honcho-ai is required for Honcho integration. "
+            "Install it with: pip install honcho-ai"
+        )
+
+    # Allow config.yaml honcho.base_url to override the SDK's environment
+    # mapping, enabling remote self-hosted Honcho deployments without
+    # requiring the server to live on localhost.
+    resolved_base_url = config.base_url
+    if not resolved_base_url:
+        try:
+            from hermes_cli.config import load_config
+            hermes_cfg = load_config()
+            honcho_cfg = hermes_cfg.get("honcho", {})
+            if isinstance(honcho_cfg, dict):
+                resolved_base_url = honcho_cfg.get("base_url", "").strip() or None
+        except Exception:
+            pass
+
+    if resolved_base_url:
+        logger.info("Initializing Honcho client (base_url: %s, workspace: %s)", resolved_base_url, config.workspace_id)
+    else:
+        logger.info("Initializing Honcho client (host: %s, workspace: %s)", config.host, config.workspace_id)
+
+    # Local Honcho instances don't require an API key, but the SDK
+    # expects a non-empty string.  Use a placeholder for local URLs.
+    _is_local = resolved_base_url and (
+        "localhost" in resolved_base_url
+        or "127.0.0.1" in resolved_base_url
+        or "::1" in resolved_base_url
+    )
+    effective_api_key = config.api_key or ("local" if _is_local else None)
+
+    kwargs: dict = {
+        "workspace_id": config.workspace_id,
+        "api_key": effective_api_key,
+        "environment": config.environment,
+    }
+    if resolved_base_url:
+        kwargs["base_url"] = resolved_base_url
+
+    _honcho_client = Honcho(**kwargs)
+
+    return _honcho_client
+
+
+def reset_honcho_client() -> None:
+    """Reset the Honcho client singleton (useful for testing)."""
+    global _honcho_client
+    _honcho_client = None
--- a/plugins/memory/honcho/plugin.yaml
+++ b/plugins/memory/honcho/plugin.yaml
@ -0,0 +1,7 @@
+name: honcho
+version: 1.0.0
+description: "Honcho AI-native memory — cross-session user modeling with dialectic Q&A, semantic search, and persistent conclusions."
+pip_dependencies:
+  - honcho-ai
+hooks:
+  - on_session_end
--- a/plugins/memory/honcho/session.py
+++ b/plugins/memory/honcho/session.py
@ -0,0 +1,997 @@
+"""Honcho-based session management for conversation history."""
+
+from __future__ import annotations
+
+import queue
+import re
+import logging
+import threading
+from dataclasses import dataclass, field
+from datetime import datetime
+from typing import Any, TYPE_CHECKING
+
+from plugins.memory.honcho.client import get_honcho_client
+
+if TYPE_CHECKING:
+    from honcho import Honcho
+
+logger = logging.getLogger(__name__)
+
+# Sentinel to signal the async writer thread to shut down
+_ASYNC_SHUTDOWN = object()
+
+
+@dataclass
+class HonchoSession:
+    """
+    A conversation session backed by Honcho.
+
+    Provides a local message cache that syncs to Honcho's
+    AI-native memory system for user modeling.
+    """
+
+    key: str  # channel:chat_id
+    user_peer_id: str  # Honcho peer ID for the user
+    assistant_peer_id: str  # Honcho peer ID for the assistant
+    honcho_session_id: str  # Honcho session ID
+    messages: list[dict[str, Any]] = field(default_factory=list)
+    created_at: datetime = field(default_factory=datetime.now)
+    updated_at: datetime = field(default_factory=datetime.now)
+    metadata: dict[str, Any] = field(default_factory=dict)
+
+    def add_message(self, role: str, content: str, **kwargs: Any) -> None:
+        """Add a message to the local cache."""
+        msg = {
+            "role": role,
+            "content": content,
+            "timestamp": datetime.now().isoformat(),
+            **kwargs,
+        }
+        self.messages.append(msg)
+        self.updated_at = datetime.now()
+
+    def get_history(self, max_messages: int = 50) -> list[dict[str, Any]]:
+        """Get message history for LLM context."""
+        recent = (
+            self.messages[-max_messages:]
+            if len(self.messages) > max_messages
+            else self.messages
+        )
+        return [{"role": m["role"], "content": m["content"]} for m in recent]
+
+    def clear(self) -> None:
+        """Clear all messages in the session."""
+        self.messages = []
+        self.updated_at = datetime.now()
+
+
+class HonchoSessionManager:
+    """
+    Manages conversation sessions using Honcho.
+
+    Runs alongside hermes' existing SQLite state and file-based memory,
+    adding persistent cross-session user modeling via Honcho's AI-native memory.
+    """
+
+    def __init__(
+        self,
+        honcho: Honcho | None = None,
+        context_tokens: int | None = None,
+        config: Any | None = None,
+    ):
+        """
+        Initialize the session manager.
+
+        Args:
+            honcho: Optional Honcho client. If not provided, uses the singleton.
+            context_tokens: Max tokens for context() calls (None = Honcho default).
+            config: HonchoClientConfig from global config (provides peer_name, ai_peer,
+                    write_frequency, memory_mode, etc.).
+        """
+        self._honcho = honcho
+        self._context_tokens = context_tokens
+        self._config = config
+        self._cache: dict[str, HonchoSession] = {}
+        self._peers_cache: dict[str, Any] = {}
+        self._sessions_cache: dict[str, Any] = {}
+
+        # Write frequency state
+        write_frequency = (config.write_frequency if config else "async")
+        self._write_frequency = write_frequency
+        self._turn_counter: int = 0
+
+        # Prefetch caches: session_key → last result (consumed once per turn)
+        self._context_cache: dict[str, dict] = {}
+        self._dialectic_cache: dict[str, str] = {}
+        self._prefetch_cache_lock = threading.Lock()
+        self._dialectic_reasoning_level: str = (
+            config.dialectic_reasoning_level if config else "low"
+        )
+        self._dialectic_max_chars: int = (
+            config.dialectic_max_chars if config else 600
+        )
+
+        # Async write queue — started lazily on first enqueue
+        self._async_queue: queue.Queue | None = None
+        self._async_thread: threading.Thread | None = None
+        if write_frequency == "async":
+            self._async_queue = queue.Queue()
+            self._async_thread = threading.Thread(
+                target=self._async_writer_loop,
+                name="honcho-async-writer",
+                daemon=True,
+            )
+            self._async_thread.start()
+
+    @property
+    def honcho(self) -> Honcho:
+        """Get the Honcho client, initializing if needed."""
+        if self._honcho is None:
+            self._honcho = get_honcho_client()
+        return self._honcho
+
+    def _get_or_create_peer(self, peer_id: str) -> Any:
+        """
+        Get or create a Honcho peer.
+
+        Peers are lazy -- no API call until first use.
+        Observation settings are controlled per-session via SessionPeerConfig.
+        """
+        if peer_id in self._peers_cache:
+            return self._peers_cache[peer_id]
+
+        peer = self.honcho.peer(peer_id)
+        self._peers_cache[peer_id] = peer
+        return peer
+
+    def _get_or_create_honcho_session(
+        self, session_id: str, user_peer: Any, assistant_peer: Any
+    ) -> tuple[Any, list]:
+        """
+        Get or create a Honcho session with peers configured.
+
+        Returns:
+            Tuple of (honcho_session, existing_messages).
+        """
+        if session_id in self._sessions_cache:
+            logger.debug("Honcho session '%s' retrieved from cache", session_id)
+            return self._sessions_cache[session_id], []
+
+        session = self.honcho.session(session_id)
+
+        # Configure peer observation settings.
+        # observe_me=True for AI peer so Honcho watches what the agent says
+        # and builds its representation over time — enabling identity formation.
+        try:
+            from honcho.session import SessionPeerConfig
+            user_config = SessionPeerConfig(observe_me=True, observe_others=True)
+            ai_config = SessionPeerConfig(observe_me=True, observe_others=True)
+
+            session.add_peers([(user_peer, user_config), (assistant_peer, ai_config)])
+        except Exception as e:
+            logger.warning(
+                "Honcho session '%s' add_peers failed (non-fatal): %s",
+                session_id, e,
+            )
+
+        # Load existing messages via context() - single call for messages + metadata
+        existing_messages = []
+        try:
+            ctx = session.context(summary=True, tokens=self._context_tokens)
+            existing_messages = ctx.messages or []
+
+            # Verify chronological ordering
+            if existing_messages and len(existing_messages) > 1:
+                timestamps = [m.created_at for m in existing_messages if m.created_at]
+                if timestamps and timestamps != sorted(timestamps):
+                    logger.warning(
+                        "Honcho messages not chronologically ordered for session '%s', sorting",
+                        session_id,
+                    )
+                    existing_messages = sorted(
+                        existing_messages,
+                        key=lambda m: m.created_at or datetime.min,
+                    )
+
+            if existing_messages:
+                logger.info(
+                    "Honcho session '%s' retrieved (%d existing messages)",
+                    session_id, len(existing_messages),
+                )
+            else:
+                logger.info("Honcho session '%s' created (new)", session_id)
+        except Exception as e:
+            logger.warning(
+                "Honcho session '%s' loaded (failed to fetch context: %s)",
+                session_id, e,
+            )
+
+        self._sessions_cache[session_id] = session
+        return session, existing_messages
+
+    def _sanitize_id(self, id_str: str) -> str:
+        """Sanitize an ID to match Honcho's pattern: ^[a-zA-Z0-9_-]+"""
+        return re.sub(r'[^a-zA-Z0-9_-]', '-', id_str)
+
+    def get_or_create(self, key: str) -> HonchoSession:
+        """
+        Get an existing session or create a new one.
+
+        Args:
+            key: Session key (usually channel:chat_id).
+
+        Returns:
+            The session.
+        """
+        if key in self._cache:
+            logger.debug("Local session cache hit: %s", key)
+            return self._cache[key]
+
+        # Use peer names from global config when available
+        if self._config and self._config.peer_name:
+            user_peer_id = self._sanitize_id(self._config.peer_name)
+        else:
+            # Fallback: derive from session key
+            parts = key.split(":", 1)
+            channel = parts[0] if len(parts) > 1 else "default"
+            chat_id = parts[1] if len(parts) > 1 else key
+            user_peer_id = self._sanitize_id(f"user-{channel}-{chat_id}")
+
+        assistant_peer_id = self._sanitize_id(
+            self._config.ai_peer if self._config else "hermes-assistant"
+        )
+
+        # Sanitize session ID for Honcho
+        honcho_session_id = self._sanitize_id(key)
+
+        # Get or create peers
+        user_peer = self._get_or_create_peer(user_peer_id)
+        assistant_peer = self._get_or_create_peer(assistant_peer_id)
+
+        # Get or create Honcho session
+        honcho_session, existing_messages = self._get_or_create_honcho_session(
+            honcho_session_id, user_peer, assistant_peer
+        )
+
+        # Convert Honcho messages to local format
+        local_messages = []
+        for msg in existing_messages:
+            role = "assistant" if msg.peer_id == assistant_peer_id else "user"
+            local_messages.append({
+                "role": role,
+                "content": msg.content,
+                "timestamp": msg.created_at.isoformat() if msg.created_at else "",
+                "_synced": True,  # Already in Honcho
+            })
+
+        # Create local session wrapper with existing messages
+        session = HonchoSession(
+            key=key,
+            user_peer_id=user_peer_id,
+            assistant_peer_id=assistant_peer_id,
+            honcho_session_id=honcho_session_id,
+            messages=local_messages,
+        )
+
+        self._cache[key] = session
+        return session
+
+    def _flush_session(self, session: HonchoSession) -> bool:
+        """Internal: write unsynced messages to Honcho synchronously."""
+        if not session.messages:
+            return True
+
+        user_peer = self._get_or_create_peer(session.user_peer_id)
+        assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+
+        if not honcho_session:
+            honcho_session, _ = self._get_or_create_honcho_session(
+                session.honcho_session_id, user_peer, assistant_peer
+            )
+
+        new_messages = [m for m in session.messages if not m.get("_synced")]
+        if not new_messages:
+            return True
+
+        honcho_messages = []
+        for msg in new_messages:
+            peer = user_peer if msg["role"] == "user" else assistant_peer
+            honcho_messages.append(peer.message(msg["content"]))
+
+        try:
+            honcho_session.add_messages(honcho_messages)
+            for msg in new_messages:
+                msg["_synced"] = True
+            logger.debug("Synced %d messages to Honcho for %s", len(honcho_messages), session.key)
+            self._cache[session.key] = session
+            return True
+        except Exception as e:
+            for msg in new_messages:
+                msg["_synced"] = False
+            logger.error("Failed to sync messages to Honcho: %s", e)
+            self._cache[session.key] = session
+            return False
+
+    def _async_writer_loop(self) -> None:
+        """Background daemon thread: drains the async write queue."""
+        while True:
+            try:
+                item = self._async_queue.get(timeout=5)
+                if item is _ASYNC_SHUTDOWN:
+                    break
+
+                first_error: Exception | None = None
+                try:
+                    success = self._flush_session(item)
+                except Exception as e:
+                    success = False
+                    first_error = e
+
+                if success:
+                    continue
+
+                if first_error is not None:
+                    logger.warning("Honcho async write failed, retrying once: %s", first_error)
+                else:
+                    logger.warning("Honcho async write failed, retrying once")
+
+                import time as _time
+                _time.sleep(2)
+
+                try:
+                    retry_success = self._flush_session(item)
+                except Exception as e2:
+                    logger.error("Honcho async write retry failed, dropping batch: %s", e2)
+                    continue
+
+                if not retry_success:
+                    logger.error("Honcho async write retry failed, dropping batch")
+            except queue.Empty:
+                continue
+            except Exception as e:
+                logger.error("Honcho async writer error: %s", e)
+
+    def save(self, session: HonchoSession) -> None:
+        """Save messages to Honcho, respecting write_frequency.
+
+        write_frequency modes:
+          "async"   — enqueue for background thread (zero blocking, zero token cost)
+          "turn"    — flush synchronously every turn
+          "session" — defer until flush_session() is called explicitly
+          N (int)   — flush every N turns
+        """
+        self._turn_counter += 1
+        wf = self._write_frequency
+
+        if wf == "async":
+            if self._async_queue is not None:
+                self._async_queue.put(session)
+        elif wf == "turn":
+            self._flush_session(session)
+        elif wf == "session":
+            # Accumulate; caller must call flush_all() at session end
+            pass
+        elif isinstance(wf, int) and wf > 0:
+            if self._turn_counter % wf == 0:
+                self._flush_session(session)
+
+    def flush_all(self) -> None:
+        """Flush all pending unsynced messages for all cached sessions.
+
+        Called at session end for "session" write_frequency, or to force
+        a sync before process exit regardless of mode.
+        """
+        for session in list(self._cache.values()):
+            try:
+                self._flush_session(session)
+            except Exception as e:
+                logger.error("Honcho flush_all error for %s: %s", session.key, e)
+
+        # Drain async queue synchronously if it exists
+        if self._async_queue is not None:
+            while not self._async_queue.empty():
+                try:
+                    item = self._async_queue.get_nowait()
+                    if item is not _ASYNC_SHUTDOWN:
+                        self._flush_session(item)
+                except queue.Empty:
+                    break
+
+    def shutdown(self) -> None:
+        """Gracefully shut down the async writer thread."""
+        if self._async_queue is not None and self._async_thread is not None:
+            self.flush_all()
+            self._async_queue.put(_ASYNC_SHUTDOWN)
+            self._async_thread.join(timeout=10)
+
+    def delete(self, key: str) -> bool:
+        """Delete a session from local cache."""
+        if key in self._cache:
+            del self._cache[key]
+            return True
+        return False
+
+    def new_session(self, key: str) -> HonchoSession:
+        """
+        Create a new session, preserving the old one for user modeling.
+
+        Creates a fresh session with a new ID while keeping the old
+        session's data in Honcho for continued user modeling.
+        """
+        import time
+
+        # Remove old session from caches (but don't delete from Honcho)
+        old_session = self._cache.pop(key, None)
+        if old_session:
+            self._sessions_cache.pop(old_session.honcho_session_id, None)
+
+        # Create new session with timestamp suffix
+        timestamp = int(time.time())
+        new_key = f"{key}:{timestamp}"
+
+        # get_or_create will create a fresh session
+        session = self.get_or_create(new_key)
+
+        # Cache under the original key so callers find it by the expected name
+        self._cache[key] = session
+
+        logger.info("Created new session for %s (honcho: %s)", key, session.honcho_session_id)
+        return session
+
+    _REASONING_LEVELS = ("minimal", "low", "medium", "high", "max")
+
+    def _dynamic_reasoning_level(self, query: str) -> str:
+        """
+        Pick a reasoning level based on message complexity.
+
+        Uses the configured default as a floor; bumps up for longer or
+        more complex messages so Honcho applies more inference where it matters.
+
+          < 120 chars  → default (typically "low")
+          120–400 chars → one level above default (cap at "high")
+          > 400 chars  → two levels above default (cap at "high")
+
+        "max" is never selected automatically — reserve it for explicit config.
+        """
+        levels = self._REASONING_LEVELS
+        default_idx = levels.index(self._dialectic_reasoning_level) if self._dialectic_reasoning_level in levels else 1
+        n = len(query)
+        if n < 120:
+            bump = 0
+        elif n < 400:
+            bump = 1
+        else:
+            bump = 2
+        # Cap at "high" (index 3) for auto-selection
+        idx = min(default_idx + bump, 3)
+        return levels[idx]
+
+    def dialectic_query(
+        self, session_key: str, query: str,
+        reasoning_level: str | None = None,
+        peer: str = "user",
+    ) -> str:
+        """
+        Query Honcho's dialectic endpoint about a peer.
+
+        Runs an LLM on Honcho's backend against the target peer's full
+        representation. Higher latency than context() — call async via
+        prefetch_dialectic() to avoid blocking the response.
+
+        Args:
+            session_key: The session key to query against.
+            query: Natural language question.
+            reasoning_level: Override the config default. If None, uses
+                             _dynamic_reasoning_level(query).
+            peer: Which peer to query — "user" (default) or "ai".
+
+        Returns:
+            Honcho's synthesized answer, or empty string on failure.
+        """
+        session = self._cache.get(session_key)
+        if not session:
+            return ""
+
+        peer_id = session.assistant_peer_id if peer == "ai" else session.user_peer_id
+        target_peer = self._get_or_create_peer(peer_id)
+        level = reasoning_level or self._dynamic_reasoning_level(query)
+
+        try:
+            result = target_peer.chat(query, reasoning_level=level) or ""
+            # Apply Hermes-side char cap before caching
+            if result and self._dialectic_max_chars and len(result) > self._dialectic_max_chars:
+                result = result[:self._dialectic_max_chars].rsplit(" ", 1)[0] + " …"
+            return result
+        except Exception as e:
+            logger.warning("Honcho dialectic query failed: %s", e)
+            return ""
+
+    def prefetch_dialectic(self, session_key: str, query: str) -> None:
+        """
+        Fire a dialectic_query in a background thread, caching the result.
+
+        Non-blocking. The result is available via pop_dialectic_result()
+        on the next call (typically the following turn). Reasoning level
+        is selected dynamically based on query complexity.
+
+        Args:
+            session_key: The session key to query against.
+            query: The user's current message, used as the query.
+        """
+        def _run():
+            result = self.dialectic_query(session_key, query)
+            if result:
+                self.set_dialectic_result(session_key, result)
+
+        t = threading.Thread(target=_run, name="honcho-dialectic-prefetch", daemon=True)
+        t.start()
+
+    def set_dialectic_result(self, session_key: str, result: str) -> None:
+        """Store a prefetched dialectic result in a thread-safe way."""
+        if not result:
+            return
+        with self._prefetch_cache_lock:
+            self._dialectic_cache[session_key] = result
+
+    def pop_dialectic_result(self, session_key: str) -> str:
+        """
+        Return and clear the cached dialectic result for this session.
+
+        Returns empty string if no result is ready yet.
+        """
+        with self._prefetch_cache_lock:
+            return self._dialectic_cache.pop(session_key, "")
+
+    def prefetch_context(self, session_key: str, user_message: str | None = None) -> None:
+        """
+        Fire get_prefetch_context in a background thread, caching the result.
+
+        Non-blocking. Consumed next turn via pop_context_result(). This avoids
+        a synchronous HTTP round-trip blocking every response.
+        """
+        def _run():
+            result = self.get_prefetch_context(session_key, user_message)
+            if result:
+                self.set_context_result(session_key, result)
+
+        t = threading.Thread(target=_run, name="honcho-context-prefetch", daemon=True)
+        t.start()
+
+    def set_context_result(self, session_key: str, result: dict[str, str]) -> None:
+        """Store a prefetched context result in a thread-safe way."""
+        if not result:
+            return
+        with self._prefetch_cache_lock:
+            self._context_cache[session_key] = result
+
+    def pop_context_result(self, session_key: str) -> dict[str, str]:
+        """
+        Return and clear the cached context result for this session.
+
+        Returns empty dict if no result is ready yet (first turn).
+        """
+        with self._prefetch_cache_lock:
+            return self._context_cache.pop(session_key, {})
+
+    def get_prefetch_context(self, session_key: str, user_message: str | None = None) -> dict[str, str]:
+        """
+        Pre-fetch user and AI peer context from Honcho.
+
+        Fetches peer_representation and peer_card for both peers. search_query
+        is intentionally omitted — it would only affect additional excerpts
+        that this code does not consume, and passing the raw message exposes
+        conversation content in server access logs.
+
+        Args:
+            session_key: The session key to get context for.
+            user_message: Unused; kept for call-site compatibility.
+
+        Returns:
+            Dictionary with 'representation', 'card', 'ai_representation',
+            and 'ai_card' keys.
+        """
+        session = self._cache.get(session_key)
+        if not session:
+            return {}
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            return {}
+
+        result: dict[str, str] = {}
+        try:
+            ctx = honcho_session.context(
+                summary=False,
+                tokens=self._context_tokens,
+                peer_target=session.user_peer_id,
+                peer_perspective=session.assistant_peer_id,
+            )
+            card = ctx.peer_card or []
+            result["representation"] = ctx.peer_representation or ""
+            result["card"] = "\n".join(card) if isinstance(card, list) else str(card)
+        except Exception as e:
+            logger.warning("Failed to fetch user context from Honcho: %s", e)
+
+        # Also fetch AI peer's own representation so Hermes knows itself.
+        try:
+            ai_ctx = honcho_session.context(
+                summary=False,
+                tokens=self._context_tokens,
+                peer_target=session.assistant_peer_id,
+                peer_perspective=session.user_peer_id,
+            )
+            ai_card = ai_ctx.peer_card or []
+            result["ai_representation"] = ai_ctx.peer_representation or ""
+            result["ai_card"] = "\n".join(ai_card) if isinstance(ai_card, list) else str(ai_card)
+        except Exception as e:
+            logger.debug("Failed to fetch AI peer context from Honcho: %s", e)
+
+        return result
+
+    def migrate_local_history(self, session_key: str, messages: list[dict[str, Any]]) -> bool:
+        """
+        Upload local session history to Honcho as a file.
+
+        Used when Honcho activates mid-conversation to preserve prior context.
+
+        Args:
+            session_key: The session key (e.g., "telegram:123456").
+            messages: Local messages (dicts with role, content, timestamp).
+
+        Returns:
+            True if upload succeeded, False otherwise.
+        """
+        session = self._cache.get(session_key)
+        if not session:
+            logger.warning("No local session cached for '%s', skipping migration", session_key)
+            return False
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            logger.warning("No Honcho session cached for '%s', skipping migration", session_key)
+            return False
+
+        user_peer = self._get_or_create_peer(session.user_peer_id)
+
+        content_bytes = self._format_migration_transcript(session_key, messages)
+        first_ts = messages[0].get("timestamp") if messages else None
+
+        try:
+            honcho_session.upload_file(
+                file=("prior_history.txt", content_bytes, "text/plain"),
+                peer=user_peer,
+                metadata={"source": "local_jsonl", "count": len(messages)},
+                created_at=first_ts,
+            )
+            logger.info("Migrated %d local messages to Honcho for %s", len(messages), session_key)
+            return True
+        except Exception as e:
+            logger.error("Failed to upload local history to Honcho for %s: %s", session_key, e)
+            return False
+
+    @staticmethod
+    def _format_migration_transcript(session_key: str, messages: list[dict[str, Any]]) -> bytes:
+        """Format local messages as an XML transcript for Honcho file upload."""
+        timestamps = [m.get("timestamp", "") for m in messages]
+        time_range = f"{timestamps[0]} to {timestamps[-1]}" if timestamps else "unknown"
+
+        lines = [
+            "<prior_conversation_history>",
+            "<context>",
+            "This conversation history occurred BEFORE the Honcho memory system was activated.",
+            "These messages are the preceding elements of this conversation session and should",
+            "be treated as foundational context for all subsequent interactions. The user and",
+            "assistant have already established rapport through these exchanges.",
+            "</context>",
+            "",
+            f'<transcript session_key="{session_key}" message_count="{len(messages)}"',
+            f'           time_range="{time_range}">',
+            "",
+        ]
+        for msg in messages:
+            ts = msg.get("timestamp", "?")
+            role = msg.get("role", "unknown")
+            content = msg.get("content") or ""
+            lines.append(f"[{ts}] {role}: {content}")
+
+        lines.append("")
+        lines.append("</transcript>")
+        lines.append("</prior_conversation_history>")
+
+        return "\n".join(lines).encode("utf-8")
+
+    def migrate_memory_files(self, session_key: str, memory_dir: str) -> bool:
+        """
+        Upload MEMORY.md and USER.md to Honcho as files.
+
+        Used when Honcho activates on an instance that already has locally
+        consolidated memory. Backwards compatible -- skips if files don't exist.
+
+        Args:
+            session_key: The session key to associate files with.
+            memory_dir: Path to the memories directory (~/.hermes/memories/).
+
+        Returns:
+            True if at least one file was uploaded, False otherwise.
+        """
+        from pathlib import Path
+        memory_path = Path(memory_dir)
+
+        if not memory_path.exists():
+            return False
+
+        session = self._cache.get(session_key)
+        if not session:
+            logger.warning("No local session cached for '%s', skipping memory migration", session_key)
+            return False
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            logger.warning("No Honcho session cached for '%s', skipping memory migration", session_key)
+            return False
+
+        user_peer = self._get_or_create_peer(session.user_peer_id)
+        assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
+
+        uploaded = False
+        files = [
+            (
+                "MEMORY.md",
+                "consolidated_memory.md",
+                "Long-term agent notes and preferences",
+                user_peer,
+                "user",
+            ),
+            (
+                "USER.md",
+                "user_profile.md",
+                "User profile and preferences",
+                user_peer,
+                "user",
+            ),
+            (
+                "SOUL.md",
+                "agent_soul.md",
+                "Agent persona and identity configuration",
+                assistant_peer,
+                "ai",
+            ),
+        ]
+
+        for filename, upload_name, description, target_peer, target_kind in files:
+            filepath = memory_path / filename
+            if not filepath.exists():
+                continue
+            content = filepath.read_text(encoding="utf-8").strip()
+            if not content:
+                continue
+
+            wrapped = (
+                f"<prior_memory_file>\n"
+                f"<context>\n"
+                f"This file was consolidated from local conversations BEFORE Honcho was activated.\n"
+                f"{description}. Treat as foundational context for this user.\n"
+                f"</context>\n"
+                f"\n"
+                f"{content}\n"
+                f"</prior_memory_file>\n"
+            )
+
+            try:
+                honcho_session.upload_file(
+                    file=(upload_name, wrapped.encode("utf-8"), "text/plain"),
+                    peer=target_peer,
+                    metadata={
+                        "source": "local_memory",
+                        "original_file": filename,
+                        "target_peer": target_kind,
+                    },
+                )
+                logger.info(
+                    "Uploaded %s to Honcho for %s (%s peer)",
+                    filename,
+                    session_key,
+                    target_kind,
+                )
+                uploaded = True
+            except Exception as e:
+                logger.error("Failed to upload %s to Honcho: %s", filename, e)
+
+        return uploaded
+
+    def get_peer_card(self, session_key: str) -> list[str]:
+        """
+        Fetch the user peer's card — a curated list of key facts.
+
+        Fast, no LLM reasoning. Returns raw structured facts Honcho has
+        inferred about the user (name, role, preferences, patterns).
+        Empty list if unavailable.
+        """
+        session = self._cache.get(session_key)
+        if not session:
+            return []
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            return []
+
+        try:
+            ctx = honcho_session.context(
+                summary=False,
+                tokens=200,
+                peer_target=session.user_peer_id,
+                peer_perspective=session.assistant_peer_id,
+            )
+            card = ctx.peer_card or []
+            return card if isinstance(card, list) else [str(card)]
+        except Exception as e:
+            logger.debug("Failed to fetch peer card from Honcho: %s", e)
+            return []
+
+    def search_context(self, session_key: str, query: str, max_tokens: int = 800) -> str:
+        """
+        Semantic search over Honcho session context.
+
+        Returns raw excerpts ranked by relevance to the query. No LLM
+        reasoning — cheaper and faster than dialectic_query. Good for
+        factual lookups where the model will do its own synthesis.
+
+        Args:
+            session_key: Session to search against.
+            query: Search query for semantic matching.
+            max_tokens: Token budget for returned content.
+
+        Returns:
+            Relevant context excerpts as a string, or empty string if none.
+        """
+        session = self._cache.get(session_key)
+        if not session:
+            return ""
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            return ""
+
+        try:
+            ctx = honcho_session.context(
+                summary=False,
+                tokens=max_tokens,
+                peer_target=session.user_peer_id,
+                peer_perspective=session.assistant_peer_id,
+                search_query=query,
+            )
+            parts = []
+            if ctx.peer_representation:
+                parts.append(ctx.peer_representation)
+            card = ctx.peer_card or []
+            if card:
+                facts = card if isinstance(card, list) else [str(card)]
+                parts.append("\n".join(f"- {f}" for f in facts))
+            return "\n\n".join(parts)
+        except Exception as e:
+            logger.debug("Honcho search_context failed: %s", e)
+            return ""
+
+    def create_conclusion(self, session_key: str, content: str) -> bool:
+        """Write a conclusion about the user back to Honcho.
+
+        Conclusions are facts the AI peer observes about the user —
+        preferences, corrections, clarifications, project context.
+        They feed into the user's peer card and representation.
+
+        Args:
+            session_key: Session to associate the conclusion with.
+            content: The conclusion text (e.g. "User prefers dark mode").
+
+        Returns:
+            True on success, False on failure.
+        """
+        if not content or not content.strip():
+            return False
+
+        session = self._cache.get(session_key)
+        if not session:
+            logger.warning("No session cached for '%s', skipping conclusion", session_key)
+            return False
+
+        assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
+        try:
+            conclusions_scope = assistant_peer.conclusions_of(session.user_peer_id)
+            conclusions_scope.create([{
+                "content": content.strip(),
+                "session_id": session.honcho_session_id,
+            }])
+            logger.info("Created conclusion for %s: %s", session_key, content[:80])
+            return True
+        except Exception as e:
+            logger.error("Failed to create conclusion: %s", e)
+            return False
+
+    def seed_ai_identity(self, session_key: str, content: str, source: str = "manual") -> bool:
+        """
+        Seed the AI peer's Honcho representation from text content.
+
+        Useful for priming AI identity from SOUL.md, exported chats, or
+        any structured description. The content is sent as an assistant
+        peer message so Honcho's reasoning model can incorporate it.
+
+        Args:
+            session_key: The session key to associate with.
+            content: The identity/persona content to seed.
+            source: Metadata tag for the source (e.g. "soul_md", "export").
+
+        Returns:
+            True on success, False on failure.
+        """
+        if not content or not content.strip():
+            return False
+
+        session = self._cache.get(session_key)
+        if not session:
+            logger.warning("No session cached for '%s', skipping AI seed", session_key)
+            return False
+
+        assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            logger.warning("No Honcho session cached for '%s', skipping AI seed", session_key)
+            return False
+
+        try:
+            wrapped = (
+                f"<ai_identity_seed>\n"
+                f"<source>{source}</source>\n"
+                f"\n"
+                f"{content.strip()}\n"
+                f"</ai_identity_seed>"
+            )
+            honcho_session.add_messages([assistant_peer.message(wrapped)])
+            logger.info("Seeded AI identity from '%s' into %s", source, session_key)
+            return True
+        except Exception as e:
+            logger.error("Failed to seed AI identity: %s", e)
+            return False
+
+    def get_ai_representation(self, session_key: str) -> dict[str, str]:
+        """
+        Fetch the AI peer's current Honcho representation.
+
+        Returns:
+            Dict with 'representation' and 'card' keys, empty strings if unavailable.
+        """
+        session = self._cache.get(session_key)
+        if not session:
+            return {"representation": "", "card": ""}
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            return {"representation": "", "card": ""}
+
+        try:
+            ctx = honcho_session.context(
+                summary=False,
+                tokens=self._context_tokens,
+                peer_target=session.assistant_peer_id,
+                peer_perspective=session.user_peer_id,
+            )
+            ai_card = ctx.peer_card or []
+            return {
+                "representation": ctx.peer_representation or "",
+                "card": "\n".join(ai_card) if isinstance(ai_card, list) else str(ai_card),
+            }
+        except Exception as e:
+            logger.debug("Failed to fetch AI representation: %s", e)
+            return {"representation": "", "card": ""}
+
+    def list_sessions(self) -> list[dict[str, Any]]:
+        """List all cached sessions."""
+        return [
+            {
+                "key": s.key,
+                "created_at": s.created_at.isoformat(),
+                "updated_at": s.updated_at.isoformat(),
+                "message_count": len(s.messages),
+            }
+            for s in self._cache.values()
+        ]