Merge PR #736: feat(honcho): async writes, memory modes, session title integration, setup CLI

Authored by erosika. Builds on #38 and #243. Adds async write support, configurable memory modes, context prefetch pipeline, 4 new Honcho tools (honcho_context, honcho_profile, honcho_search, honcho_conclude), full 'hermes honcho' CLI, session strategies, AI peer identity, recallMode A/B, gateway lifecycle management, and comprehensive docs. Cherry-picks fixes from PRs #831/#832 (adavyas). Co-authored-by: erosika <erosika@users.noreply.github.com> Co-authored-by: adavyas <adavyas@users.noreply.github.com>
2026-07-21 16:18:55 +00:00 · 2026-03-12 19:05:11 -07:00 · 2026-03-12 19:05:11 -07:00 · 475dd58a8e
commit 475dd58a8e
parent 28ffa8e693 fefc709b2c
26 changed files with 4688 additions and 355 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@ -292,7 +292,6 @@ Activate with `/skin cyberpunk` or `display.skin: cyberpunk` in config.yaml.
 ---

 ## Important Policies
-
 ### Prompt Caching Must Not Break

 Hermes-Agent ensures caching remains valid throughout a conversation. **Do NOT implement changes that would:**
--- a/agent/display.py
+++ b/agent/display.py
@ -540,3 +540,46 @@ def get_cute_tool_message(

    preview = build_tool_preview(tool_name, args) or ""
    return _wrap(f"┊ ⚡ {tool_name[:9]:9} {_trunc(preview, 35)}  {dur}")
+
+
+# =========================================================================
+# Honcho session line (one-liner with clickable OSC 8 hyperlink)
+# =========================================================================
+
+_DIM = "\033[2m"
+_SKY_BLUE = "\033[38;5;117m"
+_ANSI_RESET = "\033[0m"
+
+
+def honcho_session_url(workspace: str, session_name: str) -> str:
+    """Build a Honcho app URL for a session."""
+    from urllib.parse import quote
+    return (
+        f"https://app.honcho.dev/explore"
+        f"?workspace={quote(workspace, safe='')}"
+        f"&view=sessions"
+        f"&session={quote(session_name, safe='')}"
+    )
+
+
+def _osc8_link(url: str, text: str) -> str:
+    """OSC 8 terminal hyperlink (clickable in iTerm2, Ghostty, WezTerm, etc.)."""
+    return f"\033]8;;{url}\033\\{text}\033]8;;\033\\"
+
+
+def honcho_session_line(workspace: str, session_name: str) -> str:
+    """One-line session indicator: `Honcho session: <clickable name>`."""
+    url = honcho_session_url(workspace, session_name)
+    linked_name = _osc8_link(url, f"{_SKY_BLUE}{session_name}{_ANSI_RESET}")
+    return f"{_DIM}Honcho session:{_ANSI_RESET} {linked_name}"
+
+
+def write_tty(text: str) -> None:
+    """Write directly to /dev/tty, bypassing stdout capture."""
+    try:
+        fd = os.open("/dev/tty", os.O_WRONLY)
+        os.write(fd, text.encode("utf-8"))
+        os.close(fd)
+    except OSError:
+        sys.stdout.write(text)
+        sys.stdout.flush()
--- a/cli-config.yaml.example
+++ b/cli-config.yaml.example
@ -669,6 +669,7 @@ display:
  #   all:     Running output updates + final message (default)
  background_process_notifications: all

+
  # Play terminal bell when agent finishes a response.
  # Useful for long-running tasks — your terminal will ding when the agent is done.
  # Works over SSH. Most terminals can be configured to flash the taskbar or play a sound.
--- a/cli.py
+++ b/cli.py
@ -1509,7 +1509,7 @@ class HermesCLI:
                session_db=self._session_db,
                clarify_callback=self._clarify_callback,
                reasoning_callback=self._on_reasoning if self.show_reasoning else None,
-                honcho_session_key=self.session_id,
+                honcho_session_key=None,  # resolved by run_agent via config sessions map / title
                fallback_model=self._fallback_model,
                thinking_callback=self._on_thinking,
                checkpoints_enabled=self.checkpoints_enabled,
@ -2739,6 +2739,28 @@ class HermesCLI:
                            try:
                                if self._session_db.set_session_title(self.session_id, new_title):
                                    _cprint(f"  Session title set: {new_title}")
+                                    # Re-map Honcho session key to new title
+                                    if self.agent and getattr(self.agent, '_honcho', None):
+                                        try:
+                                            hcfg = self.agent._honcho_config
+                                            new_key = (
+                                                hcfg.resolve_session_name(
+                                                    session_title=new_title,
+                                                    session_id=self.agent.session_id,
+                                                )
+                                                if hcfg else new_title
+                                            )
+                                            if new_key and new_key != self.agent._honcho_session_key:
+                                                old_key = self.agent._honcho_session_key
+                                                self.agent._honcho.get_or_create(new_key)
+                                                self.agent._honcho_session_key = new_key
+                                                from tools.honcho_tools import set_session_context
+                                                set_session_context(self.agent._honcho, new_key)
+                                                from agent.display import honcho_session_line, write_tty
+                                                write_tty(honcho_session_line(hcfg.workspace_id, new_key) + "\n")
+                                                _cprint(f"  Honcho session: {old_key} → {new_key}")
+                                        except Exception:
+                                            pass
                                else:
                                    _cprint("  Session not found in database.")
                            except ValueError as e:
@ -3207,6 +3229,12 @@ class HermesCLI:
                f"  ✅ Compressed: {original_count} → {new_count} messages "
                f"(~{approx_tokens:,} → ~{new_tokens:,} tokens)"
            )
+            # Flush Honcho async queue so queued messages land before context resets
+            if self.agent and getattr(self.agent, '_honcho', None):
+                try:
+                    self.agent._honcho.flush_all()
+                except Exception:
+                    pass
        except Exception as e:
            print(f"  ❌ Compression failed: {e}")

@ -3657,6 +3685,7 @@ class HermesCLI:
                if response and pending_message:
                    response = response + "\n\n---\n_[Interrupted - processing new message]_"
            
+            response_previewed = result.get("response_previewed", False) if result else False
            # Display reasoning (thinking) box if enabled and available
            if self.show_reasoning and result:
                reasoning = result.get("last_reasoning")
@ -3675,7 +3704,7 @@ class HermesCLI:
                        display_reasoning = reasoning.strip()
                    _cprint(f"\n{r_top}\n{_DIM}{display_reasoning}{_RST}\n{r_bot}")

-            if response:
+            if response and not response_previewed:
                # Use a Rich Panel for the response box — adapts to terminal
                # width at render time instead of hard-coding border length.
                try:
@ -3696,7 +3725,7 @@ class HermesCLI:
                    box=rich_box.HORIZONTALS,
                    padding=(1, 2),
                ))
-            
+
            # Play terminal bell when agent finishes (if enabled).
            # Works over SSH — the bell propagates to the user's terminal.
            if self.bell_on_complete:
@ -3754,6 +3783,18 @@ class HermesCLI:
        """Run the interactive CLI loop with persistent input at bottom."""
        self.show_banner()

+        # One-line Honcho session indicator (TTY-only, not captured by agent)
+        try:
+            from honcho_integration.client import HonchoClientConfig
+            from agent.display import honcho_session_line, write_tty
+            hcfg = HonchoClientConfig.from_global_config()
+            if hcfg.enabled:
+                sname = hcfg.resolve_session_name(session_id=self.session_id)
+                if sname:
+                    write_tty(honcho_session_line(hcfg.workspace_id, sname) + "\n")
+        except Exception:
+            pass
+
        # If resuming a session, load history and display it immediately
        # so the user has context before typing their first message.
        if self._resumed:
@ -4663,6 +4704,12 @@ class HermesCLI:
            # Unregister terminal_tool callbacks to avoid dangling references
            set_sudo_password_callback(None)
            set_approval_callback(None)
+            # Flush + shut down Honcho async writer (drains queue before exit)
+            if self.agent and getattr(self.agent, '_honcho', None):
+                try:
+                    self.agent._honcho.shutdown()
+                except Exception:
+                    pass
            # Close session in SQLite
            if hasattr(self, '_session_db') and self._session_db and self.agent:
                try:
--- a/docs/honcho-integration-spec.html
+++ b/docs/honcho-integration-spec.html
@ -0,0 +1,698 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+<meta charset="UTF-8">
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+<title>honcho-integration-spec</title>
+<style>
+  :root {
+    --bg:             #0b0e14;
+    --bg-surface:     #11151c;
+    --bg-elevated:    #181d27;
+    --bg-code:        #0d1018;
+    --fg:             #c9d1d9;
+    --fg-bright:      #e6edf3;
+    --fg-muted:       #6e7681;
+    --fg-subtle:      #484f58;
+    --accent:         #7eb8f6;
+    --accent-dim:     #3d6ea5;
+    --accent-glow:    rgba(126, 184, 246, 0.08);
+    --green:          #7ee6a8;
+    --green-dim:      #2ea04f;
+    --orange:         #e6a855;
+    --red:            #f47067;
+    --purple:         #bc8cff;
+    --cyan:           #56d4dd;
+    --border:         #21262d;
+    --border-subtle:  #161b22;
+    --radius:         6px;
+    --font-sans:      'New York', ui-serif, 'Iowan Old Style', 'Apple Garamond', Baskerville, 'Times New Roman', 'Noto Emoji', serif;
+    --font-mono:      'Departure Mono', 'Noto Emoji', monospace;
+  }
+
+  *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
+  html { scroll-behavior: smooth; scroll-padding-top: 2rem; }
+  body {
+    font-family: var(--font-sans);
+    background: var(--bg);
+    color: var(--fg);
+    line-height: 1.7;
+    font-size: 15px;
+    -webkit-font-smoothing: antialiased;
+  }
+
+  .container { max-width: 860px; margin: 0 auto; padding: 3rem 2rem 6rem; }
+
+  .hero {
+    text-align: center;
+    padding: 4rem 0 3rem;
+    border-bottom: 1px solid var(--border);
+    margin-bottom: 3rem;
+  }
+  .hero h1 { font-family: var(--font-mono); font-size: 2.2rem; font-weight: 700; color: var(--fg-bright); letter-spacing: -0.03em; margin-bottom: 0.5rem; }
+  .hero h1 span { color: var(--accent); }
+  .hero .subtitle { font-family: var(--font-sans); color: var(--fg-muted); font-size: 0.92rem; max-width: 560px; margin: 0 auto; line-height: 1.6; }
+  .hero .meta { margin-top: 1.5rem; display: flex; justify-content: center; gap: 1.5rem; flex-wrap: wrap; }
+  .hero .meta span { font-size: 0.8rem; color: var(--fg-subtle); font-family: var(--font-mono); }
+
+  .toc { background: var(--bg-surface); border: 1px solid var(--border); border-radius: var(--radius); padding: 1.5rem 2rem; margin-bottom: 3rem; }
+  .toc h2 { font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.1em; color: var(--fg-muted); margin-bottom: 1rem; }
+  .toc ol { list-style: none; counter-reset: toc; columns: 2; column-gap: 2rem; }
+  .toc li { counter-increment: toc; break-inside: avoid; margin-bottom: 0.35rem; }
+  .toc li::before { content: counter(toc, decimal-leading-zero) " "; color: var(--fg-subtle); font-family: var(--font-mono); font-size: 0.75rem; margin-right: 0.25rem; }
+  .toc a { font-family: var(--font-mono); color: var(--fg); text-decoration: none; font-size: 0.82rem; transition: color 0.15s; }
+  .toc a:hover { color: var(--accent); }
+
+  section { margin-bottom: 4rem; }
+  section + section { padding-top: 1rem; }
+
+  h2 { font-family: var(--font-mono); font-size: 1.3rem; font-weight: 700; color: var(--fg-bright); letter-spacing: -0.01em; margin-bottom: 1.25rem; padding-bottom: 0.5rem; border-bottom: 1px solid var(--border); }
+  h3 { font-family: var(--font-mono); font-size: 1rem; font-weight: 600; color: var(--fg-bright); margin-top: 2rem; margin-bottom: 0.75rem; }
+  h4 { font-family: var(--font-mono); font-size: 0.9rem; font-weight: 600; color: var(--accent); margin-top: 1.5rem; margin-bottom: 0.5rem; }
+
+  p { margin-bottom: 1rem; font-size: 0.95rem; line-height: 1.75; }
+  strong { color: var(--fg-bright); font-weight: 600; }
+  a { color: var(--accent); text-decoration: none; }
+  a:hover { text-decoration: underline; }
+
+  ul, ol { margin-bottom: 1rem; padding-left: 1.5rem; font-size: 0.93rem; line-height: 1.7; }
+  li { margin-bottom: 0.35rem; }
+  li::marker { color: var(--fg-subtle); }
+
+  .table-wrap { overflow-x: auto; margin-bottom: 1.5rem; }
+  table { width: 100%; border-collapse: collapse; font-size: 0.88rem; }
+  th, td { text-align: left; padding: 0.6rem 1rem; border-bottom: 1px solid var(--border-subtle); }
+  th { font-family: var(--font-mono); font-size: 0.72rem; text-transform: uppercase; letter-spacing: 0.06em; color: var(--fg-muted); background: var(--bg-surface); border-bottom-color: var(--border); white-space: nowrap; }
+  td { font-family: var(--font-sans); font-size: 0.88rem; color: var(--fg); }
+  tr:hover td { background: var(--accent-glow); }
+  td code { background: var(--bg-elevated); padding: 0.15em 0.4em; border-radius: 3px; font-family: var(--font-mono); font-size: 0.82em; color: var(--cyan); }
+
+  pre { background: var(--bg-code); border: 1px solid var(--border); border-radius: var(--radius); padding: 1.25rem 1.5rem; overflow-x: auto; margin-bottom: 1.5rem; font-family: var(--font-mono); font-size: 0.82rem; line-height: 1.65; color: var(--fg); }
+  pre code { background: none; padding: 0; color: inherit; font-size: inherit; }
+  code { font-family: var(--font-mono); font-size: 0.85em; }
+  p code, li code { background: var(--bg-elevated); padding: 0.15em 0.4em; border-radius: 3px; color: var(--cyan); font-size: 0.85em; }
+
+  .kw { color: var(--purple); }
+  .str { color: var(--green); }
+  .cm { color: var(--fg-subtle); font-style: italic; }
+  .num { color: var(--orange); }
+  .key { color: var(--accent); }
+
+  .mermaid { margin: 1.5rem 0 2rem; text-align: center; }
+  .mermaid svg { max-width: 100%; height: auto; }
+
+  .callout { font-family: var(--font-sans); background: var(--bg-surface); border-left: 3px solid var(--accent-dim); border-radius: 0 var(--radius) var(--radius) 0; padding: 1rem 1.25rem; margin-bottom: 1.5rem; font-size: 0.88rem; color: var(--fg-muted); line-height: 1.6; }
+  .callout strong { font-family: var(--font-mono); color: var(--fg-bright); }
+  .callout.success { border-left-color: var(--green-dim); }
+  .callout.warn { border-left-color: var(--orange); }
+
+  .badge { display: inline-block; font-family: var(--font-mono); font-size: 0.65rem; font-weight: 600; text-transform: uppercase; letter-spacing: 0.05em; padding: 0.2em 0.6em; border-radius: 3px; vertical-align: middle; margin-left: 0.4rem; }
+  .badge-done { background: var(--green-dim); color: #fff; }
+  .badge-wip { background: var(--orange); color: #0b0e14; }
+  .badge-todo { background: var(--fg-subtle); color: var(--fg); }
+
+  .checklist { list-style: none; padding-left: 0; }
+  .checklist li { padding-left: 1.5rem; position: relative; margin-bottom: 0.5rem; }
+  .checklist li::before { position: absolute; left: 0; font-family: var(--font-mono); font-size: 0.85rem; }
+  .checklist li.done { color: var(--fg-muted); }
+  .checklist li.done::before { content: "\2713"; color: var(--green); }
+  .checklist li.todo::before { content: "\25CB"; color: var(--fg-subtle); }
+  .checklist li.wip::before { content: "\25D4"; color: var(--orange); }
+
+  .compare { display: grid; grid-template-columns: 1fr 1fr; gap: 1rem; margin-bottom: 2rem; }
+  .compare-card { background: var(--bg-surface); border: 1px solid var(--border); border-radius: var(--radius); padding: 1.25rem; }
+  .compare-card h4 { margin-top: 0; font-size: 0.82rem; }
+  .compare-card.after { border-color: var(--accent-dim); }
+  .compare-card ul { font-family: var(--font-mono); padding-left: 1.25rem; font-size: 0.8rem; }
+
+  hr { border: none; border-top: 1px solid var(--border); margin: 3rem 0; }
+
+  .progress-bar { position: fixed; top: 0; left: 0; height: 2px; background: var(--accent); z-index: 999; transition: width 0.1s linear; }
+
+  @media (max-width: 640px) {
+    .container { padding: 2rem 1rem 4rem; }
+    .hero h1 { font-size: 1.6rem; }
+    .toc ol { columns: 1; }
+    .compare { grid-template-columns: 1fr; }
+    table { font-size: 0.8rem; }
+    th, td { padding: 0.4rem 0.6rem; }
+  }
+</style>
+<link rel="preconnect" href="https://fonts.googleapis.com">
+<link href="https://fonts.googleapis.com/css2?family=Noto+Emoji&display=swap" rel="stylesheet">
+<style>
+  @font-face {
+    font-family: 'Departure Mono';
+    src: url('https://cdn.jsdelivr.net/gh/rektdeckard/departure-mono@latest/fonts/DepartureMono-Regular.woff2') format('woff2');
+    font-weight: normal;
+    font-style: normal;
+    font-display: swap;
+  }
+</style>
+</head>
+<body>
+
+<div class="progress-bar" id="progress"></div>
+
+<div class="container">
+
+<header class="hero">
+  <h1>honcho<span>-integration-spec</span></h1>
+  <p class="subtitle">Comparison of Hermes Agent vs. openclaw-honcho — and a porting spec for bringing Hermes patterns into other Honcho integrations.</p>
+  <div class="meta">
+    <span>hermes-agent / openclaw-honcho</span>
+    <span>Python + TypeScript</span>
+    <span>2026-03-09</span>
+  </div>
+</header>
+
+<nav class="toc">
+  <h2>Contents</h2>
+  <ol>
+    <li><a href="#overview">Overview</a></li>
+    <li><a href="#architecture">Architecture comparison</a></li>
+    <li><a href="#diff-table">Diff table</a></li>
+    <li><a href="#patterns">Hermes patterns to port</a></li>
+    <li><a href="#spec-async">Spec: async prefetch</a></li>
+    <li><a href="#spec-reasoning">Spec: dynamic reasoning level</a></li>
+    <li><a href="#spec-modes">Spec: per-peer memory modes</a></li>
+    <li><a href="#spec-identity">Spec: AI peer identity formation</a></li>
+    <li><a href="#spec-sessions">Spec: session naming strategies</a></li>
+    <li><a href="#spec-cli">Spec: CLI surface injection</a></li>
+    <li><a href="#openclaw-checklist">openclaw-honcho checklist</a></li>
+    <li><a href="#nanobot-checklist">nanobot-honcho checklist</a></li>
+  </ol>
+</nav>
+
+<!-- OVERVIEW -->
+<section id="overview">
+  <h2>Overview</h2>
+
+  <p>Two independent Honcho integrations have been built for two different agent runtimes: <strong>Hermes Agent</strong> (Python, baked into the runner) and <strong>openclaw-honcho</strong> (TypeScript plugin via hook/tool API). Both use the same Honcho peer paradigm — dual peer model, <code>session.context()</code>, <code>peer.chat()</code> — but they made different tradeoffs at every layer.</p>
+
+  <p>This document maps those tradeoffs and defines a porting spec: a set of Hermes-originated patterns, each stated as an integration-agnostic interface, that any Honcho integration can adopt regardless of runtime or language.</p>
+
+  <div class="callout">
+    <strong>Scope</strong> Both integrations work correctly today. This spec is about the delta — patterns in Hermes that are worth propagating and patterns in openclaw-honcho that Hermes should eventually adopt. The spec is additive, not prescriptive.
+  </div>
+</section>
+
+<!-- ARCHITECTURE -->
+<section id="architecture">
+  <h2>Architecture comparison</h2>
+
+  <h3>Hermes: baked-in runner</h3>
+  <p>Honcho is initialised directly inside <code>AIAgent.__init__</code>. There is no plugin boundary. Session management, context injection, async prefetch, and CLI surface are all first-class concerns of the runner. Context is injected once per session (baked into <code>_cached_system_prompt</code>) and never re-fetched mid-session — this maximises prefix cache hits at the LLM provider.</p>
+
+  <div class="mermaid">
+%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1f3150', 'primaryTextColor': '#c9d1d9', 'primaryBorderColor': '#3d6ea5', 'lineColor': '#3d6ea5', 'secondaryColor': '#162030', 'tertiaryColor': '#11151c' }}}%%
+flowchart TD
+    U["user message"] --> P["_honcho_prefetch()<br/>(reads cache — no HTTP)"]
+    P --> SP["_build_system_prompt()<br/>(first turn only, cached)"]
+    SP --> LLM["LLM call"]
+    LLM --> R["response"]
+    R --> FP["_honcho_fire_prefetch()<br/>(daemon threads, turn end)"]
+    FP --> C1["prefetch_context() thread"]
+    FP --> C2["prefetch_dialectic() thread"]
+    C1 --> CACHE["_context_cache / _dialectic_cache"]
+    C2 --> CACHE
+
+    style U fill:#162030,stroke:#3d6ea5,color:#c9d1d9
+    style P fill:#1f3150,stroke:#3d6ea5,color:#c9d1d9
+    style SP fill:#1f3150,stroke:#3d6ea5,color:#c9d1d9
+    style LLM fill:#162030,stroke:#3d6ea5,color:#c9d1d9
+    style R fill:#162030,stroke:#3d6ea5,color:#c9d1d9
+    style FP fill:#2a1a40,stroke:#bc8cff,color:#c9d1d9
+    style C1 fill:#2a1a40,stroke:#bc8cff,color:#c9d1d9
+    style C2 fill:#2a1a40,stroke:#bc8cff,color:#c9d1d9
+    style CACHE fill:#11151c,stroke:#484f58,color:#6e7681
+  </div>
+
+  <h3>openclaw-honcho: hook-based plugin</h3>
+  <p>The plugin registers hooks against OpenClaw's event bus. Context is fetched synchronously inside <code>before_prompt_build</code> on every turn. Message capture happens in <code>agent_end</code>. The multi-agent hierarchy is tracked via <code>subagent_spawned</code>. This model is correct but every turn pays a blocking Honcho round-trip before the LLM call can begin.</p>
+
+  <div class="mermaid">
+%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1f3150', 'primaryTextColor': '#c9d1d9', 'primaryBorderColor': '#3d6ea5', 'lineColor': '#3d6ea5', 'secondaryColor': '#162030', 'tertiaryColor': '#11151c' }}}%%
+flowchart TD
+    U2["user message"] --> BPB["before_prompt_build<br/>(BLOCKING HTTP — every turn)"]
+    BPB --> CTX["session.context()"]
+    CTX --> SP2["system prompt assembled"]
+    SP2 --> LLM2["LLM call"]
+    LLM2 --> R2["response"]
+    R2 --> AE["agent_end hook"]
+    AE --> SAVE["session.addMessages()<br/>session.setMetadata()"]
+
+    style U2 fill:#162030,stroke:#3d6ea5,color:#c9d1d9
+    style BPB fill:#3a1515,stroke:#f47067,color:#c9d1d9
+    style CTX fill:#3a1515,stroke:#f47067,color:#c9d1d9
+    style SP2 fill:#1f3150,stroke:#3d6ea5,color:#c9d1d9
+    style LLM2 fill:#162030,stroke:#3d6ea5,color:#c9d1d9
+    style R2 fill:#162030,stroke:#3d6ea5,color:#c9d1d9
+    style AE fill:#162030,stroke:#3d6ea5,color:#c9d1d9
+    style SAVE fill:#11151c,stroke:#484f58,color:#6e7681
+  </div>
+</section>
+
+<!-- DIFF TABLE -->
+<section id="diff-table">
+  <h2>Diff table</h2>
+
+  <div class="table-wrap">
+    <table>
+      <thead>
+        <tr>
+          <th>Dimension</th>
+          <th>Hermes Agent</th>
+          <th>openclaw-honcho</th>
+        </tr>
+      </thead>
+      <tbody>
+        <tr>
+          <td><strong>Context injection timing</strong></td>
+          <td>Once per session (cached). Zero HTTP on response path after turn 1.</td>
+          <td>Every turn, blocking. Fresh context per turn but adds latency.</td>
+        </tr>
+        <tr>
+          <td><strong>Prefetch strategy</strong></td>
+          <td>Daemon threads fire at turn end; consumed next turn from cache.</td>
+          <td>None. Blocking call at prompt-build time.</td>
+        </tr>
+        <tr>
+          <td><strong>Dialectic (peer.chat)</strong></td>
+          <td>Prefetched async; result injected into system prompt next turn.</td>
+          <td>On-demand via <code>honcho_recall</code> / <code>honcho_analyze</code> tools.</td>
+        </tr>
+        <tr>
+          <td><strong>Reasoning level</strong></td>
+          <td>Dynamic: scales with message length. Floor = config default. Cap = "high".</td>
+          <td>Fixed per tool: recall=minimal, analyze=medium.</td>
+        </tr>
+        <tr>
+          <td><strong>Memory modes</strong></td>
+          <td><code>user_memory_mode</code> / <code>agent_memory_mode</code>: hybrid / honcho / local.</td>
+          <td>None. Always writes to Honcho.</td>
+        </tr>
+        <tr>
+          <td><strong>Write frequency</strong></td>
+          <td>async (background queue), turn, session, N turns.</td>
+          <td>After every agent_end (no control).</td>
+        </tr>
+        <tr>
+          <td><strong>AI peer identity</strong></td>
+          <td><code>observe_me=True</code>, <code>seed_ai_identity()</code>, <code>get_ai_representation()</code>, SOUL.md → AI peer.</td>
+          <td>Agent files uploaded to agent peer at setup. No ongoing self-observation seeding.</td>
+        </tr>
+        <tr>
+          <td><strong>Context scope</strong></td>
+          <td>User peer + AI peer representation, both injected.</td>
+          <td>User peer (owner) representation + conversation summary. <code>peerPerspective</code> on context call.</td>
+        </tr>
+        <tr>
+          <td><strong>Session naming</strong></td>
+          <td>per-directory / global / manual map / title-based.</td>
+          <td>Derived from platform session key.</td>
+        </tr>
+        <tr>
+          <td><strong>Multi-agent</strong></td>
+          <td>Single-agent only.</td>
+          <td>Parent observer hierarchy via <code>subagent_spawned</code>.</td>
+        </tr>
+        <tr>
+          <td><strong>Tool surface</strong></td>
+          <td>Single <code>query_user_context</code> tool (on-demand dialectic).</td>
+          <td>6 tools: session, profile, search, context (fast) + recall, analyze (LLM).</td>
+        </tr>
+        <tr>
+          <td><strong>Platform metadata</strong></td>
+          <td>Not stripped.</td>
+          <td>Explicitly stripped before Honcho storage.</td>
+        </tr>
+        <tr>
+          <td><strong>Message dedup</strong></td>
+          <td>None (sends on every save cycle).</td>
+          <td><code>lastSavedIndex</code> in session metadata prevents re-sending.</td>
+        </tr>
+        <tr>
+          <td><strong>CLI surface in prompt</strong></td>
+          <td>Management commands injected into system prompt. Agent knows its own CLI.</td>
+          <td>Not injected.</td>
+        </tr>
+        <tr>
+          <td><strong>AI peer name in identity</strong></td>
+          <td>Replaces "Hermes Agent" in DEFAULT_AGENT_IDENTITY when configured.</td>
+          <td>Not implemented.</td>
+        </tr>
+        <tr>
+          <td><strong>QMD / local file search</strong></td>
+          <td>Not implemented.</td>
+          <td>Passthrough tools when QMD backend configured.</td>
+        </tr>
+        <tr>
+          <td><strong>Workspace metadata</strong></td>
+          <td>Not implemented.</td>
+          <td><code>agentPeerMap</code> in workspace metadata tracks agent&#8594;peer ID.</td>
+        </tr>
+      </tbody>
+    </table>
+  </div>
+</section>
+
+<!-- PATTERNS -->
+<section id="patterns">
+  <h2>Hermes patterns to port</h2>
+
+  <p>Six patterns from Hermes are worth adopting in any Honcho integration. They are described below as integration-agnostic interfaces — the implementation will differ per runtime, but the contract is the same.</p>
+
+  <div class="compare">
+    <div class="compare-card">
+      <h4>Patterns Hermes contributes</h4>
+      <ul>
+        <li>Async prefetch (zero-latency)</li>
+        <li>Dynamic reasoning level</li>
+        <li>Per-peer memory modes</li>
+        <li>AI peer identity formation</li>
+        <li>Session naming strategies</li>
+        <li>CLI surface injection</li>
+      </ul>
+    </div>
+    <div class="compare-card after">
+      <h4>Patterns openclaw contributes back</h4>
+      <ul>
+        <li>lastSavedIndex dedup</li>
+        <li>Platform metadata stripping</li>
+        <li>Multi-agent observer hierarchy</li>
+        <li>peerPerspective on context()</li>
+        <li>Tiered tool surface (fast/LLM)</li>
+        <li>Workspace agentPeerMap</li>
+      </ul>
+    </div>
+  </div>
+</section>
+
+<!-- SPEC: ASYNC PREFETCH -->
+<section id="spec-async">
+  <h2>Spec: async prefetch</h2>
+
+  <h3>Problem</h3>
+  <p>Calling <code>session.context()</code> and <code>peer.chat()</code> synchronously before each LLM call adds 200–800ms of Honcho round-trip latency to every turn. Users experience this as the agent "thinking slowly."</p>
+
+  <h3>Pattern</h3>
+  <p>Fire both calls as non-blocking background work at the <strong>end</strong> of each turn. Store results in a per-session cache keyed by session ID. At the <strong>start</strong> of the next turn, pop from cache — the HTTP is already done. First turn is cold (empty cache); all subsequent turns are zero-latency on the response path.</p>
+
+  <h3>Interface contract</h3>
+  <pre><code><span class="cm">// TypeScript (openclaw / nanobot plugin shape)</span>
+
+<span class="kw">interface</span> <span class="key">AsyncPrefetch</span> {
+  <span class="cm">// Fire context + dialectic fetches at turn end. Non-blocking.</span>
+  firePrefetch(sessionId: <span class="str">string</span>, userMessage: <span class="str">string</span>): <span class="kw">void</span>;
+
+  <span class="cm">// Pop cached results at turn start. Returns empty if cache is cold.</span>
+  popContextResult(sessionId: <span class="str">string</span>): ContextResult | <span class="kw">null</span>;
+  popDialecticResult(sessionId: <span class="str">string</span>): <span class="str">string</span> | <span class="kw">null</span>;
+}
+
+<span class="kw">type</span> <span class="key">ContextResult</span> = {
+  representation: <span class="str">string</span>;
+  card: <span class="str">string</span>[];
+  aiRepresentation?: <span class="str">string</span>;  <span class="cm">// AI peer context if enabled</span>
+  summary?: <span class="str">string</span>;            <span class="cm">// conversation summary if fetched</span>
+};</code></pre>
+
+  <h3>Implementation notes</h3>
+  <ul>
+    <li>Python: <code>threading.Thread(daemon=True)</code>. Write to <code>dict[session_id, result]</code> — GIL makes this safe for simple writes.</li>
+    <li>TypeScript: <code>Promise</code> stored in <code>Map&lt;string, Promise&lt;ContextResult&gt;&gt;</code>. Await at pop time. If not resolved yet, skip (return null) — do not block.</li>
+    <li>The pop is destructive: clears the cache entry after reading so stale data never accumulates.</li>
+    <li>Prefetch should also fire on first turn (even though it won't be consumed until turn 2) — this ensures turn 2 is never cold.</li>
+  </ul>
+
+  <h3>openclaw-honcho adoption</h3>
+  <p>Move <code>session.context()</code> from <code>before_prompt_build</code> to a post-<code>agent_end</code> background task. Store result in <code>state.contextCache</code>. In <code>before_prompt_build</code>, read from cache instead of calling Honcho. If cache is empty (turn 1), inject nothing — the prompt is still valid without Honcho context on the first turn.</p>
+</section>
+
+<!-- SPEC: DYNAMIC REASONING LEVEL -->
+<section id="spec-reasoning">
+  <h2>Spec: dynamic reasoning level</h2>
+
+  <h3>Problem</h3>
+  <p>Honcho's dialectic endpoint supports reasoning levels from <code>minimal</code> to <code>max</code>. A fixed level per tool wastes budget on simple queries and under-serves complex ones.</p>
+
+  <h3>Pattern</h3>
+  <p>Select the reasoning level dynamically based on the user's message. Use the configured default as a floor. Bump by message length. Cap auto-selection at <code>high</code> — never select <code>max</code> automatically.</p>
+
+  <h3>Interface contract</h3>
+  <pre><code><span class="cm">// Shared helper — identical logic in any language</span>
+
+<span class="kw">const</span> LEVELS = [<span class="str">"minimal"</span>, <span class="str">"low"</span>, <span class="str">"medium"</span>, <span class="str">"high"</span>, <span class="str">"max"</span>];
+
+<span class="kw">function</span> <span class="key">dynamicReasoningLevel</span>(
+  query: <span class="str">string</span>,
+  configDefault: <span class="str">string</span> = <span class="str">"low"</span>
+): <span class="str">string</span> {
+  <span class="kw">const</span> baseIdx = Math.max(<span class="num">0</span>, LEVELS.indexOf(configDefault));
+  <span class="kw">const</span> n = query.length;
+  <span class="kw">const</span> bump = n &lt; <span class="num">120</span> ? <span class="num">0</span> : n &lt; <span class="num">400</span> ? <span class="num">1</span> : <span class="num">2</span>;
+  <span class="kw">return</span> LEVELS[Math.min(baseIdx + bump, <span class="num">3</span>)]; <span class="cm">// cap at "high" (idx 3)</span>
+}</code></pre>
+
+  <h3>Config key</h3>
+  <p>Add a <code>dialecticReasoningLevel</code> config field (string, default <code>"low"</code>). This sets the floor. Users can raise or lower it. The dynamic bump always applies on top.</p>
+
+  <h3>openclaw-honcho adoption</h3>
+  <p>Apply in <code>honcho_recall</code> and <code>honcho_analyze</code>: replace the fixed <code>reasoningLevel</code> with the dynamic selector. <code>honcho_recall</code> should use floor <code>"minimal"</code> and <code>honcho_analyze</code> floor <code>"medium"</code> — both still bump with message length.</p>
+</section>
+
+<!-- SPEC: PER-PEER MEMORY MODES -->
+<section id="spec-modes">
+  <h2>Spec: per-peer memory modes</h2>
+
+  <h3>Problem</h3>
+  <p>Users want independent control over whether user context and agent context are written locally, to Honcho, or both. A single <code>memoryMode</code> shorthand is not granular enough.</p>
+
+  <h3>Pattern</h3>
+  <p>Three modes per peer: <code>hybrid</code> (write both local + Honcho), <code>honcho</code> (Honcho only, disable local files), <code>local</code> (local files only, skip Honcho sync for this peer). Two orthogonal axes: user peer and agent peer.</p>
+
+  <h3>Config schema</h3>
+  <pre><code><span class="cm">// ~/.openclaw/openclaw.json  (or ~/.nanobot/config.json)</span>
+{
+  <span class="str">"plugins"</span>: {
+    <span class="str">"openclaw-honcho"</span>: {
+      <span class="str">"config"</span>: {
+        <span class="str">"apiKey"</span>: <span class="str">"..."</span>,
+        <span class="str">"memoryMode"</span>: <span class="str">"hybrid"</span>,          <span class="cm">// shorthand: both peers</span>
+        <span class="str">"userMemoryMode"</span>: <span class="str">"honcho"</span>,       <span class="cm">// override for user peer</span>
+        <span class="str">"agentMemoryMode"</span>: <span class="str">"hybrid"</span>       <span class="cm">// override for agent peer</span>
+      }
+    }
+  }
+}</code></pre>
+
+  <h3>Resolution order</h3>
+  <ol>
+    <li>Per-peer field (<code>userMemoryMode</code> / <code>agentMemoryMode</code>) — wins if present.</li>
+    <li>Shorthand <code>memoryMode</code> — applies to both peers as default.</li>
+    <li>Hardcoded default: <code>"hybrid"</code>.</li>
+  </ol>
+
+  <h3>Effect on Honcho sync</h3>
+  <ul>
+    <li><code>userMemoryMode=local</code>: skip adding user peer messages to Honcho.</li>
+    <li><code>agentMemoryMode=local</code>: skip adding assistant peer messages to Honcho.</li>
+    <li>Both local: skip <code>session.addMessages()</code> entirely.</li>
+    <li><code>userMemoryMode=honcho</code>: disable local USER.md writes.</li>
+    <li><code>agentMemoryMode=honcho</code>: disable local MEMORY.md / SOUL.md writes.</li>
+  </ul>
+</section>
+
+<!-- SPEC: AI PEER IDENTITY -->
+<section id="spec-identity">
+  <h2>Spec: AI peer identity formation</h2>
+
+  <h3>Problem</h3>
+  <p>Honcho builds the user's representation organically by observing what the user says. The same mechanism exists for the AI peer — but only if <code>observe_me=True</code> is set for the agent peer. Without it, the agent peer accumulates nothing and Honcho's AI-side model never forms.</p>
+
+  <p>Additionally, existing persona files (SOUL.md, IDENTITY.md) should seed the AI peer's Honcho representation at first activation, rather than waiting for it to emerge from scratch.</p>
+
+  <h3>Part A: observe_me=True for agent peer</h3>
+  <pre><code><span class="cm">// TypeScript — in session.addPeers() call</span>
+<span class="kw">await</span> session.addPeers([
+  [ownerPeer.id, { observeMe: <span class="kw">true</span>,  observeOthers: <span class="kw">false</span> }],
+  [agentPeer.id, { observeMe: <span class="kw">true</span>,  observeOthers: <span class="kw">true</span>  }], <span class="cm">// was false</span>
+]);</code></pre>
+
+  <p>This is a one-line change but foundational. Without it, Honcho's AI peer representation stays empty regardless of what the agent says.</p>
+
+  <h3>Part B: seedAiIdentity()</h3>
+  <pre><code><span class="kw">async function</span> <span class="key">seedAiIdentity</span>(
+  session: HonchoSession,
+  agentPeer: Peer,
+  content: <span class="str">string</span>,
+  source: <span class="str">string</span>
+): Promise&lt;<span class="kw">boolean</span>&gt; {
+  <span class="kw">const</span> wrapped = [
+    <span class="str">`&lt;ai_identity_seed&gt;`</span>,
+    <span class="str">`&lt;source&gt;${source}&lt;/source&gt;`</span>,
+    <span class="str">``</span>,
+    content.trim(),
+    <span class="str">`&lt;/ai_identity_seed&gt;`</span>,
+  ].join(<span class="str">"\n"</span>);
+
+  <span class="kw">await</span> agentPeer.addMessage(<span class="str">"assistant"</span>, wrapped);
+  <span class="kw">return true</span>;
+}</code></pre>
+
+  <h3>Part C: migrate agent files at setup</h3>
+  <p>During <code>openclaw honcho setup</code>, upload agent-self files (SOUL.md, IDENTITY.md, AGENTS.md, BOOTSTRAP.md) to the agent peer using <code>seedAiIdentity()</code> instead of <code>session.uploadFile()</code>. This routes the content through Honcho's observation pipeline rather than the file store.</p>
+
+  <h3>Part D: AI peer name in identity</h3>
+  <p>When the agent has a configured name (non-default), inject it into the agent's self-identity prefix. In OpenClaw this means adding to the injected system prompt section:</p>
+  <pre><code><span class="cm">// In context hook return value</span>
+<span class="kw">return</span> {
+  systemPrompt: [
+    agentName ? <span class="str">`You are ${agentName}.`</span> : <span class="str">""</span>,
+    <span class="str">"## User Memory Context"</span>,
+    ...sections,
+  ].filter(Boolean).join(<span class="str">"\n\n"</span>)
+};</code></pre>
+
+  <h3>CLI surface: honcho identity subcommand</h3>
+  <pre><code>openclaw honcho identity &lt;file&gt;    <span class="cm"># seed from file</span>
+openclaw honcho identity --show    <span class="cm"># show current AI peer representation</span></code></pre>
+</section>
+
+<!-- SPEC: SESSION NAMING -->
+<section id="spec-sessions">
+  <h2>Spec: session naming strategies</h2>
+
+  <h3>Problem</h3>
+  <p>When Honcho is used across multiple projects or directories, a single global session means every project shares the same context. Per-directory sessions provide isolation without requiring users to name sessions manually.</p>
+
+  <h3>Strategies</h3>
+  <div class="table-wrap">
+    <table>
+      <thead><tr><th>Strategy</th><th>Session key</th><th>When to use</th></tr></thead>
+      <tbody>
+        <tr><td><code>per-directory</code></td><td>basename of CWD</td><td>Default. Each project gets its own session.</td></tr>
+        <tr><td><code>global</code></td><td>fixed string <code>"global"</code></td><td>Single cross-project session.</td></tr>
+        <tr><td>manual map</td><td>user-configured per path</td><td><code>sessions</code> config map overrides directory basename.</td></tr>
+        <tr><td>title-based</td><td>sanitized session title</td><td>When agent supports named sessions; title set mid-conversation.</td></tr>
+      </tbody>
+    </table>
+  </div>
+
+  <h3>Config schema</h3>
+  <pre><code>{
+  <span class="str">"sessionStrategy"</span>: <span class="str">"per-directory"</span>,   <span class="cm">// "per-directory" | "global"</span>
+  <span class="str">"sessionPeerPrefix"</span>: <span class="kw">false</span>,            <span class="cm">// prepend peer name to session key</span>
+  <span class="str">"sessions"</span>: {                            <span class="cm">// manual overrides</span>
+    <span class="str">"/home/user/projects/foo"</span>: <span class="str">"foo-project"</span>
+  }
+}</code></pre>
+
+  <h3>CLI surface</h3>
+  <pre><code>openclaw honcho sessions              <span class="cm"># list all mappings</span>
+openclaw honcho map &lt;name&gt;           <span class="cm"># map cwd to session name</span>
+openclaw honcho map                   <span class="cm"># no-arg = list mappings</span></code></pre>
+
+  <p>Resolution order: manual map wins &rarr; session title &rarr; directory basename &rarr; platform key.</p>
+</section>
+
+<!-- SPEC: CLI SURFACE INJECTION -->
+<section id="spec-cli">
+  <h2>Spec: CLI surface injection</h2>
+
+  <h3>Problem</h3>
+  <p>When a user asks "how do I change my memory settings?" or "what Honcho commands are available?" the agent either hallucinates or says it doesn't know. The agent should know its own management interface.</p>
+
+  <h3>Pattern</h3>
+  <p>When Honcho is active, append a compact command reference to the system prompt. The agent can cite these commands directly instead of guessing.</p>
+
+  <pre><code><span class="cm">// In context hook, append to systemPrompt</span>
+<span class="kw">const</span> honchoSection = [
+  <span class="str">"# Honcho memory integration"</span>,
+  <span class="str">`Active. Session: ${sessionKey}. Mode: ${mode}.`</span>,
+  <span class="str">"Management commands:"</span>,
+  <span class="str">"  openclaw honcho status                    — show config + connection"</span>,
+  <span class="str">"  openclaw honcho mode [hybrid|honcho|local] — show or set memory mode"</span>,
+  <span class="str">"  openclaw honcho sessions                  — list session mappings"</span>,
+  <span class="str">"  openclaw honcho map &lt;name&gt;                — map directory to session"</span>,
+  <span class="str">"  openclaw honcho identity [file] [--show]  — seed or show AI identity"</span>,
+  <span class="str">"  openclaw honcho setup                     — full interactive wizard"</span>,
+].join(<span class="str">"\n"</span>);</code></pre>
+
+  <div class="callout warn">
+    <strong>Keep it compact.</strong> This section is injected every turn. Keep it under 300 chars of context. List commands, not explanations — the agent can explain them on request.
+  </div>
+</section>
+
+<!-- OPENCLAW CHECKLIST -->
+<section id="openclaw-checklist">
+  <h2>openclaw-honcho checklist</h2>
+
+  <p>Ordered by impact. Each item maps to a spec section above.</p>
+
+  <ul class="checklist">
+    <li class="todo"><strong>Async prefetch</strong> — move <code>session.context()</code> out of <code>before_prompt_build</code> into post-<code>agent_end</code> background Promise. Pop from cache at prompt build. (<a href="#spec-async">spec</a>)</li>
+    <li class="todo"><strong>observe_me=True for agent peer</strong> — one-line change in <code>session.addPeers()</code> config for agent peer. (<a href="#spec-identity">spec</a>)</li>
+    <li class="todo"><strong>Dynamic reasoning level</strong> — add <code>dynamicReasoningLevel()</code> helper; apply in <code>honcho_recall</code> and <code>honcho_analyze</code>. Add <code>dialecticReasoningLevel</code> to config schema. (<a href="#spec-reasoning">spec</a>)</li>
+    <li class="todo"><strong>Per-peer memory modes</strong> — add <code>userMemoryMode</code> / <code>agentMemoryMode</code> to config; gate Honcho sync and local writes accordingly. (<a href="#spec-modes">spec</a>)</li>
+    <li class="todo"><strong>seedAiIdentity()</strong> — add helper; apply during setup migration for SOUL.md / IDENTITY.md instead of <code>session.uploadFile()</code>. (<a href="#spec-identity">spec</a>)</li>
+    <li class="todo"><strong>Session naming strategies</strong> — add <code>sessionStrategy</code>, <code>sessions</code> map, <code>sessionPeerPrefix</code> to config; implement resolution function. (<a href="#spec-sessions">spec</a>)</li>
+    <li class="todo"><strong>CLI surface injection</strong> — append command reference to <code>before_prompt_build</code> return value when Honcho is active. (<a href="#spec-cli">spec</a>)</li>
+    <li class="todo"><strong>honcho identity subcommand</strong> — add <code>openclaw honcho identity</code> CLI command. (<a href="#spec-identity">spec</a>)</li>
+    <li class="todo"><strong>AI peer name injection</strong> — if <code>aiPeer</code> name configured, prepend to injected system prompt. (<a href="#spec-identity">spec</a>)</li>
+    <li class="todo"><strong>honcho mode / honcho sessions / honcho map</strong> — CLI parity with Hermes. (<a href="#spec-sessions">spec</a>)</li>
+  </ul>
+
+  <div class="callout success">
+    <strong>Already done in openclaw-honcho (do not re-implement):</strong> lastSavedIndex dedup, platform metadata stripping, multi-agent parent observer hierarchy, peerPerspective on context(), tiered tool surface (fast/LLM), workspace agentPeerMap, QMD passthrough, self-hosted Honcho support.
+  </div>
+</section>
+
+<!-- NANOBOT CHECKLIST -->
+<section id="nanobot-checklist">
+  <h2>nanobot-honcho checklist</h2>
+
+  <p>nanobot-honcho is a greenfield integration. Start from openclaw-honcho's architecture (hook-based, dual peer) and apply all Hermes patterns from day one rather than retrofitting. Priority order:</p>
+
+  <h3>Phase 1 — core correctness</h3>
+  <ul class="checklist">
+    <li class="todo">Dual peer model (owner + agent peer), both with <code>observe_me=True</code></li>
+    <li class="todo">Message capture at turn end with <code>lastSavedIndex</code> dedup</li>
+    <li class="todo">Platform metadata stripping before Honcho storage</li>
+    <li class="todo">Async prefetch from day one — do not implement blocking context injection</li>
+    <li class="todo">Legacy file migration at first activation (USER.md → owner peer, SOUL.md → <code>seedAiIdentity()</code>)</li>
+  </ul>
+
+  <h3>Phase 2 — configuration</h3>
+  <ul class="checklist">
+    <li class="todo">Config schema: <code>apiKey</code>, <code>workspaceId</code>, <code>baseUrl</code>, <code>memoryMode</code>, <code>userMemoryMode</code>, <code>agentMemoryMode</code>, <code>dialecticReasoningLevel</code>, <code>sessionStrategy</code>, <code>sessions</code></li>
+    <li class="todo">Per-peer memory mode gating</li>
+    <li class="todo">Dynamic reasoning level</li>
+    <li class="todo">Session naming strategies</li>
+  </ul>
+
+  <h3>Phase 3 — tools and CLI</h3>
+  <ul class="checklist">
+    <li class="todo">Tool surface: <code>honcho_profile</code>, <code>honcho_recall</code>, <code>honcho_analyze</code>, <code>honcho_search</code>, <code>honcho_context</code></li>
+    <li class="todo">CLI: <code>setup</code>, <code>status</code>, <code>sessions</code>, <code>map</code>, <code>mode</code>, <code>identity</code></li>
+    <li class="todo">CLI surface injection into system prompt</li>
+    <li class="todo">AI peer name wired into agent identity</li>
+  </ul>
+</section>
+
+</div>
+
+<script type="module">
+  import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
+  mermaid.initialize({ startOnLoad: true, securityLevel: 'loose', fontFamily: 'Departure Mono, Noto Emoji, monospace' });
+</script>
+<script>
+  window.addEventListener('scroll', () => {
+    const bar = document.getElementById('progress');
+    const max = document.documentElement.scrollHeight - window.innerHeight;
+    bar.style.width = (max > 0 ? (window.scrollY / max) * 100 : 0) + '%';
+  });
+</script>
+</body>
+</html>
--- a/docs/honcho-integration-spec.md
+++ b/docs/honcho-integration-spec.md
@ -0,0 +1,377 @@
+# honcho-integration-spec
+
+Comparison of Hermes Agent vs. openclaw-honcho — and a porting spec for bringing Hermes patterns into other Honcho integrations.
+
+---
+
+## Overview
+
+Two independent Honcho integrations have been built for two different agent runtimes: **Hermes Agent** (Python, baked into the runner) and **openclaw-honcho** (TypeScript plugin via hook/tool API). Both use the same Honcho peer paradigm — dual peer model, `session.context()`, `peer.chat()` — but they made different tradeoffs at every layer.
+
+This document maps those tradeoffs and defines a porting spec: a set of Hermes-originated patterns, each stated as an integration-agnostic interface, that any Honcho integration can adopt regardless of runtime or language.
+
+> **Scope** Both integrations work correctly today. This spec is about the delta — patterns in Hermes that are worth propagating and patterns in openclaw-honcho that Hermes should eventually adopt. The spec is additive, not prescriptive.
+
+---
+
+## Architecture comparison
+
+### Hermes: baked-in runner
+
+Honcho is initialised directly inside `AIAgent.__init__`. There is no plugin boundary. Session management, context injection, async prefetch, and CLI surface are all first-class concerns of the runner. Context is injected once per session (baked into `_cached_system_prompt`) and never re-fetched mid-session — this maximises prefix cache hits at the LLM provider.
+
+Turn flow:
+
+```
+user message
+  → _honcho_prefetch()       (reads cache — no HTTP)
+  → _build_system_prompt()   (first turn only, cached)
+  → LLM call
+  → response
+  → _honcho_fire_prefetch()  (daemon threads, turn end)
+       → prefetch_context() thread  ──┐
+       → prefetch_dialectic() thread ─┴→ _context_cache / _dialectic_cache
+```
+
+### openclaw-honcho: hook-based plugin
+
+The plugin registers hooks against OpenClaw's event bus. Context is fetched synchronously inside `before_prompt_build` on every turn. Message capture happens in `agent_end`. The multi-agent hierarchy is tracked via `subagent_spawned`. This model is correct but every turn pays a blocking Honcho round-trip before the LLM call can begin.
+
+Turn flow:
+
+```
+user message
+  → before_prompt_build (BLOCKING HTTP — every turn)
+       → session.context()
+  → system prompt assembled
+  → LLM call
+  → response
+  → agent_end hook
+       → session.addMessages()
+       → session.setMetadata()
+```
+
+---
+
+## Diff table
+
+| Dimension | Hermes Agent | openclaw-honcho |
+|---|---|---|
+| **Context injection timing** | Once per session (cached). Zero HTTP on response path after turn 1. | Every turn, blocking. Fresh context per turn but adds latency. |
+| **Prefetch strategy** | Daemon threads fire at turn end; consumed next turn from cache. | None. Blocking call at prompt-build time. |
+| **Dialectic (peer.chat)** | Prefetched async; result injected into system prompt next turn. | On-demand via `honcho_recall` / `honcho_analyze` tools. |
+| **Reasoning level** | Dynamic: scales with message length. Floor = config default. Cap = "high". | Fixed per tool: recall=minimal, analyze=medium. |
+| **Memory modes** | `user_memory_mode` / `agent_memory_mode`: hybrid / honcho / local. | None. Always writes to Honcho. |
+| **Write frequency** | async (background queue), turn, session, N turns. | After every agent_end (no control). |
+| **AI peer identity** | `observe_me=True`, `seed_ai_identity()`, `get_ai_representation()`, SOUL.md → AI peer. | Agent files uploaded to agent peer at setup. No ongoing self-observation. |
+| **Context scope** | User peer + AI peer representation, both injected. | User peer (owner) representation + conversation summary. `peerPerspective` on context call. |
+| **Session naming** | per-directory / global / manual map / title-based. | Derived from platform session key. |
+| **Multi-agent** | Single-agent only. | Parent observer hierarchy via `subagent_spawned`. |
+| **Tool surface** | Single `query_user_context` tool (on-demand dialectic). | 6 tools: session, profile, search, context (fast) + recall, analyze (LLM). |
+| **Platform metadata** | Not stripped. | Explicitly stripped before Honcho storage. |
+| **Message dedup** | None. | `lastSavedIndex` in session metadata prevents re-sending. |
+| **CLI surface in prompt** | Management commands injected into system prompt. Agent knows its own CLI. | Not injected. |
+| **AI peer name in identity** | Replaces "Hermes Agent" in DEFAULT_AGENT_IDENTITY when configured. | Not implemented. |
+| **QMD / local file search** | Not implemented. | Passthrough tools when QMD backend configured. |
+| **Workspace metadata** | Not implemented. | `agentPeerMap` in workspace metadata tracks agent→peer ID. |
+
+---
+
+## Patterns
+
+Six patterns from Hermes are worth adopting in any Honcho integration. Each is described as an integration-agnostic interface.
+
+**Hermes contributes:**
+- Async prefetch (zero-latency)
+- Dynamic reasoning level
+- Per-peer memory modes
+- AI peer identity formation
+- Session naming strategies
+- CLI surface injection
+
+**openclaw-honcho contributes back (Hermes should adopt):**
+- `lastSavedIndex` dedup
+- Platform metadata stripping
+- Multi-agent observer hierarchy
+- `peerPerspective` on `context()`
+- Tiered tool surface (fast/LLM)
+- Workspace `agentPeerMap`
+
+---
+
+## Spec: async prefetch
+
+### Problem
+
+Calling `session.context()` and `peer.chat()` synchronously before each LLM call adds 200–800ms of Honcho round-trip latency to every turn.
+
+### Pattern
+
+Fire both calls as non-blocking background work at the **end** of each turn. Store results in a per-session cache keyed by session ID. At the **start** of the next turn, pop from cache — the HTTP is already done. First turn is cold (empty cache); all subsequent turns are zero-latency on the response path.
+
+### Interface contract
+
+```typescript
+interface AsyncPrefetch {
+  // Fire context + dialectic fetches at turn end. Non-blocking.
+  firePrefetch(sessionId: string, userMessage: string): void;
+
+  // Pop cached results at turn start. Returns empty if cache is cold.
+  popContextResult(sessionId: string): ContextResult | null;
+  popDialecticResult(sessionId: string): string | null;
+}
+
+type ContextResult = {
+  representation: string;
+  card: string[];
+  aiRepresentation?: string;  // AI peer context if enabled
+  summary?: string;           // conversation summary if fetched
+};
+```
+
+### Implementation notes
+
+- **Python:** `threading.Thread(daemon=True)`. Write to `dict[session_id, result]` — GIL makes this safe for simple writes.
+- **TypeScript:** `Promise` stored in `Map<string, Promise<ContextResult>>`. Await at pop time. If not resolved yet, return null — do not block.
+- The pop is destructive: clears the cache entry after reading so stale data never accumulates.
+- Prefetch should also fire on first turn (even though it won't be consumed until turn 2).
+
+### openclaw-honcho adoption
+
+Move `session.context()` from `before_prompt_build` to a post-`agent_end` background task. Store result in `state.contextCache`. In `before_prompt_build`, read from cache instead of calling Honcho. If cache is empty (turn 1), inject nothing — the prompt is still valid without Honcho context on the first turn.
+
+---
+
+## Spec: dynamic reasoning level
+
+### Problem
+
+Honcho's dialectic endpoint supports reasoning levels from `minimal` to `max`. A fixed level per tool wastes budget on simple queries and under-serves complex ones.
+
+### Pattern
+
+Select the reasoning level dynamically based on the user's message. Use the configured default as a floor. Bump by message length. Cap auto-selection at `high` — never select `max` automatically.
+
+### Logic
+
+```
+< 120 chars  → default (typically "low")
+120–400 chars → one level above default (cap at "high")
+> 400 chars  → two levels above default (cap at "high")
+```
+
+### Config key
+
+Add `dialecticReasoningLevel` (string, default `"low"`). This sets the floor. The dynamic bump always applies on top.
+
+### openclaw-honcho adoption
+
+Apply in `honcho_recall` and `honcho_analyze`: replace fixed `reasoningLevel` with the dynamic selector. `honcho_recall` uses floor `"minimal"`, `honcho_analyze` uses floor `"medium"` — both still bump with message length.
+
+---
+
+## Spec: per-peer memory modes
+
+### Problem
+
+Users want independent control over whether user context and agent context are written locally, to Honcho, or both.
+
+### Modes
+
+| Mode | Effect |
+|---|---|
+| `hybrid` | Write to both local files and Honcho (default) |
+| `honcho` | Honcho only — disable corresponding local file writes |
+| `local` | Local files only — skip Honcho sync for this peer |
+
+### Config schema
+
+```json
+{
+  "memoryMode": "hybrid",
+  "userMemoryMode": "honcho",
+  "agentMemoryMode": "hybrid"
+}
+```
+
+Resolution order: per-peer field wins → shorthand `memoryMode` → default `"hybrid"`.
+
+### Effect on Honcho sync
+
+- `userMemoryMode=local`: skip adding user peer messages to Honcho
+- `agentMemoryMode=local`: skip adding assistant peer messages to Honcho
+- Both local: skip `session.addMessages()` entirely
+- `userMemoryMode=honcho`: disable local USER.md writes
+- `agentMemoryMode=honcho`: disable local MEMORY.md / SOUL.md writes
+
+---
+
+## Spec: AI peer identity formation
+
+### Problem
+
+Honcho builds the user's representation organically by observing what the user says. The same mechanism exists for the AI peer — but only if `observe_me=True` is set for the agent peer. Without it, the agent peer accumulates nothing.
+
+Additionally, existing persona files (SOUL.md, IDENTITY.md) should seed the AI peer's Honcho representation at first activation.
+
+### Part A: observe_me=True for agent peer
+
+```typescript
+await session.addPeers([
+  [ownerPeer.id, { observeMe: true,  observeOthers: false }],
+  [agentPeer.id, { observeMe: true,  observeOthers: true  }], // was false
+]);
+```
+
+One-line change. Foundational. Without it, the AI peer representation stays empty regardless of what the agent says.
+
+### Part B: seedAiIdentity()
+
+```typescript
+async function seedAiIdentity(
+  agentPeer: Peer,
+  content: string,
+  source: string
+): Promise<boolean> {
+  const wrapped = [
+    `<ai_identity_seed>`,
+    `<source>${source}</source>`,
+    ``,
+    content.trim(),
+    `</ai_identity_seed>`,
+  ].join("\n");
+
+  await agentPeer.addMessage("assistant", wrapped);
+  return true;
+}
+```
+
+### Part C: migrate agent files at setup
+
+During `honcho setup`, upload agent-self files (SOUL.md, IDENTITY.md, AGENTS.md) to the agent peer via `seedAiIdentity()` instead of `session.uploadFile()`. This routes content through Honcho's observation pipeline.
+
+### Part D: AI peer name in identity
+
+When the agent has a configured name, prepend it to the injected system prompt:
+
+```typescript
+const namePrefix = agentName ? `You are ${agentName}.\n\n` : "";
+return { systemPrompt: namePrefix + "## User Memory Context\n\n" + sections };
+```
+
+### CLI surface
+
+```
+honcho identity <file>    # seed from file
+honcho identity --show    # show current AI peer representation
+```
+
+---
+
+## Spec: session naming strategies
+
+### Problem
+
+A single global session means every project shares the same Honcho context. Per-directory sessions provide isolation without requiring users to name sessions manually.
+
+### Strategies
+
+| Strategy | Session key | When to use |
+|---|---|---|
+| `per-directory` | basename of CWD | Default. Each project gets its own session. |
+| `global` | fixed string `"global"` | Single cross-project session. |
+| manual map | user-configured per path | `sessions` config map overrides directory basename. |
+| title-based | sanitized session title | When agent supports named sessions set mid-conversation. |
+
+### Config schema
+
+```json
+{
+  "sessionStrategy": "per-directory",
+  "sessionPeerPrefix": false,
+  "sessions": {
+    "/home/user/projects/foo": "foo-project"
+  }
+}
+```
+
+### CLI surface
+
+```
+honcho sessions              # list all mappings
+honcho map <name>            # map cwd to session name
+honcho map                   # no-arg = list mappings
+```
+
+Resolution order: manual map → session title → directory basename → platform key.
+
+---
+
+## Spec: CLI surface injection
+
+### Problem
+
+When a user asks "how do I change my memory settings?" the agent either hallucinates or says it doesn't know. The agent should know its own management interface.
+
+### Pattern
+
+When Honcho is active, append a compact command reference to the system prompt. Keep it under 300 chars.
+
+```
+# Honcho memory integration
+Active. Session: {sessionKey}. Mode: {mode}.
+Management commands:
+  honcho status                    — show config + connection
+  honcho mode [hybrid|honcho|local] — show or set memory mode
+  honcho sessions                  — list session mappings
+  honcho map <name>                — map directory to session
+  honcho identity [file] [--show]  — seed or show AI identity
+  honcho setup                     — full interactive wizard
+```
+
+---
+
+## openclaw-honcho checklist
+
+Ordered by impact:
+
+- [ ] **Async prefetch** — move `session.context()` out of `before_prompt_build` into post-`agent_end` background Promise
+- [ ] **observe_me=True for agent peer** — one-line change in `session.addPeers()`
+- [ ] **Dynamic reasoning level** — add helper; apply in `honcho_recall` and `honcho_analyze`; add `dialecticReasoningLevel` to config
+- [ ] **Per-peer memory modes** — add `userMemoryMode` / `agentMemoryMode` to config; gate Honcho sync and local writes
+- [ ] **seedAiIdentity()** — add helper; use during setup migration for SOUL.md / IDENTITY.md
+- [ ] **Session naming strategies** — add `sessionStrategy`, `sessions` map, `sessionPeerPrefix`
+- [ ] **CLI surface injection** — append command reference to `before_prompt_build` return value
+- [ ] **honcho identity subcommand** — seed from file or `--show` current representation
+- [ ] **AI peer name injection** — if `aiPeer` name configured, prepend to injected system prompt
+- [ ] **honcho mode / sessions / map** — CLI parity with Hermes
+
+Already done in openclaw-honcho (do not re-implement): `lastSavedIndex` dedup, platform metadata stripping, multi-agent parent observer, `peerPerspective` on `context()`, tiered tool surface, workspace `agentPeerMap`, QMD passthrough, self-hosted Honcho.
+
+---
+
+## nanobot-honcho checklist
+
+Greenfield integration. Start from openclaw-honcho's architecture and apply all Hermes patterns from day one.
+
+### Phase 1 — core correctness
+
+- [ ] Dual peer model (owner + agent peer), both with `observe_me=True`
+- [ ] Message capture at turn end with `lastSavedIndex` dedup
+- [ ] Platform metadata stripping before Honcho storage
+- [ ] Async prefetch from day one — do not implement blocking context injection
+- [ ] Legacy file migration at first activation (USER.md → owner peer, SOUL.md → `seedAiIdentity()`)
+
+### Phase 2 — configuration
+
+- [ ] Config schema: `apiKey`, `workspaceId`, `baseUrl`, `memoryMode`, `userMemoryMode`, `agentMemoryMode`, `dialecticReasoningLevel`, `sessionStrategy`, `sessions`
+- [ ] Per-peer memory mode gating
+- [ ] Dynamic reasoning level
+- [ ] Session naming strategies
+
+### Phase 3 — tools and CLI
+
+- [ ] Tool surface: `honcho_profile`, `honcho_recall`, `honcho_analyze`, `honcho_search`, `honcho_context`
+- [ ] CLI: `setup`, `status`, `sessions`, `map`, `mode`, `identity`
+- [ ] CLI surface injection into system prompt
+- [ ] AI peer name wired into agent identity
--- a/gateway/run.py
+++ b/gateway/run.py
@ -250,6 +250,12 @@ class GatewayRunner:
        # Track pending exec approvals per session
        # Key: session_key, Value: {"command": str, "pattern_key": str}
        self._pending_approvals: Dict[str, Dict[str, str]] = {}
+
+        # Persistent Honcho managers keyed by gateway session key.
+        # This preserves write_frequency="session" semantics across short-lived
+        # per-message AIAgent instances.
+        self._honcho_managers: Dict[str, Any] = {}
+        self._honcho_configs: Dict[str, Any] = {}
        
        # Initialize session database for session_search tool support
        self._session_db = None
@ -266,6 +272,61 @@ class GatewayRunner:
        # Event hook system
        from gateway.hooks import HookRegistry
        self.hooks = HookRegistry()
+
+    def _get_or_create_gateway_honcho(self, session_key: str):
+        """Return a persistent Honcho manager/config pair for this gateway session."""
+        if not hasattr(self, "_honcho_managers"):
+            self._honcho_managers = {}
+        if not hasattr(self, "_honcho_configs"):
+            self._honcho_configs = {}
+
+        if session_key in self._honcho_managers:
+            return self._honcho_managers[session_key], self._honcho_configs.get(session_key)
+
+        try:
+            from honcho_integration.client import HonchoClientConfig, get_honcho_client
+            from honcho_integration.session import HonchoSessionManager
+
+            hcfg = HonchoClientConfig.from_global_config()
+            if not hcfg.enabled or not hcfg.api_key:
+                return None, hcfg
+
+            client = get_honcho_client(hcfg)
+            manager = HonchoSessionManager(
+                honcho=client,
+                config=hcfg,
+                context_tokens=hcfg.context_tokens,
+            )
+            self._honcho_managers[session_key] = manager
+            self._honcho_configs[session_key] = hcfg
+            return manager, hcfg
+        except Exception as e:
+            logger.debug("Gateway Honcho init failed for %s: %s", session_key, e)
+            return None, None
+
+    def _shutdown_gateway_honcho(self, session_key: str) -> None:
+        """Flush and close the persistent Honcho manager for a gateway session."""
+        managers = getattr(self, "_honcho_managers", None)
+        configs = getattr(self, "_honcho_configs", None)
+        if managers is None or configs is None:
+            return
+
+        manager = managers.pop(session_key, None)
+        configs.pop(session_key, None)
+        if not manager:
+            return
+        try:
+            manager.shutdown()
+        except Exception as e:
+            logger.debug("Gateway Honcho shutdown failed for %s: %s", session_key, e)
+
+    def _shutdown_all_gateway_honcho(self) -> None:
+        """Flush and close all persistent Honcho managers."""
+        managers = getattr(self, "_honcho_managers", None)
+        if not managers:
+            return
+        for session_key in list(managers.keys()):
+            self._shutdown_gateway_honcho(session_key)
    
    def _flush_memories_for_session(self, old_session_id: str):
        """Prompt the agent to save memories/skills before context is lost.
@ -324,6 +385,12 @@ class GatewayRunner:
                conversation_history=msgs,
            )
            logger.info("Pre-reset memory flush completed for session %s", old_session_id)
+            # Flush any queued Honcho writes before the session is dropped
+            if getattr(tmp_agent, '_honcho', None):
+                try:
+                    tmp_agent._honcho.shutdown()
+                except Exception:
+                    pass
        except Exception as e:
            logger.debug("Pre-reset memory flush failed for session %s: %s", old_session_id, e)

@ -634,6 +701,7 @@ class GatewayRunner:
                    )
                    try:
                        await self._async_flush_memories(entry.session_id)
+                        self._shutdown_gateway_honcho(key)
                        self.session_store._pre_flushed_sessions.add(entry.session_id)
                    except Exception as e:
                        logger.debug("Proactive memory flush failed for %s: %s", entry.session_id, e)
@ -656,8 +724,9 @@ class GatewayRunner:
                logger.info("✓ %s disconnected", platform.value)
            except Exception as e:
                logger.error("✗ %s disconnect error: %s", platform.value, e)
-        
+
        self.adapters.clear()
+        self._shutdown_all_gateway_honcho()
        self._shutdown_event.set()
        
        from gateway.status import remove_pid_file
@ -1503,6 +1572,8 @@ class GatewayRunner:
                asyncio.create_task(self._async_flush_memories(old_entry.session_id))
        except Exception as e:
            logger.debug("Gateway memory flush on reset failed: %s", e)
+
+        self._shutdown_gateway_honcho(session_key)
        
        # Reset the session
        new_entry = self.session_store.reset_session(session_key)
@ -2435,6 +2506,8 @@ class GatewayRunner:
        except Exception as e:
            logger.debug("Memory flush on resume failed: %s", e)

+        self._shutdown_gateway_honcho(session_key)
+
        # Clear any running agent for this session key
        if session_key in self._running_agents:
            del self._running_agents[session_key]
@ -3274,6 +3347,7 @@ class GatewayRunner:
                }

            pr = self._provider_routing
+            honcho_manager, honcho_config = self._get_or_create_gateway_honcho(session_key)
            agent = AIAgent(
                model=model,
                **runtime_kwargs,
@ -3295,6 +3369,8 @@ class GatewayRunner:
                step_callback=_step_callback_sync if _hooks_ref.loaded_hooks else None,
                platform=platform_key,
                honcho_session_key=session_key,
+                honcho_manager=honcho_manager,
+                honcho_config=honcho_config,
                session_db=self._session_db,
                fallback_model=self._fallback_model,
            )
--- a/hermes_cli/config.py
+++ b/hermes_cli/config.py
@ -110,7 +110,7 @@ DEFAULT_CONFIG = {
        "inactivity_timeout": 120,
        "record_sessions": False,  # Auto-record browser sessions as WebM videos
    },
-    
+
    # Filesystem checkpoints — automatic snapshots before destructive file ops.
    # When enabled, the agent takes a snapshot of the working directory once per
    # conversation turn (on first write_file/patch call).  Use /rollback to restore.
@ -456,7 +456,7 @@ OPTIONAL_ENV_VARS = {
        "description": "Honcho API key for AI-native persistent memory",
        "prompt": "Honcho API key",
        "url": "https://app.honcho.dev",
-        "tools": ["query_user_context"],
+        "tools": ["honcho_context"],
        "password": True,
        "category": "tool",
    },
@ -907,6 +907,36 @@ _COMMENTED_SECTIONS = """
 """


+_COMMENTED_SECTIONS = """
+# ── Security ──────────────────────────────────────────────────────────
+# API keys, tokens, and passwords are redacted from tool output by default.
+# Set to false to see full values (useful for debugging auth issues).
+#
+# security:
+#   redact_secrets: false
+
+# ── Fallback Model ────────────────────────────────────────────────────
+# Automatic provider failover when primary is unavailable.
+# Uncomment and configure to enable. Triggers on rate limits (429),
+# overload (529), service errors (503), or connection failures.
+#
+# Supported providers:
+#   openrouter   (OPENROUTER_API_KEY)  — routes to any model
+#   openai-codex (OAuth — hermes login) — OpenAI Codex
+#   nous         (OAuth — hermes login) — Nous Portal
+#   zai          (ZAI_API_KEY)         — Z.AI / GLM
+#   kimi-coding  (KIMI_API_KEY)        — Kimi / Moonshot
+#   minimax      (MINIMAX_API_KEY)     — MiniMax
+#   minimax-cn   (MINIMAX_CN_API_KEY)  — MiniMax (China)
+#
+# For custom OpenAI-compatible endpoints, add base_url and api_key_env.
+#
+# fallback_model:
+#   provider: openrouter
+#   model: anthropic/claude-sonnet-4
+"""
+
+
 def save_config(config: Dict[str, Any]):
    """Save configuration to ~/.hermes/config.yaml."""
    from utils import atomic_yaml_write
--- a/hermes_cli/doctor.py
+++ b/hermes_cli/doctor.py
@ -634,6 +634,40 @@ def run_doctor(args):
    else:
        check_warn("No GITHUB_TOKEN", "(60 req/hr rate limit — set in ~/.hermes/.env for better rates)")

+    # =========================================================================
+    # Honcho memory
+    # =========================================================================
+    print()
+    print(color("◆ Honcho Memory", Colors.CYAN, Colors.BOLD))
+
+    try:
+        from honcho_integration.client import HonchoClientConfig, GLOBAL_CONFIG_PATH
+        hcfg = HonchoClientConfig.from_global_config()
+
+        if not GLOBAL_CONFIG_PATH.exists():
+            check_warn("Honcho config not found", f"run: hermes honcho setup")
+        elif not hcfg.enabled:
+            check_info("Honcho disabled (set enabled: true in ~/.honcho/config.json to activate)")
+        elif not hcfg.api_key:
+            check_fail("Honcho API key not set", "run: hermes honcho setup")
+            issues.append("No Honcho API key — run 'hermes honcho setup'")
+        else:
+            from honcho_integration.client import get_honcho_client, reset_honcho_client
+            reset_honcho_client()
+            try:
+                get_honcho_client(hcfg)
+                check_ok(
+                    "Honcho connected",
+                    f"workspace={hcfg.workspace_id} mode={hcfg.memory_mode} freq={hcfg.write_frequency}",
+                )
+            except Exception as _e:
+                check_fail("Honcho connection failed", str(_e))
+                issues.append(f"Honcho unreachable: {_e}")
+    except ImportError:
+        check_warn("honcho-ai not installed", "pip install honcho-ai")
+    except Exception as _e:
+        check_warn("Honcho check failed", str(_e))
+
    # =========================================================================
    # Summary
    # =========================================================================
--- a/hermes_cli/main.py
+++ b/hermes_cli/main.py
@ -18,6 +18,22 @@ Usage:
    hermes cron list           # List cron jobs
    hermes cron status         # Check if cron scheduler is running
    hermes doctor              # Check configuration and dependencies
+    hermes honcho setup                    # Configure Honcho AI memory integration
+    hermes honcho status                   # Show Honcho config and connection status
+    hermes honcho sessions                 # List directory → session name mappings
+    hermes honcho map <name>               # Map current directory to a session name
+    hermes honcho peer                     # Show peer names and dialectic settings
+    hermes honcho peer --user NAME         # Set user peer name
+    hermes honcho peer --ai NAME           # Set AI peer name
+    hermes honcho peer --reasoning LEVEL   # Set dialectic reasoning level
+    hermes honcho mode                     # Show current memory mode
+    hermes honcho mode [hybrid|honcho|local]  # Set memory mode
+    hermes honcho tokens                   # Show token budget settings
+    hermes honcho tokens --context N       # Set session.context() token cap
+    hermes honcho tokens --dialectic N     # Set dialectic result char cap
+    hermes honcho identity                 # Show AI peer identity representation
+    hermes honcho identity <file>          # Seed AI peer identity from a file (SOUL.md etc.)
+    hermes honcho migrate                  # Step-by-step migration guide: OpenClaw native → Hermes + Honcho
    hermes version             # Show version
    hermes update              # Update to latest version
    hermes uninstall           # Uninstall Hermes Agent
@ -2595,6 +2611,94 @@ For more help on a command:

    skills_parser.set_defaults(func=cmd_skills)

+    # =========================================================================
+    # honcho command
+    # =========================================================================
+    honcho_parser = subparsers.add_parser(
+        "honcho",
+        help="Manage Honcho AI memory integration",
+        description=(
+            "Honcho is a memory layer that persists across sessions.\n\n"
+            "Each conversation is stored as a peer interaction in a workspace. "
+            "Honcho builds a representation of the user over time — conclusions, "
+            "patterns, context — and surfaces the relevant slice at the start of "
+            "each turn so Hermes knows who you are without you having to repeat yourself.\n\n"
+            "Modes: hybrid (Honcho + local MEMORY.md), honcho (Honcho only), "
+            "local (MEMORY.md only). Write frequency is configurable so memory "
+            "writes never block the response."
+        ),
+        formatter_class=__import__("argparse").RawDescriptionHelpFormatter,
+    )
+    honcho_subparsers = honcho_parser.add_subparsers(dest="honcho_command")
+
+    honcho_subparsers.add_parser("setup", help="Interactive setup wizard for Honcho integration")
+    honcho_subparsers.add_parser("status", help="Show current Honcho config and connection status")
+    honcho_subparsers.add_parser("sessions", help="List known Honcho session mappings")
+
+    honcho_map = honcho_subparsers.add_parser(
+        "map", help="Map current directory to a Honcho session name (no arg = list mappings)"
+    )
+    honcho_map.add_argument(
+        "session_name", nargs="?", default=None,
+        help="Session name to associate with this directory. Omit to list current mappings.",
+    )
+
+    honcho_peer = honcho_subparsers.add_parser(
+        "peer", help="Show or update peer names and dialectic reasoning level"
+    )
+    honcho_peer.add_argument("--user", metavar="NAME", help="Set user peer name")
+    honcho_peer.add_argument("--ai", metavar="NAME", help="Set AI peer name")
+    honcho_peer.add_argument(
+        "--reasoning",
+        metavar="LEVEL",
+        choices=("minimal", "low", "medium", "high", "max"),
+        help="Set default dialectic reasoning level (minimal/low/medium/high/max)",
+    )
+
+    honcho_mode = honcho_subparsers.add_parser(
+        "mode", help="Show or set memory mode (hybrid/honcho/local)"
+    )
+    honcho_mode.add_argument(
+        "mode", nargs="?", metavar="MODE",
+        choices=("hybrid", "honcho", "local"),
+        help="Memory mode to set (hybrid/honcho/local). Omit to show current.",
+    )
+
+    honcho_tokens = honcho_subparsers.add_parser(
+        "tokens", help="Show or set token budget for context and dialectic"
+    )
+    honcho_tokens.add_argument(
+        "--context", type=int, metavar="N",
+        help="Max tokens Honcho returns from session.context() per turn",
+    )
+    honcho_tokens.add_argument(
+        "--dialectic", type=int, metavar="N",
+        help="Max chars of dialectic result to inject into system prompt",
+    )
+
+    honcho_identity = honcho_subparsers.add_parser(
+        "identity", help="Seed or show the AI peer's Honcho identity representation"
+    )
+    honcho_identity.add_argument(
+        "file", nargs="?", default=None,
+        help="Path to file to seed from (e.g. SOUL.md). Omit to show usage.",
+    )
+    honcho_identity.add_argument(
+        "--show", action="store_true",
+        help="Show current AI peer representation from Honcho",
+    )
+
+    honcho_subparsers.add_parser(
+        "migrate",
+        help="Step-by-step migration guide from openclaw-honcho to Hermes Honcho",
+    )
+
+    def cmd_honcho(args):
+        from honcho_integration.cli import honcho_command
+        honcho_command(args)
+
+    honcho_parser.set_defaults(func=cmd_honcho)
+
    # =========================================================================
    # tools command
    # =========================================================================
--- a/honcho_integration/cli.py
+++ b/honcho_integration/cli.py
@ -0,0 +1,765 @@
+"""CLI commands for Honcho integration management.
+
+Handles: hermes honcho setup | status | sessions | map | peer
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import sys
+from pathlib import Path
+
+GLOBAL_CONFIG_PATH = Path.home() / ".honcho" / "config.json"
+HOST = "hermes"
+
+
+def _read_config() -> dict:
+    if GLOBAL_CONFIG_PATH.exists():
+        try:
+            return json.loads(GLOBAL_CONFIG_PATH.read_text(encoding="utf-8"))
+        except Exception:
+            pass
+    return {}
+
+
+def _write_config(cfg: dict) -> None:
+    GLOBAL_CONFIG_PATH.parent.mkdir(parents=True, exist_ok=True)
+    GLOBAL_CONFIG_PATH.write_text(
+        json.dumps(cfg, indent=2, ensure_ascii=False) + "\n",
+        encoding="utf-8",
+    )
+
+
+def _resolve_api_key(cfg: dict) -> str:
+    """Resolve API key with host -> root -> env fallback."""
+    host_key = ((cfg.get("hosts") or {}).get(HOST) or {}).get("apiKey")
+    return host_key or cfg.get("apiKey", "") or os.environ.get("HONCHO_API_KEY", "")
+
+
+def _prompt(label: str, default: str | None = None, secret: bool = False) -> str:
+    suffix = f" [{default}]" if default else ""
+    sys.stdout.write(f"  {label}{suffix}: ")
+    sys.stdout.flush()
+    if secret:
+        if sys.stdin.isatty():
+            import getpass
+            val = getpass.getpass(prompt="")
+        else:
+            # Non-TTY (piped input, test runners) — read plaintext
+            val = sys.stdin.readline().strip()
+    else:
+        val = sys.stdin.readline().strip()
+    return val or (default or "")
+
+
+def _ensure_sdk_installed() -> bool:
+    """Check honcho-ai is importable; offer to install if not. Returns True if ready."""
+    try:
+        import honcho  # noqa: F401
+        return True
+    except ImportError:
+        pass
+
+    print("  honcho-ai is not installed.")
+    answer = _prompt("Install it now? (honcho-ai>=2.0.1)", default="y")
+    if answer.lower() not in ("y", "yes"):
+        print("  Skipping install. Run: pip install 'honcho-ai>=2.0.1'\n")
+        return False
+
+    import subprocess
+    print("  Installing honcho-ai...", flush=True)
+    result = subprocess.run(
+        [sys.executable, "-m", "pip", "install", "honcho-ai>=2.0.1"],
+        capture_output=True,
+        text=True,
+    )
+    if result.returncode == 0:
+        print("  Installed.\n")
+        return True
+    else:
+        print(f"  Install failed:\n{result.stderr.strip()}")
+        print("  Run manually: pip install 'honcho-ai>=2.0.1'\n")
+        return False
+
+
+def cmd_setup(args) -> None:
+    """Interactive Honcho setup wizard."""
+    cfg = _read_config()
+
+    print("\nHoncho memory setup\n" + "─" * 40)
+    print("  Honcho gives Hermes persistent cross-session memory.")
+    print("  Config is shared with other hosts at ~/.honcho/config.json\n")
+
+    if not _ensure_sdk_installed():
+        return
+
+    # All writes go to hosts.hermes — root keys are managed by the user
+    # or the honcho CLI only.
+    hosts = cfg.setdefault("hosts", {})
+    hermes_host = hosts.setdefault(HOST, {})
+
+    # API key — shared credential, lives at root so all hosts can read it
+    current_key = cfg.get("apiKey", "")
+    masked = f"...{current_key[-8:]}" if len(current_key) > 8 else ("set" if current_key else "not set")
+    print(f"  Current API key: {masked}")
+    new_key = _prompt("Honcho API key (leave blank to keep current)", secret=True)
+    if new_key:
+        cfg["apiKey"] = new_key
+
+    effective_key = cfg.get("apiKey", "")
+    if not effective_key:
+        print("\n  No API key configured. Get your API key at https://app.honcho.dev")
+        print("  Run 'hermes honcho setup' again once you have a key.\n")
+        return
+
+    # Peer name
+    current_peer = hermes_host.get("peerName") or cfg.get("peerName", "")
+    new_peer = _prompt("Your name (user peer)", default=current_peer or os.getenv("USER", "user"))
+    if new_peer:
+        hermes_host["peerName"] = new_peer
+
+    current_workspace = hermes_host.get("workspace") or cfg.get("workspace", "hermes")
+    new_workspace = _prompt("Workspace ID", default=current_workspace)
+    if new_workspace:
+        hermes_host["workspace"] = new_workspace
+
+    hermes_host.setdefault("aiPeer", HOST)
+
+    # Memory mode
+    current_mode = hermes_host.get("memoryMode") or cfg.get("memoryMode", "hybrid")
+    print(f"\n  Memory mode options:")
+    print("    hybrid  — write to both Honcho and local MEMORY.md (default)")
+    print("    honcho  — Honcho only, skip MEMORY.md writes")
+    new_mode = _prompt("Memory mode", default=current_mode)
+    if new_mode in ("hybrid", "honcho"):
+        hermes_host["memoryMode"] = new_mode
+    else:
+        hermes_host["memoryMode"] = "hybrid"
+
+    # Write frequency
+    current_wf = str(hermes_host.get("writeFrequency") or cfg.get("writeFrequency", "async"))
+    print(f"\n  Write frequency options:")
+    print("    async   — background thread, no token cost (recommended)")
+    print("    turn    — sync write after every turn")
+    print("    session — batch write at session end only")
+    print("    N       — write every N turns (e.g. 5)")
+    new_wf = _prompt("Write frequency", default=current_wf)
+    try:
+        hermes_host["writeFrequency"] = int(new_wf)
+    except (ValueError, TypeError):
+        hermes_host["writeFrequency"] = new_wf if new_wf in ("async", "turn", "session") else "async"
+
+    # Recall mode
+    _raw_recall = hermes_host.get("recallMode") or cfg.get("recallMode", "hybrid")
+    current_recall = "hybrid" if _raw_recall not in ("hybrid", "context", "tools") else _raw_recall
+    print(f"\n  Recall mode options:")
+    print("    hybrid  — auto-injected context + Honcho tools available (default)")
+    print("    context — auto-injected context only, Honcho tools hidden")
+    print("    tools   — Honcho tools only, no auto-injected context")
+    new_recall = _prompt("Recall mode", default=current_recall)
+    if new_recall in ("hybrid", "context", "tools"):
+        hermes_host["recallMode"] = new_recall
+
+    # Session strategy
+    current_strat = hermes_host.get("sessionStrategy") or cfg.get("sessionStrategy", "per-session")
+    print(f"\n  Session strategy options:")
+    print("    per-session   — new Honcho session each run, named by Hermes session ID (default)")
+    print("    per-directory — one session per working directory")
+    print("    per-repo      — one session per git repository (uses repo root name)")
+    print("    global        — single session across all directories")
+    new_strat = _prompt("Session strategy", default=current_strat)
+    if new_strat in ("per-session", "per-repo", "per-directory", "global"):
+        hermes_host["sessionStrategy"] = new_strat
+
+    hermes_host.setdefault("enabled", True)
+    hermes_host.setdefault("saveMessages", True)
+
+    _write_config(cfg)
+    print(f"\n  Config written to {GLOBAL_CONFIG_PATH}")
+
+    # Test connection
+    print("  Testing connection... ", end="", flush=True)
+    try:
+        from honcho_integration.client import HonchoClientConfig, get_honcho_client, reset_honcho_client
+        reset_honcho_client()
+        hcfg = HonchoClientConfig.from_global_config()
+        get_honcho_client(hcfg)
+        print("OK")
+    except Exception as e:
+        print(f"FAILED\n  Error: {e}")
+        return
+
+    print(f"\n  Honcho is ready.")
+    print(f"  Session:   {hcfg.resolve_session_name()}")
+    print(f"  Workspace: {hcfg.workspace_id}")
+    print(f"  Peer:      {hcfg.peer_name}")
+    _mode_str = hcfg.memory_mode
+    if hcfg.peer_memory_modes:
+        overrides = ", ".join(f"{k}={v}" for k, v in hcfg.peer_memory_modes.items())
+        _mode_str = f"{hcfg.memory_mode}  (peers: {overrides})"
+    print(f"  Mode:      {_mode_str}")
+    print(f"  Frequency: {hcfg.write_frequency}")
+    print(f"\n  Honcho tools available in chat:")
+    print(f"    honcho_context  — ask Honcho a question about you (LLM-synthesized)")
+    print(f"    honcho_search       — semantic search over your history (no LLM)")
+    print(f"    honcho_profile      — your peer card, key facts (no LLM)")
+    print(f"    honcho_conclude     — persist a user fact to Honcho memory (no LLM)")
+    print(f"\n  Other commands:")
+    print(f"    hermes honcho status     — show full config")
+    print(f"    hermes honcho mode       — show or change memory mode")
+    print(f"    hermes honcho tokens     — show or set token budgets")
+    print(f"    hermes honcho identity   — seed or show AI peer identity")
+    print(f"    hermes honcho map <name> — map this directory to a session name\n")
+
+
+def cmd_status(args) -> None:
+    """Show current Honcho config and connection status."""
+    try:
+        import honcho  # noqa: F401
+    except ImportError:
+        print("  honcho-ai is not installed. Run: hermes honcho setup\n")
+        return
+
+    cfg = _read_config()
+
+    if not cfg:
+        print("  No Honcho config found at ~/.honcho/config.json")
+        print("  Run 'hermes honcho setup' to configure.\n")
+        return
+
+    try:
+        from honcho_integration.client import HonchoClientConfig, get_honcho_client
+        hcfg = HonchoClientConfig.from_global_config()
+    except Exception as e:
+        print(f"  Config error: {e}\n")
+        return
+
+    api_key = hcfg.api_key or ""
+    masked = f"...{api_key[-8:]}" if len(api_key) > 8 else ("set" if api_key else "not set")
+
+    print(f"\nHoncho status\n" + "─" * 40)
+    print(f"  Enabled:        {hcfg.enabled}")
+    print(f"  API key:        {masked}")
+    print(f"  Workspace:      {hcfg.workspace_id}")
+    print(f"  Host:           {hcfg.host}")
+    print(f"  Config path:    {GLOBAL_CONFIG_PATH}")
+    print(f"  AI peer:        {hcfg.ai_peer}")
+    print(f"  User peer:      {hcfg.peer_name or 'not set'}")
+    print(f"  Session key:    {hcfg.resolve_session_name()}")
+    print(f"  Recall mode:    {hcfg.recall_mode}")
+    print(f"  Memory mode:    {hcfg.memory_mode}")
+    if hcfg.peer_memory_modes:
+        print(f"  Per-peer modes:")
+        for peer, mode in hcfg.peer_memory_modes.items():
+            print(f"    {peer}: {mode}")
+    print(f"  Write freq:     {hcfg.write_frequency}")
+
+    if hcfg.enabled and hcfg.api_key:
+        print("\n  Connection... ", end="", flush=True)
+        try:
+            get_honcho_client(hcfg)
+            print("OK\n")
+        except Exception as e:
+            print(f"FAILED ({e})\n")
+    else:
+        reason = "disabled" if not hcfg.enabled else "no API key"
+        print(f"\n  Not connected ({reason})\n")
+
+
+def cmd_sessions(args) -> None:
+    """List known directory → session name mappings."""
+    cfg = _read_config()
+    sessions = cfg.get("sessions", {})
+
+    if not sessions:
+        print("  No session mappings configured.\n")
+        print("  Add one with: hermes honcho map <session-name>")
+        print("  Or edit ~/.honcho/config.json directly.\n")
+        return
+
+    cwd = os.getcwd()
+    print(f"\nHoncho session mappings ({len(sessions)})\n" + "─" * 40)
+    for path, name in sorted(sessions.items()):
+        marker = " ←" if path == cwd else ""
+        print(f"  {name:<30} {path}{marker}")
+    print()
+
+
+def cmd_map(args) -> None:
+    """Map current directory to a Honcho session name."""
+    if not args.session_name:
+        cmd_sessions(args)
+        return
+
+    cwd = os.getcwd()
+    session_name = args.session_name.strip()
+
+    if not session_name:
+        print("  Session name cannot be empty.\n")
+        return
+
+    import re
+    sanitized = re.sub(r'[^a-zA-Z0-9_-]', '-', session_name).strip('-')
+    if sanitized != session_name:
+        print(f"  Session name sanitized to: {sanitized}")
+        session_name = sanitized
+
+    cfg = _read_config()
+    cfg.setdefault("sessions", {})[cwd] = session_name
+    _write_config(cfg)
+    print(f"  Mapped {cwd}\n     → {session_name}\n")
+
+
+def cmd_peer(args) -> None:
+    """Show or update peer names and dialectic reasoning level."""
+    cfg = _read_config()
+    changed = False
+
+    user_name = getattr(args, "user", None)
+    ai_name = getattr(args, "ai", None)
+    reasoning = getattr(args, "reasoning", None)
+
+    REASONING_LEVELS = ("minimal", "low", "medium", "high", "max")
+
+    if user_name is None and ai_name is None and reasoning is None:
+        # Show current values
+        hosts = cfg.get("hosts", {})
+        hermes = hosts.get(HOST, {})
+        user = hermes.get('peerName') or cfg.get('peerName') or '(not set)'
+        ai = hermes.get('aiPeer') or cfg.get('aiPeer') or HOST
+        lvl = hermes.get("dialecticReasoningLevel") or cfg.get("dialecticReasoningLevel") or "low"
+        max_chars = hermes.get("dialecticMaxChars") or cfg.get("dialecticMaxChars") or 600
+        print(f"\nHoncho peers\n" + "─" * 40)
+        print(f"  User peer:   {user}")
+        print(f"    Your identity in Honcho. Messages you send build this peer's card.")
+        print(f"  AI peer:     {ai}")
+        print(f"    Hermes' identity in Honcho. Seed with 'hermes honcho identity <file>'.")
+        print(f"    Dialectic calls ask this peer questions to warm session context.")
+        print()
+        print(f"  Dialectic reasoning:  {lvl}  ({', '.join(REASONING_LEVELS)})")
+        print(f"  Dialectic cap:        {max_chars} chars\n")
+        return
+
+    if user_name is not None:
+        cfg.setdefault("hosts", {}).setdefault(HOST, {})["peerName"] = user_name.strip()
+        changed = True
+        print(f"  User peer → {user_name.strip()}")
+
+    if ai_name is not None:
+        cfg.setdefault("hosts", {}).setdefault(HOST, {})["aiPeer"] = ai_name.strip()
+        changed = True
+        print(f"  AI peer   → {ai_name.strip()}")
+
+    if reasoning is not None:
+        if reasoning not in REASONING_LEVELS:
+            print(f"  Invalid reasoning level '{reasoning}'. Options: {', '.join(REASONING_LEVELS)}")
+            return
+        cfg.setdefault("hosts", {}).setdefault(HOST, {})["dialecticReasoningLevel"] = reasoning
+        changed = True
+        print(f"  Dialectic reasoning level → {reasoning}")
+
+    if changed:
+        _write_config(cfg)
+        print(f"  Saved to {GLOBAL_CONFIG_PATH}\n")
+
+
+def cmd_mode(args) -> None:
+    """Show or set the memory mode."""
+    MODES = {
+        "hybrid": "write to both Honcho and local MEMORY.md (default)",
+        "honcho": "Honcho only — MEMORY.md writes disabled",
+    }
+    cfg = _read_config()
+    mode_arg = getattr(args, "mode", None)
+
+    if mode_arg is None:
+        current = (
+            (cfg.get("hosts") or {}).get(HOST, {}).get("memoryMode")
+            or cfg.get("memoryMode")
+            or "hybrid"
+        )
+        print(f"\nHoncho memory mode\n" + "─" * 40)
+        for m, desc in MODES.items():
+            marker = " ←" if m == current else ""
+            print(f"  {m:<8}  {desc}{marker}")
+        print(f"\n  Set with: hermes honcho mode [hybrid|honcho]\n")
+        return
+
+    if mode_arg not in MODES:
+        print(f"  Invalid mode '{mode_arg}'. Options: {', '.join(MODES)}\n")
+        return
+
+    cfg.setdefault("hosts", {}).setdefault(HOST, {})["memoryMode"] = mode_arg
+    _write_config(cfg)
+    print(f"  Memory mode → {mode_arg}  ({MODES[mode_arg]})\n")
+
+
+def cmd_tokens(args) -> None:
+    """Show or set token budget settings."""
+    cfg = _read_config()
+    hosts = cfg.get("hosts", {})
+    hermes = hosts.get(HOST, {})
+
+    context = getattr(args, "context", None)
+    dialectic = getattr(args, "dialectic", None)
+
+    if context is None and dialectic is None:
+        ctx_tokens = hermes.get("contextTokens") or cfg.get("contextTokens") or "(Honcho default)"
+        d_chars = hermes.get("dialecticMaxChars") or cfg.get("dialecticMaxChars") or 600
+        d_level = hermes.get("dialecticReasoningLevel") or cfg.get("dialecticReasoningLevel") or "low"
+        print(f"\nHoncho budgets\n" + "─" * 40)
+        print()
+        print(f"  Context     {ctx_tokens} tokens")
+        print(f"    Raw memory retrieval. Honcho returns stored facts/history about")
+        print(f"    the user and session, injected directly into the system prompt.")
+        print()
+        print(f"  Dialectic   {d_chars} chars, reasoning: {d_level}")
+        print(f"    AI-to-AI inference. Hermes asks Honcho's AI peer a question")
+        print(f"    (e.g. \"what were we working on?\") and Honcho runs its own model")
+        print(f"    to synthesize an answer. Used for first-turn session continuity.")
+        print(f"    Level controls how much reasoning Honcho spends on the answer.")
+        print(f"\n  Set with: hermes honcho tokens [--context N] [--dialectic N]\n")
+        return
+
+    changed = False
+    if context is not None:
+        cfg.setdefault("hosts", {}).setdefault(HOST, {})["contextTokens"] = context
+        print(f"  context tokens → {context}")
+        changed = True
+    if dialectic is not None:
+        cfg.setdefault("hosts", {}).setdefault(HOST, {})["dialecticMaxChars"] = dialectic
+        print(f"  dialectic cap  → {dialectic} chars")
+        changed = True
+
+    if changed:
+        _write_config(cfg)
+        print(f"  Saved to {GLOBAL_CONFIG_PATH}\n")
+
+
+def cmd_identity(args) -> None:
+    """Seed AI peer identity or show both peer representations."""
+    cfg = _read_config()
+    if not _resolve_api_key(cfg):
+        print("  No API key configured. Run 'hermes honcho setup' first.\n")
+        return
+
+    file_path = getattr(args, "file", None)
+    show = getattr(args, "show", False)
+
+    try:
+        from honcho_integration.client import HonchoClientConfig, get_honcho_client
+        from honcho_integration.session import HonchoSessionManager
+        hcfg = HonchoClientConfig.from_global_config()
+        client = get_honcho_client(hcfg)
+        mgr = HonchoSessionManager(honcho=client, config=hcfg)
+        session_key = hcfg.resolve_session_name()
+        mgr.get_or_create(session_key)
+    except Exception as e:
+        print(f"  Honcho connection failed: {e}\n")
+        return
+
+    if show:
+        # ── User peer ────────────────────────────────────────────────────────
+        user_card = mgr.get_peer_card(session_key)
+        print(f"\nUser peer ({hcfg.peer_name or 'not set'})\n" + "─" * 40)
+        if user_card:
+            for fact in user_card:
+                print(f"  {fact}")
+        else:
+            print("  No user peer card yet. Send a few messages to build one.")
+
+        # ── AI peer ──────────────────────────────────────────────────────────
+        ai_rep = mgr.get_ai_representation(session_key)
+        print(f"\nAI peer ({hcfg.ai_peer})\n" + "─" * 40)
+        if ai_rep.get("representation"):
+            print(ai_rep["representation"])
+        elif ai_rep.get("card"):
+            print(ai_rep["card"])
+        else:
+            print("  No representation built yet.")
+            print("  Run 'hermes honcho identity <file>' to seed one.")
+        print()
+        return
+
+    if not file_path:
+        print("\nHoncho identity management\n" + "─" * 40)
+        print(f"  User peer: {hcfg.peer_name or 'not set'}")
+        print(f"  AI peer:   {hcfg.ai_peer}")
+        print()
+        print("    hermes honcho identity --show        — show both peer representations")
+        print("    hermes honcho identity <file>        — seed AI peer from SOUL.md or any .md/.txt\n")
+        return
+
+    from pathlib import Path
+    p = Path(file_path).expanduser()
+    if not p.exists():
+        print(f"  File not found: {p}\n")
+        return
+
+    content = p.read_text(encoding="utf-8").strip()
+    if not content:
+        print(f"  File is empty: {p}\n")
+        return
+
+    source = p.name
+    ok = mgr.seed_ai_identity(session_key, content, source=source)
+    if ok:
+        print(f"  Seeded AI peer identity from {p.name} into session '{session_key}'")
+        print(f"  Honcho will incorporate this into {hcfg.ai_peer}'s representation over time.\n")
+    else:
+        print(f"  Failed to seed identity. Check logs for details.\n")
+
+
+def cmd_migrate(args) -> None:
+    """Step-by-step migration guide: OpenClaw native memory → Hermes + Honcho."""
+    from pathlib import Path
+
+    # ── Detect OpenClaw native memory files ──────────────────────────────────
+    cwd = Path(os.getcwd())
+    openclaw_home = Path.home() / ".openclaw"
+
+    # User peer: facts about the user
+    user_file_names = ["USER.md", "MEMORY.md"]
+    # AI peer: agent identity / configuration
+    agent_file_names = ["SOUL.md", "IDENTITY.md", "AGENTS.md", "TOOLS.md", "BOOTSTRAP.md"]
+
+    user_files: list[Path] = []
+    agent_files: list[Path] = []
+    for name in user_file_names:
+        for d in [cwd, openclaw_home]:
+            p = d / name
+            if p.exists() and p not in user_files:
+                user_files.append(p)
+    for name in agent_file_names:
+        for d in [cwd, openclaw_home]:
+            p = d / name
+            if p.exists() and p not in agent_files:
+                agent_files.append(p)
+
+    cfg = _read_config()
+    has_key = bool(_resolve_api_key(cfg))
+
+    print("\nHoncho migration: OpenClaw native memory → Hermes\n" + "─" * 50)
+    print()
+    print("  OpenClaw's native memory stores context in local markdown files")
+    print("  (USER.md, MEMORY.md, SOUL.md, ...) and injects them via QMD search.")
+    print("  Honcho replaces that with a cloud-backed, LLM-observable memory layer:")
+    print("  context is retrieved semantically, injected automatically each turn,")
+    print("  and enriched by a dialectic reasoning layer that builds over time.")
+    print()
+
+    # ── Step 1: Honcho account ────────────────────────────────────────────────
+    print("Step 1  Create a Honcho account")
+    print()
+    if has_key:
+        masked = f"...{cfg['apiKey'][-8:]}" if len(cfg["apiKey"]) > 8 else "set"
+        print(f"  Honcho API key already configured: {masked}")
+        print("  Skip to Step 2.")
+    else:
+        print("  Honcho is a cloud memory service that gives Hermes persistent memory")
+        print("  across sessions. You need an API key to use it.")
+        print()
+        print("  1. Get your API key at https://app.honcho.dev")
+        print("  2. Run:  hermes honcho setup")
+        print("     Paste the key when prompted.")
+        print()
+        answer = _prompt("  Run 'hermes honcho setup' now?", default="y")
+        if answer.lower() in ("y", "yes"):
+            cmd_setup(args)
+            cfg = _read_config()
+            has_key = bool(cfg.get("apiKey", ""))
+        else:
+            print()
+            print("  Run 'hermes honcho setup' when ready, then re-run this walkthrough.")
+
+    # ── Step 2: Detected files ────────────────────────────────────────────────
+    print()
+    print("Step 2  Detected OpenClaw memory files")
+    print()
+    if user_files or agent_files:
+        if user_files:
+            print(f"  User memory ({len(user_files)} file(s)) — will go to Honcho user peer:")
+            for f in user_files:
+                print(f"    {f}")
+        if agent_files:
+            print(f"  Agent identity ({len(agent_files)} file(s)) — will go to Honcho AI peer:")
+            for f in agent_files:
+                print(f"    {f}")
+    else:
+        print("  No OpenClaw native memory files found in cwd or ~/.openclaw/.")
+        print("  If your files are elsewhere, copy them here before continuing,")
+        print("  or seed them manually:  hermes honcho identity <path/to/file>")
+
+    # ── Step 3: Migrate user memory ───────────────────────────────────────────
+    print()
+    print("Step 3  Migrate user memory files → Honcho user peer")
+    print()
+    print("  USER.md and MEMORY.md contain facts about you that the agent should")
+    print("  remember across sessions. Honcho will store these under your user peer")
+    print("  and inject relevant excerpts into the system prompt automatically.")
+    print()
+    if user_files:
+        print(f"  Found: {', '.join(f.name for f in user_files)}")
+        print()
+        print("  These are picked up automatically the first time you run 'hermes'")
+        print("  with Honcho configured and no prior session history.")
+        print("  (Hermes calls migrate_memory_files() on first session init.)")
+        print()
+        print("  If you want to migrate them now without starting a session:")
+        for f in user_files:
+            print(f"    hermes honcho migrate  — this step handles it interactively")
+        if has_key:
+            answer = _prompt("  Upload user memory files to Honcho now?", default="y")
+            if answer.lower() in ("y", "yes"):
+                try:
+                    from honcho_integration.client import (
+                        HonchoClientConfig,
+                        get_honcho_client,
+                        reset_honcho_client,
+                    )
+                    from honcho_integration.session import HonchoSessionManager
+
+                    reset_honcho_client()
+                    hcfg = HonchoClientConfig.from_global_config()
+                    client = get_honcho_client(hcfg)
+                    mgr = HonchoSessionManager(honcho=client, config=hcfg)
+                    session_key = hcfg.resolve_session_name()
+                    mgr.get_or_create(session_key)
+                    # Upload from each directory that had user files
+                    dirs_with_files = set(str(f.parent) for f in user_files)
+                    any_uploaded = False
+                    for d in dirs_with_files:
+                        if mgr.migrate_memory_files(session_key, d):
+                            any_uploaded = True
+                    if any_uploaded:
+                        print(f"  Uploaded user memory files from: {', '.join(dirs_with_files)}")
+                    else:
+                        print("  Nothing uploaded (files may already be migrated or empty).")
+                except Exception as e:
+                    print(f"  Failed: {e}")
+        else:
+            print("  Run 'hermes honcho setup' first, then re-run this step.")
+    else:
+        print("  No user memory files detected. Nothing to migrate here.")
+
+    # ── Step 4: Seed AI identity ──────────────────────────────────────────────
+    print()
+    print("Step 4  Seed AI identity files → Honcho AI peer")
+    print()
+    print("  SOUL.md, IDENTITY.md, AGENTS.md, TOOLS.md, BOOTSTRAP.md define the")
+    print("  agent's character, capabilities, and behavioral rules. In OpenClaw")
+    print("  these are injected via file search at prompt-build time.")
+    print()
+    print("  In Hermes, they are seeded once into Honcho's AI peer through the")
+    print("  observation pipeline. Honcho builds a representation from them and")
+    print("  from every subsequent assistant message (observe_me=True). Over time")
+    print("  the representation reflects actual behavior, not just declaration.")
+    print()
+    if agent_files:
+        print(f"  Found: {', '.join(f.name for f in agent_files)}")
+        print()
+        if has_key:
+            answer = _prompt("  Seed AI identity from all detected files now?", default="y")
+            if answer.lower() in ("y", "yes"):
+                try:
+                    from honcho_integration.client import (
+                        HonchoClientConfig,
+                        get_honcho_client,
+                        reset_honcho_client,
+                    )
+                    from honcho_integration.session import HonchoSessionManager
+
+                    reset_honcho_client()
+                    hcfg = HonchoClientConfig.from_global_config()
+                    client = get_honcho_client(hcfg)
+                    mgr = HonchoSessionManager(honcho=client, config=hcfg)
+                    session_key = hcfg.resolve_session_name()
+                    mgr.get_or_create(session_key)
+                    for f in agent_files:
+                        content = f.read_text(encoding="utf-8").strip()
+                        if content:
+                            ok = mgr.seed_ai_identity(session_key, content, source=f.name)
+                            status = "seeded" if ok else "failed"
+                            print(f"    {f.name}: {status}")
+                except Exception as e:
+                    print(f"  Failed: {e}")
+        else:
+            print("  Run 'hermes honcho setup' first, then seed manually:")
+            for f in agent_files:
+                print(f"    hermes honcho identity {f}")
+    else:
+        print("  No agent identity files detected.")
+        print("  To seed manually:  hermes honcho identity <path/to/SOUL.md>")
+
+    # ── Step 5: What changes ──────────────────────────────────────────────────
+    print()
+    print("Step 5  What changes vs. OpenClaw native memory")
+    print()
+    print("  Storage")
+    print("    OpenClaw: markdown files on disk, searched via QMD at prompt-build time.")
+    print("    Hermes:   cloud-backed Honcho peers. Files can stay on disk as source")
+    print("              of truth; Honcho holds the live representation.")
+    print()
+    print("  Context injection")
+    print("    OpenClaw: file excerpts injected synchronously before each LLM call.")
+    print("    Hermes:   Honcho context fetched async at turn end, injected next turn.")
+    print("              First turn has no Honcho context; subsequent turns are loaded.")
+    print()
+    print("  Memory growth")
+    print("    OpenClaw: you edit files manually to update memory.")
+    print("    Hermes:   Honcho observes every message and updates representations")
+    print("              automatically. Files become the seed, not the live store.")
+    print()
+    print("  Honcho tools (available to the agent during conversation)")
+    print("    honcho_context   — ask Honcho a question, get a synthesized answer (LLM)")
+    print("    honcho_search        — semantic search over stored context (no LLM)")
+    print("    honcho_profile       — fast peer card snapshot (no LLM)")
+    print("    honcho_conclude      — write a conclusion/fact back to memory (no LLM)")
+    print()
+    print("  Session naming")
+    print("    OpenClaw: no persistent session concept — files are global.")
+    print("    Hermes:   per-session by default — each run gets its own session")
+    print("              Map a custom name:  hermes honcho map <session-name>")
+
+    # ── Step 6: Next steps ────────────────────────────────────────────────────
+    print()
+    print("Step 6  Next steps")
+    print()
+    if not has_key:
+        print("  1. hermes honcho setup              — configure API key (required)")
+        print("  2. hermes honcho migrate            — re-run this walkthrough")
+    else:
+        print("  1. hermes honcho status             — verify Honcho connection")
+        print("  2. hermes                           — start a session")
+        print("     (user memory files auto-uploaded on first turn if not done above)")
+        print("  3. hermes honcho identity --show    — verify AI peer representation")
+        print("  4. hermes honcho tokens             — tune context and dialectic budgets")
+        print("  5. hermes honcho mode               — view or change memory mode")
+    print()
+
+
+def honcho_command(args) -> None:
+    """Route honcho subcommands."""
+    sub = getattr(args, "honcho_command", None)
+    if sub == "setup" or sub is None:
+        cmd_setup(args)
+    elif sub == "status":
+        cmd_status(args)
+    elif sub == "sessions":
+        cmd_sessions(args)
+    elif sub == "map":
+        cmd_map(args)
+    elif sub == "peer":
+        cmd_peer(args)
+    elif sub == "mode":
+        cmd_mode(args)
+    elif sub == "tokens":
+        cmd_tokens(args)
+    elif sub == "identity":
+        cmd_identity(args)
+    elif sub == "migrate":
+        cmd_migrate(args)
+    else:
+        print(f"  Unknown honcho command: {sub}")
+        print("  Available: setup, status, sessions, map, peer, mode, tokens, identity, migrate\n")
--- a/honcho_integration/client.py
+++ b/honcho_integration/client.py
@ -27,6 +27,40 @@ GLOBAL_CONFIG_PATH = Path.home() / ".honcho" / "config.json"
 HOST = "hermes"


+_RECALL_MODE_ALIASES = {"auto": "hybrid"}
+_VALID_RECALL_MODES = {"hybrid", "context", "tools"}
+
+
+def _normalize_recall_mode(val: str) -> str:
+    """Normalize legacy recall mode values (e.g. 'auto' → 'hybrid')."""
+    val = _RECALL_MODE_ALIASES.get(val, val)
+    return val if val in _VALID_RECALL_MODES else "hybrid"
+
+
+def _resolve_memory_mode(
+    global_val: str | dict,
+    host_val: str | dict | None,
+) -> dict:
+    """Parse memoryMode (string or object) into memory_mode + peer_memory_modes.
+
+    Resolution order: host-level wins over global.
+    String form:  applies as the default for all peers.
+    Object form:  { "default": "hybrid", "hermes": "honcho", ... }
+                  "default" key sets the fallback; other keys are per-peer overrides.
+    """
+    # Pick the winning value (host beats global)
+    val = host_val if host_val is not None else global_val
+
+    if isinstance(val, dict):
+        default = val.get("default", "hybrid")
+        overrides = {k: v for k, v in val.items() if k != "default"}
+    else:
+        default = str(val) if val else "hybrid"
+        overrides = {}
+
+    return {"memory_mode": default, "peer_memory_modes": overrides}
+
+
@dataclass
 class HonchoClientConfig:
    """Configuration for Honcho client, resolved for a specific host."""
@ -42,10 +76,36 @@ class HonchoClientConfig:
    # Toggles
    enabled: bool = False
    save_messages: bool = True
+    # memoryMode: default for all peers. "hybrid" / "honcho"
+    memory_mode: str = "hybrid"
+    # Per-peer overrides — any named Honcho peer. Override memory_mode when set.
+    # Config object form: "memoryMode": { "default": "hybrid", "hermes": "honcho" }
+    peer_memory_modes: dict[str, str] = field(default_factory=dict)
+
+    def peer_memory_mode(self, peer_name: str) -> str:
+        """Return the effective memory mode for a named peer.
+
+        Resolution: per-peer override → global memory_mode default.
+        """
+        return self.peer_memory_modes.get(peer_name, self.memory_mode)
+    # Write frequency: "async" (background thread), "turn" (sync per turn),
+    # "session" (flush on session end), or int (every N turns)
+    write_frequency: str | int = "async"
    # Prefetch budget
    context_tokens: int | None = None
+    # Dialectic (peer.chat) settings
+    # reasoning_level: "minimal" | "low" | "medium" | "high" | "max"
+    # Used as the default; prefetch_dialectic may bump it dynamically.
+    dialectic_reasoning_level: str = "low"
+    # Max chars of dialectic result to inject into Hermes system prompt
+    dialectic_max_chars: int = 600
+    # Recall mode: how memory retrieval works when Honcho is active.
+    # "hybrid"  — auto-injected context + Honcho tools available (model decides)
+    # "context" — auto-injected context only, Honcho tools removed
+    # "tools"   — Honcho tools only, no auto-injected context
+    recall_mode: str = "hybrid"
    # Session resolution
-    session_strategy: str = "per-directory"
+    session_strategy: str = "per-session"
    session_peer_prefix: bool = False
    sessions: dict[str, str] = field(default_factory=dict)
    # Raw global config for anything else consumers need
@ -97,53 +157,164 @@ class HonchoClientConfig:
        )
        linked_hosts = host_block.get("linkedHosts", [])

-        api_key = raw.get("apiKey") or os.environ.get("HONCHO_API_KEY")
+        api_key = (
+            host_block.get("apiKey")
+            or raw.get("apiKey")
+            or os.environ.get("HONCHO_API_KEY")
+        )
+
+        environment = (
+            host_block.get("environment")
+            or raw.get("environment", "production")
+        )

        # Auto-enable when API key is present (unless explicitly disabled)
-        # This matches user expectations: setting an API key should activate the feature.
-        explicit_enabled = raw.get("enabled")
-        if explicit_enabled is None:
-            # Not explicitly set in config -> auto-enable if API key exists
-            enabled = bool(api_key)
+        # Host-level enabled wins, then root-level, then auto-enable if key exists.
+        host_enabled = host_block.get("enabled")
+        root_enabled = raw.get("enabled")
+        if host_enabled is not None:
+            enabled = host_enabled
+        elif root_enabled is not None:
+            enabled = root_enabled
        else:
-            # Respect explicit setting
-            enabled = explicit_enabled
+            # Not explicitly set anywhere -> auto-enable if API key exists
+            enabled = bool(api_key)
+
+        # write_frequency: accept int or string
+        raw_wf = (
+            host_block.get("writeFrequency")
+            or raw.get("writeFrequency")
+            or "async"
+        )
+        try:
+            write_frequency: str | int = int(raw_wf)
+        except (TypeError, ValueError):
+            write_frequency = str(raw_wf)
+
+        # saveMessages: host wins (None-aware since False is valid)
+        host_save = host_block.get("saveMessages")
+        save_messages = host_save if host_save is not None else raw.get("saveMessages", True)
+
+        # sessionStrategy / sessionPeerPrefix: host first, root fallback
+        session_strategy = (
+            host_block.get("sessionStrategy")
+            or raw.get("sessionStrategy", "per-session")
+        )
+        host_prefix = host_block.get("sessionPeerPrefix")
+        session_peer_prefix = (
+            host_prefix if host_prefix is not None
+            else raw.get("sessionPeerPrefix", False)
+        )

        return cls(
            host=host,
            workspace_id=workspace,
            api_key=api_key,
-            environment=raw.get("environment", "production"),
-            peer_name=raw.get("peerName"),
+            environment=environment,
+            peer_name=host_block.get("peerName") or raw.get("peerName"),
            ai_peer=ai_peer,
            linked_hosts=linked_hosts,
            enabled=enabled,
-            save_messages=raw.get("saveMessages", True),
-            context_tokens=raw.get("contextTokens") or host_block.get("contextTokens"),
-            session_strategy=raw.get("sessionStrategy", "per-directory"),
-            session_peer_prefix=raw.get("sessionPeerPrefix", False),
+            save_messages=save_messages,
+            **_resolve_memory_mode(
+                raw.get("memoryMode", "hybrid"),
+                host_block.get("memoryMode"),
+            ),
+            write_frequency=write_frequency,
+            context_tokens=host_block.get("contextTokens") or raw.get("contextTokens"),
+            dialectic_reasoning_level=(
+                host_block.get("dialecticReasoningLevel")
+                or raw.get("dialecticReasoningLevel")
+                or "low"
+            ),
+            dialectic_max_chars=int(
+                host_block.get("dialecticMaxChars")
+                or raw.get("dialecticMaxChars")
+                or 600
+            ),
+            recall_mode=_normalize_recall_mode(
+                host_block.get("recallMode")
+                or raw.get("recallMode")
+                or "hybrid"
+            ),
+            session_strategy=session_strategy,
+            session_peer_prefix=session_peer_prefix,
            sessions=raw.get("sessions", {}),
            raw=raw,
        )

-    def resolve_session_name(self, cwd: str | None = None) -> str | None:
-        """Resolve session name for a directory.
+    @staticmethod
+    def _git_repo_name(cwd: str) -> str | None:
+        """Return the git repo root directory name, or None if not in a repo."""
+        import subprocess

-        Checks manual overrides first, then derives from directory name.
+        try:
+            root = subprocess.run(
+                ["git", "rev-parse", "--show-toplevel"],
+                capture_output=True, text=True, cwd=cwd, timeout=5,
+            )
+            if root.returncode == 0:
+                return Path(root.stdout.strip()).name
+        except (OSError, subprocess.TimeoutExpired):
+            pass
+        return None
+
+    def resolve_session_name(
+        self,
+        cwd: str | None = None,
+        session_title: str | None = None,
+        session_id: str | None = None,
+    ) -> str | None:
+        """Resolve Honcho session name.
+
+        Resolution order:
+          1. Manual directory override from sessions map
+          2. Hermes session title (from /title command)
+          3. per-session strategy — Hermes session_id ({timestamp}_{hex})
+          4. per-repo strategy — git repo root directory name
+          5. per-directory strategy — directory basename
+          6. global strategy — workspace name
        """
+        import re
+
        if not cwd:
            cwd = os.getcwd()

-        # Manual override
+        # Manual override always wins
        manual = self.sessions.get(cwd)
        if manual:
            return manual

-        # Derive from directory basename
-        base = Path(cwd).name
-        if self.session_peer_prefix and self.peer_name:
-            return f"{self.peer_name}-{base}"
-        return base
+        # /title mid-session remap
+        if session_title:
+            sanitized = re.sub(r'[^a-zA-Z0-9_-]', '-', session_title).strip('-')
+            if sanitized:
+                if self.session_peer_prefix and self.peer_name:
+                    return f"{self.peer_name}-{sanitized}"
+                return sanitized
+
+        # per-session: inherit Hermes session_id (new Honcho session each run)
+        if self.session_strategy == "per-session" and session_id:
+            if self.session_peer_prefix and self.peer_name:
+                return f"{self.peer_name}-{session_id}"
+            return session_id
+
+        # per-repo: one Honcho session per git repository
+        if self.session_strategy == "per-repo":
+            base = self._git_repo_name(cwd) or Path(cwd).name
+            if self.session_peer_prefix and self.peer_name:
+                return f"{self.peer_name}-{base}"
+            return base
+
+        # per-directory: one Honcho session per working directory
+        if self.session_strategy in ("per-directory", "per-session"):
+            base = Path(cwd).name
+            if self.session_peer_prefix and self.peer_name:
+                return f"{self.peer_name}-{base}"
+            return base
+
+        # global: single session across all directories
+        return self.workspace_id

    def get_linked_workspaces(self) -> list[str]:
        """Resolve linked host keys to workspace names."""
@ -176,9 +347,9 @@ def get_honcho_client(config: HonchoClientConfig | None = None) -> Honcho:

    if not config.api_key:
        raise ValueError(
-            "Honcho API key not found. Set it in ~/.honcho/config.json "
-            "or the HONCHO_API_KEY environment variable. "
-            "Get an API key from https://app.honcho.dev"
+            "Honcho API key not found. "
+            "Get your API key at https://app.honcho.dev, "
+            "then run 'hermes honcho setup' or set HONCHO_API_KEY."
        )

    try:
--- a/honcho_integration/session.py
+++ b/honcho_integration/session.py
@ -2,8 +2,10 @@

 from __future__ import annotations

+import queue
 import re
 import logging
+import threading
 from dataclasses import dataclass, field
 from datetime import datetime
 from typing import Any, TYPE_CHECKING
@ -15,6 +17,9 @@ if TYPE_CHECKING:

 logger = logging.getLogger(__name__)

+# Sentinel to signal the async writer thread to shut down
+_ASYNC_SHUTDOWN = object()
+

@dataclass
 class HonchoSession:
@ -80,7 +85,8 @@ class HonchoSessionManager:
        Args:
            honcho: Optional Honcho client. If not provided, uses the singleton.
            context_tokens: Max tokens for context() calls (None = Honcho default).
-            config: HonchoClientConfig from global config (provides peer_name, ai_peer, etc.).
+            config: HonchoClientConfig from global config (provides peer_name, ai_peer,
+                    write_frequency, memory_mode, etc.).
        """
        self._honcho = honcho
        self._context_tokens = context_tokens
@ -89,6 +95,34 @@ class HonchoSessionManager:
        self._peers_cache: dict[str, Any] = {}
        self._sessions_cache: dict[str, Any] = {}

+        # Write frequency state
+        write_frequency = (config.write_frequency if config else "async")
+        self._write_frequency = write_frequency
+        self._turn_counter: int = 0
+
+        # Prefetch caches: session_key → last result (consumed once per turn)
+        self._context_cache: dict[str, dict] = {}
+        self._dialectic_cache: dict[str, str] = {}
+        self._prefetch_cache_lock = threading.Lock()
+        self._dialectic_reasoning_level: str = (
+            config.dialectic_reasoning_level if config else "low"
+        )
+        self._dialectic_max_chars: int = (
+            config.dialectic_max_chars if config else 600
+        )
+
+        # Async write queue — started lazily on first enqueue
+        self._async_queue: queue.Queue | None = None
+        self._async_thread: threading.Thread | None = None
+        if write_frequency == "async":
+            self._async_queue = queue.Queue()
+            self._async_thread = threading.Thread(
+                target=self._async_writer_loop,
+                name="honcho-async-writer",
+                daemon=True,
+            )
+            self._async_thread.start()
+
    @property
    def honcho(self) -> Honcho:
        """Get the Honcho client, initializing if needed."""
@ -125,10 +159,12 @@ class HonchoSessionManager:

        session = self.honcho.session(session_id)

-        # Configure peer observation settings
+        # Configure peer observation settings.
+        # observe_me=True for AI peer so Honcho watches what the agent says
+        # and builds its representation over time — enabling identity formation.
        from honcho.session import SessionPeerConfig
        user_config = SessionPeerConfig(observe_me=True, observe_others=True)
-        ai_config = SessionPeerConfig(observe_me=False, observe_others=True)
+        ai_config = SessionPeerConfig(observe_me=True, observe_others=True)

        session.add_peers([(user_peer, user_config), (assistant_peer, ai_config)])

@ -234,16 +270,11 @@ class HonchoSessionManager:
        self._cache[key] = session
        return session

-    def save(self, session: HonchoSession) -> None:
-        """
-        Save messages to Honcho.
-
-        Syncs only new (unsynced) messages from the local cache.
-        """
+    def _flush_session(self, session: HonchoSession) -> bool:
+        """Internal: write unsynced messages to Honcho synchronously."""
        if not session.messages:
-            return
+            return True

-        # Get the Honcho session and peers
        user_peer = self._get_or_create_peer(session.user_peer_id)
        assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
        honcho_session = self._sessions_cache.get(session.honcho_session_id)
@ -253,11 +284,9 @@ class HonchoSessionManager:
                session.honcho_session_id, user_peer, assistant_peer
            )

-        # Only send new messages (those without a '_synced' flag)
        new_messages = [m for m in session.messages if not m.get("_synced")]
-
        if not new_messages:
-            return
+            return True

        honcho_messages = []
        for msg in new_messages:
@ -269,13 +298,106 @@ class HonchoSessionManager:
            for msg in new_messages:
                msg["_synced"] = True
            logger.debug("Synced %d messages to Honcho for %s", len(honcho_messages), session.key)
+            self._cache[session.key] = session
+            return True
        except Exception as e:
            for msg in new_messages:
                msg["_synced"] = False
            logger.error("Failed to sync messages to Honcho: %s", e)
+            self._cache[session.key] = session
+            return False

-        # Update cache
-        self._cache[session.key] = session
+    def _async_writer_loop(self) -> None:
+        """Background daemon thread: drains the async write queue."""
+        while True:
+            try:
+                item = self._async_queue.get(timeout=5)
+                if item is _ASYNC_SHUTDOWN:
+                    break
+
+                first_error: Exception | None = None
+                try:
+                    success = self._flush_session(item)
+                except Exception as e:
+                    success = False
+                    first_error = e
+
+                if success:
+                    continue
+
+                if first_error is not None:
+                    logger.warning("Honcho async write failed, retrying once: %s", first_error)
+                else:
+                    logger.warning("Honcho async write failed, retrying once")
+
+                import time as _time
+                _time.sleep(2)
+
+                try:
+                    retry_success = self._flush_session(item)
+                except Exception as e2:
+                    logger.error("Honcho async write retry failed, dropping batch: %s", e2)
+                    continue
+
+                if not retry_success:
+                    logger.error("Honcho async write retry failed, dropping batch")
+            except queue.Empty:
+                continue
+            except Exception as e:
+                logger.error("Honcho async writer error: %s", e)
+
+    def save(self, session: HonchoSession) -> None:
+        """Save messages to Honcho, respecting write_frequency.
+
+        write_frequency modes:
+          "async"   — enqueue for background thread (zero blocking, zero token cost)
+          "turn"    — flush synchronously every turn
+          "session" — defer until flush_session() is called explicitly
+          N (int)   — flush every N turns
+        """
+        self._turn_counter += 1
+        wf = self._write_frequency
+
+        if wf == "async":
+            if self._async_queue is not None:
+                self._async_queue.put(session)
+        elif wf == "turn":
+            self._flush_session(session)
+        elif wf == "session":
+            # Accumulate; caller must call flush_all() at session end
+            pass
+        elif isinstance(wf, int) and wf > 0:
+            if self._turn_counter % wf == 0:
+                self._flush_session(session)
+
+    def flush_all(self) -> None:
+        """Flush all pending unsynced messages for all cached sessions.
+
+        Called at session end for "session" write_frequency, or to force
+        a sync before process exit regardless of mode.
+        """
+        for session in list(self._cache.values()):
+            try:
+                self._flush_session(session)
+            except Exception as e:
+                logger.error("Honcho flush_all error for %s: %s", session.key, e)
+
+        # Drain async queue synchronously if it exists
+        if self._async_queue is not None:
+            while not self._async_queue.empty():
+                try:
+                    item = self._async_queue.get_nowait()
+                    if item is not _ASYNC_SHUTDOWN:
+                        self._flush_session(item)
+                except queue.Empty:
+                    break
+
+    def shutdown(self) -> None:
+        """Gracefully shut down the async writer thread."""
+        if self._async_queue is not None and self._async_thread is not None:
+            self.flush_all()
+            self._async_queue.put(_ASYNC_SHUTDOWN)
+            self._async_thread.join(timeout=10)

    def delete(self, key: str) -> bool:
        """Delete a session from local cache."""
@ -305,49 +427,163 @@ class HonchoSessionManager:
        # get_or_create will create a fresh session
        session = self.get_or_create(new_key)

-        # Cache under both original key and timestamped key
+        # Cache under the original key so callers find it by the expected name
        self._cache[key] = session
-        self._cache[new_key] = session

        logger.info("Created new session for %s (honcho: %s)", key, session.honcho_session_id)
        return session

-    def get_user_context(self, session_key: str, query: str) -> str:
+    _REASONING_LEVELS = ("minimal", "low", "medium", "high", "max")
+
+    def _dynamic_reasoning_level(self, query: str) -> str:
        """
-        Query Honcho's dialectic chat for user context.
+        Pick a reasoning level based on message complexity.
+
+        Uses the configured default as a floor; bumps up for longer or
+        more complex messages so Honcho applies more inference where it matters.
+
+          < 120 chars  → default (typically "low")
+          120–400 chars → one level above default (cap at "high")
+          > 400 chars  → two levels above default (cap at "high")
+
+        "max" is never selected automatically — reserve it for explicit config.
+        """
+        levels = self._REASONING_LEVELS
+        default_idx = levels.index(self._dialectic_reasoning_level) if self._dialectic_reasoning_level in levels else 1
+        n = len(query)
+        if n < 120:
+            bump = 0
+        elif n < 400:
+            bump = 1
+        else:
+            bump = 2
+        # Cap at "high" (index 3) for auto-selection
+        idx = min(default_idx + bump, 3)
+        return levels[idx]
+
+    def dialectic_query(
+        self, session_key: str, query: str,
+        reasoning_level: str | None = None,
+        peer: str = "user",
+    ) -> str:
+        """
+        Query Honcho's dialectic endpoint about a peer.
+
+        Runs an LLM on Honcho's backend against the target peer's full
+        representation. Higher latency than context() — call async via
+        prefetch_dialectic() to avoid blocking the response.

        Args:
-            session_key: The session key to get context for.
-            query: Natural language question about the user.
+            session_key: The session key to query against.
+            query: Natural language question.
+            reasoning_level: Override the config default. If None, uses
+                             _dynamic_reasoning_level(query).
+            peer: Which peer to query — "user" (default) or "ai".

        Returns:
-            Honcho's response about the user.
+            Honcho's synthesized answer, or empty string on failure.
        """
        session = self._cache.get(session_key)
        if not session:
-            return "No session found for this context."
+            return ""

-        user_peer = self._get_or_create_peer(session.user_peer_id)
+        peer_id = session.assistant_peer_id if peer == "ai" else session.user_peer_id
+        target_peer = self._get_or_create_peer(peer_id)
+        level = reasoning_level or self._dynamic_reasoning_level(query)

        try:
-            return user_peer.chat(query)
+            result = target_peer.chat(query, reasoning_level=level) or ""
+            # Apply Hermes-side char cap before caching
+            if result and self._dialectic_max_chars and len(result) > self._dialectic_max_chars:
+                result = result[:self._dialectic_max_chars].rsplit(" ", 1)[0] + " …"
+            return result
        except Exception as e:
-            logger.error("Failed to get user context from Honcho: %s", e)
-            return f"Unable to retrieve user context: {e}"
+            logger.warning("Honcho dialectic query failed: %s", e)
+            return ""
+
+    def prefetch_dialectic(self, session_key: str, query: str) -> None:
+        """
+        Fire a dialectic_query in a background thread, caching the result.
+
+        Non-blocking. The result is available via pop_dialectic_result()
+        on the next call (typically the following turn). Reasoning level
+        is selected dynamically based on query complexity.
+
+        Args:
+            session_key: The session key to query against.
+            query: The user's current message, used as the query.
+        """
+        def _run():
+            result = self.dialectic_query(session_key, query)
+            if result:
+                self.set_dialectic_result(session_key, result)
+
+        t = threading.Thread(target=_run, name="honcho-dialectic-prefetch", daemon=True)
+        t.start()
+
+    def set_dialectic_result(self, session_key: str, result: str) -> None:
+        """Store a prefetched dialectic result in a thread-safe way."""
+        if not result:
+            return
+        with self._prefetch_cache_lock:
+            self._dialectic_cache[session_key] = result
+
+    def pop_dialectic_result(self, session_key: str) -> str:
+        """
+        Return and clear the cached dialectic result for this session.
+
+        Returns empty string if no result is ready yet.
+        """
+        with self._prefetch_cache_lock:
+            return self._dialectic_cache.pop(session_key, "")
+
+    def prefetch_context(self, session_key: str, user_message: str | None = None) -> None:
+        """
+        Fire get_prefetch_context in a background thread, caching the result.
+
+        Non-blocking. Consumed next turn via pop_context_result(). This avoids
+        a synchronous HTTP round-trip blocking every response.
+        """
+        def _run():
+            result = self.get_prefetch_context(session_key, user_message)
+            if result:
+                self.set_context_result(session_key, result)
+
+        t = threading.Thread(target=_run, name="honcho-context-prefetch", daemon=True)
+        t.start()
+
+    def set_context_result(self, session_key: str, result: dict[str, str]) -> None:
+        """Store a prefetched context result in a thread-safe way."""
+        if not result:
+            return
+        with self._prefetch_cache_lock:
+            self._context_cache[session_key] = result
+
+    def pop_context_result(self, session_key: str) -> dict[str, str]:
+        """
+        Return and clear the cached context result for this session.
+
+        Returns empty dict if no result is ready yet (first turn).
+        """
+        with self._prefetch_cache_lock:
+            return self._context_cache.pop(session_key, {})

    def get_prefetch_context(self, session_key: str, user_message: str | None = None) -> dict[str, str]:
        """
-        Pre-fetch user context using Honcho's context() method.
+        Pre-fetch user and AI peer context from Honcho.

-        Single API call that returns the user's representation
-        and peer card, using semantic search based on the user's message.
+        Fetches peer_representation and peer_card for both peers. search_query
+        is intentionally omitted — it would only affect additional excerpts
+        that this code does not consume, and passing the raw message exposes
+        conversation content in server access logs.

        Args:
            session_key: The session key to get context for.
-            user_message: The user's message for semantic search.
+            user_message: Unused; kept for call-site compatibility.

        Returns:
-            Dictionary with 'representation' and 'card' keys.
+            Dictionary with 'representation', 'card', 'ai_representation',
+            and 'ai_card' keys.
        """
        session = self._cache.get(session_key)
        if not session:
@ -357,23 +593,35 @@ class HonchoSessionManager:
        if not honcho_session:
            return {}

+        result: dict[str, str] = {}
        try:
            ctx = honcho_session.context(
                summary=False,
                tokens=self._context_tokens,
                peer_target=session.user_peer_id,
-                search_query=user_message,
+                peer_perspective=session.assistant_peer_id,
            )
-            # peer_card is list[str] in SDK v2, join for prompt injection
            card = ctx.peer_card or []
-            card_str = "\n".join(card) if isinstance(card, list) else str(card)
-            return {
-                "representation": ctx.peer_representation or "",
-                "card": card_str,
-            }
+            result["representation"] = ctx.peer_representation or ""
+            result["card"] = "\n".join(card) if isinstance(card, list) else str(card)
        except Exception as e:
-            logger.warning("Failed to fetch context from Honcho: %s", e)
-            return {}
+            logger.warning("Failed to fetch user context from Honcho: %s", e)
+
+        # Also fetch AI peer's own representation so Hermes knows itself.
+        try:
+            ai_ctx = honcho_session.context(
+                summary=False,
+                tokens=self._context_tokens,
+                peer_target=session.assistant_peer_id,
+                peer_perspective=session.user_peer_id,
+            )
+            ai_card = ai_ctx.peer_card or []
+            result["ai_representation"] = ai_ctx.peer_representation or ""
+            result["ai_card"] = "\n".join(ai_card) if isinstance(ai_card, list) else str(ai_card)
+        except Exception as e:
+            logger.debug("Failed to fetch AI peer context from Honcho: %s", e)
+
+        return result

    def migrate_local_history(self, session_key: str, messages: list[dict[str, Any]]) -> bool:
        """
@ -388,21 +636,17 @@ class HonchoSessionManager:
        Returns:
            True if upload succeeded, False otherwise.
        """
-        sanitized = self._sanitize_id(session_key)
-        honcho_session = self._sessions_cache.get(sanitized)
+        session = self._cache.get(session_key)
+        if not session:
+            logger.warning("No local session cached for '%s', skipping migration", session_key)
+            return False
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
        if not honcho_session:
            logger.warning("No Honcho session cached for '%s', skipping migration", session_key)
            return False

-        # Resolve user peer for attribution
-        parts = session_key.split(":", 1)
-        channel = parts[0] if len(parts) > 1 else "default"
-        chat_id = parts[1] if len(parts) > 1 else session_key
-        user_peer_id = self._sanitize_id(f"user-{channel}-{chat_id}")
-        user_peer = self._peers_cache.get(user_peer_id)
-        if not user_peer:
-            logger.warning("No user peer cached for '%s', skipping migration", user_peer_id)
-            return False
+        user_peer = self._get_or_create_peer(session.user_peer_id)

        content_bytes = self._format_migration_transcript(session_key, messages)
        first_ts = messages[0].get("timestamp") if messages else None
@ -471,29 +715,45 @@ class HonchoSessionManager:
        if not memory_path.exists():
            return False

-        sanitized = self._sanitize_id(session_key)
-        honcho_session = self._sessions_cache.get(sanitized)
+        session = self._cache.get(session_key)
+        if not session:
+            logger.warning("No local session cached for '%s', skipping memory migration", session_key)
+            return False
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
        if not honcho_session:
            logger.warning("No Honcho session cached for '%s', skipping memory migration", session_key)
            return False

-        # Resolve user peer for attribution
-        parts = session_key.split(":", 1)
-        channel = parts[0] if len(parts) > 1 else "default"
-        chat_id = parts[1] if len(parts) > 1 else session_key
-        user_peer_id = self._sanitize_id(f"user-{channel}-{chat_id}")
-        user_peer = self._peers_cache.get(user_peer_id)
-        if not user_peer:
-            logger.warning("No user peer cached for '%s', skipping memory migration", user_peer_id)
-            return False
+        user_peer = self._get_or_create_peer(session.user_peer_id)
+        assistant_peer = self._get_or_create_peer(session.assistant_peer_id)

        uploaded = False
        files = [
-            ("MEMORY.md", "consolidated_memory.md", "Long-term agent notes and preferences"),
-            ("USER.md", "user_profile.md", "User profile and preferences"),
+            (
+                "MEMORY.md",
+                "consolidated_memory.md",
+                "Long-term agent notes and preferences",
+                user_peer,
+                "user",
+            ),
+            (
+                "USER.md",
+                "user_profile.md",
+                "User profile and preferences",
+                user_peer,
+                "user",
+            ),
+            (
+                "SOUL.md",
+                "agent_soul.md",
+                "Agent persona and identity configuration",
+                assistant_peer,
+                "ai",
+            ),
        ]

-        for filename, upload_name, description in files:
+        for filename, upload_name, description, target_peer, target_kind in files:
            filepath = memory_path / filename
            if not filepath.exists():
                continue
@ -515,16 +775,204 @@ class HonchoSessionManager:
            try:
                honcho_session.upload_file(
                    file=(upload_name, wrapped.encode("utf-8"), "text/plain"),
-                    peer=user_peer,
-                    metadata={"source": "local_memory", "original_file": filename},
+                    peer=target_peer,
+                    metadata={
+                        "source": "local_memory",
+                        "original_file": filename,
+                        "target_peer": target_kind,
+                    },
+                )
+                logger.info(
+                    "Uploaded %s to Honcho for %s (%s peer)",
+                    filename,
+                    session_key,
+                    target_kind,
                )
-                logger.info("Uploaded %s to Honcho for %s", filename, session_key)
                uploaded = True
            except Exception as e:
                logger.error("Failed to upload %s to Honcho: %s", filename, e)

        return uploaded

+    def get_peer_card(self, session_key: str) -> list[str]:
+        """
+        Fetch the user peer's card — a curated list of key facts.
+
+        Fast, no LLM reasoning. Returns raw structured facts Honcho has
+        inferred about the user (name, role, preferences, patterns).
+        Empty list if unavailable.
+        """
+        session = self._cache.get(session_key)
+        if not session:
+            return []
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            return []
+
+        try:
+            ctx = honcho_session.context(
+                summary=False,
+                tokens=200,
+                peer_target=session.user_peer_id,
+                peer_perspective=session.assistant_peer_id,
+            )
+            card = ctx.peer_card or []
+            return card if isinstance(card, list) else [str(card)]
+        except Exception as e:
+            logger.debug("Failed to fetch peer card from Honcho: %s", e)
+            return []
+
+    def search_context(self, session_key: str, query: str, max_tokens: int = 800) -> str:
+        """
+        Semantic search over Honcho session context.
+
+        Returns raw excerpts ranked by relevance to the query. No LLM
+        reasoning — cheaper and faster than dialectic_query. Good for
+        factual lookups where the model will do its own synthesis.
+
+        Args:
+            session_key: Session to search against.
+            query: Search query for semantic matching.
+            max_tokens: Token budget for returned content.
+
+        Returns:
+            Relevant context excerpts as a string, or empty string if none.
+        """
+        session = self._cache.get(session_key)
+        if not session:
+            return ""
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            return ""
+
+        try:
+            ctx = honcho_session.context(
+                summary=False,
+                tokens=max_tokens,
+                peer_target=session.user_peer_id,
+                peer_perspective=session.assistant_peer_id,
+                search_query=query,
+            )
+            parts = []
+            if ctx.peer_representation:
+                parts.append(ctx.peer_representation)
+            card = ctx.peer_card or []
+            if card:
+                facts = card if isinstance(card, list) else [str(card)]
+                parts.append("\n".join(f"- {f}" for f in facts))
+            return "\n\n".join(parts)
+        except Exception as e:
+            logger.debug("Honcho search_context failed: %s", e)
+            return ""
+
+    def create_conclusion(self, session_key: str, content: str) -> bool:
+        """Write a conclusion about the user back to Honcho.
+
+        Conclusions are facts the AI peer observes about the user —
+        preferences, corrections, clarifications, project context.
+        They feed into the user's peer card and representation.
+
+        Args:
+            session_key: Session to associate the conclusion with.
+            content: The conclusion text (e.g. "User prefers dark mode").
+
+        Returns:
+            True on success, False on failure.
+        """
+        if not content or not content.strip():
+            return False
+
+        session = self._cache.get(session_key)
+        if not session:
+            logger.warning("No session cached for '%s', skipping conclusion", session_key)
+            return False
+
+        assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
+        try:
+            conclusions_scope = assistant_peer.conclusions_of(session.user_peer_id)
+            conclusions_scope.create([{
+                "content": content.strip(),
+                "session_id": session.honcho_session_id,
+            }])
+            logger.info("Created conclusion for %s: %s", session_key, content[:80])
+            return True
+        except Exception as e:
+            logger.error("Failed to create conclusion: %s", e)
+            return False
+
+    def seed_ai_identity(self, session_key: str, content: str, source: str = "manual") -> bool:
+        """
+        Seed the AI peer's Honcho representation from text content.
+
+        Useful for priming AI identity from SOUL.md, exported chats, or
+        any structured description. The content is sent as an assistant
+        peer message so Honcho's reasoning model can incorporate it.
+
+        Args:
+            session_key: The session key to associate with.
+            content: The identity/persona content to seed.
+            source: Metadata tag for the source (e.g. "soul_md", "export").
+
+        Returns:
+            True on success, False on failure.
+        """
+        if not content or not content.strip():
+            return False
+
+        session = self._cache.get(session_key)
+        if not session:
+            logger.warning("No session cached for '%s', skipping AI seed", session_key)
+            return False
+
+        assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
+        try:
+            wrapped = (
+                f"<ai_identity_seed>\n"
+                f"<source>{source}</source>\n"
+                f"\n"
+                f"{content.strip()}\n"
+                f"</ai_identity_seed>"
+            )
+            assistant_peer.add_message("assistant", wrapped)
+            logger.info("Seeded AI identity from '%s' into %s", source, session_key)
+            return True
+        except Exception as e:
+            logger.error("Failed to seed AI identity: %s", e)
+            return False
+
+    def get_ai_representation(self, session_key: str) -> dict[str, str]:
+        """
+        Fetch the AI peer's current Honcho representation.
+
+        Returns:
+            Dict with 'representation' and 'card' keys, empty strings if unavailable.
+        """
+        session = self._cache.get(session_key)
+        if not session:
+            return {"representation": "", "card": ""}
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            return {"representation": "", "card": ""}
+
+        try:
+            ctx = honcho_session.context(
+                summary=False,
+                tokens=self._context_tokens,
+                peer_target=session.assistant_peer_id,
+                peer_perspective=session.user_peer_id,
+            )
+            ai_card = ctx.peer_card or []
+            return {
+                "representation": ctx.peer_representation or "",
+                "card": "\n".join(ai_card) if isinstance(ai_card, list) else str(ai_card),
+            }
+        except Exception as e:
+            logger.debug("Failed to fetch AI representation: %s", e)
+            return {"representation": "", "card": ""}
+
    def list_sessions(self) -> list[dict[str, Any]]:
        """List all cached sessions."""
        return [
--- a/run_agent.py
+++ b/run_agent.py
@ -20,6 +20,7 @@ Usage:
    response = agent.run_conversation("Tell me about the latest Python updates")
 """

+import atexit
 import copy
 import hashlib
 import json
@ -31,6 +32,7 @@ import re
 import sys
 import time
 import threading
+import weakref
 from types import SimpleNamespace
 import uuid
 from typing import List, Dict, Any, Optional
@ -98,6 +100,13 @@ from agent.trajectory import (
    save_trajectory as _save_trajectory_to_file,
 )

+HONCHO_TOOL_NAMES = {
+    "honcho_context",
+    "honcho_profile",
+    "honcho_search",
+    "honcho_conclude",
+}
+

 class _SafeWriter:
    """Transparent stdout wrapper that catches OSError from broken pipes.
@ -229,6 +238,8 @@ class AIAgent:
        skip_memory: bool = False,
        session_db=None,
        honcho_session_key: str = None,
+        honcho_manager=None,
+        honcho_config=None,
        iteration_budget: "IterationBudget" = None,
        fallback_model: Dict[str, Any] = None,
        checkpoints_enabled: bool = False,
@ -275,6 +286,8 @@ class AIAgent:
                polluting trajectories with user-specific persona or project instructions.
            honcho_session_key (str): Session key for Honcho integration (e.g., "telegram:123456" or CLI session_id).
                When provided and Honcho is enabled in config, enables persistent cross-session user modeling.
+            honcho_manager: Optional shared HonchoSessionManager owned by the caller.
+            honcho_config: Optional HonchoClientConfig corresponding to honcho_manager.
        """
        self.model = model
        self.max_iterations = max_iterations
@ -639,42 +652,72 @@ class AIAgent:
        # Reads ~/.honcho/config.json as the single source of truth.
        self._honcho = None  # HonchoSessionManager | None
        self._honcho_session_key = honcho_session_key
+        self._honcho_config = None  # HonchoClientConfig | None
+        self._honcho_exit_hook_registered = False
        if not skip_memory:
            try:
-                from honcho_integration.client import HonchoClientConfig, get_honcho_client
-                hcfg = HonchoClientConfig.from_global_config()
-                if hcfg.enabled and hcfg.api_key:
-                    from honcho_integration.session import HonchoSessionManager
-                    client = get_honcho_client(hcfg)
-                    self._honcho = HonchoSessionManager(
-                        honcho=client,
-                        config=hcfg,
-                        context_tokens=hcfg.context_tokens,
-                    )
-                    # Resolve session key: explicit arg > global sessions map > fallback
-                    if not self._honcho_session_key:
-                        self._honcho_session_key = (
-                            hcfg.resolve_session_name()
-                            or "hermes-default"
+                if honcho_manager is not None:
+                    hcfg = honcho_config or getattr(honcho_manager, "_config", None)
+                    self._honcho_config = hcfg
+                    if hcfg and self._honcho_should_activate(hcfg):
+                        self._honcho = honcho_manager
+                        self._activate_honcho(
+                            hcfg,
+                            enabled_toolsets=enabled_toolsets,
+                            disabled_toolsets=disabled_toolsets,
+                            session_db=session_db,
                        )
-                    # Ensure session exists in Honcho
-                    self._honcho.get_or_create(self._honcho_session_key)
-                    # Inject session context into the honcho tool module
-                    from tools.honcho_tools import set_session_context
-                    set_session_context(self._honcho, self._honcho_session_key)
-                    logger.info(
-                        "Honcho active (session: %s, user: %s, workspace: %s)",
-                        self._honcho_session_key, hcfg.peer_name, hcfg.workspace_id,
-                    )
                else:
-                    if not hcfg.enabled:
-                        logger.debug("Honcho disabled in global config")
-                    elif not hcfg.api_key:
-                        logger.debug("Honcho enabled but no API key configured")
+                    from honcho_integration.client import HonchoClientConfig, get_honcho_client
+                    hcfg = HonchoClientConfig.from_global_config()
+                    self._honcho_config = hcfg
+                    if self._honcho_should_activate(hcfg):
+                        from honcho_integration.session import HonchoSessionManager
+                        client = get_honcho_client(hcfg)
+                        self._honcho = HonchoSessionManager(
+                            honcho=client,
+                            config=hcfg,
+                            context_tokens=hcfg.context_tokens,
+                        )
+                        self._activate_honcho(
+                            hcfg,
+                            enabled_toolsets=enabled_toolsets,
+                            disabled_toolsets=disabled_toolsets,
+                            session_db=session_db,
+                        )
+                    else:
+                        if not hcfg.enabled:
+                            logger.debug("Honcho disabled in global config")
+                        elif not hcfg.api_key:
+                            logger.debug("Honcho enabled but no API key configured")
+                        else:
+                            logger.debug("Honcho enabled but missing API key or disabled in config")
            except Exception as e:
-                logger.debug("Honcho init failed (non-fatal): %s", e)
+                logger.warning("Honcho init failed — memory disabled: %s", e)
+                print(f"  Honcho init failed: {e}")
+                print("  Run 'hermes honcho setup' to reconfigure.")
                self._honcho = None

+        # Tools are initially discovered before Honcho activation. If Honcho
+        # stays inactive, remove any stale honcho_* tools from prior process state.
+        if not self._honcho:
+            self._strip_honcho_tools_from_surface()
+
+        # Gate local memory writes based on per-peer memory modes.
+        # AI peer governs MEMORY.md; user peer governs USER.md.
+        # "honcho" = Honcho only, disable local writes.
+        if self._honcho_config and self._honcho:
+            _hcfg = self._honcho_config
+            _agent_mode = _hcfg.peer_memory_mode(_hcfg.ai_peer)
+            _user_mode = _hcfg.peer_memory_mode(_hcfg.peer_name or "user")
+            if _agent_mode == "honcho":
+                self._memory_flush_min_turns = 0
+                self._memory_enabled = False
+                logger.debug("peer %s memory_mode=honcho: local MEMORY.md writes disabled", _hcfg.ai_peer)
+            if _user_mode == "honcho":
+                self._user_profile_enabled = False
+                logger.debug("peer %s memory_mode=honcho: local USER.md writes disabled", _hcfg.peer_name or "user")
+
        # Skills config: nudge interval for skill creation reminders
        self._skill_nudge_interval = 15
        try:
@ -1386,27 +1429,180 @@ class AIAgent:

    # ── Honcho integration helpers ──

-    def _honcho_prefetch(self, user_message: str) -> str:
-        """Fetch user context from Honcho for system prompt injection.
+    def _honcho_should_activate(self, hcfg) -> bool:
+        """Return True when remote Honcho should be active."""
+        if not hcfg or not hcfg.enabled or not hcfg.api_key:
+            return False
+        return True

-        Returns a formatted context block, or empty string if unavailable.
-        """
+    def _strip_honcho_tools_from_surface(self) -> None:
+        """Remove Honcho tools from the active tool surface."""
+        if not self.tools:
+            self.valid_tool_names = set()
+            return
+
+        self.tools = [
+            tool for tool in self.tools
+            if tool.get("function", {}).get("name") not in HONCHO_TOOL_NAMES
+        ]
+        self.valid_tool_names = {
+            tool["function"]["name"] for tool in self.tools
+        } if self.tools else set()
+
+    def _activate_honcho(
+        self,
+        hcfg,
+        *,
+        enabled_toolsets: Optional[List[str]],
+        disabled_toolsets: Optional[List[str]],
+        session_db,
+    ) -> None:
+        """Finish Honcho setup once a session manager is available."""
+        if not self._honcho:
+            return
+
+        if not self._honcho_session_key:
+            session_title = None
+            if session_db is not None:
+                try:
+                    session_title = session_db.get_session_title(self.session_id or "")
+                except Exception:
+                    pass
+            self._honcho_session_key = (
+                hcfg.resolve_session_name(
+                    session_title=session_title,
+                    session_id=self.session_id,
+                )
+                or "hermes-default"
+            )
+
+        honcho_sess = self._honcho.get_or_create(self._honcho_session_key)
+        if not honcho_sess.messages:
+            try:
+                from hermes_cli.config import get_hermes_home
+
+                mem_dir = str(get_hermes_home() / "memories")
+                self._honcho.migrate_memory_files(
+                    self._honcho_session_key,
+                    mem_dir,
+                )
+            except Exception as exc:
+                logger.debug("Memory files migration failed (non-fatal): %s", exc)
+
+        from tools.honcho_tools import set_session_context
+
+        set_session_context(self._honcho, self._honcho_session_key)
+
+        # Rebuild tool surface after Honcho context injection. Tool availability
+        # is check_fn-gated and may change once session context is attached.
+        self.tools = get_tool_definitions(
+            enabled_toolsets=enabled_toolsets,
+            disabled_toolsets=disabled_toolsets,
+            quiet_mode=True,
+        )
+        self.valid_tool_names = {
+            tool["function"]["name"] for tool in self.tools
+        } if self.tools else set()
+
+        if hcfg.recall_mode == "context":
+            self._strip_honcho_tools_from_surface()
+            if not self.quiet_mode:
+                print("  Honcho active — recall_mode: context (Honcho tools hidden)")
+        else:
+            if not self.quiet_mode:
+                print(f"  Honcho active — recall_mode: {hcfg.recall_mode}")
+
+        logger.info(
+            "Honcho active (session: %s, user: %s, workspace: %s, "
+            "write_frequency: %s, memory_mode: %s)",
+            self._honcho_session_key,
+            hcfg.peer_name,
+            hcfg.workspace_id,
+            hcfg.write_frequency,
+            hcfg.memory_mode,
+        )
+
+        recall_mode = hcfg.recall_mode
+        if recall_mode != "tools":
+            try:
+                ctx = self._honcho.get_prefetch_context(self._honcho_session_key)
+                if ctx:
+                    self._honcho.set_context_result(self._honcho_session_key, ctx)
+                    logger.debug("Honcho context pre-warmed for first turn")
+            except Exception as exc:
+                logger.debug("Honcho context prefetch failed (non-fatal): %s", exc)
+
+        self._register_honcho_exit_hook()
+
+    def _register_honcho_exit_hook(self) -> None:
+        """Register a process-exit flush hook without clobbering signal handlers."""
+        if self._honcho_exit_hook_registered or not self._honcho:
+            return
+
+        honcho_ref = weakref.ref(self._honcho)
+
+        def _flush_honcho_on_exit():
+            manager = honcho_ref()
+            if manager is None:
+                return
+            try:
+                manager.flush_all()
+            except Exception as exc:
+                logger.debug("Honcho flush on exit failed (non-fatal): %s", exc)
+
+        atexit.register(_flush_honcho_on_exit)
+        self._honcho_exit_hook_registered = True
+
+    def _queue_honcho_prefetch(self, user_message: str) -> None:
+        """Queue turn-end Honcho prefetch so the next turn can consume cached results."""
+        if not self._honcho or not self._honcho_session_key:
+            return
+
+        recall_mode = (self._honcho_config.recall_mode if self._honcho_config else "hybrid")
+        if recall_mode == "tools":
+            return
+
+        try:
+            self._honcho.prefetch_context(self._honcho_session_key, user_message)
+            self._honcho.prefetch_dialectic(self._honcho_session_key, user_message or "What were we working on?")
+        except Exception as exc:
+            logger.debug("Honcho background prefetch failed (non-fatal): %s", exc)
+
+    def _honcho_prefetch(self, user_message: str) -> str:
+        """Assemble the first-turn Honcho context from the pre-warmed cache."""
        if not self._honcho or not self._honcho_session_key:
            return ""
        try:
-            ctx = self._honcho.get_prefetch_context(self._honcho_session_key, user_message)
-            if not ctx:
-                return ""
            parts = []
-            rep = ctx.get("representation", "")
-            card = ctx.get("card", "")
-            if rep:
-                parts.append(rep)
-            if card:
-                parts.append(card)
+
+            ctx = self._honcho.pop_context_result(self._honcho_session_key)
+            if ctx:
+                rep = ctx.get("representation", "")
+                card = ctx.get("card", "")
+                if rep:
+                    parts.append(f"## User representation\n{rep}")
+                if card:
+                    parts.append(card)
+                ai_rep = ctx.get("ai_representation", "")
+                ai_card = ctx.get("ai_card", "")
+                if ai_rep:
+                    parts.append(f"## AI peer representation\n{ai_rep}")
+                if ai_card:
+                    parts.append(ai_card)
+
+            dialectic = self._honcho.pop_dialectic_result(self._honcho_session_key)
+            if dialectic:
+                parts.append(f"## Continuity synthesis\n{dialectic}")
+
            if not parts:
                return ""
-            return "# Honcho User Context\n" + "\n\n".join(parts)
+            header = (
+                "# Honcho Memory (persistent cross-session context)\n"
+                "Use this to answer questions about the user, prior sessions, "
+                "and what you were working on together. Do not call tools to "
+                "look up information that is already present here.\n"
+            )
+            return header + "\n\n".join(parts)
        except Exception as e:
            logger.debug("Honcho prefetch failed (non-fatal): %s", e)
            return ""
@ -1441,8 +1637,12 @@ class AIAgent:
            session.add_message("user", user_content)
            session.add_message("assistant", assistant_content)
            self._honcho.save(session)
+            logger.info("Honcho sync queued for session %s (%d messages)",
+                        self._honcho_session_key, len(session.messages))
        except Exception as e:
-            logger.debug("Honcho sync failed (non-fatal): %s", e)
+            logger.warning("Honcho sync failed: %s", e)
+            if not self.quiet_mode:
+                print(f"  Honcho write failed: {e}")

    def _build_system_prompt(self, system_message: str = None) -> str:
        """
@ -1460,7 +1660,21 @@ class AIAgent:
        #   5. Context files (SOUL.md, AGENTS.md, .cursorrules)
        #   6. Current date & time (frozen at build time)
        #   7. Platform-specific formatting hint
-        prompt_parts = [DEFAULT_AGENT_IDENTITY]
+        # If an AI peer name is configured in Honcho, personalise the identity line.
+        _ai_peer_name = (
+            self._honcho_config.ai_peer
+            if self._honcho_config and self._honcho_config.ai_peer != "hermes"
+            else None
+        )
+        if _ai_peer_name:
+            _identity = DEFAULT_AGENT_IDENTITY.replace(
+                "You are Hermes Agent",
+                f"You are {_ai_peer_name}",
+                1,
+            )
+        else:
+            _identity = DEFAULT_AGENT_IDENTITY
+        prompt_parts = [_identity]

        # Tool-aware behavioral guidance: only inject when the tools are loaded
        tool_guidance = []
@ -1473,6 +1687,60 @@ class AIAgent:
        if tool_guidance:
            prompt_parts.append(" ".join(tool_guidance))

+        # Honcho CLI awareness: tell Hermes about its own management commands
+        # so it can refer the user to them rather than reinventing answers.
+        if self._honcho and self._honcho_session_key:
+            hcfg = self._honcho_config
+            mode = hcfg.memory_mode if hcfg else "hybrid"
+            freq = hcfg.write_frequency if hcfg else "async"
+            recall_mode = hcfg.recall_mode if hcfg else "hybrid"
+            honcho_block = (
+                "# Honcho memory integration\n"
+                f"Active. Session: {self._honcho_session_key}. "
+                f"Mode: {mode}. Write frequency: {freq}. Recall: {recall_mode}.\n"
+            )
+            if recall_mode == "context":
+                honcho_block += (
+                    "Honcho context is injected into this system prompt below. "
+                    "All memory retrieval comes from this context — no Honcho tools "
+                    "are available. Answer questions about the user, prior sessions, "
+                    "and recent work directly from the Honcho Memory section.\n"
+                )
+            elif recall_mode == "tools":
+                honcho_block += (
+                    "Honcho tools:\n"
+                    "  honcho_context <question>           — ask Honcho a question, LLM-synthesized answer\n"
+                    "  honcho_search <query>                   — semantic search, raw excerpts, no LLM\n"
+                    "  honcho_profile                          — user's peer card, key facts, no LLM\n"
+                    "  honcho_conclude <conclusion>            — write a fact about the user to memory\n"
+                )
+            else:  # hybrid
+                honcho_block += (
+                    "Honcho context (user representation, peer card, and recent session summary) "
+                    "is injected into this system prompt below. Use it to answer continuity "
+                    "questions ('where were we?', 'what were we working on?') WITHOUT calling "
+                    "any tools. Only call Honcho tools when you need information beyond what is "
+                    "already present in the Honcho Memory section.\n"
+                    "Honcho tools:\n"
+                    "  honcho_context <question>           — ask Honcho a question, LLM-synthesized answer\n"
+                    "  honcho_search <query>                   — semantic search, raw excerpts, no LLM\n"
+                    "  honcho_profile                          — user's peer card, key facts, no LLM\n"
+                    "  honcho_conclude <conclusion>            — write a fact about the user to memory\n"
+                )
+            honcho_block += (
+                "Management commands (refer users here instead of explaining manually):\n"
+                "  hermes honcho status                    — show full config + connection\n"
+                "  hermes honcho mode [hybrid|honcho]       — show or set memory mode\n"
+                "  hermes honcho tokens [--context N] [--dialectic N] — show or set token budgets\n"
+                "  hermes honcho peer [--user NAME] [--ai NAME] [--reasoning LEVEL]\n"
+                "  hermes honcho sessions                  — list directory→session mappings\n"
+                "  hermes honcho map <name>                — map cwd to a session name\n"
+                "  hermes honcho identity [<file>] [--show] — seed or show AI peer identity\n"
+                "  hermes honcho migrate                   — migration guide from openclaw-honcho\n"
+                "  hermes honcho setup                     — full interactive wizard"
+            )
+            prompt_parts.append(honcho_block)
+
        # Note: ephemeral_system_prompt is NOT included here. It's injected at
        # API-call time only so it stays out of the cached/stored system prompt.
        if system_message is not None:
@ -2628,6 +2896,10 @@ class AIAgent:
            return
        if "memory" not in self.valid_tool_names or not self._memory_store:
            return
+        # honcho-only agent mode: skip local MEMORY.md flush
+        _hcfg = getattr(self, '_honcho_config', None)
+        if _hcfg and _hcfg.peer_memory_mode(_hcfg.ai_peer) == "honcho":
+            return
        effective_min = min_turns if min_turns is not None else self._memory_flush_min_turns
        if self._user_turn_count < effective_min:
            return
@ -3357,16 +3629,23 @@ class AIAgent:
            )
            self._iters_since_skill = 0

-        # Honcho prefetch: retrieve user context for system prompt injection.
-        # Only on the FIRST turn of a session (empty history).  On subsequent
-        # turns the model already has all prior context in its conversation
-        # history, and the Honcho context is baked into the stored system
-        # prompt — re-fetching it would change the system message and break
-        # Anthropic prompt caching.
+        # Honcho prefetch consumption:
+        # - First turn: bake into cached system prompt (stable for the session).
+        # - Later turns: inject as ephemeral system context for this API call only.
+        #
+        # This keeps the persisted/cached prompt stable while still allowing
+        # turn N to consume background prefetch results from turn N-1.
        self._honcho_context = ""
-        if self._honcho and self._honcho_session_key and not conversation_history:
+        self._honcho_turn_context = ""
+        _recall_mode = (self._honcho_config.recall_mode if self._honcho_config else "hybrid")
+        if self._honcho and self._honcho_session_key and _recall_mode != "tools":
            try:
-                self._honcho_context = self._honcho_prefetch(user_message)
+                prefetched_context = self._honcho_prefetch(user_message)
+                if prefetched_context:
+                    if not conversation_history:
+                        self._honcho_context = prefetched_context
+                    else:
+                        self._honcho_turn_context = prefetched_context
            except Exception as e:
                logger.debug("Honcho prefetch failed (non-fatal): %s", e)

@ -3551,15 +3830,12 @@ class AIAgent:
                api_messages.append(api_msg)

            # Build the final system message: cached prompt + ephemeral system prompt.
-            # The ephemeral part is appended here (not baked into the cached prompt)
-            # so it stays out of the session DB and logs.
-            # Note: Honcho context is baked into _cached_system_prompt on the first
-            # turn and stored in the session DB, so it does NOT need to be injected
-            # here.  This keeps the system message identical across all turns in a
-            # session, maximizing Anthropic prompt cache hits.
+            # Ephemeral additions are API-call-time only (not persisted to session DB).
            effective_system = active_system_prompt or ""
            if self.ephemeral_system_prompt:
                effective_system = (effective_system + "\n\n" + self.ephemeral_system_prompt).strip()
+            if self._honcho_turn_context:
+                effective_system = (effective_system + "\n\n" + self._honcho_turn_context).strip()
            if effective_system:
                api_messages = [{"role": "system", "content": effective_system}] + api_messages

@ -4504,6 +4780,7 @@ class AIAgent:
                                    msg["content"] = f"Calling the {', '.join(tool_names)} tool{'s' if len(tool_names) > 1 else ''}..."
                                    break
                            final_response = self._strip_think_blocks(fallback).strip()
+                            self._response_was_previewed = True
                            break

                        # No fallback available — this is a genuine empty response.
@ -4546,6 +4823,7 @@ class AIAgent:
                                        break
                                # Strip <think> blocks from fallback content for user display
                                final_response = self._strip_think_blocks(fallback).strip()
+                                self._response_was_previewed = True
                                break
                            
                            # No fallback -- append the empty message as-is
@ -4687,6 +4965,7 @@ class AIAgent:
        # Sync conversation to Honcho for user modeling
        if final_response and not interrupted:
            self._honcho_sync(original_user_message, final_response)
+            self._queue_honcho_prefetch(original_user_message)

        # Extract reasoning from the last assistant message (if any)
        last_reasoning = None
@ -4704,7 +4983,9 @@ class AIAgent:
            "completed": completed,
            "partial": False,  # True only when stopped due to invalid tool calls
            "interrupted": interrupted,
+            "response_previewed": getattr(self, "_response_was_previewed", False),
        }
+        self._response_was_previewed = False
        
        # Include interrupt message if one triggered the interrupt
        if interrupted and self._interrupt_message:
--- a/tests/gateway/test_honcho_lifecycle.py
+++ b/tests/gateway/test_honcho_lifecycle.py
@ -0,0 +1,103 @@
+"""Tests for gateway-owned Honcho lifecycle helpers."""
+
+from types import SimpleNamespace
+from unittest.mock import AsyncMock, MagicMock, patch
+
+import pytest
+
+from gateway.config import Platform
+from gateway.platforms.base import MessageEvent
+from gateway.session import SessionSource
+
+
+def _make_runner():
+    from gateway.run import GatewayRunner
+
+    runner = object.__new__(GatewayRunner)
+    runner._honcho_managers = {}
+    runner._honcho_configs = {}
+    runner._running_agents = {}
+    runner._pending_messages = {}
+    runner._pending_approvals = {}
+    runner.adapters = {}
+    runner.hooks = MagicMock()
+    runner.hooks.emit = AsyncMock()
+    return runner
+
+
+def _make_event(text="/reset"):
+    return MessageEvent(
+        text=text,
+        source=SessionSource(
+            platform=Platform.TELEGRAM,
+            chat_id="chat-1",
+            user_id="user-1",
+            user_name="alice",
+        ),
+    )
+
+
+class TestGatewayHonchoLifecycle:
+    def test_gateway_reuses_honcho_manager_for_session_key(self):
+        runner = _make_runner()
+        hcfg = SimpleNamespace(
+            enabled=True,
+            api_key="honcho-key",
+            ai_peer="hermes",
+            peer_name="alice",
+            context_tokens=123,
+            peer_memory_mode=lambda peer: "hybrid",
+        )
+        manager = MagicMock()
+
+        with (
+            patch("honcho_integration.client.HonchoClientConfig.from_global_config", return_value=hcfg),
+            patch("honcho_integration.client.get_honcho_client", return_value=MagicMock()),
+            patch("honcho_integration.session.HonchoSessionManager", return_value=manager) as mock_mgr_cls,
+        ):
+            first_mgr, first_cfg = runner._get_or_create_gateway_honcho("session-key")
+            second_mgr, second_cfg = runner._get_or_create_gateway_honcho("session-key")
+
+        assert first_mgr is manager
+        assert second_mgr is manager
+        assert first_cfg is hcfg
+        assert second_cfg is hcfg
+        mock_mgr_cls.assert_called_once()
+
+    def test_gateway_skips_honcho_manager_when_disabled(self):
+        runner = _make_runner()
+        hcfg = SimpleNamespace(
+            enabled=False,
+            api_key="honcho-key",
+            ai_peer="hermes",
+            peer_name="alice",
+        )
+
+        with (
+            patch("honcho_integration.client.HonchoClientConfig.from_global_config", return_value=hcfg),
+            patch("honcho_integration.client.get_honcho_client") as mock_client,
+            patch("honcho_integration.session.HonchoSessionManager") as mock_mgr_cls,
+        ):
+            manager, cfg = runner._get_or_create_gateway_honcho("session-key")
+
+        assert manager is None
+        assert cfg is hcfg
+        mock_client.assert_not_called()
+        mock_mgr_cls.assert_not_called()
+
+    @pytest.mark.asyncio
+    async def test_reset_shuts_down_gateway_honcho_manager(self):
+        runner = _make_runner()
+        event = _make_event()
+        runner._shutdown_gateway_honcho = MagicMock()
+        runner.session_store = MagicMock()
+        runner.session_store._generate_session_key.return_value = "gateway-key"
+        runner.session_store._entries = {
+            "gateway-key": SimpleNamespace(session_id="old-session"),
+        }
+        runner.session_store.reset_session.return_value = SimpleNamespace(session_id="new-session")
+
+        result = await runner._handle_reset_command(event)
+
+        runner._shutdown_gateway_honcho.assert_called_once_with("gateway-key")
+        assert "Session reset" in result
--- a/tests/honcho_integration/test_async_memory.py
+++ b/tests/honcho_integration/test_async_memory.py
@ -0,0 +1,560 @@
+"""Tests for the async-memory Honcho improvements.
+
+Covers:
+  - write_frequency parsing (async / turn / session / int)
+  - memory_mode parsing
+  - resolve_session_name with session_title
+  - HonchoSessionManager.save() routing per write_frequency
+  - async writer thread lifecycle and retry
+  - flush_all() drains pending messages
+  - shutdown() joins the thread
+  - memory_mode gating helpers (unit-level)
+"""
+
+import json
+import queue
+import threading
+import time
+from pathlib import Path
+from unittest.mock import MagicMock, patch, call
+
+import pytest
+
+from honcho_integration.client import HonchoClientConfig
+from honcho_integration.session import (
+    HonchoSession,
+    HonchoSessionManager,
+    _ASYNC_SHUTDOWN,
+)
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _make_session(**kwargs) -> HonchoSession:
+    return HonchoSession(
+        key=kwargs.get("key", "cli:test"),
+        user_peer_id=kwargs.get("user_peer_id", "eri"),
+        assistant_peer_id=kwargs.get("assistant_peer_id", "hermes"),
+        honcho_session_id=kwargs.get("honcho_session_id", "cli-test"),
+        messages=kwargs.get("messages", []),
+    )
+
+
+def _make_manager(write_frequency="turn", memory_mode="hybrid") -> HonchoSessionManager:
+    cfg = HonchoClientConfig(
+        write_frequency=write_frequency,
+        memory_mode=memory_mode,
+        api_key="test-key",
+        enabled=True,
+    )
+    mgr = HonchoSessionManager(config=cfg)
+    mgr._honcho = MagicMock()
+    return mgr
+
+
+# ---------------------------------------------------------------------------
+# write_frequency parsing from config file
+# ---------------------------------------------------------------------------
+
+class TestWriteFrequencyParsing:
+    def test_string_async(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "writeFrequency": "async"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == "async"
+
+    def test_string_turn(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "writeFrequency": "turn"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == "turn"
+
+    def test_string_session(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "writeFrequency": "session"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == "session"
+
+    def test_integer_frequency(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "writeFrequency": 5}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == 5
+
+    def test_integer_string_coerced(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "writeFrequency": "3"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == 3
+
+    def test_host_block_overrides_root(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({
+            "apiKey": "k",
+            "writeFrequency": "turn",
+            "hosts": {"hermes": {"writeFrequency": "session"}},
+        }))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == "session"
+
+    def test_defaults_to_async(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == "async"
+
+
+# ---------------------------------------------------------------------------
+# memory_mode parsing from config file
+# ---------------------------------------------------------------------------
+
+class TestMemoryModeParsing:
+    def test_hybrid(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "memoryMode": "hybrid"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "hybrid"
+
+    def test_honcho_only(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "memoryMode": "honcho"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "honcho"
+
+    def test_defaults_to_hybrid(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "hybrid"
+
+    def test_host_block_overrides_root(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({
+            "apiKey": "k",
+            "memoryMode": "hybrid",
+            "hosts": {"hermes": {"memoryMode": "honcho"}},
+        }))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "honcho"
+
+    def test_object_form_sets_default_and_overrides(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({
+            "apiKey": "k",
+            "hosts": {"hermes": {"memoryMode": {
+                "default": "hybrid",
+                "hermes": "honcho",
+            }}},
+        }))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "hybrid"
+        assert cfg.peer_memory_mode("hermes") == "honcho"
+        assert cfg.peer_memory_mode("unknown") == "hybrid"  # falls through to default
+
+    def test_object_form_no_default_falls_back_to_hybrid(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({
+            "apiKey": "k",
+            "hosts": {"hermes": {"memoryMode": {"hermes": "honcho"}}},
+        }))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "hybrid"
+        assert cfg.peer_memory_mode("hermes") == "honcho"
+        assert cfg.peer_memory_mode("other") == "hybrid"
+
+    def test_global_string_host_object_override(self, tmp_path):
+        """Host object form overrides global string."""
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({
+            "apiKey": "k",
+            "memoryMode": "honcho",
+            "hosts": {"hermes": {"memoryMode": {"default": "hybrid", "hermes": "honcho"}}},
+        }))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "hybrid"  # host default wins over global "honcho"
+        assert cfg.peer_memory_mode("hermes") == "honcho"
+
+
+# ---------------------------------------------------------------------------
+# resolve_session_name with session_title
+# ---------------------------------------------------------------------------
+
+class TestResolveSessionNameTitle:
+    def test_manual_override_beats_title(self):
+        cfg = HonchoClientConfig(sessions={"/my/project": "manual-name"})
+        result = cfg.resolve_session_name("/my/project", session_title="the-title")
+        assert result == "manual-name"
+
+    def test_title_beats_dirname(self):
+        cfg = HonchoClientConfig()
+        result = cfg.resolve_session_name("/some/dir", session_title="my-project")
+        assert result == "my-project"
+
+    def test_title_with_peer_prefix(self):
+        cfg = HonchoClientConfig(peer_name="eri", session_peer_prefix=True)
+        result = cfg.resolve_session_name("/some/dir", session_title="aeris")
+        assert result == "eri-aeris"
+
+    def test_title_sanitized(self):
+        cfg = HonchoClientConfig()
+        result = cfg.resolve_session_name("/some/dir", session_title="my project/name!")
+        # trailing dashes stripped by .strip('-')
+        assert result == "my-project-name"
+
+    def test_title_all_invalid_chars_falls_back_to_dirname(self):
+        cfg = HonchoClientConfig()
+        result = cfg.resolve_session_name("/some/dir", session_title="!!! ###")
+        # sanitized to empty → falls back to dirname
+        assert result == "dir"
+
+    def test_none_title_falls_back_to_dirname(self):
+        cfg = HonchoClientConfig()
+        result = cfg.resolve_session_name("/some/dir", session_title=None)
+        assert result == "dir"
+
+    def test_empty_title_falls_back_to_dirname(self):
+        cfg = HonchoClientConfig()
+        result = cfg.resolve_session_name("/some/dir", session_title="")
+        assert result == "dir"
+
+    def test_per_session_uses_session_id(self):
+        cfg = HonchoClientConfig(session_strategy="per-session")
+        result = cfg.resolve_session_name("/some/dir", session_id="20260309_175514_9797dd")
+        assert result == "20260309_175514_9797dd"
+
+    def test_per_session_with_peer_prefix(self):
+        cfg = HonchoClientConfig(session_strategy="per-session", peer_name="eri", session_peer_prefix=True)
+        result = cfg.resolve_session_name("/some/dir", session_id="20260309_175514_9797dd")
+        assert result == "eri-20260309_175514_9797dd"
+
+    def test_per_session_no_id_falls_back_to_dirname(self):
+        cfg = HonchoClientConfig(session_strategy="per-session")
+        result = cfg.resolve_session_name("/some/dir", session_id=None)
+        assert result == "dir"
+
+    def test_title_beats_session_id(self):
+        cfg = HonchoClientConfig(session_strategy="per-session")
+        result = cfg.resolve_session_name("/some/dir", session_title="my-title", session_id="20260309_175514_9797dd")
+        assert result == "my-title"
+
+    def test_manual_beats_session_id(self):
+        cfg = HonchoClientConfig(session_strategy="per-session", sessions={"/some/dir": "pinned"})
+        result = cfg.resolve_session_name("/some/dir", session_id="20260309_175514_9797dd")
+        assert result == "pinned"
+
+    def test_global_strategy_returns_workspace(self):
+        cfg = HonchoClientConfig(session_strategy="global", workspace_id="my-workspace")
+        result = cfg.resolve_session_name("/some/dir")
+        assert result == "my-workspace"
+
+
+# ---------------------------------------------------------------------------
+# save() routing per write_frequency
+# ---------------------------------------------------------------------------
+
+class TestSaveRouting:
+    def _make_session_with_message(self, mgr=None):
+        sess = _make_session()
+        sess.add_message("user", "hello")
+        sess.add_message("assistant", "hi")
+        if mgr:
+            mgr._cache[sess.key] = sess
+        return sess
+
+    def test_turn_flushes_immediately(self):
+        mgr = _make_manager(write_frequency="turn")
+        sess = self._make_session_with_message(mgr)
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            mgr.save(sess)
+            mock_flush.assert_called_once_with(sess)
+
+    def test_session_mode_does_not_flush(self):
+        mgr = _make_manager(write_frequency="session")
+        sess = self._make_session_with_message(mgr)
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            mgr.save(sess)
+            mock_flush.assert_not_called()
+
+    def test_async_mode_enqueues(self):
+        mgr = _make_manager(write_frequency="async")
+        sess = self._make_session_with_message(mgr)
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            mgr.save(sess)
+            # flush_session should NOT be called synchronously
+            mock_flush.assert_not_called()
+        assert not mgr._async_queue.empty()
+
+    def test_int_frequency_flushes_on_nth_turn(self):
+        mgr = _make_manager(write_frequency=3)
+        sess = self._make_session_with_message(mgr)
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            mgr.save(sess)  # turn 1
+            mgr.save(sess)  # turn 2
+            assert mock_flush.call_count == 0
+            mgr.save(sess)  # turn 3
+            assert mock_flush.call_count == 1
+
+    def test_int_frequency_skips_other_turns(self):
+        mgr = _make_manager(write_frequency=5)
+        sess = self._make_session_with_message(mgr)
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            for _ in range(4):
+                mgr.save(sess)
+            assert mock_flush.call_count == 0
+            mgr.save(sess)  # turn 5
+            assert mock_flush.call_count == 1
+
+
+# ---------------------------------------------------------------------------
+# flush_all()
+# ---------------------------------------------------------------------------
+
+class TestFlushAll:
+    def test_flushes_all_cached_sessions(self):
+        mgr = _make_manager(write_frequency="session")
+        s1 = _make_session(key="s1", honcho_session_id="s1")
+        s2 = _make_session(key="s2", honcho_session_id="s2")
+        s1.add_message("user", "a")
+        s2.add_message("user", "b")
+        mgr._cache = {"s1": s1, "s2": s2}
+
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            mgr.flush_all()
+            assert mock_flush.call_count == 2
+
+    def test_flush_all_drains_async_queue(self):
+        mgr = _make_manager(write_frequency="async")
+        sess = _make_session()
+        sess.add_message("user", "pending")
+        mgr._async_queue.put(sess)
+
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            mgr.flush_all()
+            # Called at least once for the queued item
+            assert mock_flush.call_count >= 1
+
+    def test_flush_all_tolerates_errors(self):
+        mgr = _make_manager(write_frequency="session")
+        sess = _make_session()
+        mgr._cache = {"key": sess}
+        with patch.object(mgr, "_flush_session", side_effect=RuntimeError("oops")):
+            # Should not raise
+            mgr.flush_all()
+
+
+# ---------------------------------------------------------------------------
+# async writer thread lifecycle
+# ---------------------------------------------------------------------------
+
+class TestAsyncWriterThread:
+    def test_thread_started_on_async_mode(self):
+        mgr = _make_manager(write_frequency="async")
+        assert mgr._async_thread is not None
+        assert mgr._async_thread.is_alive()
+        mgr.shutdown()
+
+    def test_no_thread_for_turn_mode(self):
+        mgr = _make_manager(write_frequency="turn")
+        assert mgr._async_thread is None
+        assert mgr._async_queue is None
+
+    def test_shutdown_joins_thread(self):
+        mgr = _make_manager(write_frequency="async")
+        assert mgr._async_thread.is_alive()
+        mgr.shutdown()
+        assert not mgr._async_thread.is_alive()
+
+    def test_async_writer_calls_flush(self):
+        mgr = _make_manager(write_frequency="async")
+        sess = _make_session()
+        sess.add_message("user", "async msg")
+
+        flushed = []
+
+        def capture(s):
+            flushed.append(s)
+            return True
+
+        mgr._flush_session = capture
+        mgr._async_queue.put(sess)
+        # Give the daemon thread time to process
+        deadline = time.time() + 2.0
+        while not flushed and time.time() < deadline:
+            time.sleep(0.05)
+
+        mgr.shutdown()
+        assert len(flushed) == 1
+        assert flushed[0] is sess
+
+    def test_shutdown_sentinel_stops_loop(self):
+        mgr = _make_manager(write_frequency="async")
+        thread = mgr._async_thread
+        mgr.shutdown()
+        thread.join(timeout=3)
+        assert not thread.is_alive()
+
+
+# ---------------------------------------------------------------------------
+# async retry on failure
+# ---------------------------------------------------------------------------
+
+class TestAsyncWriterRetry:
+    def test_retries_once_on_failure(self):
+        mgr = _make_manager(write_frequency="async")
+        sess = _make_session()
+        sess.add_message("user", "msg")
+
+        call_count = [0]
+
+        def flaky_flush(s):
+            call_count[0] += 1
+            if call_count[0] == 1:
+                raise ConnectionError("network blip")
+            # second call succeeds silently
+
+        mgr._flush_session = flaky_flush
+
+        with patch("time.sleep"):  # skip the 2s sleep in retry
+            mgr._async_queue.put(sess)
+            deadline = time.time() + 3.0
+            while call_count[0] < 2 and time.time() < deadline:
+                time.sleep(0.05)
+
+        mgr.shutdown()
+        assert call_count[0] == 2
+
+    def test_drops_after_two_failures(self):
+        mgr = _make_manager(write_frequency="async")
+        sess = _make_session()
+        sess.add_message("user", "msg")
+
+        call_count = [0]
+
+        def always_fail(s):
+            call_count[0] += 1
+            raise RuntimeError("always broken")
+
+        mgr._flush_session = always_fail
+
+        with patch("time.sleep"):
+            mgr._async_queue.put(sess)
+            deadline = time.time() + 3.0
+            while call_count[0] < 2 and time.time() < deadline:
+                time.sleep(0.05)
+
+        mgr.shutdown()
+        # Should have tried exactly twice (initial + one retry) and not crashed
+        assert call_count[0] == 2
+        assert not mgr._async_thread.is_alive()
+
+    def test_retries_when_flush_reports_failure(self):
+        mgr = _make_manager(write_frequency="async")
+        sess = _make_session()
+        sess.add_message("user", "msg")
+
+        call_count = [0]
+
+        def fail_then_succeed(_session):
+            call_count[0] += 1
+            return call_count[0] > 1
+
+        mgr._flush_session = fail_then_succeed
+
+        with patch("time.sleep"):
+            mgr._async_queue.put(sess)
+            deadline = time.time() + 3.0
+            while call_count[0] < 2 and time.time() < deadline:
+                time.sleep(0.05)
+
+        mgr.shutdown()
+        assert call_count[0] == 2
+
+
+class TestMemoryFileMigrationTargets:
+    def test_soul_upload_targets_ai_peer(self, tmp_path):
+        mgr = _make_manager(write_frequency="turn")
+        session = _make_session(
+            key="cli:test",
+            user_peer_id="custom-user",
+            assistant_peer_id="custom-ai",
+            honcho_session_id="cli-test",
+        )
+        mgr._cache[session.key] = session
+
+        user_peer = MagicMock(name="user-peer")
+        ai_peer = MagicMock(name="ai-peer")
+        mgr._peers_cache[session.user_peer_id] = user_peer
+        mgr._peers_cache[session.assistant_peer_id] = ai_peer
+
+        honcho_session = MagicMock()
+        mgr._sessions_cache[session.honcho_session_id] = honcho_session
+
+        (tmp_path / "MEMORY.md").write_text("memory facts", encoding="utf-8")
+        (tmp_path / "USER.md").write_text("user profile", encoding="utf-8")
+        (tmp_path / "SOUL.md").write_text("ai identity", encoding="utf-8")
+
+        uploaded = mgr.migrate_memory_files(session.key, str(tmp_path))
+
+        assert uploaded is True
+        assert honcho_session.upload_file.call_count == 3
+
+        peer_by_upload_name = {}
+        for call_args in honcho_session.upload_file.call_args_list:
+            payload = call_args.kwargs["file"]
+            peer_by_upload_name[payload[0]] = call_args.kwargs["peer"]
+
+        assert peer_by_upload_name["consolidated_memory.md"] is user_peer
+        assert peer_by_upload_name["user_profile.md"] is user_peer
+        assert peer_by_upload_name["agent_soul.md"] is ai_peer
+
+
+# ---------------------------------------------------------------------------
+# HonchoClientConfig dataclass defaults for new fields
+# ---------------------------------------------------------------------------
+
+class TestNewConfigFieldDefaults:
+    def test_write_frequency_default(self):
+        cfg = HonchoClientConfig()
+        assert cfg.write_frequency == "async"
+
+    def test_memory_mode_default(self):
+        cfg = HonchoClientConfig()
+        assert cfg.memory_mode == "hybrid"
+
+    def test_write_frequency_set(self):
+        cfg = HonchoClientConfig(write_frequency="turn")
+        assert cfg.write_frequency == "turn"
+
+    def test_memory_mode_set(self):
+        cfg = HonchoClientConfig(memory_mode="honcho")
+        assert cfg.memory_mode == "honcho"
+
+    def test_peer_memory_mode_falls_back_to_global(self):
+        cfg = HonchoClientConfig(memory_mode="honcho")
+        assert cfg.peer_memory_mode("any-peer") == "honcho"
+
+    def test_peer_memory_mode_override(self):
+        cfg = HonchoClientConfig(memory_mode="hybrid", peer_memory_modes={"hermes": "honcho"})
+        assert cfg.peer_memory_mode("hermes") == "honcho"
+        assert cfg.peer_memory_mode("other") == "hybrid"
+
+
+class TestPrefetchCacheAccessors:
+    def test_set_and_pop_context_result(self):
+        mgr = _make_manager(write_frequency="turn")
+        payload = {"representation": "Known user", "card": "prefers concise replies"}
+
+        mgr.set_context_result("cli:test", payload)
+
+        assert mgr.pop_context_result("cli:test") == payload
+        assert mgr.pop_context_result("cli:test") == {}
+
+    def test_set_and_pop_dialectic_result(self):
+        mgr = _make_manager(write_frequency="turn")
+
+        mgr.set_dialectic_result("cli:test", "Resume with toolset cleanup")
+
+        assert mgr.pop_dialectic_result("cli:test") == "Resume with toolset cleanup"
+        assert mgr.pop_dialectic_result("cli:test") == ""
--- a/tests/honcho_integration/test_cli.py
+++ b/tests/honcho_integration/test_cli.py
@ -0,0 +1,29 @@
+"""Tests for Honcho CLI helpers."""
+
+from honcho_integration.cli import _resolve_api_key
+
+
+class TestResolveApiKey:
+    def test_prefers_host_scoped_key(self):
+        cfg = {
+            "apiKey": "root-key",
+            "hosts": {
+                "hermes": {
+                    "apiKey": "host-key",
+                }
+            },
+        }
+        assert _resolve_api_key(cfg) == "host-key"
+
+    def test_falls_back_to_root_key(self):
+        cfg = {
+            "apiKey": "root-key",
+            "hosts": {"hermes": {}},
+        }
+        assert _resolve_api_key(cfg) == "root-key"
+
+    def test_falls_back_to_env_key(self, monkeypatch):
+        monkeypatch.setenv("HONCHO_API_KEY", "env-key")
+        assert _resolve_api_key({}) == "env-key"
+        monkeypatch.delenv("HONCHO_API_KEY", raising=False)
+
--- a/tests/honcho_integration/test_client.py
+++ b/tests/honcho_integration/test_client.py
@ -25,7 +25,8 @@ class TestHonchoClientConfigDefaults:
        assert config.environment == "production"
        assert config.enabled is False
        assert config.save_messages is True
-        assert config.session_strategy == "per-directory"
+        assert config.session_strategy == "per-session"
+        assert config.recall_mode == "hybrid"
        assert config.session_peer_prefix is False
        assert config.linked_hosts == []
        assert config.sessions == {}
@ -134,6 +135,41 @@ class TestFromGlobalConfig:
        assert config.workspace_id == "root-ws"
        assert config.ai_peer == "root-ai"

+    def test_session_strategy_default_from_global_config(self, tmp_path):
+        """from_global_config with no sessionStrategy should match dataclass default."""
+        config_file = tmp_path / "config.json"
+        config_file.write_text(json.dumps({"apiKey": "key"}))
+        config = HonchoClientConfig.from_global_config(config_path=config_file)
+        assert config.session_strategy == "per-session"
+
+    def test_context_tokens_host_block_wins(self, tmp_path):
+        """Host block contextTokens should override root."""
+        config_file = tmp_path / "config.json"
+        config_file.write_text(json.dumps({
+            "apiKey": "key",
+            "contextTokens": 1000,
+            "hosts": {"hermes": {"contextTokens": 2000}},
+        }))
+        config = HonchoClientConfig.from_global_config(config_path=config_file)
+        assert config.context_tokens == 2000
+
+    def test_recall_mode_from_config(self, tmp_path):
+        """recallMode is read from config, host block wins."""
+        config_file = tmp_path / "config.json"
+        config_file.write_text(json.dumps({
+            "apiKey": "key",
+            "recallMode": "tools",
+            "hosts": {"hermes": {"recallMode": "context"}},
+        }))
+        config = HonchoClientConfig.from_global_config(config_path=config_file)
+        assert config.recall_mode == "context"
+
+    def test_recall_mode_default(self, tmp_path):
+        config_file = tmp_path / "config.json"
+        config_file.write_text(json.dumps({"apiKey": "key"}))
+        config = HonchoClientConfig.from_global_config(config_path=config_file)
+        assert config.recall_mode == "hybrid"
+
    def test_corrupt_config_falls_back_to_env(self, tmp_path):
        config_file = tmp_path / "config.json"
        config_file.write_text("not valid json{{{")
@ -177,6 +213,40 @@ class TestResolveSessionName:
        # Should use os.getcwd() basename
        assert result == Path.cwd().name

+    def test_per_repo_uses_git_root(self):
+        config = HonchoClientConfig(session_strategy="per-repo")
+        with patch.object(
+            HonchoClientConfig, "_git_repo_name", return_value="hermes-agent"
+        ):
+            result = config.resolve_session_name("/home/user/hermes-agent/subdir")
+        assert result == "hermes-agent"
+
+    def test_per_repo_with_peer_prefix(self):
+        config = HonchoClientConfig(
+            session_strategy="per-repo", peer_name="eri", session_peer_prefix=True
+        )
+        with patch.object(
+            HonchoClientConfig, "_git_repo_name", return_value="groudon"
+        ):
+            result = config.resolve_session_name("/home/user/groudon/src")
+        assert result == "eri-groudon"
+
+    def test_per_repo_falls_back_to_dirname_outside_git(self):
+        config = HonchoClientConfig(session_strategy="per-repo")
+        with patch.object(
+            HonchoClientConfig, "_git_repo_name", return_value=None
+        ):
+            result = config.resolve_session_name("/home/user/not-a-repo")
+        assert result == "not-a-repo"
+
+    def test_per_repo_manual_override_still_wins(self):
+        config = HonchoClientConfig(
+            session_strategy="per-repo",
+            sessions={"/home/user/proj": "custom-session"},
+        )
+        result = config.resolve_session_name("/home/user/proj")
+        assert result == "custom-session"
+

 class TestGetLinkedWorkspaces:
    def test_resolves_linked_hosts(self):
--- a/tests/test_real_interrupt_subagent.py
+++ b/tests/test_real_interrupt_subagent.py
@ -93,8 +93,8 @@ class TestRealSubagentInterrupt(unittest.TestCase):
                    mock_client.close = MagicMock()
                    MockOpenAI.return_value = mock_client

-                    # Also need to patch the system prompt builder
-                    with patch('run_agent.build_system_prompt', return_value="You are a test agent"):
+                    # Patch the instance method so it skips prompt assembly
+                    with patch.object(AIAgent, '_build_system_prompt', return_value="You are a test agent"):
                        # Signal when child starts
                        original_run = AIAgent.run_conversation

--- a/tests/test_run_agent.py
+++ b/tests/test_run_agent.py
@ -13,6 +13,7 @@ from unittest.mock import MagicMock, patch, PropertyMock

 import pytest

+from honcho_integration.client import HonchoClientConfig
 from run_agent import AIAgent
 from agent.prompt_builder import DEFAULT_AGENT_IDENTITY, PLATFORM_HINTS

@ -1209,17 +1210,15 @@ class TestSystemPromptStability:

        assert "User prefers Python over JavaScript" in agent._cached_system_prompt

-    def test_honcho_prefetch_skipped_on_continuing_session(self):
-        """Honcho prefetch should not be called when conversation_history
-        is non-empty (continuing session)."""
+    def test_honcho_prefetch_runs_on_continuing_session(self):
+        """Honcho prefetch is consumed on continuing sessions via ephemeral context."""
        conversation_history = [
            {"role": "user", "content": "hello"},
            {"role": "assistant", "content": "hi there"},
        ]
-
-        # The guard: `not conversation_history` is False when history exists
-        should_prefetch = not conversation_history
-        assert should_prefetch is False
+        recall_mode = "hybrid"
+        should_prefetch = bool(conversation_history) and recall_mode != "tools"
+        assert should_prefetch is True

    def test_honcho_prefetch_runs_on_first_turn(self):
        """Honcho prefetch should run when conversation_history is empty."""
@ -1228,6 +1227,190 @@ class TestSystemPromptStability:
        assert should_prefetch is True


+class TestHonchoActivation:
+    def test_disabled_config_skips_honcho_init(self):
+        hcfg = HonchoClientConfig(
+            enabled=False,
+            api_key="honcho-key",
+            peer_name="user",
+            ai_peer="hermes",
+        )
+
+        with (
+            patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
+            patch("run_agent.check_toolset_requirements", return_value={}),
+            patch("run_agent.OpenAI"),
+            patch("honcho_integration.client.HonchoClientConfig.from_global_config", return_value=hcfg),
+            patch("honcho_integration.client.get_honcho_client") as mock_client,
+        ):
+            agent = AIAgent(
+                api_key="test-key-1234567890",
+                quiet_mode=True,
+                skip_context_files=True,
+                skip_memory=False,
+            )
+
+        assert agent._honcho is None
+        assert agent._honcho_config is hcfg
+        mock_client.assert_not_called()
+
+    def test_injected_honcho_manager_skips_fresh_client_init(self):
+        hcfg = HonchoClientConfig(
+            enabled=True,
+            api_key="honcho-key",
+            memory_mode="hybrid",
+            peer_name="user",
+            ai_peer="hermes",
+            recall_mode="hybrid",
+        )
+        manager = MagicMock()
+        manager._config = hcfg
+        manager.get_or_create.return_value = SimpleNamespace(messages=[])
+        manager.get_prefetch_context.return_value = {"representation": "Known user", "card": ""}
+
+        with (
+            patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
+            patch("run_agent.check_toolset_requirements", return_value={}),
+            patch("run_agent.OpenAI"),
+            patch("honcho_integration.client.get_honcho_client") as mock_client,
+            patch("tools.honcho_tools.set_session_context"),
+        ):
+            agent = AIAgent(
+                api_key="test-key-1234567890",
+                quiet_mode=True,
+                skip_context_files=True,
+                skip_memory=False,
+                honcho_session_key="gateway-session",
+                honcho_manager=manager,
+                honcho_config=hcfg,
+            )
+
+        assert agent._honcho is manager
+        manager.get_or_create.assert_called_once_with("gateway-session")
+        manager.get_prefetch_context.assert_called_once_with("gateway-session")
+        manager.set_context_result.assert_called_once_with(
+            "gateway-session",
+            {"representation": "Known user", "card": ""},
+        )
+        mock_client.assert_not_called()
+
+    def test_recall_mode_context_suppresses_honcho_tools(self):
+        hcfg = HonchoClientConfig(
+            enabled=True,
+            api_key="honcho-key",
+            memory_mode="hybrid",
+            peer_name="user",
+            ai_peer="hermes",
+            recall_mode="context",
+        )
+        manager = MagicMock()
+        manager._config = hcfg
+        manager.get_or_create.return_value = SimpleNamespace(messages=[])
+        manager.get_prefetch_context.return_value = {"representation": "Known user", "card": ""}
+
+        with (
+            patch(
+                "run_agent.get_tool_definitions",
+                side_effect=[
+                    _make_tool_defs("web_search"),
+                    _make_tool_defs(
+                        "web_search",
+                        "honcho_context",
+                        "honcho_profile",
+                        "honcho_search",
+                        "honcho_conclude",
+                    ),
+                ],
+            ),
+            patch("run_agent.check_toolset_requirements", return_value={}),
+            patch("run_agent.OpenAI"),
+            patch("tools.honcho_tools.set_session_context"),
+        ):
+            agent = AIAgent(
+                api_key="test-key-1234567890",
+                quiet_mode=True,
+                skip_context_files=True,
+                skip_memory=False,
+                honcho_session_key="gateway-session",
+                honcho_manager=manager,
+                honcho_config=hcfg,
+            )
+
+        assert "web_search" in agent.valid_tool_names
+        assert "honcho_context" not in agent.valid_tool_names
+        assert "honcho_profile" not in agent.valid_tool_names
+        assert "honcho_search" not in agent.valid_tool_names
+        assert "honcho_conclude" not in agent.valid_tool_names
+
+    def test_inactive_honcho_strips_stale_honcho_tools(self):
+        hcfg = HonchoClientConfig(
+            enabled=False,
+            api_key="honcho-key",
+            peer_name="user",
+            ai_peer="hermes",
+        )
+
+        with (
+            patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search", "honcho_context")),
+            patch("run_agent.check_toolset_requirements", return_value={}),
+            patch("run_agent.OpenAI"),
+            patch("honcho_integration.client.HonchoClientConfig.from_global_config", return_value=hcfg),
+            patch("honcho_integration.client.get_honcho_client") as mock_client,
+        ):
+            agent = AIAgent(
+                api_key="test-key-1234567890",
+                quiet_mode=True,
+                skip_context_files=True,
+                skip_memory=False,
+            )
+
+        assert agent._honcho is None
+        assert "web_search" in agent.valid_tool_names
+        assert "honcho_context" not in agent.valid_tool_names
+        mock_client.assert_not_called()
+
+
+class TestHonchoPrefetchScheduling:
+    def test_honcho_prefetch_includes_cached_dialectic(self, agent):
+        agent._honcho = MagicMock()
+        agent._honcho_session_key = "session-key"
+        agent._honcho.pop_context_result.return_value = {}
+        agent._honcho.pop_dialectic_result.return_value = "Continue with the migration checklist."
+
+        context = agent._honcho_prefetch("what next?")
+
+        assert "Continuity synthesis" in context
+        assert "migration checklist" in context
+
+    def test_queue_honcho_prefetch_skips_tools_mode(self, agent):
+        agent._honcho = MagicMock()
+        agent._honcho_session_key = "session-key"
+        agent._honcho_config = HonchoClientConfig(
+            enabled=True,
+            api_key="honcho-key",
+            recall_mode="tools",
+        )
+
+        agent._queue_honcho_prefetch("what next?")
+
+        agent._honcho.prefetch_context.assert_not_called()
+        agent._honcho.prefetch_dialectic.assert_not_called()
+
+    def test_queue_honcho_prefetch_runs_when_context_enabled(self, agent):
+        agent._honcho = MagicMock()
+        agent._honcho_session_key = "session-key"
+        agent._honcho_config = HonchoClientConfig(
+            enabled=True,
+            api_key="honcho-key",
+            recall_mode="hybrid",
+        )
+
+        agent._queue_honcho_prefetch("what next?")
+
+        agent._honcho.prefetch_context.assert_called_once_with("session-key", "what next?")
+        agent._honcho.prefetch_dialectic.assert_called_once_with("session-key", "what next?")
+
+
 # ---------------------------------------------------------------------------
 # Iteration budget pressure warnings
 # ---------------------------------------------------------------------------
--- a/tools/honcho_tools.py
+++ b/tools/honcho_tools.py
@ -1,8 +1,16 @@
-"""Honcho tool for querying user context via dialectic reasoning.
+"""Honcho tools for user context retrieval.

-Registers ``query_user_context`` -- an LLM-callable tool that asks Honcho
-about the current user's history, preferences, goals, and communication
-style. The session key is injected at runtime by the agent loop via
+Registers three complementary tools, ordered by capability:
+
+  honcho_context   — dialectic Q&A (LLM-powered, direct answers)
+  honcho_search        — semantic search (fast, no LLM, raw excerpts)
+  honcho_profile       — peer card (fast, no LLM, structured facts)
+
+Use honcho_context when you need Honcho to synthesize an answer.
+Use honcho_search or honcho_profile when you want raw data to reason
+over yourself.
+
+The session key is injected at runtime by the agent loop via
 ``set_session_context()``.
 """

@ -34,54 +42,6 @@ def clear_session_context() -> None:
    _session_key = None


-# ── Tool schema ──
-
-HONCHO_TOOL_SCHEMA = {
-    "name": "query_user_context",
-    "description": (
-        "Query Honcho to retrieve relevant context about the user based on their "
-        "history and preferences. Use this when you need to understand the user's "
-        "background, preferences, past interactions, or goals. This helps you "
-        "personalize your responses and provide more relevant assistance."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "query": {
-                "type": "string",
-                "description": (
-                    "A natural language question about the user. Examples: "
-                    "'What are this user's main goals?', "
-                    "'What communication style does this user prefer?', "
-                    "'What topics has this user discussed recently?', "
-                    "'What is this user's technical expertise level?'"
-                ),
-            }
-        },
-        "required": ["query"],
-    },
-}
-
-
-# ── Tool handler ──
-
-def _handle_query_user_context(args: dict, **kw) -> str:
-    """Execute the Honcho context query."""
-    query = args.get("query", "")
-    if not query:
-        return json.dumps({"error": "Missing required parameter: query"})
-
-    if not _session_manager or not _session_key:
-        return json.dumps({"error": "Honcho is not active for this session."})
-
-    try:
-        result = _session_manager.get_user_context(_session_key, query)
-        return json.dumps({"result": result})
-    except Exception as e:
-        logger.error("Error querying Honcho user context: %s", e)
-        return json.dumps({"error": f"Failed to query user context: {e}"})
-
-
 # ── Availability check ──

 def _check_honcho_available() -> bool:
@ -89,14 +49,201 @@ def _check_honcho_available() -> bool:
    return _session_manager is not None and _session_key is not None


+# ── honcho_profile ──
+
+_PROFILE_SCHEMA = {
+    "name": "honcho_profile",
+    "description": (
+        "Retrieve the user's peer card from Honcho — a curated list of key facts "
+        "about them (name, role, preferences, communication style, patterns). "
+        "Fast, no LLM reasoning, minimal cost. "
+        "Use this at conversation start or when you need a quick factual snapshot. "
+        "Use honcho_context instead when you need Honcho to synthesize an answer."
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {},
+        "required": [],
+    },
+}
+
+
+def _handle_honcho_profile(args: dict, **kw) -> str:
+    if not _session_manager or not _session_key:
+        return json.dumps({"error": "Honcho is not active for this session."})
+    try:
+        card = _session_manager.get_peer_card(_session_key)
+        if not card:
+            return json.dumps({"result": "No profile facts available yet. The user's profile builds over time through conversations."})
+        return json.dumps({"result": card})
+    except Exception as e:
+        logger.error("Error fetching Honcho peer card: %s", e)
+        return json.dumps({"error": f"Failed to fetch profile: {e}"})
+
+
+# ── honcho_search ──
+
+_SEARCH_SCHEMA = {
+    "name": "honcho_search",
+    "description": (
+        "Semantic search over Honcho's stored context about the user. "
+        "Returns raw excerpts ranked by relevance to your query — no LLM synthesis. "
+        "Cheaper and faster than honcho_context. "
+        "Good when you want to find specific past facts and reason over them yourself. "
+        "Use honcho_context when you need a direct synthesized answer."
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "query": {
+                "type": "string",
+                "description": "What to search for in Honcho's memory (e.g. 'programming languages', 'past projects', 'timezone').",
+            },
+            "max_tokens": {
+                "type": "integer",
+                "description": "Token budget for returned context (default 800, max 2000).",
+            },
+        },
+        "required": ["query"],
+    },
+}
+
+
+def _handle_honcho_search(args: dict, **kw) -> str:
+    query = args.get("query", "")
+    if not query:
+        return json.dumps({"error": "Missing required parameter: query"})
+    if not _session_manager or not _session_key:
+        return json.dumps({"error": "Honcho is not active for this session."})
+    max_tokens = min(int(args.get("max_tokens", 800)), 2000)
+    try:
+        result = _session_manager.search_context(_session_key, query, max_tokens=max_tokens)
+        if not result:
+            return json.dumps({"result": "No relevant context found."})
+        return json.dumps({"result": result})
+    except Exception as e:
+        logger.error("Error searching Honcho context: %s", e)
+        return json.dumps({"error": f"Failed to search context: {e}"})
+
+
+# ── honcho_context (dialectic — LLM-powered) ──
+
+_QUERY_SCHEMA = {
+    "name": "honcho_context",
+    "description": (
+        "Ask Honcho a natural language question and get a synthesized answer. "
+        "Uses Honcho's LLM (dialectic reasoning) — higher cost than honcho_profile or honcho_search. "
+        "Can query about any peer: the user (default), the AI assistant, or any named peer. "
+        "Examples: 'What are the user's main goals?', 'What has hermes been working on?', "
+        "'What is the user's technical expertise level?'"
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "query": {
+                "type": "string",
+                "description": "A natural language question.",
+            },
+            "peer": {
+                "type": "string",
+                "description": "Which peer to query about: 'user' (default) or 'ai'. Omit for user.",
+            },
+        },
+        "required": ["query"],
+    },
+}
+
+
+def _handle_honcho_context(args: dict, **kw) -> str:
+    query = args.get("query", "")
+    if not query:
+        return json.dumps({"error": "Missing required parameter: query"})
+    if not _session_manager or not _session_key:
+        return json.dumps({"error": "Honcho is not active for this session."})
+    peer_target = args.get("peer", "user")
+    try:
+        result = _session_manager.dialectic_query(_session_key, query, peer=peer_target)
+        return json.dumps({"result": result or "No result from Honcho."})
+    except Exception as e:
+        logger.error("Error querying Honcho context: %s", e)
+        return json.dumps({"error": f"Failed to query context: {e}"})
+
+
+# ── honcho_conclude ──
+
+_CONCLUDE_SCHEMA = {
+    "name": "honcho_conclude",
+    "description": (
+        "Write a conclusion about the user back to Honcho's memory. "
+        "Conclusions are persistent facts that build the user's profile — "
+        "preferences, corrections, clarifications, project context, or anything "
+        "the user tells you that should be remembered across sessions. "
+        "Use this when the user explicitly states a preference, corrects you, "
+        "or shares something they want remembered. "
+        "Examples: 'User prefers dark mode', 'User's project uses Python 3.11', "
+        "'User corrected: their name is spelled Eri not Eric'."
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "conclusion": {
+                "type": "string",
+                "description": "A factual statement about the user to persist in memory.",
+            }
+        },
+        "required": ["conclusion"],
+    },
+}
+
+
+def _handle_honcho_conclude(args: dict, **kw) -> str:
+    conclusion = args.get("conclusion", "")
+    if not conclusion:
+        return json.dumps({"error": "Missing required parameter: conclusion"})
+    if not _session_manager or not _session_key:
+        return json.dumps({"error": "Honcho is not active for this session."})
+    try:
+        ok = _session_manager.create_conclusion(_session_key, conclusion)
+        if ok:
+            return json.dumps({"result": f"Conclusion saved: {conclusion}"})
+        return json.dumps({"error": "Failed to save conclusion."})
+    except Exception as e:
+        logger.error("Error creating Honcho conclusion: %s", e)
+        return json.dumps({"error": f"Failed to save conclusion: {e}"})
+
+
 # ── Registration ──

 from tools.registry import registry

 registry.register(
-    name="query_user_context",
+    name="honcho_profile",
    toolset="honcho",
-    schema=HONCHO_TOOL_SCHEMA,
-    handler=_handle_query_user_context,
+    schema=_PROFILE_SCHEMA,
+    handler=_handle_honcho_profile,
+    check_fn=_check_honcho_available,
+)
+
+registry.register(
+    name="honcho_search",
+    toolset="honcho",
+    schema=_SEARCH_SCHEMA,
+    handler=_handle_honcho_search,
+    check_fn=_check_honcho_available,
+)
+
+registry.register(
+    name="honcho_context",
+    toolset="honcho",
+    schema=_QUERY_SCHEMA,
+    handler=_handle_honcho_context,
+    check_fn=_check_honcho_available,
+)
+
+registry.register(
+    name="honcho_conclude",
+    toolset="honcho",
+    schema=_CONCLUDE_SCHEMA,
+    handler=_handle_honcho_conclude,
    check_fn=_check_honcho_available,
 )
--- a/toolsets.py
+++ b/toolsets.py
@ -60,8 +60,8 @@ _HERMES_CORE_TOOLS = [
    "schedule_cronjob", "list_cronjobs", "remove_cronjob",
    # Cross-platform messaging (gated on gateway running via check_fn)
    "send_message",
-    # Honcho user context (gated on honcho being active via check_fn)
-    "query_user_context",
+    # Honcho memory tools (gated on honcho being active via check_fn)
+    "honcho_context", "honcho_profile", "honcho_search", "honcho_conclude",
    # Home Assistant smart home control (gated on HASS_TOKEN via check_fn)
    "ha_list_entities", "ha_get_state", "ha_list_services", "ha_call_service",
 ]
@ -192,7 +192,7 @@ TOOLSETS = {

    "honcho": {
        "description": "Honcho AI-native memory for persistent cross-session user modeling",
-        "tools": ["query_user_context"],
+        "tools": ["honcho_context", "honcho_profile", "honcho_search", "honcho_conclude"],
        "includes": []
    },

--- a/website/docs/user-guide/configuration.md
+++ b/website/docs/user-guide/configuration.md
@ -758,6 +758,7 @@ checkpoints:
  max_snapshots: 50              # Max checkpoints to keep per directory
 ```

+
 ## Delegation

 Configure subagent behavior for the delegate tool:
--- a/website/docs/user-guide/features/honcho.md
+++ b/website/docs/user-guide/features/honcho.md
@ -7,120 +7,270 @@ sidebar_position: 8

 # Honcho Memory

-[Honcho](https://honcho.dev) is an AI-native memory system that gives Hermes Agent persistent, cross-session understanding of users. While Hermes has built-in memory (`MEMORY.md` and `USER.md` files), Honcho adds a deeper layer of **user modeling** — learning user preferences, goals, communication style, and context across conversations.
+[Honcho](https://honcho.dev) is an AI-native memory system that gives Hermes persistent, cross-session understanding of users. While Hermes has built-in memory (`MEMORY.md` and `USER.md`), Honcho adds a deeper layer of **user modeling** — learning preferences, goals, communication style, and context across conversations via a dual-peer architecture where both the user and the AI build representations over time.

-## How It Complements Built-in Memory
+## Works Alongside Built-in Memory

-Hermes has two memory systems that work together:
+Hermes has two memory systems that can work together or be configured separately. In `hybrid` mode (the default), both run side by side — Honcho adds cross-session user modeling while local files handle agent-level notes.

 | Feature | Built-in Memory | Honcho Memory |
 |---------|----------------|---------------|
 | Storage | Local files (`~/.hermes/memories/`) | Cloud-hosted Honcho API |
 | Scope | Agent-level notes and user profile | Deep user modeling via dialectic reasoning |
 | Persistence | Across sessions on same machine | Across sessions, machines, and platforms |
-| Query | Injected into system prompt automatically | On-demand via `query_user_context` tool |
+| Query | Injected into system prompt automatically | Prefetched + on-demand via tools |
 | Content | Manually curated by the agent | Automatically learned from conversations |
+| Write surface | `memory` tool (add/replace/remove) | `honcho_conclude` tool (persist facts) |
+
+Set `memoryMode` to `honcho` to use Honcho exclusively. See [Memory Modes](#memory-modes) for per-peer configuration.

-Honcho doesn't replace built-in memory — it **supplements** it with richer user understanding.

 ## Setup

-### 1. Get a Honcho API Key
-
-Sign up at [app.honcho.dev](https://app.honcho.dev) and get your API key.
-
-### 2. Install the Client Library
+### Interactive Setup

 ```bash
-pip install honcho-ai
+hermes honcho setup
 ```

-### 3. Configure Honcho
+The setup wizard walks through API key, peer names, workspace, memory mode, write frequency, recall mode, and session strategy. It offers to install `honcho-ai` if missing.

-Honcho reads its configuration from `~/.honcho/config.json` (the global Honcho config shared across all Honcho-enabled applications):
+### Manual Setup
+
+#### 1. Install the Client Library
+
+```bash
+pip install 'honcho-ai>=2.0.1'
+```
+
+#### 2. Get an API Key
+
+Go to [app.honcho.dev](https://app.honcho.dev) > Settings > API Keys.
+
+#### 3. Configure
+
+Honcho reads from `~/.honcho/config.json` (shared across all Honcho-enabled applications):

 ```json
 {
  "apiKey": "your-honcho-api-key",
-  "workspace": "hermes",
-  "peerName": "your-name",
-  "aiPeer": "hermes",
-  "environment": "production",
-  "saveMessages": true,
-  "sessionStrategy": "per-directory",
-  "enabled": true
-}
-```
-
-Alternatively, set the API key as an environment variable:
-
-```bash
-# Add to ~/.hermes/.env
-HONCHO_API_KEY=your-honcho-api-key
-```
-
-:::info
-When an API key is present (either in `~/.honcho/config.json` or as `HONCHO_API_KEY`), Honcho auto-enables unless explicitly set to `"enabled": false` in the config.
-:::
-
-## Configuration Details
-
-### Global Config (`~/.honcho/config.json`)
-
-| Field | Default | Description |
-|-------|---------|-------------|
-| `apiKey` | — | Honcho API key (required) |
-| `workspace` | `"hermes"` | Workspace identifier |
-| `peerName` | *(derived)* | Your identity name for user modeling |
-| `aiPeer` | `"hermes"` | AI assistant identity name |
-| `environment` | `"production"` | Honcho environment |
-| `saveMessages` | `true` | Whether to sync messages to Honcho |
-| `sessionStrategy` | `"per-directory"` | How sessions are scoped |
-| `sessionPeerPrefix` | `false` | Prefix session names with peer name |
-| `contextTokens` | *(Honcho default)* | Max tokens for context prefetch |
-| `sessions` | `{}` | Manual session name overrides per directory |
-
-### Host-specific Configuration
-
-You can configure per-host settings for multi-application setups:
-
-```json
-{
-  "apiKey": "your-key",
  "hosts": {
    "hermes": {
-      "workspace": "my-workspace",
-      "aiPeer": "hermes-assistant",
-      "linkedHosts": ["other-app"],
-      "contextTokens": 2000
+      "workspace": "hermes",
+      "peerName": "your-name",
+      "aiPeer": "hermes",
+      "memoryMode": "hybrid",
+      "writeFrequency": "async",
+      "recallMode": "hybrid",
+      "sessionStrategy": "per-session",
+      "enabled": true
    }
  }
 }
 ```

-Host-specific fields override global fields. Resolution order:
-1. Explicit host block fields
-2. Global/flat fields from config root
-3. Defaults (host name used as workspace/peer)
+`apiKey` lives at the root because it is a shared credential across all Honcho-enabled tools. All other settings are scoped under `hosts.hermes`. The `hermes honcho setup` wizard writes this structure automatically.
+
+Or set the API key as an environment variable:
+
+```bash
+hermes config set HONCHO_API_KEY your-key
+```
+
+:::info
+When an API key is present (either in `~/.honcho/config.json` or as `HONCHO_API_KEY`), Honcho auto-enables unless explicitly set to `"enabled": false`.
+:::
+
+## Configuration
+
+### Global Config (`~/.honcho/config.json`)
+
+Settings are scoped to `hosts.hermes` and fall back to root-level globals when the host field is absent. Root-level keys are managed by the user or the honcho CLI -- Hermes only writes to its own host block (except `apiKey`, which is a shared credential at root).
+
+**Root-level (shared)**
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `apiKey` | — | Honcho API key (required, shared across all hosts) |
+| `sessions` | `{}` | Manual session name overrides per directory (shared) |
+
+**Host-level (`hosts.hermes`)**
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `workspace` | `"hermes"` | Workspace identifier |
+| `peerName` | *(derived)* | Your identity name for user modeling |
+| `aiPeer` | `"hermes"` | AI assistant identity name |
+| `environment` | `"production"` | Honcho environment |
+| `enabled` | *(auto)* | Auto-enables when API key is present |
+| `saveMessages` | `true` | Whether to sync messages to Honcho |
+| `memoryMode` | `"hybrid"` | Memory mode: `hybrid` or `honcho` |
+| `writeFrequency` | `"async"` | When to write: `async`, `turn`, `session`, or integer N |
+| `recallMode` | `"hybrid"` | Retrieval strategy: `hybrid`, `context`, or `tools` |
+| `sessionStrategy` | `"per-session"` | How sessions are scoped |
+| `sessionPeerPrefix` | `false` | Prefix session names with peer name |
+| `contextTokens` | *(Honcho default)* | Max tokens for auto-injected context |
+| `dialecticReasoningLevel` | `"low"` | Floor for dialectic reasoning: `minimal` / `low` / `medium` / `high` / `max` |
+| `dialecticMaxChars` | `600` | Char cap on dialectic results injected into system prompt |
+| `linkedHosts` | `[]` | Other host keys whose workspaces to cross-reference |
+
+All host-level fields fall back to the equivalent root-level key if not set under `hosts.hermes`. Existing configs with settings at root level continue to work.
+
+### Memory Modes
+
+| Mode | Effect |
+|------|--------|
+| `hybrid` | Write to both Honcho and local files (default) |
+| `honcho` | Honcho only — skip local file writes |
+
+Memory mode can be set globally or per-peer (user, agent1, agent2, etc):
+
+```json
+{
+  "memoryMode": {
+    "default": "hybrid",
+    "hermes": "honcho"
+  }
+}
+```
+
+To disable Honcho entirely, set `enabled: false` or remove the API key.
+
+### Recall Modes
+
+Controls how Honcho context reaches the agent:
+
+| Mode | Behavior |
+|------|----------|
+| `hybrid` | Auto-injected context + Honcho tools available (default) |
+| `context` | Auto-injected context only — Honcho tools hidden |
+| `tools` | Honcho tools only — no auto-injected context |
+
+### Write Frequency
+
+| Setting | Behavior |
+|---------|----------|
+| `async` | Background thread writes (zero blocking, default) |
+| `turn` | Synchronous write after each turn |
+| `session` | Batched write at session end |
+| *integer N* | Write every N turns |
+
+### Session Strategies
+
+| Strategy | Session key | Use case |
+|----------|-------------|----------|
+| `per-session` | Unique per run | Default. Fresh session every time. |
+| `per-directory` | CWD basename | Each project gets its own session. |
+| `per-repo` | Git repo root name | Groups subdirectories under one session. |
+| `global` | Fixed `"global"` | Single cross-project session. |
+
+Resolution order: manual map > session title > strategy-derived key > platform key.
+
+### Multi-host Configuration
+
+Multiple Honcho-enabled tools share `~/.honcho/config.json`. Each tool writes only to its own host block, reads its host block first, and falls back to root-level globals:
+
+```json
+{
+  "apiKey": "your-key",
+  "peerName": "eri",
+  "hosts": {
+    "hermes": {
+      "workspace": "my-workspace",
+      "aiPeer": "hermes-assistant",
+      "memoryMode": "honcho",
+      "linkedHosts": ["claude-code"],
+      "contextTokens": 2000,
+      "dialecticReasoningLevel": "medium"
+    },
+    "claude-code": {
+      "workspace": "my-workspace",
+      "aiPeer": "clawd"
+    }
+  }
+}
+```
+
+Resolution: `hosts.<tool>` field > root-level field > default. In this example, both tools share the root `apiKey` and `peerName`, but each has its own `aiPeer` and workspace settings.

 ### Hermes Config (`~/.hermes/config.yaml`)

-The `honcho` section in Hermes config is intentionally minimal — most configuration comes from the global `~/.honcho/config.json`:
+Intentionally minimal — most configuration comes from `~/.honcho/config.json`:

 ```yaml
 honcho: {}
 ```

-## The `query_user_context` Tool
+## How It Works

-When Honcho is active, Hermes gains access to the `query_user_context` tool. This lets the agent proactively ask Honcho about the user during conversations:
+### Async Context Pipeline

-**Tool schema:**
- **Name:** `query_user_context`
- **Parameter:** `query` (string) — a natural language question about the user
- **Toolset:** `honcho`
+Honcho context is fetched asynchronously to avoid blocking the response path:

-**Example queries the agent might make:**
+```
+Turn N:
+  user message
+    → consume cached context (from previous turn's background fetch)
+    → inject into system prompt (user representation, AI representation, dialectic)
+    → LLM call
+    → response
+    → fire background fetch for next turn
+         → fetch context    ─┐
+         → fetch dialectic  ─┴→ cache for Turn N+1
+```
+
+Turn 1 is a cold start (no cache). All subsequent turns consume cached results with zero HTTP latency on the response path. The system prompt on turn 1 uses only static context to preserve prefix cache hits at the LLM provider.
+
+### Dual-Peer Architecture
+
+Both the user and AI have peer representations in Honcho:
+
+- **User peer** — observed from user messages. Honcho learns preferences, goals, communication style.
+- **AI peer** — observed from assistant messages (`observe_me=True`). Honcho builds a representation of the agent's knowledge and behavior.
+
+Both representations are injected into the system prompt when available.
+
+### Dynamic Reasoning Level
+
+Dialectic queries scale reasoning effort with message complexity:
+
+| Message length | Reasoning level |
+|----------------|-----------------|
+| < 120 chars | Config default (typically `low`) |
+| 120-400 chars | One level above default (cap: `high`) |
+| > 400 chars | Two levels above default (cap: `high`) |
+
+`max` is never selected automatically.
+
+### Gateway Integration
+
+The gateway creates short-lived `AIAgent` instances per request. Honcho managers are owned at the gateway session layer (`_honcho_managers` dict) so they persist across requests within the same session and flush at real session boundaries (reset, resume, expiry, server stop).
+
+## Tools
+
+When Honcho is active, four tools become available. Availability is gated dynamically — they are invisible when Honcho is disabled.
+
+### `honcho_profile`
+
+Fast peer card retrieval (no LLM). Returns a curated list of key facts about the user.
+
+### `honcho_search`
+
+Semantic search over memory (no LLM). Returns raw excerpts ranked by relevance. Cheaper and faster than `honcho_context` — good for factual lookups.
+
+Parameters:
+- `query` (string) — search query
+- `max_tokens` (integer, optional) — result token budget
+
+### `honcho_context`
+
+Dialectic Q&A powered by Honcho's LLM. Synthesizes an answer from accumulated conversation history.
+
+Parameters:
+- `query` (string) — natural language question
+- `peer` (string, optional) — `"user"` (default) or `"ai"`. Querying `"ai"` asks about the assistant's own history and identity.
+
+Example queries the agent might make:

 ```
 "What are this user's main goals?"
@ -129,30 +279,70 @@ When Honcho is active, Hermes gains access to the `query_user_context` tool. Thi
 "What is this user's technical expertise level?"
 ```

-The tool calls Honcho's dialectic chat API to retrieve relevant user context based on accumulated conversation history.
+### `honcho_conclude`

-:::note
-The `query_user_context` tool is only available when Honcho is active (API key configured and session context set). It registers in the `honcho` toolset and its availability is checked dynamically.
-:::
+Writes a fact to Honcho memory. Use when the user explicitly states a preference, correction, or project context worth remembering. Feeds into the user's peer card and representation.

-## Session Management
+Parameters:
+- `conclusion` (string) — the fact to persist

-Honcho sessions track conversation history for user modeling:
+## CLI Commands

- **Session creation** — sessions are created or resumed automatically based on session keys (e.g., `telegram:123456` or CLI session IDs)
- **Message syncing** — new messages are synced to Honcho incrementally (only unsynced messages)
- **Peer configuration** — user messages are observed for learning; assistant messages are not
- **Context prefetch** — before responding, Hermes can prefetch user context (representation + peer card) in a single API call
- **Session rotation** — when sessions reset, old data is preserved in Honcho for continued user modeling
+```
+hermes honcho setup                        # Interactive setup wizard
+hermes honcho status                       # Show config and connection status
+hermes honcho sessions                     # List directory → session name mappings
+hermes honcho map <name>                   # Map current directory to a session name
+hermes honcho peer                         # Show peer names and dialectic settings
+hermes honcho peer --user NAME             # Set user peer name
+hermes honcho peer --ai NAME               # Set AI peer name
+hermes honcho peer --reasoning LEVEL       # Set dialectic reasoning level
+hermes honcho mode                         # Show current memory mode
+hermes honcho mode [hybrid|honcho]         # Set memory mode
+hermes honcho tokens                       # Show token budget settings
+hermes honcho tokens --context N           # Set context token cap
+hermes honcho tokens --dialectic N         # Set dialectic char cap
+hermes honcho identity                     # Show AI peer identity
+hermes honcho identity <file>              # Seed AI peer identity from file (SOUL.md, etc.)
+hermes honcho migrate                      # Migration guide: OpenClaw → Hermes + Honcho
+```

-## Migration from Local Memory
+### Doctor Integration

-When Honcho is activated on an instance that already has local conversation history:
+`hermes doctor` includes a Honcho section that validates config, API key, and connection status.

-1. **Conversation history** — prior messages can be uploaded to Honcho as a transcript file
-2. **Memory files** — existing `MEMORY.md` and `USER.md` files can be uploaded for context
+## Migration

-This ensures Honcho has the full picture even when activated mid-conversation.
+### From Local Memory
+
+When Honcho activates on an instance with existing local history, migration runs automatically:
+
+1. **Conversation history** — prior messages are uploaded as an XML transcript file
+2. **Memory files** — existing `MEMORY.md`, `USER.md`, and `SOUL.md` are uploaded for context
+
+### From OpenClaw
+
+```bash
+hermes honcho migrate
+```
+
+Walks through converting an OpenClaw native Honcho setup to the shared `~/.honcho/config.json` format.
+
+## AI Peer Identity
+
+Honcho can build a representation of the AI assistant over time (via `observe_me=True`). You can also seed the AI peer explicitly:
+
+```bash
+hermes honcho identity ~/.hermes/SOUL.md
+```
+
+This uploads the file content through Honcho's observation pipeline. The AI peer representation is then injected into the system prompt alongside the user's, giving the agent awareness of its own accumulated identity.
+
+```bash
+hermes honcho identity --show
+```
+
+Shows the current AI peer representation from Honcho.

 ## Use Cases

@ -161,3 +351,7 @@ This ensures Honcho has the full picture even when activated mid-conversation.
 - **Expertise adaptation** — adjusts technical depth based on user's background
 - **Cross-platform memory** — same user understanding across CLI, Telegram, Discord, etc.
 - **Multi-user support** — each user (via messaging platforms) gets their own user model
+
+:::tip
+Honcho is fully opt-in — zero behavior change when disabled or unconfigured. All Honcho calls are non-fatal; if the service is unreachable, the agent continues normally.
+:::
--- a/website/docs/user-guide/features/memory.md
+++ b/website/docs/user-guide/features/memory.md
@ -209,41 +209,10 @@ memory:

 ## Honcho Integration (Cross-Session User Modeling)

-For deeper, AI-generated user understanding that works across tools, you can optionally enable [Honcho](https://honcho.dev/) by Plastic Labs. Honcho runs alongside existing memory — USER.md stays as-is, and Honcho adds an additional layer of context.
-
-When enabled:
- **Prefetch**: Each turn, Honcho's user representation is injected into the system prompt
- **Sync**: After each conversation, messages are synced to Honcho
- **Query tool**: The agent can actively query its understanding of you via `query_user_context`
-
-**Setup:**
+For deeper, AI-generated user understanding that works across sessions and platforms, you can enable [Honcho Memory](./honcho.md). Honcho runs alongside built-in memory in `hybrid` mode (the default) — `MEMORY.md` and `USER.md` stay as-is, and Honcho adds a persistent user modeling layer on top.

 ```bash
-# 1. Install the optional dependency
-uv pip install honcho-ai
-
-# 2. Get an API key from https://app.honcho.dev
-
-# 3. Create ~/.honcho/config.json
-cat > ~/.honcho/config.json << 'EOF'
-{
-  "enabled": true,
-  "apiKey": "your-honcho-api-key",
-  "peerName": "your-name",
-  "hosts": {
-    "hermes": {
-      "workspace": "hermes"
-    }
-  }
-}
-EOF
+hermes honcho setup
 ```

-Or via environment variable:
-```bash
-hermes config set HONCHO_API_KEY your-key
-```
-
-:::tip
-Honcho is fully opt-in — zero behavior change when disabled or unconfigured. All Honcho calls are non-fatal; if the service is unreachable, the agent continues normally.
-:::
+See the [Honcho Memory](./honcho.md) docs for full configuration, tools, and CLI reference.
--- a/website/docs/user-guide/messaging/slack.md
+++ b/website/docs/user-guide/messaging/slack.md
@ -91,6 +91,7 @@ You can always find or regenerate app-level tokens under **Settings → Basic In

 This step is critical — it controls what messages the bot can see.

+
 1. In the sidebar, go to **Features → Event Subscriptions**
 2. Toggle **Enable Events** to ON
 3. Expand **Subscribe to bot events** and add:
@ -110,6 +111,7 @@ If the bot works in DMs but **not in channels**, you almost certainly forgot to
 Without these events, Slack simply never delivers channel messages to the bot.
 :::

+
 ---

 ## Step 5: Install App to Workspace
@ -200,6 +202,7 @@ This is intentional — it prevents the bot from responding to every message in

 ---

+
 ## Home Channel

 Set `SLACK_HOME_CHANNEL` to a channel ID where Hermes will deliver scheduled messages,