feat(honcho): async memory integration with prefetch pipeline and recallMode

Adds full Honcho memory integration to Hermes: - Session manager with async background writes, memory modes (honcho/hybrid/local), and dialectic prefetch for first-turn context warming - Agent integration: prefetch pipeline, tool surface gated by recallMode, system prompt context injection, SIGTERM/SIGINT flush handlers - CLI commands: setup, status, mode, tokens, peer, identity, migrate - recallMode setting (auto | context | tools) for A/B testing retrieval strategies - Session strategies: per-session, per-repo (git tree root), per-directory, global - Polymorphic memoryMode config: string shorthand or per-peer object overrides - 97 tests covering async writes, client config, session resolution, and memory modes
2026-06-18 09:51:59 +00:00 · 2026-03-09 15:58:22 -04:00 · 2026-03-09 15:58:22 -04:00 · 74c214e957
commit 74c214e957
parent 8eefbef91c
17 changed files with 2478 additions and 135 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@ -286,7 +286,6 @@ Activate with `/skin cyberpunk` or `display.skin: cyberpunk` in config.yaml.
 ---

 ## Important Policies
-
 ### Prompt Caching Must Not Break

 Hermes-Agent ensures caching remains valid throughout a conversation. **Do NOT implement changes that would:**
--- a/cli-config.yaml.example
+++ b/cli-config.yaml.example
@ -665,6 +665,7 @@ display:
  #   all:     Running output updates + final message (default)
  background_process_notifications: all

+
  # Play terminal bell when agent finishes a response.
  # Useful for long-running tasks — your terminal will ding when the agent is done.
  # Works over SSH. Most terminals can be configured to flash the taskbar or play a sound.
--- a/cli.py
+++ b/cli.py
@ -1440,7 +1440,7 @@ class HermesCLI:
                platform="cli",
                session_db=self._session_db,
                clarify_callback=self._clarify_callback,
-                honcho_session_key=self.session_id,
+                honcho_session_key=None,  # resolved by run_agent via config sessions map / title
                fallback_model=self._fallback_model,
                thinking_callback=self._on_thinking,
                checkpoints_enabled=self.checkpoints_enabled,
@ -2573,6 +2573,26 @@ class HermesCLI:
                            try:
                                if self._session_db.set_session_title(self.session_id, new_title):
                                    _cprint(f"  Session title set: {new_title}")
+                                    # Re-map Honcho session key to new title
+                                    if self.agent and getattr(self.agent, '_honcho', None):
+                                        try:
+                                            hcfg = self.agent._honcho_config
+                                            new_key = (
+                                                hcfg.resolve_session_name(
+                                                    session_title=new_title,
+                                                    session_id=self.agent.session_id,
+                                                )
+                                                if hcfg else new_title
+                                            )
+                                            if new_key and new_key != self.agent._honcho_session_key:
+                                                old_key = self.agent._honcho_session_key
+                                                self.agent._honcho.get_or_create(new_key)
+                                                self.agent._honcho_session_key = new_key
+                                                from tools.honcho_tools import set_session_context
+                                                set_session_context(self.agent._honcho, new_key)
+                                                _cprint(f"  Honcho session: {old_key} → {new_key}")
+                                        except Exception:
+                                            pass
                                else:
                                    _cprint("  Session not found in database.")
                            except ValueError as e:
@ -2886,6 +2906,12 @@ class HermesCLI:
                f"  ✅ Compressed: {original_count} → {new_count} messages "
                f"(~{approx_tokens:,} → ~{new_tokens:,} tokens)"
            )
+            # Flush Honcho async queue so queued messages land before context resets
+            if self.agent and getattr(self.agent, '_honcho', None):
+                try:
+                    self.agent._honcho.flush_all()
+                except Exception:
+                    pass
        except Exception as e:
            print(f"  ❌ Compression failed: {e}")

@ -3322,7 +3348,8 @@ class HermesCLI:
                if response and pending_message:
                    response = response + "\n\n---\n_[Interrupted - processing new message]_"
            
-            if response:
+            response_previewed = result.get("response_previewed", False) if result else False
+            if response and not response_previewed:
                # Use a Rich Panel for the response box — adapts to terminal
                # width at render time instead of hard-coding border length.
                try:
@ -3342,7 +3369,7 @@ class HermesCLI:
                    border_style=_resp_color,
                    padding=(1, 2),
                ))
-            
+
            # Play terminal bell when agent finishes (if enabled).
            # Works over SSH — the bell propagates to the user's terminal.
            if self.bell_on_complete:
@ -4254,6 +4281,12 @@ class HermesCLI:
            # Unregister terminal_tool callbacks to avoid dangling references
            set_sudo_password_callback(None)
            set_approval_callback(None)
+            # Flush + shut down Honcho async writer (drains queue before exit)
+            if self.agent and getattr(self.agent, '_honcho', None):
+                try:
+                    self.agent._honcho.shutdown()
+                except Exception:
+                    pass
            # Close session in SQLite
            if hasattr(self, '_session_db') and self._session_db and self.agent:
                try:
--- a/gateway/run.py
+++ b/gateway/run.py
@ -293,6 +293,12 @@ class GatewayRunner:
                conversation_history=msgs,
            )
            logger.info("Pre-reset memory flush completed for session %s", old_session_id)
+            # Flush any queued Honcho writes before the session is dropped
+            if getattr(tmp_agent, '_honcho', None):
+                try:
+                    tmp_agent._honcho.shutdown()
+                except Exception:
+                    pass
        except Exception as e:
            logger.debug("Pre-reset memory flush failed for session %s: %s", old_session_id, e)

--- a/hermes_cli/config.py
+++ b/hermes_cli/config.py
@ -90,7 +90,7 @@ DEFAULT_CONFIG = {
        "inactivity_timeout": 120,
        "record_sessions": False,  # Auto-record browser sessions as WebM videos
    },
-    
+
    # Filesystem checkpoints — automatic snapshots before destructive file ops.
    # When enabled, the agent takes a snapshot of the working directory once per
    # conversation turn (on first write_file/patch call).  Use /rollback to restore.
@ -849,6 +849,36 @@ _COMMENTED_SECTIONS = """
 """


+_COMMENTED_SECTIONS = """
+# ── Security ──────────────────────────────────────────────────────────
+# API keys, tokens, and passwords are redacted from tool output by default.
+# Set to false to see full values (useful for debugging auth issues).
+#
+# security:
+#   redact_secrets: false
+
+# ── Fallback Model ────────────────────────────────────────────────────
+# Automatic provider failover when primary is unavailable.
+# Uncomment and configure to enable. Triggers on rate limits (429),
+# overload (529), service errors (503), or connection failures.
+#
+# Supported providers:
+#   openrouter   (OPENROUTER_API_KEY)  — routes to any model
+#   openai-codex (OAuth — hermes login) — OpenAI Codex
+#   nous         (OAuth — hermes login) — Nous Portal
+#   zai          (ZAI_API_KEY)         — Z.AI / GLM
+#   kimi-coding  (KIMI_API_KEY)        — Kimi / Moonshot
+#   minimax      (MINIMAX_API_KEY)     — MiniMax
+#   minimax-cn   (MINIMAX_CN_API_KEY)  — MiniMax (China)
+#
+# For custom OpenAI-compatible endpoints, add base_url and api_key_env.
+#
+# fallback_model:
+#   provider: openrouter
+#   model: anthropic/claude-sonnet-4
+"""
+
+
 def save_config(config: Dict[str, Any]):
    """Save configuration to ~/.hermes/config.yaml."""
    from utils import atomic_yaml_write
--- a/hermes_cli/doctor.py
+++ b/hermes_cli/doctor.py
@ -627,6 +627,40 @@ def run_doctor(args):
    else:
        check_warn("No GITHUB_TOKEN", "(60 req/hr rate limit — set in ~/.hermes/.env for better rates)")

+    # =========================================================================
+    # Honcho memory
+    # =========================================================================
+    print()
+    print(color("◆ Honcho Memory", Colors.CYAN, Colors.BOLD))
+
+    try:
+        from honcho_integration.client import HonchoClientConfig, GLOBAL_CONFIG_PATH
+        hcfg = HonchoClientConfig.from_global_config()
+
+        if not GLOBAL_CONFIG_PATH.exists():
+            check_warn("Honcho config not found", f"run: hermes honcho setup")
+        elif not hcfg.enabled:
+            check_info("Honcho disabled (set enabled: true in ~/.honcho/config.json to activate)")
+        elif not hcfg.api_key:
+            check_fail("Honcho API key not set", "run: hermes honcho setup")
+            issues.append("No Honcho API key — run 'hermes honcho setup'")
+        else:
+            from honcho_integration.client import get_honcho_client, reset_honcho_client
+            reset_honcho_client()
+            try:
+                get_honcho_client(hcfg)
+                check_ok(
+                    "Honcho connected",
+                    f"workspace={hcfg.workspace_id} mode={hcfg.memory_mode} freq={hcfg.write_frequency}",
+                )
+            except Exception as _e:
+                check_fail("Honcho connection failed", str(_e))
+                issues.append(f"Honcho unreachable: {_e}")
+    except ImportError:
+        check_warn("honcho-ai not installed", "pip install honcho-ai")
+    except Exception as _e:
+        check_warn("Honcho check failed", str(_e))
+
    # =========================================================================
    # Summary
    # =========================================================================
--- a/hermes_cli/main.py
+++ b/hermes_cli/main.py
@ -18,6 +18,22 @@ Usage:
    hermes cron list           # List cron jobs
    hermes cron status         # Check if cron scheduler is running
    hermes doctor              # Check configuration and dependencies
+    hermes honcho setup                    # Configure Honcho AI memory integration
+    hermes honcho status                   # Show Honcho config and connection status
+    hermes honcho sessions                 # List directory → session name mappings
+    hermes honcho map <name>               # Map current directory to a session name
+    hermes honcho peer                     # Show peer names and dialectic settings
+    hermes honcho peer --user NAME         # Set user peer name
+    hermes honcho peer --ai NAME           # Set AI peer name
+    hermes honcho peer --reasoning LEVEL   # Set dialectic reasoning level
+    hermes honcho mode                     # Show current memory mode
+    hermes honcho mode [hybrid|honcho|local]  # Set memory mode
+    hermes honcho tokens                   # Show token budget settings
+    hermes honcho tokens --context N       # Set session.context() token cap
+    hermes honcho tokens --dialectic N     # Set dialectic result char cap
+    hermes honcho identity                 # Show AI peer identity representation
+    hermes honcho identity <file>          # Seed AI peer identity from a file (SOUL.md etc.)
+    hermes honcho migrate                  # Step-by-step migration guide: OpenClaw native → Hermes + Honcho
    hermes version             # Show version
    hermes update              # Update to latest version
    hermes uninstall           # Uninstall Hermes Agent
@ -2281,6 +2297,94 @@ For more help on a command:

    skills_parser.set_defaults(func=cmd_skills)

+    # =========================================================================
+    # honcho command
+    # =========================================================================
+    honcho_parser = subparsers.add_parser(
+        "honcho",
+        help="Manage Honcho AI memory integration",
+        description=(
+            "Honcho is a memory layer that persists across sessions.\n\n"
+            "Each conversation is stored as a peer interaction in a workspace. "
+            "Honcho builds a representation of the user over time — conclusions, "
+            "patterns, context — and surfaces the relevant slice at the start of "
+            "each turn so Hermes knows who you are without you having to repeat yourself.\n\n"
+            "Modes: hybrid (Honcho + local MEMORY.md), honcho (Honcho only), "
+            "local (MEMORY.md only). Write frequency is configurable so memory "
+            "writes never block the response."
+        ),
+        formatter_class=__import__("argparse").RawDescriptionHelpFormatter,
+    )
+    honcho_subparsers = honcho_parser.add_subparsers(dest="honcho_command")
+
+    honcho_subparsers.add_parser("setup", help="Interactive setup wizard for Honcho integration")
+    honcho_subparsers.add_parser("status", help="Show current Honcho config and connection status")
+    honcho_subparsers.add_parser("sessions", help="List known Honcho session mappings")
+
+    honcho_map = honcho_subparsers.add_parser(
+        "map", help="Map current directory to a Honcho session name (no arg = list mappings)"
+    )
+    honcho_map.add_argument(
+        "session_name", nargs="?", default=None,
+        help="Session name to associate with this directory. Omit to list current mappings.",
+    )
+
+    honcho_peer = honcho_subparsers.add_parser(
+        "peer", help="Show or update peer names and dialectic reasoning level"
+    )
+    honcho_peer.add_argument("--user", metavar="NAME", help="Set user peer name")
+    honcho_peer.add_argument("--ai", metavar="NAME", help="Set AI peer name")
+    honcho_peer.add_argument(
+        "--reasoning",
+        metavar="LEVEL",
+        choices=("minimal", "low", "medium", "high", "max"),
+        help="Set default dialectic reasoning level (minimal/low/medium/high/max)",
+    )
+
+    honcho_mode = honcho_subparsers.add_parser(
+        "mode", help="Show or set memory mode (hybrid/honcho/local)"
+    )
+    honcho_mode.add_argument(
+        "mode", nargs="?", metavar="MODE",
+        choices=("hybrid", "honcho", "local"),
+        help="Memory mode to set (hybrid/honcho/local). Omit to show current.",
+    )
+
+    honcho_tokens = honcho_subparsers.add_parser(
+        "tokens", help="Show or set token budget for context and dialectic"
+    )
+    honcho_tokens.add_argument(
+        "--context", type=int, metavar="N",
+        help="Max tokens Honcho returns from session.context() per turn",
+    )
+    honcho_tokens.add_argument(
+        "--dialectic", type=int, metavar="N",
+        help="Max chars of dialectic result to inject into system prompt",
+    )
+
+    honcho_identity = honcho_subparsers.add_parser(
+        "identity", help="Seed or show the AI peer's Honcho identity representation"
+    )
+    honcho_identity.add_argument(
+        "file", nargs="?", default=None,
+        help="Path to file to seed from (e.g. SOUL.md). Omit to show usage.",
+    )
+    honcho_identity.add_argument(
+        "--show", action="store_true",
+        help="Show current AI peer representation from Honcho",
+    )
+
+    honcho_subparsers.add_parser(
+        "migrate",
+        help="Step-by-step migration guide from openclaw-honcho to Hermes Honcho",
+    )
+
+    def cmd_honcho(args):
+        from honcho_integration.cli import honcho_command
+        honcho_command(args)
+
+    honcho_parser.set_defaults(func=cmd_honcho)
+
    # =========================================================================
    # tools command
    # =========================================================================
--- a/honcho_integration/cli.py
+++ b/honcho_integration/cli.py
@ -0,0 +1,749 @@
+"""CLI commands for Honcho integration management.
+
+Handles: hermes honcho setup | status | sessions | map | peer
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import sys
+from pathlib import Path
+
+GLOBAL_CONFIG_PATH = Path.home() / ".honcho" / "config.json"
+HOST = "hermes"
+
+
+def _read_config() -> dict:
+    if GLOBAL_CONFIG_PATH.exists():
+        try:
+            return json.loads(GLOBAL_CONFIG_PATH.read_text(encoding="utf-8"))
+        except Exception:
+            pass
+    return {}
+
+
+def _write_config(cfg: dict) -> None:
+    GLOBAL_CONFIG_PATH.parent.mkdir(parents=True, exist_ok=True)
+    GLOBAL_CONFIG_PATH.write_text(
+        json.dumps(cfg, indent=2, ensure_ascii=False) + "\n",
+        encoding="utf-8",
+    )
+
+
+def _prompt(label: str, default: str | None = None, secret: bool = False) -> str:
+    suffix = f" [{default}]" if default else ""
+    sys.stdout.write(f"  {label}{suffix}: ")
+    sys.stdout.flush()
+    if secret:
+        if sys.stdin.isatty():
+            import getpass
+            val = getpass.getpass(prompt="")
+        else:
+            # Non-TTY (piped input, test runners) — read plaintext
+            val = sys.stdin.readline().strip()
+    else:
+        val = sys.stdin.readline().strip()
+    return val or (default or "")
+
+
+def _ensure_sdk_installed() -> bool:
+    """Check honcho-ai is importable; offer to install if not. Returns True if ready."""
+    try:
+        import honcho  # noqa: F401
+        return True
+    except ImportError:
+        pass
+
+    print("  honcho-ai is not installed.")
+    answer = _prompt("Install it now? (honcho-ai>=2.0.1)", default="y")
+    if answer.lower() not in ("y", "yes"):
+        print("  Skipping install. Run: pip install 'honcho-ai>=2.0.1'\n")
+        return False
+
+    import subprocess
+    print("  Installing honcho-ai...", flush=True)
+    result = subprocess.run(
+        [sys.executable, "-m", "pip", "install", "honcho-ai>=2.0.1"],
+        capture_output=True,
+        text=True,
+    )
+    if result.returncode == 0:
+        print("  Installed.\n")
+        return True
+    else:
+        print(f"  Install failed:\n{result.stderr.strip()}")
+        print("  Run manually: pip install 'honcho-ai>=2.0.1'\n")
+        return False
+
+
+def cmd_setup(args) -> None:
+    """Interactive Honcho setup wizard."""
+    cfg = _read_config()
+
+    print("\nHoncho memory setup\n" + "─" * 40)
+    print("  Honcho gives Hermes persistent cross-session memory.")
+    print("  Config is shared with other hosts at ~/.honcho/config.json\n")
+
+    if not _ensure_sdk_installed():
+        return
+
+    # API key
+    current_key = cfg.get("apiKey", "")
+    masked = f"...{current_key[-8:]}" if len(current_key) > 8 else ("set" if current_key else "not set")
+    print(f"  Current API key: {masked}")
+    new_key = _prompt("Honcho API key (leave blank to keep current)", secret=True)
+    if new_key:
+        cfg["apiKey"] = new_key
+
+    if not cfg.get("apiKey"):
+        print("\n  No API key configured. Get one at https://app.honcho.dev")
+        print("  Run 'hermes honcho setup' again once you have a key.\n")
+        return
+
+    # Peer name
+    current_peer = cfg.get("peerName", "")
+    new_peer = _prompt("Your name (user peer)", default=current_peer or os.getenv("USER", "user"))
+    if new_peer:
+        cfg["peerName"] = new_peer
+
+    # Host block
+    hosts = cfg.setdefault("hosts", {})
+    hermes_host = hosts.setdefault(HOST, {})
+
+    current_workspace = hermes_host.get("workspace") or cfg.get("workspace", "hermes")
+    new_workspace = _prompt("Workspace ID", default=current_workspace)
+    if new_workspace:
+        hermes_host["workspace"] = new_workspace
+        # Also update flat workspace if it was the primary one
+        if cfg.get("workspace") == current_workspace:
+            cfg["workspace"] = new_workspace
+
+    hermes_host.setdefault("aiPeer", HOST)
+
+    # Memory mode
+    current_mode = cfg.get("memoryMode", "hybrid")
+    print(f"\n  Memory mode options:")
+    print("    hybrid  — write to both Honcho and local MEMORY.md (default)")
+    print("    honcho  — Honcho only, skip MEMORY.md writes")
+    print("    local   — MEMORY.md only, Honcho disabled")
+    new_mode = _prompt("Memory mode", default=current_mode)
+    if new_mode in ("hybrid", "honcho", "local"):
+        cfg["memoryMode"] = new_mode
+    else:
+        cfg["memoryMode"] = "hybrid"
+
+    # Write frequency
+    current_wf = str(cfg.get("writeFrequency", "async"))
+    print(f"\n  Write frequency options:")
+    print("    async   — background thread, no token cost (recommended)")
+    print("    turn    — sync write after every turn")
+    print("    session — batch write at session end only")
+    print("    N       — write every N turns (e.g. 5)")
+    new_wf = _prompt("Write frequency", default=current_wf)
+    try:
+        cfg["writeFrequency"] = int(new_wf)
+    except (ValueError, TypeError):
+        cfg["writeFrequency"] = new_wf if new_wf in ("async", "turn", "session") else "async"
+
+    # Recall mode
+    current_recall = cfg.get("recallMode", "auto")
+    print(f"\n  Recall mode options:")
+    print("    auto    — pre-warmed context + memory tools available (default)")
+    print("    context — pre-warmed context only, memory tools suppressed")
+    print("    tools   — no pre-loaded context, rely on tool calls only")
+    new_recall = _prompt("Recall mode", default=current_recall)
+    if new_recall in ("auto", "context", "tools"):
+        cfg["recallMode"] = new_recall
+
+    # Session strategy
+    current_strat = cfg.get("sessionStrategy", "per-session")
+    print(f"\n  Session strategy options:")
+    print("    per-session   — new Honcho session each run, named by Hermes session ID (default)")
+    print("    per-repo      — one session per git repository (uses repo root name)")
+    print("    per-directory — one session per working directory")
+    print("    global        — single session across all directories")
+    new_strat = _prompt("Session strategy", default=current_strat)
+    if new_strat in ("per-session", "per-repo", "per-directory", "global"):
+        cfg["sessionStrategy"] = new_strat
+
+    cfg.setdefault("enabled", True)
+    cfg.setdefault("saveMessages", True)
+
+    _write_config(cfg)
+    print(f"\n  Config written to {GLOBAL_CONFIG_PATH}")
+
+    # Test connection
+    print("  Testing connection... ", end="", flush=True)
+    try:
+        from honcho_integration.client import HonchoClientConfig, get_honcho_client, reset_honcho_client
+        reset_honcho_client()
+        hcfg = HonchoClientConfig.from_global_config()
+        get_honcho_client(hcfg)
+        print("OK")
+    except Exception as e:
+        print(f"FAILED\n  Error: {e}")
+        return
+
+    print(f"\n  Honcho is ready.")
+    print(f"  Session:   {hcfg.resolve_session_name()}")
+    print(f"  Workspace: {hcfg.workspace_id}")
+    print(f"  Peer:      {hcfg.peer_name}")
+    _mode_str = hcfg.memory_mode
+    if hcfg.peer_memory_modes:
+        overrides = ", ".join(f"{k}={v}" for k, v in hcfg.peer_memory_modes.items())
+        _mode_str = f"{hcfg.memory_mode}  (peers: {overrides})"
+    print(f"  Mode:      {_mode_str}")
+    print(f"  Frequency: {hcfg.write_frequency}")
+    print(f"\n  Tools available in chat:")
+    print(f"    query_user_context  — ask Honcho a question about you (LLM-synthesized)")
+    print(f"    honcho_search       — semantic search over your history (no LLM)")
+    print(f"    honcho_profile      — your peer card, key facts (no LLM)")
+    print(f"\n  Other commands:")
+    print(f"    hermes honcho status     — show full config")
+    print(f"    hermes honcho mode       — show or change memory mode")
+    print(f"    hermes honcho tokens     — show or set token budgets")
+    print(f"    hermes honcho identity   — seed or show AI peer identity")
+    print(f"    hermes honcho map <name> — map this directory to a session name\n")
+
+
+def cmd_status(args) -> None:
+    """Show current Honcho config and connection status."""
+    try:
+        import honcho  # noqa: F401
+    except ImportError:
+        print("  honcho-ai is not installed. Run: hermes honcho setup\n")
+        return
+
+    cfg = _read_config()
+
+    if not cfg:
+        print("  No Honcho config found at ~/.honcho/config.json")
+        print("  Run 'hermes honcho setup' to configure.\n")
+        return
+
+    try:
+        from honcho_integration.client import HonchoClientConfig, get_honcho_client
+        hcfg = HonchoClientConfig.from_global_config()
+    except Exception as e:
+        print(f"  Config error: {e}\n")
+        return
+
+    api_key = hcfg.api_key or ""
+    masked = f"...{api_key[-8:]}" if len(api_key) > 8 else ("set" if api_key else "not set")
+
+    print(f"\nHoncho status\n" + "─" * 40)
+    print(f"  Enabled:        {hcfg.enabled}")
+    print(f"  API key:        {masked}")
+    print(f"  Workspace:      {hcfg.workspace_id}")
+    print(f"  Host:           {hcfg.host}")
+    print(f"  Config path:    {GLOBAL_CONFIG_PATH}")
+    print(f"  AI peer:        {hcfg.ai_peer}")
+    print(f"  User peer:      {hcfg.peer_name or 'not set'}")
+    print(f"  Session key:    {hcfg.resolve_session_name()}")
+    print(f"  Recall mode:    {hcfg.recall_mode}")
+    print(f"  Memory mode:    {hcfg.memory_mode}")
+    if hcfg.peer_memory_modes:
+        print(f"  Per-peer modes:")
+        for peer, mode in hcfg.peer_memory_modes.items():
+            print(f"    {peer}: {mode}")
+    print(f"  Write freq:     {hcfg.write_frequency}")
+
+    if hcfg.enabled and hcfg.api_key:
+        print("\n  Connection... ", end="", flush=True)
+        try:
+            get_honcho_client(hcfg)
+            print("OK\n")
+        except Exception as e:
+            print(f"FAILED ({e})\n")
+    else:
+        reason = "disabled" if not hcfg.enabled else "no API key"
+        print(f"\n  Not connected ({reason})\n")
+
+
+def cmd_sessions(args) -> None:
+    """List known directory → session name mappings."""
+    cfg = _read_config()
+    sessions = cfg.get("sessions", {})
+
+    if not sessions:
+        print("  No session mappings configured.\n")
+        print("  Add one with: hermes honcho map <session-name>")
+        print("  Or edit ~/.honcho/config.json directly.\n")
+        return
+
+    cwd = os.getcwd()
+    print(f"\nHoncho session mappings ({len(sessions)})\n" + "─" * 40)
+    for path, name in sorted(sessions.items()):
+        marker = " ←" if path == cwd else ""
+        print(f"  {name:<30} {path}{marker}")
+    print()
+
+
+def cmd_map(args) -> None:
+    """Map current directory to a Honcho session name."""
+    if not args.session_name:
+        cmd_sessions(args)
+        return
+
+    cwd = os.getcwd()
+    session_name = args.session_name.strip()
+
+    if not session_name:
+        print("  Session name cannot be empty.\n")
+        return
+
+    import re
+    sanitized = re.sub(r'[^a-zA-Z0-9_-]', '-', session_name).strip('-')
+    if sanitized != session_name:
+        print(f"  Session name sanitized to: {sanitized}")
+        session_name = sanitized
+
+    cfg = _read_config()
+    cfg.setdefault("sessions", {})[cwd] = session_name
+    _write_config(cfg)
+    print(f"  Mapped {cwd}\n     → {session_name}\n")
+
+
+def cmd_peer(args) -> None:
+    """Show or update peer names and dialectic reasoning level."""
+    cfg = _read_config()
+    changed = False
+
+    user_name = getattr(args, "user", None)
+    ai_name = getattr(args, "ai", None)
+    reasoning = getattr(args, "reasoning", None)
+
+    REASONING_LEVELS = ("minimal", "low", "medium", "high", "max")
+
+    if user_name is None and ai_name is None and reasoning is None:
+        # Show current values
+        hosts = cfg.get("hosts", {})
+        hermes = hosts.get(HOST, {})
+        print(f"\nHoncho peer config\n" + "─" * 40)
+        print(f"  User peer:        {cfg.get('peerName') or '(not set)'}")
+        print(f"  AI peer:          {hermes.get('aiPeer') or cfg.get('aiPeer') or HOST}")
+        lvl = hermes.get("dialecticReasoningLevel") or cfg.get("dialecticReasoningLevel") or "low"
+        max_chars = hermes.get("dialecticMaxChars") or cfg.get("dialecticMaxChars") or 600
+        print(f"  Dialectic level:  {lvl}  (options: {', '.join(REASONING_LEVELS)})")
+        print(f"  Dialectic cap:    {max_chars} chars\n")
+        return
+
+    if user_name is not None:
+        cfg["peerName"] = user_name.strip()
+        changed = True
+        print(f"  User peer → {cfg['peerName']}")
+
+    if ai_name is not None:
+        cfg.setdefault("hosts", {}).setdefault(HOST, {})["aiPeer"] = ai_name.strip()
+        changed = True
+        print(f"  AI peer   → {ai_name.strip()}")
+
+    if reasoning is not None:
+        if reasoning not in REASONING_LEVELS:
+            print(f"  Invalid reasoning level '{reasoning}'. Options: {', '.join(REASONING_LEVELS)}")
+            return
+        cfg.setdefault("hosts", {}).setdefault(HOST, {})["dialecticReasoningLevel"] = reasoning
+        changed = True
+        print(f"  Dialectic reasoning level → {reasoning}")
+
+    if changed:
+        _write_config(cfg)
+        print(f"  Saved to {GLOBAL_CONFIG_PATH}\n")
+
+
+def cmd_mode(args) -> None:
+    """Show or set the memory mode."""
+    MODES = {
+        "hybrid": "write to both Honcho and local MEMORY.md (default)",
+        "honcho": "Honcho only — MEMORY.md writes disabled",
+        "local":  "MEMORY.md only — Honcho disabled",
+    }
+    cfg = _read_config()
+    mode_arg = getattr(args, "mode", None)
+
+    if mode_arg is None:
+        current = (
+            (cfg.get("hosts") or {}).get(HOST, {}).get("memoryMode")
+            or cfg.get("memoryMode")
+            or "hybrid"
+        )
+        print(f"\nHoncho memory mode\n" + "─" * 40)
+        for m, desc in MODES.items():
+            marker = " ←" if m == current else ""
+            print(f"  {m:<8}  {desc}{marker}")
+        print(f"\n  Set with: hermes honcho mode [hybrid|honcho|local]\n")
+        return
+
+    if mode_arg not in MODES:
+        print(f"  Invalid mode '{mode_arg}'. Options: {', '.join(MODES)}\n")
+        return
+
+    cfg.setdefault("hosts", {}).setdefault(HOST, {})["memoryMode"] = mode_arg
+    _write_config(cfg)
+    print(f"  Memory mode → {mode_arg}  ({MODES[mode_arg]})\n")
+
+
+def cmd_tokens(args) -> None:
+    """Show or set token budget settings."""
+    cfg = _read_config()
+    hosts = cfg.get("hosts", {})
+    hermes = hosts.get(HOST, {})
+
+    context = getattr(args, "context", None)
+    dialectic = getattr(args, "dialectic", None)
+
+    if context is None and dialectic is None:
+        ctx_tokens = hermes.get("contextTokens") or cfg.get("contextTokens") or "(Honcho default)"
+        d_chars = hermes.get("dialecticMaxChars") or cfg.get("dialecticMaxChars") or 600
+        d_level = hermes.get("dialecticReasoningLevel") or cfg.get("dialecticReasoningLevel") or "low"
+        print(f"\nHoncho token settings\n" + "─" * 40)
+        print(f"  context tokens:   {ctx_tokens}")
+        print(f"    Max tokens Honcho returns from session.context() per turn.")
+        print(f"    Injected into Hermes system prompt — counts against your LLM budget.")
+        print(f"  dialectic cap:    {d_chars} chars")
+        print(f"    Max chars of peer.chat() result injected per turn.")
+        print(f"  dialectic level:  {d_level}  (controls Honcho-side inference depth)")
+        print(f"\n  Set with: hermes honcho tokens [--context N] [--dialectic N]\n")
+        return
+
+    changed = False
+    if context is not None:
+        cfg.setdefault("hosts", {}).setdefault(HOST, {})["contextTokens"] = context
+        print(f"  context tokens → {context}")
+        changed = True
+    if dialectic is not None:
+        cfg.setdefault("hosts", {}).setdefault(HOST, {})["dialecticMaxChars"] = dialectic
+        print(f"  dialectic cap  → {dialectic} chars")
+        changed = True
+
+    if changed:
+        _write_config(cfg)
+        print(f"  Saved to {GLOBAL_CONFIG_PATH}\n")
+
+
+def cmd_identity(args) -> None:
+    """Seed AI peer identity or show both peer representations."""
+    cfg = _read_config()
+    if not cfg.get("apiKey"):
+        print("  No API key configured. Run 'hermes honcho setup' first.\n")
+        return
+
+    file_path = getattr(args, "file", None)
+    show = getattr(args, "show", False)
+
+    try:
+        from honcho_integration.client import HonchoClientConfig, get_honcho_client
+        from honcho_integration.session import HonchoSessionManager
+        hcfg = HonchoClientConfig.from_global_config()
+        client = get_honcho_client(hcfg)
+        mgr = HonchoSessionManager(honcho=client, config=hcfg)
+        session_key = hcfg.resolve_session_name()
+        mgr.get_or_create(session_key)
+    except Exception as e:
+        print(f"  Honcho connection failed: {e}\n")
+        return
+
+    if show:
+        # ── User peer ────────────────────────────────────────────────────────
+        user_card = mgr.get_peer_card(session_key)
+        print(f"\nUser peer ({hcfg.peer_name or 'not set'})\n" + "─" * 40)
+        if user_card:
+            for fact in user_card:
+                print(f"  {fact}")
+        else:
+            print("  No user peer card yet. Send a few messages to build one.")
+
+        # ── AI peer ──────────────────────────────────────────────────────────
+        ai_rep = mgr.get_ai_representation(session_key)
+        print(f"\nAI peer ({hcfg.ai_peer})\n" + "─" * 40)
+        if ai_rep.get("representation"):
+            print(ai_rep["representation"])
+        elif ai_rep.get("card"):
+            print(ai_rep["card"])
+        else:
+            print("  No representation built yet.")
+            print("  Run 'hermes honcho identity <file>' to seed one.")
+        print()
+        return
+
+    if not file_path:
+        print("\nHoncho identity management\n" + "─" * 40)
+        print(f"  User peer: {hcfg.peer_name or 'not set'}")
+        print(f"  AI peer:   {hcfg.ai_peer}")
+        print()
+        print("    hermes honcho identity --show        — show both peer representations")
+        print("    hermes honcho identity <file>        — seed AI peer from SOUL.md or any .md/.txt\n")
+        return
+
+    from pathlib import Path
+    p = Path(file_path).expanduser()
+    if not p.exists():
+        print(f"  File not found: {p}\n")
+        return
+
+    content = p.read_text(encoding="utf-8").strip()
+    if not content:
+        print(f"  File is empty: {p}\n")
+        return
+
+    source = p.name
+    ok = mgr.seed_ai_identity(session_key, content, source=source)
+    if ok:
+        print(f"  Seeded AI peer identity from {p.name} into session '{session_key}'")
+        print(f"  Honcho will incorporate this into {hcfg.ai_peer}'s representation over time.\n")
+    else:
+        print(f"  Failed to seed identity. Check logs for details.\n")
+
+
+def cmd_migrate(args) -> None:
+    """Step-by-step migration guide: OpenClaw native memory → Hermes + Honcho."""
+    from pathlib import Path
+
+    # ── Detect OpenClaw native memory files ──────────────────────────────────
+    cwd = Path(os.getcwd())
+    openclaw_home = Path.home() / ".openclaw"
+
+    # User peer: facts about the user
+    user_file_names = ["USER.md", "MEMORY.md"]
+    # AI peer: agent identity / configuration
+    agent_file_names = ["SOUL.md", "IDENTITY.md", "AGENTS.md", "TOOLS.md", "BOOTSTRAP.md"]
+
+    user_files: list[Path] = []
+    agent_files: list[Path] = []
+    for name in user_file_names:
+        for d in [cwd, openclaw_home]:
+            p = d / name
+            if p.exists() and p not in user_files:
+                user_files.append(p)
+    for name in agent_file_names:
+        for d in [cwd, openclaw_home]:
+            p = d / name
+            if p.exists() and p not in agent_files:
+                agent_files.append(p)
+
+    cfg = _read_config()
+    has_key = bool(cfg.get("apiKey", ""))
+
+    print("\nHoncho migration: OpenClaw native memory → Hermes\n" + "─" * 50)
+    print()
+    print("  OpenClaw's native memory stores context in local markdown files")
+    print("  (USER.md, MEMORY.md, SOUL.md, ...) and injects them via QMD search.")
+    print("  Honcho replaces that with a cloud-backed, LLM-observable memory layer:")
+    print("  context is retrieved semantically, injected automatically each turn,")
+    print("  and enriched by a dialectic reasoning layer that builds over time.")
+    print()
+
+    # ── Step 1: Honcho account ────────────────────────────────────────────────
+    print("Step 1  Create a Honcho account")
+    print()
+    if has_key:
+        masked = f"...{cfg['apiKey'][-8:]}" if len(cfg["apiKey"]) > 8 else "set"
+        print(f"  Honcho API key already configured: {masked}")
+        print("  Skip to Step 2.")
+    else:
+        print("  Honcho is a cloud memory service. You need a free account to use it.")
+        print()
+        print("  1. Go to https://app.honcho.dev and create an account.")
+        print("  2. Copy your API key from the dashboard.")
+        print("  3. Run:  hermes honcho setup")
+        print("     This will store the key and create a workspace for this project.")
+        print()
+        answer = _prompt("  Run 'hermes honcho setup' now?", default="y")
+        if answer.lower() in ("y", "yes"):
+            cmd_setup(args)
+            cfg = _read_config()
+            has_key = bool(cfg.get("apiKey", ""))
+        else:
+            print()
+            print("  Run 'hermes honcho setup' when ready, then re-run this walkthrough.")
+
+    # ── Step 2: Detected files ────────────────────────────────────────────────
+    print()
+    print("Step 2  Detected OpenClaw memory files")
+    print()
+    if user_files or agent_files:
+        if user_files:
+            print(f"  User memory ({len(user_files)} file(s)) — will go to Honcho user peer:")
+            for f in user_files:
+                print(f"    {f}")
+        if agent_files:
+            print(f"  Agent identity ({len(agent_files)} file(s)) — will go to Honcho AI peer:")
+            for f in agent_files:
+                print(f"    {f}")
+    else:
+        print("  No OpenClaw native memory files found in cwd or ~/.openclaw/.")
+        print("  If your files are elsewhere, copy them here before continuing,")
+        print("  or seed them manually:  hermes honcho identity <path/to/file>")
+
+    # ── Step 3: Migrate user memory ───────────────────────────────────────────
+    print()
+    print("Step 3  Migrate user memory files → Honcho user peer")
+    print()
+    print("  USER.md and MEMORY.md contain facts about you that the agent should")
+    print("  remember across sessions. Honcho will store these under your user peer")
+    print("  and inject relevant excerpts into the system prompt automatically.")
+    print()
+    if user_files:
+        print(f"  Found: {', '.join(f.name for f in user_files)}")
+        print()
+        print("  These are picked up automatically the first time you run 'hermes'")
+        print("  with Honcho configured and no prior session history.")
+        print("  (Hermes calls migrate_memory_files() on first session init.)")
+        print()
+        print("  If you want to migrate them now without starting a session:")
+        for f in user_files:
+            print(f"    hermes honcho migrate  — this step handles it interactively")
+        if has_key:
+            answer = _prompt("  Upload user memory files to Honcho now?", default="y")
+            if answer.lower() in ("y", "yes"):
+                try:
+                    from honcho_integration.client import (
+                        HonchoClientConfig,
+                        get_honcho_client,
+                        reset_honcho_client,
+                    )
+                    from honcho_integration.session import HonchoSessionManager
+
+                    reset_honcho_client()
+                    hcfg = HonchoClientConfig.from_global_config()
+                    client = get_honcho_client(hcfg)
+                    mgr = HonchoSessionManager(honcho=client, config=hcfg)
+                    session_key = hcfg.resolve_session_name()
+                    mgr.get_or_create(session_key)
+                    # Upload from each directory that had user files
+                    dirs_with_files = set(str(f.parent) for f in user_files)
+                    any_uploaded = False
+                    for d in dirs_with_files:
+                        if mgr.migrate_memory_files(session_key, d):
+                            any_uploaded = True
+                    if any_uploaded:
+                        print(f"  Uploaded user memory files from: {', '.join(dirs_with_files)}")
+                    else:
+                        print("  Nothing uploaded (files may already be migrated or empty).")
+                except Exception as e:
+                    print(f"  Failed: {e}")
+        else:
+            print("  Run 'hermes honcho setup' first, then re-run this step.")
+    else:
+        print("  No user memory files detected. Nothing to migrate here.")
+
+    # ── Step 4: Seed AI identity ──────────────────────────────────────────────
+    print()
+    print("Step 4  Seed AI identity files → Honcho AI peer")
+    print()
+    print("  SOUL.md, IDENTITY.md, AGENTS.md, TOOLS.md, BOOTSTRAP.md define the")
+    print("  agent's character, capabilities, and behavioral rules. In OpenClaw")
+    print("  these are injected via file search at prompt-build time.")
+    print()
+    print("  In Hermes, they are seeded once into Honcho's AI peer through the")
+    print("  observation pipeline. Honcho builds a representation from them and")
+    print("  from every subsequent assistant message (observe_me=True). Over time")
+    print("  the representation reflects actual behavior, not just declaration.")
+    print()
+    if agent_files:
+        print(f"  Found: {', '.join(f.name for f in agent_files)}")
+        print()
+        if has_key:
+            answer = _prompt("  Seed AI identity from all detected files now?", default="y")
+            if answer.lower() in ("y", "yes"):
+                try:
+                    from honcho_integration.client import (
+                        HonchoClientConfig,
+                        get_honcho_client,
+                        reset_honcho_client,
+                    )
+                    from honcho_integration.session import HonchoSessionManager
+
+                    reset_honcho_client()
+                    hcfg = HonchoClientConfig.from_global_config()
+                    client = get_honcho_client(hcfg)
+                    mgr = HonchoSessionManager(honcho=client, config=hcfg)
+                    session_key = hcfg.resolve_session_name()
+                    mgr.get_or_create(session_key)
+                    for f in agent_files:
+                        content = f.read_text(encoding="utf-8").strip()
+                        if content:
+                            ok = mgr.seed_ai_identity(session_key, content, source=f.name)
+                            status = "seeded" if ok else "failed"
+                            print(f"    {f.name}: {status}")
+                except Exception as e:
+                    print(f"  Failed: {e}")
+        else:
+            print("  Run 'hermes honcho setup' first, then seed manually:")
+            for f in agent_files:
+                print(f"    hermes honcho identity {f}")
+    else:
+        print("  No agent identity files detected.")
+        print("  To seed manually:  hermes honcho identity <path/to/SOUL.md>")
+
+    # ── Step 5: What changes ──────────────────────────────────────────────────
+    print()
+    print("Step 5  What changes vs. OpenClaw native memory")
+    print()
+    print("  Storage")
+    print("    OpenClaw: markdown files on disk, searched via QMD at prompt-build time.")
+    print("    Hermes:   cloud-backed Honcho peers. Files can stay on disk as source")
+    print("              of truth; Honcho holds the live representation.")
+    print()
+    print("  Context injection")
+    print("    OpenClaw: file excerpts injected synchronously before each LLM call.")
+    print("    Hermes:   Honcho context prefetched async at turn end, injected next turn.")
+    print("              First turn has no Honcho context; subsequent turns are loaded.")
+    print()
+    print("  Memory growth")
+    print("    OpenClaw: you edit files manually to update memory.")
+    print("    Hermes:   Honcho observes every message and updates representations")
+    print("              automatically. Files become the seed, not the live store.")
+    print()
+    print("  Tool surface (available to the agent during conversation)")
+    print("    query_user_context   — ask Honcho a question, get a synthesized answer (LLM)")
+    print("    honcho_search        — semantic search over stored context (no LLM)")
+    print("    honcho_profile       — fast peer card snapshot (no LLM)")
+    print()
+    print("  Session naming")
+    print("    OpenClaw: no persistent session concept — files are global.")
+    print("    Hermes:   per-session by default — each run gets a new Honcho session")
+    print("              Map a custom name:  hermes honcho map <session-name>")
+
+    # ── Step 6: Next steps ────────────────────────────────────────────────────
+    print()
+    print("Step 6  Next steps")
+    print()
+    if not has_key:
+        print("  1. hermes honcho setup              — configure API key (required)")
+        print("  2. hermes honcho migrate            — re-run this walkthrough")
+    else:
+        print("  1. hermes honcho status             — verify Honcho connection")
+        print("  2. hermes                           — start a session")
+        print("     (user memory files auto-uploaded on first turn if not done above)")
+        print("  3. hermes honcho identity --show    — verify AI peer representation")
+        print("  4. hermes honcho tokens             — tune context and dialectic budgets")
+        print("  5. hermes honcho mode               — view or change memory mode")
+    print()
+
+
+def honcho_command(args) -> None:
+    """Route honcho subcommands."""
+    sub = getattr(args, "honcho_command", None)
+    if sub == "setup" or sub is None:
+        cmd_setup(args)
+    elif sub == "status":
+        cmd_status(args)
+    elif sub == "sessions":
+        cmd_sessions(args)
+    elif sub == "map":
+        cmd_map(args)
+    elif sub == "peer":
+        cmd_peer(args)
+    elif sub == "mode":
+        cmd_mode(args)
+    elif sub == "tokens":
+        cmd_tokens(args)
+    elif sub == "identity":
+        cmd_identity(args)
+    elif sub == "migrate":
+        cmd_migrate(args)
+    else:
+        print(f"  Unknown honcho command: {sub}")
+        print("  Available: setup, status, sessions, map, peer, mode, tokens, identity, migrate\n")
--- a/honcho_integration/client.py
+++ b/honcho_integration/client.py
@ -27,6 +27,30 @@ GLOBAL_CONFIG_PATH = Path.home() / ".honcho" / "config.json"
 HOST = "hermes"


+def _resolve_memory_mode(
+    global_val: str | dict,
+    host_val: str | dict | None,
+) -> dict:
+    """Parse memoryMode (string or object) into memory_mode + peer_memory_modes.
+
+    Resolution order: host-level wins over global.
+    String form:  applies as the default for all peers.
+    Object form:  { "default": "hybrid", "hermes": "honcho", ... }
+                  "default" key sets the fallback; other keys are per-peer overrides.
+    """
+    # Pick the winning value (host beats global)
+    val = host_val if host_val is not None else global_val
+
+    if isinstance(val, dict):
+        default = val.get("default", "hybrid")
+        overrides = {k: v for k, v in val.items() if k != "default"}
+    else:
+        default = str(val) if val else "hybrid"
+        overrides = {}
+
+    return {"memory_mode": default, "peer_memory_modes": overrides}
+
+
@dataclass
 class HonchoClientConfig:
    """Configuration for Honcho client, resolved for a specific host."""
@ -42,10 +66,36 @@ class HonchoClientConfig:
    # Toggles
    enabled: bool = False
    save_messages: bool = True
+    # memoryMode: default for all peers. "hybrid" / "honcho" / "local"
+    memory_mode: str = "hybrid"
+    # Per-peer overrides — any named Honcho peer. Override memory_mode when set.
+    # Config object form: "memoryMode": { "default": "hybrid", "hermes": "honcho" }
+    peer_memory_modes: dict[str, str] = field(default_factory=dict)
+
+    def peer_memory_mode(self, peer_name: str) -> str:
+        """Return the effective memory mode for a named peer.
+
+        Resolution: per-peer override → global memory_mode default.
+        """
+        return self.peer_memory_modes.get(peer_name, self.memory_mode)
+    # Write frequency: "async" (background thread), "turn" (sync per turn),
+    # "session" (flush on session end), or int (every N turns)
+    write_frequency: str | int = "async"
    # Prefetch budget
    context_tokens: int | None = None
+    # Dialectic (peer.chat) settings
+    # reasoning_level: "minimal" | "low" | "medium" | "high" | "max"
+    # Used as the default; prefetch_dialectic may bump it dynamically.
+    dialectic_reasoning_level: str = "low"
+    # Max chars of dialectic result to inject into Hermes system prompt
+    dialectic_max_chars: int = 600
+    # Recall mode: how memory retrieval works when Honcho is active.
+    # "auto"    — pre-warmed context + memory tools available (model decides)
+    # "context" — pre-warmed context only, honcho memory tools removed
+    # "tools"   — no pre-loaded context, rely on tool calls only
+    recall_mode: str = "auto"
    # Session resolution
-    session_strategy: str = "per-directory"
+    session_strategy: str = "per-session"
    session_peer_prefix: bool = False
    sessions: dict[str, str] = field(default_factory=dict)
    # Raw global config for anything else consumers need
@ -109,6 +159,17 @@ class HonchoClientConfig:
            # Respect explicit setting
            enabled = explicit_enabled

+        # write_frequency: accept int or string
+        raw_wf = (
+            host_block.get("writeFrequency")
+            or raw.get("writeFrequency")
+            or "async"
+        )
+        try:
+            write_frequency: str | int = int(raw_wf)
+        except (TypeError, ValueError):
+            write_frequency = str(raw_wf)
+
        return cls(
            host=host,
            workspace_id=workspace,
@ -119,31 +180,105 @@ class HonchoClientConfig:
            linked_hosts=linked_hosts,
            enabled=enabled,
            save_messages=raw.get("saveMessages", True),
-            context_tokens=raw.get("contextTokens") or host_block.get("contextTokens"),
-            session_strategy=raw.get("sessionStrategy", "per-directory"),
+            **_resolve_memory_mode(
+                raw.get("memoryMode", "hybrid"),
+                host_block.get("memoryMode"),
+            ),
+            write_frequency=write_frequency,
+            context_tokens=host_block.get("contextTokens") or raw.get("contextTokens"),
+            dialectic_reasoning_level=(
+                host_block.get("dialecticReasoningLevel")
+                or raw.get("dialecticReasoningLevel")
+                or "low"
+            ),
+            dialectic_max_chars=int(
+                host_block.get("dialecticMaxChars")
+                or raw.get("dialecticMaxChars")
+                or 600
+            ),
+            recall_mode=(
+                host_block.get("recallMode")
+                or raw.get("recallMode")
+                or "auto"
+            ),
+            session_strategy=raw.get("sessionStrategy", "per-session"),
            session_peer_prefix=raw.get("sessionPeerPrefix", False),
            sessions=raw.get("sessions", {}),
            raw=raw,
        )

-    def resolve_session_name(self, cwd: str | None = None) -> str | None:
-        """Resolve session name for a directory.
+    @staticmethod
+    def _git_repo_name(cwd: str) -> str | None:
+        """Return the git repo root directory name, or None if not in a repo."""
+        import subprocess

-        Checks manual overrides first, then derives from directory name.
+        try:
+            root = subprocess.run(
+                ["git", "rev-parse", "--show-toplevel"],
+                capture_output=True, text=True, cwd=cwd, timeout=5,
+            )
+            if root.returncode == 0:
+                return Path(root.stdout.strip()).name
+        except (OSError, subprocess.TimeoutExpired):
+            pass
+        return None
+
+    def resolve_session_name(
+        self,
+        cwd: str | None = None,
+        session_title: str | None = None,
+        session_id: str | None = None,
+    ) -> str | None:
+        """Resolve Honcho session name.
+
+        Resolution order:
+          1. Manual directory override from sessions map
+          2. Hermes session title (from /title command)
+          3. per-session strategy — Hermes session_id ({timestamp}_{hex})
+          4. per-repo strategy — git repo root directory name
+          5. per-directory strategy — directory basename
+          6. global strategy — workspace name
        """
+        import re
+
        if not cwd:
            cwd = os.getcwd()

-        # Manual override
+        # Manual override always wins
        manual = self.sessions.get(cwd)
        if manual:
            return manual

-        # Derive from directory basename
-        base = Path(cwd).name
-        if self.session_peer_prefix and self.peer_name:
-            return f"{self.peer_name}-{base}"
-        return base
+        # /title mid-session remap
+        if session_title:
+            sanitized = re.sub(r'[^a-zA-Z0-9_-]', '-', session_title).strip('-')
+            if sanitized:
+                if self.session_peer_prefix and self.peer_name:
+                    return f"{self.peer_name}-{sanitized}"
+                return sanitized
+
+        # per-session: inherit Hermes session_id (new Honcho session each run)
+        if self.session_strategy == "per-session" and session_id:
+            if self.session_peer_prefix and self.peer_name:
+                return f"{self.peer_name}-{session_id}"
+            return session_id
+
+        # per-repo: one Honcho session per git repository
+        if self.session_strategy == "per-repo":
+            base = self._git_repo_name(cwd) or Path(cwd).name
+            if self.session_peer_prefix and self.peer_name:
+                return f"{self.peer_name}-{base}"
+            return base
+
+        # per-directory: one Honcho session per working directory
+        if self.session_strategy in ("per-directory", "per-session"):
+            base = Path(cwd).name
+            if self.session_peer_prefix and self.peer_name:
+                return f"{self.peer_name}-{base}"
+            return base
+
+        # global: single session across all directories
+        return self.workspace_id

    def get_linked_workspaces(self) -> list[str]:
        """Resolve linked host keys to workspace names."""
--- a/honcho_integration/session.py
+++ b/honcho_integration/session.py
@ -2,8 +2,10 @@

 from __future__ import annotations

+import queue
 import re
 import logging
+import threading
 from dataclasses import dataclass, field
 from datetime import datetime
 from typing import Any, TYPE_CHECKING
@ -15,6 +17,9 @@ if TYPE_CHECKING:

 logger = logging.getLogger(__name__)

+# Sentinel to signal the async writer thread to shut down
+_ASYNC_SHUTDOWN = object()
+

@dataclass
 class HonchoSession:
@ -80,7 +85,8 @@ class HonchoSessionManager:
        Args:
            honcho: Optional Honcho client. If not provided, uses the singleton.
            context_tokens: Max tokens for context() calls (None = Honcho default).
-            config: HonchoClientConfig from global config (provides peer_name, ai_peer, etc.).
+            config: HonchoClientConfig from global config (provides peer_name, ai_peer,
+                    write_frequency, memory_mode, etc.).
        """
        self._honcho = honcho
        self._context_tokens = context_tokens
@ -89,6 +95,33 @@ class HonchoSessionManager:
        self._peers_cache: dict[str, Any] = {}
        self._sessions_cache: dict[str, Any] = {}

+        # Write frequency state
+        write_frequency = (config.write_frequency if config else "async")
+        self._write_frequency = write_frequency
+        self._turn_counter: int = 0
+
+        # Prefetch caches: session_key → last result (consumed once per turn)
+        self._context_cache: dict[str, dict] = {}
+        self._dialectic_cache: dict[str, str] = {}
+        self._dialectic_reasoning_level: str = (
+            config.dialectic_reasoning_level if config else "low"
+        )
+        self._dialectic_max_chars: int = (
+            config.dialectic_max_chars if config else 600
+        )
+
+        # Async write queue — started lazily on first enqueue
+        self._async_queue: queue.Queue | None = None
+        self._async_thread: threading.Thread | None = None
+        if write_frequency == "async":
+            self._async_queue = queue.Queue()
+            self._async_thread = threading.Thread(
+                target=self._async_writer_loop,
+                name="honcho-async-writer",
+                daemon=True,
+            )
+            self._async_thread.start()
+
    @property
    def honcho(self) -> Honcho:
        """Get the Honcho client, initializing if needed."""
@ -125,10 +158,12 @@ class HonchoSessionManager:

        session = self.honcho.session(session_id)

-        # Configure peer observation settings
+        # Configure peer observation settings.
+        # observe_me=True for AI peer so Honcho watches what the agent says
+        # and builds its representation over time — enabling identity formation.
        from honcho.session import SessionPeerConfig
        user_config = SessionPeerConfig(observe_me=True, observe_others=True)
-        ai_config = SessionPeerConfig(observe_me=False, observe_others=True)
+        ai_config = SessionPeerConfig(observe_me=True, observe_others=True)

        session.add_peers([(user_peer, user_config), (assistant_peer, ai_config)])

@ -234,16 +269,11 @@ class HonchoSessionManager:
        self._cache[key] = session
        return session

-    def save(self, session: HonchoSession) -> None:
-        """
-        Save messages to Honcho.
-
-        Syncs only new (unsynced) messages from the local cache.
-        """
+    def _flush_session(self, session: HonchoSession) -> None:
+        """Internal: write unsynced messages to Honcho synchronously."""
        if not session.messages:
            return

-        # Get the Honcho session and peers
        user_peer = self._get_or_create_peer(session.user_peer_id)
        assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
        honcho_session = self._sessions_cache.get(session.honcho_session_id)
@ -253,9 +283,7 @@ class HonchoSessionManager:
                session.honcho_session_id, user_peer, assistant_peer
            )

-        # Only send new messages (those without a '_synced' flag)
        new_messages = [m for m in session.messages if not m.get("_synced")]
-
        if not new_messages:
            return

@ -274,9 +302,83 @@ class HonchoSessionManager:
                msg["_synced"] = False
            logger.error("Failed to sync messages to Honcho: %s", e)

-        # Update cache
        self._cache[session.key] = session

+    def _async_writer_loop(self) -> None:
+        """Background daemon thread: drains the async write queue."""
+        while True:
+            try:
+                item = self._async_queue.get(timeout=5)
+                if item is _ASYNC_SHUTDOWN:
+                    break
+                try:
+                    self._flush_session(item)
+                except Exception as e:
+                    logger.warning("Honcho async write failed, retrying once: %s", e)
+                    import time as _time
+                    _time.sleep(2)
+                    try:
+                        self._flush_session(item)
+                    except Exception as e2:
+                        logger.error("Honcho async write retry failed, dropping batch: %s", e2)
+            except queue.Empty:
+                continue
+            except Exception as e:
+                logger.error("Honcho async writer error: %s", e)
+
+    def save(self, session: HonchoSession) -> None:
+        """Save messages to Honcho, respecting write_frequency.
+
+        write_frequency modes:
+          "async"   — enqueue for background thread (zero blocking, zero token cost)
+          "turn"    — flush synchronously every turn
+          "session" — defer until flush_session() is called explicitly
+          N (int)   — flush every N turns
+        """
+        self._turn_counter += 1
+        wf = self._write_frequency
+
+        if wf == "async":
+            if self._async_queue is not None:
+                self._async_queue.put(session)
+        elif wf == "turn":
+            self._flush_session(session)
+        elif wf == "session":
+            # Accumulate; caller must call flush_all() at session end
+            pass
+        elif isinstance(wf, int) and wf > 0:
+            if self._turn_counter % wf == 0:
+                self._flush_session(session)
+
+    def flush_all(self) -> None:
+        """Flush all pending unsynced messages for all cached sessions.
+
+        Called at session end for "session" write_frequency, or to force
+        a sync before process exit regardless of mode.
+        """
+        for session in list(self._cache.values()):
+            try:
+                self._flush_session(session)
+            except Exception as e:
+                logger.error("Honcho flush_all error for %s: %s", session.key, e)
+
+        # Drain async queue synchronously if it exists
+        if self._async_queue is not None:
+            while not self._async_queue.empty():
+                try:
+                    item = self._async_queue.get_nowait()
+                    if item is not _ASYNC_SHUTDOWN:
+                        self._flush_session(item)
+                except queue.Empty:
+                    break
+
+    def shutdown(self) -> None:
+        """Gracefully shut down the async writer thread."""
+        if self._async_queue is not None and self._async_thread is not None:
+            self.flush_all()
+            self._async_queue.put(_ASYNC_SHUTDOWN)
+            self._async_thread.join(timeout=10)
+
    def delete(self, key: str) -> bool:
        """Delete a session from local cache."""
        if key in self._cache:
@ -305,49 +407,141 @@ class HonchoSessionManager:
        # get_or_create will create a fresh session
        session = self.get_or_create(new_key)

-        # Cache under both original key and timestamped key
+        # Cache under the original key so callers find it by the expected name
        self._cache[key] = session
-        self._cache[new_key] = session

        logger.info("Created new session for %s (honcho: %s)", key, session.honcho_session_id)
        return session

-    def get_user_context(self, session_key: str, query: str) -> str:
+    _REASONING_LEVELS = ("minimal", "low", "medium", "high", "max")
+
+    def _dynamic_reasoning_level(self, query: str) -> str:
        """
-        Query Honcho's dialectic chat for user context.
+        Pick a reasoning level based on message complexity.
+
+        Uses the configured default as a floor; bumps up for longer or
+        more complex messages so Honcho applies more inference where it matters.
+
+          < 120 chars  → default (typically "low")
+          120–400 chars → one level above default (cap at "high")
+          > 400 chars  → two levels above default (cap at "high")
+
+        "max" is never selected automatically — reserve it for explicit config.
+        """
+        levels = self._REASONING_LEVELS
+        default_idx = levels.index(self._dialectic_reasoning_level) if self._dialectic_reasoning_level in levels else 1
+        n = len(query)
+        if n < 120:
+            bump = 0
+        elif n < 400:
+            bump = 1
+        else:
+            bump = 2
+        # Cap at "high" (index 3) for auto-selection
+        idx = min(default_idx + bump, 3)
+        return levels[idx]
+
+    def dialectic_query(self, session_key: str, query: str, reasoning_level: str | None = None) -> str:
+        """
+        Query Honcho's dialectic endpoint about the user.
+
+        Runs an LLM on Honcho's backend against the user peer's full
+        representation. Higher latency than context() — call async via
+        prefetch_dialectic() to avoid blocking the response.

        Args:
-            session_key: The session key to get context for.
+            session_key: The session key to query against.
            query: Natural language question about the user.
+            reasoning_level: Override the config default. If None, uses
+                             _dynamic_reasoning_level(query).

        Returns:
-            Honcho's response about the user.
+            Honcho's synthesized answer, or empty string on failure.
        """
        session = self._cache.get(session_key)
        if not session:
-            return "No session found for this context."
+            return ""

        user_peer = self._get_or_create_peer(session.user_peer_id)
+        level = reasoning_level or self._dynamic_reasoning_level(query)

        try:
-            return user_peer.chat(query)
+            result = user_peer.chat(query, reasoning_level=level) or ""
+            # Apply Hermes-side char cap before caching
+            if result and self._dialectic_max_chars and len(result) > self._dialectic_max_chars:
+                result = result[:self._dialectic_max_chars].rsplit(" ", 1)[0] + " …"
+            return result
        except Exception as e:
-            logger.error("Failed to get user context from Honcho: %s", e)
-            return f"Unable to retrieve user context: {e}"
+            logger.warning("Honcho dialectic query failed: %s", e)
+            return ""
+
+    def prefetch_dialectic(self, session_key: str, query: str) -> None:
+        """
+        Fire a dialectic_query in a background thread, caching the result.
+
+        Non-blocking. The result is available via pop_dialectic_result()
+        on the next call (typically the following turn). Reasoning level
+        is selected dynamically based on query complexity.
+
+        Args:
+            session_key: The session key to query against.
+            query: The user's current message, used as the query.
+        """
+        def _run():
+            result = self.dialectic_query(session_key, query)
+            if result:
+                self._dialectic_cache[session_key] = result
+
+        t = threading.Thread(target=_run, name="honcho-dialectic-prefetch", daemon=True)
+        t.start()
+
+    def pop_dialectic_result(self, session_key: str) -> str:
+        """
+        Return and clear the cached dialectic result for this session.
+
+        Returns empty string if no result is ready yet.
+        """
+        return self._dialectic_cache.pop(session_key, "")
+
+    def prefetch_context(self, session_key: str, user_message: str | None = None) -> None:
+        """
+        Fire get_prefetch_context in a background thread, caching the result.
+
+        Non-blocking. Consumed next turn via pop_context_result(). This avoids
+        a synchronous HTTP round-trip blocking every response.
+        """
+        def _run():
+            result = self.get_prefetch_context(session_key, user_message)
+            if result:
+                self._context_cache[session_key] = result
+
+        t = threading.Thread(target=_run, name="honcho-context-prefetch", daemon=True)
+        t.start()
+
+    def pop_context_result(self, session_key: str) -> dict[str, str]:
+        """
+        Return and clear the cached context result for this session.
+
+        Returns empty dict if no result is ready yet (first turn).
+        """
+        return self._context_cache.pop(session_key, {})

    def get_prefetch_context(self, session_key: str, user_message: str | None = None) -> dict[str, str]:
        """
-        Pre-fetch user context using Honcho's context() method.
+        Pre-fetch user and AI peer context from Honcho.

-        Single API call that returns the user's representation
-        and peer card, using semantic search based on the user's message.
+        Fetches peer_representation and peer_card for both peers. search_query
+        is intentionally omitted — it would only affect additional excerpts
+        that this code does not consume, and passing the raw message exposes
+        conversation content in server access logs.

        Args:
            session_key: The session key to get context for.
-            user_message: The user's message for semantic search.
+            user_message: Unused; kept for call-site compatibility.

        Returns:
-            Dictionary with 'representation' and 'card' keys.
+            Dictionary with 'representation', 'card', 'ai_representation',
+            and 'ai_card' keys.
        """
        session = self._cache.get(session_key)
        if not session:
@ -357,23 +551,35 @@ class HonchoSessionManager:
        if not honcho_session:
            return {}

+        result: dict[str, str] = {}
        try:
            ctx = honcho_session.context(
                summary=False,
                tokens=self._context_tokens,
                peer_target=session.user_peer_id,
-                search_query=user_message,
+                peer_perspective=session.assistant_peer_id,
            )
-            # peer_card is list[str] in SDK v2, join for prompt injection
            card = ctx.peer_card or []
-            card_str = "\n".join(card) if isinstance(card, list) else str(card)
-            return {
-                "representation": ctx.peer_representation or "",
-                "card": card_str,
-            }
+            result["representation"] = ctx.peer_representation or ""
+            result["card"] = "\n".join(card) if isinstance(card, list) else str(card)
        except Exception as e:
-            logger.warning("Failed to fetch context from Honcho: %s", e)
-            return {}
+            logger.warning("Failed to fetch user context from Honcho: %s", e)
+
+        # Also fetch AI peer's own representation so Hermes knows itself.
+        try:
+            ai_ctx = honcho_session.context(
+                summary=False,
+                tokens=self._context_tokens,
+                peer_target=session.assistant_peer_id,
+                peer_perspective=session.user_peer_id,
+            )
+            ai_card = ai_ctx.peer_card or []
+            result["ai_representation"] = ai_ctx.peer_representation or ""
+            result["ai_card"] = "\n".join(ai_card) if isinstance(ai_card, list) else str(ai_card)
+        except Exception as e:
+            logger.debug("Failed to fetch AI peer context from Honcho: %s", e)
+
+        return result

    def migrate_local_history(self, session_key: str, messages: list[dict[str, Any]]) -> bool:
        """
@ -491,6 +697,7 @@ class HonchoSessionManager:
        files = [
            ("MEMORY.md", "consolidated_memory.md", "Long-term agent notes and preferences"),
            ("USER.md", "user_profile.md", "User profile and preferences"),
+            ("SOUL.md", "agent_soul.md", "Agent persona and identity configuration"),
        ]

        for filename, upload_name, description in files:
@ -525,6 +732,150 @@ class HonchoSessionManager:

        return uploaded

+    def get_peer_card(self, session_key: str) -> list[str]:
+        """
+        Fetch the user peer's card — a curated list of key facts.
+
+        Fast, no LLM reasoning. Returns raw structured facts Honcho has
+        inferred about the user (name, role, preferences, patterns).
+        Empty list if unavailable.
+        """
+        session = self._cache.get(session_key)
+        if not session:
+            return []
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            return []
+
+        try:
+            ctx = honcho_session.context(
+                summary=False,
+                tokens=200,
+                peer_target=session.user_peer_id,
+                peer_perspective=session.assistant_peer_id,
+            )
+            card = ctx.peer_card or []
+            return card if isinstance(card, list) else [str(card)]
+        except Exception as e:
+            logger.debug("Failed to fetch peer card from Honcho: %s", e)
+            return []
+
+    def search_context(self, session_key: str, query: str, max_tokens: int = 800) -> str:
+        """
+        Semantic search over Honcho session context.
+
+        Returns raw excerpts ranked by relevance to the query. No LLM
+        reasoning — cheaper and faster than dialectic_query. Good for
+        factual lookups where the model will do its own synthesis.
+
+        Args:
+            session_key: Session to search against.
+            query: Search query for semantic matching.
+            max_tokens: Token budget for returned content.
+
+        Returns:
+            Relevant context excerpts as a string, or empty string if none.
+        """
+        session = self._cache.get(session_key)
+        if not session:
+            return ""
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            return ""
+
+        try:
+            ctx = honcho_session.context(
+                summary=False,
+                tokens=max_tokens,
+                peer_target=session.user_peer_id,
+                peer_perspective=session.assistant_peer_id,
+                search_query=query,
+            )
+            parts = []
+            if ctx.peer_representation:
+                parts.append(ctx.peer_representation)
+            card = ctx.peer_card or []
+            if card:
+                facts = card if isinstance(card, list) else [str(card)]
+                parts.append("\n".join(f"- {f}" for f in facts))
+            return "\n\n".join(parts)
+        except Exception as e:
+            logger.debug("Honcho search_context failed: %s", e)
+            return ""
+
+    def seed_ai_identity(self, session_key: str, content: str, source: str = "manual") -> bool:
+        """
+        Seed the AI peer's Honcho representation from text content.
+
+        Useful for priming AI identity from SOUL.md, exported chats, or
+        any structured description. The content is sent as an assistant
+        peer message so Honcho's reasoning model can incorporate it.
+
+        Args:
+            session_key: The session key to associate with.
+            content: The identity/persona content to seed.
+            source: Metadata tag for the source (e.g. "soul_md", "export").
+
+        Returns:
+            True on success, False on failure.
+        """
+        if not content or not content.strip():
+            return False
+
+        session = self._cache.get(session_key)
+        if not session:
+            logger.warning("No session cached for '%s', skipping AI seed", session_key)
+            return False
+
+        assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
+        try:
+            wrapped = (
+                f"<ai_identity_seed>\n"
+                f"<source>{source}</source>\n"
+                f"\n"
+                f"{content.strip()}\n"
+                f"</ai_identity_seed>"
+            )
+            assistant_peer.add_message("assistant", wrapped)
+            logger.info("Seeded AI identity from '%s' into %s", source, session_key)
+            return True
+        except Exception as e:
+            logger.error("Failed to seed AI identity: %s", e)
+            return False
+
+    def get_ai_representation(self, session_key: str) -> dict[str, str]:
+        """
+        Fetch the AI peer's current Honcho representation.
+
+        Returns:
+            Dict with 'representation' and 'card' keys, empty strings if unavailable.
+        """
+        session = self._cache.get(session_key)
+        if not session:
+            return {"representation": "", "card": ""}
+
+        honcho_session = self._sessions_cache.get(session.honcho_session_id)
+        if not honcho_session:
+            return {"representation": "", "card": ""}
+
+        try:
+            ctx = honcho_session.context(
+                summary=False,
+                tokens=self._context_tokens,
+                peer_target=session.assistant_peer_id,
+                peer_perspective=session.user_peer_id,
+            )
+            ai_card = ctx.peer_card or []
+            return {
+                "representation": ctx.peer_representation or "",
+                "card": "\n".join(ai_card) if isinstance(ai_card, list) else str(ai_card),
+            }
+        except Exception as e:
+            logger.debug("Failed to fetch AI representation: %s", e)
+            return {"representation": "", "card": ""}
+
    def list_sessions(self) -> list[dict[str, Any]]:
        """List all cached sessions."""
        return [
--- a/run_agent.py
+++ b/run_agent.py
@ -545,10 +545,12 @@ class AIAgent:
        # Reads ~/.honcho/config.json as the single source of truth.
        self._honcho = None  # HonchoSessionManager | None
        self._honcho_session_key = honcho_session_key
+        self._honcho_config = None  # HonchoClientConfig | None
        if not skip_memory:
            try:
                from honcho_integration.client import HonchoClientConfig, get_honcho_client
                hcfg = HonchoClientConfig.from_global_config()
+                self._honcho_config = hcfg
                if hcfg.enabled and hcfg.api_key:
                    from honcho_integration.session import HonchoSessionManager
                    client = get_honcho_client(hcfg)
@ -557,30 +559,144 @@ class AIAgent:
                        config=hcfg,
                        context_tokens=hcfg.context_tokens,
                    )
-                    # Resolve session key: explicit arg > global sessions map > fallback
+                    # Resolve session key: explicit arg > sessions map > title > per-session id > directory
                    if not self._honcho_session_key:
+                        # Pull title from SessionDB if available
+                        session_title = None
+                        if session_db is not None:
+                            try:
+                                session_title = session_db.get_session_title(session_id or "")
+                            except Exception:
+                                pass
                        self._honcho_session_key = (
-                            hcfg.resolve_session_name()
+                            hcfg.resolve_session_name(
+                                session_title=session_title,
+                                session_id=self.session_id,
+                            )
                            or "hermes-default"
                        )
-                    # Ensure session exists in Honcho
-                    self._honcho.get_or_create(self._honcho_session_key)
+                    # Ensure session exists in Honcho; migrate local data on first activation
+                    honcho_sess = self._honcho.get_or_create(self._honcho_session_key)
+                    if not honcho_sess.messages:
+                        # New Honcho session — migrate any existing local data
+                        _conv = getattr(self, 'conversation_history', None) or []
+                        if _conv:
+                            try:
+                                self._honcho.migrate_local_history(
+                                    self._honcho_session_key, _conv
+                                )
+                                logger.info("Migrated %d local messages to Honcho", len(_conv))
+                            except Exception as _e:
+                                logger.debug("Local history migration failed (non-fatal): %s", _e)
+                        try:
+                            from hermes_cli.config import get_hermes_home
+                            _mem_dir = str(get_hermes_home() / "memories")
+                            self._honcho.migrate_memory_files(
+                                self._honcho_session_key, _mem_dir
+                            )
+                        except Exception as _e:
+                            logger.debug("Memory files migration failed (non-fatal): %s", _e)
                    # Inject session context into the honcho tool module
                    from tools.honcho_tools import set_session_context
                    set_session_context(self._honcho, self._honcho_session_key)
+
+                    # In "context" mode, skip honcho tool registration entirely —
+                    # all memory retrieval comes from the pre-warmed system prompt.
+                    if hcfg.recall_mode != "context":
+                        # Rebuild tool definitions now that Honcho check_fn will pass.
+                        # (Tools were built before Honcho init, so query_user_context
+                        # was filtered out by _check_honcho_available() returning False.)
+                        self.tools = get_tool_definitions(
+                            enabled_toolsets=enabled_toolsets,
+                            disabled_toolsets=disabled_toolsets,
+                            quiet_mode=True,  # already printed tool list above
+                        )
+                        self.valid_tool_names = {
+                            tool["function"]["name"] for tool in self.tools
+                        } if self.tools else set()
+                        if not self.quiet_mode:
+                            print(f"  Honcho active — recall_mode: {hcfg.recall_mode}")
+                    else:
+                        if not self.quiet_mode:
+                            print("  Honcho active — recall_mode: context (tools suppressed)")
+
                    logger.info(
-                        "Honcho active (session: %s, user: %s, workspace: %s)",
+                        "Honcho active (session: %s, user: %s, workspace: %s, "
+                        "write_frequency: %s, memory_mode: %s)",
                        self._honcho_session_key, hcfg.peer_name, hcfg.workspace_id,
+                        hcfg.write_frequency, hcfg.memory_mode,
                    )
+
+                    # Warm caches when recall_mode allows pre-loaded context.
+                    # "tools" mode skips warm entirely (tool calls handle recall).
+                    _recall_mode = hcfg.recall_mode
+                    if _recall_mode != "tools":
+                        try:
+                            _ctx = self._honcho.get_prefetch_context(self._honcho_session_key)
+                            if _ctx:
+                                self._honcho._context_cache[self._honcho_session_key] = _ctx
+                                logger.debug("Honcho context pre-warmed for first turn")
+                        except Exception as _e:
+                            logger.debug("Honcho context prefetch failed (non-fatal): %s", _e)
+
+                        try:
+                            _cwd = os.path.basename(os.getcwd())
+                            _dialectic = self._honcho.dialectic_query(
+                                self._honcho_session_key,
+                                f"What has the user been working on recently in {_cwd}? "
+                                "Summarize the current project context and where we left off.",
+                            )
+                            if _dialectic:
+                                self._honcho._dialectic_cache[self._honcho_session_key] = _dialectic
+                                logger.debug("Honcho dialectic pre-warmed for first turn")
+                        except Exception as _e:
+                            logger.debug("Honcho dialectic prefetch failed (non-fatal): %s", _e)
+
+                    # Register SIGTERM/SIGINT handlers to flush pending async writes
+                    # before the process exits. signal.signal() only works on the main
+                    # thread; AIAgent may be initialised from a worker thread in cli.py.
+                    import signal as _signal
+                    import threading as _threading
+                    _honcho_ref = self._honcho
+
+                    if _threading.current_thread() is _threading.main_thread():
+                        def _honcho_flush_handler(signum, frame):
+                            try:
+                                _honcho_ref.flush_all()
+                            except Exception:
+                                pass
+                            if signum == _signal.SIGINT:
+                                raise KeyboardInterrupt
+                            raise SystemExit(0)
+
+                        _signal.signal(_signal.SIGTERM, _honcho_flush_handler)
+                        _signal.signal(_signal.SIGINT, _honcho_flush_handler)
                else:
                    if not hcfg.enabled:
                        logger.debug("Honcho disabled in global config")
                    elif not hcfg.api_key:
                        logger.debug("Honcho enabled but no API key configured")
            except Exception as e:
-                logger.debug("Honcho init failed (non-fatal): %s", e)
+                logger.warning("Honcho init failed — memory disabled: %s", e)
+                print(f"  Honcho init failed: {e}")
+                print("  Run 'hermes honcho setup' to reconfigure.")
                self._honcho = None

+        # Gate local memory writes based on per-peer memory modes.
+        # AI peer governs MEMORY.md; user peer governs USER.md.
+        # "honcho" = Honcho only, disable local; "local" = local only, no Honcho sync.
+        if self._honcho_config and self._honcho:
+            _hcfg = self._honcho_config
+            _agent_mode = _hcfg.peer_memory_mode(_hcfg.ai_peer)
+            _user_mode = _hcfg.peer_memory_mode(_hcfg.peer_name or "user")
+            if _agent_mode == "honcho":
+                self._memory_flush_min_turns = 0
+                self._memory_enabled = False
+                logger.debug("peer %s memory_mode=honcho: local MEMORY.md writes disabled", _hcfg.ai_peer)
+            if _user_mode == "honcho":
+                self._user_profile_enabled = False
+                logger.debug("peer %s memory_mode=honcho: local USER.md writes disabled", _hcfg.peer_name or "user")
+
        # Skills config: nudge interval for skill creation reminders
        self._skill_nudge_interval = 15
        try:
@ -1318,30 +1434,59 @@ class AIAgent:
    # ── Honcho integration helpers ──

    def _honcho_prefetch(self, user_message: str) -> str:
-        """Fetch user context from Honcho for system prompt injection.
+        """Assemble Honcho context from cached background fetches.

-        Returns a formatted context block, or empty string if unavailable.
+        Both session.context() and peer.chat() (dialectic) are fired as
+        background threads at the end of each turn via _honcho_fire_prefetch().
+        This method just reads the cached results — no blocking HTTP calls.
+
+        First turn uses synchronously pre-warmed caches from init.
+        Subsequent turns use async prefetch results from the previous turn end.
        """
        if not self._honcho or not self._honcho_session_key:
            return ""
        try:
-            ctx = self._honcho.get_prefetch_context(self._honcho_session_key, user_message)
-            if not ctx:
-                return ""
            parts = []
-            rep = ctx.get("representation", "")
-            card = ctx.get("card", "")
-            if rep:
-                parts.append(rep)
-            if card:
-                parts.append(card)
+
+            ctx = self._honcho.pop_context_result(self._honcho_session_key)
+            if ctx:
+                rep = ctx.get("representation", "")
+                card = ctx.get("card", "")
+                if rep:
+                    parts.append(f"## User representation\n{rep}")
+                if card:
+                    parts.append(card)
+                ai_rep = ctx.get("ai_representation", "")
+                ai_card = ctx.get("ai_card", "")
+                if ai_rep:
+                    parts.append(f"## AI peer representation\n{ai_rep}")
+                if ai_card:
+                    parts.append(ai_card)
+
+            dialectic = self._honcho.pop_dialectic_result(self._honcho_session_key)
+            if dialectic:
+                parts.append(f"[Honcho dialectic]\n{dialectic}")
+
            if not parts:
                return ""
-            return "# Honcho User Context\n" + "\n\n".join(parts)
+            header = (
+                "# Honcho Memory (persistent cross-session context)\n"
+                "Use this to answer questions about the user, prior sessions, "
+                "and what you were working on together. Do not call tools to "
+                "look up information that is already present here.\n"
+            )
+            return header + "\n\n".join(parts)
        except Exception as e:
            logger.debug("Honcho prefetch failed (non-fatal): %s", e)
            return ""

+    def _honcho_fire_prefetch(self, user_message: str) -> None:
+        """Fire both Honcho background fetches for the next turn (non-blocking)."""
+        if not self._honcho or not self._honcho_session_key:
+            return
+        self._honcho.prefetch_context(self._honcho_session_key, user_message)
+        self._honcho.prefetch_dialectic(self._honcho_session_key, user_message)
+
    def _honcho_save_user_observation(self, content: str) -> str:
        """Route a memory tool target=user add to Honcho.

@ -1367,13 +1512,24 @@ class AIAgent:
        """Sync the user/assistant message pair to Honcho."""
        if not self._honcho or not self._honcho_session_key:
            return
+        # Skip Honcho sync only if BOTH peer modes are local
+        _cfg = self._honcho_config
+        if _cfg and all(
+            _cfg.peer_memory_mode(p) == "local"
+            for p in (_cfg.ai_peer, _cfg.peer_name or "user")
+        ):
+            return
        try:
            session = self._honcho.get_or_create(self._honcho_session_key)
            session.add_message("user", user_content)
            session.add_message("assistant", assistant_content)
            self._honcho.save(session)
+            logger.info("Honcho sync queued for session %s (%d messages)",
+                        self._honcho_session_key, len(session.messages))
        except Exception as e:
-            logger.debug("Honcho sync failed (non-fatal): %s", e)
+            logger.warning("Honcho sync failed: %s", e)
+            if not self.quiet_mode:
+                print(f"  Honcho write failed: {e}")

    def _build_system_prompt(self, system_message: str = None) -> str:
        """
@ -1391,7 +1547,21 @@ class AIAgent:
        #   5. Context files (SOUL.md, AGENTS.md, .cursorrules)
        #   6. Current date & time (frozen at build time)
        #   7. Platform-specific formatting hint
-        prompt_parts = [DEFAULT_AGENT_IDENTITY]
+        # If an AI peer name is configured in Honcho, personalise the identity line.
+        _ai_peer_name = (
+            self._honcho_config.ai_peer
+            if self._honcho_config and self._honcho_config.ai_peer != "hermes"
+            else None
+        )
+        if _ai_peer_name:
+            _identity = DEFAULT_AGENT_IDENTITY.replace(
+                "You are Hermes Agent",
+                f"You are {_ai_peer_name}",
+                1,
+            )
+        else:
+            _identity = DEFAULT_AGENT_IDENTITY
+        prompt_parts = [_identity]

        # Tool-aware behavioral guidance: only inject when the tools are loaded
        tool_guidance = []
@ -1404,6 +1574,58 @@ class AIAgent:
        if tool_guidance:
            prompt_parts.append(" ".join(tool_guidance))

+        # Honcho CLI awareness: tell Hermes about its own management commands
+        # so it can refer the user to them rather than reinventing answers.
+        if self._honcho and self._honcho_session_key:
+            hcfg = self._honcho_config
+            mode = hcfg.memory_mode if hcfg else "hybrid"
+            freq = hcfg.write_frequency if hcfg else "async"
+            recall_mode = hcfg.recall_mode if hcfg else "auto"
+            honcho_block = (
+                "# Honcho memory integration\n"
+                f"Active. Session: {self._honcho_session_key}. "
+                f"Mode: {mode}. Write frequency: {freq}. Recall: {recall_mode}.\n"
+            )
+            if recall_mode == "context":
+                honcho_block += (
+                    "Honcho context is pre-loaded into this system prompt below. "
+                    "All memory retrieval comes from this context — no memory tools "
+                    "are available. Answer questions about the user, prior sessions, "
+                    "and recent work directly from the Honcho Memory section.\n"
+                )
+            elif recall_mode == "tools":
+                honcho_block += (
+                    "Memory tools (most capable first; use cheaper tools when sufficient):\n"
+                    "  query_user_context <question>           — dialectic Q&A, LLM-synthesized answer\n"
+                    "  honcho_search <query>                   — semantic search, raw excerpts, no LLM\n"
+                    "  honcho_profile                          — peer card, key facts, no LLM\n"
+                )
+            else:  # auto
+                honcho_block += (
+                    "Honcho context (user representation, peer card, and recent session summary) "
+                    "is pre-loaded into this system prompt below. Use it to answer continuity "
+                    "questions ('where were we?', 'what were we working on?') WITHOUT calling "
+                    "any tools. Only call memory tools when you need information beyond what is "
+                    "already present in the Honcho Memory section.\n"
+                    "Memory tools (most capable first; use cheaper tools when sufficient):\n"
+                    "  query_user_context <question>           — dialectic Q&A, LLM-synthesized answer\n"
+                    "  honcho_search <query>                   — semantic search, raw excerpts, no LLM\n"
+                    "  honcho_profile                          — peer card, key facts, no LLM\n"
+                )
+            honcho_block += (
+                "Management commands (refer users here instead of explaining manually):\n"
+                "  hermes honcho status                    — show full config + connection\n"
+                "  hermes honcho mode [hybrid|honcho|local] — show or set memory mode\n"
+                "  hermes honcho tokens [--context N] [--dialectic N] — show or set token budgets\n"
+                "  hermes honcho peer [--user NAME] [--ai NAME] [--reasoning LEVEL]\n"
+                "  hermes honcho sessions                  — list directory→session mappings\n"
+                "  hermes honcho map <name>                — map cwd to a session name\n"
+                "  hermes honcho identity [<file>] [--show] — seed or show AI peer identity\n"
+                "  hermes honcho migrate                   — migration guide from openclaw-honcho\n"
+                "  hermes honcho setup                     — full interactive wizard"
+            )
+            prompt_parts.append(honcho_block)
+
        # Note: ephemeral_system_prompt is NOT included here. It's injected at
        # API-call time only so it stays out of the cached/stored system prompt.
        if system_message is not None:
@ -2530,6 +2752,10 @@ class AIAgent:
            return
        if "memory" not in self.valid_tool_names or not self._memory_store:
            return
+        # honcho-only agent mode: skip local MEMORY.md flush
+        _hcfg = getattr(self, '_honcho_config', None)
+        if _hcfg and _hcfg.peer_memory_mode(_hcfg.ai_peer) == "honcho":
+            return
        effective_min = min_turns if min_turns is not None else self._memory_flush_min_turns
        if self._user_turn_count < effective_min:
            return
@ -3153,18 +3379,16 @@ class AIAgent:
            )
            self._iters_since_skill = 0

-        # Honcho prefetch: retrieve user context for system prompt injection.
-        # Only on the FIRST turn of a session (empty history).  On subsequent
-        # turns the model already has all prior context in its conversation
-        # history, and the Honcho context is baked into the stored system
-        # prompt — re-fetching it would change the system message and break
-        # Anthropic prompt caching.
+        # Honcho: read cached context from last turn's background fetch (non-blocking),
+        # then fire both fetches for next turn.  Skip in "tools" mode (no context injection).
        self._honcho_context = ""
-        if self._honcho and self._honcho_session_key and not conversation_history:
+        _recall_mode = (self._honcho_config.recall_mode if self._honcho_config else "auto")
+        if self._honcho and self._honcho_session_key and not conversation_history and _recall_mode != "tools":
            try:
                self._honcho_context = self._honcho_prefetch(user_message)
            except Exception as e:
                logger.debug("Honcho prefetch failed (non-fatal): %s", e)
+            self._honcho_fire_prefetch(user_message)

        # Add user message
        user_msg = {"role": "user", "content": user_message}
@ -4240,6 +4464,7 @@ class AIAgent:
                                    msg["content"] = f"Calling the {', '.join(tool_names)} tool{'s' if len(tool_names) > 1 else ''}..."
                                    break
                            final_response = self._strip_think_blocks(fallback).strip()
+                            self._response_was_previewed = True
                            break

                        # No fallback available — this is a genuine empty response.
@ -4282,6 +4507,7 @@ class AIAgent:
                                        break
                                # Strip <think> blocks from fallback content for user display
                                final_response = self._strip_think_blocks(fallback).strip()
+                                self._response_was_previewed = True
                                break
                            
                            # No fallback -- append the empty message as-is
@ -4438,7 +4664,9 @@ class AIAgent:
            "completed": completed,
            "partial": False,  # True only when stopped due to invalid tool calls
            "interrupted": interrupted,
+            "response_previewed": getattr(self, "_response_was_previewed", False),
        }
+        self._response_was_previewed = False
        
        # Include interrupt message if one triggered the interrupt
        if interrupted and self._interrupt_message:
--- a/tests/honcho_integration/test_async_memory.py
+++ b/tests/honcho_integration/test_async_memory.py
@ -0,0 +1,489 @@
+"""Tests for the async-memory Honcho improvements.
+
+Covers:
+  - write_frequency parsing (async / turn / session / int)
+  - memory_mode parsing
+  - resolve_session_name with session_title
+  - HonchoSessionManager.save() routing per write_frequency
+  - async writer thread lifecycle and retry
+  - flush_all() drains pending messages
+  - shutdown() joins the thread
+  - memory_mode gating helpers (unit-level)
+"""
+
+import json
+import queue
+import threading
+import time
+from pathlib import Path
+from unittest.mock import MagicMock, patch, call
+
+import pytest
+
+from honcho_integration.client import HonchoClientConfig
+from honcho_integration.session import (
+    HonchoSession,
+    HonchoSessionManager,
+    _ASYNC_SHUTDOWN,
+)
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _make_session(**kwargs) -> HonchoSession:
+    return HonchoSession(
+        key=kwargs.get("key", "cli:test"),
+        user_peer_id=kwargs.get("user_peer_id", "eri"),
+        assistant_peer_id=kwargs.get("assistant_peer_id", "hermes"),
+        honcho_session_id=kwargs.get("honcho_session_id", "cli-test"),
+        messages=kwargs.get("messages", []),
+    )
+
+
+def _make_manager(write_frequency="turn", memory_mode="hybrid") -> HonchoSessionManager:
+    cfg = HonchoClientConfig(
+        write_frequency=write_frequency,
+        memory_mode=memory_mode,
+        api_key="test-key",
+        enabled=True,
+    )
+    mgr = HonchoSessionManager(config=cfg)
+    mgr._honcho = MagicMock()
+    return mgr
+
+
+# ---------------------------------------------------------------------------
+# write_frequency parsing from config file
+# ---------------------------------------------------------------------------
+
+class TestWriteFrequencyParsing:
+    def test_string_async(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "writeFrequency": "async"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == "async"
+
+    def test_string_turn(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "writeFrequency": "turn"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == "turn"
+
+    def test_string_session(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "writeFrequency": "session"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == "session"
+
+    def test_integer_frequency(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "writeFrequency": 5}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == 5
+
+    def test_integer_string_coerced(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "writeFrequency": "3"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == 3
+
+    def test_host_block_overrides_root(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({
+            "apiKey": "k",
+            "writeFrequency": "turn",
+            "hosts": {"hermes": {"writeFrequency": "session"}},
+        }))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == "session"
+
+    def test_defaults_to_async(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.write_frequency == "async"
+
+
+# ---------------------------------------------------------------------------
+# memory_mode parsing from config file
+# ---------------------------------------------------------------------------
+
+class TestMemoryModeParsing:
+    def test_hybrid(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "memoryMode": "hybrid"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "hybrid"
+
+    def test_honcho_only(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "memoryMode": "honcho"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "honcho"
+
+    def test_local_only(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k", "memoryMode": "local"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "local"
+
+    def test_defaults_to_hybrid(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({"apiKey": "k"}))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "hybrid"
+
+    def test_host_block_overrides_root(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({
+            "apiKey": "k",
+            "memoryMode": "hybrid",
+            "hosts": {"hermes": {"memoryMode": "honcho"}},
+        }))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "honcho"
+
+    def test_object_form_sets_default_and_overrides(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({
+            "apiKey": "k",
+            "hosts": {"hermes": {"memoryMode": {
+                "default": "hybrid",
+                "hermes": "honcho",
+                "sentinel": "local",
+            }}},
+        }))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "hybrid"
+        assert cfg.peer_memory_mode("hermes") == "honcho"
+        assert cfg.peer_memory_mode("sentinel") == "local"
+        assert cfg.peer_memory_mode("unknown") == "hybrid"  # falls through to default
+
+    def test_object_form_no_default_falls_back_to_hybrid(self, tmp_path):
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({
+            "apiKey": "k",
+            "hosts": {"hermes": {"memoryMode": {"hermes": "honcho"}}},
+        }))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "hybrid"
+        assert cfg.peer_memory_mode("hermes") == "honcho"
+        assert cfg.peer_memory_mode("other") == "hybrid"
+
+    def test_global_string_host_object_override(self, tmp_path):
+        """Host object form overrides global string."""
+        cfg_file = tmp_path / "config.json"
+        cfg_file.write_text(json.dumps({
+            "apiKey": "k",
+            "memoryMode": "local",
+            "hosts": {"hermes": {"memoryMode": {"default": "hybrid", "hermes": "honcho"}}},
+        }))
+        cfg = HonchoClientConfig.from_global_config(config_path=cfg_file)
+        assert cfg.memory_mode == "hybrid"  # host default wins over global "local"
+        assert cfg.peer_memory_mode("hermes") == "honcho"
+
+
+# ---------------------------------------------------------------------------
+# resolve_session_name with session_title
+# ---------------------------------------------------------------------------
+
+class TestResolveSessionNameTitle:
+    def test_manual_override_beats_title(self):
+        cfg = HonchoClientConfig(sessions={"/my/project": "manual-name"})
+        result = cfg.resolve_session_name("/my/project", session_title="the-title")
+        assert result == "manual-name"
+
+    def test_title_beats_dirname(self):
+        cfg = HonchoClientConfig()
+        result = cfg.resolve_session_name("/some/dir", session_title="my-project")
+        assert result == "my-project"
+
+    def test_title_with_peer_prefix(self):
+        cfg = HonchoClientConfig(peer_name="eri", session_peer_prefix=True)
+        result = cfg.resolve_session_name("/some/dir", session_title="aeris")
+        assert result == "eri-aeris"
+
+    def test_title_sanitized(self):
+        cfg = HonchoClientConfig()
+        result = cfg.resolve_session_name("/some/dir", session_title="my project/name!")
+        # trailing dashes stripped by .strip('-')
+        assert result == "my-project-name"
+
+    def test_title_all_invalid_chars_falls_back_to_dirname(self):
+        cfg = HonchoClientConfig()
+        result = cfg.resolve_session_name("/some/dir", session_title="!!! ###")
+        # sanitized to empty → falls back to dirname
+        assert result == "dir"
+
+    def test_none_title_falls_back_to_dirname(self):
+        cfg = HonchoClientConfig()
+        result = cfg.resolve_session_name("/some/dir", session_title=None)
+        assert result == "dir"
+
+    def test_empty_title_falls_back_to_dirname(self):
+        cfg = HonchoClientConfig()
+        result = cfg.resolve_session_name("/some/dir", session_title="")
+        assert result == "dir"
+
+    def test_per_session_uses_session_id(self):
+        cfg = HonchoClientConfig(session_strategy="per-session")
+        result = cfg.resolve_session_name("/some/dir", session_id="20260309_175514_9797dd")
+        assert result == "20260309_175514_9797dd"
+
+    def test_per_session_with_peer_prefix(self):
+        cfg = HonchoClientConfig(session_strategy="per-session", peer_name="eri", session_peer_prefix=True)
+        result = cfg.resolve_session_name("/some/dir", session_id="20260309_175514_9797dd")
+        assert result == "eri-20260309_175514_9797dd"
+
+    def test_per_session_no_id_falls_back_to_dirname(self):
+        cfg = HonchoClientConfig(session_strategy="per-session")
+        result = cfg.resolve_session_name("/some/dir", session_id=None)
+        assert result == "dir"
+
+    def test_title_beats_session_id(self):
+        cfg = HonchoClientConfig(session_strategy="per-session")
+        result = cfg.resolve_session_name("/some/dir", session_title="my-title", session_id="20260309_175514_9797dd")
+        assert result == "my-title"
+
+    def test_manual_beats_session_id(self):
+        cfg = HonchoClientConfig(session_strategy="per-session", sessions={"/some/dir": "pinned"})
+        result = cfg.resolve_session_name("/some/dir", session_id="20260309_175514_9797dd")
+        assert result == "pinned"
+
+    def test_global_strategy_returns_workspace(self):
+        cfg = HonchoClientConfig(session_strategy="global", workspace_id="my-workspace")
+        result = cfg.resolve_session_name("/some/dir")
+        assert result == "my-workspace"
+
+
+# ---------------------------------------------------------------------------
+# save() routing per write_frequency
+# ---------------------------------------------------------------------------
+
+class TestSaveRouting:
+    def _make_session_with_message(self, mgr=None):
+        sess = _make_session()
+        sess.add_message("user", "hello")
+        sess.add_message("assistant", "hi")
+        if mgr:
+            mgr._cache[sess.key] = sess
+        return sess
+
+    def test_turn_flushes_immediately(self):
+        mgr = _make_manager(write_frequency="turn")
+        sess = self._make_session_with_message(mgr)
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            mgr.save(sess)
+            mock_flush.assert_called_once_with(sess)
+
+    def test_session_mode_does_not_flush(self):
+        mgr = _make_manager(write_frequency="session")
+        sess = self._make_session_with_message(mgr)
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            mgr.save(sess)
+            mock_flush.assert_not_called()
+
+    def test_async_mode_enqueues(self):
+        mgr = _make_manager(write_frequency="async")
+        sess = self._make_session_with_message(mgr)
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            mgr.save(sess)
+            # flush_session should NOT be called synchronously
+            mock_flush.assert_not_called()
+        assert not mgr._async_queue.empty()
+
+    def test_int_frequency_flushes_on_nth_turn(self):
+        mgr = _make_manager(write_frequency=3)
+        sess = self._make_session_with_message(mgr)
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            mgr.save(sess)  # turn 1
+            mgr.save(sess)  # turn 2
+            assert mock_flush.call_count == 0
+            mgr.save(sess)  # turn 3
+            assert mock_flush.call_count == 1
+
+    def test_int_frequency_skips_other_turns(self):
+        mgr = _make_manager(write_frequency=5)
+        sess = self._make_session_with_message(mgr)
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            for _ in range(4):
+                mgr.save(sess)
+            assert mock_flush.call_count == 0
+            mgr.save(sess)  # turn 5
+            assert mock_flush.call_count == 1
+
+
+# ---------------------------------------------------------------------------
+# flush_all()
+# ---------------------------------------------------------------------------
+
+class TestFlushAll:
+    def test_flushes_all_cached_sessions(self):
+        mgr = _make_manager(write_frequency="session")
+        s1 = _make_session(key="s1", honcho_session_id="s1")
+        s2 = _make_session(key="s2", honcho_session_id="s2")
+        s1.add_message("user", "a")
+        s2.add_message("user", "b")
+        mgr._cache = {"s1": s1, "s2": s2}
+
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            mgr.flush_all()
+            assert mock_flush.call_count == 2
+
+    def test_flush_all_drains_async_queue(self):
+        mgr = _make_manager(write_frequency="async")
+        sess = _make_session()
+        sess.add_message("user", "pending")
+        mgr._async_queue.put(sess)
+
+        with patch.object(mgr, "_flush_session") as mock_flush:
+            mgr.flush_all()
+            # Called at least once for the queued item
+            assert mock_flush.call_count >= 1
+
+    def test_flush_all_tolerates_errors(self):
+        mgr = _make_manager(write_frequency="session")
+        sess = _make_session()
+        mgr._cache = {"key": sess}
+        with patch.object(mgr, "_flush_session", side_effect=RuntimeError("oops")):
+            # Should not raise
+            mgr.flush_all()
+
+
+# ---------------------------------------------------------------------------
+# async writer thread lifecycle
+# ---------------------------------------------------------------------------
+
+class TestAsyncWriterThread:
+    def test_thread_started_on_async_mode(self):
+        mgr = _make_manager(write_frequency="async")
+        assert mgr._async_thread is not None
+        assert mgr._async_thread.is_alive()
+        mgr.shutdown()
+
+    def test_no_thread_for_turn_mode(self):
+        mgr = _make_manager(write_frequency="turn")
+        assert mgr._async_thread is None
+        assert mgr._async_queue is None
+
+    def test_shutdown_joins_thread(self):
+        mgr = _make_manager(write_frequency="async")
+        assert mgr._async_thread.is_alive()
+        mgr.shutdown()
+        assert not mgr._async_thread.is_alive()
+
+    def test_async_writer_calls_flush(self):
+        mgr = _make_manager(write_frequency="async")
+        sess = _make_session()
+        sess.add_message("user", "async msg")
+
+        flushed = []
+        original = mgr._flush_session
+
+        def capture(s):
+            flushed.append(s)
+
+        mgr._flush_session = capture
+        mgr._async_queue.put(sess)
+        # Give the daemon thread time to process
+        deadline = time.time() + 2.0
+        while not flushed and time.time() < deadline:
+            time.sleep(0.05)
+
+        mgr.shutdown()
+        assert len(flushed) == 1
+        assert flushed[0] is sess
+
+    def test_shutdown_sentinel_stops_loop(self):
+        mgr = _make_manager(write_frequency="async")
+        thread = mgr._async_thread
+        mgr.shutdown()
+        thread.join(timeout=3)
+        assert not thread.is_alive()
+
+
+# ---------------------------------------------------------------------------
+# async retry on failure
+# ---------------------------------------------------------------------------
+
+class TestAsyncWriterRetry:
+    def test_retries_once_on_failure(self):
+        mgr = _make_manager(write_frequency="async")
+        sess = _make_session()
+        sess.add_message("user", "msg")
+
+        call_count = [0]
+
+        def flaky_flush(s):
+            call_count[0] += 1
+            if call_count[0] == 1:
+                raise ConnectionError("network blip")
+            # second call succeeds silently
+
+        mgr._flush_session = flaky_flush
+
+        with patch("time.sleep"):  # skip the 2s sleep in retry
+            mgr._async_queue.put(sess)
+            deadline = time.time() + 3.0
+            while call_count[0] < 2 and time.time() < deadline:
+                time.sleep(0.05)
+
+        mgr.shutdown()
+        assert call_count[0] == 2
+
+    def test_drops_after_two_failures(self):
+        mgr = _make_manager(write_frequency="async")
+        sess = _make_session()
+        sess.add_message("user", "msg")
+
+        call_count = [0]
+
+        def always_fail(s):
+            call_count[0] += 1
+            raise RuntimeError("always broken")
+
+        mgr._flush_session = always_fail
+
+        with patch("time.sleep"):
+            mgr._async_queue.put(sess)
+            deadline = time.time() + 3.0
+            while call_count[0] < 2 and time.time() < deadline:
+                time.sleep(0.05)
+
+        mgr.shutdown()
+        # Should have tried exactly twice (initial + one retry) and not crashed
+        assert call_count[0] == 2
+        assert not mgr._async_thread.is_alive()
+
+
+# ---------------------------------------------------------------------------
+# HonchoClientConfig dataclass defaults for new fields
+# ---------------------------------------------------------------------------
+
+class TestNewConfigFieldDefaults:
+    def test_write_frequency_default(self):
+        cfg = HonchoClientConfig()
+        assert cfg.write_frequency == "async"
+
+    def test_memory_mode_default(self):
+        cfg = HonchoClientConfig()
+        assert cfg.memory_mode == "hybrid"
+
+    def test_write_frequency_set(self):
+        cfg = HonchoClientConfig(write_frequency="turn")
+        assert cfg.write_frequency == "turn"
+
+    def test_memory_mode_set(self):
+        cfg = HonchoClientConfig(memory_mode="honcho")
+        assert cfg.memory_mode == "honcho"
+
+    def test_peer_memory_mode_falls_back_to_global(self):
+        cfg = HonchoClientConfig(memory_mode="honcho")
+        assert cfg.peer_memory_mode("any-peer") == "honcho"
+
+    def test_peer_memory_mode_override(self):
+        cfg = HonchoClientConfig(memory_mode="hybrid", peer_memory_modes={"hermes": "local"})
+        assert cfg.peer_memory_mode("hermes") == "local"
+        assert cfg.peer_memory_mode("other") == "hybrid"
--- a/tests/honcho_integration/test_client.py
+++ b/tests/honcho_integration/test_client.py
@ -25,7 +25,8 @@ class TestHonchoClientConfigDefaults:
        assert config.environment == "production"
        assert config.enabled is False
        assert config.save_messages is True
-        assert config.session_strategy == "per-directory"
+        assert config.session_strategy == "per-session"
+        assert config.recall_mode == "auto"
        assert config.session_peer_prefix is False
        assert config.linked_hosts == []
        assert config.sessions == {}
@ -134,6 +135,41 @@ class TestFromGlobalConfig:
        assert config.workspace_id == "root-ws"
        assert config.ai_peer == "root-ai"

+    def test_session_strategy_default_from_global_config(self, tmp_path):
+        """from_global_config with no sessionStrategy should match dataclass default."""
+        config_file = tmp_path / "config.json"
+        config_file.write_text(json.dumps({"apiKey": "key"}))
+        config = HonchoClientConfig.from_global_config(config_path=config_file)
+        assert config.session_strategy == "per-session"
+
+    def test_context_tokens_host_block_wins(self, tmp_path):
+        """Host block contextTokens should override root."""
+        config_file = tmp_path / "config.json"
+        config_file.write_text(json.dumps({
+            "apiKey": "key",
+            "contextTokens": 1000,
+            "hosts": {"hermes": {"contextTokens": 2000}},
+        }))
+        config = HonchoClientConfig.from_global_config(config_path=config_file)
+        assert config.context_tokens == 2000
+
+    def test_recall_mode_from_config(self, tmp_path):
+        """recallMode is read from config, host block wins."""
+        config_file = tmp_path / "config.json"
+        config_file.write_text(json.dumps({
+            "apiKey": "key",
+            "recallMode": "tools",
+            "hosts": {"hermes": {"recallMode": "context"}},
+        }))
+        config = HonchoClientConfig.from_global_config(config_path=config_file)
+        assert config.recall_mode == "context"
+
+    def test_recall_mode_default(self, tmp_path):
+        config_file = tmp_path / "config.json"
+        config_file.write_text(json.dumps({"apiKey": "key"}))
+        config = HonchoClientConfig.from_global_config(config_path=config_file)
+        assert config.recall_mode == "auto"
+
    def test_corrupt_config_falls_back_to_env(self, tmp_path):
        config_file = tmp_path / "config.json"
        config_file.write_text("not valid json{{{")
@ -177,6 +213,40 @@ class TestResolveSessionName:
        # Should use os.getcwd() basename
        assert result == Path.cwd().name

+    def test_per_repo_uses_git_root(self):
+        config = HonchoClientConfig(session_strategy="per-repo")
+        with patch.object(
+            HonchoClientConfig, "_git_repo_name", return_value="hermes-agent"
+        ):
+            result = config.resolve_session_name("/home/user/hermes-agent/subdir")
+        assert result == "hermes-agent"
+
+    def test_per_repo_with_peer_prefix(self):
+        config = HonchoClientConfig(
+            session_strategy="per-repo", peer_name="eri", session_peer_prefix=True
+        )
+        with patch.object(
+            HonchoClientConfig, "_git_repo_name", return_value="groudon"
+        ):
+            result = config.resolve_session_name("/home/user/groudon/src")
+        assert result == "eri-groudon"
+
+    def test_per_repo_falls_back_to_dirname_outside_git(self):
+        config = HonchoClientConfig(session_strategy="per-repo")
+        with patch.object(
+            HonchoClientConfig, "_git_repo_name", return_value=None
+        ):
+            result = config.resolve_session_name("/home/user/not-a-repo")
+        assert result == "not-a-repo"
+
+    def test_per_repo_manual_override_still_wins(self):
+        config = HonchoClientConfig(
+            session_strategy="per-repo",
+            sessions={"/home/user/proj": "custom-session"},
+        )
+        result = config.resolve_session_name("/home/user/proj")
+        assert result == "custom-session"
+

 class TestGetLinkedWorkspaces:
    def test_resolves_linked_hosts(self):
--- a/tools/browser_tool.py
+++ b/tools/browser_tool.py
@ -1640,6 +1640,25 @@ def _cleanup_old_recordings(max_age_hours=72):
        logger.debug("Recording cleanup error (non-critical): %s", e)


+def _cleanup_old_recordings(max_age_hours=72):
+    """Remove browser recordings older than max_age_hours to prevent disk bloat."""
+    import time
+    try:
+        hermes_home = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes"))
+        recordings_dir = hermes_home / "browser_recordings"
+        if not recordings_dir.exists():
+            return
+        cutoff = time.time() - (max_age_hours * 3600)
+        for f in recordings_dir.glob("session_*.webm"):
+            try:
+                if f.stat().st_mtime < cutoff:
+                    f.unlink()
+            except Exception:
+                pass
+    except Exception:
+        pass
+
+
 # ============================================================================
 # Cleanup and Management Functions
 # ============================================================================
--- a/tools/honcho_tools.py
+++ b/tools/honcho_tools.py
@ -1,8 +1,16 @@
-"""Honcho tool for querying user context via dialectic reasoning.
+"""Honcho tools for user context retrieval.

-Registers ``query_user_context`` -- an LLM-callable tool that asks Honcho
-about the current user's history, preferences, goals, and communication
-style. The session key is injected at runtime by the agent loop via
+Registers three complementary tools, ordered by capability:
+
+  query_user_context   — dialectic Q&A (LLM-powered, direct answers)
+  honcho_search        — semantic search (fast, no LLM, raw excerpts)
+  honcho_profile       — peer card (fast, no LLM, structured facts)
+
+Use query_user_context when you need Honcho to synthesize an answer.
+Use honcho_search or honcho_profile when you want raw data to reason
+over yourself.
+
+The session key is injected at runtime by the agent loop via
 ``set_session_context()``.
 """

@ -34,54 +42,6 @@ def clear_session_context() -> None:
    _session_key = None


-# ── Tool schema ──
-
-HONCHO_TOOL_SCHEMA = {
-    "name": "query_user_context",
-    "description": (
-        "Query Honcho to retrieve relevant context about the user based on their "
-        "history and preferences. Use this when you need to understand the user's "
-        "background, preferences, past interactions, or goals. This helps you "
-        "personalize your responses and provide more relevant assistance."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "query": {
-                "type": "string",
-                "description": (
-                    "A natural language question about the user. Examples: "
-                    "'What are this user's main goals?', "
-                    "'What communication style does this user prefer?', "
-                    "'What topics has this user discussed recently?', "
-                    "'What is this user's technical expertise level?'"
-                ),
-            }
-        },
-        "required": ["query"],
-    },
-}
-
-
-# ── Tool handler ──
-
-def _handle_query_user_context(args: dict, **kw) -> str:
-    """Execute the Honcho context query."""
-    query = args.get("query", "")
-    if not query:
-        return json.dumps({"error": "Missing required parameter: query"})
-
-    if not _session_manager or not _session_key:
-        return json.dumps({"error": "Honcho is not active for this session."})
-
-    try:
-        result = _session_manager.get_user_context(_session_key, query)
-        return json.dumps({"result": result})
-    except Exception as e:
-        logger.error("Error querying Honcho user context: %s", e)
-        return json.dumps({"error": f"Failed to query user context: {e}"})
-
-
 # ── Availability check ──

 def _check_honcho_available() -> bool:
@ -89,14 +49,145 @@ def _check_honcho_available() -> bool:
    return _session_manager is not None and _session_key is not None


+# ── honcho_profile ──
+
+_PROFILE_SCHEMA = {
+    "name": "honcho_profile",
+    "description": (
+        "Retrieve the user's peer card from Honcho — a curated list of key facts "
+        "about them (name, role, preferences, communication style, patterns). "
+        "Fast, no LLM reasoning, minimal cost. "
+        "Use this at conversation start or when you need a quick factual snapshot. "
+        "Use query_user_context instead when you need Honcho to synthesize an answer."
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {},
+        "required": [],
+    },
+}
+
+
+def _handle_honcho_profile(args: dict, **kw) -> str:
+    if not _session_manager or not _session_key:
+        return json.dumps({"error": "Honcho is not active for this session."})
+    try:
+        card = _session_manager.get_peer_card(_session_key)
+        if not card:
+            return json.dumps({"result": "No profile facts available yet. The user's profile builds over time through conversations."})
+        return json.dumps({"result": card})
+    except Exception as e:
+        logger.error("Error fetching Honcho peer card: %s", e)
+        return json.dumps({"error": f"Failed to fetch profile: {e}"})
+
+
+# ── honcho_search ──
+
+_SEARCH_SCHEMA = {
+    "name": "honcho_search",
+    "description": (
+        "Semantic search over Honcho's stored context about the user. "
+        "Returns raw excerpts ranked by relevance to your query — no LLM synthesis. "
+        "Cheaper and faster than query_user_context. "
+        "Good when you want to find specific past facts and reason over them yourself. "
+        "Use query_user_context when you need a direct synthesized answer."
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "query": {
+                "type": "string",
+                "description": "What to search for in Honcho's memory (e.g. 'programming languages', 'past projects', 'timezone').",
+            },
+            "max_tokens": {
+                "type": "integer",
+                "description": "Token budget for returned context (default 800, max 2000).",
+            },
+        },
+        "required": ["query"],
+    },
+}
+
+
+def _handle_honcho_search(args: dict, **kw) -> str:
+    query = args.get("query", "")
+    if not query:
+        return json.dumps({"error": "Missing required parameter: query"})
+    if not _session_manager or not _session_key:
+        return json.dumps({"error": "Honcho is not active for this session."})
+    max_tokens = min(int(args.get("max_tokens", 800)), 2000)
+    try:
+        result = _session_manager.search_context(_session_key, query, max_tokens=max_tokens)
+        if not result:
+            return json.dumps({"result": "No relevant context found."})
+        return json.dumps({"result": result})
+    except Exception as e:
+        logger.error("Error searching Honcho context: %s", e)
+        return json.dumps({"error": f"Failed to search context: {e}"})
+
+
+# ── query_user_context (dialectic — LLM-powered) ──
+
+_QUERY_SCHEMA = {
+    "name": "query_user_context",
+    "description": (
+        "Ask Honcho a natural language question about the user and get a synthesized answer. "
+        "Uses Honcho's LLM (dialectic reasoning) — higher cost than honcho_profile or honcho_search. "
+        "Use this when you need a direct answer synthesized from the user's full history. "
+        "Examples: 'What are this user's main goals?', 'How does this user prefer to communicate?', "
+        "'What is this user's technical expertise level?'"
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "query": {
+                "type": "string",
+                "description": "A natural language question about the user.",
+            }
+        },
+        "required": ["query"],
+    },
+}
+
+
+def _handle_query_user_context(args: dict, **kw) -> str:
+    query = args.get("query", "")
+    if not query:
+        return json.dumps({"error": "Missing required parameter: query"})
+    if not _session_manager or not _session_key:
+        return json.dumps({"error": "Honcho is not active for this session."})
+    try:
+        result = _session_manager.dialectic_query(_session_key, query)
+        return json.dumps({"result": result or "No result from Honcho."})
+    except Exception as e:
+        logger.error("Error querying Honcho user context: %s", e)
+        return json.dumps({"error": f"Failed to query user context: {e}"})
+
+
 # ── Registration ──

 from tools.registry import registry

+registry.register(
+    name="honcho_profile",
+    toolset="honcho",
+    schema=_PROFILE_SCHEMA,
+    handler=_handle_honcho_profile,
+    check_fn=_check_honcho_available,
+)
+
+registry.register(
+    name="honcho_search",
+    toolset="honcho",
+    schema=_SEARCH_SCHEMA,
+    handler=_handle_honcho_search,
+    check_fn=_check_honcho_available,
+)
+
 registry.register(
    name="query_user_context",
    toolset="honcho",
-    schema=HONCHO_TOOL_SCHEMA,
+    schema=_QUERY_SCHEMA,
    handler=_handle_query_user_context,
    check_fn=_check_honcho_available,
 )
--- a/website/docs/user-guide/configuration.md
+++ b/website/docs/user-guide/configuration.md
@ -673,6 +673,7 @@ checkpoints:
  max_snapshots: 50              # Max checkpoints to keep per directory
 ```

+
 ## Delegation

 Configure subagent behavior for the delegate tool:
--- a/website/docs/user-guide/messaging/slack.md
+++ b/website/docs/user-guide/messaging/slack.md
@ -91,6 +91,7 @@ You can always find or regenerate app-level tokens under **Settings → Basic In

 This step is critical — it controls what messages the bot can see.

+
 1. In the sidebar, go to **Features → Event Subscriptions**
 2. Toggle **Enable Events** to ON
 3. Expand **Subscribe to bot events** and add:
@ -110,6 +111,7 @@ If the bot works in DMs but **not in channels**, you almost certainly forgot to
 Without these events, Slack simply never delivers channel messages to the bot.
 :::

+
 ---

 ## Step 5: Install App to Workspace
@ -200,6 +202,7 @@ This is intentional — it prevents the bot from responding to every message in

 ---

+
 ## Home Channel

 Set `SLACK_HOME_CHANNEL` to a channel ID where Hermes will deliver scheduled messages,