From 098efde848a1253033fedf04e8184ef843115e11 Mon Sep 17 00:00:00 2001 From: Erosika Date: Sat, 18 Apr 2026 12:45:04 -0400 Subject: [PATCH] docs(honcho): wizard cadence default 2, prewarm/depth + observation + multi-peer MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - cli: setup wizard pre-fills dialecticCadence=2 (code default stays 1 so unset → every turn) - honcho.md: fix stale dialecticCadence default in tables, add Session-Start Prewarm subsection (depth runs at init), add Query-Adaptive Reasoning Level subsection, expand Observation section with directional vs unified semantics and per-peer patterns - memory-providers.md: fix stale default, rename Multi-agent/Profiles to Multi-peer setup, add concrete walkthrough for new profiles and sync, document observation toggles + presets, link to honcho.md - SKILL.md: fix stale defaults, add Depth at session start callout --- .../autonomous-ai-agents/honcho/SKILL.md | 8 ++- plugins/memory/honcho/cli.py | 6 +- website/docs/user-guide/features/honcho.md | 47 ++++++++++++++- .../user-guide/features/memory-providers.md | 59 ++++++++++++++++--- 4 files changed, 103 insertions(+), 17 deletions(-) diff --git a/optional-skills/autonomous-ai-agents/honcho/SKILL.md b/optional-skills/autonomous-ai-agents/honcho/SKILL.md index c60d2c635..e79875aa0 100644 --- a/optional-skills/autonomous-ai-agents/honcho/SKILL.md +++ b/optional-skills/autonomous-ai-agents/honcho/SKILL.md @@ -145,10 +145,10 @@ Controls **how often** dialectic and context calls happen. | Key | Default | Description | |-----|---------|-------------| | `contextCadence` | `1` | Min turns between context API calls | -| `dialecticCadence` | `3` | Min turns between dialectic API calls | +| `dialecticCadence` | `1` (wizard: `2`) | Min turns between dialectic API calls. Unset → every turn; wizard pre-fills `2` | | `injectionFrequency` | `every-turn` | `every-turn` or `first-turn` for base context injection | -Higher cadence values reduce API calls and cost. `dialecticCadence: 3` (default) means the dialectic engine fires at most every 3rd turn. +Higher cadence values fire the dialectic LLM less often. `dialecticCadence: 2` means the engine fires every other turn. Setting it to `1` fires every turn. ### Depth (how many) @@ -180,6 +180,8 @@ If `dialecticDepthLevels` is omitted, rounds use **proportional levels** derived This keeps earlier passes cheap while using full depth on the final synthesis. +**Depth at session start.** The session-start prewarm runs the full configured `dialecticDepth` in the background before turn 1. A single-pass prewarm on a cold peer often returns thin output — multi-pass depth runs the audit/reconcile cycle before the user ever speaks. Turn 1 consumes the prewarm result directly; if prewarm hasn't landed in time, turn 1 falls back to a synchronous call with a bounded timeout. + ### Level (how hard) Controls the **intensity** of each dialectic reasoning round. @@ -368,7 +370,7 @@ Config file: `$HERMES_HOME/honcho.json` (profile-local) or `~/.honcho/config.jso | `contextTokens` | uncapped | Max tokens for the combined base context injection (summary + representation + card). Opt-in cap — omit to leave uncapped, set to an integer to bound injection size. | | `injectionFrequency` | `every-turn` | `every-turn` or `first-turn` | | `contextCadence` | `1` | Min turns between context API calls | -| `dialecticCadence` | `3` | Min turns between dialectic LLM calls | +| `dialecticCadence` | `1` (wizard: `2`) | Min turns between dialectic LLM calls | The `contextTokens` budget is enforced at injection time. If the session summary + representation + card exceed the budget, Honcho trims the summary first, then the representation, preserving the card. This prevents context blowup in long sessions. diff --git a/plugins/memory/honcho/cli.py b/plugins/memory/honcho/cli.py index 5cd25bfba..c73dd66f3 100644 --- a/plugins/memory/honcho/cli.py +++ b/plugins/memory/honcho/cli.py @@ -460,17 +460,17 @@ def cmd_setup(args) -> None: pass # keep current # --- 7b. Dialectic cadence --- - current_dialectic = str(hermes_host.get("dialecticCadence") or cfg.get("dialecticCadence") or "1") + current_dialectic = str(hermes_host.get("dialecticCadence") or cfg.get("dialecticCadence") or "2") print("\n Dialectic cadence:") print(" How often Honcho rebuilds its user model (LLM call on Honcho backend).") - print(" 1 = every turn (default), 3+ = sparse.") + print(" 1 = every turn, 2 = every other turn (wizard default), 3+ = sparse.") new_dialectic = _prompt("Dialectic cadence", default=current_dialectic) try: val = int(new_dialectic) if val >= 1: hermes_host["dialecticCadence"] = val except (ValueError, TypeError): - hermes_host["dialecticCadence"] = 1 + hermes_host["dialecticCadence"] = 2 # --- 8. Session strategy --- current_strat = hermes_host.get("sessionStrategy") or cfg.get("sessionStrategy", "per-session") diff --git a/website/docs/user-guide/features/honcho.md b/website/docs/user-guide/features/honcho.md index 2040949d2..bf4b5c6bc 100644 --- a/website/docs/user-guide/features/honcho.md +++ b/website/docs/user-guide/features/honcho.md @@ -77,7 +77,7 @@ Cost and depth are controlled by three independent knobs: | Knob | Controls | Default | |------|----------|---------| | `contextCadence` | Turns between `context()` API calls (base layer refresh) | `1` | -| `dialecticCadence` | Turns between `peer.chat()` LLM calls (dialectic layer refresh) | `3` | +| `dialecticCadence` | Turns between `peer.chat()` LLM calls (dialectic layer refresh) | `1` (code default) / `2` (setup wizard default) | | `dialecticDepth` | Number of `.chat()` passes per dialectic invocation (1–3) | `1` | These are orthogonal — you can have frequent context refreshes with infrequent dialectic, or deep multi-pass dialectic at low frequency. Example: `contextCadence: 1, dialecticCadence: 5, dialecticDepth: 2` refreshes base context every turn, runs dialectic every 5 turns, and each dialectic run makes 2 passes. @@ -94,6 +94,14 @@ Each pass uses a proportional reasoning level (lighter early passes, base level Passes bail out early if the prior pass returned strong signal (long, structured output), so depth 3 doesn't always mean 3 LLM calls. +### Session-Start Prewarm + +On session init, Honcho fires a dialectic call in the background at the full configured `dialecticDepth` and hands the result directly to turn 1's context assembly. A single-pass prewarm on a cold peer often returns thin output — multi-pass depth runs the audit/reconcile cycle before the user ever speaks. If prewarm hasn't landed by turn 1, turn 1 falls back to a synchronous call with a bounded timeout. + +### Query-Adaptive Reasoning Level + +The auto-injected dialectic scales `dialecticReasoningLevel` by query length: +1 level at ≥120 chars, +2 at ≥400, clamped at `reasoningLevelCap` (default `"high"`). Disable with `reasoningHeuristic: false` to pin every auto call to `dialecticReasoningLevel`. `"max"` is reserved for explicit tool-path selection via `honcho_reasoning`. + ## Configuration Options Honcho is configured in `~/.honcho/config.json` (global) or `$HERMES_HOME/honcho.json` (profile-local). The setup wizard handles this for you. @@ -104,7 +112,7 @@ Honcho is configured in `~/.honcho/config.json` (global) or `$HERMES_HOME/honcho |-----|---------|-------------| | `contextTokens` | `null` (uncapped) | Token budget for auto-injected context per turn. Set to an integer (e.g. 1200) to cap. Truncates at word boundaries | | `contextCadence` | `1` | Minimum turns between `context()` API calls (base layer refresh) | -| `dialecticCadence` | `3` | Minimum turns between `peer.chat()` LLM calls (dialectic layer). In `tools` mode, irrelevant — model calls explicitly | +| `dialecticCadence` | `1` (wizard sets `2`) | Minimum turns between `peer.chat()` LLM calls (dialectic layer). Code default fires every turn when the key is unset; the setup wizard pre-fills `2`. In `tools` mode, irrelevant — model calls explicitly | | `dialecticDepth` | `1` | Number of `.chat()` passes per dialectic invocation. Clamped to 1–3 | | `dialecticDepthLevels` | `null` | Optional array of reasoning levels per pass, e.g. `["minimal", "low", "medium"]`. Overrides proportional defaults | | `dialecticReasoningLevel` | `'low'` | Base reasoning level: `minimal`, `low`, `medium`, `high`, `max` | @@ -142,6 +150,41 @@ Honcho is configured in `~/.honcho/config.json` (global) or `$HERMES_HOME/honcho In `tools` mode, the model is fully in control — it calls `honcho_reasoning` when it wants, at whatever `reasoning_level` it picks. Cadence and budget settings only apply to modes with auto-injection (`hybrid` and `context`). +## Observation (Directional vs. Unified) + +Honcho models a conversation as peers exchanging messages. Each peer has two observation toggles that map 1:1 to Honcho's `SessionPeerConfig`: + +| Toggle | Effect | +|--------|--------| +| `observeMe` | Honcho builds a representation of this peer from its own messages | +| `observeOthers` | This peer observes the other peer's messages (feeds cross-peer reasoning) | + +Two peers × two toggles = four flags. `observationMode` is a shorthand preset: + +| Preset | User flags | AI flags | Semantics | +|--------|-----------|----------|-----------| +| `"directional"` (default) | me: on, others: on | me: on, others: on | Full mutual observation. Enables cross-peer dialectic — "what does the AI know about the user, based on what the user said and the AI replied." | +| `"unified"` | me: on, others: off | me: off, others: on | Shared-pool semantics — the AI observes the user's messages only, the user peer only self-models. Single-observer pool. | + +Override the preset with an explicit `observation` block for per-peer control: + +```json +"observation": { + "user": { "observeMe": true, "observeOthers": true }, + "ai": { "observeMe": true, "observeOthers": false } +} +``` + +Common patterns: + +| Intent | Config | +|--------|--------| +| Full observation (most users) | `"observationMode": "directional"` | +| AI shouldn't re-model the user from its own replies | `"ai": {"observeMe": true, "observeOthers": false}` | +| Strong persona the AI peer shouldn't update from self-observation | `"ai": {"observeMe": false, "observeOthers": true}` | + +Server-side toggles set via the Honcho dashboard win over local defaults — Hermes syncs them back at session init. + ## Tools When Honcho is active as the memory provider, five tools become available: diff --git a/website/docs/user-guide/features/memory-providers.md b/website/docs/user-guide/features/memory-providers.md index f571c7d48..b2469a13e 100644 --- a/website/docs/user-guide/features/memory-providers.md +++ b/website/docs/user-guide/features/memory-providers.md @@ -82,7 +82,7 @@ hermes memory setup # select "honcho" | `workspace` | host key | Shared workspace ID | | `contextTokens` | `null` (uncapped) | Token budget for auto-injected context per turn. Truncates at word boundaries | | `contextCadence` | `1` | Minimum turns between `context()` API calls (base layer refresh) | -| `dialecticCadence` | `3` | Minimum turns between `peer.chat()` LLM calls. Only applies to `hybrid`/`context` modes | +| `dialecticCadence` | `1` (wizard sets `2`) | Minimum turns between `peer.chat()` LLM calls. Unset → every turn; wizard pre-fills `2`. Only applies to `hybrid`/`context` modes | | `dialecticDepth` | `1` | Number of `.chat()` passes per dialectic invocation. Clamped 1–3. Pass 0: cold/warm prompt, pass 1: self-audit, pass 2: reconciliation | | `dialecticDepthLevels` | `null` | Optional array of reasoning levels per pass, e.g. `["minimal", "low", "medium"]`. Overrides proportional defaults | | `dialecticReasoningLevel` | `'low'` | Base reasoning level: `minimal`, `low`, `medium`, `high`, `max` | @@ -140,23 +140,64 @@ hermes memory setup # select "honcho" If you previously used `hermes honcho setup`, your config and all server-side data are intact. Just re-enable through the setup wizard again or manually set `memory.provider: honcho` to reactivate via the new system. ::: -**Multi-agent / Profiles:** +**Multi-peer setup:** -Each Hermes profile gets its own Honcho AI peer while sharing the same workspace -- all profiles see the same user representation, but each agent builds its own identity and observations. +Honcho models conversations as peers exchanging messages — one user peer plus one AI peer per Hermes profile, all sharing a workspace. The workspace is the shared environment: the user peer is global across profiles, each AI peer is its own identity. Every AI peer builds an independent representation / card from its own observations, so a `coder` profile stays code-oriented while a `writer` profile stays editorial against the same user. + +The mapping: + +| Concept | What it is | +|---------|-----------| +| **Workspace** | Shared environment. All Hermes profiles under one workspace see the same user identity. | +| **User peer** (`peerName`) | The human. Shared across profiles in the workspace. | +| **AI peer** (`aiPeer`) | One per Hermes profile. Host key `hermes` → default; `hermes.` for others. | +| **Observation** | Per-peer toggles controlling what Honcho models from whose messages. `directional` (default, all four on) or `unified` (single-observer pool). | + +### New profile, fresh Honcho peer ```bash -hermes profile create coder --clone # creates honcho peer "coder", inherits config from default +hermes profile create coder --clone ``` -What `--clone` does: creates a `hermes.coder` host block in `honcho.json` with `aiPeer: "coder"`, shared `workspace`, inherited `peerName`, `recallMode`, `writeFrequency`, `observation`, etc. The peer is eagerly created in Honcho so it exists before first message. +`--clone` creates a `hermes.coder` host block in `honcho.json` with `aiPeer: "coder"`, shared `workspace`, inherited `peerName`, `recallMode`, `writeFrequency`, `observation`, etc. The AI peer is eagerly created in Honcho so it exists before the first message. -For profiles created before Honcho was set up: +### Existing profiles, backfill Honcho peers ```bash -hermes honcho sync # scans all profiles, creates host blocks for any missing ones +hermes honcho sync ``` -This inherits settings from the default `hermes` host block and creates new AI peers for each profile. Idempotent -- skips profiles that already have a host block. +Scans every Hermes profile, creates host blocks for any profile without one, inherits settings from the default `hermes` block, and creates the new AI peers eagerly. Idempotent — skips profiles that already have a host block. + +### Per-profile observation + +Each host block can override the observation config independently. Example: a code-focused profile where the AI peer observes the user but doesn't self-model: + +```json +"hermes.coder": { + "aiPeer": "coder", + "observation": { + "user": { "observeMe": true, "observeOthers": true }, + "ai": { "observeMe": false, "observeOthers": true } + } +} +``` + +**Observation toggles (one set per peer):** + +| Toggle | Effect | +|--------|--------| +| `observeMe` | Honcho builds a representation of this peer from its own messages | +| `observeOthers` | This peer observes the other peer's messages (feeds cross-peer reasoning) | + +Presets via `observationMode`: + +- **`"directional"`** (default) — all four flags on. Full mutual observation; enables cross-peer dialectic. +- **`"unified"`** — user `observeMe: true`, AI `observeOthers: true`, rest false. Single-observer pool; AI models the user but not itself, user peer only self-models. + +Server-side toggles set via the [Honcho dashboard](https://app.honcho.dev) win over local defaults — synced back at session init. + +See the [Honcho page](./honcho.md#observation-directional-vs-unified) for the full observation reference.
Full honcho.json example (multi-profile) @@ -181,7 +222,7 @@ This inherits settings from the default `hermes` host block and creates new AI p }, "dialecticReasoningLevel": "low", "dialecticDynamic": true, - "dialecticCadence": 3, + "dialecticCadence": 2, "dialecticDepth": 1, "dialecticMaxChars": 600, "contextCadence": 1,