mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-24 10:52:21 +00:00
* feat(memory): OAuth token storage and refresh for the Honcho provider * feat(memory): refresh the Honcho OAuth token in the client and session * feat(memory): zero-CLI loopback OAuth authorization flow * feat(memory): generic memory-provider OAuth connect endpoints * feat(desktop): memory-provider OAuth connect link * feat(memory): CLI OAuth sign-in with source-tagged authorize links * fix(memory): IP-literal loopback redirect and consent config_path on the authorize link * fix(memory): profile-scope the memory-provider OAuth endpoints * refactor(desktop): generic memory-provider OAuth client functions * docs(memory): trim OAuth module docstrings to the invariants * docs(memory): document OAuth connect as an optional auth method * fix(memory): send home-relative display path to consent, not the absolute path * perf(memory): cache OAuth token expiry in memory to skip the hot-path disk read * fix(memory): log OAuth refresh failures at warning, not debug * feat(memory): fall back to an OS-assigned loopback port when 8765 is taken * test(memory): cover the desktop Connect launcher, status, and provider dispatch * fix(desktop): keep the memory-provider dropdown one size regardless of connect state * fix(desktop): move the memory connect link to the description line, leaving the dropdown untouched * refactor(memory): move OAuth connect routes out of web_server into a memory-layer router * refactor(desktop): import MemoryConnect directly, drop the single-export barrel * fix(memory): launch CLI OAuth sign-in right after the auth choice, not after the wizard * fix(desktop): auto-clear the OAuth error state instead of leaving it sticky * test(honcho): isolate auth-method prompt from deployment-shape wizard tests main's wizard suite scripts the cloud prompts without the OAuth auth-method step; auto-answer it in the shared helper so the answer lists stay shape-only. * docs(honcho): document query-adaptive reasoning level (reasoningHeuristic) README never mentioned reasoningHeuristic and listed reasoningLevelCap as an orphaned cap with the wrong default (— vs "high"). Add the query-adaptive scaling note + the reasoningHeuristic/reasoningLevelCap rows (grouped under Dialectic & Reasoning), matching the wording already on the hosted honcho.md page, and add a pointer from the memory-providers overview. * fix(honcho): default the CLI peer prompt to the OAuth consent name The CLI runs the grant with apply_config=False, so the peerName the user just entered at consent was dropped and the wizard's 'Your name' prompt fell back to $USER. Surface it as a transient OAuthCredential.consent_peer_name (set even when config isn't merged) and seed the prompt default from it. * feat(honcho): split OAuth client_id by surface (cli=hermes-agent, desktop=hermes-desktop) resolve_endpoints now picks the client_id from the initiating surface and threads it through authorize -> token exchange -> persisted grant -> refresh, so the CLI and desktop register as distinct OAuth clients. Surface-specific env overrides (HONCHO_OAUTH_CLIENT_ID_CLI/_DESKTOP) win over the generic HONCHO_OAUTH_CLIENT_ID, which still overrides every surface. * feat(honcho): show OAuth vs API key in status; detect existing OAuth in setup status now prints 'Auth: OAuth (clientId, token valid Xm/expired)' instead of masking the OAuth access token as a generic API key; setup notes an existing OAuth grant when re-run. * docs(honcho): drop 'shared pool' wording from unified observation mode help * fix(honcho): cross-process lock around OAuth refresh to prevent grant revocation The in-process threading lock can't stop a sibling process (another profile or the desktop app sharing honcho.json) from replaying the single-use refresh token and tripping reuse-detection, which revokes the whole grant. Guard the read-refresh-persist section with an OS file lock on <config>.lock so only one process rotates at a time; the others re-read the freshly-persisted token. Best-effort: platforms without flock degrade to in-process serialization. * refactor(honcho): one OAuth client (hermes-agent) for all surfaces Collapse the per-surface client_id split. CLI and desktop now use a single client_id (hermes-agent); consent branding/UI still adapt via the source query param. One grant identity means no clientId-vs-refresh-token desync that could get the grant revoked. HONCHO_OAUTH_CLIENT_ID still overrides for self-hosting. * fix(honcho): per-session resolves to session_id, never remapped by title Reorder resolve_session_name so stable identifiers win over labels: gateway per-chat key first, then the per-session session_id, then the cwd map / title. A (possibly auto-generated) title can no longer remap a live per-session conversation onto a second Honcho session mid-stream — fixes the desktop, which is per-conversation via session_id. Consequence: a gateway's per-chat key now also wins over a title (titles never remap a stable id).
388 lines
21 KiB
Markdown
388 lines
21 KiB
Markdown
# Honcho Memory Provider
|
||
|
||
AI-native cross-session user modeling with multi-pass dialectic reasoning, session summaries, bidirectional peer tools, and persistent conclusions.
|
||
|
||
> **Honcho docs:** <https://docs.honcho.dev/v3/guides/integrations/hermes>
|
||
|
||
## Requirements
|
||
|
||
- `pip install honcho-ai`
|
||
- A Honcho Cloud account — connect via OAuth sign-in or an API key from
|
||
[app.honcho.dev](https://app.honcho.dev) — or a self-hosted instance
|
||
|
||
## Setup
|
||
|
||
```bash
|
||
hermes memory setup honcho # configure Honcho directly (works on a fresh install)
|
||
hermes memory setup # generic picker, choose Honcho from the list
|
||
```
|
||
|
||
For cloud, the wizard asks **OAuth or API key**. OAuth opens a browser
|
||
sign-in and stores the grant itself — nothing to copy; tokens refresh
|
||
automatically. The desktop app offers the same flow as a **Connect** link
|
||
next to the memory-provider dropdown.
|
||
|
||
Or manually:
|
||
```bash
|
||
hermes config set memory.provider honcho
|
||
echo "HONCHO_API_KEY=***" >> ~/.hermes/.env
|
||
```
|
||
|
||
> `hermes honcho setup` also works, but only **after** Honcho is the active
|
||
> memory provider — the `honcho` subcommand is registered for the active
|
||
> provider only. On a fresh install, use `hermes memory setup honcho`.
|
||
|
||
## Architecture Overview
|
||
|
||
### Two-Layer Context Injection
|
||
|
||
Context is injected into the **user message** at API-call time (not the system prompt) to preserve prompt caching. Only a static mode header goes in the system prompt. The injected block is wrapped in `<memory-context>` fences with a system note clarifying it's background data, not new user input.
|
||
|
||
Two independent layers, each on its own cadence:
|
||
|
||
**Layer 1 — Base context** (refreshed every `contextCadence` turns):
|
||
1. **SESSION SUMMARY** — from `session.context(summary=True)`, placed first
|
||
2. **User Representation** — Honcho's evolving model of the user
|
||
3. **User Peer Card** — key facts snapshot
|
||
4. **AI Self-Representation** — Honcho's model of the AI peer
|
||
5. **AI Identity Card** — AI peer facts
|
||
|
||
**Layer 2 — Dialectic supplement** (fired every `dialecticCadence` turns):
|
||
Multi-pass `.chat()` reasoning about the user, appended after base context.
|
||
|
||
Both layers are joined, then truncated to fit `contextTokens` budget via `_truncate_to_budget` (tokens × 4 chars, word-boundary safe).
|
||
|
||
### Cold Start vs Warm Session Prompts
|
||
|
||
Dialectic pass 0 automatically selects its prompt based on session state:
|
||
|
||
- **Cold** (no base context cached): "Who is this person? What are their preferences, goals, and working style? Focus on facts that would help an AI assistant be immediately useful."
|
||
- **Warm** (base context exists): "Given what's been discussed in this session so far, what context about this user is most relevant to the current conversation? Prioritize active context over biographical facts."
|
||
|
||
Not configurable — determined automatically.
|
||
|
||
### Dialectic Depth (Multi-Pass Reasoning)
|
||
|
||
`dialecticDepth` (1–3, clamped) controls how many `.chat()` calls fire per dialectic cycle:
|
||
|
||
| Depth | Passes | Behavior |
|
||
|-------|--------|----------|
|
||
| 1 | single `.chat()` | Base query only (cold or warm prompt) |
|
||
| 2 | audit + synthesis | Pass 0 result is self-audited; pass 1 does targeted synthesis. Conditional bail-out if pass 0 returns strong signal (>300 chars or structured with bullets/sections >100 chars) |
|
||
| 3 | audit + synthesis + reconciliation | Pass 2 reconciles contradictions across prior passes into a final synthesis |
|
||
|
||
### Proportional Reasoning Levels
|
||
|
||
When `dialecticDepthLevels` is not set, each pass uses a proportional level relative to `dialecticReasoningLevel` (the "base"):
|
||
|
||
| Depth | Pass levels |
|
||
|-------|-------------|
|
||
| 1 | [base] |
|
||
| 2 | [minimal, base] |
|
||
| 3 | [minimal, base, low] |
|
||
|
||
Override with `dialecticDepthLevels`: an explicit array of reasoning level strings per pass.
|
||
|
||
### Query-Adaptive Reasoning Level
|
||
|
||
The auto-injected dialectic scales `dialecticReasoningLevel` by query length: +1 level at ≥120 chars, +2 at ≥400, clamped at `reasoningLevelCap` (default `"high"`). Disable with `reasoningHeuristic: false` to pin every auto call to `dialecticReasoningLevel`.
|
||
|
||
### Three Orthogonal Dialectic Knobs
|
||
|
||
| Knob | Controls | Type |
|
||
|------|----------|------|
|
||
| `dialecticCadence` | How often — minimum turns between dialectic firings | int |
|
||
| `dialecticDepth` | How many — passes per firing (1–3) | int |
|
||
| `dialecticReasoningLevel` | How hard — reasoning ceiling per `.chat()` call | string |
|
||
|
||
### Input Sanitization
|
||
|
||
`run_conversation` strips leaked `<memory-context>` blocks from user input before processing. When `saveMessages` persists a turn that included injected context, the block can reappear in subsequent turns via message history. The sanitizer removes `<memory-context>` blocks plus associated system notes.
|
||
|
||
## Tools
|
||
|
||
Five bidirectional tools. All accept an optional `peer` parameter (`"user"` or `"ai"`, default `"user"`).
|
||
|
||
| Tool | LLM call? | Description |
|
||
|------|-----------|-------------|
|
||
| `honcho_profile` | No | Peer card — key facts snapshot |
|
||
| `honcho_search` | No | Semantic search over stored context (800 tok default, 2000 max) |
|
||
| `honcho_context` | No | Full session context: summary, representation, card, messages |
|
||
| `honcho_reasoning` | Yes | LLM-synthesized answer via dialectic `.chat()` |
|
||
| `honcho_conclude` | No | Write a persistent fact/conclusion about the user |
|
||
|
||
Tool visibility depends on `recallMode`: hidden in `context` mode, always present in `tools` and `hybrid`.
|
||
|
||
## Config Resolution
|
||
|
||
Config is read from the first file that exists:
|
||
|
||
| Priority | Path | Scope |
|
||
|----------|------|-------|
|
||
| 1 | `$HERMES_HOME/honcho.json` | Profile-local (isolated Hermes instances) |
|
||
| 2 | `~/.hermes/honcho.json` | Default profile (shared host blocks) |
|
||
| 3 | `~/.honcho/config.json` | Global (cross-app interop) |
|
||
|
||
Host key is derived from the active Hermes profile: `hermes` (default) or `hermes_<profile>`.
|
||
|
||
For every key, resolution order is: **host block > root > env var > default**.
|
||
|
||
## Full Configuration Reference
|
||
|
||
### Identity & Connection
|
||
|
||
| Key | Type | Default | Description |
|
||
|-----|------|---------|-------------|
|
||
| `apiKey` | string | — | API key. Falls back to `HONCHO_API_KEY` env var. When connected via OAuth, holds the auto-refreshing access token instead |
|
||
| `oauth` | object | — | OAuth grant (refresh token, expiry, client, token endpoint). Written by the Connect/sign-in flows and rotated automatically — not hand-edited. Optional: an API key alone works without it |
|
||
| `baseUrl` | string | — | Base URL for self-hosted Honcho. Local URLs auto-skip API key auth |
|
||
| `environment` | string | `"production"` | SDK environment mapping |
|
||
| `enabled` | bool | auto | Master toggle. Auto-enables when `apiKey` or `baseUrl` present |
|
||
| `workspace` | string | host key | Honcho workspace ID. Shared environment — all profiles in the same workspace can see the same user identity and related memories |
|
||
| `peerName` | string | — | User peer identity |
|
||
| `aiPeer` | string | host key | AI peer identity |
|
||
|
||
### Identity Mapping (Gateway Multi-User)
|
||
|
||
In gateway deployments (Telegram, Discord, Slack, etc.) each user arrives with a platform-native runtime ID (Telegram UID, Discord snowflake, Slack user). These three keys control how those runtime IDs map to Honcho peers. The resolver is config-driven and deterministic — no automatic merging or runtime inference.
|
||
|
||
| Key | Type | Default | Description |
|
||
|-----|------|---------|-------------|
|
||
| `pinUserPeer` | bool | `false` | When `true`, every gateway runtime user collapses to `peerName`. Single-operator deployments where you want all your platforms (and any other users) to share one peer |
|
||
| `userPeerAliases` | object | `{}` | Map of runtime IDs to peer IDs (`{"7654321": "alice"}`). Many-to-one is the intended pattern — alias all your runtime IDs to one peer name. One-to-many is not supported; one runtime ID resolves to exactly one peer |
|
||
| `runtimePeerPrefix` | string | `""` | Prepended to unknown runtime IDs to namespace them (e.g. `"telegram_"` → `telegram_7654321`). Used only when no alias matches. Prevents collisions between platforms whose runtime IDs share the same shape |
|
||
|
||
> **Deprecated:** `pinPeerName` is a legacy alias for `pinUserPeer`, still read for back-compat (`pinUserPeer` wins where both are set). `hermes honcho setup` migrates it onto `pinUserPeer` on touch and never writes it.
|
||
|
||
**Resolver ladder** (first match wins):
|
||
|
||
```
|
||
1. pinUserPeer / pinPeerName=true → return peerName (ignore runtime ID)
|
||
2. userPeerAliases[runtime_id] → return aliased peer
|
||
3. userPeerAliases[runtime_id_alt] → check alt-ID too (Telegram UID + username, etc.)
|
||
4. runtimePeerPrefix + runtime_id → namespaced peer, with sha256 collision escalation
|
||
5. raw sanitized runtime_id → fallback peer
|
||
6. peerName → no runtime ID at all (CLI/TUI)
|
||
7. session-key fallback → no config either
|
||
```
|
||
|
||
**Why no `pinAiPeer`?** The AI peer is already pinned by construction — `aiPeer` is the only AI-side identity setting and the resolver never overrides it. Only the user-side peer has the runtime-vs-config tension that `pinUserPeer` resolves.
|
||
|
||
**Host vs root semantics.** All three keys are accepted at both root and `hosts.<host>` levels. Host-level wins. For maps and prefixes, host-level *replaces* the root value as a whole (not merge), so a host can intentionally own its identity universe or wipe it with `userPeerAliases: {}` / `runtimePeerPrefix: ""`.
|
||
|
||
**Setup — gateway identity tree.** `hermes honcho setup` only asks about identity mapping when it detects a connected gateway platform (it inspects the gateway config; off-gateway the step is skipped because these keys do nothing without a runtime user ID). When it runs, it asks *who talks to this gateway?* and derives the keys:
|
||
|
||
- **just me** → `pinUserPeer: true`. Every non-agent gateway user collapses to `peerName`; the pin overrides all aliases, so pick this only when no user-side identity needs its own peer. Personal use where you connect Hermes to your own Telegram/Discord/etc. If separate agents reach the gateway and each needs a distinct peer, do **not** pin — leave `pinUserPeer: false` and map them via `userPeerAliases` (the `[e]` editor).
|
||
- **me + other people, pooled** → `pinUserPeer: false` + `userPeerAliases` mapping your runtime IDs to `peerName`. You stay on the shared history; everyone else gets their own peer.
|
||
- **me + other people / only other people** → `pinUserPeer: false`, optional `runtimePeerPrefix`. Each runtime user → own peer. For bots serving many humans.
|
||
|
||
Pick **[e]** at the prompt to set the three keys directly instead of going through the tree.
|
||
|
||
**Un-pinning (single → per-user).** Flipping `pinUserPeer` from `true` to `false` does not migrate data. Memory accumulated under `peerName` while pinned stays there; runtime users now resolve to fresh, empty peers. To preserve your own continuity, choose the **pooled** path — alias your runtime IDs back to `peerName` so your turns keep landing on the pooled history while other users get their own peers. The wizard offers this steer automatically when it detects you're un-pinning a previously pinned profile.
|
||
|
||
### Memory & Recall
|
||
|
||
| Key | Type | Default | Description |
|
||
|-----|------|---------|-------------|
|
||
| `recallMode` | string | `"hybrid"` | `"hybrid"` (auto-inject + tools), `"context"` (auto-inject only, tools hidden), `"tools"` (tools only, no injection). Legacy `"auto"` → `"hybrid"` |
|
||
| `observationMode` | string | `"directional"` | Preset: `"directional"` (all on) or `"unified"` (user observes self, AI observes others). Use `observation` object for granular control |
|
||
| `observation` | object | — | Per-peer observation config (see Observation section) |
|
||
|
||
### Write Behavior
|
||
|
||
| Key | Type | Default | Description |
|
||
|-----|------|---------|-------------|
|
||
| `writeFrequency` | string/int | `"async"` | `"async"` (background), `"turn"` (sync per turn), `"session"` (batch on end), or integer N (every N turns) |
|
||
| `saveMessages` | bool | `true` | Persist messages to Honcho API |
|
||
|
||
### Session Resolution
|
||
|
||
| Key | Type | Default | Description |
|
||
|-----|------|---------|-------------|
|
||
| `sessionStrategy` | string | `"per-directory"` | `"per-directory"`, `"per-session"`, `"per-repo"` (git root), `"global"` |
|
||
| `sessionPeerPrefix` | bool | `false` | Prepend peer name to session keys |
|
||
| `sessions` | object | `{}` | Manual directory-to-session-name mappings |
|
||
|
||
#### Session Name Resolution
|
||
|
||
The Honcho session name determines which conversation bucket memory lands in. Resolution follows a priority chain — first match wins:
|
||
|
||
| Priority | Source | Example session name |
|
||
|----------|--------|---------------------|
|
||
| 1 | Manual map (`sessions` config) | `"myproject-main"` |
|
||
| 2 | `/title` command (mid-session rename) | `"refactor-auth"` |
|
||
| 3 | Gateway session key (Telegram, Discord, etc.) | `"agent-main-telegram-dm-8439114563"` |
|
||
| 4 | `per-session` strategy | Hermes session ID (`20260415_a3f2b1`) |
|
||
| 5 | `per-repo` strategy | Git root directory name (`hermes-agent`) |
|
||
| 6 | `per-directory` strategy | Current directory basename (`src`) |
|
||
| 7 | `global` strategy | Workspace name (`hermes`) |
|
||
|
||
Gateway platforms always resolve via priority 3 (per-chat isolation) regardless of `sessionStrategy`. The strategy setting only affects CLI sessions.
|
||
|
||
If `sessionPeerPrefix` is `true`, the peer name is prepended: `alice-hermes-agent`.
|
||
|
||
#### What each strategy produces
|
||
|
||
- **`per-directory`** — basename of `$PWD`. Opening hermes in `~/code/myapp` and `~/code/other` gives two separate sessions. Same directory = same session across runs.
|
||
- **`per-repo`** — git root directory name. All subdirectories within a repo share one session. Falls back to `per-directory` if not inside a git repo.
|
||
- **`per-session`** — Hermes session ID (timestamp + hex). Every `hermes` invocation starts a fresh Honcho session. Falls back to `per-directory` if no session ID is available.
|
||
- **`global`** — workspace name. One session for everything. Memory accumulates across all directories and runs.
|
||
|
||
### Multi-Profile Pattern
|
||
|
||
Multiple Hermes profiles can share one workspace while maintaining separate AI identities. Config resolution is **host block > root > env var > default** — host blocks inherit from root, so shared settings only need to be declared once:
|
||
|
||
```json
|
||
{
|
||
"apiKey": "***",
|
||
"workspace": "hermes",
|
||
"peerName": "yourname",
|
||
"hosts": {
|
||
"hermes": {
|
||
"aiPeer": "hermes",
|
||
"recallMode": "hybrid",
|
||
"sessionStrategy": "per-directory"
|
||
},
|
||
"hermes_coder": {
|
||
"aiPeer": "coder",
|
||
"recallMode": "tools",
|
||
"sessionStrategy": "per-repo"
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
Both profiles see the same user (`yourname`) in the same shared environment (`hermes`), but each AI peer builds its own observations, conclusions, and behavior patterns. The coder's memory stays code-oriented; the main agent's stays broad.
|
||
|
||
Host key is derived from the active Hermes profile: `hermes` (default) or `hermes_<profile>` (e.g. `hermes -p coder` -> host key `hermes_coder`). Older `hermes.<profile>` host blocks are still read for compatibility and are migrated when the CLI writes profile-scoped Honcho config.
|
||
|
||
### Dialectic & Reasoning
|
||
|
||
| Key | Type | Default | Description |
|
||
|-----|------|---------|-------------|
|
||
| `dialecticDepth` | int | `1` | Passes per dialectic cycle (1–3, clamped). 1=single query, 2=audit+synthesis, 3=audit+synthesis+reconciliation |
|
||
| `dialecticDepthLevels` | array | — | Optional array of reasoning level strings per pass. Overrides proportional defaults. Example: `["minimal", "low", "medium"]` |
|
||
| `dialecticReasoningLevel` | string | `"low"` | Base reasoning level for `.chat()`: `"minimal"`, `"low"`, `"medium"`, `"high"`, `"max"` |
|
||
| `dialecticDynamic` | bool | `true` | When `true`, model can override reasoning level per-call via `honcho_reasoning` tool. When `false`, always uses `dialecticReasoningLevel` |
|
||
| `dialecticMaxChars` | int | `600` | Max chars of dialectic result injected into system prompt |
|
||
| `dialecticMaxInputChars` | int | `10000` | Max chars for dialectic query input to `.chat()`. Honcho cloud limit: 10k |
|
||
| `reasoningHeuristic` | bool | `true` | Query-adaptive: auto-scale the auto-injected dialectic's level up by query length (+1 at ≥120 chars, +2 at ≥400), clamped at `reasoningLevelCap`. `false` pins every auto call to `dialecticReasoningLevel` |
|
||
| `reasoningLevelCap` | string | `"high"` | Ceiling for `reasoningHeuristic` scaling: `"minimal"`, `"low"`, `"medium"`, `"high"`, `"max"` |
|
||
|
||
### Token Budgets
|
||
|
||
| Key | Type | Default | Description |
|
||
|-----|------|---------|-------------|
|
||
| `contextTokens` | int | SDK default | Token budget for `context()` API calls. Also gates prefetch truncation (tokens × 4 chars) |
|
||
| `messageMaxChars` | int | `25000` | Max chars per message sent via `add_messages()`. Exceeding this triggers chunking with `[continued]` markers. Honcho cloud limit: 25k |
|
||
|
||
### Cadence (Cost Control)
|
||
|
||
| Key | Type | Default | Description |
|
||
|-----|------|---------|-------------|
|
||
| `contextCadence` | int | `1` | Minimum turns between base context refreshes (session summary + representation + card) |
|
||
| `dialecticCadence` | int | `1` | Minimum turns between dialectic `.chat()` firings |
|
||
| `injectionFrequency` | string | `"every-turn"` | `"every-turn"` or `"first-turn"` (inject context on the first user message only, skip from turn 2 onward) |
|
||
|
||
### Observation (Granular)
|
||
|
||
Maps 1:1 to Honcho's per-peer `SessionPeerConfig`. When present, overrides `observationMode` preset.
|
||
|
||
```json
|
||
"observation": {
|
||
"user": { "observeMe": true, "observeOthers": true },
|
||
"ai": { "observeMe": true, "observeOthers": true }
|
||
}
|
||
```
|
||
|
||
| Field | Default | Description |
|
||
|-------|---------|-------------|
|
||
| `user.observeMe` | `true` | User peer self-observation (Honcho builds user representation) |
|
||
| `user.observeOthers` | `true` | User peer observes AI messages |
|
||
| `ai.observeMe` | `true` | AI peer self-observation (Honcho builds AI representation) |
|
||
| `ai.observeOthers` | `true` | AI peer observes user messages (enables cross-peer dialectic) |
|
||
|
||
Presets:
|
||
- `"directional"` (default): all four `true`
|
||
- `"unified"`: user `observeMe=true`, AI `observeOthers=true`, rest `false`
|
||
|
||
### Hardcoded Limits
|
||
|
||
| Limit | Value |
|
||
|-------|-------|
|
||
| Search tool max tokens | 2000 (hard cap), 800 (default) |
|
||
| Peer card fetch tokens | 200 |
|
||
|
||
## Environment Variables
|
||
|
||
| Variable | Fallback for |
|
||
|----------|-------------|
|
||
| `HONCHO_API_KEY` | `apiKey` |
|
||
| `HONCHO_BASE_URL` | `baseUrl` |
|
||
| `HONCHO_ENVIRONMENT` | `environment` |
|
||
| `HERMES_HONCHO_HOST` | Host key override |
|
||
| `HONCHO_OAUTH_DASHBOARD` | OAuth authorize origin (default: cloud dashboard; local-dev `localhost:3000`) |
|
||
| `HONCHO_OAUTH_AUTHORIZE_URL` | Full authorize URL (overrides the dashboard origin) |
|
||
| `HONCHO_OAUTH_TOKEN_URL` | Token endpoint (default: cloud API; local-dev `localhost:8000`) |
|
||
| `HONCHO_OAUTH_CLIENT_ID` | OAuth client (default `hermes-agent`) |
|
||
| `HONCHO_OAUTH_SCOPE` | Requested scope (default `write`) |
|
||
|
||
## CLI Commands
|
||
|
||
| Command | Description |
|
||
|---------|-------------|
|
||
| `hermes memory setup honcho` | Configure Honcho directly — works on a fresh install |
|
||
| `hermes honcho setup` | Interactive setup wizard (only registered once Honcho is the active provider; redirects to `hermes memory setup`) |
|
||
| `hermes honcho status` | Show resolved config for active profile |
|
||
| `hermes honcho enable` / `disable` | Toggle Honcho for active profile |
|
||
| `hermes honcho mode <mode>` | Change recall or observation mode |
|
||
| `hermes honcho peer --user <name>` | Update user peer name |
|
||
| `hermes honcho peer --ai <name>` | Update AI peer name |
|
||
| `hermes honcho tokens --context <N>` | Set context token budget |
|
||
| `hermes honcho tokens --dialectic <N>` | Set dialectic max chars |
|
||
| `hermes honcho map <name>` | Map current directory to a session name |
|
||
| `hermes honcho sync` | Create host blocks for all Hermes profiles |
|
||
|
||
## Example Config
|
||
|
||
```json
|
||
{
|
||
"apiKey": "***",
|
||
"workspace": "hermes",
|
||
"peerName": "username",
|
||
"contextCadence": 2,
|
||
"dialecticCadence": 3,
|
||
"dialecticDepth": 2,
|
||
"hosts": {
|
||
"hermes": {
|
||
"enabled": true,
|
||
"aiPeer": "hermes",
|
||
"recallMode": "hybrid",
|
||
"observation": {
|
||
"user": { "observeMe": true, "observeOthers": true },
|
||
"ai": { "observeMe": true, "observeOthers": true }
|
||
},
|
||
"writeFrequency": "async",
|
||
"sessionStrategy": "per-directory",
|
||
"dialecticReasoningLevel": "low",
|
||
"dialecticDepth": 2,
|
||
"dialecticMaxChars": 600,
|
||
"saveMessages": true
|
||
},
|
||
"hermes_coder": {
|
||
"enabled": true,
|
||
"aiPeer": "coder",
|
||
"sessionStrategy": "per-repo",
|
||
"dialecticDepth": 1,
|
||
"dialecticDepthLevels": ["low"],
|
||
"observation": {
|
||
"user": { "observeMe": true, "observeOthers": false },
|
||
"ai": { "observeMe": true, "observeOthers": true }
|
||
}
|
||
}
|
||
},
|
||
"sessions": {
|
||
"/home/user/myproject": "myproject-main"
|
||
}
|
||
}
|
||
```
|