mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
fix(honcho): dialectic lifecycle — defaults, retry, prewarm consumption
Several correctness and cost-safety fixes to the Honcho dialectic path
after a multi-turn investigation surfaced a chain of silent failures:
- dialecticCadence default flipped 3 → 1. PR #10619 changed this from 1 to
3 for cost, but existing installs with no explicit config silently went
from per-turn dialectic to every-3-turns on upgrade. Restores pre-#10619
behavior; 3+ remains available for cost-conscious setups. Docs + wizard
+ status output updated to match.
- Session-start prewarm now consumed. Previously fired a .chat() on init
whose result landed in HonchoSessionManager._dialectic_cache and was
never read — pop_dialectic_result had zero call sites. Turn 1 paid for
a duplicate synchronous dialectic. Prewarm now writes directly to the
plugin's _prefetch_result via _prefetch_lock so turn 1 consumes it with
no extra call.
- Prewarm is now dialecticDepth-aware. A single-pass prewarm can return
weak output on cold peers; the multi-pass audit/reconcile cycle is
exactly the case dialecticDepth was built for. Prewarm now runs the
full configured depth in the background.
- Silent dialectic failure no longer burns the cadence window.
_last_dialectic_turn now advances only when the result is non-empty.
Empty result → next eligible turn retries immediately instead of
waiting the full cadence gap.
- Thread pile-up guard. queue_prefetch skips when a prior dialectic
thread is still in-flight, preventing stacked races on _prefetch_result.
- First-turn sync timeout is recoverable. Previously on timeout the
background thread's result was stored in a dead local list. Now the
thread writes into _prefetch_result under lock so the next turn
picks it up.
- Cadence gate applies uniformly. At cadence=1 the old "cadence > 1"
guard let first-turn sync + same-turn queue_prefetch both fire.
Gate now always applies.
- Restored query-length reasoning-level scaling, dropped in 9a0ab34c.
Scales dialecticReasoningLevel up on longer queries (+1 at ≥120 chars,
+2 at ≥400), clamped at reasoningLevelCap. Two new config keys:
`reasoningHeuristic` (bool, default true) and `reasoningLevelCap`
(string, default "high"; previously parsed but never enforced).
Respects dialecticDepthLevels and proportional lighter-early passes.
- Restored short-prompt skip, dropped in ef7f3156. One-word
acknowledgements ("ok", "y", "thanks") and slash commands bypass
both injection and dialectic fire.
- Purged dead code in session.py: prefetch_dialectic, _dialectic_cache,
set_dialectic_result, pop_dialectic_result — all unused after prewarm
refactor.
Tests: 542 passed across honcho_plugin/, agent/test_memory_provider.py,
and run_agent/test_run_agent.py. New coverage:
- TestTrivialPromptHeuristic (classifier + prefetch/queue skip)
- TestDialecticCadenceAdvancesOnSuccess (empty-result retry, pile-up guard)
- TestSessionStartDialecticPrewarm (prewarm consumed, sync fallback)
- TestReasoningHeuristic (length bumps, cap clamp, interaction with depth)
- TestDialecticLifecycleSmoke (end-to-end 8-turn session walk)
This commit is contained in:
parent
bf5d7462ba
commit
78586ce036
10 changed files with 665 additions and 107 deletions
|
|
@ -77,7 +77,7 @@ Cost and depth are controlled by three independent knobs:
|
|||
| Knob | Controls | Default |
|
||||
|------|----------|---------|
|
||||
| `contextCadence` | Turns between `context()` API calls (base layer refresh) | `1` |
|
||||
| `dialecticCadence` | Turns between `peer.chat()` LLM calls (dialectic layer refresh) | `3` |
|
||||
| `dialecticCadence` | Turns between `peer.chat()` LLM calls (dialectic layer refresh) | `1` |
|
||||
| `dialecticDepth` | Number of `.chat()` passes per dialectic invocation (1–3) | `1` |
|
||||
|
||||
These are orthogonal — you can have frequent context refreshes with infrequent dialectic, or deep multi-pass dialectic at low frequency. Example: `contextCadence: 1, dialecticCadence: 5, dialecticDepth: 2` refreshes base context every turn, runs dialectic every 5 turns, and each dialectic run makes 2 passes.
|
||||
|
|
@ -104,7 +104,7 @@ Honcho is configured in `~/.honcho/config.json` (global) or `$HERMES_HOME/honcho
|
|||
|-----|---------|-------------|
|
||||
| `contextTokens` | `null` (uncapped) | Token budget for auto-injected context per turn. Set to an integer (e.g. 1200) to cap. Truncates at word boundaries |
|
||||
| `contextCadence` | `1` | Minimum turns between `context()` API calls (base layer refresh) |
|
||||
| `dialecticCadence` | `3` | Minimum turns between `peer.chat()` LLM calls (dialectic layer). In `tools` mode, irrelevant — model calls explicitly |
|
||||
| `dialecticCadence` | `1` | Minimum turns between `peer.chat()` LLM calls (dialectic layer). In `tools` mode, irrelevant — model calls explicitly |
|
||||
| `dialecticDepth` | `1` | Number of `.chat()` passes per dialectic invocation. Clamped to 1–3 |
|
||||
| `dialecticDepthLevels` | `null` | Optional array of reasoning levels per pass, e.g. `["minimal", "low", "medium"]`. Overrides proportional defaults |
|
||||
| `dialecticReasoningLevel` | `'low'` | Base reasoning level: `minimal`, `low`, `medium`, `high`, `max` |
|
||||
|
|
|
|||
|
|
@ -82,7 +82,7 @@ hermes memory setup # select "honcho"
|
|||
| `workspace` | host key | Shared workspace ID |
|
||||
| `contextTokens` | `null` (uncapped) | Token budget for auto-injected context per turn. Truncates at word boundaries |
|
||||
| `contextCadence` | `1` | Minimum turns between `context()` API calls (base layer refresh) |
|
||||
| `dialecticCadence` | `3` | Minimum turns between `peer.chat()` LLM calls. Only applies to `hybrid`/`context` modes |
|
||||
| `dialecticCadence` | `1` | Minimum turns between `peer.chat()` LLM calls. Only applies to `hybrid`/`context` modes |
|
||||
| `dialecticDepth` | `1` | Number of `.chat()` passes per dialectic invocation. Clamped 1–3. Pass 0: cold/warm prompt, pass 1: self-audit, pass 2: reconciliation |
|
||||
| `dialecticDepthLevels` | `null` | Optional array of reasoning levels per pass, e.g. `["minimal", "low", "medium"]`. Overrides proportional defaults |
|
||||
| `dialecticReasoningLevel` | `'low'` | Base reasoning level: `minimal`, `low`, `medium`, `high`, `max` |
|
||||
|
|
@ -181,7 +181,7 @@ This inherits settings from the default `hermes` host block and creates new AI p
|
|||
},
|
||||
"dialecticReasoningLevel": "low",
|
||||
"dialecticDynamic": true,
|
||||
"dialecticCadence": 3,
|
||||
"dialecticCadence": 1,
|
||||
"dialecticDepth": 1,
|
||||
"dialecticMaxChars": 600,
|
||||
"contextCadence": 1,
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue