mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
fix(honcho): dialectic lifecycle — defaults, retry, prewarm consumption
Several correctness and cost-safety fixes to the Honcho dialectic path
after a multi-turn investigation surfaced a chain of silent failures:
- dialecticCadence default flipped 3 → 1. PR #10619 changed this from 1 to
3 for cost, but existing installs with no explicit config silently went
from per-turn dialectic to every-3-turns on upgrade. Restores pre-#10619
behavior; 3+ remains available for cost-conscious setups. Docs + wizard
+ status output updated to match.
- Session-start prewarm now consumed. Previously fired a .chat() on init
whose result landed in HonchoSessionManager._dialectic_cache and was
never read — pop_dialectic_result had zero call sites. Turn 1 paid for
a duplicate synchronous dialectic. Prewarm now writes directly to the
plugin's _prefetch_result via _prefetch_lock so turn 1 consumes it with
no extra call.
- Prewarm is now dialecticDepth-aware. A single-pass prewarm can return
weak output on cold peers; the multi-pass audit/reconcile cycle is
exactly the case dialecticDepth was built for. Prewarm now runs the
full configured depth in the background.
- Silent dialectic failure no longer burns the cadence window.
_last_dialectic_turn now advances only when the result is non-empty.
Empty result → next eligible turn retries immediately instead of
waiting the full cadence gap.
- Thread pile-up guard. queue_prefetch skips when a prior dialectic
thread is still in-flight, preventing stacked races on _prefetch_result.
- First-turn sync timeout is recoverable. Previously on timeout the
background thread's result was stored in a dead local list. Now the
thread writes into _prefetch_result under lock so the next turn
picks it up.
- Cadence gate applies uniformly. At cadence=1 the old "cadence > 1"
guard let first-turn sync + same-turn queue_prefetch both fire.
Gate now always applies.
- Restored query-length reasoning-level scaling, dropped in 9a0ab34c.
Scales dialecticReasoningLevel up on longer queries (+1 at ≥120 chars,
+2 at ≥400), clamped at reasoningLevelCap. Two new config keys:
`reasoningHeuristic` (bool, default true) and `reasoningLevelCap`
(string, default "high"; previously parsed but never enforced).
Respects dialecticDepthLevels and proportional lighter-early passes.
- Restored short-prompt skip, dropped in ef7f3156. One-word
acknowledgements ("ok", "y", "thanks") and slash commands bypass
both injection and dialectic fire.
- Purged dead code in session.py: prefetch_dialectic, _dialectic_cache,
set_dialectic_result, pop_dialectic_result — all unused after prewarm
refactor.
Tests: 542 passed across honcho_plugin/, agent/test_memory_provider.py,
and run_agent/test_run_agent.py. New coverage:
- TestTrivialPromptHeuristic (classifier + prefetch/queue skip)
- TestDialecticCadenceAdvancesOnSuccess (empty-result retry, pile-up guard)
- TestSessionStartDialecticPrewarm (prewarm consumed, sync fallback)
- TestReasoningHeuristic (length bumps, cap clamp, interaction with depth)
- TestDialecticLifecycleSmoke (end-to-end 8-turn session walk)
This commit is contained in:
parent
bf5d7462ba
commit
78586ce036
10 changed files with 665 additions and 107 deletions
|
|
@ -145,10 +145,10 @@ Controls **how often** dialectic and context calls happen.
|
|||
| Key | Default | Description |
|
||||
|-----|---------|-------------|
|
||||
| `contextCadence` | `1` | Min turns between context API calls |
|
||||
| `dialecticCadence` | `3` | Min turns between dialectic API calls |
|
||||
| `dialecticCadence` | `1` | Min turns between dialectic API calls |
|
||||
| `injectionFrequency` | `every-turn` | `every-turn` or `first-turn` for base context injection |
|
||||
|
||||
Higher cadence values reduce API calls and cost. `dialecticCadence: 3` (default) means the dialectic engine fires at most every 3rd turn.
|
||||
Higher cadence values reduce API calls and cost. `dialecticCadence: 1` (default) fires every turn; set to `3` or higher to throttle for cost.
|
||||
|
||||
### Depth (how many)
|
||||
|
||||
|
|
@ -368,7 +368,7 @@ Config file: `$HERMES_HOME/honcho.json` (profile-local) or `~/.honcho/config.jso
|
|||
| `contextTokens` | uncapped | Max tokens for the combined base context injection (summary + representation + card). Opt-in cap — omit to leave uncapped, set to an integer to bound injection size. |
|
||||
| `injectionFrequency` | `every-turn` | `every-turn` or `first-turn` |
|
||||
| `contextCadence` | `1` | Min turns between context API calls |
|
||||
| `dialecticCadence` | `3` | Min turns between dialectic LLM calls |
|
||||
| `dialecticCadence` | `1` | Min turns between dialectic LLM calls |
|
||||
|
||||
The `contextTokens` budget is enforced at injection time. If the session summary + representation + card exceed the budget, Honcho trims the summary first, then the representation, preserving the card. This prevents context blowup in long sessions.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue