mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-26 01:01:40 +00:00
fix(honcho): dialectic lifecycle — defaults, retry, prewarm consumption
Several correctness and cost-safety fixes to the Honcho dialectic path
after a multi-turn investigation surfaced a chain of silent failures:
- dialecticCadence default flipped 3 → 1. PR #10619 changed this from 1 to
3 for cost, but existing installs with no explicit config silently went
from per-turn dialectic to every-3-turns on upgrade. Restores pre-#10619
behavior; 3+ remains available for cost-conscious setups. Docs + wizard
+ status output updated to match.
- Session-start prewarm now consumed. Previously fired a .chat() on init
whose result landed in HonchoSessionManager._dialectic_cache and was
never read — pop_dialectic_result had zero call sites. Turn 1 paid for
a duplicate synchronous dialectic. Prewarm now writes directly to the
plugin's _prefetch_result via _prefetch_lock so turn 1 consumes it with
no extra call.
- Prewarm is now dialecticDepth-aware. A single-pass prewarm can return
weak output on cold peers; the multi-pass audit/reconcile cycle is
exactly the case dialecticDepth was built for. Prewarm now runs the
full configured depth in the background.
- Silent dialectic failure no longer burns the cadence window.
_last_dialectic_turn now advances only when the result is non-empty.
Empty result → next eligible turn retries immediately instead of
waiting the full cadence gap.
- Thread pile-up guard. queue_prefetch skips when a prior dialectic
thread is still in-flight, preventing stacked races on _prefetch_result.
- First-turn sync timeout is recoverable. Previously on timeout the
background thread's result was stored in a dead local list. Now the
thread writes into _prefetch_result under lock so the next turn
picks it up.
- Cadence gate applies uniformly. At cadence=1 the old "cadence > 1"
guard let first-turn sync + same-turn queue_prefetch both fire.
Gate now always applies.
- Restored query-length reasoning-level scaling, dropped in 9a0ab34c.
Scales dialecticReasoningLevel up on longer queries (+1 at ≥120 chars,
+2 at ≥400), clamped at reasoningLevelCap. Two new config keys:
`reasoningHeuristic` (bool, default true) and `reasoningLevelCap`
(string, default "high"; previously parsed but never enforced).
Respects dialecticDepthLevels and proportional lighter-early passes.
- Restored short-prompt skip, dropped in ef7f3156. One-word
acknowledgements ("ok", "y", "thanks") and slash commands bypass
both injection and dialectic fire.
- Purged dead code in session.py: prefetch_dialectic, _dialectic_cache,
set_dialectic_result, pop_dialectic_result — all unused after prewarm
refactor.
Tests: 542 passed across honcho_plugin/, agent/test_memory_provider.py,
and run_agent/test_run_agent.py. New coverage:
- TestTrivialPromptHeuristic (classifier + prefetch/queue skip)
- TestDialecticCadenceAdvancesOnSuccess (empty-result retry, pile-up guard)
- TestSessionStartDialecticPrewarm (prewarm consumed, sync fallback)
- TestReasoningHeuristic (length bumps, cap clamp, interaction with depth)
- TestDialecticLifecycleSmoke (end-to-end 8-turn session walk)
This commit is contained in:
parent
bf5d7462ba
commit
78586ce036
10 changed files with 665 additions and 107 deletions
|
|
@ -100,9 +100,11 @@ class HonchoSessionManager:
|
|||
self._write_frequency = write_frequency
|
||||
self._turn_counter: int = 0
|
||||
|
||||
# Prefetch caches: session_key → last result (consumed once per turn)
|
||||
# Prefetch cache: session_key → last context result (consumed once per turn).
|
||||
# Dialectic results are cached on the plugin side (HonchoMemoryProvider
|
||||
# ._prefetch_result) so session-start prewarm and turn-driven fires share
|
||||
# one source of truth; see __init__.py _do_session_init for the prewarm.
|
||||
self._context_cache: dict[str, dict] = {}
|
||||
self._dialectic_cache: dict[str, str] = {}
|
||||
self._prefetch_cache_lock = threading.Lock()
|
||||
self._dialectic_reasoning_level: str = (
|
||||
config.dialectic_reasoning_level if config else "low"
|
||||
|
|
@ -499,8 +501,8 @@ class HonchoSessionManager:
|
|||
Query Honcho's dialectic endpoint about a peer.
|
||||
|
||||
Runs an LLM on Honcho's backend against the target peer's full
|
||||
representation. Higher latency than context() — call async via
|
||||
prefetch_dialectic() to avoid blocking the response.
|
||||
representation. Higher latency than context() — callers run this in
|
||||
a background thread (see HonchoMemoryProvider) to avoid blocking.
|
||||
|
||||
Args:
|
||||
session_key: The session key to query against.
|
||||
|
|
@ -555,42 +557,6 @@ class HonchoSessionManager:
|
|||
logger.warning("Honcho dialectic query failed: %s", e)
|
||||
return ""
|
||||
|
||||
def prefetch_dialectic(self, session_key: str, query: str) -> None:
|
||||
"""
|
||||
Fire a dialectic_query in a background thread, caching the result.
|
||||
|
||||
Non-blocking. The result is available via pop_dialectic_result()
|
||||
on the next call (typically the following turn). Reasoning level
|
||||
is selected dynamically based on query complexity.
|
||||
|
||||
Args:
|
||||
session_key: The session key to query against.
|
||||
query: The user's current message, used as the query.
|
||||
"""
|
||||
def _run():
|
||||
result = self.dialectic_query(session_key, query)
|
||||
if result:
|
||||
self.set_dialectic_result(session_key, result)
|
||||
|
||||
t = threading.Thread(target=_run, name="honcho-dialectic-prefetch", daemon=True)
|
||||
t.start()
|
||||
|
||||
def set_dialectic_result(self, session_key: str, result: str) -> None:
|
||||
"""Store a prefetched dialectic result in a thread-safe way."""
|
||||
if not result:
|
||||
return
|
||||
with self._prefetch_cache_lock:
|
||||
self._dialectic_cache[session_key] = result
|
||||
|
||||
def pop_dialectic_result(self, session_key: str) -> str:
|
||||
"""
|
||||
Return and clear the cached dialectic result for this session.
|
||||
|
||||
Returns empty string if no result is ready yet.
|
||||
"""
|
||||
with self._prefetch_cache_lock:
|
||||
return self._dialectic_cache.pop(session_key, "")
|
||||
|
||||
def prefetch_context(self, session_key: str, user_message: str | None = None) -> None:
|
||||
"""
|
||||
Fire get_prefetch_context in a background thread, caching the result.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue