hermes-agent/plugins
Erosika f512fdf697 feat(honcho): wire fire-and-forget worker + adaptive timeout + breaker into provider
Replaces the per-turn threading.Thread(target=_sync).start() pattern in
HonchoMemoryProvider with a persistent SyncWorker.  sync_turn() and
on_memory_write() both enqueue SyncTasks on the shared worker and return
immediately — run_conversation's post-response path is no longer coupled
to Honcho latency.

Three behavioural changes land here:

  Layer 1 — fire-and-forget sync
    No more join(timeout=5.0) on prior turn's thread.  Back-to-back
    sync_turn() calls return in microseconds regardless of backend
    latency.  Worker runs tasks serially per-provider (intentional:
    session writes must be ordered), uses a bounded queue with
    oldest-drop backpressure.

  Layer 2 — adaptive timeout
    SyncWorker feeds successful call latencies into HonchoLatencyTracker.
    After each turn, _drain_backlog_if_healthy() invokes
    rebuild_honcho_client_with_timeout() which rebuilds the SDK client
    iff the tracker's p95-derived timeout differs >20% from the active
    one.  Hosted Honcho converges on ~1-3s timeouts; self-hosted cold
    starts scale naturally.  30s default still applies during warmup.

  Layer 3 — circuit breaker + in-memory backlog
    CircuitBreaker trips open after 3 consecutive failures; SyncWorker
    refuses breaker-open tasks via their on_failure callback.  Provider
    wraps each task's on_failure with _enqueue_with_backlog() so
    breaker-open and queue-full tasks land in a bounded backlog (256
    tasks max).  On recovery (probe succeeds, state → closed), the next
    sync_turn() drains the backlog through the worker.  Tasks that
    crashed inside Honcho itself are NOT backlogged — replay won't help.

Updates one existing test (test_session.py) that poked at the now-
removed _sync_thread attribute; replaced with the worker's shutdown().

5 new integration tests verify the provider-level wiring:
  - sync_turn returns in < 100ms even when flush blocks 2s
  - 5 back-to-back sync_turns in < 200ms total (old code: up to 25s)
  - breaker-open enqueue lands in backlog, not on the worker
  - recovery drains backlog + new task on next sync_turn
  - backlog respects _BACKLOG_MAX and stops growing during long outages

No change to run_conversation or any agent-facing API.
2026-04-24 18:55:40 -04:00
..
context_engine fix: robust context engine interface — config selection, plugin discovery, ABC completeness 2026-04-10 19:15:50 -07:00
disk-cleanup docs(plugins): rename disk-guardian to disk-cleanup + bundled-plugins docs 2026-04-20 04:46:45 -07:00
example-dashboard/dashboard feat: dashboard plugin system — extend the web UI with custom tabs 2026-04-16 04:10:06 -07:00
image_gen fix(xai-image): drop unreachable editing code path 2026-04-23 15:13:34 -07:00
memory feat(honcho): wire fire-and-forget worker + adaptive timeout + breaker into provider 2026-04-24 18:55:40 -04:00
spotify refactor(spotify): convert to built-in bundled plugin under plugins/spotify (#15174) 2026-04-24 07:06:11 -07:00
strike-freedom-cockpit feat(dashboard): reskin extension points for themes and plugins (#14776) 2026-04-23 15:31:01 -07:00
__init__.py feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623) 2026-04-02 15:33:51 -07:00