hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-04 02:21:47 +00:00

History

Erosika f512fdf697 feat(honcho): wire fire-and-forget worker + adaptive timeout + breaker into provider Replaces the per-turn threading.Thread(target=_sync).start() pattern in HonchoMemoryProvider with a persistent SyncWorker. sync_turn() and on_memory_write() both enqueue SyncTasks on the shared worker and return immediately — run_conversation's post-response path is no longer coupled to Honcho latency. Three behavioural changes land here: Layer 1 — fire-and-forget sync No more join(timeout=5.0) on prior turn's thread. Back-to-back sync_turn() calls return in microseconds regardless of backend latency. Worker runs tasks serially per-provider (intentional: session writes must be ordered), uses a bounded queue with oldest-drop backpressure. Layer 2 — adaptive timeout SyncWorker feeds successful call latencies into HonchoLatencyTracker. After each turn, _drain_backlog_if_healthy() invokes rebuild_honcho_client_with_timeout() which rebuilds the SDK client iff the tracker's p95-derived timeout differs >20% from the active one. Hosted Honcho converges on ~1-3s timeouts; self-hosted cold starts scale naturally. 30s default still applies during warmup. Layer 3 — circuit breaker + in-memory backlog CircuitBreaker trips open after 3 consecutive failures; SyncWorker refuses breaker-open tasks via their on_failure callback. Provider wraps each task's on_failure with _enqueue_with_backlog() so breaker-open and queue-full tasks land in a bounded backlog (256 tasks max). On recovery (probe succeeds, state → closed), the next sync_turn() drains the backlog through the worker. Tasks that crashed inside Honcho itself are NOT backlogged — replay won't help. Updates one existing test (test_session.py) that poked at the now- removed _sync_thread attribute; replaced with the worker's shutdown(). 5 new integration tests verify the provider-level wiring: - sync_turn returns in < 100ms even when flush blocks 2s - 5 back-to-back sync_turns in < 200ms total (old code: up to 25s) - breaker-open enqueue lands in backlog, not on the worker - recovery drains backlog + new task on next sync_turn - backlog respects _BACKLOG_MAX and stops growing during long outages No change to run_conversation or any agent-facing API.		2026-04-24 18:55:40 -04:00
..
__init__.py	feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623 )	2026-04-02 15:33:51 -07:00
test_async_memory.py	fix(honcho): dialectic lifecycle — defaults, retry, prewarm consumption	2026-04-18 22:50:55 -07:00
test_cli.py	style(honcho): hoist hashlib import; validate baseUrl scheme before 'local' sentinel	2026-04-24 18:48:10 -04:00
test_client.py	fix(plugins/memory/honcho): default Honcho SDK HTTP timeout to 30s	2026-04-24 18:48:10 -04:00
test_pin_peer_name.py	fix(honcho): pinPeerName opt-in keeps memory unified across platforms (#14984 )	2026-04-24 18:48:10 -04:00
test_provider_sync_integration.py	feat(honcho): wire fire-and-forget worker + adaptive timeout + breaker into provider	2026-04-24 18:55:40 -04:00
test_session.py	feat(honcho): wire fire-and-forget worker + adaptive timeout + breaker into provider	2026-04-24 18:55:40 -04:00
test_sync_worker.py	feat(honcho): SyncWorker + HonchoLatencyTracker + CircuitBreaker primitives	2026-04-24 18:50:32 -04:00