feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623)

* feat(memory): add pluggable memory provider interface with profile isolation

Introduces a pluggable MemoryProvider ABC so external memory backends can
integrate with Hermes without modifying core files. Each backend becomes a
plugin implementing a standard interface, orchestrated by MemoryManager.

Key architecture:
- agent/memory_provider.py — ABC with core + optional lifecycle hooks
- agent/memory_manager.py — single integration point in the agent loop
- agent/builtin_memory_provider.py — wraps existing MEMORY.md/USER.md

Profile isolation fixes applied to all 6 shipped plugins:
- Cognitive Memory: use get_hermes_home() instead of raw env var
- Hindsight Memory: check $HERMES_HOME/hindsight/config.json first,
  fall back to legacy ~/.hindsight/ for backward compat
- Hermes Memory Store: replace hardcoded ~/.hermes paths with
  get_hermes_home() for config loading and DB path defaults
- Mem0 Memory: use get_hermes_home() instead of raw env var
- RetainDB Memory: auto-derive profile-scoped project name from
  hermes_home path (hermes-<profile>), explicit env var overrides
- OpenViking Memory: read-only, no local state, isolation via .env

MemoryManager.initialize_all() now injects hermes_home into kwargs so
every provider can resolve profile-scoped storage without importing
get_hermes_home() themselves.

Plugin system: adds register_memory_provider() to PluginContext and
get_plugin_memory_providers() accessor.

Based on PR #3825. 46 tests (37 unit + 5 E2E + 4 plugin registration).

* refactor(memory): drop cognitive plugin, rewrite OpenViking as full provider

Remove cognitive-memory plugin (#727) — core mechanics are broken:
decay runs 24x too fast (hourly not daily), prefetch uses row ID as
timestamp, search limited by importance not similarity.

Rewrite openviking-memory plugin from a read-only search wrapper into
a full bidirectional memory provider using the complete OpenViking
session lifecycle API:

- sync_turn: records user/assistant messages to OpenViking session
  (threaded, non-blocking)
- on_session_end: commits session to trigger automatic memory extraction
  into 6 categories (profile, preferences, entities, events, cases,
  patterns)
- prefetch: background semantic search via find() endpoint
- on_memory_write: mirrors built-in memory writes to the session
- is_available: checks env var only, no network calls (ABC compliance)

Tools expanded from 3 to 5:
- viking_search: semantic search with mode/scope/limit
- viking_read: tiered content (abstract ~100tok / overview ~2k / full)
- viking_browse: filesystem-style navigation (list/tree/stat)
- viking_remember: explicit memory storage via session
- viking_add_resource: ingest URLs/docs into knowledge base

Uses direct HTTP via httpx (no openviking SDK dependency needed).
Response truncation on viking_read to prevent context flooding.

* fix(memory): harden Mem0 plugin — thread safety, non-blocking sync, circuit breaker

- Remove redundant mem0_context tool (identical to mem0_search with
  rerank=true, top_k=5 — wastes a tool slot and confuses the model)
- Thread sync_turn so it's non-blocking — Mem0's server-side LLM
  extraction can take 5-10s, was stalling the agent after every turn
- Add threading.Lock around _get_client() for thread-safe lazy init
  (prefetch and sync threads could race on first client creation)
- Add circuit breaker: after 5 consecutive API failures, pause calls
  for 120s instead of hammering a down server every turn. Auto-resets
  after cooldown. Logs a warning when tripped.
- Track success/failure in prefetch, sync_turn, and all tool calls
- Wait for previous sync to finish before starting a new one (prevents
  unbounded thread accumulation on rapid turns)
- Clean up shutdown to join both prefetch and sync threads

* fix(memory): enforce single external memory provider limit

MemoryManager now rejects a second non-builtin provider with a warning.
Built-in memory (MEMORY.md/USER.md) is always accepted. Only ONE
external plugin provider is allowed at a time. This prevents tool
schema bloat (some providers add 3-5 tools each) and conflicting
memory backends.

The warning message directs users to configure memory.provider in
config.yaml to select which provider to activate.

Updated all 47 tests to use builtin + one external pattern instead
of multiple externals. Added test_second_external_rejected to verify
the enforcement.

* feat(memory): add ByteRover memory provider plugin

Implements the ByteRover integration (from PR #3499 by hieuntg81) as a
MemoryProvider plugin instead of direct run_agent.py modifications.

ByteRover provides persistent memory via the brv CLI — a hierarchical
knowledge tree with tiered retrieval (fuzzy text then LLM-driven search).
Local-first with optional cloud sync.

Plugin capabilities:
- prefetch: background brv query for relevant context
- sync_turn: curate conversation turns (threaded, non-blocking)
- on_memory_write: mirror built-in memory writes to brv
- on_pre_compress: extract insights before context compression

Tools (3):
- brv_query: search the knowledge tree
- brv_curate: store facts/decisions/patterns
- brv_status: check CLI version and context tree state

Profile isolation: working directory at $HERMES_HOME/byterover/ (scoped
per profile). Binary resolution cached with thread-safe double-checked
locking. All write operations threaded to avoid blocking the agent
(curate can take 120s with LLM processing).

* fix(memory): thread remaining sync_turns, fix holographic, add config key

Plugin fixes:
- Hindsight: thread sync_turn (was blocking up to 30s via _run_in_thread)
- RetainDB: thread sync_turn (was blocking on HTTP POST)
- Both: shutdown now joins sync threads alongside prefetch threads

Holographic retrieval fixes:
- reason(): removed dead intersection_key computation (bundled but never
  used in scoring). Now reuses pre-computed entity_residuals directly,
  moved role_content encoding outside the inner loop.
- contradict(): added _MAX_CONTRADICT_FACTS=500 scaling guard. Above
  500 facts, only checks the most recently updated ones to avoid O(n^2)
  explosion (~125K comparisons at 500 is acceptable).

Config:
- Added memory.provider key to DEFAULT_CONFIG ("" = builtin only).
  No version bump needed (deep_merge handles new keys automatically).

* feat(memory): extract Honcho as a MemoryProvider plugin

Creates plugins/honcho-memory/ as a thin adapter over the existing
honcho_integration/ package. All 4 Honcho tools (profile, search,
context, conclude) move from the normal tool registry to the
MemoryProvider interface.

The plugin delegates all work to HonchoSessionManager — no Honcho
logic is reimplemented. It uses the existing config chain:
$HERMES_HOME/honcho.json -> ~/.honcho/config.json -> env vars.

Lifecycle hooks:
- initialize: creates HonchoSessionManager via existing client factory
- prefetch: background dialectic query
- sync_turn: records messages + flushes to API (threaded)
- on_memory_write: mirrors user profile writes as conclusions
- on_session_end: flushes all pending messages

This is a prerequisite for the MemoryManager wiring in run_agent.py.
Once wired, Honcho goes through the same provider interface as all
other memory plugins, and the scattered Honcho code in run_agent.py
can be consolidated into the single MemoryManager integration point.

* feat(memory): wire MemoryManager into run_agent.py

Adds 8 integration points for the external memory provider plugin,
all purely additive (zero existing code modified):

1. Init (~L1130): Create MemoryManager, find matching plugin provider
   from memory.provider config, initialize with session context
2. Tool injection (~L1160): Append provider tool schemas to self.tools
   and self.valid_tool_names after memory_manager init
3. System prompt (~L2705): Add external provider's system_prompt_block
   alongside existing MEMORY.md/USER.md blocks
4. Tool routing (~L5362): Route provider tool calls through
   memory_manager.handle_tool_call() before the catchall handler
5. Memory write bridge (~L5353): Notify external provider via
   on_memory_write() when the built-in memory tool writes
6. Pre-compress (~L5233): Call on_pre_compress() before context
   compression discards messages
7. Prefetch (~L6421): Inject provider prefetch results into the
   current-turn user message (same pattern as Honcho turn context)
8. Turn sync + session end (~L8161, ~L8172): sync_all() after each
   completed turn, queue_prefetch_all() for next turn, on_session_end()
   + shutdown_all() at conversation end

All hooks are wrapped in try/except — a failing provider never breaks
the agent. The existing memory system, Honcho integration, and all
other code paths are completely untouched.

Full suite: 7222 passed, 4 pre-existing failures.

* refactor(memory): remove legacy Honcho integration from core

Extracts all Honcho-specific code from run_agent.py, model_tools.py,
toolsets.py, and gateway/run.py. Honcho is now exclusively available
as a memory provider plugin (plugins/honcho-memory/).

Removed from run_agent.py (-457 lines):
- Honcho init block (session manager creation, activation, config)
- 8 Honcho methods: _honcho_should_activate, _strip_honcho_tools,
  _activate_honcho, _register_honcho_exit_hook, _queue_honcho_prefetch,
  _honcho_prefetch, _honcho_save_user_observation, _honcho_sync
- _inject_honcho_turn_context module-level function
- Honcho system prompt block (tool descriptions, CLI commands)
- Honcho context injection in api_messages building
- Honcho params from __init__ (honcho_session_key, honcho_manager,
  honcho_config)
- HONCHO_TOOL_NAMES constant
- All honcho-specific tool dispatch forwarding

Removed from other files:
- model_tools.py: honcho_tools import, honcho params from handle_function_call
- toolsets.py: honcho toolset definition, honcho tools from core tools list
- gateway/run.py: honcho params from AIAgent constructor calls

Removed tests (-339 lines):
- 9 Honcho-specific test methods from test_run_agent.py
- TestHonchoAtexitFlush class from test_exit_cleanup_interrupt.py

Restored two regex constants (_SURROGATE_RE, _BUDGET_WARNING_RE) that
were accidentally removed during the honcho function extraction.

The honcho_integration/ package is kept intact — the plugin delegates
to it. tools/honcho_tools.py registry entries are now dead code (import
commented out in model_tools.py) but the file is preserved for reference.

Full suite: 7207 passed, 4 pre-existing failures. Zero regressions.

* refactor(memory): restructure plugins, add CLI, clean gateway, migration notice

Plugin restructure:
- Move all memory plugins from plugins/<name>-memory/ to plugins/memory/<name>/
  (byterover, hindsight, holographic, honcho, mem0, openviking, retaindb)
- New plugins/memory/__init__.py discovery module that scans the directory
  directly, loading providers by name without the general plugin system
- run_agent.py uses load_memory_provider() instead of get_plugin_memory_providers()

CLI wiring:
- hermes memory setup — interactive curses picker + config wizard
- hermes memory status — show active provider, config, availability
- hermes memory off — disable external provider (built-in only)
- hermes honcho — now shows migration notice pointing to hermes memory setup

Gateway cleanup:
- Remove _get_or_create_gateway_honcho (already removed in prev commit)
- Remove _shutdown_gateway_honcho and _shutdown_all_gateway_honcho methods
- Remove all calls to shutdown methods (4 call sites)
- Remove _honcho_managers/_honcho_configs dict references

Dead code removal:
- Delete tools/honcho_tools.py (279 lines, import was already commented out)
- Delete tests/gateway/test_honcho_lifecycle.py (131 lines, tested removed methods)
- Remove if False placeholder from run_agent.py

Migration:
- Honcho migration notice on startup: detects existing honcho.json or
  ~/.honcho/config.json, prints guidance to run hermes memory setup.
  Only fires when memory.provider is not set and not in quiet mode.

Full suite: 7203 passed, 4 pre-existing failures. Zero regressions.

* feat(memory): standardize plugin config + add per-plugin documentation

Config architecture:
- Add save_config(values, hermes_home) to MemoryProvider ABC
- Honcho: writes to $HERMES_HOME/honcho.json (SDK native)
- Mem0: writes to $HERMES_HOME/mem0.json
- Hindsight: writes to $HERMES_HOME/hindsight/config.json
- Holographic: writes to config.yaml under plugins.hermes-memory-store
- OpenViking/RetainDB/ByteRover: env-var only (default no-op)

Setup wizard (hermes memory setup):
- Now calls provider.save_config() for non-secret config
- Secrets still go to .env via env vars
- Only memory.provider activation key goes to config.yaml

Documentation:
- README.md for each of the 7 providers in plugins/memory/<name>/
- Requirements, setup (wizard + manual), config reference, tools table
- Consistent format across all providers

The contract for new memory plugins:
- get_config_schema() declares all fields (REQUIRED)
- save_config() writes native config (REQUIRED if not env-var-only)
- Secrets use env_var field in schema, written to .env by wizard
- README.md in the plugin directory

* docs: add memory providers user guide + developer guide

New pages:
- user-guide/features/memory-providers.md — comprehensive guide covering
  all 7 shipped providers (Honcho, OpenViking, Mem0, Hindsight,
  Holographic, RetainDB, ByteRover). Each with setup, config, tools,
  cost, and unique features. Includes comparison table and profile
  isolation notes.
- developer-guide/memory-provider-plugin.md — how to build a new memory
  provider plugin. Covers ABC, required methods, config schema,
  save_config, threading contract, profile isolation, testing.

Updated pages:
- user-guide/features/memory.md — replaced Honcho section with link to
  new Memory Providers page
- user-guide/features/honcho.md — replaced with migration redirect to
  the new Memory Providers page
- sidebars.ts — added both new pages to navigation

* fix(memory): auto-migrate Honcho users to memory provider plugin

When honcho.json or ~/.honcho/config.json exists but memory.provider
is not set, automatically set memory.provider: honcho in config.yaml
and activate the plugin. The plugin reads the same config files, so
all data and credentials are preserved. Zero user action needed.

Persists the migration to config.yaml so it only fires once. Prints
a one-line confirmation in non-quiet mode.

* fix(memory): only auto-migrate Honcho when enabled + credentialed

Check HonchoClientConfig.enabled AND (api_key OR base_url) before
auto-migrating — not just file existence. Prevents false activation
for users who disabled Honcho, stopped using it (config lingers),
or have ~/.honcho/ from a different tool.

* feat(memory): auto-install pip dependencies during hermes memory setup

Reads pip_dependencies from plugin.yaml, checks which are missing,
installs them via pip before config walkthrough. Also shows install
guidance for external_dependencies (e.g. brv CLI for ByteRover).

Updated all 7 plugin.yaml files with pip_dependencies:
- honcho: honcho-ai
- mem0: mem0ai
- openviking: httpx
- hindsight: hindsight-client
- holographic: (none)
- retaindb: requests
- byterover: (external_dependencies for brv CLI)

* fix: remove remaining Honcho crash risks from cli.py and gateway

cli.py: removed Honcho session re-mapping block (would crash importing
deleted tools/honcho_tools.py), Honcho flush on compress, Honcho
session display on startup, Honcho shutdown on exit, honcho_session_key
AIAgent param.

gateway/run.py: removed honcho_session_key params from helper methods,
sync_honcho param, _honcho.shutdown() block.

tests: fixed test_cron_session_with_honcho_key_skipped (was passing
removed honcho_key param to _flush_memories_for_session).

* fix: include plugins/ in pyproject.toml package list

Without this, plugins/memory/ wouldn't be included in non-editable
installs. Hermes always runs from the repo checkout so this is belt-
and-suspenders, but prevents breakage if the install method changes.

* fix(memory): correct pip-to-import name mapping for dep checks

The heuristic dep.replace('-', '_') fails for packages where the pip
name differs from the import name: honcho-ai→honcho, mem0ai→mem0,
hindsight-client→hindsight_client. Added explicit mapping table so
hermes memory setup doesn't try to reinstall already-installed packages.

* chore: remove dead code from old plugin memory registration path

- hermes_cli/plugins.py: removed register_memory_provider(),
  _memory_providers list, get_plugin_memory_providers() — memory
  providers now use plugins/memory/ discovery, not the general plugin system
- hermes_cli/main.py: stripped 74 lines of dead honcho argparse
  subparsers (setup, status, sessions, map, peer, mode, tokens,
  identity, migrate) — kept only the migration redirect
- agent/memory_provider.py: updated docstring to reflect new
  registration path
- tests: replaced TestPluginMemoryProviderRegistration with
  TestPluginMemoryDiscovery that tests the actual plugins/memory/
  discovery system. Added 3 new tests (discover, load, nonexistent).

* chore: delete dead honcho_integration/cli.py and its tests

cli.py (794 lines) was the old 'hermes honcho' command handler — nobody
calls it since cmd_honcho was replaced with a migration redirect.

Deleted tests that imported from removed code:
- tests/honcho_integration/test_cli.py (tested _resolve_api_key)
- tests/honcho_integration/test_config_isolation.py (tested CLI config paths)
- tests/tools/test_honcho_tools.py (tested the deleted tools/honcho_tools.py)

Remaining honcho_integration/ files (actively used by the plugin):
- client.py (445 lines) — config loading, SDK client creation
- session.py (991 lines) — session management, queries, flush

* refactor: move honcho_integration/ into the honcho plugin

Moves client.py (445 lines) and session.py (991 lines) from the
top-level honcho_integration/ package into plugins/memory/honcho/.
No Honcho code remains in the main codebase.

- plugins/memory/honcho/client.py — config loading, SDK client creation
- plugins/memory/honcho/session.py — session management, queries, flush
- Updated all imports: run_agent.py (auto-migration), hermes_cli/doctor.py,
  plugin __init__.py, session.py cross-import, all tests
- Removed honcho_integration/ package and pyproject.toml entry
- Renamed tests/honcho_integration/ → tests/honcho_plugin/

* docs: update architecture + gateway-internals for memory provider system

- architecture.md: replaced honcho_integration/ with plugins/memory/
- gateway-internals.md: replaced Honcho-specific session routing and
  flush lifecycle docs with generic memory provider interface docs

* fix: update stale mock path for resolve_active_host after honcho plugin migration

* fix(memory): address review feedback — P0 lifecycle, ABC contract, honcho CLI restore

Review feedback from Honcho devs (erosika):

P0 — Provider lifecycle:
- Remove on_session_end() + shutdown_all() from run_conversation() tail
  (was killing providers after every turn in multi-turn sessions)
- Add shutdown_memory_provider() method on AIAgent for callers
- Wire shutdown into CLI atexit, reset_conversation, gateway stop/expiry

Bug fixes:
- Remove sync_honcho=False kwarg from /btw callsites (TypeError crash)
- Fix doctor.py references to dead 'hermes honcho setup' command
- Cache prefetch_all() before tool loop (was re-calling every iteration)

ABC contract hardening (all backwards-compatible):
- Add session_id kwarg to prefetch/sync_turn/queue_prefetch
- Make on_pre_compress() return str (provider insights in compression)
- Add **kwargs to on_turn_start() for runtime context
- Add on_delegation() hook for parent-side subagent observation
- Document agent_context/agent_identity/agent_workspace kwargs on
  initialize() (prevents cron corruption, enables profile scoping)
- Fix docstring: single external provider, not multiple

Honcho CLI restoration:
- Add plugins/memory/honcho/cli.py (from main's honcho_integration/cli.py
  with imports adapted to plugin path)
- Restore full hermes honcho command with all subcommands (status, peer,
  mode, tokens, identity, enable/disable, sync, peers, --target-profile)
- Restore auto-clone on profile creation + sync on hermes update
- hermes honcho setup now redirects to hermes memory setup

* fix(memory): wire on_delegation, skip_memory for cron/flush, fix ByteRover return type

- Wire on_delegation() in delegate_tool.py — parent's memory provider
  is notified with task+result after each subagent completes
- Add skip_memory=True to cron scheduler (prevents cron system prompts
  from corrupting user representations — closes #4052)
- Add skip_memory=True to gateway flush agent (throwaway agent shouldn't
  activate memory provider)
- Fix ByteRover on_pre_compress() return type: None -> str

* fix(honcho): port profile isolation fixes from PR #4632

Ports 5 bug fixes found during profile testing (erosika's PR #4632):

1. 3-tier config resolution — resolve_config_path() now checks
   $HERMES_HOME/honcho.json → ~/.hermes/honcho.json → ~/.honcho/config.json
   (non-default profiles couldn't find shared host blocks)

2. Thread host=_host_key() through from_global_config() in cmd_setup,
   cmd_status, cmd_identity (--target-profile was being ignored)

3. Use bare profile name as aiPeer (not host key with dots) — Honcho's
   peer ID pattern is ^[a-zA-Z0-9_-]+$, dots are invalid

4. Wrap add_peers() in try/except — was fatal on new AI peers, killed
   all message uploads for the session

5. Gate Honcho clone behind --clone/--clone-all on profile create
   (bare create should be blank-slate)

Also: sanitize assistant_peer_id via _sanitize_id()

* fix(tests): add module cleanup fixture to test_cli_provider_resolution

test_cli_provider_resolution._import_cli() wipes tools.*, cli, and
run_agent from sys.modules to force fresh imports, but had no cleanup.
This poisoned all subsequent tests on the same xdist worker — mocks
targeting tools.file_tools, tools.send_message_tool, etc. patched the
NEW module object while already-imported functions still referenced
the OLD one. Caused ~25 cascade failures: send_message KeyError,
process_registry FileNotFoundError, file_read_guards timeouts,
read_loop_detection file-not-found, mcp_oauth None port, and
provider_parity/codex_execution stale tool lists.

Fix: autouse fixture saves all affected modules before each test and
restores them after, matching the pattern in
test_managed_browserbase_and_modal.py.
This commit is contained in:
Teknium 2026-04-02 15:33:51 -07:00 committed by GitHub
parent e0b2bdb089
commit 924bc67eee
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
69 changed files with 7501 additions and 2317 deletions

View file

@ -103,12 +103,6 @@ from agent.trajectory import (
)
from utils import atomic_json_write, env_var_enabled
HONCHO_TOOL_NAMES = {
"honcho_context",
"honcho_profile",
"honcho_search",
"honcho_conclude",
}
class _SafeWriter:
@ -221,9 +215,6 @@ _PARALLEL_SAFE_TOOLS = frozenset({
"ha_get_state",
"ha_list_entities",
"ha_list_services",
"honcho_context",
"honcho_profile",
"honcho_search",
"read_file",
"search_files",
"session_search",
@ -340,46 +331,15 @@ def _paths_overlap(left: Path, right: Path) -> bool:
return left_parts[:common_len] == right_parts[:common_len]
def _inject_honcho_turn_context(content, turn_context: str):
"""Append Honcho recall to the current-turn user message without mutating history.
The returned content is sent to the API for this turn only. Keeping Honcho
recall out of the system prompt preserves the stable cache prefix while
still giving the model continuity context.
"""
if not turn_context:
return content
_SURROGATE_RE = re.compile(r'[\ud800-\udfff]')
note = (
"[System note: The following Honcho memory was retrieved from prior "
"sessions. It is continuity context for this turn only, not new user "
"input.]\n\n"
f"{turn_context}"
)
if isinstance(content, list):
return list(content) + [{"type": "text", "text": note}]
text = "" if content is None else str(content)
if not text.strip():
return note
return f"{text}\n\n{note}"
# Budget warning text patterns injected by _get_budget_warning().
_BUDGET_WARNING_RE = re.compile(
r"\[BUDGET(?:\s+WARNING)?:\s+Iteration\s+\d+/\d+\..*?\]",
re.DOTALL,
)
# Regex to match lone surrogate code points (U+D800..U+DFFF).
# These are invalid in UTF-8 and cause UnicodeEncodeError when the OpenAI SDK
# serialises messages to JSON. Common source: clipboard paste from Google Docs
# or other rich-text editors on some platforms.
_SURROGATE_RE = re.compile(r'[\ud800-\udfff]')
def _sanitize_surrogates(text: str) -> str:
"""Replace lone surrogate code points with U+FFFD (replacement character).
@ -507,9 +467,6 @@ class AIAgent:
skip_context_files: bool = False,
skip_memory: bool = False,
session_db=None,
honcho_session_key: str = None,
honcho_manager=None,
honcho_config=None,
iteration_budget: "IterationBudget" = None,
fallback_model: Dict[str, Any] = None,
credential_pool=None,
@ -556,10 +513,6 @@ class AIAgent:
skip_context_files (bool): If True, skip auto-injection of SOUL.md, AGENTS.md, and .cursorrules
into the system prompt. Use this for batch processing and data generation to avoid
polluting trajectories with user-specific persona or project instructions.
honcho_session_key (str): Session key for Honcho integration (e.g., "telegram:123456" or CLI session_id).
When provided and Honcho is enabled in config, enables persistent cross-session user modeling.
honcho_manager: Optional shared HonchoSessionManager owned by the caller.
honcho_config: Optional HonchoClientConfig corresponding to honcho_manager.
"""
_install_safe_stdio()
@ -1070,75 +1023,80 @@ class AIAgent:
except Exception:
pass # Memory is optional -- don't break agent init
# Honcho AI-native memory (cross-session user modeling)
# Reads $HERMES_HOME/honcho.json (instance) or ~/.honcho/config.json (global).
self._honcho = None # HonchoSessionManager | None
self._honcho_session_key = honcho_session_key
self._honcho_config = None # HonchoClientConfig | None
self._honcho_exit_hook_registered = False
# Memory provider plugin (external — one at a time, alongside built-in)
# Reads memory.provider from config to select which plugin to activate.
self._memory_manager = None
if not skip_memory:
try:
if honcho_manager is not None:
hcfg = honcho_config or getattr(honcho_manager, "_config", None)
self._honcho_config = hcfg
if hcfg and self._honcho_should_activate(hcfg):
self._honcho = honcho_manager
self._activate_honcho(
hcfg,
enabled_toolsets=enabled_toolsets,
disabled_toolsets=disabled_toolsets,
session_db=session_db,
)
else:
from honcho_integration.client import HonchoClientConfig, get_honcho_client
hcfg = HonchoClientConfig.from_global_config()
self._honcho_config = hcfg
if self._honcho_should_activate(hcfg):
from honcho_integration.session import HonchoSessionManager
client = get_honcho_client(hcfg)
self._honcho = HonchoSessionManager(
honcho=client,
config=hcfg,
context_tokens=hcfg.context_tokens,
)
self._activate_honcho(
hcfg,
enabled_toolsets=enabled_toolsets,
disabled_toolsets=disabled_toolsets,
session_db=session_db,
)
_mem_provider_name = mem_config.get("provider", "") if mem_config else ""
# Auto-migrate: if Honcho was actively configured (enabled +
# credentials) but memory.provider is not set, activate the
# honcho plugin automatically. Just having the config file
# is not enough — the user may have disabled Honcho or the
# file may be from a different tool.
if not _mem_provider_name:
try:
from plugins.memory.honcho.client import HonchoClientConfig as _HCC
_hcfg = _HCC.from_global_config()
if _hcfg.enabled and (_hcfg.api_key or _hcfg.base_url):
_mem_provider_name = "honcho"
# Persist so this only auto-migrates once
try:
from hermes_cli.config import load_config as _lc, save_config as _sc
_cfg = _lc()
_cfg.setdefault("memory", {})["provider"] = "honcho"
_sc(_cfg)
except Exception:
pass
if not self.quiet_mode:
print(" ✓ Auto-migrated Honcho to memory provider plugin.")
print(" Your config and data are preserved.\n")
except Exception:
pass
if _mem_provider_name:
from agent.memory_manager import MemoryManager as _MemoryManager
from plugins.memory import load_memory_provider as _load_mem
self._memory_manager = _MemoryManager()
_mp = _load_mem(_mem_provider_name)
if _mp and _mp.is_available():
self._memory_manager.add_provider(_mp)
if self._memory_manager.providers:
from hermes_constants import get_hermes_home as _ghh
_init_kwargs = {
"session_id": self.session_id,
"platform": platform or "cli",
"hermes_home": str(_ghh()),
"agent_context": "primary",
}
# Profile identity for per-profile provider scoping
try:
from hermes_cli.profiles import get_active_profile_name
_profile = get_active_profile_name()
_init_kwargs["agent_identity"] = _profile
_init_kwargs["agent_workspace"] = "hermes"
except Exception:
pass
self._memory_manager.initialize_all(**_init_kwargs)
logger.info("Memory provider '%s' activated", _mem_provider_name)
else:
if not hcfg.enabled:
logger.debug("Honcho disabled in global config")
elif not (hcfg.api_key or hcfg.base_url):
logger.debug("Honcho enabled but no API key or base URL configured")
else:
logger.debug("Honcho enabled but missing API key or disabled in config")
except Exception as e:
logger.warning("Honcho init failed — memory disabled: %s", e)
print(f" Honcho init failed: {e}")
print(" Run 'hermes honcho setup' to reconfigure.")
self._honcho = None
logger.debug("Memory provider '%s' not found or not available", _mem_provider_name)
self._memory_manager = None
except Exception as _mpe:
logger.warning("Memory provider plugin init failed: %s", _mpe)
self._memory_manager = None
# Tools are initially discovered before Honcho activation. If Honcho
# stays inactive, remove any stale honcho_* tools from prior process state.
if not self._honcho:
self._strip_honcho_tools_from_surface()
# Gate local memory writes based on per-peer memory modes.
# AI peer governs MEMORY.md; user peer governs USER.md.
# "honcho" = Honcho only, disable local writes.
if self._honcho_config and self._honcho:
_hcfg = self._honcho_config
_agent_mode = _hcfg.peer_memory_mode(_hcfg.ai_peer)
_user_mode = _hcfg.peer_memory_mode(_hcfg.peer_name or "user")
if _agent_mode == "honcho":
self._memory_flush_min_turns = 0
self._memory_enabled = False
logger.debug("peer %s memory_mode=honcho: local MEMORY.md writes disabled", _hcfg.ai_peer)
if _user_mode == "honcho":
self._user_profile_enabled = False
logger.debug("peer %s memory_mode=honcho: local USER.md writes disabled", _hcfg.peer_name or "user")
# Inject memory provider tool schemas into the tool surface
if self._memory_manager and self.tools is not None:
for _schema in self._memory_manager.get_all_tool_schemas():
_wrapped = {"type": "function", "function": _schema}
self.tools.append(_wrapped)
_tname = _schema.get("name", "")
if _tname:
self.valid_tool_names.add(_tname)
# Skills config: nudge interval for skill creation reminders
self._skill_nudge_interval = 10
@ -2382,6 +2340,23 @@ class AIAgent:
self._interrupt_requested = False
self._interrupt_message = None
_set_interrupt(False)
def shutdown_memory_provider(self, messages: list = None) -> None:
"""Shut down the memory provider — call at actual session boundaries.
This calls on_session_end() then shutdown_all() on the memory
manager. NOT called per-turn only at CLI exit, /reset, gateway
session expiry, etc.
"""
if self._memory_manager:
try:
self._memory_manager.on_session_end(messages or [])
except Exception:
pass
try:
self._memory_manager.shutdown_all()
except Exception:
pass
def _hydrate_todo_store(self, history: List[Dict[str, Any]]) -> None:
"""
@ -2420,228 +2395,14 @@ class AIAgent:
"""Check if an interrupt has been requested."""
return self._interrupt_requested
# ── Honcho integration helpers ──
def _honcho_should_activate(self, hcfg) -> bool:
"""Return True when Honcho should be active.
Self-hosted Honcho may be configured with a base_url and no API key,
so activation should accept either credential style.
"""
if not hcfg or not hcfg.enabled:
return False
if not (hcfg.api_key or hcfg.base_url):
return False
return True
def _strip_honcho_tools_from_surface(self) -> None:
"""Remove Honcho tools from the active tool surface."""
if not self.tools:
self.valid_tool_names = set()
return
self.tools = [
tool for tool in self.tools
if tool.get("function", {}).get("name") not in HONCHO_TOOL_NAMES
]
self.valid_tool_names = {
tool["function"]["name"] for tool in self.tools
} if self.tools else set()
def _activate_honcho(
self,
hcfg,
*,
enabled_toolsets: Optional[List[str]],
disabled_toolsets: Optional[List[str]],
session_db,
) -> None:
"""Finish Honcho setup once a session manager is available."""
if not self._honcho:
return
if not self._honcho_session_key:
session_title = None
if session_db is not None:
try:
session_title = session_db.get_session_title(self.session_id or "")
except Exception:
pass
self._honcho_session_key = (
hcfg.resolve_session_name(
session_title=session_title,
session_id=self.session_id,
)
or "hermes-default"
)
honcho_sess = self._honcho.get_or_create(self._honcho_session_key)
if not honcho_sess.messages:
try:
from hermes_cli.config import get_hermes_home
mem_dir = str(get_hermes_home() / "memories")
self._honcho.migrate_memory_files(
self._honcho_session_key,
mem_dir,
)
except Exception as exc:
logger.debug("Memory files migration failed (non-fatal): %s", exc)
from tools.honcho_tools import set_session_context
set_session_context(self._honcho, self._honcho_session_key)
# Rebuild tool surface after Honcho context injection. Tool availability
# is check_fn-gated and may change once session context is attached.
self.tools = get_tool_definitions(
enabled_toolsets=enabled_toolsets,
disabled_toolsets=disabled_toolsets,
quiet_mode=True,
)
self.valid_tool_names = {
tool["function"]["name"] for tool in self.tools
} if self.tools else set()
if hcfg.recall_mode == "context":
self._strip_honcho_tools_from_surface()
if not self.quiet_mode:
print(" Honcho active — recall_mode: context (Honcho tools hidden)")
else:
if not self.quiet_mode:
print(f" Honcho active — recall_mode: {hcfg.recall_mode}")
logger.info(
"Honcho active (session: %s, user: %s, workspace: %s, "
"write_frequency: %s, memory_mode: %s)",
self._honcho_session_key,
hcfg.peer_name,
hcfg.workspace_id,
hcfg.write_frequency,
hcfg.memory_mode,
)
recall_mode = hcfg.recall_mode
if recall_mode != "tools":
try:
ctx = self._honcho.get_prefetch_context(self._honcho_session_key)
if ctx:
self._honcho.set_context_result(self._honcho_session_key, ctx)
logger.debug("Honcho context pre-warmed for first turn")
except Exception as exc:
logger.debug("Honcho context prefetch failed (non-fatal): %s", exc)
self._register_honcho_exit_hook()
def _register_honcho_exit_hook(self) -> None:
"""Register a process-exit flush hook without clobbering signal handlers."""
if self._honcho_exit_hook_registered or not self._honcho:
return
honcho_ref = weakref.ref(self._honcho)
def _flush_honcho_on_exit():
manager = honcho_ref()
if manager is None:
return
try:
manager.flush_all()
except (Exception, KeyboardInterrupt) as exc:
logger.debug("Honcho flush on exit failed (non-fatal): %s", exc)
atexit.register(_flush_honcho_on_exit)
self._honcho_exit_hook_registered = True
def _queue_honcho_prefetch(self, user_message: str) -> None:
"""Queue turn-end Honcho prefetch so the next turn can consume cached results."""
if not self._honcho or not self._honcho_session_key:
return
recall_mode = (self._honcho_config.recall_mode if self._honcho_config else "hybrid")
if recall_mode == "tools":
return
try:
self._honcho.prefetch_context(self._honcho_session_key, user_message)
self._honcho.prefetch_dialectic(self._honcho_session_key, user_message or "What were we working on?")
except Exception as exc:
logger.debug("Honcho background prefetch failed (non-fatal): %s", exc)
def _honcho_prefetch(self, user_message: str) -> str:
"""Assemble the first-turn Honcho context from the pre-warmed cache."""
if not self._honcho or not self._honcho_session_key:
return ""
try:
parts = []
ctx = self._honcho.pop_context_result(self._honcho_session_key)
if ctx:
rep = ctx.get("representation", "")
card = ctx.get("card", "")
if rep:
parts.append(f"## User representation\n{rep}")
if card:
parts.append(card)
ai_rep = ctx.get("ai_representation", "")
ai_card = ctx.get("ai_card", "")
if ai_rep:
parts.append(f"## AI peer representation\n{ai_rep}")
if ai_card:
parts.append(ai_card)
dialectic = self._honcho.pop_dialectic_result(self._honcho_session_key)
if dialectic:
parts.append(f"## Continuity synthesis\n{dialectic}")
if not parts:
return ""
header = (
"# Honcho Memory (persistent cross-session context)\n"
"Use this to answer questions about the user, prior sessions, "
"and what you were working on together. Do not call tools to "
"look up information that is already present here.\n"
)
return header + "\n\n".join(parts)
except Exception as e:
logger.debug("Honcho prefetch failed (non-fatal): %s", e)
return ""
def _honcho_save_user_observation(self, content: str) -> str:
"""Route a memory tool target=user add to Honcho.
Sends the content as a user peer message so Honcho's reasoning
model can incorporate it into the user representation.
"""
if not content or not content.strip():
return json.dumps({"success": False, "error": "Content cannot be empty."})
try:
session = self._honcho.get_or_create(self._honcho_session_key)
session.add_message("user", f"[observation] {content.strip()}")
self._honcho.save(session)
return json.dumps({
"success": True,
"target": "user",
"message": "Saved to Honcho user model.",
})
except Exception as e:
logger.debug("Honcho user observation failed: %s", e)
return json.dumps({"success": False, "error": f"Honcho save failed: {e}"})
def _honcho_sync(self, user_content: str, assistant_content: str) -> None:
"""Sync the user/assistant message pair to Honcho."""
if not self._honcho or not self._honcho_session_key:
return
try:
session = self._honcho.get_or_create(self._honcho_session_key)
session.add_message("user", user_content)
session.add_message("assistant", assistant_content)
self._honcho.save(session)
logger.info("Honcho sync queued for session %s (%d messages)",
self._honcho_session_key, len(session.messages))
except Exception as e:
logger.warning("Honcho sync failed: %s", e)
if not self.quiet_mode:
print(f" Honcho write failed: {e}")
def _build_system_prompt(self, system_message: str = None) -> str:
"""
@ -2671,8 +2432,8 @@ class AIAgent:
if not _soul_loaded:
# Fallback to hardcoded identity
_ai_peer_name = (
self._honcho_config.ai_peer
if self._honcho_config and self._honcho_config.ai_peer != "hermes"
None
if False
else None
)
if _ai_peer_name:
@ -2728,59 +2489,7 @@ class AIAgent:
if "gemini" in _model_lower or "gemma" in _model_lower:
prompt_parts.append(GOOGLE_MODEL_OPERATIONAL_GUIDANCE)
# Honcho CLI awareness: tell Hermes about its own management commands
# so it can refer the user to them rather than reinventing answers.
if self._honcho and self._honcho_session_key:
hcfg = self._honcho_config
mode = hcfg.memory_mode if hcfg else "hybrid"
freq = hcfg.write_frequency if hcfg else "async"
recall_mode = hcfg.recall_mode if hcfg else "hybrid"
honcho_block = (
"# Honcho memory integration\n"
f"Active. Session: {self._honcho_session_key}. "
f"Mode: {mode}. Write frequency: {freq}. Recall: {recall_mode}.\n"
)
if recall_mode == "context":
honcho_block += (
"Honcho context is injected into this system prompt below. "
"All memory retrieval comes from this context — no Honcho tools "
"are available. Answer questions about the user, prior sessions, "
"and recent work directly from the Honcho Memory section.\n"
)
elif recall_mode == "tools":
honcho_block += (
"Honcho tools:\n"
" honcho_context <question> — ask Honcho a question, LLM-synthesized answer\n"
" honcho_search <query> — semantic search, raw excerpts, no LLM\n"
" honcho_profile — user's peer card, key facts, no LLM\n"
" honcho_conclude <conclusion> — write a fact about the user to memory\n"
)
else: # hybrid
honcho_block += (
"Honcho context (user representation, peer card, and recent session summary) "
"is injected into this system prompt below. Use it to answer continuity "
"questions ('where were we?', 'what were we working on?') WITHOUT calling "
"any tools. Only call Honcho tools when you need information beyond what is "
"already present in the Honcho Memory section.\n"
"Honcho tools:\n"
" honcho_context <question> — ask Honcho a question, LLM-synthesized answer\n"
" honcho_search <query> — semantic search, raw excerpts, no LLM\n"
" honcho_profile — user's peer card, key facts, no LLM\n"
" honcho_conclude <conclusion> — write a fact about the user to memory\n"
)
honcho_block += (
"Management commands (refer users here instead of explaining manually):\n"
" hermes honcho status — show full config + connection\n"
" hermes honcho mode [hybrid|honcho] — show or set memory mode\n"
" hermes honcho tokens [--context N] [--dialectic N] — show or set token budgets\n"
" hermes honcho peer [--user NAME] [--ai NAME] [--reasoning LEVEL]\n"
" hermes honcho sessions — list directory→session mappings\n"
" hermes honcho map <name> — map cwd to a session name\n"
" hermes honcho identity [<file>] [--show] — seed or show AI peer identity\n"
" hermes honcho migrate — migration guide from openclaw-honcho\n"
" hermes honcho setup — full interactive wizard"
)
prompt_parts.append(honcho_block)
# Note: ephemeral_system_prompt is NOT included here. It's injected at
# API-call time only so it stays out of the cached/stored system prompt.
@ -2792,12 +2501,21 @@ class AIAgent:
mem_block = self._memory_store.format_for_system_prompt("memory")
if mem_block:
prompt_parts.append(mem_block)
# USER.md is always included when enabled -- Honcho prefetch is additive.
# USER.md is always included when enabled.
if self._user_profile_enabled:
user_block = self._memory_store.format_for_system_prompt("user")
if user_block:
prompt_parts.append(user_block)
# External memory provider system prompt block (additive to built-in)
if self._memory_manager:
try:
_ext_mem_block = self._memory_manager.build_system_prompt()
if _ext_mem_block:
prompt_parts.append(_ext_mem_block)
except Exception:
pass
has_skills_tools = any(name in self.valid_tool_names for name in ['skills_list', 'skill_view', 'skill_manage'])
if has_skills_tools:
avail_toolsets = {
@ -5607,10 +5325,6 @@ class AIAgent:
return
if "memory" not in self.valid_tool_names or not self._memory_store:
return
# honcho-only agent mode: skip local MEMORY.md flush
_hcfg = getattr(self, '_honcho_config', None)
if _hcfg and _hcfg.peer_memory_mode(_hcfg.ai_peer) == "honcho":
return
effective_min = min_turns if min_turns is not None else self._memory_flush_min_turns
if self._user_turn_count < effective_min:
return
@ -5734,8 +5448,6 @@ class AIAgent:
old_text=args.get("old_text"),
store=self._memory_store,
)
if self._honcho and flush_target == "user" and args.get("action") == "add":
self._honcho_save_user_observation(args.get("content", ""))
if not self.quiet_mode:
print(f" 🧠 Memory flush: saved to {args.get('target', 'memory')}")
except Exception as e:
@ -5761,6 +5473,13 @@ class AIAgent:
# Pre-compression memory flush: let the model save memories before they're lost
self.flush_memories(messages, min_turns=0)
# Notify external memory provider before compression discards context
if self._memory_manager:
try:
self._memory_manager.on_pre_compress(messages)
except Exception:
pass
compressed = self.context_compressor.compress(messages, current_tokens=approx_tokens)
todo_snapshot = self._todo_store.format_for_injection()
@ -5887,10 +5606,19 @@ class AIAgent:
old_text=function_args.get("old_text"),
store=self._memory_store,
)
# Also send user observations to Honcho when active
if self._honcho and target == "user" and function_args.get("action") == "add":
self._honcho_save_user_observation(function_args.get("content", ""))
# Bridge: notify external memory provider of built-in memory writes
if self._memory_manager and function_args.get("action") in ("add", "replace"):
try:
self._memory_manager.on_memory_write(
function_args.get("action", ""),
target,
function_args.get("content", ""),
)
except Exception:
pass
return result
elif self._memory_manager and self._memory_manager.has_tool(function_name):
return self._memory_manager.handle_tool_call(function_name, function_args)
elif function_name == "clarify":
from tools.clarify_tool import clarify_tool as _clarify_tool
return _clarify_tool(
@ -5912,8 +5640,6 @@ class AIAgent:
return handle_function_call(
function_name, function_args, effective_task_id,
enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
honcho_manager=self._honcho,
honcho_session_key=self._honcho_session_key,
)
def _execute_tool_calls_concurrent(self, assistant_message, messages: list, effective_task_id: str, api_call_count: int = 0) -> None:
@ -6237,9 +5963,6 @@ class AIAgent:
old_text=function_args.get("old_text"),
store=self._memory_store,
)
# Also send user observations to Honcho when active
if self._honcho and target == "user" and function_args.get("action") == "add":
self._honcho_save_user_observation(function_args.get("content", ""))
tool_duration = time.time() - tool_start_time
if self.quiet_mode:
self._vprint(f" {_get_cute_tool_message_impl('memory', function_args, tool_duration, result=function_result)}")
@ -6299,8 +6022,6 @@ class AIAgent:
function_result = handle_function_call(
function_name, function_args, effective_task_id,
enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
honcho_manager=self._honcho,
honcho_session_key=self._honcho_session_key,
)
_spinner_result = function_result
except Exception as tool_error:
@ -6318,8 +6039,6 @@ class AIAgent:
function_result = handle_function_call(
function_name, function_args, effective_task_id,
enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
honcho_manager=self._honcho,
honcho_session_key=self._honcho_session_key,
)
except Exception as tool_error:
function_result = f"Error executing tool '{function_name}': {tool_error}"
@ -6633,7 +6352,6 @@ class AIAgent:
task_id: str = None,
stream_callback: Optional[callable] = None,
persist_user_message: Optional[str] = None,
sync_honcho: bool = True,
) -> Dict[str, Any]:
"""
Run a complete conversation with tool calling until completion.
@ -6649,8 +6367,7 @@ class AIAgent:
persist_user_message: Optional clean user message to store in
transcripts/history when user_message contains API-only
synthetic prefixes.
sync_honcho: When False, skip writing the final synthetic turn back
to Honcho or queuing follow-up prefetch work.
or queuing follow-up prefetch work.
Returns:
Dict: Complete conversation result with final response and message history
@ -6734,7 +6451,6 @@ class AIAgent:
self._user_turn_count += 1
# Preserve the original user message (no nudge injection).
# Honcho should receive the actual user input, not system nudges.
original_user_message = persist_user_message if persist_user_message is not None else user_message
# Track memory nudge trigger (turn-based, checked here).
@ -6749,27 +6465,6 @@ class AIAgent:
_should_review_memory = True
self._turns_since_memory = 0
# Honcho prefetch consumption:
# - First turn: bake into cached system prompt (stable for the session).
# - Later turns: attach recall to the current-turn user message at
# API-call time only (never persisted to history / session DB).
#
# This keeps the system-prefix cache stable while still allowing turn N
# to consume background prefetch results from turn N-1.
self._honcho_context = ""
self._honcho_turn_context = ""
_recall_mode = (self._honcho_config.recall_mode if self._honcho_config else "hybrid")
if self._honcho and self._honcho_session_key and _recall_mode != "tools":
try:
prefetched_context = self._honcho_prefetch(original_user_message)
if prefetched_context:
if not conversation_history:
self._honcho_context = prefetched_context
else:
self._honcho_turn_context = prefetched_context
except Exception as e:
logger.debug("Honcho prefetch failed (non-fatal): %s", e)
# Add user message
user_msg = {"role": "user", "content": user_message}
messages.append(user_msg)
@ -6807,13 +6502,6 @@ class AIAgent:
else:
# First turn of a new session — build from scratch.
self._cached_system_prompt = self._build_system_prompt(system_message)
# Bake Honcho context into the prompt so it's stable for
# the entire session (not re-fetched per turn).
if self._honcho_context:
self._cached_system_prompt = (
self._cached_system_prompt + "\n\n" + self._honcho_context
).strip()
# Plugin hook: on_session_start
# Fired once when a brand-new session is created (not on
# continuation). Plugins can use this to initialise
@ -6935,7 +6623,18 @@ class AIAgent:
# Clear any stale interrupt state at start
self.clear_interrupt()
# External memory provider: prefetch once before the tool loop.
# Reuse the cached result on every iteration to avoid re-calling
# prefetch_all() on each tool call (10 tool calls = 10x latency + cost).
_ext_prefetch_cache = ""
if self._memory_manager:
try:
_query = user_message if isinstance(user_message, str) else ""
_ext_prefetch_cache = self._memory_manager.prefetch_all(_query) or ""
except Exception:
pass
while api_call_count < self.max_iterations and self.iteration_budget.remaining > 0:
# Reset per-turn checkpoint dedup so each iteration can take one snapshot
self._checkpoint_mgr.new_turn()
@ -6984,10 +6683,11 @@ class AIAgent:
for idx, msg in enumerate(messages):
api_msg = msg.copy()
if idx == current_turn_user_idx and msg.get("role") == "user" and self._honcho_turn_context:
api_msg["content"] = _inject_honcho_turn_context(
api_msg.get("content", ""), self._honcho_turn_context
)
# External memory provider prefetch: inject cached recalled context
if idx == current_turn_user_idx and msg.get("role") == "user" and _ext_prefetch_cache:
_base = api_msg.get("content", "")
if isinstance(_base, str):
api_msg["content"] = _base + "\n\n" + _ext_prefetch_cache
# For ALL assistant messages, pass reasoning back to the API
# This ensures multi-turn reasoning context is preserved
@ -7016,8 +6716,8 @@ class AIAgent:
# Build the final system message: cached prompt + ephemeral system prompt.
# Ephemeral additions are API-call-time only (not persisted to session DB).
# Honcho later-turn recall is intentionally kept OUT of the system prompt
# so the stable cache prefix remains unchanged.
# External recall context is injected into the user message, not the system
# prompt, so the stable cache prefix remains unchanged.
effective_system = active_system_prompt or ""
if self.ephemeral_system_prompt:
effective_system = (effective_system + "\n\n" + self.ephemeral_system_prompt).strip()
@ -8730,10 +8430,6 @@ class AIAgent:
# Persist session to both JSON log and SQLite
self._persist_session(messages, conversation_history)
# Sync conversation to Honcho for user modeling
if final_response and not interrupted and sync_honcho:
self._honcho_sync(original_user_message, final_response)
self._queue_honcho_prefetch(original_user_message)
# Plugin hook: post_llm_call
# Fired once per turn after the tool-calling loop completes.
@ -8807,6 +8503,14 @@ class AIAgent:
_should_review_skills = True
self._iters_since_skill = 0
# External memory provider: sync the completed turn + queue next prefetch
if self._memory_manager and final_response and user_message:
try:
self._memory_manager.sync_all(user_message, final_response)
self._memory_manager.queue_prefetch_all(user_message)
except Exception:
pass
# Background memory/skill review — runs AFTER the response is delivered
# so it never competes with the user's task for model attention.
if final_response and not interrupted and (_should_review_memory or _should_review_skills):
@ -8819,6 +8523,13 @@ class AIAgent:
except Exception:
pass # Background review is best-effort
# Note: Memory provider on_session_end() + shutdown_all() are NOT
# called here — run_conversation() is called once per user message in
# multi-turn sessions. Shutting down after every turn would kill the
# provider before the second message. Actual session-end cleanup is
# handled by the CLI (atexit / /reset) and gateway (session expiry /
# _reset_session).
# Plugin hook: on_session_end
# Fired at the very end of every run_conversation call.
# Plugins can use this for cleanup, flushing buffers, etc.