From 25757d631b493381c22efe45984655b06ae97651 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Nicol=C3=B2=20Boschi?= Date: Thu, 9 Apr 2026 07:27:31 +0200 Subject: [PATCH] feat(hindsight): feature parity, setup wizard, and config improvements MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Port missing features from the hindsight-hermes external integration package into the native plugin. Only touches plugin files — no core changes. Features: - Tags on retain/recall (tags, recall_tags, recall_tags_match) - Recall config (recall_max_tokens, recall_max_input_chars, recall_types, recall_prompt_preamble) - Retain controls (retain_every_n_turns, auto_retain, auto_recall, retain_async via aretain_batch, retain_context) - Bank config via Banks API (bank_mission, bank_retain_mission) - Structured JSON retain with per-message timestamps - Full session accumulation with document_id for dedup - Custom post_setup() wizard with curses picker - Mode-aware dep install (hindsight-client for cloud, hindsight-all for local) - local_external mode and openai_compatible LLM provider - OpenRouter support with auto base URL - Auto-upgrade of hindsight-client to >=0.4.22 on session start - Comprehensive debug logging across all operations - 46 unit tests - Updated README and website docs --- plugins/memory/hindsight/README.md | 74 ++- plugins/memory/hindsight/__init__.py | 449 +++++++++++-- plugins/memory/hindsight/plugin.yaml | 6 +- .../plugins/memory/test_hindsight_provider.py | 598 ++++++++++++++++++ .../user-guide/features/memory-providers.md | 18 +- 5 files changed, 1072 insertions(+), 73 deletions(-) create mode 100644 tests/plugins/memory/test_hindsight_provider.py diff --git a/plugins/memory/hindsight/README.md b/plugins/memory/hindsight/README.md index 3a1df59e4..024a99303 100644 --- a/plugins/memory/hindsight/README.md +++ b/plugins/memory/hindsight/README.md @@ -1,11 +1,12 @@ # Hindsight Memory Provider -Long-term memory with knowledge graph, entity resolution, and multi-strategy retrieval. Supports cloud and local (embedded) modes. +Long-term memory with knowledge graph, entity resolution, and multi-strategy retrieval. Supports cloud, local embedded, and local external modes. ## Requirements - **Cloud:** API key from [ui.hindsight.vectorize.io](https://ui.hindsight.vectorize.io) -- **Local:** API key for a supported LLM provider (OpenAI, Anthropic, Gemini, Groq, MiniMax, or Ollama). Embeddings and reranking run locally — no additional API keys needed. +- **Local Embedded:** API key for a supported LLM provider (OpenAI, Anthropic, Gemini, Groq, OpenRouter, MiniMax, Ollama, or any OpenAI-compatible endpoint). Embeddings and reranking run locally — no additional API keys needed. +- **Local External:** A running Hindsight instance (Docker or self-hosted) reachable over HTTP. ## Setup @@ -21,17 +22,28 @@ hermes config set memory.provider hindsight echo "HINDSIGHT_API_KEY=your-key" >> ~/.hermes/.env ``` -### Cloud Mode +### Cloud Connects to the Hindsight Cloud API. Requires an API key from [ui.hindsight.vectorize.io](https://ui.hindsight.vectorize.io). -### Local Mode +### Local Embedded -Runs an embedded Hindsight server with built-in PostgreSQL. Requires an LLM API key (e.g. Groq, OpenAI, Anthropic) for memory extraction and synthesis. The daemon starts automatically in the background on first use and stops after 5 minutes of inactivity. +Hermes spins up a local Hindsight daemon with built-in PostgreSQL. Requires an LLM API key for memory extraction and synthesis. The daemon starts automatically in the background on first use and stops after 5 minutes of inactivity. + +Supports any OpenAI-compatible LLM endpoint (llama.cpp, vLLM, LM Studio, etc.) — pick `openai_compatible` as the provider and enter the base URL. Daemon startup logs: `~/.hermes/logs/hindsight-embed.log` Daemon runtime logs: `~/.hindsight/profiles/.log` +To open the Hindsight web UI (local embedded mode only): +```bash +hindsight-embed -p hermes ui start +``` + +### Local External + +Points the plugin at an existing Hindsight instance you're already running (Docker, self-hosted, etc.). No daemon management — just a URL and an optional API key. + ## Config Config file: `~/.hermes/hindsight/config.json` @@ -40,40 +52,58 @@ Config file: `~/.hermes/hindsight/config.json` | Key | Default | Description | |-----|---------|-------------| -| `mode` | `cloud` | `cloud` or `local` | -| `api_url` | `https://api.hindsight.vectorize.io` | API URL (cloud mode) | -| `api_url` | `http://localhost:8888` | API URL (local mode, unused — daemon manages its own port) | +| `mode` | `cloud` | `cloud`, `local_embedded`, or `local_external` | +| `api_url` | `https://api.hindsight.vectorize.io` | API URL (cloud and local_external modes) | -### Memory +### Memory Bank | Key | Default | Description | |-----|---------|-------------| | `bank_id` | `hermes` | Memory bank name | -| `budget` | `mid` | Recall thoroughness: `low` / `mid` / `high` | +| `bank_mission` | — | Reflect mission (identity/framing for reflect reasoning). Applied via Banks API. | +| `bank_retain_mission` | — | Retain mission (steers what gets extracted). Applied via Banks API. | + +### Recall + +| Key | Default | Description | +|-----|---------|-------------| +| `recall_budget` | `mid` | Recall thoroughness: `low` / `mid` / `high` | +| `recall_prefetch_method` | `recall` | Auto-recall method: `recall` (raw facts) or `reflect` (LLM synthesis) | +| `recall_max_tokens` | `4096` | Maximum tokens for recall results | +| `recall_max_input_chars` | `800` | Maximum input query length for auto-recall | +| `recall_prompt_preamble` | — | Custom preamble for recalled memories in context | +| `recall_tags` | — | Tags to filter when searching memories | +| `recall_tags_match` | `any` | Tag matching mode: `any` / `all` / `any_strict` / `all_strict` | +| `auto_recall` | `true` | Automatically recall memories before each turn | + +### Retain + +| Key | Default | Description | +|-----|---------|-------------| +| `auto_retain` | `true` | Automatically retain conversation turns | +| `retain_async` | `true` | Process retain asynchronously on the Hindsight server | +| `retain_every_n_turns` | `1` | Retain every N turns (1 = every turn) | +| `retain_context` | `conversation between Hermes Agent and the User` | Context label for retained memories | +| `tags` | — | Tags applied when storing memories | ### Integration | Key | Default | Description | |-----|---------|-------------| | `memory_mode` | `hybrid` | How memories are integrated into the agent | -| `prefetch_method` | `recall` | Method for automatic context injection | **memory_mode:** - `hybrid` — automatic context injection + tools available to the LLM - `context` — automatic injection only, no tools exposed - `tools` — tools only, no automatic injection -**prefetch_method:** -- `recall` — injects raw memory facts (fast) -- `reflect` — injects LLM-synthesized summary (slower, more coherent) - -### Local Mode LLM +### Local Embedded LLM | Key | Default | Description | |-----|---------|-------------| -| `llm_provider` | `openai` | LLM provider: `openai`, `anthropic`, `gemini`, `groq`, `minimax`, `ollama` | -| `llm_model` | per-provider | Model name (e.g. `gpt-4o-mini`, `openai/gpt-oss-120b`) | -| `llm_base_url` | — | LLM Base URL override (e.g. `https://openrouter.ai/api/v1`) | +| `llm_provider` | `openai` | `openai`, `anthropic`, `gemini`, `groq`, `openrouter`, `minimax`, `ollama`, `lmstudio`, `openai_compatible` | +| `llm_model` | per-provider | Model name (e.g. `gpt-4o-mini`, `qwen/qwen3.5-9b`) | +| `llm_base_url` | — | Endpoint URL for `openai_compatible` (e.g. `http://192.168.1.10:8080/v1`) | The LLM API key is stored in `~/.hermes/.env` as `HINDSIGHT_LLM_API_KEY`. @@ -97,4 +127,8 @@ Available in `hybrid` and `tools` memory modes: | `HINDSIGHT_API_URL` | Override API endpoint | | `HINDSIGHT_BANK_ID` | Override bank name | | `HINDSIGHT_BUDGET` | Override recall budget | -| `HINDSIGHT_MODE` | Override mode (`cloud` / `local`) | +| `HINDSIGHT_MODE` | Override mode (`cloud`, `local_embedded`, `local_external`) | + +## Client Version + +Requires `hindsight-client >= 0.4.22`. The plugin auto-upgrades on session start if an older version is detected. diff --git a/plugins/memory/hindsight/__init__.py b/plugins/memory/hindsight/__init__.py index c87497745..c39679b73 100644 --- a/plugins/memory/hindsight/__init__.py +++ b/plugins/memory/hindsight/__init__.py @@ -28,21 +28,25 @@ from hermes_constants import get_hermes_home from typing import Any, Dict, List from agent.memory_provider import MemoryProvider +from hermes_constants import get_hermes_home from tools.registry import tool_error logger = logging.getLogger(__name__) _DEFAULT_API_URL = "https://api.hindsight.vectorize.io" _DEFAULT_LOCAL_URL = "http://localhost:8888" +_MIN_CLIENT_VERSION = "0.4.22" _VALID_BUDGETS = {"low", "mid", "high"} _PROVIDER_DEFAULT_MODELS = { "openai": "gpt-4o-mini", "anthropic": "claude-haiku-4-5", "gemini": "gemini-2.5-flash", "groq": "openai/gpt-oss-120b", + "openrouter": "qwen/qwen3.5-9b", "minimax": "MiniMax-M2.7", "ollama": "gemma3:12b", "lmstudio": "local-model", + "openai_compatible": "your-model-name", } @@ -188,6 +192,7 @@ class HindsightMemoryProvider(MemoryProvider): self._bank_id = "hermes" self._budget = "mid" self._mode = "cloud" + self._llm_base_url = "" self._memory_mode = "hybrid" # "context", "tools", or "hybrid" self._prefetch_method = "recall" # "recall" or "reflect" self._client = None @@ -195,6 +200,31 @@ class HindsightMemoryProvider(MemoryProvider): self._prefetch_lock = threading.Lock() self._prefetch_thread = None self._sync_thread = None + self._session_id = "" + + # Tags + self._tags: list[str] | None = None + self._recall_tags: list[str] | None = None + self._recall_tags_match = "any" + + # Retain controls + self._auto_retain = True + self._retain_every_n_turns = 1 + self._retain_context = "conversation between Hermes Agent and the User" + self._turn_counter = 0 + self._session_turns: list[str] = [] # accumulates ALL turns for the session + + # Recall controls + self._auto_recall = True + self._recall_max_tokens = 4096 + self._recall_types: list[str] | None = None + self._recall_prompt_preamble = "" + self._recall_max_input_chars = 800 + + # Bank + self._bank_mission = "" + self._bank_retain_mission: str | None = None + self._retain_async = True @property def name(self) -> str: @@ -204,7 +234,7 @@ class HindsightMemoryProvider(MemoryProvider): try: cfg = _load_config() mode = cfg.get("mode", "cloud") - if mode == "local": + if mode in ("local", "local_embedded", "local_external"): return True has_key = bool(cfg.get("apiKey") or os.environ.get("HINDSIGHT_API_KEY", "")) has_url = bool(cfg.get("api_url") or os.environ.get("HINDSIGHT_API_URL", "")) @@ -228,73 +258,306 @@ class HindsightMemoryProvider(MemoryProvider): existing.update(values) config_path.write_text(json.dumps(existing, indent=2)) + def post_setup(self, hermes_home: str, config: dict) -> None: + """Custom setup wizard — installs only the deps needed for the selected mode.""" + import getpass + import subprocess + import shutil + import sys + from pathlib import Path + + from hermes_cli.config import save_config + + from hermes_cli.memory_setup import _curses_select + + print("\n Configuring Hindsight memory:\n") + + # Step 1: Mode selection + mode_items = [ + ("Cloud", "Hindsight Cloud API (lightweight, just needs an API key)"), + ("Local Embedded", "Run Hindsight locally (downloads ~200MB, needs LLM key)"), + ("Local External", "Connect to an existing Hindsight instance"), + ] + mode_idx = _curses_select(" Select mode", mode_items, default=0) + mode = ["cloud", "local_embedded", "local_external"][mode_idx] + + provider_config: dict = {"mode": mode} + env_writes: dict = {} + + # Step 2: Install/upgrade deps for selected mode + _MIN_CLIENT_VERSION = "0.4.22" + cloud_dep = f"hindsight-client>={_MIN_CLIENT_VERSION}" + local_dep = "hindsight-all" + if mode == "local_embedded": + deps_to_install = [local_dep] + elif mode == "local_external": + deps_to_install = [cloud_dep] + else: + deps_to_install = [cloud_dep] + + print(f"\n Checking dependencies...") + uv_path = shutil.which("uv") + if not uv_path: + print(" ⚠ uv not found — install it: curl -LsSf https://astral.sh/uv/install.sh | sh") + print(f" Then run manually: uv pip install --python {sys.executable} {' '.join(deps_to_install)}") + else: + try: + subprocess.run( + [uv_path, "pip", "install", "--python", sys.executable, "--quiet", "--upgrade"] + deps_to_install, + check=True, timeout=120, capture_output=True, + ) + print(f" ✓ Dependencies up to date") + except Exception as e: + print(f" ⚠ Install failed: {e}") + print(f" Run manually: uv pip install --python {sys.executable} {' '.join(deps_to_install)}") + + # Step 3: Mode-specific config + if mode == "cloud": + print(f"\n Get your API key at https://ui.hindsight.vectorize.io\n") + existing_key = os.environ.get("HINDSIGHT_API_KEY", "") + if existing_key: + masked = f"...{existing_key[-4:]}" if len(existing_key) > 4 else "set" + sys.stdout.write(f" API key (current: {masked}, blank to keep): ") + sys.stdout.flush() + api_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip() + else: + sys.stdout.write(" API key: ") + sys.stdout.flush() + api_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip() + if api_key: + env_writes["HINDSIGHT_API_KEY"] = api_key + + val = input(f" API URL [{_DEFAULT_API_URL}]: ").strip() + if val: + provider_config["api_url"] = val + + elif mode == "local_external": + val = input(f" Hindsight API URL [{_DEFAULT_LOCAL_URL}]: ").strip() + provider_config["api_url"] = val or _DEFAULT_LOCAL_URL + + sys.stdout.write(" API key (optional, blank to skip): ") + sys.stdout.flush() + api_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip() + if api_key: + env_writes["HINDSIGHT_API_KEY"] = api_key + + else: # local_embedded + providers_list = list(_PROVIDER_DEFAULT_MODELS.keys()) + llm_items = [ + (p, f"default model: {_PROVIDER_DEFAULT_MODELS[p]}") + for p in providers_list + ] + llm_idx = _curses_select(" Select LLM provider", llm_items, default=0) + llm_provider = providers_list[llm_idx] + + provider_config["llm_provider"] = llm_provider + + if llm_provider == "openai_compatible": + val = input(" LLM endpoint URL (e.g. http://192.168.1.10:8080/v1): ").strip() + if val: + provider_config["llm_base_url"] = val + elif llm_provider == "openrouter": + provider_config["llm_base_url"] = "https://openrouter.ai/api/v1" + + default_model = _PROVIDER_DEFAULT_MODELS.get(llm_provider, "gpt-4o-mini") + val = input(f" LLM model [{default_model}]: ").strip() + provider_config["llm_model"] = val or default_model + + sys.stdout.write(" LLM API key: ") + sys.stdout.flush() + llm_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip() + if llm_key: + env_writes["HINDSIGHT_LLM_API_KEY"] = llm_key + + # Step 4: Save everything + provider_config["bank_id"] = "hermes" + provider_config["recall_budget"] = "mid" + bank_id = "hermes" + config["memory"]["provider"] = "hindsight" + save_config(config) + + self.save_config(provider_config, hermes_home) + + if env_writes: + env_path = Path(hermes_home) / ".env" + env_path.parent.mkdir(parents=True, exist_ok=True) + existing_lines = [] + if env_path.exists(): + existing_lines = env_path.read_text().splitlines() + updated_keys = set() + new_lines = [] + for line in existing_lines: + key_match = line.split("=", 1)[0].strip() if "=" in line and not line.startswith("#") else None + if key_match and key_match in env_writes: + new_lines.append(f"{key_match}={env_writes[key_match]}") + updated_keys.add(key_match) + else: + new_lines.append(line) + for k, v in env_writes.items(): + if k not in updated_keys: + new_lines.append(f"{k}={v}") + env_path.write_text("\n".join(new_lines) + "\n") + + print(f"\n ✓ Hindsight memory configured ({mode} mode)") + if env_writes: + print(f" API keys saved to .env") + print(f"\n Start a new session to activate.\n") + def get_config_schema(self): return [ - {"key": "mode", "description": "Cloud API or local embedded mode", "default": "cloud", "choices": ["cloud", "local"]}, - {"key": "api_url", "description": "Hindsight API URL", "default": _DEFAULT_API_URL, "when": {"mode": "cloud"}}, + {"key": "mode", "description": "Connection mode", "default": "cloud", "choices": ["cloud", "local_embedded", "local_external"]}, + # Cloud mode + {"key": "api_url", "description": "Hindsight Cloud API URL", "default": _DEFAULT_API_URL, "when": {"mode": "cloud"}}, {"key": "api_key", "description": "Hindsight Cloud API key", "secret": True, "env_var": "HINDSIGHT_API_KEY", "url": "https://ui.hindsight.vectorize.io", "when": {"mode": "cloud"}}, - {"key": "llm_provider", "description": "LLM provider for local mode", "default": "openai", "choices": ["openai", "anthropic", "gemini", "groq", "minimax", "ollama"], "when": {"mode": "local"}}, - {"key": "llm_api_key", "description": "LLM API key for local Hindsight", "secret": True, "env_var": "HINDSIGHT_LLM_API_KEY", "when": {"mode": "local"}}, - {"key": "llm_base_url", "description": "LLM Base URL (e.g. for OpenRouter)", "default": "", "env_var": "HINDSIGHT_API_LLM_BASE_URL", "when": {"mode": "local"}}, - {"key": "llm_model", "description": "LLM model for local mode", "default": "gpt-4o-mini", "default_from": {"field": "llm_provider", "map": _PROVIDER_DEFAULT_MODELS}, "when": {"mode": "local"}}, + # Local external mode + {"key": "api_url", "description": "Hindsight API URL", "default": _DEFAULT_LOCAL_URL, "when": {"mode": "local_external"}}, + {"key": "api_key", "description": "API key (optional)", "secret": True, "env_var": "HINDSIGHT_API_KEY", "when": {"mode": "local_external"}}, + # Local embedded mode + {"key": "llm_provider", "description": "LLM provider", "default": "openai", "choices": ["openai", "anthropic", "gemini", "groq", "openrouter", "minimax", "ollama", "lmstudio", "openai_compatible"], "when": {"mode": "local_embedded"}}, + {"key": "llm_base_url", "description": "Endpoint URL (e.g. http://192.168.1.10:8080/v1)", "default": "", "when": {"mode": "local_embedded", "llm_provider": "openai_compatible"}}, + {"key": "llm_api_key", "description": "LLM API key (optional for openai_compatible)", "secret": True, "env_var": "HINDSIGHT_LLM_API_KEY", "when": {"mode": "local_embedded"}}, + {"key": "llm_model", "description": "LLM model", "default": "gpt-4o-mini", "default_from": {"field": "llm_provider", "map": _PROVIDER_DEFAULT_MODELS}, "when": {"mode": "local_embedded"}}, {"key": "bank_id", "description": "Memory bank name", "default": "hermes"}, - {"key": "budget", "description": "Recall thoroughness", "default": "mid", "choices": ["low", "mid", "high"]}, + {"key": "bank_mission", "description": "Mission/purpose description for the memory bank"}, + {"key": "bank_retain_mission", "description": "Custom extraction prompt for memory retention"}, + {"key": "recall_budget", "description": "Recall thoroughness", "default": "mid", "choices": ["low", "mid", "high"]}, {"key": "memory_mode", "description": "Memory integration mode", "default": "hybrid", "choices": ["hybrid", "context", "tools"]}, - {"key": "prefetch_method", "description": "Auto-recall method", "default": "recall", "choices": ["recall", "reflect"]}, + {"key": "recall_prefetch_method", "description": "Auto-recall method", "default": "recall", "choices": ["recall", "reflect"]}, + {"key": "tags", "description": "Tags applied when storing memories (comma-separated)", "default": ""}, + {"key": "recall_tags", "description": "Tags to filter when searching memories (comma-separated)", "default": ""}, + {"key": "recall_tags_match", "description": "Tag matching mode for recall", "default": "any", "choices": ["any", "all", "any_strict", "all_strict"]}, + {"key": "auto_recall", "description": "Automatically recall memories before each turn", "default": True}, + {"key": "auto_retain", "description": "Automatically retain conversation turns", "default": True}, + {"key": "retain_every_n_turns", "description": "Retain every N turns (1 = every turn)", "default": 1}, + {"key": "retain_async","description": "Process retain asynchronously on the Hindsight server", "default": True}, + {"key": "retain_context", "description": "Context label for retained memories", "default": "conversation between Hermes Agent and the User"}, + {"key": "recall_max_tokens", "description": "Maximum tokens for recall results", "default": 4096}, + {"key": "recall_max_input_chars", "description": "Maximum input query length for auto-recall", "default": 800}, + {"key": "recall_prompt_preamble", "description": "Custom preamble for recalled memories in context"}, ] def _get_client(self): """Return the cached Hindsight client (created once, reused).""" if self._client is None: - if self._mode == "local": + if self._mode == "local_embedded": from hindsight import HindsightEmbedded - # Disable __del__ on the class to prevent "attached to a - # different loop" errors during GC — we handle cleanup in - # shutdown() instead. HindsightEmbedded.__del__ = lambda self: None + llm_provider = self._config.get("llm_provider", "") + if llm_provider in ("openai_compatible", "openrouter"): + llm_provider = "openai" + logger.debug("Creating HindsightEmbedded client (profile=%s, provider=%s)", + self._config.get("profile", "hermes"), llm_provider) kwargs = dict( profile=self._config.get("profile", "hermes"), - llm_provider=self._config.get("llm_provider", ""), - llm_api_key=self._config.get("llm_api_key") or os.environ.get("HINDSIGHT_LLM_API_KEY", ""), + llm_provider=llm_provider, + llm_api_key=self._config.get("llmApiKey") or self._config.get("llm_api_key") or os.environ.get("HINDSIGHT_LLM_API_KEY", ""), llm_model=self._config.get("llm_model", ""), ) - base_url = self._config.get("llm_base_url") or os.environ.get("HINDSIGHT_API_LLM_BASE_URL", "") - if base_url: - kwargs["llm_base_url"] = base_url + if self._llm_base_url: + kwargs["llm_base_url"] = self._llm_base_url self._client = HindsightEmbedded(**kwargs) else: from hindsight_client import Hindsight kwargs = {"base_url": self._api_url, "timeout": 30.0} if self._api_key: kwargs["api_key"] = self._api_key + logger.debug("Creating Hindsight cloud client (url=%s, has_key=%s)", + self._api_url, bool(self._api_key)) self._client = Hindsight(**kwargs) return self._client def initialize(self, session_id: str, **kwargs) -> None: + self._session_id = session_id + + # Check client version and auto-upgrade if needed + try: + from importlib.metadata import version as pkg_version + from packaging.version import Version + installed = pkg_version("hindsight-client") + if Version(installed) < Version(_MIN_CLIENT_VERSION): + logger.warning("hindsight-client %s is outdated (need >=%s), attempting upgrade...", + installed, _MIN_CLIENT_VERSION) + import shutil, subprocess, sys + uv_path = shutil.which("uv") + if uv_path: + try: + subprocess.run( + [uv_path, "pip", "install", "--python", sys.executable, + "--quiet", "--upgrade", f"hindsight-client>={_MIN_CLIENT_VERSION}"], + check=True, timeout=120, capture_output=True, + ) + logger.info("hindsight-client upgraded to >=%s", _MIN_CLIENT_VERSION) + except Exception as e: + logger.warning("Auto-upgrade failed: %s. Run: uv pip install 'hindsight-client>=%s'", + e, _MIN_CLIENT_VERSION) + else: + logger.warning("uv not found. Run: pip install 'hindsight-client>=%s'", _MIN_CLIENT_VERSION) + except Exception: + pass # packaging not available or other issue — proceed anyway + self._config = _load_config() self._mode = self._config.get("mode", "cloud") - self._api_key = self._config.get("apiKey") or os.environ.get("HINDSIGHT_API_KEY", "") - default_url = _DEFAULT_LOCAL_URL if self._mode == "local" else _DEFAULT_API_URL + # "local" is a legacy alias for "local_embedded" + if self._mode == "local": + self._mode = "local_embedded" + self._api_key = self._config.get("apiKey") or self._config.get("api_key") or os.environ.get("HINDSIGHT_API_KEY", "") + default_url = _DEFAULT_LOCAL_URL if self._mode in ("local_embedded", "local_external") else _DEFAULT_API_URL self._api_url = self._config.get("api_url") or os.environ.get("HINDSIGHT_API_URL", default_url) + self._llm_base_url = self._config.get("llm_base_url", "") banks = self._config.get("banks", {}).get("hermes", {}) self._bank_id = self._config.get("bank_id") or banks.get("bankId", "hermes") - budget = self._config.get("budget") or banks.get("budget", "mid") + budget = self._config.get("recall_budget") or self._config.get("budget") or banks.get("budget", "mid") self._budget = budget if budget in _VALID_BUDGETS else "mid" memory_mode = self._config.get("memory_mode", "hybrid") self._memory_mode = memory_mode if memory_mode in ("context", "tools", "hybrid") else "hybrid" - prefetch_method = self._config.get("prefetch_method", "recall") + prefetch_method = self._config.get("recall_prefetch_method", "recall") self._prefetch_method = prefetch_method if prefetch_method in ("recall", "reflect") else "recall" - logger.info("Hindsight initialized: mode=%s, api_url=%s, bank=%s, budget=%s, memory_mode=%s, prefetch_method=%s", - self._mode, self._api_url, self._bank_id, self._budget, self._memory_mode, self._prefetch_method) + # Bank options + self._bank_mission = self._config.get("bank_mission", "") + self._bank_retain_mission = self._config.get("bank_retain_mission") or None + + # Tags + self._tags = self._config.get("tags") or None + self._recall_tags = self._config.get("recall_tags") or None + self._recall_tags_match = self._config.get("recall_tags_match", "any") + + # Retain controls + self._auto_retain = self._config.get("auto_retain", True) + self._retain_every_n_turns = max(1, int(self._config.get("retain_every_n_turns", 1))) + self._retain_context = self._config.get("retain_context", "conversation between Hermes Agent and the User") + + # Recall controls + self._auto_recall = self._config.get("auto_recall", True) + self._recall_max_tokens = int(self._config.get("recall_max_tokens", 4096)) + self._recall_types = self._config.get("recall_types") or None + self._recall_prompt_preamble = self._config.get("recall_prompt_preamble", "") + self._recall_max_input_chars = int(self._config.get("recall_max_input_chars", 800)) + self._retain_async = self._config.get("retain_async", True) + + _client_version = "unknown" + try: + from importlib.metadata import version as pkg_version + _client_version = pkg_version("hindsight-client") + except Exception: + pass + logger.info("Hindsight initialized: mode=%s, api_url=%s, bank=%s, budget=%s, memory_mode=%s, prefetch_method=%s, client=%s", + self._mode, self._api_url, self._bank_id, self._budget, self._memory_mode, self._prefetch_method, _client_version) + logger.debug("Hindsight config: auto_retain=%s, auto_recall=%s, retain_every_n=%d, " + "retain_async=%s, retain_context=%s, " + "recall_max_tokens=%d, recall_max_input_chars=%d, tags=%s, recall_tags=%s", + self._auto_retain, self._auto_recall, self._retain_every_n_turns, + self._retain_async, self._retain_context, + self._recall_max_tokens, self._recall_max_input_chars, + self._tags, self._recall_tags) # For local mode, start the embedded daemon in the background so it # doesn't block the chat. Redirect stdout/stderr to a log file to # prevent rich startup output from spamming the terminal. - if self._mode == "local": + if self._mode == "local_embedded": def _start_daemon(): import traceback log_dir = get_hermes_home() / "logs" @@ -320,6 +583,8 @@ class HindsightMemoryProvider(MemoryProvider): current_provider = self._config.get("llm_provider", "") current_model = self._config.get("llm_model", "") current_base_url = self._config.get("llm_base_url") or os.environ.get("HINDSIGHT_API_LLM_BASE_URL", "") + # Map openai_compatible/openrouter → openai for the daemon (OpenAI wire format) + daemon_provider = "openai" if current_provider in ("openai_compatible", "openrouter") else current_provider # Read saved profile config saved = {} @@ -330,7 +595,7 @@ class HindsightMemoryProvider(MemoryProvider): saved[k.strip()] = v.strip() config_changed = ( - saved.get("HINDSIGHT_API_LLM_PROVIDER") != current_provider or + saved.get("HINDSIGHT_API_LLM_PROVIDER") != daemon_provider or saved.get("HINDSIGHT_API_LLM_MODEL") != current_model or saved.get("HINDSIGHT_API_LLM_API_KEY") != current_key or saved.get("HINDSIGHT_API_LLM_BASE_URL", "") != current_base_url @@ -340,7 +605,7 @@ class HindsightMemoryProvider(MemoryProvider): # Write updated profile .env profile_env.parent.mkdir(parents=True, exist_ok=True) env_lines = ( - f"HINDSIGHT_API_LLM_PROVIDER={current_provider}\n" + f"HINDSIGHT_API_LLM_PROVIDER={daemon_provider}\n" f"HINDSIGHT_API_LLM_API_KEY={current_key}\n" f"HINDSIGHT_API_LLM_MODEL={current_model}\n" f"HINDSIGHT_API_LOG_LEVEL=info\n" @@ -388,47 +653,118 @@ class HindsightMemoryProvider(MemoryProvider): def prefetch(self, query: str, *, session_id: str = "") -> str: if self._prefetch_thread and self._prefetch_thread.is_alive(): + logger.debug("Prefetch: waiting for background thread to complete") self._prefetch_thread.join(timeout=3.0) with self._prefetch_lock: result = self._prefetch_result self._prefetch_result = "" if not result: + logger.debug("Prefetch: no results available") return "" - return f"## Hindsight Memory\n{result}" + logger.debug("Prefetch: returning %d chars of context", len(result)) + header = self._recall_prompt_preamble or ( + "# Hindsight Memory (persistent cross-session context)\n" + "Use this to answer questions about the user and prior sessions. " + "Do not call tools to look up information that is already present here." + ) + return f"{header}\n\n{result}" def queue_prefetch(self, query: str, *, session_id: str = "") -> None: if self._memory_mode == "tools": + logger.debug("Prefetch: skipped (tools-only mode)") return + if not self._auto_recall: + logger.debug("Prefetch: skipped (auto_recall disabled)") + return + # Truncate query to max chars + if self._recall_max_input_chars and len(query) > self._recall_max_input_chars: + query = query[:self._recall_max_input_chars] + def _run(): try: client = self._get_client() if self._prefetch_method == "reflect": + logger.debug("Prefetch: calling reflect (bank=%s, query_len=%d)", self._bank_id, len(query)) resp = _run_sync(client.areflect(bank_id=self._bank_id, query=query, budget=self._budget)) text = resp.text or "" else: - resp = _run_sync(client.arecall(bank_id=self._bank_id, query=query, budget=self._budget)) - text = "\n".join(r.text for r in resp.results if r.text) if resp.results else "" + recall_kwargs: dict = { + "bank_id": self._bank_id, "query": query, + "budget": self._budget, "max_tokens": self._recall_max_tokens, + } + if self._recall_tags: + recall_kwargs["tags"] = self._recall_tags + recall_kwargs["tags_match"] = self._recall_tags_match + if self._recall_types: + recall_kwargs["types"] = self._recall_types + logger.debug("Prefetch: calling recall (bank=%s, query_len=%d, budget=%s)", + self._bank_id, len(query), self._budget) + resp = _run_sync(client.arecall(**recall_kwargs)) + num_results = len(resp.results) if resp.results else 0 + logger.debug("Prefetch: recall returned %d results", num_results) + text = "\n".join(f"- {r.text}" for r in resp.results if r.text) if resp.results else "" if text: with self._prefetch_lock: self._prefetch_result = text except Exception as e: - logger.debug("Hindsight prefetch failed: %s", e) + logger.debug("Hindsight prefetch failed: %s", e, exc_info=True) self._prefetch_thread = threading.Thread(target=_run, daemon=True, name="hindsight-prefetch") self._prefetch_thread.start() def sync_turn(self, user_content: str, assistant_content: str, *, session_id: str = "") -> None: - """Retain conversation turn in background (non-blocking).""" - combined = f"User: {user_content}\nAssistant: {assistant_content}" + """Retain conversation turn in background (non-blocking). + + Respects retain_every_n_turns for batching. + """ + if not self._auto_retain: + logger.debug("sync_turn: skipped (auto_retain disabled)") + return + + from datetime import datetime, timezone + now = datetime.now(timezone.utc).isoformat() + + messages = [ + {"role": "user", "content": user_content, "timestamp": now}, + {"role": "assistant", "content": assistant_content, "timestamp": now}, + ] + + turn = json.dumps(messages) + self._session_turns.append(turn) + self._turn_counter += 1 + + # Only retain every N turns + if self._turn_counter % self._retain_every_n_turns != 0: + logger.debug("sync_turn: buffered turn %d (will retain at turn %d)", + self._turn_counter, self._turn_counter + (self._retain_every_n_turns - self._turn_counter % self._retain_every_n_turns)) + return + + logger.debug("sync_turn: retaining %d turns, total session content %d chars", + len(self._session_turns), sum(len(t) for t in self._session_turns)) + # Send the ENTIRE session as a single JSON array (document_id deduplicates). + # Each element in _session_turns is a JSON string of that turn's messages. + content = "[" + ",".join(self._session_turns) + "]" def _sync(): try: client = self._get_client() - _run_sync(client.aretain( - bank_id=self._bank_id, content=combined, context="conversation" + item: dict = { + "content": content, + "context": self._retain_context, + } + if self._tags: + item["tags"] = self._tags + logger.debug("Hindsight retain: bank=%s, doc=%s, async=%s, content_len=%d, num_turns=%d", + self._bank_id, self._session_id, self._retain_async, len(content), len(self._session_turns)) + _run_sync(client.aretain_batch( + bank_id=self._bank_id, + items=[item], + document_id=self._session_id, + retain_async=self._retain_async, )) + logger.debug("Hindsight retain succeeded") except Exception as e: - logger.warning("Hindsight sync failed: %s", e) + logger.warning("Hindsight sync failed: %s", e, exc_info=True) if self._sync_thread and self._sync_thread.is_alive(): self._sync_thread.join(timeout=5.0) @@ -453,12 +789,18 @@ class HindsightMemoryProvider(MemoryProvider): return tool_error("Missing required parameter: content") context = args.get("context") try: - _run_sync(client.aretain( - bank_id=self._bank_id, content=content, context=context - )) + retain_kwargs: dict = { + "bank_id": self._bank_id, "content": content, "context": context, + } + if self._tags: + retain_kwargs["tags"] = self._tags + logger.debug("Tool hindsight_retain: bank=%s, content_len=%d, context=%s", + self._bank_id, len(content), context) + _run_sync(client.aretain(**retain_kwargs)) + logger.debug("Tool hindsight_retain: success") return json.dumps({"result": "Memory stored successfully."}) except Exception as e: - logger.warning("hindsight_retain failed: %s", e) + logger.warning("hindsight_retain failed: %s", e, exc_info=True) return tool_error(f"Failed to store memory: {e}") elif tool_name == "hindsight_recall": @@ -466,15 +808,26 @@ class HindsightMemoryProvider(MemoryProvider): if not query: return tool_error("Missing required parameter: query") try: - resp = _run_sync(client.arecall( - bank_id=self._bank_id, query=query, budget=self._budget - )) + recall_kwargs: dict = { + "bank_id": self._bank_id, "query": query, "budget": self._budget, + "max_tokens": self._recall_max_tokens, + } + if self._recall_tags: + recall_kwargs["tags"] = self._recall_tags + recall_kwargs["tags_match"] = self._recall_tags_match + if self._recall_types: + recall_kwargs["types"] = self._recall_types + logger.debug("Tool hindsight_recall: bank=%s, query_len=%d, budget=%s", + self._bank_id, len(query), self._budget) + resp = _run_sync(client.arecall(**recall_kwargs)) + num_results = len(resp.results) if resp.results else 0 + logger.debug("Tool hindsight_recall: %d results", num_results) if not resp.results: return json.dumps({"result": "No relevant memories found."}) lines = [f"{i}. {r.text}" for i, r in enumerate(resp.results, 1)] return json.dumps({"result": "\n".join(lines)}) except Exception as e: - logger.warning("hindsight_recall failed: %s", e) + logger.warning("hindsight_recall failed: %s", e, exc_info=True) return tool_error(f"Failed to search memory: {e}") elif tool_name == "hindsight_reflect": @@ -482,24 +835,28 @@ class HindsightMemoryProvider(MemoryProvider): if not query: return tool_error("Missing required parameter: query") try: + logger.debug("Tool hindsight_reflect: bank=%s, query_len=%d, budget=%s", + self._bank_id, len(query), self._budget) resp = _run_sync(client.areflect( bank_id=self._bank_id, query=query, budget=self._budget )) + logger.debug("Tool hindsight_reflect: response_len=%d", len(resp.text or "")) return json.dumps({"result": resp.text or "No relevant memories found."}) except Exception as e: - logger.warning("hindsight_reflect failed: %s", e) + logger.warning("hindsight_reflect failed: %s", e, exc_info=True) return tool_error(f"Failed to reflect: {e}") return tool_error(f"Unknown tool: {tool_name}") def shutdown(self) -> None: + logger.debug("Hindsight shutdown: waiting for background threads") global _loop, _loop_thread for t in (self._prefetch_thread, self._sync_thread): if t and t.is_alive(): t.join(timeout=5.0) if self._client is not None: try: - if self._mode == "local": + if self._mode == "local_embedded": # Use the public close() API. The RuntimeError from # aiohttp's "attached to a different loop" is expected # and harmless — the daemon keeps running independently. diff --git a/plugins/memory/hindsight/plugin.yaml b/plugins/memory/hindsight/plugin.yaml index 798518992..b12c09142 100644 --- a/plugins/memory/hindsight/plugin.yaml +++ b/plugins/memory/hindsight/plugin.yaml @@ -2,9 +2,7 @@ name: hindsight version: 1.0.0 description: "Hindsight — long-term memory with knowledge graph, entity resolution, and multi-strategy retrieval." pip_dependencies: - - hindsight-client - - hindsight-all -requires_env: - - HINDSIGHT_API_KEY + - "hindsight-client>=0.4.22" +requires_env: [] hooks: - on_session_end diff --git a/tests/plugins/memory/test_hindsight_provider.py b/tests/plugins/memory/test_hindsight_provider.py new file mode 100644 index 000000000..5548a29ad --- /dev/null +++ b/tests/plugins/memory/test_hindsight_provider.py @@ -0,0 +1,598 @@ +"""Tests for the Hindsight memory provider plugin. + +Tests cover config loading, tool handlers (tags, max_tokens, types), +prefetch (auto_recall, preamble, query truncation), sync_turn (auto_retain, +turn counting, tags), and schema completeness. +""" + +import json +import threading +from types import SimpleNamespace +from unittest.mock import AsyncMock, MagicMock, patch + +import pytest + +from plugins.memory.hindsight import ( + HindsightMemoryProvider, + RECALL_SCHEMA, + REFLECT_SCHEMA, + RETAIN_SCHEMA, + _load_config, +) + + +# --------------------------------------------------------------------------- +# Fixtures +# --------------------------------------------------------------------------- + + +@pytest.fixture(autouse=True) +def _clean_env(monkeypatch): + """Ensure no stale env vars leak between tests.""" + for key in ( + "HINDSIGHT_API_KEY", "HINDSIGHT_API_URL", "HINDSIGHT_BANK_ID", + "HINDSIGHT_BUDGET", "HINDSIGHT_MODE", "HINDSIGHT_LLM_API_KEY", + ): + monkeypatch.delenv(key, raising=False) + + +def _make_mock_client(): + """Create a mock Hindsight client with async methods.""" + client = MagicMock() + client.aretain = AsyncMock() + client.arecall = AsyncMock( + return_value=SimpleNamespace( + results=[ + SimpleNamespace(text="Memory 1"), + SimpleNamespace(text="Memory 2"), + ] + ) + ) + client.areflect = AsyncMock( + return_value=SimpleNamespace(text="Synthesized answer") + ) + client.aretain_batch = AsyncMock() + client.aclose = AsyncMock() + return client + + +@pytest.fixture() +def provider(tmp_path, monkeypatch): + """Create an initialized HindsightMemoryProvider with a mock client.""" + config = { + "mode": "cloud", + "apiKey": "test-key", + "api_url": "http://localhost:9999", + "bank_id": "test-bank", + "budget": "mid", + "memory_mode": "hybrid", + } + config_path = tmp_path / "hindsight" / "config.json" + config_path.parent.mkdir(parents=True, exist_ok=True) + config_path.write_text(json.dumps(config)) + + monkeypatch.setattr( + "plugins.memory.hindsight.get_hermes_home", lambda: tmp_path + ) + + p = HindsightMemoryProvider() + p.initialize(session_id="test-session", hermes_home=str(tmp_path), platform="cli") + p._client = _make_mock_client() + return p + + +@pytest.fixture() +def provider_with_config(tmp_path, monkeypatch): + """Create a provider factory that accepts custom config overrides.""" + def _make(**overrides): + config = { + "mode": "cloud", + "apiKey": "test-key", + "api_url": "http://localhost:9999", + "bank_id": "test-bank", + "budget": "mid", + "memory_mode": "hybrid", + } + config.update(overrides) + config_path = tmp_path / "hindsight" / "config.json" + config_path.parent.mkdir(parents=True, exist_ok=True) + config_path.write_text(json.dumps(config)) + + monkeypatch.setattr( + "plugins.memory.hindsight.get_hermes_home", lambda: tmp_path + ) + + p = HindsightMemoryProvider() + p.initialize(session_id="test-session", hermes_home=str(tmp_path), platform="cli") + p._client = _make_mock_client() + return p + return _make + + +# --------------------------------------------------------------------------- +# Schema tests +# --------------------------------------------------------------------------- + + +class TestSchemas: + def test_retain_schema_has_content(self): + assert RETAIN_SCHEMA["name"] == "hindsight_retain" + assert "content" in RETAIN_SCHEMA["parameters"]["properties"] + assert "content" in RETAIN_SCHEMA["parameters"]["required"] + + def test_recall_schema_has_query(self): + assert RECALL_SCHEMA["name"] == "hindsight_recall" + assert "query" in RECALL_SCHEMA["parameters"]["properties"] + assert "query" in RECALL_SCHEMA["parameters"]["required"] + + def test_reflect_schema_has_query(self): + assert REFLECT_SCHEMA["name"] == "hindsight_reflect" + assert "query" in REFLECT_SCHEMA["parameters"]["properties"] + + def test_get_tool_schemas_returns_three(self, provider): + schemas = provider.get_tool_schemas() + assert len(schemas) == 3 + names = {s["name"] for s in schemas} + assert names == {"hindsight_retain", "hindsight_recall", "hindsight_reflect"} + + def test_context_mode_returns_no_tools(self, provider_with_config): + p = provider_with_config(memory_mode="context") + assert p.get_tool_schemas() == [] + + +# --------------------------------------------------------------------------- +# Config tests +# --------------------------------------------------------------------------- + + +class TestConfig: + def test_default_values(self, provider): + assert provider._auto_retain is True + assert provider._auto_recall is True + assert provider._retain_every_n_turns == 1 + assert provider._recall_max_tokens == 4096 + assert provider._recall_max_input_chars == 800 + assert provider._tags is None + assert provider._recall_tags is None + assert provider._bank_mission == "" + assert provider._bank_retain_mission is None + assert provider._retain_context == "conversation between Hermes Agent and the User" + + def test_custom_config_values(self, provider_with_config): + p = provider_with_config( + tags=["tag1", "tag2"], + recall_tags=["recall-tag"], + recall_tags_match="all", + auto_retain=False, + auto_recall=False, + retain_every_n_turns=3, + retain_context="custom-ctx", + bank_retain_mission="Extract key facts", + recall_max_tokens=2048, + recall_types=["world", "experience"], + recall_prompt_preamble="Custom preamble:", + recall_max_input_chars=500, + bank_mission="Test agent mission", + ) + assert p._tags == ["tag1", "tag2"] + assert p._recall_tags == ["recall-tag"] + assert p._recall_tags_match == "all" + assert p._auto_retain is False + assert p._auto_recall is False + assert p._retain_every_n_turns == 3 + assert p._retain_context == "custom-ctx" + assert p._bank_retain_mission == "Extract key facts" + assert p._recall_max_tokens == 2048 + assert p._recall_types == ["world", "experience"] + assert p._recall_prompt_preamble == "Custom preamble:" + assert p._recall_max_input_chars == 500 + assert p._bank_mission == "Test agent mission" + + def test_config_from_env_fallback(self, tmp_path, monkeypatch): + """When no config file exists, falls back to env vars.""" + monkeypatch.setattr( + "plugins.memory.hindsight.get_hermes_home", + lambda: tmp_path / "nonexistent", + ) + monkeypatch.setenv("HINDSIGHT_MODE", "cloud") + monkeypatch.setenv("HINDSIGHT_API_KEY", "env-key") + monkeypatch.setenv("HINDSIGHT_BANK_ID", "env-bank") + monkeypatch.setenv("HINDSIGHT_BUDGET", "high") + + cfg = _load_config() + assert cfg["apiKey"] == "env-key" + assert cfg["banks"]["hermes"]["bankId"] == "env-bank" + assert cfg["banks"]["hermes"]["budget"] == "high" + + +# --------------------------------------------------------------------------- +# Tool handler tests +# --------------------------------------------------------------------------- + + +class TestToolHandlers: + def test_retain_success(self, provider): + result = json.loads(provider.handle_tool_call( + "hindsight_retain", {"content": "user likes dark mode"} + )) + assert result["result"] == "Memory stored successfully." + provider._client.aretain.assert_called_once() + call_kwargs = provider._client.aretain.call_args.kwargs + assert call_kwargs["bank_id"] == "test-bank" + assert call_kwargs["content"] == "user likes dark mode" + + def test_retain_with_tags(self, provider_with_config): + p = provider_with_config(tags=["pref", "ui"]) + p.handle_tool_call("hindsight_retain", {"content": "likes dark mode"}) + call_kwargs = p._client.aretain.call_args.kwargs + assert call_kwargs["tags"] == ["pref", "ui"] + + def test_retain_without_tags(self, provider): + provider.handle_tool_call("hindsight_retain", {"content": "hello"}) + call_kwargs = provider._client.aretain.call_args.kwargs + assert "tags" not in call_kwargs + + def test_retain_missing_content(self, provider): + result = json.loads(provider.handle_tool_call( + "hindsight_retain", {} + )) + assert "error" in result + + def test_recall_success(self, provider): + result = json.loads(provider.handle_tool_call( + "hindsight_recall", {"query": "dark mode"} + )) + assert "Memory 1" in result["result"] + assert "Memory 2" in result["result"] + + def test_recall_passes_max_tokens(self, provider_with_config): + p = provider_with_config(recall_max_tokens=2048) + p.handle_tool_call("hindsight_recall", {"query": "test"}) + call_kwargs = p._client.arecall.call_args.kwargs + assert call_kwargs["max_tokens"] == 2048 + + def test_recall_passes_tags(self, provider_with_config): + p = provider_with_config(recall_tags=["tag1"], recall_tags_match="all") + p.handle_tool_call("hindsight_recall", {"query": "test"}) + call_kwargs = p._client.arecall.call_args.kwargs + assert call_kwargs["tags"] == ["tag1"] + assert call_kwargs["tags_match"] == "all" + + def test_recall_passes_types(self, provider_with_config): + p = provider_with_config(recall_types=["world", "experience"]) + p.handle_tool_call("hindsight_recall", {"query": "test"}) + call_kwargs = p._client.arecall.call_args.kwargs + assert call_kwargs["types"] == ["world", "experience"] + + def test_recall_no_results(self, provider): + provider._client.arecall.return_value = SimpleNamespace(results=[]) + result = json.loads(provider.handle_tool_call( + "hindsight_recall", {"query": "test"} + )) + assert result["result"] == "No relevant memories found." + + def test_recall_missing_query(self, provider): + result = json.loads(provider.handle_tool_call( + "hindsight_recall", {} + )) + assert "error" in result + + def test_reflect_success(self, provider): + result = json.loads(provider.handle_tool_call( + "hindsight_reflect", {"query": "summarize"} + )) + assert result["result"] == "Synthesized answer" + + def test_reflect_missing_query(self, provider): + result = json.loads(provider.handle_tool_call( + "hindsight_reflect", {} + )) + assert "error" in result + + def test_unknown_tool(self, provider): + result = json.loads(provider.handle_tool_call( + "hindsight_unknown", {} + )) + assert "error" in result + + def test_retain_error_handling(self, provider): + provider._client.aretain.side_effect = RuntimeError("connection failed") + result = json.loads(provider.handle_tool_call( + "hindsight_retain", {"content": "test"} + )) + assert "error" in result + assert "connection failed" in result["error"] + + def test_recall_error_handling(self, provider): + provider._client.arecall.side_effect = RuntimeError("timeout") + result = json.loads(provider.handle_tool_call( + "hindsight_recall", {"query": "test"} + )) + assert "error" in result + + +# --------------------------------------------------------------------------- +# Prefetch tests +# --------------------------------------------------------------------------- + + +class TestPrefetch: + def test_prefetch_returns_empty_when_no_result(self, provider): + assert provider.prefetch("test") == "" + + def test_prefetch_default_preamble(self, provider): + provider._prefetch_result = "- some memory" + result = provider.prefetch("test") + assert "Hindsight Memory" in result + assert "- some memory" in result + + def test_prefetch_custom_preamble(self, provider_with_config): + p = provider_with_config(recall_prompt_preamble="Custom header:") + p._prefetch_result = "- memory line" + result = p.prefetch("test") + assert result.startswith("Custom header:") + assert "- memory line" in result + + def test_queue_prefetch_skipped_in_tools_mode(self, provider_with_config): + p = provider_with_config(memory_mode="tools") + p.queue_prefetch("test") + # Should not start a thread + assert p._prefetch_thread is None + + def test_queue_prefetch_skipped_when_auto_recall_off(self, provider_with_config): + p = provider_with_config(auto_recall=False) + p.queue_prefetch("test") + assert p._prefetch_thread is None + + def test_queue_prefetch_truncates_query(self, provider_with_config): + p = provider_with_config(recall_max_input_chars=10) + # Mock _run_sync to capture the query + original_query = None + + def _capture_recall(**kwargs): + nonlocal original_query + original_query = kwargs.get("query", "") + return SimpleNamespace(results=[]) + + p._client.arecall = AsyncMock(side_effect=_capture_recall) + + long_query = "a" * 100 + p.queue_prefetch(long_query) + if p._prefetch_thread: + p._prefetch_thread.join(timeout=5.0) + + # The query passed to arecall should be truncated + if original_query is not None: + assert len(original_query) <= 10 + + def test_queue_prefetch_passes_recall_params(self, provider_with_config): + p = provider_with_config( + recall_tags=["t1"], + recall_tags_match="all", + recall_max_tokens=1024, + recall_types=["world"], + ) + p.queue_prefetch("test query") + if p._prefetch_thread: + p._prefetch_thread.join(timeout=5.0) + + call_kwargs = p._client.arecall.call_args.kwargs + assert call_kwargs["max_tokens"] == 1024 + assert call_kwargs["tags"] == ["t1"] + assert call_kwargs["tags_match"] == "all" + assert call_kwargs["types"] == ["world"] + + +# --------------------------------------------------------------------------- +# sync_turn tests +# --------------------------------------------------------------------------- + + +class TestSyncTurn: + def _get_retain_kwargs(self, provider): + """Helper to get the kwargs from the aretain_batch call.""" + return provider._client.aretain_batch.call_args.kwargs + + def _get_retain_content(self, provider): + """Helper to get the raw content string from the first item.""" + kwargs = self._get_retain_kwargs(provider) + return kwargs["items"][0]["content"] + + def _get_retain_messages(self, provider): + """Helper to parse the first turn's messages from retained content. + + Content is a JSON array of turns: [[msgs...], [msgs...], ...] + For single-turn tests, returns the first turn's messages. + """ + content = self._get_retain_content(provider) + turns = json.loads(content) + return turns[0] if len(turns) == 1 else turns + + def test_sync_turn_retains(self, provider): + provider.sync_turn("hello", "hi there") + if provider._sync_thread: + provider._sync_thread.join(timeout=5.0) + provider._client.aretain_batch.assert_called_once() + messages = self._get_retain_messages(provider) + assert len(messages) == 2 + assert messages[0]["role"] == "user" + assert messages[0]["content"] == "hello" + assert "timestamp" in messages[0] + assert messages[1]["role"] == "assistant" + assert messages[1]["content"] == "hi there" + assert "timestamp" in messages[1] + + def test_sync_turn_skipped_when_auto_retain_off(self, provider_with_config): + p = provider_with_config(auto_retain=False) + p.sync_turn("hello", "hi") + assert p._sync_thread is None + p._client.aretain_batch.assert_not_called() + + def test_sync_turn_with_tags(self, provider_with_config): + p = provider_with_config(tags=["conv", "session1"]) + p.sync_turn("hello", "hi") + if p._sync_thread: + p._sync_thread.join(timeout=5.0) + item = p._client.aretain_batch.call_args.kwargs["items"][0] + assert item["tags"] == ["conv", "session1"] + + def test_sync_turn_uses_aretain_batch(self, provider): + """sync_turn should use aretain_batch with retain_async.""" + provider.sync_turn("hello", "hi") + if provider._sync_thread: + provider._sync_thread.join(timeout=5.0) + provider._client.aretain_batch.assert_called_once() + call_kwargs = provider._client.aretain_batch.call_args.kwargs + assert call_kwargs["document_id"] == "test-session" + assert call_kwargs["retain_async"] is True + assert len(call_kwargs["items"]) == 1 + assert call_kwargs["items"][0]["context"] == "conversation between Hermes Agent and the User" + + def test_sync_turn_custom_context(self, provider_with_config): + p = provider_with_config(retain_context="my-agent") + p.sync_turn("hello", "hi") + if p._sync_thread: + p._sync_thread.join(timeout=5.0) + item = p._client.aretain_batch.call_args.kwargs["items"][0] + assert item["context"] == "my-agent" + + def test_sync_turn_every_n_turns(self, provider_with_config): + """With retain_every_n_turns=3, only retains on every 3rd turn.""" + p = provider_with_config(retain_every_n_turns=3) + + p.sync_turn("turn1-user", "turn1-asst") + assert p._sync_thread is None # not retained yet + + p.sync_turn("turn2-user", "turn2-asst") + assert p._sync_thread is None # not retained yet + + p.sync_turn("turn3-user", "turn3-asst") + assert p._sync_thread is not None # retained! + p._sync_thread.join(timeout=5.0) + + p._client.aretain_batch.assert_called_once() + content = p._client.aretain_batch.call_args.kwargs["items"][0]["content"] + # Should contain all 3 turns + assert "turn1-user" in content + assert "turn2-user" in content + assert "turn3-user" in content + + def test_sync_turn_accumulates_full_session(self, provider_with_config): + """Each retain sends the ENTIRE session, not just the latest batch.""" + p = provider_with_config(retain_every_n_turns=2) + + p.sync_turn("turn1-user", "turn1-asst") + p.sync_turn("turn2-user", "turn2-asst") + if p._sync_thread: + p._sync_thread.join(timeout=5.0) + + p._client.aretain_batch.reset_mock() + + p.sync_turn("turn3-user", "turn3-asst") + p.sync_turn("turn4-user", "turn4-asst") + if p._sync_thread: + p._sync_thread.join(timeout=5.0) + + content = p._client.aretain_batch.call_args.kwargs["items"][0]["content"] + # Should contain ALL turns from the session + assert "turn1-user" in content + assert "turn2-user" in content + assert "turn3-user" in content + assert "turn4-user" in content + + def test_sync_turn_passes_document_id(self, provider): + """sync_turn should pass session_id as document_id for dedup.""" + provider.sync_turn("hello", "hi") + if provider._sync_thread: + provider._sync_thread.join(timeout=5.0) + call_kwargs = provider._client.aretain_batch.call_args.kwargs + assert call_kwargs["document_id"] == "test-session" + + def test_sync_turn_error_does_not_raise(self, provider): + """Errors in sync_turn should be swallowed (non-blocking).""" + provider._client.aretain_batch.side_effect = RuntimeError("network error") + provider.sync_turn("hello", "hi") + if provider._sync_thread: + provider._sync_thread.join(timeout=5.0) + # Should not raise + + +# --------------------------------------------------------------------------- +# System prompt tests +# --------------------------------------------------------------------------- + + +class TestSystemPrompt: + def test_hybrid_mode_prompt(self, provider): + block = provider.system_prompt_block() + assert "Hindsight Memory" in block + assert "hindsight_recall" in block + assert "automatically injected" in block + + def test_context_mode_prompt(self, provider_with_config): + p = provider_with_config(memory_mode="context") + block = p.system_prompt_block() + assert "context mode" in block + assert "hindsight_recall" not in block + + def test_tools_mode_prompt(self, provider_with_config): + p = provider_with_config(memory_mode="tools") + block = p.system_prompt_block() + assert "tools mode" in block + assert "hindsight_recall" in block + + +# --------------------------------------------------------------------------- +# Config schema tests +# --------------------------------------------------------------------------- + + +class TestConfigSchema: + def test_schema_has_all_new_fields(self, provider): + schema = provider.get_config_schema() + keys = {f["key"] for f in schema} + expected_keys = { + "mode", "api_url", "api_key", "llm_provider", "llm_api_key", + "llm_model", "bank_id", "bank_mission", "bank_retain_mission", + "recall_budget", "memory_mode", "recall_prefetch_method", + "tags", "recall_tags", "recall_tags_match", + "auto_recall", "auto_retain", + "retain_every_n_turns", "retain_async", + "retain_context", + "recall_max_tokens", "recall_max_input_chars", + "recall_prompt_preamble", + } + assert expected_keys.issubset(keys), f"Missing: {expected_keys - keys}" + + +# --------------------------------------------------------------------------- +# Availability tests +# --------------------------------------------------------------------------- + + +class TestAvailability: + def test_available_with_api_key(self, tmp_path, monkeypatch): + monkeypatch.setattr( + "plugins.memory.hindsight.get_hermes_home", + lambda: tmp_path / "nonexistent", + ) + monkeypatch.setenv("HINDSIGHT_API_KEY", "test-key") + p = HindsightMemoryProvider() + assert p.is_available() + + def test_not_available_without_config(self, tmp_path, monkeypatch): + monkeypatch.setattr( + "plugins.memory.hindsight.get_hermes_home", + lambda: tmp_path / "nonexistent", + ) + p = HindsightMemoryProvider() + assert not p.is_available() + + def test_available_in_local_mode(self, tmp_path, monkeypatch): + monkeypatch.setattr( + "plugins.memory.hindsight.get_hermes_home", + lambda: tmp_path / "nonexistent", + ) + monkeypatch.setenv("HINDSIGHT_MODE", "local") + p = HindsightMemoryProvider() + assert p.is_available() diff --git a/website/docs/user-guide/features/memory-providers.md b/website/docs/user-guide/features/memory-providers.md index ad0a17ae4..e76a05414 100644 --- a/website/docs/user-guide/features/memory-providers.md +++ b/website/docs/user-guide/features/memory-providers.md @@ -263,12 +263,12 @@ echo "MEM0_API_KEY=your-key" >> ~/.hermes/.env ### Hindsight -Long-term memory with knowledge graph, entity resolution, and multi-strategy retrieval. The `hindsight_reflect` tool provides cross-memory synthesis that no other provider offers. +Long-term memory with knowledge graph, entity resolution, and multi-strategy retrieval. The `hindsight_reflect` tool provides cross-memory synthesis that no other provider offers. Automatically retains full conversation turns (including tool calls) with session-level document tracking. | | | |---|---| | **Best for** | Knowledge graph-based recall with entity relationships | -| **Requires** | Cloud: `pip install hindsight-client` + API key. Local: `pip install hindsight` + LLM key | +| **Requires** | Cloud: API key from [ui.hindsight.vectorize.io](https://ui.hindsight.vectorize.io). Local: LLM API key (OpenAI, Groq, OpenRouter, etc.) | | **Data storage** | Hindsight Cloud or local embedded PostgreSQL | | **Cost** | Hindsight pricing (cloud) or free (local) | @@ -282,13 +282,25 @@ hermes config set memory.provider hindsight echo "HINDSIGHT_API_KEY=your-key" >> ~/.hermes/.env ``` +The setup wizard installs dependencies automatically and only installs what's needed for the selected mode (`hindsight-client` for cloud, `hindsight-all` for local). Requires `hindsight-client >= 0.4.22` (auto-upgraded on session start if outdated). + +**Local mode UI:** `hindsight-embed -p hermes ui start` + **Config:** `$HERMES_HOME/hindsight/config.json` | Key | Default | Description | |-----|---------|-------------| | `mode` | `cloud` | `cloud` or `local` | | `bank_id` | `hermes` | Memory bank identifier | -| `budget` | `mid` | Recall thoroughness: `low` / `mid` / `high` | +| `recall_budget` | `mid` | Recall thoroughness: `low` / `mid` / `high` | +| `memory_mode` | `hybrid` | `hybrid` (context + tools), `context` (auto-inject only), `tools` (tools only) | +| `auto_retain` | `true` | Automatically retain conversation turns | +| `auto_recall` | `true` | Automatically recall memories before each turn | +| `retain_async` | `true` | Process retain asynchronously on the server | +| `tags` | — | Tags applied when storing memories | +| `recall_tags` | — | Tags to filter on recall | + +See [plugin README](https://github.com/NousResearch/hermes-agent/blob/main/plugins/memory/hindsight/README.md) for the full configuration reference. ---