mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
feat(hindsight): feature parity, setup wizard, and config improvements
Port missing features from the hindsight-hermes external integration package into the native plugin. Only touches plugin files — no core changes. Features: - Tags on retain/recall (tags, recall_tags, recall_tags_match) - Recall config (recall_max_tokens, recall_max_input_chars, recall_types, recall_prompt_preamble) - Retain controls (retain_every_n_turns, auto_retain, auto_recall, retain_async via aretain_batch, retain_context) - Bank config via Banks API (bank_mission, bank_retain_mission) - Structured JSON retain with per-message timestamps - Full session accumulation with document_id for dedup - Custom post_setup() wizard with curses picker - Mode-aware dep install (hindsight-client for cloud, hindsight-all for local) - local_external mode and openai_compatible LLM provider - OpenRouter support with auto base URL - Auto-upgrade of hindsight-client to >=0.4.22 on session start - Comprehensive debug logging across all operations - 46 unit tests - Updated README and website docs
This commit is contained in:
parent
d97f6cec7f
commit
25757d631b
5 changed files with 1072 additions and 73 deletions
|
|
@ -1,11 +1,12 @@
|
|||
# Hindsight Memory Provider
|
||||
|
||||
Long-term memory with knowledge graph, entity resolution, and multi-strategy retrieval. Supports cloud and local (embedded) modes.
|
||||
Long-term memory with knowledge graph, entity resolution, and multi-strategy retrieval. Supports cloud, local embedded, and local external modes.
|
||||
|
||||
## Requirements
|
||||
|
||||
- **Cloud:** API key from [ui.hindsight.vectorize.io](https://ui.hindsight.vectorize.io)
|
||||
- **Local:** API key for a supported LLM provider (OpenAI, Anthropic, Gemini, Groq, MiniMax, or Ollama). Embeddings and reranking run locally — no additional API keys needed.
|
||||
- **Local Embedded:** API key for a supported LLM provider (OpenAI, Anthropic, Gemini, Groq, OpenRouter, MiniMax, Ollama, or any OpenAI-compatible endpoint). Embeddings and reranking run locally — no additional API keys needed.
|
||||
- **Local External:** A running Hindsight instance (Docker or self-hosted) reachable over HTTP.
|
||||
|
||||
## Setup
|
||||
|
||||
|
|
@ -21,17 +22,28 @@ hermes config set memory.provider hindsight
|
|||
echo "HINDSIGHT_API_KEY=your-key" >> ~/.hermes/.env
|
||||
```
|
||||
|
||||
### Cloud Mode
|
||||
### Cloud
|
||||
|
||||
Connects to the Hindsight Cloud API. Requires an API key from [ui.hindsight.vectorize.io](https://ui.hindsight.vectorize.io).
|
||||
|
||||
### Local Mode
|
||||
### Local Embedded
|
||||
|
||||
Runs an embedded Hindsight server with built-in PostgreSQL. Requires an LLM API key (e.g. Groq, OpenAI, Anthropic) for memory extraction and synthesis. The daemon starts automatically in the background on first use and stops after 5 minutes of inactivity.
|
||||
Hermes spins up a local Hindsight daemon with built-in PostgreSQL. Requires an LLM API key for memory extraction and synthesis. The daemon starts automatically in the background on first use and stops after 5 minutes of inactivity.
|
||||
|
||||
Supports any OpenAI-compatible LLM endpoint (llama.cpp, vLLM, LM Studio, etc.) — pick `openai_compatible` as the provider and enter the base URL.
|
||||
|
||||
Daemon startup logs: `~/.hermes/logs/hindsight-embed.log`
|
||||
Daemon runtime logs: `~/.hindsight/profiles/<profile>.log`
|
||||
|
||||
To open the Hindsight web UI (local embedded mode only):
|
||||
```bash
|
||||
hindsight-embed -p hermes ui start
|
||||
```
|
||||
|
||||
### Local External
|
||||
|
||||
Points the plugin at an existing Hindsight instance you're already running (Docker, self-hosted, etc.). No daemon management — just a URL and an optional API key.
|
||||
|
||||
## Config
|
||||
|
||||
Config file: `~/.hermes/hindsight/config.json`
|
||||
|
|
@ -40,40 +52,58 @@ Config file: `~/.hermes/hindsight/config.json`
|
|||
|
||||
| Key | Default | Description |
|
||||
|-----|---------|-------------|
|
||||
| `mode` | `cloud` | `cloud` or `local` |
|
||||
| `api_url` | `https://api.hindsight.vectorize.io` | API URL (cloud mode) |
|
||||
| `api_url` | `http://localhost:8888` | API URL (local mode, unused — daemon manages its own port) |
|
||||
| `mode` | `cloud` | `cloud`, `local_embedded`, or `local_external` |
|
||||
| `api_url` | `https://api.hindsight.vectorize.io` | API URL (cloud and local_external modes) |
|
||||
|
||||
### Memory
|
||||
### Memory Bank
|
||||
|
||||
| Key | Default | Description |
|
||||
|-----|---------|-------------|
|
||||
| `bank_id` | `hermes` | Memory bank name |
|
||||
| `budget` | `mid` | Recall thoroughness: `low` / `mid` / `high` |
|
||||
| `bank_mission` | — | Reflect mission (identity/framing for reflect reasoning). Applied via Banks API. |
|
||||
| `bank_retain_mission` | — | Retain mission (steers what gets extracted). Applied via Banks API. |
|
||||
|
||||
### Recall
|
||||
|
||||
| Key | Default | Description |
|
||||
|-----|---------|-------------|
|
||||
| `recall_budget` | `mid` | Recall thoroughness: `low` / `mid` / `high` |
|
||||
| `recall_prefetch_method` | `recall` | Auto-recall method: `recall` (raw facts) or `reflect` (LLM synthesis) |
|
||||
| `recall_max_tokens` | `4096` | Maximum tokens for recall results |
|
||||
| `recall_max_input_chars` | `800` | Maximum input query length for auto-recall |
|
||||
| `recall_prompt_preamble` | — | Custom preamble for recalled memories in context |
|
||||
| `recall_tags` | — | Tags to filter when searching memories |
|
||||
| `recall_tags_match` | `any` | Tag matching mode: `any` / `all` / `any_strict` / `all_strict` |
|
||||
| `auto_recall` | `true` | Automatically recall memories before each turn |
|
||||
|
||||
### Retain
|
||||
|
||||
| Key | Default | Description |
|
||||
|-----|---------|-------------|
|
||||
| `auto_retain` | `true` | Automatically retain conversation turns |
|
||||
| `retain_async` | `true` | Process retain asynchronously on the Hindsight server |
|
||||
| `retain_every_n_turns` | `1` | Retain every N turns (1 = every turn) |
|
||||
| `retain_context` | `conversation between Hermes Agent and the User` | Context label for retained memories |
|
||||
| `tags` | — | Tags applied when storing memories |
|
||||
|
||||
### Integration
|
||||
|
||||
| Key | Default | Description |
|
||||
|-----|---------|-------------|
|
||||
| `memory_mode` | `hybrid` | How memories are integrated into the agent |
|
||||
| `prefetch_method` | `recall` | Method for automatic context injection |
|
||||
|
||||
**memory_mode:**
|
||||
- `hybrid` — automatic context injection + tools available to the LLM
|
||||
- `context` — automatic injection only, no tools exposed
|
||||
- `tools` — tools only, no automatic injection
|
||||
|
||||
**prefetch_method:**
|
||||
- `recall` — injects raw memory facts (fast)
|
||||
- `reflect` — injects LLM-synthesized summary (slower, more coherent)
|
||||
|
||||
### Local Mode LLM
|
||||
### Local Embedded LLM
|
||||
|
||||
| Key | Default | Description |
|
||||
|-----|---------|-------------|
|
||||
| `llm_provider` | `openai` | LLM provider: `openai`, `anthropic`, `gemini`, `groq`, `minimax`, `ollama` |
|
||||
| `llm_model` | per-provider | Model name (e.g. `gpt-4o-mini`, `openai/gpt-oss-120b`) |
|
||||
| `llm_base_url` | — | LLM Base URL override (e.g. `https://openrouter.ai/api/v1`) |
|
||||
| `llm_provider` | `openai` | `openai`, `anthropic`, `gemini`, `groq`, `openrouter`, `minimax`, `ollama`, `lmstudio`, `openai_compatible` |
|
||||
| `llm_model` | per-provider | Model name (e.g. `gpt-4o-mini`, `qwen/qwen3.5-9b`) |
|
||||
| `llm_base_url` | — | Endpoint URL for `openai_compatible` (e.g. `http://192.168.1.10:8080/v1`) |
|
||||
|
||||
The LLM API key is stored in `~/.hermes/.env` as `HINDSIGHT_LLM_API_KEY`.
|
||||
|
||||
|
|
@ -97,4 +127,8 @@ Available in `hybrid` and `tools` memory modes:
|
|||
| `HINDSIGHT_API_URL` | Override API endpoint |
|
||||
| `HINDSIGHT_BANK_ID` | Override bank name |
|
||||
| `HINDSIGHT_BUDGET` | Override recall budget |
|
||||
| `HINDSIGHT_MODE` | Override mode (`cloud` / `local`) |
|
||||
| `HINDSIGHT_MODE` | Override mode (`cloud`, `local_embedded`, `local_external`) |
|
||||
|
||||
## Client Version
|
||||
|
||||
Requires `hindsight-client >= 0.4.22`. The plugin auto-upgrades on session start if an older version is detected.
|
||||
|
|
|
|||
|
|
@ -28,21 +28,25 @@ from hermes_constants import get_hermes_home
|
|||
from typing import Any, Dict, List
|
||||
|
||||
from agent.memory_provider import MemoryProvider
|
||||
from hermes_constants import get_hermes_home
|
||||
from tools.registry import tool_error
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
_DEFAULT_API_URL = "https://api.hindsight.vectorize.io"
|
||||
_DEFAULT_LOCAL_URL = "http://localhost:8888"
|
||||
_MIN_CLIENT_VERSION = "0.4.22"
|
||||
_VALID_BUDGETS = {"low", "mid", "high"}
|
||||
_PROVIDER_DEFAULT_MODELS = {
|
||||
"openai": "gpt-4o-mini",
|
||||
"anthropic": "claude-haiku-4-5",
|
||||
"gemini": "gemini-2.5-flash",
|
||||
"groq": "openai/gpt-oss-120b",
|
||||
"openrouter": "qwen/qwen3.5-9b",
|
||||
"minimax": "MiniMax-M2.7",
|
||||
"ollama": "gemma3:12b",
|
||||
"lmstudio": "local-model",
|
||||
"openai_compatible": "your-model-name",
|
||||
}
|
||||
|
||||
|
||||
|
|
@ -188,6 +192,7 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
self._bank_id = "hermes"
|
||||
self._budget = "mid"
|
||||
self._mode = "cloud"
|
||||
self._llm_base_url = ""
|
||||
self._memory_mode = "hybrid" # "context", "tools", or "hybrid"
|
||||
self._prefetch_method = "recall" # "recall" or "reflect"
|
||||
self._client = None
|
||||
|
|
@ -195,6 +200,31 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
self._prefetch_lock = threading.Lock()
|
||||
self._prefetch_thread = None
|
||||
self._sync_thread = None
|
||||
self._session_id = ""
|
||||
|
||||
# Tags
|
||||
self._tags: list[str] | None = None
|
||||
self._recall_tags: list[str] | None = None
|
||||
self._recall_tags_match = "any"
|
||||
|
||||
# Retain controls
|
||||
self._auto_retain = True
|
||||
self._retain_every_n_turns = 1
|
||||
self._retain_context = "conversation between Hermes Agent and the User"
|
||||
self._turn_counter = 0
|
||||
self._session_turns: list[str] = [] # accumulates ALL turns for the session
|
||||
|
||||
# Recall controls
|
||||
self._auto_recall = True
|
||||
self._recall_max_tokens = 4096
|
||||
self._recall_types: list[str] | None = None
|
||||
self._recall_prompt_preamble = ""
|
||||
self._recall_max_input_chars = 800
|
||||
|
||||
# Bank
|
||||
self._bank_mission = ""
|
||||
self._bank_retain_mission: str | None = None
|
||||
self._retain_async = True
|
||||
|
||||
@property
|
||||
def name(self) -> str:
|
||||
|
|
@ -204,7 +234,7 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
try:
|
||||
cfg = _load_config()
|
||||
mode = cfg.get("mode", "cloud")
|
||||
if mode == "local":
|
||||
if mode in ("local", "local_embedded", "local_external"):
|
||||
return True
|
||||
has_key = bool(cfg.get("apiKey") or os.environ.get("HINDSIGHT_API_KEY", ""))
|
||||
has_url = bool(cfg.get("api_url") or os.environ.get("HINDSIGHT_API_URL", ""))
|
||||
|
|
@ -228,73 +258,306 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
existing.update(values)
|
||||
config_path.write_text(json.dumps(existing, indent=2))
|
||||
|
||||
def post_setup(self, hermes_home: str, config: dict) -> None:
|
||||
"""Custom setup wizard — installs only the deps needed for the selected mode."""
|
||||
import getpass
|
||||
import subprocess
|
||||
import shutil
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
from hermes_cli.config import save_config
|
||||
|
||||
from hermes_cli.memory_setup import _curses_select
|
||||
|
||||
print("\n Configuring Hindsight memory:\n")
|
||||
|
||||
# Step 1: Mode selection
|
||||
mode_items = [
|
||||
("Cloud", "Hindsight Cloud API (lightweight, just needs an API key)"),
|
||||
("Local Embedded", "Run Hindsight locally (downloads ~200MB, needs LLM key)"),
|
||||
("Local External", "Connect to an existing Hindsight instance"),
|
||||
]
|
||||
mode_idx = _curses_select(" Select mode", mode_items, default=0)
|
||||
mode = ["cloud", "local_embedded", "local_external"][mode_idx]
|
||||
|
||||
provider_config: dict = {"mode": mode}
|
||||
env_writes: dict = {}
|
||||
|
||||
# Step 2: Install/upgrade deps for selected mode
|
||||
_MIN_CLIENT_VERSION = "0.4.22"
|
||||
cloud_dep = f"hindsight-client>={_MIN_CLIENT_VERSION}"
|
||||
local_dep = "hindsight-all"
|
||||
if mode == "local_embedded":
|
||||
deps_to_install = [local_dep]
|
||||
elif mode == "local_external":
|
||||
deps_to_install = [cloud_dep]
|
||||
else:
|
||||
deps_to_install = [cloud_dep]
|
||||
|
||||
print(f"\n Checking dependencies...")
|
||||
uv_path = shutil.which("uv")
|
||||
if not uv_path:
|
||||
print(" ⚠ uv not found — install it: curl -LsSf https://astral.sh/uv/install.sh | sh")
|
||||
print(f" Then run manually: uv pip install --python {sys.executable} {' '.join(deps_to_install)}")
|
||||
else:
|
||||
try:
|
||||
subprocess.run(
|
||||
[uv_path, "pip", "install", "--python", sys.executable, "--quiet", "--upgrade"] + deps_to_install,
|
||||
check=True, timeout=120, capture_output=True,
|
||||
)
|
||||
print(f" ✓ Dependencies up to date")
|
||||
except Exception as e:
|
||||
print(f" ⚠ Install failed: {e}")
|
||||
print(f" Run manually: uv pip install --python {sys.executable} {' '.join(deps_to_install)}")
|
||||
|
||||
# Step 3: Mode-specific config
|
||||
if mode == "cloud":
|
||||
print(f"\n Get your API key at https://ui.hindsight.vectorize.io\n")
|
||||
existing_key = os.environ.get("HINDSIGHT_API_KEY", "")
|
||||
if existing_key:
|
||||
masked = f"...{existing_key[-4:]}" if len(existing_key) > 4 else "set"
|
||||
sys.stdout.write(f" API key (current: {masked}, blank to keep): ")
|
||||
sys.stdout.flush()
|
||||
api_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip()
|
||||
else:
|
||||
sys.stdout.write(" API key: ")
|
||||
sys.stdout.flush()
|
||||
api_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip()
|
||||
if api_key:
|
||||
env_writes["HINDSIGHT_API_KEY"] = api_key
|
||||
|
||||
val = input(f" API URL [{_DEFAULT_API_URL}]: ").strip()
|
||||
if val:
|
||||
provider_config["api_url"] = val
|
||||
|
||||
elif mode == "local_external":
|
||||
val = input(f" Hindsight API URL [{_DEFAULT_LOCAL_URL}]: ").strip()
|
||||
provider_config["api_url"] = val or _DEFAULT_LOCAL_URL
|
||||
|
||||
sys.stdout.write(" API key (optional, blank to skip): ")
|
||||
sys.stdout.flush()
|
||||
api_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip()
|
||||
if api_key:
|
||||
env_writes["HINDSIGHT_API_KEY"] = api_key
|
||||
|
||||
else: # local_embedded
|
||||
providers_list = list(_PROVIDER_DEFAULT_MODELS.keys())
|
||||
llm_items = [
|
||||
(p, f"default model: {_PROVIDER_DEFAULT_MODELS[p]}")
|
||||
for p in providers_list
|
||||
]
|
||||
llm_idx = _curses_select(" Select LLM provider", llm_items, default=0)
|
||||
llm_provider = providers_list[llm_idx]
|
||||
|
||||
provider_config["llm_provider"] = llm_provider
|
||||
|
||||
if llm_provider == "openai_compatible":
|
||||
val = input(" LLM endpoint URL (e.g. http://192.168.1.10:8080/v1): ").strip()
|
||||
if val:
|
||||
provider_config["llm_base_url"] = val
|
||||
elif llm_provider == "openrouter":
|
||||
provider_config["llm_base_url"] = "https://openrouter.ai/api/v1"
|
||||
|
||||
default_model = _PROVIDER_DEFAULT_MODELS.get(llm_provider, "gpt-4o-mini")
|
||||
val = input(f" LLM model [{default_model}]: ").strip()
|
||||
provider_config["llm_model"] = val or default_model
|
||||
|
||||
sys.stdout.write(" LLM API key: ")
|
||||
sys.stdout.flush()
|
||||
llm_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip()
|
||||
if llm_key:
|
||||
env_writes["HINDSIGHT_LLM_API_KEY"] = llm_key
|
||||
|
||||
# Step 4: Save everything
|
||||
provider_config["bank_id"] = "hermes"
|
||||
provider_config["recall_budget"] = "mid"
|
||||
bank_id = "hermes"
|
||||
config["memory"]["provider"] = "hindsight"
|
||||
save_config(config)
|
||||
|
||||
self.save_config(provider_config, hermes_home)
|
||||
|
||||
if env_writes:
|
||||
env_path = Path(hermes_home) / ".env"
|
||||
env_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
existing_lines = []
|
||||
if env_path.exists():
|
||||
existing_lines = env_path.read_text().splitlines()
|
||||
updated_keys = set()
|
||||
new_lines = []
|
||||
for line in existing_lines:
|
||||
key_match = line.split("=", 1)[0].strip() if "=" in line and not line.startswith("#") else None
|
||||
if key_match and key_match in env_writes:
|
||||
new_lines.append(f"{key_match}={env_writes[key_match]}")
|
||||
updated_keys.add(key_match)
|
||||
else:
|
||||
new_lines.append(line)
|
||||
for k, v in env_writes.items():
|
||||
if k not in updated_keys:
|
||||
new_lines.append(f"{k}={v}")
|
||||
env_path.write_text("\n".join(new_lines) + "\n")
|
||||
|
||||
print(f"\n ✓ Hindsight memory configured ({mode} mode)")
|
||||
if env_writes:
|
||||
print(f" API keys saved to .env")
|
||||
print(f"\n Start a new session to activate.\n")
|
||||
|
||||
def get_config_schema(self):
|
||||
return [
|
||||
{"key": "mode", "description": "Cloud API or local embedded mode", "default": "cloud", "choices": ["cloud", "local"]},
|
||||
{"key": "api_url", "description": "Hindsight API URL", "default": _DEFAULT_API_URL, "when": {"mode": "cloud"}},
|
||||
{"key": "mode", "description": "Connection mode", "default": "cloud", "choices": ["cloud", "local_embedded", "local_external"]},
|
||||
# Cloud mode
|
||||
{"key": "api_url", "description": "Hindsight Cloud API URL", "default": _DEFAULT_API_URL, "when": {"mode": "cloud"}},
|
||||
{"key": "api_key", "description": "Hindsight Cloud API key", "secret": True, "env_var": "HINDSIGHT_API_KEY", "url": "https://ui.hindsight.vectorize.io", "when": {"mode": "cloud"}},
|
||||
{"key": "llm_provider", "description": "LLM provider for local mode", "default": "openai", "choices": ["openai", "anthropic", "gemini", "groq", "minimax", "ollama"], "when": {"mode": "local"}},
|
||||
{"key": "llm_api_key", "description": "LLM API key for local Hindsight", "secret": True, "env_var": "HINDSIGHT_LLM_API_KEY", "when": {"mode": "local"}},
|
||||
{"key": "llm_base_url", "description": "LLM Base URL (e.g. for OpenRouter)", "default": "", "env_var": "HINDSIGHT_API_LLM_BASE_URL", "when": {"mode": "local"}},
|
||||
{"key": "llm_model", "description": "LLM model for local mode", "default": "gpt-4o-mini", "default_from": {"field": "llm_provider", "map": _PROVIDER_DEFAULT_MODELS}, "when": {"mode": "local"}},
|
||||
# Local external mode
|
||||
{"key": "api_url", "description": "Hindsight API URL", "default": _DEFAULT_LOCAL_URL, "when": {"mode": "local_external"}},
|
||||
{"key": "api_key", "description": "API key (optional)", "secret": True, "env_var": "HINDSIGHT_API_KEY", "when": {"mode": "local_external"}},
|
||||
# Local embedded mode
|
||||
{"key": "llm_provider", "description": "LLM provider", "default": "openai", "choices": ["openai", "anthropic", "gemini", "groq", "openrouter", "minimax", "ollama", "lmstudio", "openai_compatible"], "when": {"mode": "local_embedded"}},
|
||||
{"key": "llm_base_url", "description": "Endpoint URL (e.g. http://192.168.1.10:8080/v1)", "default": "", "when": {"mode": "local_embedded", "llm_provider": "openai_compatible"}},
|
||||
{"key": "llm_api_key", "description": "LLM API key (optional for openai_compatible)", "secret": True, "env_var": "HINDSIGHT_LLM_API_KEY", "when": {"mode": "local_embedded"}},
|
||||
{"key": "llm_model", "description": "LLM model", "default": "gpt-4o-mini", "default_from": {"field": "llm_provider", "map": _PROVIDER_DEFAULT_MODELS}, "when": {"mode": "local_embedded"}},
|
||||
{"key": "bank_id", "description": "Memory bank name", "default": "hermes"},
|
||||
{"key": "budget", "description": "Recall thoroughness", "default": "mid", "choices": ["low", "mid", "high"]},
|
||||
{"key": "bank_mission", "description": "Mission/purpose description for the memory bank"},
|
||||
{"key": "bank_retain_mission", "description": "Custom extraction prompt for memory retention"},
|
||||
{"key": "recall_budget", "description": "Recall thoroughness", "default": "mid", "choices": ["low", "mid", "high"]},
|
||||
{"key": "memory_mode", "description": "Memory integration mode", "default": "hybrid", "choices": ["hybrid", "context", "tools"]},
|
||||
{"key": "prefetch_method", "description": "Auto-recall method", "default": "recall", "choices": ["recall", "reflect"]},
|
||||
{"key": "recall_prefetch_method", "description": "Auto-recall method", "default": "recall", "choices": ["recall", "reflect"]},
|
||||
{"key": "tags", "description": "Tags applied when storing memories (comma-separated)", "default": ""},
|
||||
{"key": "recall_tags", "description": "Tags to filter when searching memories (comma-separated)", "default": ""},
|
||||
{"key": "recall_tags_match", "description": "Tag matching mode for recall", "default": "any", "choices": ["any", "all", "any_strict", "all_strict"]},
|
||||
{"key": "auto_recall", "description": "Automatically recall memories before each turn", "default": True},
|
||||
{"key": "auto_retain", "description": "Automatically retain conversation turns", "default": True},
|
||||
{"key": "retain_every_n_turns", "description": "Retain every N turns (1 = every turn)", "default": 1},
|
||||
{"key": "retain_async","description": "Process retain asynchronously on the Hindsight server", "default": True},
|
||||
{"key": "retain_context", "description": "Context label for retained memories", "default": "conversation between Hermes Agent and the User"},
|
||||
{"key": "recall_max_tokens", "description": "Maximum tokens for recall results", "default": 4096},
|
||||
{"key": "recall_max_input_chars", "description": "Maximum input query length for auto-recall", "default": 800},
|
||||
{"key": "recall_prompt_preamble", "description": "Custom preamble for recalled memories in context"},
|
||||
]
|
||||
|
||||
def _get_client(self):
|
||||
"""Return the cached Hindsight client (created once, reused)."""
|
||||
if self._client is None:
|
||||
if self._mode == "local":
|
||||
if self._mode == "local_embedded":
|
||||
from hindsight import HindsightEmbedded
|
||||
# Disable __del__ on the class to prevent "attached to a
|
||||
# different loop" errors during GC — we handle cleanup in
|
||||
# shutdown() instead.
|
||||
HindsightEmbedded.__del__ = lambda self: None
|
||||
llm_provider = self._config.get("llm_provider", "")
|
||||
if llm_provider in ("openai_compatible", "openrouter"):
|
||||
llm_provider = "openai"
|
||||
logger.debug("Creating HindsightEmbedded client (profile=%s, provider=%s)",
|
||||
self._config.get("profile", "hermes"), llm_provider)
|
||||
kwargs = dict(
|
||||
profile=self._config.get("profile", "hermes"),
|
||||
llm_provider=self._config.get("llm_provider", ""),
|
||||
llm_api_key=self._config.get("llm_api_key") or os.environ.get("HINDSIGHT_LLM_API_KEY", ""),
|
||||
llm_provider=llm_provider,
|
||||
llm_api_key=self._config.get("llmApiKey") or self._config.get("llm_api_key") or os.environ.get("HINDSIGHT_LLM_API_KEY", ""),
|
||||
llm_model=self._config.get("llm_model", ""),
|
||||
)
|
||||
base_url = self._config.get("llm_base_url") or os.environ.get("HINDSIGHT_API_LLM_BASE_URL", "")
|
||||
if base_url:
|
||||
kwargs["llm_base_url"] = base_url
|
||||
if self._llm_base_url:
|
||||
kwargs["llm_base_url"] = self._llm_base_url
|
||||
self._client = HindsightEmbedded(**kwargs)
|
||||
else:
|
||||
from hindsight_client import Hindsight
|
||||
kwargs = {"base_url": self._api_url, "timeout": 30.0}
|
||||
if self._api_key:
|
||||
kwargs["api_key"] = self._api_key
|
||||
logger.debug("Creating Hindsight cloud client (url=%s, has_key=%s)",
|
||||
self._api_url, bool(self._api_key))
|
||||
self._client = Hindsight(**kwargs)
|
||||
return self._client
|
||||
|
||||
def initialize(self, session_id: str, **kwargs) -> None:
|
||||
self._session_id = session_id
|
||||
|
||||
# Check client version and auto-upgrade if needed
|
||||
try:
|
||||
from importlib.metadata import version as pkg_version
|
||||
from packaging.version import Version
|
||||
installed = pkg_version("hindsight-client")
|
||||
if Version(installed) < Version(_MIN_CLIENT_VERSION):
|
||||
logger.warning("hindsight-client %s is outdated (need >=%s), attempting upgrade...",
|
||||
installed, _MIN_CLIENT_VERSION)
|
||||
import shutil, subprocess, sys
|
||||
uv_path = shutil.which("uv")
|
||||
if uv_path:
|
||||
try:
|
||||
subprocess.run(
|
||||
[uv_path, "pip", "install", "--python", sys.executable,
|
||||
"--quiet", "--upgrade", f"hindsight-client>={_MIN_CLIENT_VERSION}"],
|
||||
check=True, timeout=120, capture_output=True,
|
||||
)
|
||||
logger.info("hindsight-client upgraded to >=%s", _MIN_CLIENT_VERSION)
|
||||
except Exception as e:
|
||||
logger.warning("Auto-upgrade failed: %s. Run: uv pip install 'hindsight-client>=%s'",
|
||||
e, _MIN_CLIENT_VERSION)
|
||||
else:
|
||||
logger.warning("uv not found. Run: pip install 'hindsight-client>=%s'", _MIN_CLIENT_VERSION)
|
||||
except Exception:
|
||||
pass # packaging not available or other issue — proceed anyway
|
||||
|
||||
self._config = _load_config()
|
||||
self._mode = self._config.get("mode", "cloud")
|
||||
self._api_key = self._config.get("apiKey") or os.environ.get("HINDSIGHT_API_KEY", "")
|
||||
default_url = _DEFAULT_LOCAL_URL if self._mode == "local" else _DEFAULT_API_URL
|
||||
# "local" is a legacy alias for "local_embedded"
|
||||
if self._mode == "local":
|
||||
self._mode = "local_embedded"
|
||||
self._api_key = self._config.get("apiKey") or self._config.get("api_key") or os.environ.get("HINDSIGHT_API_KEY", "")
|
||||
default_url = _DEFAULT_LOCAL_URL if self._mode in ("local_embedded", "local_external") else _DEFAULT_API_URL
|
||||
self._api_url = self._config.get("api_url") or os.environ.get("HINDSIGHT_API_URL", default_url)
|
||||
self._llm_base_url = self._config.get("llm_base_url", "")
|
||||
|
||||
banks = self._config.get("banks", {}).get("hermes", {})
|
||||
self._bank_id = self._config.get("bank_id") or banks.get("bankId", "hermes")
|
||||
budget = self._config.get("budget") or banks.get("budget", "mid")
|
||||
budget = self._config.get("recall_budget") or self._config.get("budget") or banks.get("budget", "mid")
|
||||
self._budget = budget if budget in _VALID_BUDGETS else "mid"
|
||||
|
||||
memory_mode = self._config.get("memory_mode", "hybrid")
|
||||
self._memory_mode = memory_mode if memory_mode in ("context", "tools", "hybrid") else "hybrid"
|
||||
|
||||
prefetch_method = self._config.get("prefetch_method", "recall")
|
||||
prefetch_method = self._config.get("recall_prefetch_method", "recall")
|
||||
self._prefetch_method = prefetch_method if prefetch_method in ("recall", "reflect") else "recall"
|
||||
|
||||
logger.info("Hindsight initialized: mode=%s, api_url=%s, bank=%s, budget=%s, memory_mode=%s, prefetch_method=%s",
|
||||
self._mode, self._api_url, self._bank_id, self._budget, self._memory_mode, self._prefetch_method)
|
||||
# Bank options
|
||||
self._bank_mission = self._config.get("bank_mission", "")
|
||||
self._bank_retain_mission = self._config.get("bank_retain_mission") or None
|
||||
|
||||
# Tags
|
||||
self._tags = self._config.get("tags") or None
|
||||
self._recall_tags = self._config.get("recall_tags") or None
|
||||
self._recall_tags_match = self._config.get("recall_tags_match", "any")
|
||||
|
||||
# Retain controls
|
||||
self._auto_retain = self._config.get("auto_retain", True)
|
||||
self._retain_every_n_turns = max(1, int(self._config.get("retain_every_n_turns", 1)))
|
||||
self._retain_context = self._config.get("retain_context", "conversation between Hermes Agent and the User")
|
||||
|
||||
# Recall controls
|
||||
self._auto_recall = self._config.get("auto_recall", True)
|
||||
self._recall_max_tokens = int(self._config.get("recall_max_tokens", 4096))
|
||||
self._recall_types = self._config.get("recall_types") or None
|
||||
self._recall_prompt_preamble = self._config.get("recall_prompt_preamble", "")
|
||||
self._recall_max_input_chars = int(self._config.get("recall_max_input_chars", 800))
|
||||
self._retain_async = self._config.get("retain_async", True)
|
||||
|
||||
_client_version = "unknown"
|
||||
try:
|
||||
from importlib.metadata import version as pkg_version
|
||||
_client_version = pkg_version("hindsight-client")
|
||||
except Exception:
|
||||
pass
|
||||
logger.info("Hindsight initialized: mode=%s, api_url=%s, bank=%s, budget=%s, memory_mode=%s, prefetch_method=%s, client=%s",
|
||||
self._mode, self._api_url, self._bank_id, self._budget, self._memory_mode, self._prefetch_method, _client_version)
|
||||
logger.debug("Hindsight config: auto_retain=%s, auto_recall=%s, retain_every_n=%d, "
|
||||
"retain_async=%s, retain_context=%s, "
|
||||
"recall_max_tokens=%d, recall_max_input_chars=%d, tags=%s, recall_tags=%s",
|
||||
self._auto_retain, self._auto_recall, self._retain_every_n_turns,
|
||||
self._retain_async, self._retain_context,
|
||||
self._recall_max_tokens, self._recall_max_input_chars,
|
||||
self._tags, self._recall_tags)
|
||||
|
||||
# For local mode, start the embedded daemon in the background so it
|
||||
# doesn't block the chat. Redirect stdout/stderr to a log file to
|
||||
# prevent rich startup output from spamming the terminal.
|
||||
if self._mode == "local":
|
||||
if self._mode == "local_embedded":
|
||||
def _start_daemon():
|
||||
import traceback
|
||||
log_dir = get_hermes_home() / "logs"
|
||||
|
|
@ -320,6 +583,8 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
current_provider = self._config.get("llm_provider", "")
|
||||
current_model = self._config.get("llm_model", "")
|
||||
current_base_url = self._config.get("llm_base_url") or os.environ.get("HINDSIGHT_API_LLM_BASE_URL", "")
|
||||
# Map openai_compatible/openrouter → openai for the daemon (OpenAI wire format)
|
||||
daemon_provider = "openai" if current_provider in ("openai_compatible", "openrouter") else current_provider
|
||||
|
||||
# Read saved profile config
|
||||
saved = {}
|
||||
|
|
@ -330,7 +595,7 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
saved[k.strip()] = v.strip()
|
||||
|
||||
config_changed = (
|
||||
saved.get("HINDSIGHT_API_LLM_PROVIDER") != current_provider or
|
||||
saved.get("HINDSIGHT_API_LLM_PROVIDER") != daemon_provider or
|
||||
saved.get("HINDSIGHT_API_LLM_MODEL") != current_model or
|
||||
saved.get("HINDSIGHT_API_LLM_API_KEY") != current_key or
|
||||
saved.get("HINDSIGHT_API_LLM_BASE_URL", "") != current_base_url
|
||||
|
|
@ -340,7 +605,7 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
# Write updated profile .env
|
||||
profile_env.parent.mkdir(parents=True, exist_ok=True)
|
||||
env_lines = (
|
||||
f"HINDSIGHT_API_LLM_PROVIDER={current_provider}\n"
|
||||
f"HINDSIGHT_API_LLM_PROVIDER={daemon_provider}\n"
|
||||
f"HINDSIGHT_API_LLM_API_KEY={current_key}\n"
|
||||
f"HINDSIGHT_API_LLM_MODEL={current_model}\n"
|
||||
f"HINDSIGHT_API_LOG_LEVEL=info\n"
|
||||
|
|
@ -388,47 +653,118 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
|
||||
def prefetch(self, query: str, *, session_id: str = "") -> str:
|
||||
if self._prefetch_thread and self._prefetch_thread.is_alive():
|
||||
logger.debug("Prefetch: waiting for background thread to complete")
|
||||
self._prefetch_thread.join(timeout=3.0)
|
||||
with self._prefetch_lock:
|
||||
result = self._prefetch_result
|
||||
self._prefetch_result = ""
|
||||
if not result:
|
||||
logger.debug("Prefetch: no results available")
|
||||
return ""
|
||||
return f"## Hindsight Memory\n{result}"
|
||||
logger.debug("Prefetch: returning %d chars of context", len(result))
|
||||
header = self._recall_prompt_preamble or (
|
||||
"# Hindsight Memory (persistent cross-session context)\n"
|
||||
"Use this to answer questions about the user and prior sessions. "
|
||||
"Do not call tools to look up information that is already present here."
|
||||
)
|
||||
return f"{header}\n\n{result}"
|
||||
|
||||
def queue_prefetch(self, query: str, *, session_id: str = "") -> None:
|
||||
if self._memory_mode == "tools":
|
||||
logger.debug("Prefetch: skipped (tools-only mode)")
|
||||
return
|
||||
if not self._auto_recall:
|
||||
logger.debug("Prefetch: skipped (auto_recall disabled)")
|
||||
return
|
||||
# Truncate query to max chars
|
||||
if self._recall_max_input_chars and len(query) > self._recall_max_input_chars:
|
||||
query = query[:self._recall_max_input_chars]
|
||||
|
||||
def _run():
|
||||
try:
|
||||
client = self._get_client()
|
||||
if self._prefetch_method == "reflect":
|
||||
logger.debug("Prefetch: calling reflect (bank=%s, query_len=%d)", self._bank_id, len(query))
|
||||
resp = _run_sync(client.areflect(bank_id=self._bank_id, query=query, budget=self._budget))
|
||||
text = resp.text or ""
|
||||
else:
|
||||
resp = _run_sync(client.arecall(bank_id=self._bank_id, query=query, budget=self._budget))
|
||||
text = "\n".join(r.text for r in resp.results if r.text) if resp.results else ""
|
||||
recall_kwargs: dict = {
|
||||
"bank_id": self._bank_id, "query": query,
|
||||
"budget": self._budget, "max_tokens": self._recall_max_tokens,
|
||||
}
|
||||
if self._recall_tags:
|
||||
recall_kwargs["tags"] = self._recall_tags
|
||||
recall_kwargs["tags_match"] = self._recall_tags_match
|
||||
if self._recall_types:
|
||||
recall_kwargs["types"] = self._recall_types
|
||||
logger.debug("Prefetch: calling recall (bank=%s, query_len=%d, budget=%s)",
|
||||
self._bank_id, len(query), self._budget)
|
||||
resp = _run_sync(client.arecall(**recall_kwargs))
|
||||
num_results = len(resp.results) if resp.results else 0
|
||||
logger.debug("Prefetch: recall returned %d results", num_results)
|
||||
text = "\n".join(f"- {r.text}" for r in resp.results if r.text) if resp.results else ""
|
||||
if text:
|
||||
with self._prefetch_lock:
|
||||
self._prefetch_result = text
|
||||
except Exception as e:
|
||||
logger.debug("Hindsight prefetch failed: %s", e)
|
||||
logger.debug("Hindsight prefetch failed: %s", e, exc_info=True)
|
||||
|
||||
self._prefetch_thread = threading.Thread(target=_run, daemon=True, name="hindsight-prefetch")
|
||||
self._prefetch_thread.start()
|
||||
|
||||
def sync_turn(self, user_content: str, assistant_content: str, *, session_id: str = "") -> None:
|
||||
"""Retain conversation turn in background (non-blocking)."""
|
||||
combined = f"User: {user_content}\nAssistant: {assistant_content}"
|
||||
"""Retain conversation turn in background (non-blocking).
|
||||
|
||||
Respects retain_every_n_turns for batching.
|
||||
"""
|
||||
if not self._auto_retain:
|
||||
logger.debug("sync_turn: skipped (auto_retain disabled)")
|
||||
return
|
||||
|
||||
from datetime import datetime, timezone
|
||||
now = datetime.now(timezone.utc).isoformat()
|
||||
|
||||
messages = [
|
||||
{"role": "user", "content": user_content, "timestamp": now},
|
||||
{"role": "assistant", "content": assistant_content, "timestamp": now},
|
||||
]
|
||||
|
||||
turn = json.dumps(messages)
|
||||
self._session_turns.append(turn)
|
||||
self._turn_counter += 1
|
||||
|
||||
# Only retain every N turns
|
||||
if self._turn_counter % self._retain_every_n_turns != 0:
|
||||
logger.debug("sync_turn: buffered turn %d (will retain at turn %d)",
|
||||
self._turn_counter, self._turn_counter + (self._retain_every_n_turns - self._turn_counter % self._retain_every_n_turns))
|
||||
return
|
||||
|
||||
logger.debug("sync_turn: retaining %d turns, total session content %d chars",
|
||||
len(self._session_turns), sum(len(t) for t in self._session_turns))
|
||||
# Send the ENTIRE session as a single JSON array (document_id deduplicates).
|
||||
# Each element in _session_turns is a JSON string of that turn's messages.
|
||||
content = "[" + ",".join(self._session_turns) + "]"
|
||||
|
||||
def _sync():
|
||||
try:
|
||||
client = self._get_client()
|
||||
_run_sync(client.aretain(
|
||||
bank_id=self._bank_id, content=combined, context="conversation"
|
||||
item: dict = {
|
||||
"content": content,
|
||||
"context": self._retain_context,
|
||||
}
|
||||
if self._tags:
|
||||
item["tags"] = self._tags
|
||||
logger.debug("Hindsight retain: bank=%s, doc=%s, async=%s, content_len=%d, num_turns=%d",
|
||||
self._bank_id, self._session_id, self._retain_async, len(content), len(self._session_turns))
|
||||
_run_sync(client.aretain_batch(
|
||||
bank_id=self._bank_id,
|
||||
items=[item],
|
||||
document_id=self._session_id,
|
||||
retain_async=self._retain_async,
|
||||
))
|
||||
logger.debug("Hindsight retain succeeded")
|
||||
except Exception as e:
|
||||
logger.warning("Hindsight sync failed: %s", e)
|
||||
logger.warning("Hindsight sync failed: %s", e, exc_info=True)
|
||||
|
||||
if self._sync_thread and self._sync_thread.is_alive():
|
||||
self._sync_thread.join(timeout=5.0)
|
||||
|
|
@ -453,12 +789,18 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
return tool_error("Missing required parameter: content")
|
||||
context = args.get("context")
|
||||
try:
|
||||
_run_sync(client.aretain(
|
||||
bank_id=self._bank_id, content=content, context=context
|
||||
))
|
||||
retain_kwargs: dict = {
|
||||
"bank_id": self._bank_id, "content": content, "context": context,
|
||||
}
|
||||
if self._tags:
|
||||
retain_kwargs["tags"] = self._tags
|
||||
logger.debug("Tool hindsight_retain: bank=%s, content_len=%d, context=%s",
|
||||
self._bank_id, len(content), context)
|
||||
_run_sync(client.aretain(**retain_kwargs))
|
||||
logger.debug("Tool hindsight_retain: success")
|
||||
return json.dumps({"result": "Memory stored successfully."})
|
||||
except Exception as e:
|
||||
logger.warning("hindsight_retain failed: %s", e)
|
||||
logger.warning("hindsight_retain failed: %s", e, exc_info=True)
|
||||
return tool_error(f"Failed to store memory: {e}")
|
||||
|
||||
elif tool_name == "hindsight_recall":
|
||||
|
|
@ -466,15 +808,26 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
if not query:
|
||||
return tool_error("Missing required parameter: query")
|
||||
try:
|
||||
resp = _run_sync(client.arecall(
|
||||
bank_id=self._bank_id, query=query, budget=self._budget
|
||||
))
|
||||
recall_kwargs: dict = {
|
||||
"bank_id": self._bank_id, "query": query, "budget": self._budget,
|
||||
"max_tokens": self._recall_max_tokens,
|
||||
}
|
||||
if self._recall_tags:
|
||||
recall_kwargs["tags"] = self._recall_tags
|
||||
recall_kwargs["tags_match"] = self._recall_tags_match
|
||||
if self._recall_types:
|
||||
recall_kwargs["types"] = self._recall_types
|
||||
logger.debug("Tool hindsight_recall: bank=%s, query_len=%d, budget=%s",
|
||||
self._bank_id, len(query), self._budget)
|
||||
resp = _run_sync(client.arecall(**recall_kwargs))
|
||||
num_results = len(resp.results) if resp.results else 0
|
||||
logger.debug("Tool hindsight_recall: %d results", num_results)
|
||||
if not resp.results:
|
||||
return json.dumps({"result": "No relevant memories found."})
|
||||
lines = [f"{i}. {r.text}" for i, r in enumerate(resp.results, 1)]
|
||||
return json.dumps({"result": "\n".join(lines)})
|
||||
except Exception as e:
|
||||
logger.warning("hindsight_recall failed: %s", e)
|
||||
logger.warning("hindsight_recall failed: %s", e, exc_info=True)
|
||||
return tool_error(f"Failed to search memory: {e}")
|
||||
|
||||
elif tool_name == "hindsight_reflect":
|
||||
|
|
@ -482,24 +835,28 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
if not query:
|
||||
return tool_error("Missing required parameter: query")
|
||||
try:
|
||||
logger.debug("Tool hindsight_reflect: bank=%s, query_len=%d, budget=%s",
|
||||
self._bank_id, len(query), self._budget)
|
||||
resp = _run_sync(client.areflect(
|
||||
bank_id=self._bank_id, query=query, budget=self._budget
|
||||
))
|
||||
logger.debug("Tool hindsight_reflect: response_len=%d", len(resp.text or ""))
|
||||
return json.dumps({"result": resp.text or "No relevant memories found."})
|
||||
except Exception as e:
|
||||
logger.warning("hindsight_reflect failed: %s", e)
|
||||
logger.warning("hindsight_reflect failed: %s", e, exc_info=True)
|
||||
return tool_error(f"Failed to reflect: {e}")
|
||||
|
||||
return tool_error(f"Unknown tool: {tool_name}")
|
||||
|
||||
def shutdown(self) -> None:
|
||||
logger.debug("Hindsight shutdown: waiting for background threads")
|
||||
global _loop, _loop_thread
|
||||
for t in (self._prefetch_thread, self._sync_thread):
|
||||
if t and t.is_alive():
|
||||
t.join(timeout=5.0)
|
||||
if self._client is not None:
|
||||
try:
|
||||
if self._mode == "local":
|
||||
if self._mode == "local_embedded":
|
||||
# Use the public close() API. The RuntimeError from
|
||||
# aiohttp's "attached to a different loop" is expected
|
||||
# and harmless — the daemon keeps running independently.
|
||||
|
|
|
|||
|
|
@ -2,9 +2,7 @@ name: hindsight
|
|||
version: 1.0.0
|
||||
description: "Hindsight — long-term memory with knowledge graph, entity resolution, and multi-strategy retrieval."
|
||||
pip_dependencies:
|
||||
- hindsight-client
|
||||
- hindsight-all
|
||||
requires_env:
|
||||
- HINDSIGHT_API_KEY
|
||||
- "hindsight-client>=0.4.22"
|
||||
requires_env: []
|
||||
hooks:
|
||||
- on_session_end
|
||||
|
|
|
|||
598
tests/plugins/memory/test_hindsight_provider.py
Normal file
598
tests/plugins/memory/test_hindsight_provider.py
Normal file
|
|
@ -0,0 +1,598 @@
|
|||
"""Tests for the Hindsight memory provider plugin.
|
||||
|
||||
Tests cover config loading, tool handlers (tags, max_tokens, types),
|
||||
prefetch (auto_recall, preamble, query truncation), sync_turn (auto_retain,
|
||||
turn counting, tags), and schema completeness.
|
||||
"""
|
||||
|
||||
import json
|
||||
import threading
|
||||
from types import SimpleNamespace
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from plugins.memory.hindsight import (
|
||||
HindsightMemoryProvider,
|
||||
RECALL_SCHEMA,
|
||||
REFLECT_SCHEMA,
|
||||
RETAIN_SCHEMA,
|
||||
_load_config,
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Fixtures
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _clean_env(monkeypatch):
|
||||
"""Ensure no stale env vars leak between tests."""
|
||||
for key in (
|
||||
"HINDSIGHT_API_KEY", "HINDSIGHT_API_URL", "HINDSIGHT_BANK_ID",
|
||||
"HINDSIGHT_BUDGET", "HINDSIGHT_MODE", "HINDSIGHT_LLM_API_KEY",
|
||||
):
|
||||
monkeypatch.delenv(key, raising=False)
|
||||
|
||||
|
||||
def _make_mock_client():
|
||||
"""Create a mock Hindsight client with async methods."""
|
||||
client = MagicMock()
|
||||
client.aretain = AsyncMock()
|
||||
client.arecall = AsyncMock(
|
||||
return_value=SimpleNamespace(
|
||||
results=[
|
||||
SimpleNamespace(text="Memory 1"),
|
||||
SimpleNamespace(text="Memory 2"),
|
||||
]
|
||||
)
|
||||
)
|
||||
client.areflect = AsyncMock(
|
||||
return_value=SimpleNamespace(text="Synthesized answer")
|
||||
)
|
||||
client.aretain_batch = AsyncMock()
|
||||
client.aclose = AsyncMock()
|
||||
return client
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def provider(tmp_path, monkeypatch):
|
||||
"""Create an initialized HindsightMemoryProvider with a mock client."""
|
||||
config = {
|
||||
"mode": "cloud",
|
||||
"apiKey": "test-key",
|
||||
"api_url": "http://localhost:9999",
|
||||
"bank_id": "test-bank",
|
||||
"budget": "mid",
|
||||
"memory_mode": "hybrid",
|
||||
}
|
||||
config_path = tmp_path / "hindsight" / "config.json"
|
||||
config_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
config_path.write_text(json.dumps(config))
|
||||
|
||||
monkeypatch.setattr(
|
||||
"plugins.memory.hindsight.get_hermes_home", lambda: tmp_path
|
||||
)
|
||||
|
||||
p = HindsightMemoryProvider()
|
||||
p.initialize(session_id="test-session", hermes_home=str(tmp_path), platform="cli")
|
||||
p._client = _make_mock_client()
|
||||
return p
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def provider_with_config(tmp_path, monkeypatch):
|
||||
"""Create a provider factory that accepts custom config overrides."""
|
||||
def _make(**overrides):
|
||||
config = {
|
||||
"mode": "cloud",
|
||||
"apiKey": "test-key",
|
||||
"api_url": "http://localhost:9999",
|
||||
"bank_id": "test-bank",
|
||||
"budget": "mid",
|
||||
"memory_mode": "hybrid",
|
||||
}
|
||||
config.update(overrides)
|
||||
config_path = tmp_path / "hindsight" / "config.json"
|
||||
config_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
config_path.write_text(json.dumps(config))
|
||||
|
||||
monkeypatch.setattr(
|
||||
"plugins.memory.hindsight.get_hermes_home", lambda: tmp_path
|
||||
)
|
||||
|
||||
p = HindsightMemoryProvider()
|
||||
p.initialize(session_id="test-session", hermes_home=str(tmp_path), platform="cli")
|
||||
p._client = _make_mock_client()
|
||||
return p
|
||||
return _make
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Schema tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestSchemas:
|
||||
def test_retain_schema_has_content(self):
|
||||
assert RETAIN_SCHEMA["name"] == "hindsight_retain"
|
||||
assert "content" in RETAIN_SCHEMA["parameters"]["properties"]
|
||||
assert "content" in RETAIN_SCHEMA["parameters"]["required"]
|
||||
|
||||
def test_recall_schema_has_query(self):
|
||||
assert RECALL_SCHEMA["name"] == "hindsight_recall"
|
||||
assert "query" in RECALL_SCHEMA["parameters"]["properties"]
|
||||
assert "query" in RECALL_SCHEMA["parameters"]["required"]
|
||||
|
||||
def test_reflect_schema_has_query(self):
|
||||
assert REFLECT_SCHEMA["name"] == "hindsight_reflect"
|
||||
assert "query" in REFLECT_SCHEMA["parameters"]["properties"]
|
||||
|
||||
def test_get_tool_schemas_returns_three(self, provider):
|
||||
schemas = provider.get_tool_schemas()
|
||||
assert len(schemas) == 3
|
||||
names = {s["name"] for s in schemas}
|
||||
assert names == {"hindsight_retain", "hindsight_recall", "hindsight_reflect"}
|
||||
|
||||
def test_context_mode_returns_no_tools(self, provider_with_config):
|
||||
p = provider_with_config(memory_mode="context")
|
||||
assert p.get_tool_schemas() == []
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Config tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestConfig:
|
||||
def test_default_values(self, provider):
|
||||
assert provider._auto_retain is True
|
||||
assert provider._auto_recall is True
|
||||
assert provider._retain_every_n_turns == 1
|
||||
assert provider._recall_max_tokens == 4096
|
||||
assert provider._recall_max_input_chars == 800
|
||||
assert provider._tags is None
|
||||
assert provider._recall_tags is None
|
||||
assert provider._bank_mission == ""
|
||||
assert provider._bank_retain_mission is None
|
||||
assert provider._retain_context == "conversation between Hermes Agent and the User"
|
||||
|
||||
def test_custom_config_values(self, provider_with_config):
|
||||
p = provider_with_config(
|
||||
tags=["tag1", "tag2"],
|
||||
recall_tags=["recall-tag"],
|
||||
recall_tags_match="all",
|
||||
auto_retain=False,
|
||||
auto_recall=False,
|
||||
retain_every_n_turns=3,
|
||||
retain_context="custom-ctx",
|
||||
bank_retain_mission="Extract key facts",
|
||||
recall_max_tokens=2048,
|
||||
recall_types=["world", "experience"],
|
||||
recall_prompt_preamble="Custom preamble:",
|
||||
recall_max_input_chars=500,
|
||||
bank_mission="Test agent mission",
|
||||
)
|
||||
assert p._tags == ["tag1", "tag2"]
|
||||
assert p._recall_tags == ["recall-tag"]
|
||||
assert p._recall_tags_match == "all"
|
||||
assert p._auto_retain is False
|
||||
assert p._auto_recall is False
|
||||
assert p._retain_every_n_turns == 3
|
||||
assert p._retain_context == "custom-ctx"
|
||||
assert p._bank_retain_mission == "Extract key facts"
|
||||
assert p._recall_max_tokens == 2048
|
||||
assert p._recall_types == ["world", "experience"]
|
||||
assert p._recall_prompt_preamble == "Custom preamble:"
|
||||
assert p._recall_max_input_chars == 500
|
||||
assert p._bank_mission == "Test agent mission"
|
||||
|
||||
def test_config_from_env_fallback(self, tmp_path, monkeypatch):
|
||||
"""When no config file exists, falls back to env vars."""
|
||||
monkeypatch.setattr(
|
||||
"plugins.memory.hindsight.get_hermes_home",
|
||||
lambda: tmp_path / "nonexistent",
|
||||
)
|
||||
monkeypatch.setenv("HINDSIGHT_MODE", "cloud")
|
||||
monkeypatch.setenv("HINDSIGHT_API_KEY", "env-key")
|
||||
monkeypatch.setenv("HINDSIGHT_BANK_ID", "env-bank")
|
||||
monkeypatch.setenv("HINDSIGHT_BUDGET", "high")
|
||||
|
||||
cfg = _load_config()
|
||||
assert cfg["apiKey"] == "env-key"
|
||||
assert cfg["banks"]["hermes"]["bankId"] == "env-bank"
|
||||
assert cfg["banks"]["hermes"]["budget"] == "high"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Tool handler tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestToolHandlers:
|
||||
def test_retain_success(self, provider):
|
||||
result = json.loads(provider.handle_tool_call(
|
||||
"hindsight_retain", {"content": "user likes dark mode"}
|
||||
))
|
||||
assert result["result"] == "Memory stored successfully."
|
||||
provider._client.aretain.assert_called_once()
|
||||
call_kwargs = provider._client.aretain.call_args.kwargs
|
||||
assert call_kwargs["bank_id"] == "test-bank"
|
||||
assert call_kwargs["content"] == "user likes dark mode"
|
||||
|
||||
def test_retain_with_tags(self, provider_with_config):
|
||||
p = provider_with_config(tags=["pref", "ui"])
|
||||
p.handle_tool_call("hindsight_retain", {"content": "likes dark mode"})
|
||||
call_kwargs = p._client.aretain.call_args.kwargs
|
||||
assert call_kwargs["tags"] == ["pref", "ui"]
|
||||
|
||||
def test_retain_without_tags(self, provider):
|
||||
provider.handle_tool_call("hindsight_retain", {"content": "hello"})
|
||||
call_kwargs = provider._client.aretain.call_args.kwargs
|
||||
assert "tags" not in call_kwargs
|
||||
|
||||
def test_retain_missing_content(self, provider):
|
||||
result = json.loads(provider.handle_tool_call(
|
||||
"hindsight_retain", {}
|
||||
))
|
||||
assert "error" in result
|
||||
|
||||
def test_recall_success(self, provider):
|
||||
result = json.loads(provider.handle_tool_call(
|
||||
"hindsight_recall", {"query": "dark mode"}
|
||||
))
|
||||
assert "Memory 1" in result["result"]
|
||||
assert "Memory 2" in result["result"]
|
||||
|
||||
def test_recall_passes_max_tokens(self, provider_with_config):
|
||||
p = provider_with_config(recall_max_tokens=2048)
|
||||
p.handle_tool_call("hindsight_recall", {"query": "test"})
|
||||
call_kwargs = p._client.arecall.call_args.kwargs
|
||||
assert call_kwargs["max_tokens"] == 2048
|
||||
|
||||
def test_recall_passes_tags(self, provider_with_config):
|
||||
p = provider_with_config(recall_tags=["tag1"], recall_tags_match="all")
|
||||
p.handle_tool_call("hindsight_recall", {"query": "test"})
|
||||
call_kwargs = p._client.arecall.call_args.kwargs
|
||||
assert call_kwargs["tags"] == ["tag1"]
|
||||
assert call_kwargs["tags_match"] == "all"
|
||||
|
||||
def test_recall_passes_types(self, provider_with_config):
|
||||
p = provider_with_config(recall_types=["world", "experience"])
|
||||
p.handle_tool_call("hindsight_recall", {"query": "test"})
|
||||
call_kwargs = p._client.arecall.call_args.kwargs
|
||||
assert call_kwargs["types"] == ["world", "experience"]
|
||||
|
||||
def test_recall_no_results(self, provider):
|
||||
provider._client.arecall.return_value = SimpleNamespace(results=[])
|
||||
result = json.loads(provider.handle_tool_call(
|
||||
"hindsight_recall", {"query": "test"}
|
||||
))
|
||||
assert result["result"] == "No relevant memories found."
|
||||
|
||||
def test_recall_missing_query(self, provider):
|
||||
result = json.loads(provider.handle_tool_call(
|
||||
"hindsight_recall", {}
|
||||
))
|
||||
assert "error" in result
|
||||
|
||||
def test_reflect_success(self, provider):
|
||||
result = json.loads(provider.handle_tool_call(
|
||||
"hindsight_reflect", {"query": "summarize"}
|
||||
))
|
||||
assert result["result"] == "Synthesized answer"
|
||||
|
||||
def test_reflect_missing_query(self, provider):
|
||||
result = json.loads(provider.handle_tool_call(
|
||||
"hindsight_reflect", {}
|
||||
))
|
||||
assert "error" in result
|
||||
|
||||
def test_unknown_tool(self, provider):
|
||||
result = json.loads(provider.handle_tool_call(
|
||||
"hindsight_unknown", {}
|
||||
))
|
||||
assert "error" in result
|
||||
|
||||
def test_retain_error_handling(self, provider):
|
||||
provider._client.aretain.side_effect = RuntimeError("connection failed")
|
||||
result = json.loads(provider.handle_tool_call(
|
||||
"hindsight_retain", {"content": "test"}
|
||||
))
|
||||
assert "error" in result
|
||||
assert "connection failed" in result["error"]
|
||||
|
||||
def test_recall_error_handling(self, provider):
|
||||
provider._client.arecall.side_effect = RuntimeError("timeout")
|
||||
result = json.loads(provider.handle_tool_call(
|
||||
"hindsight_recall", {"query": "test"}
|
||||
))
|
||||
assert "error" in result
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Prefetch tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestPrefetch:
|
||||
def test_prefetch_returns_empty_when_no_result(self, provider):
|
||||
assert provider.prefetch("test") == ""
|
||||
|
||||
def test_prefetch_default_preamble(self, provider):
|
||||
provider._prefetch_result = "- some memory"
|
||||
result = provider.prefetch("test")
|
||||
assert "Hindsight Memory" in result
|
||||
assert "- some memory" in result
|
||||
|
||||
def test_prefetch_custom_preamble(self, provider_with_config):
|
||||
p = provider_with_config(recall_prompt_preamble="Custom header:")
|
||||
p._prefetch_result = "- memory line"
|
||||
result = p.prefetch("test")
|
||||
assert result.startswith("Custom header:")
|
||||
assert "- memory line" in result
|
||||
|
||||
def test_queue_prefetch_skipped_in_tools_mode(self, provider_with_config):
|
||||
p = provider_with_config(memory_mode="tools")
|
||||
p.queue_prefetch("test")
|
||||
# Should not start a thread
|
||||
assert p._prefetch_thread is None
|
||||
|
||||
def test_queue_prefetch_skipped_when_auto_recall_off(self, provider_with_config):
|
||||
p = provider_with_config(auto_recall=False)
|
||||
p.queue_prefetch("test")
|
||||
assert p._prefetch_thread is None
|
||||
|
||||
def test_queue_prefetch_truncates_query(self, provider_with_config):
|
||||
p = provider_with_config(recall_max_input_chars=10)
|
||||
# Mock _run_sync to capture the query
|
||||
original_query = None
|
||||
|
||||
def _capture_recall(**kwargs):
|
||||
nonlocal original_query
|
||||
original_query = kwargs.get("query", "")
|
||||
return SimpleNamespace(results=[])
|
||||
|
||||
p._client.arecall = AsyncMock(side_effect=_capture_recall)
|
||||
|
||||
long_query = "a" * 100
|
||||
p.queue_prefetch(long_query)
|
||||
if p._prefetch_thread:
|
||||
p._prefetch_thread.join(timeout=5.0)
|
||||
|
||||
# The query passed to arecall should be truncated
|
||||
if original_query is not None:
|
||||
assert len(original_query) <= 10
|
||||
|
||||
def test_queue_prefetch_passes_recall_params(self, provider_with_config):
|
||||
p = provider_with_config(
|
||||
recall_tags=["t1"],
|
||||
recall_tags_match="all",
|
||||
recall_max_tokens=1024,
|
||||
recall_types=["world"],
|
||||
)
|
||||
p.queue_prefetch("test query")
|
||||
if p._prefetch_thread:
|
||||
p._prefetch_thread.join(timeout=5.0)
|
||||
|
||||
call_kwargs = p._client.arecall.call_args.kwargs
|
||||
assert call_kwargs["max_tokens"] == 1024
|
||||
assert call_kwargs["tags"] == ["t1"]
|
||||
assert call_kwargs["tags_match"] == "all"
|
||||
assert call_kwargs["types"] == ["world"]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# sync_turn tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestSyncTurn:
|
||||
def _get_retain_kwargs(self, provider):
|
||||
"""Helper to get the kwargs from the aretain_batch call."""
|
||||
return provider._client.aretain_batch.call_args.kwargs
|
||||
|
||||
def _get_retain_content(self, provider):
|
||||
"""Helper to get the raw content string from the first item."""
|
||||
kwargs = self._get_retain_kwargs(provider)
|
||||
return kwargs["items"][0]["content"]
|
||||
|
||||
def _get_retain_messages(self, provider):
|
||||
"""Helper to parse the first turn's messages from retained content.
|
||||
|
||||
Content is a JSON array of turns: [[msgs...], [msgs...], ...]
|
||||
For single-turn tests, returns the first turn's messages.
|
||||
"""
|
||||
content = self._get_retain_content(provider)
|
||||
turns = json.loads(content)
|
||||
return turns[0] if len(turns) == 1 else turns
|
||||
|
||||
def test_sync_turn_retains(self, provider):
|
||||
provider.sync_turn("hello", "hi there")
|
||||
if provider._sync_thread:
|
||||
provider._sync_thread.join(timeout=5.0)
|
||||
provider._client.aretain_batch.assert_called_once()
|
||||
messages = self._get_retain_messages(provider)
|
||||
assert len(messages) == 2
|
||||
assert messages[0]["role"] == "user"
|
||||
assert messages[0]["content"] == "hello"
|
||||
assert "timestamp" in messages[0]
|
||||
assert messages[1]["role"] == "assistant"
|
||||
assert messages[1]["content"] == "hi there"
|
||||
assert "timestamp" in messages[1]
|
||||
|
||||
def test_sync_turn_skipped_when_auto_retain_off(self, provider_with_config):
|
||||
p = provider_with_config(auto_retain=False)
|
||||
p.sync_turn("hello", "hi")
|
||||
assert p._sync_thread is None
|
||||
p._client.aretain_batch.assert_not_called()
|
||||
|
||||
def test_sync_turn_with_tags(self, provider_with_config):
|
||||
p = provider_with_config(tags=["conv", "session1"])
|
||||
p.sync_turn("hello", "hi")
|
||||
if p._sync_thread:
|
||||
p._sync_thread.join(timeout=5.0)
|
||||
item = p._client.aretain_batch.call_args.kwargs["items"][0]
|
||||
assert item["tags"] == ["conv", "session1"]
|
||||
|
||||
def test_sync_turn_uses_aretain_batch(self, provider):
|
||||
"""sync_turn should use aretain_batch with retain_async."""
|
||||
provider.sync_turn("hello", "hi")
|
||||
if provider._sync_thread:
|
||||
provider._sync_thread.join(timeout=5.0)
|
||||
provider._client.aretain_batch.assert_called_once()
|
||||
call_kwargs = provider._client.aretain_batch.call_args.kwargs
|
||||
assert call_kwargs["document_id"] == "test-session"
|
||||
assert call_kwargs["retain_async"] is True
|
||||
assert len(call_kwargs["items"]) == 1
|
||||
assert call_kwargs["items"][0]["context"] == "conversation between Hermes Agent and the User"
|
||||
|
||||
def test_sync_turn_custom_context(self, provider_with_config):
|
||||
p = provider_with_config(retain_context="my-agent")
|
||||
p.sync_turn("hello", "hi")
|
||||
if p._sync_thread:
|
||||
p._sync_thread.join(timeout=5.0)
|
||||
item = p._client.aretain_batch.call_args.kwargs["items"][0]
|
||||
assert item["context"] == "my-agent"
|
||||
|
||||
def test_sync_turn_every_n_turns(self, provider_with_config):
|
||||
"""With retain_every_n_turns=3, only retains on every 3rd turn."""
|
||||
p = provider_with_config(retain_every_n_turns=3)
|
||||
|
||||
p.sync_turn("turn1-user", "turn1-asst")
|
||||
assert p._sync_thread is None # not retained yet
|
||||
|
||||
p.sync_turn("turn2-user", "turn2-asst")
|
||||
assert p._sync_thread is None # not retained yet
|
||||
|
||||
p.sync_turn("turn3-user", "turn3-asst")
|
||||
assert p._sync_thread is not None # retained!
|
||||
p._sync_thread.join(timeout=5.0)
|
||||
|
||||
p._client.aretain_batch.assert_called_once()
|
||||
content = p._client.aretain_batch.call_args.kwargs["items"][0]["content"]
|
||||
# Should contain all 3 turns
|
||||
assert "turn1-user" in content
|
||||
assert "turn2-user" in content
|
||||
assert "turn3-user" in content
|
||||
|
||||
def test_sync_turn_accumulates_full_session(self, provider_with_config):
|
||||
"""Each retain sends the ENTIRE session, not just the latest batch."""
|
||||
p = provider_with_config(retain_every_n_turns=2)
|
||||
|
||||
p.sync_turn("turn1-user", "turn1-asst")
|
||||
p.sync_turn("turn2-user", "turn2-asst")
|
||||
if p._sync_thread:
|
||||
p._sync_thread.join(timeout=5.0)
|
||||
|
||||
p._client.aretain_batch.reset_mock()
|
||||
|
||||
p.sync_turn("turn3-user", "turn3-asst")
|
||||
p.sync_turn("turn4-user", "turn4-asst")
|
||||
if p._sync_thread:
|
||||
p._sync_thread.join(timeout=5.0)
|
||||
|
||||
content = p._client.aretain_batch.call_args.kwargs["items"][0]["content"]
|
||||
# Should contain ALL turns from the session
|
||||
assert "turn1-user" in content
|
||||
assert "turn2-user" in content
|
||||
assert "turn3-user" in content
|
||||
assert "turn4-user" in content
|
||||
|
||||
def test_sync_turn_passes_document_id(self, provider):
|
||||
"""sync_turn should pass session_id as document_id for dedup."""
|
||||
provider.sync_turn("hello", "hi")
|
||||
if provider._sync_thread:
|
||||
provider._sync_thread.join(timeout=5.0)
|
||||
call_kwargs = provider._client.aretain_batch.call_args.kwargs
|
||||
assert call_kwargs["document_id"] == "test-session"
|
||||
|
||||
def test_sync_turn_error_does_not_raise(self, provider):
|
||||
"""Errors in sync_turn should be swallowed (non-blocking)."""
|
||||
provider._client.aretain_batch.side_effect = RuntimeError("network error")
|
||||
provider.sync_turn("hello", "hi")
|
||||
if provider._sync_thread:
|
||||
provider._sync_thread.join(timeout=5.0)
|
||||
# Should not raise
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# System prompt tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestSystemPrompt:
|
||||
def test_hybrid_mode_prompt(self, provider):
|
||||
block = provider.system_prompt_block()
|
||||
assert "Hindsight Memory" in block
|
||||
assert "hindsight_recall" in block
|
||||
assert "automatically injected" in block
|
||||
|
||||
def test_context_mode_prompt(self, provider_with_config):
|
||||
p = provider_with_config(memory_mode="context")
|
||||
block = p.system_prompt_block()
|
||||
assert "context mode" in block
|
||||
assert "hindsight_recall" not in block
|
||||
|
||||
def test_tools_mode_prompt(self, provider_with_config):
|
||||
p = provider_with_config(memory_mode="tools")
|
||||
block = p.system_prompt_block()
|
||||
assert "tools mode" in block
|
||||
assert "hindsight_recall" in block
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Config schema tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestConfigSchema:
|
||||
def test_schema_has_all_new_fields(self, provider):
|
||||
schema = provider.get_config_schema()
|
||||
keys = {f["key"] for f in schema}
|
||||
expected_keys = {
|
||||
"mode", "api_url", "api_key", "llm_provider", "llm_api_key",
|
||||
"llm_model", "bank_id", "bank_mission", "bank_retain_mission",
|
||||
"recall_budget", "memory_mode", "recall_prefetch_method",
|
||||
"tags", "recall_tags", "recall_tags_match",
|
||||
"auto_recall", "auto_retain",
|
||||
"retain_every_n_turns", "retain_async",
|
||||
"retain_context",
|
||||
"recall_max_tokens", "recall_max_input_chars",
|
||||
"recall_prompt_preamble",
|
||||
}
|
||||
assert expected_keys.issubset(keys), f"Missing: {expected_keys - keys}"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Availability tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestAvailability:
|
||||
def test_available_with_api_key(self, tmp_path, monkeypatch):
|
||||
monkeypatch.setattr(
|
||||
"plugins.memory.hindsight.get_hermes_home",
|
||||
lambda: tmp_path / "nonexistent",
|
||||
)
|
||||
monkeypatch.setenv("HINDSIGHT_API_KEY", "test-key")
|
||||
p = HindsightMemoryProvider()
|
||||
assert p.is_available()
|
||||
|
||||
def test_not_available_without_config(self, tmp_path, monkeypatch):
|
||||
monkeypatch.setattr(
|
||||
"plugins.memory.hindsight.get_hermes_home",
|
||||
lambda: tmp_path / "nonexistent",
|
||||
)
|
||||
p = HindsightMemoryProvider()
|
||||
assert not p.is_available()
|
||||
|
||||
def test_available_in_local_mode(self, tmp_path, monkeypatch):
|
||||
monkeypatch.setattr(
|
||||
"plugins.memory.hindsight.get_hermes_home",
|
||||
lambda: tmp_path / "nonexistent",
|
||||
)
|
||||
monkeypatch.setenv("HINDSIGHT_MODE", "local")
|
||||
p = HindsightMemoryProvider()
|
||||
assert p.is_available()
|
||||
|
|
@ -263,12 +263,12 @@ echo "MEM0_API_KEY=your-key" >> ~/.hermes/.env
|
|||
|
||||
### Hindsight
|
||||
|
||||
Long-term memory with knowledge graph, entity resolution, and multi-strategy retrieval. The `hindsight_reflect` tool provides cross-memory synthesis that no other provider offers.
|
||||
Long-term memory with knowledge graph, entity resolution, and multi-strategy retrieval. The `hindsight_reflect` tool provides cross-memory synthesis that no other provider offers. Automatically retains full conversation turns (including tool calls) with session-level document tracking.
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| **Best for** | Knowledge graph-based recall with entity relationships |
|
||||
| **Requires** | Cloud: `pip install hindsight-client` + API key. Local: `pip install hindsight` + LLM key |
|
||||
| **Requires** | Cloud: API key from [ui.hindsight.vectorize.io](https://ui.hindsight.vectorize.io). Local: LLM API key (OpenAI, Groq, OpenRouter, etc.) |
|
||||
| **Data storage** | Hindsight Cloud or local embedded PostgreSQL |
|
||||
| **Cost** | Hindsight pricing (cloud) or free (local) |
|
||||
|
||||
|
|
@ -282,13 +282,25 @@ hermes config set memory.provider hindsight
|
|||
echo "HINDSIGHT_API_KEY=your-key" >> ~/.hermes/.env
|
||||
```
|
||||
|
||||
The setup wizard installs dependencies automatically and only installs what's needed for the selected mode (`hindsight-client` for cloud, `hindsight-all` for local). Requires `hindsight-client >= 0.4.22` (auto-upgraded on session start if outdated).
|
||||
|
||||
**Local mode UI:** `hindsight-embed -p hermes ui start`
|
||||
|
||||
**Config:** `$HERMES_HOME/hindsight/config.json`
|
||||
|
||||
| Key | Default | Description |
|
||||
|-----|---------|-------------|
|
||||
| `mode` | `cloud` | `cloud` or `local` |
|
||||
| `bank_id` | `hermes` | Memory bank identifier |
|
||||
| `budget` | `mid` | Recall thoroughness: `low` / `mid` / `high` |
|
||||
| `recall_budget` | `mid` | Recall thoroughness: `low` / `mid` / `high` |
|
||||
| `memory_mode` | `hybrid` | `hybrid` (context + tools), `context` (auto-inject only), `tools` (tools only) |
|
||||
| `auto_retain` | `true` | Automatically retain conversation turns |
|
||||
| `auto_recall` | `true` | Automatically recall memories before each turn |
|
||||
| `retain_async` | `true` | Process retain asynchronously on the server |
|
||||
| `tags` | — | Tags applied when storing memories |
|
||||
| `recall_tags` | — | Tags to filter on recall |
|
||||
|
||||
See [plugin README](https://github.com/NousResearch/hermes-agent/blob/main/plugins/memory/hindsight/README.md) for the full configuration reference.
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue