Wraps every sync->async coroutine-scheduling site in the codebase with a new agent.async_utils.safe_schedule_threadsafe() helper that closes the coroutine on scheduling failure (closed loop, shutdown race, etc.) instead of leaking it as 'coroutine was never awaited' RuntimeWarnings plus reference leaks. 22 production call sites migrated across the codebase: - acp_adapter/events.py, acp_adapter/permissions.py - agent/lsp/manager.py - cron/scheduler.py (media + text delivery paths) - gateway/platforms/feishu.py (5 sites, via existing _submit_on_loop helper which now delegates to safe_schedule_threadsafe) - gateway/run.py (10 sites: telegram rename, agent:step hook, status callback, interim+bg-review, clarify send, exec-approval button+text, temp-bubble cleanup, channel-directory refresh) - plugins/memory/hindsight, plugins/platforms/google_chat - tools/browser_supervisor.py (3), browser_cdp_tool.py, computer_use/cua_backend.py, slash_confirm.py - tools/environments/modal.py (_AsyncWorker) - tools/mcp_tool.py (2 + 8 _run_on_mcp_loop callers converted to factory-style so the coroutine is never constructed on a dead loop) - tui_gateway/ws.py Tests: new tests/agent/test_async_utils.py covers helper behavior under live loop, dead loop, None loop, and scheduling exceptions. Regression tests added at three PR-original sites (acp events, acp permissions, mcp loop runner) mirroring contributor's intent. Live-tested end-to-end: - Helper stress test: 1500 schedules across live/dead/race scenarios, zero leaked coroutines - Race exercised: 5000 schedules with loop killed mid-flight, 100 ok / 4900 None returns, zero leaks - hermes chat -q with terminal tool call (exercises step_callback bridge) - MCP probe against failing subprocess servers + factory path - Real gateway daemon boot + SIGINT shutdown across multiple platform adapter inits - WSTransport 100 live + 50 dead-loop writes - Cron delivery path live + dead loop Salvages PR #2657 — adopts contributor's intent over a much wider site list and a single centralized helper instead of inline try/except at each site. 3 of the original PR's 6 sites no longer exist on main (environments/patches.py deleted, DingTalk refactored to native async); the equivalent fix lives in tools/environments/modal.py instead. Co-authored-by: JithendraNara <jithendranaidunara@gmail.com> |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| plugin.yaml | ||
| README.md | ||
Hindsight Memory Provider
Long-term memory with knowledge graph, entity resolution, and multi-strategy retrieval. Supports cloud, local embedded, and local external modes.
Requirements
- Cloud: API key from ui.hindsight.vectorize.io
- Local Embedded: API key for a supported LLM provider (OpenAI, Anthropic, Gemini, Groq, OpenRouter, MiniMax, Ollama, or any OpenAI-compatible endpoint). Embeddings and reranking run locally — no additional API keys needed.
- Local External: A running Hindsight instance (Docker or self-hosted) reachable over HTTP.
Setup
hermes memory setup # select "hindsight"
The setup wizard will install dependencies automatically via uv and walk you through configuration.
Or manually (cloud mode with defaults):
hermes config set memory.provider hindsight
echo "HINDSIGHT_API_KEY=your-key" >> ~/.hermes/.env
Cloud
Connects to the Hindsight Cloud API. Requires an API key from ui.hindsight.vectorize.io.
Local Embedded
Hermes spins up a local Hindsight daemon with built-in PostgreSQL. Requires an LLM API key for memory extraction and synthesis. The daemon starts automatically in the background on first use and stops after 5 minutes of inactivity.
Supports any OpenAI-compatible LLM endpoint (llama.cpp, vLLM, LM Studio, etc.) — pick openai_compatible as the provider and enter the base URL.
Daemon startup logs: ~/.hermes/logs/hindsight-embed.log
Daemon runtime logs: ~/.hindsight/profiles/<profile>.log
To open the Hindsight web UI (local embedded mode only):
hindsight-embed -p hermes ui start
Local External
Points the plugin at an existing Hindsight instance you're already running (Docker, self-hosted, etc.). No daemon management — just a URL and an optional API key.
Config
Config file: ~/.hermes/hindsight/config.json
Connection
| Key | Default | Description |
|---|---|---|
mode |
cloud |
cloud, local_embedded, or local_external |
api_url |
https://api.hindsight.vectorize.io |
API URL (cloud and local_external modes) |
Memory Bank
| Key | Default | Description |
|---|---|---|
bank_id |
hermes |
Memory bank name (static fallback used when bank_id_template is unset or resolves empty) |
bank_id_template |
— | Optional template to derive the bank name dynamically. Placeholders: {profile}, {workspace}, {platform}, {user}, {session}. Example: hermes-{profile} isolates memory per active Hermes profile. Empty placeholders collapse cleanly (e.g. hermes-{user} with no user becomes hermes). |
bank_mission |
— | Reflect mission (identity/framing for reflect reasoning). Applied via Banks API. |
bank_retain_mission |
— | Retain mission (steers what gets extracted). Applied via Banks API. |
Recall
| Key | Default | Description |
|---|---|---|
recall_budget |
mid |
Recall thoroughness: low / mid / high |
recall_prefetch_method |
recall |
Auto-recall method: recall (raw facts) or reflect (LLM synthesis) |
recall_max_tokens |
4096 |
Maximum tokens for recall results |
recall_max_input_chars |
800 |
Maximum input query length for auto-recall |
recall_prompt_preamble |
— | Custom preamble for recalled memories in context |
recall_tags |
— | Tags to filter when searching memories |
recall_tags_match |
any |
Tag matching mode: any / all / any_strict / all_strict |
auto_recall |
true |
Automatically recall memories before each turn |
Retain
| Key | Default | Description |
|---|---|---|
auto_retain |
true |
Automatically retain conversation turns |
retain_async |
true |
Process retain asynchronously on the Hindsight server |
retain_every_n_turns |
1 |
Retain every N turns (1 = every turn) |
retain_context |
conversation between Hermes Agent and the User |
Context label for retained memories |
retain_tags |
— | Default tags applied to retained memories; merged with per-call tool tags |
retain_source |
— | Optional metadata.source attached to retained memories |
retain_user_prefix |
User |
Label used before user turns in auto-retained transcripts |
retain_assistant_prefix |
Assistant |
Label used before assistant turns in auto-retained transcripts |
Integration
| Key | Default | Description |
|---|---|---|
memory_mode |
hybrid |
How memories are integrated into the agent |
memory_mode:
hybrid— automatic context injection + tools available to the LLMcontext— automatic injection only, no tools exposedtools— tools only, no automatic injection
Local Embedded LLM
| Key | Default | Description |
|---|---|---|
llm_provider |
openai |
openai, anthropic, gemini, groq, openrouter, minimax, ollama, lmstudio, openai_compatible |
llm_model |
per-provider | Model name (e.g. gpt-4o-mini, qwen/qwen3.5-9b) |
llm_base_url |
— | Endpoint URL for openai_compatible (e.g. http://192.168.1.10:8080/v1) |
The LLM API key is stored in ~/.hermes/.env as HINDSIGHT_LLM_API_KEY.
Tools
Available in hybrid and tools memory modes:
| Tool | Description |
|---|---|
hindsight_retain |
Store information with auto entity extraction; supports optional per-call tags |
hindsight_recall |
Multi-strategy search (semantic + entity graph) |
hindsight_reflect |
Cross-memory synthesis (LLM-powered) |
Environment Variables
| Variable | Description |
|---|---|
HINDSIGHT_API_KEY |
API key for Hindsight Cloud |
HINDSIGHT_LLM_API_KEY |
LLM API key for local mode |
HINDSIGHT_API_LLM_BASE_URL |
LLM Base URL for local mode (e.g. OpenRouter) |
HINDSIGHT_API_URL |
Override API endpoint |
HINDSIGHT_BANK_ID |
Override bank name |
HINDSIGHT_BUDGET |
Override recall budget |
HINDSIGHT_MODE |
Override mode (cloud, local_embedded, local_external) |
Client Version
Requires hindsight-client >= 0.4.22. The plugin auto-upgrades on session start if an older version is detected.