Mirror built-in memory writes to external providers only after the native memory tool succeeds and is not staged for approval. Keep OpenViking's built-in memory mirroring add-only, since Hermes native memory entries do not yet have stable OpenViking file URIs for replace/remove.
Add a narrow viking_forget tool for exact user memory file deletion and document the current OpenViking write/delete behavior.
* fix: update to version 3 endpoints and adding update and delete tool
* chore: removing the test md file
* fix: prevent circuit breaker on client errors in Mem0 provider
* chore: add telemetry for platform version
* feat: add OSS mode support to Mem0 memory provider
* chore: bump mem0ai dependency to >=2.0.1 in memory plugin
* refactor: enhance dependency checks and embedder config in mem0 backend
* refactor: adjust fact storage message for OSS mode
* refactor: expand user paths, add collection recreation on dimension change for Qdrant
* fix(mem0): make MEM0_USER_ID override gateway-native ids and tag writes with channel
When MEM0_USER_ID was configured (env or mem0.json), the gateway-native id
from kwargs (Telegram numeric id, Discord snowflake, ...) still won, so the
same human ended up under different user_ids per channel and memories never
merged across CLI / Telegram / Slack / Discord. Mirrors openclaw's cfg.userId
pattern: configured override wins, gateway-native id is the fallback.
The legacy "hermes-user" placeholder default written by the setup wizard is
treated as unset to avoid silently bucketing every gateway user together.
Also tag every write with metadata.channel (cli/telegram/discord/...) so the
dashboard can offer per-channel filtered views without coupling identity to
the channel; document the read/write filter asymmetry as intentional
(reads scope to user_id only for cross-agent recall).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: improve Mem0 memory provider backend, pagination, config, and error handling
* refactor: update mem0 telemetry code, docs, and bump version
* fix(mem0): make get_config_schema() return unified schema with mode-aware required flag
Schema always includes api_key field so picker shows "API key / local" for
both modes. In OSS mode api_key.required=False so status won't mislead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: improve mem0 telemetry, add env var key and OSS mode detection
* chore: bump mem0ai lower bound to 2.0.4 (latest SDK release)
* refactor: set telemetry sample rate to 1.0 and update docs for opt‑out
* fix(mem0): resolve 15 correctness, thread-safety, and resource bugs
Thread safety:
- Protect circuit breaker counters with _breaker_lock (race between
prefetch/sync daemon threads and main thread)
- Wrap sync_turn thread creation in _sync_lock; skip if previous sync
is still alive after 5 s join to prevent duplicate memory ingestion
- Guard _schedule_flush timer creation under _queue_lock (TOCTOU race)
- Capture local `backend` reference in prefetch/sync closures so
shutdown() nulling self._backend cannot crash in-flight threads
Correctness:
- Fix bool("false")==True for rerank param; parse string values explicitly
- Guard page/top_k with max(1,...) and move int() inside try blocks
- Fix fact_count=0 always in OSS mode (Memory.add returns list, not dict)
- Fix prefetch() not clearing result when thread still alive after timeout
- Fix atexit.register accumulating on repeated initialize() calls
Backend / setup:
- Handle Qdrant named-vector collections in _recreate_collection_if_dims_changed
(vectors is a dict; .size access raised AttributeError, swallowed silently)
- Wrap QdrantClient and psycopg2 conn/cursor in try/finally to prevent leaks
- Resolve ollama_bin at top of _ensure_ollama; use it for ollama pull
- Fix embedder key lookup when LLM provider has no env_var (e.g. ollama)
Also: remove _telemetry_enabled cache (env var check is cheap), bump
required mem0ai to >=2.0.7, minor README wording fix.
* fix(mem0): fix brittle qdrant path test + add telemetry sample-rate docs
- Replace generator-throw lambda with a proper def in
test_qdrant_path_not_writable; use tmp_path instead of a hardcoded
/nonexistent path so the test is root-safe
- Add MEM0_TELEMETRY_SAMPLE_RATE to memory-providers.md (was only
in the plugin README, not the user-guide docs)
* revert: remove MEM0_TELEMETRY_SAMPLE_RATE from user-guide docs
* refactor: remove telemetry from mem0 plugin and update documentation
* fix(mem0): set stdin=DEVNULL on setup subprocess calls
The TUI stdin guard (scripts/check_subprocess_stdin.py) requires every
subprocess call in plugin code to set stdin= so it can't inherit the
gateway's JSON-RPC stdin fd. Muzzle the docker/ollama calls in the OSS
setup wizard with stdin=subprocess.DEVNULL (none need interactive input).
Also covers the docker-inspect call the linter's regex misses.
---------
Co-authored-by: chaithanyak42 <chaithanya.kumar42a@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses reviewer feedback on #13377:
1. Restore all stripped docstrings (_load_config, _is_breaker_open,
sync_turn, register, _get_client, _read_filters, _write_filters,
_unwrap_results, save_config) and section dividers
2. Revert api_key to required:true in schema — self-hosted Mem0 also
requires auth by default; validation in _get_client() handles the
either/or logic separately from the schema
3. Confirm secret:true remains on api_key (already correct)
The mem0 plugin previously hardcoded api.mem0.ai as the endpoint.
This adds a `host` config key and MEM0_HOST env var so users can
point the plugin at a self-hosted Mem0 instance.
Changes:
- _load_config(): read MEM0_HOST env var
- is_available(): accept host OR api_key (self-hosted may not need a real key)
- get_config_schema(): add host field
- initialize(): read host from config
- _get_client(): pass host kwarg to MemoryClient when set
- system_prompt_block(): show target (cloud vs URL)
- README: document self-hosted setup
On resource-contended hosts the embedded Hindsight daemon can exceed a
single 2s /health check; upstream then waits a grace window before
treating it as stale and killing+restarting it (hindsight-embed reads
HINDSIGHT_EMBED_PORT_HEALTH_GRACE_TIMEOUT, default 30s, into a
module-level constant at import time). Users on busy boxes had no
Hermes-side way to raise it short of hand-setting an env var.
Add a 'port_health_grace_timeout' config.json option to the Hindsight
plugin. When set, initialize() exports it to the process env BEFORE
daemon_embed_manager is imported (the import-time read is the contract).
setdefault() so an explicit operator env override always wins. Exposed
in 'hermes memory setup' for local_embedded mode.
Follow-up to #50308 / issue #13125 comment thread.
hermes backup only walks HERMES_HOME, so memory providers that keep
config/credentials in home-anchored dotdirs (honcho -> ~/.honcho,
hindsight -> ~/.hindsight, openviking -> ~/.openviking) lost that data
across a backup/import cycle — the peer IDs, session pairings, and API
keys never made it into the archive.
Add an optional MemoryProvider.backup_paths() hook (default []). The
active provider declares its external paths; backup resolves them from
config only (no init, no network), archives the ones under the home dir
into a reserved _external/ subtree encoded relative to home, and import
restores them to their original location with a home-anchored traversal
guard and 0600 on credential-shaped files. Paths outside home are
skipped as non-portable.
honcho, hindsight, and openviking override the hook. E2E-validated full
backup->import cycle plus 7 new tests.
PostgreSQL's initdb refuses to run as root, so the embedded Hindsight
daemon could never initialize its data directory under root. The
daemon-start thread would fail, retry, and loop forever — each cycle
reloading embedding models (~958MB RAM, ~33% CPU) with no user-visible
error, leaving Hermes sluggish on a common VPS/cloud root setup.
initialize() now detects root (os.geteuid() == 0) before spawning the
daemon thread, disables local_embedded mode, and surfaces a clear
warning to both the log and the terminal so the user knows to run as a
non-root user or switch to cloud / local_external mode.
Closes#13125.
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
The lazy_deps pin (memory.hindsight -> hindsight-client==0.6.1) was newer
than the plugin's stated floor (>=0.4.22). Align _MIN_CLIENT_VERSION,
the setup wizard dep string, plugin.yaml, and the README to 0.6.1 so the
floor check, auto-upgrade target, and runtime lazy-install all agree.
Also drops the redundant local _MIN_CLIENT_VERSION redefinition in
post_setup.
The batch tool_status values ('completed'/'error'/'pending') and the inbound
status alias sets were inline magic strings, duplicated across two checks in
_tool_result_status. Hoist them to module-level constants
(_TOOL_STATUS_* + _TOOL_STATUS_{ERROR,COMPLETED}_ALIASES) so the canonical
wire values and the alias->canonical mapping live in one place. Emitted
values are unchanged.
_messages_to_openviking_batch's pre-scan already parses and caches each
tool call's arguments into tool_calls_by_id. The pending-tool-call branch
re-parsed them via _tool_call_input(), a second parse and a second source
of truth. Reuse the cached tool_input when the id was cached (non-empty),
falling back to a parse only for the uncached empty-id case so arguments
are never dropped. No behavior change.
_OPENVIKING_RECALL_TOOL_NAMES hardcoded the three read-tool names as string
literals, which can silently desync from the *_SCHEMA["name"] constants on a
rename (the same drift the adjacent _CATEGORY_SUBDIR_MAP comment warns about).
Derive the set from SEARCH/READ/BROWSE_SCHEMA["name"] instead. Write tools
(viking_remember / viking_add_resource) remain intentionally excluded. Set
contents are unchanged.
Two follow-up fixes on top of the cherry-picked structured-sync work:
- _messages_to_openviking_batch only added a recall tool result's id to
skipped_tool_ids when the id was non-empty. An empty tool_call_id (which
the canonical transcript can carry; agent_runtime_helpers defaults it to
"") poisoned the skip set with "", silently dropping any *other* tool
result that also lacked an id. Move the recall-skip add inside the
existing `if tool_id:` guard. Adds a regression test (mutation-checked:
fails on pre-fix code, passes after).
- _sync_trace_enabled() open-coded the canonical truthy-env check; reuse
utils.env_var_enabled (byte-identical {1,true,yes,on} semantics).
Follow-up cleanup on the OpenViking setup path merged in #48262:
- _write_ovcli_config now uses utils.atomic_json_write(path, data, mode=0o600)
instead of the local _precreate_secret_file + write_text + chmod sequence.
The shared helper (already used by honcho/mem0/supermemory/hindsight) writes
via temp-file + fchmod(0600) + fsync + os.replace, so the ovcli.conf is
written atomically (no half-written secret file on crash) and with no
chmod-after-write TOCTOU window. _precreate_secret_file stays for the .env
writer path.
- Remove dead _DEFAULT_ACCOUNT/_DEFAULT_USER constants (0 references; the
empty->'default' tenant fallback lives in the _VikingClient constructor).
Tests: tests/plugins/memory/test_openviking_provider.py + test_memory_setup.py
+ openviking_plugin/test_openviking.py -> 130 passed; ruff clean.
Resolves conflicts from the OpenViking churn that merged after #32445 was
opened (#48042/#47662 session-switch + write hardening, #47311/#47973):
- plugins/memory/openviking/__init__.py: keep both __init__ field groups
(the PR's _runtime_start_* alongside main's _prefetch_threads/_shutting_down).
- tests/plugins/memory/test_openviking_provider.py: keep BOTH the PR's new
setup-validation tests and main's session-switch/concurrency tests (disjoint
additions to the same region).
Two fixes layered while reconciling (contributor work otherwise preserved):
- Restore the merged tenant-header contract (#22414/#21232). The PR had changed
_VikingClient defaults to '' and made empty account/user OMIT the tenant
headers; main's contract is that empty falls back to 'default' and the
X-OpenViking-Account/User headers are ALWAYS sent (ROOT API keys need them).
Reverted the constructor to 'account or os.environ.get(..., "default")' and
updated the two PR tests that asserted the omit-when-empty behavior.
- Close a secret-file TOCTOU in the setup writers. _write_env_vars and
_write_ovcli_config wrote the api_key/root_api_key file and chmod 0600
AFTERWARD, leaving a world-readable window on newly-created files. Added
_precreate_secret_file() to create with 0600 before any secret bytes land.
Follow-up hardening on @ehz0ah / @harshitAgr's session-switch work (#28296):
- on_session_switch no longer runs the old-session writer-drain + pending-token
GET + commit POST inline on the caller's command thread. /new, /branch,
/resume, /undo call it synchronously, so a slow drain (up to 10s) or wedged
commit blocked the user-facing command — the same hazard #41945 fixed for
end-of-turn sync. State now rotates synchronously (cheap) and the old-session
commit is offloaded to a daemon finalizer (generalized _finalize_session_async).
- Guard the (_session_id, _turn_count) pair with _session_state_lock: sync_turn
runs on the memory-manager executor thread while the session hooks run on the
command thread, so the snapshot+reset vs increment was a cross-thread race.
- _session_needs_commit checks the committed-session guard BEFORE the
turn_count>0 shortcut, closing a double-commit window when a racing sync_turn
re-increments after commit+reset.
- Add a _shutting_down flag so deferred finalizers stop POSTing against a
torn-down client; track all prefetch threads in a set so invalidate/shutdown
join every one, not just the latest slot.
Tests: regression for the non-blocking switch (asserts the caller returns while
a slow drain is parked off-thread) and the committed-guard ordering; updated the
deferred-commit test to the unified finalizer contract.
sync_turn's bounded join could drop a still-alive previous worker by
replacing the single _sync_thread slot. The dropped worker kept POSTing
under the old sid but was no longer visible to on_session_end /
on_session_switch, so the commit could fire while orphaned writes were
still in flight — those writes landed past the commit boundary and were
never extracted.
Replace the single _sync_thread slot with _inflight_writers:
Dict[sid, Set[Thread]]. Writers self-register on spawn (sync_turn,
on_memory_write) and self-deregister on exit. The commit path drains
_drain_writers(sid, 10.0) and skips the commit if any writer for that
sid is still alive after the bounded budget.
Also trim inline review-rationale comments to short invariants per
reviewer style ask: "commit only after session writes drain" and
"drop prefetch results from older switch generations."
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(cherry picked from commit 7537ee6f5b)
Three follow-ups from review on #28296:
1. Sync worker outliving the bounded join. Each sync_turn POST has
_TIMEOUT=30s and there are two per turn, but on_session_end and
on_session_switch only join for 10s. If the worker is still alive
after the join, committing the old session orphans the worker's
late writes past the commit boundary — they land in an already-
committed session and never get extracted. Both hooks now re-check
is_alive() after the join and skip the commit when the worker
hasn't drained.
2. on_memory_write late session_id capture. Same shape as the
pre-fix sync_turn: f-string for the post path read self._session_id
inside the worker, so a switch between thread spawn and post call
landed the memory note in the new session. Snapshot sid at call
time, same pattern as sync_turn.
3. Stale prefetch repopulating the new session. The pre-switch
drain+clear only protects against workers that finish before the
join completes; one finishing after the clear would write its
result into the new generation's slot. Added a monotonic
_prefetch_generation; workers capture it at spawn and refuse to
write if it has advanced.
Tests: existing in-flight-sync test updated to drain (it tested the
join-before-commit happy path); four new tests cover hung-writer skip
on end + switch, on_memory_write sid capture, and prefetch generation
gating. 177/177 memory tests pass.
(cherry picked from commit 3791a87dbe)
Two hardening fixes prompted by review on #28296:
1. sync_turn() now snapshots the target session id before spawning the
worker. The previous code read self._session_id inside the worker, so
a worker delayed past on_session_switch's bounded join could read the
rotated-in NEW id and write the OLD turn's messages into the wrong
session.
2. on_session_end() resets _turn_count to 0 after a successful commit,
making the old-session commit path idempotent with the new switch
hook. /new and compression call commit_memory_session() (which fires
on_session_end) immediately before on_session_switch; without this,
the old session would be committed twice. On commit failure we leave
_turn_count > 0 so on_session_switch retries.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(cherry picked from commit 2ea8d5c537)
OpenVikingMemoryProvider only overrides on_session_end and inherits the
base-class no-op for on_session_switch. When the agent rotates session_id
(via /new, /branch, /reset, /resume, or context compression), the
provider's cached _session_id stays at the value initialize() captured.
All subsequent sync_turn writes then land in the already-closed old
session, and on_session_end tries to commit it a second time — the new
session never accumulates messages and never triggers memory extraction.
The fix mirrors the pattern Hindsight uses (#17508):
1. Wait for any in-flight sync thread to drain under the OLD _session_id
before we mutate it, otherwise the commit below races the last
message write.
2. Commit the old session if it accumulated turns — same extraction
semantics as on_session_end. Skip if empty (nothing to extract).
3. Drain in-flight prefetch from the old session and clear its cached
result so the new session doesn't see stale recall.
4. Rotate _session_id to the new value and reset _turn_count.
Commit failures are swallowed (logged at WARN) so a flaky server can't
strand the provider on the old session forever — same posture as the
existing on_session_end commit.
(cherry picked from commit a1e7185e8a)
Generalizes #32663 (@ehz0ah). The slash-skill scaffolding pollution
affected every auto-syncing memory provider — mem0, hindsight, retaindb,
byterover, honcho, supermemory all store/embed the raw user turn, so a
/skill invocation poisoned their stores with the full skill body, not just
openviking.
- Lift the contributor's parser into agent/skill_commands.py as the canonical
extract_user_instruction_from_skill_message(), co-located with the message
builders so the markers can't drift.
- Strip once in MemoryManager.{prefetch_all,queue_prefetch_all,sync_all} —
fixes the whole provider fan-out, bare /skill turns are skipped entirely.
- OpenViking's _derive_openviking_user_text() now delegates to the shared
helper as defense-in-depth (no duplicated marker literals).
- Marker-drift regression now asserts against the canonical skill_commands
constants; add manager-level coverage proving every provider gets clean text.
Support linking, copying, and creating ovcli.conf during OpenViking memory setup.
Make setup cancellation write nothing and cover OpenViking/Hindsight picker cancellation paths.
'everyone collapses to your peer' read as a promise about all traffic.
pinUserPeer pins the user-side peer and is checked before userPeerAliases
(session.py:335), so a pin overrides every alias — including agent peers.
For a multi-agent operator that silently pools distinct agents onto one
peer, the opposite of intent.
Scopes the wording to 'every non-agent gateway user', notes the pin
overrides aliases, and points agent-mesh operators at pinUserPeer:false +
userPeerAliases instead. Same correction in the wizard menu/echo text,
the plugin README, and the website Honcho page.
Adds an observation_scopes config key (and HINDSIGHT_RETAIN_OBSERVATION_SCOPES
env var) so retained memories can opt into per_tag / all_combinations /
custom scoping instead of Hindsight's default combined pass.
Threaded through _build_retain_kwargs so all three retain paths honor it:
auto-retain and flush-on-switch already use aretain_batch; the tool retain
path is switched from aretain to aretain_batch (functionally equivalent,
aretain just wraps a single-item batch) since aretain doesn't accept the
observation_scopes parameter.
Drop pinPeerName from the key table (now a deprecated-alias note), and replace
the single/multi/hybrid 'deployment shapes' section with the gateway-gated
intent tree the wizard actually presents, including the [e] raw-edit hatch and
the un-pin pooling steer.
The single/multi/hybrid 'deployment shape' was a misnomer: these keys only
affect the gateway (the one entrypoint supplying a runtime user ID), and the
three preset names stamped a lossy taxonomy onto three orthogonal knobs while
hiding which keys got written.
Replace it with an intent-led tree gated on gateway detection:
- _gateway_platforms() lazily inspects the gateway config (best-effort, no
hard dependency); the step auto-skips when no platform is connected.
- 'who talks to this?' → just me / me+others (pooled?) / only others, deriving
pinUserPeer + userPeerAliases + runtimePeerPrefix and echoing the result.
- [e] drops to a raw-knob editor for power users.
- The single→multi orphan guard survives as a pooling steer.
The setup wizard wrote the legacy pinPeerName even though pinUserPeer is
the canonical key that outranks it in the resolver — so it had to scrub
the canonical key afterward to stop it winning. Write pinUserPeer directly
and migrate any legacy pinPeerName onto it on touch (setup load + clone),
which removes the precedence-fighting entirely.
Resolver still reads pinPeerName as a back-compat alias; that's deferred.
When Hermes runs in TUI mode, the gateway child process communicates with
the Node.js parent over a JSON-RPC protocol on stdin. Subprocess calls that
inherit this stdin fd can trigger a race condition where the child's stdin
read returns EOF, causing the gateway to exit cleanly (exit code 0) mid-tool-
execution.
This is the same root cause as issue #14036 (byterover plugin) and PR #39257
(SSH environment backend). This commit applies the fix — stdin=subprocess.DEVNULL
— to all 85 subprocess.run() and subprocess.Popen() calls that execute inside
the TUI gateway child process.
Scope: TUI-context code only (agent/, tools/, plugins/, tui_gateway/server.py).
CLI code (cli.py, hermes_cli/), tests, scripts, and gateway process management
are excluded — they don't run inside the TUI child and inherit the terminal's
stdin, not the JSON-RPC pipe.
85 call sites across 28 files. All files pass syntax check.
* fix(plugins): add thread-safe lazy-singleton helpers, fix honcho TOCTOU (#24759)
get_honcho_client() and fal's _load_fal_client() used unlocked
check-then-init: racing threads both ran the expensive build and the
loser's client (open connection) leaked.
Rather than one-off locks, add plugins/plugin_utils.py with two
reusable primitives every plugin author can drop in:
- lazy_singleton: decorator for zero-arg accessors
- SingletonSlot: manual slot for config-keyed accessors (first wins)
Both use double-checked locking; factory runs at most once; failed
builds aren't cached. honcho is the reference consumer; fal's sibling
TOCTOU gets a matching double-checked lock. Plugin dev guide documents
the pattern so future plugins don't reintroduce the race.
Closes#24759
* test(honcho): update reset test for SingletonSlot internals
test_reset_clears_singleton poked the removed _honcho_client module
global directly. Assert through the slot's public peek() surface
instead, matching the #24759 refactor.
hermes doctor and hermes honcho status warned 'Honcho config not found'
whenever ~/.honcho/config.json was absent, even though HONCHO_API_KEY in
.env resolves a working config via HonchoClientConfig.from_global_config()
-> from_env(). Both now check hcfg.api_key/base_url before warning.
Co-authored-by: oxngon <98992931+oxngon@users.noreply.github.com>
_build_memory_uri produced URIs of the form:
viking://user/{user}/memories/{subdir}/mem_{slug}.md
The /agent/{agent}/ segment was missing, causing every agent under
the same user to write into the same flat namespace. In multi-agent
deployments agents silently overwrite each other's memories and
vector retrieval cross-pollinates results.
self._agent was already populated correctly (from OPENVIKING_AGENT
env var, default 'hermes') and sent via X-OpenViking-Agent header —
it was simply not interpolated into the URI.
Fix: add the missing segment so URIs follow the documented shape:
viking://user/{user}/agent/{agent}/memories/{subdir}/mem_{slug}.md
Tests: 4 new regression tests in TestOpenVikingMemoryUriBuilder,
13/13 passed (9 existing + 4 new).
User-installed memory providers load under the synthetic
_hermes_user_memory.<name> package, but the loader never registered that
parent namespace in sys.modules (it only registers "plugins" and
"plugins.memory" for bundled providers). As a result any external provider
using a relative import failed to load:
from . import config
ModuleNotFoundError: No module named '_hermes_user_memory'
The same gap in discover_plugin_cli_commands() meant an external provider's
cli.py with a relative import could never be discovered, so the documented
"hermes <plugin>" CLI integration did not work for standalone plugins.
Register the synthetic parent namespace before loading user-installed
providers, mirror it for cli.py discovery (including the per-provider parent
package, without executing the plugin's __init__.py), and make
_load_provider_from_dir() reuse only modules actually loaded from disk so a
parent shell registered by CLI discovery is never mistaken for the loaded
provider.
Regressions cover: a flat provider with a sibling relative import, a provider
with its implementation in a nested subpackage (including a namespace
intermediate directory), cli.py discovery with a relative import, and
provider load after CLI discovery ran first.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>