Two data-loss / leak gaps in HindsightMemoryProvider.on_session_switch
introduced by #17409.
1. Buffered turns silently lost when retain_every_n_turns > 1.
on_session_switch unconditionally cleared _session_turns without
flushing. Users who batched every N>1 turns and switched mid-batch
(/reset, /new, /resume, /branch, or context compression) had those
buffered turns disappear. Same data-loss class as the shutdown race,
different lifecycle event.
Note commit_memory_session() -> on_session_end() runs *before*
on_session_switch on /reset, but Hindsight doesn't implement
on_session_end so the buffer survives that step and dies at clear
time. /resume, /branch, and compression skip commit_memory_session
entirely so an on_session_end impl wouldn't help them anyway.
Fix: snapshot the old _session_id, _document_id, _parent_session_id,
_turn_index, and _session_turns; spawn one final retain that lands
under the OLD document_id; then rotate state. Metadata is built
synchronously against the old self._* so session_id / lineage tags
on the flushed item all reference the prior session consistently.
2. Stale _prefetch_result leaks across switch.
If queue_prefetch ran in the old session and the result hadn't been
consumed by prefetch() yet, on_session_switch left the cached recall
text in place. The next session's first prefetch() call would return
text mined from the prior session's bank/query.
Fix: join any in-flight _prefetch_thread (3s bounded — matches
shutdown()), then clear _prefetch_result under _prefetch_lock before
rotating session_id.
Tests
-----
- tests/plugins/memory/test_hindsight_provider.py (TestSessionSwitchBufferFlush):
- buffered turns flushed under OLD document_id with OLD lineage tags
- empty buffer => no spurious retain
- _prefetch_result cleared on switch
- in-flight prefetch thread is awaited before clear (no race)
- tests/agent/test_memory_session_switch.py: factory extended to seed the
attrs the new flush path reads (_retain_source, _platform, _bank_id,
prefetch state, etc.) and stub _run_hindsight_operation so existing
switch-state assertions keep passing without network setup.
The plugin used to spawn one daemon thread per sync_turn() to do the
aretain_batch network write. On CLI exit, that pattern raced interpreter
shutdown — the last retain could reach aiohttp after asyncio's
"cannot schedule new futures" guard had fired, producing noisy logs and
silently losing the final unsaved turn:
WARNING ... Hindsight sync failed: cannot schedule new futures after
interpreter shutdown
ERROR asyncio: Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x...>
Switch to a single-writer model: each provider owns one long-lived
writer thread plus a queue. sync_turn() snapshots state and enqueues a
job; the writer drains sequentially. Once shutdown() is called:
- new sync_turn() / queue_prefetch() calls are dropped, not enqueued
- a sentinel wakes the writer so it finishes in-flight work
- shutdown joins the writer (10s) before nulling the client
Also register an idempotent atexit hook from the first sync_turn(), so
exit paths that don't go through MemoryManager.shutdown_all() (Ctrl-C,
abrupt exit) still get a chance to drain.
Tests: keep _sync_thread as a legacy alias to the writer, swap join()
calls to _retain_queue.join() (canonical wait-for-drain), add a new
TestShutdownRace suite covering single-writer reuse, post-shutdown drop,
queue draining, and shutdown idempotency.
HindsightEmbedded.close() delegates to its sync client.close(). When Hermes
created/used that client on the shared async loop, closing it from the main
thread raises 'attached to a different loop' before aiohttp releases the
session — so the ClientSession / TCPConnector leak past provider teardown.
Close the embedded inner async client on the shared loop first via
_run_sync(inner_client.aclose()), then let the wrapper's sync close()
do its daemon/UI bookkeeping.
Salvage of #14605: test placement rebased — appended TestShutdown class
after TestSharedEventLoopLifecycle (which landed on main after the PR was
written). Original author attribution preserved.
Adds an optional bank_id_template config that derives the bank name at
initialize() time from runtime context. Existing users with a static
bank_id keep the current behavior (template is empty by default).
Supported placeholders:
{profile} — active Hermes profile (agent_identity kwarg)
{workspace} — Hermes workspace (agent_workspace kwarg)
{platform} — cli, telegram, discord, etc.
{user} — platform user id (gateway sessions)
{session} — session id
Unsafe characters in placeholder values are sanitized, and empty
placeholders collapse cleanly (e.g. "hermes-{user}" with no user
becomes "hermes"). If the template renders empty, the static bank_id
is used as a fallback.
Common uses:
bank_id_template: hermes-{profile} # isolate per Hermes profile
bank_id_template: {workspace}-{profile} # workspace + profile scoping
bank_id_template: hermes-{user} # per-user banks for gateway
Reusing session_id as document_id caused data loss on /resume: when
the session is loaded again, _session_turns starts empty and the next
retain replaces the entire previously stored content.
Now each process lifecycle gets its own document_id formed as
{session_id}-{startup_timestamp}, so:
- Same session, same process: turns accumulate into one document (existing behavior)
- Resume (new process, same session): writes a new document, old one preserved
- Forks: child process gets its own document; parent's doc is untouched
Also adds session lineage tags so all processes for the same session
(or its parent) can still be filtered together via recall:
- session:<session_id> on every retain
- parent:<parent_session_id> when initialized with parent_session_id
Closes#6602
The existing test_local_embedded_setup_materializes_profile_env expected
exact equality on ~/.hermes/.env content; the new HINDSIGHT_TIMEOUT=120
line from the timeout feature now appears in that file. Append it to the
expected string so the test reflects the new post_setup output.
The module-global `_loop` / `_loop_thread` pair is shared across every
`HindsightMemoryProvider` instance in the process — the plugin loader
creates one provider per `AIAgent`, and the gateway creates one `AIAgent`
per concurrent chat session (Telegram/Discord/Slack/CLI).
`HindsightMemoryProvider.shutdown()` stopped the shared loop when any one
session ended. That stranded the aiohttp `ClientSession` and `TCPConnector`
owned by every sibling provider on a now-dead loop — they were never
reachable for close and surfaced as the `Unclosed client session` /
`Unclosed connector` warnings reported in #11923.
Fix: stop stopping the shared loop in `shutdown()`. Per-provider cleanup
still closes that provider's own client via `self._client.aclose()`. The
loop runs on a daemon thread and is reclaimed on process exit; keeping
it alive between provider shutdowns means sibling providers can drain
their own sessions cleanly.
Regression tests in `tests/plugins/memory/test_hindsight_provider.py`
(`TestSharedEventLoopLifecycle`):
- `test_shutdown_does_not_stop_shared_event_loop` — two providers share
the loop; shutting down one leaves the loop live for the other. This
test reproduces the #11923 leak on `main` and passes with the fix.
- `test_client_aclose_called_on_cloud_mode_shutdown` — each provider's
own aiohttp session is still closed via `aclose()`.
Fixes#11923.
- Add configurable retain_tags / retain_source / retain_user_prefix /
retain_assistant_prefix knobs for native Hindsight.
- Thread gateway session identity (user_name, chat_id, chat_name,
chat_type, thread_id) through AIAgent and MemoryManager into
MemoryProvider.initialize kwargs so providers can scope and tag
retained memories.
- Hindsight attaches the new identity fields as retain metadata,
merges per-call tool tags with configured default tags, and uses
the configurable transcript labels for auto-retained turns.
Co-authored-by: Abner <abner.the.foreman@agentmail.to>
Port missing features from the hindsight-hermes external integration
package into the native plugin. Only touches plugin files — no core
changes.
Features:
- Tags on retain/recall (tags, recall_tags, recall_tags_match)
- Recall config (recall_max_tokens, recall_max_input_chars, recall_types,
recall_prompt_preamble)
- Retain controls (retain_every_n_turns, auto_retain, auto_recall,
retain_async via aretain_batch, retain_context)
- Bank config via Banks API (bank_mission, bank_retain_mission)
- Structured JSON retain with per-message timestamps
- Full session accumulation with document_id for dedup
- Custom post_setup() wizard with curses picker
- Mode-aware dep install (hindsight-client for cloud, hindsight-all for local)
- local_external mode and openai_compatible LLM provider
- OpenRouter support with auto base URL
- Auto-upgrade of hindsight-client to >=0.4.22 on session start
- Comprehensive debug logging across all operations
- 46 unit tests
- Updated README and website docs
Based on PR #5413 spec by MaheshtheDev (Mahesh Sanikommu).
Changes:
- Add search_mode config (hybrid/memories/documents) passed to SDK
- Add {identity} template support in container_tag for profile-scoped containers
- Add SUPERMEMORY_CONTAINER_TAG env var override (priority over config)
- Add multi-container mode: enable_custom_container_tags, custom_containers,
custom_container_instructions in supermemory.json
- Dynamic tool schemas when multi-container enabled (optional container_tag param)
- Whitelist validation for custom container tags in tool calls
- Simplify get_config_schema() to only prompt for API key during setup
- Defer container_tag sanitization to initialize() (after template resolution)
- Add custom_id support to documents.add calls
- Update README with multi-container docs, search_mode, identity template,
support links (Discord, email)
- Update memory-providers.md with new features and multi-container example
- Update memory-provider-plugin.md with minimal vs full schema guidance
- Add 12 new tests covering identity template, search_mode, multi-container,
config schema, and env var override
Consolidated salvage from PRs #5301 (qaqcvc), #5339 (lance0),
#5058 and #5098 (maymuneth).
Mem0 API v2 compatibility (#5301):
- All reads use filters={user_id: ...} instead of bare user_id= kwarg
- All writes use filters with user_id + agent_id for attribution
- Response unwrapping for v2 dict format {results: [...]}
- Split _read_filters() vs _write_filters() — reads are user-scoped
only for cross-session recall, writes include agent_id
- Preserved 'hermes-user' default (no breaking change for existing users)
- Omitted run_id scoping from #5301 — cross-session memory is Mem0's
core value, session-scoping reads would defeat that purpose
Memory prefetch context fencing (#5339):
- Wraps prefetched memory in <memory-context> fenced blocks with system
note marking content as recalled context, NOT user input
- Sanitizes provider output to strip fence-escape sequences, preventing
injection where memory content breaks out of the fence
- API-call-time only — never persisted to session history
Secret redaction (#5058, #5098):
- Added prefix patterns for Groq (gsk_), Matrix (syt_), RetainDB
(retaindb_), Hindsight (hsk-), Mem0 (mem0_), ByteRover (brv_)