Follow-up to the canonical-identity session-key fix: pull the
JID/LID normalize/expand/canonical helpers into gateway/whatsapp_identity.py
instead of living in two places. gateway/session.py (session-key build) and
gateway/run.py (authorisation allowlist) now both import from the shared
module, so the two resolution paths can't drift apart.
Also switches the auth path from module-level _hermes_home (cached at
import time) to dynamic get_hermes_home() lookup, which matches the
session-key path and correctly reflects HERMES_HOME env overrides. The
lone test that monkeypatched gateway.run._hermes_home for the WhatsApp
auth path is updated to set HERMES_HOME env var instead; all other
tests that monkeypatch _hermes_home for unrelated paths (update,
restart drain, shutdown marker, etc.) still work — the module-level
_hermes_home is untouched.
Hermes' WhatsApp bridge routinely surfaces the same person under either
a phone-format JID (60123456789@s.whatsapp.net) or a LID (…@lid),
and may flip between the two for a single human within the same
conversation. Before this change, build_session_key used the raw
identifier verbatim, so the bridge reshuffling an alias form produced
two distinct session keys for the same person — in two places:
1. DM chat_id — a user's DM sessions split in half, transcripts and
per-sender state diverge.
2. Group participant_id (with group_sessions_per_user enabled) — a
member's per-user session inside a group splits in half for the
same reason.
Add a canonicalizer that walks the bridge's lid-mapping-*.json files
and picks the shortest/numeric-preferred alias as the stable identity.
build_session_key now routes both the DM chat_id and the group
participant_id through this helper when the platform is WhatsApp.
All other platforms and chat types are untouched.
Expose canonical_whatsapp_identifier and normalize_whatsapp_identifier
as public helpers. Plugins that need per-sender behaviour (role-based
routing, per-contact authorization, policy gating) need the same
identity resolution Hermes uses internally; without a public helper,
each plugin would have to re-implement the walker against the bridge's
internal on-disk format. Keeping this alongside build_session_key
makes it authoritative and one refactor away if the bridge ever
changes shape.
_expand_whatsapp_aliases stays private — it's an implementation detail
of how the mapping files are walked, not a contract callers should
depend on.
Feishu's open_id is app-scoped (same user gets different open_ids per
bot app), not a canonical identity. Functionally correct for single-bot
mode but semantically misleading.
- Add comprehensive Feishu identity model documentation to module docstring
- Prefer user_id (tenant-scoped) over open_id (app-scoped) in
_resolve_sender_profile when both are available
- Document bot_open_id usage for @mention matching
- Update user_id_alt comment in SessionSource to be platform-generic
Ref: closes analysis from PR #8388 (closed as over-scoped)
Generalize shared multi-user session handling so non-thread group sessions
(group_sessions_per_user=False) get the same treatment as shared threads:
inbound messages are prefixed with [sender name], and the session prompt
shows a multi-user note instead of pinning a single **User:** line into
the cached system prompt.
Before: build_session_key already treated these as shared sessions, but
_prepare_inbound_message_text and build_session_context_prompt only
recognized shared threads — creating cross-user attribution drift and
prompt-cache contamination in shared groups.
- Add is_shared_multi_user_session() helper alongside build_session_key()
so both the session key and the multi-user branches are driven by the
same rules (DMs never shared, threads shared unless
thread_sessions_per_user, groups shared unless group_sessions_per_user).
- Add shared_multi_user_session field to SessionContext, populated by
build_session_context() from config.
- Use context.shared_multi_user_session in the prompt builder (label is
'Multi-user thread' when a thread is present, 'Multi-user session'
otherwise).
- Use the helper in _prepare_inbound_message_text so non-thread shared
groups also get [sender] prefixes.
Default behavior unchanged: DMs stay single-user, groups with
group_sessions_per_user=True still show the user normally, shared threads
keep their existing multi-user behavior.
Tests (65 passed):
- tests/gateway/test_session.py: new shared non-thread group prompt case.
- tests/gateway/test_shared_group_sender_prefix.py: inbound preprocessing
for shared non-thread groups and default groups.
SessionStore.prune_old_entries was calling
self._has_active_processes_fn(entry.session_id) but the callback wired
up in gateway/run.py is process_registry.has_active_for_session, which
compares against session_key, not session_id. Every other caller in
session.py (_is_session_expired, _should_reset) already passes
session_key, so prune was the only outlier — and because session_id and
session_key live in different namespaces, the guard never fired.
Result in production: sessions with live background processes (queued
cron output, detached agents, long-running Bash) were pruned out of
_entries despite the docstring promising they'd be preserved. When the
process finished and tried to deliver output, the session_key to
session_id mapping was gone and the work was effectively orphaned.
Also update the existing test_prune_skips_entries_with_active_processes,
which was checking the wrong interface (its mock callback took session_id
so it agreed with the buggy implementation). The test now uses a
session_key-based mock, matching the production callback's real contract,
and a new regression guard test pins the behaviour.
Swallowed exceptions inside the prune loop now log at debug level instead
of silently disappearing.
The shutdown banner promised "send any message after restart to resume
where you left off" but the code did the opposite: a drain-timeout
restart skipped the .clean_shutdown marker, which made the next startup
call suspend_recently_active(), which marked the session suspended,
which made get_or_create_session() spawn a fresh session_id with a
'Session automatically reset. Use /resume...' notice — contradicting
the banner.
Introduce a resume_pending state on SessionEntry that is distinct from
suspended. Drain-timeout shutdown flags active sessions resume_pending
instead of letting startup-wide suspension destroy them. The next
message on the same session_key preserves the session_id, reloads the
transcript, and the agent receives a reason-aware restart-resume
system note that subsumes the existing tool-tail auto-continue note
(PR #9934).
Terminal escalation still flows through the existing
.restart_failure_counts stuck-loop counter (PR #7536, threshold 3) —
no parallel counter on SessionEntry. suspended still wins over
resume_pending in get_or_create_session() so genuinely stuck sessions
converge to a clean slate.
Spec: PR #11852 (BrennerSpear). Implementation follows the spec with
the approved correction (reuse .restart_failure_counts rather than
adding a resume_attempts field).
Changes:
- gateway/session.py: SessionEntry.resume_pending/resume_reason/
last_resume_marked_at + to_dict/from_dict; SessionStore
.mark_resume_pending()/clear_resume_pending(); get_or_create_session()
returns existing entry when resume_pending (suspended still wins);
suspend_recently_active() skips resume_pending entries.
- gateway/run.py: _stop_impl() drain-timeout branch marks active
sessions resume_pending before _interrupt_running_agents();
_run_agent() injects reason-aware restart-resume system note that
subsumes the tool-tail case; successful-turn cleanup also clears
resume_pending next to _clear_restart_failure_count();
_notify_active_sessions_of_shutdown() softens the restart banner to
'I'll try to resume where you left off' (honest about stuck-loop
escalation).
- tests/gateway/test_restart_resume_pending.py: 29 new tests covering
SessionEntry roundtrip, mark/clear helpers, get_or_create_session
precedence (suspended > resume_pending), suspend_recently_active
skip, drain-timeout mark reason (restart vs shutdown), system-note
injection decision tree (including tool-tail subsumption), banner
wording, and stuck-loop escalation override.
SessionStore._entries grew unbounded. Every unique
(platform, chat_id, thread_id, user_id) tuple ever seen was kept in
RAM and rewritten to sessions.json on every message. A Discord bot
in 100 servers x 100 channels x ~100 rotating users accumulates on
the order of 10^5 entries after a few months; each sessions.json
write becomes an O(n) fsync. Nothing trimmed this — there was no
TTL, no cap, no eviction path.
Changes
-------
* SessionStore.prune_old_entries(max_age_days) — drops entries whose
updated_at is older than the cutoff. Preserves:
- suspended entries (user paused them via /stop for later resume)
- entries with an active background process attached
Pruning is functionally identical to a natural reset-policy expiry:
SQLite transcript stays, session_key -> session_id mapping dropped,
returning user gets a fresh session.
* GatewayConfig.session_store_max_age_days (default 90; 0 disables).
Serialized in to_dict/from_dict, coerced from bad types / negatives
to safe defaults. No migration needed — missing field -> 90 days.
* _session_expiry_watcher calls prune_old_entries once per hour
(first tick is immediate). Uses the existing watcher loop so no
new background task is created.
Why not more aggressive
-----------------------
90 days is long enough that legitimate long-idle users (seasonal,
vacation, etc.) aren't surprised — pruning just means they get a
fresh session on return, same outcome they'd get from any other
reset-policy trigger. Admins can lower it via config; 0 disables.
Tests
-----
tests/gateway/test_session_store_prune.py — 17 cases covering:
* entry age based on updated_at, not created_at
* max_age_days=0 disables; negative coerces to 0
* suspended + active-process entries are skipped
* _save fires iff something was removed
* disk JSON reflects post-prune state
* thread safety against concurrent readers
* config field roundtrips + graceful fallback on bad values
* watcher gate logic (first tick prunes, subsequent within 1h don't)
119 broader session/gateway tests remain green.
Fixes#4466.
Root cause: two sequential authorization gates both independently rejected
bot messages, making DISCORD_ALLOW_BOTS completely ineffective.
Gate 1 — `discord.py` `on_message`:
_is_allowed_user ran BEFORE the bot filter, so bot senders were dropped
before the DISCORD_ALLOW_BOTS policy was ever evaluated.
Gate 2 — `gateway/run.py` _is_user_authorized:
The gateway-level allowlist check rejected bot IDs with 'Unauthorized
user: <bot_id>' even if they passed Gate 1.
Fix:
gateway/platforms/discord.py — reorder on_message so DISCORD_ALLOW_BOTS
runs BEFORE _is_allowed_user. Bots permitted by the filter skip the
user allowlist; non-bots are still checked.
gateway/session.py — add is_bot: bool = False to SessionSource so the
gateway layer can distinguish bot senders.
gateway/platforms/base.py — expose is_bot parameter in build_source.
gateway/platforms/discord.py _handle_message — set is_bot=True when
building the SessionSource for bot authors.
gateway/run.py _is_user_authorized — when source.is_bot is True AND
DISCORD_ALLOW_BOTS is 'mentions' or 'all', return True early. Platform
filter already validated the message at on_message; don't re-reject.
Behavior matrix:
| Config | Before | After |
| DISCORD_ALLOW_BOTS=none (default) | Blocked | Blocked |
| DISCORD_ALLOW_BOTS=all | Blocked | Allowed |
| DISCORD_ALLOW_BOTS=mentions + @mention | Blocked | Allowed |
| DISCORD_ALLOW_BOTS=mentions, no mention | Blocked | Blocked |
| Human in DISCORD_ALLOWED_USERS | Allowed | Allowed |
| Human NOT in DISCORD_ALLOWED_USERS | Blocked | Blocked |
Co-authored-by: Hermes Maintainer <hermes@nousresearch.com>
Fixes#7952 — Matrix E2EE completely broken after mautrix migration.
- Replace MemoryCryptoStore + pickle/HMAC persistence with mautrix's
PgCryptoStore backed by SQLite via aiosqlite. Crypto state now
persists reliably across restarts without fragile serialization.
- Add handle_sync() call on initial sync response so to-device events
(queued Megolm key shares) are dispatched to OlmMachine instead of
being silently dropped.
- Add _verify_device_keys_on_server() after loading crypto state.
Detects missing keys (re-uploads), stale keys from migration
(attempts re-upload), and corrupted state (refuses E2EE).
- Add _CryptoStateStore adapter wrapping MemoryStateStore to satisfy
mautrix crypto's StateStore interface (is_encrypted,
get_encryption_info, find_shared_rooms).
- Remove redundant share_keys() call from sync loop — OlmMachine
already handles this via DEVICE_OTK_COUNT event handler.
- Fix datetime vs float TypeError in session.py suspend_recently_active()
that crashed gateway startup.
- Add aiosqlite and asyncpg to [matrix] extra in pyproject.toml.
- Update test mocks for PgCryptoStore/Database and add query_keys mock
for key verification. 174 tests pass.
- Add E2EE upgrade/migration docs to Matrix user guide.
Cherry-picked from PR #7747 with follow-up fixes:
- Narrowed suspend_all_active() to suspend_recently_active() — only
suspends sessions updated within the last 2 minutes (likely in-flight),
not all sessions which would unnecessarily reset idle users
- /stop with no running agent no longer suspends the session; only
actual force-stops mark the session for reset
Automated dead code audit using vulture + coverage.py + ast-grep intersection,
confirmed by Opus deep verification pass. Every symbol verified to have zero
production callers (test imports excluded from reachability analysis).
Removes ~1,534 lines of dead production code across 46 files and ~1,382 lines
of stale test code. 3 entire files deleted (agent/builtin_memory_provider.py,
hermes_cli/checklist.py, tests/hermes_cli/test_setup_model_selection.py).
Co-authored-by: alt-glitch <balyan.sid@gmail.com>
The session store was copying the ENTIRE parent DM transcript into new
thread sessions. This caused unrelated conversations to bleed across
threads in Slack DMs.
The Slack adapter already handles thread context correctly via
_fetch_thread_context() (conversations.replies API), which fetches
only the actual thread messages. The session-level seeding was both
redundant and harmful.
No other platform (Telegram, Discord) uses DM threads, so the seeding
code path was only triggered by Slack — where it conflicted with the
adapter-level context.
Tests updated to assert thread isolation: all thread sessions start
empty, platform adapters are responsible for injecting thread context.
Salvage of PR #5868 (jarvisxyz). Reported by norbert on Discord.
Threads (Telegram forum topics, Discord threads, Slack threads) now default
to shared sessions where all participants see the same conversation. This is
the expected UX for threaded conversations where multiple users @mention the
bot and interact collaboratively.
Changes:
- build_session_key(): when thread_id is present, user_id is no longer
appended to the session key (threads are shared by default)
- New config: thread_sessions_per_user (default: false) — opt-in to restore
per-user isolation in threads if needed
- Sender attribution: messages in shared threads are prefixed with
[sender name] so the agent can tell participants apart
- System prompt: shared threads show 'Multi-user thread' note instead of
a per-turn User line (avoids busting prompt cache)
- Wired through all callers: gateway/run.py, base.py, telegram.py, feishu.py
- Regular group messages (no thread) remain per-user isolated (unchanged)
- DM threads are unaffected (they have their own keying logic)
Closes community request from demontut_ re: thread-based shared sessions.
* fix: force-close TCP sockets on client cleanup, detect and recover dead connections
When a provider drops connections mid-stream (e.g. OpenRouter outage),
httpx's graceful close leaves sockets in CLOSE-WAIT indefinitely. These
zombie connections accumulate and can prevent recovery without restarting.
Changes:
- _force_close_tcp_sockets: walks the httpx connection pool and issues
socket.shutdown(SHUT_RDWR) + close() to force TCP RST on every socket
when a client is closed, preventing CLOSE-WAIT accumulation
- _cleanup_dead_connections: probes the primary client's pool for dead
sockets (recv MSG_PEEK), rebuilds the client if any are found
- Pre-turn health check at the start of each run_conversation call that
auto-recovers with a user-facing status message
- Primary client rebuild after stale stream detection to purge pool
- User-facing messages on streaming connection failures:
"Connection to provider dropped — Reconnecting (attempt 2/3)"
"Connection failed after 3 attempts — try again in a moment"
Made-with: Cursor
* fix: pool entry missing base_url for openrouter, clean error messages
- _resolve_runtime_from_pool_entry: add OPENROUTER_BASE_URL fallback
when pool entry has no runtime_base_url (pool entries from auth.json
credential_pool often omit base_url)
- Replace Rich console.print for auth errors with plain print() to
prevent ANSI escape code mangling through prompt_toolkit's stdout patch
- Force-close TCP sockets on client cleanup to prevent CLOSE-WAIT
accumulation after provider outages
- Pre-turn dead connection detection with auto-recovery and user message
- Primary client rebuild after stale stream detection
- User-facing status messages on streaming connection failures/retries
Made-with: Cursor
* fix(gateway): persist memory flush state to prevent redundant re-flushes on restart
The _session_expiry_watcher tracked flushed sessions in an in-memory set
(_pre_flushed_sessions) that was lost on gateway restart. Expired sessions
remained in sessions.json and were re-discovered every restart, causing
redundant AIAgent runs that burned API credits and blocked the event loop.
Fix: Add a memory_flushed boolean field to SessionEntry, persisted in
sessions.json. The watcher sets it after a successful flush. On restart,
the flag survives and the watcher skips already-flushed sessions.
- Add memory_flushed field to SessionEntry with to_dict/from_dict support
- Old sessions.json entries without the field default to False (backward compat)
- Remove the ephemeral _pre_flushed_sessions set from SessionStore
- Update tests: save/load roundtrip, legacy entry compat, auto-reset behavior
The gateway's update_session() used += for token counts, but the cached
agent's session_prompt_tokens / session_completion_tokens are cumulative
totals that grow across messages. Each update_session call re-added the
running total, inflating usage stats with every message (1.7x after 3
messages, worse over longer conversations).
Fix: change += to = for in-memory entry fields, add set_token_counts()
to SessionDB that uses direct assignment instead of SQL increment, and
switch the gateway to call it.
CLI mode continues using update_token_counts() (increment) since it
tracks per-API-call deltas — that path is unchanged.
Based on analysis from PR #3222 by @zaycruz (closed).
Co-authored-by: zaycruz <zay@users.noreply.github.com>
The cached agent accumulates session_input_tokens across messages, so
run_conversation() returns cumulative totals. But update_session() used
+= (increment), double-counting on every message after the first.
- session.py: change in-memory entry updates from += to = (direct
assignment for cumulative values)
- hermes_state.py: add absolute=True flag to update_token_counts()
that uses SET column = ? instead of SET column = column + ?
- session.py: pass absolute=True to the DB call
CLI path is unchanged — it passes per-API-call deltas directly to
update_token_counts() with the default absolute=False (increment).
Reported by @zaycruz in #3222. Closes#3222.
rewrite_transcript (used by /retry, /undo, /compress) was calling
append_message without reasoning, reasoning_details, or
codex_reasoning_items — permanently dropping them from SQLite.
Co-authored-by: alireza78a <alireza78.crypto@gmail.com>
* fix(session-db): survive CLI/gateway concurrent write contention
Closes#3139
Three layered fixes for the scenario where CLI and gateway write to
state.db concurrently, causing create_session() to fail with
'database is locked' and permanently disabling session_search on the
gateway side.
1. Increase SQLite connection timeout: 10s -> 30s
hermes_state.py: longer window for the WAL writer to finish a batch
flush before the other process gives up entirely.
2. INSERT OR IGNORE in create_session
hermes_state.py: prevents IntegrityError on duplicate session IDs
(e.g. gateway restarts while CLI session is still alive).
3. Don't null out _session_db on create_session failure (main fix)
run_agent.py: a transient lock at agent startup must not permanently
disable session_search for the lifetime of that agent instance.
_session_db now stays alive so subsequent flushes and searches work
once the lock clears.
4. New ensure_session() helper + call it during flush
hermes_state.py: INSERT OR IGNORE for a minimal session row.
run_agent.py _flush_messages_to_session_db: calls ensure_session()
before appending messages, so the FK constraint is satisfied even
when create_session() failed at startup. No-op when the row exists.
* fix(state): release lock between context queries in search_messages
The context-window queries (one per FTS5 match) were running inside
the same lock acquisition as the primary FTS5 query, holding the lock
for O(N) sequential SQLite round-trips. Move per-match context fetches
outside the outer lock block so each acquires the lock independently,
keeping critical sections short and allowing other threads to interleave.
* fix(session): prefer longer source in load_transcript to prevent legacy truncation
When a long-lived session pre-dates SQLite storage (e.g. sessions
created before the DB layer was introduced, or after a clean
deployment that reset the DB), _flush_messages_to_session_db only
writes the *new* messages from the current turn to SQLite — it skips
messages already present in conversation_history, assuming they are
already persisted.
That assumption fails for legacy JSONL-only sessions:
Turn N (first after DB migration):
load_transcript(id) → SQLite: 0 → falls back to JSONL: 994 ✓
_flush_messages_to_session_db: skip first 994, write 2 new → SQLite: 2
Turn N+1:
load_transcript(id) → SQLite: 2 → returns immediately ✗
Agent sees 2 messages of history instead of 996
The same pattern causes the reported symptom: session JSON truncated
to 4 messages (_save_session_log writes agent.messages which only has
2 history + 2 new = 4).
Fix: always load both sources and return whichever is longer. For a
fully-migrated session SQLite will always be ≥ JSONL, so there is no
regression. For a legacy session that hasn't been bootstrapped yet,
JSONL wins and the full history is restored.
Closes#3212
* test: add load_transcript source preference tests for #3212
Covers: JSONL longer returns JSONL, SQLite longer returns SQLite,
SQLite empty falls back to JSONL, both empty returns empty, equal
length prefers SQLite (richer reasoning fields).
---------
Co-authored-by: Mibayy <mibayy@hermes.ai>
Co-authored-by: kewe63 <kewe.3217@gmail.com>
Co-authored-by: Mibayy <mibayy@users.noreply.github.com>
SessionStore._entries was read and mutated without synchronisation,
causing race conditions when multiple platforms (Telegram + Discord)
received messages concurrently on the same gateway process. Two threads
could simultaneously pass the session_key check and create duplicate
sessions for the same user, splitting conversation history.
- Added threading.Lock to protect all _entries / _loaded mutations
- Split _ensure_loaded() into public wrapper + internal _ensure_loaded_locked()
- SQLite I/O is performed outside the lock to avoid blocking during
slow disk operations
- _save() stays inside the lock since it reads _entries for serialization
Cherry-picked from PR #3012 by Kewe63. Removed unrelated changes
(delivery.py case-sensitivity, hermes_state.py schema tracking) and
stripped the UTC timezone switch to keep the change focused on threading.
Co-authored-by: Kewe63 <Kewe63@users.noreply.github.com>
feat: persist reasoning across gateway session turns (schema v6)
Tested against OpenAI Codex (direct), Anthropic (direct + OAI-compat), and OpenRouter → 6 backends. All reasoning field types (reasoning, reasoning_details, codex_reasoning_items) round-trip through the DB correctly.
When a session expires (daily schedule or idle timeout) and is
automatically reset, send a notification to the user explaining
what happened:
◐ Session automatically reset (inactive for 24h).
Conversation history cleared.
Use /resume to browse and restore a previous session.
Adjust reset timing in config.yaml under session_reset.
Notifications are suppressed when:
- The expired session had no activity (no tokens used)
- The platform is excluded (api_server, webhook by default)
- notify: false in config
Changes:
- session.py: _should_reset() returns reason string ('idle'/'daily')
instead of bool; SessionEntry gains auto_reset_reason and
reset_had_activity fields; old entry's total_tokens checked
- config.py: SessionResetPolicy gains notify (bool, default: true)
and notify_exclude_platforms (default: api_server, webhook)
- run.py: sends notification via adapter.send() before processing
the user's message, with activity + platform checks
- 13 new tests
Config (config.yaml):
session_reset:
notify: true
notify_exclude_platforms: [api_server, webhook]
Wrap json.loads() in load_transcript() with try/except JSONDecodeError
so that partial JSONL lines (from mid-write crashes like OOM/SIGKILL)
are skipped with a warning instead of crashing the entire transcript
load. The rest of the history loads fine.
Adds a logger.warning with the session ID and truncated corrupt line
content for debugging visibility.
Salvaged from PR #1193 by alireza78a.
Closes#1193
Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM
needs the real ID to tag users. Redaction now only applies to WhatsApp,
Signal, and Telegram where IDs are pure routing metadata.
Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack.
Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:
- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)
Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.
Inspired by OpenClaw PR #47959.
default group and channel sessions to per-user isolation, allow opting back into shared room sessions via config.yaml, and document Discord gateway routing and session behavior.
Include participant identifiers in non-DM session keys when available so group and channel conversations no longer share one transcript across every active user in the chat.
Gateway sessions end up with model=NULL because the session row is
created before AIAgent is constructed. After the agent responds,
update_session() writes token counts but never fills in the model.
Thread agent.model through _run_agent()'s return dict into
update_session() → update_token_counts(). The SQL uses
COALESCE(model, ?) so it only fills NULL rows — never overwrites
a model already set at creation time (e.g. CLI sessions).
If the agent falls back to a different provider, agent.model is
updated in-place by _try_activate_fallback(), so the recorded value
reflects whichever model actually produced the response.
Fixes#987
Remove web UI gateway (web.py, tests, docs, toolset, env vars, Platform.WEB
enum) per maintainer request — Nous is building their own official chat UI.
Fix 1: Replace sd.wait() with polling pattern in play_audio_file() to prevent
indefinite hang when audio device stalls (consistent with play_beep()).
Fix 2: Use importlib.util.find_spec() for faster_whisper/openai availability
checks instead of module-level imports that trigger heavy native library
loading (CUDA/cuDNN) at import time.
Fix 3: Remove inspect.signature() hack in _send_voice_reply() — add **kwargs
to Telegram send_voice() so all adapters accept metadata uniformly.
Fix 4: Make session loading resilient to removed platform enum values — skip
entries with unknown platforms instead of crashing the entire gateway.
Tell the agent what it CANNOT do on Slack and Discord — no searching
channel history, no pinning messages, no managing channels/roles.
Prevents the agent from hallucinating capabilities it doesn't have
and promising actions it can't deliver.
Addresses user feedback: agent says 'I'll search your Slack history'
then goes silent because no Slack-specific tools exist.
Three bugs fixed in the Slack adapter:
1. Tool progress messages leaked to main channel instead of thread.
Root cause: metadata key mismatch — gateway uses 'thread_id' but
Slack adapter checked for 'thread_ts'. Added _resolve_thread_ts()
helper that checks both keys with correct precedence.
2. Bot responses could escape threads for replies.
Root cause: reply_to was set to the child message's ts, but Slack
API needs the parent message's ts for thread_ts. Now metadata
thread_id (always the parent ts) takes priority over reply_to.
3. All Slack DMs shared one session key ('agent:main:slack:dm'),
so a long-running task blocked all other DM conversations.
Fix: DMs with thread_id now get per-thread session keys. Top-level
DMs still share one session for conversation continuity.
Additional fix: All Slack media methods (send_image, send_voice,
send_video, send_document, send_image_file) now accept metadata
parameter for thread routing. Previously they only accepted reply_to,
which caused media to silently fail to post in threads.
Session key behavior after this change:
- Slack channel @mention: creates thread, thread = session
- Slack thread reply: stays in thread, same session
- Slack DM (top-level): one continuous session
- Slack DM (threaded): per-thread session
- Other platforms: unchanged
Follow-up to 58dbd81 — ensures smooth transition for existing users:
- Backward compat: old session files without last_prompt_tokens
default to 0 via data.get('last_prompt_tokens', 0)
- /compress, /undo, /retry: reset last_prompt_tokens to 0 after
rewriting transcripts (stale token counts would under-report)
- Auto-compression hygiene: reset last_prompt_tokens after rewriting
- update_session: use None sentinel (not 0) as default so callers
can explicitly reset to 0 while normal calls don't clobber
- 6 new tests covering: default value, serialization roundtrip,
old-format migration, set/reset/no-change semantics
- /reset: new SessionEntry naturally gets last_prompt_tokens=0
2942 tests pass.
Root cause of aggressive gateway compression vs CLI:
- CLI: single AIAgent persists across conversation, uses real API-reported
prompt_tokens for compression decisions — accurate
- Gateway: each message creates fresh AIAgent, token count discarded after,
next message pre-check falls back to rough str(msg)//4 estimate which
overestimates 30-50% on tool-heavy conversations
Fix:
- Add last_prompt_tokens field to SessionEntry — stores the actual
API-reported prompt token count from the most recent agent turn
- After run_conversation(), extract context_compressor.last_prompt_tokens
and persist it via update_session()
- Gateway pre-check now uses stored actual tokens when available (exact
same accuracy as CLI), falling back to rough estimate with 1.4x safety
factor only for the first message of a session
This makes gateway compression behave identically to CLI compression
for all turns after the first. Reported by TigerHix.
Three separate code paths all wrote to the same SQLite state.db with
no deduplication, inflating session transcripts by 3-4x:
1. _log_msg_to_db() — wrote each message individually after append
2. _flush_messages_to_session_db() — re-wrote ALL new messages at
every _persist_session() call (~18 exit points), with no tracking
of what was already written
3. gateway append_to_transcript() — wrote everything a third time
after the agent returned
Since load_transcript() prefers SQLite over JSONL, the inflated data
was loaded on every session resume, causing proportional token waste.
Fix:
- Remove _log_msg_to_db() and all 16 call sites (redundant with flush)
- Add _last_flushed_db_idx tracking in _flush_messages_to_session_db()
so repeated _persist_session() calls only write truly new messages
- Reset flush cursor on compression (new session ID)
- Add skip_db parameter to SessionStore.append_to_transcript() so the
gateway skips SQLite writes when the agent already persisted them
- Gateway now passes skip_db=True for agent-managed messages, still
writes to JSONL as backup
Verified: a 12-message CLI session with tool calls produces exactly
12 SQLite rows with zero duplicates (previously would be 36-48).
Tests: 9 new tests covering flush deduplication, skip_db behavior,
compression reset, and initialization. Full suite passes (2869 tests).
Authored by alireza78a.
Replaces open('w') + json.dump with tempfile.mkstemp + os.replace atomic
write pattern, matching the existing pattern in cron/jobs.py. Prevents
silent session loss if the process crashes or gets OOM-killed mid-write.
Resolved conflict: kept encoding='utf-8' from HEAD in the new fdopen call.
Authored by @ch3ronsa. Fixes UnicodeEncodeError/UnicodeDecodeError on
Windows with non-UTF-8 system locales (e.g. Turkish cp1254).
Adds encoding='utf-8' to 10 open() calls across gateway/session.py,
gateway/channel_directory.py, and gateway/mirror.py.