Add client-side support for Telegram's Managed Bots feature, enabling
zero-copy-paste bot creation during hermes setup:
- New module: hermes_cli/telegram_managed_bot.py
- QR code terminal rendering via qrcode library
- Deep link generation (t.me/newbot/{manager}/{username})
- Pairing protocol client (nonce registration + token polling)
- Full auto-setup orchestrator with animated progress
- Setup wizard (hermes_cli/setup.py)
- Telegram setup now offers Automatic vs Manual choice
- Automatic: scan QR → confirm in Telegram → token saved
- Falls back to manual if auto-setup fails or is declined
- Dependencies: qrcode>=8.0 (pure Python, no PIL needed)
Requires a Nous-hosted manager bot + pairing API (Cloudflare Worker)
to complete the flow. See linked issue for backend infrastructure spec.
* fix: show correct env var name in provider API key error (#9506)
The error message for missing provider API keys dynamically built
the env var name as PROVIDER_API_KEY (e.g. ALIBABA_API_KEY), but
some providers use different names (alibaba uses DASHSCOPE_API_KEY).
Users following the error message set the wrong variable.
Fix: look up the actual env var from PROVIDER_REGISTRY before
building the error. Falls back to the dynamic name if the registry
lookup fails.
Closes#9506
* fix: five HERMES_HOME profile-isolation leaks (#5947)
Bug A: Thread session_title from session_db to memory provider init kwargs
so honcho can derive chat-scoped session keys instead of falling back to
cwd-based naming that merges all gateway users into one session.
Bug B: Replace 14 hardcoded ~/.hermes/skills/ paths across 10 skill files
with HERMES_HOME-aware alternatives (${HERMES_HOME:-$HOME/.hermes} in
shell, os.environ.get('HERMES_HOME', ...) in Python).
Bug C: install.sh now respects HERMES_HOME env var and adds --hermes-home
flag. Previously --dir only set INSTALL_DIR while HERMES_HOME was always
hardcoded to $HOME/.hermes.
Bug D: Remove hardcoded ~/.hermes/honcho.json fallback in resolve_config_path().
Non-default profiles no longer silently inherit the default profile's honcho
config. Falls through to ~/.honcho/config.json (global) instead.
Bug E: Guard _edit_skill, _patch_skill, _delete_skill, _write_file, and
_remove_file against writing to skills found in external_dirs. Skills
outside the local SKILLS_DIR are now read-only from the agent's perspective.
Closes#5947
procps-ng 4.0.4 in Docker rejects BSD-style 'ps eww -ax' with a
'must set personality' error, causing find_gateway_pids() to return
empty and falsely report the gateway as not running.
Fix: replace 'ps eww -ax' with 'ps -A eww'. -A is the POSIX
equivalent of BSD -ax (select all processes), and the eww modifiers
(show environment + wide output) still work as BSD flags alongside
the POSIX -A flag. This preserves the HERMES_HOME= environment
visibility needed for profile-aware PID matching.
Closes#9723
When Nous returns a 429, the retry amplification chain burns up to 9
API requests per conversation turn (3 SDK retries × 3 Hermes retries),
each counting against RPH and deepening the rate limit. With multiple
concurrent sessions (cron + gateway + auxiliary), this creates a spiral
where retries keep the limit tapped indefinitely.
New module: agent/nous_rate_guard.py
- Shared file-based rate limit state (~/.hermes/rate_limits/nous.json)
- Parses reset time from x-ratelimit-reset-requests-1h, x-ratelimit-
reset-requests, retry-after headers, or error context
- Falls back to 5-minute default cooldown if no header data
- Atomic writes (tempfile + rename) for cross-process safety
- Auto-cleanup of expired state files
run_agent.py changes:
- Top-of-retry-loop guard: when another session already recorded Nous
as rate-limited, skip the API call entirely. Try fallback provider
first, then return a clear message with the reset time.
- On 429 from Nous: record rate limit state and skip further retries
(sets retry_count = max_retries to trigger fallback path)
- On success from Nous: clear the rate limit state so other sessions
know they can resume
auxiliary_client.py changes:
- _try_nous() checks rate guard before attempting Nous in the auxiliary
fallback chain. When rate-limited, returns (None, None) so the chain
skips to the next provider instead of piling more requests onto Nous.
This eliminates three sources of amplification:
1. Hermes-level retries (saves 6 of 9 calls per turn)
2. Cross-session retries (cron + gateway all skip Nous)
3. Auxiliary fallback to Nous (compression/session_search skip too)
Includes 24 tests covering the rate guard module, header parsing,
state lifecycle, and auxiliary client integration.
Extract resolve_channel_prompt() shared helper into
gateway/platforms/base.py. Refactor Discord to use it.
Wire channel_prompts into Telegram (groups + forum topics),
Slack (channels), and Mattermost (channels).
Config bridging now applies to all platforms (not just Discord).
Added channel_prompts defaults to telegram/slack/mattermost
config sections.
Docs added to all four platform pages with platform-specific
examples (topic inheritance for Telegram, channel IDs for Slack,
etc.).
Move _ensure_discord_mock() from module level to _make_adapter() so it
doesn't poison sys.modules for other discord test files. Use
types.ModuleType instead of MagicMock for the mock module to avoid
auto-generated __file__ attribute confusing hasattr checks.
Add BrennerSpear to AUTHOR_MAP.
- Remove double str() normalization in _resolve_channel_prompt since
config bridging already handles numeric YAML key conversion
- Remove dead prompts.get(str(key)) fallback that could never match
after keys were already normalized to strings
- Replace getattr(event, "channel_prompt", None) with direct attribute
access since channel_prompt is a declared dataclass field
- Update test to verify normalization responsibility lives in config bridging
The error message for missing provider API keys dynamically built
the env var name as PROVIDER_API_KEY (e.g. ALIBABA_API_KEY), but
some providers use different names (alibaba uses DASHSCOPE_API_KEY).
Users following the error message set the wrong variable.
Fix: look up the actual env var from PROVIDER_REGISTRY before
building the error. Falls back to the dynamic name if the registry
lookup fails.
Closes#9506
The GPT-5 auto-upgrade logic unconditionally overrode api_mode to
codex_responses for any model starting with gpt-5, even when the
user explicitly set api_mode=chat_completions. Custom proxies that
serve GPT-5 via /chat/completions became unusable.
Fix: check api_mode is None before the override fires. If the caller
passed any explicit api_mode, it is final -- no auto-upgrade.
Closes#10473
When proxy env vars (HTTP_PROXY, HTTPS_PROXY, ALL_PROXY) contain
malformed URLs — e.g. 'http://127.0.0.1:6153export' from a broken
shell config — the OpenAI/httpx client throws a cryptic 'Invalid port'
error that doesn't identify the offending variable.
Add _validate_proxy_env_urls() and _validate_base_url() in
auxiliary_client.py, called from resolve_provider_client() and
_create_openai_client() to fail fast with a clear, actionable error
message naming the broken env var or URL.
Closes#6360
Co-authored-by: MestreY0d4-Uninter <MestreY0d4-Uninter@users.noreply.github.com>
Found via trace data audit: JWT tokens (eyJ...) and Discord snowflake
mentions (<@ID>) were passing through unredacted.
JWT pattern: matches 1/2/3-part tokens starting with eyJ (base64 for '{').
Zero false-positive risk — no normal text matches eyJ + 10+ base64url chars.
Discord pattern: matches <@digits> and <@!digits> with 17-20 digit snowflake
IDs. Syntactically unique to Discord's mention format.
Both patterns follow the same structural-uniqueness standard as existing
prefix patterns (sk-, ghp_, AKIA, etc.).
_parse_session_key() blindly assigned parts[5] as thread_id for all
chat types. For group sessions with per-user isolation, parts[5] is
a user_id, not a thread_id. This could cause shutdown notifications
to route with incorrect thread metadata.
Only return thread_id for chat types where the 6th element is
unambiguous: dm and thread. For group/channel sessions, omit
thread_id since the suffix may be a user_id.
Based on the approach from PR #9938 by @Ruzzgar.
OpenCode Go does not expose a shared /models endpoint, so the doctor
probe was always failing and producing a false warning. Set the default
URL to None and disable the health check for this provider.
Both /queue and /quit registered 'q' as an alias. Since /quit appeared
later in COMMAND_REGISTRY, _build_command_lookup() silently overwrote
/queue's claim, making the documented /queue shorthand unusable.
Fix: remove 'q' from /quit's aliases. /quit already has 'exit' as an
alias plus the full '/quit' command. /queue has no other short alias.
Closes#10467
The recovery block previously only retried (continue) when one of the
per-component sanitization checks (messages, tools, system prompt,
headers, credentials) found and stripped non-ASCII content. When the
non-ASCII lived only in api_messages' reasoning_content field (which
is built from messages['reasoning'] and not checked by the original
_sanitize_messages_non_ascii), all checks returned False and the
recovery fell through to the normal error path — burning a retry
attempt despite _force_ascii_payload being set.
Now the recovery always continues (retries) when _is_ascii_codec is
detected. The _force_ascii_payload flag guarantees the next iteration
runs _sanitize_structure_non_ascii(api_kwargs) on the full API payload,
catching any remaining non-ASCII regardless of where it lives.
Also adds test for the 'reasoning' field on canonical messages.
Fixes#6843
The ASCII-locale recovery path in run_agent.py sanitized the canonical
'messages' list but left 'api_messages' untouched. api_messages is a
separate API-copy built before the retry loop and may carry extra fields
(reasoning_content, extra_body entries) that are not present in
'messages'. This caused the retry to still raise UnicodeEncodeError even
after the 'System encoding is ASCII — stripped...' log line appeared.
Two changes:
- _sanitize_messages_non_ascii now walks all extra top-level string fields
in each message dict (any key not in {content, name, tool_calls, role})
so reasoning_content and future extras are cleaned in both 'messages'
and 'api_messages'.
- The ASCII-codec recovery block now also calls sanitize on api_messages
and api_kwargs so no non-ASCII survives into the next retry attempt.
Adds regression tests covering:
- reasoning_content with non-ASCII in api_messages
- extra_body with non-ASCII in api_kwargs
- canonical messages clean but api_messages dirty
Fixes#6843
Commentary messages (interim assistant status updates like "Using browser
tool...") are sent via _send_commentary(), which was incorrectly setting
_already_sent = True on success. This caused the final response to be
suppressed when there were multiple tool calls, because the gateway checks
already_sent to decide whether to skip re-sending the response.
The fix: commentary messages are interim status updates, not the final
response, so _already_sent should not be set when they succeed. This
ensures the final response is always delivered regardless of how many
commentary messages were sent during the turn.
Fixes: #10454
Route kimi-coding-cn through _resolve_kimi_base_url() in both
get_api_key_provider_status() and resolve_api_key_provider_credentials()
so CN users with sk-kimi- prefixed keys get auto-detected to the Kimi
Coding Plan endpoint, matching the existing behavior for kimi-coding.
Also update the kimi-coding display label to accurately reflect the
dual-endpoint setup (Kimi Coding Plan + Moonshot API).
Salvaged from PR #10525 by kkikione999.
_install_tirith() uses shutil.move() to place the binary from tmpdir
to ~/.hermes/bin/. When these are on different filesystems (common in
Docker, NFS), shutil.move() falls back to copy2 + unlink, but copy2's
metadata step can raise PermissionError. This exception propagated
past the fail_open guard, crashing the terminal tool entirely.
Additionally, a failed install could leave a non-executable tirith
binary at the destination, causing a retry loop on every subsequent
terminal command.
Fix:
- Catch OSError from shutil.move() and fall back to shutil.copy()
(skips metadata/xattr copying that causes PermissionError)
- If even copy fails, clean up the partial dest file to prevent
the non-executable retry loop
- Return (None, 'cross_device_copy_failed') so the failure routes
through the existing install-failure caching and fail_open logic
Closes#10127
Clarifies that tool-level access restrictions are not security boundaries
when the agent has unrestricted terminal access. Deny lists only matter
when paired with equivalent terminal-side restrictions (like WRITE_DENIED_PATHS
pairs with the dangerous command approval system).
After clear_session_vars() reset contextvars to their default (''),
get_session_env() treated the empty string as falsy and fell through
to os.environ — resurrecting stale HERMES_SESSION_* values from CLI
startup, cron, or previous sessions. This broke session isolation
in the gateway where concurrent messages could see each other's
stale environment values.
Fix: use a sentinel (_UNSET) as the contextvar default instead of ''.
get_session_env() now checks 'value is not _UNSET' instead of
truthiness. Three states are cleanly distinguished:
- _UNSET (never set): fall back to os.environ (CLI/cron compat)
- '' (explicitly cleared): return '' — no os.environ fallback
- 'telegram' (actively set): return the value
clear_session_vars() now uses var.set('') instead of var.reset(token)
to mark vars as explicitly cleared rather than reverting to _UNSET.
Closes#10304
When a model (e.g. mimo-v2-pro) streams intermediate text alongside tool
calls ("Let me search for that") but then returns empty after processing
tool results, the stream consumer already_sent flag is True from the
earlier text delivery. The gateway suppression check
(already_sent=True, failed=False → return None) would swallow the final
response, leaving the user staring at silence after the search.
Two changes:
1. gateway/run.py return path: skip already_sent suppression when the
final_response is "(empty)" or empty — the user needs to know the
agent finished even if streaming sent partial content earlier.
2. gateway/run.py response handler: convert the internal "(empty)"
sentinel to a user-friendly warning instead of delivering the raw
sentinel string.
Tests added for all empty/None/sentinel cases plus preserved existing
suppression behavior for normal non-empty responses.
Memory provider discovery (discover_memory_providers, load_memory_provider)
only scanned the bundled plugins/memory/ directory. User-installed providers
at $HERMES_HOME/plugins/<name>/ were invisible, forcing users to symlink
into the repo source tree — which broke on hermes update and created a
dual-registration path causing duplicate tool names (400 errors on strict
providers like Xiaomi MiMo).
Changes:
- Add _get_user_plugins_dir(), _is_memory_provider_dir(), _iter_provider_dirs(),
and find_provider_dir() helpers to plugins/memory/__init__.py
- discover_memory_providers() now scans both bundled and user dirs
- load_memory_provider() uses find_provider_dir() (bundled-first)
- discover_plugin_cli_commands() uses find_provider_dir()
- _install_dependencies() in memory_setup.py uses find_provider_dir()
- User plugins use _hermes_user_memory namespace to avoid sys.modules collisions
- Non-memory user plugins filtered via source text heuristic
- Bundled providers always take precedence on name collisions
Fixes#4956, #9099. Supersedes #4987, #9123, #9130, #9132, #9982.
Discord's _register_slash_commands() had a hardcoded list of ~27 commands
while COMMAND_REGISTRY defines 34+ gateway-available commands. Missing
commands (debug, branch, rollback, snapshot, profile, yolo, fast, reload,
commands) were invisible in Discord's / autocomplete — users couldn't
discover them.
Add a dynamic catch-all loop after the explicit registrations that
iterates COMMAND_REGISTRY, skips already-registered commands, and
auto-registers the rest using discord.app_commands.Command(). Commands
with args_hint get an optional string parameter; parameterless commands
get a simple callback.
This ensures any future commands added to COMMAND_REGISTRY automatically
appear on Discord without needing a manual entry in discord.py.
Telegram and Slack already derive dynamically from COMMAND_REGISTRY
via telegram_bot_commands() and slack_subcommand_map() — no changes
needed there.
update_job() assumed the schedule value was always a pre-parsed dict
and called .get() on it directly. When the API passes a raw string
like "every 10m", this crashed with AttributeError.
The create path already handles this correctly by calling
parse_schedule() on the incoming string. The fix adds the same
normalization to the update path: if the schedule is a string,
parse it into a dict before proceeding.
Closes#10129
When a user runs /browser connect to attach browser tools to their real
Chrome instance via CDP, the BROWSER_CDP_URL env var is set. However,
every browser tool function checks _is_camofox_mode() first, which
short-circuits to the Camofox backend before _get_session_info() ever
checks for the CDP override.
Fix: is_camofox_mode() now returns False when BROWSER_CDP_URL is set,
so the explicit CDP connection takes priority. This is the correct
behavior — /browser connect is an intentional user override.
Reported by SkyLinx on Discord.
Models (especially open-source like qwen3.5-plus) may send non-int values
for the limit parameter — None (JSON null), string, or even a type object.
This caused TypeError: '<=' not supported between instances of 'int' and
'type' when the value reached min()/comparison operations.
Changes:
- Add defensive int coercion at session_search() entry with fallback to 3
- Clamp limit to [1, 5] range (was only capped at 5, not floored)
- Add tests for None, type object, string, negative, and zero limit values
Reported by community user ludoSifu via Discord.
Memory provider plugins (e.g. Mnemosyne) can register tools via two paths:
1. Plugin system (ctx.register_tool) → tool registry → get_tool_definitions()
2. Memory manager → get_all_tool_schemas() → direct append in AIAgent.__init__
Path 2 blindly appended without checking if path 1 already added the same
tool names. This created duplicate function names in the tools array sent
to the API. Most providers silently handle duplicates, but Xiaomi MiMo
(via Nous Portal) strictly rejects them with a 400 Bad Request.
Fix: build a set of existing tool names before memory manager injection
and skip any tool whose name is already present.
Confirmed via live testing against Nous Portal:
- Unique tool names → 200 OK
- Duplicate tool names → 400 'Provider returned error'
Python's json.dumps() defaults to ensure_ascii=True, escaping non-ASCII
characters to \uXXXX sequences. For CJK characters this inflates
token count 3-4x — a single Chinese character like '中' becomes
'\u4e2d' (6 chars vs 3 bytes, ~6 tokens vs ~1 token).
Since MCP tool results feed directly into the model's conversation
context, this silently multiplied API costs for Chinese, Japanese,
and Korean users.
Fix: add ensure_ascii=False to all 20 json.dumps calls in mcp_tool.py.
Raw UTF-8 is valid JSON per RFC 8259 and all downstream consumers
(LLM APIs, display) handle it correctly.
Closes#10234
- Pastes uploaded by /debug now auto-delete after 1 hour via a detached
background process that sends DELETE to paste.rs
- CLI: shows privacy notice listing what data will be uploaded
- Gateway: only uploads summary report (system info + log tails), NOT
full log files containing conversation content
- Added 'hermes debug delete <url>' for immediate manual deletion
- 16 new tests covering auto-delete scheduling, paste deletion, privacy
notices, and the delete subcommand
Addresses user privacy concern where /debug uploaded full conversation
logs to a public paste service with no warning or expiry.
Two gateway fixes:
1. MessageDeduplicator.is_duplicate() now checks TTL at query time (#10306)
Previously, is_duplicate() returned True for any previously seen ID
without checking its age — expired entries were only purged when cache
size exceeded max_size. On normal workloads that never overflow, message
IDs stayed deduplicated forever instead of expiring after the TTL.
Fix: check `now - timestamp < ttl` before returning True. Expired
entries are removed and treated as new messages.
2. Gateway --config flag now uses yaml.safe_load() (#10216)
The --config CLI flag in gateway/run.py main() used json.load() to
parse config files. YAML is the only documented config format and
every other config loader uses yaml.safe_load(). A YAML config file
passed via --config would crash with json.JSONDecodeError.
Closes#10306Closes#10216
The on_memory_write bridge that notifies external memory providers
(ClawMem, retaindb, supermemory, etc.) of built-in memory writes was
only present in the concurrent tool execution path (_invoke_tool).
The sequential path (_execute_tool_calls_sequential) — which handles
all single tool calls, the common case — was missing it entirely.
This meant external memory providers silently missed every single-call
memory write, which is the vast majority of memory operations.
Fix: add the identical bridge block to the sequential path, right
after the memory_tool call returns.
Closes#10174
Multiple gaps in activity tracking could cause the gateway's inactivity
timeout to fire while the agent is actively working:
1. Streaming wait loop had no periodic heartbeat — the outer thread only
touched activity when the stale-stream detector fired (180-300s), and
for local providers (Ollama) the stale timeout was infinity, meaning
zero heartbeats. Now touches activity every 30s.
2. Concurrent tool execution never set the activity callback on worker
threads (threading.local invisible across threads) and never set
_current_tool. Workers now set the callback, and the concurrent wait
uses a polling loop with 30s heartbeats.
3. Modal backend's execute() override had its own polling loop without
any activity callback. Now matches _wait_for_process cadence (10s).
The _last_content_with_tools fallback was firing indiscriminately for ALL
content+tool turns, including mid-task narration alongside substantive
tools (terminal, search_files, etc.). This caused the agent to exit
the loop with 'I'll scan the directory...' as the final answer instead
of nudging the model to continue processing tool results.
The fix restricts the fallback to housekeeping-only turns (memory, todo,
skill_manage, session_search) where the content genuinely IS the final
answer. When substantive tools are present, the existing post-tool
nudge mechanism now fires instead, prompting the model to continue.
Affected models: xiaomi/mimo-v2-pro, GLM-5, and other weaker models
that intermittently return empty after tool results.
Reported by user Renaissance on Discord.
The _client_cache used event loop id() as part of the cache key, so
every new worker-thread event loop created a new entry for the same
provider config. In long-running gateways where threads are recycled
frequently, this caused unbounded cache growth — each stale entry
held an unclosed AsyncOpenAI client with its httpx connection pool,
eventually exhausting file descriptors.
Fix: remove loop_id from the cache key and instead validate on each
async cache hit that the cached loop is the current, open loop. If
the loop changed or was closed, the stale entry is replaced in-place
rather than creating an additional entry. This bounds cache growth
to at most one entry per unique provider config.
Also adds a _CLIENT_CACHE_MAX_SIZE (64) safety belt with FIFO
eviction as defense-in-depth against any remaining unbounded growth.
Cross-loop safety is preserved: different event loops still get
different client instances (validated by existing test suite).
Closes#10200
OV transparently handles message history across /new and /compress: old
messages stay in the same session and extraction is idempotent, so there's
no need to rebind providers to a new session_id. The only thing the
session boundary actually needs is to trigger extraction.
- MemoryProvider / MemoryManager: remove on_session_reset hook
- OpenViking: remove on_session_reset override (nothing to do)
- AIAgent: replace rotate_memory_session with commit_memory_session
(just calls on_session_end, no rebind)
- cli.py / run_agent.py: single commit_memory_session call at the
session boundary before session_id rotates
- tests: replace on_session_reset coverage with routing tests for
MemoryManager.on_session_end
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hasattr-forked OpenViking-specific paths with a proper base-class
hook. Collapse the two agent wrappers into a single rotate_memory_session
so callers don't orchestrate commit + rebind themselves.
- MemoryProvider: add on_session_reset(new_session_id) as a default no-op
- MemoryManager: on_session_reset fans out unconditionally (no hasattr,
no builtin skip — base no-op covers it)
- OpenViking: rename reset_session -> on_session_reset; drop the explicit
POST /api/v1/sessions (OV auto-creates on first message) and the two
debug raise_for_status wrappers
- AIAgent: collapse commit_memory_session + reinitialize_memory_session
into rotate_memory_session(new_sid, messages)
- cli.py / run_agent.py: replace hasattr blocks and the split calls with
a single unconditional rotate_memory_session call; compression path
now passes the real messages list instead of []
- tests: align with on_session_reset, assert reset does NOT POST /sessions
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The OpenViking memory provider extracts memories when its session is
committed (POST /api/v1/sessions/{id}/commit). Before this fix, the
CLI had two code paths that changed the active session_id without ever
committing the outgoing OpenViking session:
1. /new (new_session() in cli.py) — called flush_memories() to write
MEMORY.md, then immediately discarded the old session_id. The
accumulated OpenViking session was never committed, so all context
from that session was lost before extraction could run.
2. /compress and auto-compress (_compress_context() in run_agent.py) —
split the SQLite session (new session_id) but left the OpenViking
provider pointing at the old session_id with no commit, meaning all
messages synced to OpenViking were silently orphaned.
The gateway already handles session commit on /new and /reset via
shutdown_memory_provider() on the cached agent; the CLI path did not.
Fix: introduce a lightweight session-transition lifecycle alongside
the existing full shutdown path:
- OpenVikingMemoryProvider.reset_session(new_session_id): waits for
in-flight background threads, resets per-session counters, and
creates the new OV session via POST /api/v1/sessions — without
tearing down the HTTP client (avoids connection overhead on /new).
- MemoryManager.restart_session(new_session_id): calls reset_session()
on providers that implement it; falls back to initialize() for
providers that do not. Skips the builtin provider (no per-session
state).
- AIAgent.commit_memory_session(messages): wraps
memory_manager.on_session_end() without shutdown — commits OV session
for extraction but leaves the provider alive for the next session.
- AIAgent.reinitialize_memory_session(new_session_id): wraps
memory_manager.restart_session() — transitions all external providers
to the new session after session_id has been assigned.
Call sites:
- cli.py new_session(): commit BEFORE session_id changes, reinitialize
AFTER — ensuring OV extraction runs on the correct session and the
new session is immediately ready for the next turn.
- run_agent._compress_context(): same pattern, inside the
if self._session_db: block where the session_id split happens.
/compress and auto-compress are functionally identical at this layer:
both call _compress_context(), so both are fixed by the same change.
Tests added to tests/agent/test_memory_provider.py:
- TestMemoryManagerRestartSession: reset_session() routing, builtin
skip, initialize() fallback, failure tolerance, empty-manager noop.
- TestOpenVikingResetSession: session_id update, per-session state
clear, POST /api/v1/sessions call, API failure tolerance, no-client
noop.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>