Translate all nio SDK calls to mautrix equivalents while preserving the
adapter structure, business logic, and all features (E2EE, reactions,
threading, mention gating, text batching, media caching, voice MSC3245).
Key changes:
- nio.AsyncClient -> mautrix.client.Client + HTTPAPI + MemoryStateStore
- Manual E2EE key management -> OlmMachine with auto key lifecycle
- isinstance(resp, nio.XxxResponse) -> mautrix returns values directly
- add_event_callback per type -> single ROOM_MESSAGE handler with
msgtype dispatch
- Room state (member_count, display_name) via async state store lookups
- Upload/download return ContentURI/bytes directly (no wrapper objects)
When two gateway messages arrived concurrently, _set_session_env wrote
HERMES_SESSION_PLATFORM/CHAT_ID/CHAT_NAME/THREAD_ID into the process-global
os.environ. Because asyncio tasks share the same process, Message B would
overwrite Message A's values mid-flight, causing background-task notifications
and tool calls to route to the wrong thread/chat.
Replace os.environ with Python's contextvars.ContextVar. Each asyncio task
(and any run_in_executor thread it spawns) gets its own copy, so concurrent
messages never interfere.
Changes:
- New gateway/session_context.py with ContextVar definitions, set/clear/get
helpers, and os.environ fallback for CLI/cron/test backward compatibility
- gateway/run.py: _set_session_env returns reset tokens, _clear_session_env
accepts them for proper cleanup in finally blocks
- All tool consumers updated: cronjob_tools, send_message_tool, skills_tool,
terminal_tool (both notify_on_complete AND check_interval blocks), tts_tool,
agent/skill_utils, agent/prompt_builder
- Tests updated for new contextvar-based API
Fixes#7358
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
Two fixes from PR review:
1. Session expiry was looking in _running_agents for the cached agent,
but idle expired sessions live in _agent_cache. Now checks
_agent_cache first, falls back to _running_agents.
2. Global cleanup in stop() was missing process_registry.kill_all(),
so background processes from agents evicted without close() (branch,
fallback) survived shutdown.
Use getattr guard for _agent_cache_lock in _handle_reset_command
because test fixtures may create GatewayRunner without calling
__init__, leaving the attribute unset.
Fixes e2e test failure: test_new_resets_session,
test_new_then_status_reflects_reset, test_new_is_idempotent.
Add 9 tests covering the full zombie process prevention chain:
- TestZombieReproduction: demonstrates that processes survive when
references are dropped without explicit cleanup (the original bug)
- TestAgentCloseMethod: verifies close() calls all cleanup functions,
is idempotent, propagates to children, and continues cleanup even
when individual steps fail
- TestGatewayCleanupWiring: verifies stop() calls close() and that
_evict_cached_agent() does NOT call close() (since it's also used
for non-destructive cache refreshes)
- TestDelegationCleanup: calls the real _run_single_child function and
verifies close() is called on the child agent
Ref: #7131
Wire AIAgent.close() into every gateway code path where an agent's
session is actually ending:
- stop(): close all running agents after interrupt + memory shutdown,
then call cleanup_all_environments() and cleanup_all_browsers() as
a global catch-all
- _session_expiry_watcher(): close agents when sessions expire after
the 5-minute idle timeout
- _handle_reset_command(): close the old agent before evicting it from
cache on /new or /reset
Note: _evict_cached_agent() intentionally does NOT call close() because
it is also used for non-destructive cache refreshes (model switch,
branch, fallback) where tool resources should persist.
Ref: #7131
- Remove sys.path.insert hack (leftover from standalone dev)
- Add token lock (acquire_scoped_lock/release_scoped_lock) in
connect()/disconnect() to prevent duplicate pollers across profiles
- Fix get_connected_platforms: WEIXIN check must precede generic
token/api_key check (requires both token AND account_id)
- Add WEIXIN_HOME_CHANNEL_NAME to _EXTRA_ENV_KEYS
- Add gateway setup wizard with QR login flow
- Add platform status check for partially configured state
- Add weixin.md docs page with full adapter documentation
- Update environment-variables.md reference with all 11 env vars
- Update sidebars.ts to include weixin docs page
- Wire all gateway integration points onto current main
Salvaged from PR #6747 by Zihan Huang.
Simplified implementation of the feature from PR #6842 (RunzhouLi).
Allows Discord channels/forum threads to auto-bind skills via config:
discord:
channel_skill_bindings:
- id: "123456"
skills: ["skill-a", "skill-b"]
The run.py auto-skill loader now handles both str and list[str],
loading multiple skills in order and concatenating their payloads.
Forum threads inherit their parent channel's bindings.
Co-authored-by: RunzhouLi <RunzhouLi@users.noreply.github.com>
When /background was sent during an active run, it was not in the
platform adapter's bypass list and fell through to the interrupt path
instead of spawning a parallel background task.
Add "background" to the active-session command bypass in the platform
adapter, and add an early return in the gateway runner's running-agent
guard to route /background to _handle_background_command() before it
reaches the default interrupt logic.
Fixes#6827
Automated dead code audit using vulture + coverage.py + ast-grep intersection,
confirmed by Opus deep verification pass. Every symbol verified to have zero
production callers (test imports excluded from reachability analysis).
Removes ~1,534 lines of dead production code across 46 files and ~1,382 lines
of stale test code. 3 entire files deleted (agent/builtin_memory_provider.py,
hermes_cli/checklist.py, tests/hermes_cli/test_setup_model_selection.py).
Co-authored-by: alt-glitch <balyan.sid@gmail.com>
Legacy flat stt.model config key (from cli-config.yaml.example and older
versions) was passed as a model override to transcribe_audio() by the
gateway, bypassing provider-specific model resolution. When the provider
was 'local' (faster-whisper), this caused:
ValueError: Invalid model size 'whisper-1'
Changes:
- gateway/run.py, discord.py: stop passing model override — let
transcribe_audio() handle provider-specific model resolution internally
- get_stt_model_from_config(): now provider-aware, reads from the correct
nested section (stt.local.model, stt.openai.model, etc.); ignores
legacy flat key for local provider to prevent model name mismatch
- cli-config.yaml.example: updated STT section to show nested provider
config structure instead of legacy flat key
- config migration v13→v14: moves legacy stt.model to the correct
provider section and removes the flat key
Reported by community user on Discord.
Custom providers defined in config.yaml under were
completely invisible to the /model command in both gateway (Telegram,
Discord, etc.) and CLI. The provider listing skipped them and explicit
switching via --provider failed with "Unknown provider".
Root cause: gateway/run.py, cli.py, and model_switch.py only read the
dict from config, ignoring entirely.
Changes:
- providers.py: add resolve_custom_provider() and extend
resolve_provider_full() to check custom_providers after user_providers
- model_switch.py: propagate custom_providers through switch_model(),
list_authenticated_providers(), and get_authenticated_provider_slugs();
add custom provider section to provider listings
- gateway/run.py: read custom_providers from config, pass to all
model-switch calls
- cli.py: hoist config loading, pass custom_providers to listing and
switch calls
Tests: 4 new regression tests covering listing, resolution, and gateway
command handler. All 71 tests pass.
The gateway /model command stored session overrides in
_session_model_overrides but run_sync() never consulted them when
resolving the model and runtime for the next message. It always read
from config.yaml, so the switch was lost as soon as a new agent was
created.
Two fixes:
1. In run_sync(), apply _session_model_overrides after resolving from
config.yaml/env — the override takes precedence for model, provider,
api_key, base_url, and api_mode.
2. In post-run fallback detection, check whether the model mismatch
(agent.model != config_model) is due to an intentional /model switch
before evicting the cached agent. Without this, the first message
after /model would work (cached agent reused) but the fallback
detector would evict it, causing the next message to revert.
Affects all gateway platforms (Telegram, Discord, Slack, WhatsApp,
Signal, Matrix, BlueBubbles, HomeAssistant) since they all share
GatewayRunner._run_agent().
Fixes#6213
The gateway /usage handler only looked in _running_agents for the agent
object, which is only populated while the agent is actively processing a
message. Between turns (when users actually type /usage), the dict is
empty and the handler fell through to a rough message-count estimate.
The agent object actually lives in _agent_cache between turns (kept for
prompt caching). This fix checks both dicts, with _running_agents taking
priority (mid-turn) and _agent_cache as the between-turns fallback.
Also brings the gateway output to parity with the CLI /usage:
- Model name
- Detailed token breakdown (input, output, cache read, cache write)
- Cost estimation (estimated amount or 'included' for subscriptions)
- Cache token lines hidden when zero (cleaner output)
This fixes Nous Portal rate limit headers not showing up for gateway
users — the data was being captured correctly but the handler could
never see it.
Parse x-ratelimit-* headers from inference API responses (Nous Portal,
OpenRouter, OpenAI-compatible) and display them in the /usage command.
- New agent/rate_limit_tracker.py: parse 12 rate limit headers (RPM/RPH/
TPM/TPH limits, remaining, reset timers), format as progress bars (CLI)
or compact one-liner (gateway)
- Hook into streaming path in run_agent.py: stream.response.headers is
available on the OpenAI SDK Stream object before chunks are consumed
- CLI /usage: appends rate limit section with progress bars + warnings
when any bucket exceeds 80%
- Gateway /usage: appends compact rate limit summary
- 24 unit tests covering parsing, formatting, edge cases
Headers captured per response:
x-ratelimit-{limit,remaining,reset}-{requests,tokens}{,-1h}
Example CLI display:
Nous Rate Limits (captured just now):
Requests/min [░░░░░░░░░░░░░░░░░░░░] 0.1% 1/800 used (799 left, resets in 59s)
Tokens/hr [░░░░░░░░░░░░░░░░░░░░] 0.0% 49/336.0M (336.0M left, resets in 52m)
When a background process with notify_on_complete=True finishes, the
gateway injects a synthetic MessageEvent to notify the session. This
event was constructed without user_id, causing _is_user_authorized()
to reject it and — for DM-origin sessions — trigger the pairing flow,
sending "Hi~ I don't recognize you yet!" with a pairing code to the
chat owner.
Add an `internal` flag to MessageEvent that bypasses authorization
checks for system-generated synthetic events. Only the process watcher
sets this flag; no external/adapter code path can produce it.
Includes 4 regression tests covering the fix and the normal pairing path.
- Bind exception in warning send handler (was using stale _ne from outer scope)
- Calculate remaining time until timeout correctly: (timeout - warning) // 60
instead of warning // 60 (which equals elapsed time, not remaining)
Introduce gateway_timeout_warning (default 900s) as a pre-timeout alert
layer. When inactivity reaches the warning threshold, a single
notification is sent to the user offering to wait or reset. If
inactivity continues to the gateway_timeout (default 1800s), the full
timeout fires as before.
This gives users a chance to intervene before work is lost on slow
API providers without disabling the safety timeout entirely.
Config: agent.gateway_timeout_warning in config.yaml, or
HERMES_AGENT_TIMEOUT_WARNING env var (0 = disable warning).
Fixes#4647 — Signal replies duplicated when gateway streaming is enabled.
Root cause: stream_consumer.py did not handle the case where send() returns
success=True but no message_id (Signal behavior). Every stream delta produced
a separate send() call (7+ messages instead of 2), plus the gateway sent
another full duplicate since already_sent was never set.
Changes:
- stream_consumer.py: Add elif branch for success-without-message_id — enters
fallback mode (sets already_sent, disables editing, sends only continuation)
- signal.py send(): Extract timestamp from signal-cli RPC result as message_id
so stream consumer follows normal edit→fallback path
- signal.py: Add public stop_typing() delegating to _stop_typing_indicator()
so base adapter's _keep_typing finally block can clean up typing tasks
- gateway/run.py: Per-platform tool_progress_overrides (#6164) — lets users
set e.g. signal: off while keeping telegram: all
- hermes_cli/config.py: Add tool_progress_overrides to DEFAULT_CONFIG
Refs: #4647, #6164
- Fire on_session_finalize and on_session_reset in gateway _handle_reset_command()
- Fire on_session_finalize during gateway stop() for each active agent
- Move CLI test from tests/ root to tests/cli/ (matches recent restructure)
- Add 5 gateway tests covering reset hooks, ordering, shutdown, and error handling
- Place on_session_reset after new session is guaranteed to exist (covers
the get_or_create_session fallback path)
Gateway and cron had inconsistent reasoning_effort resolution:
- CLI: config.yaml only (correct)
- Gateway: config.yaml first, env var fallback
- Cron: env var first, config.yaml fallback
All three now read exclusively from agent.reasoning_effort in config.yaml.
Removed HERMES_REASONING_EFFORT env var support entirely — .env is for
secrets only, not behavioral config.
When a Discord voice message arrives, the adapter sets event.text to
"(The user sent a message with no text content)" since voice messages
have no text content. The transcription enrichment in
_enrich_message_with_transcription() then prepends the transcript but
leaves the placeholder intact, causing the agent to receive both:
[The user sent a voice message~ Here's what they said: "..."]
(The user sent a message with no text content)
The agent sees this as two separate user turns — one transcribed
and one empty — creating confusing duplicate messages.
Fix: when the transcription succeeds and user_text is only the empty
placeholder, return just the transcript without the redundant placeholder.
Previously, all/new tool progress modes always hard-truncated previews
to 40 chars, ignoring the display.tool_preview_length config. This made
it impossible for gateway users to see meaningful command/path info
without switching to verbose mode (which shows too much detail).
Now all/new modes read tool_preview_length from config:
- tool_preview_length: 0 (default/unset) → 40 chars (no regression)
- tool_preview_length: 120 → 120-char previews in all/new mode
- verbose mode: unchanged (already respected the config)
Users who want longer previews can set:
display:
tool_preview_length: 120
Reported by demontut_ on Discord.
Move _merge_caption helper from TelegramAdapter to BasePlatformAdapter
so all adapters inherit it. Fix the same substring-containment bug in:
- gateway/platforms/base.py (photo burst merging)
- gateway/run.py (priority photo follow-up merging)
- gateway/platforms/feishu.py (media batch merging)
The original fix only covered telegram.py. The same bug existed in base.py
and run.py (pure substring check) and feishu.py (list membership without
whitespace normalization).
Memory plugins (Mem0, Honcho) used static identifiers ('hermes-user',
config peerName) meaning all gateway users shared the same memory bucket.
Changes:
- AIAgent.__init__: add user_id parameter, store as self._user_id
- run_agent.py: include user_id in _init_kwargs passed to memory providers
- gateway/run.py: pass source.user_id to AIAgent in primary + background paths
- Mem0 plugin: prefer kwargs user_id over config default
- Honcho plugin: override cfg.peer_name with gateway user_id when present
CLI sessions (user_id=None) preserve existing defaults. Only gateway
sessions with a real platform user_id get per-user memory scoping.
Reported by plev333.
When the agent waits for dangerous-command approval, the typing
indicator (_keep_typing loop) kept refreshing. On Slack's Assistant
API this is critical: assistant_threads_setStatus disables the
compose box, preventing users from typing /approve or /deny.
- Add _typing_paused set + pause/resume methods to BasePlatformAdapter
- _keep_typing skips send_typing when chat_id is paused
- _approval_notify_sync pauses typing before sending approval prompt
- _handle_approve_command / _handle_deny_command resume typing after
Benefits all platforms — no reason to show 'is thinking...' while
the agent is idle waiting for human input.
Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture)
and manual analysis of the entire codebase.
Changes by category:
Unused imports removed (~95 across 55 files):
- Removed genuinely unused imports from all major subsystems
- agent/, hermes_cli/, tools/, gateway/, plugins/, cron/
- Includes imports in try/except blocks that were truly unused
(vs availability checks which were left alone)
Unused variables removed (~25):
- Removed dead variables: connected, inner, channels, last_exc,
source, new_server_names, verify, pconfig, default_terminal,
result, pending_handled, temperature, loop
- Dropped unused argparse subparser assignments in hermes_cli/main.py
(12 instances of add_parser() where result was never used)
Dead code removed:
- run_agent.py: Removed dead ternary (None if False else None) and
surrounding unreachable branch in identity fallback
- run_agent.py: Removed write-only attribute _last_reported_tool
- hermes_cli/providers.py: Removed dead @property decorator on
module-level function (decorator has no effect outside a class)
- gateway/run.py: Removed unused MCP config load before reconnect
- gateway/platforms/slack.py: Removed dead SessionSource construction
Undefined name bugs fixed (would cause NameError at runtime):
- batch_runner.py: Added missing logger = logging.getLogger(__name__)
- tools/environments/daytona.py: Added missing Dict and Path imports
Unnecessary global statements removed (14):
- tools/terminal_tool.py: 5 functions declared global for dicts
they only mutated via .pop()/[key]=value (no rebinding)
- tools/browser_tool.py: cleanup thread loop only reads flag
- tools/rl_training_tool.py: 4 functions only do dict mutations
- tools/mcp_oauth.py: only reads the global
- hermes_time.py: only reads cached values
Inefficient patterns fixed:
- startswith/endswith tuple form: 15 instances of
x.startswith('a') or x.startswith('b') consolidated to
x.startswith(('a', 'b'))
- len(x)==0 / len(x)>0: 13 instances replaced with pythonic
truthiness checks (not x / bool(x))
- in dict.keys(): 5 instances simplified to in dict
- Redefined unused name: removed duplicate _strip_mdv2 import in
send_message_tool.py
Other fixes:
- hermes_cli/doctor.py: Replaced undefined logger.debug() with pass
- hermes_cli/config.py: Consolidated chained .endswith() calls
Test results: 3934 passed, 17 failed (all pre-existing on main),
19 skipped. Zero regressions.
* fix: repair 57 failing CI tests across 14 files
Categories of fixes:
**Test isolation under xdist (-n auto):**
- test_hermes_logging: Strip ALL RotatingFileHandlers before each test
to prevent handlers leaked from other xdist workers from polluting counts
- test_code_execution: Force TERMINAL_ENV=local in setUp — prevents Modal
AuthError when another test leaks TERMINAL_ENV=modal
- test_timezone: Same TERMINAL_ENV fix for execute_code timezone tests
- test_codex_execution_paths: Mock _resolve_turn_agent_config to ensure
model resolution works regardless of xdist worker state
**Matrix adapter tests (nio not installed in CI):**
- Add _make_fake_nio() helper with real response classes for isinstance()
checks in production code
- Replace MagicMock(spec=nio.XxxResponse) with fake_nio instances
- Wrap production method calls with patch.dict('sys.modules', {'nio': ...})
so import nio succeeds in method bodies
- Use try/except instead of pytest.importorskip for nio.crypto imports
(importorskip can be fooled by MagicMock in sys.modules)
- test_matrix_voice: Skip entire file if nio is a mock, not just missing
**Stale test expectations:**
- test_cli_provider_resolution: _prompt_provider_choice now takes **kwargs
(default param added); mock getpass.getpass alongside input
- test_anthropic_oauth_flow: Mock getpass.getpass (code switched from input)
- test_gemini_provider: Mock models.dev + OpenRouter API lookups to test
hardcoded defaults without external API variance
- test_code_execution: Add notify_on_complete to blocked terminal params
- test_setup_openclaw_migration: Mock prompt_choice to select 'Full setup'
(new quick-setup path leads to _require_tty → sys.exit in CI)
- test_skill_manager_tool: Patch get_all_skills_dirs alongside SKILLS_DIR
so _find_skill searches tmp_path, not real ~/.hermes/skills/
**Missing attributes in object.__new__ test runners:**
- test_platform_reconnect: Add session_store to _make_runner()
- test_session_race_guard: Add hooks, _running_agents_ts, session_store,
delivery_router to _make_runner()
**Production bug fix (gateway/run.py):**
- Fix sentinel eviction race: _AGENT_PENDING_SENTINEL was immediately
evicted by the stale-detection logic because sentinels have no
get_activity_summary() method, causing _stale_idle=inf >= timeout.
Guard _should_evict with 'is not _AGENT_PENDING_SENTINEL'.
* fix: address remaining CI failures
- test_setup_openclaw_migration: Also mock _offer_launch_chat (called at
end of both quick and full setup paths)
- test_code_execution: Move TERMINAL_ENV=local to module level to protect
ALL test classes (TestEnvVarFiltering, TestExecuteCodeEdgeCases,
TestInterruptHandling, TestHeadTailTruncation) from xdist env leaks
- test_matrix: Use try/except for nio.crypto imports (importorskip can be
fooled by MagicMock in sys.modules under xdist)
* feat: notify_on_complete for background processes
When terminal(background=true, notify_on_complete=true), the system
auto-triggers a new agent turn when the process exits — no polling needed.
Changes:
- ProcessSession: add notify_on_complete field
- ProcessRegistry: add completion_queue, populate on _move_to_finished()
- Terminal tool: add notify_on_complete parameter to schema + handler
- CLI: drain completion_queue after agent turn AND during idle loop
- Gateway: enhanced _run_process_watcher injects synthetic MessageEvent
on completion, triggering a full agent turn
- Checkpoint persistence includes notify_on_complete for crash recovery
- code_execution_tool: block notify_on_complete in sandbox scripts
- 15 new tests covering queue mechanics, checkpoint round-trip, schema
* docs: update terminal tool descriptions for notify_on_complete
- background: remove 'ONLY for servers' language, describe both patterns
(long-lived processes AND long-running tasks with notify_on_complete)
- notify_on_complete: more prescriptive about when to use it
- TERMINAL_TOOL_DESCRIPTION: remove 'Do NOT use background for builds'
guidance that contradicted the new feature
* fix(gateway): /stop and /new bypass Level 1 active-session guard
The base adapter's Level 1 guard intercepted ALL messages while an
agent was running, including /stop and /new. These commands were queued
as pending messages instead of being dispatched to the gateway runner's
Level 2 handler. When the agent eventually stopped (via the interrupt
mechanism), the command text leaked into the conversation as a user
message — the model would receive '/stop' as input and respond to it.
Fix: Add /stop, /new, and /reset to the bypass set in base.py alongside
/approve, /deny, and /status. Consolidate the three separate bypass
blocks into one. Commands in the bypass set are dispatched inline to the
gateway runner, where Level 2 handles them correctly (hard-kill for
/stop, session reset for /new).
Also add a safety net in _run_agent's pending-message processing: if the
pending text resolves to a known slash command, discard it instead of
passing it to the agent. This catches edge cases where command text
leaks through the interrupt_message fallback.
Refs: #5244
* test: regression tests for command bypass of active-session guard
17 tests covering:
- /stop, /new, /reset bypass the Level 1 guard when agent is running
- /approve, /deny, /status bypass (existing behavior, now tested)
- Regular text and unknown commands still queued (not bypassed)
- File paths like '/path/to/file' not treated as commands
- Telegram @botname suffix handled correctly
- Safety net command resolution (resolve_command detects known commands)
- Raise max_models from 8 to 50 so all curated models come through
- Add _build_model_keyboard() helper with 8-per-page pagination
- Next ▶ / ◀ Prev buttons with page counter (e.g. 2/4)
- mg:<page> callback data for page navigation
- Catch-all query.answer() for noop buttons
/model with no args now shows an interactive UI on Telegram and Discord
instead of a text list:
Telegram: Inline keyboard buttons — two-step drill-down.
Step 1: Provider buttons with model counts (e.g. 'OpenRouter (15)')
Step 2: Model buttons within the selected provider
Edits the same message in-place as the user navigates.
Back/Cancel buttons for navigation.
Discord: Embed + Select dropdown menus via discord.ui.View.
Step 1: Provider dropdown with model counts
Step 2: Model dropdown within the selected provider
Back/Cancel buttons. Auth-gated to allowed users.
Platforms without picker support (Slack, WhatsApp, Signal, etc.)
fall back to the existing text list.
/model <name> continues to work as a direct text switch on all
platforms — the interactive picker is only for bare /model.
Implementation:
- TelegramAdapter.send_model_picker() + _handle_model_picker_callback()
with compact callback_data (mp:/mm:/mb/mx, all within 64-byte limit)
- DiscordAdapter.send_model_picker() + ModelPickerView (discord.ui.View)
with Select menus (up to 25 options per dropdown)
- GatewayRunner._handle_model_command() detects adapter capability via
getattr(type(adapter), 'send_model_picker', None) (safe with mocks)
and sends picker with async callback closure for the switch logic
- Callback performs full switch: switch_model(), cached agent update,
session override, pending model note — same as /model <name>
When a user replies in a Slack thread where the bot has an active
conversation session, the bot now processes the message even without
an explicit @mention. This improves UX for ongoing threaded
discussions.
Changes:
- Added set_session_store() to BasePlatformAdapter for adapters to
check active sessions
- Modified SlackAdapter to detect thread replies and check if a
session exists for that thread before requiring @mentions
- Updated GatewayRunner to inject the session store into adapters
- Added comprehensive tests for the new behavior
Fixes: Thread replies without @jarvis are now processed if there is
an active session, matching user expectations for conversation flow
The _async_flush_memories() helper accepts (session_id) but both the
/new and /resume handlers passed two arguments (session_id, session_key).
The TypeError was silently swallowed at DEBUG level, so memory extraction
never ran when users typed /new or /resume.
One call site (the session expiry watcher) was already fixed in 9c96f669,
but /new and /resume were missed.
- gateway/run.py:3247 — remove stray session_key from /new handler
- gateway/run.py:4989 — remove stray session_key from /resume handler
- tests/gateway/test_resume_command.py:222 — update test assertion
The canonical config key for model name is model.default (used by setup,
auth, runtime_provider, profile list, and CLI startup). But /model --global
wrote to model.name in both gateway and CLI paths.
This caused:
- hermes profile list showing the old model (reads model.default)
- Gateway restart reverting to the old model (_resolve_gateway_model reads model.default)
- CLI startup using the old model (main.py reads model.default)
The only reason it appeared to work in Telegram was the cached agent
staying alive with the in-place switch.
Fix: change all 3 write/read sites to use model.default.