Every command in COMMAND_REGISTRY (/btw, /stop, /model, /help, /new,
/bg, /reset, ...) is now a first-class Slack slash command instead of
a /hermes <subcommand>. Users get the same autocomplete-driven slash
picker experience Slack users expect and that Discord and Telegram
already provide.
Previously Slack registered ONE native slash (/hermes) and split on
the first word, so typing /btw in Slack's composer got 'couldn't find
an app for /btw' because the workspace manifest never declared it.
Changes
- hermes_cli/commands.py: slack_native_slashes() + slack_app_manifest()
generate a Slack manifest from the registry (canonical names +
aliases + plugin commands), clamped to Slack's 50-slash cap with
/hermes reserved as the catch-all.
- gateway/platforms/slack.py: single regex matcher dispatches every
registered slash to _handle_slash_command, which dispatches on
command['command']. Legacy /hermes <subcommand> keeps working for
backward compat with older workspace manifests.
- hermes_cli/slack_cli.py + hermes_cli/main.py: new 'hermes slack
manifest' command prints/writes a full manifest (display info,
OAuth scopes, event subs, socket mode, slash commands) ready to
paste into 'Create from manifest' or Features → App Manifest.
- hermes_cli/setup.py: _setup_slack() now writes the manifest up-front
and points users at the 'From an app manifest' flow; also offers
to refresh the manifest on reconfigure for picking up new commands.
- Tests: 14 new tests covering native-slash dispatch (/btw, /stop,
/model), legacy /hermes <sub> compat, manifest structure, and
telegram<->slack parity (every Telegram command must also register
as a Slack slash). Existing /hermes-registration test updated to
assert the new regex matches /hermes, /btw, /stop, /model, /help.
- Docs: slack.md gains a 'Slash Commands' section + Option A manifest
flow in Step 1; cli-commands.md documents 'hermes slack manifest'.
Users pick up the new slashes by running 'hermes slack manifest --write'
and pasting into Features → App Manifest → Edit in their Slack app
config, then Save (Slack prompts for reinstall if scopes changed).
Address Copilot review findings:
1. Gate _last_activity_desc on interrupt_depth == 0 alongside _last_activity_ts.
Both fields are semantically paired — desc describes the activity *at* ts.
Updating desc without ts made get_activity_summary() report "starting new
turn (cached)" for 20+ minutes while the timestamp showed the true stale
duration, producing misleading diagnostic output.
2. Monkeypatch gateway.run.time.time to a fixed epoch in tests that assert
on _last_activity_ts values. Real time.time() comparisons were latently
flaky under slow CI or NTP adjustments. _FAKE_NOW = 10_000.0 is used
as the reference; assertions are now exact equality rather than >=.
3. Add test_fresh_turn_resets_desc and test_interrupt_turn_preserves_desc to
directly cover the gated desc behaviour introduced by (1).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
_last_activity_ts was unconditionally reset to time.time() on every
_agent_cache hit. For interrupt-recursive _run_agent calls
(_interrupt_depth > 0) this silently reset the inactivity watchdog's
idle clock on each re-entry, preventing the 30-min timeout from ever
firing when a turn got stuck in an interrupt loop. A stuck session
would emit "Still working... iteration 0/60, starting new turn (cached)"
heartbeats indefinitely instead of timing out.
Gate the reset on _interrupt_depth == 0 only. Fresh external turns
still receive the reset so a session idle for 29 min doesn't trip the
watchdog before the new turn makes its first API call (#9051).
The per-turn reset logic is extracted into a static helper
_init_cached_agent_for_turn() to make it directly testable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Follow-up to #6616 covering the remaining user-injected prompt markers that
the original PR did not touch (reporter's second comment on #6576 explicitly
flagged these). Azure OpenAI Default/DefaultV2 content filters treat any
bracketed [SYSTEM: ...] as prompt-injection and reject with HTTP 400.
Remaining call sites renamed:
- cli.py: background-process notifications (watch_disabled, watch_match,
completion), MCP reload notice (4 live + 1 docstring)
- gateway/run.py: same notification paths + auto-loaded skill banner +
MCP reload notice (5 live + 1 docstring)
- tools/process_registry.py: comment reference
Not renamed:
- environments/hermes_base_env.py '[SYSTEM]\n{content}' — RL training
trajectory rendering only, never sent to Azure, part of a symmetric
[USER]/[ASSISTANT]/[TOOL] scheme.
AUTHOR_MAP: buraysandro9@gmail.com -> ygd58.
New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for
worker and orchestrator profiles. SQLite-backed task board
(~/.hermes/kanban.db) shared across all profiles on the host. Zero
changes to run_agent.py, no new core tools, no tool-schema bloat.
Motivation: delegate_task is a function call — sync fork/join, anonymous
subagent, no resumability, no human-in-the-loop. Kanban is the durable
shape needed for research triage, scheduled ops, digital twins,
engineering pipelines, and fleet work. They coexist (workers may call
delegate_task internally).
What this adds
- hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution,
dispatcher, workspace resolution, worker-context builder.
- hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash()
entry point used by both CLI and gateway.
- skills/devops/kanban-worker — how a profile should work a claimed task.
- skills/devops/kanban-orchestrator — "you are a dispatcher, not a
worker" template with anti-temptation rules.
- /kanban slash command wired into cli.py and gateway/run.py. Bypasses
the running-agent guard (board writes don't touch agent state), so
/kanban unblock can free a stuck worker mid-conversation.
- Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis
vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns;
4 user stories; implementation plan; concurrency correctness.
- Docs: website/docs/user-guide/features/kanban.md, CLI reference
updated, sidebar entry added.
Architecture highlights
- Three planes: control (user + gateway), state (board + dispatcher),
execution (pool of profile processes).
- Every worker is a full OS process, spawned as `hermes -p <profile>`.
No in-process subagent swarms — solves NanoClaw's SDK-lifecycle
failure class.
- Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale
claims reclaimed 15 min after their TTL expires.
- Tenant namespacing via one nullable column — one specialist fleet
can serve many businesses with data isolation by workspace path.
Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution,
dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass
hermetic via scripts/run_tests.sh.
Follow-up to PR #16053 (/btw as /background alias). Cleans up the
plumbing added exclusively for the old ephemeral /btw handler and
repairs a broken btw bypass that landed between my refactor and this
follow-up.
run_agent.py:
- Remove persist_session kwarg, instance attr, and _persist_session
short-circuit. Only /btw ever passed persist_session=False; with
/btw gone the default (always persist) is the only behavior anyone
ever wanted.
gateway/run.py:
- Remove the unreachable 'if _cmd_def_inner.name == "btw"' block
(PR #16059). Canonical name for a /btw message is 'background' after
alias resolution — the comparison could never be true, and it called
_handle_btw_command which no longer exists. The /background branch
above it already dispatches /btw correctly.
tests/gateway/test_running_agent_session_toggles.py:
- Fix test_btw_dispatches_mid_run to mock _handle_background_command
(the real dispatch target for /btw) instead of the deleted
_handle_btw_command.
/btw spawns a parallel ephemeral side-question task (self-guarded against
concurrent /btw on the same chat) — exactly like /background. But it was
missing from the running-agent bypass list in _handle_message(), so it
fell through to the catch-all and returned:
⏳ Agent is running — /btw can't run mid-turn. Wait for the current
response or /stop first.
That's the opposite of what /btw is for — asking a side question while
the main turn is still working. Add the bypass next to /background and a
regression test covering the mid-turn dispatch path.
Reported by @IuriiTiunov on Telegram.
The ephemeral no-tools side-question variant of /btw confused users who
expected 'by-the-way' to mean 'run this off to the side with tools' —
they'd type /btw and get a toolless agent that couldn't do the work.
/bg worked because it was /background with full tools.
Collapse the two: /btw and /bg both alias to /background. One command,
one behavior, no more gotchas about which variant has tools.
Removed:
- _handle_btw_command in cli.py and gateway/run.py
- _run_btw_task + _active_btw_tasks state in gateway/run.py
- prompt.btw JSON-RPC method + btw.complete event in tui_gateway
- BtwStartResponse type + btw.complete case in ui-tui
- Standalone /btw slash tree registration in Discord
- Standalone btw CommandDef in hermes_cli/commands.py
Updated:
- background CommandDef aliases: (bg,) -> (bg, btw)
- TUI session.ts: local btw handler merged into background
- Docs and tips updated to describe /btw as a /background alias
Instead of a blocking first-run questionnaire, show a one-time hint the first
time the user hits each behavior fork:
1. First message while the agent is working — appends a hint to the busy-ack
explaining the /busy queue vs /busy interrupt knob, phrased to match the
mode that was just applied (don't tell a queue-mode user to switch to
queue).
2. First tool that runs for >= 30s in the noisiest progress mode
(tool_progress: all) — prints a hint about /verbose to cycle display
modes (all -> new -> off -> verbose). Gated on /verbose actually being
usable on the surface: always shown on CLI; on gateway only shown when
display.tool_progress_command is enabled.
Each hint is latched in config.yaml under onboarding.seen.<flag>, so it
fires exactly once per install across CLI, gateway, and cron, then never
again. Users can wipe the section to re-see hints.
New:
- agent/onboarding.py — is_seen / mark_seen / hint strings, shared by
both CLI and gateway.
- onboarding.seen in DEFAULT_CONFIG (hermes_cli/config.py) and in
load_cli_config defaults (cli.py). No _config_version bump — deep
merge handles new keys.
Wired:
- gateway/run.py: _handle_active_session_busy_message appends the hint
after building the ack. progress_callback tracks tool.completed
duration and queues the tool-progress hint into the progress bubble.
- cli.py: CLI input loop appends the busy-input hint on the first busy
Enter; _on_tool_progress appends the tool-progress hint on the first
>=30s tool completion. In-memory CLI_CONFIG is also updated so
subsequent fires in the same process are suppressed immediately.
All writes go through atomic_yaml_write and are wrapped in try/except
so onboarding can never break the input/busy-ack paths.
The base adapter's auto-TTS path fired on any voice message unless the
chat had explicitly run /voice off — it never read voice.auto_tts from
config.yaml, so users who set auto_tts: false still got audio replies.
Gate the base adapter on a three-layer decision instead:
1. chat in _auto_tts_enabled_chats (explicit /voice on|tts) → fire
2. chat in _auto_tts_disabled_chats (explicit /voice off) → suppress
3. else → voice.auto_tts global default
Runner now pushes voice.auto_tts onto the adapter as _auto_tts_default
and mirrors /voice on|tts chats into _auto_tts_enabled_chats via the
existing _sync_voice_mode_state_to_adapter path. /voice off still wins.
Closes#16007.
When the LLM response carries N parallel tool calls, the agent fires
N tool.started events back-to-back before its interrupt check runs.
A user sending /stop mid-batch would see the '⚡ Interrupting current
task' ack followed by a trail of 🔍 web_search bubbles for the remaining
events in the batch — making the interrupt feel ignored.
progress_callback and the drain loop in send_progress_messages now
check agent.is_interrupted (via agent_holder[0], the existing
cross-scope handle). Events that arrive after interrupt are dropped
at both the queueing and rendering stages. The '⚡ Interrupting'
message is sent through a separate adapter path and is unaffected.
Fixes#15779. Custom-provider per-model context_length (`custom_providers[].models.<id>.context_length`) is now honored across every resolution path, not just agent startup. Also adds 256K as the top probe tier and default fallback.
## What changed
New helper `hermes_cli.config.get_custom_provider_context_length()` — single source of truth for the per-model override lookup, with trailing-slash-insensitive base-url matching.
`agent.model_metadata.get_model_context_length()` gains an optional `custom_providers=` kwarg (step 0b — runs after explicit `config_context_length` but before every other probe).
Wired through five call sites that previously either duplicated the lookup or ignored it entirely:
- `run_agent.py` startup — refactored to use the new helper (dedups legacy inline loop, keeps invalid-value warning)
- `AIAgent.switch_model()` — re-reads custom_providers from live config on every /model switch
- `hermes_cli.model_switch.resolve_display_context_length()` — new `custom_providers=` kwarg
- `gateway/run.py` /model confirmation (picker callback + text path)
- `gateway/run.py` `_format_session_info` (/info)
## Context probe tiers
`CONTEXT_PROBE_TIERS = [256_000, 128_000, 64_000, 32_000, 16_000, 8_000]` — was `[128_000, ...]`. `DEFAULT_FALLBACK_CONTEXT` follows tier[0], so unknown models now default to 256K. The stale `128000` literal in the OpenRouter metadata-miss path is replaced with `DEFAULT_FALLBACK_CONTEXT` for consistency.
## Repro (from #15779)
```yaml
custom_providers:
- name: my-custom-endpoint
base_url: https://example.invalid/v1
model: gpt-5.5
models:
gpt-5.5:
context_length: 1050000
```
`/model gpt-5.5 --provider custom:my-custom-endpoint` → previously "Context: 128,000", now "Context: 1,050,000".
## Tests
- `tests/hermes_cli/test_custom_provider_context_length.py` — new file, 19 tests covering the helper, step-0b integration, and the 256K tier invariants
- `tests/hermes_cli/test_model_switch_context_display.py` — added regression tests for #15779 through the display resolver
- `tests/gateway/test_session_info.py` — updated default-fallback assertion (128K → 256K)
- `tests/agent/test_model_metadata.py` — updated tier assertions for the new top tier
task.cancel() can't preempt the run_in_executor thread running
run_conversation(), so we rely on agent.interrupt() to wake the loop.
Without a timeout, a slow/unresponsive interrupt blocks the HTTP
response indefinitely. Wrap the await in wait_for(shield(task), 5.0)
and log a warning on timeout.
Also tidy one extra space in the module docstring's /stop entry.
Add ability to interrupt a running agent via the runs API. Previously
/v1/runs could start a run and subscribe to events, but there was no
way to cancel it. The new endpoint stores agent and task references
during execution, calls agent.interrupt() to stop LLM calls, then
cancels the asyncio task.
Includes 15 tests covering start, events, and stop scenarios.
When a user sends /stop during a streaming API call, the outer poll loop
detects _interrupt_requested and closes the HTTP connection. However, the
inner _call() thread catches the connection error and enters its retry
loop — opening a FRESH connection without checking the interrupt flag.
On slow providers like ollama-cloud, each retry attempt blocks for the
full stream-read timeout (120s+). With 3 retry attempts this caused
510+ second delays between /stop and actual response — the agent appeared
completely unresponsive despite the stop being acknowledged.
Fix: add an _interrupt_requested check at the top of the streaming retry
loop so the agent exits immediately instead of retrying.
Also fix log truncation: all session key logging in gateway/run.py used
[:20] or [:30] slices, which truncated 'agent:main:telegram:dm:5690190437'
(33 chars) to 'agent:main:telegram:' — losing the identifying chat type
and user ID. Replace with full keys to make logs debuggable.
Reported by user Sidharth Pulipaka via Telegram on ollama-cloud provider.
The AIAgent.flush_memories pre-compression save, the gateway
_flush_memories_for_session, and everything feeding them are
obsolete now that the background memory/skill review handles
persistent memory extraction.
Problems with flush_memories:
- Pre-dates the background review loop. It was the only memory-save
path when introduced; the background review now fires every 10 user
turns on CLI and gateway alike, which is far more frequent than
compression or session reset ever triggered flush.
- Blocking and synchronous. Pre-compression flush ran on the live agent
before compression, blocking the user-visible response.
- Cache-breaking. Flush built a temporary conversation prefix
(system prompt + memory-only tool list) that diverged from the live
conversation's cached prefix, invalidating prompt caching. The
gateway variant spawned a fresh AIAgent with its own clean prompt
for each finalized session — still cache-breaking, just in a
different process.
- Redundant. Background review runs in the live conversation's
session context, gets the same content, writes to the same memory
store, and doesn't break the cache. Everything flush_memories
claimed to preserve is already covered.
What this removes:
- AIAgent.flush_memories() method (~248 LOC in run_agent.py)
- Pre-compression flush call in _compress_context
- flush_memories call sites in cli.py (/new + exit)
- GatewayRunner._flush_memories_for_session + _async_flush_memories
(and the 3 call sites: session expiry watcher, /new, /resume)
- 'flush_memories' entry from DEFAULT_CONFIG auxiliary tasks,
hermes tools UI task list, auxiliary_client docstrings
- _memory_flush_min_turns config + init
- #15631's headroom-deduction math in
_check_compression_model_feasibility (headroom was only needed
because flush dragged the full main-agent system prompt along;
the compression summariser sends a single user-role prompt so
new_threshold = aux_context is safe again)
- The dedicated test files and assertions that exercised
flush-specific paths
What this renames (with read-time backcompat on sessions.json):
- SessionEntry.memory_flushed -> SessionEntry.expiry_finalized.
The session-expiry watcher still uses the flag to avoid re-running
finalize/eviction on the same expired session; the new name
reflects what it now actually gates. from_dict() reads
'expiry_finalized' first, falls back to the legacy 'memory_flushed'
key so existing sessions.json files upgrade seamlessly.
Supersedes #15631 and #15638.
Tested: 383 targeted tests pass across run_agent/, agent/, cli/,
and gateway/ session-boundary suites. No behavior regressions —
background memory review continues to handle persistent memory
extraction on both CLI and gateway.
Both discord (read/participate) and discord_admin (server admin) are now
configurable via `hermes tools` with default-OFF. Previously the core
discord tool (fetch_messages, search_members, create_thread) auto-loaded
on every Discord install with DISCORD_BOT_TOKEN set — 19 tools the user
never opted into.
Adds a platform-scoping mechanism (_TOOLSET_PLATFORM_RESTRICTIONS) so
the discord toolsets only show up in the Discord platform's checklist,
not on CLI/Telegram/Slack/etc. Applied at four gates:
- _prompt_toolset_checklist: checklist filter
- _get_platform_tools: resolution filter (both branches)
- _save_platform_tools: save-time filter (covers 'Configure all
platforms' and hand-edited config.yaml)
- tools_disable_enable_command: rejects `hermes tools enable discord`
on non-Discord platforms with a clear error
build_session_context_prompt now injects the Discord IDs block only
when both conditions hold: the discord/discord_admin toolset is
enabled AND DISCORD_BOT_TOKEN is set. Toolset alone isn't enough —
the tool's check_fn gates on the token at registry time, so opting
in without a token yields no tools and the IDs block would lie.
Otherwise keep the stale-API disclaimer.
When DISCORD_BOT_TOKEN is set — meaning the discord tool actually
loads — emit a dedicated IDs block in the session context prompt so
the agent can call ``fetch_messages``, ``pin_message``, etc. with
real identifiers instead of probing.
Currently only ``thread_id`` was exposed as a raw ID (via the
``description`` string). The agent in a Discord thread had to guess
that the thread ID doubles as a channel ID for the REST API (it
does), and it had no way to reference the parent channel, the guild,
or the triggering message at all.
The block adapts to context:
- Thread: guild / parent channel / thread / message
- Channel: guild / channel / message
- (DM has no guild/channel IDs worth listing; only message)
Discord isn't in _PII_SAFE_PLATFORMS, so IDs ship unredacted.
The Discord platform note in the session context prompt claimed the
agent has no server-management APIs — pre-dating the discord tool.
With a bot token configured the agent actually has fetch_messages,
search_members, create_thread, and optionally the discord_admin tool;
telling the model otherwise causes it to refuse or apologise for
calls it is fully able to make.
Gate the disclaimer on DISCORD_BOT_TOKEN being unset, matching the
tool's own ``check_fn``. Without a token the note still appears and
remains accurate; with a token the model is no longer gaslit into
refusing valid tool calls.
Discord knows all four identifiers for every inbound message — guild,
channel (or thread), parent channel when in a thread, and the
triggering message. Pass them into ``SessionSource`` via the new
``build_source()`` kwargs so downstream code (context-prompt builder,
delivery, logging) can use them without re-resolving from discord.py
objects.
For auto-threaded messages, remember the original channel as the
parent before swapping ``chat_id`` to the freshly created thread.
Behavioural: still a no-op — nothing consumes these fields yet.
Groundwork for injecting raw platform identifiers into the agent's
system prompt. Currently only `thread_id` is exposed as a raw ID —
callers in a Discord thread had to guess `channel_id == thread_id`
(which happens to work because threads are channels in Discord's REST
API) and had no way to reference the parent channel, guild, or the
triggering message.
Adds three optional fields:
- `guild_id` — Discord guild / Slack workspace / Matrix server scope
- `parent_chat_id` — parent channel when chat_id refers to a thread
- `message_id` — ID of the triggering message (pin/reply/react)
Extends `BasePlatformAdapter.build_source()` to accept + forward them
and teaches `to_dict`/`from_dict` to serialize them. Behaviourally a
no-op: nothing reads the fields yet and they default to None.
Two adjustments to make CI pass:
- In gateway/platforms/matrix.py: `DeviceID` is `NewType("DeviceID", str)`,
so passing `client.device_id` directly (already a str) works identically
at runtime. The explicit import was cosmetic and tripped CI environments
where `mautrix.types` doesn't re-export DeviceID at the expected path
("cannot import name 'DeviceID' from 'mautrix.types' (unknown location)").
- In tests/gateway/test_matrix.py: add `put_device_id` to the hand-written
`PgCryptoStore` fake so the three encryption-path tests
(test_connect_with_access_token_and_encryption,
test_connect_uses_configured_device_id_over_whoami,
test_connect_registers_encrypted_event_handler_when_encryption_on) can
exercise the new crypto-store binding without AttributeError.
PgCryptoStore.__init__ defaults _device_id to "" and put_account writes
that blank value into crypto_account. The UPSERT's ON CONFLICT DO UPDATE
clause deliberately does not touch device_id, so once the row is written
blank it stays blank forever — breaking every downstream device-scoped
olm operation. Peers' to-device olm ciphertext can't match our identity
key, no megolm sessions ever land, and the user sees "hermes is in the
room but never responds to encrypted messages".
Fix: call put_device_id(client.device_id) immediately after
crypto_store.open() and before olm.load(). This sets the store's
in-memory _device_id so the first put_account INSERT writes the correct
value from the start.
Observable symptoms without the fix, on a fresh crypto.db:
- crypto_account.device_id = ""
- crypto_tracked_user: 0 rows
- crypto_device: 0 rows
- crypto_olm_session: 0 rows
- crypto_megolm_inbound_session: 0 rows
- "No one-time keys nor device keys got when trying to share keys"
warning on every startup
- "olm event doesn't contain ciphertext for this device" DecryptionError
on any inbound to-device event
- Encrypted room messages arrive but never decrypt
After the fix (wiped crypto.db + restart):
- device_id populated with actual runtime device (e.g. CZIKTRFLOV)
- all counts populate from sync as expected
- encrypted DMs flow normally
Who hits this: anyone with a fresh crypto.db — includes first-time matrix
E2EE setup, nio→mautrix migrations (since matrix.py removes the legacy
pickle on startup, creating a fresh SQLite store), and anyone who wipes
crypto.db to start over. Existing installs that somehow already have a
non-blank device_id would be unaffected, but no prior code path writes
it correctly, so that set is likely empty.
/model gpt-5.5 on openai-codex showed 'Context: 1,050,000 tokens' because
the display block used ModelInfo.context_window directly from models.dev.
Codex OAuth actually enforces 272K for the same slug, and the agent's
compressor already runs at 272K via get_model_context_length() — so the
banner + real context budget said 272K while /model lied with 1M.
Route the display context through a new resolve_display_context_length()
helper that always prefers agent.model_metadata.get_model_context_length
(which knows about Codex OAuth, Copilot, Nous caps) and only falls back
to models.dev when that returns nothing.
Fix applied to all 3 /model display sites:
cli.py _handle_model_switch
gateway/run.py picker on_model_selected callback
gateway/run.py text-fallback confirmation
Reported by @emilstridell (Telegram, April 2026).
Extends PR #15171 to also cover the server-side cancellation path (aiohttp
shutdown, request-level timeout) — previously only ConnectionResetError
triggered the incomplete-snapshot write, so cancellations left the store
stuck at the in_progress snapshot written on response.created.
Factors the incomplete-snapshot build into a _persist_incomplete_if_needed()
helper called from both the ConnectionResetError and CancelledError
branches; the CancelledError handler re-raises so cooperative cancellation
semantics are preserved.
Adds two regression tests that drive _write_sse_responses directly (the
TestClient disconnect path races the server handler, which makes the
end-to-end assertion flaky).
When display.busy_input_mode is 'queue', the runner-level PRIORITY block
in _handle_message was still calling running_agent.interrupt() for every
text follow-up to an active session. The adapter-level busy handler
already honors queue mode (commit 9d147f7fd), but this runner-level path
was an unconditional interrupt regardless of config.
Adds a queue-mode branch that queues the follow-up via
_queue_or_replace_pending_event() and returns without interrupting.
Salvages the useful part of #12070 (@knockyai). The config fan-out to
per-platform extra was redundant — runner already loads busy_input_mode
directly via _load_busy_input_mode().
Follow-up to the canonical-identity session-key fix: pull the
JID/LID normalize/expand/canonical helpers into gateway/whatsapp_identity.py
instead of living in two places. gateway/session.py (session-key build) and
gateway/run.py (authorisation allowlist) now both import from the shared
module, so the two resolution paths can't drift apart.
Also switches the auth path from module-level _hermes_home (cached at
import time) to dynamic get_hermes_home() lookup, which matches the
session-key path and correctly reflects HERMES_HOME env overrides. The
lone test that monkeypatched gateway.run._hermes_home for the WhatsApp
auth path is updated to set HERMES_HOME env var instead; all other
tests that monkeypatch _hermes_home for unrelated paths (update,
restart drain, shutdown marker, etc.) still work — the module-level
_hermes_home is untouched.
Hermes' WhatsApp bridge routinely surfaces the same person under either
a phone-format JID (60123456789@s.whatsapp.net) or a LID (…@lid),
and may flip between the two for a single human within the same
conversation. Before this change, build_session_key used the raw
identifier verbatim, so the bridge reshuffling an alias form produced
two distinct session keys for the same person — in two places:
1. DM chat_id — a user's DM sessions split in half, transcripts and
per-sender state diverge.
2. Group participant_id (with group_sessions_per_user enabled) — a
member's per-user session inside a group splits in half for the
same reason.
Add a canonicalizer that walks the bridge's lid-mapping-*.json files
and picks the shortest/numeric-preferred alias as the stable identity.
build_session_key now routes both the DM chat_id and the group
participant_id through this helper when the platform is WhatsApp.
All other platforms and chat types are untouched.
Expose canonical_whatsapp_identifier and normalize_whatsapp_identifier
as public helpers. Plugins that need per-sender behaviour (role-based
routing, per-contact authorization, policy gating) need the same
identity resolution Hermes uses internally; without a public helper,
each plugin would have to re-implement the walker against the bridge's
internal on-disk format. Keeping this alongside build_session_key
makes it authoritative and one refactor away if the bridge ever
changes shape.
_expand_whatsapp_aliases stays private — it's an implementation detail
of how the mapping files are walked, not a contract callers should
depend on.
When the primary provider raises AuthError (expired OAuth token,
revoked API key), the error was re-raised before AIAgent was created,
so fallback_model was never consulted. Now both gateway/run.py and
cron/scheduler.py catch AuthError specifically and attempt to resolve
credentials from the fallback_providers/fallback_model config chain
before propagating the error.
Closes#7230
Keep Discord Copilot model switching responsive and current by refreshing picker data from the live catalog when possible, correcting the curated fallback list, and clearing stale controls before the switch completes.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Follow-up to the allowed_channels wildcard fix in the preceding commit.
The same '*' literal trap affected two other Discord channel config lists:
- DISCORD_IGNORED_CHANNELS: '*' was stored as the literal string in the
ignored set, and the intersection check never matched real channel IDs,
so '*' was a no-op instead of silencing every channel.
- DISCORD_FREE_RESPONSE_CHANNELS: same shape — '*' never matched, so
the bot still required a mention everywhere.
Add a '*' short-circuit to both checks, matching the allowed_channels
semantics. Extend tests/gateway/test_discord_allowed_channels.py with
regression coverage for all three lists.
Refs: #14920
allowed_channels: "*" in config (or DISCORD_ALLOWED_CHANNELS="*" env var)
is meant to allow all channels, but the check was comparing numeric channel
IDs against the literal string set {"*"} via set intersection — always empty,
so every message was silently dropped.
Add a "*" short-circuit before the set intersection, consistent with every
other platform's allowlist handling (Signal, Slack, Telegram all do this).
Fixes#14920
Introduces a new plugin hook `pre_gateway_dispatch` fired once per
incoming MessageEvent in `_handle_message`, after the internal-event
guard but before the auth / pairing chain. Plugins may return a dict
to influence flow:
{"action": "skip", "reason": "..."} -> drop (no reply)
{"action": "rewrite", "text": "..."} -> replace event.text
{"action": "allow"} / None -> normal dispatch
Motivation: gateway-level message-flow patterns that don't fit cleanly
into any single adapter — e.g. listen-only group-chat windows (buffer
ambient messages, collapse on @mention), or human-handover silent
ingest (record messages while an owner handles the chat manually).
Today these require forking core; with this hook they can live in a
single profile-agnostic plugin.
Hook runs BEFORE auth so plugins can handle unauthorized senders
(e.g. customer-service handover ingest) without triggering the
pairing-code flow. Exceptions in plugin callbacks are caught and
logged; the first non-None action dict wins, remaining results are
ignored.
Includes:
- `VALID_HOOKS` entry + inline doc in `hermes_cli/plugins.py`
- Invocation block in `gateway/run.py::_handle_message`
- 5 new tests in `tests/gateway/test_pre_gateway_dispatch.py`
(skip, rewrite, allow, exception safety, internal-event bypass)
- 2 additional tests in `tests/hermes_cli/test_plugins.py`
- Table entry in `website/docs/user-guide/features/plugins.md`
Made-with: Cursor
Manual /compress crashed with 'LCMEngine' object has no attribute
'_align_boundary_forward' when any context-engine plugin was active.
The gateway handler reached into _align_boundary_forward and
_find_tail_cut_by_tokens on tmp_agent.context_compressor, but those
are ContextCompressor-specific — not part of the generic ContextEngine
ABC — so every plugin engine (LCM, etc.) raised AttributeError.
- Add optional has_content_to_compress(messages) to ContextEngine ABC
with a safe default of True (always attempt).
- Override it in the built-in ContextCompressor using the existing
private helpers — preserves exact prior behavior for 'compressor'.
- Rewrite gateway /compress preflight to call the ABC method, deleting
the private-helper reach-in.
- Add focus_topic to the ABC compress() signature. Make _compress_context
retry without focus_topic on TypeError so older strict-sig plugins
don't crash on manual /compress <focus>.
- Regression test with a fake ContextEngine subclass that only
implements the ABC (mirrors LCM's surface).
Reported by @selfhostedsoul (Discord, Apr 22).
Follow-up to Magaav's safe sync policy. Two gaps in the canonicalizer
caused false diffs or silent drift:
1. discord.py's AppCommand.to_dict() omits nsfw, dm_permission, and
default_member_permissions — those live only on attributes. The
canonicalizer was reading them via payload.get() and getting defaults
(False/True/None), while the desired side from Command.to_dict(tree)
had the real values. Any command using non-default permissions
false-diffed on every startup. Pull them from the AppCommand
attributes via _existing_command_to_payload().
2. contexts and integration_types weren't canonicalized at all, so
drift in either was silently ignored. Added both to
_canonicalize_app_command_payload (sorted for stable compare).
Also normalized default_member_permissions to str-or-None since the
server emits strings but discord.py stores ints locally.
Added regression tests for both gaps.
Replaces blind tree.sync() on every Discord reconnect with a diff-based
reconcile. In safe mode (default), fetch existing global commands,
compare desired vs existing payloads, skip unchanged, PATCH changed,
recreate when non-patchable metadata differs, POST missing, and delete
stale commands one-by-one. Keeps 'bulk' for legacy behavior and 'off'
to skip startup sync entirely.
Fixes restart-heavy workflows that burn Discord's command write budget
and can surface 429s when iterating on native slash commands.
Env var: DISCORD_COMMAND_SYNC_POLICY (safe|bulk|off), default 'safe'.
Co-authored-by: Codex <codex@openai.invalid>
cmd_update no longer SIGKILLs in-flight agent runs, and users get
'still working' status every 3 min instead of 10. Two long-standing
sources of '@user — agent gives up mid-task' reports on Telegram and
other gateways.
Drain-aware update:
- New helper hermes_cli.gateway._graceful_restart_via_sigusr1(pid,
drain_timeout) sends SIGUSR1 to the gateway and polls os.kill(pid,
0) until the process exits or the budget expires.
- cmd_update's systemd loop now reads MainPID via 'systemctl show
--property=MainPID --value' and tries the graceful path first. The
gateway's existing SIGUSR1 handler -> request_restart(via_service=
True) -> drain -> exit(75) is wired in gateway/run.py and is
respawned by systemd's Restart=on-failure (and the explicit
RestartForceExitStatus=75 on newer units).
- Falls back to 'systemctl restart' when MainPID is unknown, the
drain budget elapses, or the unit doesn't respawn after exit (older
units missing Restart=on-failure). Old install behavior preserved.
- Drain budget = max(restart_drain_timeout, 30s) + 15s margin so the
drain loop in run_agent + final exit have room before fallback
fires. Composes with #14728's tool-subprocess reaping.
Notification interval:
- agent.gateway_notify_interval default 600 -> 180.
- HERMES_AGENT_NOTIFY_INTERVAL env-var fallback in gateway/run.py
matched.
- 9-minute weak-model spinning runs now ping at 3 min and 6 min
instead of 27 seconds before completion, removing the 'is the bot
dead?' reflex that drives gateway-restart cycles.
Tests:
- Two new tests in tests/hermes_cli/test_update_gateway_restart.py:
one asserts SIGUSR1 is sent and 'systemctl restart' is NOT called
when MainPID is known and the helper succeeds; one asserts the
fallback fires when the helper returns False.
- E2E: spawned detached bash processes confirm the helper returns
True on SIGUSR1-handling exit (~0.5s) and False on SIGUSR1-ignoring
processes (timeout). Verified non-existent PID and pid=0 edge cases.
- 41/41 in test_update_gateway_restart.py (was 39, +2 new).
- 154/154 in shutdown-related suites including #14728's new tests.
Reported by @GeoffWellman and @ANT_1515 on X.