hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-30 01:41:43 +00:00

Author	SHA1	Message	Date
Tranquil-Flow	bf05b8f4a2	fix(gateway): clean up cached agents on shutdown (#11205 )	2026-04-26 12:51:53 -07:00
Zainan Victor Zhou	778fd1898e	fix(slack): surface attachment access diagnostics Translate Slack attachment failures into actionable user-facing notices instead of generic download errors. When a scope/auth/permission issue breaks attachment processing, the user sees: [Slack attachment notice] - Slack attachment access failed for photo.jpg. Missing scope: files:read. Update the Slack app scopes/settings and reinstall the app to the workspace. Two helpers do the translation: _describe_slack_api_error — handles SlackApiError responses (missing_scope, invalid_auth, file_not_found, access_denied, etc.) _describe_slack_download_failure — handles httpx.HTTPStatusError (401/403/404) and Slack-returns-HTML-sign-in fallbacks Wired into three existing call sites: - the Slack Connect files.info path (PR #11111) so scope errors surface instead of being logged as generic "files.info failed" - the image, audio, and document download paths so 401/403 and HTML-body responses translate into actionable notices Adjustment from original PR: dropped _probe_slack_file_access_issue, the proactive pre-download files.info probe. It added one extra Slack API call per attachment even on healthy ones, and overlapped with the existing files.info call from PR #11111. The post-failure translation path covers the same user-facing diagnostic value without the per-message tax. Also documents files:read scope more prominently in the Slack setup guide and troubleshooting table. Contributed back from https://github.com/xinbenlv/zn-hermes-agent. Closes #7015. Co-authored-by: xinbenlv <zzn+pa@zzn.im>	2026-04-26 12:47:43 -07:00
kunlabs	f9885130b4	fix(slack): download files in Slack Connect channels Slack Connect channels return file objects with file_access="check_file_info" and no url_private_download field (see https://docs.slack.dev/reference/objects/file-object/#slack_connect_files). These stub objects must be resolved via files.info before download can proceed. Without this the agent silently skips attachments posted in Slack Connect channels. Call files.info on every file whose file_access is check_file_info, replace the stub with the full file object, and let the existing download path continue. Warn and skip on files.info failures. Closes #11095.	2026-04-26 12:35:16 -07:00
flobo3	f414df3a56	fix(slack): include team_id in thread-context cache key	2026-04-26 12:35:16 -07:00
Satoshi-agi	c0d25df311	fix(slack): preserve thread-parent context when cron/bot posted the parent The Slack thread-context fetcher used to drop every message with a bot_id, which silently erased the thread parent whenever a cron job (or any other bot) had posted it. As a result, replies to a cron-posted summary lost all context and the agent answered as if from a blank thread. Changes: 1. gateway/platforms/slack.py::_fetch_thread_context - Keep the thread parent even when it was posted by a bot (e.g. cron summaries, third-party integrations). - Only skip our own prior bot replies to avoid circular context, matching the per-workspace bot user id via _team_bot_user_ids so multi-workspace deployments stay correct. - Keep non-self bot children (useful third-party context). 2. gateway/platforms/slack.py::_handle_slack_message - Populate MessageEvent.reply_to_text for thread replies (parity with Telegram/Discord/Feishu/WeCom). gateway.run uses this field to inject a [Replying to: "..."] prefix when the parent is not already in the session history, which is exactly the scenario triggered by cron-generated thread parents. - New helper _fetch_thread_parent_text reuses the existing thread- context cache (and its 60s TTL) to avoid duplicate conversations.replies calls; falls back to a cheap limit=1 fetch when the cache is cold. Tests: - Updated TestSlackThreadContext::test_skips_bot_messages to reflect the new behaviour (self-bot child dropped, third-party bot kept). - Added: * test_fetch_thread_context_includes_bot_parent * test_fetch_thread_context_excludes_self_bot_replies * test_fetch_thread_context_multi_workspace * test_fetch_thread_context_current_ts_excluded (regression guard) * test_fetch_thread_parent_text_from_cache * test_slack_reply_to_text_set_on_thread_reply * test_slack_reply_to_text_none_for_top_level_message Full Slack suite: 176 passed (was 169).	2026-04-26 12:35:16 -07:00
hhuang91	802c7acb81	fix(Slack): resolve Slack channels by raw ID and enumerate joined channels send_message(target='slack:<channel_id>') failed with "Could not resolve" because _parse_target_ref had no Slack branch — Slack's uppercase alphanumeric IDs fell through to channel-name resolution, which only matched by name. As a fallback, the agent would retry with bare target='slack' and post to the home channel instead. Three fixes: - _parse_target_ref recognizes Slack IDs (C/G/D/U/W prefix) as explicit targets so the name-resolver is bypassed entirely. - resolve_channel_name tries a case-sensitive raw-ID match before the existing name match, so any platform's IDs resolve cleanly. - _build_slack now actually calls users.conversations against each workspace's AsyncWebClient (paginated), instead of only returning session-history entries. This populates the directory with public and private channels the bot has joined, so action='list' shows them and they can also be addressed by name. Errors from one workspace don't block others. build_channel_directory becomes async (Slack web calls require it). The two async-context callers in gateway/run.py are awaited; the cron ticker thread call bridges via asyncio.run_coroutine_threadsafe. Slack bot needs channels:read and groups:read scopes for full enumeration; missing scopes degrade gracefully per-workspace. addressing #15927	2026-04-26 12:29:02 -07:00
Honza Stepanovsky	50dd67c680	fix(slack): skip _mentioned_threads registration when strict_mention is on Extends the strict_mention feature so an @mention in strict mode no longer persistently tags the thread as 'mentioned'. Without this, the thread's first mention would permanently auto-trigger the bot on every subsequent message — which is exactly what strict_mention is designed to prevent. Closes the agent-to-agent ack loop hole hhhonzik identified in #14117. Co-authored-by: hhhonzik <me@janstepanovsky.cz>	2026-04-26 12:23:20 -07:00
Ching	aea4a90f0e	feat(slack): add opt-in slack.strict_mention gate for channel threads Adds a strict_mention config option that, when enabled, requires an explicit @-mention on every message in channel threads. Disables the 'once mentioned, forever in the thread' and session-presence auto-triggers. - New _slack_strict_mention() helper (config.extra + SLACK_STRICT_MENTION env) - Bridged top-level slack.strict_mention yaml to SLACK_STRICT_MENTION env, matching require_mention/allow_bots bridging - Unit tests for the helper + config bridge	2026-04-26 12:23:20 -07:00
Teknium	4b5a88d714	fix(slack): honor reply_in_thread=false for top-level channel messages Top-level channel messages arrive at _resolve_thread_ts with metadata.thread_id set to the message's own ts, because the inbound handler in _handle_message_event uses 'event.ts' as a session-keying fallback when event.thread_ts is absent. That made metadata alone insufficient to distinguish a real thread reply from a top-level message, so reply_in_thread=false only took effect in DMs. Use reply_to (== incoming message_id == ts for top-level messages) as the tiebreaker: when metadata.thread_id == reply_to the 'thread' is the synthetic session-keying fallback, not a real parent, so we reply directly in the channel. Real thread replies (reply_to != thread_id) still resolve to the parent thread and preserve conversation context. Closes #9268.	2026-04-26 12:04:46 -07:00
bde3249023	b1be86ef96	fix(gateway): bridge slack.reply_in_thread config	2026-04-26 12:04:46 -07:00
Zhi Yan Liu	d993a3f450	fix(gateway): use /hermes sethome in onboarding hint on Slack Slack's adapter registers a single parent slash command /hermes and dispatches subcommands via slack_subcommand_map(). Bare /sethome is not a registered command on Slack and fails with 'app did not respond', logging 'Unhandled request' in slack_bolt.AsyncApp. Show /hermes sethome in the first-run onboarding hint when the source platform is Slack; keep /sethome for Telegram, Discord, Matrix, Mattermost, and other platforms that register it directly. Fixes #14632	2026-04-26 11:56:23 -07:00
Teknium	1dfcc2ffc3	fix(gateway): /queue is now a true FIFO — each invocation gets its own turn (#16175 ) Repeated /queue commands now each produce a full agent turn, in order, with no merging. Previously the second /queue overwrote the first because the handler wrote directly into the adapter's single-slot _pending_messages dict. - GatewayRunner grows a _queued_events overflow buffer (dict of list). - /queue puts new items in the adapter's next-up slot when free, otherwise appends to the overflow. After each run's drain consumes the slot, the next overflow item is promoted so the recursive run picks it up. - /new and /reset clear the overflow. - /status now reports queue depth when non-zero. - Ack message shows the depth once it exceeds 1. Helpers (_enqueue_fifo, _promote_queued_event, _queue_depth) use the getattr default-fallback pattern so existing tests that build bare GatewayRunner instances via object.__new__ keep working.	2026-04-26 11:55:09 -07:00
Teknium	087e74d4d7	feat(slack): register every gateway command as a native slash (Discord/Telegram parity) (#16164 ) Every command in COMMAND_REGISTRY (/btw, /stop, /model, /help, /new, /bg, /reset, ...) is now a first-class Slack slash command instead of a /hermes <subcommand>. Users get the same autocomplete-driven slash picker experience Slack users expect and that Discord and Telegram already provide. Previously Slack registered ONE native slash (/hermes) and split on the first word, so typing /btw in Slack's composer got 'couldn't find an app for /btw' because the workspace manifest never declared it. Changes - hermes_cli/commands.py: slack_native_slashes() + slack_app_manifest() generate a Slack manifest from the registry (canonical names + aliases + plugin commands), clamped to Slack's 50-slash cap with /hermes reserved as the catch-all. - gateway/platforms/slack.py: single regex matcher dispatches every registered slash to _handle_slash_command, which dispatches on command['command']. Legacy /hermes <subcommand> keeps working for backward compat with older workspace manifests. - hermes_cli/slack_cli.py + hermes_cli/main.py: new 'hermes slack manifest' command prints/writes a full manifest (display info, OAuth scopes, event subs, socket mode, slash commands) ready to paste into 'Create from manifest' or Features → App Manifest. - hermes_cli/setup.py: _setup_slack() now writes the manifest up-front and points users at the 'From an app manifest' flow; also offers to refresh the manifest on reconfigure for picking up new commands. - Tests: 14 new tests covering native-slash dispatch (/btw, /stop, /model), legacy /hermes <sub> compat, manifest structure, and telegram<->slack parity (every Telegram command must also register as a Slack slash). Existing /hermes-registration test updated to assert the new regex matches /hermes, /btw, /stop, /model, /help. - Docs: slack.md gains a 'Slash Commands' section + Option A manifest flow in Step 1; cli-commands.md documents 'hermes slack manifest'. Users pick up the new slashes by running 'hermes slack manifest --write' and pasting into Features → App Manifest → Edit in their Slack app config, then Save (Slack prompts for reinstall if scopes changed).	2026-04-26 11:38:32 -07:00
briandevans	4e356098d2	fixup! fix(gateway): preserve inactivity clock on interrupt-recursive cached-agent turns (#15654 ) Address Copilot review findings: 1. Gate _last_activity_desc on interrupt_depth == 0 alongside _last_activity_ts. Both fields are semantically paired — desc describes the activity at ts. Updating desc without ts made get_activity_summary() report "starting new turn (cached)" for 20+ minutes while the timestamp showed the true stale duration, producing misleading diagnostic output. 2. Monkeypatch gateway.run.time.time to a fixed epoch in tests that assert on _last_activity_ts values. Real time.time() comparisons were latently flaky under slow CI or NTP adjustments. _FAKE_NOW = 10_000.0 is used as the reference; assertions are now exact equality rather than >=. 3. Add test_fresh_turn_resets_desc and test_interrupt_turn_preserves_desc to directly cover the gated desc behaviour introduced by (1). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 08:45:44 -07:00
briandevans	de24315978	fix(gateway): preserve inactivity clock on interrupt-recursive cached-agent turns (#15654 ) _last_activity_ts was unconditionally reset to time.time() on every _agent_cache hit. For interrupt-recursive _run_agent calls (_interrupt_depth > 0) this silently reset the inactivity watchdog's idle clock on each re-entry, preventing the 30-min timeout from ever firing when a turn got stuck in an interrupt loop. A stuck session would emit "Still working... iteration 0/60, starting new turn (cached)" heartbeats indefinitely instead of timing out. Gate the reset on _interrupt_depth == 0 only. Fresh external turns still receive the reset so a session idle for 29 min doesn't trip the watchdog before the new turn makes its first API call (#9051). The per-turn reset logic is extracted into a static helper _init_cached_agent_for_turn() to make it directly testable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 08:45:44 -07:00
Teknium	20cb706e03	chore: extend [SYSTEM:→[IMPORTANT: rename + AUTHOR_MAP Follow-up to #6616 covering the remaining user-injected prompt markers that the original PR did not touch (reporter's second comment on #6576 explicitly flagged these). Azure OpenAI Default/DefaultV2 content filters treat any bracketed [SYSTEM: ...] as prompt-injection and reject with HTTP 400. Remaining call sites renamed: - cli.py: background-process notifications (watch_disabled, watch_match, completion), MCP reload notice (4 live + 1 docstring) - gateway/run.py: same notification paths + auto-loaded skill banner + MCP reload notice (5 live + 1 docstring) - tools/process_registry.py: comment reference Not renamed: - environments/hermes_base_env.py '[SYSTEM]\n{content}' — RL training trajectory rendering only, never sent to Azure, part of a symmetric [USER]/[ASSISTANT]/[TOOL] scheme. AUTHOR_MAP: buraysandro9@gmail.com -> ygd58.	2026-04-26 08:44:58 -07:00
Teknium	06f81752ed	Revert "feat(kanban): durable multi-profile collaboration board (#16081 )" (#16098 ) This reverts commit `15937a6b46`.	2026-04-26 08:29:37 -07:00
Teknium	15937a6b46	feat(kanban): durable multi-profile collaboration board (#16081 ) New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for worker and orchestrator profiles. SQLite-backed task board (~/.hermes/kanban.db) shared across all profiles on the host. Zero changes to run_agent.py, no new core tools, no tool-schema bloat. Motivation: delegate_task is a function call — sync fork/join, anonymous subagent, no resumability, no human-in-the-loop. Kanban is the durable shape needed for research triage, scheduled ops, digital twins, engineering pipelines, and fleet work. They coexist (workers may call delegate_task internally). What this adds - hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution, dispatcher, workspace resolution, worker-context builder. - hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash() entry point used by both CLI and gateway. - skills/devops/kanban-worker — how a profile should work a claimed task. - skills/devops/kanban-orchestrator — "you are a dispatcher, not a worker" template with anti-temptation rules. - /kanban slash command wired into cli.py and gateway/run.py. Bypasses the running-agent guard (board writes don't touch agent state), so /kanban unblock can free a stuck worker mid-conversation. - Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns; 4 user stories; implementation plan; concurrency correctness. - Docs: website/docs/user-guide/features/kanban.md, CLI reference updated, sidebar entry added. Architecture highlights - Three planes: control (user + gateway), state (board + dispatcher), execution (pool of profile processes). - Every worker is a full OS process, spawned as `hermes -p <profile>`. No in-process subagent swarms — solves NanoClaw's SDK-lifecycle failure class. - Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale claims reclaimed 15 min after their TTL expires. - Tenant namespacing via one nullable column — one specialist fleet can serve many businesses with data isolation by workspace path. Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution, dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass hermetic via scripts/run_tests.sh.	2026-04-26 08:24:26 -07:00
Teknium	454d883e69	refactor: drop persist_session plumbing + fix broken btw mid-turn bypass (#16075 ) Follow-up to PR #16053 (/btw as /background alias). Cleans up the plumbing added exclusively for the old ephemeral /btw handler and repairs a broken btw bypass that landed between my refactor and this follow-up. run_agent.py: - Remove persist_session kwarg, instance attr, and _persist_session short-circuit. Only /btw ever passed persist_session=False; with /btw gone the default (always persist) is the only behavior anyone ever wanted. gateway/run.py: - Remove the unreachable 'if _cmd_def_inner.name == "btw"' block (PR #16059). Canonical name for a /btw message is 'background' after alias resolution — the comparison could never be true, and it called _handle_btw_command which no longer exists. The /background branch above it already dispatches /btw correctly. tests/gateway/test_running_agent_session_toggles.py: - Fix test_btw_dispatches_mid_run to mock _handle_background_command (the real dispatch target for /btw) instead of the deleted _handle_btw_command.	2026-04-26 07:15:23 -07:00
Teknium	70f56e7605	fix(gateway): let /btw dispatch mid-turn instead of being rejected /btw spawns a parallel ephemeral side-question task (self-guarded against concurrent /btw on the same chat) — exactly like /background. But it was missing from the running-agent bypass list in _handle_message(), so it fell through to the catch-all and returned: ⏳ Agent is running — /btw can't run mid-turn. Wait for the current response or /stop first. That's the opposite of what /btw is for — asking a side question while the main turn is still working. Add the bypass next to /background and a regression test covering the mid-turn dispatch path. Reported by @IuriiTiunov on Telegram.	2026-04-26 07:11:10 -07:00
Teknium	7fa70b6c87	refactor: /btw is now an alias for /background (#16053 ) The ephemeral no-tools side-question variant of /btw confused users who expected 'by-the-way' to mean 'run this off to the side with tools' — they'd type /btw and get a toolless agent that couldn't do the work. /bg worked because it was /background with full tools. Collapse the two: /btw and /bg both alias to /background. One command, one behavior, no more gotchas about which variant has tools. Removed: - _handle_btw_command in cli.py and gateway/run.py - _run_btw_task + _active_btw_tasks state in gateway/run.py - prompt.btw JSON-RPC method + btw.complete event in tui_gateway - BtwStartResponse type + btw.complete case in ui-tui - Standalone /btw slash tree registration in Discord - Standalone btw CommandDef in hermes_cli/commands.py Updated: - background CommandDef aliases: (bg,) -> (bg, btw) - TUI session.ts: local btw handler merged into background - Docs and tips updated to describe /btw as a /background alias	2026-04-26 07:11:08 -07:00
Teknium	83c1c201f6	feat(onboarding): contextual first-touch hints for /busy and /verbose (#16046 ) Instead of a blocking first-run questionnaire, show a one-time hint the first time the user hits each behavior fork: 1. First message while the agent is working — appends a hint to the busy-ack explaining the /busy queue vs /busy interrupt knob, phrased to match the mode that was just applied (don't tell a queue-mode user to switch to queue). 2. First tool that runs for >= 30s in the noisiest progress mode (tool_progress: all) — prints a hint about /verbose to cycle display modes (all -> new -> off -> verbose). Gated on /verbose actually being usable on the surface: always shown on CLI; on gateway only shown when display.tool_progress_command is enabled. Each hint is latched in config.yaml under onboarding.seen.<flag>, so it fires exactly once per install across CLI, gateway, and cron, then never again. Users can wipe the section to re-see hints. New: - agent/onboarding.py — is_seen / mark_seen / hint strings, shared by both CLI and gateway. - onboarding.seen in DEFAULT_CONFIG (hermes_cli/config.py) and in load_cli_config defaults (cli.py). No _config_version bump — deep merge handles new keys. Wired: - gateway/run.py: _handle_active_session_busy_message appends the hint after building the ack. progress_callback tracks tool.completed duration and queues the tool-progress hint into the progress bubble. - cli.py: CLI input loop appends the busy-input hint on the first busy Enter; _on_tool_progress appends the tool-progress hint on the first >=30s tool completion. In-memory CLI_CONFIG is also updated so subsequent fires in the same process are suppressed immediately. All writes go through atomic_yaml_write and are wrapped in try/except so onboarding can never break the input/busy-ack paths.	2026-04-26 06:06:27 -07:00
Teknium	4bda9dcade	fix(gateway): honor voice.auto_tts config in auto-TTS gate (#16007 ) (#16039 ) The base adapter's auto-TTS path fired on any voice message unless the chat had explicitly run /voice off — it never read voice.auto_tts from config.yaml, so users who set auto_tts: false still got audio replies. Gate the base adapter on a three-layer decision instead: 1. chat in _auto_tts_enabled_chats (explicit /voice on\|tts) → fire 2. chat in _auto_tts_disabled_chats (explicit /voice off) → suppress 3. else → voice.auto_tts global default Runner now pushes voice.auto_tts onto the adapter as _auto_tts_default and mirrors /voice on\|tts chats into _auto_tts_enabled_chats via the existing _sync_voice_mode_state_to_adapter path. /voice off still wins. Closes #16007.	2026-04-26 05:52:05 -07:00
Teknium	35c57cc46b	fix(gateway): suppress tool-progress bubbles after interrupt (#16034 ) When the LLM response carries N parallel tool calls, the agent fires N tool.started events back-to-back before its interrupt check runs. A user sending /stop mid-batch would see the '⚡ Interrupting current task' ack followed by a trail of 🔍 web_search bubbles for the remaining events in the batch — making the interrupt feel ignored. progress_callback and the drain loop in send_progress_messages now check agent.is_interrupted (via agent_holder[0], the existing cross-scope handle). Events that arrive after interrupt are dropped at both the queueing and rendering stages. The '⚡ Interrupting' message is sent through a separate adapter path and is unaffected.	2026-04-26 05:47:37 -07:00
Teknium	125de02056	fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K (#15844 ) Fixes #15779. Custom-provider per-model context_length (`custom_providers[].models.<id>.context_length`) is now honored across every resolution path, not just agent startup. Also adds 256K as the top probe tier and default fallback. ## What changed New helper `hermes_cli.config.get_custom_provider_context_length()` — single source of truth for the per-model override lookup, with trailing-slash-insensitive base-url matching. `agent.model_metadata.get_model_context_length()` gains an optional `custom_providers=` kwarg (step 0b — runs after explicit `config_context_length` but before every other probe). Wired through five call sites that previously either duplicated the lookup or ignored it entirely: - `run_agent.py` startup — refactored to use the new helper (dedups legacy inline loop, keeps invalid-value warning) - `AIAgent.switch_model()` — re-reads custom_providers from live config on every /model switch - `hermes_cli.model_switch.resolve_display_context_length()` — new `custom_providers=` kwarg - `gateway/run.py` /model confirmation (picker callback + text path) - `gateway/run.py` `_format_session_info` (/info) ## Context probe tiers `CONTEXT_PROBE_TIERS = [256_000, 128_000, 64_000, 32_000, 16_000, 8_000]` — was `[128_000, ...]`. `DEFAULT_FALLBACK_CONTEXT` follows tier[0], so unknown models now default to 256K. The stale `128000` literal in the OpenRouter metadata-miss path is replaced with `DEFAULT_FALLBACK_CONTEXT` for consistency. ## Repro (from #15779) ```yaml custom_providers: - name: my-custom-endpoint base_url: https://example.invalid/v1 model: gpt-5.5 models: gpt-5.5: context_length: 1050000 ``` `/model gpt-5.5 --provider custom:my-custom-endpoint` → previously "Context: 128,000", now "Context: 1,050,000". ## Tests - `tests/hermes_cli/test_custom_provider_context_length.py` — new file, 19 tests covering the helper, step-0b integration, and the 256K tier invariants - `tests/hermes_cli/test_model_switch_context_display.py` — added regression tests for #15779 through the display resolver - `tests/gateway/test_session_info.py` — updated default-fallback assertion (128K → 256K) - `tests/agent/test_model_metadata.py` — updated tier assertions for the new top tier	2026-04-25 18:47:53 -07:00
Teknium	01535a4732	fix(api_server): cap stop-run wait at 5s so interrupt can't hang handler task.cancel() can't preempt the run_in_executor thread running run_conversation(), so we rely on agent.interrupt() to wake the loop. Without a timeout, a slow/unresponsive interrupt blocks the HTTP response indefinitely. Wrap the await in wait_for(shield(task), 5.0) and log a warning on timeout. Also tidy one extra space in the module docstring's /stop entry.	2026-04-25 18:40:35 -07:00
ekko	0a15dbdc43	feat(api_server): add POST /v1/runs/{run_id}/stop endpoint Add ability to interrupt a running agent via the runs API. Previously /v1/runs could start a run and subscribe to events, but there was no way to cancel it. The new endpoint stores agent and task references during execution, calls agent.interrupt() to stop LLM calls, then cancels the asyncio task. Includes 15 tests covering start, events, and stop scenarios.	2026-04-25 18:40:35 -07:00
nerijusas	81e01f6ee9	fix(agent): preserve Codex message items for replay	2026-04-25 18:22:06 -07:00
Iris Jin	25ba6a4a74	fix(gateway): make reasoning session-scoped by default	2026-04-25 18:01:31 -07:00
kshitijk4poor	7c17accb29	fix: /stop now immediately aborts streaming retry loop When a user sends /stop during a streaming API call, the outer poll loop detects _interrupt_requested and closes the HTTP connection. However, the inner _call() thread catches the connection error and enters its retry loop — opening a FRESH connection without checking the interrupt flag. On slow providers like ollama-cloud, each retry attempt blocks for the full stream-read timeout (120s+). With 3 retry attempts this caused 510+ second delays between /stop and actual response — the agent appeared completely unresponsive despite the stop being acknowledged. Fix: add an _interrupt_requested check at the top of the streaming retry loop so the agent exits immediately instead of retrying. Also fix log truncation: all session key logging in gateway/run.py used [:20] or [:30] slices, which truncated 'agent:main:telegram:dm:5690190437' (33 chars) to 'agent:main:telegram:' — losing the identifying chat type and user ID. Replace with full keys to make logs debuggable. Reported by user Sidharth Pulipaka via Telegram on ollama-cloud provider.	2026-04-25 09:51:39 -07:00
Teknium	ea01bdcebe	refactor(memory): remove flush_memories entirely (#15696 ) The AIAgent.flush_memories pre-compression save, the gateway _flush_memories_for_session, and everything feeding them are obsolete now that the background memory/skill review handles persistent memory extraction. Problems with flush_memories: - Pre-dates the background review loop. It was the only memory-save path when introduced; the background review now fires every 10 user turns on CLI and gateway alike, which is far more frequent than compression or session reset ever triggered flush. - Blocking and synchronous. Pre-compression flush ran on the live agent before compression, blocking the user-visible response. - Cache-breaking. Flush built a temporary conversation prefix (system prompt + memory-only tool list) that diverged from the live conversation's cached prefix, invalidating prompt caching. The gateway variant spawned a fresh AIAgent with its own clean prompt for each finalized session — still cache-breaking, just in a different process. - Redundant. Background review runs in the live conversation's session context, gets the same content, writes to the same memory store, and doesn't break the cache. Everything flush_memories claimed to preserve is already covered. What this removes: - AIAgent.flush_memories() method (~248 LOC in run_agent.py) - Pre-compression flush call in _compress_context - flush_memories call sites in cli.py (/new + exit) - GatewayRunner._flush_memories_for_session + _async_flush_memories (and the 3 call sites: session expiry watcher, /new, /resume) - 'flush_memories' entry from DEFAULT_CONFIG auxiliary tasks, hermes tools UI task list, auxiliary_client docstrings - _memory_flush_min_turns config + init - #15631's headroom-deduction math in _check_compression_model_feasibility (headroom was only needed because flush dragged the full main-agent system prompt along; the compression summariser sends a single user-role prompt so new_threshold = aux_context is safe again) - The dedicated test files and assertions that exercised flush-specific paths What this renames (with read-time backcompat on sessions.json): - SessionEntry.memory_flushed -> SessionEntry.expiry_finalized. The session-expiry watcher still uses the flag to avoid re-running finalize/eviction on the same expired session; the new name reflects what it now actually gates. from_dict() reads 'expiry_finalized' first, falls back to the legacy 'memory_flushed' key so existing sessions.json files upgrade seamlessly. Supersedes #15631 and #15638. Tested: 383 targeted tests pass across run_agent/, agent/, cli/, and gateway/ session-boundary suites. No behavior regressions — background memory review continues to handle persistent memory extraction on both CLI and gateway.	2026-04-25 08:21:14 -07:00
Teknium	6ed37e0f42	feat(tools): make discord/discord_admin opt-in, Discord-only Both discord (read/participate) and discord_admin (server admin) are now configurable via `hermes tools` with default-OFF. Previously the core discord tool (fetch_messages, search_members, create_thread) auto-loaded on every Discord install with DISCORD_BOT_TOKEN set — 19 tools the user never opted into. Adds a platform-scoping mechanism (_TOOLSET_PLATFORM_RESTRICTIONS) so the discord toolsets only show up in the Discord platform's checklist, not on CLI/Telegram/Slack/etc. Applied at four gates: - _prompt_toolset_checklist: checklist filter - _get_platform_tools: resolution filter (both branches) - _save_platform_tools: save-time filter (covers 'Configure all platforms' and hand-edited config.yaml) - tools_disable_enable_command: rejects `hermes tools enable discord` on non-Discord platforms with a clear error build_session_context_prompt now injects the Discord IDs block only when both conditions hold: the discord/discord_admin toolset is enabled AND DISCORD_BOT_TOKEN is set. Toolset alone isn't enough — the tool's check_fn gates on the token at registry time, so opting in without a token yields no tools and the IDs block would lie. Otherwise keep the stale-API disclaimer.	2026-04-25 04:51:11 -07:00
alt-glitch	591deeb928	feat(session): inject Discord IDs block when discord tool is loaded When DISCORD_BOT_TOKEN is set — meaning the discord tool actually loads — emit a dedicated IDs block in the session context prompt so the agent can call ``fetch_messages``, ``pin_message``, etc. with real identifiers instead of probing. Currently only ``thread_id`` was exposed as a raw ID (via the ``description`` string). The agent in a Discord thread had to guess that the thread ID doubles as a channel ID for the REST API (it does), and it had no way to reference the parent channel, the guild, or the triggering message at all. The block adapts to context: - Thread: guild / parent channel / thread / message - Channel: guild / channel / message - (DM has no guild/channel IDs worth listing; only message) Discord isn't in _PII_SAFE_PLATFORMS, so IDs ship unredacted.	2026-04-25 04:51:11 -07:00
alt-glitch	5ae07e7b5c	fix(session): gate stale "no Discord APIs" note on DISCORD_BOT_TOKEN The Discord platform note in the session context prompt claimed the agent has no server-management APIs — pre-dating the discord tool. With a bot token configured the agent actually has fetch_messages, search_members, create_thread, and optionally the discord_admin tool; telling the model otherwise causes it to refuse or apologise for calls it is fully able to make. Gate the disclaimer on DISCORD_BOT_TOKEN being unset, matching the tool's own ``check_fn``. Without a token the note still appears and remains accurate; with a token the model is no longer gaslit into refusing valid tool calls.	2026-04-25 04:51:11 -07:00
alt-glitch	47b02e961c	feat(discord): populate guild_id, parent_chat_id, message_id on SessionSource Discord knows all four identifiers for every inbound message — guild, channel (or thread), parent channel when in a thread, and the triggering message. Pass them into ``SessionSource`` via the new ``build_source()`` kwargs so downstream code (context-prompt builder, delivery, logging) can use them without re-resolving from discord.py objects. For auto-threaded messages, remember the original channel as the parent before swapping ``chat_id`` to the freshly created thread. Behavioural: still a no-op — nothing consumes these fields yet.	2026-04-25 04:51:11 -07:00
alt-glitch	0702231dd8	feat(session): add guild_id/parent_chat_id/message_id to SessionSource Groundwork for injecting raw platform identifiers into the agent's system prompt. Currently only `thread_id` is exposed as a raw ID — callers in a Discord thread had to guess `channel_id == thread_id` (which happens to work because threads are channels in Discord's REST API) and had no way to reference the parent channel, guild, or the triggering message. Adds three optional fields: - `guild_id` — Discord guild / Slack workspace / Matrix server scope - `parent_chat_id` — parent channel when chat_id refers to a thread - `message_id` — ID of the triggering message (pin/reply/react) Extends `BasePlatformAdapter.build_source()` to accept + forward them and teaches `to_dict`/`from_dict` to serialize them. Behaviourally a no-op: nothing reads the fields yet and they default to None.	2026-04-25 04:51:11 -07:00
Clifford Garwood	2182de55bb	fix(matrix): drop needless DeviceID import + mock put_device_id in tests Two adjustments to make CI pass: - In gateway/platforms/matrix.py: `DeviceID` is `NewType("DeviceID", str)`, so passing `client.device_id` directly (already a str) works identically at runtime. The explicit import was cosmetic and tripped CI environments where `mautrix.types` doesn't re-export DeviceID at the expected path ("cannot import name 'DeviceID' from 'mautrix.types' (unknown location)"). - In tests/gateway/test_matrix.py: add `put_device_id` to the hand-written `PgCryptoStore` fake so the three encryption-path tests (test_connect_with_access_token_and_encryption, test_connect_uses_configured_device_id_over_whoami, test_connect_registers_encrypted_event_handler_when_encryption_on) can exercise the new crypto-store binding without AttributeError.	2026-04-25 07:17:03 +05:30
Clifford Garwood	3cf13747b7	fix(matrix): bind PgCryptoStore device_id so fresh E2EE installs work PgCryptoStore.__init__ defaults _device_id to "" and put_account writes that blank value into crypto_account. The UPSERT's ON CONFLICT DO UPDATE clause deliberately does not touch device_id, so once the row is written blank it stays blank forever — breaking every downstream device-scoped olm operation. Peers' to-device olm ciphertext can't match our identity key, no megolm sessions ever land, and the user sees "hermes is in the room but never responds to encrypted messages". Fix: call put_device_id(client.device_id) immediately after crypto_store.open() and before olm.load(). This sets the store's in-memory _device_id so the first put_account INSERT writes the correct value from the start. Observable symptoms without the fix, on a fresh crypto.db: - crypto_account.device_id = "" - crypto_tracked_user: 0 rows - crypto_device: 0 rows - crypto_olm_session: 0 rows - crypto_megolm_inbound_session: 0 rows - "No one-time keys nor device keys got when trying to share keys" warning on every startup - "olm event doesn't contain ciphertext for this device" DecryptionError on any inbound to-device event - Encrypted room messages arrive but never decrypt After the fix (wiped crypto.db + restart): - device_id populated with actual runtime device (e.g. CZIKTRFLOV) - all counts populate from sync as expected - encrypted DMs flow normally Who hits this: anyone with a fresh crypto.db — includes first-time matrix E2EE setup, nio→mautrix migrations (since matrix.py removes the legacy pickle on startup, creating a fresh SQLite store), and anyone who wipes crypto.db to start over. Existing installs that somehow already have a non-blank device_id would be unaffected, but no prior code path writes it correctly, so that set is likely empty.	2026-04-25 07:17:03 +05:30
Teknium	05d8f11085	fix(/model): show provider-enforced context length, not raw models.dev (#15438 ) /model gpt-5.5 on openai-codex showed 'Context: 1,050,000 tokens' because the display block used ModelInfo.context_window directly from models.dev. Codex OAuth actually enforces 272K for the same slug, and the agent's compressor already runs at 272K via get_model_context_length() — so the banner + real context budget said 272K while /model lied with 1M. Route the display context through a new resolve_display_context_length() helper that always prefers agent.model_metadata.get_model_context_length (which knows about Codex OAuth, Copilot, Nous caps) and only falls back to models.dev when that returns nothing. Fix applied to all 3 /model display sites: cli.py _handle_model_switch gateway/run.py picker on_model_selected callback gateway/run.py text-fallback confirmation Reported by @emilstridell (Telegram, April 2026).	2026-04-24 17:21:38 -07:00
simbam99	19a3e2ce8e	fix(gateway): follow compression continuations during /resume	2026-04-24 16:42:31 -07:00
Benjamin Sehl	f731c2c2bd	fix(gateway/bluebubbles): align iMessage delivery with non-editable UX	2026-04-24 16:04:37 -07:00
Teknium	36d68bcb82	fix(api-server): persist incomplete snapshot on asyncio.CancelledError too Extends PR #15171 to also cover the server-side cancellation path (aiohttp shutdown, request-level timeout) — previously only ConnectionResetError triggered the incomplete-snapshot write, so cancellations left the store stuck at the in_progress snapshot written on response.created. Factors the incomplete-snapshot build into a _persist_incomplete_if_needed() helper called from both the ConnectionResetError and CancelledError branches; the CancelledError handler re-raises so cooperative cancellation semantics are preserved. Adds two regression tests that drive _write_sse_responses directly (the TestClient disconnect path races the server handler, which makes the end-to-end assertion flaky).	2026-04-24 15:22:19 -07:00
UgwujaGeorge	a29bad2a3c	fix(api-server): persist response snapshot on client disconnect when store=True	2026-04-24 15:22:19 -07:00
Yukipukii1	8ea389a7f8	fix(gateway/config): coerce quoted boolean values in config parsing	2026-04-24 15:20:05 -07:00
knockyai	3e6c108565	fix(gateway): honor queue mode in runner PRIORITY interrupt path When display.busy_input_mode is 'queue', the runner-level PRIORITY block in _handle_message was still calling running_agent.interrupt() for every text follow-up to an active session. The adapter-level busy handler already honors queue mode (commit `9d147f7fd`), but this runner-level path was an unconditional interrupt regardless of config. Adds a queue-mode branch that queues the follow-up via _queue_or_replace_pending_event() and returns without interrupting. Salvages the useful part of #12070 (@knockyai). The config fan-out to per-platform extra was redundant — runner already loads busy_input_mode directly via _load_busy_input_mode().	2026-04-24 15:18:34 -07:00
helix4u	e7590f92a2	fix(telegram): honor no_proxy for explicit proxy setup	2026-04-24 14:31:04 -07:00
Teknium	62c14d5513	refactor(gateway): extract WhatsApp identity helpers into shared module Follow-up to the canonical-identity session-key fix: pull the JID/LID normalize/expand/canonical helpers into gateway/whatsapp_identity.py instead of living in two places. gateway/session.py (session-key build) and gateway/run.py (authorisation allowlist) now both import from the shared module, so the two resolution paths can't drift apart. Also switches the auth path from module-level _hermes_home (cached at import time) to dynamic get_hermes_home() lookup, which matches the session-key path and correctly reflects HERMES_HOME env overrides. The lone test that monkeypatched gateway.run._hermes_home for the WhatsApp auth path is updated to set HERMES_HOME env var instead; all other tests that monkeypatch _hermes_home for unrelated paths (update, restart drain, shutdown marker, etc.) still work — the module-level _hermes_home is untouched.	2026-04-24 07:55:55 -07:00
Keira Voss	10deb1b87d	fix(gateway): canonicalize WhatsApp identity in session keys Hermes' WhatsApp bridge routinely surfaces the same person under either a phone-format JID (60123456789@s.whatsapp.net) or a LID (…@lid), and may flip between the two for a single human within the same conversation. Before this change, build_session_key used the raw identifier verbatim, so the bridge reshuffling an alias form produced two distinct session keys for the same person — in two places: 1. DM chat_id — a user's DM sessions split in half, transcripts and per-sender state diverge. 2. Group participant_id (with group_sessions_per_user enabled) — a member's per-user session inside a group splits in half for the same reason. Add a canonicalizer that walks the bridge's lid-mapping-*.json files and picks the shortest/numeric-preferred alias as the stable identity. build_session_key now routes both the DM chat_id and the group participant_id through this helper when the platform is WhatsApp. All other platforms and chat types are untouched. Expose canonical_whatsapp_identifier and normalize_whatsapp_identifier as public helpers. Plugins that need per-sender behaviour (role-based routing, per-contact authorization, policy gating) need the same identity resolution Hermes uses internally; without a public helper, each plugin would have to re-implement the walker against the bridge's internal on-disk format. Keeping this alongside build_session_key makes it authoritative and one refactor away if the bridge ever changes shape. _expand_whatsapp_aliases stays private — it's an implementation detail of how the mapping files are walked, not a contract callers should depend on.	2026-04-24 07:55:55 -07:00
Blind Dev	591aa159aa	feat: allow Telegram chat allowlists for groups and forums (#15027 ) * feat: allow Telegram chat allowlists for groups and forums * chore: map web3blind noreply email for release attribution --------- Co-authored-by: web3blind <web3blind@users.noreply.github.com>	2026-04-24 07:23:14 -07:00
Stefan Dimitrov	260ae62134	Invoke session finalize hooks on expiry flush	2026-04-24 05:40:52 -07:00

1 2 3 4 5 ...

1220 commits