hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-23 10:42:00 +00:00

Author	SHA1	Message	Date
UgwujaGeorge	b7ad3f478f	fix(yuanbao): enforce owner identity check on group slash commands The bot-owner identity check inside OwnerCommandMiddleware was commented out and replaced with a hardcoded `is_owner = True`, so any group member could trigger allowlisted privileged commands (/approve, /deny, /stop, /reset, /retry, /undo, /new, /background, /bg, /btw, /queue, /q) by sending the slash command without @-mentioning the bot. The most severe case is /approve: a non-owner could approve a dangerous tool call the bot was waiting on the owner to confirm. Re-enable the documented identity check (push.from_account == push.bot_owner_id) so only the configured owner can issue these commands.	2026-04-30 23:57:55 -07:00
Mikey O'Brien	1be3b74cfb	fix(gateway): honor MATRIX_HOME_ROOM in onboarding	2026-04-30 23:13:34 -07:00
Teknium	265bd59c1d	feat: /goal — persistent cross-turn goals (Ralph loop) (#18262 ) Add a standing-goal slash command that keeps Hermes working toward a user-stated objective across turns until it is achieved, paused, or the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI 0.128.0's /goal. After each turn, a lightweight auxiliary-model judge call asks 'is this goal satisfied by the assistant's last response?'. If not, and we're under the turn budget (default 20), Hermes feeds a continuation prompt back into the same session as a normal user message. Any real user message preempts the continuation loop automatically. Judge failures fail OPEN (continue) so a flaky judge never wedges progress — the turn budget is the real backstop. ### Commands - `/goal <text>` — set a standing goal (kicks off the first turn) - `/goal` or `/goal status` — show current state - `/goal pause` — pause the continuation loop - `/goal resume` — resume (resets turn counter) - `/goal clear` — drop the goal Works on both CLI and gateway platforms via the central CommandDef registry. ### Design invariants preserved - Prompt cache: continuation prompts are regular user-role messages appended to history. No system-prompt mutation, no toolset swap. - Role alternation: continuation is a user turn, never injected mid-tool-loop. - Session persistence: goal state lives in SessionDB.state_meta keyed by `goal:<session_id>`, so `/resume` picks it up. - Mid-run safety: on the gateway, `/goal status\|pause\|clear` are allowed mid-run (control-plane only); setting a new goal requires `/stop` first so we don't race a second continuation prompt against the current turn. ### Files - `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state - `hermes_cli/commands.py` — CommandDef entry - `hermes_cli/config.py` — `goals.max_turns` default - `hermes_cli/web_server.py` — dashboard category merge - `cli.py` — /goal handler + post-turn continuation hook in process_loop - `gateway/run.py` — /goal handler + post-turn continuation hook wrapping _handle_message_with_agent - `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing, fail-open semantics, lifecycle, persistence, budget exhaustion - `website/docs/reference/slash-commands.md` — docs entry	2026-04-30 23:10:20 -07:00
Teknium	4caad285a6	feat(gateway): auto-delete slash-command system notices after TTL (#18266 ) Adds opt-in auto-deletion for slash-command reply messages like "New session started!", "Restarting gateway…", "Stopped.", and YOLO toggles. After the TTL elapses the gateway calls the adapter's delete_message; on platforms without a delete API (everything except Telegram today) the TTL is silently ignored and the message stays. Requested on Twitter by @charlesmcdowell — tool-call bubbles are useful real-time, but system notices clutter the thread once the agent finishes. Implementation: - EphemeralReply(str) sentinel in gateway/platforms/base.py. Subclasses str so existing 'X' in response / response.startswith(...) checks in tests and call sites keep working unchanged; isinstance() still distinguishes it for the send path. - _process_message_background and both busy-session bypass paths (in base.py) call _unwrap_ephemeral() on the handler return, send the unwrapped text, and schedule a detached delete task when the TTL > 0 AND the adapter class overrides delete_message. - display.ephemeral_system_ttl (default 0 = disabled) in DEFAULT_CONFIG. Handler can pass ttl_seconds explicitly to override. - Wrapped the highest-noise return sites: /new, /reset, /stop, /yolo on/off, /restart success + "already in progress". Draining notices and /help output left as plain strings — those are informational and users want to read them. Backward-compat: default TTL 0 → no scheduling, no behavior change for existing users. Platforms without delete_message silently no-op.	2026-04-30 23:05:48 -07:00
Teknium	f0dc919f92	fix(compression): include system prompt + tool schemas in token estimates (#18265 ) The user-visible /compress banner and the post-compression last_prompt_tokens writeback both counted only the raw message transcript (chars/4). With a 15KB system prompt and 30 tool schemas (~26KB), a 4-message transcript that looks like ~45 tokens to the transcript-only estimator is really ~10.5K tokens of request pressure — a 234x gap. Two user-facing consequences: - Banner shows 'Compressing … (~45 tokens)…' while compression is actually firing on 10K+ tokens of real pressure, confusing users about why compression triggered (reported by @codecovenant on X; #6217). - Post-compression last_prompt_tokens writeback omits tool schemas, so the next should_compress() check compares real usage against a stale underestimate — compression triggers late, potentially past the model's context limit on small-context models (#14695). Swap estimate_messages_tokens_rough() for estimate_request_tokens_rough() at every user-visible banner and at the post-compression writeback. estimate_request_tokens_rough() already existed for exactly this purpose and includes system prompt + tool schemas. Touched call sites: - run_agent.py: post-compression last_prompt_tokens writeback, post-tool call should_compress() fallback when provider usage is missing - cli.py: /compress banner + summary - gateway/run.py: gateway /compress banner + summary - tui_gateway/server.py: TUI /compress status + summary - acp_adapter/server.py: ACP /compact before/after Left intentionally alone: - Session-hygiene fallback and the 'no agent' /status path in gateway/run.py — no agent instance is in scope to query for system prompt/tools, and the existing 30-50% overestimate wobble on hygiene is safety-accepted. - Verbose-mode 'Request size' logging — informational only, already counts system prompt via api_messages[0]. Also relabels the feedback line from 'Rough transcript estimate' to 'Approx request size' so the metric label matches what it actually measures. Credits: diagnoses from @devilardis (#14695) and @Jackten (#6217); user report @codecovenant on X (2026-04-30). Closes #14695 Closes #6217	2026-04-30 23:03:54 -07:00
Oxidane-bot	8d7500d80d	fix(gateway): snapshot callback generation after agent binds it, not before _process_message_background snapshotted callback_generation from the interrupt event at the TOP of the task — before the handler ran. _hermes_run_generation is only set on the event by GatewayRunner._bind_adapter_run_generation during _handle_message_with_agent, which runs DURING the handler await. The early snapshot always captured None, which then flowed into pop_post_delivery_callback(..., generation=None) in the finally block. In pop_post_delivery_callback, generation=None with a tuple-registered entry (generation, callback) bypasses the ownership check — it pops and fires the callback regardless of which run owns it. Result: a stale run could fire a fresher run's post-delivery callback (e.g. a background-review notification attributed to the wrong turn). Fix: move the snapshot into the finally block, after the handler has run and _hermes_run_generation has been bound to the current run. Regression test added: simulates a stale handler at generation=1 and a fresher callback registered at generation=2. Pre-fix: snapshot=None → pop fires the generation=2 callback under generation=1's ownership ("newer" fires). Post-fix: snapshot=1 → pop skips the mismatched entry, callback stays in the dict for the correct run to claim. Verified: test FAILS on current main (captures "newer" in fired list), PASSES with this fix. Salvaged from PR #12565 (the callback-ownership portion only; the /status totals portion was already fixed on main in `7abc9ce4d` via #17158). Co-authored-by: Oxidane-bot <1317078257maroon@gmail.com>	2026-04-30 20:41:18 -07:00
Teknium	27ec74c68a	fix: coerce show_reasoning and guard_agent_created config bools Widens #16528 to two sibling sites that had the same quoted-boolean bug: a YAML string "false" (or "0", "no", "off") silently evaluated truthy under bool() / if-check. - gateway/run.py _load_show_reasoning: is_truthy_value wrap - tools/skill_manager_tool.py _guard_agent_created_enabled: is_truthy_value wrap - regression tests for both	2026-04-30 20:40:46 -07:00
johnncenae	bb706c3f38	fix(gateway): coerce tool_progress_command as a real boolean	2026-04-30 20:40:46 -07:00
simbam99	7ba1a2b3df	fix(gateway): preserve assistant metadata when branching sessions	2026-04-30 20:40:28 -07:00
simbam99	ccfe6a47c3	fix(gateway): coerce StreamingConfig booleans and malformed numerics safely	2026-04-30 20:37:49 -07:00
hharry11	158eb32686	fix(gateway): preserve document type when merging queued events	2026-04-30 20:37:27 -07:00
Roy-oss1	b94cb8e2c4	feat(feishu): operator-configurable bot admission and mention policy Add two operator-facing toggles for inbound Feishu admission, enabling bot-to-bot scenarios such as A2A orchestration and inter-bot notifications: FEISHU_ALLOW_BOTS=none\|mentions\|all (default: none) Accept messages from other bots. `mentions` requires the peer bot to @-mention Hermes; `all` admits every peer-bot message. FEISHU_REQUIRE_MENTION=true\|false (default: true) Whether group messages must @-mention the bot. Override per-chat via `group_rules.<chat_id>.require_mention` in config.yaml. Defaults preserve prior behavior. Self-echo protection is always on: when the bot's identity is unresolved (auto-detection failed and FEISHU_BOT_OPEN_ID unset), peer-bot messages are rejected fail-closed to avoid feedback loops. Admitted peer bots bypass the human-user allowlist (FEISHU_ALLOWED_USERS) to match existing Discord behavior; humans still need an explicit allowlist entry. yaml feishu.allow_bots is bridged to the env var so the adapter and gateway auth layer share one source of truth. Resolving peer-bot display names requires the application:bot.basic_info:read scope; without it, peers still route but appear as their open_id. Test: tests/gateway/test_feishu_bot_admission.py covers the admission pipeline, group-policy bot-bypass, hydration, and event-dispatch plumbing as a parametrized matrix. Change-Id: I363cccb578c2a5c8b8bf0f0a890c01c89909e256	2026-04-30 20:30:31 -07:00
buray	fa9fd26acb	fix(gateway): re-inject topic-bound skill after /new or /reset reset_session() creates a fresh SessionEntry with created_at == updated_at, but get_or_create_session() bumps updated_at on the next inbound message, causing _is_new_session in _handle_message_with_agent to evaluate False. The topic/channel skill auto-load gate (group_topics, channel_skill_bindings) silently skips the first message after a manual reset. Add an is_fresh_reset flag on SessionEntry, set by reset_session() and consumed once by the message handler. Kept distinct from was_auto_reset because that flag also drives a 'session expired due to inactivity' user-facing notice and a context-note prepend — both wrong for an explicit /new or /reset. Persisted through to_dict/from_dict so the flag survives gateway restart between /reset and the next message. Fixes #6508 Co-authored-by: warabe1122 <45554392+warabe1122@users.noreply.github.com> Co-authored-by: willy-scr <187001140+willy-scr@users.noreply.github.com>	2026-04-30 20:29:19 -07:00
Jezza Hehn	7abc9ce4df	fix(gateway): read /status token totals from SessionDB (#17158 ) /status was reading session_entry.total_tokens from the in-memory SessionStore (gateway/session.py), which the agent never writes to — so the token count was always 0. The agent already persists token deltas to the SQLite SessionDB (run_agent.py:11497) for every platform with a session_id. Route /status through that single source of truth instead of duplicating token writes into a second store. Fix: - gateway/run.py: _handle_status_command now calls self._session_db.get_session(session_id) and sums the five token component columns (input/output/cache_read/cache_write/reasoning). Falls back to 0 when no SessionDB is configured or no row exists. - Two new regression tests covering the populated-row and missing-row paths. Co-authored-by: Hermes <127238744+teknium1@users.noreply.github.com>	2026-04-30 20:28:50 -07:00
Teknium	a178081468	fix(gateway): use _session_key_for_source for native image buffer write Minor follow-up to the native-image-buffer isolation fix. The write site in _prepare_inbound_message_text was calling build_session_key directly, while every other call site in gateway/run.py uses the _session_key_for_source helper — which consults session_store._generate_session_key first and falls back to build_session_key. Keeping the write key and consume key on the same helper prevents key drift if the session store ever overrides the default keying behavior.	2026-04-30 20:26:35 -07:00
Yukipukii1	bdb7edd89e	fix(gateway): isolate pending native image paths by session	2026-04-30 20:26:35 -07:00
Teknium	9a75743496	fix(gateway): apply agent.disabled_toolsets in gateway message loop Widens the cherry-picked fix from @jatingodnani (#17343) to the gateway path. On main, user_config.agent.disabled_toolsets was only honored by _get_platform_tools' name-level subtraction — it did not catch tools pulled in implicitly by a composite toolset (browser includes web_search, hermes-* platforms include most tools). Changes: - gateway/run.py: resolve disabled_toolsets alongside enabled_toolsets and pass to AIAgent at both user-facing construction sites (normal message loop + single-turn cron-like path). Hygiene/compression agents (fixed enabled_toolsets=[memory]) are intentionally untouched. - gateway/run.py: add (agent, disabled_toolsets) to _CACHE_BUSTING_CONFIG_KEYS so editing the list in config.yaml invalidates the cached AIAgent on the next message. - cli.py: drop unused 'import platform' left over from PR #17343's import churn; restore 'import sys' used throughout the file. - model_tools.py: drop unused 'import os, sys' added by PR #17343; fix comment reference from #15291 (unrelated OAuth issue) to #17309. Co-authored-by: jatin godnani <godnanijatin@gmail.com>	2026-04-30 20:24:39 -07:00
Teknium	01cc701e54	docs + nit: busy_ack_enabled follow-ups - Move the disabled-ack guard above the debounce so we don't stamp _busy_ack_ts[session_key] when no ack was actually sent. Harmless (never read when disabled) but cosmetically off. - Document display.busy_ack_enabled in user-guide/messaging/index.md and HERMES_GATEWAY_BUSY_ACK_ENABLED in reference/environment-variables.md. - Add JezzaHehn to scripts/release.py AUTHOR_MAP for contributor credit. Follow-up to #17491 (Jezza Hehn).	2026-04-30 20:22:30 -07:00
Jezza Hehn	2b512cbca4	feat(gateway): add busy_ack_enabled config option to suppress ack messages When a user sends a message while the gateway is busy processing, an acknowledgment message is sent. This can be spammy for users who send rapid messages. Add display.busy_ack_enabled config option (default: true) to allow users to suppress these busy-input acknowledgment messages. Fixes #17457	2026-04-30 20:22:30 -07:00
Yukipukii1	25cbe3e1d6	fix(gateway): preserve thread routing for /update progress and prompts	2026-04-30 20:19:23 -07:00
Yukipukii1	38875d00a7	fix(gateway): ensure platform configs honor home_channel env overrides	2026-04-30 20:18:33 -07:00
hharry11	2997ef9446	fix(api-server): use session-scoped task IDs for tool isolation	2026-04-30 19:59:38 -07:00
johnncenae	a83d579d5b	fix(telegram): enforce gateway auth for inline approval callbacks	2026-04-30 19:59:31 -07:00
Teknium	f43b126677	fix(gateway): atomic writes for sibling recovery/dedup state files Widen PR #17842's atomic-write fix to two sibling sites that exhibit the same 'partial JSON on interrupted write' class of bug: - gateway/platforms/feishu.py: dedup state (_dedup_state_path) - gateway/platforms/helpers.py: ParticipatedThreadTracker save Both are small recovery/coordination files that get rewritten frequently and break cross-restart dedup if left partial.	2026-04-30 19:58:16 -07:00
johnncenae	1ef9e88549	fix(gateway): write restart markers atomically and fix Windows lock collisions	2026-04-30 19:58:16 -07:00
Chris Danis	f61695ee73	fix(signal): skip contentless envelopes (profile key updates, empty messages) Signal-cli sends dataMessage wrappers for profile key updates and other metadata events that have no actual text content. These were reaching the gateway as msg='' and triggering full agent turns for nothing. Add early return in _handle_envelope() when both message field is empty/ missing/whitespace AND there are no attachments. Messages with media attachments but no text still flow through. - 12 lines added to gateway/platforms/signal.py - 5 new tests in TestSignalContentlessEnvelope class	2026-04-30 19:42:59 -07:00
Teknium	c868425467	feat(kanban): durable multi-profile collaboration board (#17805 ) Salvage of PR #16100 onto current main (after emozilla's #17514 fix that unblocks plugin Pydantic body validation). History preserved on the standing `feat/kanban-standing` branch; this squashes the 22 iterative commits into one clean landing. What this lands: - SQLite kernel (hermes_cli/kanban_db.py) — durable task board with tasks, task_links, task_runs, task_comments, task_events, kanban_notify_subs tables. WAL mode, atomic claim via CAS, tenant-namespaced, skills JSON array per task, max-runtime timeouts, worker heartbeats, idempotency keys, circuit breaker on repeated spawn failures, crash detection via /proc/<pid>/status, run history preserved across attempts. - Dispatcher — runs inside the gateway by default (`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims stale claims, promotes ready tasks, spawns `hermes -p <assignee> chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK + HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker` plus any per-task skills. Health telemetry warns on stuck ready queue. - Structured tool surface (tools/kanban_tools.py) — 7 tools (kanban_show, kanban_complete, kanban_block, kanban_heartbeat, kanban_comment, kanban_create, kanban_link). Gated on HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal sessions. - System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE) injected only when kanban tools are active. - Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board UI: triage/todo/ready/running/blocked/done columns, drag-drop, inline create, task drawer with markdown, comments, run history, dependency editor, bulk ops, lanes-by-profile grouping, WS-driven live refresh. Matches active dashboard theme via CSS variables. - CLI — `hermes kanban init\|create\|list\|show\|assign\|link\|unlink\| claim\|comment\|complete\|block\|unblock\|archive\|tail\|dispatch\|context\| init\|gc\|watch\|stats\|notify\|log\|heartbeat\|runs\|assignees` + `/kanban` slash in-session. - Worker + orchestrator skills (skills/devops/kanban-worker + kanban-orchestrator) — pattern library for good summary/metadata shapes, retry diagnostics, block-reason examples, fan-out patterns. - Per-task force-loaded skills — `--skill <name>` (repeatable), stored as JSON, threaded through to dispatcher argv as one `--skills X` pair per skill alongside the built-in kanban-worker. Dashboard + CLI + tool parity. - Deprecation of standalone `hermes kanban daemon` — stub exits 2 with migration guidance; `--force` escape hatch for headless hosts. - Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md) with 11 dashboard screenshots walking through four user stories (Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker). - Tests (251 passing): kernel schema + migration + CAS atomicity, dispatcher logic, circuit breaker, crash detection, max-runtime timeouts, claim lifecycle, tenant isolation, idempotency keys, per- task skills round-trip + validation + dispatcher argv, tool surface (7 tools × round-trip + error paths), dashboard REST (CRUD + bulk + links + warnings), gateway-embedded dispatcher (config gate, env override, graceful shutdown), CLI deprecation stub, migration from legacy schemas. Gateway integration: - GatewayRunner._kanban_dispatcher_watcher — new asyncio background task, symmetric with _kanban_notifier_watcher. Runs dispatch_once via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0 env override for debugging. - Config: new `kanban` section in DEFAULT_CONFIG with `dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`. Additive — no \_config_version bump needed. Forward-compat: - workflow_template_id / current_step_key columns on tasks (v1 writes NULL; v2 will use them for routing). - task_runs holds claim machinery (claim_lock, claim_expires, worker_pid, last_heartbeat_at) so multi-attempt history is first- class from day one. Closes #16102. Co-authored-by: emozilla <emozilla@nousresearch.com>	2026-04-30 13:36:47 -07:00
Leone Parise	eda1d516dc	fix(skills): exclude .archive from skill index walk Archived skills (moved to ~/.hermes/skills/.archive/ by the curator) were still surfaced in the <available_skills> system prompt under a fake '.archive' category, causing the agent to load and try to use deprecated skills. The os.walk in iter_skill_index_files() only excluded .git/.github/.hub. Add '.archive' to EXCLUDED_SKILL_DIRS, and to the two other places that hardcode the same exclusion tuple (gateway/run.py and agent/skill_commands.py).	2026-04-30 04:59:22 -07:00
konsisumer	d1d0ef6dbd	fix(gateway): persist user message on transient agent failures (#7100 ) The #1630 fix introduced a blanket ``agent_failed_early`` transcript skip to prevent context-overflow sessions from looping. That guard also triggers for unrelated transient failures (429 rate limits, read timeouts, connection resets, provider 5xx) which have nothing to do with session size — and it silently drops the user's message, so the agent has no memory of the last turn on retry. Split the failure classification in ``GatewayRunner._run_agent``: * Context-overflow (``compression_exhausted`` flag, explicit context-length phrases, or generic 400 with a long history) → keep the existing skip, preserving the #1630/#9893 fix. * Anything else that failed → persist just the user message so the conversation survives a retry. Use specific multi-word phrases (``context length``, ``token limit``, ``prompt is too long``, etc.) to match ``run_agent.py``'s own classifier; bare ``exceed`` false-positively flagged "rate limit exceeded" as context overflow. Covered by new tests in ``tests/gateway/test_7100_transient_failure_transcript.py`` and the existing #1630 suite still passes.	2026-04-30 04:32:33 -07:00
Rob Moen	0dd373ec43	fix(context): honor model.context_length for Ollama num_ctx and all display paths When a user sets model.context_length in config.yaml, the value was only used for Hermes' internal compression decisions (context_compressor) but NOT for Ollama's num_ctx parameter. Ollama auto-detects context from GGUF metadata (often 256K+) and allocates that much VRAM regardless of the user's config — causing OOM on smaller GPUs like the P100 (16GB). Root cause: two separate context values existed independently: - context_compressor.context_length = config value (e.g. 65536) ✓ - _ollama_num_ctx = GGUF metadata value (e.g. 256000) ✗ ignored config Changes: 1. Cap Ollama num_ctx to config context_length (run_agent.py) When model.context_length is explicitly set and no explicit ollama_num_ctx override exists, cap the auto-detected GGUF value to the user's context_length. This is the core fix — it prevents Ollama from allocating more VRAM than the user budgeted. 2. Pass config_context_length through all secondary call sites Several paths called get_model_context_length() without the config override, falling through to the 256K default fallback: - cli.py: @-reference expansion and /model switch display - gateway/run.py: @-reference expansion and /model switch display - tui_gateway/server.py: @-reference expansion - hermes_cli/model_switch.py: resolve_display_context_length() 3. Normalize root-level context_length in config (hermes_cli/config.py) _normalize_root_model_keys() now migrates root-level context_length into the model section, matching existing behavior for provider and base_url. Users who wrote `context_length: 65536` at the YAML root instead of under `model:` had it silently ignored. 4. Fix misleading comments (agent/model_metadata.py) DEFAULT_FALLBACK_CONTEXT is 256K (CONTEXT_PROBE_TIERS[0]), not 128K as two comments stated. Tests: 3 new tests for root-level context_length normalization. All existing context_length tests pass (96 tests).	2026-04-30 04:31:23 -07:00
Bartok9	fbb3775770	fix(gateway): enforce auth check in busy-session path to prevent unauthorized injection (#17775 ) The busy-session handler (_handle_active_session_busy_message) bypassed the authorization gate that the cold path enforces via _is_user_authorized(). In shared-thread contexts (Slack threads, Telegram forum topics, Discord threads) where thread_sessions_per_user=False (the default), all participants share one session_key. An unauthorized user posting in the same thread as an authorized user would hit the active-session branch, skip the auth check, and have their text merged into _pending_messages or injected via agent.interrupt(). This commit adds the same _is_user_authorized() check at the top of the busy handler, before any message queuing, steering, or interrupt logic. Unauthorized messages are silently dropped (return True) with a warning log — matching the cold-path behavior. Affected platforms: Slack, Telegram, Discord, any adapter with shared-session thread contexts. Closes #17775	2026-04-30 04:29:15 -07:00
Teknium	3de8e21683	feat(gateway): native send_multiple_images for Telegram, Discord, Slack, Mattermost, Email Ports PR #17888's send_multiple_images ABC to every gateway platform that has a native multi-attachment API, so images arrive as a single bundled message instead of N separate ones. Native overrides: - Telegram: send_media_group (10 photos per album, chunks over); animated GIFs peeled off and routed through send_animation (albums don't support animations) - Discord: channel.send(files=[...]) (10 attachments per message, chunks over); URL images downloaded into BytesIO so they render inline; forum channels use create_thread with files=[...] - Slack: files_upload_v2(file_uploads=[...]) (10 per call, chunks over); respects thread_ts; records thread participation - Mattermost: single post with file_ids list (5 per post — Mattermost cap, chunks over) - Email: single SMTP message with multiple MIME attachments (no chunk cap, SMTP size governs); remote URLs remain linked in body (parity with existing send_image) All platforms fall back to the base per-image loop on any failure, so a single bad image in a batch never loses the rest. Matrix, WhatsApp, and single-attachment platforms (BlueBubbles, Feishu, WeCom, WeChat, DingTalk) continue to use the base default loop — their server APIs only accept one attachment per message anyway. Tests: adds tests/gateway/test_send_multiple_images.py with 19 targeted tests covering base default loop, chunking, animation peel-off, fallback paths, and empty-batch no-ops across all five new overrides. Co-authored-by: Maxence Groine <maxence@groine.fr>	2026-04-30 04:28:08 -07:00
Maxence Groine	04ea895ffb	feat(gateway/signal): add support for multiple images sending Adds a new `send_multiple_images` method to the ``BasePlatformAdapter`` that implements the default "One image per message" loop and allows for platform-specific overriding. Implements such an override for the Signal adapter, batching images and trying (best-effort) to work around rate-limits for voluminous batches using a specific scheduler. Also implements batching + rate-limit handling in the `send_message` tool. New tests added for the Signal adapter, its rate-limit scheduler and the `send_message` tool	2026-04-30 04:28:08 -07:00
Teknium	411f586c67	refactor(gateway): extract _float_env helper for env-var float casts Follow-up to the try/except guards added in the previous commit. Four sibling call sites all read HERMES_AGENT_TIMEOUT / HERMES_AGENT_TIMEOUT_WARNING / HERMES_AGENT_NOTIFY_INTERVAL via the same read-env-or-fallback pattern, so factor it into _float_env(name, default) alongside the existing _auto_continue_freshness_window() helper.	2026-04-30 03:32:37 -07:00
vominh1919	ca87c822ed	fix(gateway): guard yaml.safe_load and float() env var casts against crash Two defensive fixes in gateway/run.py: 1. yaml.safe_load returning None on empty config files (line 12706): GatewayConfig.from_dict(data) crashes with AttributeError when the YAML file is empty because safe_load returns None. All 6 other yaml.safe_load call sites already use `or {}` — this one was missed. Impact: gateway fails to start with empty --config file. 2. float() on env vars without ValueError guard (lines 3951, 11757, 11805, 11807): HERMES_AGENT_TIMEOUT, HERMES_AGENT_TIMEOUT_WARNING, and HERMES_AGENT_NOTIFY_INTERVAL are cast via float() directly from os.getenv(). A typo (e.g. "abc") raises ValueError and crashes the agent turn or gateway startup. Impact: single misconfigured env var crashes the entire gateway.	2026-04-30 03:32:37 -07:00
briandevans	f44f1f9615	fix(gateway): preserve session guard across in-band drain handoff When the in-band pending-message drain spawns a fresh task and transfers ownership via _session_tasks[session_key] = drain_task, the original task still unwinds through the finally block. The drain task picks up the same interrupt_event in its own _process_message_background entry, so an unconditional _release_session_guard(session_key, guard=interrupt_event) at the end of the finally matches and deletes _active_sessions[session_key] while the drain task is still pending its first await. A concurrent inbound message arriving in that handoff window passes the Level-1 guard (no entry exists) and spawns a second _process_message_background for the same session — two agents on one session_key, duplicate responses, duplicate tool calls. Fix: only call _release_session_guard when the current task still owns _session_tasks[session_key]. When ownership has been transferred to a drain task, leave _active_sessions populated; the drain task's own lifecycle releases it. This mirrors the late-arrival drain path in the same finally block, which already leaves both entries alone after handing off. Also reorder stdlib imports in the new regression test file to match the gateway test convention (stdlib before third-party). Regression test: capture _active_sessions[sk] identity at every handler entry across a 2-step in-band drain chain and assert the guard Event identity stays the same. Pre-fix, the original task's finally deletes the entry, the drain task falls through to the `or asyncio.Event()` branch, and a fresh Event is installed — identity diverges. Post-fix, the entry is preserved and the drain task reuses the original Event. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 03:27:08 -07:00
briandevans	663ba9a58f	fix(gateway): drain pending messages via fresh task, not recursion (#17758 ) `_process_message_background` finished a turn, found a queued follow-up, and drained it via `await self._process_message_background(pending_event, session_key)`. Each chained follow-up added a frame to the call stack instead of starting fresh. Under sustained pending-queue activity (e.g. a user sending follow-ups faster than the agent finishes turns) the C stack would exhaust at ~2000 nested frames and SIGSEGV the process. Mirror the late-arrival drain pattern that already exists in the same function: spawn a new `asyncio.create_task(...)` for the pending event and return so the current frame can unwind. The new task takes ownership via `_session_tasks[session_key]`. The late-arrival drain in `finally` could now race with the in-band drain across the `await typing_task` / `await stop_typing` window, so add a guard: if `_session_tasks[session_key]` is no longer the current task, an in-band drain already spawned a follow-up task — re-queue the late-arrival event so that task picks it up after its current event, instead of spawning a second concurrent task for the same session_key. Regression test (`test_pending_drain_no_recursion.py`) chains 12 follow-ups and asserts the recorded `_process_message_background` stack depth stays bounded at handler entry. Pre-fix: depths grow linearly `[1,2,3,…,12]`. Post-fix: all depths are `1`. `test_duplicate_reply_suppression::test_stale_response_suppressed_when_interrupted` called `_process_message_background` directly and implicitly relied on the old recursive `await` semantic — updated to wait for the spawned drain task before checking the sent list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 03:27:08 -07:00
Teknium	aa7bf329bc	feat(gateway): centralize audio routing + FLAC support + Telegram doc fallback (#17833 ) Extracted from PR #17211 (@versun) so it can land independently of the local_command TTS provider redesign. - Add should_send_media_as_audio(platform, ext, is_voice) in gateway/platforms/base.py; single source of truth for audio routing. - Add .flac to recognized audio extensions (MEDIA regex, weixin audio set, send_message audio set). - Telegram send_voice() now falls back to send_document for formats Telegram's Bot API can't play natively (.wav, .flac, ...) instead of raising; MP3/M4A still go to sendAudio, Opus/OGG still go to sendVoice. - Route _send_telegram() in send_message_tool through a narrower _TELEGRAM_SEND_AUDIO_EXTS = {.mp3, .m4a} set. - cron.scheduler._send_media_via_adapter now delegates the audio decision to should_send_media_as_audio so it matches the gateway. - Update the cron live-adapter ogg test to flag [[audio_as_voice]] so it still routes to sendVoice under the new Telegram-specific policy. - Tests: unit coverage for should_send_media_as_audio across platforms, end-to-end MEDIA routing via _process_message_background and GatewayRunner._deliver_media_from_response, TelegramAdapter.send_voice fallback for FLAC/WAV. Co-authored-by: Versun <me+github7604@versun.org>	2026-04-30 01:32:31 -07:00
emozilla	718e4e2e7e	fix(plugins): register dynamically-loaded modules in sys.modules before exec Dashboard plugin API routes (web_server._mount_plugin_api_routes) and gateway event hooks (gateway.hooks.HookRegistry.discover_and_load) both loaded Python files via importlib.util.spec_from_file_location + exec_module without registering the resulting module in sys.modules. That breaks any plugin or hook handler that uses `from __future__ import annotations` together with a Pydantic BaseModel / dataclass / anything that introspects `__module__`: at first request Pydantic tries to resolve string-form type hints against the defining module's namespace, can't find it by name, and raises: PydanticUserError: TypeAdapter[...] is not fully defined; you should define ... and all referenced types, then call `.rebuild()` on the instance. This is what broke the kanban dashboard's 'triage' button — POST /api/plugins/kanban/tasks validated against CreateTaskBody (a Pydantic model in a file using `from __future__ import annotations`) and returned 500 on every click. The fix, applied symmetrically to both loaders: 1. Compute module_name once. 2. Register the module in sys.modules BEFORE exec_module. 3. On exec_module failure, pop the half-initialized stub so subsequent reloads don't pick up broken state. GETs were unaffected because they don't build a body TypeAdapter, which is why this only surfaced when users started POSTing.	2026-04-29 23:34:35 -07:00
Stephen Schoettler	f73364b1c4	fix(ci): stabilize main test suite regressions (#17660 ) * fix: stabilize main test suite regressions * test(agent): update MiniMax normalization expectation * test: stabilize remaining CI assertions * test: harden config helper monkeypatching * test: harden CI-only assertions * fix(agent): propagate fast streaming interrupts	2026-04-29 23:18:55 -07:00
Teknium	71c8ca17dc	chore(salvage): strip duplicated/merge-corrupted blocks from PR #17664 Removes drive-by duplication that accumulated during the contributor branch's multiple rebases. All runtime-benign (dict last-wins, redefinition last-wins) but left dead source that would confuse reviewers and maintainers. Surgical in-place de-duplication (kept PR's intentional additions, removed only the doubled copy): * hermes_cli/auth.py: duplicate "gmi" + "azure-foundry" ProviderConfig * hermes_cli/models.py: duplicate "gmi" entry in _PROVIDER_MODELS * hermes_cli/config.py: duplicate NOTION/LINEAR/AIRTABLE/TENOR skill env block + duplicate get_custom_provider_context_length definition * hermes_cli/gateway.py: duplicate _setup_yuanbao * gateway/platforms/base.py: duplicate is_host_excluded_by_no_proxy * gateway/platforms/telegram.py: duplicate delete_message * gateway/stream_consumer.py: duplicate _should_send_fresh_final and _try_fresh_final * gateway/run.py: duplicate _parse_reasoning_command_args / _resolve_session_reasoning_config / _set_session_reasoning_override, duplicate "Drain silently when interrupted" interrupt check * run_agent.py: duplicate HERMES_AGENT_HELP_GUIDANCE append, duplicate codex_message_items capture, duplicate custom_providers resolution * tools/approval.py: duplicate HARDLINE_PATTERNS section and duplicate hardline call in check_dangerous_command * tools/mcp_tool.py: duplicate _orphan_stdio_pids module-level decl * cron/scheduler.py: duplicate "not configured/enabled" check — kept the new early-rejection, removed the stale late-path copy Full-file resets to origin/main (all PR additions were duplicates of content already on main): * ui-tui/packages/hermes-ink/index.d.ts * ui-tui/packages/hermes-ink/src/entry-exports.ts * ui-tui/packages/hermes-ink/src/ink/selection.ts * ui-tui/src/app/interfaces.ts * ui-tui/src/app/slash/commands/core.ts * ui-tui/src/components/thinking.tsx * ui-tui/src/lib/memoryMonitor.ts * ui-tui/src/types.ts * ui-tui/src/types/hermes-ink.d.ts * tests/hermes_cli/test_doctor.py * tests/hermes_cli/test_api_key_providers.py * tests/hermes_cli/test_model_validation.py * tests/plugins/memory/test_hindsight_provider.py * tests/run_agent/test_run_agent.py * tests/gateway/test_email.py * tests/tools/test_dockerfile_pid1_reaping.py * hermes_cli/commands.py (slack_native_slashes block — full duplicate)	2026-04-29 21:56:51 -07:00
Ari Lotter	868bc1c242	feat(irc): add interactive setup feat(gateway): refine Platform._missing_ and platform-connected dispatch Restricts plugin-name acceptance to bundled plugin scan + registry (no arbitrary string -> enum-pollution), pulls per-platform connectivity checks into a _PLATFORM_CONNECTED_CHECKERS lambda map with a clean _is_platform_connected method, and adds tests covering the checker map, plugin platform interface, and IRC setup wizard.	2026-04-29 21:56:51 -07:00
Ari Lotter	1f1608067c	feat(gateway): unify setup flows, load platforms dynamically from registry Merge the two gateway setup paths (hermes setup gateway + hermes gateway setup) to use a single _unified_platforms() list that merges built-in _PLATFORMS with dynamically registered plugin entries from platform_registry. - Add setup_fn field to PlatformEntry for plugin setup flows - _unified_platforms() merges built-ins with registry entries by key - setup_gateway() now uses unified list instead of hardcoded _GATEWAY_PLATFORMS tuple list - gateway_setup() uses same unified list, plugin entries appear alongside built-ins with no [plugin] suffix - _platform_status() handles plugin platforms via registry check_fn - Plugin platforms with setup_fn get called directly; plugins without get a generic env-var display fallback IRC and other plugin platforms now appear automatically in the setup menu when registered via platform_registry.register(). feat(gateway): surface disabled platform plugins in setup and auto-enable on select Platform plugins under plugins/platforms/* (IRC, etc.) were gated behind plugins.enabled, so `hermes gateway setup` wouldn't list them until the user ran `hermes plugins enable <name>` first. Now the setup menu always surfaces them as "plugin disabled — select to enable", and picking one adds it to plugins.enabled before running its setup flow. Along the way, unify the two gateway setup flows so `hermes setup gateway` and `hermes gateway setup` both read from the same platform list (built-in _PLATFORMS + platform_registry entries), dispatch through a single _configure_platform() helper, and share _platform_status(). Deletes the dead bespoke wrappers in setup.py (_setup_whatsapp, _setup_weixin, _setup_email, etc.) that duplicated logic now covered by the registry path or _setup_standard_platform. Also: - PlatformEntry gains a plugin_name field so the registry knows which plugin owns each entry (required for auto-enable). - PluginContext.register_platform auto-stamps plugin_name from the manifest so plugins don't have to pass it explicitly. - PluginManager now scans plugins/platforms/* as its own category root, one level below the bundled plugin scan. - Fix IRC plugin discovery: rename PLUGIN.yaml → plugin.yaml (the scanner is case-sensitive) and add the missing __init__.py that _load_directory_module requires.	2026-04-29 21:56:51 -07:00
Teknium	e464cde58f	feat: final platform plugin parity — webhook delivery, platform hints, docs Closes remaining functional gaps and adds documentation. webhook.py: Cross-platform delivery now checks the plugin registry for unknown platform names instead of hardcoding 15 names in a tuple. Plugin platforms can receive webhook-routed deliveries. prompt_builder: Platform hints (system prompt LLM guidance) now fall back to the plugin registry's platform_hint field. Plugin platforms can tell the LLM 'you're on IRC, no markdown.' PlatformEntry: Added platform_hint field for LLM guidance injection. IRC adapter: Added acquire_scoped_lock/release_scoped_lock in connect/disconnect to prevent two profiles from using the same IRC identity. Added platform_hint for IRC-specific LLM guidance. Removed dead token-empty-warning extension for plugin platforms (plugin adapters handle their own env vars via check_fn). website/docs/developer-guide/adding-platform-adapters.md: - Added 'Plugin Path (Recommended)' section with full code examples, PLUGIN.yaml template, config.yaml examples, and a table showing all 18 integration points the plugin system handles automatically - Renamed built-in checklist to clarify it's for core contributors gateway/platforms/ADDING_A_PLATFORM.md: - Added Plugin Path section pointing to the reference implementation and full docs guide - Clarified built-in path is for core contributors only	2026-04-29 21:56:51 -07:00
Teknium	457128d4e8	fix: wire PII redaction + token empty warnings for plugin platforms PII redaction: build_session_context_prompt() now checks the plugin registry's pii_safe flag in addition to the hardcoded _PII_SAFE_PLATFORMS frozenset. Plugin platforms that set pii_safe=True (e.g. phone-based messaging bridges) get their user IDs redacted before LLM context. Token empty warnings: the empty-token diagnostic at config load now checks the plugin registry's required_env when a platform isn't in the hardcoded _token_env_names dict. Catches 'enabled but empty' for plugin platforms too.	2026-04-29 21:56:51 -07:00
Teknium	2e20f6ae2d	feat: complete plugin platform parity — all 12 integration points Extends the platform plugin interface from Phase 1 to cover every touchpoint where built-in platforms have hardcoded behavior. - allowed_users_env / allow_all_env: per-platform auth env vars - max_message_length: smart-chunking for send_message tool - pii_safe: session PII redaction flag - emoji: CLI/gateway display - allow_update_command: /update access control send_message tool (tools/send_message_tool.py): - Replaced hardcoded platform_map dict with Platform() call - Added _send_via_adapter() for plugin platforms — routes through live gateway adapter when available - Registry-aware max message length for smart chunking Cron delivery (cron/scheduler.py): - Replaced hardcoded 15-entry platform_map with Platform() call - Plugin platforms now work as cron delivery targets User authorization (gateway/run.py _is_user_authorized): - Registry fallback: checks PlatformEntry.allowed_users_env and allow_all_env when platform not in hardcoded maps - Plugin platforms get per-platform auth support _UPDATE_ALLOWED_PLATFORMS: checks registry allow_update_command flag Channel directory: includes plugin platforms in session enumeration Orphaned config warning: descriptive message when plugin platform is in config but no plugin registered it Gateway weakref: _gateway_runner_ref for cross-module adapter access hermes status: shows plugin platforms with (plugin) tag hermes gateway setup: plugin platforms appear in menu with setup hints hermes_cli/platforms.py: get_all_platforms() merges with registry, platform_label() falls back to registry for plugin names - 8 new tests (extended fields, cron resolution, platforms merge) - Updated 3 tests for new Platform() based resolution - 2829 passed, 24 pre-existing failures, zero new failures	2026-04-29 21:56:51 -07:00
Teknium	8f144fe36b	feat: pluggable platform adapter registry + IRC reference implementation Adds a platform adapter plugin interface so anyone can create new gateway platforms (IRC, Viber, Line, etc.) as drop-in plugins without modifying core gateway code. - PlatformEntry dataclass: name, label, adapter_factory, check_fn, validate_config, required_env, install_hint, source - PlatformRegistry singleton with register/unregister/create_adapter - _create_adapter() in gateway/run.py checks registry first, falls through to existing if/elif chain for built-in platforms - Platform._missing_() accepts unknown string values, creating cached pseudo-members so Platform('irc') is Platform('irc') holds true - GatewayConfig.from_dict() now parses plugin platform names from config.yaml without rejecting them - get_connected_platforms() delegates to registry for unknown platforms - PluginContext.register_platform() for plugin authors - Mirrors the existing register_tool() / register_hook() pattern - Full async IRC adapter using stdlib asyncio (zero external deps) - Connects via TLS, handles PING/PONG, nick collision, NickServ auth - Channel messages require addressing (nick: msg), DMs always dispatch - Markdown stripping for IRC-clean output, message splitting for 512-byte line limit - Config via config.yaml extra dict or IRC_* env vars - Platform enum dynamic members (identity stability, case normalization) - PlatformRegistry (register, unregister, create, validation, factory) - GatewayConfig integration (from_dict parsing, get_connected_platforms) - IRC adapter (init, send, protocol parsing, markdown, requirements) No existing platform adapters were migrated — the if/elif chain is untouched. This is Phase 1: prove the interface with a real plugin.	2026-04-29 21:56:51 -07:00
Teknium	4d7fc0f37c	feat(gateway,cli): confirm /reload-mcp to warn about prompt cache invalidation Reloading MCP servers rebuilds the tool set for the active session, which invalidates the provider prompt cache (tool schemas are baked into the system prompt). The next message re-sends full input tokens — can be expensive on long-context or high-reasoning models. To surface that cost, /reload-mcp now routes through a new slash-confirm primitive with three options: Approve Once / Always Approve / Cancel. 'Always Approve' persists approvals.mcp_reload_confirm: false so future reloads run silently. Coverage: * Classic CLI (cli.py) — interactive numbered prompt. * TUI (tui_gateway + Ink ops.ts) — text warning on first call; `now` / `always` args skip the gate; `always` also persists the opt-out. * Messenger gateway — button UI on Telegram (inline keyboard), Discord (discord.ui.View), Slack (Block Kit actions); text fallback on every other platform via /approve /always /cancel replies intercepted in gateway/run.py _handle_message. * Config key: approvals.mcp_reload_confirm (default true). * Auto-reload paths (CLI file watcher, TUI config-sync mtime poll) pass confirm=true so they do NOT prompt. Implementation: * tools/slash_confirm.py — module-level pending-state store used by all adapters and by the CLI prompt. Thread-safe register/resolve/clear. * gateway/platforms/base.py — send_slash_confirm hook (default 'Not supported' → text fallback). * gateway/run.py — _request_slash_confirm helper + text intercept in _handle_message (yields to in-progress tool-exec approvals so dangerous-command /approve still unblocks the tool thread first). Tests: * tests/tools/test_slash_confirm.py — primitive lifecycle + async resolution + double-click atomicity (16 tests). * tests/hermes_cli/test_mcp_reload_confirm_gate.py — default-config shape + deep-merge preserves user opt-out (5 tests). Targeted runs (hermetic): 89 passed (slash-confirm, config gate, existing agent cache, existing telegram approval buttons).	2026-04-29 21:56:47 -07:00
helix4u	7fae87bc00	fix(gateway): refresh cached agents after MCP tool changes	2026-04-29 21:56:47 -07:00
memosr	d69a0b2c29	fix(security): apply ACL checks to QQBot guild messages and guild DMs to prevent allowlist bypass	2026-04-29 21:08:28 -07:00

1 2 3 4 5 ...

1360 commits