hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

Author	SHA1	Message	Date
Teknium	d2206c69cc	fix(qqbot): add back-compat for env var rename; drop qrcode core dep Follow-up to WideLee's salvaged PR #11582. Back-compat for QQ_HOME_CHANNEL → QQBOT_HOME_CHANNEL rename: - gateway/config.py reads QQBOT_HOME_CHANNEL, falls back to QQ_HOME_CHANNEL with a one-shot deprecation warning so users on the old name aren't silently broken. - cron/scheduler.py: _HOME_TARGET_ENV_VARS['qqbot'] now maps to the new name; _get_home_target_chat_id falls back to the legacy name via a _LEGACY_HOME_TARGET_ENV_VARS table. - hermes_cli/status.py + hermes_cli/setup.py: honor both names when displaying or checking for missing home channels. - hermes_cli/config.py: keep legacy QQ_HOME_CHANNEL[_NAME] in _EXTRA_ENV_KEYS so .env sanitization still recognizes them. Scope cleanup: - Drop qrcode from core dependencies and requirements.txt (remains in messaging/dingtalk/feishu extras). _qqbot_render_qr already degrades gracefully when qrcode is missing, printing a 'pip install qrcode' tip and falling back to URL-only display. - Restore @staticmethod on QQAdapter._detect_message_type (it doesn't use self). Revert the test change that was only needed when it was converted to an instance method. - Reset uv.lock to origin/main; the PR's stale lock also included unrelated changes (atroposlib source URL, hermes-agent version bump, fastapi additions) that don't belong. Verified E2E: - Existing user (QQ_HOME_CHANNEL set): gateway + cron both pick up the legacy name; deprecation warning logs once. - Fresh user (QQBOT_HOME_CHANNEL set): gateway + cron use new name, no warning. - Both set: new name wins on both surfaces. Targeted tests: 296 passed, 4 skipped (qqbot + cron + hermes_cli).	2026-04-17 15:31:14 -07:00
WideLee	103beea7a6	fix(qqbot): fix test failures after package refactor - Re-export _ssrf_redirect_guard from __init__.py - Fix _parse_json @staticmethod using self._log_tag - Update test_detect_message_type to call as instance method - Fix mock.patch path for httpx.AsyncClient in adapter submodule	2026-04-17 15:31:14 -07:00
Teknium	31e7276474	fix(gateway): consolidate per-session cleanup; close SessionDB on shutdown (#11800 ) Three closely-related fixes for shutdown / lifecycle hygiene. 1. _release_running_agent_state(session_key) helper ---------------------------------------------------- Per-running-agent state lived in three dicts that drifted out of sync across cleanup sites: self._running_agents — AIAgent per session_key self._running_agents_ts — start timestamp per session_key self._busy_ack_ts — last busy-ack timestamp per session_key Inventory before this PR: 8 sites: del self._running_agents[key] — only 1 (stale-eviction) cleaned all three — 1 cleaned _running_agents + _running_agents_ts only — 6 cleaned _running_agents only Each missed entry was a (str, float) tuple per session per gateway lifetime — small, persistent, accumulates across thousands of sessions over months. Per-platform leaks compounded. This change adds a single helper that pops all three dicts in lockstep, and replaces every bare 'del self._running_agents[key]' site with it. Per-session state that PERSISTS across turns (_session_model_overrides, _voice_mode, _pending_approvals, _update_prompt_pending) is intentionally NOT touched here — those have their own lifecycles tied to user actions, not turn boundaries. 2. _running_agents_ts cleared in _stop_impl ---------------------------------------- Was being missed alongside _running_agents.clear(); now included. 3. SessionDB close() in _stop_impl --------------------------------- The SQLite WAL write lock stayed held by the old gateway connection until Python actually exited — causing 'database is locked' errors when --replace launched a new gateway against the same file. We now explicitly close both self._db and self.session_store._db inside _stop_impl, with try/except so a flaky close on one doesn't block the other. Tests ----- tests/gateway/test_session_state_cleanup.py — 10 cases covering: * helper pops all three dicts atomically * idempotent on missing/empty keys * preserves other sessions * tolerates older runners without _busy_ack_ts attribute * thread-safe under concurrent release * regression guard: scans gateway/run.py and fails if a future contributor reintroduces 'del self._running_agents[...]' outside docstrings * SessionDB close called on both holders during shutdown * shutdown tolerates missing session_store * shutdown tolerates close() raising on one db (other still closes) Broader gateway suite: 3108 passed (vs 3100 on baseline) — failure delta is +8 net passes; the 10 remaining failures are pre-existing cross-test pollution / missing optional deps (matrix needs olm, signal/telegram approval flake, dingtalk Mock wiring), all reproduce on stashed baseline.	2026-04-17 15:18:23 -07:00
Teknium	036dacf659	feat(telegram): auto-wrap markdown tables in code blocks (#11794 ) Telegram's MarkdownV2 has no table syntax — pipes get backslash-escaped and tables render as noisy unaligned text. format_message now detects GFM-style pipe tables (header row + delimiter row + optional body) and wraps them in ``` fences before the existing MarkdownV2 conversion runs. Telegram renders fenced code blocks as monospace preformatted text with columns intact. Tables already inside an existing code block are left alone. Plain prose with pipes, lone '---' horizontal rules, and non-table content are unaffected. Closes the recurring community request to stop having to ask the agent to re-render tables as code blocks manually.	2026-04-17 14:27:26 -07:00
Teknium	eb07c05646	fix(gateway): prune stale SessionStore entries to bound memory + disk (#11789 ) SessionStore._entries grew unbounded. Every unique (platform, chat_id, thread_id, user_id) tuple ever seen was kept in RAM and rewritten to sessions.json on every message. A Discord bot in 100 servers x 100 channels x ~100 rotating users accumulates on the order of 10^5 entries after a few months; each sessions.json write becomes an O(n) fsync. Nothing trimmed this — there was no TTL, no cap, no eviction path. Changes ------- * SessionStore.prune_old_entries(max_age_days) — drops entries whose updated_at is older than the cutoff. Preserves: - suspended entries (user paused them via /stop for later resume) - entries with an active background process attached Pruning is functionally identical to a natural reset-policy expiry: SQLite transcript stays, session_key -> session_id mapping dropped, returning user gets a fresh session. * GatewayConfig.session_store_max_age_days (default 90; 0 disables). Serialized in to_dict/from_dict, coerced from bad types / negatives to safe defaults. No migration needed — missing field -> 90 days. * _session_expiry_watcher calls prune_old_entries once per hour (first tick is immediate). Uses the existing watcher loop so no new background task is created. Why not more aggressive ----------------------- 90 days is long enough that legitimate long-idle users (seasonal, vacation, etc.) aren't surprised — pruning just means they get a fresh session on return, same outcome they'd get from any other reset-policy trigger. Admins can lower it via config; 0 disables. Tests ----- tests/gateway/test_session_store_prune.py — 17 cases covering: * entry age based on updated_at, not created_at * max_age_days=0 disables; negative coerces to 0 * suspended + active-process entries are skipped * _save fires iff something was removed * disk JSON reflects post-prune state * thread safety against concurrent readers * config field roundtrips + graceful fallback on bad values * watcher gate logic (first tick prunes, subsequent within 1h don't) 119 broader session/gateway tests remain green.	2026-04-17 13:48:49 -07:00
Brooklyn Nicholson	1f37ef2fd1	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-17 08:59:33 -05:00
Young Sherlock	8dcd08d8bb	Fix Weixin media uploads and refresh lockfile	2026-04-17 06:50:36 -07:00
Teknium	3f3d8a7b24	fix(discord): strip mention syntax from auto-thread names Previously a message like `<@&1490963422786093149> help` would spawn a thread literally named `<@&1490963422786093149> help`, exposing raw Discord mention markers in the thread list. Only user mentions (`<@id>`) were being stripped upstream — role mentions (`<@&id>`) and channel mentions (`<#id>`) leaked through. Fix: strip all three mention patterns in `_auto_create_thread` before building the thread name. Collapse runs of whitespace left by the removal. If the entire content was mention-only, fall back to 'Hermes' instead of an empty title. Fixes #6336. Tests: two new regression guards in test_discord_slash_commands.py covering mixed-mention content and mention-only content.	2026-04-17 06:46:52 -07:00
sgaofen	32a694ad5f	fix(discord): fall back when auto-thread creation fails	2026-04-17 06:46:52 -07:00
OwenYWT	f5dc4e905d	fix(discord): skip auto-threading reply messages	2026-04-17 06:46:52 -07:00
Matteo De Agazio	93fe4b357d	fix(discord): free-response channels skip auto-threading Free-response channels already bypassed the @mention gate so users could chat inline with the bot, but auto-threading still fired on every message — spinning off a thread per message and defeating the lightweight-chat purpose. Fix: fold `is_free_channel` into `skip_thread` so threading is skipped whenever the channel is in DISCORD_FREE_RESPONSE_CHANNELS (via env or discord.free_response_channels in config.yaml). Net change: one line in _handle_message + one regression test. Partially addresses #9399. Authored by @Hypn0sis (salvaged from PR #9650; the bundled 'smart' auto-thread mode from that PR was dropped in favor of deterministic true/false semantics).	2026-04-17 06:46:52 -07:00
Teknium	8d7b7feb0d	fix(gateway): bound _agent_cache with LRU cap + idle TTL eviction (#11565 ) * fix(gateway): bound _agent_cache with LRU cap + idle TTL eviction The per-session AIAgent cache was unbounded. Each cached AIAgent holds LLM clients, tool schemas, memory providers, and a conversation buffer. In a long-lived gateway serving many chats/threads, cached agents accumulated indefinitely — entries were only evicted on /new, /model, or session reset. Changes: - Cache is now an OrderedDict so we can pop least-recently-used entries. - _enforce_agent_cache_cap() pops entries beyond _AGENT_CACHE_MAX_SIZE=64 when a new agent is inserted. LRU order is refreshed via move_to_end() on cache hits. - _sweep_idle_cached_agents() evicts entries whose AIAgent has been idle longer than _AGENT_CACHE_IDLE_TTL_SECS=3600s. Runs from the existing _session_expiry_watcher so no new background task is created. - The expiry watcher now also pops the cache entry after calling _cleanup_agent_resources on a flushed session — previously the agent was shut down but its reference stayed in the cache dict. - Evicted agents have _cleanup_agent_resources() called on a daemon thread so the cache lock isn't held during slow teardown. Both tuning constants live at module scope so tests can monkeypatch them without touching class state. Tests: 7 new cases in test_agent_cache.py covering LRU eviction, move_to_end refresh, cleanup thread dispatch, idle TTL sweep, defensive handling of agents without _last_activity_ts, and plain-dict test fixture tolerance. * tweak: bump _AGENT_CACHE_MAX_SIZE 64 -> 128 * fix(gateway): never evict mid-turn agents; live spillover tests The prior commit could tear down an active agent if its session_key happened to be LRU when the cap was exceeded. AIAgent.close() kills process_registry entries for the task, tears down the terminal sandbox, closes the OpenAI client (sets self.client = None), and cascades .close() into any active child subagents — all fatal if the agent is still processing a turn. Changes: - _enforce_agent_cache_cap and _sweep_idle_cached_agents now look at GatewayRunner._running_agents and skip any entry whose AIAgent instance is present (identity via id(), so MagicMock doesn't confuse lookup in tests). _AGENT_PENDING_SENTINEL is treated as 'not active' since no real agent exists yet. - Eviction only considers the LRU-excess window (first size-cap entries). If an excess slot is held by a mid-turn agent, we skip it WITHOUT compensating by evicting a newer entry. A freshly inserted session (zero cache history) shouldn't be punished to protect a long-lived one that happens to be busy. - Cache may therefore stay transiently over cap when load spikes; a WARNING is logged so operators can see it, and the next insert re-runs the check after some turns have finished. New tests (TestAgentCacheActiveSafety + TestAgentCacheSpilloverLive): - Active LRU entry is skipped; no newer entry compensated - Mixed active/idle excess window: only idle slots go - All-active cache: no eviction, WARNING logged, all clients intact - _AGENT_PENDING_SENTINEL doesn't block other evictions - Idle-TTL sweep skips active agents - End-to-end: active agent's .client survives eviction attempt - Live fill-to-cap with real AIAgents, then spillover - Live: CAP=4 all active + 1 newcomer — cache grows to 5, no teardown - Live: 8 threads racing 160 inserts into CAP=16 — settles at 16 - Live: evicted session's next turn gets a fresh agent that works 30 tests pass (13 pre-existing + 17 new). Related gateway suites (model switch, session reset, proxy, etc.) all green. * fix(gateway): cache eviction preserves per-task state for session resume The prior commits called AIAgent.close() on cache-evicted agents, which tears down process_registry entries, terminal sandbox, and browser daemon for that task_id — permanently. Fine for session-expiry (session ended), wrong for cache eviction (session may resume). Real-world scenario: a user leaves a Telegram session open for 2+ hours, idle TTL evicts the cached AIAgent, user returns and sends a message. Conversation history is preserved via SessionStore, but their terminal sandbox (cwd, env vars, bg shells) and browser state were destroyed. Fix: split the two cleanup modes. close() Full teardown — session ended. Kills bg procs, tears down terminal sandbox + browser daemon, closes LLM client. Used by session-expiry, /new, /reset (unchanged). release_clients() Soft cleanup — session may resume. Closes LLM client only. Leaves process_registry, terminal sandbox, browser daemon intact for the resuming agent to inherit via shared task_id. Gateway cache eviction (_enforce_agent_cache_cap, _sweep_idle_cached_agents) now dispatches _release_evicted_agent_soft on the daemon thread instead of _cleanup_agent_resources. All session-expiry call sites of _cleanup_agent_resources are unchanged. Tests (TestAgentCacheIdleResume, 5 new cases): - release_clients does NOT call process_registry.kill_all - release_clients does NOT call cleanup_vm / cleanup_browser - release_clients DOES close the LLM client (agent.client is None after) - close() vs release_clients() — semantic contract pinned - Idle-evicted session's rebuild with same session_id gets same task_id Updated test_cap_triggers_cleanup_thread to assert the soft path fires and the hard path does NOT. 35 tests pass in test_agent_cache.py; 67 related tests green.	2026-04-17 06:36:34 -07:00
Teknium	f64241ed90	feat(cron+tests): extend origin fallback to email/dingtalk/qqbot + fix Weixin test mocks Cron origin fallback extension (builds on #9193's _HOME_TARGET_ENV_VARS): adds the three remaining origin-fallback-eligible platforms that have home channel env vars configured in gateway/config.py but use non-generic env var names: - email → EMAIL_HOME_ADDRESS (non-standard suffix) - dingtalk → DINGTALK_HOME_CHANNEL - qqbot → QQ_HOME_CHANNEL (non-standard prefix: QQ_ not QQBOT_) Picks up the completeness intent of @Xowiek's PR #11317 using the architecturally-correct dict-based lookup from #9193, so platforms with non-standard env var names actually resolve instead of silently missing. Extended the parametrized regression test to cover the new three. Weixin test mock alignment (builds on #10091's _send_session split): Three test sites added in Batch 1 (TestWeixinSendImageFileParameterName) and Batch 3 (TestWeixinVoiceSending) mocked only adapter._session, but #10091 switched the send paths to check self._send_session. Added the companion setter so the tests stay green with the session split in place.	2026-04-17 06:26:43 -07:00
Ubuntu	5ca52bae5b	fix(gateway/weixin): split poll/send sessions, reuse live adapter for cron & send_message - gateway/platforms/weixin.py: - Split aiohttp.ClientSession into _poll_session and _send_session - Add _LIVE_ADAPTERS registry so send_weixin_direct() reuses the connected gateway adapter instead of creating a competing session - Fixes silent message loss when gateway is running (iLink token contention) - cron/scheduler.py: - Support comma-separated deliver values (e.g. 'feishu,weixin') for multi-target delivery - Delay pconfig/enabled check until standalone fallback so live adapters work even when platform is not in gateway config - tools/send_message_tool.py: - Synthesize PlatformConfig from WEIXIN_* env vars when gateway config lacks a weixin entry - Fall back to WEIXIN_HOME_CHANNEL env var for home channel resolution - tests/gateway/test_weixin.py: - Update mocks to include _send_session	2026-04-17 06:26:43 -07:00
Teknium	c60b6dc317	test(dingtalk): cover get_connected_platforms + null platform_toolsets Follow-ups to the salvaged commits in this PR: * gateway/config.py — strip trailing whitespace from youngDoo's diff (line 315 had ~140 trailing spaces). * hermes_cli/tools_config.py — replace `config.get("platform_toolsets", {})` with `config.get("platform_toolsets") or {}`. Handles the case where the YAML key is present but explicitly null (parses as None, previously crashed with AttributeError on the next line's .get(platform)). Cherry-picked from yyq4193's #9003 with attribution. * tests/gateway/test_config.py — 4 new tests for TestGetConnectedPlatforms covering DingTalk via extras, via env vars, disabled, and missing creds. * tests/hermes_cli/test_tools_config.py — regression test for the null platform_toolsets edge case. * scripts/release.py — add kagura-agent, youngDoo, yyq4193 to AUTHOR_MAP. Co-authored-by: yyq4193 <39405770+yyq4193@users.noreply.github.com>	2026-04-17 06:26:18 -07:00
kagura-agent	47a0dd1024	fix(dingtalk): fire-and-forget message processing & session_webhook fallback Fixes #11463: DingTalk channel receives messages but fails to reply with 'No session_webhook available'. Two changes: 1. Fire-and-forget message processing: process() now dispatches _on_message as a background task via asyncio.create_task instead of awaiting it. This ensures the SDK ACK is returned immediately, preventing heartbeat timeouts and disconnections when message processing takes longer than the SDK's ACK deadline. 2. session_webhook extraction fallback: If ChatbotMessage.from_dict() fails to map the sessionWebhook field (possible across SDK versions), the handler now falls back to extracting it directly from the raw callback data dict using both 'sessionWebhook' and 'session_webhook' key variants. Added 3 tests covering webhook extraction, fallback behavior, and fire-and-forget ACK timing.	2026-04-17 06:26:18 -07:00
Teknium	e5b880264b	fix(discord): harden DISCORD_ALLOWED_ROLES and cover gateway layer Two follow-ups to the cherry-picked PR #9873 (`e3bcc819`): 1. `_is_allowed_user` now uses `getattr(self, '_allowed_*_ids', set())` so test fixtures that build the adapter via `object.__new__` (skipping __init__) don't crash with AttributeError. See AGENTS.md pitfall #17 — same pattern as gateway.run. 2. New 3-case regression coverage in test_discord_bot_auth_bypass.py: - role-only config bypasses the gateway 'no allowlists' branch - roles + users combined still authorizes user-allowlist matches - the role bypass does NOT leak to other platforms (Telegram, etc.) 3. Autouse fixture in test_discord_bot_auth_bypass.py clears all Discord auth env vars before each test so DISCORD_ALLOWED_ROLES leakage from a previous test in the session can't flip later 'should-reject' tests into false-pass. Required because the bare cherry-pick of #9873 only added the adapter- level role check — it didn't cover the gateway-level _is_user_authorized, which still rejected role-only setups via the 'no allowlists configured' branch.	2026-04-17 05:48:26 -07:00
Teknium	7d888ab49c	test(discord): regression guard for DISCORD_ALLOW_BOTS auth bypass Six test cases covering: - DISCORD_ALLOW_BOTS=mentions + bot not in DISCORD_ALLOWED_USERS → authorized - DISCORD_ALLOW_BOTS=all + bot not in DISCORD_ALLOWED_USERS → authorized - DISCORD_ALLOW_BOTS=none → bots still rejected (preserves security) - DISCORD_ALLOW_BOTS unset → same as 'none' - Humans still checked against allowlist even with allow_bots=all - Bot bypass is Discord-specific — doesn't leak to other platforms Guards against a regression where the is_bot bypass in _is_user_authorized gets moved, removed, or accidentally extended to other platforms.	2026-04-17 05:42:04 -07:00
Teknium	d7fb435e0e	fix(discord): flat /skill command with autocomplete — fits 8KB limit trivially (#11580 ) Closes #11321, closes #10259. ## Problem The nested /skill command group (category subcommand groups + skill subcommands) serialized to ~14KB with the default 75-skill catalog, exceeding Discord's ~8000-byte per-command registration payload. The entire tree.sync() rejected with error 50035 — ALL slash commands including the 27 base commands failed to register. ## Fix Replace the nested Group layout with a single flat Command: /skill name:<autocomplete> args:<optional string> Autocomplete options are fetched dynamically by Discord when the user types — they do NOT count against the per-command registration budget. So this single command registers at ~200 bytes regardless of how many skills exist. Scales to thousands of skills with no size calculations, no splitting, no hidden skills. UX improvements: - Discord live-filters by user's typed prefix against BOTH name and description, so '/skill pdf' finds 'ocr-and-documents' via its description. More discoverable than clicking through category menus. - Unknown skill name → ephemeral error pointing user at autocomplete. - Stable alphabetical ordering across restarts. ## Why not the other proposed approaches Three prior PRs tried to fit within the 8KB limit by modifying the nested layout: - #10214 (njiangk): truncated all descriptions to 'Run <name>' and category descriptions to 'Skills'. Works but destroys slash picker UX. - #11385 (LeonSGP43): 40-char description clamp + iterative trim-largest-category fallback. Works but HIDES skills the user can no longer invoke via slash — functional regression. - #10261 (zeapsu): adaptive split into /skill-<cat> top-level groups. Preserves all skills but pollutes the slash namespace with 20 top-level commands. All three work around the symptom. The flat autocomplete design dissolves the problem — there is no payload-size pressure to manage. ## Tests tests/gateway/test_discord_slash_commands.py — 5 new test cases replace the 3 old nested-structure tests: - flat-not-nested structure assertion - empty skills → no command registered - callback dispatches the right cmd_key by name - unknown name → ephemeral error, no dispatch - large-catalog regression guard (500 skills) — command payload stays under 500 bytes regardless E2E validated against real discord.py 2.7.1: - Command registers as discord.app_commands.Command (not Group). - Autocomplete filters by name AND description (verified across several queries including description-only matches like 'pdf' → OCR skill). - 500-skill catalog returns max 25 results per autocomplete query (Discord's hard cap), filtered correctly. - Choice labels formatted as 'name — description' clamped to 100 chars.	2026-04-17 05:19:14 -07:00
Berny Linville	6ee65b4d61	fix(weixin): preserve native markdown rendering - stop rewriting markdown tables, headings, and links before delivery - keep markdown table blocks and headings together during chunking - update Weixin tests and docs for native markdown rendering Closes #10308	2026-04-17 05:01:29 -07:00
Patrick Wang	4ed6e4c1a5	refactor(weixin): drop pilk dependency from voice fallback	2026-04-17 05:01:29 -07:00
Patrick Wang	649f38390c	fix: force Weixin voice fallback to file attachments	2026-04-17 05:01:29 -07:00
Patrick Wang	678b69ec1b	fix(weixin): use Tencent SILK encoding for voice replies	2026-04-17 05:01:29 -07:00
Teknium	53da34a4fc	fix(discord): route attachment downloads through authenticated bot session (#11568 ) Three open issues — #8242, #6587, #11345 — all trace to the same root cause: the image / audio / document download paths in `DiscordAdapter._handle_message` used plain, unauthenticated HTTP to fetch `att.url`. That broke in three independent ways: #8242 cdn.discordapp.com attachment URLs increasingly require the bot session to download; unauthenticated httpx sees 403 Forbidden, image/voice analysis fail silently. #6587 Some user environments (VPNs, corporate DNS, tunnels) resolve cdn.discordapp.com to private-looking IPs. Our is_safe_url() guard correctly blocks them as SSRF risks, but the user environment is legitimate — image analysis and voice STT die. #11345 The document download path skipped is_safe_url() entirely — raw aiohttp.ClientSession.get(att.url) with no SSRF check, inconsistent with the image/audio branches. Unified fix: use `discord.Attachment.read()` as the primary download path on all three branches. `att.read()` routes through discord.py's own authenticated HTTPClient, so: - Discord CDN auth is handled (#8242 resolved). - Our is_safe_url() gate isn't consulted for the attachment path at all — the bot session handles networking internally (#6587 resolved). - All three branches now share the same code path, eliminating the document-path SSRF gap (#11345 resolved). Falls back to the existing cache_*_from_url helpers (image/audio) or an SSRF-gated aiohttp fetch (documents) when `att.read()` is unavailable or fails — preserves defense-in-depth for any future payload-schema drift that could slip a non-CDN URL into att.url. New helpers on DiscordAdapter: - _read_attachment_bytes(att) — safe att.read() wrapper - _cache_discord_image(att, ext) — primary + URL fallback - _cache_discord_audio(att, ext) — primary + URL fallback - _cache_discord_document(att, ext) — primary + SSRF-gated aiohttp fallback Tests: - tests/gateway/test_discord_attachment_download.py — 12 new cases covering all three helpers: primary path, fallback on missing .read(), fallback on validator rejection, SSRF guard on document fallback, aiohttp fallback happy-path, and an E2E case via _handle_message confirming cache_image_from_url is never invoked when att.read() succeeds. - All 11 existing document-handling tests continue to pass via the aiohttp fallback path (their SimpleNamespace attachments have no .read(), which triggers the fallback — now SSRF-gated). Closes #8242, closes #6587, closes #11345.	2026-04-17 04:59:03 -07:00
LehaoLin	504e7eb9e5	fix(gateway): wait for reconnection before dropping WebSocket sends When a WebSocket-based platform adapter (e.g. QQ Bot) temporarily loses its connection, send() now polls is_connected for up to 15s instead of immediately returning a non-retryable failure. If the auto-reconnect completes within the window, the message is delivered normally. On timeout, the SendResult is marked retryable=True so the base class retry mechanism can attempt re-delivery. Same treatment applied to _send_media(). Adds 4 async tests covering: - Successful send after simulated reconnection - Retryable failure on timeout - Immediate success when already connected - _send_media reconnection wait Fixes #11163	2026-04-17 04:22:40 -07:00
dieutx	995177d542	fix(gateway): honor QQ_GROUP_ALLOWED_USERS in runner auth	2026-04-17 04:22:40 -07:00
Pedro Gonzalez	590c9964e1	Fix QQ voice attachment SSRF validation	2026-04-17 04:22:40 -07:00
Teknium	aca81ac7bb	test(dingtalk): cover require_mention + allowed_users gating Adds 16 regression tests for the gating logic introduced in the salvaged commit: * TestAllowedUsersGate — empty/wildcard/case-insensitive matching, staff_id vs sender_id, env var CSV population * TestMentionPatterns — compilation, case-insensitivity, invalid regex is skipped-not-raised, JSON env var, newline fallback * TestShouldProcessMessage — DM always accepted, group gating via require_mention / is_in_at_list / wake-word pattern / free_response_chats Also adds yule975 to scripts/release.py AUTHOR_MAP (release CI blocks unmapped emails).	2026-04-17 04:21:49 -07:00
Teknium	eabe14af1c	test(discord): update reply_mode fixture for new to_reference() wrapping Follow-up to the reply-reference fix: `_make_discord_adapter` used to return the raw fetched `Message` as the expected reference, but the adapter now wraps it via `ref_msg.to_reference(fail_if_not_exists=False)` so Discord treats a deleted target as 'send without reply chip'. Update the fixture to return the MessageReference sentinel so the 4 chunk-reference-identity tests assert against the right object. No production behavior change; only aligns the stale test fixture.	2026-04-17 04:17:56 -07:00
Teknium	ef37aa7cce	test(discord): add regression guard for non-reference send errors Follow-up to the reply-reference fix: ensure errors unrelated to the reply reference (e.g. 50013 Missing Permissions) do NOT trigger the no-reference retry path and still surface as a failed SendResult. Keeps the wider retry condition from silently swallowing unrelated API errors. Proposed in the original issue writeup (#11342) as test case `test_non_reference_errors_still_propagate`.	2026-04-17 04:17:56 -07:00
LeonSGP43	a448e7a04d	fix(discord): drop invalid reply references	2026-04-17 04:17:56 -07:00
Asunfly	7c932c5aa4	fix(dingtalk): close websocket on disconnect	2026-04-17 04:11:30 -07:00
赵晨飞	82969615bb	test(weixin): add regression test for send_image_file parameter name Add TestWeixinSendImageFileParameterName test class with two tests: - test_send_image_file_uses_image_path_parameter: verifies the correct parameter name (image_path) is used when gateway calls send_image_file - test_send_image_file_works_without_optional_params: ensures minimal params work correctly This prevents the interface from drifting again as noted by Copilot.	2026-04-17 04:09:21 -07:00
Michel Belleau	efa6c9f715	fix(discord): default allowed_mentions to block @everyone and role pings discord.py does not apply a default AllowedMentions to the client, so any reply whose content contains @everyone/@here or a role mention would ping the whole server — including verbatim echoes of user input or LLM output that happens to contain those tokens. Set a safe default on commands.Bot: everyone=False, roles=False, users=True, replied_user=True. Operators can opt back in via four DISCORD_ALLOW_MENTION_* env vars or discord.allow_mentions.* in config.yaml. No behavior change for normal user/reply pings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 04:08:42 -07:00
Teknium	2367c6ffd5	test: remove 169 change-detector tests across 21 files (#11472 ) First pass of test-suite reduction to address flaky CI and bloat. Removed tests that fall into these change-detector patterns: 1. Source-grep tests (tests/gateway/test_feishu.py, test_email.py): tests that call inspect.getsource() on production modules and grep for string literals. Break on any refactor/rename even when behavior is correct. 2. Platform enum tautologies (every gateway/test_X.py): assertions like `Platform.X.value == 'x'` duplicated across ~9 adapter test files. 3. Toolset/PLATFORM_HINTS/setup-wizard registry-presence checks: tests that only verify a key exists in a dict. Data-layout tests, not behavior. 4. Argparse wiring tests (test_argparse_flag_propagation, test_subparser_routing _fallback): tests that do parser.parse_args([...]) then assert args.field. Tests Python's argparse, not our code. 5. Pure dispatch tests (test_plugins_cmd.TestPluginsCommandDispatch): patch cmd_X, call plugins_command with matching action, assert mock called. Tests the if/elif chain, not behavior. 6. Kwarg-to-mock verification (test_auxiliary_client ~45 tests, test_web_tools_config, test_gemini_cloudcode, test_retaindb_plugin): tests that mock the external API client, call our function, and assert exact kwargs. Break on refactor even when behavior is preserved. 7. Schedule-internal "function-was-called" tests (acp/test_server scheduling tests): tests that patch own helper method, then assert it was called. Kept behavioral tests throughout: error paths (pytest.raises), security tests (path traversal, SSRF, redaction), message alternation invariants, provider API format conversion, streaming logic, memory contract, real config load/merge tests. Net reduction: 169 tests removed. 38 empty classes cleaned up. Collected before: 12,522 tests Collected after: 12,353 tests	2026-04-17 01:05:09 -07:00
Teknium	3438d274f6	fix(dingtalk): repair _extract_text for dingtalk-stream >= 0.20 SDK shape The cherry-picked SDK compat fix (previous commit) wired process() to parse CallbackMessage.data into a ChatbotMessage, but _extract_text() was still written against the pre-0.20 payload shape: * message.text changed from dict {content: ...} → TextContent object. The old code's str(text) fallback produced 'TextContent(content=...)' as the agent's input, so every received message came in mangled. * rich_text moved from message.rich_text (list) to message.rich_text_content.rich_text_list. This preserves legacy fallbacks (dict-shaped text, bare rich_text list) while handling the current SDK layout via hasattr(text, 'content'). Adds regression tests covering: * webhook domain allowlist (api., oapi., and hostile lookalikes) * _IncomingHandler.process is a coroutine function * _extract_text against TextContent object, dict, rich_text_content, legacy rich_text, and empty-message cases Also adds kevinskysunny to scripts/release.py AUTHOR_MAP (release CI blocks unmapped emails).	2026-04-17 00:52:35 -07:00
Teknium	816e3e3774	test(feishu): cover new SDK event handler registrations Extends test_build_event_handler_registers_reaction_and_card_processors to assert that register_p2_im_chat_access_event_bot_p2p_chat_entered_v1 and register_p2_im_message_recalled_v1 are called when building the event handler, matching the production registrations. Also adds Fatty911 to scripts/release.py AUTHOR_MAP for credit on the salvaged event-handler fix.	2026-04-16 22:08:11 -07:00
Teknium	24fa055763	fix(ci): resolve 4 pre-existing main failures (docs lint + 3 stale tests) (#11373 ) * docs: fix ascii-guard border alignment errors Three docs pages had ASCII diagram boxes with off-by-one column alignment issues that failed docs-site-checks CI: - architecture.md: outer box is 71 cols but inner-box content lines and border corners were offset by 1 col, making content-line right border at col 70/72 while top/bottom border was at col 71. Inner boxes also had border corners at cols 19/36/53 but content pipes at cols 20/37/54. Rewrote the diagram with consistent 71-col width throughout, aligned inner boxes at cols 4-19, 22-37, 40-55 with 2-space gaps and 15-space trailing padding. - gateway-internals.md: same class of issue — outer box at 51 cols, inner content lines varied 52-54 cols. Rewrote with consistent 51-col width, inner boxes at cols 4-15, 18-29, 32-43. Also restructured the bottom-half message flow so it's bare text (not half-open box cells) matching the intent of the original. - agent-loop.md line 112-114: box 2 (API thread) content lines had one extra space pushing the right border to col 46 while the top and bottom borders of that box sat at col 45. Trimmed one trailing space from each of the three content lines. All 123 docs files now pass `npm run lint:diagrams`: ✓ Errors: 0 (warnings: 6, non-fatal) Pre-existing failures on main — unrelated to any open PR. * test(setup): accept description kwarg in prompt_choice mock lambdas setup.py's `_curses_prompt_choice` gained an optional `description` parameter (used for rendering context hints alongside the prompt). `prompt_choice` forwards it via keyword arg. The two existing tests mocked `_curses_prompt_choice` with lambdas that didn't accept the new kwarg, so the forwarded call raised TypeError. Fix: add `description=None` to both mock lambda signatures so they absorb the new kwarg without changing behavior. * test(matrix): update stale audio-caching assertion test_regular_audio_has_http_url asserted that non-voice audio messages keep their HTTP URL and are NOT downloaded/cached. That was true when the caching code only triggered on `is_voice_message`. Since `bec02f37` (encrypted-media caching refactor), matrix.py caches all media locally — photos, audio, video, documents — so downstream tools can read them as real files via media_urls. This applies to regular audio too. Renamed the test to `test_regular_audio_is_cached_locally`, flipped the assertions accordingly, and documented the intentional behavior change in the docstring. Other tests in the file (voice-specific caching, message-type detection, reply-to threading) continue to pass. * test(413): allow multi-pass preflight compression run_agent.py's preflight compression runs up to 3 passes in a loop for very large sessions (each pass summarizes the middle N turns, then re-checks tokens). The loop breaks when a pass returns a message list no shorter than its input (can't compress further). test_preflight_compresses_oversized_history used a static mock return value that returned the same 2 messages regardless of input, so the loop ran pass 1 (41 -> 2) and pass 2 (2 -> 2 -> break), making call_count == 2. The assert_called_once() assertion was strictly wrong under the multi-pass design. The invariant the test actually cares about is: preflight ran, and its first invocation received the full oversized history. Replaced the count assertion with those two invariants. * docs: drop '...' from gateway diagram, merge side-by-side boxes ascii-guard 2.3.0 flagged two remaining issues after the initial fix pass: 1. gateway-internals.md L33: the '...' suffix after inner box 3's right border got parsed as 'extra characters after inner-box right border'. Dropped the '...' — the surrounding prose already conveys 'and more platforms' without needing the visual hint. 2. agent-loop.md: ascii-guard can't cleanly parse two side-by-side boxes of different heights (main thread 7 rows, API thread 5 rows). Even equalizing heights didn't help — the linter treats the left box's right border as the end of the diagram. Merged into a single 54-char-wide outer box with both threads labeled as regions inside, keeping the ▶ arrow to preserve the main→API flow direction.	2026-04-16 20:43:41 -07:00
Teknium	7af9bf3a54	fix(feishu): queue inbound events when adapter loop not ready (#5499 ) (#11372 ) Inbound Feishu messages arriving during brief windows when the adapter loop is unavailable (startup/restart transitions, network-flap reconnect) were silently dropped with a WARNING log. This matches the symptom in issue #5499 — and users have reported seeing only a subset of their messages reach the agent. Fix: queue pending events in a thread-safe list and spawn a single drainer thread that replays them once the loop becomes ready. Covers these scenarios: * Queue events instead of dropping when loop is None/closed * Single drainer handles the full queue (not thread-per-event) * Thread-safe with threading.Lock on the queue and schedule flag * Handles mid-drain bursts (new events arrive while drainer is working) * Handles RuntimeError if loop closes between check and submit * Depth cap (1000) prevents unbounded growth during extended outages * Drops queue cleanly on disconnect rather than holding forever * Safety timeout (120s) prevents infinite retention on broken adapters Based on the approach proposed in #4789 by milkoor, rewritten for thread-safety and correctness. Test plan: * 5 new unit tests (TestPendingInboundQueue) — all passing * E2E test with real asyncio loop + fake WS thread: 10-event burst before loop ready → all 10 delivered in order * E2E concurrent burst test: 20 events queued, 20 more arrive during drainer dispatch → all 40 delivered, no loss, no duplicates * All 111 existing feishu tests pass Related: #5499, #4789 Co-authored-by: milkoor <milkoor@users.noreply.github.com>	2026-04-16 20:36:59 -07:00
Brooklyn Nicholson	7f1204840d	test(tui): fix stale mocks + xdist flakes in TUI test suite All 61 TUI-related tests green across 3 consecutive xdist runs. tests/tui_gateway/test_protocol.py: - rename `get_messages` → `get_messages_as_conversation` on mock DB (method was renamed in the real backend, test was still stubbing the old name) - update tool-message shape expectation: `{role, name, context}` matches current `_history_to_messages` output, not the legacy `{role, text}` tests/hermes_cli/test_tui_resume_flow.py: - `cmd_chat` grew a first-run provider-gate that bailed to "Run: hermes setup" before `_launch_tui` was ever reached; 3 tests stubbed `_resolve_last_session` + `_launch_tui` but not the gate - factored a `main_mod` fixture that stubs `_has_any_provider_configured`, reused by all three tests tests/test_tui_gateway_server.py: - `test_config_set_personality_resets_history_and_returns_info` was flaky under xdist because the real `_write_config_key` touches `~/.hermes/config.yaml`, racing with any other worker that writes config. Stub it in the test.	2026-04-16 19:07:49 -05:00
helix4u	5d7d574779	fix(gateway): let /queue bypass active-session guard	2026-04-16 16:36:40 -07:00
Brooklyn Nicholson	3746c60439	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-16 18:25:49 -05:00
Siddharth Balyan	d38b73fa57	fix(matrix): E2EE and migration bugfixes (#10860 ) * - make buffered streaming - fix path naming to expand `~` for agent. - fix stripping of matrix ID to not remove other mentions / localports. * fix(matrix): register MembershipEventDispatcher for invite auto-join The mautrix migration (#7518) broke auto-join because InternalEventType.INVITE events are only dispatched when MembershipEventDispatcher is registered on the client. Without it, _on_invite is dead code and the bot silently ignores all room invites. Closes #10094 Closes #10725 Refs: PR #10135 (digging-airfare-4u), PR #10732 (fxfitz) * fix(matrix): preserve _joined_rooms reference for CryptoStateStore connect() reassigned self._joined_rooms = set(...) after initial sync, orphaning the reference captured by _CryptoStateStore at init time. find_shared_rooms() returned [] forever, breaking Megolm session rotation on membership changes. Mutate in place with clear() + update() so the CryptoStateStore reference stays valid. Refs #8174, PR #8215 * fix(matrix): remove dual ROOM_ENCRYPTED handler to fix dedup race mautrix auto-registers DecryptionDispatcher when client.crypto is set. The adapter also registered _on_encrypted_event for the same event type. _on_encrypted_event had zero awaits and won the race to mark event IDs in the dedup set, causing _on_room_message to drop successfully decrypted events from DecryptionDispatcher. The retry loop masked this by re-decrypting every message ~4 seconds later. Remove _on_encrypted_event entirely. DecryptionDispatcher handles decryption; genuinely undecryptable events are logged by mautrix and retried on next key exchange. Refs #8174, PR #8215 * fix(matrix): re-verify device keys after share_keys() upload Matrix homeservers treat ed25519 identity keys as immutable per device. share_keys() can return 200 but silently ignore new keys if the device already exists with different identity keys. The bot would proceed with shared=True while peers encrypt to the old (unreachable) keys. Now re-queries the server after share_keys() and fails closed if keys don't match, with an actionable error message. Refs #8174, PR #8215 * fix(matrix): encrypt outbound attachments in E2EE rooms _upload_and_send() uploaded raw bytes and used the 'url' key for all rooms. In E2EE rooms, media must be encrypted client-side with encrypt_attachment(), the ciphertext uploaded, and the 'file' key (with key/iv/hashes) used instead of 'url'. Now detects encrypted rooms via state_store.is_encrypted() and branches to the encrypted upload path. Refs: PR #9822 (charles-brooks) * fix(matrix): add stop_typing to clear typing indicator after response The adapter set a 30-second typing timeout but never cleared it. The base class stop_typing() is a no-op, so the typing indicator lingered for up to 30 seconds after each response. Closes #6016 Refs: PR #6020 (r266-tech) * fix(matrix): cache all media types locally, not just photos/voice should_cache_locally only covered PHOTO, VOICE, and encrypted media. Unencrypted audio/video/documents in plaintext rooms were passed as MXC URLs that require authentication the agent doesn't have, resulting in 401 errors. Refs #3487, #3806 * fix(matrix): detect stale OTK conflict on startup and fail closed When crypto state is wiped but the same device ID is reused, the homeserver may still hold one-time keys signed with the previous identity key. Identity key re-upload succeeds but OTK uploads fail with "already exists" and a signature mismatch. Peers cannot establish new Olm sessions, so all new messages are undecryptable. Now proactively flushes OTKs via share_keys() during connect() and catches the "already exists" error with an actionable log message telling the operator to purge the device from the homeserver or generate a fresh device ID. Also documents the crypto store recovery procedure in the Matrix setup guide. Refs #8174 * docs(matrix): improve crypto recovery docs per review - Put easy path (fresh access token) first, manual purge second - URL-encode user ID in Synapse admin API example - Note that device deletion may invalidate the access token - Add "stop Synapse first" caveat for direct SQLite approach - Mention the fail-closed startup detection behavior - Add back-reference from upgrade section to OTK warning * refactor(matrix): cleanup from code review - Extract _extract_server_ed25519() and _reverify_keys_after_upload() to deduplicate the re-verification block (was copy-pasted in two places, three copies of ed25519 key extraction total) - Remove dead code: _pending_megolm, _retry_pending_decryptions, _MAX_PENDING_EVENTS, _PENDING_EVENT_TTL — all orphaned after removing _on_encrypted_event - Remove tautological TestMediaCacheGate (tested its own predicate, not production code) - Remove dead TestMatrixMegolmEventHandling and TestMatrixRetryPendingDecryptions (tested removed methods) - Merge duplicate TestMatrixStopTyping into TestMatrixTypingIndicator - Trim comment to just the "why"	2026-04-17 04:03:02 +05:30
Brooklyn Nicholson	842a122964	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-16 15:37:28 -05:00
asheriif	6c34bf3d00	fix(gateway): fix matrix read receipts	2026-04-16 13:18:12 -07:00
Brooklyn Nicholson	9c71f3a6ea	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-16 10:47:41 -05:00
jackjin1997	f5ac025714	fix(gateway): guard pending_event.channel_prompt against None in recursive _run_agent Initialize next_channel_prompt before the pending_event check and use getattr with None default, matching the existing pattern for next_source/next_message/next_message_id. Prevents AttributeError when pending_event is None (interrupt path). Cherry-picked from #10953 by @jackjin1997.	2026-04-16 07:45:27 -07:00
Teknium	77bdad5b02	fix(tests): resolve 12 CI failures + 10 errors across 6 root causes (#11040 ) Group A (3 tests): 'No LLM provider configured' RuntimeError - test_user_message_surrogates_sanitized, test_counters_initialized_in_init, test_openai_prompt_tokens_unchanged - Root cause: AIAgent.__init__ now requires base_url alongside api_key to skip resolve_provider_client() (which returns None when API keys are blanked in CI). Added base_url='http://localhost:1234/v1' to test agent construction. Group B (5 tests): Discord slash command auto-registration - test_auto_registers_missing_gateway_commands, test_auto_registered_command_, test_register_skill_group_ - Root cause: xdist workers that loaded a discord mock WITHOUT app_commands.Command/Group caused _register_slash_commands() to fail silently. Added comprehensive shared discord mock in tests/gateway/conftest.py (same pattern as existing telegram mock). Group C (5 errors): Discord reply mode 'NoneType has no DMChannel' - All TestReplyToText tests - Root cause: FakeDMChannel was not a subclass of real discord.DMChannel, so isinstance() checks in _handle_message failed when running in full suite (real discord installed). Made FakeDMChannel inherit from discord.DMChannel when available. Removed fragile monkeypatch approach. Group D (2 tests): detect_provider_for_model wrong provider - test_openrouter_slug_match (got 'ai-gateway'), test_bare_name_gets_ openrouter_slug (got 'copilot') - Root cause: ai-gateway, copilot, and kilocode are multi-vendor aggregators that list other providers' models (OpenRouter-style slugs). They were being matched in Step 1 before OpenRouter. Added all three to _AGGREGATORS set so they're skipped like nous/openrouter. Group E (1 test): model_flow_custom StopIteration - test_model_flow_custom_saves_verified_v1_base_url - Root cause: 'Display name' prompt was added after the test was written. The input iterator had 5 answers but the flow now asks 6 questions. Added 6th empty string answer. Group F (1 test): Telegram proxy env assertion - test_uses_proxy_env_for_primary_and_fallback_transports - Root cause: _resolve_proxy_url() now checks TELEGRAM_PROXY first (via resolve_proxy_url('TELEGRAM_PROXY')). Test didn't clear this env var, allowing potential leakage from other tests in xdist workers. Added TELEGRAM_PROXY to the cleanup list.	2026-04-16 06:49:36 -07:00
Teknium	3c42064efc	fix: enforce config.yaml as sole CWD source + deprecate .env CWD vars + add hermes memory reset (#11029 ) config.yaml terminal.cwd is now the single source of truth for working directory. MESSAGING_CWD and TERMINAL_CWD in .env are deprecated with a migration warning. Changes: 1. config.py: Remove MESSAGING_CWD from OPTIONAL_ENV_VARS (setup wizard no longer prompts for it). Add warn_deprecated_cwd_env_vars() that prints a migration hint when deprecated env vars are detected. 2. gateway/run.py: Replace all MESSAGING_CWD reads with TERMINAL_CWD (which is bridged from config.yaml terminal.cwd). MESSAGING_CWD is still accepted as a backward-compat fallback with deprecation warning. Config bridge skips cwd placeholder values so they don't clobber the resolved TERMINAL_CWD. 3. cli.py: Guard against lazy-import clobbering — when cli.py is imported lazily during gateway runtime (via delegate_tool), don't let load_cli_config() overwrite an already-resolved TERMINAL_CWD with os.getcwd() of the service's working directory. (#10817) 4. hermes_cli/main.py: Add 'hermes memory reset' command with --target all/memory/user and --yes flags. Profile-scoped via HERMES_HOME. Migration path for users with .env settings: Remove MESSAGING_CWD / TERMINAL_CWD from .env Add to config.yaml: terminal: cwd: /your/project/path Addresses: #10225, #4672, #10817, #7663	2026-04-16 06:48:33 -07:00
kshitijk4poor	a6142a8e08	fix: follow-up for salvaged PR #10854 - Extract duplicated activity-callback polling into shared touch_activity_if_due() helper in tools/environments/base.py - Use helper from both base.py _wait_for_process and code_execution_tool.py local polling loop (DRY) - Add test assertion that timeout output field contains the timeout message and emoji (#10807) - Add stream_consumer test for tool-boundary fallback scenario where continuation is empty but final_text differs from visible prefix (#10807)	2026-04-16 06:42:45 -07:00

1 2 3 4 5 ...

608 commits