hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-12 08:51:53 +00:00

Author	SHA1	Message	Date
Teknium	3b4c715e1c	fix(telegram): stripped-text fallbacks, re-finalize skip, and tail-only delete guard Follow-ups on top of the two salvaged GodsBoy commits, all live-validated against the real Telegram Bot API: - _edit_overflow_split finalize fallbacks degrade to _strip_mdv2() clean text instead of putting raw markdown markers on screen (salvaged from PR #43463 minus its format-first sizing — live probes show Telegram's 4096 limit counts PARSED text, so MarkdownV2 escape inflation cannot cause MESSAGE_TOO_LONG and sizing against formatted wire length only causes premature splits and fragment messages). - Skip the redundant requires-finalize edit after a got_done edit that split-and-delivered (salvaged from PR #43463): re-finalizing re-splits the full text into the adopted continuation and duplicates chunks. - _send_fallback_final only deletes the stale partial message when the fallback re-sent the COMPLETE final text. When the prefix dedup sent only the missing tail, the partial IS the head of the answer; deleting it left users with only the second half of long responses (live- reproduced: flood-control during a long stream -> head deleted, ratio 0.54 of content visible). This is the third bug behind the 'Telegram cut messages' reports and was present on main and both PRs.	2026-06-10 15:09:35 -07:00
GodsBoy	590b3c0d7e	fix(gateway): recover partial Telegram overflow streams	2026-06-10 15:09:35 -07:00
Teknium	cd9a9cd8e5	fix(gateway): Slack approval UX in threads — block-size overflow + typed-prefix instruction text (#43444 ) Two fixes for the reported Slack thread approval UX: 1. Slack Block Kit approval/confirm sends silently overflowed the 3000-char section-block cap (flat 2900-char truncation + header + reason), so long execute_code approvals failed with invalid_blocks and fell back to the plain-text prompt with no buttons. Budget the command preview against the rendered fixed parts so blocks never exceed the cap (send_exec_approval + send_slash_confirm). 2. The text fallbacks told users to reply /approve — which Slack blocks inside threads and Matrix clients reserve client-side. Add a typed_command_prefix capability flag on BasePlatformAdapter (default "/"; Slack and Matrix set "!" to match their existing bang-prefix rewrite) and use it in the shared fallback prompt builders (exec approval, update prompt, destructive slash confirm, expensive-model confirm) plus Matrix's reaction-prompt text. The slash-confirm text-intercept now also accepts bang-prefixed replies (!always, !cancel) since those keywords aren't registered commands and the adapters' rewrite doesn't touch them.	2026-06-10 02:30:01 -07:00
konsisumer	6a30cfca82	fix(gateway): stop typing before post-delivery callbacks (#37556 )	2026-06-10 00:46:00 -07:00
Teknium	243cada157	fix(model): cover typed gateway /model path + async-safe pricing lookups Follow-ups on top of #26016's expensive-model guard: - gateway/slash_commands.py: typed '/model <name>' now routes through the expensive-model confirmation gate (slash-confirm buttons / text fallback) instead of bypassing the guard the pickers enforce. Cancel leaves the session override and --global config untouched. - telegram/discord/web_server: run expensive_model_warning() via asyncio.to_thread — it can hit models.dev or a /models endpoint on a cache miss, which would otherwise block the event loop. - telegram: picker callback no longer toasts 'Model switched!' when the switch callback raised (both mm: and mc: paths). - tests: new tests/gateway/test_model_command_expensive_confirm.py pins the typed-path gate (prompt, confirm-once, cancel, cheap-model no-op).	2026-06-10 00:24:06 -07:00
Robin Fernandes	af978ecb17	fix(model): require confirmation for expensive model selections Rebased onto current main and re-ported across the restructured surfaces: model flows now thread confirm_provider/base_url/api_key through hermes_cli/model_setup_flows.py, the Discord picker lives in plugins/platforms/discord/adapter.py, and the web dashboard picker applies chat-mode switches via config.set so the expensive-model confirmation can ride the response. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 00:24:06 -07:00
Joel Chan	e5580f43c2	fix(discord): propagate role_authorized flag so DISCORD_ALLOWED_ROLES works end-to-end DISCORD_ALLOWED_ROLES was checked by the Discord adapter (_is_allowed_user) but gateway._is_user_authorized only read DISCORD_ALLOWED_USERS, so role-authorized users were rejected with "Unauthorized user" at the gateway layer despite passing the adapter gate. - Add role_authorized: bool = False to SessionSource - Add role_authorized param to build_source (base.py) - Compute _role_authorized in on_message when user passes via role not user ID - Thread _role_authorized through _handle_message -> build_source - Check source.role_authorized early in _is_user_authorized (run.py) Fixes #33952	2026-06-10 00:18:11 -07:00
loongzhao	ffcd9d7ac7	refactor(yuanbao): consolidate media resolution into dedicated pipeline middlewares	2026-06-09 03:17:00 -07:00
Teknium	3705625b74	feat(gateway): render terminal commands as bare fenced code blocks in chat (#42576 ) Terminal tool progress on markdown-capable gateways (Telegram, Slack, Discord, WhatsApp, Matrix, Weixin, Feishu) renders the full command in a fenced code block again, in all/new AND verbose modes — gated on the adapter's supports_code_blocks capability. Plain-text platforms keep the short truncated preview. No language tag is emitted: Slack mrkdwn renders a '```bash' fence with 'bash' as a literal first code line, so a bare '```' fence is used, which renders correctly on every platform that supports blocks. This restores the #41215 feature (removed in #41950 due to the command showing in group chats) as the default. For a personal assistant the command display is desired; the group-chat concern is a preference, not a vulnerability.	2026-06-08 21:19:05 -07:00
helix4u	b23184cad4	fix(api-server): bind request session context for tools	2026-06-08 20:52:08 -07:00
ruangraung	f4531feee8	fix(telegram): improve MarkdownV2 edit fallback and fix _strip_mdv2 bold handling When edit_message(finalize=True) fails with a MarkdownV2 parse error, the silent fallback previously sent raw content with escape sequences. Now it logs the error and strips markdown formatting via _strip_mdv2() for clean plain-text fallback. Also fixes _strip_mdv2 to handle standard markdown bold (\\text\\) before MarkdownV2 bold (\text\), preventing half-stripped asterisks. Refs: #41955, #41732	2026-06-08 15:53:16 -07:00
GodsBoy	421226e404	fix(gateway): stop terminal progress from posting the full command to messaging chats #41215 rendered a terminal tool call as a native ```bash fenced block on markdown platforms (Telegram, WhatsApp, Slack, and others), showing the full command with no truncation, in both all/new and verbose modes. That posted complete shell commands (heredocs, internal paths, destructive commands) into the chat before the final answer, visible to everyone in it. This restores the prior behavior: terminal progress shows the short, truncated preview line that every other tool already uses, capped at tool_preview_length. The supports_code_blocks capability flag is left in place for future use. CLI/TUI rendering is a separate path and was unaffected. Adds a regression test asserting terminal progress renders as a truncated preview, not a fenced bash block, even on a markdown-capable gateway. Fixes #41955	2026-06-08 15:53:00 -07:00
konsisumer	3714caa1b9	fix(session): follow compression continuations for transcript reads	2026-06-07 23:57:20 -07:00
Hariharan Ayappane	b8469a81e3	fix(weixin): add rate-limit circuit breaker	2026-06-07 22:10:17 -07:00
Teknium	2e62862784	fix(telegram): use get_running_loop in polling-conflict retry reschedule (#41716 ) The conflict-retry path called asyncio.get_event_loop() to reschedule itself when a retry's start_polling raised. On Python 3.11+ (our floor) that raises 'RuntimeError: There is no current event loop in thread MainThread' when no loop is attached to the thread, which is what happens when PTB dispatches this error callback. The retry never gets scheduled, the adapter goes silent-but-alive, and gateway --replace keeps spawning fresh instances that hit the same wall — the crash loop reported in #19471 (worse under multi-profile, where two bots hold the same conflict open). We are inside a coroutine here, so asyncio.get_running_loop() is the correct, guaranteed-valid replacement. Only get_event_loop() call in any platform adapter, so no sibling sites. Fixes #19471	2026-06-07 22:10:03 -07:00
islam666	09a5548628	fix(weixin): refresh typing ticket on expiry to prevent stuck indicator (#38085 ) The WeChat iLink typing ticket has a 600-second TTL. When a long-running session exceeds that window, the cached ticket evicts from TypingTicketCache. Both send_typing and stop_typing silently returned early when the ticket was None, meaning the TYPING_STOP=2 signal was never sent to iLink. The WeChat client then showed the typing indicator indefinitely. Fix: add _ensure_typing_ticket() that transparently refreshes the ticket via getConfig when the cached one has expired or is missing. Both send_typing and stop_typing now call this method instead of silently no-oping. Fixes #38085	2026-06-07 21:50:57 -07:00
Brian D. Evans	ab0a6270c3	fix(slack): align thread_ts check with is_thread_reply invariant (Copilot #15464 ) Two findings from Copilot's review on #15464, both addressed: 1. ``event.get("thread_ts")`` truthy vs ``event_thread_ts != ts``: the new channel branch treated ANY truthy ``thread_ts`` as a real thread reply, but three lines below ``is_thread_reply`` is defined with the stricter ``event_thread_ts and event_thread_ts != ts`` invariant. If Slack ever ships a payload where ``thread_ts == ts`` on a thread root, the stricter check would treat it as a top-level message for the ``is_thread_reply`` path but as a thread reply for session keying — divergent behaviour. Aligned this branch to the same ``and event_thread_ts_raw != ts`` invariant. 2. ``test_top_level_reply_to_id_stays_none_when_shared`` docstring had the ternary logic backwards ("None != ts → reply_to_message_id IS set"). The code reads ``reply_to_message_id = thread_ts if thread_ts != ts else None`` — with ``thread_ts = None``, the condition is True so the expression evaluates to ``thread_ts`` itself (None), meaning the reply stays un-threaded. The test asserted the correct end-state; only the explanatory docstring was wrong. Rewrote the docstring to match the actual code flow, with the note that Copilot caught the reversal. 7/7 tests still pass. No behaviour change for the existing test_thread_reply_scopes_by_thread_even_when_shared case because ``event_thread_ts_raw = "1700000000.000000"`` and ``ts = "1700000000.000005"`` are distinct — the new ``!= ts`` guard is a no-op there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-07 21:19:59 -07:00
Brian D. Evans	133e0271e2	fix(slack): scope top-level channel messages by channel-only when reply_in_thread=false (#15421 ) Top-level Slack channel messages previously fell back to the message's own ``ts`` as a synthetic ``thread_ts``: thread_ts = event.get("thread_ts") or ts # ts fallback for channels That value flows into ``build_source(thread_id=thread_ts)`` at line 1247. The gateway session store keys sessions by ``(platform, channel_id, thread_id)``, so every top-level channel message ended up on a unique session. Operators who set ``reply_in_thread: false`` in ``config.yaml`` expected all top-level channel messages to share one session (the whole point of that flag) — instead each one spawned a fresh conversation with no context carry-over. ### Fix Three explicit cases in the channel branch: \| event.thread_ts \| reply_in_thread \| thread_ts for session keying \| \|---\|---\|---\| \| non-null (real thread reply) \| either \| event.thread_ts \| \| null (top-level) \| true (default) \| ts (legacy: own-thread sessions) \| \| null (top-level) \| false \| None (shared channel session) \| The outbound-reply gate at line 1264 (``reply_to_message_id = thread_ts if thread_ts != ts else None``) still works correctly in all three cases without further changes: ``None != ts`` is True, so shared-channel top-level messages don't get their reply threaded either — matching the operator's ``reply_in_thread=false`` intent end-to-end. Genuine thread replies still scope per-thread under both modes so multi-person threaded conversations can't collide with unrelated channel chatter. ### Tests (7 new in ``tests/gateway/test_slack_channel_session_scope.py``) All drive the real ``SlackAdapter._handle_slack_message`` code path (not a re-implementation) via the standard pytest fixture pattern used by ``tests/gateway/test_slack.py``. Messages @mention the bot so the mention gate doesn't drop them — the tests are specifically about what happens once the handler decides to emit a ``MessageEvent``. * ``TestChannelSessionScopeDefault`` (2 cases): - Explicit ``reply_in_thread: true`` keeps ``thread_id = ts`` (legacy behaviour — regression guard) - Unset config behaves like ``reply_in_thread: true`` (pins the default) * ``TestChannelSessionScopeShared`` (3 cases): - ``reply_in_thread: false`` + top-level → ``thread_id is None`` (the #15421 bug 1 fix) - ``reply_to_message_id is None`` in the same case (no threaded outbound reply) - Genuine thread reply still scopes per-thread when shared mode is on — only TOP-LEVEL messages collapse to the channel session * ``TestThreadReplyAlwaysScopesByThread`` (2 parametrised cases): - Thread replies get ``thread_id = event.thread_ts`` regardless of ``reply_in_thread`` — critical invariant for multi-thread channels; a regression here would leak per-thread context across threads Regression guard verified: reverted the else-branch to the legacy ``thread_ts = event.get("thread_ts") or ts`` one-liner; ``test_top_level_maps_to_none_when_reply_in_thread_false`` correctly failed (asserts ``thread_id is None`` but got ``"1700000000.000003"``). Restored → 182 slack tests pass (175 existing + 7 new). Scope: this fixes #15421 bug 1 only. Bug 2 (sessions.json not persisting across compression) lives elsewhere in the session manager and is left for a separate diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-07 21:19:59 -07:00
Teknium	30c7913617	fix(api_server): report hermes version on /health and /health/detailed (#40620 ) Salvaged from #40479; re-verified on main, tightened, tested. Co-authored-by: tfournet <tfournet@users.noreply.github.com>	2026-06-07 18:38:54 -07:00
Teknium	cb83149dc6	fix(yuanbao): bound ws.close() so an idle server can't stall shutdown ~5s (#40607 ) Salvaged from #40421; re-verified on main, tightened, tested. Co-authored-by: maxmilian <maxmilian@users.noreply.github.com>	2026-06-07 17:49:38 -07:00
Teknium	dde9c0d19d	feat(gateway): render terminal tool calls as native bash code blocks on markdown platforms (#41215 ) Tool-progress now shows a terminal command in a ```bash fenced block — full command, no surrounding quotes, no label, no 40-char truncation — instead of the noisy `terminal: "cmd…"` line, on every platform that renders markdown code blocks (Telegram, Slack, Matrix, WhatsApp, Feishu, Weixin, Discord). Plain-text platforms keep the compact preview line. Gated on a new `BasePlatformAdapter.supports_code_blocks` capability (default False) rather than a hardcoded platform list, so plugin adapters (Discord lives in plugins/platforms/) opt in by setting the flag. Applies to both all/new and verbose progress modes, with a safe fallback when the command arg is missing or blank.	2026-06-07 17:29:55 -07:00
Teknium	0c48b7165d	hardening(api-server): scan cron prompts on REST create/update for parity with the agent tool The agent-facing cronjob tool scans the user prompt with _scan_cron_prompt() before creating/updating a job (tools/cronjob_tools.py); the REST cron endpoints (POST /api/jobs, PATCH /api/jobs/{id}) validated length but not content. This adds the same scan to both handlers so an exfiltration/injection prompt is rejected the same way regardless of which surface created the job. NOT a security boundary, defense-in-depth / parity only: the REST cron endpoints are authenticated (every handler runs _check_auth, and connect() refuses to start without API_SERVER_KEY), and _scan_cron_prompt is a documented in-process heuristic, not a containment boundary (SECURITY.md 3.2). Raised externally via GHSA-fr3q-rjg3-x6mf (DNS-rebinding pre-auth RCE). The report's load-bearing 'no auth by default' premise was already closed three weeks after it was filed by the API_SERVER_KEY-required guard (commit `1a9ef8314`); this lands the create/update prompt-validation parity the report also pointed at. Scanner imported defensively so a missing scanner cannot disable the cron REST API.	2026-06-07 10:04:57 -07:00
Dusk1e	3fa15b33dd	fix(feishu): fail closed for update prompt card actions	2026-06-07 06:21:37 -07:00
Dusk1e	410cb743bf	fix(slack): re-check gateway auth on approval and slash-confirm buttons	2026-06-07 06:21:37 -07:00
Teknium	3eeca4613d	fix(qqbot): stop 100% CPU spin when WebSocket is closed but not None (#31193 , #31771 ) (#40574 ) _read_events() returned normally when self._ws was closed-but-non-None (the while-condition is false on entry). _listen_loop treats a normal return as a clean read, resets backoff to 0, and immediately retries — a tight busy-loop pinning CPU. Raising on entry routes it through the reconnect/backoff path instead. Co-authored-by: xushibo <xushibo@users.noreply.github.com> Co-authored-by: cnfi <cnfi@users.noreply.github.com>	2026-06-06 18:44:44 -07:00
kshitijk4poor	c37c6eaf29	refactor(gateway): migrate Home Assistant adapter to bundled plugin Move gateway/platforms/homeassistant.py into plugins/platforms/homeassistant/ following the same shape as the Mattermost and Discord migrations. - Adapter file is renamed via git mv (history is preserved). - register() exposes the platform via the plugin system instead of the hardcoded Platform.HOMEASSISTANT elif in gateway/run.py::build_adapter(). - _standalone_send() replaces the legacy _send_homeassistant() helper in tools/send_message_tool.py. Out-of-process cron delivery (deliver=homeassistant from a cron process not co-located with the gateway) now flows through the registry's standalone_sender_fn path instead of the hardcoded elif. - _is_connected() probes HASS_TOKEN via hermes_cli.gateway.get_env_value so existing connected-platform checks behave identically. The HASS_TOKEN / HASS_URL env-to-PlatformConfig seeding in gateway/config.py stays in core — same pattern bluebubbles, mattermost, and discord migrations followed. No setup_fn or apply_yaml_config_fn is registered because Home Assistant has no _setup_homeassistant wizard in hermes_cli/setup.py and no homeassistant: YAML block in config.yaml today; setup runs through the existing hermes_cli/tools_config.py toolset wizard. Test imports were rewritten across tests/gateway/test_homeassistant.py, tests/integration/test_ha_integration.py, and tests/tools/test_send_message_missing_platforms.py; the legacy (token, extra, chat_id, message)-shaped _send_homeassistant call site is preserved via a small SimpleNamespace shim in test_send_message_missing_platforms.py (same approach used when mattermost moved). - Focused HA suites (64 tests across the three rewritten files) pass. - Broader gateway/cron sweep produces 10 failures identical to main baseline (telegram approval/model-picker xdist isolation flakes, wecom_callback defusedxml issue, cron script_timeout fixture issue). Zero net new failures.	2026-06-06 11:46:24 -07:00
Teknium	54e7b74f7f	fix(gateway): plain text while busy interrupts by default again (#40590 ) * fix: respect disabled auto-compaction on context overflow Port from anomalyco/opencode#30749. When compression.enabled is false, NO automatic compaction trigger may fire. The proactive token-threshold paths (preflight + post-response should_compress gate) already honoured the setting, but the three provider-overflow recovery paths in the agent loop — long-context-tier 429, 413 payload-too-large, and context-overflow — called _compress_context() unconditionally, silently compressing and rotating the session against the user's explicit choice. Add a single guard at the top of the overflow-recovery dispatch: when compression is disabled and the error is one of those three overflow classes, surface a terminal error (compaction_disabled: True) telling the user to /compress manually, /new, switch to a larger-context model, or reduce attachments. Manual /compress (force=True) is unaffected — it never enters this loop. Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't compress when disabled; control case still compresses when enabled). Existing overflow-recovery tests updated to enable compaction explicitly (they verify the recovery fires); fixture defaults flipped to True to match production (compression.enabled defaults to True). * fix(gateway): plain text while busy interrupts by default again busy_input_mode (default 'interrupt') was advertised as the busy-behavior knob, but a second knob added in `7abd62719` — busy_text_mode, defaulting to 'queue' — short-circuited every plain TEXT message before busy_input_mode was consulted. Result: plain follow-ups silently queued instead of interrupting, even with busy_input_mode left at its 'interrupt' default (regression #38390, silent-queue #31588). Collapse to one source of truth: busy_input_mode drives text handling. busy_text_mode is kept only as a legacy explicit override for back-compat (existing queue setups keep working); when unset it follows busy_input_mode. All default fallbacks flipped queue->interrupt. The debounce mechanism is preserved and now keyed off the resolved mode. Fixes #38390, #31588.	2026-06-06 09:00:10 -07:00
Teknium	947e21b3d6	fix(gateway): log silent file-delivery drops (#39767 ) When the agent's reply references a deliverable file path that does not exist on disk, extract_local_files dropped it from native delivery with no log line — the most common reason a promised file never arrives over a messaging platform. Add an INFO log at that drop point so the gap is visible in gateway.log instead of vanishing. Also convert the two print() calls in Telegram's send_document / send_video exception handlers to logger.warning(exc_info=True). print() writes to stdout, which 'hermes logs' never captures, so outbound upload failures (oversized files, Bot API rejections) were invisible.	2026-06-05 04:50:04 -07:00
Ben Barclay	b434f8c3e0	fix(deps): promote markdown to a core dependency so rich delivery works out of the box (#32486 ) (#38649 ) `markdown` was declared only in the `matrix` optional extra, and the official Docker image installs `--extra all --extra messaging --extra anthropic --extra bedrock --extra azure-identity --extra hindsight` — notably NOT `--extra matrix` (the matrix extra is deliberately routed to lazy-install because `mautrix[encryption]`/`python-olm` can't build on Windows/macOS — see the 2026-05-12 policy comment in `[all]`). Result: `markdown` never lands in the image venv, so the Markdown->HTML conversion on the DEFAULT delivery path silently falls back to plain text. Cron/agent deliveries render raw `##`/`**`/tables in clients like Element (no `formatted_body`). The conversion is now used by BOTH `gateway/platforms/matrix.py` and `tools/send_message_tool.py`, so it is no longer matrix-specific. `markdown` is a pure-Python `py3-none-any` wheel (~108KB, no compiled extensions, no platform constraints), so none of the reasons the matrix extra was lazy-routed apply to it. Promote it to a core dependency so it ships in the wheel, the Docker image, and every install; drop the now redundant copies from the `matrix` extra and the `platform.matrix` lazy-deps group; refresh the stale "installed with the matrix extra" docstring. Verified against a real build: ran the image's exact `uv sync` command (same extras, no `--extra matrix`) in a clean container off the new lockfile -> `import markdown` succeeds (3.10.2). On `origin/main` the same command leaves markdown absent. 223 targeted tests pass (test_matrix.py + test_lazy_deps.py). Closes #32486.	2026-06-04 16:46:36 -07:00
teknium1	2982122be7	fix(gateway): deliver $HOME deliverables on root-run gateways Root-run gateways have $HOME=/root, which is on the MEDIA system-path denylist, so the gateway silently dropped agent-generated deliverables under /root (e.g. /root/work/proposal.docx) — the user got a 'here is your file' reply with nothing attached. _path_under_denied_prefix now treats the running user's own home as deliverable: the home tree itself is no longer denied, while the more-specific denied paths inside it (~/.ssh, ~/.aws, ~/.hermes/.env, auth.json, config.yaml) stay blocked because they are separate denylist entries. The exception only matches when the denied prefix IS $HOME, so a non-root gateway still can't deliver another user's home. Diagnosis, reproduction, and the failing-case analysis are from @GodsBoy (#38108 / #38106). Implemented here as the minimal denylist fix rather than a staging/copy subsystem. Co-authored-by: GodsBoy <dhuysamen@gmail.com>	2026-06-04 07:50:22 -07:00
annguyenNous	f7dabd3019	fix(api-server): guard json.loads against corrupted SQLite data in response cache The ResponseStore.get() method calls json.loads(row[0]) without any error handling. If the SQLite responses table contains corrupted JSON data (e.g. from a crash mid-write or disk corruption), this raises an unhandled JSONDecodeError that propagates to the caller. Fix: wrap in try/except (json.JSONDecodeError, TypeError). On parse failure, log a warning, evict the corrupted entry from the cache, and return None (consistent with the function's Optional return type).	2026-06-04 06:15:29 -07:00
teknium1	7314757876	refactor(feishu): slim meeting-invite parser; add AUTHOR_MAP entry Collapse the payload-shape normalization helpers into one _as_dict and drop unused dataclass fields (user_type/user_role, duplicate id, bot) on the meeting-invite handler. Module 274->212 LOC, behavior unchanged. Add zhaolei.vc@bytedance.com -> zhaoleibd to release.py AUTHOR_MAP.	2026-06-04 06:15:23 -07:00
zhaolei.vc	f3bbfda6d1	feat(gateway): handle Feishu meeting invitations Change-Id: I8cf5638393dd9adb1d7be5e170ce5082b41f77fa	2026-06-04 06:15:23 -07:00
alt-glitch	a1264e9967	fix(matrix): make bang-command resolution robust + fix dead skill-command branch Follow-up to the salvaged contributor commit: - Underscore→hyphen tolerance now emits a resolvable token. Previously the detect set accepted the hyphenated variant but emit returned the raw token, so '!set_home' produced '/set_home' which the dispatcher could not resolve. Now emits '/set-home'. Aliases are left as-is — the gateway dispatcher canonicalizes them itself. - Fix dead skill-command branch: skill command keys are stored slash-prefixed (e.g. '/arxiv') in get_skill_commands(), but the check compared the bare token, so '!arxiv' never normalized. Now compares the '/candidate' form, making skill aliases (e.g. !gif-search) work. - Re-run bang normalization after Matrix reply-fallback stripping so a quoted reply whose content is a bang command reaches command parity with the slash form. - Replace silent 'except Exception: pass' with logger.debug(exc_info=True). - Add AUTHOR_MAP entry for @nepenth. Tests: +5 (underscore-alias, skill-command branch, quoted-reply bang + slash parity). 162 Matrix tests pass.	2026-06-03 17:19:27 +05:30
Chris	0022e94d74	feat(matrix): support bang command aliases	2026-06-03 17:19:27 +05:30
Fearvox	4b06c98fe4	fix(gateway): close ResponseStore + dispose unowned adapter on reconnect failure Three separate code paths in the gateway's platform reconnect loop leaked file descriptors every retry, exhausting the default 2560-fd ulimit in ~12 hours of continuous failure and turning the gateway into a zombie that raises OSError: [Errno 24] on every open() (#37011). Root cause: * APIServerAdapter.__init__ opens a ResponseStore SQLite connection that holds 2 fds (db file + WAL sidecar). * APIServerAdapter.disconnect() previously only stopped the aiohttp web server — the ResponseStore connection was never closed. * The reconnect watcher in _platform_reconnect_watcher constructs a fresh adapter on every retry attempt. When the connect call fails (3 paths: non-retryable error, retryable error, exception during connect) the adapter is dropped without ever being installed on self.adapters, so nothing else calls its disconnect(). Result: the 2 ResponseStore fds stay open until GC sweeps the unreachable object, which Python's cyclic GC does not do promptly for asyncio-bound native handles. 2 fds × 1 retry × (3600s / 300s backoff cap) ≈ 12 fds/hour. 2560 fds / 12 fds/hr ≈ 12h to ulimit exhaustion. Fix: * APIServerAdapter.disconnect() now also calls self._response_store.close() (with a try/except so a SQLite close failure doesn't abort the aiohttp teardown). * New module-level helper _dispose_unused_adapter(adapter) in gateway/run.py that calls adapter.disconnect() and swallows any exception (so half-constructed adapters whose __init__ crashed don't kill the watcher loop). * _platform_reconnect_watcher calls _dispose_unused_adapter() in all three failure paths: non-retryable, retryable, and the except Exception arm. adapter = None is initialized before the try so the except arm can see the partial construction. Tests: * New file tests/gateway/test_platform_reconnect_fd_leak.py with 7 regression tests covering all three failure paths, the _dispose_unused_adapter helper (None + raising-disconnect cases), and the APIServerAdapter ResponseStore close behavior (success + close-exception cases). The _CountingAdapter fixture tracks disconnect() invocations and an _open_fds counter that is decremented on dispose, so the assertion is the literal observable behavior of the leak. Refs: - Closes #37011 (the original fd-leak report) - Supersedes #37018, #37110, #37238, #37260, #37394 (7 competing open PRs all addressing the same root cause from different angles; none of them rebased cleanly against current main, and none covered all three failure paths in one fix with regression tests for both the watcher and the platform-level close behavior)	2026-06-02 17:27:44 -07:00
Teknium	787936d133	feat(gateway): structured stream-event protocol + Telegram draft formatting parity (#37250 ) Introduce a typed agent→gateway delivery contract so the gateway (not the agent) decides how each streaming event is rendered per platform. Moves toward smart-agent/smart-gateway separation while reproducing today's behavior exactly in the base class. - gateway/stream_events.py: typed event vocabulary (MessageChunk/Stop, Commentary, ToolCallChunk/Finished, LongToolHint, GatewayNotice). - gateway/stream_dispatch.py: GatewayEventDispatcher routes events through the adapter; adapters can eat events they can't render (e.g. tool chrome on plain-text platforms). - gateway/platforms/base.py: render_message_event + format_tool_event default hooks reproduce the historical emoji/preview tool formatting and consumer delegation 1:1; adapters override for native rendering. - gateway/platforms/telegram.py: send_draft now applies MarkdownV2 (format_message + parse_mode) with a plain-text fallback on BadRequest, fixing the jarring raw-text→formatted shift when the draft finalizes as a real sendMessage. - gateway/config.py: default streaming transport edit → auto. Safe globally: adapters without draft support report supports_draft_streaming()==False and transparently use edit, so only Telegram DMs gain native drafts. Presentation-only contract — nothing rendered here is persisted to conversation history, preserving cache/message-flow invariants.	2026-06-02 00:33:50 -07:00
Teknium	bd8e2ec1a6	feat(dashboard): complete admin panel — MCP catalog, enable/disable toggles, hook creation, system stats (#36736 ) * feat(dashboard): MCP catalog + enable/disable, webhook toggle, hook create/delete, system stats Backend for the comprehensive admin pass: - MCP: GET /api/mcp/catalog (browse Nous-approved optional-mcps), POST /api/mcp/catalog/install, PUT /api/mcp/servers/{name}/enabled - Webhooks: PUT /api/webhooks/{name}/enabled; gateway rejects disabled routes with 403 (hot-reloaded, no restart) - Hooks: POST/DELETE /api/ops/hooks — create (with consent approval) + remove; list now reports accurate allowlist status + valid events - System: GET /api/system/stats — OS/arch/python/cpu + psutil memory/disk/ uptime/process, stdlib fallback All gated by dashboard auth; secrets never returned. * feat(dashboard): MCP catalog UI, enable/disable toggles, hook create, system stats - McpPage: catalog section (browse Nous-approved MCPs, one-click install with env prompts) + per-server enable/disable toggle with gateway-restart note - WebhooksPage: per-subscription enable/disable toggle (muted + badge when off) - SystemPage: new Host stats section (OS/arch/python/cpu/mem/disk/uptime/load), shell-hook create modal + delete, 'Create backup' label - api.ts: client methods + types for catalog, toggles, hook CRUD, system stats * test(dashboard): cover catalog, toggles, hook CRUD, system stats, webhook toggle Adds tests for the comprehensive pass: MCP enable/disable + catalog list + catalog-install-unknown, hook create/delete with consent, system stats shape, and webhook enable/disable. 26 tests total, all green. * docs(dashboard): document the comprehensive admin pass + fresh screenshots Updates the MCP/Webhooks/Pairing/System sections for catalog browse+install, enable/disable toggles, hook creation, and host system stats; adds the new endpoints to the API table; replaces the screenshots with live captures of the rebuilt pages (real data, no dummies) including the hook-create modal. * feat(dashboard): curator, portal status, and prompt-size/dump/migrate ops Closes the last in-scope CLI gaps from the coverage audit: - Curator: GET /api/curator (status), PUT /api/curator/paused, POST /api/curator/run (background) - Portal: GET /api/portal (Nous auth + Tool Gateway routing, read-only) - Diagnostics: POST /api/ops/prompt-size, /api/ops/dump, /api/ops/config-migrate (backgrounded, tailed via action status) Host-bound commands (secrets/proxy/lsp/acp/computer-use/desktop/completion/ postinstall/uninstall/claw) remain CLI-only by design. * feat(dashboard): curator + portal + diagnostics UI, tests - SystemPage: Nous Portal status section (auth + Tool Gateway routing), Skill curator card (status + pause/resume + run now), and three new Operations buttons (prompt size, support dump, migrate config) - api.ts: client methods + CuratorStatus/PortalStatus types - tests: curator pause/resume, portal shape, system-stats shape, + auth-gate coverage for the new GET endpoints (31 tests total) * docs(dashboard): document curator, portal, and diagnostics + refresh System screenshots Updates the System section for the Nous Portal status, Skill curator controls, and the new prompt-size/dump/migrate operations; adds them to the API table; refreshes the System screenshots (now showing Portal + Curator) and adds a dedicated curator/gateway/memory capture. * feat(dashboard): session stats/export/prune + skills hub search endpoints Completes the existing tabs' backend depth (audit vs CLI): - Sessions: GET /api/sessions/stats (store stats), GET /api/sessions/{id}/export, POST /api/sessions/prune. /stats is registered before /{session_id} so the literal path isn't captured by the parameterized route. - Skills: GET /api/skills/hub/search — parallel multi-source hub search (threaded), returns installable identifiers - (rename via PATCH and cron-edit via PUT already existed; now surfaced in UI) * feat(dashboard): complete existing tabs — sessions mgmt, skills hub browse, cron edit Audited every existing tab against its CLI command and filled the gaps: - Sessions: store stats bar, per-row rename + export (JSON download), and a prune-old-sessions control (mirrors hermes sessions rename/export/prune/stats) - Skills: new 'Browse hub' view — search the skill hub across all sources, install by identifier with a live install log, and 'Update all' (mirrors hermes skills search/install/update) - Cron: per-job Edit modal (pre-filled) calling updateCronJob (hermes cron edit) - api.ts: renameSession/getSessionStats/exportSessionUrl/pruneSessions, updateCronJob, searchSkillsHub + types Models tab was already comprehensive (provider+model picker, dynamic per-provider lists, main + all 11 aux-task assignments, reset) — verified, no change needed. * test(dashboard): cover session stats/rename/export/prune + skills hub search Adds the route-shadowing guard for /api/sessions/stats (must not be captured by /api/sessions/{session_id}), rename/export/prune, and the empty-query short-circuit for hub search. 36 tests total, all green. * docs(dashboard): document enhanced Sessions, Skills hub, and Cron edit Sessions: stats bar, rename, export, prune (+ screenshot). Skills: new Browse hub view for search/install/update (+ screenshot). Cron: edit action. API table updated with the new endpoints.	2026-06-02 00:16:11 -04:00
teknium1	fa3b06b035	refactor(telegram): generalize observed-media caching into a reusable primitive Collapse the per-type observed-media dispatch into one platform-agnostic cache_media_bytes() helper in gateway/platforms/base.py. Any adapter can now hand it raw attachment bytes + a filename/MIME hint; it classifies against the shared MIME registries, routes to the right cache_*_from_bytes helper, sandbox-translates the path, and returns a CachedMedia with a ready context_note(). Telegram's observed-group path shrinks to: size-gate, download, call the helper, annotate. Also dedupes the addressed-media type ladder into _media_message_type(). Net: contributor's Telegram-only +595 LOC becomes a +210/-32 production change, with the reusable primitive available to Discord/Slack/Signal/etc. Co-authored-by: Glucksberg <markuscontasul@gmail.com>	2026-06-01 20:18:41 -07:00
Glucksberg	f768e75ecf	fix(telegram): cache observed group media	2026-06-01 20:18:41 -07:00
Zyrixtrex	0cd5867bbb	fix(whatsapp): honor dm_policy and group_policy open at the gateway	2026-06-01 19:51:21 -07:00
Zyrixtrex	f7a3509b25	fix(gateway): honor WECOM_ALLOWED_USERS in env-only WeCom DM allowlist	2026-06-01 19:20:36 -07:00
teknium1	abe0e19c0a	refactor(bluebubbles): simplify mention-gating helpers Collapse the three mention-parsing helpers into one _compile_mention_patterns that handles list/string/None inputs, and inline the require_mention bool coercion to match the signal/dingtalk convention. Same behavior, 16 fewer lines, no per-instance state in the staticmethod.	2026-06-01 18:52:05 -07:00
Trevin Chow	05022066ea	feat(bluebubbles): support group mention gating	2026-06-01 18:52:05 -07:00
Cao Jiguang	566669013f	fix(weixin): replace aiohttp ClientTimeout with asyncio.wait_for in _api_post/_api_get Cron delivery to WeChat fails with 'Timeout context manager should be used inside a task' because _api_post and _api_get use aiohttp's ClientTimeout directly. When the cron scheduler calls send() via asyncio.run_coroutine_threadsafe(), aiohttp cannot find a running task and raises RuntimeError. _upload_media, _download_bytes, and _download_remote_media already use asyncio.wait_for() to avoid this. Apply the same pattern to _api_post and _api_get — the two remaining iLink API helpers that still use the raw ClientTimeout approach. This fixes cron delivery errors seen on the WeChat platform adapter when meyo-external cron jobs attempt to deliver output to WeChat.	2026-06-01 17:31:40 -07:00
firefly	a1f76ba7e9	fix(gateway): recover extract-stripped tool responses on all platforms (#29346 ) The extract pipeline (extract_media/extract_images/extract_local_files + directive strips) can reduce a non-empty tool-using response to empty text_content with no deliverable attachment. The 'if text_content' send guard then silently skips delivery: a 'response ready' log with no 'Sending response', no error, and the answer never reaches the user. - A2: snapshot the pre-extract response; when extraction yields empty text and no image/local/media attachment, deliver the recovered original from the post-extract_media body (so a spaced MEDIA path can't leak). Applies on ALL platforms (supersedes the Discord-only #33842 and the unsafe raw-fallback #29499). - A3: loud delivery invariant - a non-empty response that produces nothing deliverable logs response_delivery_dropped at ERROR; every recovery logs response_delivery_recovered. No silent drop survives. - Factor a _strip_media_directives helper for the [[...]] strips; MEDIA stripping stays owned by extract_media, whose grammar handles spaced and quoted paths. - Salvaged + de-scoped the #33842 test harness to all platforms; added unrecoverable-drop and no-leak regression tests.	2026-06-01 17:31:32 -07:00
liuhao1024	3ccf4fdc6d	fix(gateway): skip MEDIA: tags inside code blocks and blockquotes extract_media() scanned the full response text without distinguishing live delivery tags from example paths in fenced code blocks, inline code spans, and blockquotes. This caused false positives where the agent's explanation of MEDIA: syntax (or tool output containing example paths) was stripped from user-visible text and the path was added to the media delivery list. Added _mask_protected_spans() helper that replaces protected regions with equal-length whitespace before regex matching, preserving match offsets. The helper skips backtick-quoted paths in MEDIA: tags to maintain existing path extraction behavior. Fixes #35695	2026-06-01 00:00:26 -07:00
kshitijk4poor	fb1b681b3b	fix(gateway): keep JSON-embedded MEDIA: text verbatim in cleaned output Self-review of #34375 fix: the cleanup path ran media_pattern.sub('') over the JSON-masked copy of the text, which baked the masking spaces into the user-visible 'cleaned' string — a serialized tool result like {"old":"MEDIA:/x.png"} came back as {"old":" "}. Now mask only a length-equal copy of 'cleaned' to locate the real tag spans, then delete those spans from the unmasked 'cleaned'. Real tags are stripped; JSON-embedded MEDIA: text reads back verbatim. Masking 'cleaned' (not the original 'content') keeps offsets valid after the [[audio_as_voice]] / [[as_document]] directives are removed. Adds two cleaned-text regression tests.	2026-05-31 23:51:42 -07:00
liuhao1024	e8827ef704	fix(gateway): skip MEDIA: inside serialized JSON string values Serialized tool results frequently embed a prior reply's text, e.g. {"result": "MEDIA:/path/stale.png"}. The bare-path branch of MEDIA_TAG_CLEANUP_RE matched these and re-delivered stale files (#34375). Adds BasePlatformAdapter._mask_json_string_media, which blanks (offset- preserving) only MEDIA:<bare-path> tokens that sit inside a JSON value- context string (opened by : , { or [). Legitimate tags at line start, after prose, indented, MEDIA:"quoted" form, and two-line TTS output are all left untouched. Reworked from the approach in #34388 (a line-start regex anchor), which no longer applied to current main and regressed same-line/indented tags. Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-05-31 23:51:42 -07:00
helix4u	b14e15c48e	fix(gateway): clean service restart notifications	2026-05-31 21:05:53 -07:00

1 2 3 4 5 ...

1036 commits