Second migration of an existing built-in platform adapter after Discord
(PR #30591) — follows the same shape established by IRC / Teams / LINE /
Google Chat / SimpleX and the playbook in
`references/platform-plugin-migration.md`. Advances the umbrella refactor
in #3823.
Matches Discord's parity bar — adapter under `plugins/platforms/mattermost/`
with the standard `__init__.py` / `adapter.py` / `plugin.yaml` shell,
`register(ctx)` entry point, **no back-compat shim** at the old import
path, and full parity for all five hooks Discord uses plus the
`apply_yaml_config_fn` hook (mattermost is the second consumer of #25443
after Discord):
* `standalone_sender_fn` — out-of-process cron delivery via Mattermost
REST API. Picks up the thread_id + media_files capabilities the
legacy `_send_mattermost` lacked (parity with Discord's `_standalone_send`).
* `setup_fn` — interactive `hermes setup gateway` wizard.
* `apply_yaml_config_fn` — translates `config.yaml` `mattermost:` keys
(`require_mention`, `free_response_channels`, `allowed_channels`) into
`MATTERMOST_*` env vars (replaces the hardcoded block in
`gateway/config.py`).
* `is_connected` — declares connection state from `MATTERMOST_TOKEN` +
`MATTERMOST_URL`.
* `check_fn` — verifies aiohttp is installed and both required env vars
are set.
* plus `allowed_users_env`, `allow_all_env`, `cron_deliver_env_var`,
`max_message_length` (4000 — Mattermost practical limit), `emoji`,
`required_env`, `install_hint`.
Files
-----
* `gateway/platforms/mattermost.py` (873 LOC) →
`plugins/platforms/mattermost/adapter.py` (git rename, R071) +
appended `register()` block, hook helpers, and `_standalone_send`
with media upload + thread_id support.
* New `plugins/platforms/mattermost/{__init__.py, plugin.yaml}` with
`requires_env` / `optional_env` declarations covering MATTERMOST_URL,
MATTERMOST_TOKEN, MATTERMOST_ALLOWED_USERS, MATTERMOST_ALLOW_ALL_USERS,
MATTERMOST_HOME_CHANNEL, MATTERMOST_REPLY_MODE,
MATTERMOST_REQUIRE_MENTION, MATTERMOST_FREE_RESPONSE_CHANNELS,
MATTERMOST_ALLOWED_CHANNELS.
* `gateway/config.py`: delete 17-LOC `mattermost_cfg` YAML→env bridge
(moved into plugin's `_apply_yaml_config`).
* `gateway/run.py::_create_adapter`: delete `Platform.MATTERMOST elif` —
replaced by the existing generic plugin-registry-first dispatch.
* `tools/send_message_tool.py`: delete `_send_mattermost` (22 LOC) +
`Platform.MATTERMOST elif` in `_send_to_platform` — the `else` branch
already routes plugin platforms through `_send_via_adapter`, which
hits the registry's `standalone_sender_fn`.
* `hermes_cli/setup.py`: delete `_setup_mattermost` (44 LOC) — replaced
by the plugin's `interactive_setup`.
* `hermes_cli/gateway.py`: delete `_PLATFORMS["mattermost"]` dict entry
(3 LOC) — plugin's `setup_fn` is dispatched via the plugin path in
`_configure_platform`.
* Consumer rewrite: 5 test files (test_mattermost.py,
test_media_download_retry.py, test_send_multiple_images.py,
test_stream_consumer.py, test_ws_auth_retry.py) get
`gateway.platforms.mattermost` → `plugins.platforms.mattermost.adapter`
with the bulk-rewrite recipe from the platform-plugin-migration playbook.
Single `mock.patch` string in test_stream_consumer.py also repointed.
* `tests/tools/test_send_message_missing_platforms.py`: thin
`(token, extra, chat_id, message)` compat shim around the plugin's
`_standalone_send(pconfig, …)` so existing test bodies continue to
work without rewriting every signature.
Validation
----------
* Plugin discovery: mattermost registers from `plugins/platforms/mattermost/`
alongside discord / teams / irc / line / google_chat / simplex.
All 9 hooks present (setup_fn, standalone_sender_fn,
apply_yaml_config_fn, is_connected, check_fn, allowed_users_env,
allow_all_env, cron_deliver_env_var, max_message_length=4000).
* Mattermost-touching tests: 62/62 pass
(`test_mattermost.py` + `test_send_message_missing_platforms.py`).
* Targeted selectors (mattermost or platform_registry or stream_consumer
or ws_auth_retry or media_download_retry or send_multiple_images or
send_message_tool or platform_connected): 433/433 pass.
* Full sweep (`scripts/run_tests.sh tests/gateway/ tests/cron/
tests/tools/test_send_message_tool.py tests/tools/test_send_message_missing_platforms.py
tests/integration/`): **6220/6220 pass in 47.8s, 0 failures**.
* Lint: ruff clean on all touched files.
* Git identity verified: kshitijk4poor.
* Rename detection: R071 (similarity dropped from a hypothetical R09x
by the ~320-line appended register block — ~36% growth over the
873-LoC base, vs Discord's 5101 LoC base which kept R091).
Closes part of #3823.
After sustained Bad Gateway / TimedOut reconnect cycles, the PTB httpx
client can enter a state where bot.send_message() returns a valid
Message (real message_id) but the message never reaches the recipient.
TelegramAdapter.send returns SendResult(success=True) and cron's
live-adapter branch marks the run delivered while the message is
silently dropped.
Add a _send_path_degraded flag. _handle_polling_network_error sets it
on reconnect storms; the existing _verify_polling_after_reconnect
heartbeat probe clears it once getMe() confirms the Bot client is
healthy. While the flag is set, send() short-circuits with
SendResult(success=False, retryable=True) so cron falls through to
the standalone delivery path (fresh HTTP session).
Closes#31165.
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
Fixes#31116 — two distinct bugs in fresh-install Matrix gateway:
1. Matrix E2EE setup installed only mautrix[encryption], leaving asyncpg
/ aiosqlite / Markdown / aiohttp-socks uninstalled. The first encrypted
connect failed with 'No module named asyncpg' deep inside
MatrixAdapter.connect(). Root cause: the setup wizard hand-rolled a
pip install of one package instead of using lazy_deps.ensure(
'platform.matrix'), and check_matrix_requirements() short-circuited the
runtime installer on 'import mautrix' alone — so the other 4 packages
were never pulled in.
2. Discord auto-enabled itself on every gateway start, even when the user
never selected Discord and had no DISCORD_BOT_TOKEN. Root cause:
gateway/config.py plugin-enablement loop gated enablement on
entry.check_fn() (just 'is the SDK importable?') and ignored
entry.is_connected (the 'did the user configure credentials?' probe).
Same bug class as commit 7849a3d73 fixed for _platform_status in the
setup wizard; this is the runtime counterpart. Affects Discord, Teams,
and Google Chat.
Changes:
- hermes_cli/setup.py::_setup_matrix — install via
lazy_deps.ensure('platform.matrix') to pull the full feature group.
- gateway/platforms/matrix.py::_check_e2ee_deps — verify asyncpg +
aiosqlite + PgCryptoStore in addition to OlmMachine, so E2EE failures
surface at startup instead of at first encrypted-room connect.
- gateway/platforms/matrix.py::check_matrix_requirements — use
feature_missing('platform.matrix') as the install gate instead of a
single 'import mautrix' check, so partial installs trigger the lazy
installer correctly.
- gateway/config.py plugin-enablement loop — consult entry.is_connected
before flipping enabled=True. Explicit YAML enabled=true still wins.
Tests: 3 new in tests/gateway/test_matrix.py (asyncpg-required,
aiosqlite-required, partial-install lazy-runs), 5 new in
tests/gateway/test_platform_registry.py (is_connected=False blocks,
is_connected=True enables, is_connected=None falls back to check_fn,
raising probe doesn't enable, explicit YAML wins).
Validation: 310 tests across affected test modules pass.
response_store.db (api server) holds conversation history including tool
payloads, prompts, and results. webhook_subscriptions.json holds per-route
HMAC secrets. Under a permissive umask (e.g. 0o022, default on most
distros) both files were created mode 0o644 — readable by other local
users on shared boxes.
- gateway/platforms/api_server.py: ResponseStore tightens itself + WAL/SHM
sidecars to 0o600 after __init__, then trusts the inode. (Original
contributor patch chmod'd after every _commit() — wasteful on a hot
api_server path; chmod-on-create is sufficient since SQLite preserves
mode bits across writes.)
- hermes_cli/webhook.py: _save_subscriptions writes via tempfile.mkstemp
(which itself creates the file with 0o600), chmods the temp before the
atomic rename, and re-asserts 0o600 on the destination so an existing
permissive file from before this fix gets narrowed.
Tests cover (a) creation under permissive umask leaves 0o600 and (b) an
existing 0o644 webhook_subscriptions.json gets narrowed on next save.
Tests guarded with skipif os.name=='nt' since POSIX mode bits don't apply
on Windows.
Salvaged from PR #30917 by @Hinotoi-agent. Reworked the api_server.py
side from chmod-on-every-commit to chmod-on-create.
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
When FEISHU_VERIFICATION_TOKEN is configured, an unauthenticated remote
could previously prove endpoint control by sending a url_verification
payload with any attacker-controlled challenge string — the handler
reflected the challenge BEFORE running the token check.
Move the verification_token check ahead of the url_verification echo so
the challenge response is gated on a valid token. Add a regression test
covering the wrong-token case. Also fix the stale
test_connect_webhook_mode_starts_local_server fixture to set
FEISHU_VERIFICATION_TOKEN (post #30746 webhook mode requires a secret).
Salvaged from PR #29663 by @m0n3r0 — kept the url_verification reorder
and its regression test; dropped the host-conditional weakening of the
#30746 secret guard (we want webhook secrets required regardless of
bind host, not only on 0.0.0.0/::).
Docs updated to call out the gating.
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
Operator misconfiguration is a client/setup error, not an internal server
exception. 403 "forbidden" more accurately reflects "this route refuses
to authenticate" than 500 "internal server error" — the latter triggers
incident alerting on operator monitoring and conflates real bugs with
config drift.
Follow-up tweak to PR #29629 by @m0n3r0.
Reject unsigned webhook requests when a route has no effective HMAC secret, even if the request handler is reached without the normal connect-time validation. Add regression coverage for the direct-handler path.
When asyncio.sleep() fires just before Task.cancel() is called, CPython
sets _must_cancel=True but cannot cancel the already-completed sleep
future, so CancelledError is delivered at the next await (handle_message)
rather than at the sleep. By that point the superseded task has already
popped the merged event from _pending_text_batches, so the superseding
task sees an empty batch and silently drops the message.
Fix: add a synchronous task-registry check between the sleep and the pop.
No await between the check and the pop means no other coroutine can
interleave, so the guard is race-free.
When WeCom returns errcode=40001 (invalid credential) or 42001 (token
expired), send() was returning a failure without evicting the bad token
from _access_tokens. All subsequent sends then kept using the same
invalid cached token until its TTL naturally expired (~7200s).
Fix: on the first token-rejection errcode, evict the cache entry and
retry once with a freshly fetched token. Non-token errcodes fail
immediately as before. If the refreshed token also fails, the error
is returned without looping further.
Adds four regression tests covering: successful retry on 40001,
successful retry on 42001, no retry on unrelated errcode, and clean
failure when the refresh does not help.
AI Card "tool progress" cards created with finalize=False were left in
streaming state on DingTalk's UI after a gateway restart because
disconnect() called _streaming_cards.clear() without first closing
them via _close_streaming_siblings.
Move the finalization loop before self._http_client.aclose() so the
HTTP client is still available when the finalize requests are sent.
Adds a regression test that asserts the HTTP client is alive during
finalization.
ntfy now ships as a self-contained plugin under plugins/platforms/ntfy/
instead of editing 8 core files (gateway/config.py Platform enum,
gateway/run.py factory + auth maps, cron/scheduler.py, toolsets.py,
hermes_cli/status.py, agent/prompt_builder.py, gateway/channel_directory.py,
tools/send_message_tool.py).
All routing goes through gateway/platform_registry via register_platform():
- adapter_factory, check_fn, validate_config, is_connected
- env_enablement_fn seeds PlatformConfig.extra from NTFY_* env vars so
gateway status reflects env-only setups without instantiating httpx
- standalone_sender_fn handles deliver=ntfy cron jobs when cron runs
out-of-process from the gateway
- allowed_users_env / allow_all_env hook into _is_user_authorized
- cron_deliver_env_var=NTFY_HOME_CHANNEL for cron home routing
- platform_hint surfaces in the system prompt
- pii_safe=True (topic names are the only identifier; no PII to redact)
Tests moved to tests/gateway/test_ntfy_plugin.py using _plugin_adapter_loader
so the module lives under plugin_adapter_ntfy in sys.modules and cannot
collide with sibling plugin-adapter tests on the same xdist worker. The
core-file grep tests (Platform.NTFY in source, hermes-ntfy in toolsets,
etc.) are replaced with plugin-shape tests covering register() metadata,
env_enablement_fn output, and standalone_sender_fn behavior.
68 tests pass under scripts/run_tests.sh.
Closes#30045. Based on @qike-ms's PR #30141.
Telegram status callbacks (lifecycle, compression, context-pressure)
used to append a fresh bubble on every emit. Now adapter tracks
{(chat_id, status_key) -> message_id}; first call sends, subsequent
calls edit. Failed edits drop the cache entry and fall through to a
fresh send.
- gateway/platforms/telegram.py: send_or_update_status() (+34 LOC)
- gateway/run.py: route _status_callback_sync through it when the
adapter supports it; plain adapter.send() otherwise (+15 LOC)
- 5 tests covering first send / edit-in-place / edit-failure fallback
/ distinct key & chat isolation
_guess_ext_from_data: data[:5] == b"#!SILK" -> data[:6] (6-byte string)
_looks_like_silk: data[:4] == b"#!SILK" -> data[:6]
The previous slices were too short to ever match the 6-byte "#!SILK"
literal, relying entirely on the "#!SILK_V3" (9-byte) and 0x02! (2-byte)
fallback paths for SILK format detection.
Add original_name parameter to _download_and_cache, preferring the
attachment metadata filename over the CDN URL path basename. Previously
files were cached with meaningless QQ CDN hash names (e.g.
qqdownload_...oadftnv5), causing ugly filenames when sent back to users.
Aligns with qqbot-agent-sdk's AttachmentDownloader.download_document.
1. Handle op 7 (Server Reconnect): close WS to trigger reconnect loop
while preserving session for Resume
2. Handle op 9 (Invalid Session): check d value to determine if session
is resumable; clear session only when not resumable
3. Remove 4009 from session-clearing set (connection timeout is resumable)
4. Expand fatal close codes: 4001/4002/4010-4014 now stop reconnect
immediately instead of retrying uselessly
5. Add unit tests
1. Add INTERACTION intent bit (1<<26) to _send_identify, fixing approval
button clicks not being received (INTERACTION_CREATE events were never
dispatched by the gateway)
2. Include local cached path in video/file attachment descriptions so the
LLM can reference files for re-sending to users
3. Add unit tests (TestIdentifyIntents, TestProcessAttachmentsPathExposure)
First migration of an existing built-in platform adapter to the plugin
system established by IRC / Teams / LINE / Google Chat. Closes#24325;
advances the umbrella refactor in #3823.
Matches Teams' shape exactly — adapter under ``plugins/platforms/discord/``
with the standard ``__init__.py`` / ``adapter.py`` / ``plugin.yaml``
shell, ``register(ctx)`` entry point, **no back-compat shim** at the old
import path, and full parity for the four hooks Teams uses plus the
``apply_yaml_config_fn`` hook that landed in #25443 (the Discord plugin
is the first consumer of that hook):
* ``standalone_sender_fn`` — out-of-process cron delivery via REST API
* ``setup_fn`` — interactive ``hermes setup gateway`` wizard
* ``apply_yaml_config_fn`` — translate ``config.yaml`` ``discord:`` keys
into ``DISCORD_*`` env vars (replaces the hardcoded block in
``gateway/config.py``)
* ``is_connected`` — declares connection state from ``DISCORD_BOT_TOKEN``
* ``check_fn`` — lazy-installs ``discord.py`` on demand
* plus ``allowed_users_env``, ``allow_all_env``, ``cron_deliver_env_var``,
``max_message_length``, ``emoji``, ``required_env``, ``install_hint``
* ``gateway/platforms/discord.py`` (5,101 LOC) →
``plugins/platforms/discord/adapter.py`` (git rename, R090).
* New ``plugins/platforms/discord/{__init__.py, plugin.yaml}`` with
``requires_env`` / ``optional_env`` declarations.
* Append ``register(ctx)`` block + new hook implementations
(``_standalone_send``, ``interactive_setup``, ``_apply_yaml_config``,
``_clean_discord_user_ids``, ``_is_connected``, ``_build_adapter``,
plus helpers ``_DISCORD_CHANNEL_TYPE_PROBE_CACHE`` etc.) to the
adapter.
* Replace the ``Platform.DISCORD elif`` branch in
``GatewayRunner._create_adapter()`` (−9 LOC) with a generic post-creation
hook (+6 LOC) in the registry path: any plugin adapter that declares a
``gateway_runner`` attribute now gets it auto-injected. Webhook's
built-in branch is unchanged (it doesn't go through the registry path).
* Move ``_send_discord`` (190 LOC) and helpers
(``_DISCORD_CHANNEL_TYPE_PROBE_CACHE``, ``_remember_channel_is_forum``,
``_probe_is_forum_cached``, ``_derive_forum_thread_name``) from
``tools/send_message_tool.py`` into the plugin as ``_standalone_send``.
* Wire via ``standalone_sender_fn=_standalone_send`` (Teams pattern; same
gap fixed in #21804 for other plugin platforms).
* Replace the Discord ``elif`` in ``tools/send_message_tool.py``
``_send_to_platform`` with a 10-line registry-hook dispatch.
* Drop the ``DiscordAdapter`` import and the
``Platform.DISCORD: DiscordAdapter.MAX_MESSAGE_LENGTH`` ``_MAX_LENGTHS``
entry — the registry's ``max_message_length=2000`` covers it.
* Move ``_setup_discord`` and ``_clean_discord_user_ids`` (68 LOC) from
``hermes_cli/setup.py`` into the plugin as ``interactive_setup``.
* Wire via ``setup_fn=interactive_setup``. CLI helpers (``prompt``,
``print_info``, etc.) are lazy-imported so the plugin's module-load
surface stays minimal.
* Remove ``"discord": _s._setup_discord`` from
``hermes_cli/gateway.py::_builtin_setup_fn``.
* Remove the entire 32-line ``_PLATFORMS["discord"]`` static dict entry —
Discord's setup metadata is now discovered dynamically via
``_all_platforms()`` from the registry entry.
* Move the 59-line ``discord_cfg`` YAML→env bridge from
``gateway/config.py::load_gateway_config()`` into the plugin as
``_apply_yaml_config``. Covers ``require_mention``,
``thread_require_mention``, ``free_response_channels``, ``auto_thread``,
``reactions``, ``ignored_channels``, ``allowed_channels``,
``no_thread_channels``, ``allow_mentions.{everyone,roles,users,
replied_user}``, and ``reply_to_mode`` (including the YAML 1.1
``off``-as-False coercion and the ``extra.reply_to_mode`` fallback).
* Wire via ``apply_yaml_config_fn=_apply_yaml_config``.
* The hook runs BEFORE ``_apply_env_overrides`` and after the generic
shared-key loop, exactly as documented in
``website/docs/developer-guide/adding-platform-adapters.md``.
* Behavior is preserved exactly — every assignment still uses
``not os.getenv(...)`` guards so env vars take precedence over YAML.
All 78 references to the old import path are rewritten — no back-compat
shim:
* 51 ``from gateway.platforms.discord import X`` →
``from plugins.platforms.discord.adapter import X``
* 5 ``import gateway.platforms.discord as discord_platform`` →
``import plugins.platforms.discord.adapter as discord_platform``
* 1 ``from gateway.platforms import discord as discord_mod`` →
``from plugins.platforms.discord import adapter as discord_mod``
* 21 ``mock.patch("gateway.platforms.discord.X")`` strings →
``mock.patch("plugins.platforms.discord.adapter.X")``
* 1 docstring reference in ``hermes_cli/commands.py``
* 1 import in ``tools/send_message_tool.py`` (now removed entirely)
The import-safety test in ``tests/gateway/test_discord_imports.py`` is
updated to purge the new canonical module name from ``sys.modules``.
**38 files changed, +621 / −473** — net positive due to the YAML hook
implementation (89 new LOC in the plugin trading for 59 deleted in core),
but every line moved has a clear plugin home now. The git rename is
detected at R090 because the adapter gained ~340 LOC of moved-in hook
implementations (``_standalone_send`` + ``interactive_setup`` +
``_apply_yaml_config`` + helpers).
* All 568 Discord-specific tests pass across 25 ``test_discord_*.py``
files plus voice/send/text-batching/reload-skills/stream-consumer/
integration tests.
* All 147 tests in the YAML-touching subset
(``test_discord_reply_mode``, ``test_discord_free_response``,
``test_discord_allowed_channels``, ``test_discord_allowed_mentions``,
``test_discord_channel_controls``, ``test_discord_reactions``,
``test_discord_thread_persistence``, ``test_runtime_footer``) pass —
this is the strongest signal that the YAML→env hook behaves
identically to the legacy block.
* Broader gateway/cron/integration sweep (1297 tests) introduces zero
new failures vs ``main``. Pre-existing failures in
``tests/gateway/test_tts_media_routing.py`` and
``tests/e2e/test_platform_commands.py`` reproduce identically on the
unchanged ``main`` revision.
* Plugin discovery sanity check confirms Discord registers alongside the
other four platform plugins:
Registered platforms: ['discord', 'google_chat', 'irc', 'line', 'teams']
These Discord-shaped tendrils in core were **deliberately not moved** —
they are generic platform-registry concerns affecting every platform,
not Discord-specific:
* ``gateway/config.py:1205`` ``DISCORD_BOT_TOKEN → config.token`` env
enablement — same shape Telegram has. The existing
``env_enablement_fn`` registry hook only seeds ``extra``, not
``.token``, so it can't replace this without an adapter refactor to
read from ``extra["bot_token"]``.
* ``gateway/run.py`` voice-mode hooks
(``self.adapters.get(Platform.DISCORD)`` for
``start_voice_mode``/``stop_voice_mode``), role-based auth,
``DISCORD_ALLOW_BOTS`` branch in ``_is_user_authorized``,
``_UPDATE_ALLOWED_PLATFORMS`` frozenset, and the per-platform
allowlist maps — generic platform-registry concerns.
* ``Platform.DISCORD`` enum literal — stable identifier used as dict
keys throughout the codebase; removing it is a separate refactor with
no real benefit.
* ``tools/discord_tool.py`` and ``tools/environments/local.py`` —
first-class agent tools and env-passthrough config, neither is the
gateway adapter.
Each of these is worth its own scoping issue when the time comes.
_handle_location_message and _handle_media_message were skipped when the
observe-unmentioned-group-messages feature landed (a9db0e2c7). Both handlers
now:
1. Check _should_observe_unmentioned_group_message on the skipped path and
call _observe_unmentioned_group_message so group chatter is stored as
shared session context even when the bot is not addressed.
2. Call _apply_telegram_group_observe_attribution on the triggered path so
the dispatched event uses the shared (user_id=None) group session instead
of the per-user session, letting the model see previously observed context.
For stickers the attribution is applied after _handle_sticker completes
(which overwrites event.text with the vision description); for all other
media types it is applied once after caption cleaning.
Four new tests cover the observe and attribution paths for both handlers.
The typing indicator loop (send_typing) ran every 8s and died on any
exception, including Discord 429 rate limits. Once a 429 killed the
loop, the indicator never restarted — and the raw exception bounce
could cascade into broader gateway instability.
Changes:
- Bump sleep interval from 8s to 12s (typing light lasts ~10s)
- On 429: extract retry_after, log a warning, sleep the backoff,
and continue the loop
- On non-rate-limit errors: log debug and return (unchanged
behaviour)
PR #29211 dropped JSONL gateway transcripts and noted that the platform's
own `message_id` field (used by Yuanbao's recall guard to redact a
message by exact platform id) was no longer preserved — falling back to
content-match. That fallback works for the common case but redacts the
wrong row when two messages share text (or fails to match when content
is post-processed).
Restore exact-id matching by giving state.db a column for it:
- New `platform_message_id TEXT` column on the messages table
(SCHEMA_VERSION bump 11 → 12; column added via declarative reconciler
on existing DBs, no version-gated migration block needed)
- Partial index `idx_messages_platform_msg_id` on
(session_id, platform_message_id) to keep recall's point-lookup cheap
even on large sessions
- `append_message()` and `replace_messages()` accept the new value:
the gateway-facing `append_to_transcript` in `gateway/session.py`
forwards either `message["platform_message_id"]` or the legacy
`message["message_id"]` key (yuanbao's existing convention)
- `get_messages_as_conversation()` surfaces the column back on the
message dict as `message_id` so platform code reads the same shape
it used to read from JSONL
- Yuanbao `_patch_transcript`: restore branch A1 (exact id match)
ahead of A2 (content match) ahead of B (system-note). Both branches
log which one fired so operators can tell from gateway.log whether
recall hit the canonical path or had to fall back.
Tests:
- New low-level round-trip tests in `test_hermes_state.py` for both
`append_message` and `replace_messages` paths
- The PR's `test_yuanbao_recall_db_only.py` was rewritten to assert
the new contract: branch A1 (id match) works against DB-only
transcripts, and branch A2 (content match) still recovers rows that
were observed without a platform id (e.g. agent-processed @bot
messages where run.py doesn't carry msg_id through)
PR #29211 review findings:
1. test_retry_replacement: pin DEFAULT_DB_PATH so SessionDB() doesn't write
to the real ~/.hermes/state.db. Same fix as the other DB-only fixtures.
2. yuanbao recall branch A1 (message_id exact match) was structurally dead
once load_transcript() became DB-only — state.db never preserves the
platform message_id. Removed the dead loop, consolidated to a single
content-match branch (renamed 'A: content match'). Branch B (system
note) unchanged. Updated the test name + docstring to reflect this.
Note: self._lock is no longer taken in append_to_transcript (was guarding
the JSONL file append). SQLite append_message handles its own concurrency
via WAL mode, so this is safe; flagging for awareness.
Yuanbao's recall feature was reading the gateway JSONL directly to look up
messages by platform message_id, which state.db does not preserve. Migrated
to use load_transcript() which returns DB messages.
Recall branch A1 (message_id match) now falls through to A2 (content match)
or B (system note) for all sessions — a documented degradation. Follow-up
issue: add platform_message_id column to state.db messages to restore
exact-id matching.
Sibling fix to PR #28918 (Discord voice notes). DingTalk's rich-text
"voice" item type is its native voice-message format, but the adapter
was routing it to MessageType.AUDIO — which gateway/run.py:7605 skips
for STT. The docs claim every voice-capable platform auto-transcribes,
so this brings DingTalk in line.
Generic audio uploads (mapped to "file" by DINGTALK_TYPE_MAPPING) are
unchanged — they were already classified as DOCUMENT, not AUDIO.
Adds tests/gateway/test_dingtalk.py::TestExtractMedia covering both the
voice path and the audio-passthrough invariant.
When discord.py is not installed at import time, DISCORD_AVAILABLE=False
and the view class definitions at module bottom are skipped.
check_discord_requirements() performs a lazy install and sets
DISCORD_AVAILABLE=True but never re-ran the class definitions, causing
NameError on the first button interaction (exec approval, slash confirm, etc.).
Extract the five ui.View subclasses into _define_discord_view_classes() and
call it both at module load (when discord.py is pre-installed) and inside
check_discord_requirements() after a successful lazy install.
Five small fixes against issues filed during the post-merge salvage audit:
* #28670: `_GATEWAY_PROVIDER_ERROR_RE` false-positives on legitimate prose.
Replace the regex with an anchored `_GATEWAY_PROVIDER_ERROR_SHAPE_RE` and
add a length-cap heuristic to `_looks_like_gateway_provider_error`:
short envelope at the start of the message → real provider error; long
prose containing 'HTTP 404' → assistant answer, leave alone.
* #28672: drop the pointless 1s asyncio.sleep on Telegram thread-not-found
retries. The same-thread retry is preserved (catches Telegram's
occasional transient flake exercised by
test_send_retries_transient_thread_not_found_before_fallback) but with
no artificial delay.
* #28674: broaden `_should_retry_without_dm_topic_reply_anchor` to also
fire when Bot API rejects `direct_messages_topic_id` for synthetic /
resumed sends that have no reply anchor. Avoids dropping post-resume
background notifications if the topic id goes stale.
* #28676: delete the dead image-document branch superseded by bd0c54d17
(which returns early on the same extension set).
* #28678: extend chat-scoped allowlist (`TELEGRAM_GROUP_ALLOWED_CHATS`)
to also cover `chat_type == 'channel'`, so operators can authorize
channel posts by chat id without falling back to per-user allowlists.
Tests:
- scripts/run_tests.sh tests/gateway/test_telegram_thread_fallback.py -q → 41/41
- scripts/run_tests.sh tests/cron/test_scheduler.py -q → 127/127
- broader test set: same 3 pre-existing test-pollution failures reproduce
on plain main.
In multi-agent shared Matrix rooms, multiple bots all participating in the
same thread could trigger infinite reply loops — each bot's reply re-engaged
the others because they were all in the bot-thread set. Discord has a
`thread_require_mention` opt-in for this; Matrix didn't.
Add `_parse_thread_require_mention(config)` (mirrors Discord's pattern).
In `_resolve_message_context`, when enabled and the message is in a
bot-participated thread (not a free-response room), require @mention
before processing.
Salvage of @justemu's 2-commit stack (#27996). Fixes#27995.
Add a configurable mention filter to the Signal adapter so the bot
only responds in groups when it is explicitly @mentioned.
Changes:
- gateway/platforms/signal.py: read require_mention from adapter
extra config or SIGNAL_REQUIRE_MENTION env var; skip group messages
that don't mention the bot account (checked in rendered text and
raw mention metadata)
- gateway/config.py: map signal.require_mention YAML key to the
SIGNAL_REQUIRE_MENTION env var (env var takes precedence)
Config example:
signal:
require_mention: true
Or via env var:
SIGNAL_REQUIRE_MENTION=true
Two coordinated changes that unblock downstream audio pipelines
(diarization, custom transcription, archival) on attachments larger
than the public Bot API's 20MB getFile ceiling.
- `stt.enabled: false` no longer drops voice/audio with a generic
"transcription disabled" note. The gateway probes the cached file's
duration (wave → mutagen → ffprobe ladder) and surfaces
`[The user sent a voice message: <abs path> (duration: M:SS)]` to
the agent so a skill or tool can pick up the raw file. The previous
placeholder is replaced rather than appended when present.
- `platforms.telegram.extra.base_url` set → adapter auto-lifts its
document size cap from 20MB to 2GB (the local telegram-bot-api
`--local` ceiling) and the "too large" reply reports the active
limit dynamically. No new config knob; presence of `base_url` is the
opt-in.
- `platforms.telegram.extra.local_mode: true` wires
`Application.builder().local_mode(True)` on the python-telegram-bot
builder. PTB then reads files from disk instead of HTTP, which is
required when telegram-bot-api runs in `--local` mode (the server
returns absolute filesystem paths, not `/file/bot...` URLs).
- gateway/run.py: rewrites the `stt.enabled: false` branch of
`_enrich_message_with_transcription`. New `_format_duration` +
`_probe_audio_duration` helpers.
- gateway/platforms/telegram.py: `_max_doc_bytes` instance attribute
derived from `extra.base_url`; `local_mode` builder wiring;
dynamic "too large" message.
- tests/gateway/test_stt_config.py: covers path-surfacing with and
without an existing user message, and placeholder replacement.
- tests/gateway/test_telegram_max_doc_bytes.py: 3 cases — default 20MB
without base_url, 2GB when set, empty-string base_url keeps default.
- website/docs/user-guide/messaging/telegram.md: new "Skipping STT"
subsection under Voice Messages and a full "Large Files (>20MB) via
Local Bot API Server" walkthrough (api_id/api_hash, docker-compose,
one-time `logOut` migration, `platforms.telegram.extra` config, the
`local_mode` disk-access requirement, the silent HTTP-fallback 404).
- website/docs/user-guide/features/voice-mode.md: documents the
`stt.enabled` knob in the config reference.
- `pytest tests/gateway/test_telegram_max_doc_bytes.py
tests/gateway/test_stt_config.py` → 9/9 passing.
- Verified end-to-end on a live deployment: gateway log shows
`Using custom Telegram base_url: http://...` and
`Using Telegram local_mode (read files from disk)` on startup;
voice messages above 20MB cache to disk and surface their path to
the agent.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a user sends a message on Telegram, the incoming message is now
automatically pinned at the start of processing and unpinned when the
agent finishes its turn. This gives the user a visual indicator that
their message is being worked on, and keeps the conversation anchored.
Changes:
- telegram.py: Added pinChatMessage in on_processing_start and
unpinChatMessage in on_processing_complete. Restructured both
hooks so pin/unpin runs independently of the reactions feature
(reactions are optional; pinning is always on).
- telegram.py: Pass message_id through SessionSource so it's
available in the session context.
- session_context.py: Added HERMES_SESSION_MESSAGE_ID context var.
- run.py: Pass source.message_id through set_session_vars.
Pinning is silent (disable_notification=True) and failures are
logged at debug level without interrupting message processing.
Only the user's incoming message is pinned -- never the agent's
replies. Auto-resume events (which have no message_id) are
correctly skipped.