Five small fixes against issues filed during the post-merge salvage audit:
* #28670: `_GATEWAY_PROVIDER_ERROR_RE` false-positives on legitimate prose.
Replace the regex with an anchored `_GATEWAY_PROVIDER_ERROR_SHAPE_RE` and
add a length-cap heuristic to `_looks_like_gateway_provider_error`:
short envelope at the start of the message → real provider error; long
prose containing 'HTTP 404' → assistant answer, leave alone.
* #28672: drop the pointless 1s asyncio.sleep on Telegram thread-not-found
retries. The same-thread retry is preserved (catches Telegram's
occasional transient flake exercised by
test_send_retries_transient_thread_not_found_before_fallback) but with
no artificial delay.
* #28674: broaden `_should_retry_without_dm_topic_reply_anchor` to also
fire when Bot API rejects `direct_messages_topic_id` for synthetic /
resumed sends that have no reply anchor. Avoids dropping post-resume
background notifications if the topic id goes stale.
* #28676: delete the dead image-document branch superseded by bd0c54d17
(which returns early on the same extension set).
* #28678: extend chat-scoped allowlist (`TELEGRAM_GROUP_ALLOWED_CHATS`)
to also cover `chat_type == 'channel'`, so operators can authorize
channel posts by chat id without falling back to per-user allowlists.
Tests:
- scripts/run_tests.sh tests/gateway/test_telegram_thread_fallback.py -q → 41/41
- scripts/run_tests.sh tests/cron/test_scheduler.py -q → 127/127
- broader test set: same 3 pre-existing test-pollution failures reproduce
on plain main.
1. trajectory_compressor.py: yaml.safe_load() returns None on empty
files, crashing with TypeError on `if 'tokenizer' in data`. Fix by
adding `or {}` fallback. (HIGH — blocks startup with empty config)
2. 6 files with fcntl.flock(LOCK_UN) in finally blocks without
try/except: cron/scheduler.py, hermes_cli/auth.py,
agent/shell_hooks.py, tools/skill_usage.py,
tools/environments/file_sync.py, tools/memory_tool.py. If unlock
raises OSError, fd.close() is skipped and the lock is held forever.
The msvcrt branches already had try/except; the fcntl branches did
not. Fix by wrapping in try/except (OSError, IOError): pass.
3. agent/copilot_acp_client.py line 639: TOCTOU race — path.exists()
followed by path.read_text() with no try/except. If file is deleted
between the check and the read, FileNotFoundError propagates. Fix
by using try/except FileNotFoundError.
4. gateway/sticker_cache.py: non-atomic write via Path.write_text()
can leave truncated JSON on crash, causing JSONDecodeError on next
load. Fix by writing to tempfile + fsync + os.replace (atomic).
In multi-agent shared Matrix rooms, multiple bots all participating in the
same thread could trigger infinite reply loops — each bot's reply re-engaged
the others because they were all in the bot-thread set. Discord has a
`thread_require_mention` opt-in for this; Matrix didn't.
Add `_parse_thread_require_mention(config)` (mirrors Discord's pattern).
In `_resolve_message_context`, when enabled and the message is in a
bot-participated thread (not a free-response room), require @mention
before processing.
Salvage of @justemu's 2-commit stack (#27996). Fixes#27995.
Pre-mark all running agent sessions as resume_pending BEFORE the drain
wait begins. If the service manager kills the process during the drain
(window), the durable marker is already written so the next gateway boot
can recover in-flight sessions. On graceful drain completion, clear the
early markers for sessions that finished successfully.
Add a configurable mention filter to the Signal adapter so the bot
only responds in groups when it is explicitly @mentioned.
Changes:
- gateway/platforms/signal.py: read require_mention from adapter
extra config or SIGNAL_REQUIRE_MENTION env var; skip group messages
that don't mention the bot account (checked in rendered text and
raw mention metadata)
- gateway/config.py: map signal.require_mention YAML key to the
SIGNAL_REQUIRE_MENTION env var (env var takes precedence)
Config example:
signal:
require_mention: true
Or via env var:
SIGNAL_REQUIRE_MENTION=true
Two coordinated changes that unblock downstream audio pipelines
(diarization, custom transcription, archival) on attachments larger
than the public Bot API's 20MB getFile ceiling.
- `stt.enabled: false` no longer drops voice/audio with a generic
"transcription disabled" note. The gateway probes the cached file's
duration (wave → mutagen → ffprobe ladder) and surfaces
`[The user sent a voice message: <abs path> (duration: M:SS)]` to
the agent so a skill or tool can pick up the raw file. The previous
placeholder is replaced rather than appended when present.
- `platforms.telegram.extra.base_url` set → adapter auto-lifts its
document size cap from 20MB to 2GB (the local telegram-bot-api
`--local` ceiling) and the "too large" reply reports the active
limit dynamically. No new config knob; presence of `base_url` is the
opt-in.
- `platforms.telegram.extra.local_mode: true` wires
`Application.builder().local_mode(True)` on the python-telegram-bot
builder. PTB then reads files from disk instead of HTTP, which is
required when telegram-bot-api runs in `--local` mode (the server
returns absolute filesystem paths, not `/file/bot...` URLs).
- gateway/run.py: rewrites the `stt.enabled: false` branch of
`_enrich_message_with_transcription`. New `_format_duration` +
`_probe_audio_duration` helpers.
- gateway/platforms/telegram.py: `_max_doc_bytes` instance attribute
derived from `extra.base_url`; `local_mode` builder wiring;
dynamic "too large" message.
- tests/gateway/test_stt_config.py: covers path-surfacing with and
without an existing user message, and placeholder replacement.
- tests/gateway/test_telegram_max_doc_bytes.py: 3 cases — default 20MB
without base_url, 2GB when set, empty-string base_url keeps default.
- website/docs/user-guide/messaging/telegram.md: new "Skipping STT"
subsection under Voice Messages and a full "Large Files (>20MB) via
Local Bot API Server" walkthrough (api_id/api_hash, docker-compose,
one-time `logOut` migration, `platforms.telegram.extra` config, the
`local_mode` disk-access requirement, the silent HTTP-fallback 404).
- website/docs/user-guide/features/voice-mode.md: documents the
`stt.enabled` knob in the config reference.
- `pytest tests/gateway/test_telegram_max_doc_bytes.py
tests/gateway/test_stt_config.py` → 9/9 passing.
- Verified end-to-end on a live deployment: gateway log shows
`Using custom Telegram base_url: http://...` and
`Using Telegram local_mode (read files from disk)` on startup;
voice messages above 20MB cache to disk and surface their path to
the agent.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a user sends a message on Telegram, the incoming message is now
automatically pinned at the start of processing and unpinned when the
agent finishes its turn. This gives the user a visual indicator that
their message is being worked on, and keeps the conversation anchored.
Changes:
- telegram.py: Added pinChatMessage in on_processing_start and
unpinChatMessage in on_processing_complete. Restructured both
hooks so pin/unpin runs independently of the reactions feature
(reactions are optional; pinning is always on).
- telegram.py: Pass message_id through SessionSource so it's
available in the session context.
- session_context.py: Added HERMES_SESSION_MESSAGE_ID context var.
- run.py: Pass source.message_id through set_session_vars.
Pinning is silent (disable_notification=True) and failures are
logged at debug level without interrupting message processing.
Only the user's incoming message is pinned -- never the agent's
replies. Auto-resume events (which have no message_id) are
correctly skipped.
The gmail-triage skill's Telegram inline buttons emit callback_data of the
form `gt:<verb>:<arg>`, but `_handle_callback_query` had no `gt:` branch —
taps fell through silently and the spinner sat there until Telegram timed it
out.
Add `_handle_gmail_triage_callback`, dispatched from the existing callback
router, that:
- Authorizes the caller via the same `_is_callback_user_authorized` path as
the approval / slash-confirm / clarify handlers.
- Maps each verb to a script under `~/.hermes/scripts/gmail-triage/` and runs
it async with a 60s timeout.
- Splits verbs into one-shots (send / archive / draft / spam) — append the
confirmation and strip the keyboard so the action can't fire twice — and
sticky-state changes (mute / trust / vip ± -domain) — append the
confirmation but leave the keyboard tappable so the user can stack actions
on one email.
- On failure: toast only, keyboard preserved so the user can retry.
- Logs every callback outcome to gateway.log for debugging.
When a DM topic lane's message_thread_id is rejected by Telegram
(e.g. stale or deleted topic), send_typing now falls back to sending
the typing indicator without thread_id so it at least appears in the
main DM view, rather than being silently swallowed.
Also adds test for the fallback behavior.
When context compression triggers a mid-turn session split, source.thread_id
can be None on synthetic/recovered events. _thread_metadata_for_source then
returns None, causing the Telegram adapter to send with no message_thread_id
and the response lands in the General thread instead of the active DM topic.
Fix:
- hermes_state.py: Add get_telegram_topic_binding_by_session() for reverse
lookup by session_id (enabled by the existing UNIQUE INDEX on session_id).
- gateway/run.py: After session-split detection, if source is a Telegram DM
and source.thread_id is None, recover it from the binding via the new
method so _thread_metadata_for_source produces the correct thread routing.
- tests/: Coverage for the new lookup method and the recovery flow.
When Hermes auto-titles a session in a Telegram DM topic it currently
renames the topic itself to the generated title. That works for
operator-managed lanes (extra.dm_topics) but is disruptive for
ad-hoc Threaded-Mode topics that users name by hand — every first
exchange overwrites their chosen title.
Add gateway.platforms.telegram.extra.disable_topic_auto_rename (default
False, preserving prior behaviour). When set, both
_schedule_telegram_topic_title_rename and the underlying
_rename_telegram_topic_for_session_title short-circuit before touching
the Telegram API. Internal session titles (sessions list, TUI) keep
working unchanged.
Also bridge the legacy top-level telegram.disable_topic_auto_rename key
through to gateway.platforms.telegram.extra so users on the older
config layout don't have to migrate to enable it.
- Tests cover the runtime flag, the scheduling entry-point, and string
truthiness coercion for YAML-loaded values.
- Docs updated in messaging/telegram.md with an example block.
When users send images as documents (Telegram file picker), they were
rejected with "Unsupported document type" because SUPPORTED_DOCUMENT_TYPES
only includes text/office formats. Add SUPPORTED_IMAGE_DOCUMENT_TYPES
to base.py and handle them in telegram.py before the document check.
- Add SUPPORTED_IMAGE_DOCUMENT_TYPES constant to base.py
- Add MIME reverse-lookup for image types in telegram.py
- Route image documents through cache_image_from_bytes + vision pipeline
- Handle media groups for image documents
Closes: #20128, #18620
Register Telegram bot commands across default, private, and group scopes so
the slash-command menu is available outside DMs.
Changes from review feedback:
- Add asyncio.Lock to prevent race condition in _ensure_forum_commands
- Extract MAX_COMMANDS_PER_SCOPE constant (30) to avoid magic number
- Upgrade error logging from debug->warning in forum registration
- Add tests covering lazy forum registration and concurrent safety
- Remove /start handler from this PR (separate feature)
Fixes review: needs_work (race, magic number, log levels, missing tests)
Topic-mode DM replies were fragmenting one conversation across many sessions: a Reply on a message in another topic delivered Telegram's message_thread_id for *that* topic, and #3206's strip routed plain replies to the lobby. Both pulled the user away from their current session. Fix: when topic mode is on, rewrite source.thread_id to the user's most-recent binding if the inbound id is missing/General or not a known topic. Non-topic-mode users unchanged.
send_slash_confirm() sent the raw command preview with ParseMode.MARKDOWN,
skipping the format_message() conversion applied to every other dynamic
send in the adapter. Commands with underscores, dots, brackets, or other
MarkdownV2-sensitive characters raised BadRequest: Can't parse entities;
the exception was swallowed by the outer try/except, so the confirmation
prompt silently never appeared.
Fix: wrap preview through format_message() and switch to MARKDOWN_V2,
symmetric with send_update_prompt and the callback sends fixed in
a69404052.
In Telegram "important" notifications mode (default), TelegramPlatformAdapter
sets ``disable_notification=True`` on every send unless metadata carries
``notify=True``. GatewayRunner._send_voice_reply already passes thread
metadata through to ``adapter.send_voice``, but never marks the final
auto-TTS voice reply as notify-worthy — so users with the default mode get
the final voice note delivered silently with no push notification.
Mirror the final-text path in gateway/platforms/base.py (the existing
text-response final send already adds ``metadata["notify"] = True``).
Issue #27970 Bug 2. Bug 1 (MP3 vs. native OGG voice-note) is being
addressed by existing PRs #20182 / #20878 — this PR is intentionally
scoped to the silent-delivery bug only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The text /approve and /deny paths in gateway/run.py call
resume_typing_for_chat() after resolve_gateway_approval() succeeds, but
the Telegram inline-button (ea:*) callback in _handle_callback_query did
not. Typing is paused when the approval is sent (gateway/run.py:15658),
so without a matching resume the typing indicator stayed gone for the
remainder of a long-running turn after a button click.
Symmetry-match the text path: after a successful resolve, call
self.resume_typing_for_chat(str(query_chat_id)). Guarded by count > 0
to match /approve's "if not count" early-return — if nothing was
actually resolved, the agent thread was never unblocked, so typing
should remain paused.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a sticky fallback IP (from DoH discovery) becomes unreachable,
the transport previously got stuck in an attempt_order that only
tried the dead IP. This prevented the gateway from recovering
until the service was restarted.
Changes:
- Always include primary DNS path (None) after the sticky IP in the
attempt_order so that a primary-path retry happens on sticky failure.
- Reset self._sticky_ip to None when the currently sticky IP hits
a connect timeout / connect error, allowing the next request to
retry from scratch.
Fixes silent Telegram disconnection when discovered fallback IPs
are transiently or permanently unreachable.
The _is_callback_user_authorized fallback returned True when
TELEGRAM_ALLOWED_USERS was not set, allowing any Telegram user
to interact with the bot. Change to fail-closed: deny by default
unless GATEWAY_ALLOW_ALL_USERS=true is explicitly set.
Fixes#24457
TELEGRAM_ALLOWED_USERS was only checked for callback/inline-button
actions but not for inbound messages. Unauthorized users triggered an
'Unauthorized user' log warning but their messages were still processed
by the agent — a P0 security bypass (issue #23778).
Fix: add allowlist check in _should_process_message() which is called
for all message types (text, command, media, location). If the sender
is not in TELEGRAM_ALLOWED_USERS, the message is dropped immediately
with a warning log. Empty TELEGRAM_ALLOWED_USERS continues to allow
all users (existing behavior).
Fixes#23778
Background-process completion notifications (notify_on_complete) and
watch-pattern notifications were always delivered to the Telegram main
chat instead of the originating private-chat topic.
Hermes-created Telegram DM topic lanes only render a send when it carries
both message_thread_id and a reply anchor. The synthetic MessageEvent
injected on process completion had no message_id, so _reply_anchor_for_event
returned None and _thread_kwargs_for_send dropped message_thread_id
entirely — routing the notification to the main chat.
Capture the triggering message id at spawn time and thread it through to
the synthetic event so it can be reply-anchored back into the topic:
- session_context: add HERMES_SESSION_MESSAGE_ID context var
- telegram adapter: populate SessionSource.message_id on inbound messages
- terminal tool: persist watcher_message_id on the process session
- process registry: carry/persist message_id on watcher dicts + checkpoint
- gateway: set MessageEvent.message_id on injected notifications
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When edit_message_text fails with a transient error (httpx.ConnectError,
NetworkError, server disconnected, timeouts), the progress-message sender
must not permanently set can_edit = False — that would convert a single
Telegram network hiccup into separate per-tool bubbles for the rest of the run.
Changes:
- gateway/platforms/telegram.py: edit_message now returns retryable=True for
transient network errors (ConnectError, NetworkError, timeouts, server
disconnects, temporarily unavailable). Permanent failures (flood control,
message-not-found, permissions) remain retryable=False.
- gateway/run.py: send_progress_messages checks result.retryable before
setting can_edit = False. Transient failures skip the fallback-send and
continue — the next edit cycle catches up with the accumulated lines.
Permanent failures (flood, message-not-found, etc.) still disable editing.
Tests: 22 new tests in test_telegram_progress_edit_transient.py covering
transient vs permanent error classification, SendResult.retryable semantics,
and the can_edit decision logic.
Fixes#27828
When a progress-message edit hits Telegram flood control (RetryAfter),
can_edit was unconditionally set to False, permanently disabling coalescing
for the rest of the run. Subsequent tool updates were posted as separate
new messages instead of updating the existing progress bubble.
Fix: only set can_edit=False for non-recoverable edit errors. On flood
control, back off by resetting _last_edit_ts so the throttle interval is
respected before the next edit attempt.
Fixes#25188
The audio-file-paths handling block at line 7334 references the variable
unconditionally, but #24879 initialized it inside the 'if event.media_urls'
block — so events without media_urls hit UnboundLocalError.
Found via test_run_agent_queued_message_does_not_treat_commentary_as_final
after PR #28478 landed.
Telegram distinguishes three kinds of audio payloads:
- message.voice → Opus/OGG voice messages → STT pipeline ✓
- message.audio → audio file attachments → bypasses STT ← was broken
- message.document (audio mime) → generic file route
**Root cause** — the inbound message routing block in gateway/run.py
matched both MessageType.VOICE *and* MessageType.AUDIO into audio_paths,
which were then fed unconditionally to _enrich_message_with_transcription.
Audio file attachments (.mp3, .m4a, etc.) were therefore auto-transcribed
instead of being treated as files, making the transcribe skill unusable
from Telegram because the path it needed was never surfaced.
**Fix**
- Introduce a new audio_file_paths list populated exclusively by
MessageType.AUDIO events.
- Narrow the audio_paths selector to MessageType.VOICE (and bare
audio/ mime-type events that are not explicitly AUDIO or DOCUMENT).
- After the STT block, inject a document-style context note for each
audio_file_path, giving the agent the file path and asking what to do
with it (consistent with how plain documents are handled).
**Tests** — 5 new tests in test_telegram_audio_vs_voice.py:
- voice message still transcribed (regression guard)
- audio attachment skips STT (core fix)
- audio attachment context note format
- STT disabled still produces file note (not STT-disabled notice)
- MessageType.AUDIO != MessageType.VOICE sanity check
Fixes#24870
The DM topic reply fallback code in send() hardcoded should_thread=True
when telegram_dm_topic_reply_fallback metadata was present, bypassing
_should_thread_reply() and ignoring reply_to_mode config. This caused
quote bubbles on every response even with reply_to_mode: 'off'.
Fix:
- Add reply_to_mode param to _reply_to_message_id_for_send() and
_thread_kwargs_for_send() classmethods
- In send(), check self._reply_to_mode != 'off' for DM topic fallback
- Suppress reply anchor and reply_to_message_id when mode is 'off'
while preserving message_thread_id for correct topic routing
- Thread reply_to_mode through all 29 call sites
Regression coverage: 10 new tests in test_telegram_reply_mode.py
covering classmethod behavior, send() integration, and backward
compatibility.
Fixes reply_to_mode: 'off' ignored by Telegram DM topic reply fallback code #23994
When Telegram clarify prompts offer long choices, mobile clients
truncate the inline button labels, making options unreadable.
Previously only the question was shown in the message body with
truncated choice text in button labels.
Fix: append the full numbered option list to the message body
so users can read complete choice text on any client. Buttons
now use short numeric labels (1, 2, ...) to avoid Telegram
truncation. The 'Other (type answer)' button is unchanged.
Long choice labels are now rendered in full (not truncated to
57 chars + '...') since they appear in the body instead of
button labels.
Closes: #27497