Closes#30045. Based on @qike-ms's PR #30141.
Telegram status callbacks (lifecycle, compression, context-pressure)
used to append a fresh bubble on every emit. Now adapter tracks
{(chat_id, status_key) -> message_id}; first call sends, subsequent
calls edit. Failed edits drop the cache entry and fall through to a
fresh send.
- gateway/platforms/telegram.py: send_or_update_status() (+34 LOC)
- gateway/run.py: route _status_callback_sync through it when the
adapter supports it; plain adapter.send() otherwise (+15 LOC)
- 5 tests covering first send / edit-in-place / edit-failure fallback
/ distinct key & chat isolation
_handle_location_message and _handle_media_message were skipped when the
observe-unmentioned-group-messages feature landed (a9db0e2c7). Both handlers
now:
1. Check _should_observe_unmentioned_group_message on the skipped path and
call _observe_unmentioned_group_message so group chatter is stored as
shared session context even when the bot is not addressed.
2. Call _apply_telegram_group_observe_attribution on the triggered path so
the dispatched event uses the shared (user_id=None) group session instead
of the per-user session, letting the model see previously observed context.
For stickers the attribution is applied after _handle_sticker completes
(which overwrites event.text with the vision description); for all other
media types it is applied once after caption cleaning.
Four new tests cover the observe and attribution paths for both handlers.
Five small fixes against issues filed during the post-merge salvage audit:
* #28670: `_GATEWAY_PROVIDER_ERROR_RE` false-positives on legitimate prose.
Replace the regex with an anchored `_GATEWAY_PROVIDER_ERROR_SHAPE_RE` and
add a length-cap heuristic to `_looks_like_gateway_provider_error`:
short envelope at the start of the message → real provider error; long
prose containing 'HTTP 404' → assistant answer, leave alone.
* #28672: drop the pointless 1s asyncio.sleep on Telegram thread-not-found
retries. The same-thread retry is preserved (catches Telegram's
occasional transient flake exercised by
test_send_retries_transient_thread_not_found_before_fallback) but with
no artificial delay.
* #28674: broaden `_should_retry_without_dm_topic_reply_anchor` to also
fire when Bot API rejects `direct_messages_topic_id` for synthetic /
resumed sends that have no reply anchor. Avoids dropping post-resume
background notifications if the topic id goes stale.
* #28676: delete the dead image-document branch superseded by bd0c54d17
(which returns early on the same extension set).
* #28678: extend chat-scoped allowlist (`TELEGRAM_GROUP_ALLOWED_CHATS`)
to also cover `chat_type == 'channel'`, so operators can authorize
channel posts by chat id without falling back to per-user allowlists.
Tests:
- scripts/run_tests.sh tests/gateway/test_telegram_thread_fallback.py -q → 41/41
- scripts/run_tests.sh tests/cron/test_scheduler.py -q → 127/127
- broader test set: same 3 pre-existing test-pollution failures reproduce
on plain main.
Two coordinated changes that unblock downstream audio pipelines
(diarization, custom transcription, archival) on attachments larger
than the public Bot API's 20MB getFile ceiling.
- `stt.enabled: false` no longer drops voice/audio with a generic
"transcription disabled" note. The gateway probes the cached file's
duration (wave → mutagen → ffprobe ladder) and surfaces
`[The user sent a voice message: <abs path> (duration: M:SS)]` to
the agent so a skill or tool can pick up the raw file. The previous
placeholder is replaced rather than appended when present.
- `platforms.telegram.extra.base_url` set → adapter auto-lifts its
document size cap from 20MB to 2GB (the local telegram-bot-api
`--local` ceiling) and the "too large" reply reports the active
limit dynamically. No new config knob; presence of `base_url` is the
opt-in.
- `platforms.telegram.extra.local_mode: true` wires
`Application.builder().local_mode(True)` on the python-telegram-bot
builder. PTB then reads files from disk instead of HTTP, which is
required when telegram-bot-api runs in `--local` mode (the server
returns absolute filesystem paths, not `/file/bot...` URLs).
- gateway/run.py: rewrites the `stt.enabled: false` branch of
`_enrich_message_with_transcription`. New `_format_duration` +
`_probe_audio_duration` helpers.
- gateway/platforms/telegram.py: `_max_doc_bytes` instance attribute
derived from `extra.base_url`; `local_mode` builder wiring;
dynamic "too large" message.
- tests/gateway/test_stt_config.py: covers path-surfacing with and
without an existing user message, and placeholder replacement.
- tests/gateway/test_telegram_max_doc_bytes.py: 3 cases — default 20MB
without base_url, 2GB when set, empty-string base_url keeps default.
- website/docs/user-guide/messaging/telegram.md: new "Skipping STT"
subsection under Voice Messages and a full "Large Files (>20MB) via
Local Bot API Server" walkthrough (api_id/api_hash, docker-compose,
one-time `logOut` migration, `platforms.telegram.extra` config, the
`local_mode` disk-access requirement, the silent HTTP-fallback 404).
- website/docs/user-guide/features/voice-mode.md: documents the
`stt.enabled` knob in the config reference.
- `pytest tests/gateway/test_telegram_max_doc_bytes.py
tests/gateway/test_stt_config.py` → 9/9 passing.
- Verified end-to-end on a live deployment: gateway log shows
`Using custom Telegram base_url: http://...` and
`Using Telegram local_mode (read files from disk)` on startup;
voice messages above 20MB cache to disk and surface their path to
the agent.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a user sends a message on Telegram, the incoming message is now
automatically pinned at the start of processing and unpinned when the
agent finishes its turn. This gives the user a visual indicator that
their message is being worked on, and keeps the conversation anchored.
Changes:
- telegram.py: Added pinChatMessage in on_processing_start and
unpinChatMessage in on_processing_complete. Restructured both
hooks so pin/unpin runs independently of the reactions feature
(reactions are optional; pinning is always on).
- telegram.py: Pass message_id through SessionSource so it's
available in the session context.
- session_context.py: Added HERMES_SESSION_MESSAGE_ID context var.
- run.py: Pass source.message_id through set_session_vars.
Pinning is silent (disable_notification=True) and failures are
logged at debug level without interrupting message processing.
Only the user's incoming message is pinned -- never the agent's
replies. Auto-resume events (which have no message_id) are
correctly skipped.
The gmail-triage skill's Telegram inline buttons emit callback_data of the
form `gt:<verb>:<arg>`, but `_handle_callback_query` had no `gt:` branch —
taps fell through silently and the spinner sat there until Telegram timed it
out.
Add `_handle_gmail_triage_callback`, dispatched from the existing callback
router, that:
- Authorizes the caller via the same `_is_callback_user_authorized` path as
the approval / slash-confirm / clarify handlers.
- Maps each verb to a script under `~/.hermes/scripts/gmail-triage/` and runs
it async with a 60s timeout.
- Splits verbs into one-shots (send / archive / draft / spam) — append the
confirmation and strip the keyboard so the action can't fire twice — and
sticky-state changes (mute / trust / vip ± -domain) — append the
confirmation but leave the keyboard tappable so the user can stack actions
on one email.
- On failure: toast only, keyboard preserved so the user can retry.
- Logs every callback outcome to gateway.log for debugging.
When a DM topic lane's message_thread_id is rejected by Telegram
(e.g. stale or deleted topic), send_typing now falls back to sending
the typing indicator without thread_id so it at least appears in the
main DM view, rather than being silently swallowed.
Also adds test for the fallback behavior.
When users send images as documents (Telegram file picker), they were
rejected with "Unsupported document type" because SUPPORTED_DOCUMENT_TYPES
only includes text/office formats. Add SUPPORTED_IMAGE_DOCUMENT_TYPES
to base.py and handle them in telegram.py before the document check.
- Add SUPPORTED_IMAGE_DOCUMENT_TYPES constant to base.py
- Add MIME reverse-lookup for image types in telegram.py
- Route image documents through cache_image_from_bytes + vision pipeline
- Handle media groups for image documents
Closes: #20128, #18620
Register Telegram bot commands across default, private, and group scopes so
the slash-command menu is available outside DMs.
Changes from review feedback:
- Add asyncio.Lock to prevent race condition in _ensure_forum_commands
- Extract MAX_COMMANDS_PER_SCOPE constant (30) to avoid magic number
- Upgrade error logging from debug->warning in forum registration
- Add tests covering lazy forum registration and concurrent safety
- Remove /start handler from this PR (separate feature)
Fixes review: needs_work (race, magic number, log levels, missing tests)
send_slash_confirm() sent the raw command preview with ParseMode.MARKDOWN,
skipping the format_message() conversion applied to every other dynamic
send in the adapter. Commands with underscores, dots, brackets, or other
MarkdownV2-sensitive characters raised BadRequest: Can't parse entities;
the exception was swallowed by the outer try/except, so the confirmation
prompt silently never appeared.
Fix: wrap preview through format_message() and switch to MARKDOWN_V2,
symmetric with send_update_prompt and the callback sends fixed in
a69404052.
The text /approve and /deny paths in gateway/run.py call
resume_typing_for_chat() after resolve_gateway_approval() succeeds, but
the Telegram inline-button (ea:*) callback in _handle_callback_query did
not. Typing is paused when the approval is sent (gateway/run.py:15658),
so without a matching resume the typing indicator stayed gone for the
remainder of a long-running turn after a button click.
Symmetry-match the text path: after a successful resolve, call
self.resume_typing_for_chat(str(query_chat_id)). Guarded by count > 0
to match /approve's "if not count" early-return — if nothing was
actually resolved, the agent thread was never unblocked, so typing
should remain paused.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The _is_callback_user_authorized fallback returned True when
TELEGRAM_ALLOWED_USERS was not set, allowing any Telegram user
to interact with the bot. Change to fail-closed: deny by default
unless GATEWAY_ALLOW_ALL_USERS=true is explicitly set.
Fixes#24457
TELEGRAM_ALLOWED_USERS was only checked for callback/inline-button
actions but not for inbound messages. Unauthorized users triggered an
'Unauthorized user' log warning but their messages were still processed
by the agent — a P0 security bypass (issue #23778).
Fix: add allowlist check in _should_process_message() which is called
for all message types (text, command, media, location). If the sender
is not in TELEGRAM_ALLOWED_USERS, the message is dropped immediately
with a warning log. Empty TELEGRAM_ALLOWED_USERS continues to allow
all users (existing behavior).
Fixes#23778
Background-process completion notifications (notify_on_complete) and
watch-pattern notifications were always delivered to the Telegram main
chat instead of the originating private-chat topic.
Hermes-created Telegram DM topic lanes only render a send when it carries
both message_thread_id and a reply anchor. The synthetic MessageEvent
injected on process completion had no message_id, so _reply_anchor_for_event
returned None and _thread_kwargs_for_send dropped message_thread_id
entirely — routing the notification to the main chat.
Capture the triggering message id at spawn time and thread it through to
the synthetic event so it can be reply-anchored back into the topic:
- session_context: add HERMES_SESSION_MESSAGE_ID context var
- telegram adapter: populate SessionSource.message_id on inbound messages
- terminal tool: persist watcher_message_id on the process session
- process registry: carry/persist message_id on watcher dicts + checkpoint
- gateway: set MessageEvent.message_id on injected notifications
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When edit_message_text fails with a transient error (httpx.ConnectError,
NetworkError, server disconnected, timeouts), the progress-message sender
must not permanently set can_edit = False — that would convert a single
Telegram network hiccup into separate per-tool bubbles for the rest of the run.
Changes:
- gateway/platforms/telegram.py: edit_message now returns retryable=True for
transient network errors (ConnectError, NetworkError, timeouts, server
disconnects, temporarily unavailable). Permanent failures (flood control,
message-not-found, permissions) remain retryable=False.
- gateway/run.py: send_progress_messages checks result.retryable before
setting can_edit = False. Transient failures skip the fallback-send and
continue — the next edit cycle catches up with the accumulated lines.
Permanent failures (flood, message-not-found, etc.) still disable editing.
Tests: 22 new tests in test_telegram_progress_edit_transient.py covering
transient vs permanent error classification, SendResult.retryable semantics,
and the can_edit decision logic.
Fixes#27828
The DM topic reply fallback code in send() hardcoded should_thread=True
when telegram_dm_topic_reply_fallback metadata was present, bypassing
_should_thread_reply() and ignoring reply_to_mode config. This caused
quote bubbles on every response even with reply_to_mode: 'off'.
Fix:
- Add reply_to_mode param to _reply_to_message_id_for_send() and
_thread_kwargs_for_send() classmethods
- In send(), check self._reply_to_mode != 'off' for DM topic fallback
- Suppress reply anchor and reply_to_message_id when mode is 'off'
while preserving message_thread_id for correct topic routing
- Thread reply_to_mode through all 29 call sites
Regression coverage: 10 new tests in test_telegram_reply_mode.py
covering classmethod behavior, send() integration, and backward
compatibility.
Fixes reply_to_mode: 'off' ignored by Telegram DM topic reply fallback code #23994
When Telegram clarify prompts offer long choices, mobile clients
truncate the inline button labels, making options unreadable.
Previously only the question was shown in the message body with
truncated choice text in button labels.
Fix: append the full numbered option list to the message body
so users can read complete choice text on any client. Buttons
now use short numeric labels (1, 2, ...) to avoid Telegram
truncation. The 'Other (type answer)' button is unchanged.
Long choice labels are now rendered in full (not truncated to
57 chars + '...') since they appear in the body instead of
button labels.
Closes: #27497
Telegram clears the typing state when a new message is delivered.
When the agent sends intermediate progress messages (like 'Checking:'),
the '...typing' bubble disappears immediately and doesn't return until
the next keepalive tick (up to 2s later). This makes Hermes appear
unresponsive during multi-tool operations.
Fix: call send_typing() immediately after successful message delivery
to restart the typing indicator without waiting for the next keepalive tick.
Fixes#25836
When the final streamed text is identical to the last plain-text edit,
stream_consumer._send_or_edit short-circuits and never calls
adapter.edit_message(finalize=True). For Telegram, this skips the
plain-text → MarkdownV2 conversion, leaving raw Markdown syntax visible
to the user.
Set REQUIRES_EDIT_FINALIZE = True on TelegramAdapter so the finalize
edit is always delivered, matching the existing DingTalk pattern.
Fixes#25710
The cherry-picked PR over-indented the edit_message_text block for
the mm: (model selected → switch) success path so the confirmation
edit lived inside the preceding 'except Exception as exc' branch and
only fired when the callback raised. Dedent the try/except back to
12-space indent so it runs after the callback succeeds, restoring
the original flow that removes the inline buttons and shows the
'Switched to ...' confirmation.
Add a regression test (test_model_selected_edits_message_on_success)
that asserts edit_message_text is awaited and the result text is
routed through format_message (MARKDOWN_V2 + backtick survival).
Add phuongvm to scripts/release.py AUTHOR_MAP.
Use MarkdownV2 formatting for Telegram callback follow-ups and interactive prompts where dynamic names or user text can break legacy Markdown parsing. Add regression coverage for reload-mcp, model picker, approval callbacks, and update prompts.
PR #23458 introduced _send_message_with_thread_fallback() and applied it
to all control-style sends (send_update_prompt, send_approval_request,
send_model_picker_prompt), but the slash-confirm result message in
handle_callback_query still called self._bot.send_message directly.
In supergroups with stale message_thread_id on the callback's parent
message, this raises "Message thread not found" and silently swallows
the result text. Replace with the helper so the same retry-without-
thread-id logic applies.
When the user runs /stop or a session is interrupted mid-flight, the
👀 in-progress reaction lingered on the user's message indefinitely.
Without another agent run to swap it for 👍/👎, the eyes stayed there
forever — visually misleading (looks like the agent is still working).
Fix: on ProcessingOutcome.CANCELLED, call set_message_reaction with
reaction=None to clear all reactions on the message. Documented Bot API
semantics (equivalent to Bot API 10.0's deleteMessageReaction, but works
on PTB 22.6 already without the version bump).
Test changes:
- Renamed test_on_processing_complete_cancelled_keeps_existing_reaction
→ test_on_processing_complete_cancelled_clears_reaction; updated
assertion to expect set_message_reaction(reaction=None).
- Added test_on_processing_complete_cancelled_skipped_when_disabled
(TELEGRAM_REACTIONS=false short-circuits).
- Added test_clear_reactions_handles_api_error_gracefully and
test_clear_reactions_returns_false_without_bot to cover the new
_clear_reactions helper.
The clarify tool returned 'not available in this execution context' for
every gateway-mode agent because gateway/run.py never passed
clarify_callback into the AIAgent constructor. Schema actively encouraged
calling it; users never saw the question.
Changes:
- tools/clarify_gateway.py — new event-based primitive mirroring
tools/approval.py: register/wait_for_response/resolve_gateway_clarify
with per-session FIFO, threading.Event blocking with 1s heartbeat
slices (so the inactivity watchdog keeps ticking), and
clear_session for boundary cleanup.
- gateway/platforms/base.py — abstract send_clarify with a numbered-text
fallback so every adapter (Discord, Slack, WhatsApp, Signal, Matrix,
etc.) gets a working clarify out of the box. Plus an active-session
bypass: when the agent is blocked on a text-awaiting clarify, the next
non-command message routes inline to the runner's intercept instead
of being queued + triggering an interrupt. Same shape as the /approve
deadlock fix from PR #4926.
- gateway/platforms/telegram.py — concrete send_clarify renders one
inline button per choice plus '✏️ Other (type answer)'. cl: callback
handler resolves numeric choices immediately, flips to text-capture
mode for Other, with the same authorization guards as exec/slash
approvals.
- gateway/run.py — clarify_callback wired at the cached-agent per-turn
callback assignment site (only the user-facing agent path; cron and
hygiene-compress agents have no human attached). Bridges sync→async
via run_coroutine_threadsafe, blocks with the configured timeout, and
returns a '[user did not respond within Xm]' sentinel on timeout so
the agent adapts rather than pinning the running-agent guard. Text-
intercept added to _handle_message before slash-confirm intercept
(skipping slash commands). clear_session called in the run's finally
to cancel any orphan entries.
- hermes_cli/config.py — agent.clarify_timeout default 600s.
- website/docs/user-guide/messaging/telegram.md — Interactive Prompts
section.
Tests:
- tests/tools/test_clarify_gateway.py (14 tests) — full primitive
coverage: button resolve, open-ended auto-await, Other flip, timeout
None, unknown-id idempotency, clear_session cancellation, FIFO
ordering, register/unregister notify, config default.
- tests/gateway/test_telegram_clarify_buttons.py (12 tests) — render
paths (multi-choice/open-ended/long-label/HTML-escape/not-connected),
callback dispatch (numeric resolve/Other flip/already-resolved/
unauthorized/invalid-token), and base-adapter text fallback.
Out of scope: bot-to-bot, guest mode, checklists, poll media, live
photos. Closes#24191.
* feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback
Three coordinated mitigations for the Mini Shai-Hulud worm hitting
mistralai 2.4.6 on PyPI (2026-05-12) and for the next single-package
compromise that follows.
# What this PR makes true
1. Users with the poisoned mistralai 2.4.6 in their venv get a loud
detection banner with copy-pasteable remediation steps the moment
they run hermes (and on every gateway startup).
2. One quarantined / yanked PyPI package can no longer silently demote
a fresh install to 'core only' — the installer keeps every other
extra and tells the user which tier landed.
3. Future opt-in backends (Mistral, ElevenLabs, Honcho, etc.) can
lazy-install on first use under a strict allowlist, instead of
eagerly pulling everything at install time.
# Detection: hermes_cli/security_advisories.py
- ADVISORIES catalog (one entry currently: shai-hulud-2026-05 for
mistralai==2.4.6). Adding the next one is a single dataclass.
- detect_compromised() uses importlib.metadata.version() — no pip
dependency, works in uv venvs that lack pip.
- Banner cache (~/.hermes/cache/advisory_banner_seen) rate-limits
the startup banner to once per 24h per advisory.
- Acks persisted to security.acked_advisories in config.yaml; never
re-banner after ack.
- Wired into:
* hermes doctor — runs first, prints full remediation block
* hermes doctor --ack <id> — dismisses an advisory
* cli.py interactive run() and single-query branches — short
stderr banner pointing at hermes doctor
* gateway/run.py startup — operator-visible warning in gateway.log
# Lazy-install framework: tools/lazy_deps.py
- LAZY_DEPS allowlist maps namespaced feature keys (tts.elevenlabs,
memory.honcho, provider.bedrock, etc.) to pip specs.
- ensure(feature) installs missing deps in the active venv via the
uv → pip → ensurepip ladder (matches tools_config._pip_install).
- Strict spec safety regex rejects URLs, file paths, shell metas,
pip flag injection, control chars — only PyPI-by-name accepted.
- Gated on security.allow_lazy_installs (default true) plus the
HERMES_DISABLE_LAZY_INSTALLS env var for restricted/audited envs.
- Migrated three backends as proof of pattern:
* tools/tts_tool.py — _import_elevenlabs() calls ensure first
* plugins/memory/honcho/client.py — get_honcho_client lazy-installs
* tts.mistral / stt.mistral entries pre-registered for when PyPI
restores mistralai
# Installer fallback tiers
scripts/install.sh, scripts/install.ps1, setup-hermes.sh:
- Centralised _BROKEN_EXTRAS list (currently: mistral). Edit one
array when a transitive breaks; users keep every other extra.
- New 'all minus known-broken' tier between [all] and the existing
PyPI-only-extras tier. Only kicks in when [all] fails resolve.
- All three tiers explicit: every fallback announces which tier
landed and prints a re-run hint when not on Tier 1.
- install.ps1 and install.sh both regenerate their tier specs from
the same _BROKEN_EXTRAS array so updates stay in sync.
Side effect: install.ps1 Tier 2 spec previously hardcoded 'mistral'
in its extra list — bug fixed by the refactor (mistral is filtered
out).
# Config
hermes_cli/config.py — DEFAULT_CONFIG.security gains:
- acked_advisories: [] (advisory IDs the user has dismissed)
- allow_lazy_installs: True (security gate for ensure())
No config version bump needed — both keys nest under existing
security: block, and load_config's deep-merge picks up DEFAULT_CONFIG
defaults for users with older configs.
# Tests
tests/hermes_cli/test_security_advisories.py — 23 tests covering:
- detect_compromised matches/non-matches, wildcard frozenset
- ack persistence, idempotence, blank rejection, config-failure path
- banner cache rate limiting + 24h re-banner + ack-stops-banner
- short_banner_lines / full_remediation_text / render_doctor_section /
gateway_log_message
- shipped catalog well-formedness invariant
tests/tools/test_lazy_deps.py — 40 tests covering:
- spec safety: 11 safe parametrized + 18 unsafe parametrized
- allowlist: unknown-feature rejection, namespace.name shape,
every shipped spec passes the safety regex
- security gating: config flag, env var, default, fail-open
- ensure() happy/sad paths: already-satisfied, install success,
pip stderr surfaced on failure, install-succeeds-but-still-missing
- is_available, feature_install_command
Combined: 63 new tests, all passing under scripts/run_tests.sh.
# Validation
- scripts/run_tests.sh tests/hermes_cli/test_security_advisories.py
tests/tools/test_lazy_deps.py → 63/63 passing
- scripts/run_tests.sh tests/hermes_cli/test_doctor.py
tests/hermes_cli/test_doctor_command_install.py
tests/tools/test_tts_mistral.py tests/tools/test_transcription_tools.py
tests/tools/test_transcription_dotenv_fallback.py → 165/165 passing
- scripts/run_tests.sh tests/hermes_cli/ tests/tools/ →
9191 passed, 8 pre-existing failures (verified on origin/main
before this change)
- bash -n on install.sh and setup-hermes.sh → OK
- py_compile on all modified .py files → OK
- End-to-end smoke test of detect_compromised + render_doctor_section
+ gateway_log_message with mocked installed version → produces
copy-pasteable remediation output
# Community
Full advisory + remediation steps:
website/docs/community/security-advisories/shai-hulud-mistralai-2026-05.md
Short-form post drafts (Discord, GitHub pinned issue, README banner):
scripts/community-announcement-shai-hulud.md
Refs: PR #24205 (mistral disabled), Socket Security advisory
<https://socket.dev/blog/mini-shai-hulud-worm-pypi>
* build(deps): pin every direct dep to ==X.Y.Z (no ranges)
Companion to the supply-chain advisory work: replace every >=/</~= range
in pyproject.toml's [project.dependencies] and [project.optional-dependencies]
with an exact ==X.Y.Z pin sourced from uv.lock.
Why: ranges allow PyPI to ship a fresh version of any direct dep at any
time without a code review on our side. With ranges, the malicious
mistralai 2.4.6 release would have been pulled by every fresh
'pip install -e .[all]' for the hours between upload and PyPI's
quarantine — exactly the install window we got hit on. Exact pins close
that window: the only way a new package version reaches a user is via
an intentional update on our end.
What the user-facing change is: nothing, behavior-wise. Every package
resolves to the same version it was already resolving to via uv.lock —
the pins just remove the resolver's freedom to pick a different one.
Cost: any user installing Hermes alongside another package that requires
a newer pin gets a resolver conflict. Acceptable for our isolated-venv
install path; documented in the new comment block.
Build-system requires line (setuptools>=61.0) is intentionally left
as a range — pinning the build backend would block fresh pip from
bootstrapping the build on architectures where that exact wheel isn't
available.
mistral extra (mistralai==2.3.0) is pinned but stays out of [all]
(per PR #24205). 'uv lock' regeneration will fail until PyPI restores
mistralai; lockfile regeneration is gated behind that, NOT on every PR.
LAZY_DEPS in tools/lazy_deps.py also moved to exact pins so the lazy-
install pathway can never resolve a different version than the one
declared in pyproject.toml.
Validation:
- Cross-checked all 77 pinned direct deps in pyproject.toml against
uv.lock — every pin matches the resolved version exactly.
- Cross-checked all LAZY_DEPS specs against uv.lock — same.
- 'uv pip install -e .[all] --dry-run' resolves 205 packages cleanly.
- tests/tools/test_lazy_deps.py + tests/hermes_cli/test_security_advisories.py
→ 63/63 passing (every shipped spec passes the safety regex).
- Doctor + TTS + transcription targeted suite → 146/146 passing.
* build(deps): hash-verify transitives via uv.lock; remove unresolvable [mistral] extra
You asked: 'what about the dependencies the dependencies rely on?' —
correctly noting that exact-pinning direct deps in pyproject.toml does
NOT cover the transitive graph. `pip install` and `uv pip install` both
re-resolve transitives fresh from PyPI at install time, so a compromised
transitive (e.g. `httpcore` if it got worm-poisoned tomorrow) would
still hit our users even with every direct dep exact-pinned.
# What this commit fixes
1. **Both real installer scripts now prefer `uv sync --locked` as Tier 0.**
uv.lock records SHA256 hashes for every transitive — a compromised
package with a different hash gets REJECTED. Falls through to the
existing `uv pip install` cascade if the lockfile is missing or
stale, with a loud warning that the fallback path does NOT
hash-verify transitives. Previously only `setup-hermes.sh` (the dev
path) used the lockfile; `scripts/install.sh` and `scripts/install.ps1`
(the paths fresh users actually run) skipped it.
2. **Removed the `[mistral]` extra entirely.** The `mistralai` PyPI
project is fully quarantined right now — every version returns 404,
so any pin we wrote was unresolvable, which broke `uv lock --check`
in CI. Restoration is documented in pyproject.toml as a 5-step
checklist (verify, re-add extra, re-enable in 4 modules, regenerate
lock, optionally re-add to [all]).
3. **Regenerated uv.lock.** 262 packages, mistralai/eval-type-backport/
jsonpath-python pruned. `uv lock --check` now passes.
# Defense-in-depth view
| Layer | Where | Protects against |
|----------------------------|-------------------|-------------------------------------------|
| Exact pins in pyproject | direct deps | new mistralai 2.4.6-style direct compromise |
| uv.lock + `--locked` install | transitive graph | transitive worm injection |
| Tier-0 hash-verified path | install.sh / .ps1 | actually USE the lockfile in fresh installs |
| `uv lock --check` CI gate | every PR | drift between pyproject and lockfile |
| `hermes_cli/security_advisories.py` | runtime | cleanup for users who already got hit |
The exact pinning + hash verification together close the supply-chain
gap. Without the lockfile path, exact pins alone are theater.
# Validation
- `uv lock --check` → passes (262 packages resolved, no drift).
- `bash -n` on install.sh + setup-hermes.sh → OK.
- 209/209 tests passing across new + adjacent test files
(test_lazy_deps.py, test_security_advisories.py, test_doctor.py,
test_tts_mistral.py, test_transcription_tools.py).
- TOML parse OK.
* chore: remove community announcement drafts (PR body covers it)
* build(deps): lazy-install every opt-in backend (anthropic, search, terminal, platforms, dashboard)
Extends the lazy-install framework to cover everything that's not used by
every hermes session. Base install drops from ~60 packages to 45.
Moved out of core dependencies = []:
- anthropic (only when provider=anthropic native, not via aggregators)
- exa-py, firecrawl-py, parallel-web (search backends; only when picked)
- fal-client (image gen; only when picked)
- edge-tts (default TTS but still optional)
New extras in pyproject.toml: [anthropic] [exa] [firecrawl] [parallel-web]
[fal] [edge-tts]. All added to [all].
New LAZY_DEPS entries: provider.anthropic, search.{exa,firecrawl,parallel},
tts.edge, image.fal, memory.hindsight, platform.{telegram,discord,matrix},
terminal.{modal,daytona,vercel}, tool.dashboard.
Each import site now calls ensure() before importing the SDK. Where the
module had a top-level try/except (telegram, discord, fastapi), the
graceful-fallback pattern was extended to lazy-install on first
check_*_requirements() call and re-bind module globals.
Updated test_windows_native_support.py tzdata check from snapshot
(>=2023.3 literal) to invariant (any version + win32 marker).
Validation:
- Base install: 45 packages (was ~60); 6 newly-extracted packages absent
- uv lock --check: passes (262 packages, no drift)
- 209/209 lazy_deps + advisory + doctor + tts/transcription tests passing
- py_compile clean on all 12 modified modules
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via `ruff --fix --unsafe-fixes`, 0 remaining.
133 files, +626/-626 (net zero).
Re-authored against current main from PR #10388 by @wilsen0. The
original branch is 3800+ commits stale and could not be cherry-picked
without reverting unrelated work; this change carries only the perf
intent forward.
Tuning summary
==============
Text-batch ingress (gateway/platforms/telegram.py):
- HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3
- HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0
- Adaptive fast-path tiers in _flush_text_batch:
total <= 320 cp -> min(cap, 0.18)
total <= 1024 cp -> min(cap, 0.24)
else -> cap
A single short reply now reaches the agent in ~180ms instead of
600ms. Tier constants compose with the configured cap via min()
so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS
below 0.18 still wins on every tier.
- _env_float_clamped helper replaces bare float(os.getenv()).
Rejects NaN / Inf, applies optional min/max bounds. Used for
text-batch + media-batch knobs. Prevents asyncio.sleep(NaN)
crashes when an operator typos an env var.
Stream cadence (gateway/config.py + stream_consumer.py):
- StreamingConfig.edit_interval default 1.0s -> 0.8s
- StreamingConfig.buffer_threshold default 40 -> 24 chars
- DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now
a single source of truth. StreamConsumerConfig imports them
instead of duplicating the literals; the prior dual-source drift
is fixed.
Tool progress (gateway/display_config.py):
- Telegram default tool_progress 'all' -> 'new'. Inside
Telegram's ~1 edit/s flood envelope the 'all' default would
accumulate edit pressure on busy chats; 'new' shows only the
leading bubble per tool batch and feels less spammy.
- Slack tier_low override (tool_progress='off') is preserved.
Composition with native draft streaming (#23512)
================================================
The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH
the draft path (send_draft) and the edit path (edit_message), so the
tighter cadence helps native draft as much as edit-based. The
text-batch fast-path applies before the consumer starts, so it speeds
up the first-token latency on every transport. No conflict.
Stale-base avoidance
====================
Re-authored from scratch rather than cherry-picked. Dropped from the
original branch:
- Unrelated d2f043f9c 'fix(anthropic): preserve third-party thinking
continuity' commit
- boot_md.py builtin gateway hook (unrelated)
- Reverted Slack tool_progress='off' (#14663) restoration
- Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO
members deletion
- 2300+ lines of run.py base-skew noise
Tests
=====
New tests/gateway/test_telegram_text_batch_perf.py:
- 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds).
- 4 tests for the adaptive-tier composition rules.
Updated tests/gateway/test_display_config.py:
- test_platform_default_when_no_user_config: 'all' -> 'new' for
Telegram, with comment.
- test_high_tier_platforms: split into Telegram-overrides-to-new
and Discord-stays-all assertions.
Closes#10388.
Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com>
When edit_message_text exceeded Telegram's 4096 UTF-16 codepoint limit,
the adapter caught the BadRequest, best-effort truncated the content
with '…', and returned SendResult(success=True). The stream consumer
believed the full edit was delivered and never recovered, silently
dropping everything past the truncation boundary on long replies.
Returning failure isn't safe either — the consumer's existing fallback
path can race against the next streaming tick, producing duplicate
sends or gaps. Instead, the adapter now SPLITS the oversized payload
across the existing message + new continuation messages, so the user
always gets the full reply in correct order.
How it works:
1. Pre-flight: if utf16_len(content) already exceeds MAX_MESSAGE_LENGTH,
call the new _edit_overflow_split helper directly — saves a doomed
round-trip + a Telegram error.
2. Reactive: if Telegram still returns 'message_too_long' after the
pre-flight (e.g. parse_mode formatting inflated the payload past
the limit via MarkdownV2 escapes), the same helper handles it.
3. _edit_overflow_split:
- Splits via truncate_message(len_fn=utf16_len) — same chunking the
non-streaming send() path uses; chunks get '(1/N)' suffixes.
- Edits the original message_id with chunk 1 (with parse_mode +
plain-fallback when finalize=True, mirroring the main edit path).
- Sends each remaining chunk via self._bot.send_message threaded as
a reply to the previous chunk so the user sees them as a
contiguous block. MarkdownV2-with-plain-fallback per chunk on
finalize.
- Returns SendResult(success=True, message_id=<last_chunk_id>,
continuation_message_ids=(<chunk2_id>, <chunk3_id>, ...)) so the
stream consumer can keep editing the most recent visible message
and the gateway has full visibility into every message id.
SendResult contract extension:
Added optional continuation_message_ids: tuple = () field. When
empty (the common case), behavior is unchanged. When populated, the
caller knows the adapter delivered across multiple platform messages.
Stream consumer integration:
GatewayStreamConsumer._send_or_edit advances _message_id to the
last-continuation id when it sees continuation_message_ids on a
successful edit result, resets _last_sent_text (the new visible
message holds only the final chunk's text), and fires
on_new_message so tool-progress bubbles linearize below the new
continuation rather than the original. Mirrors the openclaw #32535
inter-tool-leak guard.
Composes with what just landed:
- PR #23455 (UTF-16 length-aware splitting in stream consumer)
prevents most overflows upstream by measuring text in UTF-16
codeunits before deciding to split. This PR is the safety net at
the adapter boundary.
- PR #23512 (native draft streaming, default for DM Telegram) routes
DM streaming through send_draft, which has its own contract
unaffected by this change. So this fix narrows in scope to the
edit-based path: groups, supergroups, forum topics, every
non-Telegram platform, and the per-response fallback after a
draft failure.
Salvage notes:
- Cherry-picked from PR #19537 by @kjames2001. Original PR returned
failure on overflow; this evolves to split-and-deliver so users
never lose content and the consumer state stays consistent.
- Dropped an unrelated model-picker hunk (line 2114-2117) that
silently killed the 'X more available — type /model <name>
directly' hint by hardcoding total=len(models). Not in scope.
- Restored the timeout-aware retryable=not is_timeout signal in
send()'s fallthrough catch block.
Closes#19537.
Adds Telegram's native streaming-draft API as a streaming transport so DM
replies render with smooth animated previews as tokens arrive, dropping
the per-edit jitter of the legacy editMessageText polling path.
Adapter contract (gateway/platforms/base.py):
- supports_draft_streaming(chat_type, metadata) -> bool. Default False.
Telegram returns True only for DMs and only when the bound python-
telegram-bot version exposes Bot.send_message_draft (PTB 22.6+).
- send_draft(chat_id, draft_id, content, metadata) -> SendResult.
Default raises NotImplementedError. Telegram delegates to PTB's
send_message_draft. Drafts have no message_id (Bot API contract);
SendResult.message_id is None on success.
Telegram adapter (gateway/platforms/telegram.py):
- supports_draft_streaming gates on chat_type='dm' AND PTB capability.
- send_draft trims to MAX_MESSAGE_LENGTH using utf16_len, threads
message_thread_id through metadata, and routes failures back as
SendResult(success=False, error=...) so the consumer can fall back.
Stream consumer (gateway/stream_consumer.py):
- StreamConsumerConfig gains transport ('auto'|'draft'|'edit'|'off')
and chat_type fields.
- run() resolves _use_draft_streaming once via a probe at the top of
the run, allocating a fresh class-wide draft_id_counter so each
response animates as its own preview (no animation collision across
consecutive responses to the same chat).
- _send_or_edit gains a pre-edit branch: when drafts are active AND
not finalizing AND no edit-path message_id is established, the
frame routes through _send_draft_frame instead of edit_message.
Drafts intentionally do NOT set _already_sent so the gateway's
final sendMessage path still fires — drafts have no message_id and
the user needs a real message in their chat history.
- _reset_segment_state bumps the draft_id when the consumer is in
draft mode so each text block after a tool boundary animates as a
fresh preview below the tool-progress bubble (avoids the inter-
tool-call leak openclaw documented in their #32535).
- Per-response fallback: any send_draft failure (transient network,
server reject, capability gap) flips _use_draft_streaming to False
for the rest of the run, gracefully returning to the edit path.
Gateway config (gateway/config.py):
- StreamingConfig.transport default flips edit -> auto. The auto path
is identical to edit on every chat type that doesn't currently
support drafts (groups, supergroups, forum topics, every non-
Telegram platform), so the default is backwards-compatible for
non-DM users.
Lifecycle model (Telegram Bot API 9.5):
1. sendMessageDraft(chat_id, draft_id, text='') opens the bubble.
2. Repeated sendMessageDraft calls with the SAME draft_id animate
the preview as text grows.
3. Drafts have no message_id and cannot be edited or deleted.
4. When the response finishes the gateway's normal sendMessage path
delivers the final answer; the draft preview clears naturally on
the client and the user sees a real message in their history.
Inspired by PR #3412 by @NivOO5. Re-authored against current main
(stream_consumer.py is now ~4x larger than at #3412's branch base, with
new _NEW_SEGMENT/_COMMENTARY/finalize/_on_new_message machinery the
original PR didn't account for) but the design call (DM-only, edit-
fallback, transport=auto|draft|edit|off) is faithful to the original
proposal, with two improvements baked in:
1. Per-response draft_id (monotonic counter, not a time hash) — no
collision risk across consecutive responses on the same chat.
2. Tool-boundary draft_id bump — prevents the inter-tool-call leak
openclaw hit during their rollout (their #32535).
Closes#21439 (duplicate feature request).
Cherry-picked from PR #10371. Two-layer defense for the spurious-thread_id
issue (#3206):
1. _build_message_event filters DM thread_ids: only preserve thread_id
for real topic messages (is_topic_message=True). Telegram puts
message_thread_id on every DM that is a reply, but reply-chain ids
route to nonexistent threads on send.
2. _send_message_with_thread_fallback helper: control sends
(send_update_prompt, send_exec_approval / send_slash_confirm,
send_model_picker) retry once without message_thread_id when
Telegram returns BadRequest 'Message thread not found'. Mirrors
the pattern PR #3390 added for the streaming send path.
Salvage notes:
- Conflict 1 (line ~4099): merged the contributor's DM is_topic_message
filter with the existing forum General-topic default from #22423,
preserving both behaviors.
- Conflict 2 (line ~1664 / 1690): kept main's delete_message (PR #23416)
alongside the new helper. Tightened the helper's exception catch
from bare 'Exception' to use the existing _is_bad_request_error +
_is_thread_not_found_error helpers (line 484-496) for consistency
with the streaming send path.
- Widened the fix to send_update_prompt (was bare self._bot.send_message,
same bug class).
Authored by rahimsais via PR #10371 (re-attributed from donrhmexe@
local commit author).
The stream consumer measured message length using Python's len() (Unicode
code points), but Telegram's actual limit is in UTF-16 code units. This
caused messages with supplementary characters (emoji, CJK, etc.) to exceed
Telegram's 4096-character limit, resulting in truncated messages with
formatting artifacts.
Changes:
- Add message_len_fn property to BasePlatformAdapter (defaults to len)
- Override in TelegramAdapter to return utf16_len
- Stream consumer uses adapter.message_len_fn for:
- safe_limit calculation
- overflow detection
- truncate_message calls
- split point calculation (via _custom_unit_to_cp)
- fallback final send chunking
Fixes truncated messages with black square artifacts on Telegram when
the model generates responses containing multi-byte Unicode characters.