hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-18 04:41:56 +00:00

Author	SHA1	Message	Date
kjames2001	bf1f40996f	fix(telegram): split-and-deliver oversized edits instead of silent truncation When edit_message_text exceeded Telegram's 4096 UTF-16 codepoint limit, the adapter caught the BadRequest, best-effort truncated the content with '…', and returned SendResult(success=True). The stream consumer believed the full edit was delivered and never recovered, silently dropping everything past the truncation boundary on long replies. Returning failure isn't safe either — the consumer's existing fallback path can race against the next streaming tick, producing duplicate sends or gaps. Instead, the adapter now SPLITS the oversized payload across the existing message + new continuation messages, so the user always gets the full reply in correct order. How it works: 1. Pre-flight: if utf16_len(content) already exceeds MAX_MESSAGE_LENGTH, call the new _edit_overflow_split helper directly — saves a doomed round-trip + a Telegram error. 2. Reactive: if Telegram still returns 'message_too_long' after the pre-flight (e.g. parse_mode formatting inflated the payload past the limit via MarkdownV2 escapes), the same helper handles it. 3. _edit_overflow_split: - Splits via truncate_message(len_fn=utf16_len) — same chunking the non-streaming send() path uses; chunks get '(1/N)' suffixes. - Edits the original message_id with chunk 1 (with parse_mode + plain-fallback when finalize=True, mirroring the main edit path). - Sends each remaining chunk via self._bot.send_message threaded as a reply to the previous chunk so the user sees them as a contiguous block. MarkdownV2-with-plain-fallback per chunk on finalize. - Returns SendResult(success=True, message_id=<last_chunk_id>, continuation_message_ids=(<chunk2_id>, <chunk3_id>, ...)) so the stream consumer can keep editing the most recent visible message and the gateway has full visibility into every message id. SendResult contract extension: Added optional continuation_message_ids: tuple = () field. When empty (the common case), behavior is unchanged. When populated, the caller knows the adapter delivered across multiple platform messages. Stream consumer integration: GatewayStreamConsumer._send_or_edit advances _message_id to the last-continuation id when it sees continuation_message_ids on a successful edit result, resets _last_sent_text (the new visible message holds only the final chunk's text), and fires on_new_message so tool-progress bubbles linearize below the new continuation rather than the original. Mirrors the openclaw #32535 inter-tool-leak guard. Composes with what just landed: - PR #23455 (UTF-16 length-aware splitting in stream consumer) prevents most overflows upstream by measuring text in UTF-16 codeunits before deciding to split. This PR is the safety net at the adapter boundary. - PR #23512 (native draft streaming, default for DM Telegram) routes DM streaming through send_draft, which has its own contract unaffected by this change. So this fix narrows in scope to the edit-based path: groups, supergroups, forum topics, every non-Telegram platform, and the per-response fallback after a draft failure. Salvage notes: - Cherry-picked from PR #19537 by @kjames2001. Original PR returned failure on overflow; this evolves to split-and-deliver so users never lose content and the consumer state stays consistent. - Dropped an unrelated model-picker hunk (line 2114-2117) that silently killed the 'X more available — type /model <name> directly' hint by hardcoding total=len(models). Not in scope. - Restored the timeout-aware retryable=not is_timeout signal in send()'s fallthrough catch block. Closes #19537.	2026-05-10 22:02:56 -07:00
NivOO5	4ed293b38e	feat(telegram): native draft streaming via sendMessageDraft (Bot API 9.5+) Adds Telegram's native streaming-draft API as a streaming transport so DM replies render with smooth animated previews as tokens arrive, dropping the per-edit jitter of the legacy editMessageText polling path. Adapter contract (gateway/platforms/base.py): - supports_draft_streaming(chat_type, metadata) -> bool. Default False. Telegram returns True only for DMs and only when the bound python- telegram-bot version exposes Bot.send_message_draft (PTB 22.6+). - send_draft(chat_id, draft_id, content, metadata) -> SendResult. Default raises NotImplementedError. Telegram delegates to PTB's send_message_draft. Drafts have no message_id (Bot API contract); SendResult.message_id is None on success. Telegram adapter (gateway/platforms/telegram.py): - supports_draft_streaming gates on chat_type='dm' AND PTB capability. - send_draft trims to MAX_MESSAGE_LENGTH using utf16_len, threads message_thread_id through metadata, and routes failures back as SendResult(success=False, error=...) so the consumer can fall back. Stream consumer (gateway/stream_consumer.py): - StreamConsumerConfig gains transport ('auto'\|'draft'\|'edit'\|'off') and chat_type fields. - run() resolves _use_draft_streaming once via a probe at the top of the run, allocating a fresh class-wide draft_id_counter so each response animates as its own preview (no animation collision across consecutive responses to the same chat). - _send_or_edit gains a pre-edit branch: when drafts are active AND not finalizing AND no edit-path message_id is established, the frame routes through _send_draft_frame instead of edit_message. Drafts intentionally do NOT set _already_sent so the gateway's final sendMessage path still fires — drafts have no message_id and the user needs a real message in their chat history. - _reset_segment_state bumps the draft_id when the consumer is in draft mode so each text block after a tool boundary animates as a fresh preview below the tool-progress bubble (avoids the inter- tool-call leak openclaw documented in their #32535). - Per-response fallback: any send_draft failure (transient network, server reject, capability gap) flips _use_draft_streaming to False for the rest of the run, gracefully returning to the edit path. Gateway config (gateway/config.py): - StreamingConfig.transport default flips edit -> auto. The auto path is identical to edit on every chat type that doesn't currently support drafts (groups, supergroups, forum topics, every non- Telegram platform), so the default is backwards-compatible for non-DM users. Lifecycle model (Telegram Bot API 9.5): 1. sendMessageDraft(chat_id, draft_id, text='') opens the bubble. 2. Repeated sendMessageDraft calls with the SAME draft_id animate the preview as text grows. 3. Drafts have no message_id and cannot be edited or deleted. 4. When the response finishes the gateway's normal sendMessage path delivers the final answer; the draft preview clears naturally on the client and the user sees a real message in their history. Inspired by PR #3412 by @NivOO5. Re-authored against current main (stream_consumer.py is now ~4x larger than at #3412's branch base, with new _NEW_SEGMENT/_COMMENTARY/finalize/_on_new_message machinery the original PR didn't account for) but the design call (DM-only, edit- fallback, transport=auto\|draft\|edit\|off) is faithful to the original proposal, with two improvements baked in: 1. Per-response draft_id (monotonic counter, not a time hash) — no collision risk across consecutive responses on the same chat. 2. Tool-boundary draft_id bump — prevents the inter-tool-call leak openclaw hit during their rollout (their #32535). Closes #21439 (duplicate feature request).	2026-05-10 20:02:50 -07:00
Aubrey Freeman III	c0da5d09a6	fix: use UTF-16 length for Telegram stream consumer message splitting The stream consumer measured message length using Python's len() (Unicode code points), but Telegram's actual limit is in UTF-16 code units. This caused messages with supplementary characters (emoji, CJK, etc.) to exceed Telegram's 4096-character limit, resulting in truncated messages with formatting artifacts. Changes: - Add message_len_fn property to BasePlatformAdapter (defaults to len) - Override in TelegramAdapter to return utf16_len - Stream consumer uses adapter.message_len_fn for: - safe_limit calculation - overflow detection - truncate_message calls - split point calculation (via _custom_unit_to_cp) - fallback final send chunking Fixes truncated messages with black square artifacts on Telegram when the model generates responses containing multi-byte Unicode characters.	2026-05-10 16:21:07 -07:00
黄飞虹	e164a9c1ed	fix(stream-consumer): preserve thread routing on overflow first-send path When the first streamed message exceeds the platform length limit and gets split into chunks, _send_new_chunk was called with self._message_id (which is None on first send), dropping thread routing entirely. Fallback to self._initial_reply_to_id so overflow chunks land in the correct topic/thread. Also fix a fragile test assertion that could be silently skipped.	2026-05-10 15:20:40 -07:00
hrygo	ff14666cdc	fix(gateway): stream consumer first message drops thread context Cherry-picked from PR #13077 commits: - 5500c7d8 fix(gateway): stream consumer first message drops thread context - e84403b9 test(gateway): add regression tests for stream consumer thread routing Fixes: Streaming first message drops thread/topic context in Feishu group topics, Slack threads, Telegram forum topics. Adds initial_reply_to_id ctor arg to GatewayStreamConsumer, threaded through _send_or_edit and _send_new_chunk. Also fixes Feishu _send_raw_message fallback path (reply -> create) to use receive_id_type='thread_id' so the new message lands in the correct topic instead of the main channel. Authored by hrygo via PR #13077 (re-attributed from the bot-authored salvage commit on the original branch).	2026-05-10 15:20:40 -07:00
Teknium	6636fecd47	fix(gateway): only mark final response sent when split-overflow chunks actually land (#23420 ) The split-overflow path in _send_or_edit (gateway/stream_consumer.py) was copying the cumulative _already_sent flag into _final_response_sent on the done frame. _already_sent goes True on any successful prior edit (tool progress) or on fallback-mode promotion when an edit fails — neither proves the current chunked send delivered the final answer. When the chunked send actually fails (network error, flood control), the consumer would wrongly claim 'final delivered' and the gateway's independent fallback delivery in run.py would be suppressed. User saw only tool-progress bubbles and never got the answer. Now we track per-chunk success locally: _send_new_chunk returns the new message_id on success or returns the passed-in reply_to unchanged on failure. If at least one returned id differs, chunks_delivered = True; otherwise stays False, gateway fallback runs. Adds two regression tests: - test_split_overflow_failed_send_does_not_mark_final_sent — primes _already_sent=True, then makes every send fail; asserts _final_response_sent stays False. - test_split_overflow_partial_send_marks_final_sent — happy path, asserts _final_response_sent goes True. Note: the companion bug at the CancelledError handler (issue cited lines 417-418) was already fixed by `3b5572ded` on 2026-04-16. Closes #10748	2026-05-10 15:13:54 -07:00
teknium1	ec1fad3449	fix(gateway): align fallback delete with sibling style + add regression tests Follow-up to HuangYuChuh's #17384 cherry-pick: - Use defensive getattr+logger.debug for delete_message lookup, mirroring the sibling _try_send_fresh_final cleanup pattern at L820+. Platforms that don't implement delete_message no longer raise AttributeError; the failure path now logs at debug for diagnosability instead of silently swallowing. - Add three regression tests in tests/gateway/test_stream_consumer.py: - delete_message awaited on happy-path exit with stale id - delete_message NOT awaited when no fallback chunks reached the user - no crash on adapters that lack delete_message (spec-restricted mock)	2026-05-10 14:22:59 -07:00
HuangYuChuh	4eb8479ebd	fix(gateway): delete partial message after fallback send on flood control When Telegram flood control triggers 3+ consecutive edit failures, the stream consumer enters fallback mode and sends the complete response as a new message. This leaves the user seeing two messages: a frozen partial (with cursor) and the full duplicate. After the fallback chunks are sent successfully, delete the original partial message so the user only sees one complete response. The delete is best-effort — if it fails (e.g. flood still active, missing permissions), the full answer is still delivered. Fixes #16668	2026-05-10 14:22:59 -07:00
Kid	1321bcf5fe	fix(gateway): finalize final stream edit on done	2026-05-09 17:51:04 -07:00
Teknium	71c8ca17dc	chore(salvage): strip duplicated/merge-corrupted blocks from PR #17664 Removes drive-by duplication that accumulated during the contributor branch's multiple rebases. All runtime-benign (dict last-wins, redefinition last-wins) but left dead source that would confuse reviewers and maintainers. Surgical in-place de-duplication (kept PR's intentional additions, removed only the doubled copy): * hermes_cli/auth.py: duplicate "gmi" + "azure-foundry" ProviderConfig * hermes_cli/models.py: duplicate "gmi" entry in _PROVIDER_MODELS * hermes_cli/config.py: duplicate NOTION/LINEAR/AIRTABLE/TENOR skill env block + duplicate get_custom_provider_context_length definition * hermes_cli/gateway.py: duplicate _setup_yuanbao * gateway/platforms/base.py: duplicate is_host_excluded_by_no_proxy * gateway/platforms/telegram.py: duplicate delete_message * gateway/stream_consumer.py: duplicate _should_send_fresh_final and _try_fresh_final * gateway/run.py: duplicate _parse_reasoning_command_args / _resolve_session_reasoning_config / _set_session_reasoning_override, duplicate "Drain silently when interrupted" interrupt check * run_agent.py: duplicate HERMES_AGENT_HELP_GUIDANCE append, duplicate codex_message_items capture, duplicate custom_providers resolution * tools/approval.py: duplicate HARDLINE_PATTERNS section and duplicate hardline call in check_dangerous_command * tools/mcp_tool.py: duplicate _orphan_stdio_pids module-level decl * cron/scheduler.py: duplicate "not configured/enabled" check — kept the new early-rejection, removed the stale late-path copy Full-file resets to origin/main (all PR additions were duplicates of content already on main): * ui-tui/packages/hermes-ink/index.d.ts * ui-tui/packages/hermes-ink/src/entry-exports.ts * ui-tui/packages/hermes-ink/src/ink/selection.ts * ui-tui/src/app/interfaces.ts * ui-tui/src/app/slash/commands/core.ts * ui-tui/src/components/thinking.tsx * ui-tui/src/lib/memoryMonitor.ts * ui-tui/src/types.ts * ui-tui/src/types/hermes-ink.d.ts * tests/hermes_cli/test_doctor.py * tests/hermes_cli/test_api_key_providers.py * tests/hermes_cli/test_model_validation.py * tests/plugins/memory/test_hindsight_provider.py * tests/run_agent/test_run_agent.py * tests/gateway/test_email.py * tests/tools/test_dockerfile_pid1_reaping.py * hermes_cli/commands.py (slack_native_slashes block — full duplicate)	2026-04-29 21:56:51 -07:00
Ari Lotter	868bc1c242	feat(irc): add interactive setup feat(gateway): refine Platform._missing_ and platform-connected dispatch Restricts plugin-name acceptance to bundled plugin scan + registry (no arbitrary string -> enum-pollution), pulls per-platform connectivity checks into a _PLATFORM_CONNECTED_CHECKERS lambda map with a clean _is_platform_connected method, and adds tests covering the checker map, plugin platform interface, and IRC setup wizard.	2026-04-29 21:56:51 -07:00
Teknium	dcd7b717f8	fix(gateway): linearize tool-progress bubbles with content messages (#17280 ) After PR #7885 (`97b0cd51e`) added content-side segment breaks for natural mid-turn assistant messages, the tool-progress task in gateway/run.py was not updated to match. progress_msg_id and progress_lines persisted for the whole run, so after a tool batch produced bubble B1 followed by content bubble C1, the next tool.started kept editing the OLD bubble B1 above C1 — making the chat appear out of order on Telegram, Discord, and Slack. Add on_new_message callback to GatewayStreamConsumer, fired at the four sites where a fresh content bubble lands on the platform: - _send_or_edit first-send branch (NOT edits) - _send_commentary - _send_new_chunk (overflow split) - each successful chunk of _send_fallback_final Gateway supplies a lambda that enqueues ('__reset__',) into the progress_queue. send_progress_messages() handles the marker in both the main loop and the CancelledError drain path, clearing progress_msg_id, progress_lines, and the dedup state so the next tool.started opens a fresh bubble below the new content. Result: each tool batch appears in chronological order below the preceding content. When no content appears between tool batches, tools still group in one bubble (CLI-style compactness). Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 22:17:33 -07:00
Teknium	b16f9d438b	feat(telegram): send fresh finals for stale preview streams (port openclaw#72038) (#16261 ) Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-and-push (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details Tests / test (push) Waiting to run Details Tests / e2e (push) Waiting to run Details Ports openclaw/openclaw#72038 to hermes-agent. Telegram's `editMessageText` preserves the original message timestamp, so a long-running streamed reply (reasoning models that take 60+ seconds to finish) would keep the first-token timestamp even after completion. Users can't tell how long a task actually took. When a preview message has been visible for >= 60s (configurable via `streaming.fresh_final_after_seconds`), finalize by sending a fresh message instead of editing in place, then best-effort delete the stale preview. Short previews still edit in place (the existing fast path). Implementation notes adapted from OpenClaw's TypeScript original: - `StreamConsumerConfig` gains `fresh_final_after_seconds` (default 0 = legacy edit-in-place). Gateway-level `StreamingConfig` defaults to 60. - `GatewayStreamConsumer` tracks `_message_created_ts` at first-send and checks it in `_send_or_edit` on `finalize=True`. New helpers `_should_send_fresh_final` + `_try_fresh_final`. - `BasePlatformAdapter` gains optional `delete_message(chat_id, message_id)` returning False by default. `TelegramAdapter` implements it via `_bot.delete_message`. - `gateway/run.py` only enables fresh-final for `Platform.TELEGRAM`; other platforms ignore the setting (they don't have the stale-edit timestamp problem or edit-then-read works cheaply). - Fallback to normal edit on any fresh-send failure — no user-visible regression if Telegram rate-limits a send or the message is gone. Tests: 15 new cases in tests/gateway/test_stream_consumer_fresh_final.py covering short/long previews, config plumbing, delete-support absent, send-failure fallback, __no_edit__ sentinel safety, and StreamingConfig round-trip. Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-04-26 17:26:37 -07:00
Tranquil-Flow	b668c09ab2	fix(gateway): strip cursor from frozen message on empty fallback continuation (#7183 ) When _send_fallback_final() is called with nothing new to deliver (the visible partial already matches final_text), the last edit may still show the cursor character because fallback mode was entered after a failed edit. Before this fix the early-return path left _already_sent = True without attempting to strip the cursor, so the message stayed frozen with a visible ▉ permanently. Adds a best-effort edit inside the empty-continuation branch to clean the cursor off the last-sent text. Harmless when fallback mode wasn't actually armed or when the cursor isn't present. If the strip edit itself fails (flood still active), we return without crashing and without corrupting _last_sent_text. Adapted from PR #7429 onto current main — the surrounding fallback block grew the #10807 stale-prefix handling since #7429 was written, so the cursor strip lives in the new else-branch where we still return early. 3 unit tests covering: cursor stripped on empty continuation, no edit attempted when cursor is not configured, cursor-strip edit failure handled without crash. Originally proposed as PR #7429.	2026-04-19 01:51:12 -07:00
konsisumer	1d1e1277e4	fix(gateway): flush undelivered tail before segment reset to preserve streamed text (#8124 ) When a streaming edit fails mid-stream (flood control, transport error) and a tool boundary arrives before the fallback threshold is reached, the pre-boundary tail in `_accumulated` was silently discarded by `_reset_segment_state`. The user saw a frozen partial message and missing words on the other side of the tool call. Flush the undelivered tail as a continuation message before the reset, computed relative to the last successfully-delivered prefix so we don't duplicate content the user already saw.	2026-04-19 01:43:04 -07:00
pedh	4459913f40	feat(dingtalk): AI Cards streaming, emoji reactions, and media handling Cherry-picked from #10985 by pedh, adapted to current main: * Keeps main's full group-chat gating (require_mention + allowed_users + free_response_chats + mention_patterns) — PR's simpler subset dropped. * Keeps main's fire-and-forget process() dispatch + session_webhook fallback for SDK >= 0.24. * Picks up PR's REQUIRES_EDIT_FINALIZE capability flag on BasePlatformAdapter + finalize kwarg on edit_message(), plumbed through stream_consumer. Default False so Telegram/Slack/Discord/Matrix stay on the zero-overhead fast path. * DingTalk AI Card lifecycle: per-chat _message_contexts, two-card flow (tool-progress + final response) with sibling auto-close driven by reply_to, idempotent 🤔Thinking → 🥳Done swap, $alibabacloud-dingtalk$ for media URL resolution (replaces raw HTTP that was 403-ing). * pyproject: dingtalk extra now dingtalk-stream>=0.20,<1 + alibabacloud-dingtalk>=2.0.0 + qrcode. Closes #10991 Co-authored-by: pedh	2026-04-17 19:26:53 -07:00
Siddharth Balyan	d38b73fa57	fix(matrix): E2EE and migration bugfixes (#10860 ) * - make buffered streaming - fix path naming to expand `~` for agent. - fix stripping of matrix ID to not remove other mentions / localports. * fix(matrix): register MembershipEventDispatcher for invite auto-join The mautrix migration (#7518) broke auto-join because InternalEventType.INVITE events are only dispatched when MembershipEventDispatcher is registered on the client. Without it, _on_invite is dead code and the bot silently ignores all room invites. Closes #10094 Closes #10725 Refs: PR #10135 (digging-airfare-4u), PR #10732 (fxfitz) * fix(matrix): preserve _joined_rooms reference for CryptoStateStore connect() reassigned self._joined_rooms = set(...) after initial sync, orphaning the reference captured by _CryptoStateStore at init time. find_shared_rooms() returned [] forever, breaking Megolm session rotation on membership changes. Mutate in place with clear() + update() so the CryptoStateStore reference stays valid. Refs #8174, PR #8215 * fix(matrix): remove dual ROOM_ENCRYPTED handler to fix dedup race mautrix auto-registers DecryptionDispatcher when client.crypto is set. The adapter also registered _on_encrypted_event for the same event type. _on_encrypted_event had zero awaits and won the race to mark event IDs in the dedup set, causing _on_room_message to drop successfully decrypted events from DecryptionDispatcher. The retry loop masked this by re-decrypting every message ~4 seconds later. Remove _on_encrypted_event entirely. DecryptionDispatcher handles decryption; genuinely undecryptable events are logged by mautrix and retried on next key exchange. Refs #8174, PR #8215 * fix(matrix): re-verify device keys after share_keys() upload Matrix homeservers treat ed25519 identity keys as immutable per device. share_keys() can return 200 but silently ignore new keys if the device already exists with different identity keys. The bot would proceed with shared=True while peers encrypt to the old (unreachable) keys. Now re-queries the server after share_keys() and fails closed if keys don't match, with an actionable error message. Refs #8174, PR #8215 * fix(matrix): encrypt outbound attachments in E2EE rooms _upload_and_send() uploaded raw bytes and used the 'url' key for all rooms. In E2EE rooms, media must be encrypted client-side with encrypt_attachment(), the ciphertext uploaded, and the 'file' key (with key/iv/hashes) used instead of 'url'. Now detects encrypted rooms via state_store.is_encrypted() and branches to the encrypted upload path. Refs: PR #9822 (charles-brooks) * fix(matrix): add stop_typing to clear typing indicator after response The adapter set a 30-second typing timeout but never cleared it. The base class stop_typing() is a no-op, so the typing indicator lingered for up to 30 seconds after each response. Closes #6016 Refs: PR #6020 (r266-tech) * fix(matrix): cache all media types locally, not just photos/voice should_cache_locally only covered PHOTO, VOICE, and encrypted media. Unencrypted audio/video/documents in plaintext rooms were passed as MXC URLs that require authentication the agent doesn't have, resulting in 401 errors. Refs #3487, #3806 * fix(matrix): detect stale OTK conflict on startup and fail closed When crypto state is wiped but the same device ID is reused, the homeserver may still hold one-time keys signed with the previous identity key. Identity key re-upload succeeds but OTK uploads fail with "already exists" and a signature mismatch. Peers cannot establish new Olm sessions, so all new messages are undecryptable. Now proactively flushes OTKs via share_keys() during connect() and catches the "already exists" error with an actionable log message telling the operator to purge the device from the homeserver or generate a fresh device ID. Also documents the crypto store recovery procedure in the Matrix setup guide. Refs #8174 * docs(matrix): improve crypto recovery docs per review - Put easy path (fresh access token) first, manual purge second - URL-encode user ID in Synapse admin API example - Note that device deletion may invalidate the access token - Add "stop Synapse first" caveat for direct SQLite approach - Mention the fail-closed startup detection behavior - Add back-reference from upgrade section to OTK warning * refactor(matrix): cleanup from code review - Extract _extract_server_ed25519() and _reverify_keys_after_upload() to deduplicate the re-verification block (was copy-pasted in two places, three copies of ed25519 key extraction total) - Remove dead code: _pending_megolm, _retry_pending_decryptions, _MAX_PENDING_EVENTS, _PENDING_EVENT_TTL — all orphaned after removing _on_encrypted_event - Remove tautological TestMediaCacheGate (tested its own predicate, not production code) - Remove dead TestMatrixMegolmEventHandling and TestMatrixRetryPendingDecryptions (tested removed methods) - Merge duplicate TestMatrixStopTyping into TestMatrixTypingIndicator - Trim comment to just the "why"	2026-04-17 04:03:02 +05:30
konsisumer	3e3ec35a5e	fix: surface execute_code timeout to user instead of silently dropping (#10807 ) When execute_code times out, the result JSON had status="timeout" and an error field, but the output field was empty. Many models treat empty output as "nothing happened" and produce an empty/minimal response. The gateway stream consumer then considers the response "already sent" (from pre-tool streaming) and silently drops it — leaving the user staring at silence. Three changes: 1. Include the timeout message in the output field (both local and remote paths) so the model always has visible content to relay to the user. 2. Add periodic activity callbacks to the local execution polling loop so the gateway's inactivity monitor knows execute_code is alive during long runs. 3. Fix stream_consumer._send_fallback_final to not silently drop content when the continuation appears empty but the final text differs from what was previously streamed (e.g. after a tool boundary reset).	2026-04-16 06:42:45 -07:00
Teknium	3b5572ded3	fix(stream-consumer): only confirm final delivery on successful best-effort send The cancellation handler previously promoted any partial send (already_sent=True) to final_response_sent=True unconditionally. This meant if intermediate text (e.g. 'Let me search…') was streamed and the consumer was cancelled before delivering the actual answer, the gateway's suppression check would still prevent the fallback send. Now final_response_sent is only set in the cancellation path when: - The best-effort send of accumulated content actually succeeded, OR - It was already confirmed before cancellation Companion fix for PR #11000's run.py changes — closes the cancellation-path loophole that would otherwise let partial streams suppress final delivery during queued follow-ups.	2026-04-16 05:53:18 -07:00
LehaoLin	d4eba82a37	fix(streaming): don't suppress final response when commentary message is sent Commentary messages (interim assistant status updates like "Using browser tool...") are sent via _send_commentary(), which was incorrectly setting _already_sent = True on success. This caused the final response to be suppressed when there were multiple tool calls, because the gateway checks already_sent to decide whether to skip re-sending the response. The fix: commentary messages are interim status updates, not the final response, so _already_sent should not be set when they succeed. This ensures the final response is always delivered regardless of how many commentary messages were sent during the turn. Fixes: #10454	2026-04-15 15:00:58 -07:00
Teknium	b4fcec6412	fix: prevent streaming cursor from appearing as standalone messages (#9538 ) During rapid tool-calling, the model often emits 1-2 tokens before switching to tool calls. The stream consumer would create a new message with 'X ▉' (short text + cursor), and if the follow-up edit to strip the cursor was rate-limited by the platform, the cursor remained as a permanent standalone message — reported on Telegram as 'white box' artifacts. Add a minimum-content guard in _send_or_edit: when creating a new standalone message (no existing message_id), require at least 4 visible characters alongside the cursor before sending. Shorter text accumulates into the next streaming segment instead. This prevents cursor-only 'tofu' messages across all platforms without affecting normal streaming (edits to existing messages, final sends without cursor, and messages with substantial text are all unaffected). Reported by @michalkomar on X.	2026-04-14 01:52:42 -07:00
Teknium	3de2b98503	fix(streaming): filter <think> blocks from gateway stream consumer Models like MiniMax emit inline <think>...</think> reasoning blocks in their content field. The CLI already suppresses these via a state machine in _stream_delta, but the gateway's GatewayStreamConsumer had no equivalent filtering — raw think blocks were streamed directly to Discord/Telegram/Slack. The fix adds a _filter_and_accumulate() method that mirrors the CLI's approach: a state machine tracks whether we're inside a reasoning block and silently discards the content. Includes the same block-boundary check (tag must appear at line start or after whitespace-only prefix) to avoid false positives when models mention <think> in prose. Handles all tag variants: <think>, <thinking>, <THINKING>, <thought>, <reasoning>, <REASONING_SCRATCHPAD>. Also handles edge cases: - Tags split across streaming deltas (partial tag buffering) - Unclosed blocks (content suppressed until stream ends) - Multiple consecutive blocks - _flush_think_buffer on stream end for held-back partial tags Adds 22 unit tests + 1 integration test covering all scenarios.	2026-04-13 22:16:20 -07:00
Teknium	0cc7f79016	fix(streaming): prevent duplicate Telegram replies when stream task is cancelled (#9319 ) When the 5-second stream_task timeout in gateway/run.py expires (due to slow Telegram API calls from rate limiting after several messages), the stream consumer is cancelled via asyncio.CancelledError. The CancelledError handler did a best-effort final edit but never set final_response_sent, so the gateway fell through to the normal send path and delivered the full response again as a reply — causing a duplicate. The fix: in the CancelledError handler, set final_response_sent = True when already_sent is True (i.e., the stream consumer had already delivered content to the user). This tells the gateway's already_sent check that the response was delivered, preventing the duplicate send. Adds two tests verifying the cancellation behavior: - Cancelled with already_sent=True → final_response_sent=True (no dup) - Cancelled with already_sent=False → final_response_sent=False (normal send path proceeds) Reported by community user hume on Discord.	2026-04-13 19:22:43 -07:00
helix4u	0ffb6f2dae	fix(matrix): skip cursor-only stream placeholder messages	2026-04-13 16:31:02 -07:00
asheriif	97b0cd51ee	feat(gateway): surface natural mid-turn assistant messages in chat platforms Add display.interim_assistant_messages config (enabled by default) that forwards completed assistant commentary between tool calls to the user as separate chat messages. Models already emit useful status text like 'I'll inspect the repo first.' — this surfaces it on Telegram, Discord, and other messaging platforms instead of swallowing it. Independent from tool_progress and gateway streaming. Disabled for webhooks. Uses GatewayStreamConsumer when available, falls back to direct adapter send. Tracks response_previewed to prevent double-delivery when interim message matches the final response. Also fixes: cursor not stripped from fallback prefix in stream consumer (affected continuation calculation on no-edit platforms like Signal). Cherry-picked from PR #7885 by asheriif, default changed to enabled. Fixes #5016	2026-04-11 16:21:39 -07:00
Teknium	d7607292d9	fix(streaming): adaptive backoff + cursor strip to prevent message truncation (#7683 ) Telegram flood control during streaming caused messages to be cut off mid-response. The old behavior permanently disabled edits after a single flood-control failure, losing the remainder of the response. Changes: - Adaptive backoff: on flood-control edit failures, double the edit interval instead of immediately disabling edits. Only permanently disable after 3 consecutive failures (_MAX_FLOOD_STRIKES). - Cursor strip: when entering fallback mode, best-effort edit to remove the cursor (▉) from the last visible message so it doesn't appear stuck. - Fallback send retry: _send_fallback_final retries each chunk once on flood-control failures (3s delay) before giving up. - Default edit_interval increased from 0.3s to 1.0s. Telegram rate-limits edits at ~1/s per message; 0.3s was virtually guaranteed to trigger flood control on any non-trivial response. - _send_or_edit returns bool so the overflow split loop knows not to truncate accumulated text when an edit fails (prevents content loss). Fixes: messages cutting/stopping mid-response on Telegram, especially with streaming enabled.	2026-04-11 10:28:15 -07:00
KUSH42	5dea7e1ebc	fix(gateway): prevent duplicate messages on no-message-id platforms Platforms that don't return a message_id after the first send (Signal, GitHub webhooks) were causing GatewayStreamConsumer to re-enter the "first send" path on every tool boundary, posting one platform message per tool call (observed as 155 PR comments on a single response). Fix: treat _message_id == "__no_edit__" as a sentinel meaning "platform accepted the send but cannot be edited". When a tool boundary arrives in that state, skip the message_id/accumulated/last_sent_text reset so all continuation text is delivered once via _send_fallback_final rather than re-posted per segment. Also make prompt_toolkit imports in hermes_cli/commands.py optional so gateway and test environments that lack the package can still import resolve_command, gateway_help_lines, and COMMAND_REGISTRY.	2026-04-10 03:52:00 -07:00
dangelo352	aed9b90ae3	fix(stream_consumer): handle overflow when no message exists yet The overflow split loop required _message_id to be set, but on the first streamed message (or after a segment break) _message_id is None. Oversized text fell through to _send_or_edit → adapter.send(), which split internally — but subsequent edits hit Telegram's 'message too long' and were silently truncated with '…', cutting off the response. Add a new code path for the _message_id is None case that uses truncate_message() (same as the non-streaming path) to split with proper word/code-fence boundaries and chunk indicators. Each chunk is sent as a new message via _send_new_chunk(). Properly handles got_done (returns immediately after sending chunks instead of continuing into an infinite loop) and got_segment_break. Original cherry-picked from PR #6816 by dangelo352. Fixes silent message truncation on Telegram for long streamed responses.	2026-04-09 15:07:21 -07:00
Teknium	e26393ffc2	fix: Signal duplicate replies with streaming + per-platform tool_progress (#6348 ) Fixes #4647 — Signal replies duplicated when gateway streaming is enabled. Root cause: stream_consumer.py did not handle the case where send() returns success=True but no message_id (Signal behavior). Every stream delta produced a separate send() call (7+ messages instead of 2), plus the gateway sent another full duplicate since already_sent was never set. Changes: - stream_consumer.py: Add elif branch for success-without-message_id — enters fallback mode (sets already_sent, disables editing, sends only continuation) - signal.py send(): Extract timestamp from signal-cli RPC result as message_id so stream consumer follows normal edit→fallback path - signal.py: Add public stop_typing() delegating to _stop_typing_indicator() so base adapter's _keep_typing finally block can clean up typing tasks - gateway/run.py: Per-platform tool_progress_overrides (#6164) — lets users set e.g. signal: off while keeping telegram: all - hermes_cli/config.py: Add tool_progress_overrides to DEFAULT_CONFIG Refs: #4647, #6164	2026-04-08 17:39:45 -07:00
landy	383db35925	fix: improve streaming fallback after edit failures	2026-04-08 03:33:43 -07:00
Teknium	d0ffb111c2	refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821 ) Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture) and manual analysis of the entire codebase. Changes by category: Unused imports removed (~95 across 55 files): - Removed genuinely unused imports from all major subsystems - agent/, hermes_cli/, tools/, gateway/, plugins/, cron/ - Includes imports in try/except blocks that were truly unused (vs availability checks which were left alone) Unused variables removed (~25): - Removed dead variables: connected, inner, channels, last_exc, source, new_server_names, verify, pconfig, default_terminal, result, pending_handled, temperature, loop - Dropped unused argparse subparser assignments in hermes_cli/main.py (12 instances of add_parser() where result was never used) Dead code removed: - run_agent.py: Removed dead ternary (None if False else None) and surrounding unreachable branch in identity fallback - run_agent.py: Removed write-only attribute _last_reported_tool - hermes_cli/providers.py: Removed dead @property decorator on module-level function (decorator has no effect outside a class) - gateway/run.py: Removed unused MCP config load before reconnect - gateway/platforms/slack.py: Removed dead SessionSource construction Undefined name bugs fixed (would cause NameError at runtime): - batch_runner.py: Added missing logger = logging.getLogger(__name__) - tools/environments/daytona.py: Added missing Dict and Path imports Unnecessary global statements removed (14): - tools/terminal_tool.py: 5 functions declared global for dicts they only mutated via .pop()/[key]=value (no rebinding) - tools/browser_tool.py: cleanup thread loop only reads flag - tools/rl_training_tool.py: 4 functions only do dict mutations - tools/mcp_oauth.py: only reads the global - hermes_time.py: only reads cached values Inefficient patterns fixed: - startswith/endswith tuple form: 15 instances of x.startswith('a') or x.startswith('b') consolidated to x.startswith(('a', 'b')) - len(x)==0 / len(x)>0: 13 instances replaced with pythonic truthiness checks (not x / bool(x)) - in dict.keys(): 5 instances simplified to in dict - Redefined unused name: removed duplicate _strip_mdv2 import in send_message_tool.py Other fixes: - hermes_cli/doctor.py: Replaced undefined logger.debug() with pass - hermes_cli/config.py: Consolidated chained .endswith() calls Test results: 3934 passed, 17 failed (all pre-existing on main), 19 skipped. Zero regressions.	2026-04-07 10:25:31 -07:00
Teknium	8dee82ea1e	fix: stream consumer creates new message after tool boundaries (#5739 ) When streaming was enabled on the gateway, the stream consumer created a single message at the start and kept editing it as tokens arrived. Tool progress messages were sent as separate messages below it. Since edits don't change message position on Telegram/Matrix/Discord, the final response ended up stuck above all tool progress messages — users had to scroll up past potentially dozens of tool call lines to read the answer. The agent already sends stream_delta_callback(None) at tool boundaries (before _execute_tool_calls). The stream consumer was ignoring this signal. Now it treats None as a segment break: finalizes the current message (removes cursor), resets _message_id, and the next text chunk creates a fresh message below the tool progress messages. Timeline before: [msg 1: 'Let me search...' → edits → 'Here is the answer'] ← top [msg 2: tool progress lines] ← bottom Timeline after: [msg 1: 'Let me search...'] ← top [msg 2: tool progress lines] [msg 3: 'Here is the answer'] ← bottom (visible) Reported by SkyLinx on Discord.	2026-04-06 23:00:14 -07:00
Teknium	c8220e69a1	fix: strip MEDIA: directives from streamed gateway messages (#5152 ) When streaming is enabled, the GatewayStreamConsumer sends raw text chunks directly to the platform without post-processing. This causes MEDIA:/path/to/file tags and [[audio_as_voice]] directives to appear as visible text in the user's chat instead of being stripped. The non-streaming path already handles this correctly via extract_media() in base.py, but the streaming path was missing equivalent cleanup. Add _clean_for_display() to GatewayStreamConsumer that strips MEDIA: tags and internal markers before any text reaches the platform. The actual media file delivery is unaffected — _deliver_media_from_response() in gateway/run.py still extracts files from the agent's final_response (separate from the stream consumer's display text). Reported by Ao [FotM] on Discord.	2026-04-04 19:05:27 -07:00
kshitijk4poor	28380e7aed	fix(gateway): STT config resolution, stream consumer flood control fallback Three targeted fixes from user-reported issues: 1. STT config resolution (transcription_tools.py): _has_openai_audio_backend() and _resolve_openai_audio_client_config() now check stt.openai.api_key/base_url in config.yaml FIRST, before falling back to env vars. Fixes voice transcription breaking when using a custom OpenAI-compatible endpoint via config.yaml. 2. Stream consumer flood control fallback (stream_consumer.py): When an edit fails mid-stream (e.g., Telegram flood control returns failure for waits >5s), reset _already_sent to False so the normal final send path delivers the complete response. Previously, a truncated partial was left as the final message. 3. Telegram edit_message comment alignment (telegram.py): Clarify that long flood waits return failure so streaming can fall back to a normal final send.	2026-04-03 00:50:17 -07:00
Teknium	2fa33dde81	fix: handle message length overflow in streaming mode (#1783 ) Stream consumer now splits messages that exceed the platform's MAX_MESSAGE_LENGTH. When accumulated text grows past the safe limit, the current message is finalized and a new message is started for the overflow — same as how normal sends chunk long responses. Split point prefers line boundaries (rfind newline) for clean breaks. Works for all platforms (Telegram 4096, Discord 2000, etc.) by reading the adapter's MAX_MESSAGE_LENGTH at runtime. Also added a safety net in the Telegram adapter: if edit_message_text still hits MESSAGE_TOO_LONG (e.g. markdown formatting expansion), it truncates and returns success so the stream consumer doesn't die. Co-authored-by: Test <test@test.com>	2026-03-17 11:00:52 -07:00
Teknium	7ac9088d5c	fix: Telegram streaming — config bridge, not-modified, flood control (#1782 ) * fix: NameError in OpenCode provider setup (prompt_text -> prompt) The OpenCode Zen and OpenCode Go setup sections used prompt_text() which is undefined. All other providers correctly use the local prompt() function defined in setup.py. Fixes crash during 'hermes setup' when selecting either OpenCode provider. * fix: Telegram streaming — config bridge, not-modified, flood control Three fixes for gateway streaming: 1. Bridge streaming config from config.yaml into gateway runtime. load_gateway_config() now reads the 'streaming' key from config.yaml (same pattern as session_reset, stt, etc.), matching the docs. Previously only gateway.json was read. 2. Handle 'Message is not modified' in Telegram edit_message(). This Telegram API error fires when editing with identical content — a no-op, not a real failure. Previously it returned success=False which made the stream consumer disable streaming entirely. 3. Handle RetryAfter / flood control in Telegram edit_message(). Fast providers can hit Telegram rate limits during streaming. Now waits the requested retry_after duration and retries once, instead of treating it as a fatal edit failure. Also fixed double-edit on stream finish: the consumer now tracks last-sent text and skips redundant edits, preventing the not-modified error at the source. * refactor: make config.yaml the primary gateway config source Eliminates the per-key bridge pattern in load_gateway_config(). Previously gateway.json was the primary source and each config.yaml key needed an individual bridge — easy to forget (streaming was missing, causing garl4546's bug). Now config.yaml is read first and its keys are mapped directly into the GatewayConfig.from_dict() schema. gateway.json is kept as a legacy fallback layer (loaded first, then overwritten by config.yaml keys). If gateway.json exists, a log message suggests migrating. Also: - Removed dead save_gateway_config() (never called anywhere) - Updated CLI help text and send_message error to reference config.yaml instead of gateway.json --------- Co-authored-by: Test <test@test.com>	2026-03-17 10:51:54 -07:00
teknium1	25a1f1867f	fix(gateway): prevent message flooding on adapters without edit support When the stream consumer's first edit_message() call fails (Signal, Email, HomeAssistant don't support editing), it now disables editing for the rest of the stream instead of falling back to sending a new message every 0.3 seconds. The final response is delivered by the normal send path since already_sent stays false. Without this fix, enabling gateway streaming on Signal/Email/HA would flood the chat with dozens of partial messages.	2026-03-16 12:41:28 -07:00
teknium1	5479bb0e0c	feat(gateway): streaming token delivery — StreamingConfig, GatewayStreamConsumer, already_sent Stage 3 of streaming support. Gateway now streams tokens to messaging platforms: - StreamingConfig dataclass (enabled, transport, edit_interval, buffer_threshold, cursor) on GatewayConfig with from_dict/to_dict serialization - GatewayStreamConsumer: async queue-based consumer that progressively edits a single message on the target platform (edit transport) - on_delta() → queue → run() async task → send_or_edit() with rate limiting - already_sent propagation: when streaming delivered the response, handler returns None so base adapter skips duplicate send() - stream_delta_callback wired into AIAgent constructor in _run_agent - Consumer lifecycle: started as asyncio task, awaited with timeout in finally Config (config.yaml): streaming: enabled: true transport: edit # progressive editMessageText edit_interval: 0.3 # seconds between edits buffer_threshold: 40 # chars before forcing flush cursor: ' ▉' Credit: jobless0x (#774, #1312), OutThisLife (#798), clicksingh (#697).	2026-03-16 05:52:42 -07:00

38 commits