hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

Author	SHA1	Message	Date
Tranquil-Flow	b668c09ab2	fix(gateway): strip cursor from frozen message on empty fallback continuation (#7183 ) When _send_fallback_final() is called with nothing new to deliver (the visible partial already matches final_text), the last edit may still show the cursor character because fallback mode was entered after a failed edit. Before this fix the early-return path left _already_sent = True without attempting to strip the cursor, so the message stayed frozen with a visible ▉ permanently. Adds a best-effort edit inside the empty-continuation branch to clean the cursor off the last-sent text. Harmless when fallback mode wasn't actually armed or when the cursor isn't present. If the strip edit itself fails (flood still active), we return without crashing and without corrupting _last_sent_text. Adapted from PR #7429 onto current main — the surrounding fallback block grew the #10807 stale-prefix handling since #7429 was written, so the cursor strip lives in the new else-branch where we still return early. 3 unit tests covering: cursor stripped on empty continuation, no edit attempted when cursor is not configured, cursor-strip edit failure handled without crash. Originally proposed as PR #7429.	2026-04-19 01:51:12 -07:00
konsisumer	1d1e1277e4	fix(gateway): flush undelivered tail before segment reset to preserve streamed text (#8124 ) When a streaming edit fails mid-stream (flood control, transport error) and a tool boundary arrives before the fallback threshold is reached, the pre-boundary tail in `_accumulated` was silently discarded by `_reset_segment_state`. The user saw a frozen partial message and missing words on the other side of the tool call. Flush the undelivered tail as a continuation message before the reset, computed relative to the last successfully-delivered prefix so we don't duplicate content the user already saw.	2026-04-19 01:43:04 -07:00
pedh	4459913f40	feat(dingtalk): AI Cards streaming, emoji reactions, and media handling Cherry-picked from #10985 by pedh, adapted to current main: * Keeps main's full group-chat gating (require_mention + allowed_users + free_response_chats + mention_patterns) — PR's simpler subset dropped. * Keeps main's fire-and-forget process() dispatch + session_webhook fallback for SDK >= 0.24. * Picks up PR's REQUIRES_EDIT_FINALIZE capability flag on BasePlatformAdapter + finalize kwarg on edit_message(), plumbed through stream_consumer. Default False so Telegram/Slack/Discord/Matrix stay on the zero-overhead fast path. * DingTalk AI Card lifecycle: per-chat _message_contexts, two-card flow (tool-progress + final response) with sibling auto-close driven by reply_to, idempotent 🤔Thinking → 🥳Done swap, $alibabacloud-dingtalk$ for media URL resolution (replaces raw HTTP that was 403-ing). * pyproject: dingtalk extra now dingtalk-stream>=0.20,<1 + alibabacloud-dingtalk>=2.0.0 + qrcode. Closes #10991 Co-authored-by: pedh	2026-04-17 19:26:53 -07:00
Siddharth Balyan	d38b73fa57	fix(matrix): E2EE and migration bugfixes (#10860 ) * - make buffered streaming - fix path naming to expand `~` for agent. - fix stripping of matrix ID to not remove other mentions / localports. * fix(matrix): register MembershipEventDispatcher for invite auto-join The mautrix migration (#7518) broke auto-join because InternalEventType.INVITE events are only dispatched when MembershipEventDispatcher is registered on the client. Without it, _on_invite is dead code and the bot silently ignores all room invites. Closes #10094 Closes #10725 Refs: PR #10135 (digging-airfare-4u), PR #10732 (fxfitz) * fix(matrix): preserve _joined_rooms reference for CryptoStateStore connect() reassigned self._joined_rooms = set(...) after initial sync, orphaning the reference captured by _CryptoStateStore at init time. find_shared_rooms() returned [] forever, breaking Megolm session rotation on membership changes. Mutate in place with clear() + update() so the CryptoStateStore reference stays valid. Refs #8174, PR #8215 * fix(matrix): remove dual ROOM_ENCRYPTED handler to fix dedup race mautrix auto-registers DecryptionDispatcher when client.crypto is set. The adapter also registered _on_encrypted_event for the same event type. _on_encrypted_event had zero awaits and won the race to mark event IDs in the dedup set, causing _on_room_message to drop successfully decrypted events from DecryptionDispatcher. The retry loop masked this by re-decrypting every message ~4 seconds later. Remove _on_encrypted_event entirely. DecryptionDispatcher handles decryption; genuinely undecryptable events are logged by mautrix and retried on next key exchange. Refs #8174, PR #8215 * fix(matrix): re-verify device keys after share_keys() upload Matrix homeservers treat ed25519 identity keys as immutable per device. share_keys() can return 200 but silently ignore new keys if the device already exists with different identity keys. The bot would proceed with shared=True while peers encrypt to the old (unreachable) keys. Now re-queries the server after share_keys() and fails closed if keys don't match, with an actionable error message. Refs #8174, PR #8215 * fix(matrix): encrypt outbound attachments in E2EE rooms _upload_and_send() uploaded raw bytes and used the 'url' key for all rooms. In E2EE rooms, media must be encrypted client-side with encrypt_attachment(), the ciphertext uploaded, and the 'file' key (with key/iv/hashes) used instead of 'url'. Now detects encrypted rooms via state_store.is_encrypted() and branches to the encrypted upload path. Refs: PR #9822 (charles-brooks) * fix(matrix): add stop_typing to clear typing indicator after response The adapter set a 30-second typing timeout but never cleared it. The base class stop_typing() is a no-op, so the typing indicator lingered for up to 30 seconds after each response. Closes #6016 Refs: PR #6020 (r266-tech) * fix(matrix): cache all media types locally, not just photos/voice should_cache_locally only covered PHOTO, VOICE, and encrypted media. Unencrypted audio/video/documents in plaintext rooms were passed as MXC URLs that require authentication the agent doesn't have, resulting in 401 errors. Refs #3487, #3806 * fix(matrix): detect stale OTK conflict on startup and fail closed When crypto state is wiped but the same device ID is reused, the homeserver may still hold one-time keys signed with the previous identity key. Identity key re-upload succeeds but OTK uploads fail with "already exists" and a signature mismatch. Peers cannot establish new Olm sessions, so all new messages are undecryptable. Now proactively flushes OTKs via share_keys() during connect() and catches the "already exists" error with an actionable log message telling the operator to purge the device from the homeserver or generate a fresh device ID. Also documents the crypto store recovery procedure in the Matrix setup guide. Refs #8174 * docs(matrix): improve crypto recovery docs per review - Put easy path (fresh access token) first, manual purge second - URL-encode user ID in Synapse admin API example - Note that device deletion may invalidate the access token - Add "stop Synapse first" caveat for direct SQLite approach - Mention the fail-closed startup detection behavior - Add back-reference from upgrade section to OTK warning * refactor(matrix): cleanup from code review - Extract _extract_server_ed25519() and _reverify_keys_after_upload() to deduplicate the re-verification block (was copy-pasted in two places, three copies of ed25519 key extraction total) - Remove dead code: _pending_megolm, _retry_pending_decryptions, _MAX_PENDING_EVENTS, _PENDING_EVENT_TTL — all orphaned after removing _on_encrypted_event - Remove tautological TestMediaCacheGate (tested its own predicate, not production code) - Remove dead TestMatrixMegolmEventHandling and TestMatrixRetryPendingDecryptions (tested removed methods) - Merge duplicate TestMatrixStopTyping into TestMatrixTypingIndicator - Trim comment to just the "why"	2026-04-17 04:03:02 +05:30
konsisumer	3e3ec35a5e	fix: surface execute_code timeout to user instead of silently dropping (#10807 ) When execute_code times out, the result JSON had status="timeout" and an error field, but the output field was empty. Many models treat empty output as "nothing happened" and produce an empty/minimal response. The gateway stream consumer then considers the response "already sent" (from pre-tool streaming) and silently drops it — leaving the user staring at silence. Three changes: 1. Include the timeout message in the output field (both local and remote paths) so the model always has visible content to relay to the user. 2. Add periodic activity callbacks to the local execution polling loop so the gateway's inactivity monitor knows execute_code is alive during long runs. 3. Fix stream_consumer._send_fallback_final to not silently drop content when the continuation appears empty but the final text differs from what was previously streamed (e.g. after a tool boundary reset).	2026-04-16 06:42:45 -07:00
Teknium	3b5572ded3	fix(stream-consumer): only confirm final delivery on successful best-effort send The cancellation handler previously promoted any partial send (already_sent=True) to final_response_sent=True unconditionally. This meant if intermediate text (e.g. 'Let me search…') was streamed and the consumer was cancelled before delivering the actual answer, the gateway's suppression check would still prevent the fallback send. Now final_response_sent is only set in the cancellation path when: - The best-effort send of accumulated content actually succeeded, OR - It was already confirmed before cancellation Companion fix for PR #11000's run.py changes — closes the cancellation-path loophole that would otherwise let partial streams suppress final delivery during queued follow-ups.	2026-04-16 05:53:18 -07:00
LehaoLin	d4eba82a37	fix(streaming): don't suppress final response when commentary message is sent Commentary messages (interim assistant status updates like "Using browser tool...") are sent via _send_commentary(), which was incorrectly setting _already_sent = True on success. This caused the final response to be suppressed when there were multiple tool calls, because the gateway checks already_sent to decide whether to skip re-sending the response. The fix: commentary messages are interim status updates, not the final response, so _already_sent should not be set when they succeed. This ensures the final response is always delivered regardless of how many commentary messages were sent during the turn. Fixes: #10454	2026-04-15 15:00:58 -07:00
Teknium	b4fcec6412	fix: prevent streaming cursor from appearing as standalone messages (#9538 ) During rapid tool-calling, the model often emits 1-2 tokens before switching to tool calls. The stream consumer would create a new message with 'X ▉' (short text + cursor), and if the follow-up edit to strip the cursor was rate-limited by the platform, the cursor remained as a permanent standalone message — reported on Telegram as 'white box' artifacts. Add a minimum-content guard in _send_or_edit: when creating a new standalone message (no existing message_id), require at least 4 visible characters alongside the cursor before sending. Shorter text accumulates into the next streaming segment instead. This prevents cursor-only 'tofu' messages across all platforms without affecting normal streaming (edits to existing messages, final sends without cursor, and messages with substantial text are all unaffected). Reported by @michalkomar on X.	2026-04-14 01:52:42 -07:00
Teknium	3de2b98503	fix(streaming): filter <think> blocks from gateway stream consumer Models like MiniMax emit inline <think>...</think> reasoning blocks in their content field. The CLI already suppresses these via a state machine in _stream_delta, but the gateway's GatewayStreamConsumer had no equivalent filtering — raw think blocks were streamed directly to Discord/Telegram/Slack. The fix adds a _filter_and_accumulate() method that mirrors the CLI's approach: a state machine tracks whether we're inside a reasoning block and silently discards the content. Includes the same block-boundary check (tag must appear at line start or after whitespace-only prefix) to avoid false positives when models mention <think> in prose. Handles all tag variants: <think>, <thinking>, <THINKING>, <thought>, <reasoning>, <REASONING_SCRATCHPAD>. Also handles edge cases: - Tags split across streaming deltas (partial tag buffering) - Unclosed blocks (content suppressed until stream ends) - Multiple consecutive blocks - _flush_think_buffer on stream end for held-back partial tags Adds 22 unit tests + 1 integration test covering all scenarios.	2026-04-13 22:16:20 -07:00
Teknium	0cc7f79016	fix(streaming): prevent duplicate Telegram replies when stream task is cancelled (#9319 ) When the 5-second stream_task timeout in gateway/run.py expires (due to slow Telegram API calls from rate limiting after several messages), the stream consumer is cancelled via asyncio.CancelledError. The CancelledError handler did a best-effort final edit but never set final_response_sent, so the gateway fell through to the normal send path and delivered the full response again as a reply — causing a duplicate. The fix: in the CancelledError handler, set final_response_sent = True when already_sent is True (i.e., the stream consumer had already delivered content to the user). This tells the gateway's already_sent check that the response was delivered, preventing the duplicate send. Adds two tests verifying the cancellation behavior: - Cancelled with already_sent=True → final_response_sent=True (no dup) - Cancelled with already_sent=False → final_response_sent=False (normal send path proceeds) Reported by community user hume on Discord.	2026-04-13 19:22:43 -07:00
helix4u	0ffb6f2dae	fix(matrix): skip cursor-only stream placeholder messages	2026-04-13 16:31:02 -07:00
asheriif	97b0cd51ee	feat(gateway): surface natural mid-turn assistant messages in chat platforms Add display.interim_assistant_messages config (enabled by default) that forwards completed assistant commentary between tool calls to the user as separate chat messages. Models already emit useful status text like 'I'll inspect the repo first.' — this surfaces it on Telegram, Discord, and other messaging platforms instead of swallowing it. Independent from tool_progress and gateway streaming. Disabled for webhooks. Uses GatewayStreamConsumer when available, falls back to direct adapter send. Tracks response_previewed to prevent double-delivery when interim message matches the final response. Also fixes: cursor not stripped from fallback prefix in stream consumer (affected continuation calculation on no-edit platforms like Signal). Cherry-picked from PR #7885 by asheriif, default changed to enabled. Fixes #5016	2026-04-11 16:21:39 -07:00
Teknium	d7607292d9	fix(streaming): adaptive backoff + cursor strip to prevent message truncation (#7683 ) Telegram flood control during streaming caused messages to be cut off mid-response. The old behavior permanently disabled edits after a single flood-control failure, losing the remainder of the response. Changes: - Adaptive backoff: on flood-control edit failures, double the edit interval instead of immediately disabling edits. Only permanently disable after 3 consecutive failures (_MAX_FLOOD_STRIKES). - Cursor strip: when entering fallback mode, best-effort edit to remove the cursor (▉) from the last visible message so it doesn't appear stuck. - Fallback send retry: _send_fallback_final retries each chunk once on flood-control failures (3s delay) before giving up. - Default edit_interval increased from 0.3s to 1.0s. Telegram rate-limits edits at ~1/s per message; 0.3s was virtually guaranteed to trigger flood control on any non-trivial response. - _send_or_edit returns bool so the overflow split loop knows not to truncate accumulated text when an edit fails (prevents content loss). Fixes: messages cutting/stopping mid-response on Telegram, especially with streaming enabled.	2026-04-11 10:28:15 -07:00
KUSH42	5dea7e1ebc	fix(gateway): prevent duplicate messages on no-message-id platforms Platforms that don't return a message_id after the first send (Signal, GitHub webhooks) were causing GatewayStreamConsumer to re-enter the "first send" path on every tool boundary, posting one platform message per tool call (observed as 155 PR comments on a single response). Fix: treat _message_id == "__no_edit__" as a sentinel meaning "platform accepted the send but cannot be edited". When a tool boundary arrives in that state, skip the message_id/accumulated/last_sent_text reset so all continuation text is delivered once via _send_fallback_final rather than re-posted per segment. Also make prompt_toolkit imports in hermes_cli/commands.py optional so gateway and test environments that lack the package can still import resolve_command, gateway_help_lines, and COMMAND_REGISTRY.	2026-04-10 03:52:00 -07:00
dangelo352	aed9b90ae3	fix(stream_consumer): handle overflow when no message exists yet The overflow split loop required _message_id to be set, but on the first streamed message (or after a segment break) _message_id is None. Oversized text fell through to _send_or_edit → adapter.send(), which split internally — but subsequent edits hit Telegram's 'message too long' and were silently truncated with '…', cutting off the response. Add a new code path for the _message_id is None case that uses truncate_message() (same as the non-streaming path) to split with proper word/code-fence boundaries and chunk indicators. Each chunk is sent as a new message via _send_new_chunk(). Properly handles got_done (returns immediately after sending chunks instead of continuing into an infinite loop) and got_segment_break. Original cherry-picked from PR #6816 by dangelo352. Fixes silent message truncation on Telegram for long streamed responses.	2026-04-09 15:07:21 -07:00
Teknium	e26393ffc2	fix: Signal duplicate replies with streaming + per-platform tool_progress (#6348 ) Fixes #4647 — Signal replies duplicated when gateway streaming is enabled. Root cause: stream_consumer.py did not handle the case where send() returns success=True but no message_id (Signal behavior). Every stream delta produced a separate send() call (7+ messages instead of 2), plus the gateway sent another full duplicate since already_sent was never set. Changes: - stream_consumer.py: Add elif branch for success-without-message_id — enters fallback mode (sets already_sent, disables editing, sends only continuation) - signal.py send(): Extract timestamp from signal-cli RPC result as message_id so stream consumer follows normal edit→fallback path - signal.py: Add public stop_typing() delegating to _stop_typing_indicator() so base adapter's _keep_typing finally block can clean up typing tasks - gateway/run.py: Per-platform tool_progress_overrides (#6164) — lets users set e.g. signal: off while keeping telegram: all - hermes_cli/config.py: Add tool_progress_overrides to DEFAULT_CONFIG Refs: #4647, #6164	2026-04-08 17:39:45 -07:00
landy	383db35925	fix: improve streaming fallback after edit failures	2026-04-08 03:33:43 -07:00
Teknium	d0ffb111c2	refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821 ) Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture) and manual analysis of the entire codebase. Changes by category: Unused imports removed (~95 across 55 files): - Removed genuinely unused imports from all major subsystems - agent/, hermes_cli/, tools/, gateway/, plugins/, cron/ - Includes imports in try/except blocks that were truly unused (vs availability checks which were left alone) Unused variables removed (~25): - Removed dead variables: connected, inner, channels, last_exc, source, new_server_names, verify, pconfig, default_terminal, result, pending_handled, temperature, loop - Dropped unused argparse subparser assignments in hermes_cli/main.py (12 instances of add_parser() where result was never used) Dead code removed: - run_agent.py: Removed dead ternary (None if False else None) and surrounding unreachable branch in identity fallback - run_agent.py: Removed write-only attribute _last_reported_tool - hermes_cli/providers.py: Removed dead @property decorator on module-level function (decorator has no effect outside a class) - gateway/run.py: Removed unused MCP config load before reconnect - gateway/platforms/slack.py: Removed dead SessionSource construction Undefined name bugs fixed (would cause NameError at runtime): - batch_runner.py: Added missing logger = logging.getLogger(__name__) - tools/environments/daytona.py: Added missing Dict and Path imports Unnecessary global statements removed (14): - tools/terminal_tool.py: 5 functions declared global for dicts they only mutated via .pop()/[key]=value (no rebinding) - tools/browser_tool.py: cleanup thread loop only reads flag - tools/rl_training_tool.py: 4 functions only do dict mutations - tools/mcp_oauth.py: only reads the global - hermes_time.py: only reads cached values Inefficient patterns fixed: - startswith/endswith tuple form: 15 instances of x.startswith('a') or x.startswith('b') consolidated to x.startswith(('a', 'b')) - len(x)==0 / len(x)>0: 13 instances replaced with pythonic truthiness checks (not x / bool(x)) - in dict.keys(): 5 instances simplified to in dict - Redefined unused name: removed duplicate _strip_mdv2 import in send_message_tool.py Other fixes: - hermes_cli/doctor.py: Replaced undefined logger.debug() with pass - hermes_cli/config.py: Consolidated chained .endswith() calls Test results: 3934 passed, 17 failed (all pre-existing on main), 19 skipped. Zero regressions.	2026-04-07 10:25:31 -07:00
Teknium	8dee82ea1e	fix: stream consumer creates new message after tool boundaries (#5739 ) When streaming was enabled on the gateway, the stream consumer created a single message at the start and kept editing it as tokens arrived. Tool progress messages were sent as separate messages below it. Since edits don't change message position on Telegram/Matrix/Discord, the final response ended up stuck above all tool progress messages — users had to scroll up past potentially dozens of tool call lines to read the answer. The agent already sends stream_delta_callback(None) at tool boundaries (before _execute_tool_calls). The stream consumer was ignoring this signal. Now it treats None as a segment break: finalizes the current message (removes cursor), resets _message_id, and the next text chunk creates a fresh message below the tool progress messages. Timeline before: [msg 1: 'Let me search...' → edits → 'Here is the answer'] ← top [msg 2: tool progress lines] ← bottom Timeline after: [msg 1: 'Let me search...'] ← top [msg 2: tool progress lines] [msg 3: 'Here is the answer'] ← bottom (visible) Reported by SkyLinx on Discord.	2026-04-06 23:00:14 -07:00
Teknium	c8220e69a1	fix: strip MEDIA: directives from streamed gateway messages (#5152 ) When streaming is enabled, the GatewayStreamConsumer sends raw text chunks directly to the platform without post-processing. This causes MEDIA:/path/to/file tags and [[audio_as_voice]] directives to appear as visible text in the user's chat instead of being stripped. The non-streaming path already handles this correctly via extract_media() in base.py, but the streaming path was missing equivalent cleanup. Add _clean_for_display() to GatewayStreamConsumer that strips MEDIA: tags and internal markers before any text reaches the platform. The actual media file delivery is unaffected — _deliver_media_from_response() in gateway/run.py still extracts files from the agent's final_response (separate from the stream consumer's display text). Reported by Ao [FotM] on Discord.	2026-04-04 19:05:27 -07:00
kshitijk4poor	28380e7aed	fix(gateway): STT config resolution, stream consumer flood control fallback Three targeted fixes from user-reported issues: 1. STT config resolution (transcription_tools.py): _has_openai_audio_backend() and _resolve_openai_audio_client_config() now check stt.openai.api_key/base_url in config.yaml FIRST, before falling back to env vars. Fixes voice transcription breaking when using a custom OpenAI-compatible endpoint via config.yaml. 2. Stream consumer flood control fallback (stream_consumer.py): When an edit fails mid-stream (e.g., Telegram flood control returns failure for waits >5s), reset _already_sent to False so the normal final send path delivers the complete response. Previously, a truncated partial was left as the final message. 3. Telegram edit_message comment alignment (telegram.py): Clarify that long flood waits return failure so streaming can fall back to a normal final send.	2026-04-03 00:50:17 -07:00
Teknium	2fa33dde81	fix: handle message length overflow in streaming mode (#1783 ) Stream consumer now splits messages that exceed the platform's MAX_MESSAGE_LENGTH. When accumulated text grows past the safe limit, the current message is finalized and a new message is started for the overflow — same as how normal sends chunk long responses. Split point prefers line boundaries (rfind newline) for clean breaks. Works for all platforms (Telegram 4096, Discord 2000, etc.) by reading the adapter's MAX_MESSAGE_LENGTH at runtime. Also added a safety net in the Telegram adapter: if edit_message_text still hits MESSAGE_TOO_LONG (e.g. markdown formatting expansion), it truncates and returns success so the stream consumer doesn't die. Co-authored-by: Test <test@test.com>	2026-03-17 11:00:52 -07:00
Teknium	7ac9088d5c	fix: Telegram streaming — config bridge, not-modified, flood control (#1782 ) * fix: NameError in OpenCode provider setup (prompt_text -> prompt) The OpenCode Zen and OpenCode Go setup sections used prompt_text() which is undefined. All other providers correctly use the local prompt() function defined in setup.py. Fixes crash during 'hermes setup' when selecting either OpenCode provider. * fix: Telegram streaming — config bridge, not-modified, flood control Three fixes for gateway streaming: 1. Bridge streaming config from config.yaml into gateway runtime. load_gateway_config() now reads the 'streaming' key from config.yaml (same pattern as session_reset, stt, etc.), matching the docs. Previously only gateway.json was read. 2. Handle 'Message is not modified' in Telegram edit_message(). This Telegram API error fires when editing with identical content — a no-op, not a real failure. Previously it returned success=False which made the stream consumer disable streaming entirely. 3. Handle RetryAfter / flood control in Telegram edit_message(). Fast providers can hit Telegram rate limits during streaming. Now waits the requested retry_after duration and retries once, instead of treating it as a fatal edit failure. Also fixed double-edit on stream finish: the consumer now tracks last-sent text and skips redundant edits, preventing the not-modified error at the source. * refactor: make config.yaml the primary gateway config source Eliminates the per-key bridge pattern in load_gateway_config(). Previously gateway.json was the primary source and each config.yaml key needed an individual bridge — easy to forget (streaming was missing, causing garl4546's bug). Now config.yaml is read first and its keys are mapped directly into the GatewayConfig.from_dict() schema. gateway.json is kept as a legacy fallback layer (loaded first, then overwritten by config.yaml keys). If gateway.json exists, a log message suggests migrating. Also: - Removed dead save_gateway_config() (never called anywhere) - Updated CLI help text and send_message error to reference config.yaml instead of gateway.json --------- Co-authored-by: Test <test@test.com>	2026-03-17 10:51:54 -07:00
teknium1	25a1f1867f	fix(gateway): prevent message flooding on adapters without edit support When the stream consumer's first edit_message() call fails (Signal, Email, HomeAssistant don't support editing), it now disables editing for the rest of the stream instead of falling back to sending a new message every 0.3 seconds. The final response is delivered by the normal send path since already_sent stays false. Without this fix, enabling gateway streaming on Signal/Email/HA would flood the chat with dozens of partial messages.	2026-03-16 12:41:28 -07:00
teknium1	5479bb0e0c	feat(gateway): streaming token delivery — StreamingConfig, GatewayStreamConsumer, already_sent Stage 3 of streaming support. Gateway now streams tokens to messaging platforms: - StreamingConfig dataclass (enabled, transport, edit_interval, buffer_threshold, cursor) on GatewayConfig with from_dict/to_dict serialization - GatewayStreamConsumer: async queue-based consumer that progressively edits a single message on the target platform (edit transport) - on_delta() → queue → run() async task → send_or_edit() with rate limiting - already_sent propagation: when streaming delivered the response, handler returns None so base adapter skips duplicate send() - stream_delta_callback wired into AIAgent constructor in _run_agent - Consumer lifecycle: started as asyncio task, awaited with timeout in finally Config (config.yaml): streaming: enabled: true transport: edit # progressive editMessageText edit_interval: 0.3 # seconds between edits buffer_threshold: 40 # chars before forcing flush cursor: ' ▉' Credit: jobless0x (#774, #1312), OutThisLife (#798), clicksingh (#697).	2026-03-16 05:52:42 -07:00

25 commits