diff --git a/website/docs/user-guide/configuration.md b/website/docs/user-guide/configuration.md index 9a673bc7f8..97cb9f0b59 100644 --- a/website/docs/user-guide/configuration.md +++ b/website/docs/user-guide/configuration.md @@ -832,6 +832,7 @@ display: resume_display: full # full (show previous messages on resume) | minimal (one-liner only) bell_on_complete: false # Play terminal bell when agent finishes (great for long tasks) show_reasoning: false # Show model reasoning/thinking above each response (toggle with /reasoning show|hide) + streaming: false # Stream tokens to terminal as they arrive (real-time output) background_process_notifications: all # all | result | error | off (gateway only) ``` @@ -884,6 +885,36 @@ voice: Use `/voice on` in the CLI to enable microphone mode, `record_key` to start/stop recording, and `/voice tts` to toggle spoken replies. See [Voice Mode](/docs/user-guide/features/voice-mode) for end-to-end setup and platform-specific behavior. +## Streaming + +Stream tokens to the terminal or messaging platforms as they arrive, instead of waiting for the full response. + +### CLI Streaming + +```yaml +display: + streaming: true # Stream tokens to terminal in real-time + show_reasoning: true # Also stream reasoning/thinking tokens (optional) +``` + +When enabled, responses appear token-by-token inside a streaming box. Tool calls are still captured silently. If the provider doesn't support streaming, it falls back to the normal display automatically. + +### Gateway Streaming (Telegram, Discord, Slack) + +```yaml +streaming: + enabled: true # Enable progressive message editing + edit_interval: 0.3 # Seconds between message edits + buffer_threshold: 40 # Characters before forcing an edit flush + cursor: " ▉" # Cursor shown during streaming +``` + +When enabled, the bot sends a message on the first token, then progressively edits it as more tokens arrive. Platforms that don't support message editing (Signal, Email) gracefully skip streaming and deliver the final response normally. + +:::note +Streaming is disabled by default. Enable it in `~/.hermes/config.yaml` to try the streaming UX. +::: + ## Group Chat Session Isolation Control whether shared chats keep one conversation per room or one conversation per participant: