docs: comprehensive 2-week sweep of feature/PR coverage gaps (#28497)

Catch the website docs up to two weeks of merged work (May 4 – May 18, 2026, roughly 1,080 PRs). The audit found ~50 user-visible features that had landed in code with no docs footprint, plus a handful of stale pages. This PR closes every gap the scan turned up. New pages - user-guide/features/deliverable-mode.md — extension list, agent triggers, kanban_complete artifacts pattern, [[as_document]] override (PR #27813). - developer-guide/web-search-provider-plugin.md — authoring guide modeled on image-gen-provider-plugin, covering brave_free / ddgs / etc. (PR #25448). Providers / auth - Rename "Alibaba Cloud" → "Qwen Cloud (Alibaba DashScope)" everywhere the display label shows up; provider id stays `alibaba` (PR #24835). - Document OAuth refresh-token quarantine for xAI / MiniMax / Codex (PRs #28116 / #28118 / #28119). - Document Nous JWT minting from refresh token + invalid-refresh quarantine + cross-profile shared token store (PRs #27663 / #19712). - Add `## Microsoft Entra ID authentication (keyless)` section to azure-foundry guide — DefaultAzureCredential, RBAC, OpenAI + Anthropic routing details (PR #28101 / #9df9816da). - Custom providers `api_mode` is now prompted-and-persisted, not just URL autodetected (PR #25068). - Delegation honours `api_mode` + auto-detects anthropic_messages base URLs (PR #26824). - `x_search` auto-enables when xAI credentials are present (PR #27376). - Add `xAI Grok OAuth (SuperGrok)` row to providers headline table (PR #26534). - NVIDIA NIM billing-origin header is set automatically (PR #26585). Windows / installer - `install.ps1`: document `-Commit <sha>` and `-Tag <v>` pin params plus the BOM-strip / git-retry hardening (PR #28169). - Document Hermes Desktop thin installer + first-launch bootstrap (PR #27822). - Document `dep_ensure` Windows bootstrap (PR #27845). - Document install-method auto-detection (pip / git / homebrew / nixos) and the matching update command (PR #27843). Gateway / messaging - `/platform list|pause|resume` full description + circuit-breaker semantics (PR #26600). - Slack / Matrix / Mattermost get parallel `allowed_channels` / `allowed_rooms` allowlist sections matching Telegram/Discord/DingTalk (PR #21251). - Discord `allow_any_attachment` + `max_attachment_bytes` (config and env vars) (PR #27245). - Discord clarify-choice button rendering (PR #25485). - Telegram `guest_mode` @mention bypass for allowlisted groups (PR #22759). - Telegram `notifications` mode (`important` vs `all`) (PR #22793). - `[[as_document]]` skill / response directive for forcing document-style media delivery (PR #21210). CLI / TUI - `/new [name]` argument (PR #19637). - `/subgoal` user-supplied criteria appended to `/goal` (PR #25449). - `/exit --delete` flag confirmation prompts for destructive slash commands (PR #22687). - Status-bar additions: ▶ N background indicator (PR #27175), context compression count (PR #21218), YOLO mode banner+statusbar warning (PR #26238). - `display.timestamps` + `docker_extra_args` config keys (PR #23599). - TUI collapsible startup banner sections (PR #20625). - `HERMES_SESSION_ID` exported to tool subprocesses (PR #23847). i18n - Refresh display.language locale list from 8 → 16 (en, zh, zh-hant, ja, de, es, fr, tr, uk, af, ko, it, ga, pt, ru, hu) — matches `agent/i18n.py:SUPPORTED_LANGUAGES`. Tools / features - `vision_analyze` native-pixel passthrough for vision-capable callers, with auxiliary text-describer fallback (PR #22955). - `session_search` rewrite to the single-shape tool (discovery / scroll / browse modes) (PRs #27590 / #27840). - Clarify MCP transport scope: client supports stdio + SSE; embedded `hermes mcp serve` is stdio-only (PR #21227). - Web search backends table: add Brave Search (free tier) and DDGS rows (PR #21337). - ACP session-scoped edit auto-approval modes (PR #27862). - Curator rename map in the user-visible per-run summary (PR #22910). - Prompt caching feature page reference in features/overview.md — Claude cross-session 1-hour prefix cache on native Anthropic / OpenRouter / Nous Portal (PR #23828). - Cron per-job profile parameter (PR #28124). - `--no-skills` flag for `hermes profile create` (PR #20986). Build - Verified with `npm run build` in `website/`; both `en` and `zh-Hans` locales compile. Remaining broken-link/anchor warnings are pre-existing (`rl-training.md` from learning-path / overview; the zh-Hans translation lag the docs skill already calls out).
2026-07-15 14:22:43 +00:00 · 2026-05-18 23:55:25 -07:00 · 2026-05-18 23:55:25 -07:00 · eacce70a35
commit eacce70a35
parent 1335ce996d
37 changed files with 901 additions and 26 deletions
--- a/website/docs/user-guide/cli.md
+++ b/website/docs/user-guide/cli.md
@ -68,9 +68,12 @@ A persistent status bar sits above the input area, updating in real time:
 | Token count | Context tokens used / max context window |
 | Context bar | Visual fill indicator with color-coded thresholds |
 | Cost | Estimated session cost (or `n/a` for unknown/zero-priced models) |
+| 🗜️ N | **Context compression count** — how many times the running session has been auto-compressed. Appears once the first compression fires. |
+| ▶ N | **Active background tasks** — how many `/background` prompts are still running in the current session. Appears whenever at least one task is in flight. |
 | Duration | Elapsed session time |
+| ⚠ YOLO | **YOLO mode warning** — shown whenever `HERMES_YOLO_MODE` is on (either `hermes --yolo` at launch or `/yolo` toggled mid-session). Mirrors the banner-line warning so you can't forget you're in auto-approve mode. |

-The bar adapts to terminal width — full layout at ≥ 76 columns, compact at 52–75, minimal (model + duration only) below 52.
+The bar adapts to terminal width — full layout at ≥ 76 columns, compact at 52–75, minimal (model + duration, plus the YOLO badge when active) below 52.

 **Context color coding:**

@ -125,6 +128,8 @@ Common examples:
 | `/voice tts` | Toggle spoken playback for Hermes replies |
 | `/reasoning high` | Increase reasoning effort |
 | `/title My Session` | Name the current session |
+| `/status` | Show session info — model/profile/tokens/duration — followed by a local **Session recap** block (recent turn counts, top tools used, files touched, latest user prompt + assistant reply). Pure local compute; no LLM call. |
+| `/sessions` | Open an interactive session picker right inside the classic CLI (same surface the TUI uses). Type to filter, arrow keys to navigate, Enter to resume. |

 For the full built-in CLI and messaging lists, see [Slash Commands Reference](../reference/slash-commands.md).

--- a/website/docs/user-guide/configuration.md
+++ b/website/docs/user-guide/configuration.md
@ -140,6 +140,9 @@ terminal:
  docker_volumes:                  # Host directory mounts
    - "/home/user/projects:/workspace/projects"
    - "/home/user/data:/data:ro"   # :ro for read-only
+  docker_extra_args:               # Extra flags appended verbatim to `docker run`
+    - "--gpus=all"
+    - "--network=host"

  # Resource limits
  container_cpu: 1                 # CPU cores (0 = unlimited)
@ -148,6 +151,8 @@ terminal:
  container_persistent: true       # Persist /workspace and /root across sessions
 ```

+**`terminal.docker_extra_args`** (also overridable via `TERMINAL_DOCKER_EXTRA_ARGS='["--gpus=all"]'`) lets you pass arbitrary `docker run` flags that Hermes doesn't surface as first-class keys — `--gpus`, `--network`, `--add-host`, alternative `--security-opt` overrides, etc. Each entry must be a string; the list is appended last to the assembled `docker run` invocation so it can override Hermes' defaults if needed. Use sparingly — flags that conflict with the sandbox hardening (capability drops, `--user`, the workspace bind mount) will silently weaken isolation.
+
 **Requirements:** Docker Desktop or Docker Engine installed and running. Hermes probes `$PATH` plus common macOS install locations (`/usr/local/bin/docker`, `/opt/homebrew/bin/docker`, Docker Desktop app bundle). Podman is supported out of the box: set `HERMES_DOCKER_BINARY=podman` (or the full path) to force it when both are installed.

 **Container lifecycle:** Hermes reuses a single long-lived container (`docker run -d ... sleep 2h`) for every terminal and file-tool call, across sessions, `/new`, `/reset`, and `delegate_task` subagents, for the lifetime of the Hermes process. Commands run via `docker exec` with a login shell, so working-directory changes, installed packages, and files in `/workspace` all persist from one tool call to the next. The container is stopped and removed on Hermes shutdown (or when the idle-sweep reclaims it).
@ -762,6 +767,16 @@ credential_pool_strategies:

 Options: `fill_first` (default), `round_robin`, `least_used`, `random`. See [Credential Pools](/docs/user-guide/features/credential-pools) for full documentation.

+## Prompt caching
+
+Hermes turns on cross-session prompt caching automatically when the active provider supports it — no user config needed.
+
+For Claude on **native Anthropic**, **OpenRouter**, and **Nous Portal**, Hermes attaches `cache_control` breakpoints with the 1-hour TTL (`ttl: "1h"`) on the system prompt and skill blocks. The first send within a fresh hour pays full input rates; subsequent sends across any session within the same hour pull from the cache at the discounted cached-read rate. This means the system prompt, loaded skill content, and the early portion of any long-context include get reused across `hermes` sessions and across forked subagents for the first hour.
+
+The Qwen Cloud (Alibaba DashScope) upstream caps cache TTL at 5 minutes, so Hermes uses the 5-minute breakpoint TTL there instead. Other Claude-via-third-party paths (AWS Bedrock, Azure Foundry) fall back to the provider's own caching defaults. xAI Grok uses a separate session-pinned conversation-id mechanism — see [xAI prompt caching](/docs/integrations/providers#xai-grok--responses-api--prompt-caching).
+
+No knob exists to disable this — caching is always-on and saves money even on single-turn conversations because the system prompt alone is a meaningful fraction of the input token count.
+
 ## Auxiliary Models

 Hermes uses "auxiliary" models for side tasks like image analysis, web page summarization, browser screenshot analysis, session-title generation, and context compression. By default (`auxiliary.*.provider: "auto"`), Hermes routes every auxiliary task to your **main chat model** — the same provider/model you picked in `hermes model`. You don't need to configure anything to get started, but be aware that on expensive reasoning models (Opus, MiniMax M2.7, etc.) auxiliary tasks add meaningful cost. If you want cheap-and-fast side tasks regardless of your main model, set `auxiliary.<task>.provider` and `auxiliary.<task>.model` explicitly (for example, Gemini Flash on OpenRouter for vision and web extraction).
@ -1168,12 +1183,13 @@ display:
  show_reasoning: false   # Show model reasoning/thinking above each response (toggle with /reasoning show|hide)
  streaming: false        # Stream tokens to terminal as they arrive (real-time output)
  show_cost: false        # Show estimated $ cost in the CLI status bar
+  timestamps: false       # When true, prefixes user and assistant labels with [HH:MM] timestamps in the CLI / TUI transcript
  tool_preview_length: 0  # Max chars for tool call previews (0 = no limit, show full paths/commands)
  runtime_footer:         # Gateway: append a runtime-context footer to final replies
    enabled: false
    fields: ["model", "context_pct", "cwd"]
  file_mutation_verifier: true    # Append an advisory footer when write_file/patch calls failed this turn
-  language: en            # UI language for static messages (approval prompts, some gateway replies). en | zh | ja | de | es | fr | tr | uk
+  language: en            # UI language for static messages (approval prompts, some gateway replies). en | zh | zh-hant | ja | de | es | fr | tr | uk | af | ko | it | ga | pt | ru | hu
 ```

 ### File-mutation verifier
--- a/website/docs/user-guide/docker.md
+++ b/website/docs/user-guide/docker.md
@ -196,6 +196,10 @@ docker run -it --rm \

 Direct `-e` flags override values from `.env`. This is useful for CI/CD or secrets-manager integrations where you don't want keys on disk.

+:::note Looking for Docker as the **terminal backend**?
+This page covers running Hermes itself inside Docker. If you want Hermes to execute the agent's `terminal` / `execute_code` calls inside a Docker sandbox container (one persistent container per Hermes process), that's a separate config block — `terminal.backend: docker` plus `terminal.docker_image`, `terminal.docker_volumes`, `terminal.docker_forward_env`, `terminal.docker_run_as_host_user`, and `terminal.docker_extra_args`. See [Configuration → Docker Backend](configuration.md#docker-backend) for the full set.
+:::
+
 ## Docker Compose example

 For persistent deployment with both the gateway and dashboard, a `docker-compose.yaml` is convenient:
--- a/website/docs/user-guide/features/acp.md
+++ b/website/docs/user-guide/features/acp.md
@ -218,6 +218,21 @@ Dangerous terminal commands can be routed back to the editor as approval prompts

 On timeout or error, the approval bridge denies the request.

+### Session-scoped edit auto-approval
+
+ACP exposes a third tier between *allow once* and *allow always*: **Allow for session**. Picking it from the editor's permission prompt records the approval inside the current ACP session only — every subsequent matching command in that session goes through without prompting, but a new ACP session (or restarting the editor) resets the slate and re-prompts the first time.
+
+| Option | Editor label | Scope | Persisted across restarts |
+|---|---|---|---|
+| `allow_once` | Allow once | This one tool call | No |
+| `allow_session` | Allow for session | All matching calls in this ACP session | No — cleared when the session ends |
+| `allow_always` | Allow always | All future sessions | Yes (written to the Hermes permanent allowlist) |
+| `deny` | Deny | This one tool call | No |
+
+`allow_session` is the right default for an editor workflow where you trust an agent for the duration of a task but don't want to grant a long-lived allowlist entry. The safety trade-off is straightforward: the broader the scope, the less the editor will interrupt you, and the more damage a misbehaving agent (or prompt injection) can do before you notice. Start with `allow_once` for unfamiliar commands; promote to `allow_session` once you've seen the agent run the same pattern correctly a few times; reserve `allow_always` for truly idempotent commands you trust forever (e.g. `git status`).
+
+The ACP bridge maps these options onto Hermes' internal approval semantics — `allow_always` writes a permanent allowlist entry the same way the CLI does, while `allow_session` only affects the in-process approval cache for the current ACP session.
+
 ## Troubleshooting

 ### ACP agent does not appear in the editor
--- a/website/docs/user-guide/features/cron.md
+++ b/website/docs/user-guide/features/cron.md
@ -121,6 +121,35 @@ When `workdir` is set:
 Jobs with a `workdir` run sequentially on the scheduler tick, not in the parallel pool. This is deliberate — `TERMINAL_CWD` is process-global, so two workdir jobs running at the same time would corrupt each other's cwd. Workdir-less jobs still run in parallel as before.
 :::

+## Running cron jobs in a specific profile
+
+By default a cron job inherits whichever Hermes profile owned the gateway / CLI that created it. Pass `--profile <name>` (CLI) or `profile=` (cronjob tool) to re-target the job at a different profile — the scheduler resolves that profile's `HERMES_HOME`, temporarily switches into it for the duration of the run, loads its `.env` + `config.yaml`, and executes the job there:
+
+```bash
+# Pin a job to the `night-ops` profile regardless of where it was scheduled
+hermes cron create "every 1d at 03:00" \
+  "Tail the security log and flag anomalies" \
+  --profile night-ops
+```
+
+```python
+# From a chat, via the cronjob tool
+cronjob(
+    action="create",
+    schedule="every 1d at 03:00",
+    prompt="Tail the security log and flag anomalies",
+    profile="night-ops",
+)
+```
+
+Use `--profile default` to explicitly pin to the root Hermes profile. The named profile must already exist; the scheduler refuses to create profiles on the fly. To clear a profile pin during `cron edit`, pass an empty string (`--profile ""` or `profile=""`) — the job reverts to running in whatever profile the scheduler itself is in.
+
+If the pinned profile is later deleted, the scheduler logs a warning and falls back to running the job in its current profile rather than crashing — so a stale `profile` reference never wedges a job.
+
+:::note Serialization
+Jobs with a `profile` set also run sequentially, for the same reason as `workdir`-pinned jobs: switching `HERMES_HOME` is a process-global mutation, so two profile-pinned jobs running in parallel would race each other. Unpinned jobs still run in the normal parallel pool.
+:::
+
 ## Editing jobs

 You do not need to delete and recreate jobs just to change them.
--- a/website/docs/user-guide/features/curator.md
+++ b/website/docs/user-guide/features/curator.md
@ -217,6 +217,10 @@ Every curator run writes a timestamped directory under `~/.hermes/logs/curator/`

 `REPORT.md` is a quick way to see what a given run did — which skills transitioned, what the LLM reviewer said, which skills it patched. Good for auditing without having to grep `agent.log`.

+### Rename map in the summary
+
+If a run consolidated multiple skills under an umbrella (or merged near-duplicates), the user-visible summary printed at the end of the run includes an explicit rename map showing every `old-name → new-name` pair the curator applied. This is in addition to per-skill transition lines, so when a wave of renames lands you can spot them at a glance without diffing the JSON report. The hint also surfaces under `hermes curator pin` so you can pin the umbrella name immediately if you want to lock the new label in.
+
 ## Restoring an archived skill

 If the curator archived something you still want:
--- a/website/docs/user-guide/features/delegation.md
+++ b/website/docs/user-guide/features/delegation.md
@ -268,6 +268,7 @@ delegation:
  # orchestrator_enabled: true              # Disable to force all children to leaf role.
  model: "google/gemini-3-flash-preview"             # Optional provider/model override
  provider: "openrouter"                             # Optional built-in provider
+  api_mode: anthropic_messages                       # optional; auto-detected from base_url for anthropic_messages endpoints

 # Or use a direct custom endpoint instead of provider:
 delegation:
@ -277,6 +278,8 @@ delegation:
  # api_mode: "anthropic_messages"  # Optional. Wire protocol override for base_url ("chat_completions", "codex_responses", or "anthropic_messages"). Empty = auto-detect from URL (e.g. /anthropic suffix). Set explicitly for endpoints the heuristic can't classify (Azure AI Foundry, MiniMax, Zhipu GLM, LiteLLM proxies, …).
 ```

+When `base_url` points at an Anthropic-compatible endpoint — for example a path ending in `/anthropic`, an Azure Foundry Claude route, or a MiniMax `/anthropic` proxy — `api_mode` is auto-detected as `anthropic_messages` so the subagent uses the right wire format without you setting anything. Set `api_mode` explicitly when the auto-detection guess is wrong (rare).
+
 :::tip
 The agent handles delegation automatically based on the task complexity. You don't need to explicitly ask it to delegate — it will do so when it makes sense.
 :::
--- a/website/docs/user-guide/features/goals.md
+++ b/website/docs/user-guide/features/goals.md
@ -47,6 +47,21 @@ What you'll see:

 Works identically on the CLI and every gateway platform (Telegram, Discord, Slack, Matrix, Signal, WhatsApp, SMS, iMessage, Webhook, API server, and the web dashboard).

+## Adding criteria mid-goal: `/subgoal`
+
+While a goal is active you can append extra acceptance criteria with `/subgoal <text>` without resetting the loop. Each call adds one numbered item to the goal's subgoal list; the **continuation prompt** the agent sees on the next turn includes the original goal plus an "Additional criteria the user added mid-loop" block, and the **judge prompt** is rewritten so the verdict must consider every subgoal — the goal isn't marked done until the original objective **and** every subgoal are met.
+
+| Command | What it does |
+|---|---|
+| `/subgoal <text>` | Append a new criterion to the active goal. Requires an active `/goal`. |
+| `/subgoal` (no args) | Show the current numbered subgoal list. |
+| `/subgoal remove <N>` | Remove the Nth subgoal (1-based). |
+| `/subgoal clear` | Drop every subgoal but keep the original goal intact. |
+
+Subgoals are persisted alongside the goal in `SessionDB.state_meta`, so they survive `/resume`. Setting a new `/goal <text>` replaces the goal and clears the subgoal list; `/goal clear` does the same.
+
+Use this when you start a loop ("fix the failing tests") and notice partway through that you also want it to "and add a regression test for the bug you just patched" — `/subgoal add a regression test` tightens the success criteria without breaking the running loop.
+
 ## Behavior details

 ### The judge
--- a/website/docs/user-guide/features/mcp.md
+++ b/website/docs/user-guide/features/mcp.md
@ -127,6 +127,30 @@ mcp_servers:
      Authorization: "Bearer ***"
 ```

+## Built-in presets
+
+For well-known MCP servers, `hermes mcp add` accepts a `--preset` flag that fills in the transport details so you don't have to look up the command and args. The preset only supplies defaults — anything else (env vars, headers, filtering) you pass on the same command line still wins.
+
+| Preset | What it wires up |
+|---|---|
+| `codex` | The Codex CLI's MCP server (`codex mcp-server` over stdio). Requires the `codex` CLI on PATH. |
+
+```bash
+# Add Codex CLI as an MCP server in one line
+hermes mcp add codex --preset codex
+```
+
+That writes the equivalent of:
+
+```yaml
+mcp_servers:
+  codex:
+    command: "codex"
+    args: ["mcp-server"]
+```
+
+You can pick any local name (`hermes mcp add my-codex --preset codex` is fine); the preset only provides the `command`/`args` defaults.
+
 ## How Hermes registers MCP tools

 Hermes prefixes MCP tools so they do not collide with built-in names:
@ -554,7 +578,7 @@ The gateway does NOT need to be running for read operations (listing conversatio

 ### Current limits

- Stdio transport only (no HTTP MCP transport yet)
+- The embedded `hermes mcp serve` exposes a **stdio-only** MCP server today. If you need an HTTP MCP server, run a separate adapter — or, much more commonly, use the MCP **client** side of Hermes, which already speaks both stdio and HTTP (`url` + `headers` in `mcp_servers.yaml` / `config.yaml`; see [HTTP servers](#http-servers) above).
 - Event polling at ~200ms intervals via mtime-optimized DB polling (skips work when files are unchanged)
 - No `claude/channel` push notification protocol yet
 - Text-only sends (no media/attachment sending through `messages_send`)
--- a/website/docs/user-guide/features/overview.md
+++ b/website/docs/user-guide/features/overview.md
@ -39,6 +39,7 @@ Hermes Agent includes a rich set of capabilities that extend far beyond basic ch
 - **[Provider Routing](provider-routing.md)** — Fine-grained control over which AI providers handle your requests. Optimize for cost, speed, or quality with sorting, whitelists, blacklists, and priority ordering.
 - **[Fallback Providers](fallback-providers.md)** — Automatic failover to backup LLM providers when your primary model encounters errors, including independent fallback for auxiliary tasks like vision and compression.
 - **[Credential Pools](credential-pools.md)** — Distribute API calls across multiple keys for the same provider. Automatic rotation on rate limits or failures.
+- **[Prompt caching](../configuration#prompt-caching)** — Built-in cross-session 1-hour prefix cache for Claude on native Anthropic, OpenRouter, and Nous Portal. Always-on; no configuration required.
 - **[Memory Providers](memory-providers.md)** — Plug in external memory backends (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover, Supermemory) for cross-session user modeling and personalization beyond the built-in memory system.
 - **[API Server](api-server.md)** — Expose Hermes as an OpenAI-compatible HTTP endpoint. Connect any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, and more.
 - **[IDE Integration (ACP)](acp.md)** — Use Hermes inside ACP-compatible editors such as VS Code, Zed, and JetBrains. Chat, tool activity, file diffs, and terminal commands render inside your editor.
--- a/website/docs/user-guide/features/skills.md
+++ b/website/docs/user-guide/features/skills.md
@ -107,6 +107,35 @@ platforms: [macos, linux]     # macOS and Linux

 When set, the skill is automatically hidden from the system prompt, `skills_list()`, and slash commands on incompatible platforms. If omitted, the skill loads on all platforms.

+## Skill output and media delivery
+
+When a skill response (or any agent response) includes a bare absolute path to a media file — for example `/home/user/screenshots/diagram.png` — the gateway auto-detects it, strips it from the visible text, and delivers the file natively to the user's chat (Telegram photo, Discord attachment, etc.) instead of leaving the raw path in the message.
+
+For audio specifically, the `[[audio_as_voice]]` directive promotes audio files to native voice-message bubbles on platforms that support them (Telegram, WhatsApp).
+
+### Forcing document-style delivery: `[[as_document]]`
+
+Sometimes you want the **opposite** of inline preview: you want the file delivered as a downloadable attachment, not a re-compressed image bubble. The classic example is a high-resolution screenshot or chart — Telegram's `sendPhoto` recompresses it to ~200 KB at 1280 px, destroying readability. A 1-2 MB PNG sent via `sendDocument` keeps the original bytes intact.
+
+If a response (or any text inside it — typically the last line) contains the literal directive `[[as_document]]`, every media path extracted from that response is delivered as a document/file attachment rather than an image bubble:
+
+```
+Here is your rendered chart:
+
+/home/user/.hermes/cache/chart-q4-2025.png
+
+[[as_document]]
+```
+
+The directive is stripped before delivery, so users never see it. Granularity is intentionally all-or-nothing per response: emit `[[as_document]]` once and every image path in the same response is delivered as a document. This mirrors the scope of `[[audio_as_voice]]`.
+
+Use it from a skill when:
+
+- You produce screenshots or charts the user needs as files (for editing in another tool, archiving, sharing intact).
+- The default lossy preview would obscure detail (small text, pixel-accurate diagrams, color-sensitive renders).
+
+Platforms without a separate document path (e.g. SMS) fall back to whatever attachment mechanism they have.
+
 ### Conditional Activation (Fallback Skills)

 Skills can automatically show or hide themselves based on which tools are available in the current session. This is most useful for **fallback skills** — free or local alternatives that should only appear when a premium tool is unavailable.
--- a/website/docs/user-guide/features/vision.md
+++ b/website/docs/user-guide/features/vision.md
@ -202,3 +202,9 @@ When a user attaches an image — from the CLI clipboard, the gateway (Telegram/
 You don't configure this — Hermes looks up your current model's capability in the provider metadata and picks the right path automatically. The practical effect: you can switch between vision and non-vision models mid-session and image handling "just works" without changing your workflow. Text-only models get coherent context about the image rather than a broken multimodal payload they'd have to reject.

 Which auxiliary model handles the text-description path is configurable under `auxiliary.vision` — see [Auxiliary Models](/docs/user-guide/configuration#auxiliary-models).
+
+### `vision_analyze` has the same dual behavior
+
+The `vision_analyze` tool itself follows the same routing. When the active main model is vision-capable **and** its provider supports image content inside tool results (currently the Anthropic, OpenAI, Azure-OpenAI, and Gemini 3.x stacks), `vision_analyze` short-circuits the auxiliary describer and returns the raw image pixels as a multimodal tool-result envelope. The main model sees the image natively on its next turn — no aux call, no text-summary information loss, no extra latency.
+
+For text-only main models (or providers whose tool-result channel doesn't carry images), `vision_analyze` falls back to the legacy path: it asks the configured auxiliary vision model to describe the image and returns the description as plain text. Either way the calling tool signature is the same — the tool decides which path to take at runtime based on the active model.
--- a/website/docs/user-guide/features/web-search.md
+++ b/website/docs/user-guide/features/web-search.md
@ -20,10 +20,14 @@ Both are configured through a single backend selection. Providers are chosen via
 |----------|---------|--------|---------|-------|-----------|
 | **Firecrawl** (default) | `FIRECRAWL_API_KEY` | ✔ | ✔ | ✔ | 500 credits/mo |
 | **SearXNG** | `SEARXNG_URL` | ✔ | — | — | ✔ Free (self-hosted) |
+| **Brave Search (free tier)** | `BRAVE_SEARCH_API_KEY` | ✔ | — | — | 2 000 queries/mo |
+| **DDGS (DuckDuckGo)** | — (no key) | ✔ | — | — | ✔ Free |
 | **Tavily** | `TAVILY_API_KEY` | ✔ | ✔ | ✔ | 1 000 searches/mo |
 | **Exa** | `EXA_API_KEY` | ✔ | ✔ | — | 1 000 searches/mo |
 | **Parallel** | `PARALLEL_API_KEY` | ✔ | ✔ | — | Paid |

+Brave Search and DDGS are **search-only** — pair either with Firecrawl/Tavily/Exa/Parallel when you also need `web_extract`. DDGS uses the [`ddgs` Python package](https://pypi.org/project/ddgs/) under the hood; if it isn't already installed, run `pip install ddgs` (or let Hermes lazy-install it on first use).
+
 **Per-capability split:** you can use different providers for search and extract independently — for example SearXNG (free) for search and Firecrawl for extract. See [Per-capability configuration](#per-capability-configuration) below.

 :::tip Nous Subscribers
@ -278,7 +282,7 @@ Set one provider for all web capabilities:
 ```yaml
 # ~/.hermes/config.yaml
 web:
-  backend: "searxng"   # firecrawl | searxng | tavily | exa | parallel
+  backend: "searxng"   # firecrawl | searxng | brave-free | ddgs | tavily | exa | parallel
 ```

 ### Per-capability configuration
--- a/website/docs/user-guide/features/x-search.md
+++ b/website/docs/user-guide/features/x-search.md
@ -26,7 +26,7 @@ The tool's `check_fn` runs the xAI credential resolver every time the model's to

 ## Enabling the tool

-Off by default. Enable in `hermes tools`:
+Auto-enables when xAI credentials (OAuth token or `XAI_API_KEY`) are present. Disable explicitly via `hermes tools` → Search → x_search if you don't want this.

 ```bash
 hermes tools
--- a/website/docs/user-guide/messaging/discord.md
+++ b/website/docs/user-guide/messaging/discord.md
@ -634,6 +634,24 @@ When the flag is on, any uploaded file is downloaded, cached under `~/.hermes/ca

 Known-text formats already in the allowlist (`.txt`, `.md`, `.log`) continue to have their contents auto-injected up to 100 KiB; that behavior is unchanged when the flag is on.

+Equivalent env vars: `DISCORD_ALLOW_ANY_ATTACHMENT=true` and `DISCORD_MAX_ATTACHMENT_BYTES=33554432` (or `0` for no cap).
+
+:::warning Memory cost of unlimited
+Disabling the size cap (`max_attachment_bytes: 0`) means a user can drop a multi-GB file on the bot and the gateway will dutifully buffer it through memory while caching to disk. Only set this in trusted single-user installs. For shared bots, keep the default 32 MiB or raise it conservatively.
+:::
+
+## Interactive Prompts (clarify)
+
+When the agent calls the `clarify` tool — to ask which approach you prefer, get post-task feedback, or check before a non-trivial decision — Discord renders the question with **one button per choice**:
+
+> Which framework should I use for the dashboard?
+>
+> [1. Next.js] [2. Remix] [3. Astro] [Other (type answer)]
+
+Click a numbered button to answer, or click **Other** to type a free-form response (the next message you send in that channel becomes the answer). Open-ended `clarify` calls (no preset choices) skip the buttons and just capture your next message.
+
+The buttons disable themselves once a choice is made so duplicate clicks don't double-resolve the prompt. Configure the response timeout via `agent.clarify_timeout` in `~/.hermes/config.yaml` (default `600` seconds). If you don't respond within the timeout, the agent unblocks with a sentinel message and adapts rather than hanging.
+
 ## Home Channel

 You can designate a "home channel" where the bot sends proactive messages (such as cron job output, reminders, and notifications). There are two ways to set it:
--- a/website/docs/user-guide/messaging/index.md
+++ b/website/docs/user-guide/messaging/index.md
@ -443,6 +443,84 @@ Each platform has its own toolset:
 | API Server | `hermes-api-server` | Full tools (drops `clarify`, `send_message`, `text_to_speech` — programmatic access doesn't have an interactive user) |
 | Webhooks | `hermes-webhook` | Full tools including terminal |

+## Operating a multi-platform gateway
+
+A gateway typically runs several adapters at once (Telegram + Discord + Slack, etc.). The sections below cover day-2 operations that span all platforms.
+
+### `/platform` command
+
+Once the gateway is running, use the `/platform` slash command from any connected CLI session or chat to inspect and steer individual adapters without restarting the whole gateway:
+
+```
+/platform list                  # show all adapters and their state
+/platform pause <name>          # stop dispatching new messages to one adapter
+/platform resume <name>         # re-enable a paused adapter
+```
+
+`/platform list` shows whether each adapter is `running`, `paused` (manually), or `paused-by-breaker` (see below). Pausing keeps the adapter loaded and its background loops alive — incoming messages are dropped on the floor, but the connection itself stays open so resume is instant.
+
+See also the broader status summary command [`/platforms`](../../reference/slash-commands.md#info).
+
+### Automatic circuit breaker
+
+Each adapter is wrapped in a circuit breaker. Repeated retryable failures (network blips, rate-limit replies, 5xx upstream responses, websocket disconnects) cause the breaker to trip — the adapter is auto-paused, an operator notification is sent to the home channel of another live platform when one is configured, and a structured log line is emitted.
+
+The breaker does **not** auto-resume — it stays open until you run `/platform resume <name>` manually. This is intentional: if a platform is in a sustained outage, you don't want the gateway thrashing reconnects.
+
+### Where to look when a platform is paused
+
+When an adapter is paused, check:
+
+1. **Gateway log** (`~/.hermes/logs/gateway.log` or the systemd / launchd unit log). Search for the platform name and `circuit breaker`, `paused`, or `disabled`. The trip event includes the failure count and the last error.
+2. **`/platform list`** output — shows the current state and last reason.
+3. **The provider's status page** (Telegram bot API status, Discord status, etc.). The breaker tripped because the platform was unhealthy; don't try to resume until it's back.
+
+Once upstream is healthy, `/platform resume <name>` clears the breaker and re-arms the adapter.
+
+### Restart notifications
+
+When the gateway restarts (or is shut down with in-flight sessions), it can send a one-shot "the agent is back" / "the agent was interrupted" message to each platform's home channel. This is controlled per-platform by the `gateway_restart_notification` flag in `gateway-config.yaml`, which defaults to `true`:
+
+```yaml
+gateway:
+  platforms:
+    telegram:
+      home_chat_id: "123456789"
+      gateway_restart_notification: false   # opt out for this platform
+    discord:
+      home_chat_id: "987654321"
+      # gateway_restart_notification omitted → defaults to true
+```
+
+Disable it on noisy or low-priority platforms while leaving it on for your primary chat. The notification is sent once per restart, regardless of how many sessions were in flight.
+
+### Session resume across gateway restarts
+
+When the gateway shuts down with an in-flight tool call or generation, the affected sessions are flagged as `restart_interrupted`. On the next startup, the gateway schedules an auto-resume for each one — the user gets a short heads-up in the chat ("Send any message after restart and I'll try to resume where you left off.") and the session picks up from the last committed turn when they reply.
+
+This behaviour is on by default and is logged at gateway start:
+
+```
+Scheduled auto-resume for N restart-interrupted session(s)
+```
+
+No configuration is required. If you don't want the heads-up, set `gateway_restart_notification: false` on the platform.
+
+### Progress bubble cleanup (opt-in)
+
+Tool-progress messages, the "still working…" heartbeat, and status-callback bubbles can be auto-deleted after the final response lands. Enable per-platform via `display.platforms.<platform>.cleanup_progress`:
+
+```yaml
+display:
+  platforms:
+    telegram:
+      cleanup_progress: true
+    discord:
+      cleanup_progress: true
+```
+
+Defaults to `false`. Only platforms whose adapter implements `delete_message` honor the setting (currently Telegram and Discord). Failed runs **skip** cleanup so the bubbles remain as breadcrumbs.
+
 ## Next Steps

 - [Telegram Setup](telegram.md)
--- a/website/docs/user-guide/messaging/matrix.md
+++ b/website/docs/user-guide/messaging/matrix.md
@ -345,6 +345,34 @@ Add this to your `~/.hermes/.env`:
 MATRIX_HOME_ROOM=!abc123def456:matrix.example.org
 ```

+## Room allowlist (`allowed_rooms`)
+
+Restrict the bot to a fixed set of Matrix rooms. When set, the bot **only** responds in rooms whose ID appears in the list — messages from any other room are silently ignored, even if the bot is mentioned.
+
+**DMs (direct chat rooms) are exempt** from this filter, so authorized users can always reach the bot one-on-one.
+
+```yaml
+matrix:
+  allowed_rooms:
+    - "!abc123def456:matrix.example.org"
+    - "!opsroom789:matrix.example.org"
+```
+
+Or via env var (comma-separated):
+
+```bash
+MATRIX_ALLOWED_ROOMS="!abc123def456:matrix.example.org,!opsroom789:matrix.example.org"
+```
+
+Behavior:
+
+- Empty / unset → no restriction (default).
+- Non-empty → room ID must be on the list. The check runs **before** any other gating (mention requirement, sender allowlist, etc.).
+- Use the room's **internal ID** (`!abc...:server`), not its alias (`#room:server`). You can find a room's internal ID in Element via Room → Settings → Advanced.
+
+See also: [admin/user slash command split](../../reference/slash-commands.md#permissions-and-adminuser-split).
+
+
 :::tip
 To find a Room ID: in Element, go to the room → **Settings** → **Advanced** → the **Internal room ID** is shown there (starts with `!`).
 :::
--- a/website/docs/user-guide/messaging/mattermost.md
+++ b/website/docs/user-guide/messaging/mattermost.md
@ -225,6 +225,33 @@ To find a channel ID in Mattermost: open the channel, click the channel name hea

 When the bot is `@mentioned`, the mention is automatically stripped from the message before processing.

+## Channel allowlist (`allowed_channels`)
+
+Restrict the bot to a fixed set of Mattermost channels. When set, the bot **only** responds in channels whose ID appears in the list — messages from any other channel are silently ignored, even if the bot is `@mentioned`.
+
+**DMs are exempt** from this filter, so authorized users can always reach the bot in a direct message.
+
+```yaml
+mattermost:
+  allowed_channels:
+    - "abc123def456ghi789jkl012mno"   # #ops
+    - "xyz987uvw654rst321opq098nml"   # #incident-response
+```
+
+Or via env var (comma-separated):
+
+```bash
+MATTERMOST_ALLOWED_CHANNELS="abc123def456ghi789jkl012mno,xyz987uvw654rst321opq098nml"
+```
+
+Behavior:
+
+- Empty / unset → no restriction (fully backward compatible).
+- Non-empty → channel ID must be on the list, or the message is dropped before any other gating (mention requirement, `MATTERMOST_FREE_RESPONSE_CHANNELS`, etc.) runs.
+- Find a channel ID via the Mattermost UI → channel header → "View Info", or read it from the channel URL.
+
+See also: [admin/user slash command split](../../reference/slash-commands.md#permissions-and-adminuser-split).
+
 ## Troubleshooting

 ### Bot is not responding to messages
--- a/website/docs/user-guide/messaging/slack.md
+++ b/website/docs/user-guide/messaging/slack.md
@ -389,6 +389,33 @@ Set this to `true` in busy workspaces where Slack's default "the bot remembers t
 Slack supports both patterns: `@mention` required to start a conversation by default, but you can opt specific channels out via `SLACK_FREE_RESPONSE_CHANNELS` (comma-separated channel IDs) or `slack.free_response_channels` in `config.yaml`. Once the bot has an active session in a thread, subsequent thread replies do not require a mention. In DMs the bot always responds without needing a mention.
 :::

+### Channel allowlist (`allowed_channels`)
+
+Restrict the bot to a fixed set of Slack channels — useful when the bot is invited to many channels but should only respond in a few. When set, messages from channels NOT in this list are **silently ignored**, even if the bot is `@mentioned`.
+
+**DMs are exempt** from this filter, so authorized users can always reach the bot in a direct message.
+
+```yaml
+slack:
+  allowed_channels:
+    - "C0123456789"   # #ops
+    - "C0987654321"   # #incident-response
+```
+
+Or via env var (comma-separated):
+
+```bash
+SLACK_ALLOWED_CHANNELS="C0123456789,C0987654321"
+```
+
+Behavior:
+
+- Empty / unset → no restriction (fully backward compatible).
+- Non-empty → channel ID must be on the list, or the message is dropped before any other gating (mention requirement, `free_response_channels`, etc.) runs.
+- Slack channel IDs start with `C` (public), `G` (private), or `D` (DM). Look them up via the Slack UI's "Open channel details" → "About" panel, or via the API.
+
+See also: [admin/user slash command split](../../reference/slash-commands.md#permissions-and-adminuser-split).
+
 ### Unauthorized User Handling

 ```yaml
--- a/website/docs/user-guide/messaging/telegram.md
+++ b/website/docs/user-guide/messaging/telegram.md
@ -944,6 +944,34 @@ TELEGRAM_GROUP_ALLOWED_USERS="-1001234567890"
 TELEGRAM_GROUP_ALLOWED_CHATS="-1001234567890"
 ```

+### Guest @mention bypass (`guest_mode`)
+
+In a typical setup, `group_allowed_chats` is a hard gate: messages from groups outside the list are silently dropped, even if a member explicitly @mentions the bot. That's the right default for support / team bots.
+
+For more casual setups — friend group chats where you want the bot **mostly silent** but **occasionally available on explicit ping** — enable `guest_mode`:
+
+```yaml
+gateway:
+  platforms:
+    telegram:
+      extra:
+        group_allowed_chats:
+          - "-1001234567890"   # your main allowlisted group
+        guest_mode: true       # non-allowlisted groups: allow on @mention only
+```
+
+Env equivalent:
+
+```bash
+TELEGRAM_GUEST_MODE=true
+```
+
+Default: `false`.
+
+With `guest_mode: true`, a message from a non-allowlisted group is processed **only** if it explicitly @mentions the bot. The mention is required every turn — there's no session stickiness for guest interactions, so the bot never auto-engages in a friend group thread it isn't pinged into.
+
+DMs and allowlisted groups behave exactly as before.
+
 ## Slash Command Access Control

 By default, every allowed user can run every slash command. To split your allowlist into **admins** (full slash command access) and **regular users** (only commands you explicitly enable), add `allow_admin_from` and `user_allowed_commands` to the platform's `extra` block:
@ -1153,6 +1181,32 @@ Tap a button to answer, or tap **Other** to type a free-form response (the next

 Configure the response timeout via `agent.clarify_timeout` in `~/.hermes/config.yaml` (default `600` seconds). If you don't respond within the timeout, the agent unblocks with a sentinel message and adapts rather than hanging.

+## Push notification volume
+
+Telegram fires a push notification on every message the bot sends. For long agent turns that emit tool-progress bubbles, streaming updates, and status callbacks, this gets noisy fast. The Telegram adapter has two notification modes:
+
+| Mode | Behavior |
+|------|----------|
+| `important` (default) | Only **final responses**, **approval prompts**, and **slash-command confirmations** ring. Tool progress, streaming chunks, and status messages are delivered with `disable_notification=true`. |
+| `all` | Every outgoing message fires a push notification. Legacy behavior; opt in if you genuinely want to hear about every tool call. |
+
+Configure in `~/.hermes/config.yaml`:
+
+```yaml
+display:
+  platforms:
+    telegram:
+      notifications: important   # or "all"
+```
+
+Env override (handy for quick A/B testing):
+
+```bash
+HERMES_TELEGRAM_NOTIFICATIONS=all
+```
+
+Unknown values log a warning and fall back to `important`.
+
 ## Security

 :::warning
--- a/website/docs/user-guide/security.md
+++ b/website/docs/user-guide/security.md
@ -73,6 +73,8 @@ When YOLO is active, Hermes shows two persistent visual reminders so it's hard t
 YOLO mode disables **all** dangerous command safety checks for the session — **except** the hardline blocklist (see below). Use only when you fully trust the commands being generated (e.g., well-tested automation scripts in disposable environments).
 :::

+For destructive session slash commands (`/clear`, `/new` / `/reset`, `/undo`, `/exit --delete`), the CLI also prompts for confirmation before running them. See [Slash Commands — Confirmation prompts for destructive commands](../reference/slash-commands.md#confirmation-prompts-for-destructive-commands).
+
 ### Hardline Blocklist (Always-On Floor)

 Some commands are so catastrophic — irreversible filesystem wipes, fork bombs, direct block-device writes — that Hermes refuses to run them **regardless** of:
@ -605,3 +607,58 @@ TERMINAL_SSH_KEY=~/.ssh/hermes_agent_key
 ```

 The SSH connection details live in `.env` (not `config.yaml`) so they aren't checked in or shared along with profile exports. This keeps the gateway's messaging connections separate from the agent's command execution.
+
+## Supply-chain advisory checking
+
+Hermes ships with a built-in advisory scanner that flags Python packages in the active venv that match a curated catalog of known-compromised versions (supply-chain worms like the May 2026 `mistralai 2.4.6` poisoning). Implementation lives in `hermes_cli/security_advisories.py`.
+
+How it runs:
+
+- **CLI startup banner.** A one-line warning is printed if any advisory matches, with a pointer to `hermes doctor` for the full remediation.
+- **`hermes doctor`.** Surfaces every active advisory with version specifics and 2-4 step remediation instructions.
+- **Gateway startup.** Logged to `gateway.log`; the first interactive message gets a short operator banner.
+
+Each advisory carries a stable id. Once you have read and acted on it you can dismiss it for good:
+
+```bash
+hermes doctor --ack <advisory-id>
+```
+
+The ack is persisted to `config.security.acked_advisories` and survives restart. Old advisories are intentionally **not** removed from the catalog — leaving them in place keeps fresh installs warned about historically poisoned versions that might still be cached in a private mirror.
+
+The check itself is stdlib-only and runs from one `importlib.metadata.version()` lookup per advisory, so it's safe to run on every startup.
+
+### Lazy install of optional dependencies
+
+Many features (Mistral TTS, ElevenLabs, Honcho memory, Bedrock, Slack, Matrix, …) depend on Python packages that not every user needs. Hermes installs these **lazily** on first use rather than eagerly under `hermes-agent[all]`. The implementation lives in `tools/lazy_deps.py`.
+
+The trade-off this fixes:
+
+- **Fragility.** When one extra's transitive dependency becomes unavailable on PyPI (quarantined for malware, yanked, broken upload), the entire `[all]` resolve would fail and fresh installs would silently fall back to a stripped tier — losing 10+ unrelated extras at once. Lazy install isolates each backend so one poisoned dep can't break unrelated features.
+- **Bloat.** A user who only ever talks to one provider no longer pulls hundreds of packages they will never import.
+
+How it works:
+
+1. A backend module calls `ensure("feature.name")` at the top of its first-import path.
+2. If the deps are missing, `ensure` checks `security.allow_lazy_installs` in `config.yaml` (default `true`) and runs a venv-scoped `pip install` for the allowlisted specs.
+3. If the install fails or the user has disabled lazy installs, the call raises `FeatureUnavailable` with the actual pip stderr and a pointer at `hermes tools`.
+
+Security guarantees enforced by `tools/lazy_deps.py`:
+
+| Guarantee | What it means |
+|---|---|
+| Venv-scoped only | Installs target `sys.executable` in the active venv — never the system Python |
+| PyPI by name only | Specs accept `"package>=1.0,<2"` syntax. No `--index-url`, `git+https://`, or file: paths — a malicious `config.yaml` cannot redirect the install |
+| Allowlist | Only specs that appear in the in-tree `LAZY_DEPS` map can be installed via this path. A typo in a feature name does NOT get install-anything semantics |
+| Opt-out | Set `security.allow_lazy_installs: false` to disable runtime installs entirely. Useful for restricted networks or strict security postures |
+| No silent retries | Failures surface as `FeatureUnavailable` — no caching of bad state, no retry storms |
+
+To disable runtime installs:
+
+```yaml
+# ~/.hermes/config.yaml
+security:
+  allow_lazy_installs: false
+```
+
+When disabled, backends that need optional deps will tell the user to run the install manually (`pip install …`) or pick a different backend via `hermes tools`.
--- a/website/docs/user-guide/sessions.md
+++ b/website/docs/user-guide/sessions.md
@ -60,6 +60,9 @@ into chat.
 Use `/compress` when a session gets long, `/new` for a fresh thread, and
 `hermes sessions prune` only when you want to delete old ended sessions from
 storage. Compression reduces the active context; it is not a privacy delete.
+Pass a name to `/new` (e.g. `/new payments-refactor`) to set the new session's
+initial title up front — useful for finding it later with `/resume <name>` or
+in the `/sessions` picker.
 :::

 ### Session Sources
@ -412,9 +415,9 @@ session_search()

 Returns recent sessions chronologically (titles, previews, timestamps). Useful when the user asks "what was I working on" without naming a topic.

-### FTS5 Query Syntax
+### FTS5 query syntax

-The search supports standard FTS5 query syntax:
+The keyword mode supports standard FTS5 query syntax:

 - Simple keywords: `docker deployment` (FTS5 defaults to AND)
 - Phrases: `"exact phrase"`
@ -432,6 +435,8 @@ The agent is prompted to use session search automatically:

 > *"When the user references something from a past conversation or you suspect relevant prior context exists, use session_search to recall it before asking them to repeat themselves."*

+Typical triggers: "we did this before", "remember when", "last time", "as I mentioned", or any reference to a project/person/concept that isn't in the current window.
+
 ## Per-Platform Session Tracking

 ### Gateway Sessions
--- a/website/docs/user-guide/tui.md
+++ b/website/docs/user-guide/tui.md
@ -50,6 +50,19 @@ The classic CLI remains available as the default. Anything documented in [CLI In

 Same [skins](features/skins.md) and [personalities](features/personality.md) apply. Switch mid-session with `/skin ares`, `/personality pirate`, and the UI repaints live. See [Skins & Themes](features/skins.md) for the full list of customizable keys and which ones apply to classic vs TUI — the TUI honors the banner palette, UI colors, prompt glyph/color, session display, completion menu, selection bg, `tool_prefix`, and `help_header`.

+### Collapsible banner sections
+
+The TUI startup banner groups runtime info into four collapsible sections, each rendered with a `▸` / `▾` chevron next to the section title:
+
+| Section | Default state |
+|---------|---------------|
+| Tools | Open |
+| Skills | Collapsed |
+| System Prompt | Collapsed |
+| MCP Servers | Collapsed |
+
+Click anywhere on a section header (or its chevron) to toggle it. The Tools list opens by default because it's the most-checked section at session start; Skills, System Prompt, and MCP Servers collapse by default so the banner stays compact even when you've installed dozens of skills or wired up many MCP servers. State is local to the banner instance, so the next launch resets to the defaults.
+
 ## Requirements

 - **Node.js** ≥ 20 — the TUI runs as a subprocess launched from the Python CLI. `hermes doctor` verifies this.
@ -158,6 +171,9 @@ The status line also shows:

 - **Working directory with git branch** — `~/projects/hermes-agent (docs/two-week-gap-sweep)`. The branch suffix updates when you `git checkout` in a side terminal (mtime-cached) so the TUI reflects your actual active branch, not whatever it was at launch.
 - **Per-prompt elapsed time** — `⏱ 12s/3m 45s` while the turn is running (live), frozen to `⏲ 32s / 3m 45s` after the turn completes. First number is time since last user message; second is total session duration. Resets on every new prompt.
+- **`🗜️ N`** — number of times the running session has been auto-compressed. Appears once the first compression fires.
+- **`▶ N`** — number of `/background` tasks currently running in this session. Appears whenever at least one task is in flight.
+- **`⚠ YOLO`** — visible warning whenever YOLO mode is on (`hermes --yolo`, `/yolo`, or `HERMES_YOLO_MODE=1`). The same badge also appears in the startup banner so you cannot launch an auto-approving session without noticing.

 ## Configuration

@ -215,6 +231,25 @@ Sessions are shared between the TUI and the classic CLI — both write to the sa

 See [Sessions](sessions.md) for lifecycle, search, compression, and export.

+## Attaching to a running gateway
+
+By default the TUI spawns its own in-process gateway, so each TUI instance is self-contained. If you already have a long-lived gateway running (e.g. `hermes gateway run` in tmux, or the systemd / launchd service), you can point the TUI at that gateway instead — the TUI then becomes a thin client and shares state with every other surface (messaging platforms, web dashboard, other TUI sessions) that's attached to the same gateway.
+
+Set the websocket URL via env before launching:
+
+```bash
+export HERMES_TUI_GATEWAY_URL="ws://localhost:8765/api/ws?token=<auth-token>"
+hermes --tui
+```
+
+The token comes from the gateway's API auth configuration (see [API Server](features/api-server.md)). When the env var is set, the TUI:
+
+- Skips spawning a local gateway entirely — no duplicate platform adapters, no port conflicts.
+- Routes every action (slash commands, image attach, browser progress, voice events, …) over the websocket to the shared gateway.
+- Reconnects automatically if the gateway URL rotates (new token) between requests.
+
+This is the same channel the web dashboard's embedded TUI uses (see [Web Dashboard](features/web-dashboard.md#chat)) — one gateway, many clients.
+
 ## Reverting to the classic CLI

 Launching `hermes` (without `--tui`) stays on the classic CLI. To make a machine prefer the TUI, set `HERMES_TUI=1` in your shell profile. To go back, unset it.
--- a/website/docs/user-guide/windows-native.md
+++ b/website/docs/user-guide/windows-native.md
@ -38,11 +38,35 @@ No admin rights required. The installer goes to `%LOCALAPPDATA%\hermes\` and add
 | Parameter | Default | Purpose |
 |---|---|---|
 | `-Branch` | `main` | Clone a specific branch (useful for testing PRs) |
+| `-Commit` | unset | Pin install to a specific commit SHA (overrides `-Branch`) |
+| `-Tag` | unset | Pin install to a specific git tag (e.g. `v0.14.0`) |
 | `-NoVenv` | off | Skip venv creation (advanced — you manage Python yourself) |
 | `-SkipSetup` | off | Skip the post-install `hermes setup` wizard |
 | `-HermesHome` | `%LOCALAPPDATA%\hermes` | Override data directory |
 | `-InstallDir` | `%LOCALAPPDATA%\hermes\hermes-agent` | Override code location |

+The installer auto-retries flaky git fetches and strips BOM from any downloaded `install.ps1` payload, so a UTF-8 BOM picked up during HTTP transit no longer breaks the `[scriptblock]::Create((irm ...))` form.
+
+### Desktop installer (alternative)
+
+A thin GUI installer is also available — useful if you'd rather double-click an `.exe` than open PowerShell. Download Hermes Desktop, run the installer, and on first launch the GUI calls `install.ps1` under the hood to provision Python (via `uv`), Node, PortableGit, and the rest of the dependency bootstrap described below. After the first run, the desktop app and the PowerShell-installed `hermes` CLI share the same `%LOCALAPPDATA%\hermes\hermes-agent` install and `%USERPROFILE%\.hermes` data directory — switch between the GUI and the CLI freely.
+
+Use the desktop installer when you want a familiar Windows install experience or you're handing Hermes to a non-developer; use the PowerShell one-liner when you're already in a terminal.
+
+### Dependency bootstrap (`dep_ensure`)
+
+On first launch (and on demand when a missing tool is detected), Hermes runs a small Python bootstrapper — `hermes_cli/dep_ensure.py` — that checks for and lazily installs the non-Python dependencies it needs. On Windows, the relevant ones are:
+
+| Dependency | Why Hermes needs it |
+|---|---|
+| **PortableGit** | Provides `bash.exe` for the terminal tool and `git` for in-session clones. Provisioned at install time, not by `dep_ensure`. |
+| **Node.js 22** | Required for the browser tool (`agent-browser`), the TUI's web bridge, and the WhatsApp bridge. |
+| **ffmpeg** | Audio format conversion for TTS / voice messages. |
+| **ripgrep** | Fast file search — falls back to `grep` if unavailable. |
+| **npm packages** | `agent-browser`, Playwright Chromium, and any per-toolset Node deps are installed once at first browser-tool use. |
+
+Each dep has a `shutil.which(...)`-style check; if a binary is missing and the run is interactive, `dep_ensure` offers to install it (deferring to `scripts\install.ps1 -ensure <dep>` for the actual install logic). Non-interactive runs (gateway, cron, headless desktop launches) skip the prompt and surface a clear `this feature needs <dep>` error instead.
+
 ## What the installer actually does

 Top-to-bottom, in order: