docs: comprehensive 2-week sweep of feature/PR coverage gaps (#28497)

Catch the website docs up to two weeks of merged work (May 4 – May 18, 2026, roughly 1,080 PRs). The audit found ~50 user-visible features that had landed in code with no docs footprint, plus a handful of stale pages. This PR closes every gap the scan turned up. New pages - user-guide/features/deliverable-mode.md — extension list, agent triggers, kanban_complete artifacts pattern, [[as_document]] override (PR #27813). - developer-guide/web-search-provider-plugin.md — authoring guide modeled on image-gen-provider-plugin, covering brave_free / ddgs / etc. (PR #25448). Providers / auth - Rename "Alibaba Cloud" → "Qwen Cloud (Alibaba DashScope)" everywhere the display label shows up; provider id stays `alibaba` (PR #24835). - Document OAuth refresh-token quarantine for xAI / MiniMax / Codex (PRs #28116 / #28118 / #28119). - Document Nous JWT minting from refresh token + invalid-refresh quarantine + cross-profile shared token store (PRs #27663 / #19712). - Add `## Microsoft Entra ID authentication (keyless)` section to azure-foundry guide — DefaultAzureCredential, RBAC, OpenAI + Anthropic routing details (PR #28101 / #9df9816da). - Custom providers `api_mode` is now prompted-and-persisted, not just URL autodetected (PR #25068). - Delegation honours `api_mode` + auto-detects anthropic_messages base URLs (PR #26824). - `x_search` auto-enables when xAI credentials are present (PR #27376). - Add `xAI Grok OAuth (SuperGrok)` row to providers headline table (PR #26534). - NVIDIA NIM billing-origin header is set automatically (PR #26585). Windows / installer - `install.ps1`: document `-Commit <sha>` and `-Tag <v>` pin params plus the BOM-strip / git-retry hardening (PR #28169). - Document Hermes Desktop thin installer + first-launch bootstrap (PR #27822). - Document `dep_ensure` Windows bootstrap (PR #27845). - Document install-method auto-detection (pip / git / homebrew / nixos) and the matching update command (PR #27843). Gateway / messaging - `/platform list|pause|resume` full description + circuit-breaker semantics (PR #26600). - Slack / Matrix / Mattermost get parallel `allowed_channels` / `allowed_rooms` allowlist sections matching Telegram/Discord/DingTalk (PR #21251). - Discord `allow_any_attachment` + `max_attachment_bytes` (config and env vars) (PR #27245). - Discord clarify-choice button rendering (PR #25485). - Telegram `guest_mode` @mention bypass for allowlisted groups (PR #22759). - Telegram `notifications` mode (`important` vs `all`) (PR #22793). - `[[as_document]]` skill / response directive for forcing document-style media delivery (PR #21210). CLI / TUI - `/new [name]` argument (PR #19637). - `/subgoal` user-supplied criteria appended to `/goal` (PR #25449). - `/exit --delete` flag confirmation prompts for destructive slash commands (PR #22687). - Status-bar additions: ▶ N background indicator (PR #27175), context compression count (PR #21218), YOLO mode banner+statusbar warning (PR #26238). - `display.timestamps` + `docker_extra_args` config keys (PR #23599). - TUI collapsible startup banner sections (PR #20625). - `HERMES_SESSION_ID` exported to tool subprocesses (PR #23847). i18n - Refresh display.language locale list from 8 → 16 (en, zh, zh-hant, ja, de, es, fr, tr, uk, af, ko, it, ga, pt, ru, hu) — matches `agent/i18n.py:SUPPORTED_LANGUAGES`. Tools / features - `vision_analyze` native-pixel passthrough for vision-capable callers, with auxiliary text-describer fallback (PR #22955). - `session_search` rewrite to the single-shape tool (discovery / scroll / browse modes) (PRs #27590 / #27840). - Clarify MCP transport scope: client supports stdio + SSE; embedded `hermes mcp serve` is stdio-only (PR #21227). - Web search backends table: add Brave Search (free tier) and DDGS rows (PR #21337). - ACP session-scoped edit auto-approval modes (PR #27862). - Curator rename map in the user-visible per-run summary (PR #22910). - Prompt caching feature page reference in features/overview.md — Claude cross-session 1-hour prefix cache on native Anthropic / OpenRouter / Nous Portal (PR #23828). - Cron per-job profile parameter (PR #28124). - `--no-skills` flag for `hermes profile create` (PR #20986). Build - Verified with `npm run build` in `website/`; both `en` and `zh-Hans` locales compile. Remaining broken-link/anchor warnings are pre-existing (`rl-training.md` from learning-path / overview; the zh-Hans translation lag the docs skill already calls out).
2026-07-17 14:42:06 +00:00 · 2026-05-18 23:55:25 -07:00 · 2026-05-18 23:55:25 -07:00 · eacce70a35
commit eacce70a35
parent 1335ce996d
37 changed files with 901 additions and 26 deletions
--- a/website/docs/integrations/providers.md
+++ b/website/docs/integrations/providers.md
@ -29,8 +29,10 @@ You need at least one way to connect to an LLM. Use `hermes model` to switch pro
 | **GMI Cloud** | `GMI_API_KEY` in `~/.hermes/.env` (provider: `gmi`; aliases: `gmi-cloud`, `gmicloud`) |
 | **MiniMax** | `MINIMAX_API_KEY` in `~/.hermes/.env` (provider: `minimax`) |
 | **MiniMax China** | `MINIMAX_CN_API_KEY` in `~/.hermes/.env` (provider: `minimax-cn`) |
-| **Alibaba Cloud** | `DASHSCOPE_API_KEY` in `~/.hermes/.env` (provider: `alibaba`) |
-| **Alibaba Coding Plan** | `DASHSCOPE_API_KEY` (provider: `alibaba-coding-plan`, alias: `alibaba_coding`) — separate billing SKU, different endpoint |
+| **xAI (Grok) — Responses API** | `XAI_API_KEY` in `~/.hermes/.env` (provider: `xai`) |
+| **xAI Grok OAuth (SuperGrok)** | `hermes model` → "xAI Grok OAuth (SuperGrok Subscription)" — browser login, no API key. See [guide](../guides/xai-grok-oauth.md) |
+| **Qwen Cloud (Alibaba DashScope)** | `DASHSCOPE_API_KEY` in `~/.hermes/.env` (provider: `alibaba`) |
+| **Alibaba Cloud (Coding Plan)** | `DASHSCOPE_API_KEY` (provider: `alibaba-coding-plan`, alias: `alibaba_coding`) — separate billing SKU, different endpoint |
 | **Kilo Code** | `KILOCODE_API_KEY` in `~/.hermes/.env` (provider: `kilocode`) |
 | **Xiaomi MiMo** | `XIAOMI_API_KEY` in `~/.hermes/.env` (provider: `xiaomi`, aliases: `mimo`, `xiaomi-mimo`) |
 | **Tencent TokenHub** | `TOKENHUB_API_KEY` in `~/.hermes/.env` (provider: `tencent-tokenhub`, aliases: `tencent`, `tokenhub`, `tencentmaas`) |
@ -137,6 +139,8 @@ with the Generative Language API enabled.

 :::info Codex Note
 The OpenAI Codex provider authenticates via device code (open a URL, enter a code). Hermes stores the resulting credentials in its own auth store under `~/.hermes/auth.json` and can import existing Codex CLI credentials from `~/.codex/auth.json` when present. No Codex CLI installation is required.
+
+If a token refresh fails with a terminal error (HTTP 4xx, `invalid_grant`, revoked grant, etc.), Hermes marks the refresh token as dead and stops replaying it so you don't see a flood of identical auth failures. The next request surfaces a typed re-auth message instead. Run `hermes auth add codex-oauth` (or `hermes model` → OpenAI Codex) to start a fresh device-code login; the quarantine clears on the next successful exchange.
 :::

 :::warning
@ -158,6 +162,18 @@ Hermes has **two** model commands that serve different purposes:

 If you're trying to switch to a provider you haven't set up yet (e.g. you only have OpenRouter configured and want to use Anthropic), you need `hermes model`, not `/model`. Exit your session first (`Ctrl+C` or `/quit`), run `hermes model`, complete the provider setup, then start a new session.

+### Nous Portal
+
+Subscription-based access to Hermes-4 models (`Hermes-4-70B`, `Hermes-4.3-36B`, `Hermes-4-405B`) via Nous Research's portal. Run `hermes model`, pick **Nous Portal**, sign in through the browser — Hermes stores a long-lived refresh token at `~/.hermes/auth.json`.
+
+The refresh token is also shared across profiles via a shared token store, so logging in on one profile carries over to the others.
+
+#### Token handling
+
+Hermes mints a short-lived JWT from your stored Nous refresh token on each inference call rather than reusing a long-lived API key. The token lifecycle is fully automatic — refresh, mint, retry on transient 401 — and you never see it.
+
+If the portal invalidates the refresh token (password change, manual revoke, session expiry), the invalid refresh token is quarantined locally so Hermes stops replaying it and you don't see a stream of identical 401s. The next call surfaces a clear "re-authentication required" message. Run `hermes auth add nous` to log in again; the quarantine clears on the next successful login.
+
 ### Anthropic (Native)

 Use Claude models directly through the Anthropic API — no OpenRouter proxy needed. Supports three auth methods:
@ -292,7 +308,7 @@ hermes chat --provider minimax --model MiniMax-M2.7
 hermes chat --provider minimax-cn --model MiniMax-M2.7
 # Requires: MINIMAX_CN_API_KEY in ~/.hermes/.env

-# Alibaba Cloud / DashScope (Qwen models)
+# Qwen Cloud / DashScope (Qwen models)
 hermes chat --provider alibaba --model qwen3.5-plus
 # Requires: DASHSCOPE_API_KEY in ~/.hermes/.env

@ -440,11 +456,11 @@ model:

 Set `HERMES_QWEN_BASE_URL` only if the portal endpoint relocates (default: `https://portal.qwen.ai/v1`).

-:::tip Qwen OAuth vs DashScope (Alibaba)
-`qwen-oauth` uses the consumer-facing Qwen Portal with OAuth login — ideal for individual users. The `alibaba` provider uses DashScope's enterprise API with a `DASHSCOPE_API_KEY` — ideal for programmatic / production workloads. Both route to Qwen-family models but live at different endpoints.
+:::tip Qwen OAuth vs Qwen Cloud (Alibaba DashScope)
+`qwen-oauth` uses the consumer-facing Qwen Portal with OAuth login — ideal for individual users. The `alibaba` provider uses Qwen Cloud (Alibaba DashScope) with a `DASHSCOPE_API_KEY` — ideal for programmatic / production workloads. Both route to Qwen-family models but live at different endpoints.
 :::

-### Alibaba Coding Plan
+### Alibaba Cloud (Coding Plan)

 If you're subscribed to Alibaba's **Coding Plan** (a pricing SKU separate from standard DashScope API access), Hermes exposes it as its own first-class provider: `alibaba-coding-plan`. Endpoint: `https://coding-intl.dashscope.aliyuncs.com/v1`. It's OpenAI-compatible like the regular `alibaba` provider but with a different base URL and billing surface.

@ -512,6 +528,8 @@ model:
 For on-prem deployments (DGX Spark, local GPU), set `NVIDIA_BASE_URL=http://localhost:8000/v1`. NIM exposes the same OpenAI-compatible chat completions API as build.nvidia.com, so switching between cloud and local is a one-line env-var change.
 :::

+Hermes automatically attaches the NIM billing-origin header on every request to `build.nvidia.com` — no configuration needed. This routes consumption against the correct origin in NVIDIA's billing dashboard.
+
 ### GMI Cloud

 Open and reasoning models via [GMI Cloud](https://www.gmicloud.ai/) — OpenAI-compatible API, API key authentication.
@ -1203,13 +1221,15 @@ custom_providers:
  - name: work
    base_url: https://gpu-server.internal.corp/v1
    key_env: CORP_API_KEY
-    api_mode: chat_completions   # optional, auto-detected from URL
+    api_mode: chat_completions   # set explicitly by `hermes model` → Custom Endpoint wizard; auto-detection still happens as a fallback
  - name: anthropic-proxy
    base_url: https://proxy.example.com/anthropic
    key_env: ANTHROPIC_PROXY_KEY
    api_mode: anthropic_messages  # for Anthropic-compatible proxies
 ```

+The `hermes model` → Custom Endpoint wizard now prompts for `api_mode` explicitly and persists your answer to `config.yaml`. URL-based auto-detection (e.g. `/anthropic` paths → `anthropic_messages`) still happens as a fallback when the field is left blank.
+
 Switch between them mid-session with the triple syntax:

 ```