diff --git a/website/docs/developer-guide/acp-internals.md b/website/docs/developer-guide/acp-internals.md
index 968b2b906ad..2ef552e266c 100644
--- a/website/docs/developer-guide/acp-internals.md
+++ b/website/docs/developer-guide/acp-internals.md
@@ -76,9 +76,8 @@ The manager is thread-safe and supports:
 Bridged callbacks:
 
 - `tool_progress_callback`
-- `thinking_callback`
+- `thinking_callback` (currently set to `None` in the ACP bridge — reasoning is forwarded through `step_callback` instead)
 - `step_callback`
-- `message_callback`
 
 Because `AIAgent` runs in a worker thread while ACP I/O lives on the main event loop, the bridge uses:
 
diff --git a/website/docs/developer-guide/agent-loop.md b/website/docs/developer-guide/agent-loop.md
index 4ca66b56283..cf9cb1c1efd 100644
--- a/website/docs/developer-guide/agent-loop.md
+++ b/website/docs/developer-guide/agent-loop.md
@@ -6,7 +6,7 @@ description: "Detailed walkthrough of AIAgent execution, API modes, tools, callb
 
 # Agent Loop Internals
 
-The core orchestration engine is `run_agent.py`'s `AIAgent` class — roughly 13,700 lines that handle everything from prompt assembly to tool dispatch to provider failover.
+The core orchestration engine is `run_agent.py`'s `AIAgent` class — a large file (15k+ lines) that handles everything from prompt assembly to tool dispatch to provider failover.
 
 ## Core Responsibilities
 
@@ -222,7 +222,7 @@ After each turn:
 
 | File | Purpose |
 |------|---------|
-| `run_agent.py` | AIAgent class — the complete agent loop (~13,700 lines) |
+| `run_agent.py` | AIAgent class — the complete agent loop |
 | `agent/prompt_builder.py` | System prompt assembly from memory, skills, context files, personality |
 | `agent/context_engine.py` | ContextEngine ABC — pluggable context management |
 | `agent/context_compressor.py` | Default engine — lossy summarization algorithm |
diff --git a/website/docs/developer-guide/architecture.md b/website/docs/developer-guide/architecture.md
index c8901934199..af2b0a2fd4b 100644
--- a/website/docs/developer-guide/architecture.md
+++ b/website/docs/developer-guide/architecture.md
@@ -32,8 +32,8 @@ This page is the top-level map of Hermes Agent internals. Use it to orient yours
 │  ┌──────┴───────┐  ┌──────┴───────┐  ┌──────┴───────┐               │
 │  │ Compression  │  │ 3 API Modes  │  │ Tool Registry│               │
 │  │ & Caching    │  │ chat_compl.  │  │ (registry.py)│               │
-│  │              │  │ codex_resp.  │  │ 61 tools     │               │
-│  │              │  │ anthropic    │  │ 52 toolsets  │               │
+│  │              │  │ codex_resp.  │  │ 70+ tools    │               │
+│  │              │  │ anthropic    │  │ 28 toolsets  │               │
 │  └──────────────┘  └──────────────┘  └──────────────┘               │
 └─────────┴─────────────────┴─────────────────┴───────────────────────┘
            │                                    │
@@ -52,8 +52,8 @@ This page is the top-level map of Hermes Agent internals. Use it to orient yours
 
 ```text
 hermes-agent/
-├── run_agent.py              # AIAgent — core conversation loop (~13,700 lines)
-├── cli.py                    # HermesCLI — interactive terminal UI (~11,500 lines)
+├── run_agent.py              # AIAgent — core conversation loop (large file)
+├── cli.py                    # HermesCLI — interactive terminal UI (large file)
 ├── model_tools.py            # Tool discovery, schema collection, dispatch
 ├── toolsets.py               # Tool groupings and platform presets
 ├── hermes_state.py           # SQLite session/state database with FTS5
@@ -76,14 +76,14 @@ hermes-agent/
 │   └── trajectory.py         # Trajectory saving helpers
 │
 ├── hermes_cli/               # CLI subcommands and setup
-│   ├── main.py               # Entry point — all `hermes` subcommands (~10,400 lines)
+│   ├── main.py               # Entry point — all `hermes` subcommands (large file)
 │   ├── config.py             # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
 │   ├── commands.py           # COMMAND_REGISTRY — central slash command definitions
 │   ├── auth.py               # PROVIDER_REGISTRY, credential resolution
 │   ├── runtime_provider.py   # Provider → api_mode + credentials
 │   ├── models.py             # Model catalog, provider model lists
 │   ├── model_switch.py       # /model command logic (CLI + gateway shared)
-│   ├── setup.py              # Interactive setup wizard (~3,500 lines)
+│   ├── setup.py              # Interactive setup wizard (large file)
 │   ├── skin_engine.py        # CLI theming engine
 │   ├── skills_config.py      # hermes skills — enable/disable per platform
 │   ├── skills_hub.py         # /skills slash command
@@ -102,14 +102,14 @@ hermes-agent/
 │   ├── browser_tool.py       # 10 browser automation tools
 │   ├── code_execution_tool.py # execute_code sandbox
 │   ├── delegate_tool.py      # Subagent delegation
-│   ├── mcp_tool.py           # MCP client (~3,100 lines)
+│   ├── mcp_tool.py           # MCP client (large file)
 │   ├── credential_files.py   # File-based credential passthrough
 │   ├── env_passthrough.py    # Env var passthrough for sandboxes
 │   ├── ansi_strip.py         # ANSI escape stripping
 │   └── environments/         # Terminal backends (local, docker, ssh, modal, daytona, singularity)
 │
 ├── gateway/                  # Messaging platform gateway
-│   ├── run.py                # GatewayRunner — message dispatch (~12,200 lines)
+│   ├── run.py                # GatewayRunner — message dispatch (large file)
 │   ├── session.py            # SessionStore — conversation persistence
 │   ├── delivery.py           # Outbound message delivery
 │   ├── pairing.py            # DM pairing authorization
@@ -213,7 +213,7 @@ A shared runtime resolver used by CLI, gateway, cron, ACP, and auxiliary calls.
 
 ### Tool System
 
-Central tool registry (`tools/registry.py`) with 61 registered tools across 52 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 7 backends (local, Docker, SSH, Daytona, Modal, Singularity, Vercel Sandbox).
+Central tool registry (`tools/registry.py`) with 70+ registered tools across ~28 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 7 backends (local, Docker, SSH, Daytona, Modal, Singularity, Vercel Sandbox).
 
 → [Tools Runtime](./tools-runtime.md)
 
diff --git a/website/docs/developer-guide/browser-supervisor.md b/website/docs/developer-guide/browser-supervisor.md
index d0aa34dbb2b..ba26d579bbb 100644
--- a/website/docs/developer-guide/browser-supervisor.md
+++ b/website/docs/developer-guide/browser-supervisor.md
@@ -217,7 +217,6 @@ Issue planned against `jo-inc/camofox-browser` adding:
 Unit tests use an asyncio mock CDP server that speaks enough of the protocol
 to exercise all state transitions: attach, enable, navigate, dialog fire,
 dialog dismiss, frame attach/detach, child target attach, session teardown.
-Real-backend E2E (Browserbase + local Chrome) is manual; probe scripts from
-the 2026-04-23 investigation kept in-repo under
-`scripts/browser_supervisor_e2e.py` so anyone can re-verify on new backend
-versions.
+Real-backend E2E (Browserbase + local Chrome) is manual — exercise via
+`/browser connect` to a live Chrome and run the dialog/frame test cases
+described above.
diff --git a/website/docs/developer-guide/contributing.md b/website/docs/developer-guide/contributing.md
index 9b2cc9b3037..6e00e367330 100644
--- a/website/docs/developer-guide/contributing.md
+++ b/website/docs/developer-guide/contributing.md
@@ -50,6 +50,8 @@ export VIRTUAL_ENV="$(pwd)/venv"
 
 # Install with all extras (messaging, cron, CLI menus, dev tools)
 uv pip install -e ".[all,dev]"
+# tinker-atropos is a git submodule — needs `git submodule update --init` first
+# if you didn't clone with `--recurse-submodules`
 uv pip install -e "./tinker-atropos"
 
 # Optional: browser tools
diff --git a/website/docs/developer-guide/environments.md b/website/docs/developer-guide/environments.md
index 3409f304736..0a5aa00ffff 100644
--- a/website/docs/developer-guide/environments.md
+++ b/website/docs/developer-guide/environments.md
@@ -172,7 +172,7 @@ parser = get_parser("hermes")  # or "mistral", "llama3_json", "qwen", "deepseek_
 content, tool_calls = parser.parse(raw_model_output)
 ```
 
-Available parsers: `hermes`, `mistral`, `llama3_json`, `qwen`, `qwen3_coder`, `deepseek_v3`, `deepseek_v3_1`, `kimi_k2`, `longcat`, `glm45`, `glm47`.
+Available parsers: `hermes`, `mistral`, `llama3_json`, `llama4_json`, `qwen`, `qwen3_coder`, `deepseek_v3`, `deepseek_v3_1` (alias `deepseek_v31`), `kimi_k2`, `longcat`, `glm45`, `glm47`.
 
 In Phase 1 (OpenAI server type), parsers are not needed — the server handles tool call parsing natively.
 
diff --git a/website/docs/developer-guide/gateway-internals.md b/website/docs/developer-guide/gateway-internals.md
index e10fe6821f0..d0521d4816d 100644
--- a/website/docs/developer-guide/gateway-internals.md
+++ b/website/docs/developer-guide/gateway-internals.md
@@ -6,13 +6,13 @@ description: "How the messaging gateway boots, authorizes users, routes sessions
 
 # Gateway Internals
 
-The messaging gateway is the long-running process that connects Hermes to 14+ external messaging platforms through a unified architecture.
+The messaging gateway is the long-running process that connects Hermes to 20+ external messaging platforms through a unified architecture.
 
 ## Key Files
 
 | File | Purpose |
 |------|---------|
-| `gateway/run.py` | `GatewayRunner` — main loop, slash commands, message dispatch (~12,000 lines) |
+| `gateway/run.py` | `GatewayRunner` — main loop, slash commands, message dispatch (large file; check git for current LOC) |
 | `gateway/session.py` | `SessionStore` — conversation persistence and session key construction |
 | `gateway/delivery.py` | Outbound message delivery to target platforms/channels |
 | `gateway/pairing.py` | DM pairing flow for user authorization |
@@ -162,7 +162,10 @@ gateway/platforms/
 ├── wecom.py             # WeCom (WeChat Work) callback
 ├── weixin.py            # Weixin (personal WeChat) via iLink Bot API
 ├── bluebubbles.py       # Apple iMessage via BlueBubbles macOS server
-├── qqbot.py             # QQ Bot (Tencent QQ) via Official API v2
+├── qqbot/               # QQ Bot (Tencent QQ) via Official API v2 (sub-package: adapter.py, crypto.py, keyboards.py, …)
+├── yuanbao.py           # Yuanbao (Tencent) DM/group adapter
+├── feishu_comment.py    # Feishu document/drive comment-reply handler
+├── msgraph_webhook.py   # Microsoft Graph change-notification webhook (Teams, Outlook, etc.)
 ├── webhook.py           # Inbound/outbound webhook adapter
 ├── api_server.py        # REST API server adapter
 └── homeassistant.py     # Home Assistant conversation integration
@@ -205,7 +208,7 @@ Gateway hooks are Python modules that respond to lifecycle events:
 | `agent:end` | Agent finishes and returns response |
 | `command:*` | Any slash command is executed |
 
-Hooks are discovered from `gateway/builtin_hooks/` (always active) and `~/.hermes/hooks/` (user-installed). Each hook is a directory with a `HOOK.yaml` manifest and `handler.py`.
+Hooks are discovered from `gateway/builtin_hooks/` (an extension point — currently empty in the shipped distribution; `_register_builtin_hooks()` is a no-op stub) and `~/.hermes/hooks/` (user-installed). Each hook is a directory with a `HOOK.yaml` manifest and `handler.py`.
 
 ## Memory Provider Integration
 
diff --git a/website/docs/developer-guide/provider-runtime.md b/website/docs/developer-guide/provider-runtime.md
index 492a213e1f6..830382479ff 100644
--- a/website/docs/developer-guide/provider-runtime.md
+++ b/website/docs/developer-guide/provider-runtime.md
@@ -40,7 +40,7 @@ That ordering matters because Hermes treats the saved model/provider choice as t
 
 ## Providers
 
-Current provider families include:
+Current provider families include (see `plugins/model-providers/` for the complete bundled set):
 
 - AI Gateway (Vercel)
 - OpenRouter
@@ -48,16 +48,27 @@ Current provider families include:
 - OpenAI Codex
 - Copilot / Copilot ACP
 - Anthropic (native)
-- Google / Gemini
-- Alibaba / DashScope
+- Google / Gemini (`gemini`, `google-gemini-cli`)
+- Alibaba / DashScope (`alibaba`, `alibaba-coding-plan`)
 - DeepSeek
 - Z.AI
-- Kimi / Moonshot
-- MiniMax
-- MiniMax China
+- Kimi / Moonshot (`kimi-coding`, `kimi-coding-cn`)
+- MiniMax (`minimax`, `minimax-cn`, `minimax-oauth`)
 - Kilo Code
 - Hugging Face
 - OpenCode Zen / OpenCode Go
+- AWS Bedrock
+- Azure Foundry
+- NVIDIA NIM
+- xAI (Grok)
+- Arcee
+- GMI Cloud
+- StepFun
+- Qwen OAuth
+- Xiaomi
+- Ollama Cloud
+- LM Studio
+- Tencent TokenHub
 - Custom (`provider: custom`) — first-class provider for any OpenAI-compatible endpoint
 - Named custom providers (`custom_providers` list in config.yaml)
 
@@ -154,7 +165,7 @@ When an auxiliary task is configured with provider `main`, Hermes resolves that
 
 ## Fallback models
 
-Hermes supports a configured fallback model/provider pair, allowing runtime failover when the primary model encounters errors.
+Hermes supports a configured fallback provider chain — a list of `(provider, model)` entries tried in order when the primary model encounters errors. The legacy single-pair `fallback_model` dict is still accepted for back-compat (and migrated on first write).
 
 ### How it works internally
 
diff --git a/website/docs/guides/automation-templates.md b/website/docs/guides/automation-templates.md
index a4f47e0bda8..2a6a125aa97 100644
--- a/website/docs/guides/automation-templates.md
+++ b/website/docs/guides/automation-templates.md
@@ -74,7 +74,7 @@ Review for:
 - Missing tests for new behavior
 
 Post a concise review. If the PR is a trivial docs/typo change, say so briefly." \
-  --skills "github-code-review" \
+  --skill github-code-review \
   --deliver github_comment
 ```
 
@@ -296,7 +296,7 @@ Focus on:
 
 Skip routine dependency bumps and CI fixes. If nothing notable, respond with [SILENT].
 If there are findings, organize by repo with brief analysis of each item." \
-  --skills "competitive-pr-scout" \
+  --skill competitive-pr-scout \
   --name "Competitor scout" \
   --deliver telegram
 ```
@@ -335,7 +335,7 @@ Daily arXiv scan that saves summaries to your note-taking system.
 ```bash
 hermes cron create "0 8 * * *" \
   "Search arXiv for the 3 most interesting papers on 'language model reasoning' OR 'tool-use agents' from the past day. For each paper, create an Obsidian note with the title, authors, abstract summary, key contribution, and potential relevance to Hermes Agent development." \
-  --skills "arxiv,obsidian" \
+  --skill arxiv --skill obsidian \
   --name "Paper digest" \
   --deliver local
 ```
@@ -430,7 +430,7 @@ If action is 'closed' and pull_request.merged is true:
 5. Reference the original PR in the new PR description
 
 If action is not 'closed' or not merged, respond with [SILENT]." \
-  --skills "github-pr-workflow" \
+  --skill github-pr-workflow \
   --deliver log
 ```
 
@@ -514,7 +514,7 @@ hermes cron create "0 3 * * 0" \
 
 Write a security report with findings categorized by severity (Critical, High, Medium, Low).
 If nothing found, report a clean bill of health." \
-  --skills "codebase-security-audit" \
+  --skill codebase-security-audit \
   --name "Weekly security audit" \
   --deliver telegram
 ```
diff --git a/website/docs/guides/cron-script-only.md b/website/docs/guides/cron-script-only.md
index 06fa2880067..5863412f565 100644
--- a/website/docs/guides/cron-script-only.md
+++ b/website/docs/guides/cron-script-only.md
@@ -231,16 +231,15 @@ Silent when both filesystems are under 90%; fires exactly one line per over-thre
 
 | Approach | What runs | When to use |
 |----------|-----------|-------------|
-| `hermes send` (one-shot) | Any shell command piping into it | Ad-hoc delivery or as the action of an external scheduler (systemd, launchd) |
 | `cronjob --no-agent` (this page) | Your script on Hermes' schedule | Recurring watchdogs / alerts / metrics that don't need reasoning |
 | `cronjob` (default, LLM) | Agent with optional pre-check script | When the message content requires reasoning over data |
-| OS cron + `hermes send` | Your script on the OS schedule | When Hermes might be unhealthy (the thing you're monitoring) |
+| OS cron + `curl` to a [webhook subscription](/docs/user-guide/features/webhooks) | Your script on the OS schedule | When Hermes might be unhealthy (the thing you're monitoring) |
 
-For critical system-health watchdogs that must fire *even when the gateway is down*, keep using OS-level cron + a plain `curl` or `hermes send` call — those run as independent OS processes and don't depend on Hermes being up. The in-gateway scheduler is the right choice when the thing being monitored is external.
+For critical system-health watchdogs that must fire *even when the gateway is down*, use OS-level cron with a plain `curl` to a Hermes webhook subscription (or any external alerting endpoint) — those run as independent OS processes and don't depend on Hermes being up. The in-gateway scheduler is the right choice when the thing being monitored is external.
 
 ## Related
 
 - [Automate Anything with Cron](/docs/guides/automate-with-cron) — LLM-driven cron patterns.
 - [Scheduled Tasks (Cron) reference](/docs/user-guide/features/cron) — full schedule syntax, lifecycle, delivery routing.
-- [Pipe Script Output with `hermes send`](/docs/guides/pipe-script-output) — the one-shot counterpart for ad-hoc scripts.
+- [Webhook Subscriptions](/docs/user-guide/features/webhooks) — fire-and-forget HTTP entry points for external schedulers.
 - [Gateway Internals](/docs/developer-guide/gateway-internals) — delivery-router internals.
diff --git a/website/docs/guides/cron-troubleshooting.md b/website/docs/guides/cron-troubleshooting.md
index d85a1530909..0db25044bca 100644
--- a/website/docs/guides/cron-troubleshooting.md
+++ b/website/docs/guides/cron-troubleshooting.md
@@ -38,7 +38,7 @@ If the job fires once and then disappears from the list, it's a one-shot schedul
 
 Cron jobs are fired by the gateway's background ticker thread, which ticks every 60 seconds. A regular CLI chat session does **not** automatically fire cron jobs.
 
-If you're expecting jobs to fire automatically, you need a running gateway (`hermes gateway` or `hermes serve`). For one-off debugging, you can manually trigger a tick with `hermes cron tick`.
+If you're expecting jobs to fire automatically, you need a running gateway (`hermes gateway` for foreground, or `hermes gateway start` for the installed service). For one-off debugging, you can manually trigger a tick with `hermes cron tick`.
 
 ### Check 4: Check the system clock and timezone
 
diff --git a/website/docs/guides/local-ollama-setup.md b/website/docs/guides/local-ollama-setup.md
index ae0cc445a82..9e2fab5e5de 100644
--- a/website/docs/guides/local-ollama-setup.md
+++ b/website/docs/guides/local-ollama-setup.md
@@ -31,11 +31,11 @@ By the end, you'll have:
 | **GPU** | Not required | NVIDIA GPU with 8+ GB VRAM speeds things up significantly |
 
 :::tip CPU-only works, but expect slower responses
-Ollama runs on CPU-only servers. A 9B model on a modern 8-core CPU gives ~10 tokens/sec. A 31B model on CPU is slower (~2–5 tokens/sec) — each response takes 30–120 seconds, but it works. A GPU dramatically improves this. For CPU-only setups, increase the API timeout in config:
+Ollama runs on CPU-only servers. A 9B model on a modern 8-core CPU gives ~10 tokens/sec. A 31B model on CPU is slower (~2–5 tokens/sec) — each response takes 30–120 seconds, but it works. A GPU dramatically improves this. For CPU-only setups, widen the API timeout via the env var (it's not a `config.yaml` key):
 
-```yaml
-agent:
-  api_timeout: 1800   # 30 minutes — generous for slow local models
+```bash
+# ~/.hermes/.env
+HERMES_API_TIMEOUT=1800   # 30 minutes — generous for slow local models
 ```
 :::
 
diff --git a/website/docs/guides/minimax-oauth.md b/website/docs/guides/minimax-oauth.md
index 2bc1ef3683c..2914c4c1979 100644
--- a/website/docs/guides/minimax-oauth.md
+++ b/website/docs/guides/minimax-oauth.md
@@ -56,10 +56,12 @@ hermes auth add minimax-oauth
 
 ### China region
 
-If your account is on the China platform (`minimaxi.com`), pass `--region cn`:
+If your account is on the China platform (`minimaxi.com`), use the China-region OAuth provider id `minimax-cn` instead, or skip OAuth and configure `MINIMAX_CN_API_KEY` / `MINIMAX_CN_BASE_URL` directly. The `--region cn` flag described in older docs is **not** wired through the CLI's argument parser; use the `minimax-cn` provider instead:
 
 ```bash
-hermes auth add minimax-oauth --region cn
+hermes auth add minimax-cn --type oauth   # if OAuth is supported on your CN account
+# or simpler:
+echo 'MINIMAX_CN_API_KEY=your-key' >> ~/.hermes/.env
 ```
 
 ### Remote / headless sessions
@@ -128,12 +130,12 @@ model:
   base_url: https://api.minimax.io/anthropic
 ```
 
-### `--region` flag
+### Region endpoints
 
-| Value | Portal | Inference endpoint |
-|-------|--------|-------------------|
-| `global` (default) | `https://api.minimax.io` | `https://api.minimax.io/anthropic` |
-| `cn` | `https://api.minimaxi.com` | `https://api.minimaxi.com/anthropic` |
+| Provider id | Portal | Inference endpoint |
+|-------------|--------|-------------------|
+| `minimax-oauth` (global) | `https://api.minimax.io` | `https://api.minimax.io/anthropic` |
+| `minimax-cn` (China) | `https://api.minimaxi.com` | `https://api.minimaxi.com/anthropic` |
 
 ### Provider aliases
 
diff --git a/website/docs/guides/operate-teams-meeting-pipeline.md b/website/docs/guides/operate-teams-meeting-pipeline.md
index 1e32e74c1a7..78c25e6d0ab 100644
--- a/website/docs/guides/operate-teams-meeting-pipeline.md
+++ b/website/docs/guides/operate-teams-meeting-pipeline.md
@@ -54,21 +54,32 @@ You MUST run `maintain-subscriptions` on a schedule. Pick one of these three opt
 
 #### Option 1: Hermes cron (recommended if you already run the Hermes gateway)
 
-Hermes ships a built-in cron scheduler. Add a script-only cron job that runs every 12 hours (gives 6x headroom against the 72h expiry window):
+Hermes ships a built-in cron scheduler. The `--no-agent` mode runs a script as the job (rather than using an LLM), and `--script` must point at a file under `~/.hermes/scripts/`. First create the script:
 
 ```bash
-hermes cron add \
+mkdir -p ~/.hermes/scripts
+cat > ~/.hermes/scripts/maintain-teams-subscriptions.sh <<'EOF'
+#!/usr/bin/env bash
+exec hermes teams-pipeline maintain-subscriptions
+EOF
+chmod +x ~/.hermes/scripts/maintain-teams-subscriptions.sh
+```
+
+Then register a script-only cron job that runs every 12 hours (gives 6x headroom against the 72h expiry window):
+
+```bash
+hermes cron create "0 */12 * * *" \
   --name "teams-pipeline-maintain-subscriptions" \
-  --schedule "0 */12 * * *" \
-  --script-only \
-  --command "hermes teams-pipeline maintain-subscriptions"
+  --no-agent \
+  --script maintain-teams-subscriptions.sh \
+  --deliver local
 ```
 
 Verify it was registered and inspect the next run time:
 
 ```bash
 hermes cron list
-hermes cron show teams-pipeline-maintain-subscriptions
+hermes cron status        # scheduler status
 ```
 
 #### Option 2: systemd timer (recommended for Linux production deployments)
diff --git a/website/docs/guides/python-library.md b/website/docs/guides/python-library.md
index 3e857f7dd11..3bb08645ac9 100644
--- a/website/docs/guides/python-library.md
+++ b/website/docs/guides/python-library.md
@@ -81,7 +81,8 @@ print(f"Messages exchanged: {len(result['messages'])}")
 The returned dictionary contains:
 - **`final_response`** — The agent's final text reply
 - **`messages`** — The complete message history (system, user, assistant, tool calls)
-- **`task_id`** — The task identifier used for VM isolation
+
+(The `task_id` you pass in is stored on the agent instance for VM isolation but isn't echoed back in the return dict.)
 
 You can also pass a custom system message that overrides the ephemeral system prompt for that call:
 
diff --git a/website/docs/guides/use-mcp-with-hermes.md b/website/docs/guides/use-mcp-with-hermes.md
index 6d86eea1eef..5fa43bbcde5 100644
--- a/website/docs/guides/use-mcp-with-hermes.md
+++ b/website/docs/guides/use-mcp-with-hermes.md
@@ -143,7 +143,7 @@ Use `chrome-devtools-mcp`.
 If your Windows Chrome already has live remote debugging enabled from `chrome://inspect/#remote-debugging`, add it like this from WSL:
 
 ```bash
-hermes mcp add chrome-devtools-win --command cmd.exe --args /c "npx -y chrome-devtools-mcp@latest --autoConnect --no-usage-statistics"
+hermes mcp add chrome-devtools-win --command cmd.exe --args /c npx -y chrome-devtools-mcp@latest --autoConnect --no-usage-statistics
 ```
 
 After saving the server:
diff --git a/website/docs/index.md b/website/docs/index.md
index 86abf444037..bab06f634d5 100644
--- a/website/docs/index.md
+++ b/website/docs/index.md
@@ -47,7 +47,7 @@ It's not a coding copilot tethered to an IDE or a chatbot wrapper around a singl
 | 🗺️ **[Learning Path](/docs/getting-started/learning-path)** | Find the right docs for your experience level |
 | ⚙️ **[Configuration](/docs/user-guide/configuration)** | Config file, providers, models, and options |
 | 💬 **[Messaging Gateway](/docs/user-guide/messaging)** | Set up Telegram, Discord, Slack, WhatsApp, Teams, or more |
-| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 68 built-in tools and how to configure them |
+| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 70+ built-in tools and how to configure them |
 | 🧠 **[Memory System](/docs/user-guide/features/memory)** | Persistent memory that grows across sessions |
 | 📚 **[Skills System](/docs/user-guide/features/skills)** | Procedural memory the agent creates and reuses |
 | 🔌 **[MCP Integration](/docs/user-guide/features/mcp)** | Connect to MCP servers, filter their tools, and extend Hermes safely |
@@ -65,7 +65,7 @@ It's not a coding copilot tethered to an IDE or a chatbot wrapper around a singl
 
 - **A closed learning loop** — Agent-curated memory with periodic nudges, autonomous skill creation, skill self-improvement during use, FTS5 cross-session recall with LLM summarization, and [Honcho](https://github.com/plastic-labs/honcho) dialectic user modeling
 - **Runs anywhere, not just your laptop** — 6 terminal backends: local, Docker, SSH, Daytona, Singularity, Modal. Daytona and Modal offer serverless persistence — your environment hibernates when idle, costing nearly nothing
-- **Lives where you do** — CLI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Mattermost, Email, SMS, DingTalk, Feishu, WeCom, BlueBubbles, Home Assistant, Microsoft Teams — 15+ platforms from one gateway
+- **Lives where you do** — CLI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Mattermost, Email, SMS, DingTalk, Feishu, WeCom, Weixin, QQ Bot, Yuanbao, BlueBubbles, Home Assistant, Microsoft Teams, Google Chat, and more — 20+ platforms from one gateway
 - **Built by model trainers** — Created by [Nous Research](https://nousresearch.com), the lab behind Hermes, Nomos, and Psyche. Works with [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai), OpenAI, or any endpoint
 - **Scheduled automations** — Built-in cron with delivery to any platform
 - **Delegates & parallelizes** — Spawn isolated subagents for parallel workstreams. Programmatic Tool Calling via `execute_code` collapses multi-step pipelines into single inference calls
diff --git a/website/docs/integrations/providers.md b/website/docs/integrations/providers.md
index df8701778da..93e4ba630d3 100644
--- a/website/docs/integrations/providers.md
+++ b/website/docs/integrations/providers.md
@@ -378,8 +378,8 @@ bedrock:
   # profile: "myprofile"       # or set AWS_PROFILE
   # discovery: true            # auto-discover region from IAM
   # guardrail:                 # optional Bedrock Guardrails
-  #   id: "your-guardrail-id"
-  #   version: "DRAFT"
+  #   guardrail_identifier: "your-guardrail-id"
+  #   guardrail_version: "DRAFT"
 ```
 
 Authentication uses the standard boto3 chain: explicit `AWS_ACCESS_KEY_ID`/`AWS_SECRET_ACCESS_KEY`, `AWS_PROFILE` from `~/.aws/credentials`, IAM role on EC2/ECS/Lambda, IMDS, or SSO. No env var is required if you're already authenticated with the AWS CLI.
@@ -484,7 +484,7 @@ For on-prem deployments (DGX Spark, local GPU), set `NVIDIA_BASE_URL=http://loca
 
 ### GMI Cloud
 
-Open and reasoning models via [GMI Cloud](https://inference.gmi.ai) — OpenAI-compatible API, API key authentication.
+Open and reasoning models via [GMI Cloud](https://www.gmicloud.ai/) — OpenAI-compatible API, API key authentication.
 
 ```bash
 # GMI Cloud
@@ -499,7 +499,7 @@ model:
   default: "deepseek-ai/DeepSeek-R1"
 ```
 
-The base URL can be overridden with `GMI_BASE_URL` (default: `https://api.gmi.ai/v1`).
+The base URL can be overridden with `GMI_BASE_URL` (default: `https://api.gmi-serving.com/v1`).
 
 ### StepFun
 
@@ -1393,24 +1393,34 @@ Notes:
 - See OpenRouter's [Pareto Router docs](https://openrouter.ai/docs/guides/routing/routers/pareto-router) for the full router behavior.
 - To use the Pareto Code router for a specific **auxiliary task** (compression, vision, etc.) instead of the main agent, set `extra_body.plugins` under that task — see [Auxiliary Models → OpenRouter routing & Pareto Code for auxiliary tasks](/docs/user-guide/configuration#openrouter-routing--pareto-code-for-auxiliary-tasks).
 
-## Fallback Model
+## Fallback Providers
 
-Configure a backup provider:model that Hermes switches to automatically when your primary model fails (rate limits, server errors, auth failures):
+Configure a chain of backup providers Hermes tries in order when the primary model fails (rate limits, server errors, auth failures). The canonical format is a top-level `fallback_providers:` list:
+
+```yaml
+fallback_providers:
+  - provider: openrouter
+    model: anthropic/claude-sonnet-4
+  - provider: anthropic
+    model: claude-sonnet-4
+    # base_url: http://localhost:8000/v1    # optional, for custom endpoints
+    # api_mode: chat_completions           # optional override
+```
+
+The legacy single-pair `fallback_model:` dict is still accepted for back-compat:
 
 ```yaml
 fallback_model:
-  provider: openrouter                    # required
-  model: anthropic/claude-sonnet-4        # required
-  # base_url: http://localhost:8000/v1    # optional, for custom endpoints
-  # key_env: MY_CUSTOM_KEY               # optional, env var name for custom endpoint API key
+  provider: openrouter
+  model: anthropic/claude-sonnet-4
 ```
 
-When activated, the fallback swaps the model and provider mid-session without losing your conversation. It fires **at most once** per session.
+When activated, the fallback swaps the model and provider mid-session without losing your conversation. The chain is tried entry-by-entry; activation is one-shot per session.
 
-Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `deepseek`, `nvidia`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `stepfun`, `alibaba`, `tencent-tokenhub`, `custom`.
+Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `deepseek`, `nvidia`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `azure-foundry`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `stepfun`, `lmstudio`, `alibaba`, `alibaba-coding-plan`, `tencent-tokenhub`, `custom`.
 
 :::tip
-Fallback is configured exclusively through `config.yaml` — there are no environment variables for it. For full details on when it triggers, supported providers, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers).
+Fallback is configured exclusively through `config.yaml` — or interactively via `hermes fallback`. For full details on when it triggers, how the chain advances, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers).
 :::
 
 ---
diff --git a/website/docs/user-guide/messaging/google_chat.md b/website/docs/user-guide/messaging/google_chat.md
index 6fda2b179a8..8cf2d01d7a3 100644
--- a/website/docs/user-guide/messaging/google_chat.md
+++ b/website/docs/user-guide/messaging/google_chat.md
@@ -164,10 +164,10 @@ GOOGLE_CHAT_MAX_BYTES=16777216                  # 16 MiB — cap on in-flight me
 The project ID also falls back to `GOOGLE_CLOUD_PROJECT`, and the SA path falls
 back to `GOOGLE_APPLICATION_CREDENTIALS` — use whichever convention you prefer.
 
-Install Hermes with the optional dependencies:
+Install the dependencies the Google Chat adapter needs (no Hermes extra is currently published — install them directly):
 
 ```bash
-pip install 'hermes-agent[google_chat]'
+pip install google-cloud-pubsub google-api-python-client google-auth google-auth-oauthlib
 ```
 
 Start the gateway:
diff --git a/website/docs/user-guide/messaging/index.md b/website/docs/user-guide/messaging/index.md
index 24970ac235d..b6ed2796c10 100644
--- a/website/docs/user-guide/messaging/index.md
+++ b/website/docs/user-guide/messaging/index.md
@@ -386,7 +386,7 @@ Each platform has its own toolset:
 | Discord | `hermes-discord` | Full tools including terminal |
 | WhatsApp | `hermes-whatsapp` | Full tools including terminal |
 | Slack | `hermes-slack` | Full tools including terminal |
-| Google Chat | `hermes-google-chat` | Full tools including terminal |
+| Google Chat | `hermes-google_chat` | Full tools including terminal |
 | Signal | `hermes-signal` | Full tools including terminal |
 | SMS | `hermes-sms` | Full tools including terminal |
 | Email | `hermes-email` | Full tools including terminal |
@@ -402,7 +402,7 @@ Each platform has its own toolset:
 | QQBot | `hermes-qqbot` | Full tools including terminal |
 | Yuanbao | `hermes-yuanbao` | Full tools including terminal |
 | Microsoft Teams | `hermes-teams` | Full tools including terminal |
-| API Server | `hermes` (default) | Full tools including terminal |
+| API Server | `hermes-api-server` | Full tools (drops `clarify`, `send_message`, `text_to_speech` — programmatic access doesn't have an interactive user) |
 | Webhooks | `hermes-webhook` | Full tools including terminal |
 
 ## Next Steps
diff --git a/website/docs/user-guide/messaging/open-webui.md b/website/docs/user-guide/messaging/open-webui.md
index 175276eb084..e75517e79b3 100644
--- a/website/docs/user-guide/messaging/open-webui.md
+++ b/website/docs/user-guide/messaging/open-webui.md
@@ -275,16 +275,22 @@ To run separate Hermes instances per user — each with their own config, memory
 
 ### 1. Create profiles and configure API servers
 
+`API_SERVER_*` are env vars, not YAML config keys, so write them to each profile's `.env`. Pick ports outside the default-platform range (`8644` is the webhook adapter, `8645` is wecom-callback, `8646` is msgraph-webhook), e.g. `8650+`:
+
 ```bash
 hermes profile create alice
-hermes -p alice config set API_SERVER_ENABLED true
-hermes -p alice config set API_SERVER_PORT 8643
-hermes -p alice config set API_SERVER_KEY alice-secret
+cat >> ~/.hermes/profiles/alice/.env <<EOF
+API_SERVER_ENABLED=true
+API_SERVER_PORT=8650
+API_SERVER_KEY=alice-secret
+EOF
 
 hermes profile create bob
-hermes -p bob config set API_SERVER_ENABLED true
-hermes -p bob config set API_SERVER_PORT 8644
-hermes -p bob config set API_SERVER_KEY bob-secret
+cat >> ~/.hermes/profiles/bob/.env <<EOF
+API_SERVER_ENABLED=true
+API_SERVER_PORT=8651
+API_SERVER_KEY=bob-secret
+EOF
 ```
 
 ### 2. Start each gateway
@@ -300,8 +306,8 @@ In **Admin Settings** → **Connections** → **OpenAI API** → **Manage**, add
 
 | Connection | URL | API Key |
 |-----------|-----|---------|
-| Alice | `http://host.docker.internal:8643/v1` | `alice-secret` |
-| Bob | `http://host.docker.internal:8644/v1` | `bob-secret` |
+| Alice | `http://host.docker.internal:8650/v1` | `alice-secret` |
+| Bob | `http://host.docker.internal:8651/v1` | `bob-secret` |
 
 The model dropdown will show `alice` and `bob` as distinct models. You can assign models to Open WebUI users via the admin panel, giving each user their own isolated Hermes agent.
 
diff --git a/website/docs/user-guide/messaging/qqbot.md b/website/docs/user-guide/messaging/qqbot.md
index 46cef53b0f9..e5050b304fc 100644
--- a/website/docs/user-guide/messaging/qqbot.md
+++ b/website/docs/user-guide/messaging/qqbot.md
@@ -55,7 +55,7 @@ QQ_CLIENT_SECRET=your-app-secret
 | `QQ_ALLOW_ALL_USERS` | Set to `true` to allow all DMs | `false` |
 | `QQ_PORTAL_HOST` | Override the QQ portal host (set to `sandbox.q.qq.com` for sandbox routing) | `q.qq.com` |
 | `QQ_STT_API_KEY` | API key for voice-to-text provider | — |
-| `QQ_STT_BASE_URL` | Base URL for STT provider | `https://open.bigmodel.cn/api/coding/paas/v4` |
+| `QQ_STT_BASE_URL` | (Not read directly — set `platforms.qqbot.extra.stt.baseUrl` in `config.yaml` instead) | n/a |
 | `QQ_STT_MODEL` | STT model name | `glm-asr` |
 
 ## Advanced Configuration
@@ -64,7 +64,7 @@ For fine-grained control, add platform settings to `~/.hermes/config.yaml`:
 
 ```yaml
 platforms:
-  qq:
+  qqbot:
     enabled: true
     extra:
       app_id: "your-app-id"
diff --git a/website/docs/user-guide/messaging/sms.md b/website/docs/user-guide/messaging/sms.md
index c5b28cd6fd9..99b339020e5 100644
--- a/website/docs/user-guide/messaging/sms.md
+++ b/website/docs/user-guide/messaging/sms.md
@@ -108,7 +108,7 @@ hermes gateway
 You should see:
 
 ```
-[sms] Twilio webhook server listening on 0.0.0.0:8080, from: +1555***4567
+[sms] Twilio webhook server listening on 127.0.0.1:8080, from: +1555***4567
 ```
 
 If you see `Refusing to start: SMS_WEBHOOK_URL is required`, set `SMS_WEBHOOK_URL` to the public URL configured in your Twilio Console (see Step 3).
diff --git a/website/docs/user-guide/messaging/teams-meetings.md b/website/docs/user-guide/messaging/teams-meetings.md
index 825b2da5b14..eabc585ef1c 100644
--- a/website/docs/user-guide/messaging/teams-meetings.md
+++ b/website/docs/user-guide/messaging/teams-meetings.md
@@ -25,7 +25,7 @@ The pipeline:
 4. stores durable job state and sink records locally
 5. can write summaries to Notion, Linear, and Microsoft Teams
 
-Operator actions stay in the CLI:
+Operator actions stay in the CLI (the `teams-pipeline` subcommand is registered by the `teams_pipeline` plugin — enable it via `hermes plugins enable teams_pipeline` or set `plugins.enabled: [teams_pipeline]` in `config.yaml`):
 
 ```bash
 hermes teams-pipeline validate