docs: round 2 audit — messaging, developer-guide, guides, integrations (#22858)

Cross-checked 75 docs pages under user-guide/messaging/, developer-guide/, guides/, and integrations/ against the live registries and gateway code. messaging/ - index.md: API Server toolset is hermes-api-server (was 'hermes (default)'); Google Chat slug is hermes-google_chat (underscore — plugin name uses _). - google_chat.md: drop bogus 'pip install hermes-agent[google_chat]' (no such extra); list the actual deps (google-cloud-pubsub, google-api-python-client, google-auth, google-auth-oauthlib). - qqbot.md: config namespace is platforms.qqbot (was platforms.qq, which is silently ignored by the adapter); QQ_STT_BASE_URL is not read directly — baseUrl lives under platforms.qqbot.extra.stt. - teams-meetings.md: 'hermes teams-pipeline' is plugin-gated (teams_pipeline plugin must be enabled), not a built-in subcommand. - sms.md: example log line 0.0.0.0:8080 -> 127.0.0.1:8080 (default SMS_WEBHOOK_HOST). - open-webui.md: API_SERVER_* are env vars, not YAML keys — write them to per-profile .env, not 'hermes config set' (same pattern fixed in api-server.md last round). Also bumped example ports to 8650+ to dodge the default webhook (8644)/wecom-callback (8645)/msgraph-webhook (8646) collision. developer-guide/ - architecture.md: tool/toolset counts (61/52 -> 70+/~28); LOC stamps for run_agent.py, cli.py, hermes_cli/main.py, setup.py, mcp_tool.py, gateway/run.py replaced with 'large file' to stop drifting. - agent-loop.md: same LOC drift (~13,700 -> 'a large file (15k+ lines)'). - gateway-internals.md: '14+ external messaging platforms' -> '20+'; gateway platform tree updated (qqbot is a sub-package, not qqbot.py; added yuanbao.py, feishu_comment.py, msgraph_webhook.py); 'gateway/builtin_hooks/ (always active)' was wrong — it's an empty extension point and _register_builtin_hooks() is a no-op stub. - acp-internals.md: drop fictional 'message_callback' from the bridged- callbacks list; clarify thinking_callback is currently set to None. - provider-runtime.md: provider list was missing AWS Bedrock, Azure Foundry, NVIDIA NIM, xAI, Arcee, GMI Cloud, StepFun, Qwen OAuth, Xiaomi, Ollama Cloud, LM Studio, Tencent TokenHub. Fallback section described only the legacy single-pair model — corrected to the canonical list-form fallback_providers chain. - environments.md: parsers list missing llama4_json and the deepseek_v31 alias; both register via @register_parser. - browser-supervisor.md: drop reference to scripts/browser_supervisor_e2e.py which doesn't exist in-repo. - contributing.md: tinker-atropos is a git submodule — note that 'git submodule update --init' is required if cloning without --recurse-submodules. guides/ - operate-teams-meeting-pipeline.md: cron flags were all wrong — schedule is positional (not --schedule), the script-only flag is --no-agent (not --script-only), and there's no --command flag. Replaced with a real example that creates the script under ~/.hermes/scripts/ and uses the actual flags. Also replaced fictional 'hermes cron show <name>' with 'hermes cron status'. - automation-templates.md: 'cron create --skills "a,b"' doesn't work — the flag is --skill (singular, repeatable). Fixed all 5 occurrences via AST rewrite. - minimax-oauth.md: 'hermes auth add minimax-oauth --region cn' silently fails because --region isn't registered on the auth-add argparse spec. Pointed users at the minimax-cn provider (or MINIMAX_CN_API_KEY env) for China-region access. - cron-script-only.md: 'hermes send' is fictional — replaced the comparison- table mention with a webhook-subscription pointer; also fixed the dead link to /guides/pipe-script-output (page doesn't exist). - cron-troubleshooting.md: 'hermes serve' isn't a real subcommand. Pointed at 'hermes gateway' (foreground) / 'hermes gateway start' (service). - local-ollama-setup.md: 'agent.api_timeout' is not a config key. The right knob is the HERMES_API_TIMEOUT env var. - python-library.md: run_conversation() return dict has only final_response and messages — task_id is stored on the agent instance, not echoed back. - use-mcp-with-hermes.md: '--args /c "npx -y …"' wraps the npx command in one quoted string, so cmd.exe gets a single arg instead of the multi-token command line it needs. Removed the surrounding quotes — argparse nargs='*' collects each token correctly. integrations/ - providers.md: Bedrock guardrail YAML keys were 'id'/'version' (don't exist); actual keys are guardrail_identifier/guardrail_version (matches DEFAULT_CONFIG and the run_agent.py reader). GMI default base URL (api.gmi.ai/v1 -> api.gmi-serving.com/v1) and portal URL (inference.gmi.ai -> www.gmicloud.ai) refreshed. Fallback section rewritten to lead with the canonical fallback_providers list form (was leading with the legacy fallback_model single dict); supported-providers list extended to include azure-foundry, alibaba-coding-plan, lmstudio. index.md - '68 built-in tools' -> '70+'; '15+ platforms' was both inconsistent with integrations/index.md ('19+') and undercounted — bumped to 20+ and added Weixin/QQ Bot/Yuanbao/Google Chat to the list. Validation: 'npm run build' clean (exit 0); broken-link count unchanged at 155 (same as round-1 post-skill-regen baseline). 24 files, +132/-89.
2026-05-18 04:41:56 +00:00 · 2026-05-09 15:00:24 -07:00 · 2026-05-09 15:00:24 -07:00 · fef1a41248
commit fef1a41248
parent 0bcc327cab
24 changed files with 132 additions and 89 deletions
--- a/website/docs/developer-guide/acp-internals.md
+++ b/website/docs/developer-guide/acp-internals.md
@ -76,9 +76,8 @@ The manager is thread-safe and supports:
 Bridged callbacks:

 - `tool_progress_callback`
- `thinking_callback`
+- `thinking_callback` (currently set to `None` in the ACP bridge — reasoning is forwarded through `step_callback` instead)
 - `step_callback`
- `message_callback`

 Because `AIAgent` runs in a worker thread while ACP I/O lives on the main event loop, the bridge uses:

--- a/website/docs/developer-guide/agent-loop.md
+++ b/website/docs/developer-guide/agent-loop.md
@ -6,7 +6,7 @@ description: "Detailed walkthrough of AIAgent execution, API modes, tools, callb

 # Agent Loop Internals

-The core orchestration engine is `run_agent.py`'s `AIAgent` class — roughly 13,700 lines that handle everything from prompt assembly to tool dispatch to provider failover.
+The core orchestration engine is `run_agent.py`'s `AIAgent` class — a large file (15k+ lines) that handles everything from prompt assembly to tool dispatch to provider failover.

 ## Core Responsibilities

@ -222,7 +222,7 @@ After each turn:

 | File | Purpose |
 |------|---------|
-| `run_agent.py` | AIAgent class — the complete agent loop (~13,700 lines) |
+| `run_agent.py` | AIAgent class — the complete agent loop |
 | `agent/prompt_builder.py` | System prompt assembly from memory, skills, context files, personality |
 | `agent/context_engine.py` | ContextEngine ABC — pluggable context management |
 | `agent/context_compressor.py` | Default engine — lossy summarization algorithm |
--- a/website/docs/developer-guide/architecture.md
+++ b/website/docs/developer-guide/architecture.md
@ -32,8 +32,8 @@ This page is the top-level map of Hermes Agent internals. Use it to orient yours
 │  ┌──────┴───────┐  ┌──────┴───────┐  ┌──────┴───────┐               │
 │  │ Compression  │  │ 3 API Modes  │  │ Tool Registry│               │
 │  │ & Caching    │  │ chat_compl.  │  │ (registry.py)│               │
-│  │              │  │ codex_resp.  │  │ 61 tools     │               │
-│  │              │  │ anthropic    │  │ 52 toolsets  │               │
+│  │              │  │ codex_resp.  │  │ 70+ tools    │               │
+│  │              │  │ anthropic    │  │ 28 toolsets  │               │
 │  └──────────────┘  └──────────────┘  └──────────────┘               │
 └─────────┴─────────────────┴─────────────────┴───────────────────────┘
           │                                    │
@ -52,8 +52,8 @@ This page is the top-level map of Hermes Agent internals. Use it to orient yours

 ```text
 hermes-agent/
-├── run_agent.py              # AIAgent — core conversation loop (~13,700 lines)
-├── cli.py                    # HermesCLI — interactive terminal UI (~11,500 lines)
+├── run_agent.py              # AIAgent — core conversation loop (large file)
+├── cli.py                    # HermesCLI — interactive terminal UI (large file)
 ├── model_tools.py            # Tool discovery, schema collection, dispatch
 ├── toolsets.py               # Tool groupings and platform presets
 ├── hermes_state.py           # SQLite session/state database with FTS5
@ -76,14 +76,14 @@ hermes-agent/
 │   └── trajectory.py         # Trajectory saving helpers
 │
 ├── hermes_cli/               # CLI subcommands and setup
-│   ├── main.py               # Entry point — all `hermes` subcommands (~10,400 lines)
+│   ├── main.py               # Entry point — all `hermes` subcommands (large file)
 │   ├── config.py             # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
 │   ├── commands.py           # COMMAND_REGISTRY — central slash command definitions
 │   ├── auth.py               # PROVIDER_REGISTRY, credential resolution
 │   ├── runtime_provider.py   # Provider → api_mode + credentials
 │   ├── models.py             # Model catalog, provider model lists
 │   ├── model_switch.py       # /model command logic (CLI + gateway shared)
-│   ├── setup.py              # Interactive setup wizard (~3,500 lines)
+│   ├── setup.py              # Interactive setup wizard (large file)
 │   ├── skin_engine.py        # CLI theming engine
 │   ├── skills_config.py      # hermes skills — enable/disable per platform
 │   ├── skills_hub.py         # /skills slash command
@ -102,14 +102,14 @@ hermes-agent/
 │   ├── browser_tool.py       # 10 browser automation tools
 │   ├── code_execution_tool.py # execute_code sandbox
 │   ├── delegate_tool.py      # Subagent delegation
-│   ├── mcp_tool.py           # MCP client (~3,100 lines)
+│   ├── mcp_tool.py           # MCP client (large file)
 │   ├── credential_files.py   # File-based credential passthrough
 │   ├── env_passthrough.py    # Env var passthrough for sandboxes
 │   ├── ansi_strip.py         # ANSI escape stripping
 │   └── environments/         # Terminal backends (local, docker, ssh, modal, daytona, singularity)
 │
 ├── gateway/                  # Messaging platform gateway
-│   ├── run.py                # GatewayRunner — message dispatch (~12,200 lines)
+│   ├── run.py                # GatewayRunner — message dispatch (large file)
 │   ├── session.py            # SessionStore — conversation persistence
 │   ├── delivery.py           # Outbound message delivery
 │   ├── pairing.py            # DM pairing authorization
@ -213,7 +213,7 @@ A shared runtime resolver used by CLI, gateway, cron, ACP, and auxiliary calls.

 ### Tool System

-Central tool registry (`tools/registry.py`) with 61 registered tools across 52 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 7 backends (local, Docker, SSH, Daytona, Modal, Singularity, Vercel Sandbox).
+Central tool registry (`tools/registry.py`) with 70+ registered tools across ~28 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 7 backends (local, Docker, SSH, Daytona, Modal, Singularity, Vercel Sandbox).

 → [Tools Runtime](./tools-runtime.md)

--- a/website/docs/developer-guide/browser-supervisor.md
+++ b/website/docs/developer-guide/browser-supervisor.md
@ -217,7 +217,6 @@ Issue planned against `jo-inc/camofox-browser` adding:
 Unit tests use an asyncio mock CDP server that speaks enough of the protocol
 to exercise all state transitions: attach, enable, navigate, dialog fire,
 dialog dismiss, frame attach/detach, child target attach, session teardown.
-Real-backend E2E (Browserbase + local Chrome) is manual; probe scripts from
-the 2026-04-23 investigation kept in-repo under
-`scripts/browser_supervisor_e2e.py` so anyone can re-verify on new backend
-versions.
+Real-backend E2E (Browserbase + local Chrome) is manual — exercise via
+`/browser connect` to a live Chrome and run the dialog/frame test cases
+described above.
--- a/website/docs/developer-guide/contributing.md
+++ b/website/docs/developer-guide/contributing.md
@ -50,6 +50,8 @@ export VIRTUAL_ENV="$(pwd)/venv"

 # Install with all extras (messaging, cron, CLI menus, dev tools)
 uv pip install -e ".[all,dev]"
+# tinker-atropos is a git submodule — needs `git submodule update --init` first
+# if you didn't clone with `--recurse-submodules`
 uv pip install -e "./tinker-atropos"

 # Optional: browser tools
--- a/website/docs/developer-guide/environments.md
+++ b/website/docs/developer-guide/environments.md
@ -172,7 +172,7 @@ parser = get_parser("hermes")  # or "mistral", "llama3_json", "qwen", "deepseek_
 content, tool_calls = parser.parse(raw_model_output)
 ```

-Available parsers: `hermes`, `mistral`, `llama3_json`, `qwen`, `qwen3_coder`, `deepseek_v3`, `deepseek_v3_1`, `kimi_k2`, `longcat`, `glm45`, `glm47`.
+Available parsers: `hermes`, `mistral`, `llama3_json`, `llama4_json`, `qwen`, `qwen3_coder`, `deepseek_v3`, `deepseek_v3_1` (alias `deepseek_v31`), `kimi_k2`, `longcat`, `glm45`, `glm47`.

 In Phase 1 (OpenAI server type), parsers are not needed — the server handles tool call parsing natively.

--- a/website/docs/developer-guide/gateway-internals.md
+++ b/website/docs/developer-guide/gateway-internals.md
@ -6,13 +6,13 @@ description: "How the messaging gateway boots, authorizes users, routes sessions

 # Gateway Internals

-The messaging gateway is the long-running process that connects Hermes to 14+ external messaging platforms through a unified architecture.
+The messaging gateway is the long-running process that connects Hermes to 20+ external messaging platforms through a unified architecture.

 ## Key Files

 | File | Purpose |
 |------|---------|
-| `gateway/run.py` | `GatewayRunner` — main loop, slash commands, message dispatch (~12,000 lines) |
+| `gateway/run.py` | `GatewayRunner` — main loop, slash commands, message dispatch (large file; check git for current LOC) |
 | `gateway/session.py` | `SessionStore` — conversation persistence and session key construction |
 | `gateway/delivery.py` | Outbound message delivery to target platforms/channels |
 | `gateway/pairing.py` | DM pairing flow for user authorization |
@ -162,7 +162,10 @@ gateway/platforms/
 ├── wecom.py             # WeCom (WeChat Work) callback
 ├── weixin.py            # Weixin (personal WeChat) via iLink Bot API
 ├── bluebubbles.py       # Apple iMessage via BlueBubbles macOS server
-├── qqbot.py             # QQ Bot (Tencent QQ) via Official API v2
+├── qqbot/               # QQ Bot (Tencent QQ) via Official API v2 (sub-package: adapter.py, crypto.py, keyboards.py, …)
+├── yuanbao.py           # Yuanbao (Tencent) DM/group adapter
+├── feishu_comment.py    # Feishu document/drive comment-reply handler
+├── msgraph_webhook.py   # Microsoft Graph change-notification webhook (Teams, Outlook, etc.)
 ├── webhook.py           # Inbound/outbound webhook adapter
 ├── api_server.py        # REST API server adapter
 └── homeassistant.py     # Home Assistant conversation integration
@ -205,7 +208,7 @@ Gateway hooks are Python modules that respond to lifecycle events:
 | `agent:end` | Agent finishes and returns response |
 | `command:*` | Any slash command is executed |

-Hooks are discovered from `gateway/builtin_hooks/` (always active) and `~/.hermes/hooks/` (user-installed). Each hook is a directory with a `HOOK.yaml` manifest and `handler.py`.
+Hooks are discovered from `gateway/builtin_hooks/` (an extension point — currently empty in the shipped distribution; `_register_builtin_hooks()` is a no-op stub) and `~/.hermes/hooks/` (user-installed). Each hook is a directory with a `HOOK.yaml` manifest and `handler.py`.

 ## Memory Provider Integration

--- a/website/docs/developer-guide/provider-runtime.md
+++ b/website/docs/developer-guide/provider-runtime.md
@ -40,7 +40,7 @@ That ordering matters because Hermes treats the saved model/provider choice as t

 ## Providers

-Current provider families include:
+Current provider families include (see `plugins/model-providers/` for the complete bundled set):

 - AI Gateway (Vercel)
 - OpenRouter
@ -48,16 +48,27 @@ Current provider families include:
 - OpenAI Codex
 - Copilot / Copilot ACP
 - Anthropic (native)
- Google / Gemini
- Alibaba / DashScope
+- Google / Gemini (`gemini`, `google-gemini-cli`)
+- Alibaba / DashScope (`alibaba`, `alibaba-coding-plan`)
 - DeepSeek
 - Z.AI
- Kimi / Moonshot
- MiniMax
- MiniMax China
+- Kimi / Moonshot (`kimi-coding`, `kimi-coding-cn`)
+- MiniMax (`minimax`, `minimax-cn`, `minimax-oauth`)
 - Kilo Code
 - Hugging Face
 - OpenCode Zen / OpenCode Go
+- AWS Bedrock
+- Azure Foundry
+- NVIDIA NIM
+- xAI (Grok)
+- Arcee
+- GMI Cloud
+- StepFun
+- Qwen OAuth
+- Xiaomi
+- Ollama Cloud
+- LM Studio
+- Tencent TokenHub
 - Custom (`provider: custom`) — first-class provider for any OpenAI-compatible endpoint
 - Named custom providers (`custom_providers` list in config.yaml)

@ -154,7 +165,7 @@ When an auxiliary task is configured with provider `main`, Hermes resolves that

 ## Fallback models

-Hermes supports a configured fallback model/provider pair, allowing runtime failover when the primary model encounters errors.
+Hermes supports a configured fallback provider chain — a list of `(provider, model)` entries tried in order when the primary model encounters errors. The legacy single-pair `fallback_model` dict is still accepted for back-compat (and migrated on first write).

 ### How it works internally

--- a/website/docs/guides/automation-templates.md
+++ b/website/docs/guides/automation-templates.md
@ -74,7 +74,7 @@ Review for:
 - Missing tests for new behavior

 Post a concise review. If the PR is a trivial docs/typo change, say so briefly." \
-  --skills "github-code-review" \
+  --skill github-code-review \
  --deliver github_comment
 ```

@ -296,7 +296,7 @@ Focus on:

 Skip routine dependency bumps and CI fixes. If nothing notable, respond with [SILENT].
 If there are findings, organize by repo with brief analysis of each item." \
-  --skills "competitive-pr-scout" \
+  --skill competitive-pr-scout \
  --name "Competitor scout" \
  --deliver telegram
 ```
@ -335,7 +335,7 @@ Daily arXiv scan that saves summaries to your note-taking system.
 ```bash
 hermes cron create "0 8 * * *" \
  "Search arXiv for the 3 most interesting papers on 'language model reasoning' OR 'tool-use agents' from the past day. For each paper, create an Obsidian note with the title, authors, abstract summary, key contribution, and potential relevance to Hermes Agent development." \
-  --skills "arxiv,obsidian" \
+  --skill arxiv --skill obsidian \
  --name "Paper digest" \
  --deliver local
 ```
@ -430,7 +430,7 @@ If action is 'closed' and pull_request.merged is true:
 5. Reference the original PR in the new PR description

 If action is not 'closed' or not merged, respond with [SILENT]." \
-  --skills "github-pr-workflow" \
+  --skill github-pr-workflow \
  --deliver log
 ```

@ -514,7 +514,7 @@ hermes cron create "0 3 * * 0" \

 Write a security report with findings categorized by severity (Critical, High, Medium, Low).
 If nothing found, report a clean bill of health." \
-  --skills "codebase-security-audit" \
+  --skill codebase-security-audit \
  --name "Weekly security audit" \
  --deliver telegram
 ```
--- a/website/docs/guides/cron-script-only.md
+++ b/website/docs/guides/cron-script-only.md
@ -231,16 +231,15 @@ Silent when both filesystems are under 90%; fires exactly one line per over-thre

 | Approach | What runs | When to use |
 |----------|-----------|-------------|
-| `hermes send` (one-shot) | Any shell command piping into it | Ad-hoc delivery or as the action of an external scheduler (systemd, launchd) |
 | `cronjob --no-agent` (this page) | Your script on Hermes' schedule | Recurring watchdogs / alerts / metrics that don't need reasoning |
 | `cronjob` (default, LLM) | Agent with optional pre-check script | When the message content requires reasoning over data |
-| OS cron + `hermes send` | Your script on the OS schedule | When Hermes might be unhealthy (the thing you're monitoring) |
+| OS cron + `curl` to a [webhook subscription](/docs/user-guide/features/webhooks) | Your script on the OS schedule | When Hermes might be unhealthy (the thing you're monitoring) |

-For critical system-health watchdogs that must fire *even when the gateway is down*, keep using OS-level cron + a plain `curl` or `hermes send` call — those run as independent OS processes and don't depend on Hermes being up. The in-gateway scheduler is the right choice when the thing being monitored is external.
+For critical system-health watchdogs that must fire *even when the gateway is down*, use OS-level cron with a plain `curl` to a Hermes webhook subscription (or any external alerting endpoint) — those run as independent OS processes and don't depend on Hermes being up. The in-gateway scheduler is the right choice when the thing being monitored is external.

 ## Related

 - [Automate Anything with Cron](/docs/guides/automate-with-cron) — LLM-driven cron patterns.
 - [Scheduled Tasks (Cron) reference](/docs/user-guide/features/cron) — full schedule syntax, lifecycle, delivery routing.
- [Pipe Script Output with `hermes send`](/docs/guides/pipe-script-output) — the one-shot counterpart for ad-hoc scripts.
+- [Webhook Subscriptions](/docs/user-guide/features/webhooks) — fire-and-forget HTTP entry points for external schedulers.
 - [Gateway Internals](/docs/developer-guide/gateway-internals) — delivery-router internals.
--- a/website/docs/guides/cron-troubleshooting.md
+++ b/website/docs/guides/cron-troubleshooting.md
@ -38,7 +38,7 @@ If the job fires once and then disappears from the list, it's a one-shot schedul

 Cron jobs are fired by the gateway's background ticker thread, which ticks every 60 seconds. A regular CLI chat session does **not** automatically fire cron jobs.

-If you're expecting jobs to fire automatically, you need a running gateway (`hermes gateway` or `hermes serve`). For one-off debugging, you can manually trigger a tick with `hermes cron tick`.
+If you're expecting jobs to fire automatically, you need a running gateway (`hermes gateway` for foreground, or `hermes gateway start` for the installed service). For one-off debugging, you can manually trigger a tick with `hermes cron tick`.

 ### Check 4: Check the system clock and timezone

--- a/website/docs/guides/local-ollama-setup.md
+++ b/website/docs/guides/local-ollama-setup.md
@ -31,11 +31,11 @@ By the end, you'll have:
 | **GPU** | Not required | NVIDIA GPU with 8+ GB VRAM speeds things up significantly |

 :::tip CPU-only works, but expect slower responses
-Ollama runs on CPU-only servers. A 9B model on a modern 8-core CPU gives ~10 tokens/sec. A 31B model on CPU is slower (~2–5 tokens/sec) — each response takes 30–120 seconds, but it works. A GPU dramatically improves this. For CPU-only setups, increase the API timeout in config:
+Ollama runs on CPU-only servers. A 9B model on a modern 8-core CPU gives ~10 tokens/sec. A 31B model on CPU is slower (~2–5 tokens/sec) — each response takes 30–120 seconds, but it works. A GPU dramatically improves this. For CPU-only setups, widen the API timeout via the env var (it's not a `config.yaml` key):

-```yaml
-agent:
-  api_timeout: 1800   # 30 minutes — generous for slow local models
+```bash
+# ~/.hermes/.env
+HERMES_API_TIMEOUT=1800   # 30 minutes — generous for slow local models
 ```
 :::

--- a/website/docs/guides/minimax-oauth.md
+++ b/website/docs/guides/minimax-oauth.md
@ -56,10 +56,12 @@ hermes auth add minimax-oauth

 ### China region

-If your account is on the China platform (`minimaxi.com`), pass `--region cn`:
+If your account is on the China platform (`minimaxi.com`), use the China-region OAuth provider id `minimax-cn` instead, or skip OAuth and configure `MINIMAX_CN_API_KEY` / `MINIMAX_CN_BASE_URL` directly. The `--region cn` flag described in older docs is **not** wired through the CLI's argument parser; use the `minimax-cn` provider instead:

 ```bash
-hermes auth add minimax-oauth --region cn
+hermes auth add minimax-cn --type oauth   # if OAuth is supported on your CN account
+# or simpler:
+echo 'MINIMAX_CN_API_KEY=your-key' >> ~/.hermes/.env
 ```

 ### Remote / headless sessions
@ -128,12 +130,12 @@ model:
  base_url: https://api.minimax.io/anthropic
 ```

-### `--region` flag
+### Region endpoints

-| Value | Portal | Inference endpoint |
-|-------|--------|-------------------|
-| `global` (default) | `https://api.minimax.io` | `https://api.minimax.io/anthropic` |
-| `cn` | `https://api.minimaxi.com` | `https://api.minimaxi.com/anthropic` |
+| Provider id | Portal | Inference endpoint |
+|-------------|--------|-------------------|
+| `minimax-oauth` (global) | `https://api.minimax.io` | `https://api.minimax.io/anthropic` |
+| `minimax-cn` (China) | `https://api.minimaxi.com` | `https://api.minimaxi.com/anthropic` |

 ### Provider aliases

--- a/website/docs/guides/operate-teams-meeting-pipeline.md
+++ b/website/docs/guides/operate-teams-meeting-pipeline.md
@ -54,21 +54,32 @@ You MUST run `maintain-subscriptions` on a schedule. Pick one of these three opt

 #### Option 1: Hermes cron (recommended if you already run the Hermes gateway)

-Hermes ships a built-in cron scheduler. Add a script-only cron job that runs every 12 hours (gives 6x headroom against the 72h expiry window):
+Hermes ships a built-in cron scheduler. The `--no-agent` mode runs a script as the job (rather than using an LLM), and `--script` must point at a file under `~/.hermes/scripts/`. First create the script:

 ```bash
-hermes cron add \
+mkdir -p ~/.hermes/scripts
+cat > ~/.hermes/scripts/maintain-teams-subscriptions.sh <<'EOF'
+#!/usr/bin/env bash
+exec hermes teams-pipeline maintain-subscriptions
+EOF
+chmod +x ~/.hermes/scripts/maintain-teams-subscriptions.sh
+```
+
+Then register a script-only cron job that runs every 12 hours (gives 6x headroom against the 72h expiry window):
+
+```bash
+hermes cron create "0 */12 * * *" \
  --name "teams-pipeline-maintain-subscriptions" \
-  --schedule "0 */12 * * *" \
-  --script-only \
-  --command "hermes teams-pipeline maintain-subscriptions"
+  --no-agent \
+  --script maintain-teams-subscriptions.sh \
+  --deliver local
 ```

 Verify it was registered and inspect the next run time:

 ```bash
 hermes cron list
-hermes cron show teams-pipeline-maintain-subscriptions
+hermes cron status        # scheduler status
 ```

 #### Option 2: systemd timer (recommended for Linux production deployments)
--- a/website/docs/guides/python-library.md
+++ b/website/docs/guides/python-library.md
@ -81,7 +81,8 @@ print(f"Messages exchanged: {len(result['messages'])}")
 The returned dictionary contains:
 - **`final_response`** — The agent's final text reply
 - **`messages`** — The complete message history (system, user, assistant, tool calls)
- **`task_id`** — The task identifier used for VM isolation
+
+(The `task_id` you pass in is stored on the agent instance for VM isolation but isn't echoed back in the return dict.)

 You can also pass a custom system message that overrides the ephemeral system prompt for that call:

--- a/website/docs/guides/use-mcp-with-hermes.md
+++ b/website/docs/guides/use-mcp-with-hermes.md
@ -143,7 +143,7 @@ Use `chrome-devtools-mcp`.
 If your Windows Chrome already has live remote debugging enabled from `chrome://inspect/#remote-debugging`, add it like this from WSL:

 ```bash
-hermes mcp add chrome-devtools-win --command cmd.exe --args /c "npx -y chrome-devtools-mcp@latest --autoConnect --no-usage-statistics"
+hermes mcp add chrome-devtools-win --command cmd.exe --args /c npx -y chrome-devtools-mcp@latest --autoConnect --no-usage-statistics
 ```

 After saving the server:
--- a/website/docs/index.md
+++ b/website/docs/index.md
@ -47,7 +47,7 @@ It's not a coding copilot tethered to an IDE or a chatbot wrapper around a singl
 | 🗺️ **[Learning Path](/docs/getting-started/learning-path)** | Find the right docs for your experience level |
 | ⚙️ **[Configuration](/docs/user-guide/configuration)** | Config file, providers, models, and options |
 | 💬 **[Messaging Gateway](/docs/user-guide/messaging)** | Set up Telegram, Discord, Slack, WhatsApp, Teams, or more |
-| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 68 built-in tools and how to configure them |
+| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 70+ built-in tools and how to configure them |
 | 🧠 **[Memory System](/docs/user-guide/features/memory)** | Persistent memory that grows across sessions |
 | 📚 **[Skills System](/docs/user-guide/features/skills)** | Procedural memory the agent creates and reuses |
 | 🔌 **[MCP Integration](/docs/user-guide/features/mcp)** | Connect to MCP servers, filter their tools, and extend Hermes safely |
@ -65,7 +65,7 @@ It's not a coding copilot tethered to an IDE or a chatbot wrapper around a singl

 - **A closed learning loop** — Agent-curated memory with periodic nudges, autonomous skill creation, skill self-improvement during use, FTS5 cross-session recall with LLM summarization, and [Honcho](https://github.com/plastic-labs/honcho) dialectic user modeling
 - **Runs anywhere, not just your laptop** — 6 terminal backends: local, Docker, SSH, Daytona, Singularity, Modal. Daytona and Modal offer serverless persistence — your environment hibernates when idle, costing nearly nothing
- **Lives where you do** — CLI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Mattermost, Email, SMS, DingTalk, Feishu, WeCom, BlueBubbles, Home Assistant, Microsoft Teams — 15+ platforms from one gateway
+- **Lives where you do** — CLI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Mattermost, Email, SMS, DingTalk, Feishu, WeCom, Weixin, QQ Bot, Yuanbao, BlueBubbles, Home Assistant, Microsoft Teams, Google Chat, and more — 20+ platforms from one gateway
 - **Built by model trainers** — Created by [Nous Research](https://nousresearch.com), the lab behind Hermes, Nomos, and Psyche. Works with [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai), OpenAI, or any endpoint
 - **Scheduled automations** — Built-in cron with delivery to any platform
 - **Delegates & parallelizes** — Spawn isolated subagents for parallel workstreams. Programmatic Tool Calling via `execute_code` collapses multi-step pipelines into single inference calls
--- a/website/docs/integrations/providers.md
+++ b/website/docs/integrations/providers.md
@ -378,8 +378,8 @@ bedrock:
  # profile: "myprofile"       # or set AWS_PROFILE
  # discovery: true            # auto-discover region from IAM
  # guardrail:                 # optional Bedrock Guardrails
-  #   id: "your-guardrail-id"
-  #   version: "DRAFT"
+  #   guardrail_identifier: "your-guardrail-id"
+  #   guardrail_version: "DRAFT"
 ```

 Authentication uses the standard boto3 chain: explicit `AWS_ACCESS_KEY_ID`/`AWS_SECRET_ACCESS_KEY`, `AWS_PROFILE` from `~/.aws/credentials`, IAM role on EC2/ECS/Lambda, IMDS, or SSO. No env var is required if you're already authenticated with the AWS CLI.
@ -484,7 +484,7 @@ For on-prem deployments (DGX Spark, local GPU), set `NVIDIA_BASE_URL=http://loca

 ### GMI Cloud

-Open and reasoning models via [GMI Cloud](https://inference.gmi.ai) — OpenAI-compatible API, API key authentication.
+Open and reasoning models via [GMI Cloud](https://www.gmicloud.ai/) — OpenAI-compatible API, API key authentication.

 ```bash
 # GMI Cloud
@ -499,7 +499,7 @@ model:
  default: "deepseek-ai/DeepSeek-R1"
 ```

-The base URL can be overridden with `GMI_BASE_URL` (default: `https://api.gmi.ai/v1`).
+The base URL can be overridden with `GMI_BASE_URL` (default: `https://api.gmi-serving.com/v1`).

 ### StepFun

@ -1393,24 +1393,34 @@ Notes:
 - See OpenRouter's [Pareto Router docs](https://openrouter.ai/docs/guides/routing/routers/pareto-router) for the full router behavior.
 - To use the Pareto Code router for a specific **auxiliary task** (compression, vision, etc.) instead of the main agent, set `extra_body.plugins` under that task — see [Auxiliary Models → OpenRouter routing & Pareto Code for auxiliary tasks](/docs/user-guide/configuration#openrouter-routing--pareto-code-for-auxiliary-tasks).

-## Fallback Model
+## Fallback Providers

-Configure a backup provider:model that Hermes switches to automatically when your primary model fails (rate limits, server errors, auth failures):
+Configure a chain of backup providers Hermes tries in order when the primary model fails (rate limits, server errors, auth failures). The canonical format is a top-level `fallback_providers:` list:
+
+```yaml
+fallback_providers:
+  - provider: openrouter
+    model: anthropic/claude-sonnet-4
+  - provider: anthropic
+    model: claude-sonnet-4
+    # base_url: http://localhost:8000/v1    # optional, for custom endpoints
+    # api_mode: chat_completions           # optional override
+```
+
+The legacy single-pair `fallback_model:` dict is still accepted for back-compat:

 ```yaml
 fallback_model:
-  provider: openrouter                    # required
-  model: anthropic/claude-sonnet-4        # required
-  # base_url: http://localhost:8000/v1    # optional, for custom endpoints
-  # key_env: MY_CUSTOM_KEY               # optional, env var name for custom endpoint API key
+  provider: openrouter
+  model: anthropic/claude-sonnet-4
 ```

-When activated, the fallback swaps the model and provider mid-session without losing your conversation. It fires **at most once** per session.
+When activated, the fallback swaps the model and provider mid-session without losing your conversation. The chain is tried entry-by-entry; activation is one-shot per session.

-Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `deepseek`, `nvidia`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `stepfun`, `alibaba`, `tencent-tokenhub`, `custom`.
+Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `deepseek`, `nvidia`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `azure-foundry`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `stepfun`, `lmstudio`, `alibaba`, `alibaba-coding-plan`, `tencent-tokenhub`, `custom`.

 :::tip
-Fallback is configured exclusively through `config.yaml` — there are no environment variables for it. For full details on when it triggers, supported providers, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers).
+Fallback is configured exclusively through `config.yaml` — or interactively via `hermes fallback`. For full details on when it triggers, how the chain advances, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers).
 :::

 ---
--- a/website/docs/user-guide/messaging/google_chat.md
+++ b/website/docs/user-guide/messaging/google_chat.md
@ -164,10 +164,10 @@ GOOGLE_CHAT_MAX_BYTES=16777216                  # 16 MiB — cap on in-flight me
 The project ID also falls back to `GOOGLE_CLOUD_PROJECT`, and the SA path falls
 back to `GOOGLE_APPLICATION_CREDENTIALS` — use whichever convention you prefer.

-Install Hermes with the optional dependencies:
+Install the dependencies the Google Chat adapter needs (no Hermes extra is currently published — install them directly):

 ```bash
-pip install 'hermes-agent[google_chat]'
+pip install google-cloud-pubsub google-api-python-client google-auth google-auth-oauthlib
 ```

 Start the gateway:
--- a/website/docs/user-guide/messaging/index.md
+++ b/website/docs/user-guide/messaging/index.md
@ -386,7 +386,7 @@ Each platform has its own toolset:
 | Discord | `hermes-discord` | Full tools including terminal |
 | WhatsApp | `hermes-whatsapp` | Full tools including terminal |
 | Slack | `hermes-slack` | Full tools including terminal |
-| Google Chat | `hermes-google-chat` | Full tools including terminal |
+| Google Chat | `hermes-google_chat` | Full tools including terminal |
 | Signal | `hermes-signal` | Full tools including terminal |
 | SMS | `hermes-sms` | Full tools including terminal |
 | Email | `hermes-email` | Full tools including terminal |
@ -402,7 +402,7 @@ Each platform has its own toolset:
 | QQBot | `hermes-qqbot` | Full tools including terminal |
 | Yuanbao | `hermes-yuanbao` | Full tools including terminal |
 | Microsoft Teams | `hermes-teams` | Full tools including terminal |
-| API Server | `hermes` (default) | Full tools including terminal |
+| API Server | `hermes-api-server` | Full tools (drops `clarify`, `send_message`, `text_to_speech` — programmatic access doesn't have an interactive user) |
 | Webhooks | `hermes-webhook` | Full tools including terminal |

 ## Next Steps
--- a/website/docs/user-guide/messaging/open-webui.md
+++ b/website/docs/user-guide/messaging/open-webui.md
@ -275,16 +275,22 @@ To run separate Hermes instances per user — each with their own config, memory

 ### 1. Create profiles and configure API servers

+`API_SERVER_*` are env vars, not YAML config keys, so write them to each profile's `.env`. Pick ports outside the default-platform range (`8644` is the webhook adapter, `8645` is wecom-callback, `8646` is msgraph-webhook), e.g. `8650+`:
+
 ```bash
 hermes profile create alice
-hermes -p alice config set API_SERVER_ENABLED true
-hermes -p alice config set API_SERVER_PORT 8643
-hermes -p alice config set API_SERVER_KEY alice-secret
+cat >> ~/.hermes/profiles/alice/.env <<EOF
+API_SERVER_ENABLED=true
+API_SERVER_PORT=8650
+API_SERVER_KEY=alice-secret
+EOF

 hermes profile create bob
-hermes -p bob config set API_SERVER_ENABLED true
-hermes -p bob config set API_SERVER_PORT 8644
-hermes -p bob config set API_SERVER_KEY bob-secret
+cat >> ~/.hermes/profiles/bob/.env <<EOF
+API_SERVER_ENABLED=true
+API_SERVER_PORT=8651
+API_SERVER_KEY=bob-secret
+EOF
 ```

 ### 2. Start each gateway
@ -300,8 +306,8 @@ In **Admin Settings** → **Connections** → **OpenAI API** → **Manage**, add

 | Connection | URL | API Key |
 |-----------|-----|---------|
-| Alice | `http://host.docker.internal:8643/v1` | `alice-secret` |
-| Bob | `http://host.docker.internal:8644/v1` | `bob-secret` |
+| Alice | `http://host.docker.internal:8650/v1` | `alice-secret` |
+| Bob | `http://host.docker.internal:8651/v1` | `bob-secret` |

 The model dropdown will show `alice` and `bob` as distinct models. You can assign models to Open WebUI users via the admin panel, giving each user their own isolated Hermes agent.

--- a/website/docs/user-guide/messaging/qqbot.md
+++ b/website/docs/user-guide/messaging/qqbot.md
@ -55,7 +55,7 @@ QQ_CLIENT_SECRET=your-app-secret
 | `QQ_ALLOW_ALL_USERS` | Set to `true` to allow all DMs | `false` |
 | `QQ_PORTAL_HOST` | Override the QQ portal host (set to `sandbox.q.qq.com` for sandbox routing) | `q.qq.com` |
 | `QQ_STT_API_KEY` | API key for voice-to-text provider | — |
-| `QQ_STT_BASE_URL` | Base URL for STT provider | `https://open.bigmodel.cn/api/coding/paas/v4` |
+| `QQ_STT_BASE_URL` | (Not read directly — set `platforms.qqbot.extra.stt.baseUrl` in `config.yaml` instead) | n/a |
 | `QQ_STT_MODEL` | STT model name | `glm-asr` |

 ## Advanced Configuration
@ -64,7 +64,7 @@ For fine-grained control, add platform settings to `~/.hermes/config.yaml`:

 ```yaml
 platforms:
-  qq:
+  qqbot:
    enabled: true
    extra:
      app_id: "your-app-id"
--- a/website/docs/user-guide/messaging/sms.md
+++ b/website/docs/user-guide/messaging/sms.md
@ -108,7 +108,7 @@ hermes gateway
 You should see:

 ```
-[sms] Twilio webhook server listening on 0.0.0.0:8080, from: +1555***4567
+[sms] Twilio webhook server listening on 127.0.0.1:8080, from: +1555***4567
 ```

 If you see `Refusing to start: SMS_WEBHOOK_URL is required`, set `SMS_WEBHOOK_URL` to the public URL configured in your Twilio Console (see Step 3).
--- a/website/docs/user-guide/messaging/teams-meetings.md
+++ b/website/docs/user-guide/messaging/teams-meetings.md
@ -25,7 +25,7 @@ The pipeline:
 4. stores durable job state and sink records locally
 5. can write summaries to Notion, Linear, and Microsoft Teams

-Operator actions stay in the CLI:
+Operator actions stay in the CLI (the `teams-pipeline` subcommand is registered by the `teams_pipeline` plugin — enable it via `hermes plugins enable teams_pipeline` or set `plugins.enabled: [teams_pipeline]` in `config.yaml`):

 ```bash
 hermes teams-pipeline validate