docs: comprehensive docs audit — fix 12 stale/missing items across 10 pages (#3618)

Fixes found by auditing docs against recent PRs/commits: Critical (misleading): - hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks that are now active (#3542). Add correct callback signatures. - security.md: Update tirith verdict behavior — block verdicts now go through approval flow instead of hard-blocking (#3428). Add pkill/killall self-termination guard and gateway-run backgrounding patterns (#3593). New feature docs: - configuration.md: Add tool_use_enforcement section with value table (auto/true/false/list) from #3551/#3528. - configuration.md: Expand auxiliary config with per-task timeouts (compression 120s, web_extract 30s, approval 30s) from #3597. - api-server.md: Add /v1/health alias, Security Headers section, CORS details (Max-Age, SSE headers, Idempotency-Key) from #3572/#3573/#3576/#3580/#3530. Stale/incomplete: - configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484). - environment-variables.md: Specify actual DashScope default URL. - cli-commands.md: Add alibaba to --provider list. - fallback-providers.md: Add Alibaba/DashScope to provider table. - email.md: Document noreply/automated sender filtering (#3606). - toolsets-reference.md: Add 4 missing platform toolsets — matrix, mattermost, dingtalk, api-server (#3583). - skills.md: List default GitHub taps including garrytan/gstack (#3605).
2026-04-25 00:51:20 +00:00 · 2026-03-28 15:26:35 -07:00 · 2026-03-28 15:26:35 -07:00 · 02fb7c4aaf
commit 02fb7c4aaf
parent 1e924e99b9
10 changed files with 62 additions and 12 deletions
--- a/website/docs/user-guide/configuration.md
+++ b/website/docs/user-guide/configuration.md
@ -216,7 +216,7 @@ hermes chat --provider minimax-cn --model MiniMax-M2.7
 # Requires: MINIMAX_CN_API_KEY in ~/.hermes/.env

 # Alibaba Cloud / DashScope (Qwen models)
-hermes chat --provider alibaba --model qwen-plus
+hermes chat --provider alibaba --model qwen3.5-plus
 # Requires: DASHSCOPE_API_KEY in ~/.hermes/.env
 ```

@ -1026,6 +1026,7 @@ auxiliary:
    model: ""                  # e.g. "google/gemini-2.5-flash"
    base_url: ""
    api_key: ""
+    timeout: 30                # seconds

  # Dangerous command approval classifier
  approval:
@ -1033,8 +1034,17 @@ auxiliary:
    model: ""
    base_url: ""
    api_key: ""
+    timeout: 30                # seconds
+
+  # Context compression timeout (separate from compression.* config)
+  compression:
+    timeout: 120               # seconds — compression summarizes long conversations, needs more time
 ```

+:::tip
+Each auxiliary task has a configurable `timeout` (in seconds). Defaults: vision 30s, web_extract 30s, approval 30s, compression 120s. Increase these if you use slow local models for auxiliary tasks.
+:::
+
 :::info
 Context compression has its own top-level `compression:` block with `summary_provider`, `summary_model`, and `summary_base_url` — see [Context Compression](#context-compression) above. The fallback model uses a `fallback_model:` block — see [Fallback Model](#fallback-model) above. All three follow the same provider/model/base_url pattern.
 :::
@ -1166,6 +1176,24 @@ You can also change the reasoning effort at runtime with the `/reasoning` comman
 /reasoning hide      # Hide model thinking
 ```

+## Tool-Use Enforcement
+
+Some models (especially GPT-family) occasionally describe intended actions as text instead of making tool calls. Tool-use enforcement injects guidance that steers the model back to actually calling tools.
+
+```yaml
+agent:
+  tool_use_enforcement: "auto"   # "auto" | true | false | ["model-substring", ...]
+```
+
+| Value | Behavior |
+|-------|----------|
+| `"auto"` (default) | Enabled for GPT models (`gpt-`, `openai/gpt-`) and disabled for all others. |
+| `true` | Always enabled for all models. |
+| `false` | Always disabled. |
+| `["gpt-", "o1-", "custom-model"]` | Enabled only for models whose name contains one of the listed substrings. |
+
+When enabled, the system prompt includes guidance reminding the model to make actual tool calls rather than describing what it would do. This is transparent to the user and has no effect on models that already use tools reliably.
+
 ## TTS Configuration

 ```yaml