docs(config): document session_search auxiliary controls

2026-04-25 00:51:20 +00:00 · 2026-04-19 13:11:22 -06:00 · 2026-04-19 13:11:22 -06:00 · afba54364e
commit afba54364e
parent 6ab78401c9
3 changed files with 64 additions and 0 deletions
--- a/website/docs/user-guide/features/fallback-providers.md
+++ b/website/docs/user-guide/features/fallback-providers.md
@ -215,6 +215,9 @@ auxiliary:
  session_search:
    provider: "auto"
    model: ""
+    timeout: 30
+    max_concurrency: 3
+    extra_body: {}

  skills_hub:
    provider: "auto"
@ -248,6 +251,25 @@ fallback_model:
  # base_url: http://localhost:8000/v1               # Optional custom endpoint
 ```

+For `auxiliary.session_search`, Hermes also supports:
+
+- `max_concurrency` to limit how many session summaries run at once
+- `extra_body` to pass provider-specific OpenAI-compatible request fields through on the summarization calls
+
+Example:
+
+```yaml
+auxiliary:
+  session_search:
+    provider: main
+    model: glm-4.5-air
+    max_concurrency: 2
+    extra_body:
+      enable_thinking: false
+```
+
+If your provider does not support a native OpenAI-compatible reasoning-control field, `extra_body` will not help for that part; in that case `max_concurrency` is still useful for reducing request-burst 429s.
+
 All three — auxiliary, compression, fallback — work the same way: set `provider` to pick who handles the request, `model` to pick which model, and `base_url` to point at a custom endpoint (overrides provider).

 ### Provider Options for Auxiliary Tasks