fix(context): align guidance with 64k minimum

2026-07-20 15:33:54 +00:00 · 2026-05-24 20:31:27 -06:00 · 2026-05-24 20:31:27 -06:00 · 3b839f4369
commit 3b839f4369
parent 1d5deac346
6 changed files with 41 additions and 35 deletions
--- a/website/docs/reference/faq.md
+++ b/website/docs/reference/faq.md
@ -82,7 +82,7 @@ hermes model
 # API base URL: http://localhost:11434/v1
 # API key: ollama
 # Model name: qwen3.5:27b
-# Context length: 32768   ← set this to match your server's actual context window
+# Context length: 64000   ← Hermes minimum; set this to match your server's actual context window
 ```

 Or configure it directly in `config.yaml`:
@ -99,7 +99,7 @@ Hermes persists the endpoint, provider, and base URL in `config.yaml` so it surv
 This works with Ollama, vLLM, llama.cpp server, SGLang, LocalAI, and others. See the [Configuration guide](../user-guide/configuration.md) for details.

 :::tip Ollama users
-If you set a custom `num_ctx` in Ollama (e.g., `ollama run --num_ctx 16384`), make sure to set the matching context length in Hermes — Ollama's `/api/show` reports the model's *maximum* context, not the effective `num_ctx` you configured.
+If you set a custom `num_ctx` in Ollama (e.g., `ollama run --num_ctx 64000`), make sure to set the matching context length in Hermes — Ollama's `/api/show` reports the model's *maximum* context, not the effective `num_ctx` you configured.
 :::

 :::tip Timeouts with local models
@ -340,7 +340,7 @@ custom_providers:
    base_url: "http://localhost:11434/v1"
    models:
      qwen3.5:27b:
-        context_length: 32768
+        context_length: 64000
 ```

 See [Context Length Detection](../integrations/providers.md#context-length-detection) for how auto-detection works and all override options.