mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-04 07:31:58 +00:00
fix(context): align guidance with 64k minimum
This commit is contained in:
parent
1d5deac346
commit
3b839f4369
6 changed files with 41 additions and 35 deletions
|
|
@ -82,7 +82,7 @@ hermes model
|
|||
# API base URL: http://localhost:11434/v1
|
||||
# API key: ollama
|
||||
# Model name: qwen3.5:27b
|
||||
# Context length: 32768 ← set this to match your server's actual context window
|
||||
# Context length: 64000 ← Hermes minimum; set this to match your server's actual context window
|
||||
```
|
||||
|
||||
Or configure it directly in `config.yaml`:
|
||||
|
|
@ -99,7 +99,7 @@ Hermes persists the endpoint, provider, and base URL in `config.yaml` so it surv
|
|||
This works with Ollama, vLLM, llama.cpp server, SGLang, LocalAI, and others. See the [Configuration guide](../user-guide/configuration.md) for details.
|
||||
|
||||
:::tip Ollama users
|
||||
If you set a custom `num_ctx` in Ollama (e.g., `ollama run --num_ctx 16384`), make sure to set the matching context length in Hermes — Ollama's `/api/show` reports the model's *maximum* context, not the effective `num_ctx` you configured.
|
||||
If you set a custom `num_ctx` in Ollama (e.g., `ollama run --num_ctx 64000`), make sure to set the matching context length in Hermes — Ollama's `/api/show` reports the model's *maximum* context, not the effective `num_ctx` you configured.
|
||||
:::
|
||||
|
||||
:::tip Timeouts with local models
|
||||
|
|
@ -340,7 +340,7 @@ custom_providers:
|
|||
base_url: "http://localhost:11434/v1"
|
||||
models:
|
||||
qwen3.5:27b:
|
||||
context_length: 32768
|
||||
context_length: 64000
|
||||
```
|
||||
|
||||
See [Context Length Detection](../integrations/providers.md#context-length-detection) for how auto-detection works and all override options.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue