From a9602d27e7c7a4706b8efc8c20a3cc93be7117fc Mon Sep 17 00:00:00 2001 From: Railway9784 Date: Fri, 19 Jun 2026 23:11:00 -0700 Subject: [PATCH] docs(skill): document context_length auto-detection resolution chain When model.context_length is set in config.yaml, it blocks auto-detection from the server's /v1/models endpoint. The skill incorrectly implied a hard fallback to 131072. Add the resolution chain and the fix command (hermes config set model.context_length "") to both the config table and a new troubleshooting section. --- .../autonomous-ai-agents-hermes-agent.md | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent.md b/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent.md index 089ea173923..8a29c919716 100644 --- a/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent.md +++ b/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent.md @@ -377,7 +377,7 @@ Edit with `hermes config edit` or `hermes config set section.key value`. | Section | Key options | |---------|-------------| -| `model` | `default`, `provider`, `base_url`, `api_key`, `context_length` | +| `model` | `default`, `provider`, `base_url`, `api_key`, `context_length` (explicit override; clear to `""` for auto-detect from server `/v1/models`) | | `agent` | `max_turns` (90), `tool_use_enforcement` | | `terminal` | `backend` (local/docker/ssh/modal), `cwd`, `timeout` (180) | | `compression` | `enabled`, `threshold` (0.50), `target_ratio` (0.20) | @@ -875,6 +875,22 @@ hermes config set auxiliary.vision.model ``` --- +### Context window shows wrong size + +If Hermes reports a smaller context window than your local model supports +(e.g., 128k when llama-server has `-c 262144`): + +**Check if `model.context_length` is explicitly set.** Hermes uses a +multi-source resolution chain (highest priority first): + +1. `model.context_length` in config.yaml — **blocks auto-detection if set** +2. Custom provider per-model setting +3. Persistent cache (survives restarts) +4. `/v1/models` endpoint from your server — auto-detected when nothing + above overrides it + +**Fix:** Clear the override so auto-detection falls through: + ## Where to Find Things