From 55fac8a38682e79cb2b4cc06e88a3faa2a016d8f Mon Sep 17 00:00:00 2001
From: Teknium <127238744+teknium1@users.noreply.github.com>
Date: Sat, 11 Apr 2026 11:13:48 -0700
Subject: [PATCH] docs: add warning about summary model context length
 requirement (#7879)

The summary model used for context compaction must have a context window
at least as large as the main agent model. If it's smaller, the
summarization API call fails and middle turns are dropped without a
summary, silently losing conversation context.

Promoted the existing note in configuration.md to a visible warning
admonition, and added a matching warning in the developer guide's
context compression page.
---
 .../docs/developer-guide/context-compression-and-caching.md   | 4 ++++
 website/docs/user-guide/configuration.md                      | 4 +++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/website/docs/developer-guide/context-compression-and-caching.md b/website/docs/developer-guide/context-compression-and-caching.md
index 98dc0a6e2a..d17f45b95b 100644
--- a/website/docs/developer-guide/context-compression-and-caching.md
+++ b/website/docs/developer-guide/context-compression-and-caching.md
@@ -143,6 +143,10 @@ to find the parent assistant message, keeping groups intact.
 
 ### Phase 3: Generate Structured Summary
 
+:::warning Summary model context length
+The summary model must have a context window **at least as large** as the main agent model's. The entire middle section is sent to the summary model in a single `call_llm(task="compression")` call. If the summary model's context is smaller, the API returns a context-length error — `_generate_summary()` catches it, logs a warning, and returns `None`. The compressor then drops the middle turns **without a summary**, silently losing conversation context. This is the most common cause of degraded compaction quality.
+:::
+
 The middle turns are summarized using the auxiliary LLM with a structured
 template:
 
diff --git a/website/docs/user-guide/configuration.md b/website/docs/user-guide/configuration.md
index a8cb23f99a..7b735bbdee 100644
--- a/website/docs/user-guide/configuration.md
+++ b/website/docs/user-guide/configuration.md
@@ -480,7 +480,9 @@ Points at a custom OpenAI-compatible endpoint. Uses `OPENAI_API_KEY` for auth.
 | `nous` / `openrouter` / etc. | not set | Force that provider, use its auth |
 | any | set | Use the custom endpoint directly (provider ignored) |
 
-The `summary_model` must support a context length at least as large as your main model's, since it receives the full middle section of the conversation for compression.
+:::warning Summary model context length requirement
+The `summary_model` **must** have a context window at least as large as your main agent model's. The compressor sends the full middle section of the conversation to the summary model — if that model's context window is smaller than the main model's, the summarization call will fail with a context length error. When this happens, the middle turns are **dropped without a summary**, losing conversation context silently. If you override `summary_model`, verify its context length meets or exceeds your main model's.
+:::
 
 ## Context Engine