From 860cf28dabbaf93459a778a835edbc3663e381c5 Mon Sep 17 00:00:00 2001 From: Teknium <127238744+teknium1@users.noreply.github.com> Date: Fri, 29 May 2026 19:59:04 -0700 Subject: [PATCH] docs: clarify compression threshold is derived from the main model's context window (#35099) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The compression threshold is threshold × context_length where context_length is the MAIN agent model's window, not the auxiliary/summary model's. On a 262,144-token model at the default 0.50 the threshold is 131,072 — close to a common 128K figure by coincidence of the percentage, which has led to confusion that the auxiliary model's context limit is the trigger. Add a note preempting that misreading and pointing to the separate summary-model-context constraint. --- .../context-compression-and-caching.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/website/docs/developer-guide/context-compression-and-caching.md b/website/docs/developer-guide/context-compression-and-caching.md index 4b511756181..55641b16f27 100644 --- a/website/docs/developer-guide/context-compression-and-caching.md +++ b/website/docs/developer-guide/context-compression-and-caching.md @@ -111,6 +111,17 @@ tail_token_budget = 100,000 × 0.20 = 20,000 max_summary_tokens = min(200,000 × 0.05, 12,000) = 10,000 ``` +:::note Threshold is derived from the MAIN model's context window +`threshold_tokens` is always `threshold × context_length`, where `context_length` +is the **main agent model's** context window — never the auxiliary/summary +model's. On a 262,144-token model at the default `0.50`, the threshold is +`262,144 × 0.50 = 131,072`. That number being close to a common "128K context" +is a coincidence of the percentage, not a sign that the auxiliary model's window +is the trigger. The auxiliary model's context window is a separate concern — see +the "Summary model context length" warning below for how it affects whether a +summary can be produced, not when compression fires. +::: + ## Compression Algorithm