mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-09 08:21:50 +00:00
docs: clarify compression threshold is derived from the main model's context window (#35099)
The compression threshold is threshold × context_length where context_length is the MAIN agent model's window, not the auxiliary/summary model's. On a 262,144-token model at the default 0.50 the threshold is 131,072 — close to a common 128K figure by coincidence of the percentage, which has led to confusion that the auxiliary model's context limit is the trigger. Add a note preempting that misreading and pointing to the separate summary-model-context constraint.
This commit is contained in:
parent
fb0ab27649
commit
860cf28dab
1 changed files with 11 additions and 0 deletions
|
|
@ -111,6 +111,17 @@ tail_token_budget = 100,000 × 0.20 = 20,000
|
|||
max_summary_tokens = min(200,000 × 0.05, 12,000) = 10,000
|
||||
```
|
||||
|
||||
:::note Threshold is derived from the MAIN model's context window
|
||||
`threshold_tokens` is always `threshold × context_length`, where `context_length`
|
||||
is the **main agent model's** context window — never the auxiliary/summary
|
||||
model's. On a 262,144-token model at the default `0.50`, the threshold is
|
||||
`262,144 × 0.50 = 131,072`. That number being close to a common "128K context"
|
||||
is a coincidence of the percentage, not a sign that the auxiliary model's window
|
||||
is the trigger. The auxiliary model's context window is a separate concern — see
|
||||
the "Summary model context length" warning below for how it affects whether a
|
||||
summary can be produced, not when compression fires.
|
||||
:::
|
||||
|
||||
|
||||
## Compression Algorithm
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue