mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-27 11:22:03 +00:00
fix(agent): trigger preflight compression on few-but-huge sessions (#27405)
The preflight-compression gate only ran the (expensive) token estimate when the message COUNT exceeded protect_first_n + protect_last_n + 1. A session with a handful of very large messages never tripped the count condition, so compression was never attempted and the turn eventually hit a hard context-overflow error. Add _should_run_preflight_estimate() with OR semantics: run the estimate when either the message count exceeds the protected ranges (the historical gate) OR a cheap char-based estimate already crosses the configured threshold. The downstream estimate_request_tokens_rough() stays authoritative — this is only a hint that decides whether to pay for the full estimate. Salvaged from #27435 by @texhy (authorship preserved). Re-applied on current main: the preflight gate moved from conversation_loop.py to turn_context.py since the PR was opened, so the helper + gate are placed there; the test imports the real MINIMUM_CONTEXT_LENGTH instead of a hardcoded literal. Closes #27405.
This commit is contained in:
parent
b13e2fd694
commit
aacc6bb0a8
3 changed files with 150 additions and 5 deletions
|
|
@ -64,6 +64,7 @@ AUTHOR_MAP = {
|
|||
"rayjun0412@gmail.com": "rayjun", # cron model.default salvage co-author (#43952)
|
||||
"96944678+sweetcornna@users.noreply.github.com": "sweetcornna", # cron ticker-liveness salvage co-author (#33849)
|
||||
"izumi0uu@gmail.com": "izumi0uu", # PR #49544 salvage (native rich reply echo; #49534)
|
||||
"dev@pixlmedia.no": "texhy", # PR #27435 salvage (few-but-huge preflight compression gate; #27405)
|
||||
"w31rdm4ch1n3z@protonmail.com": "w31rdm4ch1nZ",
|
||||
"xtpeeps@gmail.com": "x7peeps",
|
||||
"ahmad@madsgency.com": "ahmadashfq",
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue