fix(agent): trigger preflight compression on few-but-huge sessions (#27405)

The preflight-compression gate only ran the (expensive) token estimate when the message COUNT exceeded protect_first_n + protect_last_n + 1. A session with a handful of very large messages never tripped the count condition, so compression was never attempted and the turn eventually hit a hard context-overflow error. Add _should_run_preflight_estimate() with OR semantics: run the estimate when either the message count exceeds the protected ranges (the historical gate) OR a cheap char-based estimate already crosses the configured threshold. The downstream estimate_request_tokens_rough() stays authoritative — this is only a hint that decides whether to pay for the full estimate. Salvaged from #27435 by @texhy (authorship preserved). Re-applied on current main: the preflight gate moved from conversation_loop.py to turn_context.py since the PR was opened, so the helper + gate are placed there; the test imports the real MINIMUM_CONTEXT_LENGTH instead of a hardcoded literal. Closes #27405.
2026-06-27 11:22:03 +00:00 · 2026-06-25 01:10:13 +05:30 · 2026-06-25 01:10:13 +05:30 · aacc6bb0a8
commit aacc6bb0a8
parent b13e2fd694
3 changed files with 150 additions and 5 deletions
--- a/scripts/release.py
+++ b/scripts/release.py
@ -64,6 +64,7 @@ AUTHOR_MAP = {
    "rayjun0412@gmail.com": "rayjun",  # cron model.default salvage co-author (#43952)
    "96944678+sweetcornna@users.noreply.github.com": "sweetcornna",  # cron ticker-liveness salvage co-author (#33849)
    "izumi0uu@gmail.com": "izumi0uu",  # PR #49544 salvage (native rich reply echo; #49534)
+    "dev@pixlmedia.no": "texhy",  # PR #27435 salvage (few-but-huge preflight compression gate; #27405)
    "w31rdm4ch1n3z@protonmail.com": "w31rdm4ch1nZ",
    "xtpeeps@gmail.com": "x7peeps",
    "ahmad@madsgency.com": "ahmadashfq",