Rebased onto god-file Phase 1 refactor — preflight compression has moved
from agent/conversation_loop.py to agent/turn_context.py (no semantic
change in the refactor itself; the bug below was carried over verbatim).
The preflight compression loop in ``turn_context.py`` uses
``len(messages) >= _orig_len`` to decide whether a compression pass has
made progress. That conflates two different conditions: a true no-op
(transcript materially unchanged) and effective token compression that
summarises message contents but keeps the same number of rows. The
second case is misread as "Cannot compress further" — the session then
surfaces ``Context length exceeded`` and auto-resets even when the
post-compression estimate is far below the model context window.
Observed example from #39548: a Telegram session on GPT-5.5 with a 1M
context dropped from ~288k → ~183k tokens (a 36% reduction) while
preserving 220 messages. The loop treats that as exhaustion and the
gateway auto-resets the session.
Fix
---
Add ``_compression_made_progress(orig_len, new_len, orig_tokens, new_tokens)``
and call it after the post-pass ``estimate_request_tokens_rough`` (which
is moved up to run *before* the progress check instead of after it).
Either a row-count reduction OR a token-count reduction now counts as
progress; only when neither moves do we break out as "stuck".
Fixes#39548