fix(gateway): only mark final response sent when split-overflow chunks actually land (#23420)

The split-overflow path in _send_or_edit (gateway/stream_consumer.py) was
copying the cumulative _already_sent flag into _final_response_sent on the
done frame. _already_sent goes True on any successful prior edit (tool
progress) or on fallback-mode promotion when an edit fails — neither
proves the *current* chunked send delivered the final answer.

When the chunked send actually fails (network error, flood control), the
consumer would wrongly claim 'final delivered' and the gateway's
independent fallback delivery in run.py would be suppressed. User saw
only tool-progress bubbles and never got the answer.

Now we track per-chunk success locally: _send_new_chunk returns the new
message_id on success or returns the passed-in reply_to unchanged on
failure. If at least one returned id differs, chunks_delivered = True;
otherwise stays False, gateway fallback runs.

Adds two regression tests:
- test_split_overflow_failed_send_does_not_mark_final_sent — primes
  _already_sent=True, then makes every send fail; asserts
  _final_response_sent stays False.
- test_split_overflow_partial_send_marks_final_sent — happy path,
  asserts _final_response_sent goes True.

Note: the companion bug at the CancelledError handler (issue cited
lines 417-418) was already fixed by 3b5572ded on 2026-04-16.

Closes #10748
This commit is contained in:
Teknium 2026-05-10 15:13:54 -07:00 committed by GitHub
parent b38b100105
commit 6636fecd47
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 82 additions and 2 deletions

View file

@ -362,13 +362,21 @@ class GatewayStreamConsumer:
chunks = self.adapter.truncate_message(
self._accumulated, _safe_limit
)
chunks_delivered = False
reply_to = self._message_id
for chunk in chunks:
await self._send_new_chunk(chunk, self._message_id)
new_id = await self._send_new_chunk(chunk, reply_to)
if new_id is not None and new_id != reply_to:
chunks_delivered = True
self._accumulated = ""
self._last_sent_text = ""
self._last_edit_time = time.monotonic()
if got_done:
self._final_response_sent = self._already_sent
# Only claim final delivery if THESE chunks actually
# landed. ``_already_sent`` may be True from prior
# tool-progress edits or fallback-mode promotion (#10748)
# — that doesn't mean the final answer reached the user.
self._final_response_sent = chunks_delivered
return
if got_segment_break:
self._message_id = None