fix(agent): guard Anthropic interrupt, cap vision data-URL size

Two independent agent-loop hardening fixes: - anthropic: when the streaming loop breaks on _interrupt_requested, return None instead of calling stream.get_final_message() on the partially-drained stream — the SDK may hang draining remaining events or return a Message with incomplete tool_use blocks. The outer poll loop raises InterruptedError, so the return value is discarded anyway. - vision: add a 20 MB cap on base64 data-URL payloads before base64.b64decode() in _materialize_data_url_for_vision. A 100MB+ payload creates ~275MB of memory pressure; gateway users sharing the process can trivially OOM it. Oversized payloads return ("", None). The third change from the original PR (streaming tool-name += to assignment dedup) was already landed independently on main. Co-authored-by: aaronlab <1115117931@qq.com>
2026-06-30 11:52:04 +00:00 · 2026-06-28 15:25:36 -07:00 · 2026-06-28 15:25:36 -07:00 · ec148f5d31
commit ec148f5d31
parent 490f215a19
2 changed files with 19 additions and 1 deletions
--- a/agent/chat_completion_helpers.py
+++ b/agent/chat_completion_helpers.py
@ -2319,7 +2319,15 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
                                _fire_first_delta()
                                agent._fire_reasoning_delta(thinking_text)

-            # Return the native Anthropic Message for downstream processing
+            # Return the native Anthropic Message for downstream processing.
+            # If the stream was interrupted (the event loop broke out above on
+            # agent._interrupt_requested), do NOT call get_final_message() — on
+            # a partially-consumed stream the SDK may hang draining remaining
+            # events or return a Message with incomplete tool_use blocks (partial
+            # JSON in `input`). The outer poll loop raises InterruptedError, so
+            # this return value is discarded anyway.
+            if agent._interrupt_requested:
+                return None
            return stream.get_final_message()

    def _call():
--- a/run_agent.py
+++ b/run_agent.py
@ -4429,9 +4429,19 @@ class AIAgent:
                return True
        return False

+    # 20 MB base64 ≈ 15 MB decoded image — generous but prevents OOM from an
+    # oversized data: URL (a 100 MB+ payload creates ~275 MB of memory pressure,
+    # and gateway users sharing the same process can trivially OOM it).
+    _MAX_DATA_URL_BASE64_BYTES = 20 * 1024 * 1024
+
    @staticmethod
    def _materialize_data_url_for_vision(image_url: str) -> tuple[str, Optional[Path]]:
        header, _, data = str(image_url or "").partition(",")
+        if len(data) > AIAgent._MAX_DATA_URL_BASE64_BYTES:
+            logger.warning(
+                "data-URL payload too large (%d bytes), skipping", len(data)
+            )
+            return "", None
        mime = "image/jpeg"
        if header.startswith("data:"):
            mime_part = header[len("data:"):].split(";", 1)[0].strip()