fix(vision): auto-resize oversized images, increase default timeout, fix vision capability detection

Cherry-picked from PR #7749 by kshitijk4poor with modifications:

- Raise hard image limit from 5 MB to 20 MB (matches most restrictive provider)
- Send images at full resolution first; only auto-resize to 5 MB on API failure
- Add _is_image_size_error() helper to detect size-related API rejections
- Auto-resize uses Pillow (soft dep) with progressive downscale + JPEG quality reduction
- Fix get_model_capabilities() to check modalities.input for vision support
- Increase default vision timeout from 30s to 120s (matches hardcoded fallback intent)
- Applied retry-with-resize to both vision_analyze_tool and browser_vision

Closes #7740
This commit is contained in:
kshitijk4poor 2026-04-11 11:07:18 -07:00 committed by Teknium
parent 06e1d9cdd4
commit 50bb4fe010
6 changed files with 399 additions and 25 deletions

View file

@ -381,7 +381,7 @@ DEFAULT_CONFIG = {
"model": "", # e.g. "google/gemini-2.5-flash", "gpt-4o"
"base_url": "", # direct OpenAI-compatible endpoint (takes precedence over provider)
"api_key": "", # API key for base_url (falls back to OPENAI_API_KEY)
"timeout": 30, # seconds — LLM API call timeout; increase for slow local vision models
"timeout": 120, # seconds — LLM API call timeout; vision payloads need generous timeout
"download_timeout": 30, # seconds — image HTTP download timeout; increase for slow connections
},
"web_extract": {