fix(agent): comprehensive DeepSeek V4 support — context windows, thinking mode, reasoning replay

Unifies approaches from PRs #14952, #14958, #15325, #15228, #15354 into a single cohesive implementation: - Add 1M context window entries for V4 models (deepseek-v4-pro, deepseek-v4-flash, deepseek-chat, deepseek-reasoner) - Plumb thinking.type toggle and reasoning_effort mapping for native DeepSeek API (only "high" and "max" are valid) - Strip incompatible sampling params when thinking is enabled - Inject reasoning_content="" on all assistant messages for DeepSeek replay (scoped to api.deepseek.com and OpenRouter) - Fix _extract_reasoning isinstance checks for empty strings - Preserve empty-string reasoning_content in normalize_response - Add _copy_reasoning_content_for_api call in _handle_max_iterations Fixes #15353. Supersedes #14952, #14958, #15325, #15228, #15354.
2026-05-12 03:42:08 +00:00 · 2026-04-25 10:16:34 +10:00 · 2026-04-25 10:16:34 +10:00 · 1d38b0f888
commit 1d38b0f888
parent 00c3d848d8
4 changed files with 336 additions and 10 deletions
--- a/agent/model_metadata.py
+++ b/agent/model_metadata.py
@ -162,8 +162,12 @@ DEFAULT_CONTEXT_LENGTHS = {
    "gemma-4-31b": 256000,
    "gemma-3": 131072,
    "gemma": 8192,  # fallback for older gemma models
-    # DeepSeek
-    "deepseek": 128000,
+    # DeepSeek — V4 family supports 1M context (api.deepseek.com docs)
+    "deepseek-v4-pro": 1000000,
+    "deepseek-v4-flash": 1000000,
+    "deepseek-chat": 1000000,
+    "deepseek-reasoner": 1000000,
+    "deepseek": 128000,  # fallback for older/unrecognised DeepSeek models
    # Meta
    "llama": 131072,
    # Qwen — specific model families before the catch-all.