fix(deepseek): wire thinking-mode via DeepSeekProfile, not legacy fallback

The cherry-picked PR #15251 from @tw2818 correctly identified the DeepSeek 400 root cause but placed the fix in the legacy fallback path of `build_kwargs`, which DeepSeek never reaches — DeepSeek has a registered ProviderProfile and goes through `_build_kwargs_from_profile` instead. The legacy-path block was therefore dead code. This commit pivots the fix to where it actually fires: - New `DeepSeekProfile` in `plugins/model-providers/deepseek/__init__.py` overrides `build_api_kwargs_extras` to emit DeepSeek's expected wire format (mirrors `KimiProfile`): {"reasoning_effort": "<low|medium|high|max>", "extra_body": {"thinking": {"type": "enabled" | "disabled"}}} - Model gating: only `deepseek-v4-*` and `deepseek-reasoner` emit thinking control. `deepseek-chat` (V3) is untouched — current behavior. - Effort mapping: low/medium/high passthrough, xhigh/max → max, unset → omitted (DeepSeek server applies its own default). - Revert the legacy-path additions from PR #15251 — they were dead code, and the `_copy_reasoning_content_for_api` strip block specifically would have nullified the existing reasoning_content padding machinery (`_needs_deepseek_tool_reasoning` → space-pad on replay) that the active provider already relies on for replay correctness. - Unit tests pin the wire-shape contract and the model gating rules (26 tests, all passing). Existing transport + provider profile suites (321 tests) continue to pass. - AUTHOR_MAP: map twebefy@gmail.com → tw2818 for release notes credit. Closes #15700, #17212, #17825. Co-authored-by: tw2818 <twebefy@gmail.com>
2026-05-18 04:41:56 +00:00 · 2026-05-15 16:39:18 -07:00 · 2026-05-15 16:39:18 -07:00 · cd9470f416
commit cd9470f416
parent 068c24f8a4
5 changed files with 266 additions and 29 deletions
--- a/run_agent.py
+++ b/run_agent.py
@ -9798,7 +9798,6 @@ class AIAgent:
        )
        _is_tokenhub = base_url_host_matches(self._base_url_lower, "tokenhub.tencentmaas.com")
        _is_lmstudio = (self.provider or "").strip().lower() == "lmstudio"
-        _is_deepseek = base_url_host_matches(self.base_url, "api.deepseek.com")

        # Temperature: _fixed_temperature_for_model may return OMIT_TEMPERATURE
        # sentinel (temperature omitted entirely), a numeric override, or None.
@ -9910,7 +9909,6 @@ class AIAgent:
            is_kimi=_is_kimi,
            is_tokenhub=_is_tokenhub,
            is_lmstudio=_is_lmstudio,
-            is_deepseek=_is_deepseek,
            is_custom_provider=self.provider == "custom",
            ollama_num_ctx=self._ollama_num_ctx,
            provider_preferences=_prefs or None,
@ -10370,11 +10368,6 @@ class AIAgent:
        # context compaction).  Don't pass null to the API.
        api_msg.pop("reasoning_content", None)

-        # DeepSeek: strip reasoning_content on all assistant messages so the API
-        # doesn't return 400 when the model was invoked with thinking enabled.
-        if base_url_host_matches(self.base_url, "api.deepseek.com"):
-            api_msg.pop("reasoning_content", None)
-
    @staticmethod
    def _sanitize_tool_calls_for_strict_api(api_msg: dict) -> dict:
        """Strip Codex Responses API fields from tool_calls for strict providers.