feat(deepseek): add thinking.type + reasoning_effort mapping for DeepSeek API

DeepSeek's thinking mode requires both:
- extra_body.thinking.type: "enabled" to activate thinking mode
- top-level reasoning_effort: "max" or "high" to control depth

Previously, the ChatCompletionsTransport only handled Kimi's thinking
mode — DeepSeek was left unmapped, so reasoning_effort config was
silently dropped.

This patch:
1. Adds is_deepseek: bool to the Params dataclass, detected by
   base_url matching api.deepseek.com
2. Maps Hermes effort levels (xhigh/max → "max", low/medium/high →
   themselves) to the top-level reasoning_effort parameter
3. Sets extra_body.thinking.type alongside the effort
4. Strips reasoning_content from assistant messages sent back to
   DeepSeek, preventing 400 errors when thinking was enabled
This commit is contained in:
twebefy 2026-04-25 00:46:10 +08:00 committed by Teknium
parent 31ba2b0cbc
commit 068c24f8a4
2 changed files with 27 additions and 0 deletions

View file

@ -9798,6 +9798,7 @@ class AIAgent:
)
_is_tokenhub = base_url_host_matches(self._base_url_lower, "tokenhub.tencentmaas.com")
_is_lmstudio = (self.provider or "").strip().lower() == "lmstudio"
_is_deepseek = base_url_host_matches(self.base_url, "api.deepseek.com")
# Temperature: _fixed_temperature_for_model may return OMIT_TEMPERATURE
# sentinel (temperature omitted entirely), a numeric override, or None.
@ -9909,6 +9910,7 @@ class AIAgent:
is_kimi=_is_kimi,
is_tokenhub=_is_tokenhub,
is_lmstudio=_is_lmstudio,
is_deepseek=_is_deepseek,
is_custom_provider=self.provider == "custom",
ollama_num_ctx=self._ollama_num_ctx,
provider_preferences=_prefs or None,
@ -10368,6 +10370,11 @@ class AIAgent:
# context compaction). Don't pass null to the API.
api_msg.pop("reasoning_content", None)
# DeepSeek: strip reasoning_content on all assistant messages so the API
# doesn't return 400 when the model was invoked with thinking enabled.
if base_url_host_matches(self.base_url, "api.deepseek.com"):
api_msg.pop("reasoning_content", None)
@staticmethod
def _sanitize_tool_calls_for_strict_api(api_msg: dict) -> dict:
"""Strip Codex Responses API fields from tool_calls for strict providers.