fix(openrouter): route reasoning_effort to verbosity for adaptive Anthropic models (#43436)

* fix(openrouter): route reasoning_effort to verbosity for adaptive Anthropic models Reasoning-mandatory Anthropic models (Claude 4.6+/fable/mythos-class) over OpenRouter ignore reasoning.effort and use adaptive thinking. #42991 correctly stopped Hermes from sending a reasoning field to them (it 400s), but put nothing in its place — leaving agent.reasoning_effort a silent no-op on the OpenRouter path: the model always ran at its adaptive default (high) regardless of config. OpenRouter honors the requested effort on the top-level verbosity field instead (maps to Anthropic output_config.effort). Route the existing reasoning_config[effort] there for these models while still never emitting a reasoning field, preserving the #42991 fix. No new config arg — the value the user already sets via agent.reasoning_effort now flows to verbosity. - low/medium/high/xhigh/max pass through verbatim (OpenRouter accepts the extended scale for Claude; verified live HTTP 200 + monotonic token spend). - effort unset/none/disabled omits verbosity so the model keeps its default. - native Anthropic transport already correct; unchanged. Fixes #43432 * test(openrouter): cover real effort range (add minimal, frame max as passthrough) Adversarial review noted the verbosity tests looped over 'max' — a value parse_reasoning_effort can never produce — while omitting 'minimal', which it can. Align the routing test with the real config range (VALID_REASONING_EFFORTS = minimal/low/medium/high/xhigh) and keep a separate value-agnostic passthrough test that documents why xhigh/max must survive verbatim (TypedDict, no runtime literal validation; OpenRouter accepts the extended scale for Claude). * docs: explain reasoning_effort -> verbosity routing for adaptive Anthropic models Document that reasoning_effort transparently maps to OpenRouter's verbosity field for adaptive-thinking Anthropic models (Claude 4.6+/Fable/Mythos), where reasoning.effort is ignored. Note xhigh is the configurable ceiling (max is wire- only). Add verbosity as a top-level-kwarg example in the provider-plugin guide.
2026-06-17 09:41:58 +00:00 · 2026-06-10 15:03:01 +05:30 · 2026-06-10 15:03:01 +05:30 · 183d86b3e0
commit 183d86b3e0
parent cd9a9cd8e5
4 changed files with 152 additions and 5 deletions
--- a/plugins/model-providers/openrouter/init.py
+++ b/plugins/model-providers/openrouter/init.py
@ -116,6 +116,8 @@ class OpenRouterProfile(ProviderProfile):
        the same backend server across turns.
        """
        extra_body: dict[str, Any] = {}
+        top_level: dict[str, Any] = {}
+        extra_headers: dict[str, Any] = {}
        if supports_reasoning:
            # Reasoning-mandatory Anthropic models (Claude 4.6+ / fable /
            # future named models) use *adaptive* thinking: the model decides
@ -132,18 +134,36 @@ class OpenRouterProfile(ProviderProfile):
            # The only reliable behavior is to omit ``reasoning`` and let the
            # model default to adaptive. See hermes-agent#42991 (disable case)
            # and the tool-replay follow-up.
+            #
+            # ``reasoning.effort`` being ignored does NOT mean these models have
+            # no effort lever — OpenRouter honors the requested effort on the
+            # top-level ``verbosity`` field instead (it maps to Anthropic's
+            # ``output_config.effort``; ``reasoning.effort`` is accepted but
+            # ignored — confirmed by OpenRouter's Claude migration docs and a
+            # live token-spend probe in hermes-agent#43432). Route the existing
+            # ``reasoning_config["effort"]`` (sourced from
+            # ``agent.reasoning_effort``) onto ``verbosity`` so the knob the user
+            # already sets keeps working for these models. We still send NO
+            # ``reasoning`` field, preserving the #42991 400 fix.
            if _anthropic_reasoning_is_mandatory(model):
-                pass  # omit reasoning entirely → adaptive default
+                cfg = reasoning_config or {}
+                effort = cfg.get("effort")
+                # Only emit when effort is actually requested and reasoning
+                # isn't explicitly disabled. Otherwise omit ``verbosity`` so the
+                # model keeps its own adaptive default (``high``).
+                if cfg.get("enabled", True) is not False and effort and effort != "none":
+                    top_level["verbosity"] = effort
            elif reasoning_config is not None:
                extra_body["reasoning"] = dict(reasoning_config)
            else:
                extra_body["reasoning"] = {"enabled": True, "effort": "medium"}

-        extra_headers: dict[str, Any] = {}
        if session_id and model and model.startswith(("x-ai/grok-", "xai/grok-")):
            extra_headers["x-grok-conv-id"] = session_id
+        if extra_headers:
+            top_level["extra_headers"] = extra_headers

-        return extra_body, {"extra_headers": extra_headers} if extra_headers else {}
+        return extra_body, top_level


 openrouter = OpenRouterProfile(