From 0bcc327cab9dc9b60d80e6e0e5239149d7a83207 Mon Sep 17 00:00:00 2001 From: Teknium <127238744+teknium1@users.noreply.github.com> Date: Sat, 9 May 2026 14:51:20 -0700 Subject: [PATCH] docs(openrouter): document auxiliary..extra_body for OR routing and Pareto (#22844) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The plumbing for setting OpenRouter provider preferences and the Pareto Code router on auxiliary tasks already exists — auxiliary..extra_body is forwarded verbatim by call_llm() / async_call_llm(). It just wasn't documented, so users who wanted (e.g.) Pareto Code routing for compression but the strongest coder for the main agent had no way to discover the escape hatch. - hermes_cli/config.py: expand the auxiliary section header with a YAML example showing provider routing plus plugins under extra_body, and an explicit note that main-agent provider_routing / openrouter.min_coding_score do NOT propagate to aux calls (each task is independent by design) - website/docs/user-guide/configuration.md: new 'OpenRouter routing and Pareto Code for auxiliary tasks' subsection with worked example - website/docs/integrations/providers.md: cross-link from the Pareto Code Router section to the aux-side doc E2E verified that auxiliary..extra_body reaches the OpenRouter API with the configured provider routing and plugins blocks intact. --- hermes_cli/config.py | 20 ++++++++++++++++++++ website/docs/integrations/providers.md | 1 + website/docs/user-guide/configuration.md | 22 ++++++++++++++++++++++ 3 files changed, 43 insertions(+) diff --git a/hermes_cli/config.py b/hermes_cli/config.py index a2e0ed3c739..3740fc2223e 100644 --- a/hermes_cli/config.py +++ b/hermes_cli/config.py @@ -731,6 +731,26 @@ DEFAULT_CONFIG = { # Empty model = use provider's default auxiliary model. # All tasks fall back to openrouter:google/gemini-3-flash-preview if # the configured provider is unavailable. + # + # extra_body: forwarded verbatim as request body fields on every aux call + # for that task. Use this to set provider-specific knobs (independent of + # main-agent settings). On OpenRouter you can set provider routing prefs + # and the Pareto Code coding-score floor here. Example: + # + # auxiliary: + # compression: + # provider: openrouter + # model: openrouter/pareto-code + # extra_body: + # provider: # OpenRouter provider routing + # order: [anthropic, google] + # sort: throughput # or price | latency + # plugins: # OpenRouter Pareto Code router + # - id: pareto-router + # min_coding_score: 0.5 + # + # Each aux task is independent — main-agent provider_routing and + # openrouter.min_coding_score do NOT propagate to aux calls by design. "auxiliary": { "vision": { "provider": "auto", # auto | openrouter | nous | codex | custom diff --git a/website/docs/integrations/providers.md b/website/docs/integrations/providers.md index ee7ec6780a6..df8701778da 100644 --- a/website/docs/integrations/providers.md +++ b/website/docs/integrations/providers.md @@ -1391,6 +1391,7 @@ Notes: - Set to empty string (or remove the line) to let OpenRouter pick the strongest available coder — its documented behavior when the plugins block is omitted. - Selection is deterministic per score on a given day, but the actual model chosen can shift as the Pareto frontier moves (new models, benchmark updates). - See OpenRouter's [Pareto Router docs](https://openrouter.ai/docs/guides/routing/routers/pareto-router) for the full router behavior. +- To use the Pareto Code router for a specific **auxiliary task** (compression, vision, etc.) instead of the main agent, set `extra_body.plugins` under that task — see [Auxiliary Models → OpenRouter routing & Pareto Code for auxiliary tasks](/docs/user-guide/configuration#openrouter-routing--pareto-code-for-auxiliary-tasks). ## Fallback Model diff --git a/website/docs/user-guide/configuration.md b/website/docs/user-guide/configuration.md index 78609970348..ed94dfb0ed7 100644 --- a/website/docs/user-guide/configuration.md +++ b/website/docs/user-guide/configuration.md @@ -931,6 +931,28 @@ Use `extra_body` only when your provider documents OpenAI-compatible request-bod `extra_body` is only effective when your provider actually supports the field you send. If the provider does not expose a native OpenAI-compatible reasoning-off flag, Hermes cannot synthesize one on its behalf. ::: +### OpenRouter routing & Pareto Code for auxiliary tasks + +When an auxiliary task resolves to OpenRouter (either explicitly or via `provider: "main"` while your main agent is on OpenRouter), the main agent's `provider_routing` and `openrouter.min_coding_score` settings **do not propagate** — by design, each auxiliary task is independent. To set OpenRouter provider preferences or use the [Pareto Code router](/docs/integrations/providers#openrouter-pareto-code-router) for a specific aux task, set them per-task via `extra_body`: + +```yaml +auxiliary: + compression: + provider: openrouter + model: openrouter/pareto-code # use the Pareto Code router for this task + extra_body: + provider: # OpenRouter provider routing prefs + order: [anthropic, google] # try these providers in order + sort: throughput # or "price" | "latency" + # only: [anthropic] # restrict to a specific provider + # ignore: [deepinfra] # exclude specific providers + plugins: # OpenRouter Pareto Code router knob + - id: pareto-router + min_coding_score: 0.5 # 0.0–1.0; higher = stronger coders +``` + +The shape mirrors what OpenRouter accepts in the chat completions request body. Hermes forwards the entire `extra_body` verbatim, so any other OpenRouter request-body field documented at [openrouter.ai/docs](https://openrouter.ai/docs) works the same way. + ### Changing the Vision Model To use GPT-4o instead of Gemini Flash for image analysis: