Xiaomi MiMo's /v1/models endpoint returns 401 even with a valid API key,
causing hermes doctor to falsely report 'invalid API key'.
Add a `supports_health_check` field to ProviderProfile (default True).
Providers whose /models endpoint doesn't support auth verification can
set it to False. The doctor's dynamic provider discovery now reads this
field instead of hardcoding True.
The xiaomi provider plugin sets supports_health_check=False.
Pick openrouter/pareto-code as your model and OpenRouter auto-routes each
request to the cheapest model meeting your coding-quality bar (ranked by
Artificial Analysis). The new openrouter.min_coding_score config key (0.0-1.0,
default 0.65) tunes the floor.
- hermes_cli/models.py: add openrouter/pareto-code to OPENROUTER_MODELS so
it shows up in the picker with a description
- hermes_cli/config.py: add openrouter.min_coding_score (default 0.65 — lands
on a mid-tier coder on the current Pareto frontier)
- plugins/model-providers/openrouter: emit extra_body.plugins =
[{id: pareto-router, min_coding_score: X}] when model is openrouter/pareto-code
AND the score is a valid float in [0.0, 1.0]
- agent/transports/chat_completions.py: same emission on the legacy flag
path (when no provider profile is loaded)
- run_agent.py: openrouter_min_coding_score kwarg + storage; plumbed into
both build_kwargs() invocations and the context-summary extra_body path
- cli.py: read openrouter.min_coding_score once at init, validate float in
[0,1], pass to AIAgent constructions (CLI + background-task paths)
- cron/scheduler.py, batch_runner.py, tools/delegate_tool.py,
tui_gateway/server.py: propagate the kwarg (mirrors providers_order
plumbing — subagents inherit, cron/batch read from config)
- tests: profile-level + transport-level coverage of the model gating,
unset/empty/out-of-range handling, and the legacy flag path
- docs: new 'OpenRouter Pareto Code Router' section in providers.md
Verified end-to-end against api.openrouter.ai: at score=0.65 we land on a
mid-tier coder, at omission we get the strongest. Score is silently dropped
on any model other than openrouter/pareto-code, so it's safe to leave set.
Pass session_id through to provider profile build_api_kwargs_extras so
the OpenRouter profile can attach an xAI cache-affinity header
(x-grok-conv-id: <session-id>) for x-ai/grok-* models. xAI prompt
cache requires server affinity via this header — without it the cache
is poisoned and Grok prompt-cache hit rates drop dramatically on
multi-turn sessions.
Carve-out of #22708 by Ninso112. The original PR bundled a /diff
slash command, a zsh completion fix (already on main via #22802),
and holographic memory null-guards. This salvage keeps just the
Grok header work — small, targeted, and well-tested. Other
contributors and changes preserved for separate review.
Closes#22705.
The previous revision of this PR added six GMI-specific branches
(`elif base_url_host_matches(..., 'api.gmi-serving.com')`) across
run_agent.py and agent/auxiliary_client.py, plus a _HERMES_UA_HEADERS
constant in auxiliary_client.py.
ProviderProfile already has a `default_headers: dict[str, str]` field
commented as 'Client-level quirks (set once at client construction)'.
Other plugins (ai-gateway, kimi-coding) already use it. Two of the four
auxiliary_client sites we previously patched already had a generic
`else: profile.default_headers` fallback that picked it up (so did
both run_agent sites).
This revision:
* Sets `default_headers={'User-Agent': 'HermesAgent/<ver>'}` on the
GMI profile in plugins/model-providers/gmi/__init__.py.
* Reverts all six GMI-specific branches in run_agent.py and
auxiliary_client.py.
* Adds the generic profile-fallback `else` block to the two
auxiliary_client sites (`_to_async_client`, `resolve_provider_client`)
that didn't have it yet. This benefits every provider whose profile
declares default_headers, not just GMI — e.g. Vercel AI Gateway's
HTTP-Referer/X-Title now flow through the async client path too.
* Replaces the GMI-specific URL-branch tests with a profile-level
assertion and keeps the run_agent integration test (with
`provider='gmi'` so the fallback picks up the profile).
Net diff vs main: +82/-0 across 5 files, touching only the GMI plugin,
two generic fallback blocks in auxiliary_client.py, AUTHOR_MAP, and
tests. No core files change.
Based on #20907 by @isaachuangGMICLOUD.
Every provider profile is now a self-contained plugin under
plugins/model-providers/<name>/, mirroring the plugins/platforms/
pattern established for IRC and Teams. The ProviderProfile ABC
stays in providers/; the per-provider profile data moves out.
- plugins/model-providers/<name>/__init__.py calls register_provider()
- plugins/model-providers/<name>/plugin.yaml declares kind: model-provider
- providers/__init__.py._discover_providers() lazily scans bundled plugins
then $HERMES_HOME/plugins/model-providers/<name>/ (user override path)
- User plugins with the same name override bundled ones (last-writer-wins
in register_provider)
- Legacy providers/<name>.py layout still supported for back-compat with
out-of-tree editable installs
- Hermes PluginManager: new kind=model-provider; skipped like memory
plugins (providers/ discovery owns them); standalone plugins with
register_provider+ProviderProfile in their __init__.py auto-coerce to
this kind (same heuristic as memory providers)
- skip_names extended to include 'model-providers' so the general
PluginManager doesn't double-scan the category
- 4 new tests in tests/providers/test_plugin_discovery.py covering
bundled discovery, user override, and general-loader isolation
- Docs updated: website/docs/developer-guide/adding-providers.md,
provider-runtime.md, providers/README.md, plugins/model-providers/README.md
No API break: auth.py / config.py / doctor.py / models.py / runtime_provider.py /
model_metadata.py / auxiliary_client.py / chat_completions.py / run_agent.py
all still consume providers via get_provider_profile() / list_providers() —
they just now see plugin-discovered entries instead of pkgutil-iterated ones.
Third parties can now drop a single directory into
~/.hermes/plugins/model-providers/<name>/ to add or override an inference
provider without touching the repo.