feat(providers): make all 33 providers pluggable under plugins/model-providers/

Every provider profile is now a self-contained plugin under plugins/model-providers/<name>/, mirroring the plugins/platforms/ pattern established for IRC and Teams. The ProviderProfile ABC stays in providers/; the per-provider profile data moves out. - plugins/model-providers/<name>/__init__.py calls register_provider() - plugins/model-providers/<name>/plugin.yaml declares kind: model-provider - providers/__init__.py._discover_providers() lazily scans bundled plugins then $HERMES_HOME/plugins/model-providers/<name>/ (user override path) - User plugins with the same name override bundled ones (last-writer-wins in register_provider) - Legacy providers/<name>.py layout still supported for back-compat with out-of-tree editable installs - Hermes PluginManager: new kind=model-provider; skipped like memory plugins (providers/ discovery owns them); standalone plugins with register_provider+ProviderProfile in their __init__.py auto-coerce to this kind (same heuristic as memory providers) - skip_names extended to include 'model-providers' so the general PluginManager doesn't double-scan the category - 4 new tests in tests/providers/test_plugin_discovery.py covering bundled discovery, user override, and general-loader isolation - Docs updated: website/docs/developer-guide/adding-providers.md, provider-runtime.md, providers/README.md, plugins/model-providers/README.md No API break: auth.py / config.py / doctor.py / models.py / runtime_provider.py / model_metadata.py / auxiliary_client.py / chat_completions.py / run_agent.py all still consume providers via get_provider_profile() / list_providers() — they just now see plugin-discovered entries instead of pkgutil-iterated ones. Third parties can now drop a single directory into ~/.hermes/plugins/model-providers/<name>/ to add or override an inference provider without touching the repo.
2026-05-14 04:02:26 +00:00 · 2026-05-05 13:36:08 -07:00 · 2026-05-05 13:36:08 -07:00 · 9022804d78
commit 9022804d78
parent 20a4f79ed1
63 changed files with 585 additions and 309 deletions
--- a/providers/README.md
+++ b/providers/README.md
@ -1,307 +1,78 @@
 # providers/

-Single source of truth for every inference provider Hermes knows about.
+Registry and ABC for every inference provider Hermes knows about.

-Each provider is declared once here as a `ProviderProfile`. Every other layer —
+Each provider is declared once as a `ProviderProfile`. Every other layer —
 auth resolution, transport kwargs, model listing, runtime routing — reads from
 these profiles instead of maintaining its own parallel data.

 ---

-## Directory layout
+## Layout

 ```
 providers/
-├── base.py           ProviderProfile dataclass + OMIT_TEMPERATURE sentinel
-├── __init__.py       Registry: register_provider(), get_provider_profile()
-├── README.md         This file
-│
-├── # Simple providers — just identity + auth + endpoint
-├── alibaba.py        Alibaba Cloud DashScope
-├── arcee.py          Arcee AI
-├── bedrock.py        AWS Bedrock  (api_mode=bedrock_converse)
-├── deepseek.py       DeepSeek
-├── huggingface.py    Hugging Face Inference API
-├── kilocode.py       Kilo Code
-├── minimax.py        MiniMax (international + CN)
-├── nvidia.py         NVIDIA NIM  (default_max_tokens=16384)
-├── ollama_cloud.py   Ollama Cloud
-├── stepfun.py        StepFun
-├── xiaomi.py         Xiaomi MiMo
-├── xai.py            xAI Grok  (api_mode=codex_responses)
-├── zai.py            Z.AI / GLM
-│
-├── # Medium — one or two quirks
-├── anthropic.py      Native Anthropic  (x-api-key header, api_mode=anthropic_messages)
-├── copilot.py        GitHub Copilot  (auth_type=copilot, reasoning per model)
-├── copilot_acp.py    Copilot ACP subprocess  (api_mode=copilot_acp)
-├── custom.py         Custom/Ollama local  (think=false, num_ctx)
-├── gemini.py         Google Gemini AI Studio + Cloud Code OAuth
-├── kimi.py           Kimi Coding  (OMIT_TEMPERATURE, thinking, dual endpoint)
-├── openai_codex.py   OpenAI Codex OAuth  (api_mode=codex_responses)
-├── opencode.py       OpenCode Zen + Go  (per-model api_mode routing)
-│
-├── # Complex — subclasses with multiple overrides
-├── nous.py           Nous Portal  (tags, attribution, reasoning omit-when-disabled)
-├── openrouter.py     OpenRouter  (provider preferences, public model fetch)
-├── qwen.py           Qwen OAuth  (message normalization, cache_control, vl_hires)
-└── vercel.py         Vercel AI Gateway  (attribution headers, reasoning passthrough)
+├── base.py         ProviderProfile dataclass + OMIT_TEMPERATURE sentinel
+├── __init__.py     Registry: register_provider(), get_provider_profile(), list_providers()
+└── README.md       This file
 ```

+The **profiles themselves** live as plugins under
+`plugins/model-providers/<name>/` (bundled in this repo) and
+`$HERMES_HOME/plugins/model-providers/<name>/` (per-user overrides). The
+registry in `providers/__init__.py` lazily discovers them the first time any
+consumer calls `get_provider_profile()` or `list_providers()`. See
+`plugins/model-providers/README.md` for the plugin contract and examples.
+
 ---

-## ProviderProfile fields
+## How it wires in

-```python
-@dataclass
-class ProviderProfile:
-    # Identity
-    name: str                    # canonical ID — auto-registered as PROVIDER_REGISTRY key for new api-key providers
-    api_mode: str                # "chat_completions" | "anthropic_messages" |
-                                 # "codex_responses" | "bedrock_converse" | "copilot_acp"
-    aliases: tuple               # alternate names resolved by get_provider_profile()
+The registry is populated on first access. After that, every downstream
+layer reads from it:

-    # Auth & endpoints
-    env_vars: tuple              # env var names holding the API key, in priority order
-    base_url: str                # default inference endpoint
-    models_url: str              # explicit models endpoint; falls back to {base_url}/models
-                                 # set when the models catalog lives at a different URL
-                                 # (e.g. OpenRouter: public /api/v1/models vs /api/v1 inference)
-    auth_type: str               # "api_key" | "oauth_device_code" | "oauth_external" |
-                                 # "copilot" | "aws" | "external_process"
-
-    # Client-level quirks
-    default_headers: dict        # extra HTTP headers sent on every request
-
-    # Request-level quirks
-    fixed_temperature: Any       # None = use caller's default; OMIT_TEMPERATURE = don't send
-    default_max_tokens: int|None # inject max_tokens when caller omits it
-    default_aux_model: str       # cheap model for auxiliary tasks (compression, vision, etc.)
-                                 # empty string = use main model (default)
-```
+- `hermes_cli/auth.py` extends `PROVIDER_REGISTRY` with every api-key
+  profile it sees (skipping `copilot`, `kimi-coding`, `kimi-coding-cn`,
+  `zai`, `openrouter`, `custom` — those need bespoke token resolution).
+- `hermes_cli/models.py` extends `CANONICAL_PROVIDERS` and calls
+  `profile.fetch_models()` inside `provider_model_ids()`.
+- `hermes_cli/doctor.py` adds a `/models` health check for each
+  `auth_type="api_key"` profile.
+- `hermes_cli/config.py` injects every `env_var` into
+  `OPTIONAL_ENV_VARS` so the setup wizard knows about it.
+- `hermes_cli/runtime_provider.py` reads `profile.api_mode` as a fallback
+  when URL detection finds nothing.
+- `agent/model_metadata.py` maps hostname → provider via
+  `profile.get_hostname()`.
+- `agent/auxiliary_client.py` reads `profile.default_aux_model` first
+  before falling back to the legacy hardcoded dict.
+- `agent/transports/chat_completions.py::_build_kwargs_from_profile()`
+  invokes `profile.prepare_messages()`, `profile.build_extra_body()`,
+  and `profile.build_api_kwargs_extras()` on every call.
+- `run_agent.py` passes `provider_profile=<ProviderProfile>` so the
+  transport takes the profile path instead of the legacy flag path.

 ---

-## Hooks (override in a subclass)
+## Adding a provider

-| Method | When to override |
-|--------|-----------------|
-| `prepare_messages(messages)` | Provider needs message pre-processing (Qwen: string → list-of-parts, cache_control) |
-| `build_extra_body(*, session_id, **ctx)` | Provider-specific `extra_body` fields (Nous: tags, OpenRouter: provider preferences) |
-| `build_api_kwargs_extras(*, reasoning_config, **ctx)` | Returns `(extra_body_additions, top_level_kwargs)` — use when some fields go to `extra_body` and some go top-level (Kimi: `reasoning_effort` top-level; OpenRouter: `reasoning` in extra_body) |
-| `fetch_models(*, api_key, timeout)` | Custom model listing (Anthropic: x-api-key header; OpenRouter: public endpoint, no auth; Bedrock/copilot-acp: return None) |
-
-All hooks have safe defaults — only override what differs from the base.
+See `plugins/model-providers/README.md` — drop a new directory there (or
+under `$HERMES_HOME/plugins/model-providers/` for a private plugin).

 ---

-## How to add a new provider
+## Hooks you can override on `ProviderProfile`

-### 1. Simple (standard OpenAI-compatible endpoint)
-
-```python
-# providers/myprovider.py
-from providers import register_provider
-from providers.base import ProviderProfile
-
-myprovider = ProviderProfile(
-    name="myprovider",           # must match id in hermes_cli/auth.py PROVIDER_REGISTRY
-    aliases=("my-provider", "myp"),
-    api_mode="chat_completions",
-    env_vars=("MYPROVIDER_API_KEY",),
-    base_url="https://api.myprovider.com/v1",
-    auth_type="api_key",
-)
-
-register_provider(myprovider)
-```
-
-The default `fetch_models()` will call `GET https://api.myprovider.com/v1/models`
-with Bearer auth automatically. No override needed for standard `/v1/models`.
-
-### 2. With quirks (subclass)
-
-```python
-# providers/myprovider.py
-from typing import Any
-from providers import register_provider
-from providers.base import ProviderProfile
-
-
-class MyProviderProfile(ProviderProfile):
-    """My provider — custom reasoning header."""
-
-    def build_api_kwargs_extras(
-        self,
-        *,
-        reasoning_config: dict | None = None,
-        **ctx: Any,
-    ) -> tuple[dict[str, Any], dict[str, Any]]:
-        extra_body: dict[str, Any] = {}
-        if reasoning_config:
-            extra_body["my_reasoning"] = reasoning_config.get("effort", "medium")
-        return extra_body, {}
-
-    def fetch_models(
-        self,
-        *,
-        api_key: str | None = None,
-        timeout: float = 8.0,
-    ) -> list[str] | None:
-        # Override only if your endpoint differs from standard /v1/models
-        return super().fetch_models(api_key=api_key, timeout=timeout)
-
-
-myprovider = MyProviderProfile(
-    name="myprovider",
-    aliases=("myp",),
-    env_vars=("MYPROVIDER_API_KEY",),
-    base_url="https://api.myprovider.com/v1",
-)
-
-register_provider(myprovider)
-```
-
-### 3. Wire it up
-
-After creating the file, add `name` to the `_PROFILE_ACTIVE_PROVIDERS` set in
-`run_agent.py` once you've verified parity against the legacy flag path. Start
-with a simple provider (no message prep, no reasoning quirks) and work up.
+| Hook | Purpose |
+|------|---------|
+| `get_hostname()` | URL-based detection — default derives from `base_url`. |
+| `prepare_messages(msgs)` | Provider-specific message preprocessing (Qwen normalises to list-of-parts, injects `cache_control`). |
+| `build_extra_body(**ctx)` | Provider-specific `extra_body` (OpenRouter provider prefs, Gemini `thinking_config`). |
+| `build_api_kwargs_extras(**ctx)` | `(extra_body_additions, top_level_kwargs)` — Kimi puts reasoning_effort top-level, Qwen splits `enable_thinking`/`thinking_budget`. |
+| `fetch_models(*, api_key)` | Live catalog fetch — default hits `{models_url or base_url}/models` with Bearer auth. Override for no-REST providers (Bedrock), OAuth catalogs (Anthropic), or public catalogs (OpenRouter). |

 ---

-## fetch_models contract
+## Configuration fields

-```python
-def fetch_models(
-    self,
-    *,
-    api_key: str | None = None,
-    timeout: float = 8.0,
-) -> list[str] | None:
-    ...
-```
-
- Returns `list[str]`: model IDs from the provider's live endpoint.
- Returns `None`: provider doesn't support REST model listing (Bedrock, copilot-acp),
-  or the request failed. Callers **must** fall back to `_PROVIDER_MODELS` on `None`.
- Never raises — swallow exceptions and return `None`.
- Default implementation: `GET {base_url}/models` with Bearer auth. Works for any
-  standard OpenAI-compatible provider.
-
-**Override when:**
- Auth header is not `Bearer` (Anthropic: `x-api-key`)
- Endpoint path differs from `/models` AND you can't just set `models_url` (OpenRouter: public endpoint, pass `api_key=None` explicitly)
- Response format differs (extra wrapping, non-standard `id` field)
- Provider has no REST endpoint (Bedrock, copilot-acp → return `None`)
- Filtering needed post-fetch (only tool-capable models, etc.)
-
-Use `models_url` instead of overriding when the only difference is the URL:
-
-```python
-# No subclass needed — just set models_url
-myprovider = ProviderProfile(
-    name="myprovider",
-    base_url="https://api.myprovider.com/v1",
-    models_url="https://catalog.myprovider.com/models",  # different host
-)
-```
-
---
-
-## Debugging
-
-### Check if a provider resolves
-
-```python
-from providers import get_provider_profile
-
-p = get_provider_profile("myprovider")
-print(p)           # ProviderProfile(name='myprovider', ...)
-print(p.base_url)
-print(p.api_mode)
-```
-
-### Check all registered providers
-
-```python
-from providers import _REGISTRY
-print(list(_REGISTRY.keys()))
-```
-
-### Test live model fetch
-
-```python
-import os
-from providers import get_provider_profile
-
-p = get_provider_profile("myprovider")
-key = os.getenv("MYPROVIDER_API_KEY")
-models = p.fetch_models(api_key=key, timeout=5.0)
-print(models)      # list of model IDs, or None on failure
-```
-
-### Test alias resolution
-
-```python
-from providers import get_provider_profile
-
-# All of these should return the same profile
-assert get_provider_profile("openrouter").name == "openrouter"
-assert get_provider_profile("or").name == "openrouter"
-```
-
-### Run the provider test suite
-
-```bash
-# From the repo root
-source venv/bin/activate
-python -m pytest tests/providers/ -v
-```
-
-### Check ruff + ty compliance
-
-```bash
-source venv/bin/activate
-ruff format providers/*.py
-ruff check providers/*.py --select UP,E,F,I,W
-ty check providers/*.py
-```
-
---
-
-## Common mistakes
-
-**Wrong `name`** — must be the same string that appears as the key in
-`hermes_cli/auth.py` `PROVIDER_REGISTRY`. New api-key providers auto-register
-into `PROVIDER_REGISTRY` from the profile, so the name IS the key. For providers
-with a pre-existing `PROVIDER_REGISTRY` entry, use the exact `id` field value.
-
-**Wrong `env_vars`** — separate API-key vars from base-URL override vars in the
-tuple. Env vars that end with `_BASE_URL` or `_URL` are treated as URL overrides;
-everything else is treated as an API key. Getting this wrong causes the doctor
-health check to send a URL string as a Bearer token.
-
-**Wrong `base_url`** — several providers have non-obvious paths:
-`stepfun: /step_plan/v1`, `opencode-go: /zen/go/v1`. The profile's `base_url`
-is also used as the `inference_base_url` when auto-registering into `PROVIDER_REGISTRY`
-for new providers, so it must be correct for auth resolution to work.
-
-**Skipping `api_mode`** — defaults to `chat_completions`. Providers that use
-`anthropic_messages`, `codex_responses`, `bedrock_converse`, or `copilot_acp`
-must set it explicitly.
-
-**Forgetting `register_provider()`** — auto-discovery runs `pkgutil.iter_modules`
-over the package and imports each module, but only if `register_provider()` is
-called at module level. Without it the profile is never in `_REGISTRY`.
-
-**`fetch_models` returning the wrong shape** — must return `list[str]` (plain
-model IDs), not `list[tuple]` or `list[dict]`. Callers expect plain strings.
-
-**Wrong `build_api_kwargs_extras` return shape** — must return a 2-tuple
-`(extra_body_dict, top_level_dict)`. Returning a single dict causes a
-`ValueError: not enough values to unpack` in the transport.
-
-**`build_api_kwargs_extras` wrong tuple** — must return `(extra_body_dict,
-top_level_dict)`. Returning a flat dict or swapping the order silently sends
-fields to the wrong place.
+Full reference in `providers/base.py` dataclass definition.