feat(providers): make all 33 providers pluggable under plugins/model-providers/

Every provider profile is now a self-contained plugin under plugins/model-providers/<name>/, mirroring the plugins/platforms/ pattern established for IRC and Teams. The ProviderProfile ABC stays in providers/; the per-provider profile data moves out. - plugins/model-providers/<name>/__init__.py calls register_provider() - plugins/model-providers/<name>/plugin.yaml declares kind: model-provider - providers/__init__.py._discover_providers() lazily scans bundled plugins then $HERMES_HOME/plugins/model-providers/<name>/ (user override path) - User plugins with the same name override bundled ones (last-writer-wins in register_provider) - Legacy providers/<name>.py layout still supported for back-compat with out-of-tree editable installs - Hermes PluginManager: new kind=model-provider; skipped like memory plugins (providers/ discovery owns them); standalone plugins with register_provider+ProviderProfile in their __init__.py auto-coerce to this kind (same heuristic as memory providers) - skip_names extended to include 'model-providers' so the general PluginManager doesn't double-scan the category - 4 new tests in tests/providers/test_plugin_discovery.py covering bundled discovery, user override, and general-loader isolation - Docs updated: website/docs/developer-guide/adding-providers.md, provider-runtime.md, providers/README.md, plugins/model-providers/README.md No API break: auth.py / config.py / doctor.py / models.py / runtime_provider.py / model_metadata.py / auxiliary_client.py / chat_completions.py / run_agent.py all still consume providers via get_provider_profile() / list_providers() — they just now see plugin-discovered entries instead of pkgutil-iterated ones. Third parties can now drop a single directory into ~/.hermes/plugins/model-providers/<name>/ to add or override an inference provider without touching the repo.
2026-05-11 03:31:55 +00:00 · 2026-05-05 13:36:08 -07:00 · 2026-05-05 13:36:08 -07:00 · 9022804d78
commit 9022804d78
parent 20a4f79ed1
63 changed files with 585 additions and 309 deletions
--- a/plugins/model-providers/README.md
+++ b/plugins/model-providers/README.md
@ -0,0 +1,70 @@
+# Model Provider Plugins
+
+Each subdirectory is a self-contained provider profile plugin. The
+directory layout mirrors `plugins/platforms/`:
+
+```
+plugins/model-providers/
+├── openrouter/
+│   ├── __init__.py      # registers the ProviderProfile
+│   └── plugin.yaml      # manifest: name, kind, version, description
+├── anthropic/
+│   ├── __init__.py
+│   └── plugin.yaml
+└── ...
+```
+
+## How discovery works
+
+`providers/__init__.py._discover_providers()` scans this directory (and
+`$HERMES_HOME/plugins/model-providers/`) the first time anything calls
+`get_provider_profile()` or `list_providers()`. Each `__init__.py` is
+imported and expected to call `providers.register_provider(profile)`.
+
+User plugins at `$HERMES_HOME/plugins/model-providers/<name>/` override
+bundled plugins of the same name — last-writer-wins in
+`register_provider()`. Drop a file there to replace a built-in.
+
+## Adding a new provider
+
+1. Create `plugins/model-providers/<your_provider>/__init__.py`:
+
+   ```python
+   from providers import register_provider
+   from providers.base import ProviderProfile
+
+   my_provider = ProviderProfile(
+       name="your-provider",
+       aliases=("alias1", "alias2"),
+       display_name="Your Provider",
+       description="One-line description shown in the setup picker",
+       signup_url="https://your-provider.example.com/keys",
+       env_vars=("YOUR_PROVIDER_API_KEY", "YOUR_PROVIDER_BASE_URL"),
+       base_url="https://api.your-provider.example.com/v1",
+       default_aux_model="your-cheap-model",
+   )
+
+   register_provider(my_provider)
+   ```
+
+2. Create `plugins/model-providers/<your_provider>/plugin.yaml`:
+
+   ```yaml
+   name: your-provider-profile
+   kind: model-provider
+   version: 1.0.0
+   description: Short sentence about the provider
+   author: Your Name
+   ```
+
+Nothing else needs to change. `auth.py`, `config.py`, `models.py`,
+`doctor.py`, `model_metadata.py`, `runtime_provider.py`, and the
+chat_completions transport all auto-wire from the registry.
+
+## Non-trivial profiles
+
+Override the `ProviderProfile` hooks in a subclass for per-provider
+quirks — see `plugins/model-providers/openrouter/__init__.py` for
+`build_extra_body` and `build_api_kwargs_extras` examples, and
+`plugins/model-providers/gemini/__init__.py` for `thinking_config`
+translation.
--- a/plugins/model-providers/ai-gateway/init.py
+++ b/plugins/model-providers/ai-gateway/init.py
@ -0,0 +1,43 @@
+"""Vercel AI Gateway provider profile.
+
+AI Gateway routes to multiple backends. Hermes sends attribution
+headers and full reasoning config passthrough.
+"""
+
+from typing import Any
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+
+class VercelAIGatewayProfile(ProviderProfile):
+    """Vercel AI Gateway — attribution headers + reasoning passthrough."""
+
+    def build_api_kwargs_extras(
+        self,
+        *,
+        reasoning_config: dict | None = None,
+        supports_reasoning: bool = True,
+        **ctx: Any,
+    ) -> tuple[dict[str, Any], dict[str, Any]]:
+        extra_body: dict[str, Any] = {}
+        if supports_reasoning and reasoning_config is not None:
+            extra_body["reasoning"] = dict(reasoning_config)
+        elif supports_reasoning:
+            extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
+        return extra_body, {}
+
+
+vercel = VercelAIGatewayProfile(
+    name="ai-gateway",
+    aliases=("vercel", "vercel-ai-gateway", "ai_gateway", "aigateway"),
+    env_vars=("AI_GATEWAY_API_KEY",),
+    base_url="https://ai-gateway.vercel.sh/v1",
+    default_headers={
+        "HTTP-Referer": "https://hermes-agent.nousresearch.com",
+        "X-Title": "Hermes Agent",
+    },
+    default_aux_model="google/gemini-3-flash",
+)
+
+register_provider(vercel)
--- a/plugins/model-providers/ai-gateway/plugin.yaml
+++ b/plugins/model-providers/ai-gateway/plugin.yaml
@ -0,0 +1,5 @@
+name: ai-gateway-provider
+kind: model-provider
+version: 1.0.0
+description: Vercel AI Gateway
+author: Nous Research
--- a/plugins/model-providers/alibaba-coding-plan/init.py
+++ b/plugins/model-providers/alibaba-coding-plan/init.py
@ -0,0 +1,21 @@
+"""Alibaba Cloud Coding Plan provider profile.
+
+Separate from the standard `alibaba` profile because it hits a different
+endpoint (coding-intl.dashscope.aliyuncs.com) with a dedicated API key tier.
+"""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+alibaba_coding_plan = ProviderProfile(
+    name="alibaba-coding-plan",
+    aliases=("alibaba_coding", "alibaba-coding", "dashscope-coding"),
+    display_name="Alibaba Cloud (Coding Plan)",
+    description="Alibaba Cloud Coding Plan — dedicated coding tier",
+    signup_url="https://help.aliyun.com/zh/model-studio/",
+    env_vars=("ALIBABA_CODING_PLAN_API_KEY", "DASHSCOPE_API_KEY", "ALIBABA_CODING_PLAN_BASE_URL"),
+    base_url="https://coding-intl.dashscope.aliyuncs.com/v1",
+    auth_type="api_key",
+)
+
+register_provider(alibaba_coding_plan)
--- a/plugins/model-providers/alibaba-coding-plan/plugin.yaml
+++ b/plugins/model-providers/alibaba-coding-plan/plugin.yaml
@ -0,0 +1,5 @@
+name: alibaba-coding-plan-provider
+kind: model-provider
+version: 1.0.0
+description: Alibaba Cloud Coding Plan
+author: Nous Research
--- a/plugins/model-providers/alibaba/init.py
+++ b/plugins/model-providers/alibaba/init.py
@ -0,0 +1,13 @@
+"""Alibaba Cloud DashScope provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+alibaba = ProviderProfile(
+    name="alibaba",
+    aliases=("dashscope", "alibaba-cloud", "qwen-dashscope"),
+    env_vars=("DASHSCOPE_API_KEY",),
+    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
+)
+
+register_provider(alibaba)
--- a/plugins/model-providers/alibaba/plugin.yaml
+++ b/plugins/model-providers/alibaba/plugin.yaml
@ -0,0 +1,5 @@
+name: alibaba-provider
+kind: model-provider
+version: 1.0.0
+description: Alibaba DashScope (international)
+author: Nous Research
--- a/plugins/model-providers/anthropic/init.py
+++ b/plugins/model-providers/anthropic/init.py
@ -0,0 +1,52 @@
+"""Native Anthropic provider profile."""
+
+import json
+import logging
+import urllib.request
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+logger = logging.getLogger(__name__)
+
+
+class AnthropicProfile(ProviderProfile):
+    """Native Anthropic — uses x-api-key header, not Bearer."""
+
+    def fetch_models(
+        self,
+        *,
+        api_key: str | None = None,
+        timeout: float = 8.0,
+    ) -> list[str] | None:
+        """Anthropic uses x-api-key header and anthropic-version."""
+        if not api_key:
+            return None
+        try:
+            req = urllib.request.Request("https://api.anthropic.com/v1/models")
+            req.add_header("x-api-key", api_key)
+            req.add_header("anthropic-version", "2023-06-01")
+            req.add_header("Accept", "application/json")
+            with urllib.request.urlopen(req, timeout=timeout) as resp:
+                data = json.loads(resp.read().decode())
+            return [
+                m["id"]
+                for m in data.get("data", [])
+                if isinstance(m, dict) and "id" in m
+            ]
+        except Exception as exc:
+            logger.debug("fetch_models(anthropic): %s", exc)
+            return None
+
+
+anthropic = AnthropicProfile(
+    name="anthropic",
+    aliases=("claude", "claude-oauth", "claude-code"),
+    api_mode="anthropic_messages",
+    env_vars=("ANTHROPIC_API_KEY", "ANTHROPIC_TOKEN", "CLAUDE_CODE_OAUTH_TOKEN"),
+    base_url="https://api.anthropic.com",
+    auth_type="api_key",
+    default_aux_model="claude-haiku-4-5-20251001",
+)
+
+register_provider(anthropic)
--- a/plugins/model-providers/anthropic/plugin.yaml
+++ b/plugins/model-providers/anthropic/plugin.yaml
@ -0,0 +1,5 @@
+name: anthropic-provider
+kind: model-provider
+version: 1.0.0
+description: Anthropic (Claude)
+author: Nous Research
--- a/plugins/model-providers/arcee/init.py
+++ b/plugins/model-providers/arcee/init.py
@ -0,0 +1,13 @@
+"""Arcee AI provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+arcee = ProviderProfile(
+    name="arcee",
+    aliases=("arcee-ai", "arceeai"),
+    env_vars=("ARCEEAI_API_KEY",),
+    base_url="https://api.arcee.ai/api/v1",
+)
+
+register_provider(arcee)
--- a/plugins/model-providers/arcee/plugin.yaml
+++ b/plugins/model-providers/arcee/plugin.yaml
@ -0,0 +1,5 @@
+name: arcee-provider
+kind: model-provider
+version: 1.0.0
+description: Arcee AI
+author: Nous Research
--- a/plugins/model-providers/azure-foundry/init.py
+++ b/plugins/model-providers/azure-foundry/init.py
@ -0,0 +1,21 @@
+"""Azure AI Foundry provider profile.
+
+Azure Foundry exposes an OpenAI-compatible endpoint; users supply their own
+base URL at setup since endpoints are per-resource.
+"""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+azure_foundry = ProviderProfile(
+    name="azure-foundry",
+    aliases=("azure", "azure-ai-foundry", "azure-ai"),
+    display_name="Azure Foundry",
+    description="Azure AI Foundry — OpenAI-compatible endpoint (user-supplied base URL)",
+    signup_url="https://ai.azure.com/",
+    env_vars=("AZURE_FOUNDRY_API_KEY", "AZURE_FOUNDRY_BASE_URL"),
+    base_url="",  # per-resource; user provides at setup
+    auth_type="api_key",
+)
+
+register_provider(azure_foundry)
--- a/plugins/model-providers/azure-foundry/plugin.yaml
+++ b/plugins/model-providers/azure-foundry/plugin.yaml
@ -0,0 +1,5 @@
+name: azure-foundry-provider
+kind: model-provider
+version: 1.0.0
+description: Azure AI Foundry
+author: Nous Research
--- a/plugins/model-providers/bedrock/init.py
+++ b/plugins/model-providers/bedrock/init.py
@ -0,0 +1,29 @@
+"""AWS Bedrock provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+
+class BedrockProfile(ProviderProfile):
+    """AWS Bedrock — no REST /v1/models endpoint; uses AWS SDK."""
+
+    def fetch_models(
+        self,
+        *,
+        api_key: str | None = None,
+        timeout: float = 8.0,
+    ) -> list[str] | None:
+        """Bedrock model listing requires AWS SDK, not a REST call."""
+        return None
+
+
+bedrock = BedrockProfile(
+    name="bedrock",
+    aliases=("aws", "aws-bedrock", "amazon-bedrock", "amazon"),
+    api_mode="bedrock_converse",
+    env_vars=(),  # AWS SDK credentials — not env vars
+    base_url="https://bedrock-runtime.us-east-1.amazonaws.com",
+    auth_type="aws_sdk",
+)
+
+register_provider(bedrock)
--- a/plugins/model-providers/bedrock/plugin.yaml
+++ b/plugins/model-providers/bedrock/plugin.yaml
@ -0,0 +1,5 @@
+name: bedrock-provider
+kind: model-provider
+version: 1.0.0
+description: AWS Bedrock
+author: Nous Research
--- a/plugins/model-providers/copilot-acp/init.py
+++ b/plugins/model-providers/copilot-acp/init.py
@ -0,0 +1,34 @@
+"""GitHub Copilot ACP provider profile.
+
+copilot-acp uses an external ACP subprocess — NOT the standard
+transport. api_mode="copilot_acp" is handled separately in run_agent.py.
+The profile captures auth + endpoint metadata for registry migration.
+"""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+
+class CopilotACPProfile(ProviderProfile):
+    """GitHub Copilot ACP — external process, no REST models endpoint."""
+
+    def fetch_models(
+        self,
+        *,
+        api_key: str | None = None,
+        timeout: float = 8.0,
+    ) -> list[str] | None:
+        """Model listing is handled by the ACP subprocess."""
+        return None
+
+
+copilot_acp = CopilotACPProfile(
+    name="copilot-acp",
+    aliases=("github-copilot-acp", "copilot-acp-agent"),
+    api_mode="chat_completions",  # ACP subprocess uses chat_completions routing
+    env_vars=(),  # Managed by ACP subprocess
+    base_url="acp://copilot",  # ACP internal scheme
+    auth_type="external_process",
+)
+
+register_provider(copilot_acp)
--- a/plugins/model-providers/copilot-acp/plugin.yaml
+++ b/plugins/model-providers/copilot-acp/plugin.yaml
@ -0,0 +1,5 @@
+name: copilot-acp-provider
+kind: model-provider
+version: 1.0.0
+description: GitHub Copilot via ACP subprocess
+author: Nous Research
--- a/plugins/model-providers/copilot/init.py
+++ b/plugins/model-providers/copilot/init.py
@ -0,0 +1,58 @@
+"""Copilot / GitHub Models provider profile.
+
+Copilot uses per-model api_mode routing:
+  - GPT-5+ / Codex models → codex_responses
+  - Claude models → anthropic_messages
+  - Everything else → chat_completions (this profile covers that subset)
+
+Key quirks for the chat_completions subset:
+  - Editor attribution headers (via copilot_default_headers())
+  - GitHub Models reasoning extra_body (model-catalog gated)
+"""
+
+from typing import Any
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+
+class CopilotProfile(ProviderProfile):
+    """GitHub Copilot / GitHub Models — editor headers + reasoning."""
+
+    def build_api_kwargs_extras(
+        self,
+        *,
+        model: str | None = None,
+        reasoning_config: dict | None = None,
+        supports_reasoning: bool = False,
+        **ctx,
+    ) -> tuple[dict[str, Any], dict[str, Any]]:
+        extra_body: dict[str, Any] = {}
+        if supports_reasoning and model:
+            try:
+                from hermes_cli.models import github_model_reasoning_efforts
+
+                supported_efforts = github_model_reasoning_efforts(model)
+                if supported_efforts and reasoning_config:
+                    effort = reasoning_config.get("effort", "medium")
+                    # Normalize non-standard effort levels to the nearest supported
+                    if effort == "xhigh":
+                        effort = "high"
+                    if effort in supported_efforts:
+                        extra_body["reasoning"] = {"effort": effort}
+                elif supported_efforts:
+                    extra_body["reasoning"] = {"effort": "medium"}
+            except Exception:
+                pass
+        return extra_body, {}
+
+
+copilot = CopilotProfile(
+    name="copilot",
+    aliases=("github-copilot", "github-models", "github-model", "github"),
+    env_vars=("COPILOT_GITHUB_TOKEN", "GH_TOKEN", "GITHUB_TOKEN"),
+    base_url="https://api.githubcopilot.com",
+    auth_type="copilot",
+)
+
+register_provider(copilot)
--- a/plugins/model-providers/copilot/plugin.yaml
+++ b/plugins/model-providers/copilot/plugin.yaml
@ -0,0 +1,5 @@
+name: copilot-provider
+kind: model-provider
+version: 1.0.0
+description: GitHub Copilot
+author: Nous Research
--- a/plugins/model-providers/custom/init.py
+++ b/plugins/model-providers/custom/init.py
@ -0,0 +1,68 @@
+"""Custom / Ollama (local) provider profile.
+
+Covers any endpoint registered as provider="custom", including local
+Ollama instances. Key quirks:
+  - ollama_num_ctx → extra_body.options.num_ctx (local context window)
+  - reasoning_config disabled → extra_body.think = False
+"""
+
+from typing import Any
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+
+class CustomProfile(ProviderProfile):
+    """Custom/Ollama local provider — think=false and num_ctx support."""
+
+    def build_api_kwargs_extras(
+        self,
+        *,
+        reasoning_config: dict | None = None,
+        ollama_num_ctx: int | None = None,
+        **ctx: Any,
+    ) -> tuple[dict[str, Any], dict[str, Any]]:
+        extra_body: dict[str, Any] = {}
+
+        # Ollama context window
+        if ollama_num_ctx:
+            options = extra_body.get("options", {})
+            options["num_ctx"] = ollama_num_ctx
+            extra_body["options"] = options
+
+        # Disable thinking when reasoning is turned off
+        if reasoning_config and isinstance(reasoning_config, dict):
+            _effort = (reasoning_config.get("effort") or "").strip().lower()
+            _enabled = reasoning_config.get("enabled", True)
+            if _effort == "none" or _enabled is False:
+                extra_body["think"] = False
+
+        return extra_body, {}
+
+    def fetch_models(
+        self,
+        *,
+        api_key: str | None = None,
+        timeout: float = 8.0,
+    ) -> list[str] | None:
+        """Custom/Ollama: base_url is user-configured; fetch if set."""
+        if not self.base_url:
+            return None
+        return super().fetch_models(api_key=api_key, timeout=timeout)
+
+
+custom = CustomProfile(
+    name="custom",
+    aliases=(
+        "ollama",
+        "local",
+        "vllm",
+        "llamacpp",
+        "llama.cpp",
+        "llama-cpp",
+    ),
+    env_vars=(),  # No fixed key — custom endpoint
+    base_url="",  # User-configured
+)
+
+register_provider(custom)
--- a/plugins/model-providers/custom/plugin.yaml
+++ b/plugins/model-providers/custom/plugin.yaml
@ -0,0 +1,5 @@
+name: custom-provider
+kind: model-provider
+version: 1.0.0
+description: Custom / Ollama / local OpenAI-compatible endpoint
+author: Nous Research
--- a/plugins/model-providers/deepseek/init.py
+++ b/plugins/model-providers/deepseek/init.py
@ -0,0 +1,20 @@
+"""DeepSeek provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+deepseek = ProviderProfile(
+    name="deepseek",
+    aliases=("deepseek-chat",),
+    env_vars=("DEEPSEEK_API_KEY",),
+    display_name="DeepSeek",
+    description="DeepSeek — native DeepSeek API",
+    signup_url="https://platform.deepseek.com/",
+    fallback_models=(
+        "deepseek-chat",
+        "deepseek-reasoner",
+    ),
+    base_url="https://api.deepseek.com/v1",
+)
+
+register_provider(deepseek)
--- a/plugins/model-providers/deepseek/plugin.yaml
+++ b/plugins/model-providers/deepseek/plugin.yaml
@ -0,0 +1,5 @@
+name: deepseek-provider
+kind: model-provider
+version: 1.0.0
+description: DeepSeek
+author: Nous Research
--- a/plugins/model-providers/gemini/init.py
+++ b/plugins/model-providers/gemini/init.py
@ -0,0 +1,72 @@
+"""Google Gemini provider profiles.
+
+gemini:            Google AI Studio (API key) — uses GeminiNativeClient
+google-gemini-cli: Google Cloud Code Assist (OAuth) — uses GeminiCloudCodeClient
+
+Both report api_mode="chat_completions" but use custom native clients
+that bypass the standard OpenAI transport. The profile captures auth
+and endpoint metadata for auth.py / runtime_provider.py migration, and
+carries the thinking_config translation hook so the transport's profile
+path produces the same extra_body shape the legacy flag path did.
+"""
+
+from typing import Any
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+
+class GeminiProfile(ProviderProfile):
+    """Gemini — translate reasoning_config to thinking_config in extra_body."""
+
+    def build_extra_body(
+        self, *, session_id: str | None = None, **context: Any
+    ) -> dict[str, Any]:
+        """Emit extra_body.thinking_config (native) or extra_body.extra_body.google.thinking_config
+        (OpenAI-compat /openai subpath), mirroring the legacy path's behavior.
+        """
+        from agent.transports.chat_completions import (
+            _build_gemini_thinking_config,
+            _is_gemini_openai_compat_base_url,
+            _snake_case_gemini_thinking_config,
+        )
+
+        model = context.get("model") or ""
+        reasoning_config = context.get("reasoning_config")
+        base_url = context.get("base_url") or self.base_url
+
+        raw_thinking_config = _build_gemini_thinking_config(model, reasoning_config)
+        if not raw_thinking_config:
+            return {}
+
+        body: dict[str, Any] = {}
+        if self.name == "gemini" and _is_gemini_openai_compat_base_url(base_url):
+            thinking_config = _snake_case_gemini_thinking_config(raw_thinking_config)
+            if thinking_config:
+                body["extra_body"] = {"google": {"thinking_config": thinking_config}}
+        else:
+            body["thinking_config"] = raw_thinking_config
+        return body
+
+
+gemini = GeminiProfile(
+    name="gemini",
+    aliases=("google", "google-gemini", "google-ai-studio"),
+    api_mode="chat_completions",
+    env_vars=("GOOGLE_API_KEY", "GEMINI_API_KEY"),
+    base_url="https://generativelanguage.googleapis.com/v1beta",
+    auth_type="api_key",
+    default_aux_model="gemini-3-flash-preview",
+)
+
+google_gemini_cli = GeminiProfile(
+    name="google-gemini-cli",
+    aliases=("gemini-cli", "gemini-oauth"),
+    api_mode="chat_completions",
+    env_vars=(),  # OAuth — no API key
+    base_url="cloudcode-pa://google",  # Cloud Code Assist internal scheme
+    auth_type="oauth_external",
+)
+
+register_provider(gemini)
+register_provider(google_gemini_cli)
--- a/plugins/model-providers/gemini/plugin.yaml
+++ b/plugins/model-providers/gemini/plugin.yaml
@ -0,0 +1,5 @@
+name: gemini-provider
+kind: model-provider
+version: 1.0.0
+description: Google Gemini (API key + Cloud Code OAuth)
+author: Nous Research
--- a/plugins/model-providers/gmi/init.py
+++ b/plugins/model-providers/gmi/init.py
@ -0,0 +1,26 @@
+"""GMI Cloud provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+gmi = ProviderProfile(
+    name="gmi",
+    aliases=("gmi-cloud", "gmicloud"),
+    display_name="GMI Cloud",
+    description="GMI Cloud — multi-model direct API (slash-form model IDs)",
+    signup_url="https://www.gmicloud.ai/",
+    env_vars=("GMI_API_KEY", "GMI_BASE_URL"),
+    base_url="https://api.gmi-serving.com/v1",
+    auth_type="api_key",
+    default_aux_model="google/gemini-3.1-flash-lite-preview",
+    fallback_models=(
+        "zai-org/GLM-5.1-FP8",
+        "deepseek-ai/DeepSeek-V3.2",
+        "moonshotai/Kimi-K2.5",
+        "google/gemini-3.1-flash-lite-preview",
+        "anthropic/claude-sonnet-4.6",
+        "openai/gpt-5.4",
+    ),
+)
+
+register_provider(gmi)
--- a/plugins/model-providers/gmi/plugin.yaml
+++ b/plugins/model-providers/gmi/plugin.yaml
@ -0,0 +1,5 @@
+name: gmi-provider
+kind: model-provider
+version: 1.0.0
+description: GMI Cloud
+author: Nous Research
--- a/plugins/model-providers/huggingface/init.py
+++ b/plugins/model-providers/huggingface/init.py
@ -0,0 +1,20 @@
+"""Hugging Face provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+huggingface = ProviderProfile(
+    name="huggingface",
+    aliases=("hf", "hugging-face", "huggingface-hub"),
+    env_vars=("HF_TOKEN",),
+    display_name="HuggingFace",
+    description="HuggingFace Inference API",
+    signup_url="https://huggingface.co/settings/tokens",
+    fallback_models=(
+        "Qwen/Qwen3.5-72B-Instruct",
+        "deepseek-ai/DeepSeek-V3.2",
+    ),
+    base_url="https://router.huggingface.co/v1",
+)
+
+register_provider(huggingface)
--- a/plugins/model-providers/huggingface/plugin.yaml
+++ b/plugins/model-providers/huggingface/plugin.yaml
@ -0,0 +1,5 @@
+name: huggingface-provider
+kind: model-provider
+version: 1.0.0
+description: HuggingFace Inference Providers
+author: Nous Research
--- a/plugins/model-providers/kilocode/init.py
+++ b/plugins/model-providers/kilocode/init.py
@ -0,0 +1,14 @@
+"""Kilo Code provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+kilocode = ProviderProfile(
+    name="kilocode",
+    aliases=("kilo-code", "kilo", "kilo-gateway"),
+    env_vars=("KILOCODE_API_KEY",),
+    base_url="https://api.kilo.ai/api/gateway",
+    default_aux_model="google/gemini-3-flash-preview",
+)
+
+register_provider(kilocode)
--- a/plugins/model-providers/kilocode/plugin.yaml
+++ b/plugins/model-providers/kilocode/plugin.yaml
@ -0,0 +1,5 @@
+name: kilocode-provider
+kind: model-provider
+version: 1.0.0
+description: Kilo Code
+author: Nous Research
--- a/plugins/model-providers/kimi-coding/init.py
+++ b/plugins/model-providers/kimi-coding/init.py
@ -0,0 +1,71 @@
+"""Kimi / Moonshot provider profiles.
+
+Kimi has dual endpoints:
+  - sk-kimi-* keys → api.kimi.com/coding (Anthropic Messages API)
+  - legacy keys → api.moonshot.ai/v1 (OpenAI chat completions)
+
+This module covers the chat_completions path (/v1 endpoint).
+"""
+
+from typing import Any
+
+from providers import register_provider
+from providers.base import OMIT_TEMPERATURE, ProviderProfile
+
+
+class KimiProfile(ProviderProfile):
+    """Kimi/Moonshot — temperature omitted, thinking + reasoning_effort."""
+
+    def build_api_kwargs_extras(
+        self, *, reasoning_config: dict | None = None, **context
+    ) -> tuple[dict[str, Any], dict[str, Any]]:
+        """Kimi uses extra_body.thinking + top-level reasoning_effort."""
+        extra_body = {}
+        top_level = {}
+
+        if not reasoning_config or not isinstance(reasoning_config, dict):
+            # No config → thinking enabled, default effort
+            extra_body["thinking"] = {"type": "enabled"}
+            top_level["reasoning_effort"] = "medium"
+            return extra_body, top_level
+
+        enabled = reasoning_config.get("enabled", True)
+        if enabled is False:
+            extra_body["thinking"] = {"type": "disabled"}
+            return extra_body, top_level
+
+        # Enabled
+        extra_body["thinking"] = {"type": "enabled"}
+        effort = (reasoning_config.get("effort") or "").strip().lower()
+        if effort in ("low", "medium", "high"):
+            top_level["reasoning_effort"] = effort
+        else:
+            top_level["reasoning_effort"] = "medium"
+
+        return extra_body, top_level
+
+
+kimi = KimiProfile(
+    name="kimi-coding",
+    aliases=("kimi", "moonshot", "kimi-for-coding"),
+    env_vars=("KIMI_API_KEY", "KIMI_CODING_API_KEY"),
+    base_url="https://api.moonshot.ai/v1",
+    fixed_temperature=OMIT_TEMPERATURE,
+    default_max_tokens=32000,
+    default_headers={"User-Agent": "hermes-agent/1.0"},
+    default_aux_model="kimi-k2-turbo-preview",
+)
+
+kimi_cn = KimiProfile(
+    name="kimi-coding-cn",
+    aliases=("kimi-cn", "moonshot-cn"),
+    env_vars=("KIMI_CN_API_KEY",),
+    base_url="https://api.moonshot.cn/v1",
+    fixed_temperature=OMIT_TEMPERATURE,
+    default_max_tokens=32000,
+    default_headers={"User-Agent": "hermes-agent/1.0"},
+    default_aux_model="kimi-k2-turbo-preview",
+)
+
+register_provider(kimi)
+register_provider(kimi_cn)
--- a/plugins/model-providers/kimi-coding/plugin.yaml
+++ b/plugins/model-providers/kimi-coding/plugin.yaml
@ -0,0 +1,5 @@
+name: kimi-coding-provider
+kind: model-provider
+version: 1.0.0
+description: Moonshot Kimi Coding (global + China)
+author: Nous Research
--- a/plugins/model-providers/minimax/init.py
+++ b/plugins/model-providers/minimax/init.py
@ -0,0 +1,45 @@
+"""MiniMax provider profiles (international + China).
+
+Both use anthropic_messages api_mode — their inference_base_url
+ends with /anthropic which triggers auto-detection to anthropic_messages.
+"""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+minimax = ProviderProfile(
+    name="minimax",
+    aliases=("mini-max",),
+    api_mode="anthropic_messages",
+    env_vars=("MINIMAX_API_KEY",),
+    base_url="https://api.minimax.io/anthropic",
+    auth_type="api_key",
+    default_aux_model="MiniMax-M2.7",
+)
+
+minimax_cn = ProviderProfile(
+    name="minimax-cn",
+    aliases=("minimax-china", "minimax_cn"),
+    api_mode="anthropic_messages",
+    env_vars=("MINIMAX_CN_API_KEY",),
+    base_url="https://api.minimaxi.com/anthropic",
+    auth_type="api_key",
+    default_aux_model="MiniMax-M2.7",
+)
+
+minimax_oauth = ProviderProfile(
+    name="minimax-oauth",
+    aliases=("minimax_oauth", "minimax-oauth-io"),
+    api_mode="anthropic_messages",
+    display_name="MiniMax (OAuth)",
+    description="MiniMax via OAuth browser flow — no API key required",
+    signup_url="https://api.minimax.io/",
+    env_vars=(),  # OAuth — tokens in auth.json, not env
+    base_url="https://api.minimax.io/anthropic",
+    auth_type="oauth_external",
+    default_aux_model="MiniMax-M2.7-highspeed",
+)
+
+register_provider(minimax)
+register_provider(minimax_cn)
+register_provider(minimax_oauth)
--- a/plugins/model-providers/minimax/plugin.yaml
+++ b/plugins/model-providers/minimax/plugin.yaml
@ -0,0 +1,5 @@
+name: minimax-provider
+kind: model-provider
+version: 1.0.0
+description: MiniMax M-series (global + China + OAuth)
+author: Nous Research
--- a/plugins/model-providers/nous/init.py
+++ b/plugins/model-providers/nous/init.py
@ -0,0 +1,53 @@
+"""Nous Portal provider profile."""
+
+from typing import Any
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+
+class NousProfile(ProviderProfile):
+    """Nous Portal — product tags, reasoning with Nous-specific omission."""
+
+    def build_extra_body(
+        self, *, session_id: str | None = None, **context
+    ) -> dict[str, Any]:
+        return {"tags": ["product=hermes-agent"]}
+
+    def build_api_kwargs_extras(
+        self,
+        *,
+        reasoning_config: dict | None = None,
+        supports_reasoning: bool = False,
+        **context,
+    ) -> tuple[dict[str, Any], dict[str, Any]]:
+        """Nous: passes full reasoning_config, but OMITS when disabled."""
+        extra_body = {}
+        if supports_reasoning:
+            if reasoning_config is not None:
+                rc = dict(reasoning_config)
+                if rc.get("enabled") is False:
+                    pass  # Nous omits reasoning when disabled
+                else:
+                    extra_body["reasoning"] = rc
+            else:
+                extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
+        return extra_body, {}
+
+
+nous = NousProfile(
+    name="nous",
+    aliases=("nous-portal", "nousresearch"),
+    env_vars=("NOUS_API_KEY",),
+    display_name="Nous Research",
+    description="Nous Research — Hermes model family",
+    signup_url="https://nousresearch.com/",
+    fallback_models=(
+        "hermes-3-405b",
+        "hermes-3-70b",
+    ),
+    base_url="https://inference.nousresearch.com/v1",
+    auth_type="oauth_device_code",
+)
+
+register_provider(nous)
--- a/plugins/model-providers/nous/plugin.yaml
+++ b/plugins/model-providers/nous/plugin.yaml
@ -0,0 +1,5 @@
+name: nous-provider
+kind: model-provider
+version: 1.0.0
+description: Nous Research Portal
+author: Nous Research
--- a/plugins/model-providers/nvidia/init.py
+++ b/plugins/model-providers/nvidia/init.py
@ -0,0 +1,21 @@
+"""NVIDIA NIM provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+nvidia = ProviderProfile(
+    name="nvidia",
+    aliases=("nvidia-nim",),
+    env_vars=("NVIDIA_API_KEY",),
+    display_name="NVIDIA NIM",
+    description="NVIDIA NIM — accelerated inference",
+    signup_url="https://build.nvidia.com/",
+    fallback_models=(
+        "nvidia/llama-3.1-nemotron-70b-instruct",
+        "nvidia/llama-3.3-70b-instruct",
+    ),
+    base_url="https://integrate.api.nvidia.com/v1",
+    default_max_tokens=16384,
+)
+
+register_provider(nvidia)
--- a/plugins/model-providers/nvidia/plugin.yaml
+++ b/plugins/model-providers/nvidia/plugin.yaml
@ -0,0 +1,5 @@
+name: nvidia-provider
+kind: model-provider
+version: 1.0.0
+description: NVIDIA NIM
+author: Nous Research
--- a/plugins/model-providers/ollama-cloud/init.py
+++ b/plugins/model-providers/ollama-cloud/init.py
@ -0,0 +1,14 @@
+"""Ollama Cloud provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+ollama_cloud = ProviderProfile(
+    name="ollama-cloud",
+    aliases=("ollama_cloud",),
+    default_aux_model="nemotron-3-nano:30b",
+    env_vars=("OLLAMA_API_KEY",),
+    base_url="https://ollama.com/v1",
+)
+
+register_provider(ollama_cloud)
--- a/plugins/model-providers/ollama-cloud/plugin.yaml
+++ b/plugins/model-providers/ollama-cloud/plugin.yaml
@ -0,0 +1,5 @@
+name: ollama-cloud-provider
+kind: model-provider
+version: 1.0.0
+description: Ollama Cloud
+author: Nous Research
--- a/plugins/model-providers/openai-codex/init.py
+++ b/plugins/model-providers/openai-codex/init.py
@ -0,0 +1,15 @@
+"""OpenAI Codex (Responses API) provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+openai_codex = ProviderProfile(
+    name="openai-codex",
+    aliases=("codex", "openai_codex"),
+    api_mode="codex_responses",
+    env_vars=(),  # OAuth external — no API key
+    base_url="https://chatgpt.com/backend-api/codex",
+    auth_type="oauth_external",
+)
+
+register_provider(openai_codex)
--- a/plugins/model-providers/openai-codex/plugin.yaml
+++ b/plugins/model-providers/openai-codex/plugin.yaml
@ -0,0 +1,5 @@
+name: openai-codex-provider
+kind: model-provider
+version: 1.0.0
+description: OpenAI Codex (Responses API)
+author: Nous Research
--- a/plugins/model-providers/opencode-zen/init.py
+++ b/plugins/model-providers/opencode-zen/init.py
@ -0,0 +1,30 @@
+"""OpenCode provider profiles (Zen + Go).
+
+Both use per-model api_mode routing:
+  - OpenCode Zen: Claude → anthropic_messages, GPT-5/Codex → codex_responses,
+    everything else → chat_completions (this profile)
+  - OpenCode Go: MiniMax → anthropic_messages, GLM/Kimi → chat_completions
+    (this profile)
+"""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+opencode_zen = ProviderProfile(
+    name="opencode-zen",
+    aliases=("opencode", "opencode_zen", "zen"),
+    env_vars=("OPENCODE_ZEN_API_KEY",),
+    base_url="https://opencode.ai/zen/v1",
+    default_aux_model="gemini-3-flash",
+)
+
+opencode_go = ProviderProfile(
+    name="opencode-go",
+    aliases=("opencode_go", "go", "opencode-go-sub"),
+    env_vars=("OPENCODE_GO_API_KEY",),
+    base_url="https://opencode.ai/zen/go/v1",
+    default_aux_model="glm-5",
+)
+
+register_provider(opencode_zen)
+register_provider(opencode_go)
--- a/plugins/model-providers/opencode-zen/plugin.yaml
+++ b/plugins/model-providers/opencode-zen/plugin.yaml
@ -0,0 +1,5 @@
+name: opencode-zen-provider
+kind: model-provider
+version: 1.0.0
+description: OpenCode (Zen + Go)
+author: Nous Research
--- a/plugins/model-providers/openrouter/init.py
+++ b/plugins/model-providers/openrouter/init.py
@ -0,0 +1,86 @@
+"""OpenRouter provider profile."""
+
+import logging
+from typing import Any
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+logger = logging.getLogger(__name__)
+
+_CACHE: list[str] | None = None
+
+
+class OpenRouterProfile(ProviderProfile):
+    """OpenRouter aggregator — provider preferences, reasoning config passthrough."""
+
+    def fetch_models(
+        self,
+        *,
+        api_key: str | None = None,
+        timeout: float = 8.0,
+    ) -> list[str] | None:
+        """Fetch from public OpenRouter catalog — no auth required.
+
+        Note: Tool-call capability filtering is applied by hermes_cli/models.py
+        via fetch_openrouter_models() → _openrouter_model_supports_tools(), not
+        here. The picker early-returns via the dedicated openrouter path before
+        reaching this method, so filtering here would be unreachable.
+        """
+        global _CACHE  # noqa: PLW0603
+        if _CACHE is not None:
+            return _CACHE
+        try:
+            result = super().fetch_models(api_key=None, timeout=timeout)
+            if result is not None:
+                _CACHE = result
+            return result
+        except Exception as exc:
+            logger.debug("fetch_models(openrouter): %s", exc)
+            return None
+
+    def build_extra_body(
+        self, *, session_id: str | None = None, **context: Any
+    ) -> dict[str, Any]:
+        body: dict[str, Any] = {}
+        prefs = context.get("provider_preferences")
+        if prefs:
+            body["provider"] = prefs
+        return body
+
+    def build_api_kwargs_extras(
+        self,
+        *,
+        reasoning_config: dict | None = None,
+        supports_reasoning: bool = False,
+        **context: Any,
+    ) -> tuple[dict[str, Any], dict[str, Any]]:
+        """OpenRouter passes the full reasoning_config dict as extra_body.reasoning."""
+        extra_body: dict[str, Any] = {}
+        if supports_reasoning:
+            if reasoning_config is not None:
+                extra_body["reasoning"] = dict(reasoning_config)
+            else:
+                extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
+        return extra_body, {}
+
+
+openrouter = OpenRouterProfile(
+    name="openrouter",
+    aliases=("or",),
+    env_vars=("OPENROUTER_API_KEY",),
+    display_name="OpenRouter",
+    description="OpenRouter — unified API for 200+ models",
+    signup_url="https://openrouter.ai/keys",
+    base_url="https://openrouter.ai/api/v1",
+    models_url="https://openrouter.ai/api/v1/models",
+    fallback_models=(
+        "anthropic/claude-sonnet-4.6",
+        "openai/gpt-5.4",
+        "deepseek/deepseek-chat",
+        "google/gemini-3-flash-preview",
+        "qwen/qwen3-plus",
+    ),
+)
+
+register_provider(openrouter)
--- a/plugins/model-providers/openrouter/plugin.yaml
+++ b/plugins/model-providers/openrouter/plugin.yaml
@ -0,0 +1,5 @@
+name: openrouter-provider
+kind: model-provider
+version: 1.0.0
+description: OpenRouter aggregator
+author: Nous Research
--- a/plugins/model-providers/qwen-oauth/init.py
+++ b/plugins/model-providers/qwen-oauth/init.py
@ -0,0 +1,82 @@
+"""Qwen Portal provider profile."""
+
+import copy
+from typing import Any
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+
+class QwenProfile(ProviderProfile):
+    """Qwen Portal — message normalization, vl_high_resolution, metadata top-level."""
+
+    def prepare_messages(self, messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
+        """Normalize content to list-of-dicts format.
+
+        Inject cache_control on system message.
+
+        Matches the behavior of run_agent.py:_qwen_prepare_chat_messages().
+        """
+        prepared = copy.deepcopy(messages)
+        if not prepared:
+            return prepared
+
+        for msg in prepared:
+            if not isinstance(msg, dict):
+                continue
+            content = msg.get("content")
+            if isinstance(content, str):
+                msg["content"] = [{"type": "text", "text": content}]
+            elif isinstance(content, list):
+                normalized_parts = []
+                for part in content:
+                    if isinstance(part, str):
+                        normalized_parts.append({"type": "text", "text": part})
+                    elif isinstance(part, dict):
+                        normalized_parts.append(part)
+                if normalized_parts:
+                    msg["content"] = normalized_parts
+
+        # Inject cache_control on the last part of the system message.
+        for msg in prepared:
+            if isinstance(msg, dict) and msg.get("role") == "system":
+                content = msg.get("content")
+                if (
+                    isinstance(content, list)
+                    and content
+                    and isinstance(content[-1], dict)
+                ):
+                    content[-1]["cache_control"] = {"type": "ephemeral"}
+                break
+
+        return prepared
+
+    def build_extra_body(
+        self, *, session_id: str | None = None, **context
+    ) -> dict[str, Any]:
+        return {"vl_high_resolution_images": True}
+
+    def build_api_kwargs_extras(
+        self,
+        *,
+        reasoning_config: dict | None = None,
+        qwen_session_metadata: dict | None = None,
+        **context,
+    ) -> tuple[dict[str, Any], dict[str, Any]]:
+        """Qwen metadata goes to top-level api_kwargs, not extra_body."""
+        top_level = {}
+        if qwen_session_metadata:
+            top_level["metadata"] = qwen_session_metadata
+        return {}, top_level
+
+
+qwen = QwenProfile(
+    name="qwen-oauth",
+    aliases=("qwen", "qwen-portal", "qwen-cli"),
+    env_vars=("QWEN_API_KEY",),
+    base_url="https://portal.qwen.ai/v1",
+    auth_type="oauth_external",
+    default_max_tokens=65536,
+)
+
+register_provider(qwen)
--- a/plugins/model-providers/qwen-oauth/plugin.yaml
+++ b/plugins/model-providers/qwen-oauth/plugin.yaml
@ -0,0 +1,5 @@
+name: qwen-oauth-provider
+kind: model-provider
+version: 1.0.0
+description: Qwen Portal (OAuth)
+author: Nous Research
--- a/plugins/model-providers/stepfun/init.py
+++ b/plugins/model-providers/stepfun/init.py
@ -0,0 +1,14 @@
+"""StepFun provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+stepfun = ProviderProfile(
+    name="stepfun",
+    aliases=("step", "stepfun-coding-plan"),
+    default_aux_model="step-3.5-flash",
+    env_vars=("STEPFUN_API_KEY",),
+    base_url="https://api.stepfun.ai/step_plan/v1",
+)
+
+register_provider(stepfun)
--- a/plugins/model-providers/stepfun/plugin.yaml
+++ b/plugins/model-providers/stepfun/plugin.yaml
@ -0,0 +1,5 @@
+name: stepfun-provider
+kind: model-provider
+version: 1.0.0
+description: StepFun Step Plan
+author: Nous Research
--- a/plugins/model-providers/xai/init.py
+++ b/plugins/model-providers/xai/init.py
@ -0,0 +1,15 @@
+"""xAI (Grok) provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+xai = ProviderProfile(
+    name="xai",
+    aliases=("grok", "x-ai", "x.ai"),
+    api_mode="codex_responses",
+    env_vars=("XAI_API_KEY",),
+    base_url="https://api.x.ai/v1",
+    auth_type="api_key",
+)
+
+register_provider(xai)
--- a/plugins/model-providers/xai/plugin.yaml
+++ b/plugins/model-providers/xai/plugin.yaml
@ -0,0 +1,5 @@
+name: xai-provider
+kind: model-provider
+version: 1.0.0
+description: xAI Grok (Responses API)
+author: Nous Research
--- a/plugins/model-providers/xiaomi/init.py
+++ b/plugins/model-providers/xiaomi/init.py
@ -0,0 +1,13 @@
+"""Xiaomi MiMo provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+xiaomi = ProviderProfile(
+    name="xiaomi",
+    aliases=("mimo", "xiaomi-mimo"),
+    env_vars=("XIAOMI_API_KEY",),
+    base_url="https://api.xiaomimimo.com/v1",
+)
+
+register_provider(xiaomi)
--- a/plugins/model-providers/xiaomi/plugin.yaml
+++ b/plugins/model-providers/xiaomi/plugin.yaml
@ -0,0 +1,5 @@
+name: xiaomi-provider
+kind: model-provider
+version: 1.0.0
+description: Xiaomi MiMo
+author: Nous Research
--- a/plugins/model-providers/zai/init.py
+++ b/plugins/model-providers/zai/init.py
@ -0,0 +1,21 @@
+"""ZAI / GLM provider profile."""
+
+from providers import register_provider
+from providers.base import ProviderProfile
+
+zai = ProviderProfile(
+    name="zai",
+    aliases=("glm", "z-ai", "z.ai", "zhipu"),
+    env_vars=("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"),
+    display_name="Z.AI (GLM)",
+    description="Z.AI / GLM — Zhipu AI models",
+    signup_url="https://z.ai/",
+    fallback_models=(
+        "glm-5",
+        "glm-4-9b",
+    ),
+    base_url="https://api.z.ai/api/paas/v4",
+    default_aux_model="glm-4.5-flash",
+)
+
+register_provider(zai)
--- a/plugins/model-providers/zai/plugin.yaml
+++ b/plugins/model-providers/zai/plugin.yaml
@ -0,0 +1,5 @@
+name: zai-provider
+kind: model-provider
+version: 1.0.0
+description: Z.AI / GLM
+author: Nous Research