feat: provider modules — ProviderProfile ABC, 30 providers, fetch_models, transport single-path

feat: provider modules — ProviderProfile ABC, 29 providers, fetch_models, transport single-path

Introduces providers/ as the single source of truth for every inference
provider. All 29 providers declared with correct data cross-checked against
auth.py, runtime_provider.py and auxiliary_client.py.

Rebased onto main (30307a980). Incorporates post-salvage fixes from
56724147e (gmi aux model google/gemini-3.1-flash-lite-preview, already set in providers/gmi.py).
This commit is contained in:
kshitijk4poor 2026-04-28 08:56:12 +05:30
parent 30307a9802
commit 84d1673e2f
60 changed files with 3939 additions and 1034 deletions

View file

@ -423,6 +423,44 @@ model:
For on-prem deployments (DGX Spark, local GPU), set `NVIDIA_BASE_URL=http://localhost:8000/v1`. NIM exposes the same OpenAI-compatible chat completions API as build.nvidia.com, so switching between cloud and local is a one-line env-var change.
:::
### GMI Cloud
Open and reasoning models via [GMI Cloud](https://inference.gmi.ai) — OpenAI-compatible API, API key authentication.
```bash
# GMI Cloud
hermes chat --provider gmi --model deepseek-ai/DeepSeek-R1
# Requires: GMI_API_KEY in ~/.hermes/.env
```
Or set it permanently in `config.yaml`:
```yaml
model:
provider: "gmi"
default: "deepseek-ai/DeepSeek-R1"
```
The base URL can be overridden with `GMI_BASE_URL` (default: `https://api.gmi.ai/v1`).
### StepFun
Step-series models via [StepFun](https://platform.stepfun.com) — OpenAI-compatible API, API key authentication.
```bash
# StepFun
hermes chat --provider stepfun --model step-3-mini
# Requires: STEPFUN_API_KEY in ~/.hermes/.env
```
Or set it permanently in `config.yaml`:
```yaml
model:
provider: "stepfun"
default: "step-3-mini"
```
The base URL can be overridden with `STEPFUN_BASE_URL` (default: `https://api.stepfun.com/v1`).
### Hugging Face Inference Providers
[Hugging Face Inference Providers](https://huggingface.co/docs/inference-providers) routes to 20+ open models through a unified OpenAI-compatible endpoint (`router.huggingface.co/v1`). Requests are automatically routed to the fastest available backend (Groq, Together, SambaNova, etc.) with automatic failover.
@ -1178,7 +1216,7 @@ fallback_model:
When activated, the fallback swaps the model and provider mid-session without losing your conversation. It fires **at most once** per session.
Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `deepseek`, `nvidia`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `alibaba`, `custom`.
Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `deepseek`, `nvidia`, `gmi`, `stepfun`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `alibaba`, `custom`.
:::tip
Fallback is configured exclusively through `config.yaml` — there are no environment variables for it. For full details on when it triggers, supported providers, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers).