mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-02 02:01:47 +00:00
feat: provider modules — ProviderProfile ABC, 30 providers, fetch_models, transport single-path
feat: provider modules — ProviderProfile ABC, 29 providers, fetch_models, transport single-path Introduces providers/ as the single source of truth for every inference provider. All 29 providers declared with correct data cross-checked against auth.py, runtime_provider.py and auxiliary_client.py. Rebased onto main (30307a980). Incorporates post-salvage fixes from56724147e(gmi aux model google/gemini-3.1-flash-lite-preview, already set in providers/gmi.py).
This commit is contained in:
parent
30307a9802
commit
84d1673e2f
60 changed files with 3939 additions and 1034 deletions
|
|
@ -93,6 +93,42 @@ This path includes everything from Path A plus:
|
|||
11. `run_agent.py`
|
||||
12. `pyproject.toml` if a provider SDK is required
|
||||
|
||||
## Fast path: Simple API-key providers
|
||||
|
||||
If your provider is just an OpenAI-compatible endpoint that authenticates with a single API key, you do not need to touch `auth.py`, `runtime_provider.py`, `main.py`, or any of the other files in the full checklist below.
|
||||
|
||||
All you need is:
|
||||
|
||||
1. A file in `providers/` (e.g. `providers/myprovider.py`) that calls `register_provider()` with the provider config.
|
||||
2. That's it. `auth.py` auto-registers every file in `providers/` at startup via a module-level import sweep.
|
||||
|
||||
When you add a `providers/*.py` file and call `register_provider()`, the following wire up automatically:
|
||||
|
||||
1. `PROVIDER_REGISTRY` entry in `auth.py` (credential resolution, env-var lookup)
|
||||
2. `api_mode` set to `chat_completions`
|
||||
3. `base_url` sourced from the config or the declared env var
|
||||
4. `env_vars` checked in priority order for the API key
|
||||
5. `fallback_models` list registered for the provider
|
||||
6. `--provider` CLI flag accepts the provider id
|
||||
7. `hermes model` menu includes the provider
|
||||
8. `hermes setup` wizard delegates to `main.py` automatically
|
||||
9. `provider:model` alias syntax works
|
||||
10. Runtime resolver returns the correct `base_url` and `api_key`
|
||||
11. `HERMES_INFERENCE_PROVIDER` env-var override accepts the provider id
|
||||
12. Fallback model activation can switch into the provider cleanly
|
||||
|
||||
See `providers/nvidia.py` or `providers/gmi.py` as a template.
|
||||
|
||||
## Full path: OAuth and complex providers
|
||||
|
||||
Use the full checklist below when your provider needs any of the following:
|
||||
|
||||
- OAuth or token refresh (Nous Portal, Codex, Google Gemini, Qwen Portal, Copilot)
|
||||
- A non-OpenAI API shape that requires a new adapter (Anthropic Messages, Codex Responses)
|
||||
- Custom endpoint detection or multi-region probing (z.ai, Kimi)
|
||||
- A curated static model catalog or live `/models` fetch
|
||||
- Provider-specific `hermes model` menu entries with bespoke auth flows
|
||||
|
||||
## Step 1: Pick one canonical provider id
|
||||
|
||||
Choose a single provider id and use it everywhere.
|
||||
|
|
|
|||
|
|
@ -20,6 +20,9 @@ Primary implementation:
|
|||
- `hermes_cli/auth.py` — provider registry, `resolve_provider()`
|
||||
- `hermes_cli/model_switch.py` — shared `/model` switch pipeline (CLI + gateway)
|
||||
- `agent/auxiliary_client.py` — auxiliary model routing
|
||||
- `providers/` — declarative source for `api_mode`, `base_url`, `env_vars`, `fallback_models` (auto-registered into `auth.py` `PROVIDER_REGISTRY` at startup)
|
||||
|
||||
`get_provider_profile()` in `providers/` returns a typed dict for a given provider id. `runtime_provider.py` calls this at resolution time to get the canonical `base_url`, `env_vars` priority list, `api_mode`, and `fallback_models` without needing to duplicate that data in multiple files. Adding a new `providers/*.py` file that calls `register_provider()` is enough for `runtime_provider.py` to pick it up — no branch needed in the resolver itself.
|
||||
|
||||
If you are trying to add a new first-class inference provider, read [Adding Providers](./adding-providers.md) alongside this page.
|
||||
|
||||
|
|
|
|||
|
|
@ -423,6 +423,44 @@ model:
|
|||
For on-prem deployments (DGX Spark, local GPU), set `NVIDIA_BASE_URL=http://localhost:8000/v1`. NIM exposes the same OpenAI-compatible chat completions API as build.nvidia.com, so switching between cloud and local is a one-line env-var change.
|
||||
:::
|
||||
|
||||
### GMI Cloud
|
||||
|
||||
Open and reasoning models via [GMI Cloud](https://inference.gmi.ai) — OpenAI-compatible API, API key authentication.
|
||||
|
||||
```bash
|
||||
# GMI Cloud
|
||||
hermes chat --provider gmi --model deepseek-ai/DeepSeek-R1
|
||||
# Requires: GMI_API_KEY in ~/.hermes/.env
|
||||
```
|
||||
|
||||
Or set it permanently in `config.yaml`:
|
||||
```yaml
|
||||
model:
|
||||
provider: "gmi"
|
||||
default: "deepseek-ai/DeepSeek-R1"
|
||||
```
|
||||
|
||||
The base URL can be overridden with `GMI_BASE_URL` (default: `https://api.gmi.ai/v1`).
|
||||
|
||||
### StepFun
|
||||
|
||||
Step-series models via [StepFun](https://platform.stepfun.com) — OpenAI-compatible API, API key authentication.
|
||||
|
||||
```bash
|
||||
# StepFun
|
||||
hermes chat --provider stepfun --model step-3-mini
|
||||
# Requires: STEPFUN_API_KEY in ~/.hermes/.env
|
||||
```
|
||||
|
||||
Or set it permanently in `config.yaml`:
|
||||
```yaml
|
||||
model:
|
||||
provider: "stepfun"
|
||||
default: "step-3-mini"
|
||||
```
|
||||
|
||||
The base URL can be overridden with `STEPFUN_BASE_URL` (default: `https://api.stepfun.com/v1`).
|
||||
|
||||
### Hugging Face Inference Providers
|
||||
|
||||
[Hugging Face Inference Providers](https://huggingface.co/docs/inference-providers) routes to 20+ open models through a unified OpenAI-compatible endpoint (`router.huggingface.co/v1`). Requests are automatically routed to the fastest available backend (Groq, Together, SambaNova, etc.) with automatic failover.
|
||||
|
|
@ -1178,7 +1216,7 @@ fallback_model:
|
|||
|
||||
When activated, the fallback swaps the model and provider mid-session without losing your conversation. It fires **at most once** per session.
|
||||
|
||||
Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `deepseek`, `nvidia`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `alibaba`, `custom`.
|
||||
Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `deepseek`, `nvidia`, `gmi`, `stepfun`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `alibaba`, `custom`.
|
||||
|
||||
:::tip
|
||||
Fallback is configured exclusively through `config.yaml` — there are no environment variables for it. For full details on when it triggers, supported providers, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers).
|
||||
|
|
|
|||
|
|
@ -65,6 +65,10 @@ All variables go in `~/.hermes/.env`. You can also set them with `hermes config
|
|||
| `DEEPSEEK_BASE_URL` | Custom DeepSeek API base URL |
|
||||
| `NVIDIA_API_KEY` | NVIDIA NIM API key — Nemotron and open models ([build.nvidia.com](https://build.nvidia.com)) |
|
||||
| `NVIDIA_BASE_URL` | Override NVIDIA base URL (default: `https://integrate.api.nvidia.com/v1`; set to `http://localhost:8000/v1` for a local NIM endpoint) |
|
||||
| `GMI_API_KEY` | GMI Cloud API key — open and reasoning models ([inference.gmi.ai](https://inference.gmi.ai)) |
|
||||
| `GMI_BASE_URL` | Override GMI Cloud base URL (default: `https://api.gmi.ai/v1`) |
|
||||
| `STEPFUN_API_KEY` | StepFun API key — Step-series models ([platform.stepfun.com](https://platform.stepfun.com)) |
|
||||
| `STEPFUN_BASE_URL` | Override StepFun base URL (default: `https://api.stepfun.com/v1`) |
|
||||
| `OLLAMA_API_KEY` | Ollama Cloud API key — managed Ollama catalog without local GPU ([ollama.com/settings/keys](https://ollama.com/settings/keys)) |
|
||||
| `OLLAMA_BASE_URL` | Override Ollama Cloud base URL (default: `https://ollama.com/v1`) |
|
||||
| `XAI_API_KEY` | xAI (Grok) API key for chat + TTS ([console.x.ai](https://console.x.ai/)) |
|
||||
|
|
@ -91,7 +95,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
|
|||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `custom`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `gemini`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `alibaba`, `deepseek`, `nvidia`, `ollama-cloud`, `xai` (alias `grok`), `google-gemini-cli`, `qwen-oauth`, `bedrock`, `opencode-zen`, `opencode-go`, `ai-gateway` (default: `auto`) |
|
||||
| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `custom`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `gemini`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `stepfun`, `alibaba`, `deepseek`, `nvidia`, `ollama-cloud`, `xai` (alias `grok`), `google-gemini-cli`, `qwen-oauth`, `bedrock`, `opencode-zen`, `opencode-go`, `ai-gateway` (default: `auto`) |
|
||||
| `HERMES_PORTAL_BASE_URL` | Override Nous Portal URL (for development/testing) |
|
||||
| `NOUS_INFERENCE_BASE_URL` | Override Nous inference API URL |
|
||||
| `HERMES_NOUS_MIN_KEY_TTL_SECONDS` | Min agent key TTL before re-mint (default: 1800 = 30min) |
|
||||
|
|
|
|||
|
|
@ -48,6 +48,8 @@ Both `provider` and `model` are **required**. If either is missing, the fallback
|
|||
| MiniMax (China) | `minimax-cn` | `MINIMAX_CN_API_KEY` |
|
||||
| DeepSeek | `deepseek` | `DEEPSEEK_API_KEY` |
|
||||
| NVIDIA NIM | `nvidia` | `NVIDIA_API_KEY` (optional: `NVIDIA_BASE_URL`) |
|
||||
| GMI Cloud | `gmi` | `GMI_API_KEY` (optional: `GMI_BASE_URL`) |
|
||||
| StepFun | `stepfun` | `STEPFUN_API_KEY` (optional: `STEPFUN_BASE_URL`) |
|
||||
| Ollama Cloud | `ollama-cloud` | `OLLAMA_API_KEY` |
|
||||
| Google Gemini (OAuth) | `google-gemini-cli` | `hermes model` (Google OAuth; optional: `HERMES_GEMINI_PROJECT_ID`) |
|
||||
| Google AI Studio | `gemini` | `GOOGLE_API_KEY` (alias: `GEMINI_API_KEY`) |
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue