docs(providers): Together/Groq/Perplexity cookbook via custom_providers

Three worked recipes for OpenAI-compatible cloud providers, plus the
Copilot HTTP 401 auto-recovery info block and the GMI Cloud row in the
compatible providers table. All three additions were on the original
docs/custom-providers-cookbook branch but its merge base predated 1186
main commits, making the rebase impractical (84k+ line conflict).

Replays just the providers.md additions onto current main.
This commit is contained in:
Jason Perlow 2026-04-25 05:53:24 -04:00 committed by Teknium
parent af312ccc97
commit acca3ec3af

View file

@ -1190,6 +1190,113 @@ You can also select named custom providers from the interactive `hermes model` m
---
### Cookbook: Together AI, Groq, Perplexity
The cloud providers listed in [Other Compatible Providers](#other-compatible-providers) all speak OpenAI's REST dialect, so they wire up the same way under `custom_providers:`. Three worked recipes follow. Each drops into `~/.hermes/config.yaml` and the matching API key goes in `~/.hermes/.env`.
#### Together AI
Hosts open-weight models (Llama, MiniMax, Gemma, DeepSeek, Qwen) at prices significantly below first-party APIs. Good default for multi-model fleets.
```yaml
# ~/.hermes/config.yaml
custom_providers:
- name: together
base_url: https://api.together.xyz/v1
key_env: TOGETHER_API_KEY
# api_mode: chat_completions # default — no need to set
model:
default: MiniMaxAI/MiniMax-M2.7 # or any model from together.ai/models
provider: custom:together
```
```bash
# ~/.hermes/.env
TOGETHER_API_KEY=your-together-key
```
Switch models mid-session:
```
/model custom:together:meta-llama/Llama-3.3-70B-Instruct-Turbo
/model custom:together:google/gemma-4-31b-it
/model custom:together:deepseek-ai/DeepSeek-V3
```
Together's `/v1/models` endpoint works, so `hermes model` can auto-discover available models.
#### Groq
Ultra-fast inference (~500 tok/s on Llama-3.3-70B). Small catalog but strong for latency-sensitive interactive use.
```yaml
# ~/.hermes/config.yaml
custom_providers:
- name: groq
base_url: https://api.groq.com/openai/v1
key_env: GROQ_API_KEY
model:
default: llama-3.3-70b-versatile
provider: custom:groq
```
```bash
# ~/.hermes/.env
GROQ_API_KEY=your-groq-key
```
#### Perplexity
Useful when you want a model that does live web search and citation automatically. Strict about which models are available — check [perplexity.ai/settings/api](https://www.perplexity.ai/settings/api) for the current list.
```yaml
# ~/.hermes/config.yaml
custom_providers:
- name: perplexity
base_url: https://api.perplexity.ai
key_env: PERPLEXITY_API_KEY
model:
default: sonar
provider: custom:perplexity
```
```bash
# ~/.hermes/.env
PERPLEXITY_API_KEY=your-perplexity-key
```
#### Multiple providers in one config
The three recipes compose — use all of them together and switch per turn with `/model custom:<name>:<model>`:
```yaml
custom_providers:
- name: together
base_url: https://api.together.xyz/v1
key_env: TOGETHER_API_KEY
- name: groq
base_url: https://api.groq.com/openai/v1
key_env: GROQ_API_KEY
- name: perplexity
base_url: https://api.perplexity.ai
key_env: PERPLEXITY_API_KEY
model:
default: MiniMaxAI/MiniMax-M2.7
provider: custom:together # boot to Together; switch freely after
```
:::tip Troubleshooting
- `hermes doctor` should print no `Unknown provider` warnings for any of these names after the CLI validator fixes in #15083.
- If a provider's `/v1/models` endpoint is unreachable (Perplexity is the common one), `hermes model` will persist the model with a warning rather than hard-reject — see #15136.
- To skip `custom_providers:` entirely and use bare `provider: custom` with `CUSTOM_BASE_URL` env var, see #15103.
:::
---
### Choosing the Right Setup
| Use Case | Recommended |