mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
* docs: deep audit — fix stale config keys, missing commands, and registry drift Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level user-guide, user-guide/features) against the live registries: hermes_cli/commands.py COMMAND_REGISTRY (slash commands) hermes_cli/auth.py PROVIDER_REGISTRY (providers) hermes_cli/config.py DEFAULT_CONFIG (config keys) toolsets.py TOOLSETS (toolsets) tools/registry.py get_all_tool_names() (tools) python -m hermes_cli.main <subcmd> --help (CLI args) reference/ - cli-commands.md: drop duplicate hermes fallback row + duplicate section, add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand lists to match --help output (status/logout/spotify, login, archive/prune/ list-archived). - slash-commands.md: add missing /sessions and /reload-skills entries + correct the cross-platform Notes line. - tools-reference.md: drop bogus '68 tools' headline, drop fictional 'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated), add missing 'kanban' and 'video' toolset sections, fix MCP example to use the real mcp_<server>_<tool> prefix. - toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser' row, add missing 'kanban' and 'video' toolset rows, drop the stale '38 tools' count for hermes-cli. - profile-commands.md: add missing install/update/info subcommands, document fish completion. - environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the one with the correct gmi-serving.com default). - faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just via OpenRouter), refresh the OpenAI model list. getting-started/ - installation.md: PortableGit (not MinGit) is what the Windows installer fetches; document the 32-bit MinGit fallback. - installation.md / termux.md: installer prefers .[termux-all] then falls back to .[termux]. - nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid 'nix flake update --flake' invocation. - updating.md: 'hermes backup restore --state pre-update' doesn't exist — point at the snapshot/quick-snapshot flow; correct config key 'updates.pre_update_backup' (was 'update.backup'). user-guide/ - configuration.md: api_max_retries default 3 (not 2); display.runtime_footer is the real key (not display.runtime_metadata_footer); checkpoints defaults enabled=false / max_snapshots=20 (not true / 50). - configuring-models.md: 'hermes model list' / 'hermes model set ...' don't exist — hermes model is interactive only. - tui.md: busy_indicator -> tui_status_indicator with values kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none). - security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env, not config.yaml. - windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the OpenAI-compatible API server runs inside hermes gateway. user-guide/features/ - computer-use.md: approvals.mode (not security.approval_level); fix broken ./browser-use.md link to ./browser.md. - fallback-providers.md: top-level fallback_providers (not model.fallback_providers); the picker is subcommand-based, not modal. - api-server.md: API_SERVER_* are env vars — write to per-profile .env, not 'hermes config set' which targets YAML. - web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl modes are exposed through web_extract. - kanban.md: failure_limit default is 2, not '~5'. - plugins.md: drop hard-coded '33 providers' count. - honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document that 'hermes honcho' subcommand is gated on memory.provider=honcho; reconcile subcommand list with actual --help output. - memory-providers.md: legacy 'hermes honcho setup' redirect documented. Verified via 'npm run build' — site builds cleanly; broken-link count went from 149 to 146 (no regressions, fixed a few in passing). * docs: round 2 audit fixes + regenerate skill catalogs Follow-up to the previous commit on this branch: Round 2 manual fixes: - quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY; voice-mode and ACP install commands rewritten — bare 'pip install ...' doesn't work for curl-installed setups (no pip on PATH, not in repo dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e ".[voice]"'. ACP already ships in [all] so the curl install includes it. - cli.md / configuration.md: 'auxiliary.compression.model' shown as 'google/gemini-3-flash-preview' (the doc's own claimed default); actual default is empty (= use main model). Reworded as 'leave empty (default) or pin a cheap model'. - built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row that was missing from the table. Regenerated skill catalogs: - ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill pages and both reference catalogs (skills-catalog.md, optional-skills-catalog.md). This adds the entries that were genuinely missing — productivity/teams-meeting-pipeline (bundled), optional/finance/* (entire category — 7 skills: 3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model, merger-model, pptx-author), creative/hyperframes, creative/kanban-video-orchestrator, devops/watchers, productivity/shop-app, research/searxng-search, apple/macos-computer-use — and rewrites every other per-skill page from the current SKILL.md. Most diffs are tiny (one line of refreshed metadata). Validation: - 'npm run build' succeeded. - Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation shells that lag every newly-added skill page (pre-existing pattern). No regressions on any en/ page.
234 lines
11 KiB
Markdown
234 lines
11 KiB
Markdown
---
|
|
sidebar_position: 3
|
|
---
|
|
|
|
# Configuring Models
|
|
|
|
Hermes uses two kinds of model slots:
|
|
|
|
- **Main model** — what the agent thinks with. Every user message, every tool-call loop, every streamed response goes through this model.
|
|
- **Auxiliary models** — smaller side-jobs the agent offloads. Context compression, vision (image analysis), web-page summarization, session search, approval scoring, MCP tool routing, session-title generation, and skill search. Each has its own slot and can be overridden independently.
|
|
|
|
This page covers configuring both from the dashboard. If you prefer config files or the CLI, jump to [Alternative methods](#alternative-methods) at the bottom.
|
|
|
|
## The Models page
|
|
|
|
Open the dashboard and click **Models** in the sidebar. You get two sections:
|
|
|
|
1. **Model Settings** — the top panel, where you assign models to slots.
|
|
2. **Usage analytics** — ranked cards showing every model that ran a session in the selected period, with token counts, cost, and capability badges.
|
|
|
|

|
|
|
|
The top card is the **Model Settings** panel. The main row always shows what the agent will spin up for new sessions. Click **Change** to open the picker.
|
|
|
|
## Setting the main model
|
|
|
|
Click **Change** on the Main model row:
|
|
|
|

|
|
|
|
The picker has two columns:
|
|
|
|
- **Left** — authenticated providers. Only providers you've set up (API key set, OAuth'd, or defined as a custom endpoint) show up here. If a provider is missing, head to **Keys** and add its credential.
|
|
- **Right** — the curated model list for the selected provider. These are the agentic models Hermes recommends for that provider, not the raw `/models` dump (which on OpenRouter includes 400+ models including TTS, image generators, and rerankers).
|
|
|
|
Type in the filter box to narrow by provider name, slug, or model ID.
|
|
|
|
Pick a model, hit **Switch**, and Hermes writes it to `~/.hermes/config.yaml` under the `model` section. **This applies to new sessions only** — any chat tab you already have open keeps running whatever model it started with. To hot-swap the current chat, use the `/model` slash command inside it.
|
|
|
|
## Setting auxiliary models
|
|
|
|
Click **Show auxiliary** to reveal the eight task slots:
|
|
|
|

|
|
|
|
Every auxiliary task defaults to `auto` — meaning Hermes uses your main model for that job too. Override a specific task when you want a cheaper or faster model for a side-job.
|
|
|
|
### Common override patterns
|
|
|
|
| Task | When to override |
|
|
|---|---|
|
|
| **Title Gen** | Almost always. A $0.10/M flash model writes session titles as well as Opus. Default config sets this to `google/gemini-3-flash-preview` on OpenRouter. |
|
|
| **Vision** | When your main model is a coding model without vision (e.g. Kimi, DeepSeek). Point it at `google/gemini-2.5-flash` or `gpt-4o-mini`. |
|
|
| **Compression** | When you're burning reasoning tokens on Opus/M2.7 just to summarize context. A fast chat model does the job at 1/50th the cost. |
|
|
| **Session Search** | When recall queries fan out — default max_concurrency is 3. A cheap model keeps the bill predictable. |
|
|
| **Approval** | For `approval_mode: smart` — a fast/cheap model (haiku, flash, gpt-5-mini) decides whether to auto-approve low-risk commands. Expensive models here are waste. |
|
|
| **Web Extract** | When you use `web_extract` heavily. Same logic as compression — summarization doesn't need reasoning. |
|
|
| **Skills Hub** | `hermes skills search` uses this. Usually fine at `auto`. |
|
|
| **MCP** | MCP tool routing. Usually fine at `auto`. |
|
|
|
|
### Per-task override
|
|
|
|
Click **Change** on any auxiliary row. Same picker opens, same behavior — pick provider + model, hit Switch. The row updates to show `provider · model` instead of `auto (use main model)`.
|
|
|
|
### Reset all to auto
|
|
|
|
If you've over-tuned and want to start over, click **Reset all to auto** at the top of the auxiliary section. Every slot goes back to using your main model.
|
|
|
|
## The "Use as" shortcut
|
|
|
|
Every model card on the page has a **Use as** dropdown. This is the fast path — pick a model you see in your analytics, click **Use as**, and assign it to the main slot or any specific auxiliary task in one click:
|
|
|
|

|
|
|
|
The dropdown has:
|
|
|
|
- **Main model** — same as clicking Change on the main row.
|
|
- **All auxiliary tasks** — assigns this model to all 8 aux slots at once. Useful when you just want every side-job on a cheap flash model.
|
|
- **Individual task options** — Vision, Web Extract, Compression, etc. The currently-assigned model for each task is marked `current`.
|
|
|
|
Cards are badged with `main` or `aux · <task>` when they're currently assigned to something — so you can see at a glance which of your historical models are wired in where.
|
|
|
|
## What gets written to `config.yaml`
|
|
|
|
When you save via the dashboard, Hermes writes to `~/.hermes/config.yaml`:
|
|
|
|
**Main model:**
|
|
```yaml
|
|
model:
|
|
provider: openrouter
|
|
default: anthropic/claude-opus-4.7
|
|
base_url: '' # cleared on provider switch
|
|
api_mode: chat_completions
|
|
```
|
|
|
|
**Auxiliary override (example — vision on gemini-flash):**
|
|
```yaml
|
|
auxiliary:
|
|
vision:
|
|
provider: openrouter
|
|
model: google/gemini-2.5-flash
|
|
base_url: ''
|
|
api_key: ''
|
|
timeout: 120
|
|
extra_body: {}
|
|
download_timeout: 30
|
|
```
|
|
|
|
**Auxiliary on auto (default):**
|
|
```yaml
|
|
auxiliary:
|
|
compression:
|
|
provider: auto
|
|
model: ''
|
|
base_url: ''
|
|
# ... other fields unchanged
|
|
```
|
|
|
|
`provider: auto` with `model: ''` tells Hermes to use the main model for that task.
|
|
|
|
## When does it take effect?
|
|
|
|
- **CLI** (`hermes chat`): next `hermes chat` invocation.
|
|
- **Gateway** (Telegram, Discord, Slack, etc.): next *new* session. Existing sessions keep their model. Restart the gateway (`hermes gateway restart`) if you want to force all sessions to pick up the change.
|
|
- **Dashboard chat tab** (`/chat`): next new PTY. The currently-open chat keeps its model — use `/model` inside it to hot-swap.
|
|
|
|
Changes never invalidate prompt caches on running sessions. That's deliberate: swapping the main model inside a session requires a cache reset (the system prompt contains model-specific content), and we reserve that for the explicit `/model` slash command inside chat.
|
|
|
|
## Troubleshooting
|
|
|
|
### "No authenticated providers" in the picker
|
|
|
|
Hermes lists a provider only if it has a working credential. Check **Keys** in the sidebar — you should see one of: an API key, a successful OAuth, or a custom endpoint URL. If the provider you want isn't there, run `hermes setup` to wire it up, or go to **Keys** and add the env var.
|
|
|
|
### Main model didn't change in my running chat
|
|
|
|
Expected. The dashboard writes `config.yaml`, which new sessions read. The currently-open chat is a live agent process — it keeps whatever model it was spawned with. Use `/model <name>` inside the chat to hot-swap that specific session.
|
|
|
|
### Auxiliary override "didn't take effect"
|
|
|
|
Three things to check:
|
|
|
|
1. **Did you start a new session?** Existing chats don't re-read config.
|
|
2. **Is `provider` set to something other than `auto`?** If the field shows `auto`, the task is still using your main model. Click **Change** and pick a real provider.
|
|
3. **Is the provider authenticated?** If you assigned `minimax` to a task but don't have a MiniMax API key, that task falls back to the openrouter default and logs a warning in `agent.log`.
|
|
|
|
### I picked a model but Hermes switched providers on me
|
|
|
|
On OpenRouter (or any aggregator), bare model names resolve *within* the aggregator first. So `claude-sonnet-4` on OpenRouter becomes `anthropic/claude-sonnet-4.6`, staying on your OpenRouter auth. But if you typed `claude-sonnet-4` on a native Anthropic auth, it would stay as `claude-sonnet-4-6`. If you see an unexpected provider switch, check that your current provider is what you expect — the picker always shows the current main at the top of the dialog.
|
|
|
|
## Alternative methods
|
|
|
|
### CLI slash command
|
|
|
|
Inside any `hermes chat` session:
|
|
|
|
```
|
|
/model gpt-5.4 --provider openrouter # session-only
|
|
/model gpt-5.4 --provider openrouter --global # also persists to config.yaml
|
|
```
|
|
|
|
`--global` does the same thing the dashboard's **Change** button does, plus it switches the running session in-place.
|
|
|
|
### Custom aliases
|
|
|
|
Define your own short names for models you reach for often, then use `/model <alias>` in the CLI or any messaging platform:
|
|
|
|
```yaml
|
|
# ~/.hermes/config.yaml
|
|
model_aliases:
|
|
fav:
|
|
model: claude-sonnet-4.6
|
|
provider: anthropic
|
|
grok:
|
|
model: grok-4
|
|
provider: x-ai
|
|
```
|
|
|
|
Or from the shell (short form, `provider/model`):
|
|
|
|
```bash
|
|
hermes config set model.aliases.fav anthropic/claude-opus-4.6
|
|
hermes config set model.aliases.grok x-ai/grok-4
|
|
```
|
|
|
|
Then `/model fav` or `/model grok` in chat. User aliases shadow built-in short names (`sonnet`, `kimi`, `opus`, etc.). See [Custom model aliases](/docs/reference/slash-commands#custom-model-aliases) for the full reference.
|
|
|
|
### `hermes model` subcommand
|
|
|
|
```bash
|
|
hermes model # Interactive provider + model picker (the canonical way to switch defaults)
|
|
```
|
|
|
|
`hermes model` walks you through picking a provider, authenticating (OAuth flows open a browser; API-key providers prompt for the key), and then choosing a specific model from that provider's curated catalog. The choice is written to `model.provider` and `model.model` in `~/.hermes/config.yaml`.
|
|
|
|
To list providers/models without launching the picker, use the dashboard or the REST endpoints below. To inspect what the CLI will actually use right now: `hermes config get model` and `hermes status`.
|
|
|
|
### Direct config edit
|
|
|
|
Edit `~/.hermes/config.yaml` and restart whatever reads it. See the [Configuration reference](./configuration.md) for the full schema.
|
|
|
|
### REST API
|
|
|
|
The dashboard uses three endpoints. Useful for scripting:
|
|
|
|
```bash
|
|
# List authenticated providers + curated model lists
|
|
curl -H "X-Hermes-Session-Token: $TOKEN" http://localhost:PORT/api/model/options
|
|
|
|
# Read current main + auxiliary assignments
|
|
curl -H "X-Hermes-Session-Token: $TOKEN" http://localhost:PORT/api/model/auxiliary
|
|
|
|
# Set the main model
|
|
curl -X POST -H "Content-Type: application/json" -H "X-Hermes-Session-Token: $TOKEN" \
|
|
-d '{"scope":"main","provider":"openrouter","model":"anthropic/claude-opus-4.7"}' \
|
|
http://localhost:PORT/api/model/set
|
|
|
|
# Override a single auxiliary task
|
|
curl -X POST -H "Content-Type: application/json" -H "X-Hermes-Session-Token: $TOKEN" \
|
|
-d '{"scope":"auxiliary","task":"vision","provider":"openrouter","model":"google/gemini-2.5-flash"}' \
|
|
http://localhost:PORT/api/model/set
|
|
|
|
# Assign one model to every auxiliary task
|
|
curl -X POST -H "Content-Type: application/json" -H "X-Hermes-Session-Token: $TOKEN" \
|
|
-d '{"scope":"auxiliary","task":"","provider":"openrouter","model":"google/gemini-2.5-flash"}' \
|
|
http://localhost:PORT/api/model/set
|
|
|
|
# Reset all auxiliary tasks to auto
|
|
curl -X POST -H "Content-Type: application/json" -H "X-Hermes-Session-Token: $TOKEN" \
|
|
-d '{"scope":"auxiliary","task":"__reset__","provider":"","model":""}' \
|
|
http://localhost:PORT/api/model/set
|
|
```
|
|
|
|
The session token is injected into the dashboard HTML at startup and rotates on every server restart. Grab it from the browser devtools (`window.__HERMES_SESSION_TOKEN__`) if you're scripting against a running dashboard.
|