From 192501528f8700c3e04f1c3696b421db22c3784e Mon Sep 17 00:00:00 2001
From: teknium1 <teknium1@gmail.com>
Date: Sun, 8 Mar 2026 18:09:18 -0700
Subject: [PATCH] docs: add Auxiliary Model Configuration section to AGENTS.md

Clear how-to documentation for changing the vision model, web extraction
model, and compression model. Includes config.yaml examples, env var
alternatives, provider options table, and multimodal safety notes.
---
 AGENTS.md | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/AGENTS.md b/AGENTS.md
index da54a8c41..906181cf2 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -689,6 +689,69 @@ Key files:
 
 ---
 
+## Auxiliary Model Configuration
+
+Hermes uses lightweight "auxiliary" models for side tasks that run alongside the main conversation model:
+
+| Task | Tool(s) | Default Model |
+|------|---------|---------------|
+| **Vision analysis** | `vision_analyze`, `browser_vision` | `google/gemini-3-flash-preview` (via OpenRouter) |
+| **Web extraction** | `web_extract`, browser snapshot summarization | `google/gemini-3-flash-preview` (via OpenRouter) |
+| **Context compression** | Auto-compression when approaching context limit | `google/gemini-3-flash-preview` (via OpenRouter) |
+
+By default, these auto-detect the best available provider: OpenRouter → Nous Portal → (text tasks only) custom endpoint → Codex → API-key providers.
+
+### Changing the Vision Model
+
+To use a different model for image analysis (e.g., GPT-4o instead of Gemini Flash), add to `~/.hermes/config.yaml`:
+
+```yaml
+auxiliary:
+  vision:
+    provider: "openrouter"        # or "nous", "main", "auto"
+    model: "openai/gpt-4o"        # any model slug your provider supports
+```
+
+Or set environment variables (in `~/.hermes/.env` or shell):
+
+```bash
+AUXILIARY_VISION_MODEL=openai/gpt-4o
+# Optionally force a specific provider:
+AUXILIARY_VISION_PROVIDER=openrouter
+```
+
+### Changing the Web Extraction Model
+
+```yaml
+auxiliary:
+  web_extract:
+    provider: "auto"
+    model: "google/gemini-2.5-flash"
+```
+
+### Changing the Compression Model
+
+```yaml
+compression:
+  summary_model: "google/gemini-2.5-flash"
+  summary_provider: "auto"          # "auto", "openrouter", "nous", "main"
+```
+
+### Provider Options
+
+| Provider | Description |
+|----------|-------------|
+| `"auto"` | Best available (default). For vision, only tries OpenRouter + Nous. |
+| `"openrouter"` | Force OpenRouter (requires `OPENROUTER_API_KEY`) |
+| `"nous"` | Force Nous Portal (requires `hermes login`) |
+| `"main"` | Use the same provider as your main chat model. Skips OpenRouter/Nous. Useful for local models. |
+
+**Important:** Vision tasks require a multimodal-capable model. In `auto` mode, only OpenRouter and Nous Portal are tried (they route to Gemini, which supports images). Setting `provider: "main"` for vision will work only if your main endpoint supports multimodal input.
+
+**Key files:** `agent/auxiliary_client.py` (resolution chain), `tools/vision_tools.py`, `tools/browser_tool.py`, `tools/web_tools.py`
+
+---
+
 ## Known Pitfalls
 
 ### DO NOT use `simple_term_menu` for interactive menus