mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
Cross-checked 75 docs pages under user-guide/messaging/, developer-guide/,
guides/, and integrations/ against the live registries and gateway code.
messaging/
- index.md: API Server toolset is hermes-api-server (was 'hermes (default)');
Google Chat slug is hermes-google_chat (underscore — plugin name uses _).
- google_chat.md: drop bogus 'pip install hermes-agent[google_chat]' (no such
extra); list the actual deps (google-cloud-pubsub, google-api-python-client,
google-auth, google-auth-oauthlib).
- qqbot.md: config namespace is platforms.qqbot (was platforms.qq, which is
silently ignored by the adapter); QQ_STT_BASE_URL is not read directly —
baseUrl lives under platforms.qqbot.extra.stt.
- teams-meetings.md: 'hermes teams-pipeline' is plugin-gated (teams_pipeline
plugin must be enabled), not a built-in subcommand.
- sms.md: example log line 0.0.0.0:8080 -> 127.0.0.1:8080 (default
SMS_WEBHOOK_HOST).
- open-webui.md: API_SERVER_* are env vars, not YAML keys — write them to
per-profile .env, not 'hermes config set' (same pattern fixed in
api-server.md last round). Also bumped example ports to 8650+ to dodge the
default webhook (8644)/wecom-callback (8645)/msgraph-webhook (8646)
collision.
developer-guide/
- architecture.md: tool/toolset counts (61/52 -> 70+/~28); LOC stamps for
run_agent.py, cli.py, hermes_cli/main.py, setup.py, mcp_tool.py,
gateway/run.py replaced with 'large file' to stop drifting.
- agent-loop.md: same LOC drift (~13,700 -> 'a large file (15k+ lines)').
- gateway-internals.md: '14+ external messaging platforms' -> '20+'; gateway
platform tree updated (qqbot is a sub-package, not qqbot.py; added
yuanbao.py, feishu_comment.py, msgraph_webhook.py); 'gateway/builtin_hooks/
(always active)' was wrong — it's an empty extension point and
_register_builtin_hooks() is a no-op stub.
- acp-internals.md: drop fictional 'message_callback' from the bridged-
callbacks list; clarify thinking_callback is currently set to None.
- provider-runtime.md: provider list was missing AWS Bedrock, Azure Foundry,
NVIDIA NIM, xAI, Arcee, GMI Cloud, StepFun, Qwen OAuth, Xiaomi, Ollama
Cloud, LM Studio, Tencent TokenHub. Fallback section described only the
legacy single-pair model — corrected to the canonical list-form
fallback_providers chain.
- environments.md: parsers list missing llama4_json and the deepseek_v31
alias; both register via @register_parser.
- browser-supervisor.md: drop reference to scripts/browser_supervisor_e2e.py
which doesn't exist in-repo.
- contributing.md: tinker-atropos is a git submodule — note that
'git submodule update --init' is required if cloning without
--recurse-submodules.
guides/
- operate-teams-meeting-pipeline.md: cron flags were all wrong — schedule is
positional (not --schedule), the script-only flag is --no-agent (not
--script-only), and there's no --command flag. Replaced with a real example
that creates the script under ~/.hermes/scripts/ and uses the actual flags.
Also replaced fictional 'hermes cron show <name>' with 'hermes cron status'.
- automation-templates.md: 'cron create --skills "a,b"' doesn't work —
the flag is --skill (singular, repeatable). Fixed all 5 occurrences via AST
rewrite.
- minimax-oauth.md: 'hermes auth add minimax-oauth --region cn' silently
fails because --region isn't registered on the auth-add argparse spec.
Pointed users at the minimax-cn provider (or MINIMAX_CN_API_KEY env) for
China-region access.
- cron-script-only.md: 'hermes send' is fictional — replaced the comparison-
table mention with a webhook-subscription pointer; also fixed the dead link
to /guides/pipe-script-output (page doesn't exist).
- cron-troubleshooting.md: 'hermes serve' isn't a real subcommand. Pointed
at 'hermes gateway' (foreground) / 'hermes gateway start' (service).
- local-ollama-setup.md: 'agent.api_timeout' is not a config key. The right
knob is the HERMES_API_TIMEOUT env var.
- python-library.md: run_conversation() return dict has only final_response
and messages — task_id is stored on the agent instance, not echoed back.
- use-mcp-with-hermes.md: '--args /c "npx -y …"' wraps the npx command in
one quoted string, so cmd.exe gets a single arg instead of the multi-token
command line it needs. Removed the surrounding quotes — argparse nargs='*'
collects each token correctly.
integrations/
- providers.md: Bedrock guardrail YAML keys were 'id'/'version' (don't exist);
actual keys are guardrail_identifier/guardrail_version (matches DEFAULT_CONFIG
and the run_agent.py reader). GMI default base URL (api.gmi.ai/v1 ->
api.gmi-serving.com/v1) and portal URL (inference.gmi.ai -> www.gmicloud.ai)
refreshed. Fallback section rewritten to lead with the canonical
fallback_providers list form (was leading with the legacy fallback_model
single dict); supported-providers list extended to include azure-foundry,
alibaba-coding-plan, lmstudio.
index.md
- '68 built-in tools' -> '70+'; '15+ platforms' was both inconsistent with
integrations/index.md ('19+') and undercounted — bumped to 20+ and added
Weixin/QQ Bot/Yuanbao/Google Chat to the list.
Validation: 'npm run build' clean (exit 0); broken-link count unchanged at
155 (same as round-1 post-skill-regen baseline). 24 files, +132/-89.
334 lines
14 KiB
Markdown
334 lines
14 KiB
Markdown
---
|
||
sidebar_position: 8
|
||
title: "Open WebUI"
|
||
description: "Connect Open WebUI to Hermes Agent via the OpenAI-compatible API server"
|
||
---
|
||
|
||
# Open WebUI Integration
|
||
|
||
[Open WebUI](https://github.com/open-webui/open-webui) (126k★) is the most popular self-hosted chat interface for AI. With Hermes Agent's built-in API server, you can use Open WebUI as a polished web frontend for your agent — complete with conversation management, user accounts, and a modern chat interface.
|
||
|
||
## Architecture
|
||
|
||
```mermaid
|
||
flowchart LR
|
||
A["Open WebUI<br/>browser UI<br/>port 3000"]
|
||
B["hermes-agent<br/>gateway API server<br/>port 8642"]
|
||
A -->|POST /v1/chat/completions| B
|
||
B -->|SSE streaming response| A
|
||
```
|
||
|
||
Open WebUI connects to Hermes Agent's API server just like it would connect to OpenAI. Hermes handles the requests with its full toolset — terminal, file operations, web search, memory, skills — and returns the final response.
|
||
|
||
:::important Runtime location
|
||
The API server is a **Hermes agent runtime**, not a pure LLM proxy. For each request, Hermes creates a server-side `AIAgent` on the API-server host. Tool calls run where that API server is running.
|
||
|
||
For example, if a laptop points Open WebUI or another OpenAI-compatible client at a Hermes API server on a remote machine, `pwd`, file tools, browser tools, local MCP tools, and other workspace tools run on the remote API-server host, not on the laptop.
|
||
:::
|
||
|
||
Open WebUI talks to Hermes server-to-server, so you do not need `API_SERVER_CORS_ORIGINS` for this integration.
|
||
|
||
## Quick Setup
|
||
|
||
### One-command local bootstrap (macOS/Linux, no Docker)
|
||
|
||
If you want Hermes + Open WebUI wired together locally with a reusable launcher, run:
|
||
|
||
```bash
|
||
cd ~/.hermes/hermes-agent
|
||
bash scripts/setup_open_webui.sh
|
||
```
|
||
|
||
What the script does:
|
||
|
||
- ensures `~/.hermes/.env` contains `API_SERVER_ENABLED`, `API_SERVER_HOST`, `API_SERVER_KEY`, `API_SERVER_PORT`, and `API_SERVER_MODEL_NAME`
|
||
- restarts the Hermes gateway so the API server comes up
|
||
- installs Open WebUI into `~/.local/open-webui-venv`
|
||
- writes a launcher at `~/.local/bin/start-open-webui-hermes.sh`
|
||
- on macOS, installs a `launchd` user service; on Linux with `systemd --user`, installs a user service there
|
||
|
||
Defaults:
|
||
|
||
- Hermes API: `http://127.0.0.1:8642/v1`
|
||
- Open WebUI: `http://127.0.0.1:8080`
|
||
- model name advertised to Open WebUI: `Hermes Agent`
|
||
|
||
Useful overrides:
|
||
|
||
```bash
|
||
OPEN_WEBUI_NAME='My Hermes UI' \
|
||
OPEN_WEBUI_ENABLE_SIGNUP=true \
|
||
HERMES_API_MODEL_NAME='My Hermes Agent' \
|
||
bash scripts/setup_open_webui.sh
|
||
```
|
||
|
||
On Linux, automatic background service setup requires a working `systemd --user` session. If you are on a headless SSH box and want to skip service installation, run:
|
||
|
||
```bash
|
||
OPEN_WEBUI_ENABLE_SERVICE=false bash scripts/setup_open_webui.sh
|
||
```
|
||
|
||
### 1. Enable the API server
|
||
|
||
```bash
|
||
hermes config set API_SERVER_ENABLED true
|
||
hermes config set API_SERVER_KEY your-secret-key
|
||
```
|
||
|
||
`hermes config set` auto-routes the flag to `config.yaml` and the secret to `~/.hermes/.env`. If the gateway is already running, restart it so the change takes effect:
|
||
|
||
```bash
|
||
hermes gateway stop && hermes gateway
|
||
```
|
||
|
||
### 2. Start Hermes Agent gateway
|
||
|
||
```bash
|
||
hermes gateway
|
||
```
|
||
|
||
You should see:
|
||
|
||
```
|
||
[API Server] API server listening on http://127.0.0.1:8642
|
||
```
|
||
|
||
### 3. Verify the API server is reachable
|
||
|
||
```bash
|
||
curl -s http://127.0.0.1:8642/health
|
||
# {"status": "ok", ...}
|
||
|
||
curl -s -H "Authorization: Bearer your-secret-key" http://127.0.0.1:8642/v1/models
|
||
# {"object":"list","data":[{"id":"hermes-agent", ...}]}
|
||
```
|
||
|
||
If `/health` fails, the gateway didn't pick up `API_SERVER_ENABLED=true` — restart it. If `/v1/models` returns `401`, your `Authorization` header doesn't match `API_SERVER_KEY`.
|
||
|
||
### 4. Start Open WebUI
|
||
|
||
```bash
|
||
docker run -d -p 3000:8080 \
|
||
-e OPENAI_API_BASE_URL=http://host.docker.internal:8642/v1 \
|
||
-e OPENAI_API_KEY=your-secret-key \
|
||
-e ENABLE_OLLAMA_API=false \
|
||
--add-host=host.docker.internal:host-gateway \
|
||
-v open-webui:/app/backend/data \
|
||
--name open-webui \
|
||
--restart always \
|
||
ghcr.io/open-webui/open-webui:main
|
||
```
|
||
|
||
`ENABLE_OLLAMA_API=false` suppresses the default Ollama backend, which would otherwise show up empty and clutter the model picker. Omit it if you actually have Ollama running alongside.
|
||
|
||
First launch takes 15–30 seconds: Open WebUI downloads sentence-transformer embedding models (~150MB) the first time it starts. Wait for `docker logs open-webui` to settle before opening the UI.
|
||
|
||
### 5. Open the UI
|
||
|
||
Go to **http://localhost:3000**. Create your admin account (the first user becomes admin). You should see your agent in the model dropdown (named after your profile, or **hermes-agent** for the default profile). Start chatting!
|
||
|
||
## Docker Compose Setup
|
||
|
||
For a more permanent setup, create a `docker-compose.yml`:
|
||
|
||
```yaml
|
||
services:
|
||
open-webui:
|
||
image: ghcr.io/open-webui/open-webui:main
|
||
ports:
|
||
- "3000:8080"
|
||
volumes:
|
||
- open-webui:/app/backend/data
|
||
environment:
|
||
- OPENAI_API_BASE_URL=http://host.docker.internal:8642/v1
|
||
- OPENAI_API_KEY=your-secret-key
|
||
- ENABLE_OLLAMA_API=false
|
||
extra_hosts:
|
||
- "host.docker.internal:host-gateway"
|
||
restart: always
|
||
|
||
volumes:
|
||
open-webui:
|
||
```
|
||
|
||
Then:
|
||
|
||
```bash
|
||
docker compose up -d
|
||
```
|
||
|
||
## Configuring via the Admin UI
|
||
|
||
If you prefer to configure the connection through the UI instead of environment variables:
|
||
|
||
1. Log in to Open WebUI at **http://localhost:3000**
|
||
2. Click your **profile avatar** → **Admin Settings**
|
||
3. Go to **Connections**
|
||
4. Under **OpenAI API**, click the **wrench icon** (Manage)
|
||
5. Click **+ Add New Connection**
|
||
6. Enter:
|
||
- **URL**: `http://host.docker.internal:8642/v1`
|
||
- **API Key**: the exact same value as `API_SERVER_KEY` in Hermes
|
||
7. Click the **checkmark** to verify the connection
|
||
8. **Save**
|
||
|
||
Your agent model should now appear in the model dropdown (named after your profile, or **hermes-agent** for the default profile).
|
||
|
||
:::warning
|
||
Environment variables only take effect on Open WebUI's **first launch**. After that, connection settings are stored in its internal database. To change them later, use the Admin UI or delete the Docker volume and start fresh.
|
||
:::
|
||
|
||
## API Type: Chat Completions vs Responses
|
||
|
||
Open WebUI supports two API modes when connecting to a backend:
|
||
|
||
| Mode | Format | When to use |
|
||
|------|--------|-------------|
|
||
| **Chat Completions** (default) | `/v1/chat/completions` | Recommended. Works out of the box. |
|
||
| **Responses** (experimental) | `/v1/responses` | For server-side conversation state via `previous_response_id`. |
|
||
|
||
### Using Chat Completions (recommended)
|
||
|
||
This is the default and requires no extra configuration. Open WebUI sends standard OpenAI-format requests and Hermes Agent responds accordingly. Each request includes the full conversation history.
|
||
|
||
### Using Responses API
|
||
|
||
To use the Responses API mode:
|
||
|
||
1. Go to **Admin Settings** → **Connections** → **OpenAI** → **Manage**
|
||
2. Edit your hermes-agent connection
|
||
3. Change **API Type** from "Chat Completions" to **"Responses (Experimental)"**
|
||
4. Save
|
||
|
||
With the Responses API, Open WebUI sends requests in the Responses format (`input` array + `instructions`), and Hermes Agent can preserve full tool call history across turns via `previous_response_id`. When `stream: true`, Hermes also streams spec-native `function_call` and `function_call_output` items, which enables custom structured tool-call UI in clients that render Responses events.
|
||
|
||
:::note
|
||
Open WebUI currently manages conversation history client-side even in Responses mode — it sends the full message history in each request rather than using `previous_response_id`. The main advantage of Responses mode today is the structured event stream: text deltas, `function_call`, and `function_call_output` items arrive as OpenAI Responses SSE events instead of Chat Completions chunks.
|
||
:::
|
||
|
||
## How It Works
|
||
|
||
When you send a message in Open WebUI:
|
||
|
||
1. Open WebUI sends a `POST /v1/chat/completions` request with your message and conversation history
|
||
2. Hermes Agent creates a server-side `AIAgent` instance using the API server's profile, model/provider config, memory, skills, and configured API-server toolsets
|
||
3. The agent processes your request — it may call tools (terminal, file operations, web search, etc.) on the API-server host
|
||
4. As tools execute, **inline progress messages stream to the UI** so you can see what the agent is doing (e.g. `` `💻 ls -la` ``, `` `🔍 Python 3.12 release` ``)
|
||
5. The agent's final text response streams back to Open WebUI
|
||
6. Open WebUI displays the response in its chat interface
|
||
|
||
Your agent has access to the same tools and capabilities as that API-server Hermes instance. If the API server is remote, those tools are remote too.
|
||
|
||
If you need tools to run against your **local** workspace today, run Hermes locally and point it at a pure LLM provider or pure OpenAI-compatible model proxy (for example vLLM, LiteLLM, Ollama, llama.cpp, OpenAI, OpenRouter, etc.). A future split-runtime mode for "remote brain, local hands" is being tracked in [#18715](https://github.com/NousResearch/hermes-agent/issues/18715); it is not the behavior of the current API server.
|
||
|
||
:::tip Tool Progress
|
||
With streaming enabled (the default), you'll see brief inline indicators as tools run — the tool emoji and its key argument. These appear in the response stream before the agent's final answer, giving you visibility into what's happening behind the scenes.
|
||
:::
|
||
|
||
## Configuration Reference
|
||
|
||
### Hermes Agent (API server)
|
||
|
||
| Variable | Default | Description |
|
||
|----------|---------|-------------|
|
||
| `API_SERVER_ENABLED` | `false` | Enable the API server |
|
||
| `API_SERVER_PORT` | `8642` | HTTP server port |
|
||
| `API_SERVER_HOST` | `127.0.0.1` | Bind address |
|
||
| `API_SERVER_KEY` | _(required)_ | Bearer token for auth. Match `OPENAI_API_KEY`. |
|
||
|
||
### Open WebUI
|
||
|
||
| Variable | Description |
|
||
|----------|-------------|
|
||
| `OPENAI_API_BASE_URL` | Hermes Agent's API URL (include `/v1`) |
|
||
| `OPENAI_API_KEY` | Must be non-empty. Match your `API_SERVER_KEY`. |
|
||
|
||
## Troubleshooting
|
||
|
||
### No models appear in the dropdown
|
||
|
||
- **Check the URL has `/v1` suffix**: `http://host.docker.internal:8642/v1` (not just `:8642`)
|
||
- **Verify the gateway is running**: `curl http://localhost:8642/health` should return `{"status": "ok"}`
|
||
- **Check model listing**: `curl -H "Authorization: Bearer your-secret-key" http://localhost:8642/v1/models` should return a list with `hermes-agent`
|
||
- **Docker networking**: From inside Docker, `localhost` means the container, not your host. Use `host.docker.internal` or `--network=host`.
|
||
- **Empty Ollama backend shadowing the picker**: If you omitted `ENABLE_OLLAMA_API=false`, Open WebUI shows an empty Ollama section above your Hermes models. Restart the container with `-e ENABLE_OLLAMA_API=false` or disable Ollama in **Admin Settings → Connections**.
|
||
|
||
### Connection test passes but no models load
|
||
|
||
This is almost always the missing `/v1` suffix. Open WebUI's connection test is a basic connectivity check — it doesn't verify model listing works.
|
||
|
||
### Response takes a long time
|
||
|
||
Hermes Agent may be executing multiple tool calls (reading files, running commands, searching the web) before producing its final response. This is normal for complex queries. The response appears all at once when the agent finishes.
|
||
|
||
### "Invalid API key" errors
|
||
|
||
Make sure your `OPENAI_API_KEY` in Open WebUI matches the `API_SERVER_KEY` in Hermes Agent.
|
||
|
||
:::warning
|
||
Open WebUI persists OpenAI-compatible connection settings in its own database after first launch. If you accidentally saved a wrong key in the Admin UI, fixing the environment variables alone is not enough — update or delete the saved connection in **Admin Settings → Connections**, or reset the Open WebUI data directory / database.
|
||
:::
|
||
|
||
## Multi-User Setup with Profiles
|
||
|
||
To run separate Hermes instances per user — each with their own config, memory, and skills — use [profiles](/docs/user-guide/profiles). Each profile runs its own API server on a different port and automatically advertises the profile name as the model in Open WebUI.
|
||
|
||
### 1. Create profiles and configure API servers
|
||
|
||
`API_SERVER_*` are env vars, not YAML config keys, so write them to each profile's `.env`. Pick ports outside the default-platform range (`8644` is the webhook adapter, `8645` is wecom-callback, `8646` is msgraph-webhook), e.g. `8650+`:
|
||
|
||
```bash
|
||
hermes profile create alice
|
||
cat >> ~/.hermes/profiles/alice/.env <<EOF
|
||
API_SERVER_ENABLED=true
|
||
API_SERVER_PORT=8650
|
||
API_SERVER_KEY=alice-secret
|
||
EOF
|
||
|
||
hermes profile create bob
|
||
cat >> ~/.hermes/profiles/bob/.env <<EOF
|
||
API_SERVER_ENABLED=true
|
||
API_SERVER_PORT=8651
|
||
API_SERVER_KEY=bob-secret
|
||
EOF
|
||
```
|
||
|
||
### 2. Start each gateway
|
||
|
||
```bash
|
||
hermes -p alice gateway &
|
||
hermes -p bob gateway &
|
||
```
|
||
|
||
### 3. Add connections in Open WebUI
|
||
|
||
In **Admin Settings** → **Connections** → **OpenAI API** → **Manage**, add one connection per profile:
|
||
|
||
| Connection | URL | API Key |
|
||
|-----------|-----|---------|
|
||
| Alice | `http://host.docker.internal:8650/v1` | `alice-secret` |
|
||
| Bob | `http://host.docker.internal:8651/v1` | `bob-secret` |
|
||
|
||
The model dropdown will show `alice` and `bob` as distinct models. You can assign models to Open WebUI users via the admin panel, giving each user their own isolated Hermes agent.
|
||
|
||
:::tip Custom Model Names
|
||
The model name defaults to the profile name. To override it, set `API_SERVER_MODEL_NAME` in the profile's `.env`:
|
||
```bash
|
||
hermes -p alice config set API_SERVER_MODEL_NAME "Alice's Agent"
|
||
```
|
||
:::
|
||
|
||
## Linux Docker (no Docker Desktop)
|
||
|
||
On Linux without Docker Desktop, `host.docker.internal` doesn't resolve by default. Options:
|
||
|
||
```bash
|
||
# Option 1: Add host mapping
|
||
docker run --add-host=host.docker.internal:host-gateway ...
|
||
|
||
# Option 2: Use host networking
|
||
docker run --network=host -e OPENAI_API_BASE_URL=http://localhost:8642/v1 ...
|
||
|
||
# Option 3: Use Docker bridge IP
|
||
docker run -e OPENAI_API_BASE_URL=http://172.17.0.1:8642/v1 ...
|
||
```
|