docs: add proxy mode documentation

- Matrix docs: full Proxy Mode section with architecture diagram, step-by-step setup (host + Docker), docker-compose.yml/Dockerfile examples, configuration reference, and limitations notes - API Server docs: add Proxy Mode section explaining the api_server serves as the backend for gateway proxy mode - Environment variables reference: add GATEWAY_PROXY_URL and GATEWAY_PROXY_KEY entries
2026-04-25 00:51:20 +00:00 · 2026-04-14 10:46:45 -07:00 · 2026-04-14 10:46:45 -07:00 · 8bb5973950
commit 8bb5973950
parent 90c98345c9
3 changed files with 143 additions and 0 deletions
--- a/website/docs/reference/environment-variables.md
+++ b/website/docs/reference/environment-variables.md
@ -301,6 +301,8 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI
 | `API_SERVER_PORT` | Port for the API server (default: `8642`) |
 | `API_SERVER_HOST` | Host/bind address for the API server (default: `127.0.0.1`). Use `0.0.0.0` for network access — requires `API_SERVER_KEY` and a narrow `API_SERVER_CORS_ORIGINS` allowlist. |
 | `API_SERVER_MODEL_NAME` | Model name advertised on `/v1/models`. Defaults to the profile name (or `hermes-agent` for the default profile). Useful for multi-user setups where frontends like Open WebUI need distinct model names per connection. |
+| `GATEWAY_PROXY_URL` | URL of a remote Hermes API server to forward messages to ([proxy mode](/docs/user-guide/messaging/matrix#proxy-mode-e2ee-on-macos)). When set, the gateway handles platform I/O only — all agent work is delegated to the remote server. Also configurable via `gateway.proxy_url` in `config.yaml`. |
+| `GATEWAY_PROXY_KEY` | Bearer token for authenticating with the remote API server in proxy mode. Must match `API_SERVER_KEY` on the remote host. |
 | `MESSAGING_CWD` | Working directory for terminal commands in messaging mode (default: `~`) |
 | `GATEWAY_ALLOWED_USERS` | Comma-separated user IDs allowed across all platforms |
 | `GATEWAY_ALLOW_ALL_USERS` | Allow all users without allowlists (`true`/`false`, default: `false`) |
--- a/website/docs/user-guide/features/api-server.md
+++ b/website/docs/user-guide/features/api-server.md
@ -278,3 +278,9 @@ In Open WebUI, add each as a separate connection. The model dropdown shows `alic
 - **Response storage** — stored responses (for `previous_response_id`) are persisted in SQLite and survive gateway restarts. Max 100 stored responses (LRU eviction).
 - **No file upload** — vision/document analysis via uploaded files is not yet supported through the API.
 - **Model field is cosmetic** — the `model` field in requests is accepted but the actual LLM model used is configured server-side in config.yaml.
+
+## Proxy Mode
+
+The API server also serves as the backend for **gateway proxy mode**. When another Hermes gateway instance is configured with `GATEWAY_PROXY_URL` pointing at this API server, it forwards all messages here instead of running its own agent. This enables split deployments — for example, a Docker container handling Matrix E2EE that relays to a host-side agent.
+
+See [Matrix Proxy Mode](/docs/user-guide/messaging/matrix#proxy-mode-e2ee-on-macos) for the full setup guide.
--- a/website/docs/user-guide/messaging/matrix.md
+++ b/website/docs/user-guide/messaging/matrix.md
@ -439,6 +439,141 @@ security breach). A new access token gets a new device ID with no stale key
 history, so other clients trust it immediately.
 :::

+## Proxy Mode (E2EE on macOS)
+
+Matrix E2EE requires `libolm`, which doesn't compile on macOS ARM64 (Apple Silicon). The `hermes-agent[matrix]` extra is gated to Linux only. If you're on macOS, proxy mode lets you run E2EE in a Docker container on a Linux VM while the actual agent runs natively on macOS with full access to your local files, memory, and skills.
+
+### How It Works
+
+```
+macOS (Host):
+  └─ hermes gateway
+       ├─ api_server adapter ← listens on 0.0.0.0:8642
+       ├─ AIAgent ← single source of truth
+       ├─ Sessions, memory, skills
+       └─ Local file access (Obsidian, projects, etc.)
+
+Linux VM (Docker):
+  └─ hermes gateway (proxy mode)
+       ├─ Matrix adapter ← E2EE decryption/encryption
+       └─ HTTP forward → macOS:8642/v1/chat/completions
+           (no LLM API keys, no agent, no inference)
+```
+
+The Docker container only handles Matrix protocol + E2EE. When a message arrives, it decrypts it and forwards the text to the host via a standard HTTP request. The host runs the agent, calls tools, generates a response, and streams it back. The container encrypts and sends the response to Matrix. All sessions are unified — CLI, Matrix, Telegram, and any other platform share the same memory and conversation history.
+
+### Step 1: Configure the Host (macOS)
+
+Enable the API server so the host accepts incoming requests from the Docker container.
+
+Add to `~/.hermes/.env`:
+
+```bash
+API_SERVER_ENABLED=true
+API_SERVER_KEY=your-secret-key-here
+API_SERVER_HOST=0.0.0.0
+```
+
+- `API_SERVER_HOST=0.0.0.0` binds to all interfaces so the Docker container can reach it.
+- `API_SERVER_KEY` is required for non-loopback binding. Pick a strong random string.
+- The API server runs on port 8642 by default (change with `API_SERVER_PORT` if needed).
+
+Start the gateway:
+
+```bash
+hermes gateway
+```
+
+You should see the API server start alongside any other platforms you have configured. Verify it's reachable from the VM:
+
+```bash
+# From the Linux VM
+curl http://<mac-ip>:8642/health
+```
+
+### Step 2: Configure the Docker Container (Linux VM)
+
+The container needs Matrix credentials and the proxy URL. It does NOT need LLM API keys.
+
+**`docker-compose.yml`:**
+
+```yaml
+services:
+  hermes-matrix:
+    build: .
+    environment:
+      # Matrix credentials
+      MATRIX_HOMESERVER: "https://matrix.example.org"
+      MATRIX_ACCESS_TOKEN: "syt_..."
+      MATRIX_ALLOWED_USERS: "@you:matrix.example.org"
+      MATRIX_ENCRYPTION: "true"
+      MATRIX_DEVICE_ID: "HERMES_BOT"
+
+      # Proxy mode — forward to host agent
+      GATEWAY_PROXY_URL: "http://192.168.1.100:8642"
+      GATEWAY_PROXY_KEY: "your-secret-key-here"
+    volumes:
+      - ./matrix-store:/root/.hermes/platforms/matrix/store
+```
+
+**`Dockerfile`:**
+
+```dockerfile
+FROM python:3.11-slim
+
+RUN apt-get update && apt-get install -y libolm-dev && rm -rf /var/lib/apt/lists/*
+RUN pip install 'hermes-agent[matrix]'
+
+CMD ["hermes", "gateway"]
+```
+
+That's the entire container. No API keys for OpenRouter, Anthropic, or any inference provider.
+
+### Step 3: Start Both
+
+1. Start the host gateway first:
+   ```bash
+   hermes gateway
+   ```
+
+2. Start the Docker container:
+   ```bash
+   docker compose up -d
+   ```
+
+3. Send a message in an encrypted Matrix room. The container decrypts it, forwards it to the host, and streams the response back.
+
+### Configuration Reference
+
+Proxy mode is configured on the **container side** (the thin gateway):
+
+| Setting | Description |
+|---------|-------------|
+| `GATEWAY_PROXY_URL` | URL of the remote Hermes API server (e.g., `http://192.168.1.100:8642`) |
+| `GATEWAY_PROXY_KEY` | Bearer token for authentication (must match `API_SERVER_KEY` on the host) |
+| `gateway.proxy_url` | Same as `GATEWAY_PROXY_URL` but in `config.yaml` |
+
+The host side needs:
+
+| Setting | Description |
+|---------|-------------|
+| `API_SERVER_ENABLED` | Set to `true` |
+| `API_SERVER_KEY` | Bearer token (shared with the container) |
+| `API_SERVER_HOST` | Set to `0.0.0.0` for network access |
+| `API_SERVER_PORT` | Port number (default: `8642`) |
+
+### Works for Any Platform
+
+Proxy mode is not limited to Matrix. Any platform adapter can use it — set `GATEWAY_PROXY_URL` on any gateway instance and it will forward to the remote agent instead of running one locally. This is useful for any deployment where the platform adapter needs to run in a different environment from the agent (network isolation, E2EE requirements, resource constraints).
+
+:::tip
+Session continuity is maintained via the `X-Hermes-Session-Id` header. The host's API server tracks sessions by this ID, so conversations persist across messages just like they would with a local agent.
+:::
+
+:::note
+**Limitations (v1):** Tool progress messages from the remote agent are not relayed back — the user sees the streamed final response only, not individual tool calls. Dangerous command approval prompts are handled on the host side, not relayed to the Matrix user. These can be addressed in future updates.
+:::
+
 ### Sync issues / bot falls behind

 **Cause**: Long-running tool executions can delay the sync loop, or the homeserver is slow.