docs: comprehensive 2-week sweep of feature/PR coverage gaps (#28497)

Catch the website docs up to two weeks of merged work (May 4 – May 18, 2026,
roughly 1,080 PRs). The audit found ~50 user-visible features that had landed
in code with no docs footprint, plus a handful of stale pages. This PR closes
every gap the scan turned up.

New pages
- user-guide/features/deliverable-mode.md — extension list, agent triggers,
  kanban_complete artifacts pattern, [[as_document]] override (PR #27813).
- developer-guide/web-search-provider-plugin.md — authoring guide modeled on
  image-gen-provider-plugin, covering brave_free / ddgs / etc. (PR #25448).

Providers / auth
- Rename "Alibaba Cloud" → "Qwen Cloud (Alibaba DashScope)" everywhere the
  display label shows up; provider id stays `alibaba` (PR #24835).
- Document OAuth refresh-token quarantine for xAI / MiniMax / Codex (PRs
  #28116 / #28118 / #28119).
- Document Nous JWT minting from refresh token + invalid-refresh quarantine
  + cross-profile shared token store (PRs #27663 / #19712).
- Add `## Microsoft Entra ID authentication (keyless)` section to
  azure-foundry guide — DefaultAzureCredential, RBAC, OpenAI + Anthropic
  routing details (PR #28101 / #9df9816da).
- Custom providers `api_mode` is now prompted-and-persisted, not just URL
  autodetected (PR #25068).
- Delegation honours `api_mode` + auto-detects anthropic_messages base URLs
  (PR #26824).
- `x_search` auto-enables when xAI credentials are present (PR #27376).
- Add `xAI Grok OAuth (SuperGrok)` row to providers headline table (PR
  #26534).
- NVIDIA NIM billing-origin header is set automatically (PR #26585).

Windows / installer
- `install.ps1`: document `-Commit <sha>` and `-Tag <v>` pin params plus
  the BOM-strip / git-retry hardening (PR #28169).
- Document Hermes Desktop thin installer + first-launch bootstrap (PR
  #27822).
- Document `dep_ensure` Windows bootstrap (PR #27845).
- Document install-method auto-detection (pip / git / homebrew / nixos) and
  the matching update command (PR #27843).

Gateway / messaging
- `/platform list|pause|resume` full description + circuit-breaker
  semantics (PR #26600).
- Slack / Matrix / Mattermost get parallel `allowed_channels` /
  `allowed_rooms` allowlist sections matching Telegram/Discord/DingTalk
  (PR #21251).
- Discord `allow_any_attachment` + `max_attachment_bytes` (config and env
  vars) (PR #27245).
- Discord clarify-choice button rendering (PR #25485).
- Telegram `guest_mode` @mention bypass for allowlisted groups (PR
  #22759).
- Telegram `notifications` mode (`important` vs `all`) (PR #22793).
- `[[as_document]]` skill / response directive for forcing
  document-style media delivery (PR #21210).

CLI / TUI
- `/new [name]` argument (PR #19637).
- `/subgoal` user-supplied criteria appended to `/goal` (PR #25449).
- `/exit --delete` flag confirmation prompts for destructive slash
  commands (PR #22687).
- Status-bar additions: ▶ N background indicator (PR #27175), context
  compression count (PR #21218), YOLO mode banner+statusbar warning (PR
  #26238).
- `display.timestamps` + `docker_extra_args` config keys (PR #23599).
- TUI collapsible startup banner sections (PR #20625).
- `HERMES_SESSION_ID` exported to tool subprocesses (PR #23847).

i18n
- Refresh display.language locale list from 8 → 16 (en, zh, zh-hant, ja,
  de, es, fr, tr, uk, af, ko, it, ga, pt, ru, hu) — matches
  `agent/i18n.py:SUPPORTED_LANGUAGES`.

Tools / features
- `vision_analyze` native-pixel passthrough for vision-capable callers,
  with auxiliary text-describer fallback (PR #22955).
- `session_search` rewrite to the single-shape tool (discovery / scroll /
  browse modes) (PRs #27590 / #27840).
- Clarify MCP transport scope: client supports stdio + SSE; embedded
  `hermes mcp serve` is stdio-only (PR #21227).
- Web search backends table: add Brave Search (free tier) and DDGS rows
  (PR #21337).
- ACP session-scoped edit auto-approval modes (PR #27862).
- Curator rename map in the user-visible per-run summary (PR #22910).
- Prompt caching feature page reference in features/overview.md — Claude
  cross-session 1-hour prefix cache on native Anthropic / OpenRouter /
  Nous Portal (PR #23828).
- Cron per-job profile parameter (PR #28124).
- `--no-skills` flag for `hermes profile create` (PR #20986).

Build
- Verified with `npm run build` in `website/`; both `en` and `zh-Hans`
  locales compile. Remaining broken-link/anchor warnings are pre-existing
  (`rl-training.md` from learning-path / overview; the
  zh-Hans translation lag the docs skill already calls out).
This commit is contained in:
Teknium 2026-05-18 23:55:25 -07:00 committed by GitHub
parent 1335ce996d
commit eacce70a35
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
37 changed files with 901 additions and 26 deletions

View file

@ -272,6 +272,10 @@ Put the most common workflow first. Edge cases and advanced usage go at the bott
For XML/JSON parsing or complex logic, include helper scripts in `scripts/` — don't expect the LLM to write parsers inline every time.
### Deliver media as documents (`[[as_document]]`)
If your skill produces a high-resolution screenshot, chart, or any image where lossy preview compression would hurt — emit the literal directive `[[as_document]]` somewhere in the response (commonly the last line). The gateway strips the directive and delivers every extracted media path in that response as a downloadable file attachment instead of an inline image bubble. See [Skill output and media delivery](../user-guide/features/skills.md#skill-output-and-media-delivery) for the full semantics.
#### Referencing bundled scripts from SKILL.md
When a skill is loaded, the activation message exposes the absolute skill directory as `[Skill directory: /abs/path]` and also substitutes two template tokens anywhere in the SKILL.md body:

View file

@ -0,0 +1,264 @@
---
sidebar_position: 12
title: "Web Search Provider Plugins"
description: "How to build a web-search/extract/crawl backend plugin for Hermes Agent"
---
# Building a Web Search Provider Plugin
Web-search provider plugins register a backend that services `web_search`, `web_extract`, and (optionally) deep-crawl tool calls. Built-in providers — Firecrawl, SearXNG, Tavily, Exa, Parallel, Brave Search (free tier), and DDGS — all ship as plugins under `plugins/web/<name>/`. You can add a new one, or override a bundled one, by dropping a directory next to them.
:::tip
Web search is one of several **backend plugins** Hermes supports. The others (with their own ABCs) are [Image Generation Provider Plugins](/docs/developer-guide/image-gen-provider-plugin), [Video Generation Provider Plugins](/docs/developer-guide/video-gen-provider-plugin), [Memory Provider Plugins](/docs/developer-guide/memory-provider-plugin), [Context Engine Plugins](/docs/developer-guide/context-engine-plugin), and [Model Provider Plugins](/docs/developer-guide/model-provider-plugin). General tool/hook/CLI plugins live in [Build a Hermes Plugin](/docs/guides/build-a-hermes-plugin).
:::
## How discovery works
Hermes scans for web-search backends in three places:
1. **Bundled**`<repo>/plugins/web/<name>/` (auto-loaded with `kind: backend`, always available)
2. **User**`~/.hermes/plugins/web/<name>/` (opt-in via `plugins.enabled` or `hermes plugins enable <name>`)
3. **Pip** — packages declaring a `hermes_agent.plugins` entry point
Each plugin's `register(ctx)` function calls `ctx.register_web_search_provider(...)` — that puts the instance into the registry in `agent/web_search_registry.py`. The active provider for each capability is picked by config:
| Capability | Config key | Falls back to |
|---|---|---|
| `web_search` | `web.search_backend` | `web.backend` |
| `web_extract` | `web.extract_backend` | `web.backend` |
| Deep crawl modes inside `web_extract` | `web.extract_backend` | `web.backend` |
When neither key is set, Hermes auto-detects the backend from whichever API key/URL is present in the environment. `hermes tools` walks users through selection.
## Directory structure
```
plugins/web/my-backend/
├── __init__.py # register() entry point
├── provider.py # WebSearchProvider subclass
└── plugin.yaml # Manifest with kind: backend and provides_web_providers
```
`brave_free/` and `ddgs/` are the smallest in-tree references — `brave_free` for an API-key-gated search-only provider, `ddgs` for a no-key provider that lazy-installs its SDK.
## The WebSearchProvider ABC
Subclass `agent.web_search_provider.WebSearchProvider`. The only required members are `name`, `is_available()`, and whichever of `search()` / `extract()` / `crawl()` you implement.
```python
# plugins/web/my-backend/provider.py
from __future__ import annotations
import os
from typing import Any, Dict, List
from agent.web_search_provider import WebSearchProvider
class MyBackendWebSearchProvider(WebSearchProvider):
"""Minimal search-only provider against the My Backend HTTP API."""
@property
def name(self) -> str:
# Stable id used in web.search_backend / web.extract_backend / web.backend
# config keys. Lowercase, no spaces; hyphens permitted.
return "my-backend"
@property
def display_name(self) -> str:
# Human label shown in `hermes tools`. Defaults to `name`.
return "My Backend"
def is_available(self) -> bool:
# Cheap check — env var present, optional dep importable, etc.
# MUST NOT make network calls (runs on every `hermes tools` paint).
return bool(os.getenv("MY_BACKEND_API_KEY", "").strip())
def supports_search(self) -> bool:
return True
def supports_extract(self) -> bool:
return False
def supports_crawl(self) -> bool:
return False
def search(self, query: str, limit: int = 5) -> Dict[str, Any]:
import httpx
api_key = os.environ["MY_BACKEND_API_KEY"]
try:
resp = httpx.get(
"https://api.example.com/search",
params={"q": query, "count": max(1, min(int(limit), 20))},
headers={"Authorization": f"Bearer {api_key}"},
timeout=15,
)
resp.raise_for_status()
data = resp.json()
except httpx.HTTPError as exc:
return {"success": False, "error": str(exc)}
# Response shape is fixed — see "Response shape" below.
return {
"success": True,
"data": {
"web": [
{
"title": item.get("title", ""),
"url": item.get("url", ""),
"description": item.get("snippet", ""),
"position": idx + 1,
}
for idx, item in enumerate(data.get("results", []))
],
},
}
```
```python
# plugins/web/my-backend/__init__.py
from plugins.web.my_backend.provider import MyBackendWebSearchProvider
def register(ctx) -> None:
"""Plugin entry point — called once at load time."""
ctx.register_web_search_provider(MyBackendWebSearchProvider())
```
## plugin.yaml
```yaml
name: web-my-backend
version: 1.0.0
description: "My Backend web search — Bearer-auth REST API"
author: Your Name
kind: backend
provides_web_providers:
- my-backend
requires_env:
- MY_BACKEND_API_KEY
```
| Key | Purpose |
|---|---|
| `kind: backend` | Routes the plugin through the backend-loading path |
| `provides_web_providers` | List of provider `name`s this plugin registers — used by the loader to advertise the plugin in `hermes tools` even before `register()` runs |
| `requires_env` | Interactive credential prompt during `hermes plugins install` (see [Build a Hermes Plugin](/docs/guides/build-a-hermes-plugin#gate-on-environment-variables) for the rich format) |
## ABC reference
Full contract in `agent/web_search_provider.py`. Methods you may override:
| Member | Required | Default | Purpose |
|---|---|---|---|
| `name` | ✅ | — | Stable id used in `web.*_backend` config |
| `display_name` | — | `name` | Label shown in `hermes tools` |
| `is_available()` | ✅ | — | Cheap availability gate — env vars, optional deps |
| `supports_search()` | — | `True` | Capability flag for `web_search` routing |
| `supports_extract()` | — | `False` | Capability flag for `web_extract` routing |
| `supports_crawl()` | — | `False` | Capability flag for deep-crawl modes |
| `search(query, limit)` | conditional | raises | Required when `supports_search()` returns `True` |
| `extract(urls, **kwargs)` | conditional | raises | Required when `supports_extract()` returns `True` |
| `crawl(url, **kwargs)` | conditional | raises | Required when `supports_crawl()` returns `True` |
Providers can advertise multiple capabilities from a single class — Firecrawl, Tavily, Exa, and Parallel all implement all three of search/extract/crawl. Brave Search and DDGS are search-only; SearXNG is search-only with a documented "pair me with an extract provider" workflow.
## Response shape
The tool wrapper expects a fixed envelope so it doesn't have to translate between backends.
**Search success:**
```python
{
"success": True,
"data": {
"web": [
{"title": str, "url": str, "description": str, "position": int},
...
],
},
}
```
**Extract success:**
```python
{
"success": True,
"data": [
{
"url": str,
"title": str,
"content": str,
"raw_content": str,
"metadata": dict, # optional
"error": str, # optional, only on per-URL failure
},
...
],
}
```
**Either capability, on failure:**
```python
{"success": False, "error": "human-readable message"}
```
Both `search()` and `extract()` may be `async def` — the dispatcher detects coroutine functions via `inspect.iscoroutinefunction` and awaits accordingly. Sync implementations that do blocking I/O (HTTP, SDK calls) are fine for small backends; the dispatcher handles threading.
## Capability flags
Hermes routes calls to the right provider based on the `supports_*` flags. A common multi-provider setup:
```yaml
# ~/.hermes/config.yaml
web:
search_backend: "brave-free" # search-only, fast, free 2k/mo
extract_backend: "firecrawl" # extract + crawl, paid quota
```
When `web.search_backend` or `web.extract_backend` aren't set, both fall through to `web.backend`. When that's also unset, Hermes picks the first available provider that supports the requested capability based on env-var presence.
If your provider only supports one capability, leave the other flags at their default (`False`) and the registry will skip it for that tool — users won't see misleading "provider X failed" errors when they're using X only for search and asking the agent to extract.
## How Hermes wires it into the tools
The `web_search` and `web_extract` tools live in `tools/web_tools.py`. At call time they:
1. Read the relevant config key (`web.search_backend` for `web_search`, `web.extract_backend` for `web_extract`)
2. Ask the registry for the provider with that `name`
3. Check `is_available()` and the matching `supports_*()` flag
4. Dispatch to `search()` / `extract()` / `crawl()`, awaiting if the method is a coroutine
5. JSON-serialize the response envelope and hand it back to the LLM
Errors surface as the tool result; the LLM decides how to explain them. If no provider is registered (or every available one fails the capability gate), the tool returns a helpful error pointing at `hermes tools`.
## Lazy-installing optional dependencies
If your provider wraps a third-party SDK (like DDGS does with the `ddgs` package), don't `import` it at module top level. Use `tools.lazy_deps.ensure(...)` inside `is_available()` or `search()` — Hermes will install the package on first use, gated by `security.allow_lazy_installs`. See [Build a Hermes Plugin → Lazy-install](/docs/guides/build-a-hermes-plugin#lazy-install-optional-python-dependencies) for the security model.
## Reference implementations
- **`plugins/web/brave_free/`** — small, API-key-gated, search-only HTTP provider. Good starting template.
- **`plugins/web/ddgs/`** — no-key provider that lazy-installs its SDK. Useful pattern for backends that wrap a Python package.
- **`plugins/web/firecrawl/`** — full multi-capability provider (search + extract + crawl) with multiple format modes.
- **`plugins/web/searxng/`** — self-hosted, URL-configured backend with no auth.
## Distribute via pip
```toml
# pyproject.toml
[project.entry-points."hermes_agent.plugins"]
my-backend-web = "my_backend_web_package"
```
`my_backend_web_package` must expose a top-level `register` function. See [Distribute via pip](/docs/guides/build-a-hermes-plugin#distribute-via-pip) in the general plugin guide for the full setup.
## Related pages
- [Web Search](/docs/user-guide/features/web-search) — user-facing feature documentation and per-backend configuration
- [Plugins overview](/docs/user-guide/features/plugins) — all plugin types at a glance
- [Build a Hermes Plugin](/docs/guides/build-a-hermes-plugin) — general tools/hooks/slash commands guide

View file

@ -42,6 +42,8 @@ The installer also sets `HERMES_GIT_BASH_PATH` to the located `bash.exe` so Herm
If you prefer WSL2, the Linux installer above works inside it; both native and WSL installs can coexist without conflict (native data lives under `%LOCALAPPDATA%\hermes`, WSL data lives under `~/.hermes`).
**Desktop installer (alternative):** A thin GUI installer is also available — download Hermes Desktop, run the `.exe`, and on first launch it calls `install.ps1` under the hood to provision Python (via `uv`), Node, PortableGit, and the rest of the dependencies. The desktop app and the PowerShell-installed CLI share the same install and data directories, so you can use either or both. See the [Windows (Native) guide](../user-guide/windows-native#desktop-installer-alternative) for details.
### Android / Termux
Hermes now ships a Termux-aware installer path too:
@ -183,3 +185,7 @@ The same pattern works on Arch (the installer uses pacman with the same sudo-det
| Missing config after update | Run `hermes config check` then `hermes config migrate` |
For more diagnostics, run `hermes doctor` — it will tell you exactly what's missing and how to fix it.
## Install method auto-detection
Hermes auto-detects whether it was installed via `pip`, the git installer, Homebrew, or NixOS, and `hermes update` prints the matching update command for that path. There's no env var to set — the detection is based on the install layout (Python site-packages, `~/.hermes/hermes-agent/`, Homebrew prefix, or Nix store path). `hermes doctor` also surfaces the detected method under its environment summary.

View file

@ -169,8 +169,9 @@ hermes # uses your az login token
**Service principal in CI:**
- Set `AZURE_TENANT_ID`, `AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET` in the runner env.
**Sovereign clouds (Government, China):**
- Export `AZURE_AUTHORITY_HOST` (e.g. `https://login.microsoftonline.us` for Azure Government, `https://login.partner.microsoftonline.cn` for Azure China). `azure-identity` reads it directly.
#### Sovereign clouds (Government, China)
Export `AZURE_AUTHORITY_HOST` (e.g. `https://login.microsoftonline.us` for Azure Government, `https://login.partner.microsoftonline.cn` for Azure China). `azure-identity` reads it directly.
### Health checks
@ -244,6 +245,8 @@ Important behaviour:
- **`/v1` is stripped from the base URL.** The Anthropic SDK appends `/v1/messages` to every request URL — Hermes removes any trailing `/v1` before handing the URL to the SDK to avoid double-`/v1` paths.
- **`api-version` is sent via `default_query`, not appended to the URL.** Azure Anthropic requires an `api-version` query string. Baking it into the base URL produces malformed paths like `/anthropic?api-version=.../v1/messages` and returns 404. Hermes passes `api-version=2025-04-15` via the Anthropic SDK's `default_query` instead.
- **Bearer auth is used instead of `x-api-key`.** Azure's Anthropic-compatible route requires `Authorization: Bearer <key>` rather than Anthropic's native `x-api-key` header. Hermes detects `azure.com` in the base URL and routes the API key through the SDK's `auth_token` field so the right header reaches the upstream.
- **1M context window beta header is kept.** Azure still gates the 1M-token Claude context (Opus 4.6/4.7, Sonnet 4.6) behind the `anthropic-beta: context-1m-2025-08-07` header. Hermes keeps that beta header on Azure paths (it's stripped from native Anthropic OAuth requests because some subscriptions reject it, but Azure requires it).
- **OAuth token refresh is disabled.** Azure deployments use static API keys. The `~/.claude/.credentials.json` OAuth token refresh loop that applies to Anthropic Console is explicitly skipped for Azure endpoints to prevent the Claude Code OAuth token from overwriting your Azure key mid-session.
## Alternative: `provider: anthropic` + Azure base URL

View file

@ -452,6 +452,37 @@ requires_env:
Both formats can be mixed in the same list. Already-set variables are skipped silently.
### Lazy-install optional Python dependencies
If your plugin wraps an SDK that not every user will have installed (a vendor SDK, a heavy ML lib, a platform-specific package), don't `import` it at the top of the module. Use the `tools.lazy_deps.ensure(...)` helper inside the tool handler — Hermes will install the package on first use, gated by the user's `security.allow_lazy_installs` config.
```python
# tools.py
from tools.lazy_deps import ensure, FeatureUnavailable
def my_tool_handler(args, **kwargs):
try:
ensure("my-plugin.my-backend") # key must be in LAZY_DEPS
except FeatureUnavailable as exc:
return {"error": str(exc)}
import my_backend_sdk # safe now
...
```
Two rules from the security model in `tools/lazy_deps.py`:
| Rule | Why |
|---|---|
| Your feature key must appear in the in-tree `LAZY_DEPS` allowlist | Prevents a malicious config from coaxing Hermes into installing arbitrary packages — only specs Hermes itself ships are eligible |
| Specs are PyPI-by-name only | No `--index-url`, `git+https://`, or file: paths. Pin versions with PEP 440 (`"my-sdk>=1.2,<2"`) inside the allowlist entry |
For third-party plugins distributed via pip, declare the optional deps as `[project.optional-dependencies]` extras in your own `pyproject.toml` and tell users to `pip install your-plugin[backend]` — that path doesn't go through `lazy_deps`. The lazy-install dance is most useful for **bundled** plugins where shipping a hard dependency on every install would bloat the base Hermes footprint.
When `security.allow_lazy_installs: false` is set globally, `ensure()` raises `FeatureUnavailable` immediately with a remediation hint — your plugin should catch it and degrade gracefully (return an error result, not crash the tool loop).
### Conditional tool availability
For tools that depend on optional libraries:

View file

@ -180,7 +180,9 @@ Both models support up to 200,000 tokens of context.
Hermes refreshes the token on every session start if it is within 60 seconds of expiry. If the access token is already expired (for example, after a long offline period), the refresh happens automatically on the next request. If refresh fails with `refresh_token_reused` or `invalid_grant`, Hermes marks the session as requiring re-login.
**Fix:** run `hermes auth add minimax-oauth` again to start a fresh login.
When the refresh failure is terminal (HTTP 4xx, `invalid_grant`, revoked grant, etc.), Hermes marks the refresh token as dead and quarantines it locally so it doesn't keep replaying the doomed exchange. The agent surfaces a single "re-authentication required" message and stays out of the way until you log in again.
**Fix:** run `hermes auth add minimax-oauth` again to start a fresh login. The quarantine clears on the next successful exchange.
### Authorization timed out

View file

@ -164,8 +164,8 @@ If OAuth tokens are already stored, the picker confirms it and skips the credent
The `video_gen` toolset is disabled by default. Enable it in `hermes tools``🎬 Video Generation` (press space) before the agent can call `video_generate`. Otherwise the agent may fall back to the bundled ComfyUI skill, which is also tagged for video generation.
:::
:::note X search is off by default
The `x_search` toolset is disabled by default. Enable it in `hermes tools``🐦 X (Twitter) Search` (press space) before the agent can call `x_search`. The tool routes through xAI's built-in `x_search` Responses API — it works with **either** your SuperGrok OAuth login or a paid `XAI_API_KEY`, and prefers OAuth when both are configured (uses your subscription quota instead of API spend). The tool schema is hidden from the model when no xAI credentials are configured, regardless of whether the toolset is enabled.
:::note X search auto-enables when xAI credentials are present
The `x_search` toolset auto-enables whenever xAI credentials (a SuperGrok OAuth token or `XAI_API_KEY`) are configured. Disable explicitly via `hermes tools``🐦 X (Twitter) Search` (press space) if you don't want this. The tool routes through xAI's built-in `x_search` Responses API — it works with **either** your SuperGrok OAuth login or a paid `XAI_API_KEY`, and prefers OAuth when both are configured (uses your subscription quota instead of API spend). The tool schema is hidden from the model when no xAI credentials are configured, regardless of whether the toolset is enabled.
:::
### Models
@ -196,7 +196,9 @@ The chat catalog is derived live from the on-disk `models.dev` cache; new xAI re
Hermes refreshes the token before each session and again reactively on a 401. If refresh fails with `invalid_grant` (the refresh token was revoked, or the account was rotated), Hermes surfaces a typed re-auth message instead of crashing.
**Fix:** run `hermes auth add xai-oauth` again to start a fresh login.
When the refresh failure is terminal (HTTP 4xx, `invalid_grant`, revoked grant, etc.), Hermes marks the refresh token as dead and quarantines it locally — subsequent calls skip the doomed refresh attempt instead of replaying the same 401 over and over. The agent surfaces a single "re-authentication required" message and stays out of the way until you log in again.
**Fix:** run `hermes auth add xai-oauth` again to start a fresh login. The quarantine clears on the next successful exchange.
### Authorization timed out

View file

@ -29,8 +29,10 @@ You need at least one way to connect to an LLM. Use `hermes model` to switch pro
| **GMI Cloud** | `GMI_API_KEY` in `~/.hermes/.env` (provider: `gmi`; aliases: `gmi-cloud`, `gmicloud`) |
| **MiniMax** | `MINIMAX_API_KEY` in `~/.hermes/.env` (provider: `minimax`) |
| **MiniMax China** | `MINIMAX_CN_API_KEY` in `~/.hermes/.env` (provider: `minimax-cn`) |
| **Alibaba Cloud** | `DASHSCOPE_API_KEY` in `~/.hermes/.env` (provider: `alibaba`) |
| **Alibaba Coding Plan** | `DASHSCOPE_API_KEY` (provider: `alibaba-coding-plan`, alias: `alibaba_coding`) — separate billing SKU, different endpoint |
| **xAI (Grok) — Responses API** | `XAI_API_KEY` in `~/.hermes/.env` (provider: `xai`) |
| **xAI Grok OAuth (SuperGrok)** | `hermes model` → "xAI Grok OAuth (SuperGrok Subscription)" — browser login, no API key. See [guide](../guides/xai-grok-oauth.md) |
| **Qwen Cloud (Alibaba DashScope)** | `DASHSCOPE_API_KEY` in `~/.hermes/.env` (provider: `alibaba`) |
| **Alibaba Cloud (Coding Plan)** | `DASHSCOPE_API_KEY` (provider: `alibaba-coding-plan`, alias: `alibaba_coding`) — separate billing SKU, different endpoint |
| **Kilo Code** | `KILOCODE_API_KEY` in `~/.hermes/.env` (provider: `kilocode`) |
| **Xiaomi MiMo** | `XIAOMI_API_KEY` in `~/.hermes/.env` (provider: `xiaomi`, aliases: `mimo`, `xiaomi-mimo`) |
| **Tencent TokenHub** | `TOKENHUB_API_KEY` in `~/.hermes/.env` (provider: `tencent-tokenhub`, aliases: `tencent`, `tokenhub`, `tencentmaas`) |
@ -137,6 +139,8 @@ with the Generative Language API enabled.
:::info Codex Note
The OpenAI Codex provider authenticates via device code (open a URL, enter a code). Hermes stores the resulting credentials in its own auth store under `~/.hermes/auth.json` and can import existing Codex CLI credentials from `~/.codex/auth.json` when present. No Codex CLI installation is required.
If a token refresh fails with a terminal error (HTTP 4xx, `invalid_grant`, revoked grant, etc.), Hermes marks the refresh token as dead and stops replaying it so you don't see a flood of identical auth failures. The next request surfaces a typed re-auth message instead. Run `hermes auth add codex-oauth` (or `hermes model` → OpenAI Codex) to start a fresh device-code login; the quarantine clears on the next successful exchange.
:::
:::warning
@ -158,6 +162,18 @@ Hermes has **two** model commands that serve different purposes:
If you're trying to switch to a provider you haven't set up yet (e.g. you only have OpenRouter configured and want to use Anthropic), you need `hermes model`, not `/model`. Exit your session first (`Ctrl+C` or `/quit`), run `hermes model`, complete the provider setup, then start a new session.
### Nous Portal
Subscription-based access to Hermes-4 models (`Hermes-4-70B`, `Hermes-4.3-36B`, `Hermes-4-405B`) via Nous Research's portal. Run `hermes model`, pick **Nous Portal**, sign in through the browser — Hermes stores a long-lived refresh token at `~/.hermes/auth.json`.
The refresh token is also shared across profiles via a shared token store, so logging in on one profile carries over to the others.
#### Token handling
Hermes mints a short-lived JWT from your stored Nous refresh token on each inference call rather than reusing a long-lived API key. The token lifecycle is fully automatic — refresh, mint, retry on transient 401 — and you never see it.
If the portal invalidates the refresh token (password change, manual revoke, session expiry), the invalid refresh token is quarantined locally so Hermes stops replaying it and you don't see a stream of identical 401s. The next call surfaces a clear "re-authentication required" message. Run `hermes auth add nous` to log in again; the quarantine clears on the next successful login.
### Anthropic (Native)
Use Claude models directly through the Anthropic API — no OpenRouter proxy needed. Supports three auth methods:
@ -292,7 +308,7 @@ hermes chat --provider minimax --model MiniMax-M2.7
hermes chat --provider minimax-cn --model MiniMax-M2.7
# Requires: MINIMAX_CN_API_KEY in ~/.hermes/.env
# Alibaba Cloud / DashScope (Qwen models)
# Qwen Cloud / DashScope (Qwen models)
hermes chat --provider alibaba --model qwen3.5-plus
# Requires: DASHSCOPE_API_KEY in ~/.hermes/.env
@ -440,11 +456,11 @@ model:
Set `HERMES_QWEN_BASE_URL` only if the portal endpoint relocates (default: `https://portal.qwen.ai/v1`).
:::tip Qwen OAuth vs DashScope (Alibaba)
`qwen-oauth` uses the consumer-facing Qwen Portal with OAuth login — ideal for individual users. The `alibaba` provider uses DashScope's enterprise API with a `DASHSCOPE_API_KEY` — ideal for programmatic / production workloads. Both route to Qwen-family models but live at different endpoints.
:::tip Qwen OAuth vs Qwen Cloud (Alibaba DashScope)
`qwen-oauth` uses the consumer-facing Qwen Portal with OAuth login — ideal for individual users. The `alibaba` provider uses Qwen Cloud (Alibaba DashScope) with a `DASHSCOPE_API_KEY` — ideal for programmatic / production workloads. Both route to Qwen-family models but live at different endpoints.
:::
### Alibaba Coding Plan
### Alibaba Cloud (Coding Plan)
If you're subscribed to Alibaba's **Coding Plan** (a pricing SKU separate from standard DashScope API access), Hermes exposes it as its own first-class provider: `alibaba-coding-plan`. Endpoint: `https://coding-intl.dashscope.aliyuncs.com/v1`. It's OpenAI-compatible like the regular `alibaba` provider but with a different base URL and billing surface.
@ -512,6 +528,8 @@ model:
For on-prem deployments (DGX Spark, local GPU), set `NVIDIA_BASE_URL=http://localhost:8000/v1`. NIM exposes the same OpenAI-compatible chat completions API as build.nvidia.com, so switching between cloud and local is a one-line env-var change.
:::
Hermes automatically attaches the NIM billing-origin header on every request to `build.nvidia.com` — no configuration needed. This routes consumption against the correct origin in NVIDIA's billing dashboard.
### GMI Cloud
Open and reasoning models via [GMI Cloud](https://www.gmicloud.ai/) — OpenAI-compatible API, API key authentication.
@ -1203,13 +1221,15 @@ custom_providers:
- name: work
base_url: https://gpu-server.internal.corp/v1
key_env: CORP_API_KEY
api_mode: chat_completions # optional, auto-detected from URL
api_mode: chat_completions # set explicitly by `hermes model` → Custom Endpoint wizard; auto-detection still happens as a fallback
- name: anthropic-proxy
base_url: https://proxy.example.com/anthropic
key_env: ANTHROPIC_PROXY_KEY
api_mode: anthropic_messages # for Anthropic-compatible proxies
```
The `hermes model` → Custom Endpoint wizard now prompts for `api_mode` explicitly and persists your answer to `config.yaml`. URL-based auto-detection (e.g. `/anthropic` paths → `anthropic_messages`) still happens as a fallback when the field is left blank.
Switch between them mid-session with the triple syntax:
```

View file

@ -212,6 +212,7 @@ Subcommands:
| `stop` | Stop the service (or foreground process). |
| `restart` | Restart the service. |
| `status` | Show service status. |
| `list` | List **all profiles** and whether each profile's gateway is currently running (with PID where available). Handy when you run multiple profiles side-by-side and want a single overview. |
| `install` | Install as a systemd (Linux) or launchd (macOS) background service. |
| `uninstall` | Remove the installed service. |
| `setup` | Interactive messaging-platform setup. |

View file

@ -70,7 +70,7 @@ All variables go in `~/.hermes/.env`. You can also set them with `hermes config
| `HERMES_GEMINI_PROJECT_ID` | GCP project ID for paid Gemini tiers (free tier auto-provisions) |
| `ANTHROPIC_API_KEY` | Anthropic Console API key ([console.anthropic.com](https://console.anthropic.com/)) |
| `ANTHROPIC_TOKEN` | Manual or legacy Anthropic OAuth/setup-token override |
| `DASHSCOPE_API_KEY` | Alibaba Cloud DashScope API key for Qwen models ([modelstudio.console.alibabacloud.com](https://modelstudio.console.alibabacloud.com/)) |
| `DASHSCOPE_API_KEY` | Qwen Cloud (Alibaba DashScope) API key for Qwen models ([modelstudio.console.alibabacloud.com](https://modelstudio.console.alibabacloud.com/)) |
| `DASHSCOPE_BASE_URL` | Custom DashScope base URL (default: `https://dashscope-intl.aliyuncs.com/compatible-mode/v1`; use `https://dashscope.aliyuncs.com/compatible-mode/v1` for mainland-China region) |
| `DEEPSEEK_API_KEY` | DeepSeek API key for direct DeepSeek access ([platform.deepseek.com](https://platform.deepseek.com/api_keys)) |
| `DEEPSEEK_BASE_URL` | Custom DeepSeek API base URL |
@ -579,6 +579,7 @@ Advanced per-platform knobs for throttling the outbound message batcher. Most us
|----------|-------------|
| `SESSION_IDLE_MINUTES` | Reset sessions after N minutes of inactivity (default: 1440) |
| `SESSION_RESET_HOUR` | Daily reset hour in 24h format (default: 4 = 4am) |
| `HERMES_SESSION_ID` | **Exported automatically into every tool subprocess** Hermes spawns (`terminal`, `execute_code`, persistent shell, Docker/Singularity backends, delegated subagent runs). Set by the agent to the current session ID; user scripts called from tools can read it to correlate their output, telemetry, or side effects with the originating Hermes session. **You should not set this manually** — overriding it from a parent shell only takes effect outside an agent run, and is overwritten the moment the agent starts a session. |
## Context Compression (config.yaml only)

View file

@ -84,6 +84,7 @@ Creates a new profile.
| `--clone-from <profile>` | Clone from a specific profile instead of the current one. Used with `--clone` or `--clone-all`. |
| `--no-alias` | Skip wrapper script creation. |
| `--description "<text>"` | One- or two-sentence description of what this profile is good at. Used by the kanban orchestrator to route tasks based on role instead of profile name alone. Skip and add later via `hermes profile describe`. Persisted in `<profile_dir>/profile.yaml`. |
| `--no-skills` | Create an **empty** profile with zero bundled skills enabled. Writes a `.no-skills` marker into the profile so future `hermes update` runs won't re-seed the bundled set, and refuses to combine with `--clone` / `--clone-all` (which would copy skills in anyway). Useful for narrow orchestrator profiles or sandbox profiles that should not inherit the full skill catalog. |
Creating a profile does **not** make that profile directory the default project/workspace directory for terminal commands. If you want a profile to start in a specific project, set `terminal.cwd` in that profile's `config.yaml`.

View file

@ -13,6 +13,21 @@ Hermes has two slash-command surfaces, both driven by a central `COMMAND_REGISTR
Installed skills are also exposed as dynamic slash commands on both surfaces. That includes bundled skills like `/plan`, which opens plan mode and saves markdown plans under `.hermes/plans/` relative to the active workspace/backend working directory.
## Permissions and admin/user split
Every messaging platform that supports a per-user allowlist (Telegram, Discord, Slack, Matrix, Mattermost, Signal, …) also supports a two-tier slash command split: **admins** get every registered command, **regular users** only get the names you list in `user_allowed_commands` (plus the always-allowed floor `/help` and `/whoami`). Configure `allow_admin_from` and `user_allowed_commands` (and the per-group equivalents `group_allow_admin_from` / `group_user_allowed_commands`) inside the platform's `extra:` block in `~/.hermes/gateway-config.yaml`.
See the per-platform docs for examples — the structure is identical across platforms:
- [Telegram](../user-guide/messaging/telegram.md#slash-command-access-control)
- [Discord](../user-guide/messaging/discord.md)
- [Slack](../user-guide/messaging/slack.md)
- [Matrix](../user-guide/messaging/matrix.md)
- [Mattermost](../user-guide/messaging/mattermost.md)
- [Signal](../user-guide/messaging/signal.md)
If `allow_admin_from` is unset for a scope, that scope stays in unrestricted backward-compat mode — every allowed user can run every command.
## Interactive CLI slash commands
Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-insensitive.
@ -21,7 +36,7 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| Command | Description |
|---------|-------------|
| `/new` (alias: `/reset`) | Start a new session (fresh session ID + history) |
| `/new [name]` (alias: `/reset`) | Start a new session (fresh session ID + history). Optional `[name]` sets the initial session title — e.g. `/new my-experiment` opens a fresh session already titled `my-experiment` so it's easy to find later with `/resume` or `/sessions`. |
| `/clear` | Clear screen and start a new session |
| `/history` | Show conversation history |
| `/save` | Save the current conversation |
@ -35,10 +50,11 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| `/queue <prompt>` (alias: `/q`) | Queue a prompt for the next turn (doesn't interrupt the current agent response). |
| `/steer <prompt>` | Inject a mid-run note that arrives at the agent **after the next tool call** — no interrupt, no new user turn. The text is appended to the last tool result's content once the current tool completes, giving the agent new context without breaking the current tool-calling loop. Use this to nudge direction mid-task (e.g. "focus on the auth module" while the agent is running tests). |
| `/goal <text>` | Set a standing goal Hermes works toward across turns — our take on the Ralph loop. After each turn an auxiliary judge model decides whether the goal is done; if not, Hermes auto-continues. Subcommands: `/goal status`, `/goal pause`, `/goal resume`, `/goal clear`. Budget defaults to 20 turns (`goals.max_turns`); any real user message preempts the continuation loop, and state survives `/resume`. See [Persistent Goals](/docs/user-guide/features/goals) for the full walkthrough. |
| `/subgoal <text>` | Append a user-supplied criterion to the active goal mid-loop. The continuation prompt surfaces all subgoals to the agent verbatim, and the judge factors them into its DONE/CONTINUE verdict — so the goal isn't marked done until the original goal **and** every subgoal are met. Subcommands: `/subgoal` (list), `/subgoal remove <N>`, `/subgoal clear`. Requires an active `/goal`. |
| `/resume [name]` | Resume a previously-named session |
| `/sessions` | Browse and resume previous sessions in an interactive picker |
| `/redraw` | Force a full UI repaint (recovers from terminal drift after tmux resize, mouse selection artifacts, etc.) |
| `/status` | Show session info |
| `/status` | Show session info — model, provider, profile, session ID, working directory, title, created/updated timestamps, token totals, agent-running state — followed by a local **Session recap** block (recent user/assistant turn counts, tool result count, top tools used, last few files touched, the latest user prompt, and the latest assistant reply). The recap is computed locally from the in-memory conversation; no LLM call, no prompt-cache impact. |
| `/agents` (alias: `/tasks`) | Show active agents and running tasks across the current session. |
| `/background <prompt>` (alias: `/bg`, `/btw`) | Run a prompt in a separate background session. The agent processes your prompt independently — your current session stays free for other work. Results appear as a panel when the task finishes. See [CLI Background Sessions](/docs/user-guide/cli#background-sessions). |
| `/branch [name]` (alias: `/fork`) | Branch the current session (explore a different path) |
@ -86,7 +102,8 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| `/help` | Show this help message |
| `/usage` | Show token usage, cost breakdown, session duration, and — when available from the active provider — an **Account limits** section with remaining quota / credits / plan usage pulled live from the provider's API. |
| `/insights` | Show usage insights and analytics (last 30 days) |
| `/platforms` (alias: `/gateway`) | Show gateway/messaging platform status |
| `/platforms` (alias: `/gateway`) | Show gateway/messaging platform status (CLI-only summary view). |
| `/platform <list\|pause\|resume> [name]` | Operate a running gateway platform. `/platform list` lists every adapter and its state (running, paused-by-breaker, manually-paused); `/platform pause <name>` stops dispatching new messages to that adapter without unloading it; `/platform resume <name>` re-enables it. The gateway also auto-pauses an adapter when its circuit breaker trips on repeated retryable failures (network / rate-limit / 5xx) — use `/platform resume <name>` to clear the breaker once the upstream is healthy. Available wherever the gateway is reachable (CLI session, Telegram, Discord, …). |
| `/paste` | Attach a clipboard image |
| `/copy [number]` | Copy the last assistant response to clipboard (or the Nth-from-last with a number). CLI-only. |
| `/image <path>` | Attach a local image file for your next prompt. |
@ -183,7 +200,7 @@ The messaging gateway supports the following built-in commands inside Telegram,
|---------|-------------|
| `/new` | Start a new conversation. |
| `/reset` | Reset conversation history. |
| `/status` | Show session info. |
| `/status` | Show session info, followed by a local **Session recap** block (recent turn counts, top tools used, files touched, latest prompt + reply). |
| `/stop` | Kill all running background processes and interrupt the running agent. |
| `/model [provider:model]` | Show or change the model. Supports provider switches (`/model zai:glm-5`), custom endpoints (`/model custom:model`), named custom providers (`/model custom:local:qwen`), auto-detect (`/model custom`), and user-defined aliases (`/model fav`, `/model grok` — see [Custom model aliases](#custom-model-aliases)). Use `--global` to persist the change to config.yaml. **Note:** `/model` can only switch between already-configured providers. To add a new provider or set up API keys, use `hermes model` from your terminal (outside the chat session). |
| `/codex-runtime [auto\|codex_app_server\|on\|off]` | Toggle the optional [Codex app-server runtime](../user-guide/features/codex-app-server-runtime). Persists to `model.openai_runtime` in config.yaml and evicts the cached agent so the next message picks up the new runtime. Effective on next session. |
@ -226,3 +243,18 @@ The messaging gateway supports the following built-in commands inside Telegram,
- `/sethome`, `/update`, `/restart`, `/approve`, `/deny`, `/topic`, and `/commands` are **messaging-only** commands.
- `/status`, `/background`, `/queue`, `/steer`, `/voice`, `/reload-mcp`, `/reload-skills`, `/rollback`, `/debug`, `/fast`, `/footer`, `/curator`, `/kanban`, `/sessions`, and `/yolo` work in **both** the CLI and the messaging gateway.
- `/voice join`, `/voice channel`, and `/voice leave` are only meaningful on Discord.
## Confirmation prompts for destructive commands
The CLI prompts before running slash commands that throw away unsaved session state. The current destructive set is:
| Command | What it destroys |
|---------|------------------|
| `/clear` | Clears the screen and starts a fresh session — current session ID and in-memory history are gone. |
| `/new` / `/reset` | Starts a fresh session (new session ID + empty history). |
| `/undo` | Removes the last user/assistant exchange from history. |
| `/exit --delete` / `/quit --delete` | Exits **and** permanently deletes the current session's SQLite history and on-disk transcripts. |
For each of these the CLI opens a three-choice modal: **Approve Once** (proceed this time), **Always Approve** (proceed and persist `approvals.destructive_slash_confirm: false` so future destructive commands run without prompting), or **Cancel**.
Set `approvals.destructive_slash_confirm: false` in `~/.hermes/config.yaml` to disable the prompts globally; set it back to `true` to re-enable. See [Security — Destructive slash command confirmation](../user-guide/security.md#dangerous-command-approval) for context.

View file

@ -181,7 +181,7 @@ Registered when the agent is either (a) spawned by the kanban dispatcher (`HERME
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `vision_analyze` | Analyze images using AI vision. Provides a comprehensive description and answers a specific question about the image content. | — |
| `vision_analyze` | Analyze images using AI vision. On vision-capable main models, returns the raw image pixels as a multimodal tool result so the model sees them natively on its next turn. On text-only main models, falls back to an auxiliary vision model that describes the image and returns the description as text. Tool signature is identical either way. | — |
## `video` toolset

View file

@ -68,9 +68,12 @@ A persistent status bar sits above the input area, updating in real time:
| Token count | Context tokens used / max context window |
| Context bar | Visual fill indicator with color-coded thresholds |
| Cost | Estimated session cost (or `n/a` for unknown/zero-priced models) |
| 🗜️ N | **Context compression count** — how many times the running session has been auto-compressed. Appears once the first compression fires. |
| ▶ N | **Active background tasks** — how many `/background` prompts are still running in the current session. Appears whenever at least one task is in flight. |
| Duration | Elapsed session time |
| ⚠ YOLO | **YOLO mode warning** — shown whenever `HERMES_YOLO_MODE` is on (either `hermes --yolo` at launch or `/yolo` toggled mid-session). Mirrors the banner-line warning so you can't forget you're in auto-approve mode. |
The bar adapts to terminal width — full layout at ≥ 76 columns, compact at 5275, minimal (model + duration only) below 52.
The bar adapts to terminal width — full layout at ≥ 76 columns, compact at 5275, minimal (model + duration, plus the YOLO badge when active) below 52.
**Context color coding:**
@ -125,6 +128,8 @@ Common examples:
| `/voice tts` | Toggle spoken playback for Hermes replies |
| `/reasoning high` | Increase reasoning effort |
| `/title My Session` | Name the current session |
| `/status` | Show session info — model/profile/tokens/duration — followed by a local **Session recap** block (recent turn counts, top tools used, files touched, latest user prompt + assistant reply). Pure local compute; no LLM call. |
| `/sessions` | Open an interactive session picker right inside the classic CLI (same surface the TUI uses). Type to filter, arrow keys to navigate, Enter to resume. |
For the full built-in CLI and messaging lists, see [Slash Commands Reference](../reference/slash-commands.md).

View file

@ -140,6 +140,9 @@ terminal:
docker_volumes: # Host directory mounts
- "/home/user/projects:/workspace/projects"
- "/home/user/data:/data:ro" # :ro for read-only
docker_extra_args: # Extra flags appended verbatim to `docker run`
- "--gpus=all"
- "--network=host"
# Resource limits
container_cpu: 1 # CPU cores (0 = unlimited)
@ -148,6 +151,8 @@ terminal:
container_persistent: true # Persist /workspace and /root across sessions
```
**`terminal.docker_extra_args`** (also overridable via `TERMINAL_DOCKER_EXTRA_ARGS='["--gpus=all"]'`) lets you pass arbitrary `docker run` flags that Hermes doesn't surface as first-class keys — `--gpus`, `--network`, `--add-host`, alternative `--security-opt` overrides, etc. Each entry must be a string; the list is appended last to the assembled `docker run` invocation so it can override Hermes' defaults if needed. Use sparingly — flags that conflict with the sandbox hardening (capability drops, `--user`, the workspace bind mount) will silently weaken isolation.
**Requirements:** Docker Desktop or Docker Engine installed and running. Hermes probes `$PATH` plus common macOS install locations (`/usr/local/bin/docker`, `/opt/homebrew/bin/docker`, Docker Desktop app bundle). Podman is supported out of the box: set `HERMES_DOCKER_BINARY=podman` (or the full path) to force it when both are installed.
**Container lifecycle:** Hermes reuses a single long-lived container (`docker run -d ... sleep 2h`) for every terminal and file-tool call, across sessions, `/new`, `/reset`, and `delegate_task` subagents, for the lifetime of the Hermes process. Commands run via `docker exec` with a login shell, so working-directory changes, installed packages, and files in `/workspace` all persist from one tool call to the next. The container is stopped and removed on Hermes shutdown (or when the idle-sweep reclaims it).
@ -762,6 +767,16 @@ credential_pool_strategies:
Options: `fill_first` (default), `round_robin`, `least_used`, `random`. See [Credential Pools](/docs/user-guide/features/credential-pools) for full documentation.
## Prompt caching
Hermes turns on cross-session prompt caching automatically when the active provider supports it — no user config needed.
For Claude on **native Anthropic**, **OpenRouter**, and **Nous Portal**, Hermes attaches `cache_control` breakpoints with the 1-hour TTL (`ttl: "1h"`) on the system prompt and skill blocks. The first send within a fresh hour pays full input rates; subsequent sends across any session within the same hour pull from the cache at the discounted cached-read rate. This means the system prompt, loaded skill content, and the early portion of any long-context include get reused across `hermes` sessions and across forked subagents for the first hour.
The Qwen Cloud (Alibaba DashScope) upstream caps cache TTL at 5 minutes, so Hermes uses the 5-minute breakpoint TTL there instead. Other Claude-via-third-party paths (AWS Bedrock, Azure Foundry) fall back to the provider's own caching defaults. xAI Grok uses a separate session-pinned conversation-id mechanism — see [xAI prompt caching](/docs/integrations/providers#xai-grok--responses-api--prompt-caching).
No knob exists to disable this — caching is always-on and saves money even on single-turn conversations because the system prompt alone is a meaningful fraction of the input token count.
## Auxiliary Models
Hermes uses "auxiliary" models for side tasks like image analysis, web page summarization, browser screenshot analysis, session-title generation, and context compression. By default (`auxiliary.*.provider: "auto"`), Hermes routes every auxiliary task to your **main chat model** — the same provider/model you picked in `hermes model`. You don't need to configure anything to get started, but be aware that on expensive reasoning models (Opus, MiniMax M2.7, etc.) auxiliary tasks add meaningful cost. If you want cheap-and-fast side tasks regardless of your main model, set `auxiliary.<task>.provider` and `auxiliary.<task>.model` explicitly (for example, Gemini Flash on OpenRouter for vision and web extraction).
@ -1168,12 +1183,13 @@ display:
show_reasoning: false # Show model reasoning/thinking above each response (toggle with /reasoning show|hide)
streaming: false # Stream tokens to terminal as they arrive (real-time output)
show_cost: false # Show estimated $ cost in the CLI status bar
timestamps: false # When true, prefixes user and assistant labels with [HH:MM] timestamps in the CLI / TUI transcript
tool_preview_length: 0 # Max chars for tool call previews (0 = no limit, show full paths/commands)
runtime_footer: # Gateway: append a runtime-context footer to final replies
enabled: false
fields: ["model", "context_pct", "cwd"]
file_mutation_verifier: true # Append an advisory footer when write_file/patch calls failed this turn
language: en # UI language for static messages (approval prompts, some gateway replies). en | zh | ja | de | es | fr | tr | uk
language: en # UI language for static messages (approval prompts, some gateway replies). en | zh | zh-hant | ja | de | es | fr | tr | uk | af | ko | it | ga | pt | ru | hu
```
### File-mutation verifier

View file

@ -196,6 +196,10 @@ docker run -it --rm \
Direct `-e` flags override values from `.env`. This is useful for CI/CD or secrets-manager integrations where you don't want keys on disk.
:::note Looking for Docker as the **terminal backend**?
This page covers running Hermes itself inside Docker. If you want Hermes to execute the agent's `terminal` / `execute_code` calls inside a Docker sandbox container (one persistent container per Hermes process), that's a separate config block — `terminal.backend: docker` plus `terminal.docker_image`, `terminal.docker_volumes`, `terminal.docker_forward_env`, `terminal.docker_run_as_host_user`, and `terminal.docker_extra_args`. See [Configuration → Docker Backend](configuration.md#docker-backend) for the full set.
:::
## Docker Compose example
For persistent deployment with both the gateway and dashboard, a `docker-compose.yaml` is convenient:

View file

@ -218,6 +218,21 @@ Dangerous terminal commands can be routed back to the editor as approval prompts
On timeout or error, the approval bridge denies the request.
### Session-scoped edit auto-approval
ACP exposes a third tier between *allow once* and *allow always*: **Allow for session**. Picking it from the editor's permission prompt records the approval inside the current ACP session only — every subsequent matching command in that session goes through without prompting, but a new ACP session (or restarting the editor) resets the slate and re-prompts the first time.
| Option | Editor label | Scope | Persisted across restarts |
|---|---|---|---|
| `allow_once` | Allow once | This one tool call | No |
| `allow_session` | Allow for session | All matching calls in this ACP session | No — cleared when the session ends |
| `allow_always` | Allow always | All future sessions | Yes (written to the Hermes permanent allowlist) |
| `deny` | Deny | This one tool call | No |
`allow_session` is the right default for an editor workflow where you trust an agent for the duration of a task but don't want to grant a long-lived allowlist entry. The safety trade-off is straightforward: the broader the scope, the less the editor will interrupt you, and the more damage a misbehaving agent (or prompt injection) can do before you notice. Start with `allow_once` for unfamiliar commands; promote to `allow_session` once you've seen the agent run the same pattern correctly a few times; reserve `allow_always` for truly idempotent commands you trust forever (e.g. `git status`).
The ACP bridge maps these options onto Hermes' internal approval semantics — `allow_always` writes a permanent allowlist entry the same way the CLI does, while `allow_session` only affects the in-process approval cache for the current ACP session.
## Troubleshooting
### ACP agent does not appear in the editor

View file

@ -121,6 +121,35 @@ When `workdir` is set:
Jobs with a `workdir` run sequentially on the scheduler tick, not in the parallel pool. This is deliberate — `TERMINAL_CWD` is process-global, so two workdir jobs running at the same time would corrupt each other's cwd. Workdir-less jobs still run in parallel as before.
:::
## Running cron jobs in a specific profile
By default a cron job inherits whichever Hermes profile owned the gateway / CLI that created it. Pass `--profile <name>` (CLI) or `profile=` (cronjob tool) to re-target the job at a different profile — the scheduler resolves that profile's `HERMES_HOME`, temporarily switches into it for the duration of the run, loads its `.env` + `config.yaml`, and executes the job there:
```bash
# Pin a job to the `night-ops` profile regardless of where it was scheduled
hermes cron create "every 1d at 03:00" \
"Tail the security log and flag anomalies" \
--profile night-ops
```
```python
# From a chat, via the cronjob tool
cronjob(
action="create",
schedule="every 1d at 03:00",
prompt="Tail the security log and flag anomalies",
profile="night-ops",
)
```
Use `--profile default` to explicitly pin to the root Hermes profile. The named profile must already exist; the scheduler refuses to create profiles on the fly. To clear a profile pin during `cron edit`, pass an empty string (`--profile ""` or `profile=""`) — the job reverts to running in whatever profile the scheduler itself is in.
If the pinned profile is later deleted, the scheduler logs a warning and falls back to running the job in its current profile rather than crashing — so a stale `profile` reference never wedges a job.
:::note Serialization
Jobs with a `profile` set also run sequentially, for the same reason as `workdir`-pinned jobs: switching `HERMES_HOME` is a process-global mutation, so two profile-pinned jobs running in parallel would race each other. Unpinned jobs still run in the normal parallel pool.
:::
## Editing jobs
You do not need to delete and recreate jobs just to change them.

View file

@ -217,6 +217,10 @@ Every curator run writes a timestamped directory under `~/.hermes/logs/curator/`
`REPORT.md` is a quick way to see what a given run did — which skills transitioned, what the LLM reviewer said, which skills it patched. Good for auditing without having to grep `agent.log`.
### Rename map in the summary
If a run consolidated multiple skills under an umbrella (or merged near-duplicates), the user-visible summary printed at the end of the run includes an explicit rename map showing every `old-name → new-name` pair the curator applied. This is in addition to per-skill transition lines, so when a wave of renames lands you can spot them at a glance without diffing the JSON report. The hint also surfaces under `hermes curator pin` so you can pin the umbrella name immediately if you want to lock the new label in.
## Restoring an archived skill
If the curator archived something you still want:

View file

@ -268,6 +268,7 @@ delegation:
# orchestrator_enabled: true # Disable to force all children to leaf role.
model: "google/gemini-3-flash-preview" # Optional provider/model override
provider: "openrouter" # Optional built-in provider
api_mode: anthropic_messages # optional; auto-detected from base_url for anthropic_messages endpoints
# Or use a direct custom endpoint instead of provider:
delegation:
@ -277,6 +278,8 @@ delegation:
# api_mode: "anthropic_messages" # Optional. Wire protocol override for base_url ("chat_completions", "codex_responses", or "anthropic_messages"). Empty = auto-detect from URL (e.g. /anthropic suffix). Set explicitly for endpoints the heuristic can't classify (Azure AI Foundry, MiniMax, Zhipu GLM, LiteLLM proxies, …).
```
When `base_url` points at an Anthropic-compatible endpoint — for example a path ending in `/anthropic`, an Azure Foundry Claude route, or a MiniMax `/anthropic` proxy — `api_mode` is auto-detected as `anthropic_messages` so the subagent uses the right wire format without you setting anything. Set `api_mode` explicitly when the auto-detection guess is wrong (rare).
:::tip
The agent handles delegation automatically based on the task complexity. You don't need to explicitly ask it to delegate — it will do so when it makes sense.
:::

View file

@ -47,6 +47,21 @@ What you'll see:
Works identically on the CLI and every gateway platform (Telegram, Discord, Slack, Matrix, Signal, WhatsApp, SMS, iMessage, Webhook, API server, and the web dashboard).
## Adding criteria mid-goal: `/subgoal`
While a goal is active you can append extra acceptance criteria with `/subgoal <text>` without resetting the loop. Each call adds one numbered item to the goal's subgoal list; the **continuation prompt** the agent sees on the next turn includes the original goal plus an "Additional criteria the user added mid-loop" block, and the **judge prompt** is rewritten so the verdict must consider every subgoal — the goal isn't marked done until the original objective **and** every subgoal are met.
| Command | What it does |
|---|---|
| `/subgoal <text>` | Append a new criterion to the active goal. Requires an active `/goal`. |
| `/subgoal` (no args) | Show the current numbered subgoal list. |
| `/subgoal remove <N>` | Remove the Nth subgoal (1-based). |
| `/subgoal clear` | Drop every subgoal but keep the original goal intact. |
Subgoals are persisted alongside the goal in `SessionDB.state_meta`, so they survive `/resume`. Setting a new `/goal <text>` replaces the goal and clears the subgoal list; `/goal clear` does the same.
Use this when you start a loop ("fix the failing tests") and notice partway through that you also want it to "and add a regression test for the bug you just patched" — `/subgoal add a regression test` tightens the success criteria without breaking the running loop.
## Behavior details
### The judge

View file

@ -127,6 +127,30 @@ mcp_servers:
Authorization: "Bearer ***"
```
## Built-in presets
For well-known MCP servers, `hermes mcp add` accepts a `--preset` flag that fills in the transport details so you don't have to look up the command and args. The preset only supplies defaults — anything else (env vars, headers, filtering) you pass on the same command line still wins.
| Preset | What it wires up |
|---|---|
| `codex` | The Codex CLI's MCP server (`codex mcp-server` over stdio). Requires the `codex` CLI on PATH. |
```bash
# Add Codex CLI as an MCP server in one line
hermes mcp add codex --preset codex
```
That writes the equivalent of:
```yaml
mcp_servers:
codex:
command: "codex"
args: ["mcp-server"]
```
You can pick any local name (`hermes mcp add my-codex --preset codex` is fine); the preset only provides the `command`/`args` defaults.
## How Hermes registers MCP tools
Hermes prefixes MCP tools so they do not collide with built-in names:
@ -554,7 +578,7 @@ The gateway does NOT need to be running for read operations (listing conversatio
### Current limits
- Stdio transport only (no HTTP MCP transport yet)
- The embedded `hermes mcp serve` exposes a **stdio-only** MCP server today. If you need an HTTP MCP server, run a separate adapter — or, much more commonly, use the MCP **client** side of Hermes, which already speaks both stdio and HTTP (`url` + `headers` in `mcp_servers.yaml` / `config.yaml`; see [HTTP servers](#http-servers) above).
- Event polling at ~200ms intervals via mtime-optimized DB polling (skips work when files are unchanged)
- No `claude/channel` push notification protocol yet
- Text-only sends (no media/attachment sending through `messages_send`)

View file

@ -39,6 +39,7 @@ Hermes Agent includes a rich set of capabilities that extend far beyond basic ch
- **[Provider Routing](provider-routing.md)** — Fine-grained control over which AI providers handle your requests. Optimize for cost, speed, or quality with sorting, whitelists, blacklists, and priority ordering.
- **[Fallback Providers](fallback-providers.md)** — Automatic failover to backup LLM providers when your primary model encounters errors, including independent fallback for auxiliary tasks like vision and compression.
- **[Credential Pools](credential-pools.md)** — Distribute API calls across multiple keys for the same provider. Automatic rotation on rate limits or failures.
- **[Prompt caching](../configuration#prompt-caching)** — Built-in cross-session 1-hour prefix cache for Claude on native Anthropic, OpenRouter, and Nous Portal. Always-on; no configuration required.
- **[Memory Providers](memory-providers.md)** — Plug in external memory backends (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover, Supermemory) for cross-session user modeling and personalization beyond the built-in memory system.
- **[API Server](api-server.md)** — Expose Hermes as an OpenAI-compatible HTTP endpoint. Connect any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, and more.
- **[IDE Integration (ACP)](acp.md)** — Use Hermes inside ACP-compatible editors such as VS Code, Zed, and JetBrains. Chat, tool activity, file diffs, and terminal commands render inside your editor.

View file

@ -107,6 +107,35 @@ platforms: [macos, linux] # macOS and Linux
When set, the skill is automatically hidden from the system prompt, `skills_list()`, and slash commands on incompatible platforms. If omitted, the skill loads on all platforms.
## Skill output and media delivery
When a skill response (or any agent response) includes a bare absolute path to a media file — for example `/home/user/screenshots/diagram.png` — the gateway auto-detects it, strips it from the visible text, and delivers the file natively to the user's chat (Telegram photo, Discord attachment, etc.) instead of leaving the raw path in the message.
For audio specifically, the `[[audio_as_voice]]` directive promotes audio files to native voice-message bubbles on platforms that support them (Telegram, WhatsApp).
### Forcing document-style delivery: `[[as_document]]`
Sometimes you want the **opposite** of inline preview: you want the file delivered as a downloadable attachment, not a re-compressed image bubble. The classic example is a high-resolution screenshot or chart — Telegram's `sendPhoto` recompresses it to ~200 KB at 1280 px, destroying readability. A 1-2 MB PNG sent via `sendDocument` keeps the original bytes intact.
If a response (or any text inside it — typically the last line) contains the literal directive `[[as_document]]`, every media path extracted from that response is delivered as a document/file attachment rather than an image bubble:
```
Here is your rendered chart:
/home/user/.hermes/cache/chart-q4-2025.png
[[as_document]]
```
The directive is stripped before delivery, so users never see it. Granularity is intentionally all-or-nothing per response: emit `[[as_document]]` once and every image path in the same response is delivered as a document. This mirrors the scope of `[[audio_as_voice]]`.
Use it from a skill when:
- You produce screenshots or charts the user needs as files (for editing in another tool, archiving, sharing intact).
- The default lossy preview would obscure detail (small text, pixel-accurate diagrams, color-sensitive renders).
Platforms without a separate document path (e.g. SMS) fall back to whatever attachment mechanism they have.
### Conditional Activation (Fallback Skills)
Skills can automatically show or hide themselves based on which tools are available in the current session. This is most useful for **fallback skills** — free or local alternatives that should only appear when a premium tool is unavailable.

View file

@ -202,3 +202,9 @@ When a user attaches an image — from the CLI clipboard, the gateway (Telegram/
You don't configure this — Hermes looks up your current model's capability in the provider metadata and picks the right path automatically. The practical effect: you can switch between vision and non-vision models mid-session and image handling "just works" without changing your workflow. Text-only models get coherent context about the image rather than a broken multimodal payload they'd have to reject.
Which auxiliary model handles the text-description path is configurable under `auxiliary.vision` — see [Auxiliary Models](/docs/user-guide/configuration#auxiliary-models).
### `vision_analyze` has the same dual behavior
The `vision_analyze` tool itself follows the same routing. When the active main model is vision-capable **and** its provider supports image content inside tool results (currently the Anthropic, OpenAI, Azure-OpenAI, and Gemini 3.x stacks), `vision_analyze` short-circuits the auxiliary describer and returns the raw image pixels as a multimodal tool-result envelope. The main model sees the image natively on its next turn — no aux call, no text-summary information loss, no extra latency.
For text-only main models (or providers whose tool-result channel doesn't carry images), `vision_analyze` falls back to the legacy path: it asks the configured auxiliary vision model to describe the image and returns the description as plain text. Either way the calling tool signature is the same — the tool decides which path to take at runtime based on the active model.

View file

@ -20,10 +20,14 @@ Both are configured through a single backend selection. Providers are chosen via
|----------|---------|--------|---------|-------|-----------|
| **Firecrawl** (default) | `FIRECRAWL_API_KEY` | ✔ | ✔ | ✔ | 500 credits/mo |
| **SearXNG** | `SEARXNG_URL` | ✔ | — | — | ✔ Free (self-hosted) |
| **Brave Search (free tier)** | `BRAVE_SEARCH_API_KEY` | ✔ | — | — | 2 000 queries/mo |
| **DDGS (DuckDuckGo)** | — (no key) | ✔ | — | — | ✔ Free |
| **Tavily** | `TAVILY_API_KEY` | ✔ | ✔ | ✔ | 1 000 searches/mo |
| **Exa** | `EXA_API_KEY` | ✔ | ✔ | — | 1 000 searches/mo |
| **Parallel** | `PARALLEL_API_KEY` | ✔ | ✔ | — | Paid |
Brave Search and DDGS are **search-only** — pair either with Firecrawl/Tavily/Exa/Parallel when you also need `web_extract`. DDGS uses the [`ddgs` Python package](https://pypi.org/project/ddgs/) under the hood; if it isn't already installed, run `pip install ddgs` (or let Hermes lazy-install it on first use).
**Per-capability split:** you can use different providers for search and extract independently — for example SearXNG (free) for search and Firecrawl for extract. See [Per-capability configuration](#per-capability-configuration) below.
:::tip Nous Subscribers
@ -278,7 +282,7 @@ Set one provider for all web capabilities:
```yaml
# ~/.hermes/config.yaml
web:
backend: "searxng" # firecrawl | searxng | tavily | exa | parallel
backend: "searxng" # firecrawl | searxng | brave-free | ddgs | tavily | exa | parallel
```
### Per-capability configuration

View file

@ -26,7 +26,7 @@ The tool's `check_fn` runs the xAI credential resolver every time the model's to
## Enabling the tool
Off by default. Enable in `hermes tools`:
Auto-enables when xAI credentials (OAuth token or `XAI_API_KEY`) are present. Disable explicitly via `hermes tools` → Search → x_search if you don't want this.
```bash
hermes tools

View file

@ -634,6 +634,24 @@ When the flag is on, any uploaded file is downloaded, cached under `~/.hermes/ca
Known-text formats already in the allowlist (`.txt`, `.md`, `.log`) continue to have their contents auto-injected up to 100 KiB; that behavior is unchanged when the flag is on.
Equivalent env vars: `DISCORD_ALLOW_ANY_ATTACHMENT=true` and `DISCORD_MAX_ATTACHMENT_BYTES=33554432` (or `0` for no cap).
:::warning Memory cost of unlimited
Disabling the size cap (`max_attachment_bytes: 0`) means a user can drop a multi-GB file on the bot and the gateway will dutifully buffer it through memory while caching to disk. Only set this in trusted single-user installs. For shared bots, keep the default 32 MiB or raise it conservatively.
:::
## Interactive Prompts (clarify)
When the agent calls the `clarify` tool — to ask which approach you prefer, get post-task feedback, or check before a non-trivial decision — Discord renders the question with **one button per choice**:
> Which framework should I use for the dashboard?
>
> [1. Next.js] [2. Remix] [3. Astro] [Other (type answer)]
Click a numbered button to answer, or click **Other** to type a free-form response (the next message you send in that channel becomes the answer). Open-ended `clarify` calls (no preset choices) skip the buttons and just capture your next message.
The buttons disable themselves once a choice is made so duplicate clicks don't double-resolve the prompt. Configure the response timeout via `agent.clarify_timeout` in `~/.hermes/config.yaml` (default `600` seconds). If you don't respond within the timeout, the agent unblocks with a sentinel message and adapts rather than hanging.
## Home Channel
You can designate a "home channel" where the bot sends proactive messages (such as cron job output, reminders, and notifications). There are two ways to set it:

View file

@ -443,6 +443,84 @@ Each platform has its own toolset:
| API Server | `hermes-api-server` | Full tools (drops `clarify`, `send_message`, `text_to_speech` — programmatic access doesn't have an interactive user) |
| Webhooks | `hermes-webhook` | Full tools including terminal |
## Operating a multi-platform gateway
A gateway typically runs several adapters at once (Telegram + Discord + Slack, etc.). The sections below cover day-2 operations that span all platforms.
### `/platform` command
Once the gateway is running, use the `/platform` slash command from any connected CLI session or chat to inspect and steer individual adapters without restarting the whole gateway:
```
/platform list # show all adapters and their state
/platform pause <name> # stop dispatching new messages to one adapter
/platform resume <name> # re-enable a paused adapter
```
`/platform list` shows whether each adapter is `running`, `paused` (manually), or `paused-by-breaker` (see below). Pausing keeps the adapter loaded and its background loops alive — incoming messages are dropped on the floor, but the connection itself stays open so resume is instant.
See also the broader status summary command [`/platforms`](../../reference/slash-commands.md#info).
### Automatic circuit breaker
Each adapter is wrapped in a circuit breaker. Repeated retryable failures (network blips, rate-limit replies, 5xx upstream responses, websocket disconnects) cause the breaker to trip — the adapter is auto-paused, an operator notification is sent to the home channel of another live platform when one is configured, and a structured log line is emitted.
The breaker does **not** auto-resume — it stays open until you run `/platform resume <name>` manually. This is intentional: if a platform is in a sustained outage, you don't want the gateway thrashing reconnects.
### Where to look when a platform is paused
When an adapter is paused, check:
1. **Gateway log** (`~/.hermes/logs/gateway.log` or the systemd / launchd unit log). Search for the platform name and `circuit breaker`, `paused`, or `disabled`. The trip event includes the failure count and the last error.
2. **`/platform list`** output — shows the current state and last reason.
3. **The provider's status page** (Telegram bot API status, Discord status, etc.). The breaker tripped because the platform was unhealthy; don't try to resume until it's back.
Once upstream is healthy, `/platform resume <name>` clears the breaker and re-arms the adapter.
### Restart notifications
When the gateway restarts (or is shut down with in-flight sessions), it can send a one-shot "the agent is back" / "the agent was interrupted" message to each platform's home channel. This is controlled per-platform by the `gateway_restart_notification` flag in `gateway-config.yaml`, which defaults to `true`:
```yaml
gateway:
platforms:
telegram:
home_chat_id: "123456789"
gateway_restart_notification: false # opt out for this platform
discord:
home_chat_id: "987654321"
# gateway_restart_notification omitted → defaults to true
```
Disable it on noisy or low-priority platforms while leaving it on for your primary chat. The notification is sent once per restart, regardless of how many sessions were in flight.
### Session resume across gateway restarts
When the gateway shuts down with an in-flight tool call or generation, the affected sessions are flagged as `restart_interrupted`. On the next startup, the gateway schedules an auto-resume for each one — the user gets a short heads-up in the chat ("Send any message after restart and I'll try to resume where you left off.") and the session picks up from the last committed turn when they reply.
This behaviour is on by default and is logged at gateway start:
```
Scheduled auto-resume for N restart-interrupted session(s)
```
No configuration is required. If you don't want the heads-up, set `gateway_restart_notification: false` on the platform.
### Progress bubble cleanup (opt-in)
Tool-progress messages, the "still working…" heartbeat, and status-callback bubbles can be auto-deleted after the final response lands. Enable per-platform via `display.platforms.<platform>.cleanup_progress`:
```yaml
display:
platforms:
telegram:
cleanup_progress: true
discord:
cleanup_progress: true
```
Defaults to `false`. Only platforms whose adapter implements `delete_message` honor the setting (currently Telegram and Discord). Failed runs **skip** cleanup so the bubbles remain as breadcrumbs.
## Next Steps
- [Telegram Setup](telegram.md)

View file

@ -345,6 +345,34 @@ Add this to your `~/.hermes/.env`:
MATRIX_HOME_ROOM=!abc123def456:matrix.example.org
```
## Room allowlist (`allowed_rooms`)
Restrict the bot to a fixed set of Matrix rooms. When set, the bot **only** responds in rooms whose ID appears in the list — messages from any other room are silently ignored, even if the bot is mentioned.
**DMs (direct chat rooms) are exempt** from this filter, so authorized users can always reach the bot one-on-one.
```yaml
matrix:
allowed_rooms:
- "!abc123def456:matrix.example.org"
- "!opsroom789:matrix.example.org"
```
Or via env var (comma-separated):
```bash
MATRIX_ALLOWED_ROOMS="!abc123def456:matrix.example.org,!opsroom789:matrix.example.org"
```
Behavior:
- Empty / unset → no restriction (default).
- Non-empty → room ID must be on the list. The check runs **before** any other gating (mention requirement, sender allowlist, etc.).
- Use the room's **internal ID** (`!abc...:server`), not its alias (`#room:server`). You can find a room's internal ID in Element via Room → Settings → Advanced.
See also: [admin/user slash command split](../../reference/slash-commands.md#permissions-and-adminuser-split).
:::tip
To find a Room ID: in Element, go to the room → **Settings****Advanced** → the **Internal room ID** is shown there (starts with `!`).
:::

View file

@ -225,6 +225,33 @@ To find a channel ID in Mattermost: open the channel, click the channel name hea
When the bot is `@mentioned`, the mention is automatically stripped from the message before processing.
## Channel allowlist (`allowed_channels`)
Restrict the bot to a fixed set of Mattermost channels. When set, the bot **only** responds in channels whose ID appears in the list — messages from any other channel are silently ignored, even if the bot is `@mentioned`.
**DMs are exempt** from this filter, so authorized users can always reach the bot in a direct message.
```yaml
mattermost:
allowed_channels:
- "abc123def456ghi789jkl012mno" # #ops
- "xyz987uvw654rst321opq098nml" # #incident-response
```
Or via env var (comma-separated):
```bash
MATTERMOST_ALLOWED_CHANNELS="abc123def456ghi789jkl012mno,xyz987uvw654rst321opq098nml"
```
Behavior:
- Empty / unset → no restriction (fully backward compatible).
- Non-empty → channel ID must be on the list, or the message is dropped before any other gating (mention requirement, `MATTERMOST_FREE_RESPONSE_CHANNELS`, etc.) runs.
- Find a channel ID via the Mattermost UI → channel header → "View Info", or read it from the channel URL.
See also: [admin/user slash command split](../../reference/slash-commands.md#permissions-and-adminuser-split).
## Troubleshooting
### Bot is not responding to messages

View file

@ -389,6 +389,33 @@ Set this to `true` in busy workspaces where Slack's default "the bot remembers t
Slack supports both patterns: `@mention` required to start a conversation by default, but you can opt specific channels out via `SLACK_FREE_RESPONSE_CHANNELS` (comma-separated channel IDs) or `slack.free_response_channels` in `config.yaml`. Once the bot has an active session in a thread, subsequent thread replies do not require a mention. In DMs the bot always responds without needing a mention.
:::
### Channel allowlist (`allowed_channels`)
Restrict the bot to a fixed set of Slack channels — useful when the bot is invited to many channels but should only respond in a few. When set, messages from channels NOT in this list are **silently ignored**, even if the bot is `@mentioned`.
**DMs are exempt** from this filter, so authorized users can always reach the bot in a direct message.
```yaml
slack:
allowed_channels:
- "C0123456789" # #ops
- "C0987654321" # #incident-response
```
Or via env var (comma-separated):
```bash
SLACK_ALLOWED_CHANNELS="C0123456789,C0987654321"
```
Behavior:
- Empty / unset → no restriction (fully backward compatible).
- Non-empty → channel ID must be on the list, or the message is dropped before any other gating (mention requirement, `free_response_channels`, etc.) runs.
- Slack channel IDs start with `C` (public), `G` (private), or `D` (DM). Look them up via the Slack UI's "Open channel details" → "About" panel, or via the API.
See also: [admin/user slash command split](../../reference/slash-commands.md#permissions-and-adminuser-split).
### Unauthorized User Handling
```yaml

View file

@ -944,6 +944,34 @@ TELEGRAM_GROUP_ALLOWED_USERS="-1001234567890"
TELEGRAM_GROUP_ALLOWED_CHATS="-1001234567890"
```
### Guest @mention bypass (`guest_mode`)
In a typical setup, `group_allowed_chats` is a hard gate: messages from groups outside the list are silently dropped, even if a member explicitly @mentions the bot. That's the right default for support / team bots.
For more casual setups — friend group chats where you want the bot **mostly silent** but **occasionally available on explicit ping** — enable `guest_mode`:
```yaml
gateway:
platforms:
telegram:
extra:
group_allowed_chats:
- "-1001234567890" # your main allowlisted group
guest_mode: true # non-allowlisted groups: allow on @mention only
```
Env equivalent:
```bash
TELEGRAM_GUEST_MODE=true
```
Default: `false`.
With `guest_mode: true`, a message from a non-allowlisted group is processed **only** if it explicitly @mentions the bot. The mention is required every turn — there's no session stickiness for guest interactions, so the bot never auto-engages in a friend group thread it isn't pinged into.
DMs and allowlisted groups behave exactly as before.
## Slash Command Access Control
By default, every allowed user can run every slash command. To split your allowlist into **admins** (full slash command access) and **regular users** (only commands you explicitly enable), add `allow_admin_from` and `user_allowed_commands` to the platform's `extra` block:
@ -1153,6 +1181,32 @@ Tap a button to answer, or tap **Other** to type a free-form response (the next
Configure the response timeout via `agent.clarify_timeout` in `~/.hermes/config.yaml` (default `600` seconds). If you don't respond within the timeout, the agent unblocks with a sentinel message and adapts rather than hanging.
## Push notification volume
Telegram fires a push notification on every message the bot sends. For long agent turns that emit tool-progress bubbles, streaming updates, and status callbacks, this gets noisy fast. The Telegram adapter has two notification modes:
| Mode | Behavior |
|------|----------|
| `important` (default) | Only **final responses**, **approval prompts**, and **slash-command confirmations** ring. Tool progress, streaming chunks, and status messages are delivered with `disable_notification=true`. |
| `all` | Every outgoing message fires a push notification. Legacy behavior; opt in if you genuinely want to hear about every tool call. |
Configure in `~/.hermes/config.yaml`:
```yaml
display:
platforms:
telegram:
notifications: important # or "all"
```
Env override (handy for quick A/B testing):
```bash
HERMES_TELEGRAM_NOTIFICATIONS=all
```
Unknown values log a warning and fall back to `important`.
## Security
:::warning

View file

@ -73,6 +73,8 @@ When YOLO is active, Hermes shows two persistent visual reminders so it's hard t
YOLO mode disables **all** dangerous command safety checks for the session — **except** the hardline blocklist (see below). Use only when you fully trust the commands being generated (e.g., well-tested automation scripts in disposable environments).
:::
For destructive session slash commands (`/clear`, `/new` / `/reset`, `/undo`, `/exit --delete`), the CLI also prompts for confirmation before running them. See [Slash Commands — Confirmation prompts for destructive commands](../reference/slash-commands.md#confirmation-prompts-for-destructive-commands).
### Hardline Blocklist (Always-On Floor)
Some commands are so catastrophic — irreversible filesystem wipes, fork bombs, direct block-device writes — that Hermes refuses to run them **regardless** of:
@ -605,3 +607,58 @@ TERMINAL_SSH_KEY=~/.ssh/hermes_agent_key
```
The SSH connection details live in `.env` (not `config.yaml`) so they aren't checked in or shared along with profile exports. This keeps the gateway's messaging connections separate from the agent's command execution.
## Supply-chain advisory checking
Hermes ships with a built-in advisory scanner that flags Python packages in the active venv that match a curated catalog of known-compromised versions (supply-chain worms like the May 2026 `mistralai 2.4.6` poisoning). Implementation lives in `hermes_cli/security_advisories.py`.
How it runs:
- **CLI startup banner.** A one-line warning is printed if any advisory matches, with a pointer to `hermes doctor` for the full remediation.
- **`hermes doctor`.** Surfaces every active advisory with version specifics and 2-4 step remediation instructions.
- **Gateway startup.** Logged to `gateway.log`; the first interactive message gets a short operator banner.
Each advisory carries a stable id. Once you have read and acted on it you can dismiss it for good:
```bash
hermes doctor --ack <advisory-id>
```
The ack is persisted to `config.security.acked_advisories` and survives restart. Old advisories are intentionally **not** removed from the catalog — leaving them in place keeps fresh installs warned about historically poisoned versions that might still be cached in a private mirror.
The check itself is stdlib-only and runs from one `importlib.metadata.version()` lookup per advisory, so it's safe to run on every startup.
### Lazy install of optional dependencies
Many features (Mistral TTS, ElevenLabs, Honcho memory, Bedrock, Slack, Matrix, …) depend on Python packages that not every user needs. Hermes installs these **lazily** on first use rather than eagerly under `hermes-agent[all]`. The implementation lives in `tools/lazy_deps.py`.
The trade-off this fixes:
- **Fragility.** When one extra's transitive dependency becomes unavailable on PyPI (quarantined for malware, yanked, broken upload), the entire `[all]` resolve would fail and fresh installs would silently fall back to a stripped tier — losing 10+ unrelated extras at once. Lazy install isolates each backend so one poisoned dep can't break unrelated features.
- **Bloat.** A user who only ever talks to one provider no longer pulls hundreds of packages they will never import.
How it works:
1. A backend module calls `ensure("feature.name")` at the top of its first-import path.
2. If the deps are missing, `ensure` checks `security.allow_lazy_installs` in `config.yaml` (default `true`) and runs a venv-scoped `pip install` for the allowlisted specs.
3. If the install fails or the user has disabled lazy installs, the call raises `FeatureUnavailable` with the actual pip stderr and a pointer at `hermes tools`.
Security guarantees enforced by `tools/lazy_deps.py`:
| Guarantee | What it means |
|---|---|
| Venv-scoped only | Installs target `sys.executable` in the active venv — never the system Python |
| PyPI by name only | Specs accept `"package>=1.0,<2"` syntax. No `--index-url`, `git+https://`, or file: paths — a malicious `config.yaml` cannot redirect the install |
| Allowlist | Only specs that appear in the in-tree `LAZY_DEPS` map can be installed via this path. A typo in a feature name does NOT get install-anything semantics |
| Opt-out | Set `security.allow_lazy_installs: false` to disable runtime installs entirely. Useful for restricted networks or strict security postures |
| No silent retries | Failures surface as `FeatureUnavailable` — no caching of bad state, no retry storms |
To disable runtime installs:
```yaml
# ~/.hermes/config.yaml
security:
allow_lazy_installs: false
```
When disabled, backends that need optional deps will tell the user to run the install manually (`pip install …`) or pick a different backend via `hermes tools`.

View file

@ -60,6 +60,9 @@ into chat.
Use `/compress` when a session gets long, `/new` for a fresh thread, and
`hermes sessions prune` only when you want to delete old ended sessions from
storage. Compression reduces the active context; it is not a privacy delete.
Pass a name to `/new` (e.g. `/new payments-refactor`) to set the new session's
initial title up front — useful for finding it later with `/resume <name>` or
in the `/sessions` picker.
:::
### Session Sources
@ -412,9 +415,9 @@ session_search()
Returns recent sessions chronologically (titles, previews, timestamps). Useful when the user asks "what was I working on" without naming a topic.
### FTS5 Query Syntax
### FTS5 query syntax
The search supports standard FTS5 query syntax:
The keyword mode supports standard FTS5 query syntax:
- Simple keywords: `docker deployment` (FTS5 defaults to AND)
- Phrases: `"exact phrase"`
@ -432,6 +435,8 @@ The agent is prompted to use session search automatically:
> *"When the user references something from a past conversation or you suspect relevant prior context exists, use session_search to recall it before asking them to repeat themselves."*
Typical triggers: "we did this before", "remember when", "last time", "as I mentioned", or any reference to a project/person/concept that isn't in the current window.
## Per-Platform Session Tracking
### Gateway Sessions

View file

@ -50,6 +50,19 @@ The classic CLI remains available as the default. Anything documented in [CLI In
Same [skins](features/skins.md) and [personalities](features/personality.md) apply. Switch mid-session with `/skin ares`, `/personality pirate`, and the UI repaints live. See [Skins & Themes](features/skins.md) for the full list of customizable keys and which ones apply to classic vs TUI — the TUI honors the banner palette, UI colors, prompt glyph/color, session display, completion menu, selection bg, `tool_prefix`, and `help_header`.
### Collapsible banner sections
The TUI startup banner groups runtime info into four collapsible sections, each rendered with a `▸` / `▾` chevron next to the section title:
| Section | Default state |
|---------|---------------|
| Tools | Open |
| Skills | Collapsed |
| System Prompt | Collapsed |
| MCP Servers | Collapsed |
Click anywhere on a section header (or its chevron) to toggle it. The Tools list opens by default because it's the most-checked section at session start; Skills, System Prompt, and MCP Servers collapse by default so the banner stays compact even when you've installed dozens of skills or wired up many MCP servers. State is local to the banner instance, so the next launch resets to the defaults.
## Requirements
- **Node.js** ≥ 20 — the TUI runs as a subprocess launched from the Python CLI. `hermes doctor` verifies this.
@ -158,6 +171,9 @@ The status line also shows:
- **Working directory with git branch**`~/projects/hermes-agent (docs/two-week-gap-sweep)`. The branch suffix updates when you `git checkout` in a side terminal (mtime-cached) so the TUI reflects your actual active branch, not whatever it was at launch.
- **Per-prompt elapsed time**`⏱ 12s/3m 45s` while the turn is running (live), frozen to `⏲ 32s / 3m 45s` after the turn completes. First number is time since last user message; second is total session duration. Resets on every new prompt.
- **`🗜️ N`** — number of times the running session has been auto-compressed. Appears once the first compression fires.
- **`▶ N`** — number of `/background` tasks currently running in this session. Appears whenever at least one task is in flight.
- **`⚠ YOLO`** — visible warning whenever YOLO mode is on (`hermes --yolo`, `/yolo`, or `HERMES_YOLO_MODE=1`). The same badge also appears in the startup banner so you cannot launch an auto-approving session without noticing.
## Configuration
@ -215,6 +231,25 @@ Sessions are shared between the TUI and the classic CLI — both write to the sa
See [Sessions](sessions.md) for lifecycle, search, compression, and export.
## Attaching to a running gateway
By default the TUI spawns its own in-process gateway, so each TUI instance is self-contained. If you already have a long-lived gateway running (e.g. `hermes gateway run` in tmux, or the systemd / launchd service), you can point the TUI at that gateway instead — the TUI then becomes a thin client and shares state with every other surface (messaging platforms, web dashboard, other TUI sessions) that's attached to the same gateway.
Set the websocket URL via env before launching:
```bash
export HERMES_TUI_GATEWAY_URL="ws://localhost:8765/api/ws?token=<auth-token>"
hermes --tui
```
The token comes from the gateway's API auth configuration (see [API Server](features/api-server.md)). When the env var is set, the TUI:
- Skips spawning a local gateway entirely — no duplicate platform adapters, no port conflicts.
- Routes every action (slash commands, image attach, browser progress, voice events, …) over the websocket to the shared gateway.
- Reconnects automatically if the gateway URL rotates (new token) between requests.
This is the same channel the web dashboard's embedded TUI uses (see [Web Dashboard](features/web-dashboard.md#chat)) — one gateway, many clients.
## Reverting to the classic CLI
Launching `hermes` (without `--tui`) stays on the classic CLI. To make a machine prefer the TUI, set `HERMES_TUI=1` in your shell profile. To go back, unset it.

View file

@ -38,11 +38,35 @@ No admin rights required. The installer goes to `%LOCALAPPDATA%\hermes\` and add
| Parameter | Default | Purpose |
|---|---|---|
| `-Branch` | `main` | Clone a specific branch (useful for testing PRs) |
| `-Commit` | unset | Pin install to a specific commit SHA (overrides `-Branch`) |
| `-Tag` | unset | Pin install to a specific git tag (e.g. `v0.14.0`) |
| `-NoVenv` | off | Skip venv creation (advanced — you manage Python yourself) |
| `-SkipSetup` | off | Skip the post-install `hermes setup` wizard |
| `-HermesHome` | `%LOCALAPPDATA%\hermes` | Override data directory |
| `-InstallDir` | `%LOCALAPPDATA%\hermes\hermes-agent` | Override code location |
The installer auto-retries flaky git fetches and strips BOM from any downloaded `install.ps1` payload, so a UTF-8 BOM picked up during HTTP transit no longer breaks the `[scriptblock]::Create((irm ...))` form.
### Desktop installer (alternative)
A thin GUI installer is also available — useful if you'd rather double-click an `.exe` than open PowerShell. Download Hermes Desktop, run the installer, and on first launch the GUI calls `install.ps1` under the hood to provision Python (via `uv`), Node, PortableGit, and the rest of the dependency bootstrap described below. After the first run, the desktop app and the PowerShell-installed `hermes` CLI share the same `%LOCALAPPDATA%\hermes\hermes-agent` install and `%USERPROFILE%\.hermes` data directory — switch between the GUI and the CLI freely.
Use the desktop installer when you want a familiar Windows install experience or you're handing Hermes to a non-developer; use the PowerShell one-liner when you're already in a terminal.
### Dependency bootstrap (`dep_ensure`)
On first launch (and on demand when a missing tool is detected), Hermes runs a small Python bootstrapper — `hermes_cli/dep_ensure.py` — that checks for and lazily installs the non-Python dependencies it needs. On Windows, the relevant ones are:
| Dependency | Why Hermes needs it |
|---|---|
| **PortableGit** | Provides `bash.exe` for the terminal tool and `git` for in-session clones. Provisioned at install time, not by `dep_ensure`. |
| **Node.js 22** | Required for the browser tool (`agent-browser`), the TUI's web bridge, and the WhatsApp bridge. |
| **ffmpeg** | Audio format conversion for TTS / voice messages. |
| **ripgrep** | Fast file search — falls back to `grep` if unavailable. |
| **npm packages** | `agent-browser`, Playwright Chromium, and any per-toolset Node deps are installed once at first browser-tool use. |
Each dep has a `shutil.which(...)`-style check; if a binary is missing and the run is interactive, `dep_ensure` offers to install it (deferring to `scripts\install.ps1 -ensure <dep>` for the actual install logic). Non-interactive runs (gateway, cron, headless desktop launches) skip the prompt and surface a clear `this feature needs <dep>` error instead.
## What the installer actually does
Top-to-bottom, in order: