docs: expand Docusaurus coverage across CLI, tools, skills, and skins (#1232)

- add code-derived reference pages for slash commands, tools, toolsets,
  bundled skills, and official optional skills
- document the skin system and link visual theming separately from
  conversational personality
- refresh quickstart, configuration, environment variable, and messaging
  docs to match current provider, gateway, and browser behavior
- fix stale command, session, and Home Assistant configuration guidance
This commit is contained in:
Teknium 2026-03-13 21:34:41 -07:00 committed by GitHub
parent 2bf6b7ad1a
commit 984f00e0b0
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
26 changed files with 1228 additions and 397 deletions

View file

@ -75,61 +75,33 @@ When resuming a previous session (`hermes -c` or `hermes --resume <id>`), a "Pre
|-----|--------|
| `Enter` | Send message |
| `Alt+Enter` or `Ctrl+J` | New line (multi-line input) |
| `Alt+V` | Paste an image from the clipboard when supported by the terminal |
| `Ctrl+V` | Paste text and opportunistically attach clipboard images |
| `Ctrl+C` | Interrupt agent (double-press within 2s to force exit) |
| `Ctrl+D` | Exit |
| `Tab` | Autocomplete slash commands |
## Slash Commands
Type `/` to see an autocomplete dropdown of all available commands.
Type `/` to see the autocomplete dropdown. Hermes supports a large set of CLI slash commands, dynamic skill commands, and user-defined quick commands.
### Navigation & Control
Common examples:
| Command | Description |
|---------|-------------|
| `/help` | Show available commands |
| `/quit` | Exit the CLI (also: `/exit`, `/q`) |
| `/clear` | Clear screen and reset conversation |
| `/new` | Start a new conversation |
| `/reset` | Reset conversation only (keep screen) |
| `/help` | Show command help |
| `/model` | Show or change the current model |
| `/tools` | List currently available tools |
| `/skills browse` | Browse the skills hub and official optional skills |
| `/background <prompt>` | Run a prompt in a separate background session |
| `/skin` | Show or switch the active CLI skin |
| `/reasoning high` | Increase reasoning effort |
| `/title My Session` | Name the current session |
### Tools & Configuration
| Command | Description |
|---------|-------------|
| `/tools` | List all available tools grouped by toolset |
| `/toolsets` | List available toolsets with descriptions |
| `/model [provider:model]` | Show or change the current model (supports `provider:model` syntax) |
| `/provider` | Show available providers with auth status |
| `/config` | Show current configuration |
| `/prompt [text]` | View/set/clear custom system prompt |
| `/personality [name]` | Set a predefined personality |
| `/reasoning [arg]` | Manage reasoning effort (`none`/`low`/`medium`/`high`/`xhigh`) and display (`show`/`hide`) |
### Conversation Management
| Command | Description |
|---------|-------------|
| `/history` | Show conversation history |
| `/retry` | Retry the last message |
| `/undo` | Remove the last user/assistant exchange |
| `/save` | Save the current conversation |
| `/compress` | Manually compress conversation context |
| `/usage` | Show token usage for this session |
| `/insights [--days N]` | Show usage insights and analytics (last 30 days) |
### Skills & Scheduling
| Command | Description |
|---------|-------------|
| `/cron` | Manage scheduled tasks |
| `/skills` | Browse, search, install, inspect, or manage skills |
| `/platforms` | Show gateway/messaging platform status |
| `/verbose` | Cycle tool progress display: off → new → all → verbose |
| `/<skill-name>` | Invoke any installed skill (e.g., `/axolotl`, `/gif-search`) |
For the full built-in CLI and messaging lists, see [Slash Commands Reference](../reference/slash-commands.md).
:::tip
Commands are case-insensitive — `/HELP` works the same as `/help`. Most commands work mid-conversation.
Commands are case-insensitive — `/HELP` works the same as `/help`. Installed skills also become slash commands automatically.
:::
## Quick Commands
@ -261,16 +233,16 @@ Resuming restores the full conversation history from SQLite. The agent sees all
Use `/title My Session Name` inside a chat to name the current session, or `hermes sessions rename <id> <title>` from the command line. Use `hermes sessions list` to browse past sessions.
### Session Logging
### Session Storage
Sessions are automatically logged to `~/.hermes/sessions/`:
CLI sessions are stored in Hermes's SQLite state database under `~/.hermes/state.db`. The database keeps:
```
sessions/
├── session_20260201_143052_a1b2c3.json
├── session_20260201_150217_d4e5f6.json
└── ...
```
- session metadata (ID, title, timestamps, token counters)
- message history
- lineage across compressed/resumed sessions
- full-text search indexes used by `session_search`
Some messaging adapters also keep per-platform transcript files alongside the database, but the CLI itself resumes from the SQLite session store.
### Context Compression
@ -280,7 +252,7 @@ Long conversations are automatically summarized when approaching context limits:
# In ~/.hermes/config.yaml
compression:
enabled: true
threshold: 0.85 # Compress at 85% of context limit
threshold: 0.50 # Compress at 50% of context limit by default
summary_model: "google/gemini-3-flash-preview" # Model used for summarization
```

View file

@ -72,7 +72,7 @@ You need at least one way to connect to an LLM. Use `hermes model` to switch pro
| **Custom Endpoint** | `OPENAI_BASE_URL` + `OPENAI_API_KEY` in `~/.hermes/.env` |
:::info Codex Note
The OpenAI Codex provider authenticates via device code (open a URL, enter a code). Credentials are stored at `~/.codex/auth.json` and auto-refresh. No Codex CLI installation required.
The OpenAI Codex provider authenticates via device code (open a URL, enter a code). Hermes stores the resulting credentials in its own auth store under `~/.hermes/auth.json` and can import existing Codex CLI credentials from `~/.codex/auth.json` when present. No Codex CLI installation is required.
:::
:::warning
@ -493,7 +493,7 @@ node_modules/
```yaml
compression:
enabled: true
threshold: 0.85 # Compress at 85% of context limit
threshold: 0.50 # Compress at 50% of context limit by default
summary_model: "google/gemini-3-flash-preview" # Model for summarization
# summary_provider: "auto" # "auto", "openrouter", "nous", "main"
```
@ -666,12 +666,13 @@ tts:
```yaml
display:
tool_progress: all # off | new | all | verbose
personality: "kawaii" # Default personality for the CLI
compact: false # Compact output mode (less whitespace)
resume_display: full # full (show previous messages on resume) | minimal (one-liner only)
bell_on_complete: false # Play terminal bell when agent finishes (great for long tasks)
show_reasoning: false # Show model reasoning/thinking above each response (toggle with /reasoning show|hide)
tool_progress: all # off | new | all | verbose
skin: default # Built-in or custom CLI skin (see user-guide/features/skins)
personality: "kawaii" # Legacy cosmetic field still surfaced in some summaries
compact: false # Compact output mode (less whitespace)
resume_display: full # full (show previous messages on resume) | minimal (one-liner only)
bell_on_complete: false # Play terminal bell when agent finishes (great for long tasks)
show_reasoning: false # Show model reasoning/thinking above each response (toggle with /reasoning show|hide)
```
| Mode | What you see |
@ -714,8 +715,9 @@ Usage: type `/status`, `/disk`, `/update`, or `/gpu` in the CLI or any messaging
- **30-second timeout** — long-running commands are killed with an error message
- **Priority** — quick commands are checked before skill commands, so you can override skill names
- **Autocomplete** — quick commands are resolved at dispatch time and are not shown in the built-in slash-command autocomplete tables
- **Type** — only `exec` is supported (runs a shell command); other types show an error
- **Works everywhere** — CLI, Telegram, Discord, Slack, WhatsApp, Signal
- **Works everywhere** — CLI, Telegram, Discord, Slack, WhatsApp, Signal, Email, Home Assistant
## Human Delay

View file

@ -94,7 +94,7 @@ Entries can optionally include:
Each prompt gets a randomly sampled set of toolsets from a **distribution**. This ensures training data covers diverse tool combinations. Use `--list_distributions` to see all available distributions.
Distributions define probability weights for each toolset combination. For example, a "default" distribution might assign high probability to `["terminal", "file", "web"]` and lower probability to web-only or file-only combinations.
In the current implementation, distributions assign a probability to **each individual toolset**. The sampler flips each toolset independently, then guarantees that at least one toolset is enabled. This is different from a hand-authored table of prebuilt combinations.
## Output Format

View file

@ -7,11 +7,16 @@ sidebar_position: 5
# Browser Automation
Hermes Agent includes a full browser automation toolset powered by [Browserbase](https://browserbase.com), enabling the agent to navigate websites, interact with page elements, fill forms, and extract information — all running in cloud-hosted browsers with built-in anti-bot stealth features.
Hermes Agent includes a full browser automation toolset that can run in two modes:
- **Browserbase cloud mode** via [Browserbase](https://browserbase.com) for managed cloud browsers and anti-bot tooling
- **Local browser mode** via the `agent-browser` CLI and a local Chromium installation
In both modes, the agent can navigate websites, interact with page elements, fill forms, and extract information.
## Overview
The browser tools use the `agent-browser` CLI with Browserbase cloud execution. Pages are represented as **accessibility trees** (text-based snapshots), making them ideal for LLM agents. Interactive elements get ref IDs (like `@e1`, `@e2`) that the agent uses for clicking and typing.
The browser tools use the `agent-browser` CLI. In Browserbase mode, `agent-browser` connects to Browserbase cloud sessions. In local mode, it drives a local Chromium installation. Pages are represented as **accessibility trees** (text-based snapshots), making them ideal for LLM agents. Interactive elements get ref IDs (like `@e1`, `@e2`) that the agent uses for clicking and typing.
Key capabilities:
@ -23,16 +28,22 @@ Key capabilities:
## Setup
### Required Environment Variables
### Browserbase cloud mode
To use Browserbase-managed cloud browsers, add:
```bash
# Add to ~/.hermes/.env
BROWSERBASE_API_KEY=your-api-key-here
BROWSERBASE_API_KEY=***
BROWSERBASE_PROJECT_ID=your-project-id-here
```
Get your credentials at [browserbase.com](https://browserbase.com).
### Local browser mode
If you do **not** set Browserbase credentials, Hermes can still use the browser tools through a local Chromium install driven by `agent-browser`.
### Optional Environment Variables
```bash

View file

@ -298,7 +298,7 @@ hermes honcho peer --user NAME # Set user peer name
hermes honcho peer --ai NAME # Set AI peer name
hermes honcho peer --reasoning LEVEL # Set dialectic reasoning level
hermes honcho mode # Show current memory mode
hermes honcho mode [hybrid|honcho] # Set memory mode
hermes honcho mode [hybrid|honcho|local] # Set memory mode
hermes honcho tokens # Show token budget settings
hermes honcho tokens --context N # Set context token cap
hermes honcho tokens --dialectic N # Set dialectic char cap

View file

@ -161,6 +161,8 @@ Tools appear alongside built-in tools — the agent calls them like any other to
:::info
In addition to the server's own tools, each MCP server also gets 4 utility tools auto-registered: `list_resources`, `read_resource`, `list_prompts`, and `get_prompt`. These allow the agent to discover and use MCP resources and prompts exposed by the server.
Each configured server also creates a **runtime toolset** named `mcp-<server>`. This means you can filter or reason about MCP servers at the toolset level in the same way you do with built-in toolsets.
:::
### Reconnection

View file

@ -216,13 +216,17 @@ The system prompt is assembled in layers (from `agent/prompt_builder.py` and `ru
**SOUL.md vs agent.system_prompt**: SOUL.md is part of the "Project Context" section and coexists with the default identity. The `agent.system_prompt` (set via `/personality` or config) is an ephemeral overlay. Both can be active simultaneously — SOUL.md for tone/personality, system_prompt for additional instructions.
:::
## Display Personality (CLI Banner)
## CLI Appearance vs Conversational Personality
The `display.personality` config option controls the CLI's **visual** personality (banner art, spinner messages), independent of the agent's conversational personality:
Conversational personality and CLI appearance are separate:
- `agent.system_prompt`, `/personality`, and `SOUL.md` affect how Hermes **speaks**.
- `display.skin` and `/skin` affect how Hermes **looks in the terminal**.
```yaml
display:
personality: kawaii # Affects CLI banner and spinner art
skin: default
# personality: kawaii # legacy cosmetic setting still shown in some summaries
```
This is purely cosmetic and doesn't affect the agent's responses — only the ASCII art and loading messages shown in the terminal.
For the full theming system — built-in skins, custom YAML skins, spinner branding, and `/skin` — see [Skins & Themes](./skins.md).

View file

@ -10,6 +10,11 @@ Skills are on-demand knowledge documents the agent can load when needed. They fo
All skills live in **`~/.hermes/skills/`** — a single directory that serves as the source of truth. On fresh install, bundled skills are copied from the repo. Hub-installed and agent-created skills also go here. The agent can modify or delete any skill.
See also:
- [Bundled Skills Catalog](/docs/reference/skills-catalog)
- [Official Optional Skills Catalog](/docs/reference/optional-skills-catalog)
## Using Skills
Every installed skill is automatically available as a slash command:
@ -139,6 +144,7 @@ When a missing value is encountered, Hermes asks for it securely only when the s
│ │ ├── SKILL.md # Main instructions (required)
│ │ ├── references/ # Additional docs
│ │ ├── templates/ # Output formats
│ │ ├── scripts/ # Helper scripts callable from the skill
│ │ └── assets/ # Supplementary files
│ └── vllm/
│ └── SKILL.md
@ -199,6 +205,8 @@ hermes skills tap add myorg/skills-repo # Add a custom source
All hub-installed skills go through a **security scanner** that checks for data exfiltration, prompt injection, destructive commands, and other threats.
Official optional skills use identifiers like `official/security/1password` and `official/migration/openclaw-migration`.
### Trust Levels
| Level | Source | Policy |

View file

@ -0,0 +1,81 @@
---
sidebar_position: 10
title: "Skins & Themes"
description: "Customize the Hermes CLI with built-in and user-defined skins"
---
# Skins & Themes
Skins control the **visual presentation** of the Hermes CLI: banner colors, spinner faces and verbs, response-box labels, branding text, and the tool activity prefix.
Conversational style and visual style are separate concepts:
- **Personality** changes the agent's tone and wording.
- **Skin** changes the CLI's appearance.
## Change skins
```bash
/skin # show the current skin and list available skins
/skin ares # switch to a built-in skin
/skin mytheme # switch to a custom skin from ~/.hermes/skins/mytheme.yaml
```
Or set the default skin in `~/.hermes/config.yaml`:
```yaml
display:
skin: default
```
## Built-in skins
| Skin | Description | Agent branding |
|------|-------------|----------------|
| `default` | Classic Hermes — gold and kawaii | `Hermes Agent` |
| `ares` | War-god theme — crimson and bronze | `Ares Agent` |
| `mono` | Monochrome — clean grayscale | `Hermes Agent` |
| `slate` | Cool blue — developer-focused | `Hermes Agent` |
| `poseidon` | Ocean-god theme — deep blue and seafoam | `Poseidon Agent` |
| `sisyphus` | Sisyphean theme — austere grayscale with persistence | `Sisyphus Agent` |
| `charizard` | Volcanic theme — burnt orange and ember | `Charizard Agent` |
## What a skin can customize
| Area | Keys |
|------|------|
| Banner + response colors | `colors.banner_*`, `colors.response_border` |
| Spinner animation | `spinner.waiting_faces`, `spinner.thinking_faces`, `spinner.thinking_verbs`, `spinner.wings` |
| Branding text | `branding.agent_name`, `branding.welcome`, `branding.response_label`, `branding.prompt_symbol` |
| Tool activity prefix | `tool_prefix` |
## Custom skins
Create YAML files under `~/.hermes/skins/`. User skins inherit missing values from the built-in `default` skin.
```yaml
name: cyberpunk
description: Neon terminal theme
colors:
banner_border: "#FF00FF"
banner_title: "#00FFFF"
banner_accent: "#FF1493"
spinner:
thinking_verbs: ["jacking in", "decrypting", "uploading"]
wings:
- ["⟨⚡", "⚡⟩"]
branding:
agent_name: "Cyber Agent"
response_label: " ⚡ Cyber "
tool_prefix: "▏"
```
## Operational notes
- Built-in skins load from `hermes_cli/skin_engine.py`.
- Unknown skins automatically fall back to `default`.
- `/skin` updates the active CLI theme immediately for the current session.

View file

@ -10,25 +10,22 @@ Tools are functions that extend the agent's capabilities. They're organized into
## Available Tools
| Category | Tools | Description |
|----------|-------|-------------|
| **Web** | `web_search`, `web_extract` | Search the web, extract page content |
| **Terminal** | `terminal`, `process` | Execute commands (local/docker/singularity/modal/daytona/ssh backends), manage background processes |
| **File** | `read_file`, `write_file`, `patch`, `search_files` | Read, write, edit, and search files |
| **Browser** | `browser_navigate`, `browser_click`, `browser_type`, `browser_console`, etc. | Full browser automation via Browserbase |
| **Vision** | `vision_analyze` | Image analysis via multimodal models |
| **Image Gen** | `image_generate` | Generate images (FLUX via FAL) |
| **TTS** | `text_to_speech` | Text-to-speech (Edge TTS / ElevenLabs / OpenAI) |
| **Reasoning** | `mixture_of_agents` | Multi-model reasoning |
| **Skills** | `skills_list`, `skill_view`, `skill_manage` | Find, view, create, and manage skills |
| **Todo** | `todo` | Read/write task list for multi-step planning |
| **Memory** | `memory` | Persistent notes + user profile across sessions |
| **Session Search** | `session_search` | Search + summarize past conversations (FTS5) |
| **Cronjob** | `schedule_cronjob`, `list_cronjobs`, `remove_cronjob` | Scheduled task management |
| **Code Execution** | `execute_code` | Run Python scripts that call tools via RPC sandbox |
| **Delegation** | `delegate_task` | Spawn subagents with isolated context |
| **Clarify** | `clarify` | Ask the user multiple-choice or open-ended questions |
| **MCP** | Auto-discovered | External tools from MCP servers |
Hermes ships with a broad built-in tool registry covering web search, browser automation, terminal execution, file editing, memory, delegation, RL training, messaging delivery, Home Assistant, Honcho memory, and more.
High-level categories:
| Category | Examples | Description |
|----------|----------|-------------|
| **Web** | `web_search`, `web_extract` | Search the web and extract page content. |
| **Terminal & Files** | `terminal`, `process`, `read_file`, `patch` | Execute commands and manipulate files. |
| **Browser** | `browser_navigate`, `browser_snapshot`, `browser_vision` | Interactive browser automation with text and vision support. |
| **Media** | `vision_analyze`, `image_generate`, `text_to_speech` | Multimodal analysis and generation. |
| **Agent orchestration** | `todo`, `clarify`, `execute_code`, `delegate_task` | Planning, clarification, code execution, and subagent delegation. |
| **Memory & recall** | `memory`, `session_search`, `honcho_*` | Persistent memory, session search, and Honcho cross-session context. |
| **Automation & delivery** | `schedule_cronjob`, `send_message` | Scheduled tasks and outbound messaging delivery. |
| **Integrations** | `ha_*`, MCP server tools, `rl_*` | Home Assistant, MCP, RL training, and other integrations. |
For the authoritative code-derived registry, see [Built-in Tools Reference](/docs/reference/tools-reference) and [Toolsets Reference](/docs/reference/toolsets-reference).
## Using Toolsets
@ -43,7 +40,9 @@ hermes tools
hermes tools
```
**Available toolsets:** `web`, `terminal`, `file`, `browser`, `vision`, `image_gen`, `moa`, `skills`, `tts`, `todo`, `memory`, `session_search`, `cronjob`, `code_execution`, `delegation`, `clarify`, and more.
Common toolsets include `web`, `terminal`, `file`, `browser`, `vision`, `image_gen`, `moa`, `skills`, `tts`, `todo`, `memory`, `session_search`, `cronjob`, `code_execution`, `delegation`, `clarify`, `honcho`, `homeassistant`, and `rl`.
See [Toolsets Reference](/docs/reference/toolsets-reference) for the full set, including platform presets such as `hermes-cli`, `hermes-telegram`, and dynamic MCP toolsets like `mcp-<server>`.
## Terminal Backends
@ -56,6 +55,7 @@ The terminal tool can execute commands in different environments:
| `ssh` | Remote server | Sandboxing, keep agent away from its own code |
| `singularity` | HPC containers | Cluster computing, rootless |
| `modal` | Cloud execution | Serverless, scale |
| `daytona` | Cloud sandbox workspace | Persistent remote dev environments |
### Configuration

View file

@ -202,7 +202,7 @@ Replace the ID with the actual channel ID (right-click → Copy Channel ID with
## Bot Behavior
- **Server channels**: The bot responds to all messages from allowed users in channels it can access. It does **not** require a mention or prefix — any message from an allowed user is treated as a prompt.
- **Server channels**: By default the bot requires an `@mention` before it responds in server channels. You can disable that globally with `DISCORD_REQUIRE_MENTION=false` or allow specific channels to be mention-free via `DISCORD_FREE_RESPONSE_CHANNELS`.
- **Direct messages**: DMs always work, even without the Message Content Intent enabled (Discord exempts DMs from this requirement). However, you should still enable the intent for server channel support.
- **Conversations**: Each channel or DM maintains its own conversation context.

View file

@ -130,33 +130,22 @@ The Home Assistant gateway adapter connects via WebSocket and subscribes to `sta
By default, **no events are forwarded**. You must configure at least one of `watch_domains`, `watch_entities`, or `watch_all` to receive events. Without filters, a warning is logged at startup and all state changes are silently dropped.
:::
Configure which events the agent sees in `~/.hermes/config.yaml` under the Home Assistant platform's `extra` section:
Configure which events the agent sees in `~/.hermes/gateway.json` under the Home Assistant platform's `extra` section:
```yaml
# ~/.hermes/config.yaml
messaging:
platforms:
homeassistant:
extra:
# Watch specific domains (recommended)
watch_domains:
- climate
- binary_sensor
- alarm_control_panel
- light
# Watch specific entities (in addition to domains)
watch_entities:
- sensor.front_door_battery
# Ignore noisy entities
ignore_entities:
- sensor.uptime
- sensor.cpu_usage
- sensor.memory_usage
# Per-entity cooldown (seconds)
cooldown_seconds: 30
```json
{
"platforms": {
"homeassistant": {
"enabled": true,
"extra": {
"watch_domains": ["climate", "binary_sensor", "alarm_control_panel", "light"],
"watch_entities": ["sensor.front_door_battery"],
"ignore_entities": ["sensor.uptime", "sensor.cpu_usage", "sensor.memory_usage"],
"cooldown_seconds": 30
}
}
}
}
```
| Setting | Default | Description |

View file

@ -62,7 +62,7 @@ hermes gateway status # Check service status
| Command | Description |
|---------|-------------|
| `/new` or `/reset` | Start fresh conversation |
| `/new` or `/reset` | Start a fresh conversation |
| `/model [provider:model]` | Show or change the model (supports `provider:model` syntax) |
| `/provider` | Show available providers with auth status |
| `/personality [name]` | Set a personality |
@ -72,8 +72,13 @@ hermes gateway status # Check service status
| `/stop` | Stop the running agent |
| `/sethome` | Set this chat as the home channel |
| `/compress` | Manually compress conversation context |
| `/title [name]` | Set or show the session title |
| `/resume [name]` | Resume a previously named session |
| `/usage` | Show token usage for this session |
| `/insights [days]` | Show usage insights and analytics |
| `/reasoning [level\|show\|hide]` | Change reasoning effort or toggle reasoning display |
| `/rollback [number]` | List or restore filesystem checkpoints |
| `/background <prompt>` | Run a prompt in a separate background session |
| `/reload-mcp` | Reload MCP servers from config |
| `/update` | Update Hermes Agent to the latest version |
| `/help` | Show available commands |
@ -92,7 +97,7 @@ Sessions reset based on configurable policies:
| Policy | Default | Description |
|--------|---------|-------------|
| Daily | 4:00 AM | Reset at a specific hour each day |
| Idle | 120 min | Reset after N minutes of inactivity |
| Idle | 1440 min | Reset after N minutes of inactivity |
| Both | (combined) | Whichever triggers first |
Configure per-platform overrides in `~/.hermes/gateway.json`:
@ -204,7 +209,7 @@ Each platform has its own toolset:
| Slack | `hermes-slack` | Full tools including terminal |
| Signal | `hermes-signal` | Full tools including terminal |
| Email | `hermes-email` | Full tools including terminal |
| Home Assistant | `hermes-gateway` | Full tools + HA device control (ha_list_entities, ha_get_state, ha_call_service, ha_list_services) |
| Home Assistant | `hermes-homeassistant` | Full tools + HA device control (ha_list_entities, ha_get_state, ha_call_service, ha_list_services) |
## Next Steps

View file

@ -192,8 +192,8 @@ The adapter monitors the SSE connection and automatically reconnects if:
| **Messages not received** | Check that `SIGNAL_ALLOWED_USERS` includes the sender's number in E.164 format (with `+` prefix) |
| **"signal-cli not found on PATH"** | Install signal-cli and ensure it's in your PATH, or use Docker |
| **Connection keeps dropping** | Check signal-cli logs for errors. Ensure Java 17+ is installed. |
| **Group messages ignored** | `SIGNAL_GROUP_POLICY` defaults to `disabled`. Set to `allowlist` or `open`. |
| **Bot responds to everyone** | Set `SIGNAL_DM_POLICY=pairing` or `allowlist` and configure `SIGNAL_ALLOWED_USERS` |
| **Group messages ignored** | Configure `SIGNAL_GROUP_ALLOWED_USERS` with specific group IDs, or `*` to allow all groups. |
| **Bot responds to no one** | Configure `SIGNAL_ALLOWED_USERS`, use DM pairing, or explicitly allow all users through gateway policy if you want broader access. |
| **Duplicate messages** | Ensure only one signal-cli instance is listening on your phone number |
---
@ -205,8 +205,8 @@ The adapter monitors the SSE connection and automatically reconnects if:
:::
- Phone numbers are redacted in all log output
- Use `SIGNAL_DM_POLICY=pairing` (default) for safe onboarding of new users
- Keep groups disabled unless you specifically need group support
- Use DM pairing or explicit allowlists for safe onboarding of new users
- Keep groups disabled unless you specifically need group support, or allowlist only the groups you trust
- Signal's end-to-end encryption protects message content in transit
- The signal-cli session data in `~/.local/share/signal-cli/` contains account credentials — protect it like a password

View file

@ -20,7 +20,7 @@ the steps below.
| Component | Value |
|-----------|-------|
| **Library** | `@slack/bolt` (Socket Mode) |
| **Library** | `slack-bolt` / `slack_sdk` for Python (Socket Mode) |
| **Connection** | WebSocket — no public URL required |
| **Auth tokens needed** | Bot Token (`xoxb-`) + App-Level Token (`xapp-`) |
| **User identification** | Slack Member IDs (e.g., `U01ABC2DEF3`) |

View file

@ -6,13 +6,10 @@ description: "Set up Hermes Agent as a WhatsApp bot via the built-in Baileys bri
# WhatsApp Setup
Hermes connects to WhatsApp through a built-in bridge using [whatsapp-web.js](https://github.com/pedroslopez/whatsapp-web.js)
(Baileys-based). This works by emulating a WhatsApp Web session — **not** through the official
WhatsApp Business API. No Meta developer account or Business verification is required.
Hermes connects to WhatsApp through a built-in bridge based on **Baileys**. This works by emulating a WhatsApp Web session — **not** through the official WhatsApp Business API. No Meta developer account or Business verification is required.
:::warning Unofficial API — Ban Risk
WhatsApp does **not** officially support third-party bots outside the Business API. Using
whatsapp-web.js carries a small risk of account restrictions. To minimize risk:
WhatsApp does **not** officially support third-party bots outside the Business API. Using a third-party bridge carries a small risk of account restrictions. To minimize risk:
- **Use a dedicated phone number** for the bot (not your personal number)
- **Don't send bulk/spam messages** — keep usage conversational
- **Don't automate outbound messaging** to people who haven't messaged first
@ -20,7 +17,7 @@ whatsapp-web.js carries a small risk of account restrictions. To minimize risk:
:::warning WhatsApp Web Protocol Updates
WhatsApp periodically updates their Web protocol, which can temporarily break compatibility
with whatsapp-web.js. When this happens, Hermes will update the bridge dependency. If the
with third-party bridges. When this happens, Hermes will update the bridge dependency. If the
bot stops working after a WhatsApp update, pull the latest Hermes version and re-pair.
:::
@ -38,21 +35,7 @@ bot stops working after a WhatsApp update, pull the latest Hermes version and re
- **Node.js v18+** and **npm** — the WhatsApp bridge runs as a Node.js process
- **A phone with WhatsApp** installed (for scanning the QR code)
**On Linux headless servers**, you also need Chromium/Puppeteer dependencies:
```bash
# Debian / Ubuntu
sudo apt-get install -y \
libnss3 libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 \
libxkbcommon0 libxcomposite1 libxdamage1 libxrandr2 libgbm1 \
libpango-1.0-0 libcairo2 libasound2 libxshmfence1
# Fedora / RHEL
sudo dnf install -y \
nss atk at-spi2-atk cups-libs libdrm libxkbcommon \
libXcomposite libXdamage libXrandr mesa-libgbm \
pango cairo alsa-lib
```
Unlike older browser-driven bridges, the current Baileys-based bridge does **not** require a local Chromium or Puppeteer dependency stack.
---
@ -112,9 +95,6 @@ Add the following to your `~/.hermes/.env` file:
WHATSAPP_ENABLED=true
WHATSAPP_MODE=bot # "bot" or "self-chat"
WHATSAPP_ALLOWED_USERS=15551234567 # Comma-separated phone numbers (with country code, no +)
# Optional
WHATSAPP_HOME_CONTACT=15551234567 # Default contact for proactive/scheduled messages
```
Then start the gateway:
@ -130,12 +110,11 @@ The gateway starts the WhatsApp bridge automatically using the saved session.
## Session Persistence
The whatsapp-web.js `LocalAuth` strategy saves your session to the `.wwebjs_auth` folder inside
your Hermes data directory (`~/.hermes/`). This means:
The Baileys bridge saves its session under `~/.hermes/whatsapp/session`. This means:
- **Sessions survive restarts** — you don't need to re-scan the QR code every time
- The session data includes encryption keys and device credentials
- **Do not share or commit the `.wwebjs_auth` folder** — it grants full access to the WhatsApp account
- **Do not share or commit this session directory** — it grants full access to the WhatsApp account
---
@ -170,9 +149,9 @@ Hermes supports voice on WhatsApp:
|---------|----------|
| **QR code not scanning** | Ensure terminal is wide enough (60+ columns). Try a different terminal. Make sure you're scanning from the correct WhatsApp account (bot number, not personal). |
| **QR code expires** | QR codes refresh every ~20 seconds. If it times out, restart `hermes whatsapp`. |
| **Session not persisting** | Check that `~/.hermes/.wwebjs_auth/` exists and is writable. On Docker, mount this as a volume. |
| **Logged out unexpectedly** | WhatsApp unlinks devices after ~14 days of phone inactivity. Keep the phone on and connected to WiFi. Re-pair with `hermes whatsapp`. |
| **"Execution context was destroyed"** | Chromium crashed. Install the Puppeteer dependencies listed in Prerequisites. On low-RAM servers, add swap space. |
| **Session not persisting** | Check that `~/.hermes/whatsapp/session` exists and is writable. If containerized, mount it as a persistent volume. |
| **Logged out unexpectedly** | WhatsApp unlinks devices after long inactivity. Keep the phone on and connected to the network, then re-pair with `hermes whatsapp` if needed. |
| **Bridge crashes or reconnect loops** | Restart the gateway, update Hermes, and re-pair if the session was invalidated by a WhatsApp protocol change. |
| **Bot stops working after WhatsApp update** | Update Hermes to get the latest bridge version, then re-pair. |
| **Messages not being received** | Verify `WHATSAPP_ALLOWED_USERS` includes the sender's number (with country code, no `+` or spaces). |
@ -186,8 +165,8 @@ of authorized users. Without this setting, the gateway will **deny all incoming
safety measure.
:::
- The `.wwebjs_auth` folder contains full session credentials — protect it like a password
- Set file permissions: `chmod 700 ~/.hermes/.wwebjs_auth`
- The `~/.hermes/whatsapp/session` directory contains full session credentials — protect it like a password
- Set file permissions: `chmod 700 ~/.hermes/whatsapp/session`
- Use a **dedicated phone number** for the bot to isolate risk from your personal account
- If you suspect compromise, unlink the device from WhatsApp → Settings → Linked Devices
- Phone numbers in logs are partially redacted, but review your log retention policy

View file

@ -271,7 +271,7 @@ Total messages: 3847
Database size: 12.4 MB
```
For deeper analytics — token usage, cost estimates, tool breakdown, and activity patterns — use [`hermes insights`](/docs/reference/cli-commands#insights).
For deeper analytics — token usage, cost estimates, tool breakdown, and activity patterns — use [`hermes insights`](/docs/reference/cli-commands#hermes-insights).
## Session Search Tool