hermes-agent/website/docs/user-guide/messaging/whatsapp.md
Teknium c58e16757a
docs: fix 40+ discrepancies between documentation and codebase (#5818)
Comprehensive audit of all ~100 doc pages against the actual code, fixing:

Reference docs:
- HERMES_API_TIMEOUT default 900 -> 1800 (env-vars)
- TERMINAL_DOCKER_IMAGE default python:3.11 -> nikolaik/python-nodejs (env-vars)
- compression.summary_model default shown as gemini -> actually empty string (env-vars)
- Add missing GOOGLE_API_KEY, GEMINI_API_KEY, GEMINI_BASE_URL env vars (env-vars)
- Add missing /branch (/fork) slash command (slash-commands)
- Fix hermes-cli tool count 39 -> 38 (toolsets-reference)
- Fix hermes-api-server drop list to include text_to_speech (toolsets-reference)
- Fix total tool count 47 -> 48, standalone 14 -> 15 (tools-reference)

User guide:
- web_extract.timeout default 30 -> 360 (configuration)
- Remove display.theme_mode (not implemented in code) (configuration)
- Remove display.background_process_notifications (not in defaults) (configuration)
- Browser inactivity timeout 300/5min -> 120/2min (browser)
- Screenshot path browser_screenshots -> cache/screenshots (browser)
- batch_runner default model claude-sonnet-4-20250514 -> claude-sonnet-4.6
- Add minimax to TTS provider list (voice-mode)
- Remove credential_pool_strategies from auth.json example (credential-pools)
- Fix Slack token path platforms/slack/ -> root ~/.hermes/ (slack)
- Fix Matrix store path for new installs (matrix)
- Fix WhatsApp session path for new installs (whatsapp)
- Fix HomeAssistant config from gateway.json to config.yaml (homeassistant)
- Fix WeCom gateway start command (wecom)

Developer guide:
- Fix tool/toolset counts in architecture overview
- Update line counts: main.py ~5500, setup.py ~3100, run.py ~7500, mcp_tool ~2200
- Replace nonexistent agent/memory_store.py with memory_manager.py + memory_provider.py
- Update _discover_tools() list: remove honcho_tools, add skill_manager_tool
- Add session_search and delegate_task to intercepted tools list (agent-loop)
- Fix budget warning: two-tier system (70% caution, 90% warning) (agent-loop)
- Fix gateway auth order (per-platform first, global last) (gateway-internals)
- Fix email_adapter.py -> email.py, add webhook.py + api_server.py (gateway-internals)
- Add 7 missing providers to provider-runtime list

Other:
- Add Docker --cap-add entries to security doc
- Fix Python version 3.10+ -> 3.11+ (contributing)
- Fix AGENTS.md discovery claim (not hierarchical walk) (tips)
- Fix cron 'add' -> canonical 'create' (cron-internals)
- Add pre_api_request/post_api_request hooks to plugin guide
- Add Google/Gemini provider to providers page
- Clarify OPENAI_BASE_URL deprecation (providers)
2026-04-07 10:17:44 -07:00

213 lines
9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
sidebar_position: 5
title: "WhatsApp"
description: "Set up Hermes Agent as a WhatsApp bot via the built-in Baileys bridge"
---
# WhatsApp Setup
Hermes connects to WhatsApp through a built-in bridge based on **Baileys**. This works by emulating a WhatsApp Web session — **not** through the official WhatsApp Business API. No Meta developer account or Business verification is required.
:::warning Unofficial API — Ban Risk
WhatsApp does **not** officially support third-party bots outside the Business API. Using a third-party bridge carries a small risk of account restrictions. To minimize risk:
- **Use a dedicated phone number** for the bot (not your personal number)
- **Don't send bulk/spam messages** — keep usage conversational
- **Don't automate outbound messaging** to people who haven't messaged first
:::
:::warning WhatsApp Web Protocol Updates
WhatsApp periodically updates their Web protocol, which can temporarily break compatibility
with third-party bridges. When this happens, Hermes will update the bridge dependency. If the
bot stops working after a WhatsApp update, pull the latest Hermes version and re-pair.
:::
## Two Modes
| Mode | How it works | Best for |
|------|-------------|----------|
| **Separate bot number** (recommended) | Dedicate a phone number to the bot. People message that number directly. | Clean UX, multiple users, lower ban risk |
| **Personal self-chat** | Use your own WhatsApp. You message yourself to talk to the agent. | Quick setup, single user, testing |
---
## Prerequisites
- **Node.js v18+** and **npm** — the WhatsApp bridge runs as a Node.js process
- **A phone with WhatsApp** installed (for scanning the QR code)
Unlike older browser-driven bridges, the current Baileys-based bridge does **not** require a local Chromium or Puppeteer dependency stack.
---
## Step 1: Run the Setup Wizard
```bash
hermes whatsapp
```
The wizard will:
1. Ask which mode you want (**bot** or **self-chat**)
2. Install bridge dependencies if needed
3. Display a **QR code** in your terminal
4. Wait for you to scan it
**To scan the QR code:**
1. Open WhatsApp on your phone
2. Go to **Settings → Linked Devices**
3. Tap **Link a Device**
4. Point your camera at the terminal QR code
Once paired, the wizard confirms the connection and exits. Your session is saved automatically.
:::tip
If the QR code looks garbled, make sure your terminal is at least 60 columns wide and supports
Unicode. You can also try a different terminal emulator.
:::
---
## Step 2: Getting a Second Phone Number (Bot Mode)
For bot mode, you need a phone number that isn't already registered with WhatsApp. Three options:
| Option | Cost | Notes |
|--------|------|-------|
| **Google Voice** | Free | US only. Get a number at [voice.google.com](https://voice.google.com). Verify WhatsApp via SMS through the Google Voice app. |
| **Prepaid SIM** | $515 one-time | Any carrier. Activate, verify WhatsApp, then the SIM can sit in a drawer. Number must stay active (make a call every 90 days). |
| **VoIP services** | Free$5/month | TextNow, TextFree, or similar. Some VoIP numbers are blocked by WhatsApp — try a few if the first doesn't work. |
After getting the number:
1. Install WhatsApp on a phone (or use WhatsApp Business app with dual-SIM)
2. Register the new number with WhatsApp
3. Run `hermes whatsapp` and scan the QR code from that WhatsApp account
---
## Step 3: Configure Hermes
Add the following to your `~/.hermes/.env` file:
```bash
# Required
WHATSAPP_ENABLED=true
WHATSAPP_MODE=bot # "bot" or "self-chat"
# Access control — pick ONE of these options:
WHATSAPP_ALLOWED_USERS=15551234567 # Comma-separated phone numbers (with country code, no +)
# WHATSAPP_ALLOWED_USERS=* # OR use * to allow everyone
# WHATSAPP_ALLOW_ALL_USERS=true # OR set this flag instead (same effect as *)
```
:::tip Allow-all shorthand
Setting `WHATSAPP_ALLOWED_USERS=*` allows **all** senders (equivalent to `WHATSAPP_ALLOW_ALL_USERS=true`).
This is consistent with [Signal group allowlists](/docs/reference/environment-variables).
To use the pairing flow instead, remove both variables and rely on the
[DM pairing system](/docs/user-guide/security#dm-pairing-system).
:::
Optional behavior settings in `~/.hermes/config.yaml`:
```yaml
unauthorized_dm_behavior: pair
whatsapp:
unauthorized_dm_behavior: ignore
```
- `unauthorized_dm_behavior: pair` is the global default. Unknown DM senders get a pairing code.
- `whatsapp.unauthorized_dm_behavior: ignore` makes WhatsApp stay silent for unauthorized DMs, which is usually the better choice for a private number.
Then start the gateway:
```bash
hermes gateway # Foreground
hermes gateway install # Install as a user service
sudo hermes gateway install --system # Linux only: boot-time system service
```
The gateway starts the WhatsApp bridge automatically using the saved session.
---
## Session Persistence
The Baileys bridge saves its session under `~/.hermes/platforms/whatsapp/session`. This means:
- **Sessions survive restarts** — you don't need to re-scan the QR code every time
- The session data includes encryption keys and device credentials
- **Do not share or commit this session directory** — it grants full access to the WhatsApp account
---
## Re-pairing
If the session breaks (phone reset, WhatsApp update, manually unlinked), you'll see connection
errors in the gateway logs. To fix it:
```bash
hermes whatsapp
```
This generates a fresh QR code. Scan it again and the session is re-established. The gateway
handles **temporary** disconnections (network blips, phone going offline briefly) automatically
with reconnection logic.
---
## Voice Messages
Hermes supports voice on WhatsApp:
- **Incoming:** Voice messages (`.ogg` opus) are automatically transcribed using the configured STT provider: local `faster-whisper`, Groq Whisper (`GROQ_API_KEY`), or OpenAI Whisper (`VOICE_TOOLS_OPENAI_KEY`)
- **Outgoing:** TTS responses are sent as MP3 audio file attachments
- Agent responses are prefixed with "⚕ **Hermes Agent**" by default. You can customize or disable this in `config.yaml`:
```yaml
# ~/.hermes/config.yaml
whatsapp:
reply_prefix: "" # Empty string disables the header
# reply_prefix: "🤖 *My Bot*\n──────\n" # Custom prefix (supports \n for newlines)
```
---
## Troubleshooting
| Problem | Solution |
|---------|----------|
| **QR code not scanning** | Ensure terminal is wide enough (60+ columns). Try a different terminal. Make sure you're scanning from the correct WhatsApp account (bot number, not personal). |
| **QR code expires** | QR codes refresh every ~20 seconds. If it times out, restart `hermes whatsapp`. |
| **Session not persisting** | Check that `~/.hermes/platforms/whatsapp/session` exists and is writable. If containerized, mount it as a persistent volume. |
| **Logged out unexpectedly** | WhatsApp unlinks devices after long inactivity. Keep the phone on and connected to the network, then re-pair with `hermes whatsapp` if needed. |
| **Bridge crashes or reconnect loops** | Restart the gateway, update Hermes, and re-pair if the session was invalidated by a WhatsApp protocol change. |
| **Bot stops working after WhatsApp update** | Update Hermes to get the latest bridge version, then re-pair. |
| **macOS: "Node.js not installed" but node works in terminal** | launchd services don't inherit your shell PATH. Run `hermes gateway install` to re-snapshot your current PATH into the plist, then `hermes gateway start`. See the [Gateway Service docs](./index.md#macos-launchd) for details. |
| **Messages not being received** | Verify `WHATSAPP_ALLOWED_USERS` includes the sender's number (with country code, no `+` or spaces), or set it to `*` to allow everyone. Set `WHATSAPP_DEBUG=true` in `.env` and restart the gateway to see raw message events in `bridge.log`. |
| **Bot replies to strangers with a pairing code** | Set `whatsapp.unauthorized_dm_behavior: ignore` in `~/.hermes/config.yaml` if you want unauthorized DMs to be silently ignored instead. |
---
## Security
:::warning
**Configure access control** before going live. Set `WHATSAPP_ALLOWED_USERS` with specific
phone numbers (including country code, without the `+`), use `*` to allow everyone, or set
`WHATSAPP_ALLOW_ALL_USERS=true`. Without any of these, the gateway **denies all incoming
messages** as a safety measure.
:::
By default, unauthorized DMs still receive a pairing code reply. If you want a private WhatsApp number to stay completely silent to strangers, set:
```yaml
whatsapp:
unauthorized_dm_behavior: ignore
```
- The `~/.hermes/platforms/whatsapp/session` directory contains full session credentials — protect it like a password
- Set file permissions: `chmod 700 ~/.hermes/platforms/whatsapp/session`
- Use a **dedicated phone number** for the bot to isolate risk from your personal account
- If you suspect compromise, unlink the device from WhatsApp → Settings → Linked Devices
- Phone numbers in logs are partially redacted, but review your log retention policy