mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-28 01:21:43 +00:00
docs: comprehensive documentation audit — fix 9 HIGH, 20+ MEDIUM gaps (#4087)
Reference docs fixes: - cli-commands.md: remove non-existent --provider alibaba, add hermes profile/completion/plugins/mcp to top-level table, add --profile/-p global flag, add --source chat option - slash-commands.md: add /yolo and /commands, fix /q alias conflict (resolves to /queue not /quit), add missing aliases (/bg, /set-home, /reload_mcp, /gateway) - toolsets-reference.md: fix hermes-api-server (not same as hermes-cli, omits clarify/send_message/text_to_speech) - profile-commands.md: fix show name required not optional, --clone-from not --from, add --remove/--name to alias, fix alias path, fix export/ import arg types, remove non-existent fish completion - tools-reference.md: add EXA_API_KEY to web tools requires_env - mcp-config-reference.md: add auth key for OAuth, tool name sanitization - environment-variables.md: add EXA_API_KEY, update provider values - plugins.md: remove non-existent ctx.register_command(), add ctx.inject_message() Feature docs additions: - security.md: add /yolo mode, approval modes (manual/smart/off), configurable timeout, expanded dangerous patterns table - cron.md: add wrap_response config, [SILENT] suppression - mcp.md: add dynamic tool discovery, MCP sampling support - cli.md: add Ctrl+Z suspend, busy_input_mode, tool_preview_length - docker.md: add skills/credential file mounting Messaging platform docs: - telegram.md: add webhook mode, DoH fallback IPs - slack.md: add multi-workspace OAuth support - discord.md: add DISCORD_IGNORE_NO_MENTION - matrix.md: add MSC3245 native voice messages - feishu.md: expand from 129 to 365 lines (encrypt key, verification token, group policy, card actions, media, rate limiting, markdown, troubleshooting) - wecom.md: expand from 86 to 264 lines (per-group allowlists, media, AES decryption, stream replies, reconnection, troubleshooting) Configuration docs: - quickstart.md: add DeepSeek, Copilot, Copilot ACP providers - configuration.md: add DeepSeek provider, Exa web backend, terminal env_passthrough/images, browser.command_timeout, compression params, discord config, security/tirith config, timezone, auxiliary models 21 files changed, ~1000 lines added
This commit is contained in:
parent
3c8f910973
commit
7e0c2c3ce3
21 changed files with 1004 additions and 83 deletions
|
|
@ -19,6 +19,7 @@ Before setup, here's the part most people want to know: how Hermes behaves once
|
|||
| **Free-response channels** | You can make specific channels mention-free with `DISCORD_FREE_RESPONSE_CHANNELS`, or disable mentions globally with `DISCORD_REQUIRE_MENTION=false`. |
|
||||
| **Threads** | Hermes replies in the same thread. Mention rules still apply unless that thread or its parent channel is configured as free-response. Threads stay isolated from the parent channel for session history. |
|
||||
| **Shared channels with multiple users** | By default, Hermes isolates session history per user inside the channel for safety and clarity. Two people talking in the same channel do not share one transcript unless you explicitly disable that. |
|
||||
| **Messages mentioning other users** | When `DISCORD_IGNORE_NO_MENTION` is `true` (the default), Hermes stays silent if a message @mentions other users but does **not** mention the bot. This prevents the bot from jumping into conversations directed at other people. Set to `false` if you want the bot to respond to all messages regardless of who is mentioned. This only applies in server channels, not DMs. |
|
||||
|
||||
:::tip
|
||||
If you want a normal bot-help channel where people can talk to Hermes without tagging it every time, add that channel to `DISCORD_FREE_RESPONSE_CHANNELS`.
|
||||
|
|
@ -253,6 +254,9 @@ DISCORD_ALLOWED_USERS=284102345871466496
|
|||
|
||||
# Optional: channels where bot responds without @mention (comma-separated channel IDs)
|
||||
# DISCORD_FREE_RESPONSE_CHANNELS=1234567890,9876543210
|
||||
|
||||
# Optional: ignore messages that @mention other users but NOT the bot (default: true)
|
||||
# DISCORD_IGNORE_NO_MENTION=true
|
||||
```
|
||||
|
||||
Optional behavior settings in `~/.hermes/config.yaml`:
|
||||
|
|
|
|||
|
|
@ -18,7 +18,7 @@ The integration supports both connection modes:
|
|||
| Context | Behavior |
|
||||
|---------|----------|
|
||||
| Direct messages | Hermes responds to every message. |
|
||||
| Group chats | Hermes responds when the bot is addressed in the chat. |
|
||||
| Group chats | Hermes responds only when the bot is @mentioned in the chat. |
|
||||
| Shared group chats | By default, session history is isolated per user inside a shared chat. |
|
||||
|
||||
This shared-chat behavior is controlled by `config.yaml`:
|
||||
|
|
@ -46,12 +46,16 @@ Keep the App Secret private. Anyone with it can impersonate your app.
|
|||
|
||||
### Recommended: WebSocket mode
|
||||
|
||||
Use WebSocket mode when Hermes runs on your laptop, workstation, or a private server. No public URL is required.
|
||||
Use WebSocket mode when Hermes runs on your laptop, workstation, or a private server. No public URL is required. The official Lark SDK opens and maintains a persistent outbound WebSocket connection with automatic reconnection.
|
||||
|
||||
```bash
|
||||
FEISHU_CONNECTION_MODE=websocket
|
||||
```
|
||||
|
||||
**Requirements:** The `websockets` Python package must be installed. The SDK handles connection lifecycle, heartbeats, and auto-reconnection internally.
|
||||
|
||||
**How it works:** The adapter runs the Lark SDK's WebSocket client in a background executor thread. Inbound events (messages, reactions, card actions) are dispatched to the main asyncio loop. On disconnect, the SDK will attempt to reconnect automatically.
|
||||
|
||||
### Optional: Webhook mode
|
||||
|
||||
Use webhook mode only when you already run Hermes behind a reachable HTTP endpoint.
|
||||
|
|
@ -60,12 +64,24 @@ Use webhook mode only when you already run Hermes behind a reachable HTTP endpoi
|
|||
FEISHU_CONNECTION_MODE=webhook
|
||||
```
|
||||
|
||||
In webhook mode, Hermes serves a Feishu endpoint at:
|
||||
In webhook mode, Hermes starts an HTTP server (via `aiohttp`) and serves a Feishu endpoint at:
|
||||
|
||||
```text
|
||||
/feishu/webhook
|
||||
```
|
||||
|
||||
**Requirements:** The `aiohttp` Python package must be installed.
|
||||
|
||||
You can customize the webhook server bind address and path:
|
||||
|
||||
```bash
|
||||
FEISHU_WEBHOOK_HOST=127.0.0.1 # default: 127.0.0.1
|
||||
FEISHU_WEBHOOK_PORT=8765 # default: 8765
|
||||
FEISHU_WEBHOOK_PATH=/feishu/webhook # default: /feishu/webhook
|
||||
```
|
||||
|
||||
When Feishu sends a URL verification challenge (`type: url_verification`), the webhook responds automatically so you can complete the subscription setup in the Feishu developer console.
|
||||
|
||||
## Step 3: Configure Hermes
|
||||
|
||||
### Option A: Interactive Setup
|
||||
|
|
@ -116,13 +132,233 @@ FEISHU_HOME_CHANNEL=oc_xxx
|
|||
|
||||
## Security
|
||||
|
||||
For production use, set an allowlist:
|
||||
### User Allowlist
|
||||
|
||||
For production use, set an allowlist of Feishu Open IDs:
|
||||
|
||||
```bash
|
||||
FEISHU_ALLOWED_USERS=ou_xxx,ou_yyy
|
||||
```
|
||||
|
||||
If you leave the allowlist empty, anyone who can reach the bot may be able to use it.
|
||||
If you leave the allowlist empty, anyone who can reach the bot may be able to use it. In group chats, the allowlist is checked against the sender's open_id before the message is processed.
|
||||
|
||||
### Webhook Encryption Key
|
||||
|
||||
When running in webhook mode, set an encryption key to enable signature verification of inbound webhook payloads:
|
||||
|
||||
```bash
|
||||
FEISHU_ENCRYPT_KEY=your-encrypt-key
|
||||
```
|
||||
|
||||
This key is found in the **Event Subscriptions** section of your Feishu app configuration. When set, the adapter verifies every webhook request using the signature algorithm:
|
||||
|
||||
```
|
||||
SHA256(timestamp + nonce + encrypt_key + body)
|
||||
```
|
||||
|
||||
The computed hash is compared against the `x-lark-signature` header using timing-safe comparison. Requests with invalid or missing signatures are rejected with HTTP 401.
|
||||
|
||||
:::tip
|
||||
In WebSocket mode, signature verification is handled by the SDK itself, so `FEISHU_ENCRYPT_KEY` is optional. In webhook mode, it is strongly recommended for production.
|
||||
:::
|
||||
|
||||
### Verification Token
|
||||
|
||||
An additional layer of authentication that checks the `token` field inside webhook payloads:
|
||||
|
||||
```bash
|
||||
FEISHU_VERIFICATION_TOKEN=your-verification-token
|
||||
```
|
||||
|
||||
This token is also found in the **Event Subscriptions** section of your Feishu app. When set, every inbound webhook payload must contain a matching `token` in its `header` object. Mismatched tokens are rejected with HTTP 401.
|
||||
|
||||
Both `FEISHU_ENCRYPT_KEY` and `FEISHU_VERIFICATION_TOKEN` can be used together for defense in depth.
|
||||
|
||||
## Group Message Policy
|
||||
|
||||
The `FEISHU_GROUP_POLICY` environment variable controls whether and how Hermes responds in group chats:
|
||||
|
||||
```bash
|
||||
FEISHU_GROUP_POLICY=allowlist # default
|
||||
```
|
||||
|
||||
| Value | Behavior |
|
||||
|-------|----------|
|
||||
| `open` | Hermes responds to @mentions from any user in any group. |
|
||||
| `allowlist` | Hermes only responds to @mentions from users listed in `FEISHU_ALLOWED_USERS`. |
|
||||
| `disabled` | Hermes ignores all group messages entirely. |
|
||||
|
||||
In all modes, the bot must be explicitly @mentioned (or @all) in the group before the message is processed. Direct messages bypass this gate.
|
||||
|
||||
### Bot Identity for @Mention Gating
|
||||
|
||||
For precise @mention detection in groups, the adapter needs to know the bot's identity. It can be provided explicitly:
|
||||
|
||||
```bash
|
||||
FEISHU_BOT_OPEN_ID=ou_xxx
|
||||
FEISHU_BOT_USER_ID=xxx
|
||||
FEISHU_BOT_NAME=MyBot
|
||||
```
|
||||
|
||||
If none of these are set, the adapter will attempt to auto-discover the bot name via the Application Info API on startup. For this to work, grant the `admin:app.info:readonly` or `application:application:self_manage` permission scope.
|
||||
|
||||
## Interactive Card Actions
|
||||
|
||||
When users click buttons or interact with interactive cards sent by the bot, the adapter routes these as synthetic `/card` command events:
|
||||
|
||||
- Button clicks become: `/card button {"key": "value", ...}`
|
||||
- The action's `value` payload from the card definition is included as JSON.
|
||||
- Card actions are deduplicated with a 15-minute window to prevent double processing.
|
||||
|
||||
Card action events are dispatched with `MessageType.COMMAND`, so they flow through the normal command processing pipeline.
|
||||
|
||||
To use this feature, enable the **Interactive Card** event in your Feishu app's event subscriptions (`card.action.trigger`).
|
||||
|
||||
## Media Support
|
||||
|
||||
### Inbound (receiving)
|
||||
|
||||
The adapter receives and caches the following media types from users:
|
||||
|
||||
| Type | Extensions | How it's processed |
|
||||
|------|-----------|-------------------|
|
||||
| **Images** | .jpg, .jpeg, .png, .gif, .webp, .bmp | Downloaded via Feishu API and cached locally |
|
||||
| **Audio** | .ogg, .mp3, .wav, .m4a, .aac, .flac, .opus, .webm | Downloaded and cached; small text files are auto-extracted |
|
||||
| **Video** | .mp4, .mov, .avi, .mkv, .webm, .m4v, .3gp | Downloaded and cached as documents |
|
||||
| **Files** | .pdf, .doc, .docx, .xls, .xlsx, .ppt, .pptx, and more | Downloaded and cached as documents |
|
||||
|
||||
Media from rich-text (post) messages, including inline images and file attachments, is also extracted and cached.
|
||||
|
||||
For small text-based documents (.txt, .md), the file content is automatically injected into the message text so the agent can read it directly without needing tools.
|
||||
|
||||
### Outbound (sending)
|
||||
|
||||
| Method | What it sends |
|
||||
|--------|--------------|
|
||||
| `send` | Text or rich post messages (auto-detected based on markdown content) |
|
||||
| `send_image` / `send_image_file` | Uploads image to Feishu, then sends as native image bubble (with optional caption) |
|
||||
| `send_document` | Uploads file to Feishu API, then sends as file attachment |
|
||||
| `send_voice` | Uploads audio file as a Feishu file attachment |
|
||||
| `send_video` | Uploads video and sends as native media message |
|
||||
| `send_animation` | GIFs are downgraded to file attachments (Feishu has no native GIF bubble) |
|
||||
|
||||
File upload routing is automatic based on extension:
|
||||
|
||||
- `.ogg`, `.opus` → uploaded as `opus` audio
|
||||
- `.mp4`, `.mov`, `.avi`, `.m4v` → uploaded as `mp4` media
|
||||
- `.pdf`, `.doc(x)`, `.xls(x)`, `.ppt(x)` → uploaded with their document type
|
||||
- Everything else → uploaded as a generic stream file
|
||||
|
||||
## Markdown Rendering and Post Fallback
|
||||
|
||||
When outbound text contains markdown formatting (headings, bold, lists, code blocks, links, etc.), the adapter automatically sends it as a Feishu **post** message with an embedded `md` tag rather than as plain text. This enables rich rendering in the Feishu client.
|
||||
|
||||
If the Feishu API rejects the post payload (e.g., due to unsupported markdown constructs), the adapter automatically falls back to sending as plain text with markdown stripped. This two-stage fallback ensures messages are always delivered.
|
||||
|
||||
Plain text messages (no markdown detected) are sent as the simple `text` message type.
|
||||
|
||||
## ACK Emoji Reactions
|
||||
|
||||
When the adapter receives an inbound message, it immediately adds an ✅ (OK) emoji reaction to signal that the message was received and is being processed. This provides visual feedback before the agent completes its response.
|
||||
|
||||
The reaction is persistent — it remains on the message after the response is sent, serving as a receipt marker.
|
||||
|
||||
User reactions on bot messages are also tracked. If a user adds or removes an emoji reaction on a message sent by the bot, it is routed as a synthetic text event (`reaction:added:EMOJI_TYPE` or `reaction:removed:EMOJI_TYPE`) so the agent can respond to feedback.
|
||||
|
||||
## Burst Protection and Batching
|
||||
|
||||
The adapter includes debouncing for rapid message bursts to avoid overwhelming the agent:
|
||||
|
||||
### Text Batching
|
||||
|
||||
When a user sends multiple text messages in quick succession, they are merged into a single event before being dispatched:
|
||||
|
||||
| Setting | Env Var | Default |
|
||||
|---------|---------|---------|
|
||||
| Quiet period | `HERMES_FEISHU_TEXT_BATCH_DELAY_SECONDS` | 0.6s |
|
||||
| Max messages per batch | `HERMES_FEISHU_TEXT_BATCH_MAX_MESSAGES` | 8 |
|
||||
| Max characters per batch | `HERMES_FEISHU_TEXT_BATCH_MAX_CHARS` | 4000 |
|
||||
|
||||
### Media Batching
|
||||
|
||||
Multiple media attachments sent in quick succession (e.g., dragging several images) are merged into a single event:
|
||||
|
||||
| Setting | Env Var | Default |
|
||||
|---------|---------|---------|
|
||||
| Quiet period | `HERMES_FEISHU_MEDIA_BATCH_DELAY_SECONDS` | 0.8s |
|
||||
|
||||
### Per-Chat Serialization
|
||||
|
||||
Messages within the same chat are processed serially (one at a time) to maintain conversation coherence. Each chat has its own lock, so messages in different chats are processed concurrently.
|
||||
|
||||
## Rate Limiting (Webhook Mode)
|
||||
|
||||
In webhook mode, the adapter enforces per-IP rate limiting to protect against abuse:
|
||||
|
||||
- **Window:** 60-second sliding window
|
||||
- **Limit:** 120 requests per window per (app_id, path, IP) triple
|
||||
- **Tracking cap:** Up to 4096 unique keys tracked (prevents unbounded memory growth)
|
||||
|
||||
Requests that exceed the limit receive HTTP 429 (Too Many Requests).
|
||||
|
||||
### Webhook Anomaly Tracking
|
||||
|
||||
The adapter tracks consecutive error responses per IP address. After 25 consecutive errors from the same IP within a 6-hour window, a warning is logged. This helps detect misconfigured clients or probing attempts.
|
||||
|
||||
Additional webhook protections:
|
||||
- **Body size limit:** 1 MB maximum
|
||||
- **Body read timeout:** 30 seconds
|
||||
- **Content-Type enforcement:** Only `application/json` is accepted
|
||||
|
||||
## Deduplication
|
||||
|
||||
Inbound messages are deduplicated using message IDs with a 24-hour TTL. The dedup state is persisted across restarts to `~/.hermes/feishu_seen_message_ids.json`.
|
||||
|
||||
| Setting | Env Var | Default |
|
||||
|---------|---------|---------|
|
||||
| Cache size | `HERMES_FEISHU_DEDUP_CACHE_SIZE` | 2048 entries |
|
||||
|
||||
## All Environment Variables
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|----------|----------|---------|-------------|
|
||||
| `FEISHU_APP_ID` | ✅ | — | Feishu/Lark App ID |
|
||||
| `FEISHU_APP_SECRET` | ✅ | — | Feishu/Lark App Secret |
|
||||
| `FEISHU_DOMAIN` | — | `feishu` | `feishu` (China) or `lark` (international) |
|
||||
| `FEISHU_CONNECTION_MODE` | — | `websocket` | `websocket` or `webhook` |
|
||||
| `FEISHU_ALLOWED_USERS` | — | _(empty)_ | Comma-separated open_id list for user allowlist |
|
||||
| `FEISHU_HOME_CHANNEL` | — | — | Chat ID for cron/notification output |
|
||||
| `FEISHU_ENCRYPT_KEY` | — | _(empty)_ | Encrypt key for webhook signature verification |
|
||||
| `FEISHU_VERIFICATION_TOKEN` | — | _(empty)_ | Verification token for webhook payload auth |
|
||||
| `FEISHU_GROUP_POLICY` | — | `allowlist` | Group message policy: `open`, `allowlist`, `disabled` |
|
||||
| `FEISHU_BOT_OPEN_ID` | — | _(empty)_ | Bot's open_id (for @mention detection) |
|
||||
| `FEISHU_BOT_USER_ID` | — | _(empty)_ | Bot's user_id (for @mention detection) |
|
||||
| `FEISHU_BOT_NAME` | — | _(empty)_ | Bot's display name (for @mention detection) |
|
||||
| `FEISHU_WEBHOOK_HOST` | — | `127.0.0.1` | Webhook server bind address |
|
||||
| `FEISHU_WEBHOOK_PORT` | — | `8765` | Webhook server port |
|
||||
| `FEISHU_WEBHOOK_PATH` | — | `/feishu/webhook` | Webhook endpoint path |
|
||||
| `HERMES_FEISHU_DEDUP_CACHE_SIZE` | — | `2048` | Max deduplicated message IDs to track |
|
||||
| `HERMES_FEISHU_TEXT_BATCH_DELAY_SECONDS` | — | `0.6` | Text burst debounce quiet period |
|
||||
| `HERMES_FEISHU_TEXT_BATCH_MAX_MESSAGES` | — | `8` | Max messages merged per text batch |
|
||||
| `HERMES_FEISHU_TEXT_BATCH_MAX_CHARS` | — | `4000` | Max characters merged per text batch |
|
||||
| `HERMES_FEISHU_MEDIA_BATCH_DELAY_SECONDS` | — | `0.8` | Media burst debounce quiet period |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Problem | Fix |
|
||||
|---------|-----|
|
||||
| `lark-oapi not installed` | Install the SDK: `pip install lark-oapi` |
|
||||
| `websockets not installed; websocket mode unavailable` | Install websockets: `pip install websockets` |
|
||||
| `aiohttp not installed; webhook mode unavailable` | Install aiohttp: `pip install aiohttp` |
|
||||
| `FEISHU_APP_ID or FEISHU_APP_SECRET not set` | Set both env vars or configure via `hermes gateway setup` |
|
||||
| `Another local Hermes gateway is already using this Feishu app_id` | Only one Hermes instance can use the same app_id at a time. Stop the other gateway first. |
|
||||
| Bot doesn't respond in groups | Ensure the bot is @mentioned, check `FEISHU_GROUP_POLICY`, and verify the sender is in `FEISHU_ALLOWED_USERS` if policy is `allowlist` |
|
||||
| `Webhook rejected: invalid verification token` | Ensure `FEISHU_VERIFICATION_TOKEN` matches the token in your Feishu app's Event Subscriptions config |
|
||||
| `Webhook rejected: invalid signature` | Ensure `FEISHU_ENCRYPT_KEY` matches the encrypt key in your Feishu app config |
|
||||
| Post messages show as plain text | The Feishu API rejected the post payload; this is normal fallback behavior. Check logs for details. |
|
||||
| Images/files not received by bot | Grant `im:message` and `im:resource` permission scopes to your Feishu app |
|
||||
| Bot identity not auto-detected | Grant `admin:app.info:readonly` scope, or set `FEISHU_BOT_OPEN_ID` / `FEISHU_BOT_NAME` manually |
|
||||
| `Webhook rate limit exceeded` | More than 120 requests/minute from the same IP. This is usually a misconfiguration or loop. |
|
||||
|
||||
## Toolset
|
||||
|
||||
|
|
|
|||
|
|
@ -352,3 +352,4 @@ For more information on securing your Hermes Agent deployment, see the [Security
|
|||
- **Federation**: If you're on a federated homeserver, the bot can communicate with users from other servers — just add their full `@user:server` IDs to `MATRIX_ALLOWED_USERS`.
|
||||
- **Auto-join**: The bot automatically accepts room invites and joins. It starts responding immediately after joining.
|
||||
- **Media support**: Hermes can send and receive images, audio, video, and file attachments. Media is uploaded to your homeserver using the Matrix content repository API.
|
||||
- **Native voice messages (MSC3245)**: The Matrix adapter automatically tags outgoing voice messages with the `org.matrix.msc3245.voice` flag. This means TTS responses and voice audio are rendered as **native voice bubbles** in Element and other clients that support MSC3245, rather than as generic audio file attachments. Incoming voice messages with the MSC3245 flag are also correctly identified and routed to speech-to-text transcription. No configuration is needed — this works automatically.
|
||||
|
|
|
|||
|
|
@ -237,6 +237,60 @@ Make sure the bot has been **invited to the channel** (`/invite @Hermes Agent`).
|
|||
|
||||
---
|
||||
|
||||
## Multi-Workspace Support
|
||||
|
||||
Hermes can connect to **multiple Slack workspaces** simultaneously using a single gateway instance. Each workspace is authenticated independently with its own bot user ID.
|
||||
|
||||
### Configuration
|
||||
|
||||
Provide multiple bot tokens as a **comma-separated list** in `SLACK_BOT_TOKEN`:
|
||||
|
||||
```bash
|
||||
# Multiple bot tokens — one per workspace
|
||||
SLACK_BOT_TOKEN=xoxb-workspace1-token,xoxb-workspace2-token,xoxb-workspace3-token
|
||||
|
||||
# A single app-level token is still used for Socket Mode
|
||||
SLACK_APP_TOKEN=xapp-your-app-token
|
||||
```
|
||||
|
||||
Or in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
platforms:
|
||||
slack:
|
||||
token: "xoxb-workspace1-token,xoxb-workspace2-token"
|
||||
```
|
||||
|
||||
### OAuth Token File
|
||||
|
||||
In addition to tokens in the environment or config, Hermes also loads tokens from an **OAuth token file** at:
|
||||
|
||||
```
|
||||
~/.hermes/platforms/slack/slack_tokens.json
|
||||
```
|
||||
|
||||
This file is a JSON object mapping team IDs to token entries:
|
||||
|
||||
```json
|
||||
{
|
||||
"T01ABC2DEF3": {
|
||||
"token": "xoxb-workspace-token-here",
|
||||
"team_name": "My Workspace"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Tokens from this file are merged with any tokens specified via `SLACK_BOT_TOKEN`. Duplicate tokens are automatically deduplicated.
|
||||
|
||||
### How it works
|
||||
|
||||
- The **first token** in the list is the primary token, used for the Socket Mode connection (AsyncApp).
|
||||
- Each token is authenticated via `auth.test` on startup. The gateway maps each `team_id` to its own `WebClient` and `bot_user_id`.
|
||||
- When a message arrives, Hermes uses the correct workspace-specific client to respond.
|
||||
- The primary `bot_user_id` (from the first token) is used for backward compatibility with features that expect a single bot identity.
|
||||
|
||||
---
|
||||
|
||||
## Voice Messages
|
||||
|
||||
Hermes supports voice on Slack:
|
||||
|
|
|
|||
|
|
@ -258,6 +258,73 @@ Topics created outside of the config (e.g., by manually calling the Telegram API
|
|||
- **Privacy policy:** Telegram now requires bots to have a privacy policy. Set one via BotFather with `/setprivacy_policy`, or Telegram may auto-generate a placeholder. This is particularly important if your bot is public-facing.
|
||||
- **Message streaming:** Bot API 9.x added support for streaming long responses, which can improve perceived latency for lengthy agent replies.
|
||||
|
||||
## Webhook Mode
|
||||
|
||||
By default, the Telegram adapter connects via **long polling** — the gateway makes outbound connections to Telegram's servers. This works everywhere but keeps a persistent connection open.
|
||||
|
||||
**Webhook mode** is an alternative where Telegram pushes updates to your server over HTTPS. This is ideal for **serverless and cloud deployments** (Fly.io, Railway, etc.) where inbound HTTP can wake a suspended machine.
|
||||
|
||||
### Configuration
|
||||
|
||||
Set the `TELEGRAM_WEBHOOK_URL` environment variable to enable webhook mode:
|
||||
|
||||
```bash
|
||||
# Required — your public HTTPS endpoint
|
||||
TELEGRAM_WEBHOOK_URL=https://app.fly.dev/telegram
|
||||
|
||||
# Optional — local listen port (default: 8443)
|
||||
TELEGRAM_WEBHOOK_PORT=8443
|
||||
|
||||
# Optional — secret token for update verification (auto-generated if not set)
|
||||
TELEGRAM_WEBHOOK_SECRET=my-secret-token
|
||||
```
|
||||
|
||||
Or in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
telegram:
|
||||
webhook_mode: true
|
||||
```
|
||||
|
||||
When `TELEGRAM_WEBHOOK_URL` is set, the gateway starts an HTTP server listening on `0.0.0.0:<port>` and registers the webhook URL with Telegram. The URL path is extracted from the webhook URL (defaults to `/telegram`).
|
||||
|
||||
:::warning
|
||||
Telegram requires a **valid TLS certificate** on the webhook endpoint. Self-signed certificates will be rejected. Use a reverse proxy (nginx, Caddy) or a platform that provides TLS termination (Fly.io, Railway, Cloudflare Tunnel).
|
||||
:::
|
||||
|
||||
## DNS-over-HTTPS Fallback IPs
|
||||
|
||||
In some restricted networks, `api.telegram.org` may resolve to an IP that is unreachable. The Telegram adapter includes a **fallback IP** mechanism that transparently retries connections against alternative IPs while preserving the correct TLS hostname and SNI.
|
||||
|
||||
### How it works
|
||||
|
||||
1. If `TELEGRAM_FALLBACK_IPS` is set, those IPs are used directly.
|
||||
2. Otherwise, the adapter automatically queries **Google DNS** and **Cloudflare DNS** via DNS-over-HTTPS (DoH) to discover alternative IPs for `api.telegram.org`.
|
||||
3. IPs returned by DoH that differ from the system DNS result are used as fallbacks.
|
||||
4. If DoH is also blocked, a hardcoded seed IP (`149.154.167.220`) is used as a last resort.
|
||||
5. Once a fallback IP succeeds, it becomes "sticky" — subsequent requests use it directly without retrying the primary path first.
|
||||
|
||||
### Configuration
|
||||
|
||||
```bash
|
||||
# Explicit fallback IPs (comma-separated)
|
||||
TELEGRAM_FALLBACK_IPS=149.154.167.220,149.154.167.221
|
||||
```
|
||||
|
||||
Or in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
platforms:
|
||||
telegram:
|
||||
extra:
|
||||
fallback_ips:
|
||||
- "149.154.167.220"
|
||||
```
|
||||
|
||||
:::tip
|
||||
You usually don't need to configure this manually. The auto-discovery via DoH handles most restricted-network scenarios. The `TELEGRAM_FALLBACK_IPS` env var is only needed if DoH is also blocked on your network.
|
||||
:::
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Problem | Solution |
|
||||
|
|
|
|||
|
|
@ -13,6 +13,7 @@ Connect Hermes to [WeCom](https://work.weixin.qq.com/) (企业微信), Tencent's
|
|||
- A WeCom organization account
|
||||
- An AI Bot created in the WeCom Admin Console
|
||||
- The Bot ID and Secret from the bot's credentials page
|
||||
- Python packages: `aiohttp` and `httpx`
|
||||
|
||||
## Setup
|
||||
|
||||
|
|
@ -56,10 +57,12 @@ hermes gateway start
|
|||
|
||||
- **WebSocket transport** — persistent connection, no public endpoint needed
|
||||
- **DM and group messaging** — configurable access policies
|
||||
- **Per-group sender allowlists** — fine-grained control over who can interact in each group
|
||||
- **Media support** — images, files, voice, video upload and download
|
||||
- **AES-encrypted media** — automatic decryption for inbound attachments
|
||||
- **Quote context** — preserves reply threading
|
||||
- **Markdown rendering** — rich text responses
|
||||
- **Reply-mode streaming** — correlates responses to inbound message context
|
||||
- **Auto-reconnect** — exponential backoff on connection drops
|
||||
|
||||
## Configuration Options
|
||||
|
|
@ -75,12 +78,187 @@ Set these in `config.yaml` under `platforms.wecom.extra`:
|
|||
| `group_policy` | `open` | Group access: `open`, `allowlist`, `disabled` |
|
||||
| `allow_from` | `[]` | User IDs allowed for DMs (when dm_policy=allowlist) |
|
||||
| `group_allow_from` | `[]` | Group IDs allowed (when group_policy=allowlist) |
|
||||
| `groups` | `{}` | Per-group configuration (see below) |
|
||||
|
||||
## Access Policies
|
||||
|
||||
### DM Policy
|
||||
|
||||
Controls who can send direct messages to the bot:
|
||||
|
||||
| Value | Behavior |
|
||||
|-------|----------|
|
||||
| `open` | Anyone can DM the bot (default) |
|
||||
| `allowlist` | Only user IDs in `allow_from` can DM |
|
||||
| `disabled` | All DMs are ignored |
|
||||
| `pairing` | Pairing mode (for initial setup) |
|
||||
|
||||
```bash
|
||||
WECOM_DM_POLICY=allowlist
|
||||
```
|
||||
|
||||
### Group Policy
|
||||
|
||||
Controls which groups the bot responds in:
|
||||
|
||||
| Value | Behavior |
|
||||
|-------|----------|
|
||||
| `open` | Bot responds in all groups (default) |
|
||||
| `allowlist` | Bot only responds in group IDs listed in `group_allow_from` |
|
||||
| `disabled` | All group messages are ignored |
|
||||
|
||||
```bash
|
||||
WECOM_GROUP_POLICY=allowlist
|
||||
```
|
||||
|
||||
### Per-Group Sender Allowlists
|
||||
|
||||
For fine-grained control, you can restrict which users are allowed to interact with the bot within specific groups. This is configured in `config.yaml`:
|
||||
|
||||
```yaml
|
||||
platforms:
|
||||
wecom:
|
||||
enabled: true
|
||||
extra:
|
||||
bot_id: "your-bot-id"
|
||||
secret: "your-secret"
|
||||
group_policy: "allowlist"
|
||||
group_allow_from:
|
||||
- "group_id_1"
|
||||
- "group_id_2"
|
||||
groups:
|
||||
group_id_1:
|
||||
allow_from:
|
||||
- "user_alice"
|
||||
- "user_bob"
|
||||
group_id_2:
|
||||
allow_from:
|
||||
- "user_charlie"
|
||||
"*":
|
||||
allow_from:
|
||||
- "user_admin"
|
||||
```
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. The `group_policy` and `group_allow_from` controls determine whether a group is allowed at all.
|
||||
2. If a group passes the top-level check, the `groups.<group_id>.allow_from` list (if present) further restricts which senders within that group can interact with the bot.
|
||||
3. A wildcard `"*"` group entry serves as a default for groups not explicitly listed.
|
||||
4. Allowlist entries support the `*` wildcard to allow all users, and entries are case-insensitive.
|
||||
5. Entries can optionally use the `wecom:user:` or `wecom:group:` prefix format — the prefix is stripped automatically.
|
||||
|
||||
If no `allow_from` is configured for a group, all users in that group are allowed (assuming the group itself passes the top-level policy check).
|
||||
|
||||
## Media Support
|
||||
|
||||
### Inbound (receiving)
|
||||
|
||||
The adapter receives media attachments from users and caches them locally for agent processing:
|
||||
|
||||
| Type | How it's handled |
|
||||
|------|-----------------|
|
||||
| **Images** | Downloaded and cached locally. Supports both URL-based and base64-encoded images. |
|
||||
| **Files** | Downloaded and cached. Filename is preserved from the original message. |
|
||||
| **Voice** | Voice message text transcription is extracted if available. |
|
||||
| **Mixed messages** | WeCom mixed-type messages (text + images) are parsed and all components extracted. |
|
||||
|
||||
**Quoted messages:** Media from quoted (replied-to) messages is also extracted, so the agent has context about what the user is replying to.
|
||||
|
||||
### AES-Encrypted Media Decryption
|
||||
|
||||
WeCom encrypts some inbound media attachments with AES-256-CBC. The adapter handles this automatically:
|
||||
|
||||
- When an inbound media item includes an `aeskey` field, the adapter downloads the encrypted bytes and decrypts them using AES-256-CBC with PKCS#7 padding.
|
||||
- The AES key is the base64-decoded value of the `aeskey` field (must be exactly 32 bytes).
|
||||
- The IV is derived from the first 16 bytes of the key.
|
||||
- This requires the `cryptography` Python package (`pip install cryptography`).
|
||||
|
||||
No configuration is needed — decryption happens transparently when encrypted media is received.
|
||||
|
||||
### Outbound (sending)
|
||||
|
||||
| Method | What it sends | Size limit |
|
||||
|--------|--------------|------------|
|
||||
| `send` | Markdown text messages | 4000 chars |
|
||||
| `send_image` / `send_image_file` | Native image messages | 10 MB |
|
||||
| `send_document` | File attachments | 20 MB |
|
||||
| `send_voice` | Voice messages (AMR format only for native voice) | 2 MB |
|
||||
| `send_video` | Video messages | 10 MB |
|
||||
|
||||
**Chunked upload:** Files are uploaded in 512 KB chunks through a three-step protocol (init → chunks → finish). The adapter handles this automatically.
|
||||
|
||||
**Automatic downgrade:** When media exceeds the native type's size limit but is under the absolute 20 MB file limit, it is automatically sent as a generic file attachment instead:
|
||||
|
||||
- Images > 10 MB → sent as file
|
||||
- Videos > 10 MB → sent as file
|
||||
- Voice > 2 MB → sent as file
|
||||
- Non-AMR audio → sent as file (WeCom only supports AMR for native voice)
|
||||
|
||||
Files exceeding the absolute 20 MB limit are rejected with an informational message sent to the chat.
|
||||
|
||||
## Reply-Mode Stream Responses
|
||||
|
||||
When the bot receives a message via the WeCom callback, the adapter remembers the inbound request ID. If a response is sent while the request context is still active, the adapter uses WeCom's reply-mode (`aibot_respond_msg`) with streaming to correlate the response directly to the inbound message. This provides a more natural conversation experience in the WeCom client.
|
||||
|
||||
If the inbound request context has expired or is unavailable, the adapter falls back to proactive message sending via `aibot_send_msg`.
|
||||
|
||||
Reply-mode also works for media: uploaded media can be sent as a reply to the originating message.
|
||||
|
||||
## Connection and Reconnection
|
||||
|
||||
The adapter maintains a persistent WebSocket connection to WeCom's gateway at `wss://openws.work.weixin.qq.com`.
|
||||
|
||||
### Connection Lifecycle
|
||||
|
||||
1. **Connect:** Opens a WebSocket connection and sends an `aibot_subscribe` authentication frame with the bot_id and secret.
|
||||
2. **Heartbeat:** Sends application-level ping frames every 30 seconds to keep the connection alive.
|
||||
3. **Listen:** Continuously reads inbound frames and dispatches message callbacks.
|
||||
|
||||
### Reconnection Behavior
|
||||
|
||||
On connection loss, the adapter uses exponential backoff to reconnect:
|
||||
|
||||
| Attempt | Delay |
|
||||
|---------|-------|
|
||||
| 1st retry | 2 seconds |
|
||||
| 2nd retry | 5 seconds |
|
||||
| 3rd retry | 10 seconds |
|
||||
| 4th retry | 30 seconds |
|
||||
| 5th+ retry | 60 seconds |
|
||||
|
||||
After each successful reconnection, the backoff counter resets to zero. All pending request futures are failed on disconnect so callers don't hang indefinitely.
|
||||
|
||||
### Deduplication
|
||||
|
||||
Inbound messages are deduplicated using message IDs with a 5-minute window and a maximum cache of 1000 entries. This prevents double-processing of messages during reconnection or network hiccups.
|
||||
|
||||
## All Environment Variables
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|----------|----------|---------|-------------|
|
||||
| `WECOM_BOT_ID` | ✅ | — | WeCom AI Bot ID |
|
||||
| `WECOM_SECRET` | ✅ | — | WeCom AI Bot Secret |
|
||||
| `WECOM_ALLOWED_USERS` | — | _(empty)_ | Comma-separated user IDs for the gateway-level allowlist |
|
||||
| `WECOM_HOME_CHANNEL` | — | — | Chat ID for cron/notification output |
|
||||
| `WECOM_WEBSOCKET_URL` | — | `wss://openws.work.weixin.qq.com` | WebSocket gateway URL |
|
||||
| `WECOM_DM_POLICY` | — | `open` | DM access policy |
|
||||
| `WECOM_GROUP_POLICY` | — | `open` | Group access policy |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Problem | Fix |
|
||||
|---------|-----|
|
||||
| "WECOM_BOT_ID and WECOM_SECRET are required" | Set both env vars or configure in setup wizard |
|
||||
| "invalid secret (errcode=40013)" | Verify the secret matches your bot's credentials |
|
||||
| "Timed out waiting for subscribe acknowledgement" | Check network connectivity to `openws.work.weixin.qq.com` |
|
||||
| Bot doesn't respond in groups | Check `group_policy` setting and group allowlist |
|
||||
| `WECOM_BOT_ID and WECOM_SECRET are required` | Set both env vars or configure in setup wizard |
|
||||
| `WeCom startup failed: aiohttp not installed` | Install aiohttp: `pip install aiohttp` |
|
||||
| `WeCom startup failed: httpx not installed` | Install httpx: `pip install httpx` |
|
||||
| `invalid secret (errcode=40013)` | Verify the secret matches your bot's credentials |
|
||||
| `Timed out waiting for subscribe acknowledgement` | Check network connectivity to `openws.work.weixin.qq.com` |
|
||||
| Bot doesn't respond in groups | Check `group_policy` setting and ensure the group ID is in `group_allow_from` |
|
||||
| Bot ignores certain users in a group | Check per-group `allow_from` lists in the `groups` config section |
|
||||
| Media decryption fails | Install `cryptography`: `pip install cryptography` |
|
||||
| `cryptography is required for WeCom media decryption` | The inbound media is AES-encrypted. Install: `pip install cryptography` |
|
||||
| Voice messages sent as files | WeCom only supports AMR format for native voice. Other formats are auto-downgraded to file. |
|
||||
| `File too large` error | WeCom has a 20 MB absolute limit on all file uploads. Compress or split the file. |
|
||||
| Images sent as files | Images > 10 MB exceed the native image limit and are auto-downgraded to file attachments. |
|
||||
| `Timeout sending message to WeCom` | The WebSocket may have disconnected. Check logs for reconnection messages. |
|
||||
| `WeCom websocket closed during authentication` | Network issue or incorrect credentials. Verify bot_id and secret. |
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue