mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-01 01:51:44 +00:00
QR-login connects an iLink bot identity (...@im.bot), not a scriptable personal WeChat account. iLink typically does not deliver ordinary WeChat group events to these bots, so WEIXIN_GROUP_POLICY / WEIXIN_GROUP_ALLOWED_USERS often have no effect regardless of value. - Setup wizard: print iLink-bot caveat before the group-policy prompt; relabel the allowlist input as 'group chat IDs (not member user IDs)'; note that 'open' / 'allowlist' only take effect if iLink delivers group events. - Adapter: log a WARNING at connect() when WEIXIN_GROUP_POLICY is non-disabled so the limitation is surfaced in gateway logs, not just docs. - Docs: add a top-of-page warning callout to weixin.md explaining the iLink bot identity, narrow the 'DM and group messaging' feature line to DM-only with a group caveat, tighten the Group Policy section and troubleshooting row, and clarify WEIXIN_GROUP_ALLOWED_USERS as group IDs (not user IDs) in weixin.md and environment-variables.md. Closes #17094
312 lines
16 KiB
Markdown
312 lines
16 KiB
Markdown
---
|
||
sidebar_position: 15
|
||
title: "Weixin (WeChat)"
|
||
description: "Connect Hermes Agent to personal WeChat accounts via the iLink Bot API"
|
||
---
|
||
|
||
# Weixin (WeChat)
|
||
|
||
Connect Hermes to [WeChat](https://weixin.qq.com/) (微信), Tencent's personal messaging platform. The adapter uses Tencent's **iLink Bot API** for personal WeChat accounts — this is distinct from WeCom (Enterprise WeChat). Messages are delivered via long-polling, so no public endpoint or webhook is required.
|
||
|
||
:::info
|
||
This adapter is for **personal WeChat accounts** (微信). If you need enterprise/corporate WeChat, see the [WeCom adapter](./wecom.md) instead.
|
||
:::
|
||
|
||
:::warning iLink bot identity — ordinary WeChat groups may not work
|
||
QR login connects Hermes to an **iLink bot identity** (e.g. `a5ace6fd482e@im.bot`), **not** a fully scriptable ordinary personal WeChat account. Consequences:
|
||
|
||
- The iLink bot identity generally **cannot be invited into ordinary WeChat groups** the way a normal contact can.
|
||
- iLink typically **does not deliver ordinary WeChat group events** (including `@`-mentions of the personal account used for QR login) to the gateway for most bot-type accounts.
|
||
- `@`-mentioning the personal WeChat account used to scan the QR code is **not** the same as `@`-mentioning the iLink bot — the bot is a separate identity.
|
||
- The `WEIXIN_GROUP_POLICY` / `WEIXIN_GROUP_ALLOWED_USERS` settings below only take effect when iLink actually returns group events for your account type. If it doesn't, group messages will never reach Hermes regardless of policy.
|
||
|
||
In practice, most deployments only get DMs to the iLink bot working reliably. If group delivery doesn't work after configuration, the limitation is on the iLink side, not in Hermes. The gateway logs a `WARNING` at startup whenever `WEIXIN_GROUP_POLICY` is set to anything other than `disabled`.
|
||
:::
|
||
|
||
## Prerequisites
|
||
|
||
- A personal WeChat account
|
||
- Python packages: `aiohttp` and `cryptography`
|
||
- Terminal QR rendering is included when Hermes is installed with the `messaging` extra
|
||
|
||
Install the required dependencies:
|
||
|
||
```bash
|
||
pip install aiohttp cryptography
|
||
# Optional: for terminal QR code display
|
||
pip install hermes-agent[messaging]
|
||
```
|
||
|
||
## Setup
|
||
|
||
### 1. Run the Setup Wizard
|
||
|
||
The easiest way to connect your WeChat account is through the interactive setup:
|
||
|
||
```bash
|
||
hermes gateway setup
|
||
```
|
||
|
||
Select **Weixin** when prompted. The wizard will:
|
||
|
||
1. Request a QR code from the iLink Bot API
|
||
2. Display the QR code in your terminal (or provide a URL)
|
||
3. Wait for you to scan the QR code with the WeChat mobile app
|
||
4. Prompt you to confirm the login on your phone
|
||
5. Save the account credentials automatically to `~/.hermes/weixin/accounts/`
|
||
|
||
Once confirmed, you'll see a message like:
|
||
|
||
```
|
||
微信连接成功,account_id=your-account-id
|
||
```
|
||
|
||
The wizard stores the `account_id`, `token`, and `base_url` so you don't need to configure them manually.
|
||
|
||
### 2. Configure Environment Variables
|
||
|
||
After initial QR login, set at minimum the account ID in `~/.hermes/.env`:
|
||
|
||
```bash
|
||
WEIXIN_ACCOUNT_ID=your-account-id
|
||
|
||
# Optional: override the token (normally auto-saved from QR login)
|
||
# WEIXIN_TOKEN=your-bot-token
|
||
|
||
# Optional: restrict access
|
||
WEIXIN_DM_POLICY=open
|
||
WEIXIN_ALLOWED_USERS=user_id_1,user_id_2
|
||
|
||
# Optional: restore legacy multiline splitting behavior
|
||
# WEIXIN_SPLIT_MULTILINE_MESSAGES=true
|
||
|
||
# Optional: home channel for cron/notifications
|
||
WEIXIN_HOME_CHANNEL=chat_id
|
||
WEIXIN_HOME_CHANNEL_NAME=Home
|
||
```
|
||
|
||
### 3. Start the Gateway
|
||
|
||
```bash
|
||
hermes gateway
|
||
```
|
||
|
||
The adapter will restore saved credentials, connect to the iLink API, and begin long-polling for messages.
|
||
|
||
## Features
|
||
|
||
- **Long-poll transport** — no public endpoint, webhook, or WebSocket needed
|
||
- **QR code login** — scan-to-connect setup via `hermes gateway setup`
|
||
- **DM messaging** — configurable access policies; group messaging depends on iLink actually delivering group events for the connected identity (often not the case for iLink bot accounts — see the warning above)
|
||
- **Media support** — images, video, files, and voice messages
|
||
- **AES-128-ECB encrypted CDN** — automatic encryption/decryption for all media transfers
|
||
- **Context token persistence** — disk-backed reply continuity across restarts
|
||
- **Markdown formatting** — preserves Markdown, including headers, tables, and code blocks, so WeChat clients that support Markdown can render it natively
|
||
- **Smart message chunking** — messages stay as a single bubble when under the limit; only oversized payloads split at logical boundaries
|
||
- **Typing indicators** — shows "typing…" status in the WeChat client while the agent processes
|
||
- **SSRF protection** — outbound media URLs are validated before download
|
||
- **Message deduplication** — 5-minute sliding window prevents double-processing
|
||
- **Automatic retry with backoff** — recovers from transient API errors
|
||
|
||
## Configuration Options
|
||
|
||
Set these in `config.yaml` under `platforms.weixin.extra`:
|
||
|
||
| Key | Default | Description |
|
||
|-----|---------|-------------|
|
||
| `account_id` | — | iLink Bot account ID (required) |
|
||
| `token` | — | iLink Bot token (required, auto-saved from QR login) |
|
||
| `base_url` | `https://ilinkai.weixin.qq.com` | iLink API base URL |
|
||
| `cdn_base_url` | `https://novac2c.cdn.weixin.qq.com/c2c` | CDN base URL for media transfer |
|
||
| `dm_policy` | `open` | DM access: `open`, `allowlist`, `disabled`, `pairing` |
|
||
| `group_policy` | `disabled` | Group access: `open`, `allowlist`, `disabled` |
|
||
| `allow_from` | `[]` | User IDs allowed for DMs (when dm_policy=allowlist) |
|
||
| `group_allow_from` | `[]` | Group IDs allowed (when group_policy=allowlist) |
|
||
| `split_multiline_messages` | `false` | When `true`, split multi-line replies into multiple chat messages (legacy behavior). When `false`, keep multi-line replies as one message unless they exceed the length limit. |
|
||
|
||
## Access Policies
|
||
|
||
### DM Policy
|
||
|
||
Controls who can send direct messages to the bot:
|
||
|
||
| Value | Behavior |
|
||
|-------|----------|
|
||
| `open` | Anyone can DM the bot (default) |
|
||
| `allowlist` | Only user IDs in `allow_from` can DM |
|
||
| `disabled` | All DMs are ignored |
|
||
| `pairing` | Pairing mode (for initial setup) |
|
||
|
||
```bash
|
||
WEIXIN_DM_POLICY=allowlist
|
||
WEIXIN_ALLOWED_USERS=user_id_1,user_id_2
|
||
```
|
||
|
||
### Group Policy
|
||
|
||
Controls which groups the bot responds in **when iLink delivers group events for the connected identity**. For QR-login iLink bot identities (e.g. `...@im.bot`), group events are typically not delivered at all, so this policy may have no effect — see the iLink bot limitation warning at the top of the page.
|
||
|
||
| Value | Behavior |
|
||
|-------|----------|
|
||
| `open` | Bot responds in all groups (if events are delivered) |
|
||
| `allowlist` | Bot only responds in group IDs listed in `group_allow_from` (if events are delivered) |
|
||
| `disabled` | All group messages are ignored (default) |
|
||
|
||
```bash
|
||
WEIXIN_GROUP_POLICY=allowlist
|
||
# NOTE: this is a comma-separated list of group chat IDs, NOT member user IDs,
|
||
# despite the variable name containing "USERS". Keep this in mind when configuring.
|
||
WEIXIN_GROUP_ALLOWED_USERS=group_id_1,group_id_2
|
||
```
|
||
|
||
:::note
|
||
The default group policy is `disabled` for Weixin (unlike WeCom where it defaults to `open`). This is intentional — personal WeChat accounts may be in many groups, and iLink bot identities typically can't receive ordinary WeChat group messages at all. The gateway logs a `WARNING` at startup if you set `WEIXIN_GROUP_POLICY` to anything other than `disabled`.
|
||
:::
|
||
|
||
## Media Support
|
||
|
||
### Inbound (receiving)
|
||
|
||
The adapter receives media attachments from users, downloads them from the WeChat CDN, decrypts them, and caches them locally for agent processing:
|
||
|
||
| Type | How it's handled |
|
||
|------|-----------------|
|
||
| **Images** | Downloaded, AES-decrypted, and cached as JPEG. |
|
||
| **Video** | Downloaded, AES-decrypted, and cached as MP4. |
|
||
| **Files** | Downloaded, AES-decrypted, and cached. Original filename is preserved. |
|
||
| **Voice** | If a text transcription is available, it's extracted as text. Otherwise the audio (SILK format) is downloaded and cached. |
|
||
|
||
**Quoted messages:** Media from quoted (replied-to) messages is also extracted, so the agent has context about what the user is replying to.
|
||
|
||
### AES-128-ECB Encrypted CDN
|
||
|
||
WeChat media files are transferred through an encrypted CDN. The adapter handles this transparently:
|
||
|
||
- **Inbound:** Encrypted media is downloaded from the CDN using `encrypted_query_param` URLs, then decrypted with AES-128-ECB using the per-file key provided in the message payload.
|
||
- **Outbound:** Files are encrypted locally with a random AES-128-ECB key, uploaded to the CDN, and the encrypted reference is included in the outbound message.
|
||
- The AES key is 16 bytes (128-bit). Keys may arrive as raw base64 or hex-encoded — the adapter handles both formats.
|
||
- This requires the `cryptography` Python package.
|
||
|
||
No configuration is needed — encryption and decryption happen automatically.
|
||
|
||
### Outbound (sending)
|
||
|
||
| Method | What it sends |
|
||
|--------|--------------|
|
||
| `send` | Text messages with Markdown formatting |
|
||
| `send_image` / `send_image_file` | Native image messages (via CDN upload) |
|
||
| `send_document` | File attachments (via CDN upload) |
|
||
| `send_video` | Video messages (via CDN upload) |
|
||
|
||
All outbound media goes through the encrypted CDN upload flow:
|
||
|
||
1. Generate a random AES-128 key
|
||
2. Encrypt the file with AES-128-ECB + PKCS#7 padding
|
||
3. Request an upload URL from the iLink API (`getuploadurl`)
|
||
4. Upload the ciphertext to the CDN
|
||
5. Send the message with the encrypted media reference
|
||
|
||
## Context Token Persistence
|
||
|
||
The iLink Bot API requires a `context_token` to be echoed back with each outbound message for a given peer. The adapter maintains a disk-backed context token store:
|
||
|
||
- Tokens are saved per account+peer to `~/.hermes/weixin/accounts/<account_id>.context-tokens.json`
|
||
- On startup, previously saved tokens are restored
|
||
- Every inbound message updates the stored token for that sender
|
||
- Outbound messages automatically include the latest context token
|
||
|
||
This ensures reply continuity even after gateway restarts.
|
||
|
||
## Markdown Formatting
|
||
|
||
WeChat clients connected through the iLink Bot API can render Markdown directly, so the adapter preserves Markdown instead of rewriting it:
|
||
|
||
- **Headers** stay as Markdown headings (`#`, `##`, ...)
|
||
- **Tables** stay as Markdown tables
|
||
- **Code fences** stay as fenced code blocks
|
||
- **Excessive blank lines** are collapsed to double newlines outside fenced code blocks
|
||
|
||
## Message Chunking
|
||
|
||
Messages are delivered as a single chat message whenever they fit within the platform limit. Only oversized payloads are split for delivery:
|
||
|
||
- Maximum message length: **4000 characters**
|
||
- Messages under the limit stay intact even when they contain multiple paragraphs or line breaks
|
||
- Oversized messages split at logical boundaries (paragraphs, blank lines, code fences)
|
||
- Code fences are kept intact whenever possible (never split mid-block unless the fence itself exceeds the limit)
|
||
- Oversized individual blocks fall back to the base adapter's truncation logic
|
||
- A 0.3 s inter-chunk delay prevents WeChat rate-limit drops when multiple chunks are sent
|
||
|
||
## Typing Indicators
|
||
|
||
The adapter shows typing status in the WeChat client:
|
||
|
||
1. When a message arrives, the adapter fetches a `typing_ticket` via the `getconfig` API
|
||
2. Typing tickets are cached for 10 minutes per user
|
||
3. `send_typing` sends a typing-start signal; `stop_typing` sends a typing-stop signal
|
||
4. The gateway automatically triggers typing indicators while the agent processes a message
|
||
|
||
## Long-Poll Connection
|
||
|
||
The adapter uses HTTP long-polling (not WebSocket) to receive messages:
|
||
|
||
### How It Works
|
||
|
||
1. **Connect:** Validates credentials and starts the poll loop
|
||
2. **Poll:** Calls `getupdates` with a 35-second timeout; the server holds the request until messages arrive or the timeout expires
|
||
3. **Dispatch:** Inbound messages are dispatched concurrently via `asyncio.create_task`
|
||
4. **Sync buffer:** A persistent sync cursor (`get_updates_buf`) is saved to disk so the adapter resumes from the correct position after restarts
|
||
|
||
### Retry Behavior
|
||
|
||
On API errors, the adapter uses a simple retry strategy:
|
||
|
||
| Condition | Behavior |
|
||
|-----------|----------|
|
||
| Transient error (1st–2nd) | Retry after 2 seconds |
|
||
| Repeated errors (3+) | Back off for 30 seconds, then reset counter |
|
||
| Session expired (`errcode=-14`) | Pause for 10 minutes (re-login may be needed) |
|
||
| Timeout | Immediately re-poll (normal long-poll behavior) |
|
||
|
||
### Deduplication
|
||
|
||
Inbound messages are deduplicated using message IDs with a 5-minute window. This prevents double-processing during network hiccups or overlapping poll responses.
|
||
|
||
### Token Lock
|
||
|
||
Only one Weixin gateway instance can use a given token at a time. The adapter acquires a scoped lock on startup and releases it on shutdown. If another gateway is already using the same token, startup fails with an informative error message.
|
||
|
||
## All Environment Variables
|
||
|
||
| Variable | Required | Default | Description |
|
||
|----------|----------|---------|-------------|
|
||
| `WEIXIN_ACCOUNT_ID` | ✅ | — | iLink Bot account ID (from QR login) |
|
||
| `WEIXIN_TOKEN` | ✅ | — | iLink Bot token (auto-saved from QR login) |
|
||
| `WEIXIN_BASE_URL` | — | `https://ilinkai.weixin.qq.com` | iLink API base URL |
|
||
| `WEIXIN_CDN_BASE_URL` | — | `https://novac2c.cdn.weixin.qq.com/c2c` | CDN base URL for media transfer |
|
||
| `WEIXIN_DM_POLICY` | — | `open` | DM access policy: `open`, `allowlist`, `disabled`, `pairing` |
|
||
| `WEIXIN_GROUP_POLICY` | — | `disabled` | Group access policy: `open`, `allowlist`, `disabled` |
|
||
| `WEIXIN_ALLOWED_USERS` | — | _(empty)_ | Comma-separated user IDs for DM allowlist |
|
||
| `WEIXIN_GROUP_ALLOWED_USERS` | — | _(empty)_ | Comma-separated **group chat IDs** (not member user IDs) for group allowlist. The variable name is legacy — it expects group IDs, not user IDs. |
|
||
| `WEIXIN_HOME_CHANNEL` | — | — | Chat ID for cron/notification output |
|
||
| `WEIXIN_HOME_CHANNEL_NAME` | — | `Home` | Display name for the home channel |
|
||
| `WEIXIN_ALLOW_ALL_USERS` | — | — | Gateway-level flag to allow all users (used by setup wizard) |
|
||
|
||
## Troubleshooting
|
||
|
||
| Problem | Fix |
|
||
|---------|-----|
|
||
| `Weixin startup failed: aiohttp and cryptography are required` | Install both: `pip install aiohttp cryptography` |
|
||
| `Weixin startup failed: WEIXIN_TOKEN is required` | Run `hermes gateway setup` to complete QR login, or set `WEIXIN_TOKEN` manually |
|
||
| `Weixin startup failed: WEIXIN_ACCOUNT_ID is required` | Set `WEIXIN_ACCOUNT_ID` in your `.env` or run `hermes gateway setup` |
|
||
| `Another local Hermes gateway is already using this Weixin token` | Stop the other gateway instance first — only one poller per token is allowed |
|
||
| Session expired (`errcode=-14`) | Your login session has expired. Re-run `hermes gateway setup` to scan a new QR code |
|
||
| QR code expired during setup | The QR auto-refreshes up to 3 times. If it keeps expiring, check your network connection |
|
||
| Bot doesn't respond to DMs | Check `WEIXIN_DM_POLICY` — if set to `allowlist`, the sender must be in `WEIXIN_ALLOWED_USERS` |
|
||
| Bot ignores group messages | Group policy defaults to `disabled`. Set `WEIXIN_GROUP_POLICY=open` or `allowlist` — but note that QR-login iLink bot identities (`...@im.bot`) typically cannot receive ordinary WeChat group messages at all. If the gateway logs show no raw inbound events for group messages, the limitation is on the iLink side, not in Hermes. |
|
||
| Media download/upload fails | Ensure `cryptography` is installed. Check network access to `novac2c.cdn.weixin.qq.com` |
|
||
| `Blocked unsafe URL (SSRF protection)` | The outbound media URL points to a private/internal address. Only public URLs are allowed |
|
||
| Voice messages show as text | If WeChat provides a transcription, the adapter uses the text. This is expected behavior |
|
||
| Messages appear duplicated | The adapter deduplicates by message ID. If you see duplicates, check if multiple gateway instances are running |
|
||
| `iLink POST ... HTTP 4xx/5xx` | API error from the iLink service. Check your token validity and network connectivity |
|
||
| Terminal QR code doesn't render | Reinstall with the messaging extra: `pip install hermes-agent[messaging]`. Alternatively, open the URL printed above the QR |
|