hermes-agent/website/docs/user-guide/messaging/weixin.md
Teknium f317325279
docs(weixin): clarify iLink bot identity limits and warn on group policy (#17433)
QR-login connects an iLink bot identity (...@im.bot), not a scriptable
personal WeChat account. iLink typically does not deliver ordinary WeChat
group events to these bots, so WEIXIN_GROUP_POLICY / WEIXIN_GROUP_ALLOWED_USERS
often have no effect regardless of value.

- Setup wizard: print iLink-bot caveat before the group-policy prompt; relabel
  the allowlist input as 'group chat IDs (not member user IDs)'; note that
  'open' / 'allowlist' only take effect if iLink delivers group events.
- Adapter: log a WARNING at connect() when WEIXIN_GROUP_POLICY is non-disabled
  so the limitation is surfaced in gateway logs, not just docs.
- Docs: add a top-of-page warning callout to weixin.md explaining the iLink
  bot identity, narrow the 'DM and group messaging' feature line to DM-only
  with a group caveat, tighten the Group Policy section and troubleshooting
  row, and clarify WEIXIN_GROUP_ALLOWED_USERS as group IDs (not user IDs)
  in weixin.md and environment-variables.md.

Closes #17094
2026-04-29 06:26:10 -07:00

312 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
sidebar_position: 15
title: "Weixin (WeChat)"
description: "Connect Hermes Agent to personal WeChat accounts via the iLink Bot API"
---
# Weixin (WeChat)
Connect Hermes to [WeChat](https://weixin.qq.com/) (微信), Tencent's personal messaging platform. The adapter uses Tencent's **iLink Bot API** for personal WeChat accounts — this is distinct from WeCom (Enterprise WeChat). Messages are delivered via long-polling, so no public endpoint or webhook is required.
:::info
This adapter is for **personal WeChat accounts** (微信). If you need enterprise/corporate WeChat, see the [WeCom adapter](./wecom.md) instead.
:::
:::warning iLink bot identity — ordinary WeChat groups may not work
QR login connects Hermes to an **iLink bot identity** (e.g. `a5ace6fd482e@im.bot`), **not** a fully scriptable ordinary personal WeChat account. Consequences:
- The iLink bot identity generally **cannot be invited into ordinary WeChat groups** the way a normal contact can.
- iLink typically **does not deliver ordinary WeChat group events** (including `@`-mentions of the personal account used for QR login) to the gateway for most bot-type accounts.
- `@`-mentioning the personal WeChat account used to scan the QR code is **not** the same as `@`-mentioning the iLink bot — the bot is a separate identity.
- The `WEIXIN_GROUP_POLICY` / `WEIXIN_GROUP_ALLOWED_USERS` settings below only take effect when iLink actually returns group events for your account type. If it doesn't, group messages will never reach Hermes regardless of policy.
In practice, most deployments only get DMs to the iLink bot working reliably. If group delivery doesn't work after configuration, the limitation is on the iLink side, not in Hermes. The gateway logs a `WARNING` at startup whenever `WEIXIN_GROUP_POLICY` is set to anything other than `disabled`.
:::
## Prerequisites
- A personal WeChat account
- Python packages: `aiohttp` and `cryptography`
- Terminal QR rendering is included when Hermes is installed with the `messaging` extra
Install the required dependencies:
```bash
pip install aiohttp cryptography
# Optional: for terminal QR code display
pip install hermes-agent[messaging]
```
## Setup
### 1. Run the Setup Wizard
The easiest way to connect your WeChat account is through the interactive setup:
```bash
hermes gateway setup
```
Select **Weixin** when prompted. The wizard will:
1. Request a QR code from the iLink Bot API
2. Display the QR code in your terminal (or provide a URL)
3. Wait for you to scan the QR code with the WeChat mobile app
4. Prompt you to confirm the login on your phone
5. Save the account credentials automatically to `~/.hermes/weixin/accounts/`
Once confirmed, you'll see a message like:
```
微信连接成功account_id=your-account-id
```
The wizard stores the `account_id`, `token`, and `base_url` so you don't need to configure them manually.
### 2. Configure Environment Variables
After initial QR login, set at minimum the account ID in `~/.hermes/.env`:
```bash
WEIXIN_ACCOUNT_ID=your-account-id
# Optional: override the token (normally auto-saved from QR login)
# WEIXIN_TOKEN=your-bot-token
# Optional: restrict access
WEIXIN_DM_POLICY=open
WEIXIN_ALLOWED_USERS=user_id_1,user_id_2
# Optional: restore legacy multiline splitting behavior
# WEIXIN_SPLIT_MULTILINE_MESSAGES=true
# Optional: home channel for cron/notifications
WEIXIN_HOME_CHANNEL=chat_id
WEIXIN_HOME_CHANNEL_NAME=Home
```
### 3. Start the Gateway
```bash
hermes gateway
```
The adapter will restore saved credentials, connect to the iLink API, and begin long-polling for messages.
## Features
- **Long-poll transport** — no public endpoint, webhook, or WebSocket needed
- **QR code login** — scan-to-connect setup via `hermes gateway setup`
- **DM messaging** — configurable access policies; group messaging depends on iLink actually delivering group events for the connected identity (often not the case for iLink bot accounts — see the warning above)
- **Media support** — images, video, files, and voice messages
- **AES-128-ECB encrypted CDN** — automatic encryption/decryption for all media transfers
- **Context token persistence** — disk-backed reply continuity across restarts
- **Markdown formatting** — preserves Markdown, including headers, tables, and code blocks, so WeChat clients that support Markdown can render it natively
- **Smart message chunking** — messages stay as a single bubble when under the limit; only oversized payloads split at logical boundaries
- **Typing indicators** — shows "typing…" status in the WeChat client while the agent processes
- **SSRF protection** — outbound media URLs are validated before download
- **Message deduplication** — 5-minute sliding window prevents double-processing
- **Automatic retry with backoff** — recovers from transient API errors
## Configuration Options
Set these in `config.yaml` under `platforms.weixin.extra`:
| Key | Default | Description |
|-----|---------|-------------|
| `account_id` | — | iLink Bot account ID (required) |
| `token` | — | iLink Bot token (required, auto-saved from QR login) |
| `base_url` | `https://ilinkai.weixin.qq.com` | iLink API base URL |
| `cdn_base_url` | `https://novac2c.cdn.weixin.qq.com/c2c` | CDN base URL for media transfer |
| `dm_policy` | `open` | DM access: `open`, `allowlist`, `disabled`, `pairing` |
| `group_policy` | `disabled` | Group access: `open`, `allowlist`, `disabled` |
| `allow_from` | `[]` | User IDs allowed for DMs (when dm_policy=allowlist) |
| `group_allow_from` | `[]` | Group IDs allowed (when group_policy=allowlist) |
| `split_multiline_messages` | `false` | When `true`, split multi-line replies into multiple chat messages (legacy behavior). When `false`, keep multi-line replies as one message unless they exceed the length limit. |
## Access Policies
### DM Policy
Controls who can send direct messages to the bot:
| Value | Behavior |
|-------|----------|
| `open` | Anyone can DM the bot (default) |
| `allowlist` | Only user IDs in `allow_from` can DM |
| `disabled` | All DMs are ignored |
| `pairing` | Pairing mode (for initial setup) |
```bash
WEIXIN_DM_POLICY=allowlist
WEIXIN_ALLOWED_USERS=user_id_1,user_id_2
```
### Group Policy
Controls which groups the bot responds in **when iLink delivers group events for the connected identity**. For QR-login iLink bot identities (e.g. `...@im.bot`), group events are typically not delivered at all, so this policy may have no effect — see the iLink bot limitation warning at the top of the page.
| Value | Behavior |
|-------|----------|
| `open` | Bot responds in all groups (if events are delivered) |
| `allowlist` | Bot only responds in group IDs listed in `group_allow_from` (if events are delivered) |
| `disabled` | All group messages are ignored (default) |
```bash
WEIXIN_GROUP_POLICY=allowlist
# NOTE: this is a comma-separated list of group chat IDs, NOT member user IDs,
# despite the variable name containing "USERS". Keep this in mind when configuring.
WEIXIN_GROUP_ALLOWED_USERS=group_id_1,group_id_2
```
:::note
The default group policy is `disabled` for Weixin (unlike WeCom where it defaults to `open`). This is intentional — personal WeChat accounts may be in many groups, and iLink bot identities typically can't receive ordinary WeChat group messages at all. The gateway logs a `WARNING` at startup if you set `WEIXIN_GROUP_POLICY` to anything other than `disabled`.
:::
## Media Support
### Inbound (receiving)
The adapter receives media attachments from users, downloads them from the WeChat CDN, decrypts them, and caches them locally for agent processing:
| Type | How it's handled |
|------|-----------------|
| **Images** | Downloaded, AES-decrypted, and cached as JPEG. |
| **Video** | Downloaded, AES-decrypted, and cached as MP4. |
| **Files** | Downloaded, AES-decrypted, and cached. Original filename is preserved. |
| **Voice** | If a text transcription is available, it's extracted as text. Otherwise the audio (SILK format) is downloaded and cached. |
**Quoted messages:** Media from quoted (replied-to) messages is also extracted, so the agent has context about what the user is replying to.
### AES-128-ECB Encrypted CDN
WeChat media files are transferred through an encrypted CDN. The adapter handles this transparently:
- **Inbound:** Encrypted media is downloaded from the CDN using `encrypted_query_param` URLs, then decrypted with AES-128-ECB using the per-file key provided in the message payload.
- **Outbound:** Files are encrypted locally with a random AES-128-ECB key, uploaded to the CDN, and the encrypted reference is included in the outbound message.
- The AES key is 16 bytes (128-bit). Keys may arrive as raw base64 or hex-encoded — the adapter handles both formats.
- This requires the `cryptography` Python package.
No configuration is needed — encryption and decryption happen automatically.
### Outbound (sending)
| Method | What it sends |
|--------|--------------|
| `send` | Text messages with Markdown formatting |
| `send_image` / `send_image_file` | Native image messages (via CDN upload) |
| `send_document` | File attachments (via CDN upload) |
| `send_video` | Video messages (via CDN upload) |
All outbound media goes through the encrypted CDN upload flow:
1. Generate a random AES-128 key
2. Encrypt the file with AES-128-ECB + PKCS#7 padding
3. Request an upload URL from the iLink API (`getuploadurl`)
4. Upload the ciphertext to the CDN
5. Send the message with the encrypted media reference
## Context Token Persistence
The iLink Bot API requires a `context_token` to be echoed back with each outbound message for a given peer. The adapter maintains a disk-backed context token store:
- Tokens are saved per account+peer to `~/.hermes/weixin/accounts/<account_id>.context-tokens.json`
- On startup, previously saved tokens are restored
- Every inbound message updates the stored token for that sender
- Outbound messages automatically include the latest context token
This ensures reply continuity even after gateway restarts.
## Markdown Formatting
WeChat clients connected through the iLink Bot API can render Markdown directly, so the adapter preserves Markdown instead of rewriting it:
- **Headers** stay as Markdown headings (`#`, `##`, ...)
- **Tables** stay as Markdown tables
- **Code fences** stay as fenced code blocks
- **Excessive blank lines** are collapsed to double newlines outside fenced code blocks
## Message Chunking
Messages are delivered as a single chat message whenever they fit within the platform limit. Only oversized payloads are split for delivery:
- Maximum message length: **4000 characters**
- Messages under the limit stay intact even when they contain multiple paragraphs or line breaks
- Oversized messages split at logical boundaries (paragraphs, blank lines, code fences)
- Code fences are kept intact whenever possible (never split mid-block unless the fence itself exceeds the limit)
- Oversized individual blocks fall back to the base adapter's truncation logic
- A 0.3 s inter-chunk delay prevents WeChat rate-limit drops when multiple chunks are sent
## Typing Indicators
The adapter shows typing status in the WeChat client:
1. When a message arrives, the adapter fetches a `typing_ticket` via the `getconfig` API
2. Typing tickets are cached for 10 minutes per user
3. `send_typing` sends a typing-start signal; `stop_typing` sends a typing-stop signal
4. The gateway automatically triggers typing indicators while the agent processes a message
## Long-Poll Connection
The adapter uses HTTP long-polling (not WebSocket) to receive messages:
### How It Works
1. **Connect:** Validates credentials and starts the poll loop
2. **Poll:** Calls `getupdates` with a 35-second timeout; the server holds the request until messages arrive or the timeout expires
3. **Dispatch:** Inbound messages are dispatched concurrently via `asyncio.create_task`
4. **Sync buffer:** A persistent sync cursor (`get_updates_buf`) is saved to disk so the adapter resumes from the correct position after restarts
### Retry Behavior
On API errors, the adapter uses a simple retry strategy:
| Condition | Behavior |
|-----------|----------|
| Transient error (1st2nd) | Retry after 2 seconds |
| Repeated errors (3+) | Back off for 30 seconds, then reset counter |
| Session expired (`errcode=-14`) | Pause for 10 minutes (re-login may be needed) |
| Timeout | Immediately re-poll (normal long-poll behavior) |
### Deduplication
Inbound messages are deduplicated using message IDs with a 5-minute window. This prevents double-processing during network hiccups or overlapping poll responses.
### Token Lock
Only one Weixin gateway instance can use a given token at a time. The adapter acquires a scoped lock on startup and releases it on shutdown. If another gateway is already using the same token, startup fails with an informative error message.
## All Environment Variables
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `WEIXIN_ACCOUNT_ID` | ✅ | — | iLink Bot account ID (from QR login) |
| `WEIXIN_TOKEN` | ✅ | — | iLink Bot token (auto-saved from QR login) |
| `WEIXIN_BASE_URL` | — | `https://ilinkai.weixin.qq.com` | iLink API base URL |
| `WEIXIN_CDN_BASE_URL` | — | `https://novac2c.cdn.weixin.qq.com/c2c` | CDN base URL for media transfer |
| `WEIXIN_DM_POLICY` | — | `open` | DM access policy: `open`, `allowlist`, `disabled`, `pairing` |
| `WEIXIN_GROUP_POLICY` | — | `disabled` | Group access policy: `open`, `allowlist`, `disabled` |
| `WEIXIN_ALLOWED_USERS` | — | _(empty)_ | Comma-separated user IDs for DM allowlist |
| `WEIXIN_GROUP_ALLOWED_USERS` | — | _(empty)_ | Comma-separated **group chat IDs** (not member user IDs) for group allowlist. The variable name is legacy — it expects group IDs, not user IDs. |
| `WEIXIN_HOME_CHANNEL` | — | — | Chat ID for cron/notification output |
| `WEIXIN_HOME_CHANNEL_NAME` | — | `Home` | Display name for the home channel |
| `WEIXIN_ALLOW_ALL_USERS` | — | — | Gateway-level flag to allow all users (used by setup wizard) |
## Troubleshooting
| Problem | Fix |
|---------|-----|
| `Weixin startup failed: aiohttp and cryptography are required` | Install both: `pip install aiohttp cryptography` |
| `Weixin startup failed: WEIXIN_TOKEN is required` | Run `hermes gateway setup` to complete QR login, or set `WEIXIN_TOKEN` manually |
| `Weixin startup failed: WEIXIN_ACCOUNT_ID is required` | Set `WEIXIN_ACCOUNT_ID` in your `.env` or run `hermes gateway setup` |
| `Another local Hermes gateway is already using this Weixin token` | Stop the other gateway instance first — only one poller per token is allowed |
| Session expired (`errcode=-14`) | Your login session has expired. Re-run `hermes gateway setup` to scan a new QR code |
| QR code expired during setup | The QR auto-refreshes up to 3 times. If it keeps expiring, check your network connection |
| Bot doesn't respond to DMs | Check `WEIXIN_DM_POLICY` — if set to `allowlist`, the sender must be in `WEIXIN_ALLOWED_USERS` |
| Bot ignores group messages | Group policy defaults to `disabled`. Set `WEIXIN_GROUP_POLICY=open` or `allowlist` — but note that QR-login iLink bot identities (`...@im.bot`) typically cannot receive ordinary WeChat group messages at all. If the gateway logs show no raw inbound events for group messages, the limitation is on the iLink side, not in Hermes. |
| Media download/upload fails | Ensure `cryptography` is installed. Check network access to `novac2c.cdn.weixin.qq.com` |
| `Blocked unsafe URL (SSRF protection)` | The outbound media URL points to a private/internal address. Only public URLs are allowed |
| Voice messages show as text | If WeChat provides a transcription, the adapter uses the text. This is expected behavior |
| Messages appear duplicated | The adapter deduplicates by message ID. If you see duplicates, check if multiple gateway instances are running |
| `iLink POST ... HTTP 4xx/5xx` | API error from the iLink service. Check your token validity and network connectivity |
| Terminal QR code doesn't render | Reinstall with the messaging extra: `pip install hermes-agent[messaging]`. Alternatively, open the URL printed above the QR |