docs: comprehensive docs audit — cover 13 features from last week's PRs (#5815)

Cover documentation gaps found by auditing all 50+ merged PRs from the past week:

tools-reference.md:
- Fix stale tool count (47→46, 11→10 browser tools) after browser_close removal
- Document notify_on_complete parameter in terminal tool description

telegram.md:
- Add Interactive Model Picker section (inline keyboard, provider/model drill-down)

discord.md:
- Add Interactive Model Picker section (Select dropdowns, 120s timeout)
- Add Native Slash Commands for Skills section (auto-registration at startup)

signal.md:
- Expand Attachments section with outgoing media delivery (send_image_file,
  send_voice, send_video, send_document via MEDIA: tags)

webhooks.md:
- Document {__raw__} special template token for full payload access
- Document Forum Topic Delivery via message_thread_id in deliver_extra

slack.md:
- Fix stale/misleading thread reply docs — thread replies no longer require
  @mention when bot has active session (3 locations updated)

security.md:
- Add cross-session isolation (layer 6) and input sanitization (layer 7)
  to security layers overview

feishu.md:
- Add WebSocket Tuning section (ws_reconnect_interval, ws_ping_interval)
- Add Per-Group Access Control section (group_rules with 5 policy types)

credential-pools.md:
- Add Delegation & Subagent Sharing section

delegation.md:
- Update key properties to mention credential pool inheritance

providers.md:
- Add Z.AI Endpoint Auto-Detection note
- Add xAI (Grok) Prompt Caching section

skills-catalog.md:
- Add p5js to creative skills category
This commit is contained in:
Teknium 2026-04-07 10:21:03 -07:00 committed by GitHub
parent c58e16757a
commit afe6c63c52
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
12 changed files with 158 additions and 11 deletions

View file

@ -383,6 +383,26 @@ display:
tool_progress_command: true
```
## Interactive Model Picker
Send `/model` with no arguments in a Discord channel to open a dropdown-based model picker:
1. **Provider selection** — a Select dropdown showing available providers (up to 25).
2. **Model selection** — a second dropdown with models for the chosen provider (up to 25).
The picker times out after 120 seconds. Only authorized users (those in `DISCORD_ALLOWED_USERS`) can interact with it. If you know the model name, type `/model <name>` directly.
## Native Slash Commands for Skills
Hermes automatically registers installed skills as **native Discord Application Commands**. This means skills appear in Discord's autocomplete `/` menu alongside built-in commands.
- Each skill becomes a Discord slash command (e.g., `/code-review`, `/ascii-art`)
- Skills accept an optional `args` string parameter
- Discord has a limit of 100 application commands per bot — if you have more skills than available slots, extra skills are skipped with a warning in the logs
- Skills are registered during bot startup alongside built-in commands like `/model`, `/reset`, and `/background`
No extra configuration is needed — any skill installed via `hermes skills install` is automatically registered as a Discord slash command on the next gateway restart.
## Home Channel
You can designate a "home channel" where the bot sends proactive messages (such as cron job output, reminders, and notifications). There are two ways to set it:

View file

@ -310,6 +310,58 @@ Additional webhook protections:
- **Body read timeout:** 30 seconds
- **Content-Type enforcement:** Only `application/json` is accepted
## WebSocket Tuning
When using `websocket` mode, you can customize reconnect and ping behavior:
```yaml
platforms:
feishu:
extra:
ws_reconnect_interval: 120 # Seconds between reconnect attempts (default: 120)
ws_ping_interval: 30 # Seconds between WebSocket pings (optional; SDK default if unset)
```
| Setting | Config key | Default | Description |
|---------|-----------|---------|-------------|
| Reconnect interval | `ws_reconnect_interval` | 120s | How long to wait between reconnection attempts |
| Ping interval | `ws_ping_interval` | _(SDK default)_ | Frequency of WebSocket keepalive pings |
## Per-Group Access Control
Beyond the global `FEISHU_GROUP_POLICY`, you can set fine-grained rules per group chat using `group_rules` in config.yaml:
```yaml
platforms:
feishu:
extra:
default_group_policy: "open" # Default for groups not in group_rules
admins: # Users who can manage bot settings
- "ou_admin_open_id"
group_rules:
"oc_group_chat_id_1":
policy: "allowlist" # open | allowlist | blacklist | admin_only | disabled
allowlist:
- "ou_user_open_id_1"
- "ou_user_open_id_2"
"oc_group_chat_id_2":
policy: "admin_only"
"oc_group_chat_id_3":
policy: "blacklist"
blacklist:
- "ou_blocked_user"
```
| Policy | Description |
|--------|-------------|
| `open` | Anyone in the group can use the bot |
| `allowlist` | Only users in the group's `allowlist` can use the bot |
| `blacklist` | Everyone except users in the group's `blacklist` can use the bot |
| `admin_only` | Only users in the global `admins` list can use the bot in this group |
| `disabled` | Bot ignores all messages in this group |
Groups not listed in `group_rules` fall back to `default_group_policy` (defaults to the value of `FEISHU_GROUP_POLICY`).
## Deduplication
Inbound messages are deduplicated using message IDs with a 24-hour TTL. The dedup state is persisted across restarts to `~/.hermes/feishu_seen_message_ids.json`.
@ -343,6 +395,8 @@ Inbound messages are deduplicated using message IDs with a 24-hour TTL. The dedu
| `HERMES_FEISHU_TEXT_BATCH_MAX_CHARS` | — | `4000` | Max characters merged per text batch |
| `HERMES_FEISHU_MEDIA_BATCH_DELAY_SECONDS` | — | `0.8` | Media burst debounce quiet period |
WebSocket and per-group ACL settings are configured via `config.yaml` under `platforms.feishu.extra` (see [WebSocket Tuning](#websocket-tuning) and [Per-Group Access Control](#per-group-access-control) above).
## Troubleshooting
| Problem | Fix |

View file

@ -147,13 +147,26 @@ Group access is controlled by the `SIGNAL_GROUP_ALLOWED_USERS` env var:
### Attachments
The adapter supports sending and receiving:
The adapter supports sending and receiving media in both directions.
**Incoming** (user → agent):
- **Images** — PNG, JPEG, GIF, WebP (auto-detected via magic bytes)
- **Audio** — MP3, OGG, WAV, M4A (voice messages transcribed if Whisper is configured)
- **Documents** — PDF, ZIP, and other file types
Attachment size limit: **100 MB**.
**Outgoing** (agent → user):
The agent can send media files via `MEDIA:` tags in responses. The following delivery methods are supported:
- **Images**`send_image_file` sends PNG, JPEG, GIF, WebP as native Signal attachments
- **Voice**`send_voice` sends audio files (OGG, MP3, WAV, M4A, AAC) as attachments
- **Video**`send_video` sends MP4 video files
- **Documents**`send_document` sends any file type (PDF, ZIP, etc.)
All outgoing media goes through Signal's standard attachment API. Unlike some platforms, Signal does not distinguish between voice messages and file attachments at the protocol level.
Attachment size limit: **100 MB** (both directions).
### Typing Indicators

View file

@ -210,11 +210,10 @@ Understanding how Hermes behaves in different contexts:
|---------|----------|
| **DMs** | Bot responds to every message — no @mention needed |
| **Channels** | Bot **only responds when @mentioned** (e.g., `@Hermes Agent what time is it?`). In channels, Hermes replies in a thread attached to that message. |
| **Threads** | If you @mention Hermes inside an existing thread, it replies in that same thread. |
| **Threads** | If you @mention Hermes inside an existing thread, it replies in that same thread. Once the bot has an active session in a thread, **subsequent replies in that thread do not require @mention** — the bot follows the conversation naturally. |
:::tip
In channels, always @mention the bot. Simply typing a message without mentioning it will be ignored.
This is intentional — it prevents the bot from responding to every message in busy channels.
In channels, always @mention the bot to start a conversation. Once the bot is active in a thread, you can reply in that thread without mentioning it. Outside of threads, messages without @mention are ignored to prevent noise in busy channels.
:::
---
@ -283,7 +282,7 @@ slack:
```
:::info
Unlike Discord and Telegram, Slack does not have a `free_response_channels` equivalent. The Slack adapter always requires `@mention` in channels — this is hardcoded behavior. In DMs, the bot always responds without needing a mention.
Unlike Discord and Telegram, Slack does not have a `free_response_channels` equivalent. The Slack adapter requires `@mention` to start a conversation in channels. However, once the bot has an active session in a thread, subsequent thread replies do not require a mention. In DMs, the bot always responds without needing a mention.
:::
### Unauthorized User Handling

View file

@ -383,6 +383,19 @@ To find a topic's `thread_id`, open the topic in Telegram Web or Desktop and loo
- **Privacy policy:** Telegram now requires bots to have a privacy policy. Set one via BotFather with `/setprivacy_policy`, or Telegram may auto-generate a placeholder. This is particularly important if your bot is public-facing.
- **Message streaming:** Bot API 9.x added support for streaming long responses, which can improve perceived latency for lengthy agent replies.
## Interactive Model Picker
When you send `/model` with no arguments in a Telegram chat, Hermes shows an interactive inline keyboard for switching models:
1. **Provider selection** — buttons showing each available provider with model counts (e.g., "OpenAI (15)", "✓ Anthropic (12)" for the current provider).
2. **Model selection** — paginated model list with **Prev**/**Next** navigation, a **Back** button to return to providers, and **Cancel**.
The current model and provider are displayed at the top. All navigation happens by editing the same message in-place (no chat clutter).
:::tip
If you know the exact model name, type `/model <name>` directly to skip the picker. You can also type `/model <name> --global` to persist the change across sessions.
:::
## Webhook Mode
By default, the Telegram adapter connects via **long polling** — the gateway makes outbound connections to Telegram's servers. This works everywhere but keeps a persistent connection open.

View file

@ -112,13 +112,38 @@ Prompts use dot-notation to access nested fields in the webhook payload:
- `{pull_request.title}` resolves to `payload["pull_request"]["title"]`
- `{repository.full_name}` resolves to `payload["repository"]["full_name"]`
- `{__raw__}` — special token that dumps the **entire payload** as indented JSON (truncated at 4000 characters). Useful for monitoring alerts or generic webhooks where the agent needs the full context.
- Missing keys are left as the literal `{key}` string (no error)
- Nested dicts and lists are JSON-serialized and truncated at 2000 characters
You can mix `{__raw__}` with regular template variables:
```yaml
prompt: "PR #{pull_request.number} by {pull_request.user.login}: {__raw__}"
```
If no `prompt` template is configured for a route, the entire payload is dumped as indented JSON (truncated at 4000 characters).
The same dot-notation templates work in `deliver_extra` values.
### Forum Topic Delivery
When delivering webhook responses to Telegram, you can target a specific forum topic by including `message_thread_id` (or `thread_id`) in `deliver_extra`:
```yaml
webhooks:
routes:
alerts:
events: ["alert"]
prompt: "Alert: {__raw__}"
deliver: "telegram"
deliver_extra:
chat_id: "-1001234567890"
message_thread_id: "42"
```
If `chat_id` is not provided in `deliver_extra`, the delivery falls back to the home channel configured for the target platform.
---
## GitHub PR Review (Step by Step) {#github-pr-review}