docs: comprehensive documentation audit — fix 9 HIGH, 20+ MEDIUM gaps (#4087)

Reference docs fixes:
- cli-commands.md: remove non-existent --provider alibaba, add hermes
  profile/completion/plugins/mcp to top-level table, add --profile/-p
  global flag, add --source chat option
- slash-commands.md: add /yolo and /commands, fix /q alias conflict
  (resolves to /queue not /quit), add missing aliases (/bg, /set-home,
  /reload_mcp, /gateway)
- toolsets-reference.md: fix hermes-api-server (not same as hermes-cli,
  omits clarify/send_message/text_to_speech)
- profile-commands.md: fix show name required not optional, --clone-from
  not --from, add --remove/--name to alias, fix alias path, fix export/
  import arg types, remove non-existent fish completion
- tools-reference.md: add EXA_API_KEY to web tools requires_env
- mcp-config-reference.md: add auth key for OAuth, tool name sanitization
- environment-variables.md: add EXA_API_KEY, update provider values
- plugins.md: remove non-existent ctx.register_command(), add
  ctx.inject_message()

Feature docs additions:
- security.md: add /yolo mode, approval modes (manual/smart/off),
  configurable timeout, expanded dangerous patterns table
- cron.md: add wrap_response config, [SILENT] suppression
- mcp.md: add dynamic tool discovery, MCP sampling support
- cli.md: add Ctrl+Z suspend, busy_input_mode, tool_preview_length
- docker.md: add skills/credential file mounting

Messaging platform docs:
- telegram.md: add webhook mode, DoH fallback IPs
- slack.md: add multi-workspace OAuth support
- discord.md: add DISCORD_IGNORE_NO_MENTION
- matrix.md: add MSC3245 native voice messages
- feishu.md: expand from 129 to 365 lines (encrypt key, verification
  token, group policy, card actions, media, rate limiting, markdown,
  troubleshooting)
- wecom.md: expand from 86 to 264 lines (per-group allowlists, media,
  AES decryption, stream replies, reconnection, troubleshooting)

Configuration docs:
- quickstart.md: add DeepSeek, Copilot, Copilot ACP providers
- configuration.md: add DeepSeek provider, Exa web backend, terminal
  env_passthrough/images, browser.command_timeout, compression params,
  discord config, security/tirith config, timezone, auxiliary models

21 files changed, ~1000 lines added
This commit is contained in:
Teknium 2026-03-30 17:15:21 -07:00 committed by GitHub
parent 3c8f910973
commit 7e0c2c3ce3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
21 changed files with 1004 additions and 83 deletions

View file

@ -193,6 +193,40 @@ When scheduling jobs, you specify where the output goes:
The agent's final response is automatically delivered. You do not need to call `send_message` in the cron prompt.
### Response wrapping
By default, delivered cron output is wrapped with a header and footer so the recipient knows it came from a scheduled task:
```
Cronjob Response: Morning feeds
-------------
<agent output here>
Note: The agent cannot see this message, and therefore cannot respond to it.
```
To deliver the raw agent output without the wrapper, set `cron.wrap_response` to `false`:
```yaml
# ~/.hermes/config.yaml
cron:
wrap_response: false
```
### Silent suppression
If the agent's final response starts with `[SILENT]`, delivery is suppressed entirely. The output is still saved locally for audit (in `~/.hermes/cron/output/`), but no message is sent to the delivery target.
This is useful for monitoring jobs that should only report when something is wrong:
```text
Check if nginx is running. If everything is healthy, respond with only [SILENT].
Otherwise, report the issue.
```
Failed jobs always deliver regardless of the `[SILENT]` marker — only successful runs can be silenced.
## Schedule formats
The agent's final response is automatically delivered — you do **not** need to include `send_message` in the cron prompt for that same destination. If a cron run calls `send_message` to the exact target the scheduler will already deliver to, Hermes skips that duplicate send and tells the model to put the user-facing content in the final response instead. Use `send_message` only for additional or different targets.

View file

@ -277,6 +277,14 @@ That keeps the tool list clean.
Hermes discovers MCP servers at startup and registers their tools into the normal tool registry.
### Dynamic Tool Discovery
MCP servers can notify Hermes when their available tools change at runtime by sending a `notifications/tools/list_changed` notification. When Hermes receives this notification, it automatically re-fetches the server's tool list and updates the registry — no manual `/reload-mcp` required.
This is useful for MCP servers whose capabilities change dynamically (e.g. a server that adds tools when a new database schema is loaded, or removes tools when a service goes offline).
The refresh is lock-protected so rapid-fire notifications from the same server don't cause overlapping refreshes. Prompt and resource change notifications (`prompts/list_changed`, `resources/list_changed`) are received but not yet acted on.
### Reloading
If you change MCP config, use:
@ -285,7 +293,7 @@ If you change MCP config, use:
/reload-mcp
```
This reloads MCP servers from config and refreshes the available tool list.
This reloads MCP servers from config and refreshes the available tool list. For runtime tool changes pushed by the server itself, see [Dynamic Tool Discovery](#dynamic-tool-discovery) above.
### Toolsets
@ -403,6 +411,39 @@ Because Hermes now only registers those wrappers when both are true:
This is intentional and keeps the tool list honest.
## MCP Sampling Support
MCP servers can request LLM inference from Hermes via the `sampling/createMessage` protocol. This allows an MCP server to ask Hermes to generate text on its behalf — useful for servers that need LLM capabilities but don't have their own model access.
Sampling is **enabled by default** for all MCP servers (when the MCP SDK supports it). Configure it per-server under the `sampling` key:
```yaml
mcp_servers:
my_server:
command: "my-mcp-server"
sampling:
enabled: true # Enable sampling (default: true)
model: "openai/gpt-4o" # Override model for sampling requests (optional)
max_tokens_cap: 4096 # Max tokens per sampling response (default: 4096)
timeout: 30 # Timeout in seconds per request (default: 30)
max_rpm: 10 # Rate limit: max requests per minute (default: 10)
max_tool_rounds: 5 # Max tool-use rounds in sampling loops (default: 5)
allowed_models: [] # Allowlist of model names the server may request (empty = any)
log_level: "info" # Audit log level: debug, info, or warning (default: info)
```
The sampling handler includes a sliding-window rate limiter, per-request timeouts, and tool-loop depth limits to prevent runaway usage. Metrics (request count, errors, tokens used) are tracked per server instance.
To disable sampling for a specific server:
```yaml
mcp_servers:
untrusted_server:
url: "https://mcp.example.com"
sampling:
enabled: false
```
## Running Hermes as an MCP server
In addition to connecting **to** MCP servers, Hermes can also **be** an MCP server. This lets other MCP-capable agents (Claude Code, Cursor, Codex, or any MCP client) use Hermes's messaging capabilities — list conversations, read message history, and send messages across all your connected platforms.

View file

@ -4,7 +4,7 @@ sidebar_position: 20
# Plugins
Hermes has a plugin system for adding custom tools, hooks, slash commands, and integrations without modifying core code.
Hermes has a plugin system for adding custom tools, hooks, and integrations without modifying core code.
**→ [Build a Hermes Plugin](/docs/guides/build-a-hermes-plugin)** — step-by-step guide with a complete working example.
@ -30,7 +30,7 @@ Project-local plugins under `./.hermes/plugins/` are disabled by default. Enable
|-----------|-----|
| Add tools | `ctx.register_tool(name, schema, handler)` |
| Add hooks | `ctx.register_hook("post_tool_call", callback)` |
| Add slash commands | `ctx.register_command("mycommand", handler)` |
| Inject messages | `ctx.inject_message(content, role="user")` — see [Injecting Messages](#injecting-messages) |
| Ship data files | `Path(__file__).parent / "data" / "file.yaml"` |
| Bundle skills | Copy `skill.md` to `~/.hermes/skills/` at load time |
| Gate on env vars | `requires_env: [API_KEY]` in plugin.yaml |
@ -57,34 +57,6 @@ Plugins can register callbacks for these lifecycle events. See the **[Event Hook
| `on_session_start` | New session created (first turn only) |
| `on_session_end` | End of every `run_conversation` call |
## Slash commands
Plugins can register slash commands that work in both CLI and messaging platforms:
```python
def register(ctx):
ctx.register_command(
name="greet",
handler=lambda args: f"Hello, {args or 'world'}!",
description="Greet someone",
args_hint="[name]",
aliases=("hi",),
)
```
The handler receives the argument string (everything after `/greet`) and returns a string to display. Registered commands automatically appear in `/help`, tab autocomplete, Telegram bot menu, and Slack subcommand mapping.
| Parameter | Description |
|-----------|-------------|
| `name` | Command name without slash |
| `handler` | Callable that takes `args: str` and returns `str | None` |
| `description` | Shown in `/help` |
| `args_hint` | Usage hint, e.g. `"[name]"` |
| `aliases` | Tuple of alternative names |
| `cli_only` | Only available in CLI |
| `gateway_only` | Only available in messaging platforms |
| `gateway_config_gate` | Config dotpath (e.g. `"display.my_option"`). When set on a `cli_only` command, the command becomes available in the gateway if the config value is truthy. |
## Managing plugins
```bash
@ -109,4 +81,27 @@ plugins:
In a running session, `/plugins` shows which plugins are currently loaded.
## Injecting Messages
Plugins can inject messages into the active conversation using `ctx.inject_message()`:
```python
ctx.inject_message("New data arrived from the webhook", role="user")
```
**Signature:** `ctx.inject_message(content: str, role: str = "user") -> bool`
How it works:
- If the agent is **idle** (waiting for user input), the message is queued as the next input and starts a new turn.
- If the agent is **mid-turn** (actively running), the message interrupts the current operation — the same as a user typing a new message and pressing Enter.
- For non-`"user"` roles, the content is prefixed with `[role]` (e.g. `[system] ...`).
- Returns `True` if the message was queued successfully, `False` if no CLI reference is available (e.g. in gateway mode).
This enables plugins like remote control viewers, messaging bridges, or webhook receivers to feed messages into the conversation from external sources.
:::note
`inject_message` is only available in CLI mode. In gateway mode, there is no CLI reference and the method returns `False`.
:::
See the **[full guide](/docs/guides/build-a-hermes-plugin)** for handler contracts, schema format, hook behavior, error handling, and common mistakes.