Add an official, production-grade WhatsApp integration via Meta's Business Cloud API as a complement to the existing Baileys bridge. No bridge subprocess, no QR codes, no account-ban risk — at the cost of a Meta Business account and a public HTTPS webhook URL. Setup is fully wizard-driven: 'hermes whatsapp-cloud' walks through every credential with paste-time validation (catches the #1 trap of pasting a phone number into the Phone Number ID field), generates a verify token, and ends with copy-paste instructions for the cloudflared / Meta-dashboard / Business Manager pieces that can't be automated. The wizard also points users at Meta's Business Manager for setting the bot's display name and profile picture. Feature set: - Inbound: text, images (with native-vision routing), voice notes (STT), documents (small text inlined, larger cached), reply context. - Outbound: text with WhatsApp-flavored markdown conversion, images, videos, documents, opus voice notes via ffmpeg with MP3 fallback. - Native interactive buttons for clarify, dangerous-command approval, and slash-command confirmation flows — matches the Telegram / Discord UX, graceful degrades to plain text. - Read receipts (blue double-checkmarks) and typing indicator, using Meta's combined endpoint so they fire in a single API call. - Webhook security: X-Hub-Signature-256 HMAC verification (raw body, constant-time), wamid deduplication, group-shaped-message refusal (groups deferred to v2 — Baileys still covers them). - Full integration with the gateway's session, cron, display-tier, prompt-hint, and auth-allowlist systems. Cloud and Baileys can run side-by-side against different phone numbers. Also wires STT (speech-to-text) through Nous's managed audio gateway for Nous subscribers — previously the default stt.provider=local required a separate faster-whisper install. New subscribers now get voice-note transcription out of the box. Docs: 418-line user guide at website/docs/user-guide/messaging/ whatsapp-cloud.md, sidebar entry, environment-variables reference, ADDING_A_PLATFORM.md updated with the optional interactive-UX contract for future adapter authors. Tests: 100 dedicated tests for the adapter, 32 for the setup wizard, 20 for the Nous subscription STT wiring, plus regression coverage across display_config, prompt_builder, and the cron scheduler. Known limitations (deferred until clear demand signal): - Group chats — use the Baileys bridge if you need them. - Message templates for 24-hour-window outside-conversation sends — reactive chat is unaffected; cron / delegate_task with gaps > 24h will fail with a clear error. The agent's system prompt warns the model about this so it knows to mention it when scheduling delayed messages.
14 KiB
Adding a New Messaging Platform
There are two ways to add a platform to the Hermes gateway:
Plugin Path (Recommended for Community/Third-Party)
Create a plugin directory in ~/.hermes/plugins/ (or under plugins/platforms/
for bundled plugins) with a plugin.yaml and adapter.py. The adapter
inherits from BasePlatformAdapter and registers via
ctx.register_platform() in the register(ctx) entry point. This requires
zero changes to core Hermes code.
The plugin system automatically handles: adapter creation, config parsing, user authorization, cron delivery, send_message routing, system prompt hints, status display, gateway setup, and more.
Optional hooks cover the edges most adapters need:
env_enablement_fn: () -> Optional[dict]— seedsPlatformConfig.extra(and an optionalhome_channeldict) from env vars BEFORE the adapter is constructed. Without this, env-only setups don't surface inhermes gateway statusorget_connected_platforms()until the SDK instantiates.apply_yaml_config_fn: (yaml_cfg, platform_cfg) -> Optional[dict]— translate this platform'sconfig.yamlkeys into env vars and/or seedPlatformConfig.extradirectly. Lets a plugin own its YAML schema instead of growing coregateway/config.pyboilerplate per platform. Mutatingos.environis allowed (usenot os.getenv(...)guards to preserve env > YAML precedence); the returned dict is merged intoPlatformConfig.extra. Called duringload_gateway_config()after the generic shared-key loop and before_apply_env_overrides().cron_deliver_env_var: str— name of the*_HOME_CHANNELenv var. When set,deliver=<name>cron jobs route to this var without editingcron/scheduler.py's hardcoded sets.standalone_sender_fn: async (...) -> dict: out-of-process delivery for cron jobs that run separately from the gateway. Without this, adeliver=<name>job fires correctly but the actual send returnsNo live adapter for platform '<name>'. Pair withcron_deliver_env_varfor end-to-end cron support. See the docsite for the signature.plugin.yamlrequires_env/optional_envrich-dict entries — auto-populateOPTIONAL_ENV_VARSinhermes_cli/config.pyso the setup wizard surfaces proper descriptions, prompts, password flags, and URLs.
Subclassing for platform-specific UX. When a platform has a hard
time-window constraint that the base adapter can't anticipate (LINE's
60s single-use reply token, WhatsApp's 24h session window, etc.), an
adapter can override _keep_typing to layer a mid-flight bubble at a
threshold without expanding the kwarg surface. Always
await super()._keep_typing(...) so the typing heartbeat keeps running,
and tear down your side task in finally. See plugins/platforms/line/
for the full pattern (Template Buttons postback at 45s, RequestCache
state machine, interrupt_session_activity override for /stop
orphans) and the developer-guide page for the prose walkthrough.
Sibling adapters that share behavior. When a single platform has
two transport modes the user picks between — unofficial vs official
APIs, polling vs websocket, library A vs library B — the right
structure is two adapters that share a behavior mixin. WhatsApp does
this: gateway/platforms/whatsapp.py (Baileys bridge) and
gateway/platforms/whatsapp_cloud.py (Meta Cloud API) both inherit
from WhatsAppBehaviorMixin in gateway/platforms/whatsapp_common.py.
The mixin owns gating, allow-lists, mention parsing, broadcast
filters, and the WhatsApp-flavored markdown conversion — everything
that's platform-protocol-agnostic. Each adapter owns its transport.
Both register distinct Platform.* enum values so the gateway can run
both simultaneously against different phone numbers. The mixin must
come first in the bases list — class WhatsAppAdapter(Mixin, BasePlatformAdapter) — so the mixin's format_message overrides
BasePlatformAdapter's generic default.
See plugins/platforms/irc/, plugins/platforms/teams/, and
plugins/platforms/google_chat/ for complete working examples, and
website/docs/developer-guide/adding-platform-adapters.md for the full
plugin guide with code examples and hook documentation.
Built-in Path (Core Contributors Only)
Checklist for integrating a platform directly into the Hermes core. Use this as a reference when building a built-in adapter — every item here is a real integration point. Missing any of them will cause broken functionality, missing features, or inconsistent behavior.
1. Core Adapter (gateway/platforms/<platform>.py)
The adapter is a subclass of BasePlatformAdapter from gateway/platforms/base.py.
Required methods
| Method | Purpose |
|---|---|
__init__(self, config) |
Parse config, init state. Call super().__init__(config, Platform.YOUR_PLATFORM) |
connect() -> bool |
Connect to the platform, start listeners. Return True on success |
disconnect() |
Stop listeners, close connections, cancel tasks |
send(chat_id, text, ...) -> SendResult |
Send a text message |
send_typing(chat_id) |
Send typing indicator |
send_image(chat_id, image_url, caption) -> SendResult |
Send an image |
get_chat_info(chat_id) -> dict |
Return {name, type, chat_id} for a chat |
Optional methods (have default stubs in base)
| Method | Purpose |
|---|---|
send_document(chat_id, path, caption) |
Send a file attachment |
send_voice(chat_id, path) |
Send a voice message |
send_video(chat_id, path, caption) |
Send a video |
send_animation(chat_id, path, caption) |
Send a GIF/animation |
send_image_file(chat_id, path, caption) |
Send image from local file |
Interactive UX (recommended if your platform supports tappable buttons)
If your platform supports interactive button/menu messages, implement these for a more polished agent experience. They all degrade gracefully to plain text when not overridden:
| Method | Purpose |
|---|---|
send_clarify(chat_id, question, choices, clarify_id, session_key, ...) |
Render the clarify tool's multi-choice question as tappable buttons. Pair with inbound dispatch that routes button taps to tools.clarify_gateway.resolve_gateway_clarify. |
send_exec_approval(chat_id, command, session_key, description, ...) |
Render dangerous-command approval as Approve/Deny buttons. Inbound dispatch routes to tools.approval.resolve_gateway_approval. |
send_slash_confirm(chat_id, title, message, session_key, confirm_id, ...) |
Render slash-command confirmations (e.g. /reload-mcp) as Once/Always/Cancel buttons. Inbound dispatch routes to tools.slash_confirm.resolve. |
send_model_picker(...) |
Interactive /model picker. Used by Telegram and Discord. |
See gateway/platforms/telegram.py, discord.py, and whatsapp_cloud.py for reference implementations. The button-callback id convention (cl:<id>:<idx>, appr:<id>:<choice>, sc:<choice>:<id>) is shared across adapters — match it so the gateway-side resolvers work without modification.
Required function
def check_<platform>_requirements() -> bool:
"""Check if this platform's dependencies are available."""
Key patterns to follow
- Use
self.build_source(...)to constructSessionSourceobjects - Call
self.handle_message(event)to dispatch inbound messages to the gateway - Use
MessageEvent,MessageType,SendResultfrom base - Use
cache_image_from_bytes,cache_audio_from_bytes,cache_document_from_bytesfor attachments - Filter self-messages (prevent reply loops)
- Filter sync/echo messages if the platform has them
- Redact sensitive identifiers (phone numbers, tokens) in all log output
- Implement reconnection with exponential backoff + jitter for streaming connections
- Set
MAX_MESSAGE_LENGTHif the platform has message size limits
2. Platform Enum (gateway/config.py)
Add the platform to the Platform enum:
class Platform(Enum):
...
YOUR_PLATFORM = "your_platform"
Add env var loading in _apply_env_overrides():
# Your Platform
your_token = os.getenv("YOUR_PLATFORM_TOKEN")
if your_token:
if Platform.YOUR_PLATFORM not in config.platforms:
config.platforms[Platform.YOUR_PLATFORM] = PlatformConfig()
config.platforms[Platform.YOUR_PLATFORM].enabled = True
config.platforms[Platform.YOUR_PLATFORM].token = your_token
Update get_connected_platforms() if your platform doesn't use token/api_key
(e.g., WhatsApp uses enabled flag, Signal uses extra dict).
3. Adapter Factory (gateway/run.py)
Add to _create_adapter():
elif platform == Platform.YOUR_PLATFORM:
from gateway.platforms.your_platform import YourAdapter, check_your_requirements
if not check_your_requirements():
logger.warning("Your Platform: dependencies not met")
return None
return YourAdapter(config)
4. Authorization Maps (gateway/run.py)
Add to BOTH dicts in _is_user_authorized():
platform_env_map = {
...
Platform.YOUR_PLATFORM: "YOUR_PLATFORM_ALLOWED_USERS",
}
platform_allow_all_map = {
...
Platform.YOUR_PLATFORM: "YOUR_PLATFORM_ALLOW_ALL_USERS",
}
5. Session Source (gateway/session.py)
If your platform needs extra identity fields (e.g., Signal's UUID alongside
phone number), add them to the SessionSource dataclass with Optional defaults,
and update to_dict(), from_dict(), and build_source() in base.py.
6. System Prompt Hints (agent/prompt_builder.py)
Add a PLATFORM_HINTS entry so the agent knows what platform it's on:
PLATFORM_HINTS = {
...
"your_platform": (
"You are on Your Platform. "
"Describe formatting capabilities, media support, etc."
),
}
Without this, the agent won't know it's on your platform and may use inappropriate formatting (e.g., markdown on platforms that don't render it).
7. Toolset (toolsets.py)
Add a named toolset for your platform:
"hermes-your-platform": {
"description": "Your Platform bot toolset",
"tools": _HERMES_CORE_TOOLS,
"includes": []
},
And add it to the hermes-gateway composite:
"hermes-gateway": {
"includes": [..., "hermes-your-platform"]
}
8. Cron Delivery (cron/scheduler.py)
Add to platform_map in _deliver_result():
platform_map = {
...
"your_platform": Platform.YOUR_PLATFORM,
}
Without this, cronjob(action="create", deliver="your_platform", ...) silently fails.
9. Send Message Tool (tools/send_message_tool.py)
Add to platform_map in send_message_tool():
platform_map = {
...
"your_platform": Platform.YOUR_PLATFORM,
}
Add routing in _send_to_platform():
elif platform == Platform.YOUR_PLATFORM:
return await _send_your_platform(pconfig, chat_id, message)
Implement _send_your_platform() — a standalone async function that sends
a single message without requiring the full adapter (for use by cron jobs
and the send_message tool outside the gateway process).
Update the tool schema target description to include your platform example.
10. Cronjob Tool Schema (tools/cronjob_tools.py)
Update the deliver parameter description and docstring to mention your
platform as a delivery option.
11. Channel Directory (gateway/channel_directory.py)
If your platform can't enumerate chats (most can't), add it to the session-based discovery list:
for plat_name in ("telegram", "whatsapp", "signal", "your_platform"):
12. Status Display (hermes_cli/status.py)
Add to the platforms dict in the Messaging Platforms section:
platforms = {
...
"Your Platform": ("YOUR_PLATFORM_TOKEN", "YOUR_PLATFORM_HOME_CHANNEL"),
}
13. Gateway Setup Wizard (hermes_cli/gateway.py)
Add to the _PLATFORMS list:
{
"key": "your_platform",
"label": "Your Platform",
"emoji": "📱",
"token_var": "YOUR_PLATFORM_TOKEN",
"setup_instructions": [...],
"vars": [...],
}
If your platform needs custom setup logic (connectivity testing, QR codes,
policy choices), add a _setup_your_platform() function and route to it
in the platform selection switch.
Update _platform_status() if your platform's "configured" check differs
from the standard bool(get_env_value(token_var)).
14. Phone/ID Redaction (agent/redact.py)
If your platform uses sensitive identifiers (phone numbers, etc.), add a
regex pattern and redaction function to agent/redact.py. This ensures
identifiers are masked in ALL log output, not just your adapter's logs.
15. Documentation
| File | What to update |
|---|---|
README.md |
Platform list in feature table + documentation table |
AGENTS.md |
Gateway description + env var config section |
website/docs/user-guide/messaging/<platform>.md |
NEW — Full setup guide (see existing platform docs for template) |
website/docs/user-guide/messaging/index.md |
Architecture diagram, toolset table, security examples, Next Steps links |
website/docs/reference/environment-variables.md |
All env vars for the platform |
16. Tests (tests/gateway/test_<platform>.py)
Recommended test coverage:
- Platform enum exists with correct value
- Config loading from env vars via
_apply_env_overrides - Adapter init (config parsing, allowlist handling, default values)
- Helper functions (redaction, parsing, file type detection)
- Session source round-trip (to_dict → from_dict)
- Authorization integration (platform in allowlist maps)
- Send message tool routing (platform in platform_map)
Optional but valuable:
- Async tests for message handling flow (mock the platform API)
- SSE/WebSocket reconnection logic
- Attachment processing
- Group message filtering
Quick Verification
After implementing everything, verify with:
# All tests pass
python -m pytest tests/ -q
# Grep for your platform name to find any missed integration points
grep -r "telegram\|discord\|whatsapp\|slack" gateway/ tools/ agent/ cron/ hermes_cli/ toolsets.py \
--include="*.py" -l | sort -u
# Check each file in the output — if it mentions other platforms but not yours, you missed it