mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
* docs: fix ascii-guard border alignment errors
Three docs pages had ASCII diagram boxes with off-by-one column
alignment issues that failed docs-site-checks CI:
- architecture.md: outer box is 71 cols but inner-box content lines
and border corners were offset by 1 col, making content-line right
border at col 70/72 while top/bottom border was at col 71. Inner
boxes also had border corners at cols 19/36/53 but content pipes
at cols 20/37/54. Rewrote the diagram with consistent 71-col width
throughout, aligned inner boxes at cols 4-19, 22-37, 40-55 with
2-space gaps and 15-space trailing padding.
- gateway-internals.md: same class of issue — outer box at 51 cols,
inner content lines varied 52-54 cols. Rewrote with consistent
51-col width, inner boxes at cols 4-15, 18-29, 32-43. Also
restructured the bottom-half message flow so it's bare text
(not half-open box cells) matching the intent of the original.
- agent-loop.md line 112-114: box 2 (API thread) content lines had
one extra space pushing the right border to col 46 while the top
and bottom borders of that box sat at col 45. Trimmed one trailing
space from each of the three content lines.
All 123 docs files now pass `npm run lint:diagrams`:
✓ Errors: 0 (warnings: 6, non-fatal)
Pre-existing failures on main — unrelated to any open PR.
* test(setup): accept description kwarg in prompt_choice mock lambdas
setup.py's `_curses_prompt_choice` gained an optional `description`
parameter (used for rendering context hints alongside the prompt).
`prompt_choice` forwards it via keyword arg. The two existing tests
mocked `_curses_prompt_choice` with lambdas that didn't accept the
new kwarg, so the forwarded call raised TypeError.
Fix: add `description=None` to both mock lambda signatures so they
absorb the new kwarg without changing behavior.
* test(matrix): update stale audio-caching assertion
test_regular_audio_has_http_url asserted that non-voice audio
messages keep their HTTP URL and are NOT downloaded/cached. That
was true when the caching code only triggered on
`is_voice_message`. Since bec02f37 (encrypted-media caching
refactor), matrix.py caches all media locally — photos, audio,
video, documents — so downstream tools can read them as real
files via media_urls. This applies to regular audio too.
Renamed the test to `test_regular_audio_is_cached_locally`,
flipped the assertions accordingly, and documented the
intentional behavior change in the docstring. Other tests in
the file (voice-specific caching, message-type detection,
reply-to threading) continue to pass.
* test(413): allow multi-pass preflight compression
run_agent.py's preflight compression runs up to 3 passes in a loop
for very large sessions (each pass summarizes the middle N turns,
then re-checks tokens). The loop breaks when a pass returns a
message list no shorter than its input (can't compress further).
test_preflight_compresses_oversized_history used a static mock
return value that returned the same 2 messages regardless of input,
so the loop ran pass 1 (41 -> 2) and pass 2 (2 -> 2 -> break),
making call_count == 2. The assert_called_once() assertion was
strictly wrong under the multi-pass design.
The invariant the test actually cares about is: preflight ran, and
its first invocation received the full oversized history. Replaced
the count assertion with those two invariants.
* docs: drop '...' from gateway diagram, merge side-by-side boxes
ascii-guard 2.3.0 flagged two remaining issues after the initial fix
pass:
1. gateway-internals.md L33: the '...' suffix after inner box 3's
right border got parsed as 'extra characters after inner-box right
border'. Dropped the '...' — the surrounding prose already conveys
'and more platforms' without needing the visual hint.
2. agent-loop.md: ascii-guard can't cleanly parse two side-by-side
boxes of different heights (main thread 7 rows, API thread 5 rows).
Even equalizing heights didn't help — the linter treats the left
box's right border as the end of the diagram. Merged into a single
54-char-wide outer box with both threads labeled as regions inside,
keeping the ▶ arrow to preserve the main→API flow direction.
259 lines
12 KiB
Markdown
259 lines
12 KiB
Markdown
---
|
|
sidebar_position: 7
|
|
title: "Gateway Internals"
|
|
description: "How the messaging gateway boots, authorizes users, routes sessions, and delivers messages"
|
|
---
|
|
|
|
# Gateway Internals
|
|
|
|
The messaging gateway is the long-running process that connects Hermes to 14+ external messaging platforms through a unified architecture.
|
|
|
|
## Key Files
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `gateway/run.py` | `GatewayRunner` — main loop, slash commands, message dispatch (~9,000 lines) |
|
|
| `gateway/session.py` | `SessionStore` — conversation persistence and session key construction |
|
|
| `gateway/delivery.py` | Outbound message delivery to target platforms/channels |
|
|
| `gateway/pairing.py` | DM pairing flow for user authorization |
|
|
| `gateway/channel_directory.py` | Maps chat IDs to human-readable names for cron delivery |
|
|
| `gateway/hooks.py` | Hook discovery, loading, and lifecycle event dispatch |
|
|
| `gateway/mirror.py` | Cross-session message mirroring for `send_message` |
|
|
| `gateway/status.py` | Token lock management for profile-scoped gateway instances |
|
|
| `gateway/builtin_hooks/` | Always-registered hooks (e.g., BOOT.md system prompt hook) |
|
|
| `gateway/platforms/` | Platform adapters (one per messaging platform) |
|
|
|
|
## Architecture Overview
|
|
|
|
```text
|
|
┌─────────────────────────────────────────────────┐
|
|
│ GatewayRunner │
|
|
│ │
|
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
|
│ │ Telegram │ │ Discord │ │ Slack │ │
|
|
│ │ Adapter │ │ Adapter │ │ Adapter │ │
|
|
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
|
|
│ │ │ │ │
|
|
│ └─────────────┼─────────────┘ │
|
|
│ ▼ │
|
|
│ _handle_message() │
|
|
│ │ │
|
|
│ ┌───────────┼───────────┐ │
|
|
│ ▼ ▼ ▼ │
|
|
│ Slash command AIAgent Queue/BG │
|
|
│ dispatch creation sessions │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ SessionStore │
|
|
│ (SQLite persistence) │
|
|
└─────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Message Flow
|
|
|
|
When a message arrives from any platform:
|
|
|
|
1. **Platform adapter** receives raw event, normalizes it into a `MessageEvent`
|
|
2. **Base adapter** checks active session guard:
|
|
- If agent is running for this session → queue message, set interrupt event
|
|
- If `/approve`, `/deny`, `/stop` → bypass guard (dispatched inline)
|
|
3. **GatewayRunner._handle_message()** receives the event:
|
|
- Resolve session key via `_session_key_for_source()` (format: `agent:main:{platform}:{chat_type}:{chat_id}`)
|
|
- Check authorization (see Authorization below)
|
|
- Check if it's a slash command → dispatch to command handler
|
|
- Check if agent is already running → intercept commands like `/stop`, `/status`
|
|
- Otherwise → create `AIAgent` instance and run conversation
|
|
4. **Response** is sent back through the platform adapter
|
|
|
|
### Session Key Format
|
|
|
|
Session keys encode the full routing context:
|
|
|
|
```
|
|
agent:main:{platform}:{chat_type}:{chat_id}
|
|
```
|
|
|
|
For example: `agent:main:telegram:private:123456789`
|
|
|
|
Thread-aware platforms (Telegram forum topics, Discord threads, Slack threads) may include thread IDs in the chat_id portion. **Never construct session keys manually** — always use `build_session_key()` from `gateway/session.py`.
|
|
|
|
### Two-Level Message Guard
|
|
|
|
When an agent is actively running, incoming messages pass through two sequential guards:
|
|
|
|
1. **Level 1 — Base adapter** (`gateway/platforms/base.py`): Checks `_active_sessions`. If the session is active, queues the message in `_pending_messages` and sets an interrupt event. This catches messages *before* they reach the gateway runner.
|
|
|
|
2. **Level 2 — Gateway runner** (`gateway/run.py`): Checks `_running_agents`. Intercepts specific commands (`/stop`, `/new`, `/queue`, `/status`, `/approve`, `/deny`) and routes them appropriately. Everything else triggers `running_agent.interrupt()`.
|
|
|
|
Commands that must reach the runner while the agent is blocked (like `/approve`) are dispatched **inline** via `await self._message_handler(event)` — they bypass the background task system to avoid race conditions.
|
|
|
|
## Authorization
|
|
|
|
The gateway uses a multi-layer authorization check, evaluated in order:
|
|
|
|
1. **Per-platform allow-all flag** (e.g., `TELEGRAM_ALLOW_ALL_USERS`) — if set, all users on that platform are authorized
|
|
2. **Platform allowlist** (e.g., `TELEGRAM_ALLOWED_USERS`) — comma-separated user IDs
|
|
3. **DM pairing** — authenticated users can pair new users via a pairing code
|
|
4. **Global allow-all** (`GATEWAY_ALLOW_ALL_USERS`) — if set, all users across all platforms are authorized
|
|
5. **Default: deny** — unauthorized users are rejected
|
|
|
|
### DM Pairing Flow
|
|
|
|
```text
|
|
Admin: /pair
|
|
Gateway: "Pairing code: ABC123. Share with the user."
|
|
New user: ABC123
|
|
Gateway: "Paired! You're now authorized."
|
|
```
|
|
|
|
Pairing state is persisted in `gateway/pairing.py` and survives restarts.
|
|
|
|
## Slash Command Dispatch
|
|
|
|
All slash commands in the gateway flow through the same resolution pipeline:
|
|
|
|
1. `resolve_command()` from `hermes_cli/commands.py` maps input to canonical name (handles aliases, prefix matching)
|
|
2. The canonical name is checked against `GATEWAY_KNOWN_COMMANDS`
|
|
3. Handler in `_handle_message()` dispatches based on canonical name
|
|
4. Some commands are gated on config (`gateway_config_gate` on `CommandDef`)
|
|
|
|
### Running-Agent Guard
|
|
|
|
Commands that must NOT execute while the agent is processing are rejected early:
|
|
|
|
```python
|
|
if _quick_key in self._running_agents:
|
|
if canonical == "model":
|
|
return "⏳ Agent is running — wait for it to finish or /stop first."
|
|
```
|
|
|
|
Bypass commands (`/stop`, `/new`, `/approve`, `/deny`, `/queue`, `/status`) have special handling.
|
|
|
|
## Config Sources
|
|
|
|
The gateway reads configuration from multiple sources:
|
|
|
|
| Source | What it provides |
|
|
|--------|-----------------|
|
|
| `~/.hermes/.env` | API keys, bot tokens, platform credentials |
|
|
| `~/.hermes/config.yaml` | Model settings, tool configuration, display options |
|
|
| Environment variables | Override any of the above |
|
|
|
|
Unlike the CLI (which uses `load_cli_config()` with hardcoded defaults), the gateway reads `config.yaml` directly via YAML loader. This means config keys that exist in the CLI's defaults dict but not in the user's config file may behave differently between CLI and gateway.
|
|
|
|
## Platform Adapters
|
|
|
|
Each messaging platform has an adapter in `gateway/platforms/`:
|
|
|
|
```text
|
|
gateway/platforms/
|
|
├── base.py # BaseAdapter — shared logic for all platforms
|
|
├── telegram.py # Telegram Bot API (long polling or webhook)
|
|
├── discord.py # Discord bot via discord.py
|
|
├── slack.py # Slack Socket Mode
|
|
├── whatsapp.py # WhatsApp Business Cloud API
|
|
├── signal.py # Signal via signal-cli REST API
|
|
├── matrix.py # Matrix via mautrix (optional E2EE)
|
|
├── mattermost.py # Mattermost WebSocket API
|
|
├── email.py # Email via IMAP/SMTP
|
|
├── sms.py # SMS via Twilio
|
|
├── dingtalk.py # DingTalk WebSocket
|
|
├── feishu.py # Feishu/Lark WebSocket or webhook
|
|
├── wecom.py # WeCom (WeChat Work) callback
|
|
├── weixin.py # Weixin (personal WeChat) via iLink Bot API
|
|
├── bluebubbles.py # Apple iMessage via BlueBubbles macOS server
|
|
├── qqbot.py # QQ Bot (Tencent QQ) via Official API v2
|
|
├── webhook.py # Inbound/outbound webhook adapter
|
|
├── api_server.py # REST API server adapter
|
|
└── homeassistant.py # Home Assistant conversation integration
|
|
```
|
|
|
|
Adapters implement a common interface:
|
|
- `connect()` / `disconnect()` — lifecycle management
|
|
- `send_message()` — outbound message delivery
|
|
- `on_message()` — inbound message normalization → `MessageEvent`
|
|
|
|
### Token Locks
|
|
|
|
Adapters that connect with unique credentials call `acquire_scoped_lock()` in `connect()` and `release_scoped_lock()` in `disconnect()`. This prevents two profiles from using the same bot token simultaneously.
|
|
|
|
## Delivery Path
|
|
|
|
Outgoing deliveries (`gateway/delivery.py`) handle:
|
|
|
|
- **Direct reply** — send response back to the originating chat
|
|
- **Home channel delivery** — route cron job outputs and background results to a configured home channel
|
|
- **Explicit target delivery** — `send_message` tool specifying `telegram:-1001234567890`
|
|
- **Cross-platform delivery** — deliver to a different platform than the originating message
|
|
|
|
Cron job deliveries are NOT mirrored into gateway session history — they live in their own cron session only. This is a deliberate design choice to avoid message alternation violations.
|
|
|
|
## Hooks
|
|
|
|
Gateway hooks are Python modules that respond to lifecycle events:
|
|
|
|
### Gateway Hook Events
|
|
|
|
| Event | When fired |
|
|
|-------|-----------|
|
|
| `gateway:startup` | Gateway process starts |
|
|
| `session:start` | New conversation session begins |
|
|
| `session:end` | Session completes or times out |
|
|
| `session:reset` | User resets session with `/new` |
|
|
| `agent:start` | Agent begins processing a message |
|
|
| `agent:step` | Agent completes one tool-calling iteration |
|
|
| `agent:end` | Agent finishes and returns response |
|
|
| `command:*` | Any slash command is executed |
|
|
|
|
Hooks are discovered from `gateway/builtin_hooks/` (always active) and `~/.hermes/hooks/` (user-installed). Each hook is a directory with a `HOOK.yaml` manifest and `handler.py`.
|
|
|
|
## Memory Provider Integration
|
|
|
|
When a memory provider plugin (e.g., Honcho) is enabled:
|
|
|
|
1. Gateway creates an `AIAgent` per message with the session ID
|
|
2. The `MemoryManager` initializes the provider with the session context
|
|
3. Provider tools (e.g., `honcho_profile`, `viking_search`) are routed through:
|
|
|
|
```text
|
|
AIAgent._invoke_tool()
|
|
→ self._memory_manager.handle_tool_call(name, args)
|
|
→ provider.handle_tool_call(name, args)
|
|
```
|
|
|
|
4. On session end/reset, `on_session_end()` fires for cleanup and final data flush
|
|
|
|
### Memory Flush Lifecycle
|
|
|
|
When a session is reset, resumed, or expires:
|
|
1. Built-in memories are flushed to disk
|
|
2. Memory provider's `on_session_end()` hook fires
|
|
3. A temporary `AIAgent` runs a memory-only conversation turn
|
|
4. Context is then discarded or archived
|
|
|
|
## Background Maintenance
|
|
|
|
The gateway runs periodic maintenance alongside message handling:
|
|
|
|
- **Cron ticking** — checks job schedules and fires due jobs
|
|
- **Session expiry** — cleans up abandoned sessions after timeout
|
|
- **Memory flush** — proactively flushes memory before session expiry
|
|
- **Cache refresh** — refreshes model lists and provider status
|
|
|
|
## Process Management
|
|
|
|
The gateway runs as a long-lived process, managed via:
|
|
|
|
- `hermes gateway start` / `hermes gateway stop` — manual control
|
|
- `systemctl` (Linux) or `launchctl` (macOS) — service management
|
|
- PID file at `~/.hermes/gateway.pid` — profile-scoped process tracking
|
|
|
|
**Profile-scoped vs global**: `start_gateway()` uses profile-scoped PID files. `hermes gateway stop` stops only the current profile's gateway. `hermes gateway stop --all` uses global `ps aux` scanning to kill all gateway processes (used during updates).
|
|
|
|
## Related Docs
|
|
|
|
- [Session Storage](./session-storage.md)
|
|
- [Cron Internals](./cron-internals.md)
|
|
- [ACP Internals](./acp-internals.md)
|
|
- [Agent Loop Internals](./agent-loop.md)
|
|
- [Messaging Gateway (User Guide)](/docs/user-guide/messaging)
|