mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
* docs: fix ascii-guard border alignment errors
Three docs pages had ASCII diagram boxes with off-by-one column
alignment issues that failed docs-site-checks CI:
- architecture.md: outer box is 71 cols but inner-box content lines
and border corners were offset by 1 col, making content-line right
border at col 70/72 while top/bottom border was at col 71. Inner
boxes also had border corners at cols 19/36/53 but content pipes
at cols 20/37/54. Rewrote the diagram with consistent 71-col width
throughout, aligned inner boxes at cols 4-19, 22-37, 40-55 with
2-space gaps and 15-space trailing padding.
- gateway-internals.md: same class of issue — outer box at 51 cols,
inner content lines varied 52-54 cols. Rewrote with consistent
51-col width, inner boxes at cols 4-15, 18-29, 32-43. Also
restructured the bottom-half message flow so it's bare text
(not half-open box cells) matching the intent of the original.
- agent-loop.md line 112-114: box 2 (API thread) content lines had
one extra space pushing the right border to col 46 while the top
and bottom borders of that box sat at col 45. Trimmed one trailing
space from each of the three content lines.
All 123 docs files now pass `npm run lint:diagrams`:
✓ Errors: 0 (warnings: 6, non-fatal)
Pre-existing failures on main — unrelated to any open PR.
* test(setup): accept description kwarg in prompt_choice mock lambdas
setup.py's `_curses_prompt_choice` gained an optional `description`
parameter (used for rendering context hints alongside the prompt).
`prompt_choice` forwards it via keyword arg. The two existing tests
mocked `_curses_prompt_choice` with lambdas that didn't accept the
new kwarg, so the forwarded call raised TypeError.
Fix: add `description=None` to both mock lambda signatures so they
absorb the new kwarg without changing behavior.
* test(matrix): update stale audio-caching assertion
test_regular_audio_has_http_url asserted that non-voice audio
messages keep their HTTP URL and are NOT downloaded/cached. That
was true when the caching code only triggered on
`is_voice_message`. Since bec02f37 (encrypted-media caching
refactor), matrix.py caches all media locally — photos, audio,
video, documents — so downstream tools can read them as real
files via media_urls. This applies to regular audio too.
Renamed the test to `test_regular_audio_is_cached_locally`,
flipped the assertions accordingly, and documented the
intentional behavior change in the docstring. Other tests in
the file (voice-specific caching, message-type detection,
reply-to threading) continue to pass.
* test(413): allow multi-pass preflight compression
run_agent.py's preflight compression runs up to 3 passes in a loop
for very large sessions (each pass summarizes the middle N turns,
then re-checks tokens). The loop breaks when a pass returns a
message list no shorter than its input (can't compress further).
test_preflight_compresses_oversized_history used a static mock
return value that returned the same 2 messages regardless of input,
so the loop ran pass 1 (41 -> 2) and pass 2 (2 -> 2 -> break),
making call_count == 2. The assert_called_once() assertion was
strictly wrong under the multi-pass design.
The invariant the test actually cares about is: preflight ran, and
its first invocation received the full oversized history. Replaced
the count assertion with those two invariants.
* docs: drop '...' from gateway diagram, merge side-by-side boxes
ascii-guard 2.3.0 flagged two remaining issues after the initial fix
pass:
1. gateway-internals.md L33: the '...' suffix after inner box 3's
right border got parsed as 'extra characters after inner-box right
border'. Dropped the '...' — the surrounding prose already conveys
'and more platforms' without needing the visual hint.
2. agent-loop.md: ascii-guard can't cleanly parse two side-by-side
boxes of different heights (main thread 7 rows, API thread 5 rows).
Even equalizing heights didn't help — the linter treats the left
box's right border as the end of the diagram. Merged into a single
54-char-wide outer box with both threads labeled as regions inside,
keeping the ▶ arrow to preserve the main→API flow direction.
278 lines
16 KiB
Markdown
278 lines
16 KiB
Markdown
---
|
|
sidebar_position: 1
|
|
title: "Architecture"
|
|
description: "Hermes Agent internals — major subsystems, execution paths, data flow, and where to read next"
|
|
---
|
|
|
|
# Architecture
|
|
|
|
This page is the top-level map of Hermes Agent internals. Use it to orient yourself in the codebase, then dive into subsystem-specific docs for implementation details.
|
|
|
|
## System Overview
|
|
|
|
```text
|
|
┌─────────────────────────────────────────────────────────────────────┐
|
|
│ Entry Points │
|
|
│ │
|
|
│ CLI (cli.py) Gateway (gateway/run.py) ACP (acp_adapter/) │
|
|
│ Batch Runner API Server Python Library │
|
|
└──────────┬──────────────┬───────────────────────┬───────────────────┘
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌─────────────────────────────────────────────────────────────────────┐
|
|
│ AIAgent (run_agent.py) │
|
|
│ │
|
|
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
│ │ Prompt │ │ Provider │ │ Tool │ │
|
|
│ │ Builder │ │ Resolution │ │ Dispatch │ │
|
|
│ │ (prompt_ │ │ (runtime_ │ │ (model_ │ │
|
|
│ │ builder.py) │ │ provider.py)│ │ tools.py) │ │
|
|
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
|
│ │ │ │ │
|
|
│ ┌──────┴───────┐ ┌──────┴───────┐ ┌──────┴───────┐ │
|
|
│ │ Compression │ │ 3 API Modes │ │ Tool Registry│ │
|
|
│ │ & Caching │ │ chat_compl. │ │ (registry.py)│ │
|
|
│ │ │ │ codex_resp. │ │ 47 tools │ │
|
|
│ │ │ │ anthropic │ │ 19 toolsets │ │
|
|
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────────┘
|
|
│ │
|
|
▼ ▼
|
|
┌───────────────────┐ ┌──────────────────────┐
|
|
│ Session Storage │ │ Tool Backends │
|
|
│ (SQLite + FTS5) │ │ Terminal (6 backends) │
|
|
│ hermes_state.py │ │ Browser (5 backends) │
|
|
│ gateway/session.py│ │ Web (4 backends) │
|
|
└───────────────────┘ │ MCP (dynamic) │
|
|
│ File, Vision, etc. │
|
|
└──────────────────────┘
|
|
```
|
|
|
|
## Directory Structure
|
|
|
|
```text
|
|
hermes-agent/
|
|
├── run_agent.py # AIAgent — core conversation loop (~10,700 lines)
|
|
├── cli.py # HermesCLI — interactive terminal UI (~10,000 lines)
|
|
├── model_tools.py # Tool discovery, schema collection, dispatch
|
|
├── toolsets.py # Tool groupings and platform presets
|
|
├── hermes_state.py # SQLite session/state database with FTS5
|
|
├── hermes_constants.py # HERMES_HOME, profile-aware paths
|
|
├── batch_runner.py # Batch trajectory generation
|
|
│
|
|
├── agent/ # Agent internals
|
|
│ ├── prompt_builder.py # System prompt assembly
|
|
│ ├── context_engine.py # ContextEngine ABC (pluggable)
|
|
│ ├── context_compressor.py # Default engine — lossy summarization
|
|
│ ├── prompt_caching.py # Anthropic prompt caching
|
|
│ ├── auxiliary_client.py # Auxiliary LLM for side tasks (vision, summarization)
|
|
│ ├── model_metadata.py # Model context lengths, token estimation
|
|
│ ├── models_dev.py # models.dev registry integration
|
|
│ ├── anthropic_adapter.py # Anthropic Messages API format conversion
|
|
│ ├── display.py # KawaiiSpinner, tool preview formatting
|
|
│ ├── skill_commands.py # Skill slash commands
|
|
│ ├── memory_manager.py # Memory manager orchestration
|
|
│ ├── memory_provider.py # Memory provider ABC
|
|
│ └── trajectory.py # Trajectory saving helpers
|
|
│
|
|
├── hermes_cli/ # CLI subcommands and setup
|
|
│ ├── main.py # Entry point — all `hermes` subcommands (~6,000 lines)
|
|
│ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
|
|
│ ├── commands.py # COMMAND_REGISTRY — central slash command definitions
|
|
│ ├── auth.py # PROVIDER_REGISTRY, credential resolution
|
|
│ ├── runtime_provider.py # Provider → api_mode + credentials
|
|
│ ├── models.py # Model catalog, provider model lists
|
|
│ ├── model_switch.py # /model command logic (CLI + gateway shared)
|
|
│ ├── setup.py # Interactive setup wizard (~3,100 lines)
|
|
│ ├── skin_engine.py # CLI theming engine
|
|
│ ├── skills_config.py # hermes skills — enable/disable per platform
|
|
│ ├── skills_hub.py # /skills slash command
|
|
│ ├── tools_config.py # hermes tools — enable/disable per platform
|
|
│ ├── plugins.py # PluginManager — discovery, loading, hooks
|
|
│ ├── callbacks.py # Terminal callbacks (clarify, sudo, approval)
|
|
│ └── gateway.py # hermes gateway start/stop
|
|
│
|
|
├── tools/ # Tool implementations (one file per tool)
|
|
│ ├── registry.py # Central tool registry
|
|
│ ├── approval.py # Dangerous command detection
|
|
│ ├── terminal_tool.py # Terminal orchestration
|
|
│ ├── process_registry.py # Background process management
|
|
│ ├── file_tools.py # read_file, write_file, patch, search_files
|
|
│ ├── web_tools.py # web_search, web_extract
|
|
│ ├── browser_tool.py # 10 browser automation tools
|
|
│ ├── code_execution_tool.py # execute_code sandbox
|
|
│ ├── delegate_tool.py # Subagent delegation
|
|
│ ├── mcp_tool.py # MCP client (~2,200 lines)
|
|
│ ├── credential_files.py # File-based credential passthrough
|
|
│ ├── env_passthrough.py # Env var passthrough for sandboxes
|
|
│ ├── ansi_strip.py # ANSI escape stripping
|
|
│ └── environments/ # Terminal backends (local, docker, ssh, modal, daytona, singularity)
|
|
│
|
|
├── gateway/ # Messaging platform gateway
|
|
│ ├── run.py # GatewayRunner — message dispatch (~9,000 lines)
|
|
│ ├── session.py # SessionStore — conversation persistence
|
|
│ ├── delivery.py # Outbound message delivery
|
|
│ ├── pairing.py # DM pairing authorization
|
|
│ ├── hooks.py # Hook discovery and lifecycle events
|
|
│ ├── mirror.py # Cross-session message mirroring
|
|
│ ├── status.py # Token locks, profile-scoped process tracking
|
|
│ ├── builtin_hooks/ # Always-registered hooks
|
|
│ └── platforms/ # 18 adapters: telegram, discord, slack, whatsapp,
|
|
│ # signal, matrix, mattermost, email, sms,
|
|
│ # dingtalk, feishu, wecom, wecom_callback, weixin,
|
|
│ # bluebubbles, qqbot, homeassistant, webhook, api_server
|
|
│
|
|
├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains)
|
|
├── cron/ # Scheduler (jobs.py, scheduler.py)
|
|
├── plugins/memory/ # Memory provider plugins
|
|
├── plugins/context_engine/ # Context engine plugins
|
|
├── environments/ # RL training environments (Atropos)
|
|
├── skills/ # Bundled skills (always available)
|
|
├── optional-skills/ # Official optional skills (install explicitly)
|
|
├── website/ # Docusaurus documentation site
|
|
└── tests/ # Pytest suite (~3,000+ tests)
|
|
```
|
|
|
|
## Data Flow
|
|
|
|
### CLI Session
|
|
|
|
```text
|
|
User input → HermesCLI.process_input()
|
|
→ AIAgent.run_conversation()
|
|
→ prompt_builder.build_system_prompt()
|
|
→ runtime_provider.resolve_runtime_provider()
|
|
→ API call (chat_completions / codex_responses / anthropic_messages)
|
|
→ tool_calls? → model_tools.handle_function_call() → loop
|
|
→ final response → display → save to SessionDB
|
|
```
|
|
|
|
### Gateway Message
|
|
|
|
```text
|
|
Platform event → Adapter.on_message() → MessageEvent
|
|
→ GatewayRunner._handle_message()
|
|
→ authorize user
|
|
→ resolve session key
|
|
→ create AIAgent with session history
|
|
→ AIAgent.run_conversation()
|
|
→ deliver response back through adapter
|
|
```
|
|
|
|
### Cron Job
|
|
|
|
```text
|
|
Scheduler tick → load due jobs from jobs.json
|
|
→ create fresh AIAgent (no history)
|
|
→ inject attached skills as context
|
|
→ run job prompt
|
|
→ deliver response to target platform
|
|
→ update job state and next_run
|
|
```
|
|
|
|
## Recommended Reading Order
|
|
|
|
If you are new to the codebase:
|
|
|
|
1. **This page** — orient yourself
|
|
2. **[Agent Loop Internals](./agent-loop.md)** — how AIAgent works
|
|
3. **[Prompt Assembly](./prompt-assembly.md)** — system prompt construction
|
|
4. **[Provider Runtime Resolution](./provider-runtime.md)** — how providers are selected
|
|
5. **[Adding Providers](./adding-providers.md)** — practical guide to adding a new provider
|
|
6. **[Tools Runtime](./tools-runtime.md)** — tool registry, dispatch, environments
|
|
7. **[Session Storage](./session-storage.md)** — SQLite schema, FTS5, session lineage
|
|
8. **[Gateway Internals](./gateway-internals.md)** — messaging platform gateway
|
|
9. **[Context Compression & Prompt Caching](./context-compression-and-caching.md)** — compression and caching
|
|
10. **[ACP Internals](./acp-internals.md)** — IDE integration
|
|
11. **[Environments, Benchmarks & Data Generation](./environments.md)** — RL training
|
|
|
|
## Major Subsystems
|
|
|
|
### Agent Loop
|
|
|
|
The synchronous orchestration engine (`AIAgent` in `run_agent.py`). Handles provider selection, prompt construction, tool execution, retries, fallback, callbacks, compression, and persistence. Supports three API modes for different provider backends.
|
|
|
|
→ [Agent Loop Internals](./agent-loop.md)
|
|
|
|
### Prompt System
|
|
|
|
Prompt construction and maintenance across the conversation lifecycle:
|
|
|
|
- **`prompt_builder.py`** — Assembles the system prompt from: personality (SOUL.md), memory (MEMORY.md, USER.md), skills, context files (AGENTS.md, .hermes.md), tool-use guidance, and model-specific instructions
|
|
- **`prompt_caching.py`** — Applies Anthropic cache breakpoints for prefix caching
|
|
- **`context_compressor.py`** — Summarizes middle conversation turns when context exceeds thresholds
|
|
|
|
→ [Prompt Assembly](./prompt-assembly.md), [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
|
|
|
### Provider Resolution
|
|
|
|
A shared runtime resolver used by CLI, gateway, cron, ACP, and auxiliary calls. Maps `(provider, model)` tuples to `(api_mode, api_key, base_url)`. Handles 18+ providers, OAuth flows, credential pools, and alias resolution.
|
|
|
|
→ [Provider Runtime Resolution](./provider-runtime.md)
|
|
|
|
### Tool System
|
|
|
|
Central tool registry (`tools/registry.py`) with 47 registered tools across 19 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 6 backends (local, Docker, SSH, Daytona, Modal, Singularity).
|
|
|
|
→ [Tools Runtime](./tools-runtime.md)
|
|
|
|
### Session Persistence
|
|
|
|
SQLite-based session storage with FTS5 full-text search. Sessions have lineage tracking (parent/child across compressions), per-platform isolation, and atomic writes with contention handling.
|
|
|
|
→ [Session Storage](./session-storage.md)
|
|
|
|
### Messaging Gateway
|
|
|
|
Long-running process with 18 platform adapters, unified session routing, user authorization (allowlists + DM pairing), slash command dispatch, hook system, cron ticking, and background maintenance.
|
|
|
|
→ [Gateway Internals](./gateway-internals.md)
|
|
|
|
### Plugin System
|
|
|
|
Three discovery sources: `~/.hermes/plugins/` (user), `.hermes/plugins/` (project), and pip entry points. Plugins register tools, hooks, and CLI commands through a context API. Two specialized plugin types exist: memory providers (`plugins/memory/`) and context engines (`plugins/context_engine/`). Both are single-select — only one of each can be active at a time, configured via `hermes plugins` or `config.yaml`.
|
|
|
|
→ [Plugin Guide](/docs/guides/build-a-hermes-plugin), [Memory Provider Plugin](./memory-provider-plugin.md)
|
|
|
|
### Cron
|
|
|
|
First-class agent tasks (not shell tasks). Jobs store in JSON, support multiple schedule formats, can attach skills and scripts, and deliver to any platform.
|
|
|
|
→ [Cron Internals](./cron-internals.md)
|
|
|
|
### ACP Integration
|
|
|
|
Exposes Hermes as an editor-native agent over stdio/JSON-RPC for VS Code, Zed, and JetBrains.
|
|
|
|
→ [ACP Internals](./acp-internals.md)
|
|
|
|
### RL / Environments / Trajectories
|
|
|
|
Full environment framework for evaluation and RL training. Integrates with Atropos, supports multiple tool-call parsers, and generates ShareGPT-format trajectories.
|
|
|
|
→ [Environments, Benchmarks & Data Generation](./environments.md), [Trajectories & Training Format](./trajectory-format.md)
|
|
|
|
## Design Principles
|
|
|
|
| Principle | What it means in practice |
|
|
|-----------|--------------------------|
|
|
| **Prompt stability** | System prompt doesn't change mid-conversation. No cache-breaking mutations except explicit user actions (`/model`). |
|
|
| **Observable execution** | Every tool call is visible to the user via callbacks. Progress updates in CLI (spinner) and gateway (chat messages). |
|
|
| **Interruptible** | API calls and tool execution can be cancelled mid-flight by user input or signals. |
|
|
| **Platform-agnostic core** | One AIAgent class serves CLI, gateway, ACP, batch, and API server. Platform differences live in the entry point, not the agent. |
|
|
| **Loose coupling** | Optional subsystems (MCP, plugins, memory providers, RL environments) use registry patterns and check_fn gating, not hard dependencies. |
|
|
| **Profile isolation** | Each profile (`hermes -p <name>`) gets its own HERMES_HOME, config, memory, sessions, and gateway PID. Multiple profiles run concurrently. |
|
|
|
|
## File Dependency Chain
|
|
|
|
```text
|
|
tools/registry.py (no deps — imported by all tool files)
|
|
↑
|
|
tools/*.py (each calls registry.register() at import time)
|
|
↑
|
|
model_tools.py (imports tools/registry + triggers tool discovery)
|
|
↑
|
|
run_agent.py, cli.py, batch_runner.py, environments/
|
|
```
|
|
|
|
This chain means tool registration happens at import time, before any agent instance is created. Any `tools/*.py` file with a top-level `registry.register()` call is auto-discovered — no manual import list needed.
|