docs: add ACP and internal systems implementation guides

- add ACP user and developer docs covering setup, lifecycle, callbacks,
  permissions, tool rendering, and runtime behavior
- add developer guides for agent loop, provider runtime resolution,
  prompt assembly, context caching/compression, gateway internals,
  session storage, tools runtime, trajectories, and cron internals
- refresh architecture, quickstart, installation, CLI reference, and
  environments docs to link the new implementation pages and ACP support
This commit is contained in:
teknium1 2026-03-14 00:29:48 -07:00
parent 29176f302e
commit d87a1615ce
17 changed files with 1256 additions and 170 deletions

View file

@ -1,218 +1,151 @@
---
sidebar_position: 1
title: "Architecture"
description: "Hermes Agent internals — project structure, agent loop, key classes, and design patterns"
description: "Hermes Agent internals — major subsystems, execution paths, and where to read next"
---
# Architecture
This guide covers the internal architecture of Hermes Agent for developers contributing to the project.
This page is the top-level map of Hermes Agent internals. The project has grown beyond a single monolithic loop, so the best way to understand it is by subsystem.
## Project Structure
## High-level structure
```
```text
hermes-agent/
├── run_agent.py # AIAgent class — core conversation loop, tool dispatch
├── cli.py # HermesCLI class — interactive TUI, prompt_toolkit
├── model_tools.py # Tool orchestration (thin layer over tools/registry.py)
├── toolsets.py # Tool groupings and presets
├── hermes_state.py # SQLite session database with FTS5 full-text search
├── batch_runner.py # Parallel batch processing for trajectory generation
├── run_agent.py # AIAgent core loop
├── cli.py # interactive terminal UI
├── model_tools.py # tool discovery/orchestration
├── toolsets.py # tool groupings and presets
├── hermes_state.py # SQLite session/state database
├── batch_runner.py # batch trajectory generation
├── agent/ # Agent internals (extracted modules)
│ ├── prompt_builder.py # System prompt assembly (identity, skills, memory)
│ ├── context_compressor.py # Auto-summarization when approaching context limits
│ ├── auxiliary_client.py # Resolves auxiliary OpenAI clients (summarization, vision)
│ ├── display.py # KawaiiSpinner, tool progress formatting
│ ├── model_metadata.py # Model context lengths, token estimation
│ └── trajectory.py # Trajectory saving helpers
├── hermes_cli/ # CLI command implementations
│ ├── main.py # Entry point, argument parsing, command dispatch
│ ├── config.py # Config management, migration, env var definitions
│ ├── setup.py # Interactive setup wizard
│ ├── auth.py # Provider resolution, OAuth, Nous Portal
│ ├── models.py # OpenRouter model selection lists
│ ├── banner.py # Welcome banner, ASCII art
│ ├── commands.py # Slash command definitions + autocomplete
│ ├── callbacks.py # Interactive callbacks (clarify, sudo, approval)
│ ├── doctor.py # Diagnostics
│ └── skills_hub.py # Skills Hub CLI + /skills slash command handler
├── tools/ # Tool implementations (self-registering)
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
│ ├── approval.py # Dangerous command detection + per-session approval
│ ├── terminal_tool.py # Terminal orchestration (sudo, env lifecycle, backends)
│ ├── file_operations.py # File tool implementations (read, write, search, patch)
│ ├── file_tools.py # File tool registration
│ ├── web_tools.py # web_search, web_extract
│ ├── vision_tools.py # Image analysis via multimodal models
│ ├── delegate_tool.py # Subagent spawning and parallel task execution
│ ├── code_execution_tool.py # Sandboxed Python with RPC tool access
│ ├── session_search_tool.py # Search past conversations
│ ├── cronjob_tools.py # Scheduled task management
│ ├── skills_tool.py # Skill search and load
│ ├── skill_manager_tool.py # Skill management
│ └── environments/ # Terminal execution backends
│ ├── base.py # BaseEnvironment ABC
│ ├── local.py, docker.py, ssh.py, singularity.py, modal.py, daytona.py
├── gateway/ # Messaging gateway
│ ├── run.py # GatewayRunner — platform lifecycle, message routing
│ ├── config.py # Platform configuration resolution
│ ├── session.py # Session store, context prompts, reset policies
│ └── platforms/ # Platform adapters
│ ├── telegram.py, discord_adapter.py, slack.py, whatsapp.py
├── scripts/ # Installer and bridge scripts
│ ├── install.sh # Linux/macOS installer
│ ├── install.ps1 # Windows PowerShell installer
│ └── whatsapp-bridge/ # Node.js WhatsApp bridge (Baileys)
├── skills/ # Bundled skills (copied to ~/.hermes/skills/)
├── optional-skills/ # Official optional skills (discoverable via hub, not activated by default)
├── environments/ # RL training environments (Atropos integration)
└── tests/ # Test suite
├── agent/ # prompt building, compression, caching, metadata, trajectories
├── hermes_cli/ # command entrypoints, auth, setup, models, config, doctor
├── tools/ # tool implementations and terminal environments
├── gateway/ # messaging gateway, session routing, delivery, pairing, hooks
├── cron/ # scheduled job storage and scheduler
├── honcho_integration/ # Honcho memory integration
├── acp_adapter/ # ACP editor integration server
├── acp_registry/ # ACP registry manifest + icon
├── environments/ # Hermes RL / benchmark environment framework
├── skills/ # bundled skills
├── optional-skills/ # official optional skills
└── tests/ # test suite
```
## Core Loop
## Recommended reading order
The main agent loop lives in `run_agent.py`:
If you are new to the codebase, read in this order:
```
User message → AIAgent._run_agent_loop()
├── Build system prompt (prompt_builder.py)
├── Build API kwargs (model, messages, tools, reasoning config)
├── Call LLM (OpenAI-compatible API)
├── If tool_calls in response:
│ ├── Execute each tool via registry dispatch
│ ├── Add tool results to conversation
│ └── Loop back to LLM call
├── If text response:
│ ├── Persist session to DB
│ └── Return final_response
└── Context compression if approaching token limit
```
1. this page
2. [Agent Loop Internals](./agent-loop.md)
3. [Prompt Assembly](./prompt-assembly.md)
4. [Provider Runtime Resolution](./provider-runtime.md)
5. [Tools Runtime](./tools-runtime.md)
6. [Session Storage](./session-storage.md)
7. [Gateway Internals](./gateway-internals.md)
8. [Context Compression & Prompt Caching](./context-compression-and-caching.md)
9. [ACP Internals](./acp-internals.md)
10. [Environments, Benchmarks & Data Generation](./environments.md)
```python
while turns < max_turns:
response = client.chat.completions.create(
model=model,
messages=messages,
tools=tool_schemas,
)
## Major subsystems
if response.tool_calls:
for tool_call in response.tool_calls:
result = execute_tool(tool_call)
messages.append(tool_result_message(result))
turns += 1
else:
return response.content
```
### Agent loop
## AIAgent Class
The core synchronous orchestration engine is `AIAgent` in `run_agent.py`.
```python
class AIAgent:
def __init__(
self,
model: str = "anthropic/claude-opus-4.6",
api_key: str = None,
base_url: str = None, # Resolved internally based on provider
max_iterations: int = 60,
enabled_toolsets: list = None,
disabled_toolsets: list = None,
verbose_logging: bool = False,
quiet_mode: bool = False,
tool_progress_callback: callable = None,
):
...
It is responsible for:
def chat(self, message: str) -> str:
# Main entry point - runs the agent loop
...
```
- provider/API-mode selection
- prompt construction
- tool execution
- retries and fallback
- callbacks
- compression and persistence
## File Dependency Chain
See [Agent Loop Internals](./agent-loop.md).
```
tools/registry.py (no deps — imported by all tool files)
tools/*.py (each calls registry.register() at import time)
model_tools.py (imports tools/registry + triggers tool discovery)
run_agent.py, cli.py, batch_runner.py, environments/
```
### Prompt system
Each tool file co-locates its schema, handler, and registration. `model_tools.py` is a thin orchestration layer.
Prompt-building logic is split between:
## Key Design Patterns
- `run_agent.py`
- `agent/prompt_builder.py`
- `agent/prompt_caching.py`
- `agent/context_compressor.py`
### Self-Registering Tools
See:
Each tool file calls `registry.register()` at import time. `model_tools.py` triggers discovery by importing all tool modules.
- [Prompt Assembly](./prompt-assembly.md)
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
### Toolset Grouping
### Provider/runtime resolution
Tools are grouped into toolsets (`web`, `terminal`, `file`, `browser`, etc.) that can be enabled/disabled per platform.
Hermes has a shared runtime provider resolver used by CLI, gateway, cron, ACP, and auxiliary calls.
### Session Persistence
See [Provider Runtime Resolution](./provider-runtime.md).
All conversations are stored in SQLite (`hermes_state.py`) with full-text search. JSON logs go to `~/.hermes/sessions/`.
### Tooling runtime
### Ephemeral Injection
The tool registry, toolsets, terminal backends, process manager, and dispatch rules form a subsystem of their own.
System prompts and prefill messages are injected at API call time, never persisted to the database or logs.
See [Tools Runtime](./tools-runtime.md).
### Provider Abstraction
### Session persistence
The agent works with any OpenAI-compatible API. Provider resolution happens at init time (Nous Portal OAuth, OpenRouter API key, or custom endpoint).
Historical session state is stored primarily in SQLite, with lineage preserved across compression splits.
### Conversation Format
See [Session Storage](./session-storage.md).
Messages follow the OpenAI format:
### Messaging gateway
```python
messages = [
{"role": "system", "content": "You are a helpful assistant..."},
{"role": "user", "content": "Search for Python tutorials"},
{"role": "assistant", "content": None, "tool_calls": [...]},
{"role": "tool", "tool_call_id": "...", "content": "..."},
{"role": "assistant", "content": "Here's what I found..."},
]
```
The gateway is a long-running orchestration layer for platform adapters, session routing, pairing, delivery, and cron ticking.
## CLI Architecture
See [Gateway Internals](./gateway-internals.md).
The interactive CLI (`cli.py`) uses:
### ACP integration
- **Rich** — Welcome banner and styled panels
- **prompt_toolkit** — Fixed input area with history, `patch_stdout`, slash command autocomplete
- **KawaiiSpinner** — Animated kawaii faces during API calls; clean activity feed for tool results
ACP exposes Hermes as an editor-native agent over stdio/JSON-RPC.
Key UX behaviors:
See:
- Thinking spinner shows animated kawaii face + verb (`(⌐■_■) deliberating...`)
- Tool execution results appear as `┊ {emoji} {verb} {detail} {duration}`
- Prompt shows `⚕ ` when working, `` when idle
- Multi-line paste support with automatic formatting
- [ACP Editor Integration](../user-guide/features/acp.md)
- [ACP Internals](./acp-internals.md)
## Messaging Gateway Architecture
### Cron
The gateway (`gateway/run.py`) uses `GatewayRunner` to:
Cron jobs are implemented as first-class agent tasks, not just shell tasks.
1. Connect to all configured platforms
2. Route messages through per-chat session stores
3. Dispatch to AIAgent instances
4. Run the cron scheduler (ticks every 60s)
5. Handle interrupts and tool progress notifications
See [Cron Internals](./cron-internals.md).
Each platform adapter conforms to `BasePlatformAdapter`.
### RL / environments / trajectories
## Configuration System
Hermes ships a full environment framework for evaluation, RL integration, and SFT data generation.
- `~/.hermes/config.yaml` — All settings
- `~/.hermes/.env` — API keys and secrets
- `_config_version` in `DEFAULT_CONFIG` — Bumped when required fields are added, triggers migration prompts
See:
- [Environments, Benchmarks & Data Generation](./environments.md)
- [Trajectories & Training Format](./trajectory-format.md)
## Design themes
Several cross-cutting design themes appear throughout the codebase:
- prompt stability matters
- tool execution must be observable and interruptible
- session persistence must survive long-running use
- platform frontends should share one agent core
- optional subsystems should remain loosely coupled where possible
## Implementation notes
The older mental model of Hermes as “one OpenAI-compatible chat loop plus some tools” is no longer sufficient. Current Hermes includes:
- multiple API modes
- auxiliary model routing
- ACP editor integration
- gateway-specific session and delivery semantics
- RL environment infrastructure
- prompt-caching and compression logic with lineage-aware persistence
Use this page as the map, then dive into subsystem-specific docs for the real implementation details.