mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
docs: add ACP and internal systems implementation guides
- add ACP user and developer docs covering setup, lifecycle, callbacks, permissions, tool rendering, and runtime behavior - add developer guides for agent loop, provider runtime resolution, prompt assembly, context caching/compression, gateway internals, session storage, tools runtime, trajectories, and cron internals - refresh architecture, quickstart, installation, CLI reference, and environments docs to link the new implementation pages and ACP support
This commit is contained in:
parent
29176f302e
commit
d87a1615ce
17 changed files with 1256 additions and 170 deletions
|
|
@ -1,218 +1,151 @@
|
|||
---
|
||||
sidebar_position: 1
|
||||
title: "Architecture"
|
||||
description: "Hermes Agent internals — project structure, agent loop, key classes, and design patterns"
|
||||
description: "Hermes Agent internals — major subsystems, execution paths, and where to read next"
|
||||
---
|
||||
|
||||
# Architecture
|
||||
|
||||
This guide covers the internal architecture of Hermes Agent for developers contributing to the project.
|
||||
This page is the top-level map of Hermes Agent internals. The project has grown beyond a single monolithic loop, so the best way to understand it is by subsystem.
|
||||
|
||||
## Project Structure
|
||||
## High-level structure
|
||||
|
||||
```
|
||||
```text
|
||||
hermes-agent/
|
||||
├── run_agent.py # AIAgent class — core conversation loop, tool dispatch
|
||||
├── cli.py # HermesCLI class — interactive TUI, prompt_toolkit
|
||||
├── model_tools.py # Tool orchestration (thin layer over tools/registry.py)
|
||||
├── toolsets.py # Tool groupings and presets
|
||||
├── hermes_state.py # SQLite session database with FTS5 full-text search
|
||||
├── batch_runner.py # Parallel batch processing for trajectory generation
|
||||
├── run_agent.py # AIAgent core loop
|
||||
├── cli.py # interactive terminal UI
|
||||
├── model_tools.py # tool discovery/orchestration
|
||||
├── toolsets.py # tool groupings and presets
|
||||
├── hermes_state.py # SQLite session/state database
|
||||
├── batch_runner.py # batch trajectory generation
|
||||
│
|
||||
├── agent/ # Agent internals (extracted modules)
|
||||
│ ├── prompt_builder.py # System prompt assembly (identity, skills, memory)
|
||||
│ ├── context_compressor.py # Auto-summarization when approaching context limits
|
||||
│ ├── auxiliary_client.py # Resolves auxiliary OpenAI clients (summarization, vision)
|
||||
│ ├── display.py # KawaiiSpinner, tool progress formatting
|
||||
│ ├── model_metadata.py # Model context lengths, token estimation
|
||||
│ └── trajectory.py # Trajectory saving helpers
|
||||
│
|
||||
├── hermes_cli/ # CLI command implementations
|
||||
│ ├── main.py # Entry point, argument parsing, command dispatch
|
||||
│ ├── config.py # Config management, migration, env var definitions
|
||||
│ ├── setup.py # Interactive setup wizard
|
||||
│ ├── auth.py # Provider resolution, OAuth, Nous Portal
|
||||
│ ├── models.py # OpenRouter model selection lists
|
||||
│ ├── banner.py # Welcome banner, ASCII art
|
||||
│ ├── commands.py # Slash command definitions + autocomplete
|
||||
│ ├── callbacks.py # Interactive callbacks (clarify, sudo, approval)
|
||||
│ ├── doctor.py # Diagnostics
|
||||
│ └── skills_hub.py # Skills Hub CLI + /skills slash command handler
|
||||
│
|
||||
├── tools/ # Tool implementations (self-registering)
|
||||
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
|
||||
│ ├── approval.py # Dangerous command detection + per-session approval
|
||||
│ ├── terminal_tool.py # Terminal orchestration (sudo, env lifecycle, backends)
|
||||
│ ├── file_operations.py # File tool implementations (read, write, search, patch)
|
||||
│ ├── file_tools.py # File tool registration
|
||||
│ ├── web_tools.py # web_search, web_extract
|
||||
│ ├── vision_tools.py # Image analysis via multimodal models
|
||||
│ ├── delegate_tool.py # Subagent spawning and parallel task execution
|
||||
│ ├── code_execution_tool.py # Sandboxed Python with RPC tool access
|
||||
│ ├── session_search_tool.py # Search past conversations
|
||||
│ ├── cronjob_tools.py # Scheduled task management
|
||||
│ ├── skills_tool.py # Skill search and load
|
||||
│ ├── skill_manager_tool.py # Skill management
|
||||
│ └── environments/ # Terminal execution backends
|
||||
│ ├── base.py # BaseEnvironment ABC
|
||||
│ ├── local.py, docker.py, ssh.py, singularity.py, modal.py, daytona.py
|
||||
│
|
||||
├── gateway/ # Messaging gateway
|
||||
│ ├── run.py # GatewayRunner — platform lifecycle, message routing
|
||||
│ ├── config.py # Platform configuration resolution
|
||||
│ ├── session.py # Session store, context prompts, reset policies
|
||||
│ └── platforms/ # Platform adapters
|
||||
│ ├── telegram.py, discord_adapter.py, slack.py, whatsapp.py
|
||||
│
|
||||
├── scripts/ # Installer and bridge scripts
|
||||
│ ├── install.sh # Linux/macOS installer
|
||||
│ ├── install.ps1 # Windows PowerShell installer
|
||||
│ └── whatsapp-bridge/ # Node.js WhatsApp bridge (Baileys)
|
||||
│
|
||||
├── skills/ # Bundled skills (copied to ~/.hermes/skills/)
|
||||
├── optional-skills/ # Official optional skills (discoverable via hub, not activated by default)
|
||||
├── environments/ # RL training environments (Atropos integration)
|
||||
└── tests/ # Test suite
|
||||
├── agent/ # prompt building, compression, caching, metadata, trajectories
|
||||
├── hermes_cli/ # command entrypoints, auth, setup, models, config, doctor
|
||||
├── tools/ # tool implementations and terminal environments
|
||||
├── gateway/ # messaging gateway, session routing, delivery, pairing, hooks
|
||||
├── cron/ # scheduled job storage and scheduler
|
||||
├── honcho_integration/ # Honcho memory integration
|
||||
├── acp_adapter/ # ACP editor integration server
|
||||
├── acp_registry/ # ACP registry manifest + icon
|
||||
├── environments/ # Hermes RL / benchmark environment framework
|
||||
├── skills/ # bundled skills
|
||||
├── optional-skills/ # official optional skills
|
||||
└── tests/ # test suite
|
||||
```
|
||||
|
||||
## Core Loop
|
||||
## Recommended reading order
|
||||
|
||||
The main agent loop lives in `run_agent.py`:
|
||||
If you are new to the codebase, read in this order:
|
||||
|
||||
```
|
||||
User message → AIAgent._run_agent_loop()
|
||||
├── Build system prompt (prompt_builder.py)
|
||||
├── Build API kwargs (model, messages, tools, reasoning config)
|
||||
├── Call LLM (OpenAI-compatible API)
|
||||
├── If tool_calls in response:
|
||||
│ ├── Execute each tool via registry dispatch
|
||||
│ ├── Add tool results to conversation
|
||||
│ └── Loop back to LLM call
|
||||
├── If text response:
|
||||
│ ├── Persist session to DB
|
||||
│ └── Return final_response
|
||||
└── Context compression if approaching token limit
|
||||
```
|
||||
1. this page
|
||||
2. [Agent Loop Internals](./agent-loop.md)
|
||||
3. [Prompt Assembly](./prompt-assembly.md)
|
||||
4. [Provider Runtime Resolution](./provider-runtime.md)
|
||||
5. [Tools Runtime](./tools-runtime.md)
|
||||
6. [Session Storage](./session-storage.md)
|
||||
7. [Gateway Internals](./gateway-internals.md)
|
||||
8. [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
||||
9. [ACP Internals](./acp-internals.md)
|
||||
10. [Environments, Benchmarks & Data Generation](./environments.md)
|
||||
|
||||
```python
|
||||
while turns < max_turns:
|
||||
response = client.chat.completions.create(
|
||||
model=model,
|
||||
messages=messages,
|
||||
tools=tool_schemas,
|
||||
)
|
||||
## Major subsystems
|
||||
|
||||
if response.tool_calls:
|
||||
for tool_call in response.tool_calls:
|
||||
result = execute_tool(tool_call)
|
||||
messages.append(tool_result_message(result))
|
||||
turns += 1
|
||||
else:
|
||||
return response.content
|
||||
```
|
||||
### Agent loop
|
||||
|
||||
## AIAgent Class
|
||||
The core synchronous orchestration engine is `AIAgent` in `run_agent.py`.
|
||||
|
||||
```python
|
||||
class AIAgent:
|
||||
def __init__(
|
||||
self,
|
||||
model: str = "anthropic/claude-opus-4.6",
|
||||
api_key: str = None,
|
||||
base_url: str = None, # Resolved internally based on provider
|
||||
max_iterations: int = 60,
|
||||
enabled_toolsets: list = None,
|
||||
disabled_toolsets: list = None,
|
||||
verbose_logging: bool = False,
|
||||
quiet_mode: bool = False,
|
||||
tool_progress_callback: callable = None,
|
||||
):
|
||||
...
|
||||
It is responsible for:
|
||||
|
||||
def chat(self, message: str) -> str:
|
||||
# Main entry point - runs the agent loop
|
||||
...
|
||||
```
|
||||
- provider/API-mode selection
|
||||
- prompt construction
|
||||
- tool execution
|
||||
- retries and fallback
|
||||
- callbacks
|
||||
- compression and persistence
|
||||
|
||||
## File Dependency Chain
|
||||
See [Agent Loop Internals](./agent-loop.md).
|
||||
|
||||
```
|
||||
tools/registry.py (no deps — imported by all tool files)
|
||||
↑
|
||||
tools/*.py (each calls registry.register() at import time)
|
||||
↑
|
||||
model_tools.py (imports tools/registry + triggers tool discovery)
|
||||
↑
|
||||
run_agent.py, cli.py, batch_runner.py, environments/
|
||||
```
|
||||
### Prompt system
|
||||
|
||||
Each tool file co-locates its schema, handler, and registration. `model_tools.py` is a thin orchestration layer.
|
||||
Prompt-building logic is split between:
|
||||
|
||||
## Key Design Patterns
|
||||
- `run_agent.py`
|
||||
- `agent/prompt_builder.py`
|
||||
- `agent/prompt_caching.py`
|
||||
- `agent/context_compressor.py`
|
||||
|
||||
### Self-Registering Tools
|
||||
See:
|
||||
|
||||
Each tool file calls `registry.register()` at import time. `model_tools.py` triggers discovery by importing all tool modules.
|
||||
- [Prompt Assembly](./prompt-assembly.md)
|
||||
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
||||
|
||||
### Toolset Grouping
|
||||
### Provider/runtime resolution
|
||||
|
||||
Tools are grouped into toolsets (`web`, `terminal`, `file`, `browser`, etc.) that can be enabled/disabled per platform.
|
||||
Hermes has a shared runtime provider resolver used by CLI, gateway, cron, ACP, and auxiliary calls.
|
||||
|
||||
### Session Persistence
|
||||
See [Provider Runtime Resolution](./provider-runtime.md).
|
||||
|
||||
All conversations are stored in SQLite (`hermes_state.py`) with full-text search. JSON logs go to `~/.hermes/sessions/`.
|
||||
### Tooling runtime
|
||||
|
||||
### Ephemeral Injection
|
||||
The tool registry, toolsets, terminal backends, process manager, and dispatch rules form a subsystem of their own.
|
||||
|
||||
System prompts and prefill messages are injected at API call time, never persisted to the database or logs.
|
||||
See [Tools Runtime](./tools-runtime.md).
|
||||
|
||||
### Provider Abstraction
|
||||
### Session persistence
|
||||
|
||||
The agent works with any OpenAI-compatible API. Provider resolution happens at init time (Nous Portal OAuth, OpenRouter API key, or custom endpoint).
|
||||
Historical session state is stored primarily in SQLite, with lineage preserved across compression splits.
|
||||
|
||||
### Conversation Format
|
||||
See [Session Storage](./session-storage.md).
|
||||
|
||||
Messages follow the OpenAI format:
|
||||
### Messaging gateway
|
||||
|
||||
```python
|
||||
messages = [
|
||||
{"role": "system", "content": "You are a helpful assistant..."},
|
||||
{"role": "user", "content": "Search for Python tutorials"},
|
||||
{"role": "assistant", "content": None, "tool_calls": [...]},
|
||||
{"role": "tool", "tool_call_id": "...", "content": "..."},
|
||||
{"role": "assistant", "content": "Here's what I found..."},
|
||||
]
|
||||
```
|
||||
The gateway is a long-running orchestration layer for platform adapters, session routing, pairing, delivery, and cron ticking.
|
||||
|
||||
## CLI Architecture
|
||||
See [Gateway Internals](./gateway-internals.md).
|
||||
|
||||
The interactive CLI (`cli.py`) uses:
|
||||
### ACP integration
|
||||
|
||||
- **Rich** — Welcome banner and styled panels
|
||||
- **prompt_toolkit** — Fixed input area with history, `patch_stdout`, slash command autocomplete
|
||||
- **KawaiiSpinner** — Animated kawaii faces during API calls; clean activity feed for tool results
|
||||
ACP exposes Hermes as an editor-native agent over stdio/JSON-RPC.
|
||||
|
||||
Key UX behaviors:
|
||||
See:
|
||||
|
||||
- Thinking spinner shows animated kawaii face + verb (`(⌐■_■) deliberating...`)
|
||||
- Tool execution results appear as `┊ {emoji} {verb} {detail} {duration}`
|
||||
- Prompt shows `⚕ ❯` when working, `❯` when idle
|
||||
- Multi-line paste support with automatic formatting
|
||||
- [ACP Editor Integration](../user-guide/features/acp.md)
|
||||
- [ACP Internals](./acp-internals.md)
|
||||
|
||||
## Messaging Gateway Architecture
|
||||
### Cron
|
||||
|
||||
The gateway (`gateway/run.py`) uses `GatewayRunner` to:
|
||||
Cron jobs are implemented as first-class agent tasks, not just shell tasks.
|
||||
|
||||
1. Connect to all configured platforms
|
||||
2. Route messages through per-chat session stores
|
||||
3. Dispatch to AIAgent instances
|
||||
4. Run the cron scheduler (ticks every 60s)
|
||||
5. Handle interrupts and tool progress notifications
|
||||
See [Cron Internals](./cron-internals.md).
|
||||
|
||||
Each platform adapter conforms to `BasePlatformAdapter`.
|
||||
### RL / environments / trajectories
|
||||
|
||||
## Configuration System
|
||||
Hermes ships a full environment framework for evaluation, RL integration, and SFT data generation.
|
||||
|
||||
- `~/.hermes/config.yaml` — All settings
|
||||
- `~/.hermes/.env` — API keys and secrets
|
||||
- `_config_version` in `DEFAULT_CONFIG` — Bumped when required fields are added, triggers migration prompts
|
||||
See:
|
||||
|
||||
- [Environments, Benchmarks & Data Generation](./environments.md)
|
||||
- [Trajectories & Training Format](./trajectory-format.md)
|
||||
|
||||
## Design themes
|
||||
|
||||
Several cross-cutting design themes appear throughout the codebase:
|
||||
|
||||
- prompt stability matters
|
||||
- tool execution must be observable and interruptible
|
||||
- session persistence must survive long-running use
|
||||
- platform frontends should share one agent core
|
||||
- optional subsystems should remain loosely coupled where possible
|
||||
|
||||
## Implementation notes
|
||||
|
||||
The older mental model of Hermes as “one OpenAI-compatible chat loop plus some tools” is no longer sufficient. Current Hermes includes:
|
||||
|
||||
- multiple API modes
|
||||
- auxiliary model routing
|
||||
- ACP editor integration
|
||||
- gateway-specific session and delivery semantics
|
||||
- RL environment infrastructure
|
||||
- prompt-caching and compression logic with lineage-aware persistence
|
||||
|
||||
Use this page as the map, then dive into subsystem-specific docs for the real implementation details.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue