mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393)
Major changes across 20 documentation pages: Staleness fixes: - Fix FAQ: wrong import path (hermes.agent → run_agent) - Fix FAQ: stale Gemini 2.0 model → Gemini 3 Flash - Fix integrations/index: missing MiniMax TTS provider - Fix integrations/index: web_crawl is not a registered tool - Fix sessions: add all 19 session sources (was only 5) - Fix cron: add all 18 delivery targets (was only telegram/discord) - Fix webhooks: add all delivery targets - Fix overview: add missing MCP, memory providers, credential pools - Fix all line-number references → use function name searches instead - Update file size estimates (run_agent ~9200, gateway ~7200, cli ~8500) Expanded thin pages (< 150 lines → substantial depth): - honcho.md: 43 → 108 lines — added feature comparison, tools, config, CLI - overview.md: 49 → 55 lines — added MCP, memory providers, credential pools - toolsets-reference.md: 57 → 175 lines — added explanations, config examples, custom toolsets, wildcards, platform differences table - optional-skills-catalog.md: 74 → 153 lines — added 25+ missing skills across communication, devops, mlops (18!), productivity, research categories - integrations/index.md: 82 → 115 lines — added messaging, HA, plugins sections - cron-internals.md: 90 → 195 lines — added job JSON example, lifecycle states, tick cycle, delivery targets, script-backed jobs, CLI interface - gateway-internals.md: 111 → 250 lines — added architecture diagram, message flow, two-level guard, platform adapters, token locks, process management - agent-loop.md: 112 → 235 lines — added entry points, API mode resolution, turn lifecycle detail, message alternation rules, tool execution flow, callback table, budget tracking, compression details - architecture.md: 152 → 295 lines — added system overview diagram, data flow diagrams, design principles table, dependency chain Other depth additions: - context-references.md: added platform availability, compression interaction, common patterns sections - slash-commands.md: added quick commands config example, alias resolution - image-generation.md: added platform delivery table - tools-reference.md: added tool counts, MCP tools note - index.md: updated platform count (5 → 14+), tool count (40+ → 47)
This commit is contained in:
parent
fec58ad99e
commit
43d468cea8
20 changed files with 1243 additions and 406 deletions
|
|
@ -1,152 +1,274 @@
|
|||
---
|
||||
sidebar_position: 1
|
||||
title: "Architecture"
|
||||
description: "Hermes Agent internals — major subsystems, execution paths, and where to read next"
|
||||
description: "Hermes Agent internals — major subsystems, execution paths, data flow, and where to read next"
|
||||
---
|
||||
|
||||
# Architecture
|
||||
|
||||
This page is the top-level map of Hermes Agent internals. The project has grown beyond a single monolithic loop, so the best way to understand it is by subsystem.
|
||||
This page is the top-level map of Hermes Agent internals. Use it to orient yourself in the codebase, then dive into subsystem-specific docs for implementation details.
|
||||
|
||||
## High-level structure
|
||||
## System Overview
|
||||
|
||||
```text
|
||||
┌─────────────────────────────────────────────────────────────────────┐
|
||||
│ Entry Points │
|
||||
│ │
|
||||
│ CLI (cli.py) Gateway (gateway/run.py) ACP (acp_adapter/) │
|
||||
│ Batch Runner API Server Python Library │
|
||||
└──────────┬──────────────┬───────────────────────┬────────────────────┘
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌─────────────────────────────────────────────────────────────────────┐
|
||||
│ AIAgent (run_agent.py) │
|
||||
│ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ Prompt │ │ Provider │ │ Tool │ │
|
||||
│ │ Builder │ │ Resolution │ │ Dispatch │ │
|
||||
│ │ (prompt_ │ │ (runtime_ │ │ (model_ │ │
|
||||
│ │ builder.py) │ │ provider.py)│ │ tools.py) │ │
|
||||
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
||||
│ │ │ │ │
|
||||
│ ┌──────┴───────┐ ┌──────┴───────┐ ┌──────┴───────┐ │
|
||||
│ │ Compression │ │ 3 API Modes │ │ Tool Registry│ │
|
||||
│ │ & Caching │ │ chat_compl. │ │ (registry.py)│ │
|
||||
│ │ │ │ codex_resp. │ │ 47 tools │ │
|
||||
│ │ │ │ anthropic │ │ 37 toolsets │ │
|
||||
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────┘
|
||||
│ │
|
||||
▼ ▼
|
||||
┌───────────────────┐ ┌──────────────────────┐
|
||||
│ Session Storage │ │ Tool Backends │
|
||||
│ (SQLite + FTS5) │ │ Terminal (6 backends) │
|
||||
│ hermes_state.py │ │ Browser (5 backends) │
|
||||
│ gateway/session.py│ │ Web (4 backends) │
|
||||
└───────────────────┘ │ MCP (dynamic) │
|
||||
│ File, Vision, etc. │
|
||||
└──────────────────────┘
|
||||
```
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```text
|
||||
hermes-agent/
|
||||
├── run_agent.py # AIAgent core loop
|
||||
├── cli.py # interactive terminal UI
|
||||
├── model_tools.py # tool discovery/orchestration
|
||||
├── toolsets.py # tool groupings and presets
|
||||
├── hermes_state.py # SQLite session/state database
|
||||
├── batch_runner.py # batch trajectory generation
|
||||
├── run_agent.py # AIAgent — core conversation loop (~9,200 lines)
|
||||
├── cli.py # HermesCLI — interactive terminal UI (~8,500 lines)
|
||||
├── model_tools.py # Tool discovery, schema collection, dispatch
|
||||
├── toolsets.py # Tool groupings and platform presets
|
||||
├── hermes_state.py # SQLite session/state database with FTS5
|
||||
├── hermes_constants.py # HERMES_HOME, profile-aware paths
|
||||
├── batch_runner.py # Batch trajectory generation
|
||||
│
|
||||
├── agent/ # prompt building, compression, caching, metadata, trajectories
|
||||
├── hermes_cli/ # command entrypoints, auth, setup, models, config, doctor
|
||||
├── tools/ # tool implementations and terminal environments
|
||||
├── gateway/ # messaging gateway, session routing, delivery, pairing, hooks
|
||||
├── cron/ # scheduled job storage and scheduler
|
||||
├── plugins/memory/ # Memory provider plugins (honcho, openviking, mem0, etc.)
|
||||
├── acp_adapter/ # ACP editor integration server
|
||||
├── acp_registry/ # ACP registry manifest + icon
|
||||
├── environments/ # Hermes RL / benchmark environment framework
|
||||
├── skills/ # bundled skills
|
||||
├── optional-skills/ # official optional skills
|
||||
└── tests/ # test suite
|
||||
├── agent/ # Agent internals
|
||||
│ ├── prompt_builder.py # System prompt assembly
|
||||
│ ├── context_compressor.py # Conversation compression algorithm
|
||||
│ ├── prompt_caching.py # Anthropic prompt caching
|
||||
│ ├── auxiliary_client.py # Auxiliary LLM for side tasks (vision, summarization)
|
||||
│ ├── model_metadata.py # Model context lengths, token estimation
|
||||
│ ├── models_dev.py # models.dev registry integration
|
||||
│ ├── anthropic_adapter.py # Anthropic Messages API format conversion
|
||||
│ ├── display.py # KawaiiSpinner, tool preview formatting
|
||||
│ ├── skill_commands.py # Skill slash commands
|
||||
│ ├── memory_store.py # Persistent memory read/write
|
||||
│ └── trajectory.py # Trajectory saving helpers
|
||||
│
|
||||
├── hermes_cli/ # CLI subcommands and setup
|
||||
│ ├── main.py # Entry point — all `hermes` subcommands (~4,200 lines)
|
||||
│ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
|
||||
│ ├── commands.py # COMMAND_REGISTRY — central slash command definitions
|
||||
│ ├── auth.py # PROVIDER_REGISTRY, credential resolution
|
||||
│ ├── runtime_provider.py # Provider → api_mode + credentials
|
||||
│ ├── models.py # Model catalog, provider model lists
|
||||
│ ├── model_switch.py # /model command logic (CLI + gateway shared)
|
||||
│ ├── setup.py # Interactive setup wizard (~3,500 lines)
|
||||
│ ├── skin_engine.py # CLI theming engine
|
||||
│ ├── skills_config.py # hermes skills — enable/disable per platform
|
||||
│ ├── skills_hub.py # /skills slash command
|
||||
│ ├── tools_config.py # hermes tools — enable/disable per platform
|
||||
│ ├── plugins.py # PluginManager — discovery, loading, hooks
|
||||
│ ├── callbacks.py # Terminal callbacks (clarify, sudo, approval)
|
||||
│ └── gateway.py # hermes gateway start/stop
|
||||
│
|
||||
├── tools/ # Tool implementations (one file per tool)
|
||||
│ ├── registry.py # Central tool registry
|
||||
│ ├── approval.py # Dangerous command detection
|
||||
│ ├── terminal_tool.py # Terminal orchestration
|
||||
│ ├── process_registry.py # Background process management
|
||||
│ ├── file_tools.py # read_file, write_file, patch, search_files
|
||||
│ ├── web_tools.py # web_search, web_extract
|
||||
│ ├── browser_tool.py # 11 browser automation tools
|
||||
│ ├── code_execution_tool.py # execute_code sandbox
|
||||
│ ├── delegate_tool.py # Subagent delegation
|
||||
│ ├── mcp_tool.py # MCP client (~1,050 lines)
|
||||
│ ├── credential_files.py # File-based credential passthrough
|
||||
│ ├── env_passthrough.py # Env var passthrough for sandboxes
|
||||
│ ├── ansi_strip.py # ANSI escape stripping
|
||||
│ └── environments/ # Terminal backends (local, docker, ssh, modal, daytona, singularity)
|
||||
│
|
||||
├── gateway/ # Messaging platform gateway
|
||||
│ ├── run.py # GatewayRunner — message dispatch (~5,800 lines)
|
||||
│ ├── session.py # SessionStore — conversation persistence
|
||||
│ ├── delivery.py # Outbound message delivery
|
||||
│ ├── pairing.py # DM pairing authorization
|
||||
│ ├── hooks.py # Hook discovery and lifecycle events
|
||||
│ ├── mirror.py # Cross-session message mirroring
|
||||
│ ├── status.py # Token locks, profile-scoped process tracking
|
||||
│ ├── builtin_hooks/ # Always-registered hooks
|
||||
│ └── platforms/ # 14 adapters: telegram, discord, slack, whatsapp,
|
||||
│ # signal, matrix, mattermost, email, sms,
|
||||
│ # dingtalk, feishu, wecom, homeassistant, webhook
|
||||
│
|
||||
├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains)
|
||||
├── cron/ # Scheduler (jobs.py, scheduler.py)
|
||||
├── plugins/memory/ # Memory provider plugins
|
||||
├── environments/ # RL training environments (Atropos)
|
||||
├── skills/ # Bundled skills (always available)
|
||||
├── optional-skills/ # Official optional skills (install explicitly)
|
||||
├── website/ # Docusaurus documentation site
|
||||
└── tests/ # Pytest suite (~3,000+ tests)
|
||||
```
|
||||
|
||||
## Recommended reading order
|
||||
## Data Flow
|
||||
|
||||
If you are new to the codebase, read in this order:
|
||||
### CLI Session
|
||||
|
||||
1. this page
|
||||
2. [Agent Loop Internals](./agent-loop.md)
|
||||
3. [Prompt Assembly](./prompt-assembly.md)
|
||||
4. [Provider Runtime Resolution](./provider-runtime.md)
|
||||
5. [Adding Providers](./adding-providers.md)
|
||||
6. [Tools Runtime](./tools-runtime.md)
|
||||
7. [Session Storage](./session-storage.md)
|
||||
8. [Gateway Internals](./gateway-internals.md)
|
||||
9. [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
||||
10. [ACP Internals](./acp-internals.md)
|
||||
11. [Environments, Benchmarks & Data Generation](./environments.md)
|
||||
```text
|
||||
User input → HermesCLI.process_input()
|
||||
→ AIAgent.run_conversation()
|
||||
→ prompt_builder.build_system_prompt()
|
||||
→ runtime_provider.resolve_runtime_provider()
|
||||
→ API call (chat_completions / codex_responses / anthropic_messages)
|
||||
→ tool_calls? → model_tools.handle_function_call() → loop
|
||||
→ final response → display → save to SessionDB
|
||||
```
|
||||
|
||||
## Major subsystems
|
||||
### Gateway Message
|
||||
|
||||
### Agent loop
|
||||
```text
|
||||
Platform event → Adapter.on_message() → MessageEvent
|
||||
→ GatewayRunner._handle_message()
|
||||
→ authorize user
|
||||
→ resolve session key
|
||||
→ create AIAgent with session history
|
||||
→ AIAgent.run_conversation()
|
||||
→ deliver response back through adapter
|
||||
```
|
||||
|
||||
The core synchronous orchestration engine is `AIAgent` in `run_agent.py`.
|
||||
### Cron Job
|
||||
|
||||
It is responsible for:
|
||||
```text
|
||||
Scheduler tick → load due jobs from jobs.json
|
||||
→ create fresh AIAgent (no history)
|
||||
→ inject attached skills as context
|
||||
→ run job prompt
|
||||
→ deliver response to target platform
|
||||
→ update job state and next_run
|
||||
```
|
||||
|
||||
- provider/API-mode selection
|
||||
- prompt construction
|
||||
- tool execution
|
||||
- retries and fallback
|
||||
- callbacks
|
||||
- compression and persistence
|
||||
## Recommended Reading Order
|
||||
|
||||
See [Agent Loop Internals](./agent-loop.md).
|
||||
If you are new to the codebase:
|
||||
|
||||
### Prompt system
|
||||
1. **This page** — orient yourself
|
||||
2. **[Agent Loop Internals](./agent-loop.md)** — how AIAgent works
|
||||
3. **[Prompt Assembly](./prompt-assembly.md)** — system prompt construction
|
||||
4. **[Provider Runtime Resolution](./provider-runtime.md)** — how providers are selected
|
||||
5. **[Adding Providers](./adding-providers.md)** — practical guide to adding a new provider
|
||||
6. **[Tools Runtime](./tools-runtime.md)** — tool registry, dispatch, environments
|
||||
7. **[Session Storage](./session-storage.md)** — SQLite schema, FTS5, session lineage
|
||||
8. **[Gateway Internals](./gateway-internals.md)** — messaging platform gateway
|
||||
9. **[Context Compression & Prompt Caching](./context-compression-and-caching.md)** — compression and caching
|
||||
10. **[ACP Internals](./acp-internals.md)** — IDE integration
|
||||
11. **[Environments, Benchmarks & Data Generation](./environments.md)** — RL training
|
||||
|
||||
Prompt-building logic is split between:
|
||||
## Major Subsystems
|
||||
|
||||
- `run_agent.py`
|
||||
- `agent/prompt_builder.py`
|
||||
- `agent/prompt_caching.py`
|
||||
- `agent/context_compressor.py`
|
||||
### Agent Loop
|
||||
|
||||
See:
|
||||
The synchronous orchestration engine (`AIAgent` in `run_agent.py`). Handles provider selection, prompt construction, tool execution, retries, fallback, callbacks, compression, and persistence. Supports three API modes for different provider backends.
|
||||
|
||||
- [Prompt Assembly](./prompt-assembly.md)
|
||||
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
||||
→ [Agent Loop Internals](./agent-loop.md)
|
||||
|
||||
### Provider/runtime resolution
|
||||
### Prompt System
|
||||
|
||||
Hermes has a shared runtime provider resolver used by CLI, gateway, cron, ACP, and auxiliary calls.
|
||||
Prompt construction and maintenance across the conversation lifecycle:
|
||||
|
||||
See [Provider Runtime Resolution](./provider-runtime.md).
|
||||
- **`prompt_builder.py`** — Assembles the system prompt from: personality (SOUL.md), memory (MEMORY.md, USER.md), skills, context files (AGENTS.md, .hermes.md), tool-use guidance, and model-specific instructions
|
||||
- **`prompt_caching.py`** — Applies Anthropic cache breakpoints for prefix caching
|
||||
- **`context_compressor.py`** — Summarizes middle conversation turns when context exceeds thresholds
|
||||
|
||||
### Tooling runtime
|
||||
→ [Prompt Assembly](./prompt-assembly.md), [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
||||
|
||||
The tool registry, toolsets, terminal backends, process manager, and dispatch rules form a subsystem of their own.
|
||||
### Provider Resolution
|
||||
|
||||
See [Tools Runtime](./tools-runtime.md).
|
||||
A shared runtime resolver used by CLI, gateway, cron, ACP, and auxiliary calls. Maps `(provider, model)` tuples to `(api_mode, api_key, base_url)`. Handles 18+ providers, OAuth flows, credential pools, and alias resolution.
|
||||
|
||||
### Session persistence
|
||||
→ [Provider Runtime Resolution](./provider-runtime.md)
|
||||
|
||||
Historical session state is stored primarily in SQLite, with lineage preserved across compression splits.
|
||||
### Tool System
|
||||
|
||||
See [Session Storage](./session-storage.md).
|
||||
Central tool registry (`tools/registry.py`) with 47 registered tools across 20 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 6 backends (local, Docker, SSH, Daytona, Modal, Singularity).
|
||||
|
||||
### Messaging gateway
|
||||
→ [Tools Runtime](./tools-runtime.md)
|
||||
|
||||
The gateway is a long-running orchestration layer for platform adapters, session routing, pairing, delivery, and cron ticking.
|
||||
### Session Persistence
|
||||
|
||||
See [Gateway Internals](./gateway-internals.md).
|
||||
SQLite-based session storage with FTS5 full-text search. Sessions have lineage tracking (parent/child across compressions), per-platform isolation, and atomic writes with contention handling.
|
||||
|
||||
### ACP integration
|
||||
→ [Session Storage](./session-storage.md)
|
||||
|
||||
ACP exposes Hermes as an editor-native agent over stdio/JSON-RPC.
|
||||
### Messaging Gateway
|
||||
|
||||
See:
|
||||
Long-running process with 14 platform adapters, unified session routing, user authorization (allowlists + DM pairing), slash command dispatch, hook system, cron ticking, and background maintenance.
|
||||
|
||||
- [ACP Editor Integration](../user-guide/features/acp.md)
|
||||
- [ACP Internals](./acp-internals.md)
|
||||
→ [Gateway Internals](./gateway-internals.md)
|
||||
|
||||
### Plugin System
|
||||
|
||||
Three discovery sources: `~/.hermes/plugins/` (user), `.hermes/plugins/` (project), and pip entry points. Plugins register tools, hooks, and CLI commands through a context API. Memory providers are a specialized plugin type under `plugins/memory/`.
|
||||
|
||||
→ [Plugin Guide](/docs/guides/build-a-hermes-plugin), [Memory Provider Plugin](./memory-provider-plugin.md)
|
||||
|
||||
### Cron
|
||||
|
||||
Cron jobs are implemented as first-class agent tasks, not just shell tasks.
|
||||
First-class agent tasks (not shell tasks). Jobs store in JSON, support multiple schedule formats, can attach skills and scripts, and deliver to any platform.
|
||||
|
||||
See [Cron Internals](./cron-internals.md).
|
||||
→ [Cron Internals](./cron-internals.md)
|
||||
|
||||
### RL / environments / trajectories
|
||||
### ACP Integration
|
||||
|
||||
Hermes ships a full environment framework for evaluation, RL integration, and SFT data generation.
|
||||
Exposes Hermes as an editor-native agent over stdio/JSON-RPC for VS Code, Zed, and JetBrains.
|
||||
|
||||
See:
|
||||
→ [ACP Internals](./acp-internals.md)
|
||||
|
||||
- [Environments, Benchmarks & Data Generation](./environments.md)
|
||||
- [Trajectories & Training Format](./trajectory-format.md)
|
||||
### RL / Environments / Trajectories
|
||||
|
||||
## Design themes
|
||||
Full environment framework for evaluation and RL training. Integrates with Atropos, supports multiple tool-call parsers, and generates ShareGPT-format trajectories.
|
||||
|
||||
Several cross-cutting design themes appear throughout the codebase:
|
||||
→ [Environments, Benchmarks & Data Generation](./environments.md), [Trajectories & Training Format](./trajectory-format.md)
|
||||
|
||||
- prompt stability matters
|
||||
- tool execution must be observable and interruptible
|
||||
- session persistence must survive long-running use
|
||||
- platform frontends should share one agent core
|
||||
- optional subsystems should remain loosely coupled where possible
|
||||
## Design Principles
|
||||
|
||||
## Implementation notes
|
||||
| Principle | What it means in practice |
|
||||
|-----------|--------------------------|
|
||||
| **Prompt stability** | System prompt doesn't change mid-conversation. No cache-breaking mutations except explicit user actions (`/model`). |
|
||||
| **Observable execution** | Every tool call is visible to the user via callbacks. Progress updates in CLI (spinner) and gateway (chat messages). |
|
||||
| **Interruptible** | API calls and tool execution can be cancelled mid-flight by user input or signals. |
|
||||
| **Platform-agnostic core** | One AIAgent class serves CLI, gateway, ACP, batch, and API server. Platform differences live in the entry point, not the agent. |
|
||||
| **Loose coupling** | Optional subsystems (MCP, plugins, memory providers, RL environments) use registry patterns and check_fn gating, not hard dependencies. |
|
||||
| **Profile isolation** | Each profile (`hermes -p <name>`) gets its own HERMES_HOME, config, memory, sessions, and gateway PID. Multiple profiles run concurrently. |
|
||||
|
||||
The older mental model of Hermes as “one OpenAI-compatible chat loop plus some tools” is no longer sufficient. Current Hermes includes:
|
||||
## File Dependency Chain
|
||||
|
||||
- multiple API modes
|
||||
- auxiliary model routing
|
||||
- ACP editor integration
|
||||
- gateway-specific session and delivery semantics
|
||||
- RL environment infrastructure
|
||||
- prompt-caching and compression logic with lineage-aware persistence
|
||||
```text
|
||||
tools/registry.py (no deps — imported by all tool files)
|
||||
↑
|
||||
tools/*.py (each calls registry.register() at import time)
|
||||
↑
|
||||
model_tools.py (imports tools/registry + triggers tool discovery)
|
||||
↑
|
||||
run_agent.py, cli.py, batch_runner.py, environments/
|
||||
```
|
||||
|
||||
Use this page as the map, then dive into subsystem-specific docs for the real implementation details.
|
||||
This chain means tool registration happens at import time, before any agent instance is created. Adding a new tool requires an import in `model_tools.py`'s `_discover_tools()` list.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue