docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393)

Major changes across 20 documentation pages: Staleness fixes: - Fix FAQ: wrong import path (hermes.agent → run_agent) - Fix FAQ: stale Gemini 2.0 model → Gemini 3 Flash - Fix integrations/index: missing MiniMax TTS provider - Fix integrations/index: web_crawl is not a registered tool - Fix sessions: add all 19 session sources (was only 5) - Fix cron: add all 18 delivery targets (was only telegram/discord) - Fix webhooks: add all delivery targets - Fix overview: add missing MCP, memory providers, credential pools - Fix all line-number references → use function name searches instead - Update file size estimates (run_agent ~9200, gateway ~7200, cli ~8500) Expanded thin pages (< 150 lines → substantial depth): - honcho.md: 43 → 108 lines — added feature comparison, tools, config, CLI - overview.md: 49 → 55 lines — added MCP, memory providers, credential pools - toolsets-reference.md: 57 → 175 lines — added explanations, config examples, custom toolsets, wildcards, platform differences table - optional-skills-catalog.md: 74 → 153 lines — added 25+ missing skills across communication, devops, mlops (18!), productivity, research categories - integrations/index.md: 82 → 115 lines — added messaging, HA, plugins sections - cron-internals.md: 90 → 195 lines — added job JSON example, lifecycle states, tick cycle, delivery targets, script-backed jobs, CLI interface - gateway-internals.md: 111 → 250 lines — added architecture diagram, message flow, two-level guard, platform adapters, token locks, process management - agent-loop.md: 112 → 235 lines — added entry points, API mode resolution, turn lifecycle detail, message alternation rules, tool execution flow, callback table, budget tracking, compression details - architecture.md: 152 → 295 lines — added system overview diagram, data flow diagrams, design principles table, dependency chain Other depth additions: - context-references.md: added platform availability, compression interaction, common patterns sections - slash-commands.md: added quick commands config example, alias resolution - image-generation.md: added platform delivery table - tools-reference.md: added tool counts, MCP tools note - index.md: updated platform count (5 → 14+), tool count (40+ → 47)
2026-04-25 00:51:20 +00:00 · 2026-04-05 19:45:50 -07:00 · 2026-04-05 19:45:50 -07:00 · 43d468cea8
commit 43d468cea8
parent fec58ad99e
20 changed files with 1243 additions and 406 deletions
--- a/website/docs/developer-guide/architecture.md
+++ b/website/docs/developer-guide/architecture.md
@ -1,152 +1,274 @@
 ---
 sidebar_position: 1
 title: "Architecture"
-description: "Hermes Agent internals — major subsystems, execution paths, and where to read next"
+description: "Hermes Agent internals — major subsystems, execution paths, data flow, and where to read next"
 ---

 # Architecture

-This page is the top-level map of Hermes Agent internals. The project has grown beyond a single monolithic loop, so the best way to understand it is by subsystem.
+This page is the top-level map of Hermes Agent internals. Use it to orient yourself in the codebase, then dive into subsystem-specific docs for implementation details.

-## High-level structure
+## System Overview
+
+```text
+┌─────────────────────────────────────────────────────────────────────┐
+│                        Entry Points                                  │
+│                                                                      │
+│  CLI (cli.py)    Gateway (gateway/run.py)    ACP (acp_adapter/)     │
+│  Batch Runner    API Server                  Python Library          │
+└──────────┬──────────────┬───────────────────────┬────────────────────┘
+           │              │                       │
+           ▼              ▼                       ▼
+┌─────────────────────────────────────────────────────────────────────┐
+│                     AIAgent (run_agent.py)                           │
+│                                                                      │
+│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐                │
+│  │ Prompt        │ │ Provider     │ │ Tool         │                │
+│  │ Builder       │ │ Resolution   │ │ Dispatch     │                │
+│  │ (prompt_      │ │ (runtime_    │ │ (model_      │                │
+│  │  builder.py)  │ │  provider.py)│ │  tools.py)   │                │
+│  └──────┬───────┘ └──────┬───────┘ └──────┬───────┘                │
+│         │                │                │                          │
+│  ┌──────┴───────┐ ┌──────┴───────┐ ┌──────┴───────┐                │
+│  │ Compression  │ │ 3 API Modes  │ │ Tool Registry│                │
+│  │ & Caching    │ │ chat_compl.  │ │ (registry.py)│                │
+│  │              │ │ codex_resp.  │ │ 47 tools     │                │
+│  │              │ │ anthropic    │ │ 37 toolsets   │                │
+│  └──────────────┘ └──────────────┘ └──────────────┘                │
+└─────────────────────────────────────────────────────────────────────┘
+           │                                    │
+           ▼                                    ▼
+┌───────────────────┐              ┌──────────────────────┐
+│ Session Storage   │              │ Tool Backends         │
+│ (SQLite + FTS5)   │              │ Terminal (6 backends) │
+│ hermes_state.py   │              │ Browser (5 backends)  │
+│ gateway/session.py│              │ Web (4 backends)      │
+└───────────────────┘              │ MCP (dynamic)         │
+                                   │ File, Vision, etc.    │
+                                   └──────────────────────┘
+```
+
+## Directory Structure

 ```text
 hermes-agent/
-├── run_agent.py              # AIAgent core loop
-├── cli.py                    # interactive terminal UI
-├── model_tools.py            # tool discovery/orchestration
-├── toolsets.py               # tool groupings and presets
-├── hermes_state.py           # SQLite session/state database
-├── batch_runner.py           # batch trajectory generation
+├── run_agent.py              # AIAgent — core conversation loop (~9,200 lines)
+├── cli.py                    # HermesCLI — interactive terminal UI (~8,500 lines)
+├── model_tools.py            # Tool discovery, schema collection, dispatch
+├── toolsets.py               # Tool groupings and platform presets
+├── hermes_state.py           # SQLite session/state database with FTS5
+├── hermes_constants.py       # HERMES_HOME, profile-aware paths
+├── batch_runner.py           # Batch trajectory generation
 │
-├── agent/                    # prompt building, compression, caching, metadata, trajectories
-├── hermes_cli/               # command entrypoints, auth, setup, models, config, doctor
-├── tools/                    # tool implementations and terminal environments
-├── gateway/                  # messaging gateway, session routing, delivery, pairing, hooks
-├── cron/                     # scheduled job storage and scheduler
-├── plugins/memory/           # Memory provider plugins (honcho, openviking, mem0, etc.)
-├── acp_adapter/              # ACP editor integration server
-├── acp_registry/             # ACP registry manifest + icon
-├── environments/             # Hermes RL / benchmark environment framework
-├── skills/                   # bundled skills
-├── optional-skills/          # official optional skills
-└── tests/                    # test suite
+├── agent/                    # Agent internals
+│   ├── prompt_builder.py     # System prompt assembly
+│   ├── context_compressor.py # Conversation compression algorithm
+│   ├── prompt_caching.py     # Anthropic prompt caching
+│   ├── auxiliary_client.py   # Auxiliary LLM for side tasks (vision, summarization)
+│   ├── model_metadata.py     # Model context lengths, token estimation
+│   ├── models_dev.py         # models.dev registry integration
+│   ├── anthropic_adapter.py  # Anthropic Messages API format conversion
+│   ├── display.py            # KawaiiSpinner, tool preview formatting
+│   ├── skill_commands.py     # Skill slash commands
+│   ├── memory_store.py       # Persistent memory read/write
+│   └── trajectory.py         # Trajectory saving helpers
+│
+├── hermes_cli/               # CLI subcommands and setup
+│   ├── main.py               # Entry point — all `hermes` subcommands (~4,200 lines)
+│   ├── config.py             # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
+│   ├── commands.py           # COMMAND_REGISTRY — central slash command definitions
+│   ├── auth.py               # PROVIDER_REGISTRY, credential resolution
+│   ├── runtime_provider.py   # Provider → api_mode + credentials
+│   ├── models.py             # Model catalog, provider model lists
+│   ├── model_switch.py       # /model command logic (CLI + gateway shared)
+│   ├── setup.py              # Interactive setup wizard (~3,500 lines)
+│   ├── skin_engine.py        # CLI theming engine
+│   ├── skills_config.py      # hermes skills — enable/disable per platform
+│   ├── skills_hub.py         # /skills slash command
+│   ├── tools_config.py       # hermes tools — enable/disable per platform
+│   ├── plugins.py            # PluginManager — discovery, loading, hooks
+│   ├── callbacks.py          # Terminal callbacks (clarify, sudo, approval)
+│   └── gateway.py            # hermes gateway start/stop
+│
+├── tools/                    # Tool implementations (one file per tool)
+│   ├── registry.py           # Central tool registry
+│   ├── approval.py           # Dangerous command detection
+│   ├── terminal_tool.py      # Terminal orchestration
+│   ├── process_registry.py   # Background process management
+│   ├── file_tools.py         # read_file, write_file, patch, search_files
+│   ├── web_tools.py          # web_search, web_extract
+│   ├── browser_tool.py       # 11 browser automation tools
+│   ├── code_execution_tool.py # execute_code sandbox
+│   ├── delegate_tool.py      # Subagent delegation
+│   ├── mcp_tool.py           # MCP client (~1,050 lines)
+│   ├── credential_files.py   # File-based credential passthrough
+│   ├── env_passthrough.py    # Env var passthrough for sandboxes
+│   ├── ansi_strip.py         # ANSI escape stripping
+│   └── environments/         # Terminal backends (local, docker, ssh, modal, daytona, singularity)
+│
+├── gateway/                  # Messaging platform gateway
+│   ├── run.py                # GatewayRunner — message dispatch (~5,800 lines)
+│   ├── session.py            # SessionStore — conversation persistence
+│   ├── delivery.py           # Outbound message delivery
+│   ├── pairing.py            # DM pairing authorization
+│   ├── hooks.py              # Hook discovery and lifecycle events
+│   ├── mirror.py             # Cross-session message mirroring
+│   ├── status.py             # Token locks, profile-scoped process tracking
+│   ├── builtin_hooks/        # Always-registered hooks
+│   └── platforms/            # 14 adapters: telegram, discord, slack, whatsapp,
+│                             #   signal, matrix, mattermost, email, sms,
+│                             #   dingtalk, feishu, wecom, homeassistant, webhook
+│
+├── acp_adapter/              # ACP server (VS Code / Zed / JetBrains)
+├── cron/                     # Scheduler (jobs.py, scheduler.py)
+├── plugins/memory/           # Memory provider plugins
+├── environments/             # RL training environments (Atropos)
+├── skills/                   # Bundled skills (always available)
+├── optional-skills/          # Official optional skills (install explicitly)
+├── website/                  # Docusaurus documentation site
+└── tests/                    # Pytest suite (~3,000+ tests)
 ```

-## Recommended reading order
+## Data Flow

-If you are new to the codebase, read in this order:
+### CLI Session

-1. this page
-2. [Agent Loop Internals](./agent-loop.md)
-3. [Prompt Assembly](./prompt-assembly.md)
-4. [Provider Runtime Resolution](./provider-runtime.md)
-5. [Adding Providers](./adding-providers.md)
-6. [Tools Runtime](./tools-runtime.md)
-7. [Session Storage](./session-storage.md)
-8. [Gateway Internals](./gateway-internals.md)
-9. [Context Compression & Prompt Caching](./context-compression-and-caching.md)
-10. [ACP Internals](./acp-internals.md)
-11. [Environments, Benchmarks & Data Generation](./environments.md)
+```text
+User input → HermesCLI.process_input()
+  → AIAgent.run_conversation()
+    → prompt_builder.build_system_prompt()
+    → runtime_provider.resolve_runtime_provider()
+    → API call (chat_completions / codex_responses / anthropic_messages)
+    → tool_calls? → model_tools.handle_function_call() → loop
+    → final response → display → save to SessionDB
+```

-## Major subsystems
+### Gateway Message

-### Agent loop
+```text
+Platform event → Adapter.on_message() → MessageEvent
+  → GatewayRunner._handle_message()
+    → authorize user
+    → resolve session key
+    → create AIAgent with session history
+    → AIAgent.run_conversation()
+    → deliver response back through adapter
+```

-The core synchronous orchestration engine is `AIAgent` in `run_agent.py`.
+### Cron Job

-It is responsible for:
+```text
+Scheduler tick → load due jobs from jobs.json
+  → create fresh AIAgent (no history)
+  → inject attached skills as context
+  → run job prompt
+  → deliver response to target platform
+  → update job state and next_run
+```

- provider/API-mode selection
- prompt construction
- tool execution
- retries and fallback
- callbacks
- compression and persistence
+## Recommended Reading Order

-See [Agent Loop Internals](./agent-loop.md).
+If you are new to the codebase:

-### Prompt system
+1. **This page** — orient yourself
+2. **[Agent Loop Internals](./agent-loop.md)** — how AIAgent works
+3. **[Prompt Assembly](./prompt-assembly.md)** — system prompt construction
+4. **[Provider Runtime Resolution](./provider-runtime.md)** — how providers are selected
+5. **[Adding Providers](./adding-providers.md)** — practical guide to adding a new provider
+6. **[Tools Runtime](./tools-runtime.md)** — tool registry, dispatch, environments
+7. **[Session Storage](./session-storage.md)** — SQLite schema, FTS5, session lineage
+8. **[Gateway Internals](./gateway-internals.md)** — messaging platform gateway
+9. **[Context Compression & Prompt Caching](./context-compression-and-caching.md)** — compression and caching
+10. **[ACP Internals](./acp-internals.md)** — IDE integration
+11. **[Environments, Benchmarks & Data Generation](./environments.md)** — RL training

-Prompt-building logic is split between:
+## Major Subsystems

- `run_agent.py`
- `agent/prompt_builder.py`
- `agent/prompt_caching.py`
- `agent/context_compressor.py`
+### Agent Loop

-See:
+The synchronous orchestration engine (`AIAgent` in `run_agent.py`). Handles provider selection, prompt construction, tool execution, retries, fallback, callbacks, compression, and persistence. Supports three API modes for different provider backends.

- [Prompt Assembly](./prompt-assembly.md)
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
+→ [Agent Loop Internals](./agent-loop.md)

-### Provider/runtime resolution
+### Prompt System

-Hermes has a shared runtime provider resolver used by CLI, gateway, cron, ACP, and auxiliary calls.
+Prompt construction and maintenance across the conversation lifecycle:

-See [Provider Runtime Resolution](./provider-runtime.md).
+- **`prompt_builder.py`** — Assembles the system prompt from: personality (SOUL.md), memory (MEMORY.md, USER.md), skills, context files (AGENTS.md, .hermes.md), tool-use guidance, and model-specific instructions
+- **`prompt_caching.py`** — Applies Anthropic cache breakpoints for prefix caching
+- **`context_compressor.py`** — Summarizes middle conversation turns when context exceeds thresholds

-### Tooling runtime
+→ [Prompt Assembly](./prompt-assembly.md), [Context Compression & Prompt Caching](./context-compression-and-caching.md)

-The tool registry, toolsets, terminal backends, process manager, and dispatch rules form a subsystem of their own.
+### Provider Resolution

-See [Tools Runtime](./tools-runtime.md).
+A shared runtime resolver used by CLI, gateway, cron, ACP, and auxiliary calls. Maps `(provider, model)` tuples to `(api_mode, api_key, base_url)`. Handles 18+ providers, OAuth flows, credential pools, and alias resolution.

-### Session persistence
+→ [Provider Runtime Resolution](./provider-runtime.md)

-Historical session state is stored primarily in SQLite, with lineage preserved across compression splits.
+### Tool System

-See [Session Storage](./session-storage.md).
+Central tool registry (`tools/registry.py`) with 47 registered tools across 20 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 6 backends (local, Docker, SSH, Daytona, Modal, Singularity).

-### Messaging gateway
+→ [Tools Runtime](./tools-runtime.md)

-The gateway is a long-running orchestration layer for platform adapters, session routing, pairing, delivery, and cron ticking.
+### Session Persistence

-See [Gateway Internals](./gateway-internals.md).
+SQLite-based session storage with FTS5 full-text search. Sessions have lineage tracking (parent/child across compressions), per-platform isolation, and atomic writes with contention handling.

-### ACP integration
+→ [Session Storage](./session-storage.md)

-ACP exposes Hermes as an editor-native agent over stdio/JSON-RPC.
+### Messaging Gateway

-See:
+Long-running process with 14 platform adapters, unified session routing, user authorization (allowlists + DM pairing), slash command dispatch, hook system, cron ticking, and background maintenance.

- [ACP Editor Integration](../user-guide/features/acp.md)
- [ACP Internals](./acp-internals.md)
+→ [Gateway Internals](./gateway-internals.md)
+
+### Plugin System
+
+Three discovery sources: `~/.hermes/plugins/` (user), `.hermes/plugins/` (project), and pip entry points. Plugins register tools, hooks, and CLI commands through a context API. Memory providers are a specialized plugin type under `plugins/memory/`.
+
+→ [Plugin Guide](/docs/guides/build-a-hermes-plugin), [Memory Provider Plugin](./memory-provider-plugin.md)

 ### Cron

-Cron jobs are implemented as first-class agent tasks, not just shell tasks.
+First-class agent tasks (not shell tasks). Jobs store in JSON, support multiple schedule formats, can attach skills and scripts, and deliver to any platform.

-See [Cron Internals](./cron-internals.md).
+→ [Cron Internals](./cron-internals.md)

-### RL / environments / trajectories
+### ACP Integration

-Hermes ships a full environment framework for evaluation, RL integration, and SFT data generation.
+Exposes Hermes as an editor-native agent over stdio/JSON-RPC for VS Code, Zed, and JetBrains.

-See:
+→ [ACP Internals](./acp-internals.md)

- [Environments, Benchmarks & Data Generation](./environments.md)
- [Trajectories & Training Format](./trajectory-format.md)
+### RL / Environments / Trajectories

-## Design themes
+Full environment framework for evaluation and RL training. Integrates with Atropos, supports multiple tool-call parsers, and generates ShareGPT-format trajectories.

-Several cross-cutting design themes appear throughout the codebase:
+→ [Environments, Benchmarks & Data Generation](./environments.md), [Trajectories & Training Format](./trajectory-format.md)

- prompt stability matters
- tool execution must be observable and interruptible
- session persistence must survive long-running use
- platform frontends should share one agent core
- optional subsystems should remain loosely coupled where possible
+## Design Principles

-## Implementation notes
+| Principle | What it means in practice |
+|-----------|--------------------------|
+| **Prompt stability** | System prompt doesn't change mid-conversation. No cache-breaking mutations except explicit user actions (`/model`). |
+| **Observable execution** | Every tool call is visible to the user via callbacks. Progress updates in CLI (spinner) and gateway (chat messages). |
+| **Interruptible** | API calls and tool execution can be cancelled mid-flight by user input or signals. |
+| **Platform-agnostic core** | One AIAgent class serves CLI, gateway, ACP, batch, and API server. Platform differences live in the entry point, not the agent. |
+| **Loose coupling** | Optional subsystems (MCP, plugins, memory providers, RL environments) use registry patterns and check_fn gating, not hard dependencies. |
+| **Profile isolation** | Each profile (`hermes -p <name>`) gets its own HERMES_HOME, config, memory, sessions, and gateway PID. Multiple profiles run concurrently. |

-The older mental model of Hermes as “one OpenAI-compatible chat loop plus some tools” is no longer sufficient. Current Hermes includes:
+## File Dependency Chain

- multiple API modes
- auxiliary model routing
- ACP editor integration
- gateway-specific session and delivery semantics
- RL environment infrastructure
- prompt-caching and compression logic with lineage-aware persistence
+```text
+tools/registry.py  (no deps — imported by all tool files)
+       ↑
+tools/*.py  (each calls registry.register() at import time)
+       ↑
+model_tools.py  (imports tools/registry + triggers tool discovery)
+       ↑
+run_agent.py, cli.py, batch_runner.py, environments/
+```

-Use this page as the map, then dive into subsystem-specific docs for the real implementation details.
+This chain means tool registration happens at import time, before any agent instance is created. Adding a new tool requires an import in `model_tools.py`'s `_discover_tools()` list.