diff --git a/website/docs/developer-guide/acp-internals.md b/website/docs/developer-guide/acp-internals.md new file mode 100644 index 000000000..0db8d94cd --- /dev/null +++ b/website/docs/developer-guide/acp-internals.md @@ -0,0 +1,182 @@ +--- +sidebar_position: 2 +title: "ACP Internals" +description: "How the ACP adapter works: lifecycle, sessions, event bridge, approvals, and tool rendering" +--- + +# ACP Internals + +The ACP adapter wraps Hermes' synchronous `AIAgent` in an async JSON-RPC stdio server. + +Key implementation files: + +- `acp_adapter/entry.py` +- `acp_adapter/server.py` +- `acp_adapter/session.py` +- `acp_adapter/events.py` +- `acp_adapter/permissions.py` +- `acp_adapter/tools.py` +- `acp_adapter/auth.py` +- `acp_registry/agent.json` + +## Boot flow + +```text +hermes acp / hermes-acp / python -m acp_adapter + -> acp_adapter.entry.main() + -> load ~/.hermes/.env + -> configure stderr logging + -> construct HermesACPAgent + -> acp.run_agent(agent) +``` + +Stdout is reserved for ACP JSON-RPC transport. Human-readable logs go to stderr. + +## Major components + +### `HermesACPAgent` + +`acp_adapter/server.py` implements the ACP agent protocol. + +Responsibilities: + +- initialize / authenticate +- new/load/resume/fork/list/cancel session methods +- prompt execution +- session model switching +- wiring sync AIAgent callbacks into ACP async notifications + +### `SessionManager` + +`acp_adapter/session.py` tracks live ACP sessions. + +Each session stores: + +- `session_id` +- `agent` +- `cwd` +- `model` +- `history` +- `cancel_event` + +The manager is thread-safe and supports: + +- create +- get +- remove +- fork +- list +- cleanup +- cwd updates + +### Event bridge + +`acp_adapter/events.py` converts AIAgent callbacks into ACP `session_update` events. + +Bridged callbacks: + +- `tool_progress_callback` +- `thinking_callback` +- `step_callback` +- `message_callback` + +Because `AIAgent` runs in a worker thread while ACP I/O lives on the main event loop, the bridge uses: + +```python +asyncio.run_coroutine_threadsafe(...) +``` + +### Permission bridge + +`acp_adapter/permissions.py` adapts dangerous terminal approval prompts into ACP permission requests. + +Mapping: + +- `allow_once` -> Hermes `once` +- `allow_always` -> Hermes `always` +- reject options -> Hermes `deny` + +Timeouts and bridge failures deny by default. + +### Tool rendering helpers + +`acp_adapter/tools.py` maps Hermes tools to ACP tool kinds and builds editor-facing content. + +Examples: + +- `patch` / `write_file` -> file diffs +- `terminal` -> shell command text +- `read_file` / `search_files` -> text previews +- large results -> truncated text blocks for UI safety + +## Session lifecycle + +```text +new_session(cwd) + -> create SessionState + -> create AIAgent(platform="acp", enabled_toolsets=["hermes-acp"]) + -> bind task_id/session_id to cwd override + +prompt(..., session_id) + -> extract text from ACP content blocks + -> reset cancel event + -> install callbacks + approval bridge + -> run AIAgent in ThreadPoolExecutor + -> update session history + -> emit final agent message chunk +``` + +### Cancelation + +`cancel(session_id)`: + +- sets the session cancel event +- calls `agent.interrupt()` when available +- causes the prompt response to return `stop_reason="cancelled"` + +### Forking + +`fork_session()` deep-copies message history into a new live session, preserving conversation state while giving the fork its own session ID and cwd. + +## Provider/auth behavior + +ACP does not implement its own auth store. + +Instead it reuses Hermes' runtime resolver: + +- `acp_adapter/auth.py` +- `hermes_cli/runtime_provider.py` + +So ACP advertises and uses the currently configured Hermes provider/credentials. + +## Working directory binding + +ACP sessions carry an editor cwd. + +The session manager binds that cwd to the ACP session ID via task-scoped terminal/file overrides, so file and terminal tools operate relative to the editor workspace. + +## Duplicate same-name tool calls + +The event bridge tracks tool IDs FIFO per tool name, not just one ID per name. This is important for: + +- parallel same-name calls +- repeated same-name calls in one step + +Without FIFO queues, completion events would attach to the wrong tool invocation. + +## Approval callback restoration + +ACP temporarily installs an approval callback on the terminal tool during prompt execution, then restores the previous callback afterward. This avoids leaving ACP session-specific approval handlers installed globally forever. + +## Current limitations + +- ACP sessions are process-local from the ACP server's point of view +- non-text prompt blocks are currently ignored for request text extraction +- editor-specific UX varies by ACP client implementation + +## Related files + +- `tests/acp/` — ACP test suite +- `toolsets.py` — `hermes-acp` toolset definition +- `hermes_cli/main.py` — `hermes acp` CLI subcommand +- `pyproject.toml` — `[acp]` optional dependency + `hermes-acp` script diff --git a/website/docs/developer-guide/agent-loop.md b/website/docs/developer-guide/agent-loop.md new file mode 100644 index 000000000..26ec11a6e --- /dev/null +++ b/website/docs/developer-guide/agent-loop.md @@ -0,0 +1,110 @@ +--- +sidebar_position: 3 +title: "Agent Loop Internals" +description: "Detailed walkthrough of AIAgent execution, API modes, tools, callbacks, and fallback behavior" +--- + +# Agent Loop Internals + +The core orchestration engine is `run_agent.py`'s `AIAgent`. + +## Core responsibilities + +`AIAgent` is responsible for: + +- assembling the effective prompt and tool schemas +- selecting the correct provider/API mode +- making interruptible model calls +- executing tool calls (sequentially or concurrently) +- maintaining session history +- handling compression, retries, and fallback models + +## API modes + +Hermes currently supports three API execution modes: + +| API mode | Used for | +|----------|----------| +| `chat_completions` | OpenAI-compatible chat endpoints, including OpenRouter and most custom endpoints | +| `codex_responses` | OpenAI Codex / Responses API path | +| `anthropic_messages` | Native Anthropic Messages API | + +The mode is resolved from explicit args, provider selection, and base URL heuristics. + +## Turn lifecycle + +```text +run_conversation() + -> generate effective task_id + -> append current user message + -> load or build cached system prompt + -> maybe preflight-compress + -> build api_messages + -> inject ephemeral prompt layers + -> apply prompt caching if appropriate + -> make interruptible API call + -> if tool calls: execute them, append tool results, loop + -> if final text: persist, cleanup, return response +``` + +## Interruptible API calls + +Hermes wraps API requests so they can be interrupted from the CLI or gateway. + +This matters because: + +- the agent may be in a long LLM call +- the user may send a new message mid-flight +- background systems may need cancellation semantics + +## Tool execution modes + +Hermes uses two execution strategies: + +- sequential execution for single or interactive tools +- concurrent execution for multiple non-interactive tools + +Concurrent tool execution preserves message/result ordering when reinserting tool responses into conversation history. + +## Callback surfaces + +`AIAgent` supports platform/integration callbacks such as: + +- `tool_progress_callback` +- `thinking_callback` +- `reasoning_callback` +- `clarify_callback` +- `step_callback` +- `message_callback` + +These are how the CLI, gateway, and ACP integrations stream intermediate progress and interactive approval/clarification flows. + +## Budget and fallback behavior + +Hermes tracks a shared iteration budget across parent and subagents. It also injects budget pressure hints near the end of the available iteration window. + +Fallback model support allows the agent to switch providers/models when the primary route fails in supported failure paths. + +## Compression and persistence + +Before and during long runs, Hermes may: + +- flush memory before context loss +- compress middle conversation turns +- split the session lineage into a new session ID after compression +- preserve recent context and structural tool-call/result consistency + +## Key files to read next + +- `run_agent.py` +- `agent/prompt_builder.py` +- `agent/context_compressor.py` +- `agent/prompt_caching.py` +- `model_tools.py` + +## Related docs + +- [Provider Runtime Resolution](./provider-runtime.md) +- [Prompt Assembly](./prompt-assembly.md) +- [Context Compression & Prompt Caching](./context-compression-and-caching.md) +- [Tools Runtime](./tools-runtime.md) diff --git a/website/docs/developer-guide/architecture.md b/website/docs/developer-guide/architecture.md index ef5bd9d63..2ff148174 100644 --- a/website/docs/developer-guide/architecture.md +++ b/website/docs/developer-guide/architecture.md @@ -1,218 +1,151 @@ --- sidebar_position: 1 title: "Architecture" -description: "Hermes Agent internals — project structure, agent loop, key classes, and design patterns" +description: "Hermes Agent internals — major subsystems, execution paths, and where to read next" --- # Architecture -This guide covers the internal architecture of Hermes Agent for developers contributing to the project. +This page is the top-level map of Hermes Agent internals. The project has grown beyond a single monolithic loop, so the best way to understand it is by subsystem. -## Project Structure +## High-level structure -``` +```text hermes-agent/ -├── run_agent.py # AIAgent class — core conversation loop, tool dispatch -├── cli.py # HermesCLI class — interactive TUI, prompt_toolkit -├── model_tools.py # Tool orchestration (thin layer over tools/registry.py) -├── toolsets.py # Tool groupings and presets -├── hermes_state.py # SQLite session database with FTS5 full-text search -├── batch_runner.py # Parallel batch processing for trajectory generation +├── run_agent.py # AIAgent core loop +├── cli.py # interactive terminal UI +├── model_tools.py # tool discovery/orchestration +├── toolsets.py # tool groupings and presets +├── hermes_state.py # SQLite session/state database +├── batch_runner.py # batch trajectory generation │ -├── agent/ # Agent internals (extracted modules) -│ ├── prompt_builder.py # System prompt assembly (identity, skills, memory) -│ ├── context_compressor.py # Auto-summarization when approaching context limits -│ ├── auxiliary_client.py # Resolves auxiliary OpenAI clients (summarization, vision) -│ ├── display.py # KawaiiSpinner, tool progress formatting -│ ├── model_metadata.py # Model context lengths, token estimation -│ └── trajectory.py # Trajectory saving helpers -│ -├── hermes_cli/ # CLI command implementations -│ ├── main.py # Entry point, argument parsing, command dispatch -│ ├── config.py # Config management, migration, env var definitions -│ ├── setup.py # Interactive setup wizard -│ ├── auth.py # Provider resolution, OAuth, Nous Portal -│ ├── models.py # OpenRouter model selection lists -│ ├── banner.py # Welcome banner, ASCII art -│ ├── commands.py # Slash command definitions + autocomplete -│ ├── callbacks.py # Interactive callbacks (clarify, sudo, approval) -│ ├── doctor.py # Diagnostics -│ └── skills_hub.py # Skills Hub CLI + /skills slash command handler -│ -├── tools/ # Tool implementations (self-registering) -│ ├── registry.py # Central tool registry (schemas, handlers, dispatch) -│ ├── approval.py # Dangerous command detection + per-session approval -│ ├── terminal_tool.py # Terminal orchestration (sudo, env lifecycle, backends) -│ ├── file_operations.py # File tool implementations (read, write, search, patch) -│ ├── file_tools.py # File tool registration -│ ├── web_tools.py # web_search, web_extract -│ ├── vision_tools.py # Image analysis via multimodal models -│ ├── delegate_tool.py # Subagent spawning and parallel task execution -│ ├── code_execution_tool.py # Sandboxed Python with RPC tool access -│ ├── session_search_tool.py # Search past conversations -│ ├── cronjob_tools.py # Scheduled task management -│ ├── skills_tool.py # Skill search and load -│ ├── skill_manager_tool.py # Skill management -│ └── environments/ # Terminal execution backends -│ ├── base.py # BaseEnvironment ABC -│ ├── local.py, docker.py, ssh.py, singularity.py, modal.py, daytona.py -│ -├── gateway/ # Messaging gateway -│ ├── run.py # GatewayRunner — platform lifecycle, message routing -│ ├── config.py # Platform configuration resolution -│ ├── session.py # Session store, context prompts, reset policies -│ └── platforms/ # Platform adapters -│ ├── telegram.py, discord_adapter.py, slack.py, whatsapp.py -│ -├── scripts/ # Installer and bridge scripts -│ ├── install.sh # Linux/macOS installer -│ ├── install.ps1 # Windows PowerShell installer -│ └── whatsapp-bridge/ # Node.js WhatsApp bridge (Baileys) -│ -├── skills/ # Bundled skills (copied to ~/.hermes/skills/) -├── optional-skills/ # Official optional skills (discoverable via hub, not activated by default) -├── environments/ # RL training environments (Atropos integration) -└── tests/ # Test suite +├── agent/ # prompt building, compression, caching, metadata, trajectories +├── hermes_cli/ # command entrypoints, auth, setup, models, config, doctor +├── tools/ # tool implementations and terminal environments +├── gateway/ # messaging gateway, session routing, delivery, pairing, hooks +├── cron/ # scheduled job storage and scheduler +├── honcho_integration/ # Honcho memory integration +├── acp_adapter/ # ACP editor integration server +├── acp_registry/ # ACP registry manifest + icon +├── environments/ # Hermes RL / benchmark environment framework +├── skills/ # bundled skills +├── optional-skills/ # official optional skills +└── tests/ # test suite ``` -## Core Loop +## Recommended reading order -The main agent loop lives in `run_agent.py`: +If you are new to the codebase, read in this order: -``` -User message → AIAgent._run_agent_loop() - ├── Build system prompt (prompt_builder.py) - ├── Build API kwargs (model, messages, tools, reasoning config) - ├── Call LLM (OpenAI-compatible API) - ├── If tool_calls in response: - │ ├── Execute each tool via registry dispatch - │ ├── Add tool results to conversation - │ └── Loop back to LLM call - ├── If text response: - │ ├── Persist session to DB - │ └── Return final_response - └── Context compression if approaching token limit -``` +1. this page +2. [Agent Loop Internals](./agent-loop.md) +3. [Prompt Assembly](./prompt-assembly.md) +4. [Provider Runtime Resolution](./provider-runtime.md) +5. [Tools Runtime](./tools-runtime.md) +6. [Session Storage](./session-storage.md) +7. [Gateway Internals](./gateway-internals.md) +8. [Context Compression & Prompt Caching](./context-compression-and-caching.md) +9. [ACP Internals](./acp-internals.md) +10. [Environments, Benchmarks & Data Generation](./environments.md) -```python -while turns < max_turns: - response = client.chat.completions.create( - model=model, - messages=messages, - tools=tool_schemas, - ) +## Major subsystems - if response.tool_calls: - for tool_call in response.tool_calls: - result = execute_tool(tool_call) - messages.append(tool_result_message(result)) - turns += 1 - else: - return response.content -``` +### Agent loop -## AIAgent Class +The core synchronous orchestration engine is `AIAgent` in `run_agent.py`. -```python -class AIAgent: - def __init__( - self, - model: str = "anthropic/claude-opus-4.6", - api_key: str = None, - base_url: str = None, # Resolved internally based on provider - max_iterations: int = 60, - enabled_toolsets: list = None, - disabled_toolsets: list = None, - verbose_logging: bool = False, - quiet_mode: bool = False, - tool_progress_callback: callable = None, - ): - ... +It is responsible for: - def chat(self, message: str) -> str: - # Main entry point - runs the agent loop - ... -``` +- provider/API-mode selection +- prompt construction +- tool execution +- retries and fallback +- callbacks +- compression and persistence -## File Dependency Chain +See [Agent Loop Internals](./agent-loop.md). -``` -tools/registry.py (no deps — imported by all tool files) - ↑ -tools/*.py (each calls registry.register() at import time) - ↑ -model_tools.py (imports tools/registry + triggers tool discovery) - ↑ -run_agent.py, cli.py, batch_runner.py, environments/ -``` +### Prompt system -Each tool file co-locates its schema, handler, and registration. `model_tools.py` is a thin orchestration layer. +Prompt-building logic is split between: -## Key Design Patterns +- `run_agent.py` +- `agent/prompt_builder.py` +- `agent/prompt_caching.py` +- `agent/context_compressor.py` -### Self-Registering Tools +See: -Each tool file calls `registry.register()` at import time. `model_tools.py` triggers discovery by importing all tool modules. +- [Prompt Assembly](./prompt-assembly.md) +- [Context Compression & Prompt Caching](./context-compression-and-caching.md) -### Toolset Grouping +### Provider/runtime resolution -Tools are grouped into toolsets (`web`, `terminal`, `file`, `browser`, etc.) that can be enabled/disabled per platform. +Hermes has a shared runtime provider resolver used by CLI, gateway, cron, ACP, and auxiliary calls. -### Session Persistence +See [Provider Runtime Resolution](./provider-runtime.md). -All conversations are stored in SQLite (`hermes_state.py`) with full-text search. JSON logs go to `~/.hermes/sessions/`. +### Tooling runtime -### Ephemeral Injection +The tool registry, toolsets, terminal backends, process manager, and dispatch rules form a subsystem of their own. -System prompts and prefill messages are injected at API call time, never persisted to the database or logs. +See [Tools Runtime](./tools-runtime.md). -### Provider Abstraction +### Session persistence -The agent works with any OpenAI-compatible API. Provider resolution happens at init time (Nous Portal OAuth, OpenRouter API key, or custom endpoint). +Historical session state is stored primarily in SQLite, with lineage preserved across compression splits. -### Conversation Format +See [Session Storage](./session-storage.md). -Messages follow the OpenAI format: +### Messaging gateway -```python -messages = [ - {"role": "system", "content": "You are a helpful assistant..."}, - {"role": "user", "content": "Search for Python tutorials"}, - {"role": "assistant", "content": None, "tool_calls": [...]}, - {"role": "tool", "tool_call_id": "...", "content": "..."}, - {"role": "assistant", "content": "Here's what I found..."}, -] -``` +The gateway is a long-running orchestration layer for platform adapters, session routing, pairing, delivery, and cron ticking. -## CLI Architecture +See [Gateway Internals](./gateway-internals.md). -The interactive CLI (`cli.py`) uses: +### ACP integration -- **Rich** — Welcome banner and styled panels -- **prompt_toolkit** — Fixed input area with history, `patch_stdout`, slash command autocomplete -- **KawaiiSpinner** — Animated kawaii faces during API calls; clean activity feed for tool results +ACP exposes Hermes as an editor-native agent over stdio/JSON-RPC. -Key UX behaviors: +See: -- Thinking spinner shows animated kawaii face + verb (`(⌐■_■) deliberating...`) -- Tool execution results appear as `┊ {emoji} {verb} {detail} {duration}` -- Prompt shows `⚕ ❯` when working, `❯` when idle -- Multi-line paste support with automatic formatting +- [ACP Editor Integration](../user-guide/features/acp.md) +- [ACP Internals](./acp-internals.md) -## Messaging Gateway Architecture +### Cron -The gateway (`gateway/run.py`) uses `GatewayRunner` to: +Cron jobs are implemented as first-class agent tasks, not just shell tasks. -1. Connect to all configured platforms -2. Route messages through per-chat session stores -3. Dispatch to AIAgent instances -4. Run the cron scheduler (ticks every 60s) -5. Handle interrupts and tool progress notifications +See [Cron Internals](./cron-internals.md). -Each platform adapter conforms to `BasePlatformAdapter`. +### RL / environments / trajectories -## Configuration System +Hermes ships a full environment framework for evaluation, RL integration, and SFT data generation. -- `~/.hermes/config.yaml` — All settings -- `~/.hermes/.env` — API keys and secrets -- `_config_version` in `DEFAULT_CONFIG` — Bumped when required fields are added, triggers migration prompts +See: + +- [Environments, Benchmarks & Data Generation](./environments.md) +- [Trajectories & Training Format](./trajectory-format.md) + +## Design themes + +Several cross-cutting design themes appear throughout the codebase: + +- prompt stability matters +- tool execution must be observable and interruptible +- session persistence must survive long-running use +- platform frontends should share one agent core +- optional subsystems should remain loosely coupled where possible + +## Implementation notes + +The older mental model of Hermes as “one OpenAI-compatible chat loop plus some tools” is no longer sufficient. Current Hermes includes: + +- multiple API modes +- auxiliary model routing +- ACP editor integration +- gateway-specific session and delivery semantics +- RL environment infrastructure +- prompt-caching and compression logic with lineage-aware persistence + +Use this page as the map, then dive into subsystem-specific docs for the real implementation details. diff --git a/website/docs/developer-guide/context-compression-and-caching.md b/website/docs/developer-guide/context-compression-and-caching.md new file mode 100644 index 000000000..92bf718cd --- /dev/null +++ b/website/docs/developer-guide/context-compression-and-caching.md @@ -0,0 +1,72 @@ +--- +sidebar_position: 6 +title: "Context Compression & Prompt Caching" +description: "How Hermes compresses long conversations and applies provider-side prompt caching" +--- + +# Context Compression & Prompt Caching + +Hermes manages long conversations with two complementary mechanisms: + +- prompt caching +- context compression + +Primary files: + +- `agent/prompt_caching.py` +- `agent/context_compressor.py` +- `run_agent.py` + +## Prompt caching + +For Anthropic/native and Claude-via-OpenRouter flows, Hermes applies Anthropic-style cache markers. + +Current strategy: + +- cache the system prompt +- cache the last 3 non-system messages +- default TTL is 5 minutes unless explicitly extended + +This is implemented in `agent/prompt_caching.py`. + +## Why prompt stability matters + +Prompt caching only helps when the stable prefix remains stable. That is why Hermes avoids rebuilding or mutating the core system prompt mid-session unless it has to. + +## Compression trigger + +Hermes can compress context when conversations become large. Configuration defaults live in `config.yaml`, and the compressor also has runtime checks based on actual prompt token counts. + +## Compression algorithm + +The compressor protects: + +- the first N turns +- the last N turns + +and summarizes the middle section. + +It also cleans up structural issues such as orphaned tool-call/result pairs so the API never receives invalid conversation structure after compression. + +## Pre-compression memory flush + +Before compression, Hermes can give the model one last chance to persist memory so facts are not lost when middle turns are summarized away. + +## Session lineage after compression + +Compression can split the session into a new session ID while preserving parent lineage in the state DB. + +This lets Hermes continue operating with a smaller active context while retaining a searchable ancestry chain. + +## Re-injected state after compression + +After compression, Hermes may re-inject compact operational state such as: + +- todo snapshot +- prior-read-files summary + +## Related docs + +- [Prompt Assembly](./prompt-assembly.md) +- [Session Storage](./session-storage.md) +- [Agent Loop Internals](./agent-loop.md) diff --git a/website/docs/developer-guide/cron-internals.md b/website/docs/developer-guide/cron-internals.md new file mode 100644 index 000000000..574cc522a --- /dev/null +++ b/website/docs/developer-guide/cron-internals.md @@ -0,0 +1,56 @@ +--- +sidebar_position: 11 +title: "Cron Internals" +description: "How Hermes stores, schedules, locks, and delivers cron jobs" +--- + +# Cron Internals + +Hermes cron support is implemented primarily in: + +- `cron/jobs.py` +- `cron/scheduler.py` +- `gateway/run.py` + +## Scheduling model + +Hermes supports: + +- one-shot delays +- intervals +- cron expressions +- explicit timestamps + +## Job storage + +Cron jobs are stored in Hermes-managed local state with atomic save/update semantics. + +## Runtime behavior + +The scheduler: + +- loads jobs +- computes due work +- executes jobs in fresh agent sessions +- handles repeat counters +- updates next-run metadata + +In gateway mode, cron ticking is integrated into the long-running gateway loop. + +## Delivery model + +Cron jobs can deliver to: + +- origin chat +- local files +- platform home channels +- explicit platform/chat IDs + +## Locking + +Hermes uses lock-based protections so concurrent cron ticks or overlapping scheduler processes do not corrupt job state. + +## Related docs + +- [Cron feature guide](../user-guide/features/cron.md) +- [Gateway Internals](./gateway-internals.md) diff --git a/website/docs/developer-guide/environments.md b/website/docs/developer-guide/environments.md index 27f122832..6579b3787 100644 --- a/website/docs/developer-guide/environments.md +++ b/website/docs/developer-guide/environments.md @@ -14,6 +14,10 @@ Hermes Agent includes a full environment framework that connects its tool-callin All three share the same core: an **environment** class that defines tasks, runs an agent loop, and scores the output. +:::info Repo environments vs RL training tools +The Python environment framework documented here lives under the repo's `environments/` directory and is the implementation-level API for Hermes/Atropos integration. This is separate from the user-facing `rl_*` tools, which operate as an orchestration surface for remote RL training workflows. +::: + :::tip Quick Links - **Want to run benchmarks?** Jump to [Available Benchmarks](#available-benchmarks) - **Want to train with RL?** See [RL Training Tools](/user-guide/features/rl-training) for the agent-driven interface, or [Running Environments](#running-environments) for manual execution diff --git a/website/docs/developer-guide/gateway-internals.md b/website/docs/developer-guide/gateway-internals.md new file mode 100644 index 000000000..6edaf6504 --- /dev/null +++ b/website/docs/developer-guide/gateway-internals.md @@ -0,0 +1,95 @@ +--- +sidebar_position: 7 +title: "Gateway Internals" +description: "How the messaging gateway boots, authorizes users, routes sessions, and delivers messages" +--- + +# Gateway Internals + +The messaging gateway is the long-running process that connects Hermes to external platforms. + +Key files: + +- `gateway/run.py` +- `gateway/config.py` +- `gateway/session.py` +- `gateway/delivery.py` +- `gateway/pairing.py` +- `gateway/channel_directory.py` +- `gateway/hooks.py` +- `gateway/mirror.py` +- `gateway/platforms/*` + +## Core responsibilities + +The gateway process is responsible for: + +- loading configuration from `.env`, `config.yaml`, and `gateway.json` +- starting platform adapters +- authorizing users +- routing incoming events to sessions +- maintaining per-chat session continuity +- dispatching messages to `AIAgent` +- running cron ticks and background maintenance tasks +- mirroring/proactively delivering output to configured channels + +## Config sources + +The gateway has a multi-source config model: + +- environment variables +- `~/.hermes/gateway.json` +- selected bridged values from `~/.hermes/config.yaml` + +## Session routing + +`gateway/session.py` and `GatewayRunner` cooperate to map incoming messages to active session IDs. + +Session keying can depend on: + +- platform +- user/chat identity +- thread/topic identity +- special platform-specific routing behavior + +## Authorization layers + +The gateway can authorize through: + +- platform allowlists +- gateway-wide allowlists +- DM pairing flows +- explicit allow-all settings + +Pairing support is implemented in `gateway/pairing.py`. + +## Delivery path + +Outgoing deliveries are handled by `gateway/delivery.py`, which knows how to: + +- deliver to a home channel +- resolve explicit targets +- mirror some remote deliveries back into local history/session tracking + +## Hooks + +Gateway events emit hook callbacks through `gateway/hooks.py`. Hooks are local trusted Python code and can observe or extend gateway lifecycle events. + +## Background maintenance + +The gateway also runs maintenance tasks such as: + +- cron ticking +- cache refreshes +- session expiry checks +- proactive memory flush before reset/expiry + +## Honcho interaction + +When Honcho is enabled, the gateway can keep persistent Honcho managers aligned with session lifetimes and platform-specific session keys. + +## Related docs + +- [Session Storage](./session-storage.md) +- [Cron Internals](./cron-internals.md) +- [ACP Internals](./acp-internals.md) diff --git a/website/docs/developer-guide/prompt-assembly.md b/website/docs/developer-guide/prompt-assembly.md new file mode 100644 index 000000000..163647167 --- /dev/null +++ b/website/docs/developer-guide/prompt-assembly.md @@ -0,0 +1,85 @@ +--- +sidebar_position: 5 +title: "Prompt Assembly" +description: "How Hermes builds the system prompt, preserves cache stability, and injects ephemeral layers" +--- + +# Prompt Assembly + +Hermes deliberately separates: + +- **cached system prompt state** +- **ephemeral API-call-time additions** + +This is one of the most important design choices in the project because it affects: + +- token usage +- prompt caching effectiveness +- session continuity +- memory correctness + +Primary files: + +- `run_agent.py` +- `agent/prompt_builder.py` +- `tools/memory_tool.py` + +## Cached system prompt layers + +The cached system prompt is assembled in roughly this order: + +1. default agent identity +2. tool-aware behavior guidance +3. Honcho static block (when active) +4. optional system message +5. frozen MEMORY snapshot +6. frozen USER profile snapshot +7. skills index +8. context files (`AGENTS.md`, `SOUL.md`, `.cursorrules`, `.cursor/rules/*.mdc`) +9. timestamp / optional session ID +10. platform hint + +## API-call-time-only layers + +These are intentionally *not* persisted as part of the cached system prompt: + +- `ephemeral_system_prompt` +- prefill messages +- gateway-derived session context overlays +- later-turn Honcho recall injected into the current-turn user message + +This separation keeps the stable prefix stable for caching. + +## Memory snapshots + +Local memory and user profile data are injected as frozen snapshots at session start. Mid-session writes update disk state but do not mutate the already-built system prompt until a new session or forced rebuild occurs. + +## Context files + +`agent/prompt_builder.py` scans and sanitizes: + +- `AGENTS.md` +- `SOUL.md` +- `.cursorrules` +- `.cursor/rules/*.mdc` + +Long files are truncated before injection. + +## Skills index + +The skills system contributes a compact skills index to the prompt when skills tooling is available. + +## Why prompt assembly is split this way + +The architecture is intentionally optimized to: + +- preserve provider-side prompt caching +- avoid mutating history unnecessarily +- keep memory semantics understandable +- let gateway/ACP/CLI add context without poisoning persistent prompt state + +## Related docs + +- [Context Compression & Prompt Caching](./context-compression-and-caching.md) +- [Session Storage](./session-storage.md) +- [Gateway Internals](./gateway-internals.md) diff --git a/website/docs/developer-guide/provider-runtime.md b/website/docs/developer-guide/provider-runtime.md new file mode 100644 index 000000000..9bfd48c28 --- /dev/null +++ b/website/docs/developer-guide/provider-runtime.md @@ -0,0 +1,116 @@ +--- +sidebar_position: 4 +title: "Provider Runtime Resolution" +description: "How Hermes resolves providers, credentials, API modes, and auxiliary models at runtime" +--- + +# Provider Runtime Resolution + +Hermes has a shared provider runtime resolver used across: + +- CLI +- gateway +- cron jobs +- ACP +- auxiliary model calls + +Primary implementation: + +- `hermes_cli/runtime_provider.py` +- `hermes_cli/auth.py` +- `agent/auxiliary_client.py` + +## Resolution precedence + +At a high level, provider resolution uses: + +1. explicit CLI/runtime request +2. environment variables +3. `config.yaml` model/provider config +4. provider-specific defaults or auto resolution + +## Providers + +Current provider families include: + +- OpenRouter +- Nous Portal +- OpenAI Codex +- Anthropic (native) +- Z.AI +- Kimi / Moonshot +- MiniMax +- MiniMax China +- custom OpenAI-compatible endpoints + +## Output of runtime resolution + +The runtime resolver returns data such as: + +- `provider` +- `api_mode` +- `base_url` +- `api_key` +- `source` +- provider-specific metadata like expiry/refresh info + +## Why this matters + +This resolver is the main reason Hermes can share auth/runtime logic between: + +- `hermes chat` +- gateway message handling +- cron jobs running in fresh sessions +- ACP editor sessions +- auxiliary model tasks + +## OpenRouter vs custom OpenAI-compatible base URLs + +Hermes contains logic to avoid leaking the wrong API key to a custom endpoint when both `OPENROUTER_API_KEY` and `OPENAI_API_KEY` exist. + +That distinction is especially important for: + +- local model servers +- non-OpenRouter OpenAI-compatible APIs +- switching providers without re-running setup + +## Native Anthropic path + +Anthropic is not just "via OpenRouter" anymore. + +When provider resolution selects `anthropic`, Hermes uses: + +- `api_mode = anthropic_messages` +- the native Anthropic Messages API +- `agent/anthropic_adapter.py` for translation + +## OpenAI Codex path + +Codex uses a separate Responses API path: + +- `api_mode = codex_responses` +- dedicated credential resolution and auth store support + +## Auxiliary model routing + +Auxiliary tasks such as: + +- vision +- web extraction summarization +- context compression summaries +- session search summarization +- skills hub operations +- MCP helper operations +- memory flushes + +can use their own provider/model routing rather than the main conversational model. + +## Fallback models + +Hermes also supports a configured fallback model/provider, allowing runtime failover in supported error paths. + +## Related docs + +- [Agent Loop Internals](./agent-loop.md) +- [ACP Internals](./acp-internals.md) +- [Context Compression & Prompt Caching](./context-compression-and-caching.md) diff --git a/website/docs/developer-guide/session-storage.md b/website/docs/developer-guide/session-storage.md new file mode 100644 index 000000000..103a72b5d --- /dev/null +++ b/website/docs/developer-guide/session-storage.md @@ -0,0 +1,66 @@ +--- +sidebar_position: 8 +title: "Session Storage" +description: "How Hermes stores sessions in SQLite, maintains lineage, and exposes recall/search" +--- + +# Session Storage + +Hermes uses a SQLite-backed session store as the main source of truth for historical conversation state. + +Primary files: + +- `hermes_state.py` +- `gateway/session.py` +- `tools/session_search_tool.py` + +## Main database + +The primary store lives at: + +```text +~/.hermes/state.db +``` + +It contains: + +- sessions +- messages +- metadata such as token counts and titles +- lineage relationships +- full-text search indexes + +## What is stored per session + +Examples of important session metadata: + +- session ID +- source/platform +- title +- created/updated timestamps +- token counts +- tool call counts +- stored system prompt snapshot +- parent session ID after compression splits + +## Lineage + +When Hermes compresses a conversation, it can continue in a new session ID while preserving ancestry via `parent_session_id`. + +This means resuming/searching can follow session families instead of treating each compressed shard as unrelated. + +## Gateway vs CLI persistence + +- CLI uses the state DB directly for resume/history/search +- gateway keeps active-session mappings and may also maintain additional platform transcript/state files +- some legacy JSON/JSONL artifacts still exist for compatibility, but SQLite is the main historical store + +## Session search + +The `session_search` tool uses the session DB's search features to retrieve and summarize relevant past work. + +## Related docs + +- [Gateway Internals](./gateway-internals.md) +- [Prompt Assembly](./prompt-assembly.md) +- [Context Compression & Prompt Caching](./context-compression-and-caching.md) diff --git a/website/docs/developer-guide/tools-runtime.md b/website/docs/developer-guide/tools-runtime.md new file mode 100644 index 000000000..4cb4e0d1e --- /dev/null +++ b/website/docs/developer-guide/tools-runtime.md @@ -0,0 +1,65 @@ +--- +sidebar_position: 9 +title: "Tools Runtime" +description: "Runtime behavior of the tool registry, toolsets, dispatch, and terminal environments" +--- + +# Tools Runtime + +Hermes tools are self-registering functions grouped into toolsets and executed through a central registry/dispatch system. + +Primary files: + +- `tools/registry.py` +- `model_tools.py` +- `toolsets.py` +- `tools/terminal_tool.py` +- `tools/environments/*` + +## Tool registration model + +Each tool module calls `registry.register(...)` at import time. + +`model_tools.py` is responsible for importing/discovering tool modules and building the schema list used by the model. + +## Toolset resolution + +Toolsets are named bundles of tools. Hermes resolves them through: + +- explicit enabled/disabled toolset lists +- platform presets (`hermes-cli`, `hermes-telegram`, etc.) +- dynamic MCP toolsets +- curated special-purpose sets like `hermes-acp` + +## Dispatch + +At runtime, tools are dispatched through the central registry, with agent-loop exceptions for some agent-level tools such as memory/todo/session-search handling. + +## Terminal/runtime environments + +The terminal system supports multiple backends: + +- local +- docker +- ssh +- singularity +- modal +- daytona + +It also supports: + +- per-task cwd overrides +- background process management +- PTY mode +- approval callbacks for dangerous commands + +## Concurrency + +Tool calls may execute sequentially or concurrently depending on the tool mix and interaction requirements. + +## Related docs + +- [Toolsets Reference](../reference/toolsets-reference.md) +- [Built-in Tools Reference](../reference/tools-reference.md) +- [Agent Loop Internals](./agent-loop.md) +- [ACP Internals](./acp-internals.md) diff --git a/website/docs/developer-guide/trajectory-format.md b/website/docs/developer-guide/trajectory-format.md new file mode 100644 index 000000000..0232846ca --- /dev/null +++ b/website/docs/developer-guide/trajectory-format.md @@ -0,0 +1,56 @@ +--- +sidebar_position: 10 +title: "Trajectories & Training Format" +description: "How Hermes saves trajectories, normalizes tool calls, and produces training-friendly outputs" +--- + +# Trajectories & Training Format + +Hermes can save conversation trajectories for training, evaluation, and batch data generation workflows. + +Primary files: + +- `agent/trajectory.py` +- `run_agent.py` +- `batch_runner.py` +- `trajectory_compressor.py` + +## What trajectories are for + +Trajectory outputs are used for: + +- SFT data generation +- debugging agent behavior +- benchmark/evaluation artifact capture +- post-processing and compression pipelines + +## Normalization strategy + +Hermes converts live conversation structure into a training-friendly format. + +Important behaviors include: + +- representing reasoning in explicit markup +- converting tool calls into structured XML-like regions for dataset compatibility +- grouping tool outputs appropriately +- separating successful and failed trajectories + +## Persistence boundaries + +Trajectory files do **not** blindly mirror all runtime prompt state. + +Some prompt-time-only layers are intentionally excluded from persisted trajectory content so datasets are cleaner and less environment-specific. + +## Batch runner + +`batch_runner.py` emits richer metadata than single-session trajectory saving, including: + +- model/provider metadata +- toolset info +- partial/failure markers +- tool statistics + +## Related docs + +- [Environments, Benchmarks & Data Generation](./environments.md) +- [Agent Loop Internals](./agent-loop.md) diff --git a/website/docs/getting-started/installation.md b/website/docs/getting-started/installation.md index 04ba46e30..e273f6da2 100644 --- a/website/docs/getting-started/installation.md +++ b/website/docs/getting-started/installation.md @@ -123,6 +123,7 @@ uv pip install -e "." | `honcho` | AI-native memory (Honcho integration) | `uv pip install -e ".[honcho]"` | | `mcp` | Model Context Protocol support | `uv pip install -e ".[mcp]"` | | `homeassistant` | Home Assistant integration | `uv pip install -e ".[homeassistant]"` | +| `acp` | ACP editor integration support | `uv pip install -e ".[acp]"` | | `slack` | Slack messaging | `uv pip install -e ".[slack]"` | | `dev` | pytest & test utilities | `uv pip install -e ".[dev]"` | diff --git a/website/docs/getting-started/quickstart.md b/website/docs/getting-started/quickstart.md index a4c45a301..68d41ab34 100644 --- a/website/docs/getting-started/quickstart.md +++ b/website/docs/getting-started/quickstart.md @@ -147,6 +147,17 @@ hermes skills install official/security/1password Or use the `/skills` slash command inside chat. +### Use Hermes inside an editor via ACP + +Hermes can also run as an ACP server for ACP-compatible editors like VS Code, Zed, and JetBrains: + +```bash +pip install -e '.[acp]' +hermes acp +``` + +See [ACP Editor Integration](../user-guide/features/acp.md) for setup details. + ### Try MCP servers Connect to external tools via the Model Context Protocol: diff --git a/website/docs/reference/cli-commands.md b/website/docs/reference/cli-commands.md index 1348bf54c..71a76b071 100644 --- a/website/docs/reference/cli-commands.md +++ b/website/docs/reference/cli-commands.md @@ -44,6 +44,7 @@ hermes [global-options] [subcommand/options] | `hermes pairing` | Approve or revoke messaging pairing codes. | | `hermes skills` | Browse, install, publish, audit, and configure skills. | | `hermes honcho` | Manage Honcho cross-session memory integration. | +| `hermes acp` | Run Hermes as an ACP server for editor integration. | | `hermes tools` | Configure enabled tools per platform. | | `hermes sessions` | Browse, export, prune, rename, and delete sessions. | | `hermes insights` | Show token/cost/activity analytics. | @@ -283,6 +284,29 @@ Subcommands: | `identity` | Seed or show the AI peer identity representation. | | `migrate` | Migration guide from openclaw-honcho to Hermes Honcho. | +## `hermes acp` + +```bash +hermes acp +``` + +Starts Hermes as an ACP (Agent Client Protocol) stdio server for editor integration. + +Related entrypoints: + +```bash +hermes-acp +python -m acp_adapter +``` + +Install support first: + +```bash +pip install -e '.[acp]' +``` + +See [ACP Editor Integration](../user-guide/features/acp.md) and [ACP Internals](../developer-guide/acp-internals.md). + ## `hermes tools` ```bash diff --git a/website/docs/user-guide/features/acp.md b/website/docs/user-guide/features/acp.md new file mode 100644 index 000000000..acb948ecd --- /dev/null +++ b/website/docs/user-guide/features/acp.md @@ -0,0 +1,197 @@ +--- +sidebar_position: 11 +title: "ACP Editor Integration" +description: "Use Hermes Agent inside ACP-compatible editors such as VS Code, Zed, and JetBrains" +--- + +# ACP Editor Integration + +Hermes Agent can run as an ACP server, letting ACP-compatible editors talk to Hermes over stdio and render: + +- chat messages +- tool activity +- file diffs +- terminal commands +- approval prompts +- streamed thinking / response chunks + +ACP is a good fit when you want Hermes to behave like an editor-native coding agent instead of a standalone CLI or messaging bot. + +## What Hermes exposes in ACP mode + +Hermes runs with a curated `hermes-acp` toolset designed for editor workflows. It includes: + +- file tools: `read_file`, `write_file`, `patch`, `search_files` +- terminal tools: `terminal`, `process` +- web/browser tools +- memory, todo, session search +- skills +- execute_code and delegate_task +- vision + +It intentionally excludes things that do not fit typical editor UX, such as messaging delivery and cronjob management. + +## Installation + +Install Hermes normally, then add the ACP extra: + +```bash +pip install -e '.[acp]' +``` + +This installs the `agent-client-protocol` dependency and enables: + +- `hermes acp` +- `hermes-acp` +- `python -m acp_adapter` + +## Launching the ACP server + +Any of the following starts Hermes in ACP mode: + +```bash +hermes acp +``` + +```bash +hermes-acp +``` + +```bash +python -m acp_adapter +``` + +Hermes logs to stderr so stdout remains reserved for ACP JSON-RPC traffic. + +## Editor setup + +### VS Code + +Install an ACP client extension, then point it at the repo's `acp_registry/` directory. + +Example settings snippet: + +```json +{ + "acpClient.agents": [ + { + "name": "hermes-agent", + "registryDir": "/path/to/hermes-agent/acp_registry" + } + ] +} +``` + +### Zed + +Example settings snippet: + +```json +{ + "acp": { + "agents": [ + { + "name": "hermes-agent", + "registry_dir": "/path/to/hermes-agent/acp_registry" + } + ] + } +} +``` + +### JetBrains + +Use an ACP-compatible plugin and point it at: + +```text +/path/to/hermes-agent/acp_registry +``` + +## Registry manifest + +The ACP registry manifest lives at: + +```text +acp_registry/agent.json +``` + +It advertises a command-based agent whose launch command is: + +```text +hermes acp +``` + +## Configuration and credentials + +ACP mode uses the same Hermes configuration as the CLI: + +- `~/.hermes/.env` +- `~/.hermes/config.yaml` +- `~/.hermes/skills/` +- `~/.hermes/state.db` + +Provider resolution uses Hermes' normal runtime resolver, so ACP inherits the currently configured provider and credentials. + +## Session behavior + +ACP sessions are tracked by the ACP adapter's in-memory session manager while the server is running. + +Each session stores: + +- session ID +- working directory +- selected model +- current conversation history +- cancel event + +The underlying `AIAgent` still uses Hermes' normal persistence/logging paths, but ACP `list/load/resume/fork` are scoped to the currently running ACP server process. + +## Working directory behavior + +ACP sessions bind the editor's cwd to the Hermes task ID so file and terminal tools run relative to the editor workspace, not the server process cwd. + +## Approvals + +Dangerous terminal commands can be routed back to the editor as approval prompts. ACP approval options are simpler than the CLI flow: + +- allow once +- allow always +- deny + +On timeout or error, the approval bridge denies the request. + +## Troubleshooting + +### ACP agent does not appear in the editor + +Check: + +- the editor is pointed at the correct `acp_registry/` path +- Hermes is installed and on your PATH +- the ACP extra is installed (`pip install -e '.[acp]'`) + +### ACP starts but immediately errors + +Try these checks: + +```bash +hermes doctor +hermes status +hermes acp +``` + +### Missing credentials + +ACP mode does not have its own login flow. It uses Hermes' existing provider setup. Configure credentials with: + +```bash +hermes model +``` + +or by editing `~/.hermes/.env`. + +## See also + +- [ACP Internals](../../developer-guide/acp-internals.md) +- [Provider Runtime Resolution](../../developer-guide/provider-runtime.md) +- [Tools Runtime](../../developer-guide/tools-runtime.md) diff --git a/website/sidebars.ts b/website/sidebars.ts index e525ab58f..0861cdf08 100644 --- a/website/sidebars.ts +++ b/website/sidebars.ts @@ -42,6 +42,8 @@ const sidebars: SidebarsConfig = { 'user-guide/messaging/discord', 'user-guide/messaging/slack', 'user-guide/messaging/whatsapp', + 'user-guide/messaging/signal', + 'user-guide/messaging/email', 'user-guide/messaging/homeassistant', ], }, @@ -81,6 +83,7 @@ const sidebars: SidebarsConfig = { type: 'category', label: 'Integrations', items: [ + 'user-guide/features/acp', 'user-guide/features/mcp', 'user-guide/features/honcho', 'user-guide/features/provider-routing', @@ -101,6 +104,16 @@ const sidebars: SidebarsConfig = { label: 'Developer Guide', items: [ 'developer-guide/architecture', + 'developer-guide/agent-loop', + 'developer-guide/provider-runtime', + 'developer-guide/prompt-assembly', + 'developer-guide/context-compression-and-caching', + 'developer-guide/gateway-internals', + 'developer-guide/session-storage', + 'developer-guide/tools-runtime', + 'developer-guide/acp-internals', + 'developer-guide/trajectory-format', + 'developer-guide/cron-internals', 'developer-guide/environments', 'developer-guide/adding-tools', 'developer-guide/creating-skills',