docs: add ACP and internal systems implementation guides

- add ACP user and developer docs covering setup, lifecycle, callbacks, permissions, tool rendering, and runtime behavior - add developer guides for agent loop, provider runtime resolution, prompt assembly, context caching/compression, gateway internals, session storage, tools runtime, trajectories, and cron internals - refresh architecture, quickstart, installation, CLI reference, and environments docs to link the new implementation pages and ACP support
2026-04-26 01:01:40 +00:00 · 2026-03-14 00:29:48 -07:00 · 2026-03-14 00:29:48 -07:00 · d87a1615ce
commit d87a1615ce
parent 29176f302e
17 changed files with 1256 additions and 170 deletions
--- a/website/docs/developer-guide/prompt-assembly.md
+++ b/website/docs/developer-guide/prompt-assembly.md
@ -0,0 +1,85 @@
+---
+sidebar_position: 5
+title: "Prompt Assembly"
+description: "How Hermes builds the system prompt, preserves cache stability, and injects ephemeral layers"
+---
+
+# Prompt Assembly
+
+Hermes deliberately separates:
+
+- **cached system prompt state**
+- **ephemeral API-call-time additions**
+
+This is one of the most important design choices in the project because it affects:
+
+- token usage
+- prompt caching effectiveness
+- session continuity
+- memory correctness
+
+Primary files:
+
+- `run_agent.py`
+- `agent/prompt_builder.py`
+- `tools/memory_tool.py`
+
+## Cached system prompt layers
+
+The cached system prompt is assembled in roughly this order:
+
+1. default agent identity
+2. tool-aware behavior guidance
+3. Honcho static block (when active)
+4. optional system message
+5. frozen MEMORY snapshot
+6. frozen USER profile snapshot
+7. skills index
+8. context files (`AGENTS.md`, `SOUL.md`, `.cursorrules`, `.cursor/rules/*.mdc`)
+9. timestamp / optional session ID
+10. platform hint
+
+## API-call-time-only layers
+
+These are intentionally *not* persisted as part of the cached system prompt:
+
+- `ephemeral_system_prompt`
+- prefill messages
+- gateway-derived session context overlays
+- later-turn Honcho recall injected into the current-turn user message
+
+This separation keeps the stable prefix stable for caching.
+
+## Memory snapshots
+
+Local memory and user profile data are injected as frozen snapshots at session start. Mid-session writes update disk state but do not mutate the already-built system prompt until a new session or forced rebuild occurs.
+
+## Context files
+
+`agent/prompt_builder.py` scans and sanitizes:
+
+- `AGENTS.md`
+- `SOUL.md`
+- `.cursorrules`
+- `.cursor/rules/*.mdc`
+
+Long files are truncated before injection.
+
+## Skills index
+
+The skills system contributes a compact skills index to the prompt when skills tooling is available.
+
+## Why prompt assembly is split this way
+
+The architecture is intentionally optimized to:
+
+- preserve provider-side prompt caching
+- avoid mutating history unnecessarily
+- keep memory semantics understandable
+- let gateway/ACP/CLI add context without poisoning persistent prompt state
+
+## Related docs
+
+- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
+- [Session Storage](./session-storage.md)
+- [Gateway Internals](./gateway-internals.md)