From b922e3ff93c457e6079aea8637ffdc7a15dc15b8 Mon Sep 17 00:00:00 2001 From: WadydX <65117428+WadydX@users.noreply.github.com> Date: Thu, 28 May 2026 23:25:51 +0100 Subject: [PATCH] docs(prompt): align precedence docs with system prompt runtime - Replace outdated linear ordering in prompt-assembly guide with current stable/context/volatile tier contract from system_prompt.py - Clarify where memory/profile snapshots live versus skills guidance - Document that pre_llm_call context is user-message injection, not cached system-prompt mutation - Update architecture guide wording to reference system_prompt.py + prompt_builder.py tiered assembly Closes #34118 --- website/docs/developer-guide/architecture.md | 2 +- .../docs/developer-guide/prompt-assembly.md | 28 ++++++++++--------- 2 files changed, 16 insertions(+), 14 deletions(-) diff --git a/website/docs/developer-guide/architecture.md b/website/docs/developer-guide/architecture.md index 93077db0a64..f3698f8e52a 100644 --- a/website/docs/developer-guide/architecture.md +++ b/website/docs/developer-guide/architecture.md @@ -197,7 +197,7 @@ The synchronous orchestration engine (`AIAgent` in `run_agent.py`). Handles prov Prompt construction and maintenance across the conversation lifecycle: -- **`prompt_builder.py`** — Assembles the system prompt from: personality (SOUL.md), memory (MEMORY.md, USER.md), skills, context files (AGENTS.md, .hermes.md), tool-use guidance, and model-specific instructions +- **`system_prompt.py` + `prompt_builder.py`** — assembles the ordered system-prompt tiers (`stable` → `context` → `volatile`): identity/tool guidance/skills, context files, then memory/profile/timestamp blocks - **`prompt_caching.py`** — Applies Anthropic cache breakpoints for prefix caching - **`context_compressor.py`** — Summarizes middle conversation turns when context exceeds thresholds diff --git a/website/docs/developer-guide/prompt-assembly.md b/website/docs/developer-guide/prompt-assembly.md index f23705870ee..d4b31027e2f 100644 --- a/website/docs/developer-guide/prompt-assembly.md +++ b/website/docs/developer-guide/prompt-assembly.md @@ -26,18 +26,18 @@ Primary files: ## Cached system prompt layers -The cached system prompt is assembled in roughly this order: +The cached system prompt is assembled as three ordered tiers (see `agent/system_prompt.py`): -1. agent identity — `SOUL.md` from `HERMES_HOME` when available, otherwise falls back to `DEFAULT_AGENT_IDENTITY` in `prompt_builder.py` -2. tool-aware behavior guidance -3. Honcho static block (when active) -4. optional system message -5. frozen MEMORY snapshot -6. frozen USER profile snapshot -7. skills index -8. context files (`AGENTS.md`, `.cursorrules`, `.cursor/rules/*.mdc`) — SOUL.md is **not** included here when it was already loaded as the identity in step 1 -9. timestamp / optional session ID -10. platform hint +1. **stable** — identity (`SOUL.md` or fallback), tool/model guidance, skills prompt, environment hints, platform hints +2. **context** — caller-supplied `system_message` plus project context files (`.hermes.md` / `AGENTS.md` / `CLAUDE.md` / `.cursorrules`) +3. **volatile** — built-in memory snapshot (`MEMORY.md`), user profile snapshot (`USER.md`), external memory-provider block, timestamp/session/model/provider line + +The final system prompt is then joined as: `stable` → `context` → `volatile`. + +This ordering matters for precedence discussions: +- skills are part of the **stable** tier +- memory/profile snapshots are part of the **volatile** tier +- both are still in the cached system prompt (they are not injected as ad-hoc mid-turn overlays) When `skip_context_files` is set (e.g., subagent delegation), SOUL.md is not loaded and the hardcoded `DEFAULT_AGENT_IDENTITY` is used instead. @@ -205,13 +205,15 @@ These are intentionally *not* persisted as part of the cached system prompt: - `ephemeral_system_prompt` - prefill messages - gateway-derived session context overlays -- later-turn Honcho recall injected into the current-turn user message +- later-turn Honcho/external recall injected into the current-turn user message + +`pre_llm_call` plugin context also lands in this API-call-time path: it is appended to the current turn's **user message**, not written into the cached system prompt. When multiple plugins return context, Hermes concatenates those context blocks (see [Hooks → `pre_llm_call`](../user-guide/features/hooks.md#pre_llm_call)). This separation keeps the stable prefix stable for caching. ## Memory snapshots -Local memory and user profile data are injected as frozen snapshots at session start. Mid-session writes update disk state but do not mutate the already-built system prompt until a new session or forced rebuild occurs. +Local memory and user profile data are captured in the system prompt's **volatile tier**. Mid-session writes update disk state but do not mutate the already-built cached system prompt until a rebuild path runs (new session, or explicit invalidation/rebuild flow such as compression-triggered rebuild). ## Context files