mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-09 03:11:58 +00:00
270 lines
11 KiB
Markdown
270 lines
11 KiB
Markdown
---
|
|
sidebar_position: 5
|
|
title: "Prompt Assembly"
|
|
description: "How Hermes builds the system prompt, preserves cache stability, and injects ephemeral layers"
|
|
---
|
|
|
|
# Prompt Assembly
|
|
|
|
Hermes deliberately separates:
|
|
|
|
- **cached system prompt state**
|
|
- **ephemeral API-call-time additions**
|
|
|
|
This is one of the most important design choices in the project because it affects:
|
|
|
|
- token usage
|
|
- prompt caching effectiveness
|
|
- session continuity
|
|
- memory correctness
|
|
|
|
Primary files:
|
|
|
|
- `run_agent.py`
|
|
- `agent/prompt_builder.py`
|
|
- `tools/memory_tool.py`
|
|
|
|
## Cached system prompt layers
|
|
|
|
The cached system prompt is assembled in roughly this order:
|
|
|
|
1. agent identity — `SOUL.md` from `HERMES_HOME` when available, otherwise falls back to `DEFAULT_AGENT_IDENTITY` in `prompt_builder.py`
|
|
2. tool-aware behavior guidance
|
|
3. Honcho static block (when active)
|
|
4. optional system message
|
|
5. frozen MEMORY snapshot
|
|
6. frozen USER profile snapshot
|
|
7. skills index
|
|
8. context files (`AGENTS.md`, `.cursorrules`, `.cursor/rules/*.mdc`) — SOUL.md is **not** included here when it was already loaded as the identity in step 1
|
|
9. timestamp / optional session ID
|
|
10. platform hint
|
|
|
|
When `skip_context_files` is set (e.g., subagent delegation), SOUL.md is not loaded and the hardcoded `DEFAULT_AGENT_IDENTITY` is used instead.
|
|
|
|
### Concrete example: assembled system prompt
|
|
|
|
Here is a simplified view of what the final system prompt looks like when all layers are present (comments show the source of each section):
|
|
|
|
```
|
|
# Layer 1: Agent Identity (from ~/.hermes/SOUL.md)
|
|
You are Hermes, an AI assistant created by Nous Research.
|
|
You are an expert software engineer and researcher.
|
|
You value correctness, clarity, and efficiency.
|
|
...
|
|
|
|
# Layer 2: Tool-aware behavior guidance
|
|
You have persistent memory across sessions. Save durable facts using
|
|
the memory tool: user preferences, environment details, tool quirks,
|
|
and stable conventions. Memory is injected into every turn, so keep
|
|
it compact and focused on facts that will still matter later.
|
|
...
|
|
When the user references something from a past conversation or you
|
|
suspect relevant cross-session context exists, use session_search
|
|
to recall it before asking them to repeat themselves.
|
|
|
|
# Tool-use enforcement (for GPT/Codex models only)
|
|
You MUST use your tools to take action — do not describe what you
|
|
would do or plan to do without actually doing it.
|
|
...
|
|
|
|
# Layer 3: Honcho static block (when active)
|
|
[Honcho personality/context data]
|
|
|
|
# Layer 4: Optional system message (from config or API)
|
|
[User-configured system message override]
|
|
|
|
# Layer 5: Frozen MEMORY snapshot
|
|
## Persistent Memory
|
|
- User prefers Python 3.12, uses pyproject.toml
|
|
- Default editor is nvim
|
|
- Working on project "atlas" in ~/code/atlas
|
|
- Timezone: US/Pacific
|
|
|
|
# Layer 6: Frozen USER profile snapshot
|
|
## User Profile
|
|
- Name: Alice
|
|
- GitHub: alice-dev
|
|
|
|
# Layer 7: Skills index
|
|
## Skills (mandatory)
|
|
Before replying, scan the skills below. If one clearly matches
|
|
your task, load it with skill_view(name) and follow its instructions.
|
|
...
|
|
<available_skills>
|
|
software-development:
|
|
- code-review: Structured code review workflow
|
|
- test-driven-development: TDD methodology
|
|
research:
|
|
- arxiv: Search and summarize arXiv papers
|
|
</available_skills>
|
|
|
|
# Layer 8: Context files (from project directory)
|
|
# Project Context
|
|
The following project context files have been loaded and should be followed:
|
|
|
|
## AGENTS.md
|
|
This is the atlas project. Use pytest for testing. The main
|
|
entry point is src/atlas/main.py. Always run `make lint` before
|
|
committing.
|
|
|
|
# Layer 9: Timestamp + session
|
|
Current time: 2026-03-30T14:30:00-07:00
|
|
Session: abc123
|
|
|
|
# Layer 10: Platform hint
|
|
You are a CLI AI Agent. Try not to use markdown but simple text
|
|
renderable inside a terminal.
|
|
```
|
|
|
|
## How SOUL.md appears in the prompt
|
|
|
|
`SOUL.md` lives at `~/.hermes/SOUL.md` and serves as the agent's identity — the very first section of the system prompt. The loading logic in `prompt_builder.py` works as follows:
|
|
|
|
```python
|
|
# From agent/prompt_builder.py (simplified)
|
|
def load_soul_md() -> Optional[str]:
|
|
soul_path = get_hermes_home() / "SOUL.md"
|
|
if not soul_path.exists():
|
|
return None
|
|
content = soul_path.read_text(encoding="utf-8").strip()
|
|
content = _scan_context_content(content, "SOUL.md") # Security scan
|
|
content = _truncate_content(content, "SOUL.md") # Cap at 20k chars
|
|
return content
|
|
```
|
|
|
|
When `load_soul_md()` returns content, it replaces the hardcoded `DEFAULT_AGENT_IDENTITY`. The `build_context_files_prompt()` function is then called with `skip_soul=True` to prevent SOUL.md from appearing twice (once as identity, once as a context file).
|
|
|
|
If `SOUL.md` doesn't exist, the system falls back to:
|
|
|
|
```
|
|
You are Hermes Agent, an intelligent AI assistant created by Nous Research.
|
|
You are helpful, knowledgeable, and direct. You assist users with a wide
|
|
range of tasks including answering questions, writing and editing code,
|
|
analyzing information, creative work, and executing actions via your tools.
|
|
You communicate clearly, admit uncertainty when appropriate, and prioritize
|
|
being genuinely useful over being verbose unless otherwise directed below.
|
|
Be targeted and efficient in your exploration and investigations.
|
|
```
|
|
|
|
## How context files are injected
|
|
|
|
`build_context_files_prompt()` uses a **priority system** — only one project context type is loaded (first match wins):
|
|
|
|
```python
|
|
# From agent/prompt_builder.py (simplified)
|
|
def build_context_files_prompt(cwd=None, skip_soul=False):
|
|
cwd_path = Path(cwd).resolve()
|
|
|
|
# Priority: first match wins — only ONE project context loaded
|
|
project_context = (
|
|
_load_hermes_md(cwd_path) # 1. .hermes.md / HERMES.md (walks to git root)
|
|
or _load_agents_md(cwd_path) # 2. AGENTS.md (cwd only)
|
|
or _load_claude_md(cwd_path) # 3. CLAUDE.md (cwd only)
|
|
or _load_cursorrules(cwd_path) # 4. .cursorrules / .cursor/rules/*.mdc
|
|
)
|
|
|
|
sections = []
|
|
if project_context:
|
|
sections.append(project_context)
|
|
|
|
# SOUL.md from HERMES_HOME (independent of project context)
|
|
if not skip_soul:
|
|
soul_content = load_soul_md()
|
|
if soul_content:
|
|
sections.append(soul_content)
|
|
|
|
if not sections:
|
|
return ""
|
|
|
|
return (
|
|
"# Project Context\n\n"
|
|
"The following project context files have been loaded "
|
|
"and should be followed:\n\n"
|
|
+ "\n".join(sections)
|
|
)
|
|
```
|
|
|
|
### Context file discovery details
|
|
|
|
| Priority | Files | Search scope | Notes |
|
|
|----------|-------|-------------|-------|
|
|
| 1 | `.hermes.md`, `HERMES.md` | CWD up to git root | Hermes-native project config |
|
|
| 2 | `AGENTS.md` | CWD only | Common agent instruction file |
|
|
| 3 | `CLAUDE.md` | CWD only | Claude Code compatibility |
|
|
| 4 | `.cursorrules`, `.cursor/rules/*.mdc` | CWD only | Cursor compatibility |
|
|
|
|
All context files are:
|
|
- **Security scanned** — checked for prompt injection patterns (invisible unicode, "ignore previous instructions", credential exfiltration attempts)
|
|
- **Truncated** — capped at 20,000 characters using 70/20 head/tail ratio with a truncation marker
|
|
- **YAML frontmatter stripped** — `.hermes.md` frontmatter is removed (reserved for future config overrides)
|
|
|
|
## API-call-time-only layers
|
|
|
|
These are intentionally *not* persisted as part of the cached system prompt:
|
|
|
|
- `ephemeral_system_prompt`
|
|
- prefill messages
|
|
- gateway-derived session context overlays
|
|
- later-turn Honcho recall injected into the current-turn user message
|
|
|
|
This separation keeps the stable prefix stable for caching.
|
|
|
|
## Memory snapshots
|
|
|
|
Local memory and user profile data are injected as frozen snapshots at session start. Mid-session writes update disk state but do not mutate the already-built system prompt until a new session or forced rebuild occurs.
|
|
|
|
## Context files
|
|
|
|
`agent/prompt_builder.py` scans and sanitizes project context files using a **priority system** — only one type is loaded (first match wins):
|
|
|
|
1. `.hermes.md` / `HERMES.md` (walks to git root)
|
|
2. `AGENTS.md` (CWD at startup; subdirectories discovered progressively during the session via `agent/subdirectory_hints.py`)
|
|
3. `CLAUDE.md` (CWD only)
|
|
4. `.cursorrules` / `.cursor/rules/*.mdc` (CWD only)
|
|
|
|
`SOUL.md` is loaded separately via `load_soul_md()` for the identity slot. When it loads successfully, `build_context_files_prompt(skip_soul=True)` prevents it from appearing twice.
|
|
|
|
Long files are truncated before injection.
|
|
|
|
## Skills index
|
|
|
|
The skills system contributes a compact skills index to the prompt when skills tooling is available.
|
|
|
|
## Supported prompt customization surfaces
|
|
|
|
Most users should treat `agent/prompt_builder.py` as implementation code, not a configuration surface. The supported customization path is to change the prompt inputs Hermes already loads, rather than editing Python templates in place.
|
|
|
|
### Use these surfaces first
|
|
|
|
- `~/.hermes/SOUL.md` — replace the built-in default identity block with your own agent persona and standing behavior.
|
|
- `~/.hermes/MEMORY.md` and `~/.hermes/USER.md` — provide durable cross-session facts and user profile data that should be snapshotted into new sessions.
|
|
- Project context files such as `.hermes.md`, `HERMES.md`, `AGENTS.md`, `CLAUDE.md`, or `.cursorrules` — inject repo-specific working rules.
|
|
- Skills — package reusable workflows and references without editing core prompt code.
|
|
- Optional system prompt config / API overrides — add deployment-specific instruction text without forking Hermes.
|
|
- Ephemeral overlays such as `HERMES_EPHEMERAL_SYSTEM_PROMPT` or prefill messages — add turn-scoped guidance that should not become part of the cached prompt prefix.
|
|
|
|
### When to edit code instead
|
|
|
|
Edit `agent/prompt_builder.py` only if you are intentionally maintaining a fork or contributing upstream behavior changes. That file assembles the prompt plumbing, cache boundaries, and injection order for every session. Direct edits there are global product changes, not per-user prompt customization.
|
|
|
|
In other words:
|
|
|
|
- if you want a different assistant identity, edit `SOUL.md`
|
|
- if you want different repo rules, edit project context files
|
|
- if you want reusable operating procedures, add or modify skills
|
|
- if you want to change how Hermes assembles prompts for everyone, change Python and treat it as a code contribution
|
|
|
|
## Why prompt assembly is split this way
|
|
|
|
The architecture is intentionally optimized to:
|
|
|
|
- preserve provider-side prompt caching
|
|
- avoid mutating history unnecessarily
|
|
- keep memory semantics understandable
|
|
- let gateway/ACP/CLI add context without poisoning persistent prompt state
|
|
|
|
## Related docs
|
|
|
|
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
|
- [Session Storage](./session-storage.md)
|
|
- [Gateway Internals](./gateway-internals.md)
|