mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-30 06:41:51 +00:00
Catch the website docs up to two weeks of merged work (May 4 – May 18, 2026,
roughly 1,080 PRs). The audit found ~50 user-visible features that had landed
in code with no docs footprint, plus a handful of stale pages. This PR closes
every gap the scan turned up.
New pages
- user-guide/features/deliverable-mode.md — extension list, agent triggers,
kanban_complete artifacts pattern, [[as_document]] override (PR #27813).
- developer-guide/web-search-provider-plugin.md — authoring guide modeled on
image-gen-provider-plugin, covering brave_free / ddgs / etc. (PR #25448).
Providers / auth
- Rename "Alibaba Cloud" → "Qwen Cloud (Alibaba DashScope)" everywhere the
display label shows up; provider id stays `alibaba` (PR #24835).
- Document OAuth refresh-token quarantine for xAI / MiniMax / Codex (PRs
#28116 / #28118 / #28119).
- Document Nous JWT minting from refresh token + invalid-refresh quarantine
+ cross-profile shared token store (PRs #27663 / #19712).
- Add `## Microsoft Entra ID authentication (keyless)` section to
azure-foundry guide — DefaultAzureCredential, RBAC, OpenAI + Anthropic
routing details (PR #28101 / #9df9816da).
- Custom providers `api_mode` is now prompted-and-persisted, not just URL
autodetected (PR #25068).
- Delegation honours `api_mode` + auto-detects anthropic_messages base URLs
(PR #26824).
- `x_search` auto-enables when xAI credentials are present (PR #27376).
- Add `xAI Grok OAuth (SuperGrok)` row to providers headline table (PR
#26534).
- NVIDIA NIM billing-origin header is set automatically (PR #26585).
Windows / installer
- `install.ps1`: document `-Commit <sha>` and `-Tag <v>` pin params plus
the BOM-strip / git-retry hardening (PR #28169).
- Document Hermes Desktop thin installer + first-launch bootstrap (PR
#27822).
- Document `dep_ensure` Windows bootstrap (PR #27845).
- Document install-method auto-detection (pip / git / homebrew / nixos) and
the matching update command (PR #27843).
Gateway / messaging
- `/platform list|pause|resume` full description + circuit-breaker
semantics (PR #26600).
- Slack / Matrix / Mattermost get parallel `allowed_channels` /
`allowed_rooms` allowlist sections matching Telegram/Discord/DingTalk
(PR #21251).
- Discord `allow_any_attachment` + `max_attachment_bytes` (config and env
vars) (PR #27245).
- Discord clarify-choice button rendering (PR #25485).
- Telegram `guest_mode` @mention bypass for allowlisted groups (PR
#22759).
- Telegram `notifications` mode (`important` vs `all`) (PR #22793).
- `[[as_document]]` skill / response directive for forcing
document-style media delivery (PR #21210).
CLI / TUI
- `/new [name]` argument (PR #19637).
- `/subgoal` user-supplied criteria appended to `/goal` (PR #25449).
- `/exit --delete` flag confirmation prompts for destructive slash
commands (PR #22687).
- Status-bar additions: ▶ N background indicator (PR #27175), context
compression count (PR #21218), YOLO mode banner+statusbar warning (PR
#26238).
- `display.timestamps` + `docker_extra_args` config keys (PR #23599).
- TUI collapsible startup banner sections (PR #20625).
- `HERMES_SESSION_ID` exported to tool subprocesses (PR #23847).
i18n
- Refresh display.language locale list from 8 → 16 (en, zh, zh-hant, ja,
de, es, fr, tr, uk, af, ko, it, ga, pt, ru, hu) — matches
`agent/i18n.py:SUPPORTED_LANGUAGES`.
Tools / features
- `vision_analyze` native-pixel passthrough for vision-capable callers,
with auxiliary text-describer fallback (PR #22955).
- `session_search` rewrite to the single-shape tool (discovery / scroll /
browse modes) (PRs #27590 / #27840).
- Clarify MCP transport scope: client supports stdio + SSE; embedded
`hermes mcp serve` is stdio-only (PR #21227).
- Web search backends table: add Brave Search (free tier) and DDGS rows
(PR #21337).
- ACP session-scoped edit auto-approval modes (PR #27862).
- Curator rename map in the user-visible per-run summary (PR #22910).
- Prompt caching feature page reference in features/overview.md — Claude
cross-session 1-hour prefix cache on native Anthropic / OpenRouter /
Nous Portal (PR #23828).
- Cron per-job profile parameter (PR #28124).
- `--no-skills` flag for `hermes profile create` (PR #20986).
Build
- Verified with `npm run build` in `website/`; both `en` and `zh-Hans`
locales compile. Remaining broken-link/anchor warnings are pre-existing
(`rl-training.md` from learning-path / overview; the
zh-Hans translation lag the docs skill already calls out).
180 lines
10 KiB
Markdown
180 lines
10 KiB
Markdown
---
|
|
sidebar_position: 16
|
|
title: "Persistent Goals"
|
|
description: "Set a standing goal and let Hermes keep working across turns until it's done. Our take on the Ralph loop."
|
|
---
|
|
|
|
# Persistent Goals (`/goal`)
|
|
|
|
`/goal` gives Hermes a standing objective that survives across turns. After every turn a lightweight judge model checks whether the goal is satisfied by the assistant's last response. If not, Hermes automatically feeds a continuation prompt back into the same session and keeps working — until the goal is achieved, you pause or clear it, or the turn budget runs out.
|
|
|
|
It's our take on the **Ralph loop**, directly inspired by [Codex CLI 0.128.0's `/goal`](https://github.com/openai/codex) by Eric Traut (OpenAI). The core idea — keep a goal alive across turns and don't stop until it's achieved — is theirs. The implementation here is independent and adapted to Hermes' architecture.
|
|
|
|
## When to use it
|
|
|
|
Use `/goal` for tasks where you want Hermes to iterate on its own without you re-prompting every turn:
|
|
|
|
- "Fix every lint error in `src/` and verify `ruff check` passes"
|
|
- "Port feature X from repo Y, including tests, and get CI green"
|
|
- "Investigate why session IDs sometimes drift on mid-run compression and write up a report"
|
|
- "Build a small CLI to rename files by their EXIF dates, then test it against the photos/ folder"
|
|
|
|
Tasks where the agent does one turn and stops don't need `/goal`. Tasks where *you'd otherwise have to say "keep going" three times* are where this shines.
|
|
|
|
## Quick start
|
|
|
|
```
|
|
/goal Fix every failing test in tests/hermes_cli/ and make sure scripts/run_tests.sh passes for that directory
|
|
```
|
|
|
|
What you'll see:
|
|
|
|
1. **Goal accepted** — `⊙ Goal set (20-turn budget): <your goal>`
|
|
2. **Turn 1 runs** — Hermes starts working as if you'd sent the goal as a normal message.
|
|
3. **Judge runs** — after the turn, the judge model decides `done` or `continue`.
|
|
4. **Loop fires if needed** — if `continue`, you'll see `↻ Continuing toward goal (1/20): <judge's reason>` and Hermes takes the next step automatically.
|
|
5. **Terminates** — eventually you see either `✓ Goal achieved: <reason>` or `⏸ Goal paused — N/20 turns used`.
|
|
|
|
## Commands
|
|
|
|
| Command | What it does |
|
|
|---|---|
|
|
| `/goal <text>` | Set (or replace) the standing goal. Kicks off the first turn immediately so you don't need to send a separate message. |
|
|
| `/goal` or `/goal status` | Show the current goal, its status, and turns used. |
|
|
| `/goal pause` | Stop the auto-continuation loop without clearing the goal. |
|
|
| `/goal resume` | Resume the loop (resets the turn counter back to zero). |
|
|
| `/goal clear` | Drop the goal entirely. |
|
|
|
|
Works identically on the CLI and every gateway platform (Telegram, Discord, Slack, Matrix, Signal, WhatsApp, SMS, iMessage, Webhook, API server, and the web dashboard).
|
|
|
|
## Adding criteria mid-goal: `/subgoal`
|
|
|
|
While a goal is active you can append extra acceptance criteria with `/subgoal <text>` without resetting the loop. Each call adds one numbered item to the goal's subgoal list; the **continuation prompt** the agent sees on the next turn includes the original goal plus an "Additional criteria the user added mid-loop" block, and the **judge prompt** is rewritten so the verdict must consider every subgoal — the goal isn't marked done until the original objective **and** every subgoal are met.
|
|
|
|
| Command | What it does |
|
|
|---|---|
|
|
| `/subgoal <text>` | Append a new criterion to the active goal. Requires an active `/goal`. |
|
|
| `/subgoal` (no args) | Show the current numbered subgoal list. |
|
|
| `/subgoal remove <N>` | Remove the Nth subgoal (1-based). |
|
|
| `/subgoal clear` | Drop every subgoal but keep the original goal intact. |
|
|
|
|
Subgoals are persisted alongside the goal in `SessionDB.state_meta`, so they survive `/resume`. Setting a new `/goal <text>` replaces the goal and clears the subgoal list; `/goal clear` does the same.
|
|
|
|
Use this when you start a loop ("fix the failing tests") and notice partway through that you also want it to "and add a regression test for the bug you just patched" — `/subgoal add a regression test` tightens the success criteria without breaking the running loop.
|
|
|
|
## Behavior details
|
|
|
|
### The judge
|
|
|
|
After every turn, Hermes calls an auxiliary model with:
|
|
|
|
- The standing goal text
|
|
- The agent's most recent final response (last ~4 KB of text)
|
|
- A system prompt telling the judge to reply with strict JSON: `{"done": <bool>, "reason": "<one-sentence rationale>"}`
|
|
|
|
The judge is deliberately conservative: it marks a goal `done` only when the response **explicitly** confirms the goal is complete, when the final deliverable is clearly produced, or when the goal is unachievable/blocked (treated as DONE with a block reason so we don't burn budget on impossible tasks).
|
|
|
|
### Fail-open semantics
|
|
|
|
If the judge errors (network blip, malformed response, unavailable aux client), Hermes treats the verdict as `continue` — a broken judge never wedges progress. The **turn budget** is the real backstop.
|
|
|
|
### Turn budget
|
|
|
|
Default is 20 continuation turns (`goals.max_turns` in `config.yaml`). When the budget is hit, Hermes auto-pauses and tells you exactly how to proceed:
|
|
|
|
```
|
|
⏸ Goal paused — 20/20 turns used. Use /goal resume to keep going, or /goal clear to stop.
|
|
```
|
|
|
|
`/goal resume` resets the counter to zero, so you can keep going in measured chunks.
|
|
|
|
### User messages always preempt
|
|
|
|
Any real message you send while a goal is active takes priority over the continuation loop. On the CLI your message lands in `_pending_input` ahead of the queued continuation; on the gateway it goes through the adapter FIFO the same way. The judge runs again after your turn — so if your message happens to complete the goal, the judge will catch it and stop.
|
|
|
|
### Mid-run safety (gateway)
|
|
|
|
While an agent is already running, `/goal status`, `/goal pause`, and `/goal clear` are safe to run — they only touch control-plane state and don't interrupt the current turn. Setting a **new** goal mid-run (`/goal <new text>`) is rejected with a message telling you to `/stop` first, so the old continuation can't race the new one.
|
|
|
|
### Persistence
|
|
|
|
Goal state lives in `SessionDB.state_meta` keyed by `goal:<session_id>`. That means `/resume` picks up right where you left off — set a goal, close your laptop, come back tomorrow, `/resume`, and the goal is still standing exactly as you left it (active, paused, or done).
|
|
|
|
### Prompt cache
|
|
|
|
The continuation prompt is a plain user-role message appended to history. It does **not** mutate the system prompt, swap toolsets, or touch the conversation in any way that invalidates Hermes' prompt cache. Running a 20-turn goal costs the same cache-wise as 20 turns of normal conversation.
|
|
|
|
## Configuration
|
|
|
|
Add to `~/.hermes/config.yaml`:
|
|
|
|
```yaml
|
|
goals:
|
|
# Max continuation turns before Hermes auto-pauses and asks you to
|
|
# /goal resume. Default 20. Lower this if you want tighter loops;
|
|
# raise it for long-running refactors.
|
|
max_turns: 20
|
|
```
|
|
|
|
### Choosing the judge model
|
|
|
|
The judge uses the `goal_judge` auxiliary task. By default it resolves to your main model (see [Auxiliary Models](/docs/user-guide/configuration#auxiliary-models)). If you want to route the judge to a cheap fast model to keep costs down, add an override:
|
|
|
|
```yaml
|
|
auxiliary:
|
|
goal_judge:
|
|
provider: openrouter
|
|
model: google/gemini-3-flash-preview
|
|
```
|
|
|
|
The judge call is small (~200 output tokens) and runs once per turn, so a cheap fast model is usually the right call.
|
|
|
|
## Example walkthrough
|
|
|
|
```
|
|
You: /goal Create four files /tmp/note_{1..4}.txt, one per turn, each containing its number as text
|
|
|
|
⊙ Goal set (20-turn budget): Create four files /tmp/note_{1..4}.txt, one per turn, each containing its number as text
|
|
|
|
Hermes: Creating /tmp/note_1.txt now.
|
|
💻 echo "1" > /tmp/note_1.txt (0.1s)
|
|
I've created /tmp/note_1.txt with the content "1". I'll continue with the remaining files on the next turn as you specified.
|
|
|
|
↻ Continuing toward goal (1/20): Only 1 of 4 files has been created; 3 files remain.
|
|
|
|
Hermes: [Continuing toward your standing goal]
|
|
💻 echo "2" > /tmp/note_2.txt (0.1s)
|
|
Created /tmp/note_2.txt. Two more to go.
|
|
|
|
↻ Continuing toward goal (2/20): 2 of 4 files created; 2 remain.
|
|
|
|
Hermes: [Continuing toward your standing goal]
|
|
💻 echo "3" > /tmp/note_3.txt (0.1s)
|
|
Created /tmp/note_3.txt.
|
|
|
|
↻ Continuing toward goal (3/20): 3 of 4 files created; 1 remains.
|
|
|
|
Hermes: [Continuing toward your standing goal]
|
|
💻 echo "4" > /tmp/note_4.txt (0.1s)
|
|
All four files have been created: /tmp/note_1.txt through /tmp/note_4.txt, each containing its number.
|
|
|
|
✓ Goal achieved: All four files were created with the specified content, completing the goal.
|
|
|
|
You: _
|
|
```
|
|
|
|
Four turns, one `/goal` invocation, zero "keep going" prompts from you.
|
|
|
|
## When the judge gets it wrong
|
|
|
|
No judge is perfect. Two failure modes to watch for:
|
|
|
|
**False negative — judge says continue when the goal is actually done.** The turn budget catches this. You'll see `⏸ Goal paused` and can `/goal clear` or just send a new message.
|
|
|
|
**False positive — judge says done when work remains.** You'll see `✓ Goal achieved` but you know better. Send a follow-up message to continue, or re-set the goal more precisely: `/goal <more specific text>`. The judge's system prompt is deliberately conservative to make false positives rarer than false negatives.
|
|
|
|
If you find a judge verdict unconvincing, the reason text in the `↻ Continuing toward goal` or `✓ Goal achieved` line tells you exactly what the judge saw. That's usually enough to diagnose whether the goal text was ambiguous or the model's response was.
|
|
|
|
## Attribution
|
|
|
|
`/goal` is Hermes' take on the **Ralph loop** pattern. The user-facing design — keep a goal alive across turns, don't stop until it's achieved, with create/pause/resume/clear controls — was popularised and shipped in [Codex CLI 0.128.0](https://github.com/openai/codex) by Eric Traut on OpenAI's Codex team. Our implementation is independent (central `CommandDef` registry, `SessionDB.state_meta` persistence, auxiliary-client judge, adapter-FIFO continuation on the gateway side) but the idea is theirs. Credit where credit's due.
|