docs(hermes-agent skill): cover v0.13–v0.17 features, fix stale claims, tighten (#53566)

Refresh the hermes-agent skill against the last 5 major releases and the current codebase, and cut verbose prose. Coverage added (v0.13.0–v0.17.0): - New gateway platforms: iMessage (Photon), Teams, LINE, SimpleX, ntfy, Google Chat, Raft, official WhatsApp Business Cloud API (now 20+). - New surfaces section: desktop app, web dashboard admin panel, hermes proxy (OpenAI-compatible OAuth proxy), Automation Blueprints. - delegate_task(background=true) async subagents; memory-tool atomic batch operations; session_search three-mode shape; x_search/video_analyze toolsets; image_gen image-to-image; xAI Grok via SuperGrok OAuth. - display.interface (cli/tui), curator.consolidate opt-in, PyPI install. Accuracy fixes: - Adding-a-Tool is two files (auto-discovery), not three. - Testing uses scripts/run_tests.sh (canonical runner), not bare pytest. - Dropped change-detector test count and a dangling references/ pointer. - Refreshed overview (Windows-native, 20+ providers, many surfaces). Conciseness: trimmed over-explained Windows keybinding/sandbox/test prose and deep prompt-builder internals to pointers.
2026-07-01 12:02:05 +00:00 · 2026-06-27 03:51:25 -07:00 · 2026-06-27 03:51:25 -07:00 · f67c0b3e60
commit f67c0b3e60
parent d3db73210c
1 changed files with 137 additions and 111 deletions
--- a/skills/autonomous-ai-agents/hermes-agent/SKILL.md
+++ b/skills/autonomous-ai-agents/hermes-agent/SKILL.md
@ -1,7 +1,7 @@
 ---
 name: hermes-agent
 description: "Configure, extend, or contribute to Hermes Agent."
-version: 2.2.0
+version: 2.3.0
 author: Hermes Agent + Teknium
 license: MIT
 platforms: [linux, macos, windows]
@ -14,13 +14,14 @@ metadata:

 # Hermes Agent

-Hermes Agent is an open-source AI agent framework by Nous Research that runs in your terminal, messaging platforms, and IDEs. It belongs to the same category as Claude Code (Anthropic), Codex (OpenAI), and OpenClaw — autonomous coding and task-execution agents that use tool calling to interact with your system. Hermes works with any LLM provider (OpenRouter, Anthropic, OpenAI, DeepSeek, local models, and 15+ others) and runs on Linux, macOS, and WSL.
+Hermes Agent is an open-source AI agent framework by Nous Research that runs in your terminal, a native desktop app, messaging platforms, and IDEs. It's in the same category as Claude Code (Anthropic), Codex (OpenAI), and OpenClaw — autonomous coding and task-execution agents that use tool calling to interact with your system. Hermes works with any LLM provider (OpenRouter, Anthropic, OpenAI, Google, DeepSeek, xAI, local models, and 20+ others) and runs on Linux, macOS, Windows, and WSL.

 What makes Hermes different:

 - **Self-improving through skills** — Hermes learns from experience by saving reusable procedures as skills. When it solves a complex problem, discovers a workflow, or gets corrected, it can persist that knowledge as a skill document that loads into future sessions. Skills accumulate over time, making the agent better at your specific tasks and environment.
 - **Persistent memory across sessions** — remembers who you are, your preferences, environment details, and lessons learned. Pluggable memory backends (built-in, Honcho, Mem0, and more) let you choose how memory works.
- **Multi-platform gateway** — the same agent runs on Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Email, and 10+ other platforms with full tool access, not just chat.
+- **Multi-platform gateway** — the same agent runs on Telegram, Discord, Slack, WhatsApp, iMessage, Signal, Matrix, Teams, Email, and a dozen more platforms with full tool access, not just chat.
+- **Many surfaces** — the same agent core drives the CLI, the Ink TUI, a native Electron desktop app, a web dashboard, and an ACP server for IDEs (VS Code / Zed / JetBrains).
 - **Provider-agnostic** — swap models and providers mid-workflow without changing anything else. Credential pools rotate across multiple API keys automatically.
 - **Profiles** — run multiple independent Hermes instances with isolated configs, sessions, skills, and memory.
 - **Extensible** — plugins, MCP servers, custom tools, webhook triggers, cron scheduling, and the full Python ecosystem.
@ -44,23 +45,27 @@ Good verification targets:
 ## Quick Start

 ```bash
-# Install
+# Install (shell installer — sets up uv, Python, the venv, and the launcher)
 curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash

-# Interactive chat (default)
+# Or via PyPI (ships the TUI bundle + shell launcher)
+pip install hermes-agent       # or: uv pip install hermes-agent
+
+# Interactive chat (default surface; set display.interface: tui to launch the Ink TUI instead)
 hermes

 # Single query
 hermes chat -q "What is the capital of France?"

-# Setup wizard
+# Setup wizard  /  pick model+provider  /  health check
 hermes setup
-
-# Change model/provider
 hermes model
-
-# Check health
 hermes doctor
+
+# Other surfaces
+hermes desktop                 # launch the native desktop app (alias: hermes gui)
+hermes dashboard               # web admin panel + embedded chat
+hermes proxy                   # OpenAI-compatible local proxy backed by your OAuth provider
 ```

 ---
@ -110,14 +115,12 @@ hermes config path          Print config.yaml path
 hermes config env-path      Print .env path
 hermes config check         Check for missing/outdated config
 hermes config migrate       Update config with new options
-hermes auth                 Interactive credential manager
-hermes auth add PROVIDER    Add OAuth or API-key credential (e.g. nous, openai-codex, qwen-oauth)
-hermes auth list            List stored credentials
-hermes auth remove PROVIDER Remove a stored credential
 hermes doctor [--fix]       Check dependencies and config
 hermes status [--all]       Show component status
 ```

+Credentials (OAuth + API keys, with pooling) are managed under `hermes auth` — see the Credentials & Pools section below.
+
 ### Tools & Skills

 ```
@ -165,7 +168,7 @@ hermes gateway status       Check status
 hermes gateway setup        Configure platforms
 ```

-Supported platforms: Telegram, Discord, Slack, WhatsApp, Signal, Email, SMS, Matrix, Mattermost, Home Assistant, DingTalk, Feishu, WeCom, BlueBubbles (iMessage), Weixin (WeChat), API Server, Webhooks. Open WebUI connects via the API Server adapter.
+Supported platforms (20+): Telegram, Discord, Slack, WhatsApp (Baileys bridge + official Business Cloud API), iMessage (Photon — `hermes photon setup`, the BlueBubbles successor with no Mac relay), Signal, Email, SMS, Matrix, Mattermost, Microsoft Teams, LINE, SimpleX, ntfy, Google Chat, Home Assistant, DingTalk, Feishu, WeCom, Weixin (WeChat), Raft (agent network), API Server, Webhooks. Open WebUI connects via the API Server adapter. Most adapters ship under `plugins/platforms/`, so new ones drop in without touching core.

 Platform docs: https://hermes-agent.nousresearch.com/docs/user-guide/messaging/

@ -219,30 +222,42 @@ hermes profile export NAME  Export to tar.gz
 hermes profile import FILE  Import from archive
 ```

-### Credential Pools
+### Credentials & Pools

 ```
-hermes auth add             Interactive credential wizard
+hermes auth                 Interactive credential manager
+hermes auth add [PROVIDER]  Add OAuth or API-key credential
+                            (e.g. nous, openai-codex, qwen-oauth, anthropic)
 hermes auth list [PROVIDER] List pooled credentials
 hermes auth remove P INDEX  Remove by provider + index
 hermes auth reset PROVIDER  Clear exhaustion status
 ```

+Multiple credentials per provider form a pool that rotates automatically and skips exhausted keys.
+
 ### Other

 ```
 hermes insights [--days N]  Usage analytics
 hermes update               Update to latest version
+hermes desktop / gui        Launch the native desktop app
+hermes dashboard            Web admin panel + embedded chat
+hermes proxy                OpenAI-compatible local proxy backed by an OAuth provider
+hermes portal               Quick setup / sign in via Nous Portal
+hermes kanban <verb>        Multi-agent work-queue board (init/create/list/show/assign/…)
 hermes pairing list/approve/revoke  DM authorization
 hermes plugins list/install/remove  Plugin management
-hermes honcho setup/status  Honcho memory integration (requires honcho plugin)
+hermes secrets bitwarden …  External secret store (Bitwarden Secrets Manager)
 hermes memory setup/status/off  Memory provider config
+hermes send                 Send a one-off message through a gateway platform
 hermes completion bash|zsh  Shell completions
 hermes acp                  ACP server (IDE integration)
 hermes claw migrate         Migrate from OpenClaw
 hermes uninstall            Uninstall Hermes
 ```

+For the full, authoritative command list run `hermes --help` (and `hermes <command> --help`). Plugin- and provider-supplied subcommands (e.g. `hermes photon setup` for iMessage) only appear once their plugin is installed/active.
+
 ---

 ## Slash Commands (In-Session)
@ -321,6 +336,7 @@ The registry of record is `hermes_cli/commands.py` — every consumer
 ### Utility
 ```
 /branch (/fork)      Branch the current session
+/handoff <platform>  Hand the live session off to a messaging platform (CLI)
 /fast                Toggle priority/fast processing
 /browser             Open CDP browser connection
 /history             Show conversation history (CLI)
@ -373,13 +389,14 @@ Edit with `hermes config edit` or `hermes config set section.key value`.
 | `agent` | `max_turns` (90), `tool_use_enforcement` |
 | `terminal` | `backend` (local/docker/ssh/modal), `cwd`, `timeout` (180) |
 | `compression` | `enabled`, `threshold` (0.50), `target_ratio` (0.20) |
-| `display` | `skin`, `tool_progress`, `show_reasoning`, `show_cost` |
+| `display` | `skin`, `interface` (cli/tui), `tool_progress`, `show_reasoning`, `show_cost`, `language` |
 | `stt` | `enabled`, `provider` (local/groq/openai/mistral) |
 | `tts` | `provider` (edge/elevenlabs/openai/minimax/mistral/neutts) |
 | `memory` | `memory_enabled`, `user_profile_enabled`, `provider` |
 | `security` | `tirith_enabled`, `website_blocklist` |
 | `delegation` | `model`, `provider`, `base_url`, `api_key`, `max_iterations` (50), `reasoning_effort` |
 | `checkpoints` | `enabled`, `max_snapshots` (50) |
+| `curator` | `enabled`, `consolidate` (false — opt-in aux-model skill consolidation), `interval_hours`, `stale_after_days` |

 Full config reference: https://hermes-agent.nousresearch.com/docs/user-guide/configuration

@ -426,8 +443,9 @@ Enable/disable via `hermes tools` (interactive) or `hermes tools enable/disable
 | `file` | File read/write/search/patch |
 | `code_execution` | Sandboxed Python execution |
 | `vision` | Image analysis |
-| `image_gen` | AI image generation |
-| `video` | Video analysis and generation |
+| `image_gen` | AI image generation and image-to-image editing |
+| `video` | Video analysis (`video_analyze`) and generation |
+| `x_search` | First-class X (Twitter) search (X OAuth or API key) |
 | `tts` | Text-to-speech |
 | `skills` | Skill browsing and management |
 | `memory` | Persistent cross-session memory |
@ -686,16 +704,19 @@ here; full developer notes live in `AGENTS.md`, user-facing docs under

 ### Delegation (`delegate_task`)

-Synchronous subagent spawn — the parent waits for the child's summary
-before continuing its own loop. Isolated context + terminal session.
+Spawn a subagent with an isolated context + terminal session.

 - **Single:** `delegate_task(goal, context, toolsets)`.
 - **Batch:** `delegate_task(tasks=[{goal, ...}, ...])` runs children in
  parallel, capped by `delegation.max_concurrent_children` (default 3).
+- **Background:** `delegate_task(background=true)` returns a handle
+  immediately and keeps the parent loop going; the child's result
+  re-enters the conversation as a new turn when it finishes.
 - **Roles:** `leaf` (default; cannot re-delegate) vs `orchestrator`
  (can spawn its own workers, bounded by `delegation.max_spawn_depth`).
- **Not durable.** If the parent is interrupted, the child is
-  cancelled. For work that must outlive the turn, use `cronjob` or
+- **Not durable.** A backgrounded child is still process-local — if the
+  parent process exits, the child is lost. For work that must outlive
+  the process, use `cronjob` or
  `terminal(background=True, notify_on_complete=True)`.

 Config: `delegation.*` in `config.yaml`.
@ -734,6 +755,11 @@ so nothing is lost.
  Bundled + hub-installed skills are off-limits. **Never deletes** —
  max destructive action is archive. Pinned skills are exempt from
  every auto-transition and every LLM review pass.
+- **Cost:** the deterministic inactivity/prune sweep runs for free. The
+  aux-model "consolidate overlapping skills into umbrellas" pass is
+  **off by default** — opt in with `curator.consolidate: true` or
+  `hermes curator run --consolidate`. Routine background curation costs
+  zero tokens.
 - **Telemetry:** sidecar at `~/.hermes/skills/.usage.json` holds
  per-skill `use_count`, `view_count`, `patch_count`,
  `last_activity_at`, `state`, `pinned`.
@ -773,6 +799,39 @@ User docs: https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban

 ---

+## Surfaces & Other Capabilities
+
+Beyond the CLI and gateway, a few things worth knowing about:
+
+- **Desktop app** (`hermes desktop` / `hermes gui`) — native Electron app
+  for macOS/Linux/Windows: streaming chat, session list, drag-and-drop +
+  clipboard-paste files, Cmd+K palette, status-bar model picker,
+  rebindable shortcuts, native notifications, live subagent watch-windows,
+  VS Code Marketplace themes, and per-profile remote-gateway login (OAuth
+  or username/password) so a thin local GUI can drive a heavy remote agent.
+- **Web dashboard** (`hermes dashboard`) — full admin panel: configure
+  every messaging channel, the MCP catalog, webhooks/hooks, memory, and a
+  complete profile builder (model + skills + MCPs) from the browser, plus
+  an embedded `hermes --tui` chat. Secured behind an OAuth/token gate.
+- **OpenAI-compatible proxy** (`hermes proxy`) — exposes a
+  `http://localhost:port` OpenAI API backed by whichever OAuth provider
+  you're signed into (Claude Pro, ChatGPT Pro, SuperGrok). Point Codex
+  CLI, Aider, Cline, Continue, or any script at it — no API key.
+- **Automation Blueprints** — pick a named automation and Hermes asks for
+  what it needs (no cron syntax). One definition renders as a dashboard
+  form, a slash command, an agent conversation, and a docs-catalog entry.
+- **`memory` tool batch operations** — pass an `operations` array of
+  add/replace/remove edits applied atomically against the final character
+  budget, so a single call can free space and add entries even when an add
+  alone would overflow.
+- **`session_search`** — FTS5-backed, no aux-LLM, effectively free. One
+  tool, three modes inferred from which args are set: discovery (`query`),
+  scroll (`session_id` + `around_message_id`), browse (no args).
+- **xAI Grok via SuperGrok OAuth** — sign in with your xAI account (no API
+  key); includes Cursor's `grok-composer-2.5-fast` coding model.
+
+---
+
 ## Windows-Specific Quirks

 Hermes runs natively on Windows (PowerShell, cmd, Windows Terminal, git-bash
@ -783,54 +842,33 @@ rediscover them from scratch.

 ### Input / Keybindings

-**Alt+Enter doesn't insert a newline.** Windows Terminal intercepts Alt+Enter
-at the terminal layer to toggle fullscreen — the keystroke never reaches
-prompt_toolkit. Use **Ctrl+Enter** instead. Windows Terminal delivers
-Ctrl+Enter as LF (`c-j`), distinct from plain Enter (`c-m` / CR), and the
-CLI binds `c-j` to newline insertion on `win32` only (see
-`_bind_prompt_submit_keys` + the Windows-only `c-j` binding in `cli.py`).
-Side effect: the raw Ctrl+J keystroke also inserts a newline on Windows —
-unavoidable, because Windows Terminal collapses Ctrl+Enter and Ctrl+J to
-the same keycode at the Win32 console API layer. No conflicting binding
-existed for Ctrl+J on Windows, so this is a harmless side effect.
-
-mintty / git-bash behaves the same (fullscreen on Alt+Enter) unless you
-disable Alt+Fn shortcuts in Options → Keys. Easier to just use Ctrl+Enter.
-
-**Diagnosing keybindings.** Run `python scripts/keystroke_diagnostic.py`
-(repo root) to see exactly how prompt_toolkit identifies each keystroke
-in the current terminal. Answers questions like "does Shift+Enter come
-through as a distinct key?" (almost never — most terminals collapse it
-to plain Enter) or "what byte sequence is my terminal sending for
-Ctrl+Enter?" This is how the Ctrl+Enter = c-j fact was established.
+**Alt+Enter doesn't insert a newline** — Windows Terminal (and mintty) grab it
+for fullscreen before prompt_toolkit sees it. Use **Ctrl+Enter** instead (the
+CLI binds it to newline on Windows; raw Ctrl+J does the same, harmlessly).
+To inspect how your terminal reports a keystroke, run
+`python scripts/keystroke_diagnostic.py` from the repo root.

 ### Config / Files

-**HTTP 400 "No models provided" on first run.** `config.yaml` was saved
-with a UTF-8 BOM (common when Windows apps write it). Re-save as UTF-8
-without BOM. `hermes config edit` writes without BOM; manual edits in
-Notepad are the usual culprit.
+**HTTP 400 "No models provided" on first run** — `config.yaml` was saved with
+a UTF-8 BOM (Notepad does this). Re-save as UTF-8 without BOM;
+`hermes config edit` writes correctly.

 ### `execute_code` / Sandbox

-**WinError 10106** ("The requested service provider could not be loaded
-or initialized") from the sandbox child process — it can't create an
-`AF_INET` socket, so the loopback-TCP RPC fallback fails before
-`connect()`. Root cause is usually **not** a broken Winsock LSP; it's
-Hermes's own env scrubber dropping `SYSTEMROOT` / `WINDIR` / `COMSPEC`
-from the child env. Python's `socket` module needs `SYSTEMROOT` to locate
-`mswsock.dll`. Fixed via the `_WINDOWS_ESSENTIAL_ENV_VARS` allowlist in
-`tools/code_execution_tool.py`. If you still hit it, echo `os.environ`
-inside an `execute_code` block to confirm `SYSTEMROOT` is set. Full
-diagnostic recipe in `references/execute-code-sandbox-env-windows.md`.
+**WinError 10106** from the sandbox child process — it can't create an
+`AF_INET` socket. Root cause is usually Hermes's env scrubber dropping
+`SYSTEMROOT`/`WINDIR`/`COMSPEC` (Python's `socket` needs `SYSTEMROOT` to find
+`mswsock.dll`), not a broken Winsock LSP. The `_WINDOWS_ESSENTIAL_ENV_VARS`
+allowlist in `tools/code_execution_tool.py` covers it; if you still hit it,
+echo `os.environ` inside an `execute_code` block to confirm `SYSTEMROOT` is set.

-### Testing / Contributing
+### Testing on Windows

-**`scripts/run_tests.sh` doesn't work as-is on Windows** — it looks for
-POSIX venv layouts (`.venv/bin/activate`). The Hermes-installed venv at
-`venv/Scripts/` has no pip or pytest either (stripped for install size).
-Workaround: install `pytest + pytest-xdist + pyyaml` into a system Python
-3.11 user site, then invoke pytest directly with `PYTHONPATH` set:
+`scripts/run_tests.sh` is POSIX-only (expects `.venv/bin/activate`); the
+Hermes-installed `venv/Scripts/` has no pip/pytest (stripped for size).
+Install pytest into a system Python and run directly with `-n 0`
+(`pyproject.toml`'s `addopts` already sets `-n`):

 ```bash
 "/c/Program Files/Python311/python" -m pip install --user pytest pytest-xdist pyyaml
@ -838,24 +876,14 @@ export PYTHONPATH="$(pwd)"
 "/c/Program Files/Python311/python" -m pytest tests/foo/test_bar.py -v --tb=short -n 0
 ```

-Use `-n 0`, not `-n 4` — `pyproject.toml`'s default `addopts` already
-includes `-n`, and the wrapper's CI-parity guarantees don't apply off POSIX.
-
-**POSIX-only tests need skip guards.** Common markers already in the codebase:
- Symlinks — elevated privileges on Windows
- `0o600` file modes — POSIX mode bits not enforced on NTFS by default
- `signal.SIGALRM` — Unix-only (see `tests/conftest.py::_enforce_test_timeout`)
- Winsock / Windows-specific regressions — `@pytest.mark.skipif(sys.platform != "win32", ...)`
-
-Use the existing skip-pattern style (`sys.platform == "win32"` or
-`sys.platform.startswith("win")`) to stay consistent with the rest of the
-suite.
+(POSIX-only tests need skip guards — see the cross-platform guard list in the
+Contributor section below.)

 ### Path / Filesystem

-**Line endings.** Git may warn `LF will be replaced by CRLF the next time
-Git touches it`. Cosmetic — the repo's `.gitattributes` normalizes. Don't
-let editors auto-convert committed POSIX-newline files to CRLF.
+**Line endings.** Git may warn `LF will be replaced by CRLF`. Cosmetic — the
+repo's `.gitattributes` normalizes. Don't let editors auto-convert committed
+POSIX-newline files to CRLF.

 **Forward slashes work almost everywhere.** `C:/Users/...` is accepted by
 every Hermes tool and most Windows APIs. Prefer forward slashes in code
@ -961,13 +989,17 @@ hermes-agent/
 ├── gateway/              # Messaging gateway
 │   └── platforms/        # Platform adapters (telegram, discord, etc.)
 ├── cron/                 # Job scheduler
-├── tests/                # ~3000 pytest tests
+├── tests/                # Extensive pytest suite (run via scripts/run_tests.sh)
 └── website/              # Docusaurus docs site
 ```

 Config: `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys) — both under `$HERMES_HOME` when it is set.

-### Adding a Tool (3 files)
+### Adding a Tool
+
+Two files. Auto-discovery imports any `tools/*.py` with a top-level
+`registry.register()` call, but a tool is only *exposed* to an agent once
+its name appears in a toolset.

 **1. Create `tools/your_tool.py`:**
 ```python
@ -991,11 +1023,12 @@ registry.register(
 )
 ```

-**2. Add to `toolsets.py`** → `_HERMES_CORE_TOOLS` list.
+**2. Wire it into a toolset in `toolsets.py`** — add the name to
+`_HERMES_CORE_TOOLS` (every platform) or to a specific toolset.

-Auto-discovery: any `tools/*.py` file with a top-level `registry.register()` call is imported automatically — no manual list needed.
-
-All handlers must return JSON strings. Use `get_hermes_home()` for paths, never hardcode `~/.hermes`.
+All handlers must return JSON strings. Use `get_hermes_home()` for paths,
+never hardcode `~/.hermes`. For custom/local-only tools, write a plugin in
+`~/.hermes/plugins/` instead of editing core — see the developer docs.

 ### Adding a Slash Command

@ -1019,25 +1052,22 @@ run_conversation():

 ### Testing

-```bash
-python -m pytest tests/ -o 'addopts=' -q   # Full suite
-python -m pytest tests/tools/ -q            # Specific area
-```
-
- Tests auto-redirect `HERMES_HOME` to temp dirs — never touch real `~/.hermes/`
- Run full suite before pushing any change
- Use `-o 'addopts='` to clear any baked-in pytest flags
-
-**Windows contributors:** `scripts/run_tests.sh` currently looks for POSIX venvs (`.venv/bin/activate` / `venv/bin/activate`) and will error out on Windows where the layout is `venv/Scripts/activate` + `python.exe`. The Hermes-installed venv at `venv/Scripts/` also has no `pip` or `pytest` — it's stripped for end-user install size. Workaround: install pytest + pytest-xdist + pyyaml into a system Python 3.11 user site (`/c/Program Files/Python311/python -m pip install --user pytest pytest-xdist pyyaml`), then run tests directly:
+Use the canonical runner — it enforces CI-parity (hermetic env, unset
+credentials, TZ=UTC, xdist workers, per-test subprocess isolation):

 ```bash
-export PYTHONPATH="$(pwd)"
-"/c/Program Files/Python311/python" -m pytest tests/tools/test_foo.py -v --tb=short -n 0
+scripts/run_tests.sh                          # full suite
+scripts/run_tests.sh tests/tools/             # one directory
+scripts/run_tests.sh tests/tools/test_x.py    # one file
+scripts/run_tests.sh -v --tb=long             # pass-through pytest flags
 ```

-Use `-n 0` (not `-n 4`) because `pyproject.toml`'s default `addopts` already includes `-n`, and the wrapper's CI-parity story doesn't apply off-POSIX.
+- Tests auto-redirect `HERMES_HOME` to temp dirs — never touch real `~/.hermes/`.
+- The script probes `.venv`, then `venv`, then the shared worktree venv.
+- **Windows:** the wrapper is POSIX-only; see the **Windows-Specific Quirks**
+  section above for the direct-pytest workaround.

-**Cross-platform test guards:** tests that use POSIX-only syscalls need a skip marker. Common ones already in the codebase:
+**Cross-platform test guards:** tests using POSIX-only syscalls need a skip marker. Common ones already in the codebase:
 - Symlink creation → `@pytest.mark.skipif(sys.platform == "win32", reason="Symlinks require elevated privileges on Windows")` (see `tests/cron/test_cron_script.py`)
 - POSIX file modes (0o600, etc.) → `@pytest.mark.skipif(sys.platform.startswith("win"), reason="POSIX mode bits not enforced on Windows")` (see `tests/hermes_cli/test_auth_toctou_file_modes.py`)
 - `signal.SIGALRM` → Unix-only (see `tests/conftest.py::_enforce_test_timeout`)
@ -1053,18 +1083,14 @@ monkeypatch.setattr(platform, "release", lambda: "6.8.0-generic")

 See `tests/agent/test_prompt_builder.py::TestEnvironmentHints` for a worked example.

-### Extending the system prompt's execution-environment block
+### System prompt's execution-environment block

-Factual guidance about the host OS, user home, cwd, terminal backend, and shell (bash vs. PowerShell on Windows) is emitted from `agent/prompt_builder.py::build_environment_hints()`. This is also where the WSL hint and per-backend probe logic live. The convention:
-
- **Local terminal backend** → emit host info (OS, `$HOME`, cwd) + Windows-specific notes (hostname ≠ username, `terminal` uses bash not PowerShell).
- **Remote terminal backend** (anything in `_REMOTE_TERMINAL_BACKENDS`: `docker, singularity, modal, daytona, ssh, managed_modal`) → **suppress** host info entirely and describe only the backend. A live `uname`/`whoami`/`pwd` probe runs inside the backend via `tools.environments.get_environment(...).execute(...)`, cached per process in `_BACKEND_PROBE_CACHE`, with a static fallback if the probe times out.
- **Key fact for prompt authoring:** when `TERMINAL_ENV != "local"`, *every* file tool (`read_file`, `write_file`, `patch`, `search_files`) runs inside the backend container, not on the host. The system prompt must never describe the host in that case — the agent can't touch it.
-
-Full design notes, the exact emitted strings, and testing pitfalls:
-`references/prompt-builder-environment-hints.md`.
-
-**Refactor-safety pattern (POSIX-equivalence guard):** when you extract inline logic into a helper that adds Windows/platform-specific behavior, keep a `_legacy_<name>` oracle function in the test file that's a verbatim copy of the old code, then parametrize-diff against it. Example: `tests/tools/test_code_execution_windows_env.py::TestPosixEquivalence`. This locks in the invariant that POSIX behavior is bit-for-bit identical and makes any future drift fail loudly with a clear diff.
+Factual host/backend guidance (OS, `$HOME`, cwd, terminal backend, shell)
+is emitted by `agent/prompt_builder.py::build_environment_hints()`. The key
+invariant for prompt authors: with a **remote** terminal backend
+(`docker, singularity, modal, daytona, ssh, managed_modal`), host info is
+suppressed and *every* file tool runs inside the backend container — the
+prompt must never describe the host the agent can't touch.

 ### Commit Conventions