* docs(providers): add model-provider-plugin authoring guide + fix stale refs
New docs:
- website/docs/developer-guide/model-provider-plugin.md — full authoring
guide (directory layout, minimal example, ProviderProfile fields,
overridable hooks, user overrides, api_mode selection, auth types,
testing, pip distribution)
- Wired into website/sidebars.ts under 'Extending'
- Cross-references added in:
- guides/build-a-hermes-plugin.md (tip block)
- developer-guide/adding-providers.md
- developer-guide/provider-runtime.md
User guide:
- user-guide/features/plugins.md: Plugin types table grows from 3 to 4
with 'Model providers' row
Stale comment cleanup (providers/*.py → plugins/model-providers/<name>/):
- hermes_cli/main.py:_is_profile_api_key_provider docstring
- hermes_cli/doctor.py:_build_apikey_providers_list docstring
- hermes_cli/auth.py: PROVIDER_REGISTRY + alias auto-extension comments
- hermes_cli/models.py: CANONICAL_PROVIDERS auto-extension comment
AGENTS.md:
- Project-structure tree: added plugins/model-providers/ row
- New section: 'Model-provider plugins' explaining discovery, override
semantics, PluginManager integration, kind auto-coerce heuristic
Verified: docusaurus build succeeds, new page renders, all 3 cross-links
resolve. 347/347 targeted tests pass (tests/providers/,
tests/hermes_cli/test_plugins.py, tests/hermes_cli/test_runtime_provider_resolution.py,
tests/run_agent/test_provider_parity.py).
* docs(plugins): add 'pluggable interfaces at a glance' maps to plugins.md + build-a-hermes-plugin
Devs landing on either the user-guide plugin page or the build-a-plugin
guide now get an upfront table of every distinct pluggable surface with
a link to the right authoring doc. Previously they'd have to read the
full general-plugin guide to discover that model providers / platforms
/ memory / context engines are separate systems.
user-guide/features/plugins.md:
- New 'Pluggable interfaces — where to go for each' section below the
existing 4-kinds table
- 10 rows covering every register_* surface (tool, hook, slash command,
CLI subcommand, skill, model provider, platform, memory, context
engine, image-gen)
- Explicit note: TTS/STT are NOT plugin-extensible yet — documented
with a pointer to the current config.yaml 'command providers' pattern
and a note that register_tts_provider()/register_stt_provider() may
come later
guides/build-a-hermes-plugin.md:
- New :::info 'Not sure which guide you need?' map at the top so devs
see all pluggable interfaces before investing in this 737-line
general-plugin walkthrough
- Existing bottom :::tip expanded to include platform adapters alongside
model/memory/context plugins
Verified:
- All 8 cross-doc links in the new plugins.md table resolve in a
docusaurus build (SUCCESS, no new broken links)
- TTS link corrected (features/voice → features/tts; latter exists)
- Pre-existing broken links/anchors (cron-script-only, llms.txt,
adding-platform-adapters#step-by-step-checklist) are unchanged
* docs(plugins): correct TTS/STT pluggability \u2014 they ARE plugins (command-providers)
Previous commit incorrectly said TTS/STT 'aren't plugin-extensible'. They
are, via the config-driven command-provider pattern \u2014 any CLI that reads
text and writes audio (or vice versa for STT) is automatically a plugin
with zero Python. The tts.md docs cover this extensively and I missed it.
plugins.md:
- TTS row: 'Config-driven (not a Python plugin)', points at
tts.md#custom-command-providers
- STT row: points at tts.md#voice-message-transcription-stt (STT docs
live in tts.md despite the filename)
- Expanded note: TTS/STT use config-driven shell-command templates as
their plugin surface (full tts.providers.<name> registry for TTS;
HERMES_LOCAL_STT_COMMAND escape hatch for STT)
- Any CLI that reads/writes files is automatically a plugin \u2014 no Python
register_* API needed
- Future register_tts_provider()/register_stt_provider() hooks mentioned
as nice-to-have for SDK/streaming cases, not as the primary story
build-a-hermes-plugin.md:
- Same map update: TTS/STT rows explicit, footer note corrected
Verified:
- tts.md anchors (custom-command-providers, voice-message-transcription-stt)
exist and resolve in docusaurus build (SUCCESS, no new broken links)
* docs(plugins): expand pluggable interfaces table with MCP / event hooks / shell hooks / skill taps
Broadened the scope beyond Python register_* hooks. Hermes has MULTIPLE
plugin-style extension surfaces; they're now all in one table instead of
being scattered across feature docs.
Added rows for:
- **MCP servers** — config.yaml mcp_servers.<name> auto-registers external
tools from any MCP server. Huge extensibility surface, previously not
linked from the plugin map.
- **Gateway event hooks** — drop HOOK.yaml + handler.py into
~/.hermes/hooks/<name>/ to fire on gateway:startup, session:*, agent:*,
command:* events. Separate from Python plugin hooks.
- **Shell hooks** — hooks: block in config.yaml runs shell commands on
events (notifications, auditing, etc.).
- **Skill sources (taps)** — hermes skills tap add <repo> to pull in new
skill registries beyond the built-in sources.
Both docs updated:
- user-guide/features/plugins.md: table column renamed to 'How' (mixes
Python API + config-driven + drop-in-dir surfaces accurately)
- guides/build-a-hermes-plugin.md: :::info map at top mirrors the new
surfaces with a forward-link to the consolidated table
Note block rewritten: instead of singling out TTS/STT as the 'different
style' exception, now honestly describes that Hermes deliberately
supports three plugin styles — Python APIs, config-driven commands, and
drop-in manifest directories — and devs should pick the one that fits
their integration.
Not included (considered and rejected):
- Transport layer (register_transport) — internal, not user-facing
- Tool-call parsers — internal, VLLM phase-2 thing
- Cloud browser providers — hardcoded registry, not drop-in yet
- Terminal backends — hardcoded if/elif, not drop-in yet
- Skill sources (the ABC) — hardcoded list, only taps are user-extensible
Verified:
- All 5 new anchors resolve (gateway-event-hooks, shell-hooks, skills-hub,
custom-command-providers, voice-message-transcription-stt)
- Docusaurus build SUCCESS, zero new broken links
- Same 3 pre-existing broken links on main (cron-script-only, llms.txt,
adding-platform-adapters#step-by-step-checklist)
* docs(plugins): cover every pluggable surface in both the overview and how-to
Both plugins.md and build-a-hermes-plugin.md now cover every extension
surface end-to-end \u2014 general plugin APIs, specialized plugin types,
config-driven surfaces \u2014 with concrete authoring patterns for each.
plugins.md:
- 'What plugins can do' table grows from 9 rows (general ctx.register_*
only) to 14 rows covering register_platform, register_image_gen_provider,
register_context_engine, MemoryProvider subclass, register_provider
(model). Each row links to its full authoring guide.
- New 'Plugin sub-categories' section under Plugin Discovery explains
how plugins/platforms/, plugins/image_gen/, plugins/memory/,
plugins/context_engine/, plugins/model-providers/ are routed to
different loaders \u2014 PluginManager vs the per-category own-loader
systems.
- Explicit mention of user-override semantics at
~/.hermes/plugins/model-providers/ and ~/.hermes/plugins/memory/.
build-a-hermes-plugin.md:
- New '## Specialized plugin types' section (5 sub-sections):
- Model provider plugins \u2014 ProviderProfile + plugin.yaml example,
auto-wiring summary, link to full guide
- Platform plugins \u2014 BasePlatformAdapter + register_platform() skeleton
- Memory provider plugins \u2014 MemoryProvider subclass example
- Context engine plugins \u2014 ContextEngine subclass example
- Image-generation backends \u2014 ImageGenProvider + kind: backend example
- New '## Non-Python extension surfaces' section (5 sub-sections):
- MCP servers \u2014 config.yaml mcp_servers.<name> example
- Gateway event hooks \u2014 HOOK.yaml + handler.py example
- Shell hooks \u2014 hooks: block in config.yaml example
- Skill sources (taps) \u2014 hermes skills tap add example
- TTS / STT command templates \u2014 tts.providers.<name> with type: command
- Distribute via pip / NixOS promoted from ### to ## (they were orphaned
after the reorganization)
Each specialized / non-Python section has a concrete, copy-pasteable
example plus a 'Full guide:' link to the authoritative doc. Devs arriving
at the build-a-hermes-plugin guide now see every extension surface at
their disposal, not just the general tool/hook/slash-command surface.
Verified:
- Docusaurus build SUCCESS, zero new broken links
- All new cross-links (developer-guide/model-provider-plugin,
adding-platform-adapters, memory-provider-plugin, context-engine-plugin,
user-guide/features/mcp, skills#skills-hub, hooks#gateway-event-hooks,
hooks#shell-hooks, tts#custom-command-providers,
tts#voice-message-transcription-stt) resolve
- Same 3 pre-existing broken links on main (cron-script-only, llms.txt,
adding-platform-adapters#step-by-step-checklist)
* docs(plugins): fix opt-in inconsistency — not every plugin is gated
The 'Every plugin is disabled by default' statement was wrong. Several
plugin categories intentionally bypass plugins.enabled:
- Bundled platform plugins (IRC, Teams) auto-load so shipped gateway
channels are available out of the box. Activation per channel is via
gateway.platforms.<name>.enabled.
- Bundled backends (plugins/image_gen/*) auto-load so the default
backend 'just works'. Selection via <category>.provider config.
- Memory providers are all discovered; one is active via memory.provider.
- Context engines are all discovered; one is active via context.engine.
- Model providers: all 33 discovered at first get_provider_profile();
user picks via --provider / config.
The plugins.enabled allow-list specifically gates:
- Standalone plugins (general tools/hooks/slash commands)
- User-installed backends
- User-installed platforms (third-party gateway adapters)
- Pip entry-point backends
Which matches the actual code in hermes_cli/plugins.py:737 where the
bundled+backend/platform check bypasses the allow-list.
Rewrote '## Plugins are opt-in' to:
- Retitle to 'Plugins are opt-in (with a few exceptions)'
- Narrow opening claim to 'General plugins and user-installed backends
are disabled by default'
- Added 'What the allow-list does NOT gate' subsection with a full
table of which bypass the gate and how they're activated instead
- Fixed migration section wording (bundled platform/backend plugins
never needed grandfathering)
Verified: docusaurus build SUCCESS, zero new broken links.
45 KiB
Hermes Agent - Development Guide
Instructions for AI coding assistants and developers working on the hermes-agent codebase.
Development Environment
# Prefer .venv; fall back to venv if that's what your checkout has.
source .venv/bin/activate # or: source venv/bin/activate
scripts/run_tests.sh probes .venv first, then venv, then
$HOME/.hermes/hermes-agent/venv (for worktrees that share a venv with the
main checkout).
Project Structure
File counts shift constantly — don't treat the tree below as exhaustive. The canonical source is the filesystem. The notes call out the load-bearing entry points you'll actually edit.
hermes-agent/
├── run_agent.py # AIAgent class — core conversation loop (~12k LOC)
├── model_tools.py # Tool orchestration, discover_builtin_tools(), handle_function_call()
├── toolsets.py # Toolset definitions, _HERMES_CORE_TOOLS list
├── cli.py # HermesCLI class — interactive CLI orchestrator (~11k LOC)
├── hermes_state.py # SessionDB — SQLite session store (FTS5 search)
├── hermes_constants.py # get_hermes_home(), display_hermes_home() — profile-aware paths
├── hermes_logging.py # setup_logging() — agent.log / errors.log / gateway.log (profile-aware)
├── batch_runner.py # Parallel batch processing
├── agent/ # Agent internals (provider adapters, memory, caching, compression, etc.)
├── hermes_cli/ # CLI subcommands, setup wizard, plugins loader, skin engine
├── tools/ # Tool implementations — auto-discovered via tools/registry.py
│ └── environments/ # Terminal backends (local, docker, ssh, modal, daytona, singularity)
├── gateway/ # Messaging gateway — run.py + session.py + platforms/
│ ├── platforms/ # Adapter per platform (telegram, discord, slack, whatsapp,
│ │ # homeassistant, signal, matrix, mattermost, email, sms,
│ │ # dingtalk, wecom, weixin, feishu, qqbot, bluebubbles,
│ │ # yuanbao, webhook, api_server, ...). See ADDING_A_PLATFORM.md.
│ └── builtin_hooks/ # Extension point for always-registered gateway hooks (none shipped)
├── plugins/ # Plugin system (see "Plugins" section below)
│ ├── memory/ # Memory-provider plugins (honcho, mem0, supermemory, ...)
│ ├── context_engine/ # Context-engine plugins
│ ├── model-providers/ # Inference backend plugins (openrouter, anthropic, gmi, ...)
│ ├── kanban/ # Multi-agent board dispatcher + worker plugin
│ ├── hermes-achievements/ # Gamified achievement tracking
│ ├── observability/ # Metrics / traces / logs plugin
│ ├── image_gen/ # Image-generation providers
│ └── <others>/ # disk-cleanup, example-dashboard, google_meet, platforms,
│ # spotify, strike-freedom-cockpit, ...
├── optional-skills/ # Heavier/niche skills shipped but NOT active by default
├── skills/ # Built-in skills bundled with the repo
├── ui-tui/ # Ink (React) terminal UI — `hermes --tui`
│ └── src/ # entry.tsx, app.tsx, gatewayClient.ts + app/components/hooks/lib
├── tui_gateway/ # Python JSON-RPC backend for the TUI
├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains integration)
├── cron/ # Scheduler — jobs.py, scheduler.py
├── environments/ # RL training environments (Atropos)
├── scripts/ # run_tests.sh, release.py, auxiliary scripts
├── website/ # Docusaurus docs site
└── tests/ # Pytest suite (~17k tests across ~900 files as of May 2026)
User config: ~/.hermes/config.yaml (settings), ~/.hermes/.env (API keys only).
Logs: ~/.hermes/logs/ — agent.log (INFO+), errors.log (WARNING+),
gateway.log when running the gateway. Profile-aware via get_hermes_home().
Browse with hermes logs [--follow] [--level ...] [--session ...].
File Dependency Chain
tools/registry.py (no deps — imported by all tool files)
↑
tools/*.py (each calls registry.register() at import time)
↑
model_tools.py (imports tools/registry + triggers tool discovery)
↑
run_agent.py, cli.py, batch_runner.py, environments/
AIAgent Class (run_agent.py)
The real AIAgent.__init__ takes ~60 parameters (credentials, routing, callbacks,
session context, budget, credential pool, etc.). The signature below is the
minimum subset you'll usually touch — read run_agent.py for the full list.
class AIAgent:
def __init__(self,
base_url: str = None,
api_key: str = None,
provider: str = None,
api_mode: str = None, # "chat_completions" | "codex_responses" | ...
model: str = "", # empty → resolved from config/provider later
max_iterations: int = 90, # tool-calling iterations (shared with subagents)
enabled_toolsets: list = None,
disabled_toolsets: list = None,
quiet_mode: bool = False,
save_trajectories: bool = False,
platform: str = None, # "cli", "telegram", etc.
session_id: str = None,
skip_context_files: bool = False,
skip_memory: bool = False,
credential_pool=None,
# ... plus callbacks, thread/user/chat IDs, iteration_budget, fallback_model,
# checkpoints config, prefill_messages, service_tier, reasoning_config, etc.
): ...
def chat(self, message: str) -> str:
"""Simple interface — returns final response string."""
def run_conversation(self, user_message: str, system_message: str = None,
conversation_history: list = None, task_id: str = None) -> dict:
"""Full interface — returns dict with final_response + messages."""
Agent Loop
The core loop is inside run_conversation() — entirely synchronous, with
interrupt checks, budget tracking, and a one-turn grace call:
while (api_call_count < self.max_iterations and self.iteration_budget.remaining > 0) \
or self._budget_grace_call:
if self._interrupt_requested: break
response = client.chat.completions.create(model=model, messages=messages, tools=tool_schemas)
if response.tool_calls:
for tool_call in response.tool_calls:
result = handle_function_call(tool_call.name, tool_call.args, task_id)
messages.append(tool_result_message(result))
api_call_count += 1
else:
return response.content
Messages follow OpenAI format: {"role": "system/user/assistant/tool", ...}.
Reasoning content is stored in assistant_msg["reasoning"].
CLI Architecture (cli.py)
- Rich for banner/panels, prompt_toolkit for input with autocomplete
- KawaiiSpinner (
agent/display.py) — animated faces during API calls,┊activity feed for tool results load_cli_config()in cli.py merges hardcoded defaults + user config YAML- Skin engine (
hermes_cli/skin_engine.py) — data-driven CLI theming; initialized fromdisplay.skinconfig key at startup; skins customize banner colors, spinner faces/verbs/wings, tool prefix, response box, branding text process_command()is a method onHermesCLI— dispatches on canonical command name resolved viaresolve_command()from the central registry- Skill slash commands:
agent/skill_commands.pyscans~/.hermes/skills/, injects as user message (not system prompt) to preserve prompt caching
Slash Command Registry (hermes_cli/commands.py)
All slash commands are defined in a central COMMAND_REGISTRY list of CommandDef objects. Every downstream consumer derives from this registry automatically:
- CLI —
process_command()resolves aliases viaresolve_command(), dispatches on canonical name - Gateway —
GATEWAY_KNOWN_COMMANDSfrozenset for hook emission,resolve_command()for dispatch - Gateway help —
gateway_help_lines()generates/helpoutput - Telegram —
telegram_bot_commands()generates the BotCommand menu - Slack —
slack_subcommand_map()generates/hermessubcommand routing - Autocomplete —
COMMANDSflat dict feedsSlashCommandCompleter - CLI help —
COMMANDS_BY_CATEGORYdict feedsshow_help()
Adding a Slash Command
- Add a
CommandDefentry toCOMMAND_REGISTRYinhermes_cli/commands.py:
CommandDef("mycommand", "Description of what it does", "Session",
aliases=("mc",), args_hint="[arg]"),
- Add handler in
HermesCLI.process_command()incli.py:
elif canonical == "mycommand":
self._handle_mycommand(cmd_original)
- If the command is available in the gateway, add a handler in
gateway/run.py:
if canonical == "mycommand":
return await self._handle_mycommand(event)
- For persistent settings, use
save_config_value()incli.py
CommandDef fields:
name— canonical name without slash (e.g."background")description— human-readable descriptioncategory— one of"Session","Configuration","Tools & Skills","Info","Exit"aliases— tuple of alternative names (e.g.("bg",))args_hint— argument placeholder shown in help (e.g."<prompt>","[name]")cli_only— only available in the interactive CLIgateway_only— only available in messaging platformsgateway_config_gate— config dotpath (e.g."display.tool_progress_command"); when set on acli_onlycommand, the command becomes available in the gateway if the config value is truthy.GATEWAY_KNOWN_COMMANDSalways includes config-gated commands so the gateway can dispatch them; help/menus only show them when the gate is open.
Adding an alias requires only adding it to the aliases tuple on the existing CommandDef. No other file changes needed — dispatch, help text, Telegram menu, Slack mapping, and autocomplete all update automatically.
TUI Architecture (ui-tui + tui_gateway)
The TUI is a full replacement for the classic (prompt_toolkit) CLI, activated via hermes --tui or HERMES_TUI=1.
Process Model
hermes --tui
└─ Node (Ink) ──stdio JSON-RPC── Python (tui_gateway)
│ └─ AIAgent + tools + sessions
└─ renders transcript, composer, prompts, activity
TypeScript owns the screen. Python owns sessions, tools, model calls, and slash command logic.
Transport
Newline-delimited JSON-RPC over stdio. Requests from Ink, events from Python. See tui_gateway/server.py for the full method/event catalog.
Key Surfaces
| Surface | Ink component | Gateway method |
|---|---|---|
| Chat streaming | app.tsx + messageLine.tsx |
prompt.submit → message.delta/complete |
| Tool activity | thinking.tsx |
tool.start/progress/complete |
| Approvals | prompts.tsx |
approval.respond ← approval.request |
| Clarify/sudo/secret | prompts.tsx, maskedPrompt.tsx |
clarify/sudo/secret.respond |
| Session picker | sessionPicker.tsx |
session.list/resume |
| Slash commands | Local handler + fallthrough | slash.exec → _SlashWorker, command.dispatch |
| Completions | useCompletion hook |
complete.slash, complete.path |
| Theming | theme.ts + branding.tsx |
gateway.ready with skin data |
Slash Command Flow
- Built-in client commands (
/help,/quit,/clear,/resume,/copy,/paste, etc.) handled locally inapp.tsx - Everything else →
slash.exec(runs in persistent_SlashWorkersubprocess) →command.dispatchfallback
Dev Commands
cd ui-tui
npm install # first time
npm run dev # watch mode (rebuilds hermes-ink + tsx --watch)
npm start # production
npm run build # full build (hermes-ink + tsc)
npm run type-check # typecheck only (tsc --noEmit)
npm run lint # eslint
npm run fmt # prettier
npm test # vitest
TUI in the Dashboard (hermes dashboard → /chat)
The dashboard embeds the real hermes --tui — not a rewrite. See hermes_cli/pty_bridge.py + the @app.websocket("/api/pty") endpoint in hermes_cli/web_server.py.
- Browser loads
web/src/pages/ChatPage.tsx, which mounts xterm.js'sTerminalwith the WebGL renderer,@xterm/addon-fitfor container-driven resize, and@xterm/addon-unicode11for modern wide-character widths. /api/pty?token=…upgrades to a WebSocket; auth uses the same ephemeral_SESSION_TOKENas REST, via query param (browsers can't setAuthorizationon WS upgrade).- The server spawns whatever
hermes --tuiwould spawn, throughptyprocess(POSIX PTY — WSL works, native Windows does not). - Frames: raw PTY bytes each direction; resize via
\x1b[RESIZE:<cols>;<rows>]intercepted on the server and applied withTIOCSWINSZ.
Do not re-implement the primary chat experience in React. The main transcript, composer/input flow (including slash-command behavior), and PTY-backed terminal belong to the embedded hermes --tui — anything new you add to Ink shows up in the dashboard automatically. If you find yourself rebuilding the transcript or composer for the dashboard, stop and extend Ink instead.
Structured React UI around the TUI is allowed when it is not a second chat surface. Sidebar widgets, inspectors, summaries, status panels, and similar supporting views (e.g. ChatSidebar, ModelPickerDialog, ToolCall) are fine when they complement the embedded TUI rather than replacing the transcript / composer / terminal. Keep their state independent of the PTY child's session and surface their failures non-destructively so the terminal pane keeps working unimpaired.
Adding New Tools
For most custom or local-only tools, do not edit Hermes core. Use the plugin
route instead: create ~/.hermes/plugins/<name>/plugin.yaml and
~/.hermes/plugins/<name>/__init__.py, then register tools with
ctx.register_tool(...). Plugin toolsets are discovered automatically and can be
enabled or disabled without touching tools/ or toolsets.py.
Use the built-in route below only when the user is explicitly contributing a new core Hermes tool that should ship in the base system.
Built-in/core tools require changes in 2 files:
1. Create tools/your_tool.py:
import json, os
from tools.registry import registry
def check_requirements() -> bool:
return bool(os.getenv("EXAMPLE_API_KEY"))
def example_tool(param: str, task_id: str = None) -> str:
return json.dumps({"success": True, "data": "..."})
registry.register(
name="example_tool",
toolset="example",
schema={"name": "example_tool", "description": "...", "parameters": {...}},
handler=lambda args, **kw: example_tool(param=args.get("param", ""), task_id=kw.get("task_id")),
check_fn=check_requirements,
requires_env=["EXAMPLE_API_KEY"],
)
2. Add to toolsets.py — either _HERMES_CORE_TOOLS (all platforms) or a new toolset. This step is required: auto-discovery imports the tool and registers its schema, but the tool is only exposed to an agent if its name appears in a toolset. _HERMES_CORE_TOOLS is not dead code — it's the default bundle every platform's base toolset inherits from.
Auto-discovery: any tools/*.py file with a top-level registry.register() call is imported automatically — no manual import list to maintain. Wiring into a toolset is still a deliberate, manual step.
The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.
Path references in tool schemas: If the schema description mentions file paths (e.g. default output directories), use display_hermes_home() to make them profile-aware. The schema is generated at import time, which is after _apply_profile_override() sets HERMES_HOME.
State files: If a tool stores persistent state (caches, logs, checkpoints), use get_hermes_home() for the base directory — never Path.home() / ".hermes". This ensures each profile gets its own state.
Agent-level tools (todo, memory): intercepted by run_agent.py before handle_function_call(). See tools/todo_tool.py for the pattern.
Adding Configuration
config.yaml options:
- Add to
DEFAULT_CONFIGinhermes_cli/config.py - Bump
_config_version(check the current value at the top ofDEFAULT_CONFIG) ONLY if you need to actively migrate/transform existing user config (renaming keys, changing structure). Adding a new key to an existing section is handled automatically by the deep-merge and does NOT require a version bump.
Top-level config.yaml sections (non-exhaustive):
model, agent, terminal, compression, display, stt, tts,
memory, security, delegation, smart_model_routing, checkpoints,
auxiliary, curator, skills, gateway, logging, cron, profiles,
plugins, honcho.
auxiliary holds per-task overrides for side-LLM work (curator, vision,
embedding, title generation, session_search, etc.) — each task can pin
its own provider/model/base_url/max_tokens/reasoning_effort. See
agent/auxiliary_client.py::_resolve_auto for resolution order.
curator holds the background skill-maintenance config —
enabled, interval_hours, min_idle_hours, stale_after_days,
archive_after_days, backup (nested).
.env variables (SECRETS ONLY — API keys, tokens, passwords):
- Add to
OPTIONAL_ENV_VARSinhermes_cli/config.pywith metadata:
"NEW_API_KEY": {
"description": "What it's for",
"prompt": "Display name",
"url": "https://...",
"password": True,
"category": "tool", # provider, tool, messaging, setting
},
Non-secret settings (timeouts, thresholds, feature flags, paths, display
preferences) belong in config.yaml, not .env. If internal code needs an
env var mirror for backward compatibility, bridge it from config.yaml to
the env var in code (see gateway_timeout, terminal.cwd → TERMINAL_CWD).
Config loaders (three paths — know which one you're in):
| Loader | Used by | Location |
|---|---|---|
load_cli_config() |
CLI mode | cli.py — merges CLI-specific defaults + user YAML |
load_config() |
hermes tools, hermes setup, most CLI subcommands |
hermes_cli/config.py — merges DEFAULT_CONFIG + user YAML |
| Direct YAML load | Gateway runtime | gateway/run.py + gateway/config.py — reads user YAML raw |
If you add a new key and the CLI sees it but the gateway doesn't (or vice
versa), you're on the wrong loader. Check DEFAULT_CONFIG coverage.
Working directory:
- CLI — uses the process's current directory (
os.getcwd()). - Messaging — uses
terminal.cwdfromconfig.yaml. The gateway bridges this to theTERMINAL_CWDenv var for child tools.MESSAGING_CWDhas been removed — the config loader prints a deprecation warning if it's set in.env. Same forTERMINAL_CWDin.env; the canonical setting isterminal.cwdinconfig.yaml.
Skin/Theme System
The skin engine (hermes_cli/skin_engine.py) provides data-driven CLI visual customization. Skins are pure data — no code changes needed to add a new skin.
Architecture
hermes_cli/skin_engine.py # SkinConfig dataclass, built-in skins, YAML loader
~/.hermes/skins/*.yaml # User-installed custom skins (drop-in)
init_skin_from_config()— called at CLI startup, readsdisplay.skinfrom configget_active_skin()— returns cachedSkinConfigfor the current skinset_active_skin(name)— switches skin at runtime (used by/skincommand)load_skin(name)— loads from user skins first, then built-ins, then falls back to default- Missing skin values inherit from the
defaultskin automatically
What skins customize
| Element | Skin Key | Used By |
|---|---|---|
| Banner panel border | colors.banner_border |
banner.py |
| Banner panel title | colors.banner_title |
banner.py |
| Banner section headers | colors.banner_accent |
banner.py |
| Banner dim text | colors.banner_dim |
banner.py |
| Banner body text | colors.banner_text |
banner.py |
| Response box border | colors.response_border |
cli.py |
| Spinner faces (waiting) | spinner.waiting_faces |
display.py |
| Spinner faces (thinking) | spinner.thinking_faces |
display.py |
| Spinner verbs | spinner.thinking_verbs |
display.py |
| Spinner wings (optional) | spinner.wings |
display.py |
| Tool output prefix | tool_prefix |
display.py |
| Per-tool emojis | tool_emojis |
display.py → get_tool_emoji() |
| Agent name | branding.agent_name |
banner.py, cli.py |
| Welcome message | branding.welcome |
cli.py |
| Response box label | branding.response_label |
cli.py |
| Prompt symbol | branding.prompt_symbol |
cli.py |
Built-in skins
default— Classic Hermes gold/kawaii (the current look)ares— Crimson/bronze war-god theme with custom spinner wingsmono— Clean grayscale monochromeslate— Cool blue developer-focused theme
Adding a built-in skin
Add to _BUILTIN_SKINS dict in hermes_cli/skin_engine.py:
"mytheme": {
"name": "mytheme",
"description": "Short description",
"colors": { ... },
"spinner": { ... },
"branding": { ... },
"tool_prefix": "┊",
},
User skins (YAML)
Users create ~/.hermes/skins/<name>.yaml:
name: cyberpunk
description: Neon-soaked terminal theme
colors:
banner_border: "#FF00FF"
banner_title: "#00FFFF"
banner_accent: "#FF1493"
spinner:
thinking_verbs: ["jacking in", "decrypting", "uploading"]
wings:
- ["⟨⚡", "⚡⟩"]
branding:
agent_name: "Cyber Agent"
response_label: " ⚡ Cyber "
tool_prefix: "▏"
Activate with /skin cyberpunk or display.skin: cyberpunk in config.yaml.
Plugins
Hermes has two plugin surfaces. Both live under plugins/ in the repo so
repo-shipped plugins can be discovered alongside user-installed ones in
~/.hermes/plugins/ and pip-installed entry points.
General plugins (hermes_cli/plugins.py + plugins/<name>/)
PluginManager discovers plugins from ~/.hermes/plugins/, ./.hermes/plugins/,
and pip entry points. Each plugin exposes a register(ctx) function that
can:
- Register Python-callback lifecycle hooks:
pre_tool_call,post_tool_call,pre_llm_call,post_llm_call,on_session_start,on_session_end - Register new tools via
ctx.register_tool(...) - Register CLI subcommands via
ctx.register_cli_command(...)— the plugin's argparse tree is wired intohermesat startup sohermes <pluginname> <subcmd>works with no change tomain.py
Hooks are invoked from model_tools.py (pre/post tool) and run_agent.py
(lifecycle). Discovery timing pitfall: discover_plugins() only runs
as a side effect of importing model_tools.py. Code paths that read plugin
state without importing model_tools.py first must call discover_plugins()
explicitly (it's idempotent).
Memory-provider plugins (plugins/memory/<name>/)
Separate discovery system for pluggable memory backends. Current built-in providers include honcho, mem0, supermemory, byterover, hindsight, holographic, openviking, retaindb.
Each provider implements the MemoryProvider ABC (see agent/memory_provider.py)
and is orchestrated by agent/memory_manager.py. Lifecycle hooks include
sync_turn(turn_messages), prefetch(query), shutdown(), and optional
post_setup(hermes_home, config) for setup-wizard integration.
CLI commands via plugins/memory/<name>/cli.py: if a memory plugin
defines register_cli(subparser), discover_plugin_cli_commands() finds
it at argparse setup time and wires it into hermes <plugin>. The
framework only exposes CLI commands for the currently active memory
provider (read from memory.provider in config.yaml), so disabled
providers don't clutter hermes --help.
Rule (Teknium, May 2026): plugins MUST NOT modify core files
(run_agent.py, cli.py, gateway/run.py, hermes_cli/main.py, etc.).
If a plugin needs a capability the framework doesn't expose, expand the
generic plugin surface (new hook, new ctx method) — never hardcode
plugin-specific logic into core. PR #5295 removed 95 lines of hardcoded
honcho argparse from main.py for exactly this reason.
Model-provider plugins (plugins/model-providers/<name>/)
Every inference backend (openrouter, anthropic, gmi, deepseek, nvidia, …)
ships as a plugin here. Each plugin's __init__.py calls
providers.register_provider(ProviderProfile(...)) at module load.
providers/__init__.py._discover_providers() is a lazy, separate
discovery system — scanned on first get_provider_profile() or
list_providers() call, NOT by the general PluginManager.
Scan order:
- Bundled:
<repo>/plugins/model-providers/<name>/ - User:
$HERMES_HOME/plugins/model-providers/<name>/ - Legacy:
<repo>/providers/<name>.py(back-compat)
User plugins of the same name override bundled ones — register_provider()
is last-writer-wins. This lets third parties swap out any built-in
profile without a repo patch.
The general PluginManager records kind: model-provider manifests but does
NOT import them (would double-instantiate ProviderProfile). Plugins
without an explicit kind: get auto-coerced via a source-text heuristic
(register_provider + ProviderProfile in __init__.py).
Full authoring guide: website/docs/developer-guide/model-provider-plugin.md.
Dashboard / context-engine / image-gen plugin directories
plugins/context_engine/, plugins/image_gen/, plugins/example-dashboard/,
etc. follow the same pattern (ABC + orchestrator + per-plugin directory).
Context engines plug into agent/context_engine.py; image-gen providers
into agent/image_gen_provider.py.
Skills
Two parallel surfaces:
skills/— built-in skills shipped and loadable by default. Organized by category directories (e.g.skills/github/,skills/mlops/).optional-skills/— heavier or niche skills shipped with the repo but NOT active by default. Installed explicitly viahermes skills install official/<category>/<skill>. Adapter lives intools/skills_hub.py(OptionalSkillSource). Categories includeautonomous-ai-agents,blockchain,communication,creative,devops,email,health,mcp,migration,mlops,productivity,research,security,web-development.
When reviewing skill PRs, check which directory they target — heavy-dep or
niche skills belong in optional-skills/.
SKILL.md frontmatter
Standard fields: name, description, version, author, license,
platforms (OS-gating list: [macos], [linux, macos], ...),
metadata.hermes.tags, metadata.hermes.category,
metadata.hermes.related_skills, metadata.hermes.config (config.yaml
settings the skill needs — stored under skills.config.<key>, prompted
during setup, injected at load time).
Top-level tags: and category: are also accepted and mirrored from
metadata.hermes.* by the loader.
Toolsets
All toolsets are defined in toolsets.py as a single TOOLSETS dict.
Each platform's adapter picks a base toolset (e.g. Telegram uses
"messaging"); _HERMES_CORE_TOOLS is the default bundle most
platforms inherit from.
Current toolset keys: browser, clarify, code_execution, cronjob,
debugging, delegation, discord, discord_admin, feishu_doc,
feishu_drive, file, homeassistant, image_gen, kanban, memory,
messaging, moa, rl, safe, search, session_search, skills,
spotify, terminal, todo, tts, video, vision, web, yuanbao.
Enable/disable per platform via hermes tools (the curses UI) or the
tools.<platform>.enabled / tools.<platform>.disabled lists in
config.yaml.
Delegation (delegate_task)
tools/delegate_tool.py spawns a subagent with an isolated
context + terminal session. Synchronous: the parent waits for the
child's summary before continuing its own loop — if the parent is
interrupted, the child is cancelled.
Two shapes:
- Single: pass
goal(+ optionalcontext,toolsets). - Batch (parallel): pass
tasks: [...]— each gets its own subagent running concurrently. Concurrency is capped bydelegation.max_concurrent_children(default 3).
Roles:
role="leaf"(default) — focused worker. Cannot calldelegate_task,clarify,memory,send_message,execute_code.role="orchestrator"— retainsdelegate_taskso it can spawn its own workers. Gated bydelegation.orchestrator_enabled(default true) and bounded bydelegation.max_spawn_depth(default 2).
Key config knobs (under delegation: in config.yaml):
max_concurrent_children, max_spawn_depth, child_timeout_seconds,
orchestrator_enabled, subagent_auto_approve, inherit_mcp_toolsets,
max_iterations.
Synchronicity rule: delegate_task is not durable. For long-running
work that must outlive the current turn, use cronjob or
terminal(background=True, notify_on_complete=True) instead.
Curator (skill lifecycle)
Background skill-maintenance system that tracks usage on agent-created
skills and auto-archives stale ones. Users never lose skills; archives
go to ~/.hermes/skills/.archive/ and are restorable.
- Core:
agent/curator.py(review loop, auto-transitions, LLM review prompt) +agent/curator_backup.py(pre-run tar.gz snapshots). - CLI:
hermes_cli/curator.pywireshermes curator <verb>where verbs are:status,run,pause,resume,pin,unpin,archive,restore,prune,backup,rollback. - Telemetry:
tools/skill_usage.pyowns the sidecar~/.hermes/skills/.usage.json— per-skilluse_count,view_count,patch_count,last_activity_at,state(active / stale / archived),pinned.
Invariants:
- Curator only touches skills with
created_by: "agent"provenance — bundled + hub-installed skills are off-limits. - Never deletes; max destructive action is archive.
- Pinned skills are exempt from every auto-transition and from the LLM review pass.
skill_manage(action="delete")refuses pinned skills; patch/edit/ write_file/remove_file go through so the agent can keep improving pinned skills.
Config section (curator: in config.yaml):
enabled, interval_hours, min_idle_hours, stale_after_days,
archive_after_days, backup.*.
Full user-facing docs: website/docs/user-guide/features/curator.md.
Cron (scheduled jobs)
cron/jobs.py (job store) + cron/scheduler.py (tick loop). Agents
schedule jobs via the cronjob tool; users via hermes cron <verb>
(list, add, edit, pause, resume, run, remove) or the
/cron slash command.
Supported schedule formats:
- Duration:
"30m","2h","1d" - "every" phrase:
"every 2h","every monday 9am" - 5-field cron expression:
"0 9 * * *" - ISO timestamp (one-shot):
"2026-06-01T09:00:00Z"
Per-job fields include skills (load specific skills), model /
provider overrides, script (pre-run data-collection script whose
stdout is injected into the prompt; no_agent=True turns the script
into the entire job), context_from (chain job A's last output into
job B's prompt), workdir (run in a specific directory with its
AGENTS.md/CLAUDE.md loaded), and multi-platform delivery.
Hardening invariants:
- 3-minute hard interrupt on cron sessions — runaway agent loops cannot monopolize the scheduler.
- Catchup window: half the job's period, clamped to 120s–2h.
- Grace window: 120s for one-shot jobs whose fire time was missed.
- File lock at
~/.hermes/cron/.tick.lockprevents duplicate ticks across processes. - Cron sessions pass
skip_memory=Trueby default; memory providers intentionally do not run during cron.
Cron deliveries are not mirrored into the target gateway session — they land in their own cron session with a header/footer frame so the main conversation's message-role alternation stays intact.
Kanban (multi-agent work queue)
Durable SQLite-backed board that lets multiple profiles / workers
collaborate on shared tasks. Users drive it via hermes kanban <verb>;
workers spawned by the dispatcher drive it via a dedicated kanban_*
toolset so their schema footprint is zero when they're not inside a
kanban task.
- CLI:
hermes_cli/kanban.pywireshermes kanbanwith verbsinit,create,list(aliasls),show,assign,link,unlink,comment,complete,block,unblock,archive,tail, plus less-commonly-usedwatch,stats,runs,log,assignees,heartbeat,notify-*,dispatch,daemon,gc. - Worker toolset:
tools/kanban_tools.pyexposeskanban_show,kanban_complete,kanban_block,kanban_heartbeat,kanban_comment,kanban_create,kanban_link— gated byHERMES_KANBAN_TASKso the schema only appears for processes actually running as a worker. - Dispatcher: long-lived loop that (default every 60s) reclaims
stale claims, promotes ready tasks, atomically claims, and spawns
assigned profiles. Runs inside the gateway by default via
kanban.dispatch_in_gateway: true. - Plugin assets:
plugins/kanban/dashboard/(web UI) +plugins/kanban/systemd/(hermes-kanban-dispatcher.servicefor standalone dispatcher deployment).
Isolation model:
- Board is the hard boundary — workers are spawned with
HERMES_KANBAN_BOARDpinned in their env so they can't see other boards. - Tenant is a soft namespace within a board — one specialist fleet can serve multiple businesses with workspace-path + memory-key isolation.
- After ~5 consecutive spawn failures on the same task the dispatcher auto-blocks it to prevent spin loops.
Full user-facing docs: website/docs/user-guide/features/kanban.md.
Important Policies
Prompt Caching Must Not Break
Hermes-Agent ensures caching remains valid throughout a conversation. Do NOT implement changes that would:
- Alter past context mid-conversation
- Change toolsets mid-conversation
- Reload memories or rebuild system prompts mid-conversation
Cache-breaking forces dramatically higher costs. The ONLY time we alter context is during context compression.
Slash commands that mutate system-prompt state (skills, tools, memory, etc.)
must be cache-aware: default to deferred invalidation (change takes
effect next session), with an opt-in --now flag for immediate
invalidation. See /skills install --now for the canonical pattern.
Background Process Notifications (Gateway)
When terminal(background=true, notify_on_complete=true) is used, the gateway runs a watcher that
detects process completion and triggers a new agent turn. Control verbosity of background process
messages with display.background_process_notifications
in config.yaml (or HERMES_BACKGROUND_NOTIFICATIONS env var):
all— running-output updates + final message (default)result— only the final completion messageerror— only the final message when exit code != 0off— no watcher messages at all
Profiles: Multi-Instance Support
Hermes supports profiles — multiple fully isolated instances, each with its own
HERMES_HOME directory (config, API keys, memory, sessions, skills, gateway, etc.).
The core mechanism: _apply_profile_override() in hermes_cli/main.py sets
HERMES_HOME before any module imports. All get_hermes_home() references
automatically scope to the active profile.
Rules for profile-safe code
-
Use
get_hermes_home()for all HERMES_HOME paths. Import fromhermes_constants. NEVER hardcode~/.hermesorPath.home() / ".hermes"in code that reads/writes state.# GOOD from hermes_constants import get_hermes_home config_path = get_hermes_home() / "config.yaml" # BAD — breaks profiles config_path = Path.home() / ".hermes" / "config.yaml" -
Use
display_hermes_home()for user-facing messages. Import fromhermes_constants. This returns~/.hermesfor default or~/.hermes/profiles/<name>for profiles.# GOOD from hermes_constants import display_hermes_home print(f"Config saved to {display_hermes_home()}/config.yaml") # BAD — shows wrong path for profiles print("Config saved to ~/.hermes/config.yaml") -
Module-level constants are fine — they cache
get_hermes_home()at import time, which is AFTER_apply_profile_override()sets the env var. Just useget_hermes_home(), notPath.home() / ".hermes". -
Tests that mock
Path.home()must also setHERMES_HOME— since code now usesget_hermes_home()(reads env var), notPath.home() / ".hermes":with patch.object(Path, "home", return_value=tmp_path), \ patch.dict(os.environ, {"HERMES_HOME": str(tmp_path / ".hermes")}): ... -
Gateway platform adapters should use token locks — if the adapter connects with a unique credential (bot token, API key), call
acquire_scoped_lock()fromgateway.statusin theconnect()/start()method andrelease_scoped_lock()indisconnect()/stop(). This prevents two profiles from using the same credential. Seegateway/platforms/telegram.pyfor the canonical pattern. -
Profile operations are HOME-anchored, not HERMES_HOME-anchored —
_get_profiles_root()returnsPath.home() / ".hermes" / "profiles", NOTget_hermes_home() / "profiles". This is intentional — it letshermes -p coder profile listsee all profiles regardless of which one is active.
Known Pitfalls
DO NOT hardcode ~/.hermes paths
Use get_hermes_home() from hermes_constants for code paths. Use display_hermes_home()
for user-facing print/log messages. Hardcoding ~/.hermes breaks profiles — each profile
has its own HERMES_HOME directory. This was the source of 5 bugs fixed in PR #3575.
DO NOT introduce new simple_term_menu usage
Existing call sites in hermes_cli/main.py remain for legacy fallback only;
the preferred UI is curses (stdlib) because simple_term_menu has
ghost-duplication rendering bugs in tmux/iTerm2 with arrow keys. New
interactive menus must use hermes_cli/curses_ui.py — see
hermes_cli/tools_config.py for the canonical pattern.
DO NOT use \033[K (ANSI erase-to-EOL) in spinner/display code
Leaks as literal ?[K text under prompt_toolkit's patch_stdout. Use space-padding: f"\r{line}{' ' * pad}".
_last_resolved_tool_names is a process-global in model_tools.py
_run_single_child() in delegate_tool.py saves and restores this global around subagent execution. If you add new code that reads this global, be aware it may be temporarily stale during child agent runs.
DO NOT hardcode cross-tool references in schema descriptions
Tool schema descriptions must not mention tools from other toolsets by name (e.g., browser_navigate saying "prefer web_search"). Those tools may be unavailable (missing API keys, disabled toolset), causing the model to hallucinate calls to non-existent tools. If a cross-reference is needed, add it dynamically in get_tool_definitions() in model_tools.py — see the browser_navigate / execute_code post-processing blocks for the pattern.
The gateway has TWO message guards — both must bypass approval/control commands
When an agent is running, messages pass through two sequential guards:
(1) base adapter (gateway/platforms/base.py) queues messages in
_pending_messages when session_key in self._active_sessions, and
(2) gateway runner (gateway/run.py) intercepts /stop, /new,
/queue, /status, /approve, /deny before they reach
running_agent.interrupt(). Any new command that must reach the runner
while the agent is blocked (e.g. approval prompts) MUST bypass BOTH
guards and be dispatched inline, not via _process_message_background()
(which races session lifecycle).
Squash merges from stale branches silently revert recent fixes
Before squash-merging a PR, ensure the branch is up to date with main
(git fetch origin main && git reset --hard origin/main in the worktree,
then re-apply the PR's commits). A stale branch's version of an unrelated
file will silently overwrite recent fixes on main when squashed. Verify
with git diff HEAD~1..HEAD after merging — unexpected deletions are a
red flag.
Don't wire in dead code without E2E validation
Unused code that was never shipped was dead for a reason. Before wiring an
unused module into a live code path, E2E test the real resolution chain
with actual imports (not mocks) against a temp HERMES_HOME.
Tests must not write to ~/.hermes/
The _isolate_hermes_home autouse fixture in tests/conftest.py redirects HERMES_HOME to a temp dir. Never hardcode ~/.hermes/ paths in tests.
Profile tests: When testing profile features, also mock Path.home() so that
_get_profiles_root() and _get_default_hermes_home() resolve within the temp dir.
Use the pattern from tests/hermes_cli/test_profiles.py:
@pytest.fixture
def profile_env(tmp_path, monkeypatch):
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(home))
return home
Testing
ALWAYS use scripts/run_tests.sh — do not call pytest directly. The script enforces
hermetic environment parity with CI (unset credential vars, TZ=UTC, LANG=C.UTF-8,
4 xdist workers matching GHA ubuntu-latest). Direct pytest on a 16+ core
developer machine with API keys set diverges from CI in ways that have caused
multiple "works locally, fails in CI" incidents (and the reverse).
scripts/run_tests.sh # full suite, CI-parity
scripts/run_tests.sh tests/gateway/ # one directory
scripts/run_tests.sh tests/agent/test_foo.py::test_x # one test
scripts/run_tests.sh -v --tb=long # pass-through pytest flags
Why the wrapper (and why the old "just call pytest" doesn't work)
Five real sources of local-vs-CI drift the script closes:
| Without wrapper | With wrapper | |
|---|---|---|
| Provider API keys | Whatever is in your env (auto-detects pool) | All *_API_KEY/*_TOKEN/etc. unset |
HOME / ~/.hermes/ |
Your real config+auth.json | Temp dir per test |
| Timezone | Local TZ (PDT etc.) | UTC |
| Locale | Whatever is set | C.UTF-8 |
| xdist workers | -n auto = all cores (20+ on a workstation) |
-n 4 matching CI |
tests/conftest.py also enforces points 1-4 as an autouse fixture so ANY pytest
invocation (including IDE integrations) gets hermetic behavior — but the wrapper
is belt-and-suspenders.
Running without the wrapper (only if you must)
If you can't use the wrapper (e.g. on Windows or inside an IDE that shells
pytest directly), at minimum activate the venv and pass -n 4:
source .venv/bin/activate # or: source venv/bin/activate
python -m pytest tests/ -q -n 4
Worker count above 4 will surface test-ordering flakes that CI never sees.
Always run the full suite before pushing changes.
Don't write change-detector tests
A test is a change-detector if it fails whenever data that is expected to change gets updated — model catalogs, config version numbers, enumeration counts, hardcoded lists of provider models. These tests add no behavioral coverage; they just guarantee that routine source updates break CI and cost engineering time to "fix."
Do not write:
# catalog snapshot — breaks every model release
assert "gemini-2.5-pro" in _PROVIDER_MODELS["gemini"]
assert "MiniMax-M2.7" in models
# config version literal — breaks every schema bump
assert DEFAULT_CONFIG["_config_version"] == 21
# enumeration count — breaks every time a skill/provider is added
assert len(_PROVIDER_MODELS["huggingface"]) == 8
Do write:
# behavior: does the catalog plumbing work at all?
assert "gemini" in _PROVIDER_MODELS
assert len(_PROVIDER_MODELS["gemini"]) >= 1
# behavior: does migration bump the user's version to current latest?
assert raw["_config_version"] == DEFAULT_CONFIG["_config_version"]
# invariant: no plan-only model leaks into the legacy list
assert not (set(moonshot_models) & coding_plan_only_models)
# invariant: every model in the catalog has a context-length entry
for m in _PROVIDER_MODELS["huggingface"]:
assert m.lower() in DEFAULT_CONTEXT_LENGTHS_LOWER
The rule: if the test reads like a snapshot of current data, delete it. If it reads like a contract about how two pieces of data must relate, keep it. When a PR adds a new provider/model and you want a test, make the test assert the relationship (e.g. "catalog entries all have context lengths"), not the specific names.
Reviewers should reject new change-detector tests; authors should convert them into invariants before re-requesting review.