hermes-agent/AGENTS.md
alt-glitch 76aebd73c3 refactor(restructure): update infrastructure for hermes_agent package
Update pyproject.toml entry points, packages.find, and package-data.
Delete py-modules (all top-level modules moved into hermes_agent/).
Add hermes-skills-sync console_script entry point.
Update Dockerfile HERMES_WEB_DIST path.
Update docker/entrypoint.sh, scripts/install.sh, setup-hermes.sh
to use hermes-skills-sync console_script.
Update web/vite.config.ts output directory.
Update MANIFEST.in to graft hermes_agent.
Update AGENTS.md project structure to reflect new layout.

Part of #14182, #14183
2026-04-23 12:10:25 +05:30

26 KiB

Hermes Agent - Development Guide

Instructions for AI coding assistants and developers working on the hermes-agent codebase.

Development Environment

source venv/bin/activate  # ALWAYS activate before running Python

Project Structure

hermes-agent/
├── hermes_agent/             # Single installable package
│   ├── agent/                # Core conversation loop and agent internals
│   │   ├── loop.py               # AIAgent class — core conversation loop
│   │   ├── prompt_builder.py     # System prompt assembly
│   │   ├── context/              # Context management (engine, compressor, references)
│   │   ├── memory/               # Memory management (manager, provider)
│   │   ├── image_gen/            # Image generation (provider, registry)
│   │   ├── display.py            # KawaiiSpinner, tool preview formatting
│   │   ├── skill_commands.py     # Skill slash commands (shared CLI/gateway)
│   │   └── trajectory.py         # Trajectory saving helpers
│   ├── providers/            # LLM provider adapters and transports
│   │   ├── anthropic_adapter.py  # Anthropic adapter
│   │   ├── anthropic_transport.py # Anthropic transport
│   │   ├── metadata.py           # Model context lengths, token estimation
│   │   ├── auxiliary.py           # Auxiliary LLM client (vision, summarization)
│   │   ├── caching.py            # Anthropic prompt caching
│   │   └── credential_pool.py    # Credential management
│   ├── tools/                # Tool implementations
│   │   ├── dispatch.py           # Tool orchestration, discover_builtin_tools()
│   │   ├── toolsets.py           # Toolset definitions
│   │   ├── registry.py           # Central tool registry
│   │   ├── terminal.py           # Terminal orchestration
│   │   ├── browser/              # Browser tools (tool, cdp, camofox, providers/)
│   │   ├── mcp/                  # MCP client and server
│   │   ├── skills/               # Skill management (manager, tool, hub, guard, sync)
│   │   ├── media/                # Voice, TTS, transcription, image gen
│   │   ├── files/                # File operations (tools, operations, state)
│   │   └── security/             # Path security, URL safety, approval
│   ├── backends/             # Terminal backends (local, docker, ssh, modal, daytona, singularity)
│   ├── cli/                  # CLI subcommands and setup
│   │   ├── main.py               # Entry point — all `hermes` subcommands
│   │   ├── repl.py               # HermesCLI class — interactive CLI orchestrator
│   │   ├── config.py             # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
│   │   ├── commands.py           # Slash command definitions
│   │   ├── auth/                 # Provider credential resolution
│   │   ├── models/               # Model catalog, provider lists, switching
│   │   └── ui/                   # Banner, colors, skin engine, callbacks, tips
│   ├── gateway/              # Messaging platform gateway
│   │   ├── run.py                # Main loop, slash commands, message dispatch
│   │   ├── session.py            # SessionStore — conversation persistence
│   │   └── platforms/            # Adapters: telegram, discord, slack, whatsapp, etc.
│   ├── acp/                  # ACP server (VS Code / Zed / JetBrains integration)
│   ├── cron/                 # Scheduler (jobs.py, scheduler.py)
│   ├── plugins/              # Plugin system (memory providers, context engines)
│   ├── constants.py          # Shared constants
│   ├── state.py              # SessionDB — SQLite session store
│   ├── logging.py            # Logging configuration
│   └── utils.py              # Shared utilities
├── tui_gateway/          # Python JSON-RPC backend for the TUI
├── ui-tui/               # Ink (React) terminal UI — `hermes --tui`
├── environments/         # RL training environments (Atropos)
├── tests/                # Pytest suite
└── web/                  # Vite + React web dashboard

User config: ~/.hermes/config.yaml (settings), ~/.hermes/.env (API keys)

File Dependency Chain

hermes_agent/tools/registry.py  (no deps — imported by all tool files)
       ↑
hermes_agent/tools/*.py  (each calls registry.register() at import time)
       ↑
hermes_agent/tools/dispatch.py  (imports registry + triggers tool discovery)
       ↑
hermes_agent/agent/loop.py, hermes_agent/cli/repl.py, environments/

AIAgent Class (hermes_agent/agent/loop.py)

class AIAgent:
    def __init__(self,
        model: str = "anthropic/claude-opus-4.6",
        max_iterations: int = 90,
        enabled_toolsets: list = None,
        disabled_toolsets: list = None,
        quiet_mode: bool = False,
        save_trajectories: bool = False,
        platform: str = None,           # "cli", "telegram", etc.
        session_id: str = None,
        skip_context_files: bool = False,
        skip_memory: bool = False,
        # ... plus provider, api_mode, callbacks, routing params
    ): ...

    def chat(self, message: str) -> str:
        """Simple interface — returns final response string."""

    def run_conversation(self, user_message: str, system_message: str = None,
                         conversation_history: list = None, task_id: str = None) -> dict:
        """Full interface — returns dict with final_response + messages."""

Agent Loop

The core loop is inside run_conversation() — entirely synchronous:

while api_call_count < self.max_iterations and self.iteration_budget.remaining > 0:
    response = client.chat.completions.create(model=model, messages=messages, tools=tool_schemas)
    if response.tool_calls:
        for tool_call in response.tool_calls:
            result = handle_function_call(tool_call.name, tool_call.args, task_id)
            messages.append(tool_result_message(result))
        api_call_count += 1
    else:
        return response.content

Messages follow OpenAI format: {"role": "system/user/assistant/tool", ...}. Reasoning content is stored in assistant_msg["reasoning"].


CLI Architecture (hermes_agent/cli/repl.py)

  • Rich for banner/panels, prompt_toolkit for input with autocomplete
  • KawaiiSpinner (hermes_agent/agent/display.py) — animated faces during API calls, activity feed for tool results
  • load_cli_config() in repl.py merges hardcoded defaults + user config YAML
  • Skin engine (hermes_agent/cli/ui/skin_engine.py) — data-driven CLI theming; initialized from display.skin config key at startup; skins customize banner colors, spinner faces/verbs/wings, tool prefix, response box, branding text
  • process_command() is a method on HermesCLI — dispatches on canonical command name resolved via resolve_command() from the central registry
  • Skill slash commands: hermes_agent/agent/skill_commands.py scans ~/.hermes/skills/, injects as user message (not system prompt) to preserve prompt caching

Slash Command Registry (hermes_cli/commands.py)

All slash commands are defined in a central COMMAND_REGISTRY list of CommandDef objects. Every downstream consumer derives from this registry automatically:

  • CLIprocess_command() resolves aliases via resolve_command(), dispatches on canonical name
  • GatewayGATEWAY_KNOWN_COMMANDS frozenset for hook emission, resolve_command() for dispatch
  • Gateway helpgateway_help_lines() generates /help output
  • Telegramtelegram_bot_commands() generates the BotCommand menu
  • Slackslack_subcommand_map() generates /hermes subcommand routing
  • AutocompleteCOMMANDS flat dict feeds SlashCommandCompleter
  • CLI helpCOMMANDS_BY_CATEGORY dict feeds show_help()

Adding a Slash Command

  1. Add a CommandDef entry to COMMAND_REGISTRY in hermes_cli/commands.py:
CommandDef("mycommand", "Description of what it does", "Session",
           aliases=("mc",), args_hint="[arg]"),
  1. Add handler in HermesCLI.process_command() in cli.py:
elif canonical == "mycommand":
    self._handle_mycommand(cmd_original)
  1. If the command is available in the gateway, add a handler in gateway/run.py:
if canonical == "mycommand":
    return await self._handle_mycommand(event)
  1. For persistent settings, use save_config_value() in cli.py

CommandDef fields:

  • name — canonical name without slash (e.g. "background")
  • description — human-readable description
  • category — one of "Session", "Configuration", "Tools & Skills", "Info", "Exit"
  • aliases — tuple of alternative names (e.g. ("bg",))
  • args_hint — argument placeholder shown in help (e.g. "<prompt>", "[name]")
  • cli_only — only available in the interactive CLI
  • gateway_only — only available in messaging platforms
  • gateway_config_gate — config dotpath (e.g. "display.tool_progress_command"); when set on a cli_only command, the command becomes available in the gateway if the config value is truthy. GATEWAY_KNOWN_COMMANDS always includes config-gated commands so the gateway can dispatch them; help/menus only show them when the gate is open.

Adding an alias requires only adding it to the aliases tuple on the existing CommandDef. No other file changes needed — dispatch, help text, Telegram menu, Slack mapping, and autocomplete all update automatically.


TUI Architecture (ui-tui + tui_gateway)

The TUI is a full replacement for the classic (prompt_toolkit) CLI, activated via hermes --tui or HERMES_TUI=1.

Process Model

hermes --tui
  └─ Node (Ink)  ──stdio JSON-RPC──  Python (tui_gateway)
       │                                  └─ AIAgent + tools + sessions
       └─ renders transcript, composer, prompts, activity

TypeScript owns the screen. Python owns sessions, tools, model calls, and slash command logic.

Transport

Newline-delimited JSON-RPC over stdio. Requests from Ink, events from Python. See tui_gateway/server.py for the full method/event catalog.

Key Surfaces

Surface Ink component Gateway method
Chat streaming app.tsx + messageLine.tsx prompt.submitmessage.delta/complete
Tool activity thinking.tsx tool.start/progress/complete
Approvals prompts.tsx approval.respondapproval.request
Clarify/sudo/secret prompts.tsx, maskedPrompt.tsx clarify/sudo/secret.respond
Session picker sessionPicker.tsx session.list/resume
Slash commands Local handler + fallthrough slash.exec_SlashWorker, command.dispatch
Completions useCompletion hook complete.slash, complete.path
Theming theme.ts + branding.tsx gateway.ready with skin data

Slash Command Flow

  1. Built-in client commands (/help, /quit, /clear, /resume, /copy, /paste, etc.) handled locally in app.tsx
  2. Everything else → slash.exec (runs in persistent _SlashWorker subprocess) → command.dispatch fallback

Dev Commands

cd ui-tui
npm install       # first time
npm run dev       # watch mode (rebuilds hermes-ink + tsx --watch)
npm start         # production
npm run build     # full build (hermes-ink + tsc)
npm run type-check # typecheck only (tsc --noEmit)
npm run lint      # eslint
npm run fmt       # prettier
npm test          # vitest

Adding New Tools

Requires changes in 2 files:

1. Create tools/your_tool.py:

import json, os
from tools.registry import registry

def check_requirements() -> bool:
    return bool(os.getenv("EXAMPLE_API_KEY"))

def example_tool(param: str, task_id: str = None) -> str:
    return json.dumps({"success": True, "data": "..."})

registry.register(
    name="example_tool",
    toolset="example",
    schema={"name": "example_tool", "description": "...", "parameters": {...}},
    handler=lambda args, **kw: example_tool(param=args.get("param", ""), task_id=kw.get("task_id")),
    check_fn=check_requirements,
    requires_env=["EXAMPLE_API_KEY"],
)

2. Add to toolsets.py — either _HERMES_CORE_TOOLS (all platforms) or a new toolset.

Auto-discovery: any hermes_agent/tools/*.py file with a top-level registry.register() call is imported automatically — no manual import list to maintain.

The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.

Path references in tool schemas: If the schema description mentions file paths (e.g. default output directories), use display_hermes_home() to make them profile-aware. The schema is generated at import time, which is after _apply_profile_override() sets HERMES_HOME.

State files: If a tool stores persistent state (caches, logs, checkpoints), use get_hermes_home() for the base directory — never Path.home() / ".hermes". This ensures each profile gets its own state.

Agent-level tools (todo, memory): intercepted by run_agent.py before handle_function_call(). See todo_tool.py for the pattern.


Adding Configuration

config.yaml options:

  1. Add to DEFAULT_CONFIG in hermes_cli/config.py
  2. Bump _config_version (currently 5) to trigger migration for existing users

.env variables:

  1. Add to OPTIONAL_ENV_VARS in hermes_cli/config.py with metadata:
"NEW_API_KEY": {
    "description": "What it's for",
    "prompt": "Display name",
    "url": "https://...",
    "password": True,
    "category": "tool",  # provider, tool, messaging, setting
},

Config loaders (two separate systems):

Loader Used by Location
load_cli_config() CLI mode cli.py
load_config() hermes tools, hermes setup hermes_cli/config.py
Direct YAML load Gateway gateway/run.py

Skin/Theme System

The skin engine (hermes_cli/skin_engine.py) provides data-driven CLI visual customization. Skins are pure data — no code changes needed to add a new skin.

Architecture

hermes_cli/skin_engine.py    # SkinConfig dataclass, built-in skins, YAML loader
~/.hermes/skins/*.yaml       # User-installed custom skins (drop-in)
  • init_skin_from_config() — called at CLI startup, reads display.skin from config
  • get_active_skin() — returns cached SkinConfig for the current skin
  • set_active_skin(name) — switches skin at runtime (used by /skin command)
  • load_skin(name) — loads from user skins first, then built-ins, then falls back to default
  • Missing skin values inherit from the default skin automatically

What skins customize

Element Skin Key Used By
Banner panel border colors.banner_border banner.py
Banner panel title colors.banner_title banner.py
Banner section headers colors.banner_accent banner.py
Banner dim text colors.banner_dim banner.py
Banner body text colors.banner_text banner.py
Response box border colors.response_border cli.py
Spinner faces (waiting) spinner.waiting_faces display.py
Spinner faces (thinking) spinner.thinking_faces display.py
Spinner verbs spinner.thinking_verbs display.py
Spinner wings (optional) spinner.wings display.py
Tool output prefix tool_prefix display.py
Per-tool emojis tool_emojis display.pyget_tool_emoji()
Agent name branding.agent_name banner.py, cli.py
Welcome message branding.welcome cli.py
Response box label branding.response_label cli.py
Prompt symbol branding.prompt_symbol cli.py

Built-in skins

  • default — Classic Hermes gold/kawaii (the current look)
  • ares — Crimson/bronze war-god theme with custom spinner wings
  • mono — Clean grayscale monochrome
  • slate — Cool blue developer-focused theme

Adding a built-in skin

Add to _BUILTIN_SKINS dict in hermes_cli/skin_engine.py:

"mytheme": {
    "name": "mytheme",
    "description": "Short description",
    "colors": { ... },
    "spinner": { ... },
    "branding": { ... },
    "tool_prefix": "┊",
},

User skins (YAML)

Users create ~/.hermes/skins/<name>.yaml:

name: cyberpunk
description: Neon-soaked terminal theme

colors:
  banner_border: "#FF00FF"
  banner_title: "#00FFFF"
  banner_accent: "#FF1493"

spinner:
  thinking_verbs: ["jacking in", "decrypting", "uploading"]
  wings:
    - ["⟨⚡", "⚡⟩"]

branding:
  agent_name: "Cyber Agent"
  response_label: " ⚡ Cyber "

tool_prefix: "▏"

Activate with /skin cyberpunk or display.skin: cyberpunk in config.yaml.


Important Policies

Prompt Caching Must Not Break

Hermes-Agent ensures caching remains valid throughout a conversation. Do NOT implement changes that would:

  • Alter past context mid-conversation
  • Change toolsets mid-conversation
  • Reload memories or rebuild system prompts mid-conversation

Cache-breaking forces dramatically higher costs. The ONLY time we alter context is during context compression.

Working Directory Behavior

  • CLI: Uses current directory (.os.getcwd())
  • Messaging: Uses MESSAGING_CWD env var (default: home directory)

Background Process Notifications (Gateway)

When terminal(background=true, notify_on_complete=true) is used, the gateway runs a watcher that detects process completion and triggers a new agent turn. Control verbosity of background process messages with display.background_process_notifications in config.yaml (or HERMES_BACKGROUND_NOTIFICATIONS env var):

  • all — running-output updates + final message (default)
  • result — only the final completion message
  • error — only the final message when exit code != 0
  • off — no watcher messages at all

Profiles: Multi-Instance Support

Hermes supports profiles — multiple fully isolated instances, each with its own HERMES_HOME directory (config, API keys, memory, sessions, skills, gateway, etc.).

The core mechanism: _apply_profile_override() in hermes_cli/main.py sets HERMES_HOME before any module imports. All 119+ references to get_hermes_home() automatically scope to the active profile.

Rules for profile-safe code

  1. Use get_hermes_home() for all HERMES_HOME paths. Import from hermes_constants. NEVER hardcode ~/.hermes or Path.home() / ".hermes" in code that reads/writes state.

    # GOOD
    from hermes_constants import get_hermes_home
    config_path = get_hermes_home() / "config.yaml"
    
    # BAD — breaks profiles
    config_path = Path.home() / ".hermes" / "config.yaml"
    
  2. Use display_hermes_home() for user-facing messages. Import from hermes_constants. This returns ~/.hermes for default or ~/.hermes/profiles/<name> for profiles.

    # GOOD
    from hermes_constants import display_hermes_home
    print(f"Config saved to {display_hermes_home()}/config.yaml")
    
    # BAD — shows wrong path for profiles
    print("Config saved to ~/.hermes/config.yaml")
    
  3. Module-level constants are fine — they cache get_hermes_home() at import time, which is AFTER _apply_profile_override() sets the env var. Just use get_hermes_home(), not Path.home() / ".hermes".

  4. Tests that mock Path.home() must also set HERMES_HOME — since code now uses get_hermes_home() (reads env var), not Path.home() / ".hermes":

    with patch.object(Path, "home", return_value=tmp_path), \
         patch.dict(os.environ, {"HERMES_HOME": str(tmp_path / ".hermes")}):
        ...
    
  5. Gateway platform adapters should use token locks — if the adapter connects with a unique credential (bot token, API key), call acquire_scoped_lock() from gateway.status in the connect()/start() method and release_scoped_lock() in disconnect()/stop(). This prevents two profiles from using the same credential. See gateway/platforms/telegram.py for the canonical pattern.

  6. Profile operations are HOME-anchored, not HERMES_HOME-anchored_get_profiles_root() returns Path.home() / ".hermes" / "profiles", NOT get_hermes_home() / "profiles". This is intentional — it lets hermes -p coder profile list see all profiles regardless of which one is active.

Known Pitfalls

DO NOT hardcode ~/.hermes paths

Use get_hermes_home() from hermes_constants for code paths. Use display_hermes_home() for user-facing print/log messages. Hardcoding ~/.hermes breaks profiles — each profile has its own HERMES_HOME directory. This was the source of 5 bugs fixed in PR #3575.

DO NOT use simple_term_menu for interactive menus

Rendering bugs in tmux/iTerm2 — ghosting on scroll. Use curses (stdlib) instead. See hermes_cli/tools_config.py for the pattern.

DO NOT use \033[K (ANSI erase-to-EOL) in spinner/display code

Leaks as literal ?[K text under prompt_toolkit's patch_stdout. Use space-padding: f"\r{line}{' ' * pad}".

_last_resolved_tool_names is a process-global in hermes_agent/tools/dispatch.py

_run_single_child() in delegate_tool.py saves and restores this global around subagent execution. If you add new code that reads this global, be aware it may be temporarily stale during child agent runs.

DO NOT hardcode cross-tool references in schema descriptions

Tool schema descriptions must not mention tools from other toolsets by name (e.g., browser_navigate saying "prefer web_search"). Those tools may be unavailable (missing API keys, disabled toolset), causing the model to hallucinate calls to non-existent tools. If a cross-reference is needed, add it dynamically in get_tool_definitions() in hermes_agent/tools/dispatch.py — see the browser_navigate / execute_code post-processing blocks for the pattern.

Tests must not write to ~/.hermes/

The _isolate_hermes_home autouse fixture in tests/conftest.py redirects HERMES_HOME to a temp dir. Never hardcode ~/.hermes/ paths in tests.

Profile tests: When testing profile features, also mock Path.home() so that _get_profiles_root() and _get_default_hermes_home() resolve within the temp dir. Use the pattern from tests/hermes_cli/test_profiles.py:

@pytest.fixture
def profile_env(tmp_path, monkeypatch):
    home = tmp_path / ".hermes"
    home.mkdir()
    monkeypatch.setattr(Path, "home", lambda: tmp_path)
    monkeypatch.setenv("HERMES_HOME", str(home))
    return home

Testing

ALWAYS use scripts/run_tests.sh — do not call pytest directly. The script enforces hermetic environment parity with CI (unset credential vars, TZ=UTC, LANG=C.UTF-8, 4 xdist workers matching GHA ubuntu-latest). Direct pytest on a 16+ core developer machine with API keys set diverges from CI in ways that have caused multiple "works locally, fails in CI" incidents (and the reverse).

scripts/run_tests.sh                                  # full suite, CI-parity
scripts/run_tests.sh tests/gateway/                   # one directory
scripts/run_tests.sh tests/agent/test_foo.py::test_x  # one test
scripts/run_tests.sh -v --tb=long                     # pass-through pytest flags

Why the wrapper (and why the old "just call pytest" doesn't work)

Five real sources of local-vs-CI drift the script closes:

Without wrapper With wrapper
Provider API keys Whatever is in your env (auto-detects pool) All *_API_KEY/*_TOKEN/etc. unset
HOME / ~/.hermes/ Your real config+auth.json Temp dir per test
Timezone Local TZ (PDT etc.) UTC
Locale Whatever is set C.UTF-8
xdist workers -n auto = all cores (20+ on a workstation) -n 4 matching CI

tests/conftest.py also enforces points 1-4 as an autouse fixture so ANY pytest invocation (including IDE integrations) gets hermetic behavior — but the wrapper is belt-and-suspenders.

Running without the wrapper (only if you must)

If you can't use the wrapper (e.g. on Windows or inside an IDE that shells pytest directly), at minimum activate the venv and pass -n 4:

source venv/bin/activate
python -m pytest tests/ -q -n 4

Worker count above 4 will surface test-ordering flakes that CI never sees.

Always run the full suite before pushing changes.

Don't write change-detector tests

A test is a change-detector if it fails whenever data that is expected to change gets updated — model catalogs, config version numbers, enumeration counts, hardcoded lists of provider models. These tests add no behavioral coverage; they just guarantee that routine source updates break CI and cost engineering time to "fix."

Do not write:

# catalog snapshot — breaks every model release
assert "gemini-2.5-pro" in _PROVIDER_MODELS["gemini"]
assert "MiniMax-M2.7" in models

# config version literal — breaks every schema bump
assert DEFAULT_CONFIG["_config_version"] == 21

# enumeration count — breaks every time a skill/provider is added
assert len(_PROVIDER_MODELS["huggingface"]) == 8

Do write:

# behavior: does the catalog plumbing work at all?
assert "gemini" in _PROVIDER_MODELS
assert len(_PROVIDER_MODELS["gemini"]) >= 1

# behavior: does migration bump the user's version to current latest?
assert raw["_config_version"] == DEFAULT_CONFIG["_config_version"]

# invariant: no plan-only model leaks into the legacy list
assert not (set(moonshot_models) & coding_plan_only_models)

# invariant: every model in the catalog has a context-length entry
for m in _PROVIDER_MODELS["huggingface"]:
    assert m.lower() in DEFAULT_CONTEXT_LENGTHS_LOWER

The rule: if the test reads like a snapshot of current data, delete it. If it reads like a contract about how two pieces of data must relate, keep it. When a PR adds a new provider/model and you want a test, make the test assert the relationship (e.g. "catalog entries all have context lengths"), not the specific names.

Reviewers should reject new change-detector tests; authors should convert them into invariants before re-requesting review.