hermes-agent/website/docs/reference/environment-variables.md
Teknium 091d8e1030
feat(codex-runtime): optional codex app-server runtime for OpenAI/Codex models (#24182)
* feat(codex-runtime): scaffold optional codex app-server runtime

Foundational commit for an opt-in alternate runtime that hands OpenAI/Codex
turns to a 'codex app-server' subprocess instead of Hermes' tool dispatch.
Default behavior is unchanged.

Lands in three pieces:

1. agent/transports/codex_app_server.py — JSON-RPC 2.0 over stdio speaker
   for codex's app-server protocol (codex-rs/app-server). Spawn, init
   handshake, request/response, notification queue, server-initiated
   request queue (for approval round-trips), interrupt-friendly blocking
   reads. Tested against real codex 0.130.0 binary end-to-end during
   development.

2. hermes_cli/runtime_provider.py:
   - Adds 'codex_app_server' to _VALID_API_MODES.
   - Adds _maybe_apply_codex_app_server_runtime() helper, called at the
     end of _resolve_runtime_from_pool_entry(). Inert unless
     'model.openai_runtime: codex_app_server' is set in config.yaml AND
     provider in {openai, openai-codex}. Other providers cannot be
     rerouted (anthropic, openrouter, etc. preserved).

3. tests/agent/transports/test_codex_app_server_runtime.py — 24 tests
   covering api_mode registration, the rewriter helper (default-off,
   case-insensitive, opt-in, non-eligible providers preserved), version
   parser, missing-binary handling, error class. Does NOT require codex
   CLI installed.

This commit is wire-only: the api_mode is recognized but AIAgent does
not yet branch on it. Followup commits add the session adapter, event
projector, approval bridge, transcript projection (so memory/skill
review still works), plugin migration, and slash command.

Existing tests remain green:
- tests/cli/test_cli_provider_resolution.py (29 passed)
- tests/agent/test_credential_pool_routing.py (included above)

* feat(codex-runtime): add codex item projector for memory/skill review

The translator that lets Hermes' self-improvement loop keep working under the
Codex runtime: converts codex 'item/*' notifications into Hermes' standard
{role, content, tool_calls, tool_call_id} message shape that
agent/curator.py already knows how to read.

Item taxonomy (matches codex-rs/app-server-protocol/src/protocol/v2/item.rs):
  - userMessage          → {role: user, content}
  - agentMessage         → {role: assistant, content: text}
  - reasoning            → stashed in next assistant's 'reasoning' field
  - commandExecution     → assistant tool_call(name='exec_command') + tool result
  - fileChange           → assistant tool_call(name='apply_patch') + tool result
  - mcpToolCall          → assistant tool_call(name='mcp.<server>.<tool>') + tool result
  - dynamicToolCall      → assistant tool_call(name=<tool>) + tool result
  - plan/hookPrompt/etc  → opaque assistant note, no fabricated tool_calls

Invariants preserved:
  - Message role alternation never violated: each tool item produces at most
    one assistant + one tool message in that order, correlated by call_id.
  - Streaming deltas (item/<type>/outputDelta, item/agentMessage/delta)
    don't materialize messages — only item/completed does. Mirrors how
    Hermes already only writes the assistant message after streaming ends.
  - Tool call ids are deterministic (codex item id-based) so replays produce
    identical messages and prefix caches stay valid (AGENTS.md pitfall #16).
  - JSON args use sorted_keys for the same reason.

Real wire formats verified against codex 0.130.0 by capturing live
notifications from thread/shellCommand and including one as a fixture
(COMMAND_EXEC_COMPLETED).

23 new tests, all green:
  - Streaming deltas don't materialize (3 paths)
  - Turn/thread frame events are silent
  - commandExecution: 5 tests including non-zero exit annotation +
    deterministic id stability across replays
  - agentMessage + reasoning attachment + reasoning consumption
  - fileChange: summary without inlined content
  - mcpToolCall: namespaced naming + error surfacing
  - userMessage: text fragments only (drops images/etc)
  - opaque items: no fabricated tool_calls
  - Helpers: deterministic id stability + sorted JSON args
  - Role alternation invariant across all four tool-shaped item types

This commit is a pure addition. AIAgent integration (the wire that uses the
projector) is the next commit.

* feat(codex-runtime): add session adapter + approval bridge

The third self-contained module: CodexAppServerSession owns one Codex
thread per Hermes session, drives turn/start, consumes streaming
notifications via CodexEventProjector, handles server-initiated approval
requests, and translates cancellation into turn/interrupt.

The adapter has a single public per-turn method:

    result = session.run_turn(user_input='...', turn_timeout=600)
    # result.final_text          → assistant text for the caller
    # result.projected_messages  → list ready to splice into AIAgent.messages
    # result.tool_iterations     → tick count for _iters_since_skill nudge
    # result.interrupted         → True on Ctrl+C / deadline / interrupt
    # result.error               → error string when the turn cannot complete
    # result.turn_id, thread_id  → for sessions DB / resume

Behavior:

  - ensure_started() spawns codex, does the initialize handshake, and
    issues thread/start with cwd + permissions profile. Idempotent.
  - run_turn() blocks until turn/completed, drains server-initiated
    requests (approvals) before reading notifications so codex never
    deadlocks waiting for us, projects every item/completed via the
    projector, and increments tool_iterations for the skill nudge gate.
  - request_interrupt() is thread-safe (threading.Event); the next loop
    iteration issues turn/interrupt and unwinds.
  - turn_timeout deadlock guard issues turn/interrupt and records an
    error if the turn never completes.
  - close() escalates terminate → kill via the underlying client.

Approval bridge:

  Codex emits server-initiated requests for execCommandApproval and
  applyPatchApproval. The adapter translates Hermes' approval choice
  vocabulary onto codex's decision vocabulary:

    Hermes 'once'                → codex 'approved'
    Hermes 'session' or 'always' → codex 'approvedForSession'
    Hermes 'deny' / anything else → codex 'denied'

  Routing precedence:
    1. _ServerRequestRouting.auto_approve_* flags (cron / non-interactive)
    2. approval_callback wired by the CLI (defers to
       tools.approval.prompt_dangerous_approval())
    3. Fail-closed denial when neither is wired

  Unknown server-request methods are answered with JSON-RPC error -32601
  so codex doesn't hang waiting for us.

Permission profile mapping mirrors AGENTS.md:
    Hermes 'auto'              → codex 'workspace-write'
    Hermes 'approval-required' → codex 'read-only-with-approval'
    Hermes 'unrestricted/yolo' → codex 'full-access'

20 new tests, all green. Combined with prior commits this PR now has
67 tests across three modules:
  - test_codex_app_server_runtime.py: 24 (api_mode + transport surface)
  - test_codex_event_projector.py: 23 (item taxonomy projections)
  - test_codex_app_server_session.py: 20 (turn loop + approvals + interrupts)

Full tests/agent/transports/ directory: 249/249 pass — no regressions
to existing transport tests.

Still no wire into AIAgent.run_conversation(); that integration commit
is small and goes next.

* feat(codex-runtime): wire codex_app_server runtime into AIAgent

The integration commit. AIAgent.run_conversation() now early-returns to a
new helper _run_codex_app_server_turn() when self.api_mode ==
'codex_app_server', bypassing the chat_completions tool loop entirely.

Three small surgical edits to run_agent.py (~105 LOC total):

1. Line ~1204 (constructor api_mode validation set):
   Add 'codex_app_server' so an explicit api_mode='codex_app_server'
   passed to AIAgent() isn't silently rewritten to 'chat_completions'.

2. Line ~12048 (run_conversation, just before the while loop):
   Early-return to _run_codex_app_server_turn() when self.api_mode is
   'codex_app_server'. Placed AFTER all standard pre-loop setup —
   logging context, session DB, surrogate sanitization, _user_turn_count
   and _turns_since_memory increments, _ext_prefetch_cache, memory
   manager on_turn_start — so behavior outside the model-call loop is
   identical between paths. Default Hermes flow is unchanged when the
   flag is off.

3. End-of-class (line ~15497):
   New method _run_codex_app_server_turn(). Lazy-instantiates one
   CodexAppServerSession per AIAgent (reused across turns), runs the
   turn, splices projected_messages into messages, increments
   _iters_since_skill by tool_iterations (since the chat_completions
   loop normally does that per iteration), fires
   _spawn_background_review on the same cadence as the default path.

Counter accounting:

  _turns_since_memory  ← already incremented at run_conversation:11817
                         (gated on memory store configured) — codex
                         helper does NOT touch it (would double-count).
  _user_turn_count     ← already incremented at run_conversation:11793
                         — codex helper does NOT touch it.
  _iters_since_skill   ← incremented in the chat_completions loop per
                         tool iteration. Codex helper increments by
                         turn.tool_iterations since the loop is bypassed.

User message:

  ALREADY appended to messages by run_conversation pre-loop (line 11823)
  before the early-return reaches us. Helper does NOT append again.
  Regression test test_user_message_not_duplicated guards this.

Approval callback wiring:

  Lazy-fetches tools.terminal_tool._get_approval_callback at session
  spawn time, passes to CodexAppServerSession. CLI threads with
  prompt_toolkit get interactive approvals; gateway/cron contexts get
  the codex-side fail-closed deny.

Error path:

  Codex session exceptions become a 'partial' result with completed=False
  and a final_response that explicitly tells the user how to switch back:
  'Codex app-server turn failed: ... Fall back to default runtime with
  /codex-runtime auto.' Same return-dict shape as the chat_completions
  path so all callers (gateway, CLI, batch_runner, ACP) work unchanged.

9 new integration tests in tests/run_agent/test_codex_app_server_integration.py:
  - api_mode='codex_app_server' is accepted on AIAgent construction
  - run_conversation returns the expected codex shape
    (final_response, codex_thread_id, codex_turn_id, completed, partial)
  - Projected messages are spliced into messages list
  - _iters_since_skill ticks per tool iteration
  - _user_turn_count delegated to standard flow (not double-counted)
  - User message appears exactly once (regression guard)
  - _spawn_background_review IS invoked (memory/skill review keeps working)
  - chat.completions.create is NEVER called (loop fully bypassed)
  - Session exception → partial result with /codex-runtime auto hint
  - Interrupted turn → partial result with error preserved

Adjacent test runs confirm no regressions:
  - tests/run_agent/test_memory_nudge_counter_hydration.py: green
  - tests/run_agent/test_background_review.py: green
  - tests/run_agent/test_fallback_model.py: green
  - tests/agent/transports/: 249/249 green

Still missing for full feature: /codex-runtime slash command, plugin
migration helper, docs page, live e2e test gated on codex binary. Those
are the remaining followup commits.

* feat(codex-runtime): add /codex-runtime slash command (CLI + gateway)

User-facing toggle for the optional codex app-server runtime. Follows the
'Adding a Slash Command (All Platforms)' pattern from AGENTS.md exactly:
single CommandDef in the central registry → CLI handler → gateway handler
→ running-agent guard → all surfaces (autocomplete, /help, Telegram menu,
Slack subcommands) update automatically.

Surface:
    /codex-runtime                    — show current state + codex CLI status
    /codex-runtime auto               — Hermes default runtime
    /codex-runtime codex_app_server   — codex subprocess runtime
    /codex-runtime on / off           — synonyms

Files changed:

  hermes_cli/codex_runtime_switch.py (new):
    Pure-Python state machine shared by CLI and gateway. Parse args,
    read/write model.openai_runtime in the config dict, gate enabling
    behind a codex --version check (don't let users opt in to a runtime
    they have no binary for; print npm install hint instead).
    Returns a CodexRuntimeStatus dataclass that callers render however
    suits their surface.

  hermes_cli/commands.py:
    Single CommandDef entry, no aliases (codex-runtime is its own thing).

  cli.py:
    Dispatch in process_command() + _handle_codex_runtime() handler that
    delegates to the shared module and renders results via _cprint.

  gateway/run.py:
    Dispatch in _handle_message() + _handle_codex_runtime_command() that
    returns a string (gateway sends as message). On a successful change
    that requires a new session, _evict_cached_agent() forces the next
    inbound message to construct a fresh AIAgent with the new api_mode —
    avoids prompt-cache invalidation mid-session.

  gateway/run.py running-agent guard:
    /codex-runtime joins /model in the early-intercept block so a runtime
    flip mid-turn can't split a turn across two transports.

Tests:
  tests/hermes_cli/test_codex_runtime_switch.py — 25 tests covering the
  state machine: arg parsing (10 cases incl. case-insensitive and
  synonyms), reading current runtime (5 cases incl. malformed configs),
  writing runtime (3 cases), apply() entry point covering read-only,
  no-op, codex-missing-blocked, codex-present-success, disable-no-binary-check,
  and persist-failure paths (8 cases). All green.

Adjacent test suites confirm no regressions:
  - tests/hermes_cli/test_commands.py + test_codex_runtime_switch.py:
    167/167 green
  - tests/agent/transports/: 283/283 green when combined with prior commits

Still missing: plugin migration helper, docs page, live e2e test gated on
codex binary. Followup commits.

* feat(codex-runtime): auto-migrate Hermes MCP servers to ~/.codex/config.toml

Translates the user's mcp_servers config from ~/.hermes/config.yaml into
the TOML format codex's MCP client expects. Wired into the
/codex-runtime codex_app_server enable path so users get their MCP tool
surface in the spawned subprocess automatically.

The migration runs on every enable. Failures are non-fatal — the runtime
change still proceeds and the user gets a warning so they can fix the
codex config manually.

What translates (mapping verified against codex-rs/core/src/config/edit.rs):
  Hermes mcp_servers.<n>.command/args/env  → codex stdio transport
  Hermes mcp_servers.<n>.url/headers       → codex streamable_http transport
  Hermes mcp_servers.<n>.timeout           → codex tool_timeout_sec
  Hermes mcp_servers.<n>.connect_timeout   → codex startup_timeout_sec
  Hermes mcp_servers.<n>.cwd               → codex stdio cwd
  Hermes mcp_servers.<n>.enabled: false    → codex enabled = false

What does NOT translate (warned + skipped per server):
  Hermes-specific keys (sampling, etc.) — codex's MCP client has no
  equivalent. Listed in the per-server skipped[] field of the report.

What's NOT migrated (intentional):
  AGENTS.md — codex respects this file natively in its cwd. Hermes' own
  AGENTS.md (project-level) is already in the worktree, so codex picks
  it up without translation. No code needed.

Idempotency design:
  All managed content lives between a 'managed by hermes-agent' marker
  and the next non-mcp_servers section header. _strip_existing_managed_block
  removes the prior managed region cleanly, preserving any user-added
  codex config (model, providers.openai, sandbox profiles, etc.) above
  or below.

Files added:
  hermes_cli/codex_runtime_plugin_migration.py — pure-Python migration
    helper. Public API: migrate(hermes_config, codex_home=None,
    dry_run=False) returns MigrationReport with .migrated/.errors/
    .skipped_keys_per_server. No external TOML dependency — minimal
    formatter handles strings/numbers/booleans/lists/inline-tables.

  tests/hermes_cli/test_codex_runtime_plugin_migration.py — 39 tests
  covering:
    - per-server translation (12): stdio/http/sse, cwd, timeouts,
      enabled flag, command+url precedence, sampling drop, unknown keys
    - TOML formatter (8): types, escaping, inline tables, error case
    - existing-block stripping (4): no marker, alone, with user content
      above, with user content below
    - end-to-end migrate() (8): empty, dry-run, round-trip, idempotent
      re-run, preserves user config, error reporting, invalid input,
      summary formatting

Files changed:
  hermes_cli/codex_runtime_switch.py — apply() now calls migrate() in
    the codex_app_server enable branch. Migration failure logs a warning
    in the result message but does NOT fail the runtime change. Disable
    path (auto) explicitly skips migration.

  tests/hermes_cli/test_codex_runtime_switch.py — 3 new tests:
    test_enable_triggers_mcp_migration, test_disable_does_not_trigger_migration,
    test_migration_failure_does_not_block_enable.

All 325 feature tests green:
  - tests/agent/transports/: 249 (incl. 67 new)
  - tests/run_agent/test_codex_app_server_integration.py: 9
  - tests/hermes_cli/test_codex_runtime_switch.py: 28 (3 new)
  - tests/hermes_cli/test_codex_runtime_plugin_migration.py: 39 (new)

* perf(codex-runtime): cache codex --version check within apply()

Single /codex-runtime invocation could spawn 'codex --version' up to 3
times (state report, enable gate, success message). Each spawn is ~50ms,
so the cumulative cost wasn't a crisis, but it was wasteful and turned a
trivial slash command into something noticeably laggy on slower systems.

Refactored to lazy-once via a closure over a nonlocal cache. First call
spawns; subsequent calls in the same apply() reuse the result.

Behavior unchanged — same return shape, same error handling, same install
hint when codex is missing. Just one subprocess per call instead of three.

Two regression-guard tests added:
  - test_binary_check_cached_within_apply: enable path → call_count == 1
  - test_binary_check_cached_on_read_only_call: state-report path → call_count == 1

Total tests for /codex-runtime now 30 (was 28); all 143 codex-runtime
tests still green.

* fix(codex-runtime): correct protocol field names found via live e2e test

Three real bugs caught only by running a turn end-to-end against codex
0.130.0 with a real ChatGPT subscription. Unit tests passed because they
asserted on our own (incorrect) wire shapes; the wire format from
codex-rs/app-server-protocol/src/protocol/v2/* is the source of truth and
my initial reading of the README was incomplete.

Bug 1: thread/start.permissions wire format

Was sending {"profileId": "workspace-write"}.
Real format per PermissionProfileSelectionParams enum (tagged union):
  {"type": "profile", "id": "workspace-write"}
AND requires the experimentalApi capability declared during initialize.
AND requires a matching [permissions] table in ~/.codex/config.toml or
codex fails the request with 'default_permissions requires a [permissions]
table'.

Fix: stop overriding permissions on thread/start. Codex picks its default
profile (read-only unless user configures otherwise), which matches what
codex CLI users expect — they configure their default permission profile
in ~/.codex/config.toml the standard way. Trying to be clever about
profile selection broke every turn we tested.

Live error before fix: 'Invalid request: missing field type' on every
turn/start, even though our turn/start payload was correct — the field
codex was complaining about was inside the permissions sub-object we
shouldn't have been sending.

Bug 2: server-request method names

Was matching 'execCommandApproval' and 'applyPatchApproval'.
Real names per common.rs ServerRequest enum:
  item/commandExecution/requestApproval
  item/fileChange/requestApproval
  item/permissions/requestApproval (new third method)

Fix: match the documented names. Added handler for
item/permissions/requestApproval that always declines — codex sometimes
asks to escalate permissions mid-turn and silent acceptance would surprise
users.

Live symptom before fix: agent.log showed
'Unknown codex server request: item/commandExecution/requestApproval'
and codex stalled because we replied with -32601 (unsupported method)
instead of an approval decision. The agent reported back 'The write
command was rejected' even though Hermes never showed the user an
approval prompt.

Bug 3: approval decision values

Was sending decision strings 'approved'/'approvedForSession'/'denied'.
Real values per CommandExecutionApprovalDecision enum (camelCase):
  accept, acceptForSession, decline, cancel
(also AcceptWithExecpolicyAmendment and ApplyNetworkPolicyAmendment
variants we don't currently use).

Fix: rename _approval_choice_to_codex_decision return values; update
auto_approve_* fallbacks; update fail-closed default from 'denied' to
'decline'. Test mapping table updated to match.

Live test verified after fixes:
  $ hermes (with model.openai_runtime: codex_app_server)
  > Run the shell command: echo hermes-codex-livetest > .../proof.txt
    then read it back

  Approval prompt fired with 'Codex requests exec in <cwd>'.
  User chose 'Allow once'. Codex executed the command, wrote the file,
  read it back. Final response: 'Read back from proof.txt:
  hermes-codex-livetest'. File contents on disk match.

agent.log confirms:
  codex app-server thread started: id=019e200e profile=workspace-write
                                    cwd=/tmp/hermes-codex-livetest/workspace

All 20 session tests still green after wire-format updates.

* fix(codex-runtime): correct apply_patch approval params + ship docs

Live e2e revealed FileChangeRequestApprovalParams doesn't carry the
changeset (just itemId, threadId, turnId, reason, grantRoot) — Codex's
'reason' field describes what the patch wants to do. Test config and
display logic updated to use it. The first 'apply_patch (0 change(s))'
display from the live test is now 'apply_patch: <reason>'.

Adds website/docs/user-guide/features/codex-app-server-runtime.md
covering enable/disable, prerequisites, approval UX, MCP migration
behavior, permission profile delegation to ~/.codex/config.toml, known
limitations, and the architecture diagram. Wired into the Automation
category in sidebars.ts.

Live e2e validation across the path matrix:
  ✓ thread/start handshake
  ✓ turn/start with text input
  ✓ commandExecution items + projection
  ✓ item/commandExecution/requestApproval → Hermes UI → response
  ✓ Approve once → command runs
  ✓ Deny → command rejected, codex falls back to read-only message
  ✓ Multi-turn (codex remembers prior turn's results)
  ✓ apply_patch via Codex's fileChange path
  ✓ item/fileChange/requestApproval → Hermes UI
  ✓ MCP server migration loads inside spawned codex (verified via
    'use the filesystem MCP tool' prompt)
  ✓ /codex-runtime auto → codex_app_server toggle cycle
  ✓ Disable doesn't trigger migration
  ✓ Enable with codex CLI present succeeds + migrates
  ✓ Hermes-side interrupt path (turn/interrupt request issued cleanly
    even if codex finishes before the interrupt lands)

Known live-validated limitations now documented in the docs page:
  - delegate_task subagents unavailable on this runtime
  - permission profile selection delegated to ~/.codex/config.toml
  - apply_patch approval prompt has no inline changeset (codex protocol
    doesn't expose it)

145/145 codex-runtime tests still green.

* feat(codex-runtime): native plugin migration + UX polish (quirks 2/4/5/10/11)

Major: migrate native Codex plugins (#7 in OpenClaw's PR list)

Discovers installed curated plugins via codex's plugin/list RPC and
writes [plugins."<name>@<marketplace>"] entries to ~/.codex/config.toml
so they're enabled in the spawned Codex sessions. This is the
'YouTube-video-worthy' bit Pash highlighted: when a user has
google-calendar, github, etc. installed in their Codex CLI, those
plugins activate automatically when they enable Hermes' codex runtime.

Implementation:
  - hermes_cli/codex_runtime_plugin_migration.py: new _query_codex_plugins()
    helper spawns 'codex app-server' briefly and walks plugin/list. Returns
    (plugins, error) — failures are non-fatal so MCP migration still works.
  - render_codex_toml_section() now takes plugins + permissions args.
  - migrate() defaults: discover_plugins=True, default_permission_profile=
    'workspace-write'. Explicit None on either disables that side.
  - _strip_existing_managed_block() now also strips [plugins.*] and
    [permissions]/[permissions.*] sections inside the managed block, so
    re-runs replace plugins cleanly without touching codex's own config.

Quirk fixes:

#2 Default permissions profile written on enable.
   Without this, Codex's read-only default kicks in and EVERY write
   triggers an approval prompt. Now writes [permissions] default =
   'workspace-write' so the runtime feels normal out of the box. Set
   default_permission_profile=None to opt out.

#4 apply_patch approval prompt now shows what's changing.
   Codex's FileChangeRequestApprovalParams doesn't carry the changeset.
   Session adapter now caches the fileChange item from item/started
   notifications and looks it up by itemId when codex requests approval.
   Prompt shows '1 add, 1 update: /tmp/new.py, /tmp/old.py' instead of
   'apply_patch (0 change(s))'.

   Side benefit: also drains pending notifications BEFORE handling a
   server request, so the projector and per-turn caches are up to date
   when the approval decision fires. Bounded to 8 notifications per
   loop iter to avoid starving codex's response.

#5/#10 Exec approval prompt never shows empty cwd.
   When codex omits cwd in CommandExecutionRequestApprovalParams, fall
   back to the session's cwd. If somehow neither is available, show
   '<unknown>' explicitly instead of an empty string.

   Also surfaces 'reason' from the approval params when codex provides
   it — gives users more context on why codex wants to run something.

#11 Banner indicates the codex_app_server runtime when active.
   New 'Runtime: codex app-server (terminal/file ops/MCP run inside
   codex)' line appears in the welcome banner only when the runtime is
   on. Default banner is unchanged.

Tests:
  - 7 new tests in test_codex_runtime_plugin_migration.py covering
    plugin discovery (mocked), failure handling, dry-run skip, opt-out
    flag, idempotent re-runs, and permissions writing.
  - 3 new tests in test_codex_app_server_session.py covering the
    enriched approval prompts: cwd fallback, change summary on
    apply_patch, fallback when no item/started cache exists.
  - All 26 session tests + 46 migration tests green; 153 total in PR.

* feat(codex-runtime): hermes-tools MCP callback + native plugin migration

The big architectural addition: when codex_app_server runtime is on,
Hermes registers its own tool surface as an MCP server in
~/.codex/config.toml so the codex subprocess can call back into Hermes
for tools codex doesn't ship with — web_search, browser_*, vision,
image_generate, skills, TTS.

Also: 'migrate native codex plugins' (Pash's YouTube-video-worthy bit) —
when the user has plugins like Linear, GitHub, Gmail, Calendar, Canva
installed via 'codex plugin', Hermes discovers them via plugin/list and
writes [plugins.<name>@openai-curated] entries so they activate
automatically.

New module: agent/transports/hermes_tools_mcp_server.py
  FastMCP stdio server exposing 17 Hermes tools. Each call dispatches
  through model_tools.handle_function_call() — same code path as the
  Hermes default runtime. Run with:
    python -m agent.transports.hermes_tools_mcp_server [--verbose]

  Exposed: web_search, web_extract, browser_navigate / _click / _type /
    _press / _snapshot / _scroll / _back / _get_images / _console /
    _vision, vision_analyze, image_generate, skill_view, skills_list,
    text_to_speech.

  NOT exposed (deliberately):
    - terminal/shell/read_file/write_file/patch — codex has built-ins
    - delegate_task/memory/session_search/todo — _AGENT_LOOP_TOOLS in
      model_tools.py:493, require running AIAgent context. Documented
      as a limitation and surfaced in the slash command output.

Migration changes (hermes_cli/codex_runtime_plugin_migration.py):
  - _query_codex_plugins() spawns 'codex app-server' briefly to walk
    plugin/list and pull installed openai-curated plugins. Failures are
    non-fatal — MCP migration still completes.
  - render_codex_toml_section() now takes plugins + permissions args
    AND wraps the managed block with a MIGRATION_END_MARKER comment so
    the stripper can reliably find both ends, even when the block
    contains top-level keys (default_permissions = ...).
  - migrate() defaults: discover_plugins=True, expose_hermes_tools=True,
    default_permission_profile=':workspace' (built-in codex profile name
    — must be prefixed with ':'). All three opt-out via explicit args.
  - _build_hermes_tools_mcp_entry() builds the codex stdio entry with
    HERMES_HOME and PYTHONPATH passthrough so a worktree-launched
    Hermes points the MCP subprocess at the same module layout.

Live-caught wire bugs fixed during this turn:
  1. Permission profile config key is top-level , NOT a [permissions] table. The [permissions] table is
     for *user-defined* profiles with structured fields. Built-in
     profile names start with ':' (':workspace', ':read-only',
     ':danger-no-sandbox'). Was emitting
     which codex rejected with 'invalid type: string "X", expected
     struct PermissionProfileToml'.
  2. Built-in profile is , NOT . Codex
     rejected  with 'unknown built-in profile'.
  3. Codex's MCP layer sends  for
     tool-call confirmation. We weren't handling it, so codex stalled
     and returned 'MCP tool call was rejected'. Now: auto-accept for
     our own hermes-tools server (user already opted in by enabling
     the runtime), decline for third-party servers.

Quirk fixes shipped (from the limitations list):
  #2 default permissions: workspace profile written on enable. No more
     approval prompt on every write.
  #4 apply_patch approval shows what's changing: cache fileChange
     items from item/started, look up by itemId when codex sends
     item/fileChange/requestApproval. Prompt: '1 add, 1 update:
     /tmp/new.py, /tmp/old.py' instead of '0 change(s)'.
  #5/#10 exec approval cwd never empty: fall back to session cwd, then
     '<unknown>'. Also surfaces 'reason' from codex when present.
  #11 banner shows 'Runtime: codex app-server' line when active so
     users understand why tool counts may not match what's reachable.

Tests:
  - 5 new tests in test_codex_runtime_plugin_migration.py covering
    plugin discovery, expose_hermes_tools entry generation, idempotent
    re-runs, opt-out flag, permissions profile.
  - 3 new tests in test_codex_app_server_session.py covering enriched
    approval prompts (cwd fallback, fileChange summary).
  - 2 new tests for mcpServer/elicitation/request handling (accept
    hermes-tools, decline others).
  - New test file test_hermes_tools_mcp_server.py covering module
    surface, EXPOSED_TOOLS safety invariants (no shell/file_ops,
    no agent-loop tools), and main() error paths.
  - 166 codex-runtime tests total, all green.

Live e2e validated against codex 0.130.0 + ChatGPT subscription:
  ✓ /codex-runtime codex_app_server enables, migrates filesystem MCP,
    registers hermes-tools, writes default_permissions = ':workspace'
  ✓ Banner shows 'Runtime: codex app-server' line in subsequent sessions
  ✓ Shell command runs without approval prompt (workspace profile works)
  ✓ Multi-turn — codex remembers prior turn's results
  ✓ apply_patch path via fileChange request approval
  ✓ web_search via hermes-tools MCP callback returns real Firecrawl
    results: 'OpenAI Codex CLI – Getting Started' end-to-end in 13s
  ✓ Disable cycle clean

Docs updated: website/docs/user-guide/features/codex-app-server-runtime.md
  Full re-write covering native plugin migration, the hermes-tools
  callback architecture, the prerequisites change ('codex login is
  separate from hermes auth login codex'), the trade-off table now
  reflecting which Hermes tools work via callback, and the limitations
  list updated with what's actually unavailable on this runtime.

* feat(codex-runtime): pin user-config preservation invariant for quirk #6

Quirk #6 from the limitations list — user MCP servers / overrides /
codex-only sections in ~/.codex/config.toml that live OUTSIDE the
hermes-managed block must survive re-migration verbatim.

This already worked thanks to the MIGRATION_MARKER + MIGRATION_END_MARKER
pair I added when fixing the default_permissions wire format (so the
strip can find both ends of the managed region even with top-level
keys like default_permissions). But it was an emergent property
without a test pinning it.

Now explicitly tested:
  - User MCP server above the managed block survives migration
  - User MCP server below the managed block survives migration
  - Both above + below survive a second re-migration
  - User content (model, providers, sandbox, otel, etc.) outside our
    region is left untouched

Docs added a section "Editing ~/.codex/config.toml safely" explaining
the marker contract — so users know they can add their own MCP
servers, override permissions, configure codex-only options, etc.
without fear of Hermes overwriting their work.

167 codex-runtime tests, all green.

* docs(codex-runtime): clarify the actual tool surface — shell covers terminal/read/write/find

Previous docs and PR description undersold what codex's built-in
toolset actually provides. apply_patch alone made it sound like the
runtime could only edit files in patch format — implying you'd lose
terminal use, read_file, write_file, search/find. That was wrong.

Codex's 'shell' tool runs arbitrary shell commands inside the sandbox,
which covers everything you'd do in bash: cat/head/tail (read), echo>
or heredocs (write), find/rg/grep (search), ls/cd (navigate), build/
test/git/etc. apply_patch is for structured multi-file edits on top
of that. update_plan is its in-runtime todo. view_image loads images.
And codex has its own web_search built in (in addition to the
Firecrawl-backed one Hermes exposes via MCP callback).

Docs now have a 'What tools the model actually has' section right
after Why, breaking the surface into three clearly-labeled buckets:

  1. Codex's built-in toolset (always on) — shell, apply_patch,
     update_plan, view_image, web_search; covers everything terminal-
     adjacent.
  2. Native Codex plugins (auto-migrated from your codex plugin
     install) — Linear, GitHub, Gmail, Calendar, Outlook, Canva, etc.
  3. Hermes tool callback (MCP server in ~/.codex/config.toml) —
     web_search/web_extract via Firecrawl, browser_*, vision_analyze,
     image_generate, skill_view/skills_list, text_to_speech.

Plus a 'What's NOT available' callout listing the four agent-loop tools
(delegate_task, memory, session_search, todo) that need running
AIAgent context and can't reach the codex runtime.

Trade-offs table broken out: shell, apply_patch, update_plan,
view_image, sandbox each get their own row with a one-line description
so users can see at a glance what's available natively.

Architecture diagram updated to list the codex built-ins by name
instead of 'apply_patch + shell + sandbox'.

No code changes — purely docs clarification. 167 codex-runtime tests
still green.

* fix(codex-runtime): _spawn_background_review signature + review fork api_mode downgrade

Two real bugs in the self-improvement loop integration that the previous
test mocked away.

Bug 1: wrong call signature

The codex helper was calling self._spawn_background_review() with no
args after every turn. That function actually requires:
  messages_snapshot=list   (positional or keyword)
  review_memory=bool       (at least one trigger must be True)
  review_skills=bool

So the call would have raised TypeError at runtime — except the only
test that exercised this path mocked _spawn_background_review entirely
and just asserted spawn.called, so the wrong-arg shape never surfaced.

Bug 2: review fork inherits codex_app_server api_mode

The review fork is constructed with:
  api_mode = _parent_runtime.get('api_mode')

So when the parent is codex_app_server, the review fork ALSO runs as
codex_app_server. But the review fork's whole job is to call agent-loop
tools (memory, skill_manage) which require Hermes' own dispatch — they
short-circuit with 'must be handled by the agent loop' on the codex
runtime. So the review fork would have run, decided to save something,
called memory or skill_manage, and silently no-op'd.

Fixed in run_agent.py:_spawn_background_review() — when the parent
api_mode is 'codex_app_server', the review fork is downgraded to
'codex_responses' (same OAuth credentials, same openai-codex provider,
but talks to OpenAI's Responses API directly so Hermes owns the loop).

Also rewrote the codex helper's review wiring to match the
chat_completions path:
  - Computes _should_review_memory in the pre-loop block (was already
    being computed; now passed through to the helper as an arg).
  - Computes _should_review_skills AFTER the codex turn returns +
    counters tick (line ~15432 pattern in chat_completions).
  - Calls _spawn_background_review(messages_snapshot=, review_memory=,
    review_skills=) only when at least one trigger fires.
  - Adds the external memory provider sync (_sync_external_memory_for_turn)
    that the chat_completions path runs after every turn.

Tests:

  Replaced the broken test_background_review_invoked (which only
  asserted spawn.called) with three sharper tests:
    - test_background_review_NOT_invoked_below_threshold:
      single turn at default thresholds → no review fires (would have
      caught the original 'every turn calls spawn with no args' bug)
    - test_background_review_skill_trigger_fires_above_threshold:
      10 tool_iterations at threshold=10 → review fires with
      messages_snapshot=list, review_skills=True, counter resets
    - test_background_review_signature_never_breaks: regression guard
      asserting positional args are always empty and kwargs include
      messages_snapshot

  New TestReviewForkApiModeDowngrade class:
    - test_codex_app_server_parent_downgrades_review_fork: drives the
      real _spawn_background_review function (no mock at that level),
      asserts the review_agent gets api_mode='codex_responses' when
      the parent was codex_app_server.

Live-validated against real run_conversation:
  - Counter ticked from 0 to 5 after a 5-tool-iteration turn
  - _spawn_background_review fired exactly once with kwargs-only signature
  - review_skills=True, review_memory=False
  - messages_snapshot was 12 entries (5 assistant tool_calls + 5 tool
    results + 1 final assistant + initial system/user)
  - Counter reset to 0 after fire

170 codex-runtime tests, all green.

Docs: added a Self-improvement loop section to the codex runtime page
explaining both how the trigger logic stays equivalent and that the
review fork is auto-downgraded to codex_responses for the agent-loop
tools. Also clarified that apply_patch and update_plan ARE codex's
built-in tools (the previous version made it sound like they were
separate from 'codex's stuff' — they're not, all five tools listed
in 'What tools the model actually has' section 1 are codex built-ins).

* feat(codex-runtime): expose kanban tools through Hermes MCP callback

Kanban workers spawn as separate hermes chat -q subprocesses that read
the user's config.yaml. If model.openai_runtime: codex_app_server is set
globally (which is the whole point of opt-in), every dispatched worker
ALSO comes up on the codex runtime.

That mostly works — codex's built-in shell + apply_patch + update_plan
do the actual task work fine — but it had one critical break: the
worker handoff tools (kanban_complete, kanban_block, kanban_comment,
kanban_heartbeat) are Hermes-registered tools, not codex built-ins.
On the codex runtime, codex builds its own tool list and these never
reach the model, so the worker would do the work but not be able to
report back, hanging until the dispatcher's timeout escalates it as
zombie.

Fix: add all 9 kanban tools to the EXPOSED_TOOLS list in the Hermes
MCP callback. They dispatch statelessly through handle_function_call()
just like web_search and the others — they read HERMES_KANBAN_TASK
from env (set by the dispatcher), gate correctly (worker tools require
the env var, orchestrator tools require it unset), and write to
~/.hermes/kanban.db.

Why kanban tools work via stateless dispatch when delegate_task/memory/
session_search/todo don't: those four are listed in _AGENT_LOOP_TOOLS
(model_tools.py:493) and short-circuit in handle_function_call() with
'must be handled by the agent loop' — they need to mutate AIAgent's
mid-loop state. Kanban tools have no such requirement; they're pure
side-effect functions against the kanban.db plus state_meta.

Tools exposed:
  Worker handoff (require HERMES_KANBAN_TASK):
    kanban_complete, kanban_block, kanban_comment, kanban_heartbeat
  Read-only board queries:
    kanban_show, kanban_list
  Orchestrator (require HERMES_KANBAN_TASK unset):
    kanban_create, kanban_unblock, kanban_link

Tests:
  - test_kanban_worker_tools_exposed: complete/block/comment/heartbeat
    in EXPOSED_TOOLS (regression guard for the would-hang-worker bug)
  - test_kanban_orchestrator_tools_exposed: create/show/list/unblock/link

Docs:
  - New 'Workflow features' section in the docs page covering /goal,
    kanban, and cron behavior on this runtime
  - /goal: works fully via run_conversation feedback; only caveat is
    approval-prompt noise on long writes-heavy goals (mitigated by
    the default :workspace permission profile)
  - Kanban: enumerated which tools are reachable via the callback and
    why the env var propagates correctly through the codex subprocess
    to the MCP server subprocess
  - Cron: documented as 'not specifically tested' — same rules as the
    CLI apply since cron runs through AIAgent.run_conversation
  - Trade-offs table gained rows for /goal, kanban worker, kanban
    orchestrator

172/172 codex-runtime tests green (+2 from kanban tests).

* docs(codex-runtime): wire /codex-runtime into slash-commands ref + flag aux token cost

Three docs gaps caught during a final audit:

1. /codex-runtime was only in the feature docs page, not in the
   slash-commands reference. Added rows to both the CLI section and
   the Messaging section so users discover it where they'd look for
   slash command syntax.

2. CODEX_HOME and HERMES_KANBAN_TASK weren't in environment-variables.md.
   CODEX_HOME lets users redirect Codex CLI's config dir (the migration
   honors it). HERMES_KANBAN_TASK is set by the kanban dispatcher and
   propagates to the codex subprocess + the hermes-tools MCP subprocess
   so kanban worker tools gate correctly — documented as 'don't set
   manually' since it's an internal handoff.

3. Aux client behavior on this runtime. When openai_runtime=
   codex_app_server is on with the openai-codex provider, every aux
   task (title generation, context compression, vision auto-detect,
   session search summarization, the background self-improvement review
   fork) flows through the user's ChatGPT subscription by default.

   This is true for the existing codex_responses path too, but it's
   more visible / important here because users explicitly opted in for
   subscription billing. Added a 'Auxiliary tasks and ChatGPT
   subscription token cost' section to the docs page with a YAML
   example showing how to override specific aux tasks to a cheaper
   model (typically google/gemini-3-flash-preview via OpenRouter).

   Also documents how the self-improvement review fork gets
   auto-downgraded from codex_app_server to codex_responses by the
   fix earlier in this PR.

No code changes — pure docs. 172 codex-runtime tests still green.

* docs+test(codex-runtime): pin HOME passthrough, document multi-profile + CODEX_HOME

OpenClaw hit a real footgun in openclaw/openclaw#81562: when spawning
codex app-server they were synthesizing a per-agent HOME alongside
CODEX_HOME. That made every subprocess codex's shell tool launches
(gh, git, aws, npm, gcloud, ...) see a fake $HOME and miss the user's
real config files. They had to back it out in PR #81562 — keep
CODEX_HOME isolation, leave HOME alone.

Audit confirms Hermes' codex spawn doesn't have this problem. We do
os.environ.copy() and only overlay CODEX_HOME (when provided) and
RUST_LOG. HOME passes through unchanged. But it was an emergent
property without a test pinning it, so adding a regression guard:

  test_spawn_env_preserves_HOME — confirms parent HOME survives intact
                                  in the subprocess env
  test_spawn_env_sets_CODEX_HOME_when_provided — confirms codex_home
                                                  arg still isolates
                                                  codex state correctly

Docs additions:

  'HOME environment variable passthrough' section — calls out the
  contract explicitly: CODEX_HOME isolates codex's own state, HOME
  stays user-real so gh/git/aws/npm/etc. find their normal config.
  Cites openclaw#81562 as the cautionary tale.

  'Multi-profile / multi-tenant setups' section — addresses the
  related concern: profiles share ~/.codex/ by default. For users who
  want per-profile codex isolation (separate auth, separate plugins),
  documents the manual CODEX_HOME=<profile-scoped-dir> approach.

  Explains why we DON'T auto-scope CODEX_HOME per profile: doing so
  would silently invalidate existing codex login state for anyone
  upgrading to this PR with tokens already at ~/.codex/auth.json.
  Opt-in is safer than surprising users.

174 codex-runtime tests (+2 from HOME guards), all green.

* fix(codex-runtime): TOML control-char escapes + atomic config.toml write

Two footguns caught in a final audit pass before merge.

Bug 1: TOML control characters not escaped

The _format_toml_value() helper escaped backslashes and double quotes
but passed literal control characters (\n, \t, \r, \f, \b) through
unchanged. TOML basic strings don't allow literal control characters
— a path or env var containing a newline would produce invalid TOML
that codex refuses to load.

Realistic exposure: pathological cases like a HERMES_HOME with a
trailing newline (env var concatenation accident), or a PYTHONPATH
with a tab from a multi-line shell heredoc.

Fix: escape all five TOML basic-string control sequences (\b \t \n
\f \r) in addition to \\ and \" that we already did. Order
matters — backslash must come first or the other escapes get
re-escaped.

Bug 2: config.toml write wasn't atomic

If the python process crashed between target.mkdir() and the
write_text() finishing, a half-written config.toml could be left
behind. On NFS / Windows / some FUSE mounts this is a real concern;
on ext4/APFS small writes are usually atomic in practice but not
guaranteed.

Fix: write to a tempfile.mkstemp() temp file in the same directory,
then Path.replace() (atomic same-dir rename on POSIX, ReplaceFile on
Windows). On rename failure, clean up the temp file so repeated
failed migrations don't pile up .config.toml.* files.

Tests:
  - test_string_with_newline_escaped — \n in value → \n in output
  - test_string_with_tab_escaped — \t in value → \t in output
  - test_string_with_other_controls_escaped — \r, \f, \b
  - test_windows_path_escaped_correctly — backslash doubling
  - test_atomic_write_no_temp_leak_on_success — no .config.toml.*
    left over after a successful write
  - test_atomic_write_cleanup_on_rename_failure — temp file removed
    when Path.replace raises (simulated disk full)

180 codex-runtime tests, all green (+6 from this commit).

Footguns audited but NOT fixed (with rationale):

- Concurrent migrations race. Two Hermes processes hitting
  /codex-runtime codex_app_server within seconds of each other could
  cause one writer to lose entries. Low probability (you'd have to
  enable from two surfaces simultaneously) and low impact (just re-run
  migration). Adding fcntl/msvcrt locking is more code than it's
  worth here. The atomic rename above means each individual write is
  consistent — only the merge step is racy.

- Codex protocol version drift. We pin MIN_CODEX_VERSION=0.125 and
  check at runtime but don't reject too-new versions. Right call —
  the protocol has been stable through 0.125 → 0.130. If OpenAI
  breaks it later we'd see the error in test_codex_app_server_runtime
  on CI before users hit it.
2026-05-13 17:18:15 -07:00

57 KiB
Raw Blame History

sidebar_position title description
2 Environment Variables Complete reference of all environment variables used by Hermes Agent

Environment Variables Reference

All variables go in ~/.hermes/.env. You can also set them with hermes config set VAR value.

LLM Providers

Variable Description
OPENROUTER_API_KEY OpenRouter API key (recommended for flexibility)
OPENROUTER_BASE_URL Override the OpenRouter-compatible base URL
HERMES_OPENROUTER_CACHE Enable OpenRouter response caching (1/true/yes/on). Overrides openrouter.response_cache in config.yaml. See Response Caching.
HERMES_OPENROUTER_CACHE_TTL Cache TTL in seconds (1-86400). Overrides openrouter.response_cache_ttl in config.yaml.
NOUS_BASE_URL Override Nous Portal base URL (rarely needed; development/testing only)
NOUS_INFERENCE_BASE_URL Override Nous inference endpoint directly
AI_GATEWAY_API_KEY Vercel AI Gateway API key (ai-gateway.vercel.sh)
AI_GATEWAY_BASE_URL Override AI Gateway base URL (default: https://ai-gateway.vercel.sh/v1)
OPENAI_API_KEY API key for custom OpenAI-compatible endpoints (used with OPENAI_BASE_URL)
OPENAI_BASE_URL Base URL for custom endpoint (VLLM, SGLang, etc.)
COPILOT_GITHUB_TOKEN GitHub token for Copilot API — first priority (OAuth gho_* or fine-grained PAT github_pat_*; classic PATs ghp_* are not supported)
GH_TOKEN GitHub token — second priority for Copilot (also used by gh CLI)
GITHUB_TOKEN GitHub token — third priority for Copilot
HERMES_COPILOT_ACP_COMMAND Override Copilot ACP CLI binary path (default: copilot)
COPILOT_CLI_PATH Alias for HERMES_COPILOT_ACP_COMMAND
HERMES_COPILOT_ACP_ARGS Override Copilot ACP arguments (default: --acp --stdio)
COPILOT_ACP_BASE_URL Override Copilot ACP base URL
GLM_API_KEY z.ai / ZhipuAI GLM API key (z.ai)
ZAI_API_KEY Alias for GLM_API_KEY
Z_AI_API_KEY Alias for GLM_API_KEY
GLM_BASE_URL Override z.ai base URL (default: https://api.z.ai/api/paas/v4)
KIMI_API_KEY Kimi / Moonshot AI API key (moonshot.ai)
KIMI_BASE_URL Override Kimi base URL (default: https://api.moonshot.ai/v1)
KIMI_CN_API_KEY Kimi / Moonshot China API key (moonshot.cn)
ARCEEAI_API_KEY Arcee AI API key (chat.arcee.ai)
ARCEE_BASE_URL Override Arcee base URL (default: https://api.arcee.ai/api/v1)
GMI_API_KEY GMI Cloud API key (gmicloud.ai)
GMI_BASE_URL Override GMI Cloud base URL (default: https://api.gmi-serving.com/v1)
MINIMAX_API_KEY MiniMax API key — global endpoint (minimax.io). Not used by minimax-oauth (OAuth path uses browser login instead).
MINIMAX_BASE_URL Override MiniMax base URL (default: https://api.minimax.io/anthropic — Hermes uses MiniMax's Anthropic Messages-compatible endpoint). Not used by minimax-oauth.
MINIMAX_CN_API_KEY MiniMax API key — China endpoint (minimaxi.com). Not used by minimax-oauth (OAuth path uses browser login instead).
MINIMAX_CN_BASE_URL Override MiniMax China base URL (default: https://api.minimaxi.com/anthropic). Not used by minimax-oauth.
KILOCODE_API_KEY Kilo Code API key (kilo.ai)
KILOCODE_BASE_URL Override Kilo Code base URL (default: https://api.kilo.ai/api/gateway)
XIAOMI_API_KEY Xiaomi MiMo API key (platform.xiaomimimo.com)
XIAOMI_BASE_URL Override Xiaomi MiMo base URL (default: https://api.xiaomimimo.com/v1)
TOKENHUB_API_KEY Tencent TokenHub API key (tokenhub.tencentmaas.com)
TOKENHUB_BASE_URL Override Tencent TokenHub base URL (default: https://tokenhub.tencentmaas.com/v1)
AZURE_FOUNDRY_API_KEY Azure AI Foundry / Azure OpenAI API key (ai.azure.com)
AZURE_FOUNDRY_BASE_URL Azure AI Foundry endpoint URL (e.g. https://<resource>.openai.azure.com/openai/v1 for OpenAI-style, or https://<resource>.services.ai.azure.com/anthropic for Anthropic-style)
AZURE_ANTHROPIC_KEY Azure Anthropic API key for provider: anthropic + base_url pointing at an Azure Foundry Claude deployment (alternative to ANTHROPIC_API_KEY when both Anthropic and Azure Anthropic are configured)
HF_TOKEN Hugging Face token for Inference Providers (huggingface.co/settings/tokens)
HF_BASE_URL Override Hugging Face base URL (default: https://router.huggingface.co/v1)
GOOGLE_API_KEY Google AI Studio API key (aistudio.google.com/app/apikey)
GEMINI_API_KEY Alias for GOOGLE_API_KEY
GEMINI_BASE_URL Override Google AI Studio base URL
HERMES_GEMINI_CLIENT_ID OAuth client ID for google-gemini-cli PKCE login (optional; defaults to Google's public gemini-cli client)
HERMES_GEMINI_CLIENT_SECRET OAuth client secret for google-gemini-cli (optional)
HERMES_GEMINI_PROJECT_ID GCP project ID for paid Gemini tiers (free tier auto-provisions)
ANTHROPIC_API_KEY Anthropic Console API key (console.anthropic.com)
ANTHROPIC_TOKEN Manual or legacy Anthropic OAuth/setup-token override
DASHSCOPE_API_KEY Alibaba Cloud DashScope API key for Qwen models (modelstudio.console.alibabacloud.com)
DASHSCOPE_BASE_URL Custom DashScope base URL (default: https://dashscope-intl.aliyuncs.com/compatible-mode/v1; use https://dashscope.aliyuncs.com/compatible-mode/v1 for mainland-China region)
DEEPSEEK_API_KEY DeepSeek API key for direct DeepSeek access (platform.deepseek.com)
DEEPSEEK_BASE_URL Custom DeepSeek API base URL
NVIDIA_API_KEY NVIDIA NIM API key — Nemotron and open models (build.nvidia.com)
NVIDIA_BASE_URL Override NVIDIA base URL (default: https://integrate.api.nvidia.com/v1; set to http://localhost:8000/v1 for a local NIM endpoint)
STEPFUN_API_KEY StepFun API key — Step-series models (platform.stepfun.com)
STEPFUN_BASE_URL Override StepFun base URL (default: https://api.stepfun.com/v1)
OLLAMA_API_KEY Ollama Cloud API key — managed Ollama catalog without local GPU (ollama.com/settings/keys)
OLLAMA_BASE_URL Override Ollama Cloud base URL (default: https://ollama.com/v1)
XAI_API_KEY xAI (Grok) API key for chat + TTS (console.x.ai)
XAI_BASE_URL Override xAI base URL (default: https://api.x.ai/v1)
MISTRAL_API_KEY Mistral API key for Voxtral TTS and Voxtral STT (console.mistral.ai)
AWS_REGION AWS region for Bedrock inference (e.g. us-east-1, eu-central-1). Read by boto3.
AWS_PROFILE AWS named profile for Bedrock authentication (reads ~/.aws/credentials). Leave unset to use default boto3 credential chain.
BEDROCK_BASE_URL Override Bedrock runtime base URL (default: https://bedrock-runtime.us-east-1.amazonaws.com; usually leave unset and use AWS_REGION instead)
HERMES_QWEN_BASE_URL Qwen Portal base URL override (default: https://portal.qwen.ai/v1)
OPENCODE_ZEN_API_KEY OpenCode Zen API key — pay-as-you-go access to curated models (opencode.ai)
OPENCODE_ZEN_BASE_URL Override OpenCode Zen base URL
OPENCODE_GO_API_KEY OpenCode Go API key — $10/month subscription for open models (opencode.ai)
OPENCODE_GO_BASE_URL Override OpenCode Go base URL
CLAUDE_CODE_OAUTH_TOKEN Explicit Claude Code token override if you export one manually
HERMES_MODEL Override model name at process level (used by cron scheduler; prefer config.yaml for normal use)
VOICE_TOOLS_OPENAI_KEY Preferred OpenAI key for OpenAI speech-to-text and text-to-speech providers
HERMES_LOCAL_STT_COMMAND Optional local speech-to-text command template. Supports {input_path}, {output_dir}, {language}, and {model} placeholders
HERMES_LOCAL_STT_LANGUAGE Default language passed to HERMES_LOCAL_STT_COMMAND or auto-detected local whisper CLI fallback (default: en)
HERMES_HOME Override Hermes config directory (default: ~/.hermes). Also scopes the gateway PID file and systemd service name, so multiple installations can run concurrently
HERMES_GIT_BASH_PATH Windows only. Override bash.exe discovery for the terminal tool. Points at any bash — full Git-for-Windows install, WSL bash via symlink, MSYS2, Cygwin. The installer sets this automatically to the PortableGit it provisioned. See the Windows (Native) Guide
HERMES_DISABLE_WINDOWS_UTF8 Windows only. Set to 1 to disable the UTF-8 stdio shim (configure_windows_stdio()) and fall back to the console's locale code page. Useful for bisecting encoding bugs; rarely the right setting in normal operation
HERMES_KANBAN_HOME Override the shared Hermes root that anchors the kanban board (db + workspaces + worker logs). Falls back to get_default_hermes_root() (the parent of any active profile). Useful for tests and unusual deployments
HERMES_KANBAN_BOARD Pin the active kanban board for this process. Takes precedence over ~/.hermes/kanban/current; the dispatcher injects this into worker subprocess env so workers physically cannot see tasks on other boards. Defaults to default. Slug validation: lowercase alphanumerics + hyphens + underscores, 1-64 chars
HERMES_KANBAN_DB Pin the kanban database file path directly (highest precedence; beats HERMES_KANBAN_BOARD and HERMES_KANBAN_HOME). The dispatcher injects this into worker subprocess env so profile workers converge on the dispatcher's board
HERMES_KANBAN_WORKSPACES_ROOT Pin the kanban workspaces root directly (highest precedence for workspaces; beats HERMES_KANBAN_HOME). The dispatcher injects this into worker subprocess env

Provider Auth (OAuth)

For native Anthropic auth, Hermes prefers Claude Code's own credential files when they exist because those credentials can refresh automatically. OAuth against Anthropic requires a Claude Max plan with purchased extra usage credits — Hermes routes as Claude Code, which only draws from the Max plan's extra/overage credits, not the base Max allowance, and does not work on Claude Pro. Without Max + extra credits, use an API key instead. Environment variables such as ANTHROPIC_TOKEN remain useful as manual overrides, but they are no longer the preferred path for Claude Max login.

Variable Description
HERMES_INFERENCE_PROVIDER Override provider selection: auto, custom, openrouter, nous, openai-codex, copilot, copilot-acp, anthropic, huggingface, gemini, zai, kimi-coding, kimi-coding-cn, minimax, minimax-cn, minimax-oauth (browser OAuth login — no API key required; see MiniMax OAuth guide), kilocode, xiaomi, arcee, gmi, stepfun, alibaba, alibaba-coding-plan (alias alibaba_coding), deepseek, nvidia, ollama-cloud, xai (alias grok), google-gemini-cli, qwen-oauth, bedrock, opencode-zen, opencode-go, ai-gateway, tencent-tokenhub (default: auto)
HERMES_PORTAL_BASE_URL Override Nous Portal URL (for development/testing)
NOUS_INFERENCE_BASE_URL Override Nous inference API URL
HERMES_NOUS_MIN_KEY_TTL_SECONDS Min agent key TTL before re-mint (default: 1800 = 30min)
HERMES_NOUS_TIMEOUT_SECONDS HTTP timeout for Nous credential / token flows
HERMES_DUMP_REQUESTS Dump API request payloads to log files (true/false)
HERMES_PREFILL_MESSAGES_FILE Path to a JSON file of ephemeral prefill messages injected at API-call time
HERMES_TIMEZONE IANA timezone override (for example America/New_York)

Tool APIs

Variable Description
PARALLEL_API_KEY AI-native web search (parallel.ai)
FIRECRAWL_API_KEY Web scraping and cloud browser (firecrawl.dev)
FIRECRAWL_API_URL Custom Firecrawl API endpoint for self-hosted instances (optional)
TAVILY_API_KEY Tavily API key for AI-native web search, extract, and crawl (app.tavily.com)
SEARXNG_URL SearXNG instance URL for free self-hosted web search — no API key required (searxng.github.io)
TAVILY_BASE_URL Override the Tavily API endpoint. Useful for corporate proxies and self-hosted Tavily-compatible search backends. Same pattern as GROQ_BASE_URL.
EXA_API_KEY Exa API key for AI-native web search and contents (exa.ai)
BROWSERBASE_API_KEY Browser automation (browserbase.com)
BROWSERBASE_PROJECT_ID Browserbase project ID
BROWSER_USE_API_KEY Browser Use cloud browser API key (browser-use.com)
FIRECRAWL_BROWSER_TTL Firecrawl browser session TTL in seconds (default: 300)
BROWSER_CDP_URL Chrome DevTools Protocol URL for local browser (set via /browser connect, e.g. ws://localhost:9222)
CAMOFOX_URL Camofox local anti-detection browser URL (default: http://localhost:9377)
CAMOFOX_USER_ID Optional externally managed Camofox user ID for shared visible sessions
CAMOFOX_SESSION_KEY Optional Camofox session key used when creating tabs for CAMOFOX_USER_ID
CAMOFOX_ADOPT_EXISTING_TAB Set to true to reuse an existing Camofox tab before creating a new one
BROWSER_INACTIVITY_TIMEOUT Browser session inactivity timeout in seconds
FAL_KEY Image generation (fal.ai)
GROQ_API_KEY Groq Whisper STT API key (groq.com)
ELEVENLABS_API_KEY ElevenLabs premium TTS voices (elevenlabs.io)
STT_GROQ_MODEL Override the Groq STT model (default: whisper-large-v3-turbo)
GROQ_BASE_URL Override the Groq OpenAI-compatible STT endpoint
STT_OPENAI_MODEL Override the OpenAI STT model (default: whisper-1)
STT_OPENAI_BASE_URL Override the OpenAI-compatible STT endpoint
GITHUB_TOKEN GitHub token for Skills Hub (higher API rate limits, skill publish)
HONCHO_API_KEY Cross-session user modeling (honcho.dev)
HONCHO_BASE_URL Base URL for self-hosted Honcho instances (default: Honcho cloud). No API key required for local instances
HINDSIGHT_TIMEOUT Timeout in seconds for Hindsight memory-provider API calls (default: 60). Bump this if your Hindsight instance is slow to respond during /sync or on_session_switch and you're seeing timeouts in errors.log.
SUPERMEMORY_API_KEY Semantic long-term memory with profile recall and session ingest (supermemory.ai)
TINKER_API_KEY RL training (tinker-console.thinkingmachines.ai)
WANDB_API_KEY RL training metrics (wandb.ai)
DAYTONA_API_KEY Daytona cloud sandboxes (daytona.io)
VERCEL_TOKEN Vercel Sandbox access token (vercel.com)
VERCEL_PROJECT_ID Vercel project ID (required with VERCEL_TOKEN)
VERCEL_TEAM_ID Vercel team ID (required with VERCEL_TOKEN)
VERCEL_OIDC_TOKEN Vercel short-lived OIDC token (development-only alternative)

Langfuse Observability

Environment variables for the bundled observability/langfuse plugin. Set these with hermes tools → Langfuse Observability or manually in ~/.hermes/.env. The plugin must also be enabled (hermes plugins enable observability/langfuse) before any of these take effect.

Variable Description
HERMES_LANGFUSE_PUBLIC_KEY Langfuse project public key (pk-lf-...). Required.
HERMES_LANGFUSE_SECRET_KEY Langfuse project secret key (sk-lf-...). Required.
HERMES_LANGFUSE_BASE_URL Langfuse server URL (default: https://cloud.langfuse.com). Set for self-hosted.
HERMES_LANGFUSE_ENV Environment tag on traces (production, staging, …)
HERMES_LANGFUSE_RELEASE Release/version tag on traces
HERMES_LANGFUSE_SAMPLE_RATE SDK sampling rate 0.01.0 (default: 1.0)
HERMES_LANGFUSE_MAX_CHARS Per-field truncation for serialized payloads (default: 12000)
HERMES_LANGFUSE_DEBUG true enables verbose plugin logging to agent.log
LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY / LANGFUSE_BASE_URL Standard Langfuse SDK names. Accepted as fallbacks when the HERMES_LANGFUSE_* equivalents are unset.

Nous Tool Gateway

These variables configure the Tool Gateway for paid Nous subscribers or self-hosted gateway deployments. Most users don't need to set these — the gateway is configured automatically via hermes model or hermes tools.

Variable Description
TOOL_GATEWAY_DOMAIN Base domain for Tool Gateway routing (default: nousresearch.com)
TOOL_GATEWAY_SCHEME HTTP or HTTPS scheme for gateway URLs (default: https)
TOOL_GATEWAY_USER_TOKEN Auth token for the Tool Gateway (normally auto-populated from Nous auth)
FIRECRAWL_GATEWAY_URL Override URL for the Firecrawl gateway endpoint specifically

Terminal Backend

Variable Description
TERMINAL_ENV Backend: local, docker, ssh, singularity, modal, daytona, vercel_sandbox
HERMES_DOCKER_BINARY Override the container binary Hermes shells out to (e.g. podman, /usr/local/bin/docker). When unset, Hermes auto-discovers docker or podman on PATH. Needed when both are installed and you want the non-default, or when the binary lives outside PATH.
TERMINAL_DOCKER_IMAGE Docker image (default: nikolaik/python-nodejs:python3.11-nodejs20)
TERMINAL_DOCKER_FORWARD_ENV JSON array of env var names to explicitly forward into Docker terminal sessions. Note: skill-declared required_environment_variables are forwarded automatically — you only need this for vars not declared by any skill.
TERMINAL_DOCKER_VOLUMES Additional Docker volume mounts (comma-separated host:container pairs)
TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE Advanced opt-in: mount the launch cwd into Docker /workspace (true/false, default: false)
TERMINAL_SINGULARITY_IMAGE Singularity image or .sif path
TERMINAL_MODAL_IMAGE Modal container image
TERMINAL_DAYTONA_IMAGE Daytona sandbox image
TERMINAL_VERCEL_RUNTIME Vercel Sandbox runtime (node24, node22, python3.13)
TERMINAL_TIMEOUT Command timeout in seconds
TERMINAL_LIFETIME_SECONDS Max lifetime for terminal sessions in seconds
TERMINAL_CWD Working directory for terminal sessions (gateway/cron only; CLI uses launch dir)
SUDO_PASSWORD Enable sudo without interactive prompt

For cloud sandbox backends, persistence is filesystem-oriented. TERMINAL_LIFETIME_SECONDS controls when Hermes cleans up an idle terminal session, and later resumes may recreate the sandbox rather than keep the same live processes running.

SSH Backend

Variable Description
TERMINAL_SSH_HOST Remote server hostname
TERMINAL_SSH_USER SSH username
TERMINAL_SSH_PORT SSH port (default: 22)
TERMINAL_SSH_KEY Path to private key
TERMINAL_SSH_PERSISTENT Override persistent shell for SSH (default: follows TERMINAL_PERSISTENT_SHELL)

Container Resources (Docker, Singularity, Modal, Daytona)

Variable Description
TERMINAL_CONTAINER_CPU CPU cores (default: 1)
TERMINAL_CONTAINER_MEMORY Memory in MB (default: 5120)
TERMINAL_CONTAINER_DISK Disk in MB (default: 51200)
TERMINAL_CONTAINER_PERSISTENT Persist container filesystem across sessions (default: true)
TERMINAL_SANDBOX_DIR Host directory for workspaces and overlays (default: ~/.hermes/sandboxes/)

Persistent Shell

Variable Description
TERMINAL_PERSISTENT_SHELL Enable persistent shell for non-local backends (default: true). Also settable via terminal.persistent_shell in config.yaml
TERMINAL_LOCAL_PERSISTENT Enable persistent shell for local backend (default: false)
TERMINAL_SSH_PERSISTENT Override persistent shell for SSH backend (default: follows TERMINAL_PERSISTENT_SHELL)

Messaging

Variable Description
TELEGRAM_BOT_TOKEN Telegram bot token (from @BotFather)
TELEGRAM_ALLOWED_USERS Comma-separated user IDs allowed to use the bot (applies to DMs, groups, and forums)
TELEGRAM_GROUP_ALLOWED_USERS Comma-separated sender user IDs authorized in groups/forums only (does NOT grant DM access). Chat-ID-shaped values (starting with -) are still honored as chat IDs for backward compat with pre-#17686 configs, with a deprecation warning.
TELEGRAM_GROUP_ALLOWED_CHATS Comma-separated group/forum chat IDs; any member is authorized
TELEGRAM_HOME_CHANNEL Default Telegram chat/channel for cron delivery
TELEGRAM_HOME_CHANNEL_NAME Display name for the Telegram home channel
TELEGRAM_WEBHOOK_URL Public HTTPS URL for webhook mode (enables webhook instead of polling)
TELEGRAM_WEBHOOK_PORT Local listen port for webhook server (default: 8443)
TELEGRAM_WEBHOOK_SECRET Secret token Telegram echoes back in each update for verification. Required whenever TELEGRAM_WEBHOOK_URL is set — the gateway refuses to start without it (GHSA-3vpc-7q5r-276h). Generate with openssl rand -hex 32.
TELEGRAM_REACTIONS Enable emoji reactions on messages during processing (default: false)
TELEGRAM_REPLY_TO_MODE Reply-reference behavior: off, first (default), or all. Matches the Discord pattern.
TELEGRAM_IGNORED_THREADS Comma-separated Telegram forum topic/thread IDs where the bot never responds
TELEGRAM_PROXY Proxy URL for Telegram connections — overrides HTTPS_PROXY. Supports http://, https://, socks5://
DISCORD_BOT_TOKEN Discord bot token
DISCORD_ALLOWED_USERS Comma-separated Discord user IDs allowed to use the bot
DISCORD_ALLOWED_ROLES Comma-separated Discord role IDs allowed to use the bot (OR with DISCORD_ALLOWED_USERS). Auto-enables the Members intent. Useful when moderation teams churn — role grants propagate automatically.
DISCORD_ALLOWED_CHANNELS Comma-separated Discord channel IDs. When set, the bot only responds in these channels (plus DMs if allowed). Overrides config.yaml discord.allowed_channels.
DISCORD_PROXY Proxy URL for Discord connections — overrides HTTPS_PROXY. Supports http://, https://, socks5://
DISCORD_HOME_CHANNEL Default Discord channel for cron delivery
DISCORD_HOME_CHANNEL_NAME Display name for the Discord home channel
DISCORD_COMMAND_SYNC_POLICY Discord slash-command startup sync policy: safe (diff and reconcile), bulk (legacy tree.sync()), or off
DISCORD_REQUIRE_MENTION Require an @mention before responding in server channels
DISCORD_FREE_RESPONSE_CHANNELS Comma-separated channel IDs where mention is not required
DISCORD_AUTO_THREAD Auto-thread long replies when supported
DISCORD_REACTIONS Enable emoji reactions on messages during processing (default: true)
DISCORD_IGNORED_CHANNELS Comma-separated channel IDs where the bot never responds
DISCORD_NO_THREAD_CHANNELS Comma-separated channel IDs where bot responds without auto-threading
DISCORD_REPLY_TO_MODE Reply-reference behavior: off, first (default), or all
DISCORD_ALLOW_MENTION_EVERYONE Allow the bot to ping @everyone/@here (default: false). See Mention Control.
DISCORD_ALLOW_MENTION_ROLES Allow the bot to ping @role mentions (default: false).
DISCORD_ALLOW_MENTION_USERS Allow the bot to ping individual @user mentions (default: true).
DISCORD_ALLOW_MENTION_REPLIED_USER Ping the author when replying to their message (default: true).
SLACK_BOT_TOKEN Slack bot token (xoxb-...)
SLACK_APP_TOKEN Slack app-level token (xapp-..., required for Socket Mode)
SLACK_ALLOWED_USERS Comma-separated Slack user IDs
SLACK_HOME_CHANNEL Default Slack channel for cron delivery
SLACK_HOME_CHANNEL_NAME Display name for the Slack home channel
GOOGLE_CHAT_PROJECT_ID GCP project hosting the Pub/Sub topic (falls back to GOOGLE_CLOUD_PROJECT)
GOOGLE_CHAT_SUBSCRIPTION_NAME Full Pub/Sub subscription path, projects/{proj}/subscriptions/{sub} (legacy alias: GOOGLE_CHAT_SUBSCRIPTION)
GOOGLE_CHAT_SERVICE_ACCOUNT_JSON Path to Service Account JSON, or the JSON inline (falls back to GOOGLE_APPLICATION_CREDENTIALS)
GOOGLE_CHAT_ALLOWED_USERS Comma-separated user emails allowed to chat with the bot
GOOGLE_CHAT_ALLOW_ALL_USERS Allow any Google Chat user to trigger the bot (dev only)
GOOGLE_CHAT_HOME_CHANNEL Default space (e.g. spaces/AAAA...) for cron delivery
GOOGLE_CHAT_HOME_CHANNEL_NAME Display name for the Google Chat home space
GOOGLE_CHAT_MAX_MESSAGES Pub/Sub FlowControl max in-flight messages (default: 1)
GOOGLE_CHAT_MAX_BYTES Pub/Sub FlowControl max in-flight bytes (default: 16777216, 16 MiB)
GOOGLE_CHAT_BOOTSTRAP_SPACES Comma-separated extra space IDs to probe at startup when resolving the bot's own users/{id}
GOOGLE_CHAT_DEBUG_RAW Set to any value to log redacted Pub/Sub envelopes at DEBUG level (debugging only)
WHATSAPP_ENABLED Enable the WhatsApp bridge (true/false)
WHATSAPP_MODE bot (separate number) or self-chat (message yourself)
WHATSAPP_ALLOWED_USERS Comma-separated phone numbers (with country code, no +), or * to allow all senders
WHATSAPP_ALLOW_ALL_USERS Allow all WhatsApp senders without an allowlist (true/false)
WHATSAPP_DEBUG Log raw message events in the bridge for troubleshooting (true/false)
SIGNAL_HTTP_URL signal-cli daemon HTTP endpoint (for example http://127.0.0.1:8080)
SIGNAL_ACCOUNT Bot phone number in E.164 format
SIGNAL_ALLOWED_USERS Comma-separated E.164 phone numbers or UUIDs
SIGNAL_GROUP_ALLOWED_USERS Comma-separated group IDs, or * for all groups
SIGNAL_HOME_CHANNEL_NAME Display name for the Signal home channel
SIGNAL_IGNORE_STORIES Ignore Signal stories/status updates
SIGNAL_ALLOW_ALL_USERS Allow all Signal users without an allowlist
TWILIO_ACCOUNT_SID Twilio Account SID (shared with telephony skill)
TWILIO_AUTH_TOKEN Twilio Auth Token (shared with telephony skill; also used for webhook signature validation)
TWILIO_PHONE_NUMBER Twilio phone number in E.164 format (shared with telephony skill)
SMS_WEBHOOK_URL Public URL for Twilio signature validation — must match the webhook URL in Twilio Console (required)
SMS_WEBHOOK_PORT Webhook listener port for inbound SMS (default: 8080)
SMS_WEBHOOK_HOST Webhook bind address (default: 0.0.0.0)
SMS_INSECURE_NO_SIGNATURE Set to true to disable Twilio signature validation (local dev only — not for production)
SMS_ALLOWED_USERS Comma-separated E.164 phone numbers allowed to chat
SMS_ALLOW_ALL_USERS Allow all SMS senders without an allowlist
SMS_HOME_CHANNEL Phone number for cron job / notification delivery
SMS_HOME_CHANNEL_NAME Display name for the SMS home channel
EMAIL_ADDRESS Email address for the Email gateway adapter
EMAIL_PASSWORD Password or app password for the email account
EMAIL_IMAP_HOST IMAP hostname for the email adapter
EMAIL_IMAP_PORT IMAP port
EMAIL_SMTP_HOST SMTP hostname for the email adapter
EMAIL_SMTP_PORT SMTP port
EMAIL_ALLOWED_USERS Comma-separated email addresses allowed to message the bot
EMAIL_HOME_ADDRESS Default recipient for proactive email delivery
EMAIL_HOME_ADDRESS_NAME Display name for the email home target
EMAIL_POLL_INTERVAL Email polling interval in seconds
EMAIL_ALLOW_ALL_USERS Allow all inbound email senders
DINGTALK_CLIENT_ID DingTalk bot AppKey from developer portal (open.dingtalk.com)
DINGTALK_CLIENT_SECRET DingTalk bot AppSecret from developer portal
DINGTALK_ALLOWED_USERS Comma-separated DingTalk user IDs allowed to message the bot
FEISHU_APP_ID Feishu/Lark bot App ID from open.feishu.cn
FEISHU_APP_SECRET Feishu/Lark bot App Secret
FEISHU_DOMAIN feishu (China) or lark (international). Default: feishu
FEISHU_CONNECTION_MODE websocket (recommended) or webhook. Default: websocket
FEISHU_ENCRYPT_KEY Optional encryption key for webhook mode
FEISHU_VERIFICATION_TOKEN Optional verification token for webhook mode
FEISHU_ALLOWED_USERS Comma-separated Feishu user IDs allowed to message the bot
FEISHU_ALLOW_BOTS none (default) / mentions / all — accept inbound messages from other bots. See bot-to-bot messaging
FEISHU_REQUIRE_MENTION true (default) / false — whether group messages must @mention the bot. Override per-chat via group_rules.<chat_id>.require_mention.
FEISHU_HOME_CHANNEL Feishu chat ID for cron delivery and notifications
WECOM_BOT_ID WeCom AI Bot ID from admin console
WECOM_SECRET WeCom AI Bot secret
WECOM_WEBSOCKET_URL Custom WebSocket URL (default: wss://openws.work.weixin.qq.com)
WECOM_ALLOWED_USERS Comma-separated WeCom user IDs allowed to message the bot
WECOM_HOME_CHANNEL WeCom chat ID for cron delivery and notifications
WECOM_CALLBACK_CORP_ID WeCom enterprise Corp ID for callback self-built app
WECOM_CALLBACK_CORP_SECRET Corp secret for the self-built app
WECOM_CALLBACK_AGENT_ID Agent ID of the self-built app
WECOM_CALLBACK_TOKEN Callback verification token
WECOM_CALLBACK_ENCODING_AES_KEY AES key for callback encryption
WECOM_CALLBACK_HOST Callback server bind address (default: 0.0.0.0)
WECOM_CALLBACK_PORT Callback server port (default: 8645)
WECOM_CALLBACK_ALLOWED_USERS Comma-separated user IDs for allowlist
WECOM_CALLBACK_ALLOW_ALL_USERS Set true to allow all users without an allowlist
WEIXIN_ACCOUNT_ID Weixin account ID obtained via QR login through iLink Bot API
WEIXIN_TOKEN Weixin authentication token obtained via QR login through iLink Bot API
WEIXIN_BASE_URL Override Weixin iLink Bot API base URL (default: https://ilinkai.weixin.qq.com)
WEIXIN_CDN_BASE_URL Override Weixin CDN base URL for media (default: https://novac2c.cdn.weixin.qq.com/c2c)
WEIXIN_DM_POLICY Direct message policy: open, allowlist, pairing, disabled (default: open)
WEIXIN_GROUP_POLICY Group message policy: open, allowlist, disabled (default: disabled)
WEIXIN_ALLOWED_USERS Comma-separated Weixin user IDs allowed to DM the bot
WEIXIN_GROUP_ALLOWED_USERS Comma-separated Weixin group chat IDs (not member user IDs) allowed to interact with the bot. The variable name is legacy — it expects group IDs. Only takes effect when iLink actually delivers group events; QR-login iLink bot identities (...@im.bot) typically don't receive ordinary WeChat group messages.
WEIXIN_HOME_CHANNEL Weixin chat ID for cron delivery and notifications
WEIXIN_HOME_CHANNEL_NAME Display name for the Weixin home channel
WEIXIN_ALLOW_ALL_USERS Allow all Weixin users without an allowlist (true/false)
BLUEBUBBLES_SERVER_URL BlueBubbles server URL (e.g. http://192.168.1.10:1234)
BLUEBUBBLES_PASSWORD BlueBubbles server password
BLUEBUBBLES_WEBHOOK_HOST Webhook listener bind address (default: 127.0.0.1)
BLUEBUBBLES_WEBHOOK_PORT Webhook listener port (default: 8645)
BLUEBUBBLES_HOME_CHANNEL Phone/email for cron/notification delivery
BLUEBUBBLES_ALLOWED_USERS Comma-separated authorized users
BLUEBUBBLES_ALLOW_ALL_USERS Allow all users (true/false)
QQ_APP_ID QQ Bot App ID from q.qq.com
QQ_CLIENT_SECRET QQ Bot App Secret from q.qq.com
QQ_STT_API_KEY API key for external STT fallback provider (optional, used when QQ built-in ASR returns no text)
QQ_STT_BASE_URL Base URL for external STT provider (optional)
QQ_STT_MODEL Model name for external STT provider (optional)
QQ_ALLOWED_USERS Comma-separated QQ user openIDs allowed to message the bot
QQ_GROUP_ALLOWED_USERS Comma-separated QQ group IDs for group @-message access
QQ_ALLOW_ALL_USERS Allow all users (true/false, overrides QQ_ALLOWED_USERS)
QQBOT_HOME_CHANNEL QQ user/group openID for cron delivery and notifications
QQBOT_HOME_CHANNEL_NAME Display name for the QQ home channel
QQ_PORTAL_HOST Override the QQ portal host (set to sandbox.q.qq.com to route through the sandbox gateway; default: q.qq.com).
MATTERMOST_URL Mattermost server URL (e.g. https://mm.example.com)
MATTERMOST_TOKEN Bot token or personal access token for Mattermost
MATTERMOST_ALLOWED_USERS Comma-separated Mattermost user IDs allowed to message the bot
MATTERMOST_HOME_CHANNEL Channel ID for proactive message delivery (cron, notifications)
MATTERMOST_REQUIRE_MENTION Require @mention in channels (default: true). Set to false to respond to all messages.
MATTERMOST_FREE_RESPONSE_CHANNELS Comma-separated channel IDs where bot responds without @mention
MATTERMOST_REPLY_MODE Reply style: thread (threaded replies) or off (flat messages, default)
MATRIX_HOMESERVER Matrix homeserver URL (e.g. https://matrix.org)
MATRIX_ACCESS_TOKEN Matrix access token for bot authentication
MATRIX_USER_ID Matrix user ID (e.g. @hermes:matrix.org) — required for password login, optional with access token
MATRIX_PASSWORD Matrix password (alternative to access token)
MATRIX_ALLOWED_USERS Comma-separated Matrix user IDs allowed to message the bot (e.g. @alice:matrix.org)
MATRIX_HOME_ROOM Room ID for proactive message delivery (e.g. !abc123:matrix.org)
MATRIX_ENCRYPTION Enable end-to-end encryption (true/false, default: false)
MATRIX_DEVICE_ID Stable Matrix device ID for E2EE persistence across restarts (e.g. HERMES_BOT). Without this, E2EE keys rotate every startup and historic-room decrypt breaks.
MATRIX_REACTIONS Enable processing-lifecycle emoji reactions on inbound messages (default: true). Set to false to disable.
MATRIX_REQUIRE_MENTION Require @mention in rooms (default: true). Set to false to respond to all messages.
MATRIX_FREE_RESPONSE_ROOMS Comma-separated room IDs where bot responds without @mention
MATRIX_AUTO_THREAD Auto-create threads for room messages (default: true)
MATRIX_DM_MENTION_THREADS Create a thread when bot is @mentioned in a DM (default: false)
MATRIX_RECOVERY_KEY Recovery key for cross-signing verification after device key rotation. Recommended for E2EE setups with cross-signing enabled.
HASS_TOKEN Home Assistant Long-Lived Access Token (enables HA platform + tools)
HASS_URL Home Assistant URL (default: http://homeassistant.local:8123)
WEBHOOK_ENABLED Enable the webhook platform adapter (true/false)
WEBHOOK_PORT HTTP server port for receiving webhooks (default: 8644)
WEBHOOK_SECRET Global HMAC secret for webhook signature validation (used as fallback when routes don't specify their own)
API_SERVER_ENABLED Enable the OpenAI-compatible API server (true/false). Runs alongside other platforms.
API_SERVER_KEY Bearer token for API server authentication. Enforced for non-loopback binding.
API_SERVER_CORS_ORIGINS Comma-separated browser origins allowed to call the API server directly (for example http://localhost:3000,http://127.0.0.1:3000). Default: disabled.
API_SERVER_PORT Port for the API server (default: 8642)
API_SERVER_HOST Host/bind address for the API server (default: 127.0.0.1). Use 0.0.0.0 for network access — requires API_SERVER_KEY and a narrow API_SERVER_CORS_ORIGINS allowlist.
API_SERVER_MODEL_NAME Model name advertised on /v1/models. Defaults to the profile name (or hermes-agent for the default profile). Useful for multi-user setups where frontends like Open WebUI need distinct model names per connection.
GATEWAY_PROXY_URL URL of a remote Hermes API server to forward messages to (proxy mode). When set, the gateway handles platform I/O only — all agent work is delegated to the remote server. Also configurable via gateway.proxy_url in config.yaml.
GATEWAY_PROXY_KEY Bearer token for authenticating with the remote API server in proxy mode. Must match API_SERVER_KEY on the remote host.
MESSAGING_CWD Working directory for terminal commands in messaging mode (default: ~)
GATEWAY_ALLOWED_USERS Comma-separated user IDs allowed across all platforms
GATEWAY_ALLOW_ALL_USERS Allow all users without allowlists (true/false, default: false)

Microsoft Graph (Teams Meetings)

App-only credentials for the Microsoft Graph REST client used by the upcoming Teams meeting summary pipeline. See Register a Microsoft Graph application for the Azure portal walkthrough and the exact API permissions required.

Variable Description
MSGRAPH_TENANT_ID Azure AD tenant ID (directory GUID) for the Graph app registration.
MSGRAPH_CLIENT_ID Application (client) ID of the Azure app registration.
MSGRAPH_CLIENT_SECRET Client secret value for the app registration. Store in ~/.hermes/.env with chmod 600; rotate periodically via the Azure portal.
MSGRAPH_SCOPE OAuth2 scope for the client-credentials token request (default: https://graph.microsoft.com/.default).
MSGRAPH_AUTHORITY_URL Microsoft identity platform authority (default: https://login.microsoftonline.com). Override only for national/sovereign clouds (e.g. https://login.microsoftonline.us for GCC High).

Microsoft Graph Webhook Listener

Inbound change-notification listener for Graph events (Teams meetings, calendar, chat, etc.). See Microsoft Graph Webhook Listener for setup and security hardening.

Variable Description
MSGRAPH_WEBHOOK_ENABLED Enable the msgraph_webhook gateway platform (true/1/yes).
MSGRAPH_WEBHOOK_PORT Port the listener binds to (default: 8646).
MSGRAPH_WEBHOOK_CLIENT_STATE Shared secret Graph echoes in every notification; compared with hmac.compare_digest. Generate with openssl rand -hex 32.
MSGRAPH_WEBHOOK_ACCEPTED_RESOURCES Comma-separated allowlist of Graph resource paths/patterns (e.g. communications/onlineMeetings,chats/*/messages). Trailing * is prefix-matching. Empty = accept all.
MSGRAPH_WEBHOOK_ALLOWED_SOURCE_CIDRS Comma-separated CIDR ranges allowed to POST to the listener (e.g. 52.96.0.0/14,52.104.0.0/14). Empty = allow all (default). Restrict to Microsoft Graph's published egress ranges in production.

Teams Meeting Summary Delivery

Only used when the teams_pipeline plugin is enabled. Settings are also configurable under platforms.teams.extra in config.yaml — env vars take priority when both are set. See Microsoft Teams → Meeting Summary Delivery.

Variable Description
TEAMS_DELIVERY_MODE graph or incoming_webhook.
TEAMS_INCOMING_WEBHOOK_URL Teams-generated webhook URL; required when TEAMS_DELIVERY_MODE=incoming_webhook.
TEAMS_GRAPH_ACCESS_TOKEN Pre-acquired delegated access token for Graph delivery. Rarely needed — the writer falls back to the MSGRAPH_* app credentials when unset.
TEAMS_TEAM_ID Target Team ID for channel delivery (graph mode).
TEAMS_CHANNEL_ID Target channel ID (paired with TEAMS_TEAM_ID).
TEAMS_CHAT_ID Target 1:1 or group chat ID (alternative to team+channel for graph mode).

LINE Messaging API

Used by the bundled LINE platform plugin (plugins/platforms/line/). See Messaging Gateway → LINE for full setup.

Variable Description
LINE_CHANNEL_ACCESS_TOKEN Long-lived channel access token from the LINE Developers Console (Messaging API tab). Required.
LINE_CHANNEL_SECRET Channel secret (Basic settings tab); used for HMAC-SHA256 webhook signature verification. Required.
LINE_HOST Webhook bind host (default: 0.0.0.0).
LINE_PORT Webhook bind port (default: 8646).
LINE_PUBLIC_URL Public HTTPS base URL (e.g. https://my-tunnel.example.com). Required for image / audio / video sends — LINE only accepts HTTPS-reachable URLs.
LINE_ALLOWED_USERS Comma-separated user IDs allowed to DM the bot (U-prefixed).
LINE_ALLOWED_GROUPS Comma-separated group IDs the bot will respond in (C-prefixed).
LINE_ALLOWED_ROOMS Comma-separated room IDs the bot will respond in (R-prefixed).
LINE_ALLOW_ALL_USERS Dev-only escape hatch — accepts any source. Default: false.
LINE_HOME_CHANNEL Default delivery target for cron jobs with deliver: line.
LINE_SLOW_RESPONSE_THRESHOLD Seconds before the slow-LLM Template Buttons postback fires (default: 45). Set 0 to disable and always Push-fallback.
LINE_PENDING_TEXT Bubble text shown alongside the postback button.
LINE_BUTTON_LABEL Postback button label (default: Get answer).
LINE_DELIVERED_TEXT Reply when an already-delivered postback is tapped again (default: Already replied ✅).
LINE_INTERRUPTED_TEXT Reply when a /stop-orphaned postback button is tapped (default: Run was interrupted before completion.).

Advanced Messaging Tuning

Advanced per-platform knobs for throttling the outbound message batcher. Most users never need to touch these; defaults are set to respect each platform's rate limits without feeling sluggish.

Variable Description
HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS Grace window before flushing a queued Telegram text chunk (default: 0.6).
HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS Delay between split chunks when a single Telegram message exceeds the length limit (default: 2.0).
HERMES_TELEGRAM_MEDIA_BATCH_DELAY_SECONDS Grace window before flushing queued Telegram media (default: 0.6).
HERMES_TELEGRAM_FOLLOWUP_GRACE_SECONDS Delay before sending a follow-up after the agent finishes, to avoid racing the last stream chunk.
HERMES_TELEGRAM_HTTP_CONNECT_TIMEOUT / _READ_TIMEOUT / _WRITE_TIMEOUT / _POOL_TIMEOUT Override the underlying python-telegram-bot HTTP timeouts (seconds).
HERMES_TELEGRAM_HTTP_POOL_SIZE Max concurrent HTTP connections to the Telegram API.
HERMES_TELEGRAM_DISABLE_FALLBACK_IPS Disable the hard-coded Cloudflare fallback IPs used when DNS fails (true/false).
HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS Grace window before flushing a queued Discord text chunk (default: 0.6).
HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS Delay between split chunks when a Discord message exceeds the length limit (default: 2.0).
HERMES_MATRIX_TEXT_BATCH_DELAY_SECONDS / _SPLIT_DELAY_SECONDS Matrix equivalents of the Telegram batch knobs.
HERMES_FEISHU_TEXT_BATCH_DELAY_SECONDS / _SPLIT_DELAY_SECONDS / _MAX_CHARS / _MAX_MESSAGES Feishu batcher tuning — delay, split delay, max chars per message, max messages per batch.
HERMES_FEISHU_MEDIA_BATCH_DELAY_SECONDS Feishu media flush delay.
HERMES_FEISHU_DEDUP_CACHE_SIZE Size of the Feishu webhook dedup cache (default: 1024).
HERMES_WECOM_TEXT_BATCH_DELAY_SECONDS / _SPLIT_DELAY_SECONDS WeCom batcher tuning.
HERMES_VISION_DOWNLOAD_TIMEOUT Timeout in seconds for downloading an image before handing it to vision models (default: 30).
HERMES_RESTART_DRAIN_TIMEOUT Gateway: seconds to wait for active runs to drain on /restart before forcing the restart (default: 900).
HERMES_GATEWAY_PLATFORM_CONNECT_TIMEOUT Per-platform connect timeout during gateway startup (seconds).
HERMES_GATEWAY_BUSY_INPUT_MODE Default gateway busy-input behavior: queue, steer, or interrupt. Can be overridden per chat with /busy.
HERMES_GATEWAY_BUSY_ACK_ENABLED Whether the gateway sends an acknowledgment message (//) when a user sends input while the agent is busy (default: true). Set to false to suppress these messages entirely — the input is still queued/steered/interrupts as normal, only the chat reply is silenced. Bridged from display.busy_ack_enabled in config.yaml.
HERMES_FILE_MUTATION_VERIFIER Enable the per-turn file-mutation verifier footer (default: true). When enabled, Hermes appends an advisory listing any write_file / patch calls that failed during the turn and were not superseded by a successful write. Set to 0, false, no, or off to suppress. Mirrors display.file_mutation_verifier in config.yaml; the env var wins when set.
HERMES_CRON_TIMEOUT Inactivity timeout for cron job agent runs in seconds (default: 600). The agent can run indefinitely while actively calling tools or receiving stream tokens — this only triggers when idle. Set to 0 for unlimited.
HERMES_CRON_SCRIPT_TIMEOUT Timeout for pre-run scripts attached to cron jobs in seconds (default: 120). Override for scripts that need longer execution (e.g., randomized delays for anti-bot timing). Also configurable via cron.script_timeout_seconds in config.yaml.
HERMES_CRON_MAX_PARALLEL Max cron jobs run in parallel per tick (default: 4).

Agent Behavior

Variable Description
HERMES_MAX_ITERATIONS Max tool-calling iterations per conversation (default: 90)
HERMES_INFERENCE_MODEL Override model name at process level (takes priority over config.yaml for the session). Also settable via -m/--model flag.
HERMES_YOLO_MODE Set to 1 to bypass dangerous-command approval prompts. Equivalent to --yolo.
HERMES_ACCEPT_HOOKS Auto-approve any unseen shell hooks declared in config.yaml without a TTY prompt. Equivalent to --accept-hooks or hooks_auto_accept: true.
HERMES_IGNORE_USER_CONFIG Skip ~/.hermes/config.yaml and use built-in defaults (credentials in .env still load). Equivalent to --ignore-user-config.
HERMES_IGNORE_RULES Skip auto-injection of AGENTS.md, SOUL.md, .cursorrules, memory, and preloaded skills. Equivalent to --ignore-rules.
HERMES_MD_NAMES Comma-separated list of rules-file names to auto-inject (default: AGENTS.md,CLAUDE.md,.cursorrules,SOUL.md).
HERMES_TOOL_PROGRESS Deprecated compatibility variable for tool progress display. Prefer display.tool_progress in config.yaml.
HERMES_TOOL_PROGRESS_MODE Deprecated compatibility variable for tool progress mode. Prefer display.tool_progress in config.yaml.
HERMES_HUMAN_DELAY_MODE Response pacing: off/natural/custom
HERMES_HUMAN_DELAY_MIN_MS Custom delay range minimum (ms)
HERMES_HUMAN_DELAY_MAX_MS Custom delay range maximum (ms)
HERMES_QUIET Suppress non-essential output (true/false)
CODEX_HOME When Codex app-server runtime is enabled, override the directory Codex CLI reads its config + auth from (default: ~/.codex). Hermes' migration writes the managed block to <CODEX_HOME>/config.toml.
HERMES_KANBAN_TASK Set by the kanban dispatcher when spawning a worker (task UUID). Workers and the spawned hermes-tools MCP subprocess inherit it so kanban tools gate correctly. Don't set manually.
HERMES_API_TIMEOUT LLM API call timeout in seconds (default: 1800)
HERMES_API_CALL_STALE_TIMEOUT Non-streaming stale-call timeout in seconds (default: 300). Auto-disabled for local providers when left unset. Also configurable via providers.<id>.stale_timeout_seconds or providers.<id>.models.<model>.stale_timeout_seconds in config.yaml.
HERMES_STREAM_READ_TIMEOUT Streaming socket read timeout in seconds (default: 120). Auto-increased to HERMES_API_TIMEOUT for local providers. Increase if local LLMs time out during long code generation.
HERMES_STREAM_STALE_TIMEOUT Stale stream detection timeout in seconds (default: 180). Auto-disabled for local providers. Triggers connection kill if no chunks arrive within this window.
HERMES_STREAM_RETRIES Number of mid-stream reconnect attempts on transient network errors (default: 3).
HERMES_AGENT_TIMEOUT Gateway inactivity timeout for a running agent in seconds (default: 900). Resets on every tool call and streamed token. Set to 0 to disable.
HERMES_AGENT_TIMEOUT_WARNING Gateway: send a warning message after this many seconds of inactivity (default: 75% of HERMES_AGENT_TIMEOUT).
HERMES_AGENT_NOTIFY_INTERVAL Gateway: interval in seconds between progress notifications on long-running agent turns.
HERMES_CHECKPOINT_TIMEOUT Timeout for filesystem checkpoint creation in seconds (default: 30).
HERMES_EXEC_ASK Enable execution approval prompts in gateway mode (true/false)
HERMES_ENABLE_PROJECT_PLUGINS Enable auto-discovery of repo-local plugins from ./.hermes/plugins/ (true/false, default: false)
HERMES_PLUGINS_DEBUG 1/true to surface verbose plugin-discovery logs on stderr — directories scanned, manifests parsed, skip reasons, and full tracebacks on parse or register() failure. Aimed at plugin authors.
HERMES_BACKGROUND_NOTIFICATIONS Background process notification mode in gateway: all (default), result, error, off
HERMES_EPHEMERAL_SYSTEM_PROMPT Ephemeral system prompt injected at API-call time (never persisted to sessions)
HERMES_PREFILL_MESSAGES_FILE Path to a JSON file of ephemeral prefill messages injected at API-call time.
HERMES_ALLOW_PRIVATE_URLS true/false — allow tools to fetch localhost/private-network URLs. Off by default in gateway mode.
HERMES_REDACT_SECRETS true/false — control secret redaction in tool output, logs, and chat responses (default: true).
HERMES_WRITE_SAFE_ROOT Optional directory prefix that restricts write_file/patch writes; paths outside require approval.
HERMES_DISABLE_FILE_STATE_GUARD Set to 1 to turn off the "file changed since you read it" guard on patch/write_file.
HERMES_CORE_TOOLS Comma-separated override for the canonical core tool list (advanced; rarely needed).
HERMES_BUNDLED_SKILLS Comma-separated override for the list of bundled skills loaded at startup.
HERMES_OPTIONAL_SKILLS Comma-separated list of optional-skill names to auto-install on first run.
HERMES_DEBUG_INTERRUPT Set to 1 to log detailed interrupt/cancel tracing to agent.log.
HERMES_DUMP_REQUESTS Dump API request payloads to log files (true/false)
HERMES_DUMP_REQUEST_STDOUT Dump API request payloads to stdout instead of log files.
HERMES_OAUTH_TRACE Set to 1 to log OAuth token exchange and refresh attempts. Includes redacted timing info.
HERMES_OAUTH_FILE Override the path used for OAuth credential storage (default: ~/.hermes/auth.json).
HERMES_AGENT_HELP_GUIDANCE Append additional guidance text to the system prompt for custom deployments.
HERMES_AGENT_LOGO Override the ASCII banner logo at CLI startup.
DELEGATION_MAX_CONCURRENT_CHILDREN Max parallel subagents per delegate_task batch (default: 3, floor of 1, no ceiling). Also configurable via delegation.max_concurrent_children in config.yaml — the config value takes priority.

Interface

Variable Description
HERMES_TUI Launch the TUI instead of the classic CLI when set to 1. Equivalent to passing --tui.
HERMES_TUI_DIR Path to a prebuilt ui-tui/ directory (must contain dist/entry.js and populated node_modules). Used by distros and Nix to skip the first-launch npm install.
HERMES_TUI_RESUME Resume a specific TUI session by ID on launch. When set, hermes --tui skips forging a fresh session and picks up the named session instead — useful for re-attaching after a disconnect or terminal crash.
HERMES_TUI_THEME Force the TUI color theme: light, dark, or a raw 6-character background hex (e.g. ffffff or 1a1a2e). When unset, Hermes auto-detects using COLORFGBG and terminal background queries; this variable overrides detection on terminals (Ghostty, Warp, iTerm2, etc.) that don't set COLORFGBG.
HERMES_INFERENCE_MODEL Force the model for hermes -z / hermes chat without mutating config.yaml. Pairs with HERMES_INFERENCE_PROVIDER. Useful for scripted callers (sweeper, CI, batch runners) that need to override the default model per run.

Session Settings

Variable Description
SESSION_IDLE_MINUTES Reset sessions after N minutes of inactivity (default: 1440)
SESSION_RESET_HOUR Daily reset hour in 24h format (default: 4 = 4am)

Context Compression (config.yaml only)

Context compression is configured exclusively through config.yaml — there are no environment variables for it. Threshold settings live in the compression: block, while the summarization model/provider lives under auxiliary.compression:.

compression:
  enabled: true
  threshold: 0.50
  target_ratio: 0.20         # fraction of threshold to preserve as recent tail
  protect_last_n: 20         # minimum recent messages to keep uncompressed

:::info Legacy migration Older configs with compression.summary_model, compression.summary_provider, and compression.summary_base_url are automatically migrated to auxiliary.compression.* on first load. :::

Auxiliary Task Overrides

Variable Description
AUXILIARY_VISION_PROVIDER Override provider for vision tasks
AUXILIARY_VISION_MODEL Override model for vision tasks
AUXILIARY_VISION_BASE_URL Direct OpenAI-compatible endpoint for vision tasks
AUXILIARY_VISION_API_KEY API key paired with AUXILIARY_VISION_BASE_URL
AUXILIARY_WEB_EXTRACT_PROVIDER Override provider for web extraction/summarization
AUXILIARY_WEB_EXTRACT_MODEL Override model for web extraction/summarization
AUXILIARY_WEB_EXTRACT_BASE_URL Direct OpenAI-compatible endpoint for web extraction/summarization
AUXILIARY_WEB_EXTRACT_API_KEY API key paired with AUXILIARY_WEB_EXTRACT_BASE_URL

For task-specific direct endpoints, Hermes uses the task's configured API key or OPENAI_API_KEY. It does not reuse OPENROUTER_API_KEY for those custom endpoints.

Fallback Providers (config.yaml only)

The primary model fallback chain is configured exclusively through config.yaml — there are no environment variables for it. Add a top-level fallback_providers list with provider and model keys to enable automatic failover when your main model encounters errors.

fallback_providers:
  - provider: openrouter
    model: anthropic/claude-sonnet-4

The older top-level fallback_model single-provider shape is still read for backward compatibility, but new configuration should use fallback_providers.

See Fallback Providers for full details.

Provider Routing (config.yaml only)

These go in ~/.hermes/config.yaml under the provider_routing section:

Key Description
sort Sort providers: "price" (default), "throughput", or "latency"
only List of provider slugs to allow (e.g., ["anthropic", "google"])
ignore List of provider slugs to skip
order List of provider slugs to try in order
require_parameters Only use providers supporting all request params (true/false)
data_collection "allow" (default) or "deny" to exclude data-storing providers

:::tip Use hermes config set to set environment variables — it automatically saves them to the right file (.env for secrets, config.yaml for everything else). :::