The 300s default was too tight for high-reasoning models on non-trivial
delegated tasks — e.g. gpt-5.5 xhigh reviewing 12 files would burn >5min
on reasoning tokens before issuing its first tool call, tripping the
hard wall-clock timeout with 0 api_calls logged.
- tools/delegate_tool.py: DEFAULT_CHILD_TIMEOUT 300 -> 600
- hermes_cli/config.py: surface delegation.child_timeout_seconds in
DEFAULT_CONFIG so it's discoverable (previously the key was read by
_get_child_timeout() but absent from the default config schema)
Users can still override via config.yaml delegation.child_timeout_seconds
or DELEGATION_CHILD_TIMEOUT_SECONDS env var (floor 30s, no ceiling).
cmd_update no longer SIGKILLs in-flight agent runs, and users get
'still working' status every 3 min instead of 10. Two long-standing
sources of '@user — agent gives up mid-task' reports on Telegram and
other gateways.
Drain-aware update:
- New helper hermes_cli.gateway._graceful_restart_via_sigusr1(pid,
drain_timeout) sends SIGUSR1 to the gateway and polls os.kill(pid,
0) until the process exits or the budget expires.
- cmd_update's systemd loop now reads MainPID via 'systemctl show
--property=MainPID --value' and tries the graceful path first. The
gateway's existing SIGUSR1 handler -> request_restart(via_service=
True) -> drain -> exit(75) is wired in gateway/run.py and is
respawned by systemd's Restart=on-failure (and the explicit
RestartForceExitStatus=75 on newer units).
- Falls back to 'systemctl restart' when MainPID is unknown, the
drain budget elapses, or the unit doesn't respawn after exit (older
units missing Restart=on-failure). Old install behavior preserved.
- Drain budget = max(restart_drain_timeout, 30s) + 15s margin so the
drain loop in run_agent + final exit have room before fallback
fires. Composes with #14728's tool-subprocess reaping.
Notification interval:
- agent.gateway_notify_interval default 600 -> 180.
- HERMES_AGENT_NOTIFY_INTERVAL env-var fallback in gateway/run.py
matched.
- 9-minute weak-model spinning runs now ping at 3 min and 6 min
instead of 27 seconds before completion, removing the 'is the bot
dead?' reflex that drives gateway-restart cycles.
Tests:
- Two new tests in tests/hermes_cli/test_update_gateway_restart.py:
one asserts SIGUSR1 is sent and 'systemctl restart' is NOT called
when MainPID is known and the helper succeeds; one asserts the
fallback fires when the helper returns False.
- E2E: spawned detached bash processes confirm the helper returns
True on SIGUSR1-handling exit (~0.5s) and False on SIGUSR1-ignoring
processes (timeout). Verified non-existent PID and pid=0 edge cases.
- 41/41 in test_update_gateway_restart.py (was 39, +2 new).
- 154/154 in shutdown-related suites including #14728's new tests.
Reported by @GeoffWellman and @ANT_1515 on X.
Closes#11616.
The agent's API retry loop hardcoded max_retries = 3, so users with
fallback providers on flaky primaries burned through ~3 × provider
timeout (e.g. 3 × 180s = 9 minutes) before their fallback chain got a
chance to kick in.
Expose a new config key:
agent:
api_max_retries: 3 # default unchanged
Set it to 1 for fast failover when you have fallback providers, or
raise it if you prefer longer tolerance on a single provider. Values
< 1 are clamped to 1 (single attempt, no retry); non-integer values
fall back to the default.
This wraps the Hermes-level retry loop only — the OpenAI SDK's own
low-level retries (max_retries=2 default) still run beneath this for
transient network errors.
Changes:
- hermes_cli/config.py: add agent.api_max_retries default 3 with comment.
- run_agent.py: read self._api_max_retries in AIAgent.__init__; replace
hardcoded max_retries = 3 in the retry loop with self._api_max_retries.
- cli-config.yaml.example: documented example entry.
- hermes_cli/tips.py: discoverable tip line.
- tests/run_agent/test_api_max_retries_config.py: 4 tests covering
default, override, clamp-to-one, and invalid-value fallback.
## Merged
Adds MiMo v2.5-pro and v2.5 support to Xiaomi native provider, OpenCode Go, and setup wizard.
### Changes
- Context lengths: added v2.5-pro (1M) and v2.5 (1M), corrected existing MiMo entries to exact values (262144)
- Provider lists: xiaomi, opencode-go, setup wizard
- Vision: upgraded from mimo-v2-omni to mimo-v2.5 (omnimodal)
- Config description updated for XIAOMI_API_KEY
- Tests updated for new vision model preference
### Verification
- 4322 tests passed, 0 new regressions
- Live API tested on Xiaomi portal: basic, reasoning, tool calling, multi-tool, file ops, system prompt, vision — all pass
- Self-review found and fixed 2 issues (redundant vision check, stale HuggingFace context length)
Replaces the blanket 'always allow' change from the previous commit with
an opt-in config flag so users who want belt-and-suspenders security can
still get the keyword scan on skill_manage output.
## Default behavior (flag off)
skill_manage(action='create'|'edit'|'patch') no longer runs the keyword
scanner. The agent can write skills that mention risky keywords in prose
(documenting what reviewers should watch for, describing cache-bust
semantics in a PR-review skill, referencing AGENTS.md, etc.) without
getting blocked.
Rationale: the agent can already execute the same code paths via
terminal() with no gate, so the scan adds friction without meaningful
security against a compromised or malicious agent.
## Opt-in behavior (flag on)
Set skills.guard_agent_created: true in config.yaml to get the original
behavior back. Scanner runs on every skill_manage write; dangerous
verdicts surface as a tool error the agent can react to (retry without
the flagged content).
## External hub installs unaffected
trusted/community sources (hermes skills install) always get scanned
regardless of this flag. The gate is specifically for skill_manage,
which only agents call.
## Changes
- hermes_cli/config.py: add skills.guard_agent_created: False to DEFAULT_CONFIG
- tools/skill_manager_tool.py: _guard_agent_created_enabled() reads the flag;
_security_scan_skill() short-circuits to None when the flag is off
- tools/skills_guard.py: restore INSTALL_POLICY['agent-created'] =
('allow', 'allow', 'ask') so the scan remains strict when it does run
- tests/tools/test_skills_guard.py: restore original ask/force tests
- tests/tools/test_skill_manager_tool.py: new TestSecurityScanGate class
covering both flag states + config error handling
## Validation
- tests/tools/test_skills_guard.py + test_skill_manager_tool.py: 115/115 pass
- E2E: flagged-keyword skill creates with default config, blocks with flag on
The environment-snapshot login shell was auto-sourcing only ~/.bashrc when
building the PATH snapshot. On Debian/Ubuntu the default ~/.bashrc starts
with a non-interactive short-circuit:
case $- in *i*) ;; *) return;; esac
Sourcing it from a non-interactive shell returns before any PATH export
below that guard runs. Node version managers like n and nvm append their
PATH line under that guard, so Hermes was capturing a PATH without
~/n/bin — and the terminal tool saw 'node: command not found' even when
node was on the user's interactive shell PATH.
Expand the auto-source list (when auto_source_bashrc is on) to:
~/.profile → ~/.bash_profile → ~/.bashrc
~/.profile and ~/.bash_profile have no interactivity guard — installers
that write their PATH there (n's n-install, nvm's curl installer on most
setups) take effect. ~/.bashrc still runs last to preserve behaviour for
users who put PATH logic there without the guard.
Added two tests covering the new behaviour plus an E2E test that spins up
a real LocalEnvironment with a guard-prefixed ~/.bashrc and a ~/.profile
PATH export, and verifies the captured snapshot PATH contains the profile
entry.
_normalize_custom_provider_entry silently drops the models field when it's
a list. Hand-edited configs (and the shape used by older Hermes versions)
still write models as a plain list of ids, so after the normalize pass the
entry reaches list_authenticated_providers() with no models and /model
shows the provider with (0) models — even though the underlying picker
code handles lists fine.
Convert list-format models into the empty-value dict shape the rest of
the pipeline already expects. Dict-format entries keep passing through
unchanged.
Repro (before the fix):
custom_providers:
- name: acme
base_url: https://api.example.com/v1
models: [foo, bar, baz]
/model shows "acme (0)"; bypassing normalize in list_authenticated_providers
returns three models, confirming the drop happens in normalize.
Adds four unit tests covering list→dict conversion, dict pass-through,
filtering of empty/non-string entries, and the empty-list case.
When a user manually sets fallback_model as a YAML list instead of a
dict, save_config_value() crashes with:
AttributeError: 'list' object has no attribute 'get'
at the fb.get('provider') call on hermes_cli/config.py.
The fix adds isinstance(fb, dict) so list-format values are treated as
unconfigured — the fallback_model comment block is appended to guide
correct usage — instead of crashing.
Fixes#4091
Co-authored-by: [AI-assisted — Claude Sonnet 4.6 via Milo/Hermes]
Follow-up on helix4u's PR #14211:
- Flip default to true: narrowing toolsets=['web','browser'] expresses
'I want these extras', not 'silently strip MCP'. Parent MCP tools
(registered at runtime) should survive narrowing by default.
- Drop _config_version bump (22->23); additive nested key under
delegation.* is handled by _deep_merge, no migration needed.
- Update tests to reflect new default behavior.
Adds security.allow_private_urls / HERMES_ALLOW_PRIVATE_URLS toggle so
users on OpenWrt routers, TUN-mode proxies (Clash/Mihomo/Sing-box),
corporate split-tunnel VPNs, and Tailscale networks — where DNS resolves
public domains to 198.18.0.0/15 or 100.64.0.0/10 — can use web_extract,
browser, vision URL fetching, and gateway media downloads.
Single toggle in tools/url_safety.py; all 23 is_safe_url() call sites
inherit automatically. Cached for process lifetime.
Cloud metadata endpoints stay ALWAYS blocked regardless of the toggle:
169.254.169.254 (AWS/GCP/Azure/DO/Oracle), 169.254.170.2 (AWS ECS task
IAM creds), 169.254.169.253 (Azure IMDS wire server), 100.100.100.200
(Alibaba), fd00:ec2::254 (AWS IPv6), the entire 169.254.0.0/16
link-local range, and the metadata.google.internal / metadata.goog
hostnames (checked pre-DNS so they can't be bypassed on networks where
those names resolve to local IPs).
Supersedes #3779 (narrower HERMES_ALLOW_RFC2544 for the same class of
users).
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
* feat(state): auto-prune old sessions + VACUUM state.db at startup
state.db accumulates every session, message, and FTS5 index entry forever.
A heavy user (gateway + cron) reported 384MB with 982 sessions / 68K messages
causing slowdown; manual 'hermes sessions prune --older-than 7' + VACUUM
brought it to 43MB. The prune command and VACUUM are not wired to run
automatically anywhere — sessions grew unbounded until users noticed.
Changes:
- hermes_state.py: new state_meta key/value table, vacuum() method, and
maybe_auto_prune_and_vacuum() — idempotent via last-run timestamp in
state_meta so it only actually executes once per min_interval_hours
across all Hermes processes for a given HERMES_HOME. Never raises.
- hermes_cli/config.py: new 'sessions:' block in DEFAULT_CONFIG
(auto_prune=True, retention_days=90, vacuum_after_prune=True,
min_interval_hours=24). Added to _KNOWN_ROOT_KEYS.
- cli.py: call maintenance once at HermesCLI init (shared helper
_run_state_db_auto_maintenance reads config and delegates to DB).
- gateway/run.py: call maintenance once at GatewayRunner init.
- Docs: user-guide/sessions.md rewrites 'Automatic Cleanup' section.
Why VACUUM matters: SQLite does NOT shrink the file on DELETE — freed
pages get reused on next INSERT. Without VACUUM, a delete-heavy DB stays
bloated forever. VACUUM only runs when the prune actually removed rows,
so tight DBs don't pay the I/O cost.
Tests: 10 new tests in tests/test_hermes_state.py covering state_meta,
vacuum, idempotency, interval skipping, VACUUM-only-when-needed,
corrupt-marker recovery. All 246 existing state/config/gateway tests
still pass.
Verified E2E with real imports + isolated HERMES_HOME: DEFAULT_CONFIG
exposes the new block, load_config() returns it for fresh installs,
first call prunes+vacuums, second call within min_interval_hours skips,
and the state_meta marker persists across connection close/reopen.
* sessions.auto_prune defaults to false (opt-in)
Session history powers session_search recall across past conversations,
so silently pruning on startup could surprise users. Ship the machinery
disabled and let users opt in when they notice state.db is hurting
performance.
- DEFAULT_CONFIG.sessions.auto_prune: True → False
- Call-site fallbacks in cli.py and gateway/run.py match the new default
(so unmigrated configs still see off)
- Docs: flip 'Enable in config.yaml' framing + tip explains the tradeoff
Adds a first-class 'stepfun' API-key provider surfaced as Step Plan:
- Support Step Plan setup for both International and China regions
- Discover Step Plan models live from /step_plan/v1/models, with a
small coding-focused fallback catalog when discovery is unavailable
- Thread StepFun through provider metadata, setup persistence, status
and doctor output, auxiliary routing, and model normalization
- Add tests for provider resolution, model validation, metadata
mapping, and StepFun region/model persistence
Based on #6005 by @hengm3467.
Co-authored-by: hengm3467 <100685635+hengm3467@users.noreply.github.com>
A single global MAX_TEXT_LENGTH = 4000 truncated every TTS provider at
4000 chars, causing long inputs to be silently chopped even though the
underlying APIs allow much more:
- OpenAI: 4096
- xAI: 15000
- MiniMax: 10000
- ElevenLabs: 5000 / 10000 / 30000 / 40000 (model-aware)
- Gemini: ~5000
- Edge: ~5000
The schema description also told the model 'Keep under 4000 characters',
which encouraged the agent to self-chunk long briefs into multiple TTS
calls (producing 3 separate audio files instead of one).
New behavior:
- PROVIDER_MAX_TEXT_LENGTH table + ELEVENLABS_MODEL_MAX_TEXT_LENGTH
encode the documented per-provider limits.
- _resolve_max_text_length(provider, cfg) resolves:
1. tts.<provider>.max_text_length user override
2. ElevenLabs model_id lookup
3. provider default
4. 4000 fallback
- text_to_speech_tool() and stream_tts_to_speaker() both call the
resolver; old MAX_TEXT_LENGTH alias kept for back-compat.
- Schema description no longer hardcodes 4000.
Tests: 27 new unit + E2E tests; all 53 existing TTS tests and 253
voice-command/voice-cli tests still pass.
Adds role='leaf'|'orchestrator' to delegate_task. With max_spawn_depth>=2,
an orchestrator child retains the 'delegation' toolset and can spawn its
own workers; leaf children cannot delegate further (identical to today).
Default posture is flat — max_spawn_depth=1 means a depth-0 parent's
children land at the depth-1 floor and orchestrator role silently
degrades to leaf. Users opt into nested delegation by raising
max_spawn_depth to 2 or 3 in config.yaml.
Also threads acp_command/acp_args through the main agent loop's delegate
dispatch (previously silently dropped in the schema) via a new
_dispatch_delegate_task helper, and adds a DelegateEvent enum with
legacy-string back-compat for gateway/ACP/CLI progress consumers.
Config (hermes_cli/config.py defaults):
delegation.max_concurrent_children: 3 # floor-only, no upper cap
delegation.max_spawn_depth: 1 # 1=flat (default), 2-3 unlock nested
delegation.orchestrator_enabled: true # global kill switch
Salvaged from @pefontana's PR #11215. Overrides vs. the original PR:
concurrency stays at 3 (PR bumped to 5 + cap 8 — we keep the floor only,
no hard ceiling); max_spawn_depth defaults to 1 (PR defaulted to 2 which
silently enabled one level of orchestration for every user).
Co-authored-by: pefontana <fontana.pedro93@gmail.com>
Full AST-based scan of all .py files to find every case where a module
or name is imported locally inside a function body but is already
available at module level. This is the second pass — the first commit
handled the known cases from the lint report; this one catches
everything else.
Files changed (19):
cli.py — 16 removals: time as _time/_t/_tmod (×10),
re / re as _re (×2), os as _os, sys,
partial os from combo import,
from model_tools import get_tool_definitions
gateway/run.py — 8 removals: MessageEvent as _ME /
MessageType as _MT (×3), os as _os2,
MessageEvent+MessageType (×2), Platform,
BasePlatformAdapter as _BaseAdapter
run_agent.py — 6 removals: get_hermes_home as _ghh,
partial (contextlib, os as _os),
cleanup_vm, cleanup_browser,
set_interrupt as _sif (×2),
partial get_toolset_for_tool
hermes_cli/main.py — 4 removals: get_hermes_home, time as _time,
logging as _log, shutil
hermes_cli/config.py — 1 removal: get_hermes_home as _ghome
hermes_cli/runtime_provider.py
— 1 removal: load_config as _load_bedrock_config
hermes_cli/setup.py — 2 removals: importlib.util (×2)
hermes_cli/nous_subscription.py
— 1 removal: from hermes_cli.config import load_config
hermes_cli/tools_config.py
— 1 removal: from hermes_cli.config import load_config, save_config
cron/scheduler.py — 3 removals: concurrent.futures, json as _json,
from hermes_cli.config import load_config
batch_runner.py — 1 removal: list_distributions as get_all_dists
(kept print_distribution_info, not at top level)
tools/send_message_tool.py
— 2 removals: import os (×2)
tools/skills_tool.py — 1 removal: logging as _logging
tools/browser_camofox.py
— 1 removal: from hermes_cli.config import load_config
tools/image_generation_tool.py
— 1 removal: import fal_client
environments/tool_context.py
— 1 removal: concurrent.futures
gateway/platforms/bluebubbles.py
— 1 removal: httpx as _httpx
gateway/platforms/whatsapp.py
— 1 removal: import asyncio
tui_gateway/server.py — 2 removals: from datetime import datetime,
import time
All alias references (_time, _t, _tmod, _re, _os, _os2, _json, _ghh,
_ghome, _sif, _ME, _MT, _BaseAdapter, _load_bedrock_config, _httpx,
_logging, _log, get_all_dists) updated to use the top-level names.
Sweep ~74 redundant local imports across 21 files where the same module
was already imported at the top level. Also includes type fixes and lint
cleanups on the same branch.
* feat(skills): inject absolute skill dir and expand ${HERMES_SKILL_DIR} templates
When a skill loads, the activation message now exposes the absolute
skill directory and substitutes ${HERMES_SKILL_DIR} /
${HERMES_SESSION_ID} tokens in the SKILL.md body, so skills with
bundled scripts can instruct the agent to run them by absolute path
without an extra skill_view round-trip.
Also adds opt-in inline-shell expansion: !`cmd` snippets in SKILL.md
are pre-executed (with the skill directory as CWD) and their stdout is
inlined into the message before the agent reads it. Off by default —
enable via skills.inline_shell in config.yaml — because any snippet
runs on the host without approval.
Changes:
- agent/skill_commands.py: template substitution, inline-shell
expansion, absolute skill-dir header, supporting-files list now
shows both relative and absolute forms.
- hermes_cli/config.py: new skills.template_vars,
skills.inline_shell, skills.inline_shell_timeout knobs.
- tests/agent/test_skill_commands.py: coverage for header, both
template tokens (present and missing session id), template_vars
disable, inline-shell default-off, enabled, CWD, and timeout.
- website/docs/developer-guide/creating-skills.md: documents the
template tokens, the absolute-path header, and the opt-in inline
shell with its security caveat.
Validation: tests/agent/ 1591 passed (includes 9 new tests).
E2E: loaded a real skill in an isolated HERMES_HOME; confirmed
${HERMES_SKILL_DIR} resolves to the absolute path, ${HERMES_SESSION_ID}
resolves to the passed task_id, !`date` runs when opt-in is set, and
stays literal when it isn't.
* feat(terminal): source ~/.bashrc (and user-listed init files) into session snapshot
bash login shells don't source ~/.bashrc, so tools that install themselves
there — nvm, asdf, pyenv, cargo, custom PATH exports — stay invisible to
the environment snapshot Hermes builds once per session. Under systemd
or any context with a minimal parent env, that surfaces as
'node: command not found' in the terminal tool even though the binary
is reachable from every interactive shell on the machine.
Changes:
- tools/environments/local.py: before the login-shell snapshot bootstrap
runs, prepend guarded 'source <file>' lines for each resolved init
file. Missing files are skipped, each source is wrapped with a
'[ -r ... ] && . ... || true' guard so a broken rc can't abort the
bootstrap.
- hermes_cli/config.py: new terminal.shell_init_files (explicit list,
supports ~ and ${VAR}) and terminal.auto_source_bashrc (default on)
knobs. When shell_init_files is set it takes precedence; when it's
empty and auto_source_bashrc is on, ~/.bashrc gets auto-sourced.
- tests/tools/test_local_shell_init.py: 10 tests covering the resolver
(auto-bashrc, missing file, explicit override, ~/${VAR} expansion,
opt-out) and the prelude builder (quoting, guarded sourcing), plus
a real-LocalEnvironment snapshot test that confirms exports in the
init file land in subsequent commands' environment.
- website/docs/reference/faq.md: documents the fix in Troubleshooting,
including the zsh-user pattern of sourcing ~/.zshrc or nvm.sh
directly via shell_init_files.
Validation: 10/10 new tests pass; tests/tools/test_local_*.py 40/40
pass; tests/agent/ 1591/1591 pass; tests/hermes_cli/test_config.py
50/50 pass. E2E in an isolated HERMES_HOME: confirmed that a fake
~/.bashrc setting a marker var and PATH addition shows up in a real
LocalEnvironment().execute() call, that auto_source_bashrc=false
suppresses it, that an explicit shell_init_files entry wins over the
auto default, and that a missing bashrc is silently skipped.
Users can declare shell scripts in config.yaml under a hooks: block that
fire on plugin-hook events (pre_tool_call, post_tool_call, pre_llm_call,
subagent_stop, etc). Scripts receive JSON on stdin, can return JSON on
stdout to block tool calls or inject context pre-LLM.
Key design:
- Registers closures on existing PluginManager._hooks dict — zero changes
to invoke_hook() call sites
- subprocess.run(shell=False) via shlex.split — no shell injection
- First-use consent per (event, command) pair, persisted to allowlist JSON
- Bypass via --accept-hooks, HERMES_ACCEPT_HOOKS=1, or hooks_auto_accept
- hermes hooks list/test/revoke/doctor CLI subcommands
- Adds subagent_stop hook event fired after delegate_task children exit
- Claude Code compatible response shapes accepted
Cherry-picked from PR #13143 by @pefontana.
Replaces the serial for-loop in tick() with ThreadPoolExecutor so all
jobs due in a single tick run concurrently. A slow job no longer blocks
others from executing, fixing silent job skipping (issue #9086).
Thread safety:
- Session/delivery env vars migrated from os.environ to ContextVars
(gateway/session_context.py) so parallel jobs can't clobber each
other's delivery targets. Each thread gets its own copied context.
- jobs.json read-modify-write cycles (advance_next_run, mark_job_run)
protected by threading.Lock to prevent concurrent save clobber.
- send_message_tool reads delivery vars via get_session_env() for
ContextVar-aware resolution with os.environ fallback.
Configuration:
- cron.max_parallel_jobs in config.yaml (null = unbounded, 1 = serial)
- HERMES_CRON_MAX_PARALLEL env var override
Based on PR #9169 by @VenomMoth1.
Fixes#9086
Cherry-picked from PR #9359 by @luyao618.
- Accept camelCase aliases (apiKey, baseUrl, apiMode, keyEnv, defaultModel,
contextLength, rateLimitDelay) with auto-mapping to snake_case + warning
- Validate URL field values with urlparse (scheme + netloc check) — reject
non-URL strings like 'openai-reverse-proxy' that were silently accepted
- Warn on unknown keys in provider config entries
- Re-order URL field priority: base_url > url > api (was api > url > base_url)
- 12 new tests covering all scenarios
Closes#9332
Plugins now require explicit consent to load. Discovery still finds every
plugin — user-installed, bundled, and pip — so they all show up in
`hermes plugins` and `/plugins`, but the loader only instantiates
plugins whose name appears in `plugins.enabled` in config.yaml. This
removes the previous ambient-execution risk where a newly-installed or
bundled plugin could register hooks, tools, and commands on first run
without the user opting in.
The three-state model is now explicit:
enabled — in plugins.enabled, loads on next session
disabled — in plugins.disabled, never loads (wins over enabled)
not enabled — discovered but never opted in (default for new installs)
`hermes plugins install <repo>` prompts "Enable 'name' now? [y/N]"
(defaults to no). New `--enable` / `--no-enable` flags skip the prompt
for scripted installs. `hermes plugins enable/disable` manage both lists
so a disabled plugin stays explicitly off even if something later adds
it to enabled.
Config migration (schema v20 → v21): existing user plugins already
installed under ~/.hermes/plugins/ (minus anything in plugins.disabled)
are auto-grandfathered into plugins.enabled so upgrades don't silently
break working setups. Bundled plugins are NOT grandfathered — even
existing users have to opt in explicitly.
Also: HERMES_DISABLE_BUNDLED_PLUGINS env var removed (redundant with
opt-in default), cmd_list now shows bundled + user plugins together with
their three-state status, interactive UI tags bundled entries
[bundled], docs updated across plugins.md and built-in-plugins.md.
Validation: 442 plugin/config tests pass. E2E: fresh install discovers
disk-cleanup but does not load it; `hermes plugins enable disk-cleanup`
activates hooks; migration grandfathers existing user plugins correctly
while leaving bundled plugins off.
Smart model routing (auto-routing short/simple turns to a cheap model
across providers) was opt-in and disabled by default. This removes the
feature wholesale: the routing module, its config keys, docs, tests, and
the orchestration scaffolding it required in cli.py / gateway/run.py /
cron/scheduler.py.
The /fast (Priority Processing / Anthropic fast mode) feature kept its
hooks into _resolve_turn_agent_config — those still build a route dict
and attach request_overrides when the model supports it; the route now
just always uses the session's primary model/provider rather than
running prompts through choose_cheap_model_route() first.
Also removed:
- DEFAULT_CONFIG['smart_model_routing'] block and matching commented-out
example sections in hermes_cli/config.py and cli-config.yaml.example
- _load_smart_model_routing() / self._smart_model_routing on GatewayRunner
- self._smart_model_routing / self._active_agent_route_signature on
HermesCLI (signature kept; just no longer initialised through the
smart-routing pipeline)
- route_label parameter on HermesCLI._init_agent (only set by smart
routing; never read elsewhere)
- 'Smart Model Routing' section in website/docs/integrations/providers.md
- tip in hermes_cli/tips.py
- entries in hermes_cli/dump.py + hermes_cli/web_server.py
- row in skills/autonomous-ai-agents/hermes-agent/SKILL.md
Tests:
- Deleted tests/agent/test_smart_model_routing.py
- Rewrote tests/agent/test_credential_pool_routing.py to target the
simplified _resolve_turn_agent_config directly (preserves credential
pool propagation + 429 rotation coverage)
- Dropped 'cheap model' test from test_cli_provider_resolution.py
- Dropped resolve_turn_route patches from cli + gateway test_fast_command
— they now exercise the real method end-to-end
- Removed _smart_model_routing stub assignments from gateway/cron test
helpers
Targeted suites: 74/74 in the directly affected test files;
tests/agent + tests/cron + tests/cli pass except 5 failures that
already exist on main (cron silent-delivery + alias quick-command).
* feat: add Discord server introspection and management tool
Add a discord_server tool that gives the agent the ability to interact
with Discord servers when running on the Discord gateway. Uses Discord
REST API directly with the bot token — no dependency on the gateway
adapter's discord.py client.
The tool is only included in the hermes-discord toolset (zero cost for
users on other platforms) and gated on DISCORD_BOT_TOKEN via check_fn.
Actions (14):
- Introspection: list_guilds, server_info, list_channels, channel_info,
list_roles, member_info, search_members
- Messages: fetch_messages, list_pins, pin_message, unpin_message
- Management: create_thread, add_role, remove_role
This addresses a gap where users on Discord could not ask Hermes to
review server structure, channels, roles, or members — a task competing
agents (OpenClaw) handle out of the box.
Files changed:
- tools/discord_tool.py (new): Tool implementation + registration
- model_tools.py: Add to discovery list
- toolsets.py: Add to hermes-discord toolset only
- tests/tools/test_discord_tool.py (new): 43 tests covering all actions,
validation, error handling, registration, and toolset scoping
* feat(discord): intent-aware schema filtering + config allowlist + schema cleanup
- _detect_capabilities() hits GET /applications/@me once per process
to read GUILD_MEMBERS / MESSAGE_CONTENT privileged intent bits.
- Schema is rebuilt per-session in model_tools.get_tool_definitions:
hides search_members / member_info when GUILD_MEMBERS intent is off,
annotates fetch_messages description when MESSAGE_CONTENT is off.
- New config key discord.server_actions (comma-separated or YAML list)
lets users restrict which actions the agent can call, intersected
with intent availability. Unknown names are warned and dropped.
- Defense-in-depth: runtime handler re-checks the allowlist so a stale
cached schema cannot bypass a tightened config.
- Schema description rewritten as an action-first manifest (signature
per action) instead of per-parameter 'required for X, Y, Z' cross-refs.
~25% shorter; model can see each action's required params at a glance.
- Added bounds: limit gets minimum=1 maximum=100, auto_archive_duration
becomes an enum of the 4 valid Discord values.
- 403 enrichment: runtime 403 errors are mapped to actionable guidance
(which permission is missing and what to do about it) instead of the
raw Discord error body.
- 36 new tests: capability detection with caching and force refresh,
config allowlist parsing (string/list/invalid/unknown), intent+allowlist
intersection, dynamic schema build, runtime allowlist enforcement,
403 enrichment, and model_tools integration wiring.
Add approvals.cron_mode config option that controls how cron jobs handle
dangerous commands. Previously, cron jobs silently auto-approved all
dangerous commands because there was no user present to approve them.
Now the behavior is configurable:
- deny (default): block dangerous commands and return a message telling
the agent to find an alternative approach. The agent loop continues —
it just can't use that specific command.
- approve: auto-approve all dangerous commands (previous behavior).
When a command is blocked, the agent receives the same response format as
a user denial in the CLI — exit_code=-1, status=blocked, with a message
explaining why and pointing to the config option. This keeps the agent
loop running and encourages it to adapt.
Implementation:
- config.py: add approvals.cron_mode to DEFAULT_CONFIG
- scheduler.py: set HERMES_CRON_SESSION=1 env var before agent runs
- approval.py: both check_command_approval() and check_all_command_guards()
now check for cron sessions and apply the configured mode
- 21 new tests covering config parsing, deny/approve behavior, and
interaction with other bypass mechanisms (yolo, containers)
Weaker models (Gemma-class) repeatedly rediscover and forget that
execute_code uses a different CWD and Python interpreter than terminal(),
causing them to flip-flop on whether user files exist and to hit import
errors on project dependencies like pandas.
Adds a new 'code_execution.mode' config key (default 'project') that
brings execute_code into line with terminal()'s filesystem/interpreter:
project (new default):
- cwd = session's TERMINAL_CWD (falls back to os.getcwd())
- python = active VIRTUAL_ENV/bin/python or CONDA_PREFIX/bin/python
with a Python 3.8+ version check; falls back cleanly to
sys.executable if no venv or the candidate fails
- result : 'import pandas' works, '.env' resolves, matches terminal()
strict (opt-in):
- cwd = staging tmpdir (today's behavior)
- python = sys.executable (today's behavior)
- result : maximum reproducibility and isolation; project deps
won't resolve
Security-critical invariants are identical across both modes and covered by
explicit regression tests:
- env scrubbing (strips *_API_KEY, *_TOKEN, *_SECRET, *_PASSWORD,
*_CREDENTIAL, *_PASSWD, *_AUTH substrings)
- SANDBOX_ALLOWED_TOOLS whitelist (no execute_code recursion, no
delegate_task, no MCP from inside scripts)
- resource caps (5-min timeout, 50KB stdout, 50 tool calls)
Deliberately avoids 'sandbox'/'isolated'/'cloud' language in tool
descriptions (regression from commit 39b83f34 where agents on local
backends falsely believed they were sandboxed and refused networking).
Override via env var: HERMES_EXECUTE_CODE_MODE=strict|project
Previously users had to hand-edit config.yaml to route individual auxiliary
tasks (vision, compression, web_extract, etc.) to a specific provider+model.
Add a first-class picker reachable from the bottom of the existing `hermes
model` provider list.
Flow:
hermes model
→ Configure auxiliary models...
→ <task picker: 9 tasks, shows current setting inline>
→ <provider picker: authenticated providers + auto + custom>
→ <model picker: curated list + live pricing>
The aux picker does NOT re-run credential/OAuth setup; users authenticate
providers through the normal `hermes model` flow, then route aux tasks to
them here. `list_authenticated_providers()` gates the list to providers
the user has configured.
Also:
- 'Cancel' entry relabeled 'Leave unchanged' (sentinel still 'cancel'
internally, so dispatch logic is unchanged)
- 'Reset all to auto' entry to bulk-clear aux overrides; preserves
user-tuned timeout / download_timeout values
- Adds `title_generation` task to DEFAULT_CONFIG.auxiliary — the task
was called from agent/title_generator.py but was missing from defaults,
so config-backed timeout overrides never worked for it
Co-authored-by: teknium1 <teknium@nousresearch.com>
Follow-up to WideLee's salvaged PR #11582.
Back-compat for QQ_HOME_CHANNEL → QQBOT_HOME_CHANNEL rename:
- gateway/config.py reads QQBOT_HOME_CHANNEL, falls back to QQ_HOME_CHANNEL
with a one-shot deprecation warning so users on the old name aren't
silently broken.
- cron/scheduler.py: _HOME_TARGET_ENV_VARS['qqbot'] now maps to the new
name; _get_home_target_chat_id falls back to the legacy name via a
_LEGACY_HOME_TARGET_ENV_VARS table.
- hermes_cli/status.py + hermes_cli/setup.py: honor both names when
displaying or checking for missing home channels.
- hermes_cli/config.py: keep legacy QQ_HOME_CHANNEL[_NAME] in
_EXTRA_ENV_KEYS so .env sanitization still recognizes them.
Scope cleanup:
- Drop qrcode from core dependencies and requirements.txt (remains in
messaging/dingtalk/feishu extras). _qqbot_render_qr already degrades
gracefully when qrcode is missing, printing a 'pip install qrcode' tip
and falling back to URL-only display.
- Restore @staticmethod on QQAdapter._detect_message_type (it doesn't
use self). Revert the test change that was only needed when it was
converted to an instance method.
- Reset uv.lock to origin/main; the PR's stale lock also included
unrelated changes (atroposlib source URL, hermes-agent version bump,
fastapi additions) that don't belong.
Verified E2E:
- Existing user (QQ_HOME_CHANNEL set): gateway + cron both pick up the
legacy name; deprecation warning logs once.
- Fresh user (QQBOT_HOME_CHANNEL set): gateway + cron use new name,
no warning.
- Both set: new name wins on both surfaces.
Targeted tests: 296 passed, 4 skipped (qqbot + cron + hermes_cli).
Follow-up on the native NVIDIA NIM provider salvage. The original PR wired
PROVIDER_REGISTRY + HERMES_OVERLAYS correctly but missed several touchpoints
required for full parity with other OpenAI-compatible providers (xai,
huggingface, deepseek, zai).
Gaps closed:
- hermes_cli/main.py:
- Add 'nvidia' to the _model_flow_api_key_provider dispatch tuple so
selecting 'NVIDIA NIM' in `hermes model` actually runs the api-key
provider flow (previously fell through silently).
- Add 'nvidia' to `hermes chat --provider` argparse choices so the
documented test command (`hermes chat --provider nvidia --model ...`)
parses successfully.
- hermes_cli/config.py: Register NVIDIA_API_KEY and NVIDIA_BASE_URL in
OPTIONAL_ENV_VARS so setup wizard can prompt for them and they're
auto-added to the subprocess env blocklist.
- hermes_cli/doctor.py: Add NVIDIA NIM row to `_apikey_providers` so
`hermes doctor` probes https://integrate.api.nvidia.com/v1/models.
- hermes_cli/dump.py: Add NVIDIA_API_KEY → 'nvidia' mapping for
`hermes dump` credential masking.
- tests/tools/test_local_env_blocklist.py: Extend registry_vars fixture
with NVIDIA_API_KEY to verify it's blocked from leaking into subprocesses.
- agent/model_metadata.py: Add 'nemotron' → 131072 context-length entry
so all Nemotron variants get 128K context via substring match (rather
than falling back to MINIMUM_CONTEXT_LENGTH).
- hermes_cli/models.py: Fix hallucinated model ID
'nvidia/nemotron-3-nano-8b-a4b' → 'nvidia/nemotron-3-nano-30b-a3b'
(verified against live integrate.api.nvidia.com/v1/models catalog).
Expand curated list from 5 to 9 agentic models mapping to OpenRouter
defaults per provider-guide convention: add qwen3.5-397b-a17b,
deepseek-v3.2, llama-3.3-nemotron-super-49b-v1.5, gpt-oss-120b.
- cli-config.yaml.example: Document 'nvidia' provider option.
- scripts/release.py: Map asurla@nvidia.com → anniesurla in AUTHOR_MAP
for CI attribution.
E2E verified: `hermes chat --provider nvidia ...` now reaches NVIDIA's
endpoint (returns 401 with bogus key instead of argparse error);
`hermes doctor` detects NVIDIA NIM when NVIDIA_API_KEY is set.
Replace the HERMES_ENABLE_NOUS_MANAGED_TOOLS env-var feature flag with
subscription-based detection. The Tool Gateway is now available to any
paid Nous subscriber without needing a hidden env var.
Core changes:
- managed_nous_tools_enabled() checks get_nous_auth_status() +
check_nous_free_tier() instead of an env var
- New use_gateway config flag per tool section (web, tts, browser,
image_gen) records explicit user opt-in and overrides direct API
keys at runtime
- New prefers_gateway(section) shared helper in tool_backend_helpers.py
used by all 4 tool runtimes (web, tts, image gen, browser)
UX flow:
- hermes model: after Nous login/model selection, shows a curses
prompt listing all gateway-eligible tools with current status.
User chooses to enable all, enable only unconfigured tools, or skip.
Defaults to Enable for new users, Skip when direct keys exist.
- hermes tools: provider selection now manages use_gateway flag —
selecting Nous Subscription sets it, selecting any other provider
clears it
- hermes status: renamed section to Nous Tool Gateway, added
free-tier upgrade nudge for logged-in free users
- curses_radiolist: new description parameter for multi-line context
that survives the screen clear
Runtime behavior:
- Each tool runtime (web_tools, tts_tool, image_generation_tool,
browser_use) checks prefers_gateway() before falling back to
direct env-var credentials
- get_nous_subscription_features() respects use_gateway flags,
suppressing direct credential detection when the user opted in
Removed:
- HERMES_ENABLE_NOUS_MANAGED_TOOLS env var and all references
- apply_nous_provider_defaults() silent TTS auto-set
- get_nous_subscription_explainer_lines() static text
- Override env var warnings (use_gateway handles this properly now)
config.yaml terminal.cwd is now the single source of truth for working
directory. MESSAGING_CWD and TERMINAL_CWD in .env are deprecated with a
migration warning.
Changes:
1. config.py: Remove MESSAGING_CWD from OPTIONAL_ENV_VARS (setup wizard
no longer prompts for it). Add warn_deprecated_cwd_env_vars() that
prints a migration hint when deprecated env vars are detected.
2. gateway/run.py: Replace all MESSAGING_CWD reads with TERMINAL_CWD
(which is bridged from config.yaml terminal.cwd). MESSAGING_CWD is
still accepted as a backward-compat fallback with deprecation warning.
Config bridge skips cwd placeholder values so they don't clobber
the resolved TERMINAL_CWD.
3. cli.py: Guard against lazy-import clobbering — when cli.py is
imported lazily during gateway runtime (via delegate_tool), don't
let load_cli_config() overwrite an already-resolved TERMINAL_CWD
with os.getcwd() of the service's working directory. (#10817)
4. hermes_cli/main.py: Add 'hermes memory reset' command with
--target all/memory/user and --yes flags. Profile-scoped via
HERMES_HOME.
Migration path for users with .env settings:
Remove MESSAGING_CWD / TERMINAL_CWD from .env
Add to config.yaml:
terminal:
cwd: /your/project/path
Addresses: #10225, #4672, #10817, #7663
Camofox automatically maps each userId to a persistent Firefox profile
on the server side — no CAMOFOX_PROFILE_DIR env var exists. Our docs
incorrectly told users to configure this on the server.
Removed the fabricated env var from:
- browser docs (:::note block)
- config.py DEFAULT_CONFIG comment
- test docstring
Add a theme engine for the web dashboard that mirrors the CLI skin
engine philosophy — pure data, no code changes needed for new themes.
Frontend:
- ThemeProvider context that loads active theme from backend on mount
and applies CSS variable overrides to document.documentElement
- ThemeSwitcher dropdown component in the header (next to language
switcher) with instant preview on click
- 6 built-in themes: Hermes Teal (default), Midnight, Ember, Mono,
Cyberpunk, Rosé — each defines all 21 color tokens + overlay settings
- Theme types, presets, and context in web/src/themes/
Backend:
- GET /api/dashboard/themes — returns available themes + active name
- PUT /api/dashboard/theme — persists selection to config.yaml
- User custom themes discoverable from ~/.hermes/dashboard-themes/*.yaml
- Theme list endpoint added to public API paths (no auth needed)
Config:
- dashboard.theme key in DEFAULT_CONFIG (default: 'default')
- Schema override for select dropdown in config page
- Category merged into 'display' tab in config UI
i18n: theme switcher strings added for en + zh.
Pass platform_env_var="TELEGRAM_PROXY" to resolve_proxy_url() in both
telegram.py (main connect) and telegram_network.py (fallback transport),
so a Telegram-specific proxy takes priority over the generic HTTPS_PROXY.
Also bridge telegram.proxy_url from config.yaml to the TELEGRAM_PROXY
env var (env var takes precedence if both are set), add OPTIONAL_ENV_VARS
entry, docs, and tests.
Composite salvage of four community PRs:
- Core approach (both call sites): #9414 by @leeyang1990
- config.yaml bridging + docs: #6530 by @WhiteWorld
- Naming convention: #9074 by @brantzh6
- Earlier proxy work: #7786 by @ten-ltw
Closes#9414, closes#9074, closes#7786, closes#6530
Co-authored-by: WhiteWorld <WhiteWorld@users.noreply.github.com>
Co-authored-by: brantzh6 <brantzh6@users.noreply.github.com>
Co-authored-by: ten-ltw <ten-ltw@users.noreply.github.com>
atomic_yaml_write() and atomic_json_write() used tempfile.mkstemp()
which creates files with 0o600 (owner-only). After os.replace(), the
original file's permissions were destroyed. Combined with _secure_file()
forcing 0o600, this broke Docker/NAS setups where volume-mounted config
files need broader permissions (e.g. 0o666).
Changes:
- atomic_yaml_write/atomic_json_write: capture original permissions
before write, restore after os.replace()
- _secure_file: skip permission tightening in container environments
(detected via /.dockerenv, /proc/1/cgroup, or HERMES_SKIP_CHMOD env)
- save_env_value: preserve original .env permissions, remove redundant
third os.chmod call
- remove_env_value: same permission preservation
On desktop installs, _secure_file() still tightens to 0o600 as before.
In containers, the user's original permissions are respected.
Reported by Cedric Weber (Docker/Portainer on NAS).
Extract resolve_channel_prompt() shared helper into
gateway/platforms/base.py. Refactor Discord to use it.
Wire channel_prompts into Telegram (groups + forum topics),
Slack (channels), and Mattermost (channels).
Config bridging now applies to all platforms (not just Discord).
Added channel_prompts defaults to telegram/slack/mattermost
config sections.
Docs added to all four platform pages with platform-specific
examples (topic inheritance for Telegram, channel IDs for Slack,
etc.).