- Replace async create_bind_task/poll_bind_result with synchronous
httpx.Client equivalents, eliminating manual event loop management
- Move _render_qr and full qr_register() entry-point into onboard.py,
mirroring the Feishu onboarding pattern
- Remove _qqbot_render_qr and _qqbot_qr_flow from gateway.py (~90 lines);
call site becomes a single qr_register() import
- Fix potential segfault: previous code called loop.close() in the EXPIRED
branch and again in the finally block (double-close crashed under uvloop)
* feat(state): auto-prune old sessions + VACUUM state.db at startup
state.db accumulates every session, message, and FTS5 index entry forever.
A heavy user (gateway + cron) reported 384MB with 982 sessions / 68K messages
causing slowdown; manual 'hermes sessions prune --older-than 7' + VACUUM
brought it to 43MB. The prune command and VACUUM are not wired to run
automatically anywhere — sessions grew unbounded until users noticed.
Changes:
- hermes_state.py: new state_meta key/value table, vacuum() method, and
maybe_auto_prune_and_vacuum() — idempotent via last-run timestamp in
state_meta so it only actually executes once per min_interval_hours
across all Hermes processes for a given HERMES_HOME. Never raises.
- hermes_cli/config.py: new 'sessions:' block in DEFAULT_CONFIG
(auto_prune=True, retention_days=90, vacuum_after_prune=True,
min_interval_hours=24). Added to _KNOWN_ROOT_KEYS.
- cli.py: call maintenance once at HermesCLI init (shared helper
_run_state_db_auto_maintenance reads config and delegates to DB).
- gateway/run.py: call maintenance once at GatewayRunner init.
- Docs: user-guide/sessions.md rewrites 'Automatic Cleanup' section.
Why VACUUM matters: SQLite does NOT shrink the file on DELETE — freed
pages get reused on next INSERT. Without VACUUM, a delete-heavy DB stays
bloated forever. VACUUM only runs when the prune actually removed rows,
so tight DBs don't pay the I/O cost.
Tests: 10 new tests in tests/test_hermes_state.py covering state_meta,
vacuum, idempotency, interval skipping, VACUUM-only-when-needed,
corrupt-marker recovery. All 246 existing state/config/gateway tests
still pass.
Verified E2E with real imports + isolated HERMES_HOME: DEFAULT_CONFIG
exposes the new block, load_config() returns it for fresh installs,
first call prunes+vacuums, second call within min_interval_hours skips,
and the state_meta marker persists across connection close/reopen.
* sessions.auto_prune defaults to false (opt-in)
Session history powers session_search recall across past conversations,
so silently pruning on startup could surprise users. Ship the machinery
disabled and let users opt in when they notice state.db is hurting
performance.
- DEFAULT_CONFIG.sessions.auto_prune: True → False
- Call-site fallbacks in cli.py and gateway/run.py match the new default
(so unmigrated configs still see off)
- Docs: flip 'Enable in config.yaml' framing + tip explains the tradeoff
Adds a first-class 'stepfun' API-key provider surfaced as Step Plan:
- Support Step Plan setup for both International and China regions
- Discover Step Plan models live from /step_plan/v1/models, with a
small coding-focused fallback catalog when discovery is unavailable
- Thread StepFun through provider metadata, setup persistence, status
and doctor output, auxiliary routing, and model normalization
- Add tests for provider resolution, model validation, metadata
mapping, and StepFun region/model persistence
Based on #6005 by @hengm3467.
Co-authored-by: hengm3467 <100685635+hengm3467@users.noreply.github.com>
* feat(plugins): pluggable image_gen backends + OpenAI provider
Adds a ImageGenProvider ABC so image generation backends register as
bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner
gains three primitives to make this work generically:
- `kind:` manifest field (`standalone` | `backend` | `exclusive`).
Bundled `kind: backend` plugins auto-load — no `plugins.enabled`
incantation. User-installed backends stay opt-in.
- Path-derived keys: `plugins/image_gen/openai/` gets key
`image_gen/openai`, so a future `tts/openai` cannot collide.
- Depth-2 recursion into category namespaces (parent dirs without a
`plugin.yaml` of their own).
Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5
default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64
responses save to `$HERMES_HOME/cache/images/`; URL responses pass
through.
FAL stays in-tree for this PR — a follow-up ports it into
`plugins/image_gen/fal/` so the in-tree `image_generation_tool.py`
slims down. The dispatch shim in `_handle_image_generate` only fires
when `image_gen.provider` is explicitly set to a non-FAL value, so
existing FAL setups are untouched.
- 41 unit tests (scanner recursion, kind parsing, gate logic,
registry, OpenAI payload shapes)
- E2E smoke verified: bundled plugin autoloads, registers, and
`_handle_image_generate` routes to OpenAI when configured
* fix(image_gen/openai): don't send response_format to gpt-image-*
The live API rejects it: 'Unknown parameter: response_format'
(verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return
b64_json unconditionally, so the parameter was both unnecessary and
actively broken.
* feat(image_gen/openai): gpt-image-2 only, drop legacy catalog
gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21)
and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 /
dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward
(dall-e-2 squares only). Trim the catalog down to a single model.
Live-verified end-to-end: landscape 1536x1024 render of a Moog-style
synth matches prompt exactly, 2.4MB PNG saved to cache.
* feat(image_gen/openai): expose gpt-image-2 as three quality tiers
Users pick speed/fidelity via the normal model picker instead of a
hidden quality knob. All three tier IDs resolve to the single underlying
gpt-image-2 API model with a different quality parameter:
gpt-image-2-low ~15s fast iteration
gpt-image-2-medium ~40s default
gpt-image-2-high ~2min highest fidelity
Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the
same 1024x1024 prompt.
Config:
image_gen.openai.model: gpt-image-2-high
# or
image_gen.model: gpt-image-2-low
# or env var for scripts/tests
OPENAI_IMAGE_MODEL=gpt-image-2-medium
Live-verified end-to-end with the low tier: 18.8s landscape render of a
golden retriever in wildflowers, vision-confirmed exact match.
* feat(tools_config): plugin image_gen providers inject themselves into picker
'hermes tools' → Image Generation now shows plugin-registered backends
alongside Nous Subscription and FAL.ai without tools_config.py needing
to know about them. OpenAI appears as a third option today; future
backends appear automatically as they're added.
Mechanism:
- ImageGenProvider gains an optional get_setup_schema() hook
(name, badge, tag, env_vars). Default derived from display_name.
- tools_config._plugin_image_gen_providers() pulls the schemas from
every registered non-FAL plugin provider.
- _visible_providers() appends those rows when rendering the Image
Generation category.
- _configure_provider() handles the new image_gen_plugin_name marker:
writes image_gen.provider and routes to the plugin's list_models()
catalog for the model picker.
- _toolset_needs_configuration_prompt('image_gen') stops demanding a
FAL key when any plugin provider reports is_available().
FAL is skipped in the plugin path because it already has hardcoded
TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up
PR the hardcoded rows go away and it surfaces through the same path
as OpenAI.
Verified live: picker shows Nous Subscription / FAL.ai / OpenAI.
Picking OpenAI prompts for OPENAI_API_KEY, then shows the
gpt-image-2-low/medium/high model picker sourced from the plugin.
397 tests pass across plugins/, tools_config, registry, and picker.
* fix(image_gen): close final gaps for plugin-backend parity with FAL
Two small places that still hardcoded FAL:
- hermes_cli/setup.py status line: an OpenAI-only setup showed
'Image Generation: missing FAL_KEY'. Now probes plugin providers
and reports '(OpenAI)' when one is_available() — or falls back to
'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured.
- image_generate tool schema description: said 'using FAL.ai, default
FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are
user-configured' — and notes the 'image' field can be a URL or an
absolute path, which the gateway delivers either way via
extract_local_files().
Surfaces the free variant alongside the paid minimax-m2.5 entry in
both the OPENROUTER_MODELS fallback snapshot and the nous/openrouter
provider model list.
Remove nvidia/nemotron-3-super-120b-a12b:free, arcee-ai/trinity-large-preview:free,
and openrouter/elephant-alpha from _PROVIDER_MODELS['nous']. The paid nemotron and
arcee-thinking variants remain.
Wire the auxiliary client (compaction, vision, session search, web extract)
to the Nous Portal's curated recommended-models endpoint when running on
Nous Portal, with a TTL-cached fetch that mirrors how we pull /models for
pricing.
hermes_cli/models.py
- fetch_nous_recommended_models(portal_base_url, force_refresh=False)
10-minute TTL cache, keyed per portal URL (staging vs prod don't
collide). Public endpoint, no auth required. Returns {} on any
failure so callers always get a dict.
- get_nous_recommended_aux_model(vision, free_tier=None, ...)
Tier-aware pick from the payload:
- Paid tier → paidRecommended{Vision,Compaction}Model, falling back
to freeRecommended* when the paid field is null (common during
staged rollouts of new paid models).
- Free tier → freeRecommended* only, never leaks paid models.
When free_tier is None, auto-detects via the existing
check_nous_free_tier() helper (already cached 3 min against
/api/oauth/account). Detection errors default to paid so we never
silently downgrade a paying user.
agent/auxiliary_client.py — _try_nous()
- Replaces the hardcoded xiaomi/mimo free-tier branch with a single call
to get_nous_recommended_aux_model(vision=vision).
- Falls back to _NOUS_MODEL (google/gemini-3-flash-preview) when the
Portal is unreachable or returns a null recommendation.
- The Portal is now the source of truth for aux model selection; the
xiaomi allowlist we used to carry is effectively dead.
Tests (15 new)
- tests/hermes_cli/test_models.py::TestNousRecommendedModels
Fetch caching, per-portal keying, network failure, force_refresh;
paid-prefers-paid, paid-falls-to-free, free-never-leaks-paid,
auto-detect, detection-error → paid default, null/blank modelName
handling.
- tests/agent/test_auxiliary_client.py::TestNousAuxiliaryRefresh
_try_nous honors Portal recommendation for text + vision, falls
back to google/gemini-3-flash-preview on None or exception.
Behavior won't visibly change today — both tier recommendations currently
point at google/gemini-3-flash-preview — but the moment the Portal ships
a better paid recommendation, subscribers pick it up within 10 minutes
without a Hermes release.
Drop _NOUS_ALLOWED_FREE_MODELS + filter_nous_free_models and its two call
sites. Whatever Nous Portal prices as free now shows up in the picker as-is
— no local allowlist gatekeeping. Free-tier partitioning (paid vs free in
the menu) still runs via partition_nous_models_by_tier.
Follow-ups after salvaging xiaoqiang243's kimi-for-coding patches:
- KIMI_CODE_BASE_URL: drop trailing /v1 (was /coding/v1).
The /coding endpoint speaks Anthropic Messages, and the Anthropic SDK
appends /v1/messages internally. /coding/v1 + SDK suffix produced
/coding/v1/v1/messages (a 404). /coding + SDK suffix now yields
/coding/v1/messages correctly.
- kimi-coding ProviderConfig: keep legacy default api.moonshot.ai/v1 so
non-sk-kimi- moonshot keys still authenticate. sk-kimi- keys are
already redirected to api.kimi.com/coding via _resolve_kimi_base_url.
- doctor.py: update Kimi UA to claude-code/0.1.0 (was KimiCLI/1.30.0)
and rewrite /coding base URLs to /coding/v1 for the /models health
check (Anthropic surface has no /models).
- test_kimi_env_vars: accept KIMI_CODING_API_KEY as a secondary env var.
E2E verified:
sk-kimi-<key> → https://api.kimi.com/coding/v1/messages (Anthropic)
sk-<legacy> → https://api.moonshot.ai/v1/chat/completions (OpenAI)
UA: claude-code/0.1.0, x-api-key: <sk-kimi-*>
A single global MAX_TEXT_LENGTH = 4000 truncated every TTS provider at
4000 chars, causing long inputs to be silently chopped even though the
underlying APIs allow much more:
- OpenAI: 4096
- xAI: 15000
- MiniMax: 10000
- ElevenLabs: 5000 / 10000 / 30000 / 40000 (model-aware)
- Gemini: ~5000
- Edge: ~5000
The schema description also told the model 'Keep under 4000 characters',
which encouraged the agent to self-chunk long briefs into multiple TTS
calls (producing 3 separate audio files instead of one).
New behavior:
- PROVIDER_MAX_TEXT_LENGTH table + ELEVENLABS_MODEL_MAX_TEXT_LENGTH
encode the documented per-provider limits.
- _resolve_max_text_length(provider, cfg) resolves:
1. tts.<provider>.max_text_length user override
2. ElevenLabs model_id lookup
3. provider default
4. 4000 fallback
- text_to_speech_tool() and stream_tts_to_speaker() both call the
resolver; old MAX_TEXT_LENGTH alias kept for back-compat.
- Schema description no longer hardcodes 4000.
Tests: 27 new unit + E2E tests; all 53 existing TTS tests and 253
voice-command/voice-cli tests still pass.
Adds role='leaf'|'orchestrator' to delegate_task. With max_spawn_depth>=2,
an orchestrator child retains the 'delegation' toolset and can spawn its
own workers; leaf children cannot delegate further (identical to today).
Default posture is flat — max_spawn_depth=1 means a depth-0 parent's
children land at the depth-1 floor and orchestrator role silently
degrades to leaf. Users opt into nested delegation by raising
max_spawn_depth to 2 or 3 in config.yaml.
Also threads acp_command/acp_args through the main agent loop's delegate
dispatch (previously silently dropped in the schema) via a new
_dispatch_delegate_task helper, and adds a DelegateEvent enum with
legacy-string back-compat for gateway/ACP/CLI progress consumers.
Config (hermes_cli/config.py defaults):
delegation.max_concurrent_children: 3 # floor-only, no upper cap
delegation.max_spawn_depth: 1 # 1=flat (default), 2-3 unlock nested
delegation.orchestrator_enabled: true # global kill switch
Salvaged from @pefontana's PR #11215. Overrides vs. the original PR:
concurrency stays at 3 (PR bumped to 5 + cap 8 — we keep the floor only,
no hard ceiling); max_spawn_depth defaults to 1 (PR defaulted to 2 which
silently enabled one level of orchestration for every user).
Co-authored-by: pefontana <fontana.pedro93@gmail.com>
Reported during TUI v2 blitz testing: typing `@folder:` in the composer
pulled up .dockerignore, .env, .gitignore, and every other file in the
cwd alongside the actual directories. The completion loop yielded every
entry regardless of the explicit prefix and auto-rewrote each completion
to @file: vs @folder: based on is_dir — defeating the user's choice.
Also fixed a pre-existing adjacent bug: a bare `@file:` or `@folder:`
(no path) used expanded=="." as both search_dir AND match_prefix,
filtering the list to dotfiles only. When expanded is empty or ".",
search in cwd with no prefix filter.
- want_dir = prefix == "@folder:" drives an explicit is_dir filter
- preserve the typed prefix in completion text instead of rewriting
- three regression tests cover: folder-only, file-only, and the bare-
prefix case where completions keep the `@folder:` prefix
DNS rebinding attack: a victim browser that has the dashboard (or the
WhatsApp bridge) open could be tricked into fetching from an
attacker-controlled hostname that TTL-flips to 127.0.0.1. Same-origin
and CORS checks don't help — the browser now treats the attacker origin
as same-origin with the local service. Validating the Host header at
the app layer rejects any request whose Host isn't one we bound for.
Changes:
hermes_cli/web_server.py:
- New host_header_middleware runs before auth_middleware. Reads
app.state.bound_host (set by start_server) and rejects requests
whose Host header doesn't match the bound interface with HTTP 400.
- Loopback binds accept localhost / 127.0.0.1 / ::1. Non-loopback
binds require exact match. 0.0.0.0 binds skip the check (explicit
--insecure opt-in; no app-layer defence possible).
- IPv6 bracket notation parsed correctly: [::1] and [::1]:9119 both
accepted.
scripts/whatsapp-bridge/bridge.js:
- Express middleware rejects non-loopback Host headers. Bridge
already binds 127.0.0.1-only, this adds the complementary app-layer
check for DNS rebinding defence.
Tests: 8 new in tests/hermes_cli/test_web_server_host_header.py
covering loopback/non-loopback/zero-zero binds, IPv6 brackets, case
insensitivity, and end-to-end middleware rejection via TestClient.
Reported in GHSA-ppp5-vxwm-4cf7 by @bupt-Yy-young. Hardening — not
CVE per SECURITY.md §3. The dashboard's main trust boundary is the
loopback bind + session token; DNS rebinding defeats the bind assumption
but not the token (since the rebinding browser still sees a first-party
fetch to 127.0.0.1 with the token-gated API). Host-header validation
adds the missing belt-and-braces layer.
Two call sites still used a raw substring check to identify ollama.com:
hermes_cli/runtime_provider.py:496:
_is_ollama_url = "ollama.com" in base_url.lower()
run_agent.py:6127:
if fb_base_url_hint and "ollama.com" in fb_base_url_hint.lower() ...
Same bug class as GHSA-xf8p-v2cg-h7h5 (OpenRouter substring leak), which
was fixed in commit dbb7e00e via base_url_host_matches() across the
codebase. The earlier sweep missed these two Ollama sites. Self-discovered
during April 2026 security-advisory triage; filed as GHSA-76xc-57q6-vm5m.
Impact is narrow — requires a user with OLLAMA_API_KEY configured AND a
custom base_url whose path or look-alike host contains 'ollama.com'.
Users on default provider flows are unaffected. Filed as a draft advisory
to use the private-fork flow; not CVE-worthy on its own.
Fix is mechanical: replace substring check with base_url_host_matches
at both sites. Same helper the rest of the codebase uses.
Tests: 67 -> 71 passing. 7 new host-matcher cases in
tests/test_base_url_hostname.py (path injection, lookalike host,
localtest.me subdomain, ollama.ai TLD confusion, localhost, genuine
ollama.com, api.ollama.com subdomain) + 4 call-site tests in
tests/hermes_cli/test_runtime_provider_resolution.py verifying
OLLAMA_API_KEY is selected only when base_url actually targets
ollama.com.
Fixes GHSA-76xc-57q6-vm5m
Gateway /model <name> --provider opencode-go (or any provider whose /models
endpoint is down, 404s, or doesn't exist) silently failed. validate_requested_model
returned accepted=False whenever fetch_api_models returned None, switch_model
returned success=False, and the gateway never wrote _session_model_overrides —
so the switch appeared to succeed in the error message flow but the next turn
kept calling the old provider.
The validator already had static-catalog fallbacks for MiniMax and Codex
(providers without a /models endpoint). Extended the same pattern as the
terminal fallback: when the live probe fails, consult provider_model_ids()
for the curated catalog. Known models → accepted+recognized. Close typos →
auto-corrected. Unknown models → soft-accepted with a 'Not in curated
catalog' warning. Providers with no catalog at all → soft-accepted with a
generic 'Note:' warning, finally honoring the in-code comment ('Accept and
persist, but warn') that had been lying since it was written.
Tests: 7 new tests in test_opencode_go_validation_fallback.py covering the
catalog lookup, case-insensitive match, auto-correct, unknown-with-suggestion,
unknown-without-suggestion, and no-catalog paths. TestValidateApiFallback in
test_model_validation.py updated — its four 'rejected_when_api_down' tests
were encoding exactly the bug being fixed.
* feat(models): hide OpenRouter models that don't advertise tool support
Port from Kilo-Org/kilocode#9068.
hermes-agent is tool-calling-first — every provider path assumes the
model can invoke tools. Models whose OpenRouter supported_parameters
doesn't include 'tools' (e.g. image-only or completion-only models)
cannot be driven by the agent loop and fail at the first tool call.
Filter them out of fetch_openrouter_models() so they never appear in
the model picker (`hermes model`, setup wizard, /model slash command).
Permissive when the field is missing — OpenRouter-compatible gateways
(Nous Portal, private mirrors, older snapshots) don't always populate
supported_parameters. Treat missing as 'unknown → allow' rather than
silently emptying the picker on those gateways. Only hide models
whose supported_parameters is an explicit list that omits tools.
Tests cover: tools present → kept, tools absent → dropped, field
missing → kept, malformed non-list → kept, non-dict item → kept,
empty list → dropped.
* refactor(acp): validate method_id against advertised provider in authenticate()
Previously authenticate() accepted any method_id whenever the server had
provider credentials configured. This was not a vulnerability under the
personal-assistant trust model (ACP is stdio-only, local-trust — anything
that can reach the transport is already code-execution-equivalent to the
user), but it was sloppy API hygiene: the advertised auth_methods list
from initialize() was effectively ignored.
Now authenticate() only returns AuthenticateResponse when method_id
matches the currently-advertised provider (case-insensitive). Mismatched
or missing method_id returns None, consistent with the no-credentials
case.
Raised by xeloxa via GHSA-g5pf-8w9m-h72x. Declined as a CVE
(ACP transport is stdio, local-trust model), but the correctness fix is
worth having on its own.
Follow-up to PR #2504. The original fix covered the two direct FAL_KEY
checks in image_generation_tool but left four other call sites intact,
including the managed-gateway gate where a whitespace-only FAL_KEY
falsely claimed 'user has direct FAL' and *skipped* the Nous managed
gateway fallback entirely.
Introduce fal_key_is_configured() in tools/tool_backend_helpers.py as a
single source of truth (consults os.environ, falls back to .env for
CLI-setup paths) and route every FAL_KEY presence check through it:
- tools/image_generation_tool.py : _resolve_managed_fal_gateway,
image_generate_tool's upfront check, check_fal_api_key
- hermes_cli/nous_subscription.py : direct_fal detection, selected
toolset gating, tools_ready map
- hermes_cli/tools_config.py : image_gen needs-setup check
Verified by extending tests/tools/test_image_generation_env.py and by
E2E exercising whitespace + managed-gateway composition directly.
OpenCode Go's published model list (opencode.ai/docs/go) includes kimi-k2.6,
qwen3.5-plus, and qwen3.6-plus, but Hermes' curated lists didn't carry them.
When the live /models probe fails during `hermes model`, users fell back to
the stale curated list and had to type newer models via 'Enter custom model
name'.
Adds kimi-k2.6 (now first in the Go list), qwen3.6-plus, and qwen3.5-plus
to both the model picker (hermes_cli/models.py) and setup defaults
(hermes_cli/setup.py). All routed through the existing opencode-go
chat_completions path — no api_mode changes needed.
Every credential source Hermes reads from now behaves identically on
`hermes auth remove`: the pool entry stays gone across fresh load_pool()
calls, even when the underlying external state (env var, OAuth file,
auth.json block, config entry) is still present.
Before this, auth_remove_command was a 110-line if/elif with five
special cases, and three more sources (qwen-cli, copilot, custom
config) had no removal handler at all — their pool entries silently
resurrected on the next invocation. Even the handled cases diverged:
codex suppressed, anthropic deleted-without-suppressing, nous cleared
without suppressing. Each new provider added a new gap.
What's new:
agent/credential_sources.py — RemovalStep registry, one entry per
source (env, claude_code, hermes_pkce, nous device_code, codex
device_code, qwen-cli, copilot gh_cli + env vars, custom config).
auth_remove_command dispatches uniformly via find_removal_step().
Changes elsewhere:
agent/credential_pool.py — every upsert in _seed_from_env,
_seed_from_singletons, and _seed_custom_pool now gates on
is_source_suppressed(provider, source) via a shared helper.
hermes_cli/auth_commands.py — auth_remove_command reduced to 25
lines of dispatch; auth_add_command now clears ALL suppressions for
the provider on re-add (was env:* only).
Copilot is special: the same token is seeded twice (gh_cli via
_seed_from_singletons + env:<VAR> via _seed_from_env), so removing one
entry without suppressing the other variants lets the duplicate
resurrect. The copilot RemovalStep suppresses gh_cli + all three env
variants (COPILOT_GITHUB_TOKEN, GH_TOKEN, GITHUB_TOKEN) at once.
Tests: 11 new unit tests + 4059 existing pass. 12 E2E scenarios cover
every source in isolated HERMES_HOME with simulated fresh processes.
Removing an env-seeded credential only cleared ~/.hermes/.env and the
current process's os.environ, leaving shell-exported vars (shell profile,
systemd EnvironmentFile, launchd plist) to resurrect the entry on the
next load_pool() call. This matched the pre-#11485 codex behaviour.
Now we suppress env:<VAR> in auth.json on remove, gate _seed_from_env()
behind is_source_suppressed(), clear env:* suppressions on auth add,
and print a diagnostic pointing at the shell when the var lives there.
Applies to every env:* seeded credential (xai, deepseek, moonshot, zai,
nvidia, openrouter, anthropic, etc.), not just xai.
Reported by @teknium1 from community user 'Artificial Brain' — couldn't
remove their xAI key via hermes auth remove.
Builds on @AxDSan's PR #2109 to finish the KittenTTS wiring so the
provider behaves like every other TTS backend end to end.
- tools/tts_tool.py: `_check_kittentts_available()` helper and wire
into `check_tts_requirements()`; extend Opus-conversion list to
include kittentts (WAV → Opus for Telegram voice bubbles); point the
missing-package error at `hermes setup tts`.
- hermes_cli/tools_config.py: add KittenTTS entry to the "Text-to-Speech"
toolset picker, with a `kittentts` post_setup hook that auto-installs
the wheel + soundfile via pip.
- hermes_cli/setup.py: `_install_kittentts_deps()`, new choice + install
flow in `_setup_tts_provider()`, provider_labels entry, and status row
in the `hermes setup` summary.
- website/docs/user-guide/features/tts.md: add KittenTTS to the provider
table, config example, ffmpeg note, and the zero-config voice-bubble tip.
- tests/tools/test_tts_kittentts.py: 10 unit tests covering generation,
model caching, config passthrough, ffmpeg conversion, availability
detection, and the missing-package dispatcher branch.
E2E verified against the real `kittentts` wheel:
- WAV direct output (pcm_s16le, 24kHz mono)
- MP3 conversion via ffmpeg (from WAV)
- Telegram flow (provider in Opus-conversion list) produces
`codec_name=opus`, 48kHz mono, `voice_compatible=True`, and the
`[[audio_as_voice]]` marker
- check_tts_requirements() returns True when kittentts is installed
Follow-up to the redundant-imports sweep. _install_hangup_protection
used to import get_hermes_home locally; the sweep hoisted it to the
module-level binding already present at line 164.
test_non_fatal_if_log_setup_fails monkeypatches
hermes_cli.config.get_hermes_home to raise, which only works when the
function late-binds its lookup. The hoisted version captures the
reference at import time and bypasses the monkeypatch.
Restore the local import (with a distinct local alias) so the test
seam works and the stdio-untouched-on-setup-failure invariant is
actually exercised.
Full AST-based scan of all .py files to find every case where a module
or name is imported locally inside a function body but is already
available at module level. This is the second pass — the first commit
handled the known cases from the lint report; this one catches
everything else.
Files changed (19):
cli.py — 16 removals: time as _time/_t/_tmod (×10),
re / re as _re (×2), os as _os, sys,
partial os from combo import,
from model_tools import get_tool_definitions
gateway/run.py — 8 removals: MessageEvent as _ME /
MessageType as _MT (×3), os as _os2,
MessageEvent+MessageType (×2), Platform,
BasePlatformAdapter as _BaseAdapter
run_agent.py — 6 removals: get_hermes_home as _ghh,
partial (contextlib, os as _os),
cleanup_vm, cleanup_browser,
set_interrupt as _sif (×2),
partial get_toolset_for_tool
hermes_cli/main.py — 4 removals: get_hermes_home, time as _time,
logging as _log, shutil
hermes_cli/config.py — 1 removal: get_hermes_home as _ghome
hermes_cli/runtime_provider.py
— 1 removal: load_config as _load_bedrock_config
hermes_cli/setup.py — 2 removals: importlib.util (×2)
hermes_cli/nous_subscription.py
— 1 removal: from hermes_cli.config import load_config
hermes_cli/tools_config.py
— 1 removal: from hermes_cli.config import load_config, save_config
cron/scheduler.py — 3 removals: concurrent.futures, json as _json,
from hermes_cli.config import load_config
batch_runner.py — 1 removal: list_distributions as get_all_dists
(kept print_distribution_info, not at top level)
tools/send_message_tool.py
— 2 removals: import os (×2)
tools/skills_tool.py — 1 removal: logging as _logging
tools/browser_camofox.py
— 1 removal: from hermes_cli.config import load_config
tools/image_generation_tool.py
— 1 removal: import fal_client
environments/tool_context.py
— 1 removal: concurrent.futures
gateway/platforms/bluebubbles.py
— 1 removal: httpx as _httpx
gateway/platforms/whatsapp.py
— 1 removal: import asyncio
tui_gateway/server.py — 2 removals: from datetime import datetime,
import time
All alias references (_time, _t, _tmod, _re, _os, _os2, _json, _ghh,
_ghome, _sif, _ME, _MT, _BaseAdapter, _load_bedrock_config, _httpx,
_logging, _log, get_all_dists) updated to use the top-level names.
Sweep ~74 redundant local imports across 21 files where the same module
was already imported at the top level. Also includes type fixes and lint
cleanups on the same branch.
* feat(skills): inject absolute skill dir and expand ${HERMES_SKILL_DIR} templates
When a skill loads, the activation message now exposes the absolute
skill directory and substitutes ${HERMES_SKILL_DIR} /
${HERMES_SESSION_ID} tokens in the SKILL.md body, so skills with
bundled scripts can instruct the agent to run them by absolute path
without an extra skill_view round-trip.
Also adds opt-in inline-shell expansion: !`cmd` snippets in SKILL.md
are pre-executed (with the skill directory as CWD) and their stdout is
inlined into the message before the agent reads it. Off by default —
enable via skills.inline_shell in config.yaml — because any snippet
runs on the host without approval.
Changes:
- agent/skill_commands.py: template substitution, inline-shell
expansion, absolute skill-dir header, supporting-files list now
shows both relative and absolute forms.
- hermes_cli/config.py: new skills.template_vars,
skills.inline_shell, skills.inline_shell_timeout knobs.
- tests/agent/test_skill_commands.py: coverage for header, both
template tokens (present and missing session id), template_vars
disable, inline-shell default-off, enabled, CWD, and timeout.
- website/docs/developer-guide/creating-skills.md: documents the
template tokens, the absolute-path header, and the opt-in inline
shell with its security caveat.
Validation: tests/agent/ 1591 passed (includes 9 new tests).
E2E: loaded a real skill in an isolated HERMES_HOME; confirmed
${HERMES_SKILL_DIR} resolves to the absolute path, ${HERMES_SESSION_ID}
resolves to the passed task_id, !`date` runs when opt-in is set, and
stays literal when it isn't.
* feat(terminal): source ~/.bashrc (and user-listed init files) into session snapshot
bash login shells don't source ~/.bashrc, so tools that install themselves
there — nvm, asdf, pyenv, cargo, custom PATH exports — stay invisible to
the environment snapshot Hermes builds once per session. Under systemd
or any context with a minimal parent env, that surfaces as
'node: command not found' in the terminal tool even though the binary
is reachable from every interactive shell on the machine.
Changes:
- tools/environments/local.py: before the login-shell snapshot bootstrap
runs, prepend guarded 'source <file>' lines for each resolved init
file. Missing files are skipped, each source is wrapped with a
'[ -r ... ] && . ... || true' guard so a broken rc can't abort the
bootstrap.
- hermes_cli/config.py: new terminal.shell_init_files (explicit list,
supports ~ and ${VAR}) and terminal.auto_source_bashrc (default on)
knobs. When shell_init_files is set it takes precedence; when it's
empty and auto_source_bashrc is on, ~/.bashrc gets auto-sourced.
- tests/tools/test_local_shell_init.py: 10 tests covering the resolver
(auto-bashrc, missing file, explicit override, ~/${VAR} expansion,
opt-out) and the prelude builder (quoting, guarded sourcing), plus
a real-LocalEnvironment snapshot test that confirms exports in the
init file land in subsequent commands' environment.
- website/docs/reference/faq.md: documents the fix in Troubleshooting,
including the zsh-user pattern of sourcing ~/.zshrc or nvm.sh
directly via shell_init_files.
Validation: 10/10 new tests pass; tests/tools/test_local_*.py 40/40
pass; tests/agent/ 1591/1591 pass; tests/hermes_cli/test_config.py
50/50 pass. E2E in an isolated HERMES_HOME: confirmed that a fake
~/.bashrc setting a marker var and PATH addition shows up in a real
LocalEnvironment().execute() call, that auto_source_bashrc=false
suppresses it, that an explicit shell_init_files entry wins over the
auto default, and that a missing bashrc is silently skipped.
The re-pair branch had a redundant 'import shutil' inside cmd_whatsapp,
which made shutil a function-local throughout the whole scope. The
earlier 'shutil.which("npm")' call at the dependency-install step then
crashed with UnboundLocalError before control ever reached the local
import.
shutil is already imported at module level (line 48), so the local
import was dead code anyway. Drop it.
The WhatsApp bridge depends on @whiskeysockets/baileys pulled directly
from a GitHub commit tarball, which on slower connections or when
GitHub is sluggish routinely exceeds 120s. The hardcoded timeout
surfaced as a raw TimeoutExpired traceback during 'hermes whatsapp'
setup.
Switch to the same pattern used by the TUI npm install at line
~945: no timeout, --no-fund/--no-audit/--progress=false to keep
output clean, stderr captured and tailed on failure. Also resolve
npm via shutil.which so missing Node.js gives a clean error instead
of FileNotFoundError, and handle Ctrl+C cleanly.
Co-authored-by: teknium1 <teknium@nousresearch.com>
Delete the stale literal `_PROVIDER_MODELS["ai-gateway"]` (gpt-5,
gemini-2.5-pro, claude-4.5 — outdated the moment PR #13223 landed with
its curated `AI_GATEWAY_MODELS` snapshot) and derive it from
`AI_GATEWAY_MODELS` instead, so the picker tuples and the bare-id
fallback catalog stay in sync automatically. Also fixes
`get_default_model_for_provider('ai-gateway')` to return kimi-k2.6
(the curated recommendation) instead of claude-opus-4.6.
Aslaaen's fix in the original PR covered _detect_api_mode_for_url and the
two openai/xai sites in run_agent.py. This finishes the sweep: the same
substring-match false-positive class (e.g. https://api.openai.com.evil/v1,
https://proxy/api.openai.com/v1, https://api.anthropic.com.example/v1)
existed in eight more call sites, and the hostname helper was duplicated
in two modules.
- utils: add shared base_url_hostname() (single source of truth).
- hermes_cli/runtime_provider, run_agent: drop local duplicates, import
from utils. Reuse the cached AIAgent._base_url_hostname attribute
everywhere it's already populated.
- agent/auxiliary_client: switch codex-wrap auto-detect, max_completion_tokens
gate (auxiliary_max_tokens_param), and custom-endpoint max_tokens kwarg
selection to hostname equality.
- run_agent: native-anthropic check in the Claude-style model branch
and in the AIAgent init provider-auto-detect branch.
- agent/model_metadata: Anthropic /v1/models context-length lookup.
- hermes_cli/providers.determine_api_mode: anthropic / openai URL
heuristics for custom/unknown providers (the /anthropic path-suffix
convention for third-party gateways is preserved).
- tools/delegate_tool: anthropic detection for delegated subagent
runtimes.
- hermes_cli/setup, hermes_cli/tools_config: setup-wizard vision-endpoint
native-OpenAI detection (paired with deduping the repeated check into
a single is_native_openai boolean per branch).
Tests:
- tests/test_base_url_hostname.py covers the helper directly
(path-containing-host, host-suffix, trailing dot, port, case).
- tests/hermes_cli/test_determine_api_mode_hostname.py adds the same
regression class for determine_api_mode, plus a test that the
/anthropic third-party gateway convention still wins.
Also: add asslaenn5@gmail.com → Aslaaen to scripts/release.py AUTHOR_MAP.
Load-time sanitizer silently removed non-ASCII codepoints from any
env var ending in _API_KEY / _TOKEN / _SECRET / _KEY, turning
copy-paste artifacts (Unicode lookalikes, ZWSP, NBSP) into opaque
provider-side API_KEY_INVALID errors.
Warn once per key to stderr with the offending codepoints (U+XXXX)
and guidance to re-copy from the provider dashboard.
The original list was copied from OpenRouter conventions and didn't
match what Vercel actually hosts. Verified against the live
/v1/models endpoint (266 models):
- qwen/qwen3.6-plus → alibaba/qwen3.6-plus (Vercel hosts Qwen under alibaba/)
- z-ai/glm-5.1 → zai/glm-5.1 (no hyphen)
- x-ai/grok-4.20 → xai/grok-4.20-reasoning (no hyphen, picks reasoning variant)
- google/gemini-3-flash-preview → google/gemini-3-flash (no -preview suffix)
- moonshotai/kimi-k2.5 → moonshotai/kimi-k2.6 (newest available)
Vercel provides a d?to= redirect URL that routes users through their
team picker to the AI Gateway API keys management page. Using this
specific URL lands users directly on the "Create key" page instead of
the generic AI Gateway dashboard.
When the live Vercel AI Gateway catalog exposes a Moonshot model with
zero input AND output pricing, it's promoted to position #1 as the
recommended default — even if the exact ID isn't in the curated
AI_GATEWAY_MODELS list. This enables dynamic discovery of new free
Moonshot variants without requiring a PR to update curation.
Paid Moonshot models are unaffected; falls back to the normal curated
recommended tag when no free Moonshot is live.
Moves Vercel AI Gateway from the bottom of the list to near the top,
adjacent to other multi-model aggregators. The existing bottom
position was a result of the list growing by appending new providers
over time — the new position makes it more discoverable.
- Curated AI_GATEWAY_MODELS list in hermes_cli/models.py (OSS first,
kimi-k2.5 as recommended default).
- fetch_ai_gateway_models() filters the curated list against the live
/v1/models catalog; falls back to the snapshot on network failure.
- fetch_ai_gateway_pricing() translates Vercel's input/output field
names to the prompt/completion shape the shared picker expects;
carries input_cache_read / input_cache_write through unchanged.
- get_pricing_for_provider() now handles ai-gateway.
- _model_flow_ai_gateway() provides a guided URL prompt when no key
is set and a pricing-column picker; routes ai-gateway to it instead
of the generic api-key flow.
Users can declare shell scripts in config.yaml under a hooks: block that
fire on plugin-hook events (pre_tool_call, post_tool_call, pre_llm_call,
subagent_stop, etc). Scripts receive JSON on stdin, can return JSON on
stdout to block tool calls or inject context pre-LLM.
Key design:
- Registers closures on existing PluginManager._hooks dict — zero changes
to invoke_hook() call sites
- subprocess.run(shell=False) via shlex.split — no shell injection
- First-use consent per (event, command) pair, persisted to allowlist JSON
- Bypass via --accept-hooks, HERMES_ACCEPT_HOOKS=1, or hooks_auto_accept
- hermes hooks list/test/revoke/doctor CLI subcommands
- Adds subagent_stop hook event fired after delegate_task children exit
- Claude Code compatible response shapes accepted
Cherry-picked from PR #13143 by @pefontana.
Six small fixes, all valid review feedback:
- gatewayClient: onTimeout is now a class-field arrow so setTimeout gets a
stable reference — no per-request bind allocation (the whole point of
the original refactor).
- memory: growth rate was lifetime average of rss/uptime, which reports
phantom growth for stable processes. Now computed as delta since a
module-load baseline (STARTED_AT). Sanity-checked: 0.00 MB/hr at
steady-state, non-zero after an allocation.
- hermes_cli: NODE_OPTIONS merge is now token-aware — respects a
user-supplied --max-old-space-size (don't downgrade a deliberate 16GB
setting) and avoids duplicating --expose-gc.
- useVirtualHistory: if items shrink past the frozen range's start
mid-freeze (/clear, compaction), drop the freeze and fall through to
the normal range calc instead of collapsing to an empty mount.
- circularBuffer: throw on non-positive capacity instead of silently
producing NaN indices.
- debug slash help: /heapdump mentions HERMES_HEAPDUMP_DIR override
instead of hardcoding the default path.
Validation: tsc clean, eslint clean, vitest 102/102, growth-rate smoke
test confirms baseline=0 → post-alloc>0.