Adds a per-task override for the consecutive-failure circuit breaker,
so individual tasks can opt out of the global ``kanban.failure_limit``
without dragging everyone else with them.
Resolution order (now three tiers):
1. per-task ``max_retries`` (new, this commit)
2. caller-supplied ``failure_limit`` — the gateway threads
``kanban.failure_limit`` from config here
3. ``DEFAULT_FAILURE_LIMIT`` (2)
Changes:
- ``tasks.max_retries INTEGER`` column + migration for existing DBs
(NULL = no override, matches pre-column behavior).
- ``Task.max_retries`` field + ``from_row`` plumbing.
- ``create_task(..., max_retries=N)`` kwarg.
- ``_record_task_failure`` reads the per-task value first and records
``limit_source`` + ``effective_limit`` on the ``gave_up`` event so
operators can see which tier won.
- CLI: ``hermes kanban create --max-retries N`` (rejects ``< 1``).
- CLI: ``hermes kanban show`` surfaces the effective threshold +
source (``(task)``, ``(config kanban.failure_limit)``, ``(default)``).
- CLI: ``_task_to_dict`` includes ``max_retries`` in ``--json`` output.
Key design choice vs. the earlier #20972 attempt:
- No new config key. The existing ``kanban.failure_limit`` (landed in
#21183) is the dispatcher-tier source — no silent break for users
who already tuned it.
- No ``!=`` sentinel for "is config set" (which would misfire when
config equals the default). The tier-winner is determined purely
by "is per-task override set" — the dispatcher always wins when
per-task is NULL, regardless of whether the caller passed the
default or a configured value.
E2E verified across four scenarios: default-only (trips at 2),
config-only (trips at caller's value), per-task-only beats default
(trips at task value), per-task beats larger config (trips at task
value). ``gave_up`` event metadata correctly records ``limit_source``
and ``effective_limit`` in all cases.
Tests:
- ``test_per_task_max_retries_overrides_dispatcher_limit`` — task=1
beats caller=10.
- ``test_per_task_max_retries_allows_more_than_default`` — task=5
does not trip at caller=default of 2.
- ``test_max_retries_none_falls_through_to_dispatcher_limit`` — None
honors caller's config value (4), records ``limit_source=dispatcher``.
Full kanban trio (db + core + cli + tools + dashboard-plugin): 342
passed, no regressions.
Supersedes: #20972 (@jelrod27) — credit in PR close comment.
Ref: #20263 (tangentially — the reporter asked about adapter API
drift, not retry caps, but the CLI discussion there is what
surfaced the original ask).
PairingStore.approve_code() didn't consult _is_locked_out(), so after
MAX_FAILED_ATTEMPTS bad approvals the lockout flag was set but a valid
code still got accepted — any pending code (legitimately issued or
attacker-obtained) could be approved during the 1-hour lockout window,
nullifying the brute-force protection.
- gateway/pairing.py: lockout check runs in approve_code() right after
_cleanup_expired, before the pending lookup. Returns None on lockout.
- tests/gateway/test_pairing.py: test_lockout_blocks_code_approval pins
the regression — reporter's exact reproducer (generate valid code,
exhaust attempts with WRONGCODE, try to approve valid code) must
return None and leave is_approved == False. Also pins recovery: once
lockout expires, the still-pending code approves normally.
- hermes_cli/pairing.py: _cmd_approve distinguishes the two None cases.
On lockout, prints 'Platform locked out... clears in N minutes. To
reset sooner, delete the _lockout:<platform> entry from
_rate_limits.json' instead of the misleading 'Code not found or
expired' message. 29/29 pairing tests pass; E2E-verified with
reporter's exact Python reproducer.
Widen the platform-plugin surface so plugins can self-configure from env
vars and opt into cron home-channel delivery without editing core files.
Closes the scope gap that forced every new platform (Google Chat, Teams,
IRC, future) to either touch gateway/config.py, cron/scheduler.py, and
hermes_cli/config.py or live without env-only setup.
Changes:
- gateway/platform_registry.py: two new optional PlatformEntry fields.
- env_enablement_fn: () -> Optional[dict]. Called during
_apply_env_overrides BEFORE the adapter is constructed. Returned
dict fields are merged into PlatformConfig.extra; the special
'home_channel' key (if present) becomes a proper HomeChannel
dataclass on the PlatformConfig.
- cron_deliver_env_var: name of the *_HOME_CHANNEL env var. When set,
the plugin platform is a valid cron deliver= target and cron reads
the env var to resolve the default chat/room ID.
- gateway/config.py: the existing plugin-platform enable pass at the
bottom of _apply_env_overrides now calls env_enablement_fn and seeds
extras/home_channel. No effect on plugins that don't set the new
field.
- cron/scheduler.py: _is_known_delivery_platform and
_resolve_home_env_var fall through to the registry when the platform
isn't in the hardcoded built-in sets. New _iter_home_target_platforms
helper iterates built-ins + plugin platforms for the deliver=origin
fallback.
- gateway/run.py: _home_target_env_var now consults the new resolver so
plugin-defined home channels work for non-cron call sites too.
- hermes_cli/config.py: new _inject_platform_plugin_env_vars() sibling
of _inject_profile_env_vars(). Scans plugins/platforms/*/plugin.yaml
at import time and contributes entries to OPTIONAL_ENV_VARS so
'hermes config' UI discovers them. Supports bare-string and rich-dict
requires_env entries plus a new optional_env list for non-required
vars (home channels, allowlists).
All additions are strictly opt-in. Existing plugins (IRC, Teams,
image_gen, memory) see zero behavior change until they adopt the new
fields.
Mirrors the Slack `allowed_channels` feature (PR #7401) and Discord's
`allowed_channels` (PR #7044) across the remaining group-capable platforms.
All five platforms (Slack + Discord + the four added here) now follow the
same pattern: primary config via config.yaml, env-var fallback as an escape
hatch — matching the project policy that .env is for secrets only and
behavioral settings belong in config.yaml.
Also fixes a duplicate `slack` key in DEFAULT_CONFIG introduced by PR
#7401 (the later entry silently overwrote `allowed_channels`, `require_mention`,
and `free_response_channels` at dict-literal evaluation time).
Platforms added:
- Telegram: `telegram.allowed_chats` (env alias: `TELEGRAM_ALLOWED_CHATS`)
- Mattermost: `mattermost.allowed_channels` (env alias: `MATTERMOST_ALLOWED_CHANNELS`)
- Matrix: `matrix.allowed_rooms` (env alias: `MATRIX_ALLOWED_ROOMS`)
- DingTalk: `dingtalk.allowed_chats` (env alias: `DINGTALK_ALLOWED_CHATS`)
Mattermost and Matrix previously had NO config.yaml bridging for any of
their gating settings; this PR adds `load_gateway_config` bridges for them
(Mattermost gets require_mention + free_response_channels + allowed_channels;
Matrix gets allowed_rooms on top of its existing bridges for require_mention
and free_response_rooms).
Semantics identical everywhere:
- Empty = no restriction (fully backward compatible).
- Non-empty = hard whitelist: non-listed chats are silently ignored,
even when the bot is @mentioned.
- DMs bypass the check entirely.
DEFAULT_CONFIG merges the duplicate `slack` block and adds new `mattermost`
and `matrix` blocks so all gating settings surface in defaults.
Not included: Feishu (has its own per-chat `chat_rules` system that covers
this use case differently), WhatsApp (already has `group_allow_from` via
`group_policy: allowlist`), pure-DM platforms (Signal, SMS, BlueBubbles,
Yuanbao — no group concept).
The Hermes dashboard previously assumed it was served at the root of its
host (e.g. https://kanban.tilos.com/). When mounted behind a path-prefix
reverse proxy (e.g. https://mission-control.tilos.com/hermes/), the SPA
404'd because:
- index.html shipped absolute /assets/index-*.js URLs
- React Router had no basename
- The plugin loader hit /dashboard-plugins/<name>/... at the root host
- CSS in the bundle had absolute url(/fonts/...) references
This patch makes the dashboard prefix-aware at runtime, no rebuild
required. The proxy injects 'X-Forwarded-Prefix: /hermes' on every
request and the Python server:
- Rewrites href/src in served index.html to '${prefix}/assets/...'
- Injects 'window.__HERMES_BASE_PATH__="${prefix}"' for the SPA to read
- Rewrites url() refs in CSS at serve time
The SPA reads window.__HERMES_BASE_PATH__ once at boot and:
- Prefixes all /api/... fetches via api.ts
- Prefixes all /dashboard-plugins/... script/css URLs in usePlugins
- Sets <BrowserRouter basename={...}> so client-side routing works
When no X-Forwarded-Prefix header is present, behavior is unchanged
(empty prefix => serves at root, kanban.tilos.com keeps working).
Refs: MC-AUTO-13
Add tencent/hy3-preview (without :free suffix) as a paid model route
alongside the existing free variant. This allows seamless transition
when the model moves from free to paid on OpenRouter — both routes
coexist so neither side's timing causes breakage.
Changes:
- models.py: add ("tencent/hy3-preview", "") to OPENROUTER_MODELS
- model-catalog.json: add paid variant entry
- tests: add assertions for paid route presence
The :free entry can be removed in a follow-up PR once OpenRouter
confirms the free route is deprecated.
Co-authored-by: simonweng <simonweng@tencent.com>
The alibaba-coding-plan provider (DashScope coding-intl endpoint) was
defined in providers.py but missing from _PROVIDER_MODELS in models.py.
This caused /model to show "0 models" for this provider even though
credentials were configured and the provider was functional.
Add the curated model list so the provider picker displays available
models correctly.
- Add hermes dashboard examples to the CLI help epilogue so users can
discover the web UI command from 'hermes --help' output
- Add an independent 'Test dashboard subcommand' CI step that verifies
'hermes dashboard --help' works in the Docker image, with its own
mkdir/chown setup to remain independent of the prior smoke test step
- Prevents regressions like #9153 where the dashboard subcommand was
present in source but missing from the published Docker image
Closes#9153
Per repo policy, ~/.hermes/.env is for secrets only. Guild IDs are
behavioral configuration, not secrets. Replacing the
DISCORD_DM_ROLE_AUTH_GUILD env var from the original fix with
discord.dm_role_auth_guild in config.yaml.
- New module-level _read_dm_role_auth_guild() helper reads
hermes_cli.config.read_raw_config()['discord']['dm_role_auth_guild'].
Fails closed on any parse error (safe default = DM role-auth off).
- DEFAULT_CONFIG['discord'] gains dm_role_auth_guild: '' with a comment
documenting the opt-in.
- Tests patch hermes_cli.config.read_raw_config directly (via the
_set_dm_role_auth_guild helper) instead of setenv/delenv. 12 tests
in test_discord_roles_dm_scope pass; no env var involvement.
- Docstring + module docstring + comments updated to reference
discord.dm_role_auth_guild.
- E2E verified with real imports across 6 scenarios: unset, int,
string, garbage, zero, and (crucially) env-var-only-no-config all
return None except the valid int/string cases. Env var has zero
effect — policy compliance confirmed.
Lists the skills sitting in ~/.hermes/skills/.archive/ so users have
something to pass to `hermes curator restore`. `curator status` already
shows counts; this fills the name-discovery gap.
Archive layout is flat (`archive_skill` writes to `.archive/<skill>/`),
so the directory name IS the skill name — no frontmatter parsing
needed. Timestamped collision directories (`<skill>-<ts>`) are listed
literally; user can still pass them to `restore`.
Reshape of @EvilDrag0n's #20651, simplified: drop the frontmatter
rglob + preamble/trailer output + duplicate subcommand registration.
Co-authored-by: EvilDrag0n <lxl694522264@gmail.com>
Enables plugins to transform LLM output text after generation,
useful for vocabulary/personality transformation without burning
inference tokens.
Follows same pattern as transform_tool_result and transform_terminal_output:
- First non-empty string result wins
- Fail-open: exceptions logged as warnings, agent continues
- Signature: (response_text, session_id, model, platform)
Add a new `hermes gateway list` subcommand that shows the running
status of gateways across all profiles in a single view:
Gateways:
✓ default (current) — PID 155469
✓ wx1 — PID 166893
✗ dev — not running
Also includes `_print_other_profiles_gateway_status()` which appends
an "Other profiles" section to `hermes gateway status` output when
other profile gateways are running.
Both use existing `list_profiles()` and `find_profile_gateway_processes()`
— no new dependencies.
Closes#19127
Related: #19113, #4402, #4587
When multiple custom_providers share the same base_url but have different API keys,
get_custom_provider_pool_key() always returned the first match, causing wrong-key
unauthorized errors. Add provider_name parameter to prefer exact name matches
over base_url-only matching, with fallback for backward compatibility.
Fixes#19083
When a kanban worker subprocess exits rc=0 but its task is still in
status='running', the agent almost certainly answered the task
conversationally without calling kanban_complete or kanban_block. The
dispatcher used to classify this as a generic crash and respawn, which
loops forever on small local models (gemma4-e2b q4 etc.) that keep
returning clean but unproductive output.
Dispatcher changes:
- The waitpid reap loop at the top of dispatch_once now records each
reaped child's raw exit status in a bounded module registry
(_recent_worker_exits, TTL 600s, size cap 4096).
- _classify_worker_exit distinguishes clean_exit / nonzero_exit /
signaled / unknown using os.WIFEXITED / WIFSIGNALED.
- detect_crashed_workers consults the classification when a worker
is found dead. clean_exit → protocol_violation event + immediate
circuit-breaker trip (failure_limit=1). Everything else keeps the
existing crashed-event + counter behavior.
- DispatchResult.auto_blocked now includes protocol-violation trips.
Gateway fix (Bug A in #20894):
- gateway.run._notify_active_sessions_of_shutdown snapshots
self.adapters with list(...) before iterating. adapter.send() can
hit a fatal-error path that pops the adapter from the dict, which
was raising 'RuntimeError: dictionary changed size during iteration'
during shutdown.
Regression tests:
- test_detect_crashed_workers_protocol_violation_auto_blocks verifies
rc=0 + still-running → status=blocked on first occurrence with
protocol_violation + gave_up events and NO crashed event.
- test_detect_crashed_workers_nonzero_exit_uses_default_limit verifies
non-zero exits keep the existing 2-strike behavior.
Closes#20894.
custom_providers entries (section 4 of list_authenticated_providers) only
read the static models: dict from config.yaml, ignoring the live /v1/models
endpoint. This means gateways like Bifrost that expose hundreds of models
only show the handful explicitly listed in config.
Add live discovery via fetch_api_models() for custom_providers entries
that have api_key + base_url, matching the existing behavior for user
providers: entries (section 3). When the endpoint is reachable and
returns models, the live list replaces the static subset.
Fixes: /model picker showing only 9 models from a Bifrost gateway that
actually exposes 581.
The --command flag of `hermes mcp add` shared its argparse dest with the
top-level subparser (`dest="command"` in `hermes_cli/_parser.py`). When
the flag was omitted, argparse still wrote `args.command = None`,
clobbering the top-level value of `"mcp"`. The dispatcher then saw
`args.command is None` and fell through to interactive chat, so
`hermes mcp add ...` silently launched chat instead of registering the
server. `cmd_mcp_add` was never reached.
Use `dest="mcp_command"` on the flag and read it from `cmd_mcp_add`.
The user-facing CLI flag `--command` is unchanged; only the in-memory
namespace attribute moves. Also updates the `_make_args` helper in
`tests/hermes_cli/test_mcp_config.py` to populate the new dest, and
adds `tests/hermes_cli/test_mcp_add_command_dest.py` with a parser-
level regression test.
Closes#19785.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`_save_auth_store`, `_save_qwen_cli_tokens`, and `_write_shared_nous_state`
all created the temp file via `Path.open('w')` / `Path.write_text` and only
tightened permissions to 0o600 afterward. Between create and chmod the file
existed at the process umask (commonly 0o644 = world-readable on multi-user
hosts), briefly exposing OAuth access/refresh tokens for Nous, Codex,
Copilot, Claude, Qwen, Gemini, and every other native OAuth provider that
flows through auth.json.
Switch all three to `os.open(O_WRONLY|O_CREAT|O_EXCL, 0o600)` + `os.fdopen`
+ `fsync` so the file is atomic at 0o600 on creation. Tighten each parent
directory (`~/.hermes/`, Qwen auth dir, Nous shared auth dir) to 0o700 so
siblings can't traverse to the creds. `_save_auth_store` also gains a
per-process random temp suffix to match `agent/google_oauth.py` (#19673)
and `tools/mcp_oauth.py` (#21148).
Adds `tests/hermes_cli/test_auth_toctou_file_modes.py` asserting final
file mode 0o600 and parent dir mode 0o700 across all three writers, plus
an explicit `os.open(flags, mode)` check on the main auth.json writer
that would fail if anyone reintroduces the `Path.open('w')` pattern.
POSIX-only (mode bits skipped on Windows).
Flip the default for HERMES_REDACT_SECRETS from off to on so the redactor
already wired into send_message_tool, logs, and tool output actually runs
on a fresh install.
- agent/redact.py: env-var default "" → "true"
- hermes_cli/config.py: DEFAULT_CONFIG security.redact_secrets True;
two config-template comments rewritten
- gateway/run.py + cli.py: startup log / banner warning when the user
has explicitly opted out, so the downgrade is visible in agent.log
and at CLI banner time
- docs/reference/environment-variables.md: description reconciled
- tests: flipped the default-pin, restructured the force=True
regression test to explicit-false instead of unset
Users who need raw credential values (redactor development) can still
opt out via security.redact_secrets: false in config.yaml or
HERMES_REDACT_SECRETS=false in .env.
Closes#17691.
Addresses #20785 (short-term output-pipeline recommendation).
Widen PR #20314's fix to the other timeout-polling sites in the codebase
that share the same wall-clock-jump bug class. All of these measure elapsed
timeout duration, not civil time, so they belong on time.monotonic().
- hermes_cli/auth.py: auth-store file-lock timeout, Spotify OAuth callback
wait, Nous portal device-auth token poll.
- hermes_cli/copilot_auth.py: Copilot OAuth device-flow token poll.
- hermes_cli/gateway.py: gateway systemd restart wait.
- hermes_cli/web_server.py: dashboard Codex device-auth user_code wait,
dashboard Nous device-auth token poll. (sess["expires_at"] stays on
time.time() — it's a persisted absolute timestamp, not a local
deadline-polling variable.)
- agent/copilot_acp_client.py: Copilot ACP JSON-RPC request timeout.
Extract the shared flock/msvcrt boilerplate from _auth_store_lock and
_nous_shared_store_lock into a single _file_lock(lock_path, holder,
timeout, message) helper. Each caller keeps its own threading.local
holder so reentrancy state stays per-lock.
Also document the lock-ordering invariant on both wrappers:
_auth_store_lock is OUTER, _nous_shared_store_lock is INNER for all
runtime refresh paths. The one exception is _try_import_shared_nous_state,
which holds the shared lock alone across the full HTTP refresh+mint
cycle to prevent concurrent sibling imports from racing on the single-
use shared refresh token; that helper must not be called with the auth
lock already held.
The gateway-embedded dispatcher (default since `kanban.dispatch_in_gateway
= true`) is the parent of every spawned kanban worker. `_default_spawn`
calls `subprocess.Popen(..., start_new_session=True)` and returns the
pid — `start_new_session` detaches the controlling tty but does not
reparent to init, so the gateway keeps each worker as a child until it
`wait()`s for them.
Nothing in the dispatch loop ever calls `waitpid`. Result: every
completed worker becomes a `<defunct>` zombie that lingers until the
gateway exits. We hit ~430 zombies on a single hermes-agent container
after ~40 days of steady kanban traffic, approaching process-table
exhaustion on the host.
Fix: add a non-blocking reap loop at the top of `dispatch_once`, so
every dispatcher tick (default 60s) drains zombies that accumulated
since the last tick. WNOHANG keeps the call non-blocking; ChildProcessError
means no children to reap.
Why here, not a SIGCHLD handler:
- signal.signal requires the main thread; gateway threading model makes
that placement non-trivial.
- Bounded staleness: at default interval=60s the maximum live zombie
count is one tick's worth of worker completions.
- No interaction with detect_crashed_workers: that function only inspects
rows where status='running', and rows reach 'done' (and stop being
inspected) before their workers exit.
Adds `hermes profile create <name> --no-skills` to create a profile with
zero bundled skills. Writes a `.no-bundled-skills` marker file in the
profile root so `hermes update`'s all-profile skill sync loop also skips
the profile — without the marker, every update would re-seed skills and
the user would have to delete them again.
Use case (from @hiut1u): orchestrator profiles and narrow-task profiles
don't need 100+ bundled skills polluting their system prompt.
- create_profile() gains a `no_skills` param, mutually exclusive with
`--clone` / `--clone-all` (cloning explicitly copies skills).
- seed_profile_skills() no-ops on opted-out profiles and returns
`{skipped_opt_out: True}` so callers can report cleanly.
- Web API (POST /api/profiles) accepts `no_skills: bool`.
- Delete `.no-bundled-skills` to opt back in — next `hermes update`
re-seeds normally.
6 new tests in TestNoSkillsOptOut cover marker write, mutual exclusion
with clone, seed_profile_skills opt-out, fresh profile unaffected, and
delete-marker-re-enables-seeding.
The setup wizard dropped non-root users at a bare shell prompt when
trying to start a system-scope gateway service. Previously
_require_root_for_system_service called sys.exit(1), which the
wizard's `except Exception` guards cannot catch (SystemExit is a
BaseException). Users with a pre-existing /etc/systemd/system unit
(e.g. from an earlier `sudo hermes setup` run) hit this whenever
they re-ran `hermes setup` as a regular user.
- Convert _require_root_for_system_service to raise a typed
SystemScopeRequiresRootError (RuntimeError subclass) instead of
sys.exit(1). The direct CLI path (`hermes gateway install|start|stop|
restart|uninstall` without sudo) still exits 1 cleanly via a new
catch at the top of gateway_command, matching the existing
UserSystemdUnavailableError pattern.
- Add _system_scope_wizard_would_need_root() pre-check and
_print_system_scope_remediation() helper. Both setup wizards
(hermes_cli/setup.py and hermes_cli/gateway.py::gateway_setup) now
detect the dead-end before prompting and print actionable guidance:
either `sudo systemctl start <service>` this time, or uninstall the
system unit and install a per-user one.
- Defense-in-depth: all 5 wizard prompt sites also catch
SystemScopeRequiresRootError and fall back to the remediation
helper if the pre-check is bypassed (race, etc.).
Tests: 12 new tests in TestSystemScopeRequiresRootError,
TestSystemScopeWizardPreCheck, TestSystemScopeRemediationOutput, and
TestGatewayCommandCatchesSystemScopeError covering the exception
contract, pre-check matrix (root vs non-root, system-only vs
user-present vs none vs explicit system=True), remediation output
for each action, and the direct-CLI exit-1 path.
* fix(tui): restore classic CLI voice push-to-talk parity
(cherry picked from commit 93b9ae301b)
* fix(tui): harden voice push-to-talk stop flow
Address review feedback from PR #16189 by stopping the active recorder before background transcription, documenting single-shot voice capture, and covering the TUI gateway flags with regression tests.
* fix(tui): preserve silent voice strike tracking
Keep single-shot voice recording's no-speech counter alive across starts so the TUI can still emit the three-strikes auto-disable event, and bind the auto-restart state at module scope for type checking.
* fix(tui): clean up voice stop failure path
Address follow-up review by naming the TUI flow as single-shot push-to-talk and cancelling the recorder when forced stop cannot produce a WAV.
* fix(tui): report busy voice capture starts
Return explicit start state from the voice wrapper so the TUI gateway does not report recording while forced-stop transcription is still cleaning up.
* fix(tui): handle busy voice record responses
Apply the gateway busy status immediately in the TUI and route forced-stop voice events to the session that sent the stop request.
* fix(tui): clear voice recording on null response
Treat a null voice.record RPC result as a failed optimistic start so the REC badge cannot stick after gateway-side errors.
* fix(tui): count silent manual voice stops
Preserve single-shot voice no-speech strikes through forced stop transcription so empty push-to-talk captures still trigger the three-strikes guard.
---------
Co-authored-by: Montbra <montbra@gmail.com>
Profile processes (kanban workers, cron subprocesses, delegated subagents)
read the profile's auth.json only. If a provider was authenticated at the
global root but not inside the profile, the profile's credential_pool
comes back empty and the process fails with 'No LLM provider configured'
— even though the credentials are sitting in ~/.hermes/auth.json. #18594
propagated HERMES_HOME correctly, which is what surfaced this: workers
now land in the right profile, and the profile turns out to shadow global
with no fallback.
Semantics (read-only, per-provider shadowing):
* Profile has any entries for provider X → use profile only (global ignored).
* Profile has zero entries for provider X → fall back to global.
* Writes (write_credential_pool, _save_auth_store) still target the profile.
* Classic mode (HERMES_HOME == global root) skips the fallback entirely —
_global_auth_file_path() returns None.
Also mirrors the fallback in get_provider_auth_state so OAuth singletons
(nous, minimax-oauth, openai-codex, spotify) inherit cleanly — the Nous
shared-token store (PR #19712) remains the authoritative path for Nous
OAuth rotation, this just makes the read side consistent with it.
Seat belt: _load_global_auth_store() refuses to read the real user's
~/.hermes/auth.json under PYTEST_CURRENT_TEST even when HERMES_HOME points
to a profile-shaped path. Guard uses $HOME (stable across fixtures) rather
than Path.home() (which fixtures often monkeypatch to a tmp root).
Reported by @SeedsForbidden on Twitter as the credential_pool shadowing
follow-up to the #18594 fix.
- Expand migration comment to name the primary failure mode (missing
column OperationalError from #20842) ahead of the secondary SQLite
schema-reparse concern; also document the stale-cols-snapshot invariant
- Add clarifying comments on from_row() legacy fallback branches noting
they are belt-and-suspenders dead code post-migration
- Add task_events comment in existing test explaining why the table is
required by the migrator
- Add test_legacy_migration_no_legacy_columns_at_all: Scenario A —
explicitly asserts the exact #20842 crash no longer occurs and that
consecutive_failures defaults to 0 on a DB that never had spawn_failures
- Add test_legacy_migration_both_columns_already_present: Scenario D —
asserts the migration is a no-op when both columns already exist,
preserving the existing counter value
Adds SearXNG as a free, self-hosted web search provider. SearXNG is a
privacy-respecting metasearch engine that requires no API key — just a
running instance and SEARXNG_URL pointing at it.
## What this adds
- `tools/web_providers/searxng.py` — `SearXNGSearchProvider` implementing
`WebSearchProvider` (search only; no extract capability)
- `_is_backend_available("searxng")` — gates on SEARXNG_URL
- `_get_backend()` — accepts "searxng" as a configured value; adds it to
auto-detect candidates (lower priority than paid services)
- `web_search_tool` — dispatches to SearXNG when it is the active backend
- `check_web_api_key()` — includes SearXNG in availability check
- `OPTIONAL_ENV_VARS["SEARXNG_URL"]` — registered with tools=["web_search"]
- `tools_config.py` — SearXNG appears in the `hermes tools` provider picker
- `nous_subscription.py` — `direct_searxng` detection, web_active / web_available
- `setup.py` — SEARXNG_URL listed in the missing-credential hint
- 23 tests covering: is_configured, happy-path search, score sorting, limit,
HTTP/request errors, _is_backend_available, _get_backend, check_web_api_key
## Config
```yaml
# Use SearXNG for search, any paid provider for extract
web:
search_backend: "searxng"
extract_backend: "firecrawl"
# Or: SearXNG as the sole backend (web_extract will use the next available)
web:
backend: "searxng"
```
SearXNG is search-only — it does not implement WebExtractProvider. Users
who only configure SEARXNG_URL get web_search available; web_extract falls
back to the next available extract provider (or is unavailable if none).
Closes#19198 (Phase 2 Task 4 — SearXNG provider)
Ref: #11562 (original SearXNG PR)
Introduce the foundation for independently selecting web search and
extract backends — enabling future combinations like SearXNG for
search + Firecrawl for extract.
Architecture:
- tools/web_providers/base.py: WebSearchProvider and WebExtractProvider
ABCs with normalized result contracts (mirrors CloudBrowserProvider)
- tools/web_tools.py: _get_search_backend() and _get_extract_backend()
read per-capability config keys, fall through to shared web.backend
- hermes_cli/config.py: web.search_backend and web.extract_backend in
DEFAULT_CONFIG (empty = inherit from web.backend)
Behavioral change:
- web_search_tool() now dispatches via _get_search_backend()
- web_extract_tool() now dispatches via _get_extract_backend()
- When per-capability keys are empty (default), behavior is identical
to before — _get_search_backend() falls through to _get_backend()
This is purely structural — no new backends are added. SearXNG and
other search-only/extract-only providers can now be added as simple
drop-in modules in follow-up PRs.
12 new tests, 49 existing tests pass with zero regressions.
Ref: #19198
Same Hermes Teal palette as the default theme, but with baseSize 18px,
lineHeight 1.65, and spacious density so the whole dashboard scales up.
Gives users a one-click bigger-text preset and a copyable reference for
authoring custom YAML themes with their own typography settings.
OpenCode Go and OpenCode Zen are flat-namespace model resellers — their
/v1/models returns bare IDs (deepseek-v4-flash, minimax-m2.7), and the
inference API rejects vendor-prefixed names with HTTP 401 'Model not
supported'. Two bugs fixed:
1. `switch_model` in hermes_cli/model_switch.py was silently switching the
user off opencode-go to native deepseek when they typed
`/model deepseek-v4-flash`. Step d found the model in opencode-go's live
catalog, but step e (detect_provider_for_model) still ran and matched
the bare name against deepseek's static catalog. Fix: track whether
the live catalog resolved it; skip step e when it did.
2. `normalize_model_for_provider` in hermes_cli/model_normalize.py only
stripped the exact `opencode-zen/` prefix, leaving arbitrary vendor
prefixes like `minimax/minimax-m2.7` (commonly copied from aggregator
slugs into fallback_model configs) intact — causing HTTP 401s when
the fallback chain activated. Fix: opencode-go/opencode-zen strip ANY
leading vendor prefix because their APIs are flat-namespace.
Tests: 11 new cases in tests/hermes_cli/test_opencode_go_flat_namespace.py
covering both normalization (prefix stripping, regression guards for
opencode-zen Claude hyphenation and openrouter vendor-prepending) and
switch_model (bare-name resolution on opencode-go's live catalog must
not trigger cross-provider hijack).
Reported by @Ufonik via Discord; Kimi K2.6 always worked because moonshotai
has no overlapping entry in a native provider's static catalog. Deepseek
and minimax failed because their v4/v2.7 names existed in the native
deepseek/minimax catalogs.
* feat(skills/linear): add Documents support + Python helper script
The bundled Linear skill (PR #1230) covered issues, projects, teams, and
workflow states via curl. It had no coverage for Linear's Documents API,
so fetching an RFC/doc from a linear.app URL required hand-writing
GraphQL against an underdocumented schema.
Adds:
- Documents section in SKILL.md explaining slugId extraction from URLs,
the contentState (markdown) vs contentState (ProseMirror) split, and
four canonical curl examples (fetch by slugId, fetch by UUID, list
recent, title-search).
- scripts/linear_api.py — stdlib-only Python CLI wrapping the most
common operations (whoami, list-teams, list/get/search/create/update
issues, add-comment, update-status, list/get/search documents, raw
GraphQL passthrough). Zero deps, reads LINEAR_API_KEY from env.
Auth header quirk (personal key takes bare $LINEAR_API_KEY, no Bearer
prefix) is already documented in the skill.
Found during RFC review: the existing skill's lack of document support
forced falling back to the browser (which hit Linear's login wall).
Also fixes a schema gotcha — the Document field is `contentState`, not
`contentData` (which returns 400).
Tested end-to-end against the production API:
python3 linear_api.py whoami
python3 linear_api.py get-document 38359beef67c
Both return expected payloads.
* fix(skills/linear): point LINEAR_API_KEY setup to the correct page
The org-level Settings > API page (/settings/api) only shows OAuth apps
and workspace-member keys. Personal API keys live under Account,
Security, access (/settings/account/security). Update both the setup
link in config.py (shown during hermes setup) and the setup step in
SKILL.md so users land on the page that can create a personal key.
* docs(providers): add model-provider-plugin authoring guide + fix stale refs
New docs:
- website/docs/developer-guide/model-provider-plugin.md — full authoring
guide (directory layout, minimal example, ProviderProfile fields,
overridable hooks, user overrides, api_mode selection, auth types,
testing, pip distribution)
- Wired into website/sidebars.ts under 'Extending'
- Cross-references added in:
- guides/build-a-hermes-plugin.md (tip block)
- developer-guide/adding-providers.md
- developer-guide/provider-runtime.md
User guide:
- user-guide/features/plugins.md: Plugin types table grows from 3 to 4
with 'Model providers' row
Stale comment cleanup (providers/*.py → plugins/model-providers/<name>/):
- hermes_cli/main.py:_is_profile_api_key_provider docstring
- hermes_cli/doctor.py:_build_apikey_providers_list docstring
- hermes_cli/auth.py: PROVIDER_REGISTRY + alias auto-extension comments
- hermes_cli/models.py: CANONICAL_PROVIDERS auto-extension comment
AGENTS.md:
- Project-structure tree: added plugins/model-providers/ row
- New section: 'Model-provider plugins' explaining discovery, override
semantics, PluginManager integration, kind auto-coerce heuristic
Verified: docusaurus build succeeds, new page renders, all 3 cross-links
resolve. 347/347 targeted tests pass (tests/providers/,
tests/hermes_cli/test_plugins.py, tests/hermes_cli/test_runtime_provider_resolution.py,
tests/run_agent/test_provider_parity.py).
* docs(plugins): add 'pluggable interfaces at a glance' maps to plugins.md + build-a-hermes-plugin
Devs landing on either the user-guide plugin page or the build-a-plugin
guide now get an upfront table of every distinct pluggable surface with
a link to the right authoring doc. Previously they'd have to read the
full general-plugin guide to discover that model providers / platforms
/ memory / context engines are separate systems.
user-guide/features/plugins.md:
- New 'Pluggable interfaces — where to go for each' section below the
existing 4-kinds table
- 10 rows covering every register_* surface (tool, hook, slash command,
CLI subcommand, skill, model provider, platform, memory, context
engine, image-gen)
- Explicit note: TTS/STT are NOT plugin-extensible yet — documented
with a pointer to the current config.yaml 'command providers' pattern
and a note that register_tts_provider()/register_stt_provider() may
come later
guides/build-a-hermes-plugin.md:
- New :::info 'Not sure which guide you need?' map at the top so devs
see all pluggable interfaces before investing in this 737-line
general-plugin walkthrough
- Existing bottom :::tip expanded to include platform adapters alongside
model/memory/context plugins
Verified:
- All 8 cross-doc links in the new plugins.md table resolve in a
docusaurus build (SUCCESS, no new broken links)
- TTS link corrected (features/voice → features/tts; latter exists)
- Pre-existing broken links/anchors (cron-script-only, llms.txt,
adding-platform-adapters#step-by-step-checklist) are unchanged
* docs(plugins): correct TTS/STT pluggability \u2014 they ARE plugins (command-providers)
Previous commit incorrectly said TTS/STT 'aren't plugin-extensible'. They
are, via the config-driven command-provider pattern \u2014 any CLI that reads
text and writes audio (or vice versa for STT) is automatically a plugin
with zero Python. The tts.md docs cover this extensively and I missed it.
plugins.md:
- TTS row: 'Config-driven (not a Python plugin)', points at
tts.md#custom-command-providers
- STT row: points at tts.md#voice-message-transcription-stt (STT docs
live in tts.md despite the filename)
- Expanded note: TTS/STT use config-driven shell-command templates as
their plugin surface (full tts.providers.<name> registry for TTS;
HERMES_LOCAL_STT_COMMAND escape hatch for STT)
- Any CLI that reads/writes files is automatically a plugin \u2014 no Python
register_* API needed
- Future register_tts_provider()/register_stt_provider() hooks mentioned
as nice-to-have for SDK/streaming cases, not as the primary story
build-a-hermes-plugin.md:
- Same map update: TTS/STT rows explicit, footer note corrected
Verified:
- tts.md anchors (custom-command-providers, voice-message-transcription-stt)
exist and resolve in docusaurus build (SUCCESS, no new broken links)
* docs(plugins): expand pluggable interfaces table with MCP / event hooks / shell hooks / skill taps
Broadened the scope beyond Python register_* hooks. Hermes has MULTIPLE
plugin-style extension surfaces; they're now all in one table instead of
being scattered across feature docs.
Added rows for:
- **MCP servers** — config.yaml mcp_servers.<name> auto-registers external
tools from any MCP server. Huge extensibility surface, previously not
linked from the plugin map.
- **Gateway event hooks** — drop HOOK.yaml + handler.py into
~/.hermes/hooks/<name>/ to fire on gateway:startup, session:*, agent:*,
command:* events. Separate from Python plugin hooks.
- **Shell hooks** — hooks: block in config.yaml runs shell commands on
events (notifications, auditing, etc.).
- **Skill sources (taps)** — hermes skills tap add <repo> to pull in new
skill registries beyond the built-in sources.
Both docs updated:
- user-guide/features/plugins.md: table column renamed to 'How' (mixes
Python API + config-driven + drop-in-dir surfaces accurately)
- guides/build-a-hermes-plugin.md: :::info map at top mirrors the new
surfaces with a forward-link to the consolidated table
Note block rewritten: instead of singling out TTS/STT as the 'different
style' exception, now honestly describes that Hermes deliberately
supports three plugin styles — Python APIs, config-driven commands, and
drop-in manifest directories — and devs should pick the one that fits
their integration.
Not included (considered and rejected):
- Transport layer (register_transport) — internal, not user-facing
- Tool-call parsers — internal, VLLM phase-2 thing
- Cloud browser providers — hardcoded registry, not drop-in yet
- Terminal backends — hardcoded if/elif, not drop-in yet
- Skill sources (the ABC) — hardcoded list, only taps are user-extensible
Verified:
- All 5 new anchors resolve (gateway-event-hooks, shell-hooks, skills-hub,
custom-command-providers, voice-message-transcription-stt)
- Docusaurus build SUCCESS, zero new broken links
- Same 3 pre-existing broken links on main (cron-script-only, llms.txt,
adding-platform-adapters#step-by-step-checklist)
* docs(plugins): cover every pluggable surface in both the overview and how-to
Both plugins.md and build-a-hermes-plugin.md now cover every extension
surface end-to-end \u2014 general plugin APIs, specialized plugin types,
config-driven surfaces \u2014 with concrete authoring patterns for each.
plugins.md:
- 'What plugins can do' table grows from 9 rows (general ctx.register_*
only) to 14 rows covering register_platform, register_image_gen_provider,
register_context_engine, MemoryProvider subclass, register_provider
(model). Each row links to its full authoring guide.
- New 'Plugin sub-categories' section under Plugin Discovery explains
how plugins/platforms/, plugins/image_gen/, plugins/memory/,
plugins/context_engine/, plugins/model-providers/ are routed to
different loaders \u2014 PluginManager vs the per-category own-loader
systems.
- Explicit mention of user-override semantics at
~/.hermes/plugins/model-providers/ and ~/.hermes/plugins/memory/.
build-a-hermes-plugin.md:
- New '## Specialized plugin types' section (5 sub-sections):
- Model provider plugins \u2014 ProviderProfile + plugin.yaml example,
auto-wiring summary, link to full guide
- Platform plugins \u2014 BasePlatformAdapter + register_platform() skeleton
- Memory provider plugins \u2014 MemoryProvider subclass example
- Context engine plugins \u2014 ContextEngine subclass example
- Image-generation backends \u2014 ImageGenProvider + kind: backend example
- New '## Non-Python extension surfaces' section (5 sub-sections):
- MCP servers \u2014 config.yaml mcp_servers.<name> example
- Gateway event hooks \u2014 HOOK.yaml + handler.py example
- Shell hooks \u2014 hooks: block in config.yaml example
- Skill sources (taps) \u2014 hermes skills tap add example
- TTS / STT command templates \u2014 tts.providers.<name> with type: command
- Distribute via pip / NixOS promoted from ### to ## (they were orphaned
after the reorganization)
Each specialized / non-Python section has a concrete, copy-pasteable
example plus a 'Full guide:' link to the authoritative doc. Devs arriving
at the build-a-hermes-plugin guide now see every extension surface at
their disposal, not just the general tool/hook/slash-command surface.
Verified:
- Docusaurus build SUCCESS, zero new broken links
- All new cross-links (developer-guide/model-provider-plugin,
adding-platform-adapters, memory-provider-plugin, context-engine-plugin,
user-guide/features/mcp, skills#skills-hub, hooks#gateway-event-hooks,
hooks#shell-hooks, tts#custom-command-providers,
tts#voice-message-transcription-stt) resolve
- Same 3 pre-existing broken links on main (cron-script-only, llms.txt,
adding-platform-adapters#step-by-step-checklist)
* docs(plugins): fix opt-in inconsistency — not every plugin is gated
The 'Every plugin is disabled by default' statement was wrong. Several
plugin categories intentionally bypass plugins.enabled:
- Bundled platform plugins (IRC, Teams) auto-load so shipped gateway
channels are available out of the box. Activation per channel is via
gateway.platforms.<name>.enabled.
- Bundled backends (plugins/image_gen/*) auto-load so the default
backend 'just works'. Selection via <category>.provider config.
- Memory providers are all discovered; one is active via memory.provider.
- Context engines are all discovered; one is active via context.engine.
- Model providers: all 33 discovered at first get_provider_profile();
user picks via --provider / config.
The plugins.enabled allow-list specifically gates:
- Standalone plugins (general tools/hooks/slash commands)
- User-installed backends
- User-installed platforms (third-party gateway adapters)
- Pip entry-point backends
Which matches the actual code in hermes_cli/plugins.py:737 where the
bundled+backend/platform check bypasses the allow-list.
Rewrote '## Plugins are opt-in' to:
- Retitle to 'Plugins are opt-in (with a few exceptions)'
- Narrow opening claim to 'General plugins and user-installed backends
are disabled by default'
- Added 'What the allow-list does NOT gate' subsection with a full
table of which bypass the gate and how they're activated instead
- Fixed migration section wording (bundled platform/backend plugins
never needed grandfathering)
Verified: docusaurus build SUCCESS, zero new broken links.
Replaces the per-directory shadow-repo design with a single shared shadow
git store at ~/.hermes/checkpoints/store/. Object DB is now deduplicated
across every working directory the agent has ever touched; a dozen
worktrees of the same project cost near-zero in additional disk.
Why
---
Pre-v2 design had three compounding problems that let ~/.hermes/checkpoints/
grow to multi-GB on active machines:
1. Each working directory got its own full shadow git repo — no object
dedup across projects or across worktrees of the same project.
2. _prune() was a documented no-op: max_snapshots only limited the
/rollback listing. Loose objects accumulated forever.
3. Defaults: enabled=True, auto_prune=False — users paid the disk cost
without ever asking for /rollback.
Field report on a single workstation: 847 MB across 47 shadow repos,
mostly redundant clones of the hermes-agent source tree.
Changes
-------
- tools/checkpoint_manager.py: full rewrite. Single bare store, per-project
refs (refs/hermes/<hash>), per-project indexes (store/indexes/<hash>),
per-project metadata (store/projects/<hash>.json with workdir +
created_at + last_touch). On first v2 init, any pre-v2 per-directory
shadow repos are auto-migrated into legacy-<timestamp>/ so the new
store starts clean. _prune() now actually rewrites the per-project ref
to the last max_snapshots commits and runs git gc --prune=now. New
_enforce_size_cap() drops oldest commits round-robin across projects
when the store exceeds max_total_size_mb. _drop_oversize_from_index()
filters any single file larger than max_file_size_mb out of the snapshot.
- hermes_cli/checkpoints.py: new 'hermes checkpoints' CLI
(status / list / prune / clear / clear-legacy) for managing the store
outside a session.
- hermes_cli/config.py: flipped defaults — enabled=False, max_snapshots=20,
auto_prune=True. Added max_total_size_mb=500, max_file_size_mb=10.
Tightened DEFAULT_EXCLUDES (added target/, *.so/*.dylib/*.dll,
*.mp4/*.mov, *.zip/*.tar.gz, .worktrees/, .mypy_cache/, etc.).
- run_agent.py / cli.py / gateway/run.py: thread the new kwargs through
AIAgent and the startup auto_prune hooks.
- Tests rewritten to match v2 storage while keeping backwards-compat
coverage for the pre-v2 prune path (per-directory shadow repos under
base/ are still swept correctly for anyone mid-migration).
- Docs updated: user-guide/checkpoints-and-rollback.md explains the
shared store, new defaults, migration, and the new CLI;
reference/cli-commands.md documents 'hermes checkpoints'.
E2E validated
-------------
- Legacy migration: pre-v2 shadow repos auto-archived into legacy-<ts>/.
- Object dedup: two projects with an identical shared.py blob resolve to
7 total objects in the store (v1 would have stored the blob twice).
- max_snapshots=3 actually enforced: after 6 commits, list shows 3.
- Orphan prune: deleting a project's workdir + 'hermes checkpoints prune
--retention-days 0' removes its ref, index, and metadata; GC reclaims
the objects.
- max_file_size_mb=1 excludes a 2 MB weights.bin while keeping the
tracked source code files.
- hermes checkpoints {status,prune,clear,clear-legacy} all work from the
CLI without an agent running.
Breaking / migration
--------------------
No in-place data migration — legacy per-directory shadow repos are moved
into legacy-<timestamp>/ on first run. Old /rollback history is still
accessible by inspecting the archive with git; run
'hermes checkpoints clear-legacy' to reclaim the space when ready. Users
relying on /rollback must now set checkpoints.enabled=true (or pass
--checkpoints) explicitly.
On Termux/Android aarch64 (and other platforms without prebuilt wheels
for some optional extras), 'pip install -e .[all]' compiles C/Rust
extensions from source. This can run for several minutes with zero
network activity and — with --quiet — zero stdout. Users report
'hermes update hangs at Updating Python dependencies', Ctrl+C it, then
re-run and see 'up to date' (because git pull already succeeded and the
pip step was still working when they interrupted).
Pip's default output is proportional to actual work (one line per
Collecting / Building wheel for X / Installing), so removing --quiet
costs nothing on fast hardware and prevents the false-hang interrupt
loop on slow hardware.
Reported via Discord on Termux/Android. Supersedes #20466 which
misdiagnosed the hang as PYTHONPATH shadowing (install.sh doesn't run
during 'hermes update', and terminal() doesn't inherit PYTHONPATH).