Add _find_bundled_tui() that checks for hermes_cli/tui_dist/entry.js
(present in wheel installs) and wire it into _make_tui_argv() between
the HERMES_TUI_DIR prebuilt path and the npm install fallback.
Stop the gateway from exiting (or systemd-restart-looping) when a single
messaging adapter fails at startup or runtime. A misconfigured WhatsApp
(npm install timeout, unpaired bridge, missing creds.json) used to take
the entire gateway down, killing cron jobs and any other connected
platforms with it.
Changes:
• Startup (gateway/run.py): when connected_count==0 but the only
errors are retryable, log a degraded-state warning and keep the
gateway alive instead of returning False. Reconnect watcher then
recovers platforms as their underlying problem clears.
• Runtime (gateway/run.py _handle_adapter_fatal_error): when the last
adapter goes down with a retryable error and is queued for
reconnection, stay alive instead of exit-with-failure. Previously
this triggered systemd Restart=on-failure, which created infinite
restart loops on persistent retryable failures (proxy outage,
repeated bridge crashes).
• Reconnect watcher (gateway/run.py _platform_reconnect_watcher):
replace the 20-attempt hard drop with a circuit-breaker pause.
After _PAUSE_AFTER_FAILURES (10) consecutive retryable failures, the
platform stays in _failed_platforms with paused=True so the watcher
skips it but the operator can still see and resume it. Non-retryable
errors still drop out of the queue immediately. Resolves#17063
(gateway giving up on Telegram after 20 attempts).
• WhatsApp preflight (gateway/platforms/whatsapp.py): refuse to start
the Node bridge when creds.json is missing. Sets a non-retryable
whatsapp_not_paired fatal error so the watcher drops it cleanly
with a single 'run hermes whatsapp' log line instead of paying the
30s bridge bootstrap timeout on every gateway start.
• WhatsApp setup ordering (hermes_cli/main.py cmd_whatsapp): only set
WHATSAPP_ENABLED=true once pairing actually succeeds. Previously
the wizard wrote the env var at step 2 (before npm install and QR
pairing), so any Ctrl+C left .env claiming WhatsApp was ready when
the bridge had no creds.json. Also propagate the env var when the
user keeps an existing pairing on a re-run.
• /platform slash command (hermes_cli/commands.py + gateway/run.py):
new gateway-only command for manual circuit-breaker control.
/platform list — show connected + failed/paused platforms
/platform pause <name> — silence a known-broken platform
/platform resume <name> — re-queue a paused platform
Tests:
• New: pause/resume helpers, /platform list|pause|resume command,
WhatsApp creds.json preflight, WhatsApp setup ordering.
• Updated: stale assertions that codified the old 'exit and let
systemd restart' behavior in test_runner_fatal_adapter.py,
test_runner_startup_failures.py, and test_platform_reconnect.py
(the 20-attempt give-up test became a circuit-breaker pause test).
5488 tests pass in tests/gateway/.
Per @mark-xai's review on PR #26457 and the xAI model retirement on
2026-05-15: grok-code-fast-1 is being retired today and aliases redirect
to grok-4.3 (already pinned to the top of the xAI model list by this
PR). Update the two xAI Responses-API test fixtures Mark flagged plus
the picker fallback default in hermes_cli/main.py that uses the same
literal.
Adds a new authentication provider that lets SuperGrok subscribers sign
in to Hermes with their xAI account via the standard OAuth 2.0 PKCE
loopback flow, instead of pasting a raw API key from console.x.ai.
Highlights
----------
* OAuth 2.0 PKCE loopback login against accounts.x.ai with discovery,
state/nonce, and a strict CORS-origin allowlist on the callback.
* Authorize URL carries `plan=generic` (required for non-allowlisted
loopback clients) and `referrer=hermes-agent` for best-effort
attribution in xAI's OAuth server logs.
* Token storage in `auth.json` with file-locked atomic writes; JWT
`exp`-based expiry detection with skew; refresh-token rotation
synced both ways between the singleton store and the credential
pool so multi-process / multi-profile setups don't tear each other's
refresh tokens.
* Reactive 401 retry: on a 401 from the xAI Responses API, the agent
refreshes the token, swaps it back into `self.api_key`, and retries
the call once. Guarded against silent account swaps when the active
key was sourced from a different (manual) pool entry.
* Auxiliary tasks (curator, vision, embeddings, etc.) route through a
dedicated xAI Responses-mode auxiliary client instead of falling back
to OpenRouter billing.
* Direct HTTP tools (`tools/xai_http.py`, transcription, TTS, image-gen
plugin) resolve credentials through a unified runtime → singleton →
env-var fallback chain so xai-oauth users get them for free.
* `hermes auth add xai-oauth` and `hermes auth remove xai-oauth N` are
wired through the standard auth-commands surface; remove cleans up
the singleton loopback_pkce entry so it doesn't silently reinstate.
* `hermes model` provider picker shows
"xAI Grok OAuth (SuperGrok Subscription)" and the model-flow falls
back to pool credentials when the singleton is missing.
Hardening
---------
* Discovery and refresh responses validate the returned
`token_endpoint` host against the same `*.x.ai` allowlist as the
authorization endpoint, blocking MITM persistence of a hostile
endpoint.
* Discovery / refresh / token-exchange `response.json()` calls are
wrapped to raise typed `AuthError` on malformed bodies (captive
portals, proxy error pages) instead of leaking JSONDecodeError
tracebacks.
* `prompt_cache_key` is routed through `extra_body` on the codex
transport (sending it as a top-level kwarg trips xAI's SDK with a
TypeError).
* Credential-pool sync-back preserves `active_provider` so refreshing
an OAuth entry doesn't silently flip the active provider out from
under the running agent.
Testing
-------
* New `tests/hermes_cli/test_auth_xai_oauth_provider.py` (~63 tests)
covers JWT expiry, OAuth URL params (plan + referrer), CORS origins,
redirect URI validation, singleton↔pool sync, concurrency races,
refresh error paths, runtime resolution, and malformed-JSON guards.
* Extended `test_credential_pool.py`, `test_codex_transport.py`, and
`test_run_agent_codex_responses.py` cover the pool sync-back,
`extra_body` routing, and 401 reactive refresh paths.
* 165 tests passing on this branch via `scripts/run_tests.sh`.
The Zed ACP Registry path (uvx --from 'hermes-agent[acp]==X' hermes-acp)
gets a Python-only install. Browser tools depend on the agent-browser npm
package + Chromium, neither of which are in the wheel. Without an
explicit bootstrap, registry users have no path to working browser tools.
Ship a bundled, idempotent bootstrap script (Linux/macOS bash + Windows
PowerShell) inside acp_adapter/bootstrap/ as wheel package-data. New
entry points:
hermes acp --setup-browser # interactive; prompts before Chromium download
hermes acp --setup-browser --yes # non-interactive
hermes-acp --setup-browser
The terminal-auth flow (hermes acp --setup) also offers the browser
bootstrap as a follow-up after model selection, so first-run registry
users get the option without knowing the flag exists.
Key design choices:
- npm install -g --prefix $NODE_PREFIX so we never need sudo. System Node
on PATH is respected; only the install target is redirected to the
user-writable Hermes-managed Node prefix.
- tools/browser_tool.py::_browser_candidate_path_dirs() already walks
$HERMES_HOME/node/bin, so installed binaries are discovered with no
agent-side code change.
- System Chrome/Chromium detection short-circuits the ~400 MB Playwright
download when a suitable browser already exists.
- Bash + PowerShell live as ONE copy each under acp_adapter/bootstrap/.
Not duplicated under scripts/. install.sh and install.ps1 keep their
inline browser blocks for the source-checkout path.
E2E validated end-to-end:
bash bootstrap_browser_tools.sh --skip-chromium
→ installs agent-browser into ~/.hermes/node/bin/
tools.browser_tool._find_agent_browser()
→ returns the installed path
check_browser_requirements()
→ returns True (browser tools register)
Tests:
- tests/acp/test_entry.py: 11 tests covering --setup-browser dispatch
(linux + windows + --yes forwarding + failure propagation), the
terminal-auth follow-up prompt path, and a package-data wheel-shipping
assertion that catches any future pyproject.toml regression.
Docs: website/docs/user-guide/features/acp.md gains a 'Browser tools
(optional)' subsection with the two-line install + what-it-does.
Codex review pointed out that even with the sync-assets fix applied,
_build_web_ui still crashes on a stock Windows console before reaching
npm: Python stdout defaults to cp1252 (or similar) and raises
UnicodeEncodeError when print() hits the arrow/check glyphs used for
status messages (→, ✗, ⚠, ✓). Reproduced locally in PowerShell:
$ PYTHONIOENCODING=cp1252 python -c "from hermes_cli.main import _build_web_ui; _build_web_ui(Path('web'), fatal=True)"
UnicodeEncodeError: 'charmap' codec can't encode character '\u2192' ...
The previous PR body claimed "end-to-end verified on Windows 11", but
that was under the venv's default (utf-8) stdout. A plain `py` or
PowerShell invocation would still fail before sync-assets ever ran.
Fix: inner _say() helper that falls back to
text.encode(sys.stdout.encoding, errors="replace")
when print() raises UnicodeEncodeError. Glyphs degrade to '?' on
ASCII / cp1252 consoles; utf-8 consoles are unaffected. Verified the
full build pipeline runs to completion with PYTHONIOENCODING=cp1252.
Scoped tightly to _build_web_ui (the function this PR already touches);
other call sites in the codebase with the same risk are out of scope.
Three Windows-only bugs in the web-dashboard build path. Each is small,
scoped, and verified end-to-end on Windows 11 — including under a stock
cmd.exe / PowerShell console with its default cp1252 encoding.
1. `sync-assets` shells out to Unix-only commands
web/package.json hard-codes `rm -rf … && cp -r …`. Neither exists on
Windows cmd.exe. `hermes_cli/main.py::_build_web_ui` runs npm via
subprocess (which on Windows defaults to cmd.exe), so the prebuild
hook crashed before Vite ever ran and the dashboard never built.
Fix: web/scripts/sync-assets.mjs — ~20 lines of Node using fs.rmSync
+ fs.cpSync (stdlib, Node >= 16.7). No new deps, identical behavior
on POSIX and Windows.
2. Build failures were silent
_build_web_ui ran both subprocess calls with capture_output=True and
never relayed the captured buffers on failure. Users saw 'Web UI
build failed' and nothing else — no stdout, no stderr, no hint that
the real problem was 'rm is not recognized'.
Fix: inner _relay() helper that decodes and prints stdout + stderr
(utf-8, errors='replace') whenever a step returns non-zero. Replaces
the existing stderr_tail-only relay on the build path; success path
is unchanged. (stderr_tail is preserved for the stale-dist fallback
branch added by #23817.)
Salvaged from #13368 by @johnisag onto current main. Conflict
resolution preserves main's improvements:
- _run_npm_install_deterministic() (replaces bare subprocess.run for
npm install)
- npm-build retry-after-sleep for Windows boot-time races (#23817)
- stale-dist fallback for non-interactive callers (#23817)
Closes#25073, #13368.
Adds 'hermes proxy start' — a local HTTP server that lets external apps
(OpenViking, Karakeep, Open WebUI, ...) use a Hermes-managed provider
subscription as their LLM endpoint. The proxy attaches the user's real
OAuth-resolved credentials to each forwarded request, refreshing them
automatically; the client can send any bearer (it gets stripped).
Ships with one adapter — Nous Portal. The UpstreamAdapter ABC and
registry in hermes_cli/proxy/adapters/ are designed for additional
OAuth providers to plug in by name without server changes.
Commands:
hermes proxy start [--provider nous] [--host 127.0.0.1] [--port 8645]
hermes proxy status
hermes proxy providers
Allowed Portal paths: /v1/chat/completions, /v1/completions,
/v1/embeddings, /v1/models. Anything else returns 404 with a clear
error pointing at the allowed list.
aiohttp is gated like gateway/platforms/api_server.py (try-import,
clean runtime error if missing). No new core dependency.
Tests: 24 unit tests + 1 separate E2E that spawns the real subprocess
and verifies the upstream receives the right bearer with the client's
header stripped.
Pyproject's [all] extra was slimmed down in May 2026 — ~20 optional
backends moved to tools/lazy_deps.py and only install on first use.
hermes update runs uv pip install -e .[all] which doesn't touch any of
them, so pin bumps in LAZY_DEPS (CVE response, transitive fixes) were
silently ignored on already-activated backends.
Two changes:
1. _is_satisfied() now parses the spec and checks the installed version
against the constraint via packaging.specifiers. Previously it
returned True the moment the package name was importable, which made
ensure() a name-presence gate rather than a version-pin gate.
2. New active_features() / refresh_active_features() pair: lists every
feature with at least one of its packages currently installed, then
re-runs ensure() on each. Refresh is invoked at the end of
_cmd_update_impl, right after the [all] install completes. Cold
backends (never activated) stay quiet — no churn for them.
Output during update is one summary block:
→ Refreshing 4 active lazy backend(s)...
↑ 1 refreshed: provider.anthropic
✓ 3 already current
or
⚠ memory.honcho failed to refresh: <pip stderr>
Failures never raise out of update — backends keep their previously-
installed version and we tell the user to rerun once upstream is fixed.
security.allow_lazy_installs=false is honored: features get marked
"skipped" with the reason shown.
Tests: 18 new unit tests covering version-aware satisfaction (exact pin,
range, extras blocks, missing package, malformed spec), active feature
discovery, and refresh status reporting. All 61 lazy_deps tests pass.
Before: when `OPENROUTER_API_KEY` (or `AI_GATEWAY_API_KEY`) was already
set in ~/.hermes/.env, `hermes model openrouter` / `hermes model
ai-gateway` skipped the API-key prompt entirely and jumped straight to
the model picker. Users with a broken / expired / wrong key had no way
to replace it without editing ~/.hermes/.env by hand or re-running
`hermes setup` from scratch.
Both flows now route through the existing `_prompt_api_key()` helper,
which surfaces [K]eep / [R]eplace / [C]lear when a key is already
configured — the same UX the generic API-key providers (z.ai, MiniMax,
Gemini, etc.) and the Daytona setup already use.
Add NovitaAI as a first-class provider with dedicated model selection
flow, live pricing, and authoritative context length resolution.
- Register provider in PROVIDER_REGISTRY, HERMES_OVERLAYS, and all
alias/label maps (ID: novita, aliases: novita-ai, novitaai)
- Add dedicated _model_flow_novita() with 3-tier model list fallback:
Novita API → models.dev → static curated list
- Fetch live pricing from /v1/models with correct unit conversion
(input_token_price_per_m is 0.0001 USD per Mtok)
- Add Novita-specific context length resolution (step 4b) in
get_model_context_length(), prioritized over models.dev/OpenRouter
- Register api.novita.ai in _URL_TO_PROVIDER to prevent early return
from the custom-endpoint code path
- Add models.dev mapping (novita → novita-ai)
- Add default auxiliary model (deepseek/deepseek-v3-0324)
- Add NOVITA_API_KEY to test isolation (conftest.py)
- Update docs: providers page, env vars reference, CLI reference,
.env.example, README, and landing page
Adds an explicit API compatibility mode prompt to the `hermes model -> custom`
flow so Codex-compatible third-party endpoints (and any other non-default
backend whose URL doesn't match the existing heuristics in
`_detect_api_mode_for_url`) can be selected explicitly instead of silently
falling back to chat_completions.
Choices: Auto-detect / chat_completions / codex_responses / anthropic_messages.
Persists `api_mode` to:
- `model.api_mode` (active session config)
- the matching `custom_providers[*]` entry (so re-activating the named
provider next time replays the same transport)
Salvaged from PR #6125 onto current main: kept the new prompt and the
`_save_custom_provider(api_mode=...)` plumbing; the named-custom flow
already extracts and applies `api_mode` from the saved entry on current
main so those changes are preserved as-is. Test fixtures updated for the
new prompt and the existing display-name prompt.
Co-authored-by: littlewwwhite <1095245867@qq.com>
`lsp` is registered as a top-level subparser in `main()` (lines 9539-9545)
via `agent.lsp.cli.register_subparser`, so it shows up in `hermes --help`
output alongside the other built-ins. The `_BUILTIN_SUBCOMMANDS` set used
by `_plugin_cli_discovery_needed` to short-circuit the ~500-650ms plugin
import pass did not list it, so every `hermes lsp ...` invocation paid
the full discovery cost despite being a fully-built-in command.
This is also caught by the parity guard added in #22120:
`tests/hermes_cli/test_startup_plugin_gating.py::test_builtin_set_covers_every_registered_subcommand`
has been failing on clean origin/main with:
AssertionError: _BUILTIN_SUBCOMMANDS is missing these live
subcommands: ['lsp']. Add them to hermes_cli/main.py::_BUILTIN_SUBCOMMANDS
so plugin discovery can be skipped when the user targets them.
Fix: add `"lsp"` to the frozenset (alphabetical position between `logs`
and `mcp`). The accompanying `test_builtin_set_has_no_phantom_entries`
guard still passes because `lsp` is genuinely live — registered via the
guarded `try/except Exception` in main() since #24168.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Salvage of #21063 — adds 'Weixin, and more' to module-level docstrings
in gateway/__init__.py, gateway/config.py, gateway/platforms/base.py
and the 'hermes gateway' subparser description.
Co-authored-by: wuwuzhijing <chuang.guo@hopechart.com>
* feat(lsp): semantic diagnostics from real language servers in write_file/patch
Wire ~26 language servers (pyright, gopls, rust-analyzer, typescript-language-server,
clangd, bash-language-server, ...) into the post-write lint check used by write_file
and patch. The model now sees type errors, undefined names, missing imports, and
project-wide semantic issues introduced by its edits, not just syntax errors.
LSP is gated on git workspace detection: when the agent's cwd or the file being
edited is inside a git worktree, LSP runs against that workspace; otherwise the
existing in-process syntax checks are the only tier. This keeps users on
user-home cwds (Telegram/Discord gateway chats) from spawning daemons.
The post-write check is layered: in-process syntax check first (microseconds),
then LSP semantic diagnostics second when syntax is clean. Diagnostics are
delta-filtered against a baseline captured at write start, so the agent only
sees errors its edit introduced. A flaky/missing language server can never
break a write -- every LSP failure path falls back silently to the syntax-only
result.
New module agent/lsp/ split into:
- protocol.py: Content-Length JSON-RPC framer + envelope helpers
- client.py: async LSPClient (spawn, initialize, didOpen/didChange,
ContentModified retry, push/pull diagnostic stores)
- workspace.py: git worktree walk-up + per-server NearestRoot resolver
- servers.py: registry of 26 language servers (extension match,
root resolver, spawn builder per language)
- install.py: auto-install dispatch (npm install --prefix, go install
with GOBIN, pip install --target) into HERMES_HOME/lsp/bin/
- manager.py: LSPService (per-(server_id, root) client registry, lazy
spawn, broken-set, in-flight dedupe, sync facade for tools layer)
- reporter.py: <diagnostics> block formatter (severity-1-only, 20-per-file)
- cli.py: hermes lsp {status,list,install,install-all,restart,which}
Wired into tools/file_operations.py:
- write_file/patch_replace now call _snapshot_lsp_baseline before write
- _check_lint_delta gains a third tier: LSP semantic diagnostics when
syntax is clean
- All LSP code paths swallow exceptions; write_file's contract unchanged
Config: 'lsp' section in DEFAULT_CONFIG with enabled (default true),
wait_mode, wait_timeout, install_strategy (default 'auto'), and per-server
overrides (disabled, command, env, initialization_options).
Tests: tests/agent/lsp/ -- 49 tests covering protocol framing (encode and
read_message round-trip, EOF/truncation/missing Content-Length), workspace
gate (git walk-up, exclude markers, fallback to file location), reporter
(severity filter, max-per-file cap, truncation), service-level delta filter,
and an in-process mock LSP server that exercises the full client lifecycle
including didChange version bumps, dedup, crash recovery, and idempotent
teardown.
Live E2E verified end-to-end through ShellFileOperations: pyright
auto-installed via npm into HERMES_HOME, baseline captured, type error
introduced, single delta diagnostic surfaced with correct line/column/code/
source, then patch fix removes the diagnostic from the output.
Docs: new website/docs/user-guide/features/lsp.md page covering supported
languages, configuration knobs, performance characteristics, and
troubleshooting; cli-commands.md updated with the 'hermes lsp' reference;
sidebar updated.
* feat(lsp): structured logging, backend gate, defensive walk caps
Cherry-picks the substantive ideas from #24155 (different scope, same
problem space) onto our PR.
agent/lsp/eventlog.py (new): dedicated structured logger
``hermes.lint.lsp`` with steady-state silence. Module-level dedup sets
keep a 1000-write session at exactly ONE INFO line ("active for
<root>") at the default INFO threshold; clean writes log at DEBUG so
they never reach agent.log under normal config. State transitions
(server starts, no project root for a file, server unavailable) fire
at INFO/WARNING once per (server_id, key); novel events (timeouts,
unexpected errors) fire WARNING per call. Grep recipe: ``rg 'lsp\\['``.
agent/lsp/manager.py: wire the eventlog into _get_or_spawn and
get_diagnostics_sync so users can answer "did LSP fire on this edit?"
with a single grep, plus surface "binary not on PATH" warnings once
instead of silently retrying every write.
tools/file_operations.py: backend-type gate. ``_lsp_local_only()``
returns False for non-local backends (Docker / Modal / SSH /
Daytona); ``_snapshot_lsp_baseline`` and ``_maybe_lsp_diagnostics``
now skip entirely on remote envs. The host-side language server
can't see files inside a sandbox, so this prevents pretending to
lint a file the host process can't open.
agent/lsp/protocol.py: 8 KiB cap on the header block in
``read_message``. A pathological server that streams headers
without ever emitting CRLF-CRLF would have looped forever consuming
bytes; now raises ``LSPProtocolError`` instead.
agent/lsp/workspace.py: 64-step cap on ``find_git_worktree`` and
``nearest_root`` upward walks, plus try/except containment around
``Path(...).resolve()`` and child ``.exists()`` calls. Defensive
against pathological inputs (symlink loops, encoding errors,
permission failures mid-walk) — the lint hook is hot-path code and
must never raise.
Tests:
- tests/agent/lsp/test_eventlog.py: 18 tests covering steady-state
silence (clean writes stay DEBUG), state-transition INFO-once
semantics (active for, no project root), action-required
WARNING-once (server unavailable), per-call WARNING (timeouts,
spawn failures), and the "1000 clean writes => 1 INFO" contract.
- tests/agent/lsp/test_backend_gate.py: 5 tests verifying
_lsp_local_only / snapshot_baseline / maybe_lsp_diagnostics skip
the LSP layer for non-local backends and route correctly for
LocalEnvironment.
- tests/agent/lsp/test_protocol.py: new test_read_message_rejects_runaway_header
exercising the 8 KiB cap.
Validation:
- 73/73 LSP tests pass (49 original + 18 eventlog + 5 backend-gate + 1 framer cap)
- 198/198 pass when run alongside existing file_operations tests
- Live E2E re-run with pyright still surfaces "ERROR [2:12] Type
... reportReturnType (Pyright)" through the full path, then patch
fix removes it on the next call.
* feat(lsp): atexit cleanup + separate lsp_diagnostics JSON field
Two improvements salvaged from #24414's plugin-form alternative,
keeping our core-integrated design:
1. atexit cleanup of spawned language servers
----------------------------------------------------------------
``agent/lsp/__init__.get_service`` now registers an ``atexit``
handler on first creation that tears down the LSPService on
Python exit. Without this, every ``hermes chat`` exit was
leaking pyright/gopls/etc. processes for a few seconds while
their stdout buffers drained -- they got reaped by the kernel
eventually but a watchful ``ps aux`` would catch them.
The handler runs once per process (gated by
``_atexit_registered``); idempotent ``shutdown_service``
ensures double-fire is a no-op. Errors during shutdown are
swallowed at debug level since by the time atexit fires the
user has already seen the agent's final response.
2. Separate ``lsp_diagnostics`` field on WriteResult / PatchResult
----------------------------------------------------------------
Previously the LSP layer folded its diagnostic block into the
``lint.output`` string, conflating the syntax-check tier with
the semantic tier. The agent (and any downstream parsers) now
read syntax errors and semantic errors as independent signals:
{
"bytes_written": 42,
"lint": {"status": "ok", "output": ""},
"lsp_diagnostics": "<diagnostics file=...>\nERROR [2:12] ..."
}
``_check_lint_delta`` returns to its original two-tier shape
(syntax check + delta filter); ``write_file`` and
``patch_replace`` independently fetch LSP diagnostics via
``_maybe_lsp_diagnostics`` and pass them into the new field.
``patch_replace`` propagates the inner write_file's
``lsp_diagnostics`` so the outer PatchResult carries the patch's
delta correctly.
Tests: 19 new
- tests/agent/lsp/test_lifecycle.py (8 tests): atexit registration
fires once and only once across N get_service calls; the
registered callable is our internal shutdown wrapper;
shutdown_service is idempotent and safe when never started;
exceptions during shutdown are swallowed; inactive service is
cached so we don't rebuild on every check.
- tests/agent/lsp/test_diagnostics_field.py (11 tests): WriteResult
/ PatchResult dataclass shape, to_dict include/omit semantics,
channel separation (lint and lsp_diagnostics carry independent
signals), write_file populates the field via
_maybe_lsp_diagnostics only when the syntax tier is clean,
patch_replace propagates the field forward from its internal
write_file.
Validation:
- 92/92 LSP tests pass (73 prior + 8 lifecycle + 11 diagnostics field)
- 217/217 pass with file_operations + LSP combined
- Live E2E reverified: clean writes -> both fields empty/none; type
error introduced -> lint clean (parses), lsp_diagnostics carries
the pyright reportReturnType block; patch fix -> both fields
clean again.
* fix(lsp): broken-set short-circuit so a wedged server isn't paid every write
Discovered while auditing failure paths: a language server binary that
hangs (sleep forever, no LSP traffic on stdin/stdout) caused EVERY
subsequent write to re-pay the 8s snapshot_baseline timeout. Five
writes = ~64s of dead time.
The bug: ``_get_or_spawn`` adds the (server_id, root) pair to
``_broken`` inside its inner exception handler, but when the OUTER
``_loop.run`` timeout fires, it cancels the inner task before that
handler runs. The pair never makes it to broken-set, so the next
write re-enters the spawn path and re-pays the timeout.
Fix:
- New ``_mark_broken_for_file`` helper at the service layer marks
the (server_id, workspace_root) pair broken from the OUTSIDE when
the outer timeout fires. Called from the except branches in
``snapshot_baseline``, ``get_diagnostics_sync`` (asyncio.TimeoutError
+ generic Exception). Also kills any orphan client process that
survived the cancelled future, fire-and-forget with a 1s ceiling.
- ``enabled_for`` now consults the broken-set BEFORE returning True.
Files in already-broken (server_id, root) pairs short-circuit to
False, so the file_operations layer skips the LSP path entirely
with no spawn cost. Until the service is restarted (``hermes lsp
restart``) or the process exits.
- A single eventlog WARNING is emitted on first mark-broken so the
user knows which server gave up. Subsequent edits in the same
project stay silent.
Tests: 7 new in tests/agent/lsp/test_broken_set.py — covers the
key shape (server_id, per_server_root), enabled_for short-circuit,
sibling-file skip in same project, project isolation (broken in
A doesn't affect B), graceful no-op for missing-server / no-workspace,
and an end-to-end test that snapshots after a failure and verifies
the next ``enabled_for`` returns False.
Validation:
- Live retest of the wedged-binary scenario: 5 sequential writes,
first 8.88s (the one snapshot timeout), subsequent four ~0.84s
(no LSP cost). Down from 5x12.85s = 64s before this fix.
- 99/99 LSP tests pass (92 prior + 7 broken-set)
- 224/224 pass with file_operations + LSP combined
- Happy path E2E reverified — clean write, type error introduced,
patch fix all behave correctly with the new broken-set logic.
Note: the FIRST write to a wedged binary still pays 8s (the
snapshot_baseline timeout). We could shorten that, but pyright/
tsserver normally take 2-3s and slow CI rust-analyzer can need
5+ seconds, so 8s is the conservative ceiling. Subsequent writes
are instant.
* feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback
Three coordinated mitigations for the Mini Shai-Hulud worm hitting
mistralai 2.4.6 on PyPI (2026-05-12) and for the next single-package
compromise that follows.
# What this PR makes true
1. Users with the poisoned mistralai 2.4.6 in their venv get a loud
detection banner with copy-pasteable remediation steps the moment
they run hermes (and on every gateway startup).
2. One quarantined / yanked PyPI package can no longer silently demote
a fresh install to 'core only' — the installer keeps every other
extra and tells the user which tier landed.
3. Future opt-in backends (Mistral, ElevenLabs, Honcho, etc.) can
lazy-install on first use under a strict allowlist, instead of
eagerly pulling everything at install time.
# Detection: hermes_cli/security_advisories.py
- ADVISORIES catalog (one entry currently: shai-hulud-2026-05 for
mistralai==2.4.6). Adding the next one is a single dataclass.
- detect_compromised() uses importlib.metadata.version() — no pip
dependency, works in uv venvs that lack pip.
- Banner cache (~/.hermes/cache/advisory_banner_seen) rate-limits
the startup banner to once per 24h per advisory.
- Acks persisted to security.acked_advisories in config.yaml; never
re-banner after ack.
- Wired into:
* hermes doctor — runs first, prints full remediation block
* hermes doctor --ack <id> — dismisses an advisory
* cli.py interactive run() and single-query branches — short
stderr banner pointing at hermes doctor
* gateway/run.py startup — operator-visible warning in gateway.log
# Lazy-install framework: tools/lazy_deps.py
- LAZY_DEPS allowlist maps namespaced feature keys (tts.elevenlabs,
memory.honcho, provider.bedrock, etc.) to pip specs.
- ensure(feature) installs missing deps in the active venv via the
uv → pip → ensurepip ladder (matches tools_config._pip_install).
- Strict spec safety regex rejects URLs, file paths, shell metas,
pip flag injection, control chars — only PyPI-by-name accepted.
- Gated on security.allow_lazy_installs (default true) plus the
HERMES_DISABLE_LAZY_INSTALLS env var for restricted/audited envs.
- Migrated three backends as proof of pattern:
* tools/tts_tool.py — _import_elevenlabs() calls ensure first
* plugins/memory/honcho/client.py — get_honcho_client lazy-installs
* tts.mistral / stt.mistral entries pre-registered for when PyPI
restores mistralai
# Installer fallback tiers
scripts/install.sh, scripts/install.ps1, setup-hermes.sh:
- Centralised _BROKEN_EXTRAS list (currently: mistral). Edit one
array when a transitive breaks; users keep every other extra.
- New 'all minus known-broken' tier between [all] and the existing
PyPI-only-extras tier. Only kicks in when [all] fails resolve.
- All three tiers explicit: every fallback announces which tier
landed and prints a re-run hint when not on Tier 1.
- install.ps1 and install.sh both regenerate their tier specs from
the same _BROKEN_EXTRAS array so updates stay in sync.
Side effect: install.ps1 Tier 2 spec previously hardcoded 'mistral'
in its extra list — bug fixed by the refactor (mistral is filtered
out).
# Config
hermes_cli/config.py — DEFAULT_CONFIG.security gains:
- acked_advisories: [] (advisory IDs the user has dismissed)
- allow_lazy_installs: True (security gate for ensure())
No config version bump needed — both keys nest under existing
security: block, and load_config's deep-merge picks up DEFAULT_CONFIG
defaults for users with older configs.
# Tests
tests/hermes_cli/test_security_advisories.py — 23 tests covering:
- detect_compromised matches/non-matches, wildcard frozenset
- ack persistence, idempotence, blank rejection, config-failure path
- banner cache rate limiting + 24h re-banner + ack-stops-banner
- short_banner_lines / full_remediation_text / render_doctor_section /
gateway_log_message
- shipped catalog well-formedness invariant
tests/tools/test_lazy_deps.py — 40 tests covering:
- spec safety: 11 safe parametrized + 18 unsafe parametrized
- allowlist: unknown-feature rejection, namespace.name shape,
every shipped spec passes the safety regex
- security gating: config flag, env var, default, fail-open
- ensure() happy/sad paths: already-satisfied, install success,
pip stderr surfaced on failure, install-succeeds-but-still-missing
- is_available, feature_install_command
Combined: 63 new tests, all passing under scripts/run_tests.sh.
# Validation
- scripts/run_tests.sh tests/hermes_cli/test_security_advisories.py
tests/tools/test_lazy_deps.py → 63/63 passing
- scripts/run_tests.sh tests/hermes_cli/test_doctor.py
tests/hermes_cli/test_doctor_command_install.py
tests/tools/test_tts_mistral.py tests/tools/test_transcription_tools.py
tests/tools/test_transcription_dotenv_fallback.py → 165/165 passing
- scripts/run_tests.sh tests/hermes_cli/ tests/tools/ →
9191 passed, 8 pre-existing failures (verified on origin/main
before this change)
- bash -n on install.sh and setup-hermes.sh → OK
- py_compile on all modified .py files → OK
- End-to-end smoke test of detect_compromised + render_doctor_section
+ gateway_log_message with mocked installed version → produces
copy-pasteable remediation output
# Community
Full advisory + remediation steps:
website/docs/community/security-advisories/shai-hulud-mistralai-2026-05.md
Short-form post drafts (Discord, GitHub pinned issue, README banner):
scripts/community-announcement-shai-hulud.md
Refs: PR #24205 (mistral disabled), Socket Security advisory
<https://socket.dev/blog/mini-shai-hulud-worm-pypi>
* build(deps): pin every direct dep to ==X.Y.Z (no ranges)
Companion to the supply-chain advisory work: replace every >=/</~= range
in pyproject.toml's [project.dependencies] and [project.optional-dependencies]
with an exact ==X.Y.Z pin sourced from uv.lock.
Why: ranges allow PyPI to ship a fresh version of any direct dep at any
time without a code review on our side. With ranges, the malicious
mistralai 2.4.6 release would have been pulled by every fresh
'pip install -e .[all]' for the hours between upload and PyPI's
quarantine — exactly the install window we got hit on. Exact pins close
that window: the only way a new package version reaches a user is via
an intentional update on our end.
What the user-facing change is: nothing, behavior-wise. Every package
resolves to the same version it was already resolving to via uv.lock —
the pins just remove the resolver's freedom to pick a different one.
Cost: any user installing Hermes alongside another package that requires
a newer pin gets a resolver conflict. Acceptable for our isolated-venv
install path; documented in the new comment block.
Build-system requires line (setuptools>=61.0) is intentionally left
as a range — pinning the build backend would block fresh pip from
bootstrapping the build on architectures where that exact wheel isn't
available.
mistral extra (mistralai==2.3.0) is pinned but stays out of [all]
(per PR #24205). 'uv lock' regeneration will fail until PyPI restores
mistralai; lockfile regeneration is gated behind that, NOT on every PR.
LAZY_DEPS in tools/lazy_deps.py also moved to exact pins so the lazy-
install pathway can never resolve a different version than the one
declared in pyproject.toml.
Validation:
- Cross-checked all 77 pinned direct deps in pyproject.toml against
uv.lock — every pin matches the resolved version exactly.
- Cross-checked all LAZY_DEPS specs against uv.lock — same.
- 'uv pip install -e .[all] --dry-run' resolves 205 packages cleanly.
- tests/tools/test_lazy_deps.py + tests/hermes_cli/test_security_advisories.py
→ 63/63 passing (every shipped spec passes the safety regex).
- Doctor + TTS + transcription targeted suite → 146/146 passing.
* build(deps): hash-verify transitives via uv.lock; remove unresolvable [mistral] extra
You asked: 'what about the dependencies the dependencies rely on?' —
correctly noting that exact-pinning direct deps in pyproject.toml does
NOT cover the transitive graph. `pip install` and `uv pip install` both
re-resolve transitives fresh from PyPI at install time, so a compromised
transitive (e.g. `httpcore` if it got worm-poisoned tomorrow) would
still hit our users even with every direct dep exact-pinned.
# What this commit fixes
1. **Both real installer scripts now prefer `uv sync --locked` as Tier 0.**
uv.lock records SHA256 hashes for every transitive — a compromised
package with a different hash gets REJECTED. Falls through to the
existing `uv pip install` cascade if the lockfile is missing or
stale, with a loud warning that the fallback path does NOT
hash-verify transitives. Previously only `setup-hermes.sh` (the dev
path) used the lockfile; `scripts/install.sh` and `scripts/install.ps1`
(the paths fresh users actually run) skipped it.
2. **Removed the `[mistral]` extra entirely.** The `mistralai` PyPI
project is fully quarantined right now — every version returns 404,
so any pin we wrote was unresolvable, which broke `uv lock --check`
in CI. Restoration is documented in pyproject.toml as a 5-step
checklist (verify, re-add extra, re-enable in 4 modules, regenerate
lock, optionally re-add to [all]).
3. **Regenerated uv.lock.** 262 packages, mistralai/eval-type-backport/
jsonpath-python pruned. `uv lock --check` now passes.
# Defense-in-depth view
| Layer | Where | Protects against |
|----------------------------|-------------------|-------------------------------------------|
| Exact pins in pyproject | direct deps | new mistralai 2.4.6-style direct compromise |
| uv.lock + `--locked` install | transitive graph | transitive worm injection |
| Tier-0 hash-verified path | install.sh / .ps1 | actually USE the lockfile in fresh installs |
| `uv lock --check` CI gate | every PR | drift between pyproject and lockfile |
| `hermes_cli/security_advisories.py` | runtime | cleanup for users who already got hit |
The exact pinning + hash verification together close the supply-chain
gap. Without the lockfile path, exact pins alone are theater.
# Validation
- `uv lock --check` → passes (262 packages resolved, no drift).
- `bash -n` on install.sh + setup-hermes.sh → OK.
- 209/209 tests passing across new + adjacent test files
(test_lazy_deps.py, test_security_advisories.py, test_doctor.py,
test_tts_mistral.py, test_transcription_tools.py).
- TOML parse OK.
* chore: remove community announcement drafts (PR body covers it)
* build(deps): lazy-install every opt-in backend (anthropic, search, terminal, platforms, dashboard)
Extends the lazy-install framework to cover everything that's not used by
every hermes session. Base install drops from ~60 packages to 45.
Moved out of core dependencies = []:
- anthropic (only when provider=anthropic native, not via aggregators)
- exa-py, firecrawl-py, parallel-web (search backends; only when picked)
- fal-client (image gen; only when picked)
- edge-tts (default TTS but still optional)
New extras in pyproject.toml: [anthropic] [exa] [firecrawl] [parallel-web]
[fal] [edge-tts]. All added to [all].
New LAZY_DEPS entries: provider.anthropic, search.{exa,firecrawl,parallel},
tts.edge, image.fal, memory.hindsight, platform.{telegram,discord,matrix},
terminal.{modal,daytona,vercel}, tool.dashboard.
Each import site now calls ensure() before importing the SDK. Where the
module had a top-level try/except (telegram, discord, fastapi), the
graceful-fallback pattern was extended to lazy-install on first
check_*_requirements() call and re-bind module globals.
Updated test_windows_native_support.py tzdata check from snapshot
(>=2023.3 literal) to invariant (any version + win32 marker).
Validation:
- Base install: 45 packages (was ~60); 6 newly-extracted packages absent
- uv lock --check: passes (262 packages, no drift)
- 209/209 lazy_deps + advisory + doctor + tts/transcription tests passing
- py_compile clean on all 12 modified modules
Free-tier users were seeing 'No free models currently available.' in the
`hermes model` and post-login pickers even though qwen/qwen3.6-plus is
free on the Portal right now. Three independent breakages compounded:
1. The docs-hosted catalog manifest at website/static/api/model-catalog.json
was not regenerated when _PROVIDER_MODELS['nous'] was updated, so users
fetching the manifest got a list that didn't include qwen/qwen3.6-plus.
2. _resolve_nous_pricing_credentials() returned ('', '') on any auth blip,
collapsing get_pricing_for_provider('nous') to {} and making every
curated model fall through the free-tier filter as 'paid'.
3. Even with healthy pricing, the picker only ever showed models from the
in-repo curated list intersected with live pricing — a Portal-flagged
free model not yet in the curated list could never appear.
Changes:
- hermes_cli/models.py: new union_with_portal_free_recommendations() that
augments the curated list with Portal freeRecommendedModels entries
(with synthetic free pricing so partition keeps them). The Portal's
/api/nous/recommended-models endpoint is now the source of truth for
free-tier surfacing — old Hermes builds will see new free models
without a CLI release.
- hermes_cli/models.py: _resolve_nous_pricing_credentials() falls back to
the public inference base URL when runtime cred resolution fails.
The /v1/models endpoint exposes pricing without auth, so silently
returning {} just because a refresh token expired was wrong.
- hermes_cli/auth.py + hermes_cli/main.py: both free-tier picker call
sites call union_with_portal_free_recommendations() before partition.
- tests/hermes_cli/test_models.py: 7 tests covering union behaviour
(prepend, dedup, end-to-end with stale pricing, empty/missing/error
payloads, invalid entries).
- tests/hermes_cli/test_model_catalog.py: drift guard
TestManifestMatchesInRepoLists fails CI when _PROVIDER_MODELS['nous']
or OPENROUTER_MODELS is edited without re-running
scripts/build_model_catalog.py. Verified empirically that removing a
manifest entry triggers an assertion with an actionable error message.
Validation:
- 133/133 targeted tests pass (test_models, test_model_catalog,
test_auth_nous_provider).
- Live E2E against the real Portal:
- Stale curated list ['claude-opus','claude-sonnet','gpt-5.4'] (no
qwen) → after union: ['qwen/qwen3.6-plus', ...] →
partition(free_tier=True): selectable=['qwen/qwen3.6-plus'].
- Simulated expired refresh token → anon fetch returns 403 pricing
entries including qwen/qwen3.6-plus -> {prompt:0, completion:0}.
- ruff: clean.
cua-driver was only installed once on toolset enable: `_run_post_setup` early-returns when the binary is already on PATH, so upstream fixes (e.g. v0.1.6 Safari window-focus fix) never reached existing users without manual reinstall.
Two refresh points now:
- `hermes update` re-runs the upstream installer at the end of the update if cua-driver is on PATH (macOS-only, no-op otherwise). Ties driver freshness to the user-controlled update cadence — no startup latency, no per-launch GitHub API call.
- `hermes computer-use install --upgrade` for manual force-refresh.
The upstream `install.sh` always pulls the latest release, so re-running is the canonical upgrade path. No version-comparison logic needed.
`hermes computer-use status` now shows the installed version, and points at `--upgrade` for refreshing.
The old mtime-tracking staleness machinery (_tui_build_needed,
_hermes_ink_bundle_stale, _find_bundled_tui) tried to avoid rebuilding
by comparing source timestamps to dist/entry.js. This was fragile and
added ~100 lines of code. Replace with three clear paths:
1. HERMES_TUI_DIR set (prebuilt/nix): just node dist/entry.js, no build
2. --dev mode: tsx src/entry.tsx, no build, hot reload
3. Normal: always npm run build (esbuild is ~1s, correctness > caching)
Also error when HERMES_TUI_DIR is set with --dev (footgun: prebuilt
bundle has no source code to hot-reload).
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via `ruff --fix --unsafe-fixes`, 0 remaining.
133 files, +626/-626 (net zero).
Follow-up to PR #23824. Adds two correctness fixes on top of the
contributor's salvaged commit:
1. Stale-dist fallback no longer gated on `fatal=False`. `cmd_dashboard`
passes `fatal=True` and is the primary scenario this fallback is for
(issue #23817 — Windows Scheduled Task at logon). The previous gate
meant the fallback never fired in the case it was designed for.
2. `--skip-build` now verifies the dist actually exists before starting
the server. Without this, a misconfigured pre-build would launch the
dashboard pointing at a missing dist and silently serve 404s. We now
exit 1 with a clear "pre-build first: cd web && npm run build"
message, and on success print which dist directory is being used.
Verified end-to-end on Linux:
- build fails + stale dist (fatal=True) -> fallback fires
- build fails + no dist (fatal=True) -> exit 1 with stderr surfaced
- build fails + stale dist (fatal=False) -> fallback fires
- --skip-build + missing dist -> exit 1 with clear guidance
- --skip-build + valid dist -> 'Skipping web UI build...'
Three improvements for non-interactive contexts (Windows Scheduled
Tasks, CI/CD) where the web UI build may fail (issue #23817):
1. Retry build once after 3s — covers boot-time races (antivirus
scanning Node.js, npm cache not ready, transient disk I/O)
2. Fall back to existing dist when build fails (non-fatal mode) —
a stale UI is far better than no UI at all
3. Add --skip-build flag — lets callers pre-build in their wrapper
script and start the dashboard without internal build attempt
4. Surface npm stderr in build failure output for easier debugging
Fixes#23817
On Windows systems using a Chinese GBK locale, `hermes update` could misreport the Web UI build as failed even when `npm run build` actually succeeded. The failure was caused by Python decoding captured npm output with the process locale inside a background subprocess reader thread. When npm emitted bytes such as `0x85`, decoding under GBK raised `UnicodeDecodeError`, and Hermes then surfaced a misleading "Web UI build failed" warning.
This change makes the npm install/npm ci path and the Web UI build step decode captured output explicitly as UTF-8 with `errors="replace"`. That keeps unexpected bytes from crashing output collection, preserves successful builds, and prevents false negatives during update on Windows.
The patch also adds regression tests that verify these subprocess calls always use explicit UTF-8 decoding with replacement semantics.
Three issues hit during a fresh Windows install + first `hermes update`:
1. `pyproject.toml` re-introduced the invalid `exclude-newer = "7 days"`
under [tool.uv]. uv requires an RFC 3339 / ISO date — relative-duration
strings parse-fail. The line was removed in PR #21221 on May 7 and
accidentally added back in the v0.13.0 release commit (498bfc7bc1)
the same day. Every uv invocation throughout install logged a TOML
parse error, confusing users into thinking the install was broken.
Fix: remove the line (and the now-empty [tool.uv] section).
2. `hermes update` failed on Windows with
`Access is denied. (os error 5)` when uv tried to overwrite
`venv\\Scripts\\hermes.exe` — the running entry-point shim. Windows
blocks REPLACE on a mapped/loaded executable but allows RENAME (kernel
tracks the file by handle, not path; same trick Chrome/Firefox use for
self-update). Pre-rename live shims to `hermes.exe.old.<unix-ms>`
before each `uv pip install -e .`; uv writes a fresh shim at the
original path; the .old files are swept on the next hermes invocation.
Wraps every install attempt (primary, base-only fallback, and
per-extra retries). Restores shims if uv fails before writing
replacements.
3. Tools post-setup hooks (ddgs, piper-tts, kittentts, langfuse,
tinker-atropos) shelled out to `[sys.executable, '-m', 'pip', ...]`
and died with `No module named pip` on every fresh Windows install.
install.ps1 creates the venv via `uv venv` which doesn't seed pip;
install.ps1 bootstraps pip later, but only inside the platform-SDK
verify block — by then the wizard's post-setup hooks have already
run and failed.
New `_pip_install` helper tries uv pip first (works in pip-less
venvs), then python -m pip, then ensurepip-bootstrap-then-pip. All
five post-setup sites now route through it.
E2E:
- uv pip compile pyproject.toml — no parse warning
- quarantine + cleanup with simulated Windows scripts dir; rollback
works when uv install fails before writing replacement shim
- _pip_install in a real `uv venv`-created (pip-less) venv: bootstraps
pip via ensurepip and completes the install
Tests: tests/hermes_cli/ — 4135 pass, 8 pre-existing failures on main
unrelated to this PR (kanban_boards, openclaw_migration,
update_gateway_restart, web_server PluginAPIAuth).
* feat(curator): show rename map (where skills went) in user-visible summary
The full data has always been on disk in REPORT.md, but the user-visible
curator summary (gateway 💾 line, CLI session-start panel,
`hermes curator status`) was counts-only — "consolidated 4 into 2
umbrellas" with no names. Users only discovered renames when something
they expected was gone.
New `_build_rename_summary()` formats the rename map and appends it to
`final_summary`:
auto: 1 marked stale; llm: consolidated 2 into 1, pruned 1
archived 3 skill(s):
• docx-extraction → document-tools
• pdf-extraction → document-tools
• old-stale-thing — pruned (stale)
full report: hermes curator status
Empty on no-op ticks (no archives), so most ticks add zero log noise.
Cap of 10 entries keeps agent.log readable when a 50-skill
consolidation lands; the full list is always in REPORT.md.
`hermes curator status` indents continuation lines so the multi-line
summary reads as one logical field.
5 new tests in tests/agent/test_curator_classification.py covering
empty / consolidation / pruning / cap / mixed cases.
* feat(curator): show recent run summary once on `hermes update`
The rename map is now visible from where users actually look — the
update flow they explicitly run, instead of just the live gateway log
or transient CLI session-start panel.
Behavior:
- After `hermes update`, if the most recent curator run produced a
rename map (multi-line summary) that the user hasn't seen yet, print
it once with a 'last run Xh ago' header and a one-time-message
footer.
- Stamp `last_run_summary_shown_at = last_run_at` after printing so
subsequent `hermes update` invocations are silent until a newer
curator run lands.
- Silent on no-op runs (single-line summary like 'auto: no changes;
llm: no change'). Still stamps shown so we don't reconsider on
every update.
- Silent when the curator has never run (the existing first-run
notice handles that case).
Output:
ℹ Skill curator — last run 4h ago
auto: 1 marked stale; llm: consolidated 2 into 1, pruned 1
archived 3 skill(s):
• docx-extraction → document-tools
• pdf-extraction → document-tools
• old-stale-thing — pruned (stale)
full report: hermes curator status
(This message shows once per curator run. View anytime: hermes curator status)
State migration:
- `_default_state()` gains `last_run_summary_shown_at: None`. Existing
state files lack the field; `.get()` returns None; the comparison
treats any prior run as 'not yet shown' and prints once on next
update. Self-healing.
Wiring:
- Both `hermes update` paths in main.py call the new
`_print_curator_recent_run_notice()` right after the existing
first-run notice. Best-effort try/except so a state-load bug
never breaks the update flow.
6 tests in tests/hermes_cli/test_curator_recent_run_notice.py:
no-run / single-line / multi-line / show-once / new-run-resets /
time-formatter buckets.
The Termux update path (PR #22814) prebuilds psutil from a marker-patched
sdist so 'platform android is not supported' doesn't kill it. The same
psutil setup.py error blocks fresh installs via scripts/install.sh — only
the update path was wired up. Without this, a brand-new Termux user can't
get past the very first 'pip install -e .[termux-all]' call.
- New scripts/install_psutil_android.py — standalone version of the same
patcher hermes_cli/main.py uses, callable from bash.
- scripts/install.sh detects sys.platform == 'android' and runs the
patcher before pip install.
- TODO note added to both copies pointing at upstream
https://github.com/giampaolo/psutil/pull/2762; remove both when that
ships.
Note: we keep psutil as a base dep on Android (do not adopt the proposed
sys_platform != 'android' marker in pyproject). Removing it would crash
five unguarded 'import psutil' sites at runtime
(tools/code_execution_tool.py, tools/tts_tool.py, tools/process_registry.py
(2x), gateway/platforms/whatsapp.py).
Returning users who enabled '🖱️ Computer Use (macOS)' via 'hermes tools'
saw '✓ Saved configuration' but no install — cua-driver was never on
PATH and the toolset failed at first use. Two compounding causes:
1. _toolset_needs_configuration_prompt fell through to _toolset_has_keys,
which returned True for any provider with empty env_vars. cua-driver
has no env vars, so the gate skipped _configure_toolset entirely and
_run_post_setup('cua_driver') never ran.
2. No stable CLI entry-point existed for re-running the install when
the picker no-op'd it (e.g. when toggling the toolset off+on inside
one picker session, where 'added' is empty).
Changes:
- hermes_cli/tools_config.py: add _POST_SETUP_INSTALLED registry
mapping post_setup keys to installed-state predicates. The gate
now returns True when any visible provider has a registered
post_setup whose predicate fails. cua_driver is the only opt-in
for now; other post_setup hooks keep their existing behaviour.
- hermes_cli/main.py: add 'hermes computer-use install' and
'hermes computer-use status' as a stable docs target. install
reuses the same _run_post_setup('cua_driver') path that the
picker invokes; status reports whether cua-driver is on PATH.
- tools/computer_use/cua_backend.py: install hint now points users
at 'hermes computer-use install' first.
- website/docs/user-guide/features/computer-use.md: document the
new command as the primary install path.
- website/docs/reference/cli-commands.md: catalog 'hermes
computer-use' alongside 'hermes tools'.
- tests/hermes_cli/test_post_setup_gating.py: regression coverage
for the gate predicate (missing -> setup forced, installed ->
setup skipped, broken predicate -> non-blocking, unregistered
keys -> behaviour unchanged).
Fixes#22737. Reported by @f-trycua.
Problem:
After `hermes profile use NAME`, the gateway (started via systemd with
HERMES_HOME=/root/.hermes hardcoded) ignores the active profile and
always runs as the Default profile. WebUI, Telegram, and all non-CLI
platforms are affected.
Root cause:
_apply_profile_override() contained an early-return guard:
if profile_name is None and os.environ.get("HERMES_HOME"):
return # trust the inherited value
The intent was to let child processes inherit their parent's profile via
HERMES_HOME without redundantly re-reading active_profile. But
systemd also sets HERMES_HOME — to the hermes root (/root/.hermes),
not a profile directory — so the guard fired and silently skipped the
active_profile check. The user's `hermes profile use NAME` write to
~/.hermes/active_profile was never seen by the gateway process.
Fix:
Only skip the active_profile check when HERMES_HOME is already a
profile directory, identified by its immediate parent directory being
named "profiles" (e.g. ~/.hermes/profiles/coder or
/opt/data/profiles/coder). When HERMES_HOME points to a root
directory (parent name != "profiles"), continue to read active_profile.
Tests:
- test_hermes_home_at_root_with_active_profile_is_redirected: the
bug scenario — HERMES_HOME=/root/.hermes + active_profile=coder →
HERMES_HOME must be redirected to .../profiles/coder.
Stash-verified: FAILS without fix, PASSES with fix.
- test_hermes_home_already_profile_dir_is_trusted: child-process
inheritance contract unchanged — .../profiles/coder is trusted as-is.
- test_hermes_home_unset_reads_active_profile: classic path unchanged.
- test_hermes_home_unset_default_profile_no_redirect: "default" still
produces no redirect.
4/4 tests green.
Closes#22502.
After a clean SIGUSR1 drain, cmd_update passively polled for systemd's
auto-restart to fire. Our unit file sets RestartSec=60 (a crash-loop
guard), so the voluntary-restart path waited a full minute of dead air
before the gateway came back — the user saw 'draining (up to 75s)...'
and stared at it.
Change: after the drain exits with code 75, call 'reset-failed' +
'start' explicitly. Manual start bypasses RestartSec entirely
(RestartSec only governs systemd's own auto-restart logic). Takes
about as long as the gateway needs to come up (~1-3s on a warm box)
instead of ~60s.
The RestartSec=60 default stays — it's the right crash-loop guard for
actual crashes. This only short-circuits the voluntary-restart path.
Matches the pattern already used in 'hermes gateway restart'
(systemd_restart() in hermes_cli/gateway.py, PR #20949).
Tests:
- tests/hermes_cli/test_update_gateway_restart.py: new
test_update_bypasses_restartsec_after_graceful_drain asserts both
'reset-failed hermes-gateway' AND 'start hermes-gateway' (NOT
'restart') are issued after a successful graceful drain.
- All existing tests in the affected classes still pass
(TestCmdUpdateLaunchdRestart, TestCmdUpdateResetFailedBeforeRestart
are green; one pre-existing flake in the latter is unrelated).
`hermes --help` drops from ~700ms to ~180ms; `hermes version` from
~950ms to ~240ms. ~4-5x startup speedup on inspection / diagnostic
invocations.
Changes:
- hermes_cli/main.py: gate the argparse-setup `discover_plugins()` call
behind `_plugin_cli_discovery_needed()`. Eager plugin imports
(google.cloud.pubsub_v1, aiohttp, grpc, PIL) cost 500-650ms and are
pure waste when the user is running a built-in subcommand that
doesn't take plugin extensions (`--help`, `version`, `logs`,
`config`, `sessions`, etc.). New `_BUILTIN_SUBCOMMANDS` frozenset
+ `_first_positional_argv` helper handle flag-value skipping
(`-m gpt5 chat` → still fast).
- hermes_cli/main.py: `cmd_version` now reads the OpenAI SDK version
via `importlib.metadata` (~2ms) instead of `import openai` (~800ms
of pydantic type-module loading).
Agent-running paths (`hermes chat`, `hermes gateway run`) are
unaffected — the second `discover_plugins()` call later in `main()`
still runs so plugin hooks / tools wire up normally.
Tests:
- tests/hermes_cli/test_startup_plugin_gating.py: parity test guards
the `_BUILTIN_SUBCOMMANDS` set against drift (every registered
subparser must be declared; no phantom entries). Behavior tests for
flag-value skipping, `--` terminator, inline `--flag=value` form.
37 tests.
teknium1 hit ModuleNotFoundError: No module named 'hermes_bootstrap' after
a code update, on both his Windows machine AND his Linux workstation. The
failure mode is real and affects every user who updates hermes by any path
OTHER than a fully-successful ``hermes update``.
## What happens
hermes_bootstrap.py is a top-level module registered via pyproject.toml's
``py-modules`` list (added by Brooklyn's Windows UTF-8 stdio work). It
must be registered in the venv's editable-install .pth file before Python
can find it as a bare ``import hermes_bootstrap``.
``hermes update`` handles this correctly: (1) git reset --hard, (2) clear
__pycache__, (3) uv pip install -e . (re-registers the package including
the new py-modules list), (4) restart.
BUT if any step AFTER (1) fails — network blip during pip install, PEP 668
on a system Python, venv locked, uv not in PATH, a crash mid-update — the
user is left with new code that references hermes_bootstrap and a venv
that doesn't know about it. Every hermes invocation after that crashes
with ModuleNotFoundError, including ``hermes update`` itself. No recovery
path without manual `uv pip install -e .`.
Also affects users who ``git pull`` the repo directly without running
hermes update — relatively common for developers.
## Fix
Wrap ``import hermes_bootstrap`` in a try/except ModuleNotFoundError
across all 6 entry points (hermes_cli/main, run_agent, gateway/run,
acp_adapter/entry, cli, batch_runner). On Windows, missing bootstrap
means the UTF-8 stdio setup doesn't run — degraded behavior (Unicode
chars may fail to print) but NOT a crash. POSIX is unaffected either way
since the bootstrap is a no-op there.
Once hermes is running again, the user can ``hermes update`` to fully
recover.
## Test update
tests/test_hermes_bootstrap.py::test_entry_point_imports_bootstrap
scans for the first top-level import in each entry point and asserts it
is hermes_bootstrap. Extended the check to accept a Try block whose body
is a lone Import of hermes_bootstrap — that's the recovery-friendly form
we just introduced.
Verified behavior by ``mv hermes_bootstrap.py hermes_bootstrap.py.bak``
and confirming ``python -c "import hermes_cli.main"`` succeeds. 82/82
tests pass (hermes_bootstrap + windows-native + windows-compat).
## Why
Hermes supports Linux, macOS, and native Windows, but the codebase grew up
POSIX-first and has accumulated patterns that silently break (or worse,
silently kill!) on Windows:
- `os.kill(pid, 0)` as a liveness probe — on Windows this maps to
CTRL_C_EVENT and broadcasts Ctrl+C to the target's entire console
process group (bpo-14484, open since 2012).
- `os.killpg` — doesn't exist on Windows at all (AttributeError).
- `os.setsid` / `os.getuid` / `os.geteuid` — same.
- `signal.SIGKILL` / `signal.SIGHUP` / `signal.SIGUSR1` — module-attr
errors at runtime on Windows.
- `open(path)` / `open(path, "r")` without explicit encoding= — inherits
the platform default, which is cp1252/mbcs on Windows (UTF-8 on POSIX),
causing mojibake round-tripping between hosts.
- `wmic` — removed from Windows 10 21H1+.
This commit does three things:
1. Makes `psutil` a core dependency and migrates critical callsites to it.
2. Adds a grep-based CI gate (`scripts/check-windows-footguns.py`) that
blocks new instances of any of the above patterns.
3. Fixes every existing instance in the codebase so the baseline is clean.
## What changed
### 1. psutil as a core dependency (pyproject.toml)
Added `psutil>=5.9.0,<8` to core deps. psutil is the canonical
cross-platform answer for "is this PID alive" and "kill this process
tree" — its `pid_exists()` uses `OpenProcess + GetExitCodeProcess` on
Windows (NOT a signal call), and its `Process.children(recursive=True)`
+ `.kill()` combo replaces `os.killpg()` portably.
### 2. `gateway/status.py::_pid_exists`
Rewrote to call `psutil.pid_exists()` first, falling back to the
hand-rolled ctypes `OpenProcess + WaitForSingleObject` dance on Windows
(and `os.kill(pid, 0)` on POSIX) only if psutil is somehow missing —
e.g. during the scaffold phase of a fresh install before pip finishes.
### 3. `os.killpg` migration to psutil (7 callsites, 5 files)
- `tools/code_execution_tool.py`
- `tools/process_registry.py`
- `tools/tts_tool.py`
- `tools/environments/local.py` (3 sites kept as-is, suppressed with
`# windows-footgun: ok` — the pgid semantics psutil can't replicate,
and the calls are already Windows-guarded at the outer branch)
- `gateway/platforms/whatsapp.py`
### 4. `scripts/check-windows-footguns.py` (NEW, 500 lines)
Grep-based checker with 11 rules covering every Windows cross-platform
footgun we've hit so far:
1. `os.kill(pid, 0)` — the silent killer
2. `os.setsid` without guard
3. `os.killpg` (recommends psutil)
4. `os.getuid` / `os.geteuid` / `os.getgid`
5. `os.fork`
6. `signal.SIGKILL`
7. `signal.SIGHUP/SIGUSR1/SIGUSR2/SIGALRM/SIGCHLD/SIGPIPE/SIGQUIT`
8. `subprocess` shebang script invocation
9. `wmic` without `shutil.which` guard
10. Hardcoded `~/Desktop` (OneDrive trap)
11. `asyncio.add_signal_handler` without try/except
12. `open()` without `encoding=` on text mode
Features:
- Triple-quoted-docstring aware (won't flag prose inside docstrings)
- Trailing-comment aware (won't flag mentions in `# os.kill(pid, 0)` comments)
- Guard-hint aware (skips lines with `hasattr(os, ...)`,
`shutil.which(...)`, `if platform.system() != 'Windows'`, etc.)
- Inline suppression with `# windows-footgun: ok — <reason>`
- `--list` to print all rules with fixes
- `--all` / `--diff <ref>` / staged-files (default) modes
- Scans 380 files in under 2 seconds
### 5. CI integration
A GitHub Actions workflow that runs the checker on every PR and push is
staged at `/tmp/hermes-stash/windows-footguns.yml` — not included in this
commit because the GH token on the push machine lacks `workflow` scope.
A maintainer with `workflow` permissions should add it as
`.github/workflows/windows-footguns.yml` in a follow-up. Content:
```yaml
name: Windows footgun check
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: {python-version: "3.11"}
- run: python scripts/check-windows-footguns.py --all
```
### 6. CONTRIBUTING.md — "Cross-Platform Compatibility" expansion
Expanded from 5 to 16 rules, each with message, example, and fix.
Recommends psutil as the preferred API for PID / process-tree operations.
### 7. Baseline cleanup (91 → 0 findings)
- 14 `open()` sites → added `encoding='utf-8'` (internal logs/caches) or
`encoding='utf-8-sig'` (user-editable files that Notepad may BOM)
- 23 POSIX-only callsites in systemd helpers, pty_bridge, and plugin
tool subprocess management → annotated with
`# windows-footgun: ok — <reason>`
- 7 `os.killpg` sites → migrated to psutil (see §3 above)
## Verification
```
$ python scripts/check-windows-footguns.py --all
✓ No Windows footguns found (380 file(s) scanned).
$ python -c "from gateway.status import _pid_exists; import os
> print('self:', _pid_exists(os.getpid())); print('bogus:', _pid_exists(999999))"
self: True
bogus: False
```
Proof-of-repro that `os.kill(pid, 0)` was actually killing processes
before this fix — see commit `1cbe39914` and bpo-14484. This commit
removes the last hand-rolled ctypes path from the hot liveness-check
path and defers to the best-maintained cross-platform answer.
On Windows, Python's ``os.kill(pid, 0)`` is NOT a no-op. CPython's
implementation (``Modules/posixmodule.c::os_kill_impl``) treats sig=0
as ``CTRL_C_EVENT`` because the two integer values collide at the C
layer, and routes it through ``GenerateConsoleCtrlEvent(0, pid)`` —
which sends a Ctrl+C to the ENTIRE console process group containing
the target PID, not just the PID itself. Any caller that wanted to
check "is PID X alive" via the classic POSIX ``os.kill(pid, 0)``
idiom was silently killing that process (and often unrelated
processes in the same console group) on Windows. Long-standing
Python Windows quirk; see bpo-14484 (open since 2012).
This manifested in Hermes as: every ``hermes gateway status``
invocation would read the gateway's PID from the PID file, call
``os.kill(pid, 0)`` via ``gateway.status.get_running_pid()`` as a
"liveness check", and instantly terminate the gateway it was trying
to report on. No shutdown log, no traceback, no atexit hook fire,
no exit-diag entry — just silent termination of the detached pythonw
process. "Bot answered one message then stopped typing" was the
characteristic end-user symptom because `os.kill(pid, 0)` fires
mid-response-send and kills the gateway between logs.
Reproduction (verified in this branch before the fix):
$ hermes gateway start # gateway alive, PID 37520
$ hermes gateway status # reports "No gateway process detected"
$ tasklist /FI "PID eq 37520" # INFO: No tasks are running
# — gateway terminated silently
Root-cause fix is a new ``gateway.status._pid_exists(pid)`` helper:
- On Windows: Win32 ``OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION |
SYNCHRONIZE, False, pid)`` + ``WaitForSingleObject(handle, 0)``
via ctypes. Zero signal delivery, zero console-group side effects.
Pins ctypes return types to avoid DWORD-vs-signed-int parse bugs
on WAIT_TIMEOUT (0x102). Distinguishes ERROR_INVALID_PARAMETER
(PID gone) from ERROR_ACCESS_DENIED (alive but another user).
- On POSIX: the canonical ``os.kill(pid, 0)`` idiom that actually is
a no-op there.
Then patch every ``os.kill(pid, 0)`` liveness-check callsite to
route through ``_pid_exists`` instead. Total 14 callsites across
11 files; every single one was a latent silent-kill on Windows:
gateway/run.py:2810 — /restart watcher (inline subprocess)
gateway/run.py:15195 — --replace wait loop
gateway/status.py:572 — acquire_gateway_runtime_lock stale check
gateway/status.py:828 — get_running_pid (THE killer for status)
gateway/platforms/whatsapp.py:111
hermes_cli/gateway.py:228, 522, 1012 — gateway-related drain loops
hermes_cli/kanban_db.py:2826 — _pid_alive was claiming to
be cross-platform but used
os.kill(pid, 0) on Windows
hermes_cli/main.py:5792 — CLI process-kill polling
hermes_cli/profiles.py:782 — profile stop wait loop
plugins/google_meet/process_manager.py:74
tools/browser_tool.py:1215, 1255 — browser daemon ownership probes
tools/mcp_tool.py:1255, 3374 — MCP stdio orphan tracking
The watcher source in gateway/run.py:2810 is a multi-line string
that gets spawned as an inline ``python -c "..."`` subprocess, so
it can't import gateway.status. The fix for that callsite inlines
the same ctypes probe directly into the watcher source.
Tested on Windows 10 with the hermes gateway + Telegram bot:
- gateway start → alive
- 5 consecutive ``hermes gateway status`` invocations → gateway
alive after every one, same PID reported each time (37520, 21952)
- gateway.log shows uninterrupted operation; no spurious shutdown
entries; cron ticker and kanban dispatcher still running on
their 60-second cadence
- bot continues answering Telegram messages throughout
Ships alongside an exit-path diagnostic wrapper in
``hermes_cli/gateway.py::run_gateway()`` that captures every way
``asyncio.run(start_gateway(...))`` can return (success, SystemExit,
KeyboardInterrupt, BaseException, atexit) with full traceback to
``logs/gateway-exit-diag.log``. This was used to prove the gateway
was being hard-killed externally (no exit event fired) and should
be kept for future Windows debugging.
Refs: https://bugs.python.org/issue14484
See also: references/windows-subprocess-sigint-storm.md in
the hermes-agent skill.
Codebase-wide fix for Python-on-Windows UTF-8 footguns, complementing
the earlier execute_code sandbox fixes (which remain load-bearing for
when the sandbox explicitly scrubs child env).
Problem: Python on Windows has two long-standing text-encoding pitfalls:
1. sys.stdout/stderr are bound to the console code page (cp1252 on
US-locale installs) — print('café') crashes with UnicodeEncodeError.
2. Subprocess children don't know to use UTF-8 unless PYTHONUTF8 and/or
PYTHONIOENCODING are set in their env — so any Python we spawn
(linters, sandbox children, delegation workers) hits the same bug.
Solution: A tiny bootstrap module (hermes_bootstrap.py) imported as the
first statement of every Hermes entry point:
- hermes_cli/main.py (hermes / hermes-agent console_script)
- run_agent.py (hermes-agent direct)
- acp_adapter/entry.py (hermes-acp)
- gateway/run.py (messaging gateway)
- batch_runner.py (parallel batch mode)
- cli.py (legacy direct-launch CLI)
On Windows, the bootstrap:
- os.environ.setdefault('PYTHONUTF8', '1') (PEP 540 UTF-8 mode)
- os.environ.setdefault('PYTHONIOENCODING', 'utf-8')
- sys.stdout/stderr/stdin.reconfigure(encoding='utf-8', errors='replace')
Children inherit the env vars → they run in UTF-8 mode.
Current process's stdio is reconfigured → print('café') works now.
On POSIX (Linux/macOS), the bootstrap is a complete no-op. We don't
touch LANG, LC_*, or anything else — users who have intentionally
configured a non-UTF-8 locale aren't affected. POSIX systems are
already UTF-8 by default in 99% of modern setups, so there's nothing
to fix.
setdefault() (not overwrite) means users who explicitly set PYTHONUTF8=0
or PYTHONIOENCODING=cp1252 in their environment are respected.
What this does NOT fix: bare open(path, 'w') calls in the *parent*
process still default to locale encoding because PYTHONUTF8 is only
read at interpreter init. A ruff PLW1514 sweep (separate follow-up)
will add explicit encoding='utf-8' at those ~219 call sites for
belt-and-suspenders.
Tests (17): 16 passed, 1 skipped on Windows.
- Windows: env vars set, stdio reconfigured, child inherits UTF-8 mode
- POSIX: complete no-op (verified on fake POSIX + skipped on real
POSIX since we don't have a Linux box in this session)
- Idempotence: multiple calls safe
- Graceful degradation: non-reconfigurable streams don't crash
- User opt-out: explicit PYTHONUTF8=0 is respected
- Load order: every entry point's FIRST top-level import is
hermes_bootstrap, enforced by an AST-level parametrized test
pyproject.toml: added hermes_bootstrap to py-modules so it ships with
pip installs.
Native Windows (with Git for Windows installed) can now run the Hermes CLI
and gateway end-to-end without crashing. install.ps1 already existed and
the Git Bash terminal backend was already wired up — this PR fills the
remaining gaps discovered by auditing every Windows-unsafe primitive
(`signal.SIGKILL`, `os.kill(pid, 0)` probes, bare `fcntl`/`termios`
imports) and by comparing hermes against how Claude Code, OpenCode, Codex,
and Cline handle native Windows.
## What changed
### UTF-8 stdio (new module)
- `hermes_cli/stdio.py` — single `configure_windows_stdio()` entry point.
Flips the console code page to CP_UTF8 (65001), reconfigures
`sys.stdout`/`stderr`/`stdin` to UTF-8, sets `PYTHONIOENCODING` + `PYTHONUTF8`
for subprocesses. No-op on non-Windows. Opt out via `HERMES_DISABLE_WINDOWS_UTF8=1`.
- Called early in `cli.py::main`, `hermes_cli/main.py::main`, and
`gateway/run.py::main` so Unicode banners (box-drawing, geometric
symbols, non-Latin chat text) don't `UnicodeEncodeError` on cp1252
consoles.
### Crash sites fixed
- `hermes_cli/main.py:7970` (hermes update → stuck gateway sweep): raw
`os.kill(pid, _signal.SIGKILL)` → `gateway.status.terminate_pid(pid, force=True)`
which routes through `taskkill /T /F` on Windows.
- `hermes_cli/profiles.py::_stop_gateway_process`: same fix — also
converted SIGTERM path to `terminate_pid()` and widened OSError catch
on the intermediate `os.kill(pid, 0)` probe.
- `hermes_cli/kanban_db.py:2914, 3041`: raw `signal.SIGKILL` →
`getattr(signal, "SIGKILL", signal.SIGTERM)` fallback (matches the
pattern already used in `gateway/status.py`).
### OSError widening on `os.kill(pid, 0)` probes
Windows raises `OSError` (WinError 87) for a gone PID instead of
`ProcessLookupError`. Widened the catch at:
- `gateway/run.py:15101` (`--replace` wait-for-exit loop — without this,
the loop busy-spins the full 10s every Windows gateway start)
- `hermes_cli/gateway.py:228, 460, 940`
- `hermes_cli/profiles.py:777`
- `tools/process_registry.py::_is_host_pid_alive`
- `tools/browser_tool.py:1170, 1206`
### Dashboard PTY graceful degradation
`hermes_cli/pty_bridge.py` depends on `fcntl`/`termios`/`ptyprocess`,
none of which exist on native Windows. Previously a Windows dashboard
would crash on `import hermes_cli.web_server` because of a top-level
import. Now:
- `hermes_cli/web_server.py` wraps the pty_bridge import in
`try/except ImportError` and sets `_PTY_BRIDGE_AVAILABLE=False`.
- The `/api/pty` WebSocket handler returns a friendly "use WSL2 for
this tab" message instead of exploding.
- Every other dashboard feature (sessions, jobs, metrics, config
editor) runs natively on Windows.
### Dependency
- `pyproject.toml`: add `tzdata>=2023.3; sys_platform == 'win32'` so
Python's `zoneinfo` works on Windows (which has no IANA tzdata
shipped with the OS). Credits @sprmn24 (PR #13182).
### Docs
- README.md: removed "Native Windows is not supported"; added
PowerShell one-liner and Git-for-Windows prerequisite note.
- `website/docs/getting-started/installation.md`: new Windows section
with capability matrix (everything native except the dashboard
`/chat` PTY tab, which is WSL2-only).
- `website/docs/user-guide/windows-wsl-quickstart.md`: reframed as
"WSL2 as an alternative to native" rather than "the only way".
- `website/docs/developer-guide/contributing.md`: updated
cross-platform guidance with the `signal.SIGKILL` / `OSError`
rules we enforce now.
- `website/docs/user-guide/features/web-dashboard.md`: acknowledged
native Windows works for everything except the embedded PTY pane.
## Why this shape
Pulled from a survey of how other agent codebases handle native
Windows (Claude Code, OpenCode, Codex, Cline):
- All four treat Git Bash as the canonical shell on Windows, same as
hermes already does in `tools/environments/local.py::_find_bash()`.
- None of them force `SetConsoleOutputCP` — but they don't have to,
Node/Rust write UTF-16 to the Win32 console API. Python does not get
that for free, so we flip CP_UTF8 via ctypes.
- None of them ship PowerShell-as-primary-shell (Claude Code exposes
PS as a secondary tool; scope creep for this PR).
- All of them use `taskkill /T /F` for force-kill on Windows, which
is exactly what `gateway.status.terminate_pid(force=True)` does.
## Non-goals (deliberate scope limits)
- No PowerShell-as-a-second-shell tool — worth designing separately.
- No terminal routing rewrite (#12317, #15461, #19800 cluster) — that's
the hardest design call and needs a separate doc.
- No wholesale `open()` → `open(..., encoding="utf-8")` sweep (Tianworld
cluster) — will do as follow-up if users hit actual breakage; most
modern code already specifies it.
## Validation
- 28 new tests in `tests/tools/test_windows_native_support.py` — all
platform-mocked, pass on Linux CI. Cover:
- `configure_windows_stdio` idempotency, opt-out, env-preservation
- `terminate_pid` taskkill routing, failure → OSError, FileNotFoundError fallback
- `getattr(signal, "SIGKILL", …)` fallback shape
- `_is_host_pid_alive` OSError widening (Windows-gone-PID behavior)
- Source-level checks that all entry points call `configure_windows_stdio`
- pty_bridge import-guard present in `web_server.py`
- README no longer says "not supported"
- 12 pre-existing tests in `tests/tools/test_windows_compat.py` still pass.
- `tests/hermes_cli/` ran fully (3909 passed, 9 failures — all confirmed
pre-existing on main by stash-test).
- `tests/gateway/` ran fully (5021 passed, 1 pre-existing failure).
- `tests/tools/test_process_registry.py` + `test_browser_*` pass.
- Manual smoke: `import hermes_cli.stdio; import gateway.run;
import hermes_cli.web_server` — all clean, `_PTY_BRIDGE_AVAILABLE=True`
on Linux (as expected).
## Files
- New: `hermes_cli/stdio.py`, `tests/tools/test_windows_native_support.py`
- Modified: `cli.py`, `gateway/run.py`, `hermes_cli/main.py`,
`hermes_cli/profiles.py`, `hermes_cli/gateway.py`,
`hermes_cli/kanban_db.py`, `hermes_cli/pty_bridge.py`,
`hermes_cli/web_server.py`, `tools/browser_tool.py`,
`tools/process_registry.py`, `pyproject.toml`, `README.md`, and 4
docs pages.
Credits to everyone whose prior PR work informed these fixes — see
the co-author trailers. All of the PRs listed in
`~/.hermes/plans/windows-support-prs.md` fixing `os.kill` / `signal.SIGKILL`
/ UTF-8 stdio / tzdata / README patterns found the same issues; this PR
consolidates them.
Co-authored-by: Philip D'Souza <9472774+PhilipAD@users.noreply.github.com>
Co-authored-by: Arecanon <42595053+ArecaNon@users.noreply.github.com>
Co-authored-by: XiaoXiao0221 <263113677+XiaoXiao0221@users.noreply.github.com>
Co-authored-by: Lars Hagen <1360677+lars-hagen@users.noreply.github.com>
Co-authored-by: Luan Dias <65574834+luandiasrj@users.noreply.github.com>
Co-authored-by: Ruzzgar <ruzzgarcn@gmail.com>
Co-authored-by: sprmn24 <oncuevtv@gmail.com>
Co-authored-by: adybag14-cyber <252811164+adybag14-cyber@users.noreply.github.com>
Co-authored-by: Prasanna28Devadiga <54196612+Prasanna28Devadiga@users.noreply.github.com>
Third slice of the Microsoft Teams meeting pipeline stack, salvaged
onto current main. Adds the standalone teams_pipeline plugin that
consumes Graph change notifications from the webhook listener,
resolves meeting artifacts (transcript first, recording + STT fallback
later), persists job state in a durable store, and exposes an operator
CLI for inspection, replay, subscription management, and validation.
Design choices follow maintainer review feedback on PR #19815:
- Standalone plugin rather than bolted-on core surface
(plugins/teams_pipeline/, kind: standalone in plugin.yaml).
- Zero new model tools. The agent drives the pipeline by invoking
the operator CLI via the terminal tool, guided by the skill that
ships with a follow-up PR.
- Reuses the existing msgraph_webhook gateway platform for Graph
ingress. Pipeline runtime is wired in via bind_gateway_runtime and
gated on plugins.enabled so gateways that don't run the plugin
boot cleanly.
Additions:
- plugins/teams_pipeline/: runtime (gateway wiring + config builder),
pipeline core, durable SQLite store, subscription maintenance
helpers, Graph artifact resolution, operator CLI (list, show,
run/replay, fetch dry-run, subscriptions list, subscribe,
renew-subscription, delete-subscription, maintain-subscriptions,
token-health, validate).
- hermes_cli/main.py: second-pass plugin CLI discovery so any
standalone plugin registered via ctx.register_cli_command()
outside the memory-plugin convention path gets its subcommand
wired into argparse without touching core.
- gateway/run.py: _teams_pipeline_plugin_enabled() config gate,
_wire_teams_pipeline_runtime() binding after adapter setup, and
the two runner attributes used by the runtime.
Credit to @dlkakbs for the entire plugin implementation.
* feat(profile): shareable profile distributions (pack/install/update/info)
Closes#20456.
Turns a profile into a portable, versioned artifact. Packs SOUL.md, config,
skills, cron, and an env-var manifest into a tar.gz that others can install
from a local path, URL, or git repo. Updates re-pull the distribution while
preserving user data (memories, sessions, auth.json, .env) and the user's
config.yaml overrides.
New subcommands (under hermes profile, no parallel tree):
hermes profile pack <name> [-o FILE]
hermes profile install <source> [--name N] [--alias] [--force] [-y]
hermes profile update <name> [--force-config] [-y]
hermes profile info <name>
Manifest (distribution.yaml at the profile root): name, version,
hermes_requires, author, env_requires, distribution_owned.
Security:
- Installer shows manifest + env-var requirements before mutating disk;
confirmation required unless -y.
- auth.json and .env are never packed (same exclude set as profile export).
- Cron jobs are packed but NOT auto-scheduled — user is pointed at
'hermes -p <name> cron list' to review.
- Archive extraction rejects path traversal (../ members).
- Alias creation is opt-in via --alias.
Update semantics:
- Distribution-owned paths (SOUL.md, skills/, cron/, mcp.json, manifest):
replaced from the new archive.
- config.yaml: preserved by default; --force-config to overwrite.
- User-owned paths (memories/, sessions/, auth.json, .env, state.db*,
logs/, workspace/, plans/, home/, *_cache/, local/): never touched.
Version pin:
hermes_requires accepts >=, <=, ==, !=, >, < or a bare version (treated
as >=). Install fails with a clear error when the running Hermes version
doesn't satisfy the spec.
Sources supported by 'install':
- Local .tar.gz / .tgz archive
- Local directory
- HTTP(S) URL pointing to a .tar.gz (uses httpx, already a dep)
- Git URL (github.com/user/repo, https://..., git@..., ssh://, git://)
Tests: 43 new unit tests (manifest parsing, version checks, env template,
pack/install/update round-trip, config-preservation, security).
E2E validated via real CLI invocations against an isolated HERMES_HOME
covering pack, install with confirmation, update preservation, update
--force-config, decline-preview, duplicate-install rejection, and
version-requirement rejection.
* refactor(profile-dist): git-only — drop tar.gz/HTTP transports and pack
Scope-cut on top of the original distribution PR: a profile distribution
is now exclusively a git repository (or a local directory during
development). The tar.gz / HTTP archive transports and the matching
`hermes profile pack` subcommand have been removed.
Why:
* GitHub tags, branches, and commits are already the right versioning
primitive. Tag pushes do for us what 'pack + upload' did.
* `hermes profile export` / `import` already cover local backup and
restore; they are not a distribution format and stay untouched.
* One transport means one install/update code path, one doc page,
and one mental model. The extra source types doubled the surface
for no real user win — GitHub auto-attaches release tarballs, and
`git bundle` / `git clone --mirror` cover the airgap case.
Changes:
* hermes_cli/profile_distribution.py — removed pack_profile,
_fetch_tar_archive (_http_fetch), _safe_extract, _archive_roots,
_safe_parts, _find_dist_root, tarfile/io/urlparse imports. The
new _stage_source has two arms: git URL → clone, local directory
→ use in place.
* hermes_cli/main.py — removed the 'pack' subparser and action
handler. Install help text updated to match the reduced source list.
* tests/hermes_cli/test_profile_distribution.py — rewritten around a
local-directory staging fixture. The install/update/describe suites
now build a distribution tree on disk directly and install from it,
which is what a real git clone produces after .git is stripped.
Dropped TestPack, TestFindDistRoot, and the tar-specific security
test. New tests cover _looks_like_git_url, env_example emission,
hermes_requires enforcement, and 'installer does not import
credentials if an author mistakenly leaks them in the staging tree'.
* website/docs/reference/profile-commands.md — 'Distribution commands'
section rewritten around git. Added a 'Publishing a distribution'
section. export/import stay documented as local backup/restore.
* website/docs/reference/cli-commands.md — dropped 'pack' from the
profile subcommand table.
* website/package.json — 'lint:diagrams' now passes
--exclude-code-blocks to ascii-guard. Without it, markdown tables
and box-drawing diagrams inside fenced code blocks were being
misidentified as malformed ASCII boxes, blocking the PR's
docs-site-checks CI with 8 false-positive errors.
Validation:
* Targeted suite: tests/hermes_cli/test_profile_distribution.py —
56/56 pass (down from 43 — reorganized to cover the new
local-dir paths).
* Regression: test_profiles.py + test_profile_export_credentials.py
102/102 still pass. export/import behaviour unchanged.
* Docs lint: ascii-guard lint --exclude-code-blocks docs returns
0 errors (was 8 on the PR before the flag bump).
* E2E: ran the real `hermes profile install`/`info` against a
local staging dir under an isolated HERMES_HOME — install writes
SOUL.md + skills to the target profile, info reads the manifest
back, a bogus source produces a clear error, and `hermes profile
pack` is now rejected by argparse as expected.
* feat(profile-dist): distribution-aware list/show/delete + installed_at + env preview
Polish pass on top of the git-only scope cut. Five additions, all small,
wiring into existing commands rather than adding new surface.
1. `installed_at` timestamp on the manifest
* Stamped automatically inside plan_install() on both fresh install
and update — ISO-8601 UTC, seconds resolution.
* Surfaced in `hermes profile info` as `Installed: <ts>`.
* Lets users tell "installed 6 months ago, needs update" from
"installed yesterday" without guessing from file mtimes.
2. `hermes profile list` grows a `Distribution` column
* Plain profiles: "—"
* Distribution profiles: "<name>@<version>" (e.g. `telemetry@1.2.3`)
* ProfileInfo gains three optional fields — distribution_name,
distribution_version, distribution_source — populated by a new
_read_distribution_meta() helper that swallows manifest read errors
so a broken distribution.yaml in one profile can't break `list`
for the others.
3. `hermes profile show` and `hermes profile delete` surface
distribution provenance
* show: `Distribution: name@version` + `Installed from: <source>`
plus a pointer to `hermes profile info <name>` for the full
manifest.
* delete: same lines in the pre-confirmation preview, so a user
deleting "telemetry" can see it came from
`github.com/kyle/telemetry-distribution` before they type
`telemetry` to confirm. No change to the confirmation gate itself —
deletion semantics are identical to plain profiles.
4. Install preview checks env vars against the current environment
* Replaces the "Env vars you'll need to set:" header with a simpler
"Env vars:" block.
* Each required var is labeled:
- `✓ set` — already in `os.environ` OR present as a key in the
target profile's existing .env (update case).
- `needs setting` — required but not found in either place.
- `—` — optional.
* Mirrors pip's "Requirement already satisfied" UX: no unnecessary
nagging about keys the user already has configured.
5. Docs: private distributions
* New "Private distributions" section in
website/docs/reference/profile-commands.md explaining that we
shell out to the user's `git` binary, so SSH keys / credential
helpers / GitHub CLI stored creds all work transparently. One
paragraph, two examples.
* `hermes profile info` section updated to mention `Installed:`.
Module-level hoist:
* `from datetime import datetime, timezone` was previously lazy-imported
inside plan_install(). Hoisted to module scope so tests can monkeypatch
`hermes_cli.profile_distribution.datetime` to freeze time.
Tests (+7):
* TestInstalledAtStamp.test_install_stamps_installed_at — format check
(4-digit year, 'T', +00:00 suffix).
* TestInstalledAtStamp.test_update_refreshes_installed_at — freezes
datetime.now() to 2099-01-01 and confirms update writes a new stamp.
* TestProfileInfoDistribution.test_installed_distribution_shows_in_list
— ProfileInfo.distribution_{name,version,source} populated after install.
* TestProfileInfoDistribution.test_plain_profile_has_no_distribution_fields
— plain profiles have None.
* TestProfileInfoDistribution.test_malformed_manifest_does_not_break_list
— broken distribution.yaml in one profile doesn't break list_profiles().
Validation:
* 163/163 tests pass (56 distribution + 102 profile regression +
5 new from this commit — up from 158).
* docs-lint: 0 errors.
* E2E verified: install preview shows ✓/needs-setting per env var,
`profile list` shows Distribution column, `profile show` + `delete`
preview mentions source URL, `info` shows Installed: timestamp.
* fix(profile-dist): clean errors + warn when overwriting plain profiles
Two small polish fixes found during collision sweeps of the PR:
1. ValueError from validate_profile_name now caught cleanly
* A distribution.yaml whose 'name' field can't be used as a profile
identifier (spaces, path traversal, etc.) raises ValueError from
hermes_cli.profiles.validate_profile_name, which was escaping as a
raw Python traceback from 'hermes profile install/update/info'.
* Broadened the except clause in all three handlers to catch
(DistributionError, ValueError) — users now see:
Error: Invalid profile name '../../etc/passwd'. Must match
[a-z0-9][a-z0-9_-]{0,63}
instead of a stack trace.
2. Install preview distinguishes plain profile overwrite from
distribution re-install
* When plan.target_dir exists and IS a distribution (has
distribution.yaml), preview still shows the mild
(profile exists — will overwrite distribution-owned files only)
* When plan.target_dir exists but is a HAND-BUILT plain profile (no
distribution.yaml), preview now shows a loud warning:
⚠ Profile exists but is NOT a distribution. Installing here will
overwrite its SOUL.md, skills/, cron/, and mcp.json.
Your memories, sessions, auth.json, and .env will be preserved,
but any hand-edits to distribution-owned files will be lost.
* Users who type 'hermes profile install foo --force' against a
profile they hand-built now see what they're signing up for. User
data is still safe (memories, sessions, auth, .env are in
USER_OWNED_EXCLUDE), but custom SOUL/skills get stomped.
Tests (+2):
* TestErrorSurfaces.test_bad_profile_name_raises_valueerror_not_traceback
* TestErrorSurfaces.test_path_traversal_name_rejected
Validation:
* 165/165 tests pass (was 163).
* E2E: bad manifest names produce 'Error: Invalid profile name ...'
with no traceback; installing over a plain profile shows the warning;
re-installing over an existing distribution shows the normal
overwrite message.
* Bad HTTPS URLs still produce 'Error: git clone failed: ...' — git
itself generates a clean enough message that no wrapper is needed.
* 'install .' works correctly from any cwd.
* fix(profiles): reject reserved names at validate time
Before: `hermes profile create hermes` / `profile install` / `profile rename`
all silently accepted reserved names like `hermes`, `test`, `tmp`, `root`,
`sudo`. The profile directory was created; only alias creation failed (via
check_alias_collision), leaving a confusingly-named profile on disk — e.g.
`~/.hermes/profiles/hermes/` sitting next to `~/.hermes/` itself.
The reserved set already exists (_RESERVED_NAMES, introduced alongside alias
collision detection). This commit moves the check up one layer to
validate_profile_name so every entry point — create, install, import,
rename, dashboard web API — shares the same gate.
The error message points the user at the cause without being cryptic:
Error: Profile name 'hermes' is reserved — it collides with either the
Hermes installation itself or a common system binary. Pick a different
name.
`default` continues to pass through (it's a special alias for ~/.hermes).
_HERMES_SUBCOMMANDS (`chat`, `model`, `gateway`, etc.) stays at
alias-collision time only — those are fine as bare profile names with
`--no-alias`.
Tests (+5): test_reserved_names_rejected parametrized over the full
_RESERVED_NAMES set, matching the existing pattern in TestValidateProfileName.
No existing test uses a reserved name as a profile identifier (greppped
create_profile("hermes|test|tmp|root|sudo") — zero hits).
Validation:
* 170/170 tests pass in the profile suites.
* E2E: `profile create hermes`, `profile install` with manifest
name=hermes, and `profile install ... --name hermes` all produce the
same clean `Error: Profile name 'hermes' is reserved ...` with rc=1
and no traceback. Normal names (`mybot`) still work.
cmd_update's auto-restart path could leave the gateway dead after a
transient failure in systemd's own auto-restart window. Reproduced
on Ubuntu 25.10 + systemd 257: after update, gateway drains and exits 75,
systemd's first respawn 60s later fails (status=200/CHDIR with
"No such file or directory" on a WorkingDirectory that demonstrably
exists), the unit ends up in RestartMaxDelaySec=300 backoff, and
cmd_update's fallback 'systemctl restart' never recovers it — leaving
users with a permanently silent gateway until they manually run
'systemctl reset-failed'.
The fix mirrors the recovery pattern 'hermes gateway restart'
(systemd_restart) got in PR #20949: always reset-failed before
restart, on both the initial fallback and the retry. Also rewrites
the final failure message to tell the user to reset-failed +
restart (not just restart, which is the step that already failed
twice).
Add a new `hermes gateway list` subcommand that shows the running
status of gateways across all profiles in a single view:
Gateways:
✓ default (current) — PID 155469
✓ wx1 — PID 166893
✗ dev — not running
Also includes `_print_other_profiles_gateway_status()` which appends
an "Other profiles" section to `hermes gateway status` output when
other profile gateways are running.
Both use existing `list_profiles()` and `find_profile_gateway_processes()`
— no new dependencies.
Closes#19127
Related: #19113, #4402, #4587
The --command flag of `hermes mcp add` shared its argparse dest with the
top-level subparser (`dest="command"` in `hermes_cli/_parser.py`). When
the flag was omitted, argparse still wrote `args.command = None`,
clobbering the top-level value of `"mcp"`. The dispatcher then saw
`args.command is None` and fell through to interactive chat, so
`hermes mcp add ...` silently launched chat instead of registering the
server. `cmd_mcp_add` was never reached.
Use `dest="mcp_command"` on the flag and read it from `cmd_mcp_add`.
The user-facing CLI flag `--command` is unchanged; only the in-memory
namespace attribute moves. Also updates the `_make_args` helper in
`tests/hermes_cli/test_mcp_config.py` to populate the new dest, and
adds `tests/hermes_cli/test_mcp_add_command_dest.py` with a parser-
level regression test.
Closes#19785.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds `hermes profile create <name> --no-skills` to create a profile with
zero bundled skills. Writes a `.no-bundled-skills` marker file in the
profile root so `hermes update`'s all-profile skill sync loop also skips
the profile — without the marker, every update would re-seed skills and
the user would have to delete them again.
Use case (from @hiut1u): orchestrator profiles and narrow-task profiles
don't need 100+ bundled skills polluting their system prompt.
- create_profile() gains a `no_skills` param, mutually exclusive with
`--clone` / `--clone-all` (cloning explicitly copies skills).
- seed_profile_skills() no-ops on opted-out profiles and returns
`{skipped_opt_out: True}` so callers can report cleanly.
- Web API (POST /api/profiles) accepts `no_skills: bool`.
- Delete `.no-bundled-skills` to opt back in — next `hermes update`
re-seeds normally.
6 new tests in TestNoSkillsOptOut cover marker write, mutual exclusion
with clone, seed_profile_skills opt-out, fresh profile unaffected, and
delete-marker-re-enables-seeding.