The card was macOS-only. cua-driver also runs on Windows and Linux, so
fold `cua-driver doctor` (cross-platform binary/health probes) into a
single OS-aware `ready` signal:
- macOS: ready == both TCC grants; keeps the permission rows + grant flow.
- Windows/Linux: no TCC toggles, so ready == driver health, with a
per-OS note (SmartScreen/UIAccess on Windows; X11/XWayland on Linux).
`computer_use_status()` replaces the macOS-only `permissions_status()` and
surfaces `platform`, `ready`, `can_grant`, and the doctor `checks` (non-ok
ones render as warnings). CLI `permissions status`, the REST endpoint, and
the desktop card all key off the one payload. Grant stays macOS-only (400
elsewhere — nothing to grant).
Computer Use already worked through the desktop backend (the cua-driver
toolset enables + installs via Settings -> Skills & Tools), but there was
no in-app way to see or grant the two macOS permissions it needs, so "give
a model my Mac" was tribal knowledge.
The grants attach to cua-driver's OWN TCC identity (com.trycua.driver /
the installed CuaDriver.app), not Hermes -- so no app entitlement is
involved. cua-driver 0.5+ exposes `permissions status/grant`, which we wrap:
- tools/computer_use/permissions.py: thin client over the two subcommands
- hermes computer-use permissions {status,grant}: CLI parity
- GET /api/tools/computer-use/status, POST .../permissions/grant: desktop REST
- ComputerUsePanel: live Accessibility + Screen Recording state with a
Grant button (dialog attributed to CuaDriver), shown in the expanded
Computer Use toolset row. Binary install stays in the existing provider
post-setup runner.
Follow-ups: i18n the card copy; a "Stop driver" control (cua-driver stop)
for the runaway-`serve` case.
Salvages #50469 by @libre-7.
_dashboard_local_update_managed_externally() previously blocked every containerized dashboard from the local update API, even when the running install was a bind-mounted git checkout that can be updated with hermes update.
Allow the dashboard updater only for git installs inside containers, while keeping hosted /opt/data, docker, and pip installs managed externally. Pip remains blocked because its apply path mutates the running container filesystem and is not the self-managed checkout case.
Adds regression coverage for docker, git, and pip install-method handling inside containers, and maps the contributor email for release attribution.
Co-authored-by: libre-7 <libre-7@users.noreply.github.com>
/simplify-code (LOW, flagged by two reviewers): the source tags 'user' /
'project' / 'bundled' were bare string literals scattered across the discovery
scrub and the two mount-time refuse guards. A typo in any one site (e.g.
'users') would SILENTLY disable a security gate with no error — the exact
failure mode this RCE boundary must not have.
Introduce a shared module-level _NON_BUNDLED_PLUGIN_SOURCES frozenset referenced
by both the discovery scrub and the (now single) mount guard, so the
auto-import policy lives in one place. The two mount guards collapse into one
gate that still emits the distinct per-source operator message via a map (no
loss of guidance). Behavior unchanged: 39 RCE-bypass tests pass, and the
constant is mutation-checked (typo'ing it fails the bypass tests).
Defence-in-depth (discovery scrub + mount refuse) is retained intentionally.
* feat(computer_use): disable cua-driver telemetry by default, add opt-in
cua-driver ships anonymous PostHog usage telemetry ENABLED by default
upstream (fires cua_driver_install / cua_driver_doctor events to
eu.i.posthog.com). Hermes now disables it for our users unless they
explicitly opt in.
- New config key `computer_use.cua_telemetry` (default false) in
DEFAULT_CONFIG.
- `cua_backend.cua_driver_child_env()` injects
`CUA_DRIVER_RS_TELEMETRY_ENABLED=0` into the child env when telemetry is
disabled (the default); leaves the var untouched on opt-in so the driver
uses its own default. Reads config fail-safe — any error defaults to
telemetry off.
- Routed every cua-driver spawn site through the policy: MCP backend
(StdioServerParameters env), `cua_driver_update_check`, doctor's
health_report Popen, the install.sh/install.ps1 runner, and the
`--version` / status probes.
- Docs: new Telemetry subsection in computer-use.md (EN).
- Tests: tests/computer_use/test_cua_telemetry.py — default disables,
explicit-false disables, opt-in leaves var untouched, config-failure
fails safe, inherited-enabled is overridden off.
Verified live on Linux against the real cua-driver-rs 0.6.0 binary: with
the var=0 the driver reports "telemetry: disabled via
CUA_DRIVER_RS_TELEMETRY_ENABLED" and sends no event; with it unset it logs
"sending event: cua_driver_doctor". 213 computer_use + install tests green.
* fix(dashboard): fold computer_use config category into agent tab
The new computer_use.cua_telemetry key created a single-field dashboard
config category, tripping test_no_single_field_categories (web_server's
invariant that categories with <2 fields must be merged to avoid tab
sprawl). Add computer_use -> agent to _CATEGORY_MERGE, matching the
existing onboarding/telegram single-field folds.
Defense-in-depth for the dashboard plugin auto-import path. The web server
auto-imports and mounts the Python backend (dashboard/manifest.json -> api file)
of plugins found in ~/.hermes/plugins/ (user) and ./.hermes/plugins/ (project),
not just bundled plugins. So any plugin that reaches one of those dirs gets
arbitrary Python executed on the next dashboard start.
NOTE ON THREAT MODEL: #43719's originally-documented delivery chain (a public
--insecure dashboard + open API used to git clone a malicious repo into
~/.hermes/plugins/) is ALREADY mitigated on main — since the June 2026
hermes-0day hardening, a non-loopback bind ALWAYS requires an auth provider and
--insecure no longer bypasses the auth gate. This change is therefore NOT
closing that (now-authenticated) network path; it removes the residual
'arbitrary code executes merely because a plugin is on disk' hazard, which still
applies when a plugin arrives by other means: a socially-engineered git clone,
a supply-chain drop, an authenticated-but-malicious actor, or a future
regression in the auth gate. Untrusted on-disk code should not auto-execute.
Restrict dashboard backend Python auto-import to BUNDLED plugins only. User and
project plugins may still extend the dashboard UI via static JS/CSS, but their
api Python file is never auto-imported. Two layers: _discover_dashboard_plugins
scrubs api/_api_file for user/project sources (and bundled wins name conflicts
so a non-bundled plugin cannot shadow a trusted backend route);
_mount_plugin_api_routes re-refuses user/project at mount time. Tightens the
prior GHSA-5qr3-c538-wm9j / #29156 hardening (bundled+user) to bundled-only.
Salvaged from #44472 (@egilewski) onto current main.
When `hermes dashboard --host 0.0.0.0` is run interactively with the auth
gate engaged but no DashboardAuthProvider configured, prompt to set up the
bundled username/password provider on the spot (or point at `hermes dashboard
register` for OAuth) instead of only emitting the fail-closed error.
- main.py: `_maybe_setup_dashboard_auth_interactively()` runs before
start_server. No-ops on loopback binds, when a provider is already
registered, or when stdin/stdout isn't a TTY (Docker/s6, CI, piped runs) so
the fail-closed SystemExit stays the backstop for unattended deploys. On the
password path it writes dashboard.basic_auth.{username,password_hash,secret}
to config.yaml (scrypt hash, never plaintext), then force-rediscovers
plugins so the basic provider registers before the gate check.
- web_server.py: fix the fail-closed hint — it told operators to set
`dashboard_auth.basic.username` but the provider reads `dashboard.basic_auth`.
- docs: note the interactive setup under Fail-closed semantics.
No new env vars; reuses the existing dashboard.basic_auth config surface.
* feat(providers): remove google-gemini-cli + google-antigravity OAuth providers
Google now actively bans accounts for third-party tools that piggyback on
Gemini CLI / Antigravity / Code Assist OAuth, and because abuse prevention
sits at a backend layer the ban can extend to the entire Google account
(Gmail/Drive), with a second violation being permanent.
Ref: https://github.com/google-gemini/gemini-cli/discussions/20632
Removes both OAuth inference providers entirely (modules, provider profiles,
auth/runtime/config/models wiring, the /gquota Code Assist quota command,
the antigravity-cli optional skill, desktop + docs surface in en + zh-Hans).
The API-key 'gemini' provider (GOOGLE_API_KEY/GEMINI_API_KEY against
generativelanguage.googleapis.com) is unaffected and stays fully supported.
* fix(skills): keep the antigravity-cli skill — only the OAuth provider is removed
The antigravity-cli optional skill orchestrates the external `agy` binary as
a coding-agent tool via the terminal tool — it does NOT wrap Hermes inference
through the banned google-antigravity OAuth provider, so it carries none of
the account-ban risk that motivated removing that provider. Restore the skill,
its docs page, the sidebar entry, and the optional-skills catalog row. The
google-antigravity / google-gemini-cli inference providers stay fully removed.
Remove the dashboard --insecure auth-bypass, add an MCP persistence guard +
IOC blocklist, and raise the API-server key entropy floor.
Driven by the June 2026 hermes-0day campaign (r/hermesagent, live 854.media
instance): scanners find exposed Hermes dashboards/API servers, drive the
root agent to plant a 'command: bash' MCP entry that appends an attacker SSH
key to authorized_keys, which cron + startup then re-execute every tick.
- dashboard: --insecure no longer disables the auth gate. should_require_auth
returns True for every non-loopback bind; a public bind ALWAYS requires an
auth provider (bundled password provider or OAuth). --insecure kept as a
warned no-op for backward compat. Fail-closed error now points at the
password provider, not at --insecure.
- mcp_security: validate_mcp_server_entry now also rejects shell payloads that
write to OS persistence surfaces (authorized_keys/.ssh/pam.d/sudoers/cron/
rc files) and hard-rejects a hermes-0day IOC blocklist (attacker SSH key +
source IPs) anywhere in command/args/env. Runs at save AND spawn time.
- api_server: raise network-bind API_SERVER_KEY entropy floor 8->16 chars;
warn when a network-accessible API server runs an unsandboxed local backend.
Per @egilewski's audit on this PR (#15544), the original fix was
correct but the file has refactored since: the four endpoint-local
empty-peer checks have been consolidated into _ws_client_is_allowed
and _ws_client_reason, but the helpers were left fail-open ('no peer
host known means allow' / 'no reason to block').
On a loopback-bound dashboard with auth disabled, an ASGI server
behind a misconfigured proxy or a unix-socket transport can deliver
ws.client == None or ws.client.host == ''. The helpers were treating
that as 'allowed', so the loopback-only peer gate could be bypassed
by anything that suppressed the client tuple in transit. All four
WebSocket endpoints (/api/pty, /api/ws, /api/pub, /api/events) route
through _ws_request_is_allowed -> _ws_client_is_allowed, so the gap
applied uniformly.
Fix:
* _ws_client_is_allowed: return False when client_host is empty
instead of True. Only reached on loopback bind with auth disabled
(auth_required=True and explicit non-loopback binds short-circuit
earlier), so the fail-closed behavior is scoped to the surface
that needs it.
* _ws_client_reason: return a 'missing_or_empty_peer bound=...'
block reason instead of None, so the dispatcher's existing
reason-based rejection path picks it up and the close gets logged
with a machine-parseable token for diagnosability.
Behavior unchanged for:
* gated mode (auth_required=True) — early-returns True before the
empty-peer check runs. The OAuth ticket is the auth at that point.
* explicit non-loopback bind (--host 0.0.0.0/::, or a specific LAN
address, always with --insecure) — early-returns True before the
empty-peer check runs. DNS-rebinding is still blocked by the
Host/Origin guard in _ws_host_origin_is_allowed.
* legitimate loopback peers (client_host == '127.0.0.1' / '::1') —
not affected by the empty-peer branch.
Regression tests added in tests/hermes_cli/test_dashboard_auth_ws_auth.py:
* test_empty_client_host_rejected_in_loopback_mode
* test_missing_client_object_rejected_in_loopback_mode
* test_empty_client_host_reason_is_block
Plus two regression guards to ensure the fix does not over-reach:
* test_empty_client_host_still_allowed_in_insecure_public_mode
* test_empty_client_host_still_allowed_in_gated_mode
All three new fail-closed tests fail without this patch (the helpers
return True / None for an empty peer) and pass with it. The 45
pre-existing tests in test_dashboard_auth_ws_auth.py continue to pass.
Fixes a regression introduced by the prior approach (synchronous import
hermes_cli.gateway inside _lifespan) that caused a new failure mode:
the blocking import stalled the asyncio event loop before uvicorn could
bind its port, pushing HERMES_DASHBOARD_READY past the desktop shell's
45 s announcement deadline and triggering a respawn loop that accumulated
orphaned backend processes.
Two-part fix:
_lifespan: replace the blocking import with a fire-and-forget
run_in_executor call (_warm_gateway_module). The import runs in a
worker thread while the server socket is already open, so
HERMES_DASHBOARD_READY fires without delay.
get_status: replace the inline lazy import with
await run_in_executor(None, _resolve_restart_drain_timeout). This is
the root fix for the original 15 s socket-timeout: the blocking
.pyc-compilation + Defender scan is offloaded to a thread, keeping the
event loop free for every /api/status probe. After the first call the
module is in sys.modules and the executor returns in microseconds.
Both helpers are extracted as module-level sync functions so they can
be unit-tested independently of FastAPI or uvicorn.
Closes#50209
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Second cleanup pass (simplify-code review of the first follow-up):
- write_runtime_status now clamps active_agents via parse_active_agents
instead of an inline max(0, int(...)). Removes the duplicated clamp the
helper's docstring acknowledged AND closes a write-side ValueError gap
(a non-numeric active_agents previously raised; now degrades to 0).
- hermes_cli/gateway.py draining-status line routes its active-agents count
through parse_active_agents too — the third coercion site of the same
persisted field, now consistent and non-raising with the two HTTP surfaces.
- web_server.py /api/status: the drain-timeout resolver fallback now catches
ImportError specifically and falls back to DEFAULT_GATEWAY_RESTART_DRAIN_TIMEOUT
(a real float) instead of a blanket 'except Exception -> None'. None would
have violated the surfaced field's int/float contract and stripped NAS's
poll-deadline hint silently.
- Dropped a redundant 'if runtime else 0' branch (parse_active_agents already
handles the empty/None case) and tightened the parse_active_agents docstring
to describe the actual single-contract role (write + both reads).
Follow-up cleanups on top of the busy/idle readout (PR #50103):
- web_server.py /api/status reused the single drain-timeout resolver
hermes_cli.gateway._get_restart_drain_timeout() (HERMES_RESTART_DRAIN_TIMEOUT
env -> agent.restart_drain_timeout config -> default) instead of inlining a
third hand-rolled copy of that precedence chain. Also fixes a subtle
divergence: the inline copy used os.environ.get() so a set-but-empty env var
was treated as a value rather than falling through to config; the shared
resolver .strip()s and falls through correctly.
- Added gateway.status.parse_active_agents() and routed BOTH HTTP surfaces
(/api/status and /health/detailed) through it, so the exposed active_agents
field is consistently clamped non-negative. Previously /api/status clamped
while /health/detailed exposed the raw file value, diverging on a corrupt
count.
- Added TestParseActiveAgents covering the shared coercion contract.
Give an external consumer (NAS) a trustworthy, always-reachable busy/idle
readout it can poll before a disruptive lifecycle action (restart,
migrate, stop, auto-update). The dashboard /api/status is the only HTTP
surface guaranteed up on a hosted agent regardless of which gateway
platforms are enabled, and it already reads gateway_state.json.
Add to /api/status (additive, non-breaking):
- active_agents — in-flight gateway-turn count (now refreshed
per-turn by the companion gateway-side commit)
- gateway_busy — running AND active_agents > 0
- gateway_drainable — running and live (a valid begin-drain target)
- restart_drain_timeout — resolved seconds, so the consumer can size its
poll deadline without out-of-band knowledge
(env HERMES_RESTART_DRAIN_TIMEOUT → config
agent.restart_drain_timeout → default)
The busy/drainable contract is defined once in gateway.status
(derive_gateway_busy / derive_gateway_drainable) and consumed by both
/api/status and /health/detailed so the two surfaces can never disagree.
Liveness keys off gateway_running (a live PID/health probe), NEVER
gateway_updated_at — a healthy idle gateway never advances that timestamp.
All derived fields degrade to safe falsy values when the gateway is down
or the status file is absent/corrupt (never a spurious "busy" that would
wedge the consumer). active_sessions (the 5-min DB recency heuristic the
SPA reads) is left exactly as-is — new signal, new fields.
Tests (behaviour contracts, not snapshots): the pure derivation contract
across every running/state/count/liveness combination; /api/status
integration for busy, idle-drainable, draining, down, stale-busy-file,
corrupt-count, and timeout surfacing; and /health/detailed parity.
Accounts-tab cards derived from the unified provider_catalog() carry
status_fn=None and had no hardcoded branch in _resolve_provider_status,
so any future OAuth/account provider plugin rendered permanently
logged-out. Fall through to the canonical hermes_cli.auth.get_auth_status
slug dispatcher and adapt its shape, so membership AND status both
auto-extend with the hermes model universe.
Adds the end-to-end parity contract test: every CANONICAL_PROVIDERS entry (the
`hermes model` universe) must be configurable on a desktop Providers tab —
keys(/api/env) ∪ ids(/api/providers/oauth) ⊇ canonical. Asserted as an
invariant against the live endpoints so the GUI can never silently drift from
the CLI again.
Surfacing this contract caught Bedrock: it's aws_sdk (no api-key vars), so it
had no Keys card. /api/env now tags AWS_REGION/AWS_PROFILE to the bedrock
provider card. Anthropic is whitelisted as a legitimate dual-tab provider
(direct API key + subscription OAuth).
Also refreshes the _OAUTH_PROVIDER_CATALOG docstring to describe its new role
as the override base for _build_oauth_catalog().
/api/providers/oauth now unions the explicit hand-tuned OAuth cards
(_OAUTH_PROVIDER_CATALOG — bespoke flow/status/cli, plus the api-key Anthropic
PKCE card and synthetic claude-code row) with every accounts-tab provider in
provider_catalog(). Any OAuth/external provider in the `hermes model` universe
now appears automatically, closing the drift where google-gemini-cli and
copilot-acp had no Accounts card despite being CLI-configurable.
Adds read-only status cards for google-gemini-cli (via existing
get_gemini_oauth_auth_status) and copilot-acp (managed-by-CLI, like claude-code).
DELETE handler routes through the same _build_oauth_catalog() builder.
Parity test asserts the Accounts tab offers every accounts-tab catalog provider
as an invariant.
The Keys tab now surfaces every keys-tab provider in provider_catalog() (the
`hermes model` universe), synthesizing a card even when the env var has no hand
entry in OPTIONAL_ENV_VARS. Closes the drift where openai-api, kilocode, novita,
tencent-tokenhub, and copilot were CLI-configurable but invisible in the desktop
Providers → API keys tab.
Each provider row now carries backend-derived provider/provider_label grouping
hints so the desktop can group by the same provider identity the CLI picker
uses. Hand OPTIONAL_ENV_VARS prose still wins where present (enrichment, not a
gate). Shared non-provider credentials (e.g. tool-category GITHUB_TOKEN) are
explicitly not hijacked into a provider card — Copilot uses its provider-owned
COPILOT_GITHUB_TOKEN.
- Drop empty entries before validating SLACK_ALLOWED_USERS so a trailing or
interior comma (which the gateway silently tolerates in
gateway/platforms/slack.py) is no longer rejected at the dashboard.
- Hoist the member-ID regex to a module-level _SLACK_MEMBER_ID_RE constant
and note it stays in sync with the frontend SLACK_MEMBER_ID_RE.
- Add a regression test for the trailing-comma case.
The new SLACK_ALLOWED_USERS validation rejected '*', but the Slack gateway
honors '*' as an allow-all wildcard (gateway/platforms/slack.py DM auth,
slash-confirm, and approval-button paths). Accept '*' as a valid list entry
in both the API validator and the dashboard form so a value the runtime
honors is no longer blocked at setup.
The desktop model picker had no way to force a fresh model fetch: model.options
went through the 1h-cached provider_models_cache.json, and there was no flag to
bust it. When a provider's cached list expired and its next live fetch failed,
the picker fell back to the curated static list — silently dropping live-only
models (e.g. OpenCode Zen's free tier like deepseek-v4-flash-free) the user had
been using.
- Thread refresh through model.options (RPC + REST /api/model/options) ->
build_models_payload -> list_authenticated_providers, which calls
clear_provider_models_cache() up front when set so every row re-fetches live.
- Add a 'Refresh Models' control to the desktop picker (5-locale i18n, spinning
sync icon). Normal opens leave refresh=false to stay snappy on the cache.
Verified: stale cache hides deepseek-v4-flash-free -> refresh busts it -> live
re-fetch surfaces it. refresh=false never touches the cache.
Live-test finding: the Chronos fire webhook was only on the APIServerAdapter
(aiohttp), but hosted agents expose `hermes dashboard` (the FastAPI web_server
app on :9119) as their public URL — NOT the api_server adapter. So NAS's relay
callback to {callback_url}/api/cron/fire could never reach the verifier on a
hosted agent (the exact target environment). Two layers were wrong:
1. Wrong server: /api/cron/fire didn't exist on the dashboard app. Added
cron_fire_webhook there, alongside the existing /api/cron/* dashboard routes.
It resolves the job's profile (_find_cron_job_profile) and runs fire_due via
the resolved provider under the cron-profile retarget lock
(_fire_cron_job_for_profile, mirroring _call_cron_for_profile) so the CAS
claim + run_one_job operate on the right profile's jobs.json. Runs with no
live adapters (delivery falls back to the per-platform send path, like the
desktop cron path). 202 + background so a long turn never trips NAS's
timeout; the store CAS de-dupes a NAS retry. job-not-found -> 200 "gone".
2. Auth gate: the dashboard auth middleware 401s any non-cookie request before
the handler runs. Added /api/cron/fire to the shared PUBLIC_API_PATHS so the
NAS bearer-JWT callback reaches the verifier — the JWT (purpose=cron_fire),
not the cookie, is the real gate. One shared frozenset feeds both the
loopback and OAuth middlewares, so no drift.
Kept the APIServerAdapter route too (valid self-host api_server surface).
Contract doc updated to name the dashboard app as the hosted-agent callback
surface.
Tests: test_cron_fire_dashboard (6) — route registered on the dashboard app,
in PUBLIC_API_PATHS, 401 on bad token WITH the cookie gate engaged (proves it's
reachable past the gate + JWT is the gate), 400 missing job_id, 200 gone for
unknown job, 202 + fire_due invoked for the resolved profile on a valid token.
Full hermes_cli + cron + chronos + webhook suites green (7637).
Why the original tests missed it: the api_server webhook test built an
APIServerAdapter client directly and never asserted which server the hosted
public URL exposes — green-but-wrong-integration. The new test pins the route
to the dashboard app.
* fix(dashboard): resolve chat TUI argv off event loop
Dashboard chat now resolves its TUI launch command off the
FastAPI/WebSocket event loop. The resolver can run `npm install` /
`npm run build` through `_make_tui_argv()`, and doing that synchronously
in `/api/pty` can block proxy keepalives and other dashboard WebSocket
work long enough for reverse-proxy deployments to drop the chat
connection.
This keeps the current TUI build policy intact: normal production
launches still run the correctness-first `npm run build` path, while
`HERMES_TUI_DIR` remains the prebuilt/no-build path for distros and
containers. The change only moves the potentially slow resolver work to
a worker thread for the dashboard chat path, serialized by an
`asyncio.Lock` so concurrent chat tabs preserve one-build-at-a-time
behavior. `SystemExit` (node/npm missing) and the profile `HTTPException`
path still propagate cleanly through `asyncio.to_thread()`.
Salvaged from #26124 — rebased onto current main. The async wrapper now
threads the `profile` parameter that `_resolve_chat_argv` gained on main
since the PR was opened, so cross-profile chat is preserved.
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
* chore: add 0xdany to AUTHOR_MAP
* fix(dashboard): bind chat-argv lock to app.state; cover error propagation
Self-review hardening on top of the salvaged fix:
- Move `_chat_argv_lock` from a module-level `asyncio.Lock()` onto
`app.state` (initialised in `_lifespan`, lazy fallback via
`_get_chat_argv_lock`), mirroring `event_lock`. A module-level
`asyncio.Lock()` binds to whatever event loop is active at import time,
which is the exact pattern `_get_event_state`'s docstring warns against
(breaks across TestClient instances / uvicorn reloads). This keeps the
lock on the running loop.
- Add two tests exercising the real `_resolve_chat_argv_async` →
`asyncio.to_thread` → lock → re-raise chain: `SystemExit` (node/npm
missing) and `HTTPException` (invalid profile) both propagate out of the
worker thread and are caught by `pty_ws`'s existing handlers. The prior
tests mocked `asyncio.to_thread` away and never covered this path.
* test(dashboard): dedupe pty error-propagation tests; assert close code
simplify-code cleanup pass on the salvage stack:
- Extract the shared scaffolding of the two pty_ws error-propagation tests
into `_assert_pty_propagates`, keeping the two tests as distinct contracts
for the `except SystemExit` and `except HTTPException` arms.
- Assert the stable WebSocket close code (1011) instead of relying solely on
the user-facing "Chat unavailable" notice wording — a behavior contract per
the AGENTS.md "behavior contracts over snapshots" rule, robust to notice
rewording. The detail substring ("unknown profile") is still checked for the
HTTPException case since proving the detail survives the thread hop is the
point of that test.
No production-code change; the helper exercises the same real
_resolve_chat_argv_async -> asyncio.to_thread -> lock -> re-raise chain.
---------
Co-authored-by: draihan <draihan@student.ubc.ca>
* fix(desktop): show Hindsight memory provider
* feat(desktop): configure Hindsight memory provider
* fix(desktop): limit Hindsight modes to supported setup
* refactor(desktop): generic memory-provider config surface
Replace the bespoke Hindsight settings surface with a declarative,
schema-driven path so adding a memory provider is pure declaration —
no per-provider page, conditional, or endpoint.
- memory_providers.py: declarative registry. Each provider lists its
fields {key, label, kind, default, options, secret-vs-plain}. Hindsight's
mode is a select(cloud, local_external), so rejecting local_embedded
falls out of generic enum validation instead of a hand-written check.
- One generic endpoint pair GET/PUT /api/memory/providers/{name}/config.
GET returns declared fields + current values (secrets only as is_set,
never read back); PUT validates selects against their options, writes
plain fields to the provider config file, secrets to the env store,
and flips memory.provider.
- ProviderConfigPanel renders straight from the schema, replacing
hindsight-settings.tsx and the memory.provider === 'hindsight'
conditional in config-settings.tsx — same pattern as
toolset-config-panel.tsx off env_vars.
Scoped to memory providers; storage layout is unchanged so the runtime
Hindsight plugin reads the same config.json / HINDSIGHT_API_KEY / provider
keys as before. Tests cover the registry, endpoint behavior (defaults,
write+secret, select rejection, unknown provider, secret-never-returned),
and the generic panel.
DELETE /api/sessions/{id} was the only session endpoint that didn't
resolve the id (detail, messages, rename, export all call
resolve_session_id) and 404'd when the row was already gone. The desktop
optimistically removes the sidebar row, then RESTORES it and shows the
error on any failure — so deleting a session that had just been reaped
(empty-session hygiene) or removed by a concurrent client resurrected a
ghost row and surfaced "session not found". /goal + auto-compression churn
leaves transient empty rows that race the sidebar snapshot, which is the
exact "I deleted the empty one and got 'session not found'" report.
Resolve exact ids / unique prefixes, and treat an already-absent session
as an idempotent success — DELETE's contract is "ensure it's gone". This
mirrors the bulk-delete endpoint, which already treats ghost ids as
success.
Tests: deleting an absent id is idempotent (200, not 404); delete resolves
a unique prefix; a real session still deletes.
The interactive model pickers (Desktop REST API, TUI model.options, CLI
/model) were hard-capped at max_models=50, which truncated large provider
catalogs like Kilo Gateway (336 models) to just 50 entries. This made
most models undiscoverable via the picker search box.
Changes:
- Change build_models_payload() default from max_models=50 to None (unlimited)
- Change list_authenticated_providers() default from max_models=8 to None
- Change list_picker_providers() default from max_models=8 to None
- Fix all [:max_models] slicing to handle None as 'no limit'
- Remove max_models=50 from 5 interactive picker callers:
* web_server.py: get_model_options (Desktop /api/model/options)
* web_server.py: get_recommended_default_model
* model_switch.py: prewarm_picker_cache_async
* tui_gateway/server.py: model.options JSON-RPC
* cli.py: HermesCLI model picker
- Telegram/Discord inline keyboard picker (gateway/slash_commands.py)
still passes max_models=50 explicitly — unchanged behavior.
The total_models field was already in the response payload and is now
meaningful since models.length == total_models for interactive pickers.
Fixes#48279
The dashboard MCP catalog only showed name/description/transport and a
non-clickable source. Users couldn't see what an entry connects to or runs
before installing — the exact detail the docs trust model tells them to vet.
- /api/mcp/catalog now returns transport target (url, or command+args),
auth_type, git install source/ref + bootstrap commands, default-enabled
tool hint, and post-install guidance per entry.
- McpPage renders the endpoint URL (http) or command+args (stdio), the git
install source/ref, a collapsible bootstrap-commands list, setup notes,
and the source as a clickable link when it's a URL.
- Docs: drop the 'uv pip install -e .[mcp]' quick-start step (Hermes does
not support pip installs; MCP ships with the standard install) and note
the dashboard now surfaces this detail.
- Strengthen the catalog endpoint test to assert the new inspection fields.
Follow-up to #47663 (streaming multipart upload), fixing two issues that
landed with it.
1. Temp file leaked on client disconnect. The streaming upload endpoint's
except chain caught only HTTPException / PermissionError / OSError — all
Exception subclasses. asyncio.CancelledError, raised when a browser aborts
a large upload mid-stream (the exact NS-501 scenario), is a BaseException,
so it bypassed every except clause and reached a finally that only closed
the file handle and never unlinked the temp file. Every aborted large
upload orphaned a partial `.{name}.*.upload` file (up to ~100 MB) in the
target directory. Cleanup now lives in finally, keyed on a `renamed`
success flag, so the temp file is removed on every non-success exit
including BaseException paths. Added test_stream_upload_cleans_temp_on_cancellation,
which fails on the pre-fix code (leaks the temp file) and passes with the fix.
2. python-multipart pinned to ==0.0.27 instead of ==0.0.20. The package was
already resolved at 0.0.27 transitively (via daytona) before #47663; the
explicit ==0.0.20 pin in the [web] extra and the tool.dashboard lazy-install
set downgraded it. Bumped both to ==0.0.27 and regenerated with `uv lock`,
keeping the lockfile coherent. The base dependency stays >=0.0.9,<1.
* fix(dashboard): stream file uploads via multipart instead of base64 JSON
The dashboard file manager uploaded files (including backup/restore zip
archives) by reading them client-side with FileReader.readAsDataURL and
POSTing a base64 data URL inside a JSON body to /api/files/upload. For a
large backup this (a) inflates the payload ~33%, (b) buffers the whole
file plus its decoded copy in memory, and (c) reliably trips an upstream
proxy body-size/timeout limit, surfacing as a 502 with the upload
appearing to hang indefinitely (NS-501). Dashboard-only hosted users have
no shell fallback to place the archive, so backup restore was unusable.
Add a streaming multipart endpoint POST /api/files/upload-stream
(UploadFile + Form) that reads the request body in 1 MiB chunks straight
to a sibling temp file, enforces the existing 100 MB size cap as it
streams (413 on overflow, before buffering the whole file), and
atomically renames into place so a partial/aborted/over-limit upload
never clobbers an existing file. The frontend api.uploadFile now sends
multipart/form-data (raw bytes, no base64, browser-set boundary) and
FilesPage passes the File object directly; the dead readAsDataUrl helper
is removed. The legacy base64 JSON endpoint stays for backward compat.
FastAPI's UploadFile/Form require python-multipart, which is NOT pulled in
by fastapi itself, so it is added to the base deps, the [web] extra, and
the tool.dashboard lazy-install set (kept in sync).
Validated: 5 new endpoint tests (roundtrip, multi-chunk >1 MiB,
over-limit 413 without clobbering + no temp-file leak, overwrite=false
conflict, forced-root traversal containment); existing base64 tests still
pass; web typecheck + vite build clean; and a real uvicorn server E2E
(5 MB multipart upload -> HTTP 200 in 0.21s, exact byte match) plus a
30 MB TestClient roundtrip confirm constant-memory streaming end to end.
Reported via beta (NS-501).
* build(deps): regenerate uv.lock for python-multipart (NS-501)
CI ran uv lock --check / uv sync --locked which failed because the
python-multipart dependency add was not reflected in uv.lock. Regenerate
the lockfile (resolves to 0.0.20, matching the [web] extra pin) after
merging current main.
Phase 3 — rebind both ticker call sites to resolve_cron_scheduler(). Default
(built-in) path is byte-identical; Phase 0 characterization tests + the full
gateway suite (6919) stay green.
Task 3.1: split gateway/run.py _start_cron_ticker into:
- _start_gateway_housekeeping() — the gateway-only chores (channel-dir
refresh, image/doc cache cleanup, paste sweep, curator poll), now on their
own loop/thread, independent of which cron provider is active.
- _start_cron_ticker() — kept as a DEPRECATED shim that runs only the
built-in InProcessCronScheduler().start(), preserving the symbol for
hermes_cli/debug.py and the Phase 0 characterization test.
Task 3.2: start_gateway() resolves the provider and runs provider.start() in
the 'cron-scheduler' thread, plus a second 'gateway-housekeeping' thread;
teardown sets the shared cron_stop, calls provider.stop(), joins both.
Task 3.3: desktop _start_desktop_cron_ticker() swapped its inline tick loop for
resolve_cron_scheduler().start() (no adapters/loop — desktop has none).
The provider owns ONLY the cron tick (so an external scale-to-zero provider
with no 60s loop fits); gateway housekeeping is decoupled from the cron
trigger. Both threads share cron_stop.
Verified: full tests/cron/ (453) + full tests/gateway/ (6919) green. Manual
gateway smoke (Task 3.4) is operator-run, pending.
get_runtime_status_running_pid() validates liveness with a local
os.kill(pid, 0) probe. In /api/status the runtime record can be the
REMOTE health-probe body (cross-container), whose PID belongs to another
host and is display-only — probing it locally is wrong and trips the
test live-system guard (os.kill on a PID outside the test subtree).
Run the fallback only against the local read_runtime_status() record.
_profile_scope swaps process-global skills_tool/skill_manager module
attrs under an RLock; /api/status holds that scope across the
run_in_executor remote-health probe await, so a concurrent
/api/skills?profile=X request can cross-restore the status profile's
skill dir on its finally. Add _config_profile_scope (contextvar-only,
task-local, await-safe) and use it for status, which only resolves
get_hermes_home() at call time for config/env/gateway state and never
needs the skills-module globals.
External providers (Claude Code) store creds outside Hermes, so the
disconnect API refuses them. The backend now hands the GUI a per-OS
`disconnect_command` that clears the credential the same way the CLI's
logout does (macOS Keychain entry + ~/.claude/.credentials.json), and
the misleading "use claude setup-token" hint is corrected.
Settings → Providers offers a Disconnect button for these: it confirms,
leaves Settings, and runs the removal command in the embedded terminal
via a new runInTerminal() (queues onto $terminalInjection; the terminal
pane flushes and clears it once its session is live). The expanded list
also gets its own "Other providers" header so it no longer reads as
grouped under "Connected". API-managed providers keep the one-click
(trash) disconnect.
On a remote gateway connection, agent-written files live on the gateway
host, not the desktop's disk, so the Artifacts view's file:// hrefs failed
("Invalid external URL") and image thumbnails broke.
Make mediaExternalUrl() remote-aware in one place: in remote mode it
rewrites gateway-local paths to GET /api/files/download (a new endpoint
that streams the file as a Content-Disposition: attachment). The artifacts
view now resolves through it, and so do the existing chat-media and
generated-image callers, for free.
The download endpoint stays auth-gated; auth_middleware additionally
accepts the session token as a ?token= query param for this one path so a
shell/browser-opened download (which can't set the session header) still
authenticates — the same query-token tradeoff as the /api/pty WebSocket.
It is NOT added to PUBLIC_API_PATHS.
Salvages #46663 (which carried ~19k lines of CRLF noise and made the
endpoint public). Reimplemented on a clean LF base with the security hole
closed and tests added.
Co-authored-by: qingshan89 <qs2816661685@gmail.com>