Commit graph

650 commits

Author SHA1 Message Date
Austin Pickett
ce99a81123 fix(dashboard): suppress unicode-animations postinstall during npm ci
Set CI=1 in _run_npm_install_deterministic so the package's /dev/tty
postinstall demo is skipped during hermes dashboard web UI builds.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-11 11:49:08 -07:00
Teknium
5508f4bc54 fix(cli): utf-8 decode for whatsapp-bridge npm install capture (sibling of #43790) 2026-06-11 09:00:55 -07:00
helix4u
b2043cf157 fix(tui): decode startup subprocess output as utf-8 2026-06-11 09:00:55 -07:00
helix4u
dca11b6650 fix(mcp): preserve stdio argv passthrough 2026-06-11 08:59:55 -07:00
Teknium
875aa8f162
feat(dashboard): unify multi-profile management — one machine dashboard, global profile switcher (#44007)
* feat(dashboard): unify multi-profile management — one machine dashboard, global profile switcher

The dashboard becomes a machine-level management surface with one
write-target selector, replacing per-profile dashboard fragmentation.

Backend:
- profile param (query or body) on /api/config (get/put/raw), /api/env
  (get/put/delete/reveal), /api/mcp/servers (list/add/remove/test/enabled),
  /api/mcp/catalog (list/install), /api/model/info, /api/model/set —
  all scoped through the existing _profile_scope() context manager
- model/set restructured: expensive-model warning (await) runs before the
  scope; the config write runs sync inside the scope in a worker thread
- MCP catalog installs + git-bootstrap entries spawn 'hermes -p <profile>'
- chat PTY: ?profile= on /api/pty points the child's HERMES_HOME at the
  profile dir (its own gateway subprocess, config/skills/memory/state.db
  all profile-bound); in-process gateway attach skipped when scoped

CLI launch unification:
- '<profile> dashboard' routes to the machine dashboard: attach (open
  browser at ?profile=) when one is listening, else re-exec pinned to the
  default profile with --open-profile preselecting the launcher
- --isolated preserves the old dedicated per-profile server behavior
- start_server(initial_profile=...) appends ?profile= to the auto-open URL

Frontend:
- ProfileProvider + sidebar ProfileSwitcher: ONE global selector, URL-
  persisted (?profile=), mirrored into fetchJSON which auto-appends the
  param to the scoped endpoint families (explicit params win)
- app-wide amber banner names the managed profile
- SkillsPage's page-local selector (from the skills-scoping PR) folded
  into the global context — single source of truth
- ChatPage threads the scope into the PTY WS URL; switching profiles
  remounts the terminal into a fresh scoped session

Omitted profile keeps legacy behavior everywhere.

* docs(dashboard): document machine-level multi-profile management

- web-dashboard.md: 'Managing multiple profiles' section (switcher, URL
  deep-links, unified launch, --isolated, scoped Chat, what stays
  per-profile) + --isolated in the options table
- profiles.md: 'From the dashboard' subsection + set-as-active vs
  switcher clarification
- cli-commands.md: --isolated flag + profile-alias launch example

* fix(dashboard): address profile-unification review findings

Review findings (dev review on PR #44007):

1. HIGH — stale page state on profile switch: pages load data on mount
   and didn't consume the profile scope, so a page opened under profile A
   kept showing A's state while writes silently targeted the newly
   selected B. Fixed structurally: ProfileKeyedRoutes wraps the routed
   page tree and keys it by the selected profile, remounting every page
   (fresh state + refetch) on switch. ChatPage keeps its own remount
   (channel keyed on scopedProfile).

2. HIGH — /api/model/auxiliary read was unscoped while /api/model/set
   wrote scoped (Models page could show default's aux pins while editing
   worker's). Endpoint now takes profile + _profile_scope, added to
   PROFILE_SCOPED_PREFIXES, HTTPException re-raise so ghost profiles 404
   instead of 500. Regression test asserts read/write symmetry with
   differing worker/default aux config.

3. MEDIUM — tools post-setup spawned unscoped from the profile-aware
   drawer. Now spawns 'hermes -p <profile> tools post-setup <key>'
   (same mechanism as hub installs); drawer threads its profile prop.
   Most hooks install machine-level artifacts where the scope is inert,
   but hooks reading config/env now see the drawer's HERMES_HOME.

4. LOW — ty warnings: env Optional asserts before subscript/membership,
   fastapi import replaced with web_server.HTTPException re-use.

298 tests green across the four affected suites; tsc -b + vite build
green; aux scoping E2E-verified with real imports.

* fix(dashboard): address second profile-unification review (gille)

1. BLOCKER — profile scope dropped on sidebar navigation: ProfileProvider
   derived the selection from the current URL, and nav links are bare
   paths, so clicking Config from /skills?profile=worker silently reset
   the write target. State is now the source of truth; an effect
   re-asserts ?profile= onto the new location after every navigation
   (URL stays a synchronized projection for deep links/refresh), and an
   incoming URL param (e.g. 'Manage skills & tools' links) still wins.

2. BLOCKER — /api/model/options unscoped while model/set wrote scoped:
   the picker context (current model/provider, custom providers,
   per-profile .env auth state) now loads inside _profile_scope; added
   to PROFILE_SCOPED_PREFIXES. Test: a worker-only current-model pin
   appears in the scoped payload and not the unscoped one.

3. BLOCKER — MCP test-server probe escaped the scope after the config
   read: the probe now re-enters _profile_scope inside the worker thread
   so env-placeholder expansion resolves against the selected profile's
   .env. Known limit (documented): the probe's dedicated MCP event-loop
   thread doesn't inherit the contextvar (OAuth token paths). Test
   asserts get_hermes_home() inside the probe == the worker profile dir.

4. BLOCKER — broad excepts swallowed unknown-profile 404s: /api/model/info
   degraded to 200-with-empty-model-info and /api/mcp/catalog to a
   silently-empty catalog. Both re-raise HTTPException; 404 regression
   tests added for info/options/catalog.

Polish: scope banner clears the fixed mobile header (mt-14 lg:mt-0);
--open-profile hidden via argparse.SUPPRESS (internal re-exec flag);
attach-path test now asserts the opened ?profile= URL.

(Stale-page-state + /api/model/auxiliary findings from this review were
already fixed in 92bcd1568 — the review ran against e600f6951.)

35 tests in the two new suites + 274 in the adjacent ones, all green;
tsc -b + vite build green; scoping E2E-verified with real imports.

* docs(dashboard)+fix: self-review pass — Profiles page section, REST profile-param tip, body-beats-query precedence

Docs:
- web-dashboard.md: add the missing 'Profiles' subsection to Pages
  (cards, create/builder, manage-skills jump, set-as-active vs switcher
  distinction, editors); REST API section gets a profile-scoped-endpoints
  tip documenting ?profile= / body profile / 404 semantics / /api/pty
- (profiles.md + cli-commands.md were already updated in e600f6951)

Precedence fix: scoped endpoints taking BOTH a query param and a body
field now resolve body.profile first. The SPA's fetchJSON injects the
query param from the GLOBAL switcher; an explicit body.profile (e.g.
Profile Builder flows writing into a specific new profile) is the more
specific intent and must not be overridden by whatever the sidebar
happens to be set to. Matches the documented 'explicit beats global'
contract in api.ts.

Verified: 304 tests green across the four suites; tsc -b + vite build
green; docusaurus build green (only pre-existing broken-link warnings,
none from this PR's pages).
2026-06-11 03:29:33 -07:00
brooklyn!
975edd4140
fix(cli): omit --workspace when subpackage has its own package-lock.json (#42973) (#43986)
* fix(cli): omit --workspace when subpackage has its own package-lock.json

When ui-tui/ (or web/) contains its own package-lock.json, _workspace_root()
returns the subpackage directory itself.  Passing --workspace ui-tui in that
case fails because npm cannot find a workspace named 'ui-tui' inside ui-tui/.

Fix: skip the --workspace flag when npm_cwd equals the target directory,
running a plain 'npm install' from the standalone project root instead.

Applies the same fix to both _make_tui_argv (TUI) and _build_web_ui (web).

Fixes #42973

* test(cli): fix web workspace-scope fixture + cover own-lockfile fallback (#42973)

The web half of the #42977 fix broke test_npm_install_uses_workspace_web_scope,
which built its fixture with no lockfile anywhere. Without a root lockfile,
_workspace_root(web_dir) already returns web_dir, so the new
"() if npm_cwd == web_dir" branch correctly drops --workspace and the
assertion failed. Model a real workspace checkout instead: the single
package-lock.json lives at the root, so --workspace web scopes the install.

Also add the symmetric web regression test (web/ carrying its own lockfile =>
--workspace must be dropped and the install runs plainly from web_dir via
npm ci), matching the TUI coverage already in test_tui_npm_install.py.

---------

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
2026-06-11 05:01:25 +00:00
Barron Roth
2c19208224 feat(tts): add Gemini audio tag rewrite 2026-06-10 02:57:39 -07:00
Teknium
a5c32cdf30
fix(update): self-heal a venv left half-built by an interrupted install (#42172)
* fix(update): self-heal a venv left half-built by an interrupted install

An update killed mid dependency-install (Ctrl-C, terminal close, WSL OOM)
could leave the venv with pip wiped and core deps (e.g. Pillow) missing,
with no automatic recovery — the user had to manually run ensurepip +
reinstall.

Drop an install-scoped .update-incomplete breadcrumb right before the dep
install and clear it only after core-dependency verification passes. On the
next launch (any command except 'update' itself), if the marker is present,
unconditionally bootstrap pip via ensurepip then re-run the .[all] install +
verification, then clear the marker. Failure leaves the marker for retry and
prints the manual recovery command. Never raises — recovery cannot block
launch.

* fix(update): address review — stderr-only recovery output, single-flight lock, gitignore marker

- Route all recovery output (status lines + streamed pip/uv install via
  fd-level dup2) to stderr so protocol-on-stdout launches (hermes acp)
  never get install noise on the JSON-RPC stream.
- Single-flight O_EXCL lockfile (.update-incomplete.lock) so a gateway
  start + CLI launch (or two profiles) can't run concurrent installs
  into the shared venv; stale locks (>1h) are broken for the next launch.
- gitignore .update-incomplete + lock so source-tree installs keep a
  clean git status and update's autostash skips them.
- Document why the loose 'update' argv substring match is intentional
  (over-match defers one launch; under-match would race the real update).
- 4 new tests: lock held → skip, stale lock broken, lock released,
  output lands on stderr only.
2026-06-10 02:57:05 -07:00
Robin Fernandes
af978ecb17 fix(model): require confirmation for expensive model selections
Rebased onto current main and re-ported across the restructured
surfaces: model flows now thread confirm_provider/base_url/api_key
through hermes_cli/model_setup_flows.py, the Discord picker lives in
plugins/platforms/discord/adapter.py, and the web dashboard picker
applies chat-mode switches via config.set so the expensive-model
confirmation can ride the response.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 00:24:06 -07:00
Ondrej Drapalik
1c055a4c58 fix(xai): accept Grok Build code during loopback wait + tiny screenshot guard
xAI's consent page renders the authorization code in-page instead of
redirecting to the loopback callback, so the listener just hangs and the
manual-paste flow demands a callback URL that never contains the token.

- auth.py: poll stdin non-blockingly while waiting for the xAI loopback
  callback; accept a pasted bare Grok Build code and substitute the locally
  generated state (PKCE code_verifier still binds the exchange). No need to
  wait for timeout or re-run with --manual-paste.
- computer_use: parse PNG/JPEG dimensions from base64 and fall back to the
  text/AX/SOM payload when the screenshot is below the provider minimum
  (8x8), which xAI rejects with HTTP 400.
- model_setup_flows.py: xAI credential reuse prompt uses the standard radio
  picker via a shared _prompt_auth_credentials_choice helper.
- main.py: thread a title through _prompt_provider_choice; re-home the helper
  import (flows live in model_setup_flows.py post-decomposition).

Salvaged from #36781 onto current main (contributor's main.py edits re-homed
to model_setup_flows.py, where the flows were extracted since the PR opened).
2026-06-09 23:21:24 -07:00
brooklyn!
218452b050
fix(state.db): recover from malformed sqlite_master so hidden sessions reappear (#43149)
* fix(state.db): recover from malformed sqlite_master so hidden sessions reappear

The corruption class behind "Desktop/Dashboard show no sessions while
hundreds of session files sit on disk" is a malformed sqlite_master — most
often a duplicate object row, e.g. two CREATE VIRTUAL TABLE messages_fts
entries — surfacing as:

    sqlite3.DatabaseError: malformed database schema (messages_fts) -
    table messages_fts already exists

SQLite parses the whole schema while preparing the FIRST statement on a
connection, so on this class every statement fails before it runs: PRAGMA
journal_mode (which is where SessionDB.__init__ actually trips, in
apply_wal_with_fallback, BEFORE _init_schema), PRAGMA integrity_check, and
even DROP TABLE. The only operations that still work are
PRAGMA writable_schema=ON plus direct sqlite_master surgery. A plain
FTS-index rebuild at the _init_schema layer therefore cannot reach or fix
this; the canonical sessions/messages rows are intact — only the derived
schema is broken.

Add a dedicated recovery that operates where the failure actually happens:

- hermes_state.repair_state_db_schema(): backs up the raw file first, then a
  least-destructive ladder — (1) de-duplicate sqlite_master keeping the
  lowest rowid per object (preserves the existing FTS index), escalating to
  (2) drop every messages_fts* schema object + VACUUM and let the next open
  rebuild the FTS index from messages. sessions/messages are never modified.
  Plus is_malformed_db_error() to discriminate this class.
- SessionDB.__init__ auto-heals: on a malformed-schema open error it repairs
  once (process-guarded against loops / concurrent web_server opens) and
  reopens, so Desktop/Dashboard recover on their own instead of silently
  showing "no sessions".
- hermes doctor --fix detects the malformed class and repairs it (reporting
  the recovered session count + backup name).
- hermes sessions repair [--check-only] [--no-backup] runs on the raw file
  path, since SessionDB() itself cannot open a malformed DB.

Supersedes #32589 and #33869: both targeted FTS corruption but gated their
repair behind statements (integrity_check / SELECT / DROP TABLE) that
themselves fail on this class, and neither addressed the apply_wal_with_fallback
open-time failure. Credit preserved via Co-authored-by.

Closes #33865.

Co-authored-by: João Vitor Cunha <145560011+plcunha@users.noreply.github.com>
Co-authored-by: Tuna Dev <273476039+tuancookiez-hub@users.noreply.github.com>

* test(state.db): cover strat-B escalation + unrepairable safe-fail paths

---------

Co-authored-by: João Vitor Cunha <145560011+plcunha@users.noreply.github.com>
Co-authored-by: Tuna Dev <273476039+tuancookiez-hub@users.noreply.github.com>
2026-06-09 18:49:08 -05:00
brooklyn!
ba44de06da
fix(install): self-heal a stuck Electron download (salvage of #42894) (#42998)
* fix(install): self-heal a stuck Electron download on the desktop build

The desktop build downloads Electron (~114MB) from GitHub. A corrupt cached
zip, or a blocked/throttled GitHub release host (the repeating "retrying" log),
hard-failed the install — and install.sh had no recovery at all while
install.ps1 / `hermes desktop` only purged the cache.

All three build paths now escalate on a failed `npm run pack`:
GitHub → purge corrupt electron-*.zip + stale *-unpacked and retry → one retry
via a public Electron mirror (npmmirror.com). @electron/get SHASUM-verifies the
download, and a user-pinned ELECTRON_MIRROR is always respected (never
overridden). Adds a bash clear_electron_build_cache()/_desktop_pack() to mirror
the existing PowerShell/Python helpers.

* test(install): cover the Electron mirror fallback

Verify `hermes desktop` falls back to a mirror when the cache purge finds
nothing, and that a user-pinned ELECTRON_MIRROR is respected (no extra attempt,
not overridden).

* docs(desktop): troubleshoot a stuck Electron download

Document the automatic cache-purge + mirror fallback, how to pin your own
ELECTRON_MIRROR, and how to clear a corrupt cached zip by hand.

* docs(install): correct the Electron mirror trust framing

The mirror-fallback comments and the desktop troubleshooting doc implied
`@electron/get`'s SHASUM check makes the npmmirror.com download safe against
tampering. It doesn't: the SHASUMS256.txt is fetched from the same mirror, so
the check guards against a corrupt/partial download, not a compromised mirror.

Reframe all four surfaces (install.sh, install.ps1, `hermes desktop`, and the
docs) to state the trust trade-off honestly — npmmirror.com is the de-facto
Electron community mirror, we only fall back to it after the canonical GitHub
download fails, and a user-pinned ELECTRON_MIRROR is never overridden. No
behavior change.

---------

Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com>
2026-06-09 18:19:14 +00:00
helix4u
f8adefdebf fix(tui): apply terminal backend config before launch
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
Build Skills Index / build-index (push) Has been cancelled
Build Skills Index / trigger-deploy (push) Has been cancelled
2026-06-09 00:31:27 -07:00
Teknium
54318c65b0
feat(models): seed model-catalog disk cache from checkout on update (#42614)
hermes update pulls the latest repo, so the freshly-pulled
website/static/api/model-catalog.json is already the newest catalog. Copy
it straight over ~/.hermes/cache/model_catalog.json instead of relying on a
network fetch (which can be Vercel bot-gated or hit a Portal hiccup and
silently degrade the picker to a stale/short list).

Adds seed_cache_from_checkout() in model_catalog.py (read shipped manifest,
validate, atomic write via _write_disk_cache, reset in-process cache) and
calls it from both update paths in main.py: _cmd_update_impl (git pull) and
_update_via_zip (Docker/no-git). Non-fatal on missing/malformed/invalid
files — the normal network refresh still applies on next picker open.
2026-06-08 22:31:06 -07:00
Brooklyn Nicholson
e88116256c fix(update): scope git fetch to target branch
A bare `git fetch origin` (and `git fetch upstream`) pulls every ref. The
repo carries thousands of auto-generated branches, so on any
non-single-branch checkout the installer's update path and `hermes update`
spend minutes downloading the full branch list — long enough to stall the
desktop installer or trip the follow-up `git pull --ff-only`.

Scope every update-path fetch to the branch we actually compare/merge
against:
- scripts/install.sh: collapse the remote to single-branch and fetch only
  $BRANCH on the "existing install, updating" path.
- hermes_cli/main.py: fetch the resolved branch in the apply path, the
  --check path (upstream + origin), and the fork upstream-sync.

Tracking-ref updates still happen via git's opportunistic refspec, so the
later origin/<branch> rev-parse/rev-list checks are unaffected.

Tests assert the apply-path fetch is branch-scoped and never bare.
2026-06-08 15:24:31 -04:00
xxxigm
96fd9d4979
fix(desktop): stop running Hermes.exe locking win-unpacked before Windows pack (#42100)
* fix(desktop): stop running app locking win-unpacked before pack

On Windows a running Hermes.exe keeps an exclusive lock on
release/win-unpacked/Hermes.exe, so electron-builder's pack cannot
replace it and dies with "remove ...\Hermes.exe: Access is denied" /
ERR_ELECTRON_BUILDER_CANNOT_EXECUTE (before-pack hits the same EPERM
cleaning the dir, and the cache-purge retry repeats the failure since
the lock is still held).

Before building the packaged app, terminate any process whose
executable lives inside this build's release/ tree so the rebuild --
including the installer's headless --update rebuild -- can replace the
binary. Scope is narrow (only exes under release/), POSIX is a no-op
(it can unlink a running binary), and the final error now points
Windows users at the running-app cause.

* test(desktop): cover the win-unpacked lock-breaker helper

Verify _stop_desktop_processes_locking_build is a no-op off-Windows,
terminates only processes whose exe lives under release/ (sparing our
own PID and unrelated installs), and short-circuits when no release dir
exists.
2026-06-08 11:51:31 -07:00
teknium1
a77efada5f refactor(cli): extract 18 model-flow wizard functions into model_setup_flows (god-file Phase 2)
Lift the 18 _model_flow_* provider-setup wizard functions out of hermes_cli/main.py
into hermes_cli/model_setup_flows.py. Behavior-neutral; main.py 14050 -> 11479 LOC.

select_provider_and_model (the dispatcher) STAYS in main.py and re-imports the
flows via an explicit 'from hermes_cli.model_setup_flows import (...)' block, so
both its bare-name calls and existing test monkeypatches targeting
hermes_cli.main._model_flow_* keep resolving against main's namespace unchanged.

Imports: 3 neutral deps (argparse, os, subprocess) at the module top; the 14
main.py-internal helpers the flows call (_prompt_api_key, _save_custom_provider,
the reasoning-effort/stepfun/qwen helpers, _run_anthropic_oauth_flow, ...) are
lazy-imported per-flow (from hermes_cli.main import ...) so the new module never
imports main at module scope -> no import cycle.

Repointed one source-inspection change-detector (test_setup_ollama_cloud_force_refresh)
to read the module the ollama-cloud branch moved to.

Validation: 6563/6563 hermes_cli tests pass; live flow-dispatch probe confirms the
lazy main-internal imports resolve at runtime.
2026-06-08 09:42:44 -07:00
floory
15c99b437f
fix(cli): set PYTHON env for node-gyp native builds on NixOS (#40690)
* fix(cli): set PYTHON env for node-gyp native builds on NixOS

node-gyp (triggered by node-pty during npm ci) looks for python3 on
PATH, which fails on NixOS because python3 lives in the nix store and
is not on the system PATH.

Add _nixos_build_env() — a two-tier helper that detects NixOS and:
1. Fast path: hermes venv python3 (~0s)
2. Fallback: nix-shell which python3 (~2-5s)

Wire it into _run_npm_install_deterministic() via a new env= parameter,
then pass it through cmd_gui() and _update_node_dependencies().

Non-NixOS systems: _nixos_build_env() returns None, behavior unchanged.

* fix(cli): merge _nixos_build_env() with os.environ, fix NixOS detection, add explicit return None

- Critical fix: both Tier 1 (venv) and Tier 2 (nix-shell) now return
  {**os.environ, "PYTHON": ...} instead of {"PYTHON": ...} — subprocess.run
  with env= replaces the entire environment, so the old code wiped PATH
  and broke npm/node on NixOS entirely.
- Uses re.search(r"^ID=nixos$", ...) for anchored NixOS detection instead
  of unanchored substring match (could match ID_LIKE=...nixos).
- Removes redundant Path.exists() guard before read_text(); just catches
  OSError (one filesystem read instead of two).
- Adds explicit return None at end of function for type-hint consistency.
2026-06-08 13:57:37 +05:30
teknium1
1a626470ca refactor(cli): promote 9 closure handlers to top-level + extract their parsers (god-file Phase 2 follow-up)
Subcommands whose handler was a closure defined inside main() — memory, acp,
tools, insights, skills, pairing, plugins, mcp, claw — have their handler
promoted to a top-level function and their parser block extracted into
hermes_cli/subcommands/<name>.py (build_<name>_parser, injected handler).

These 9 had zero closure-over-main-locals, so promotion is a pure relocation.
acp/mcp parser blocks use the shared add_accept_hooks_flag helper.

main() 1798 -> 954 LOC (71% below the 3297 Phase-2 starting point);
add_parser calls in main.py 89 -> 28.

Deferred: sessions, computer-use, secrets handlers reference <name>_parser
(for a no-subcommand print_help fallback) — left in place to avoid the
_self_parser indirection; minority, low value.

Behavior-neutral: all 9 subcommands' --help (incl nested subactions) byte-
identical to pre-extraction (diff-verified). tests/hermes_cli/ 6519 passed /
0 failed; new test_subcommands_followup.py covers the 9 builders.
2026-06-07 22:56:23 -07:00
teknium1
568e127612 refactor(cli): extract 25 more subcommand parsers into hermes_cli/subcommands/
Batch extraction of every remaining subcommand whose handler is top-level and
whose parser block is pure argparse: model, setup, postinstall, whatsapp, slack,
login, logout, auth, status, webhook, hooks, doctor, security, dump, debug,
backup, import, config, version, update, uninstall, dashboard, gui, logs,
prompt-size.

Each becomes hermes_cli/subcommands/<name>.py with build_<name>_parser() and an
injected handler (no main import). dashboard also injects cmd_dashboard_register
for its nested 'register' action.

Behavior-neutral: all 25 subcommands' --help output (and nested subaction help)
diff-verified byte-identical to pre-extraction. Two RawDescriptionHelpFormatter
epilogs (debug, logs) needed their multi-line string interiors preserved at
column 0 — caught by the --help diff, not compile.

main() 3297 -> 1798 LOC across this PR; add_parser calls in main.py 179 -> 89.

Validation: tests/hermes_cli/ 6476 passed / 0 failed under per-file process
isolation; new test_subcommands_batch.py smoke-tests all 25 builders + the
dashboard two-handler case.
2026-06-07 22:18:14 -07:00
teknium1
4da45e8727 refactor(cli): extract profile + gateway/proxy parsers into hermes_cli/subcommands/
Follow-on to the cron extraction in the same Phase 2 PR. Same pattern:
per-group build_<name>_parser() functions with injected handlers, no main
import.

- subcommands/profile.py: build_profile_parser (190-line block out of main()).
- subcommands/gateway.py: build_gateway_parser (gateway + proxy, 238-line block;
  they shared one inline section). Imports argparse for SUPPRESS defaults.
- main(): two more inline blocks become single builder calls.

Behavior-neutral: 'profile [sub] --help' and 'gateway/proxy [sub] --help'
byte-identical to pre-extraction (diff-verified).

main() now 2723 LOC (was 3297 at Phase 2 start); add_parser calls in main.py
179 -> 141.

Validation: tests/hermes_cli/ 6476 passed / 0 failed under per-file process
isolation; new builder unit tests cover subactions, aliases, dispatch, flags.
2026-06-07 22:18:14 -07:00
teknium1
b2e6053243 refactor(cli): extract hermes cron parser into hermes_cli/subcommands/ (god-file Phase 2)
Phase 2 of the god-file decomposition plan. main()'s argparse tree is 179
inline add_parser calls in one 3,297-line function. This establishes the
hermes_cli/subcommands/ package and extracts the first group (cron) as the
proof-of-pattern:

- hermes_cli/subcommands/_shared.py: shared parser helpers (add_accept_hooks_flag),
  re-exported from main.py for backwards compat.
- hermes_cli/subcommands/cron.py: build_cron_parser(subparsers, cmd_cron=...).
  Handler injected so the module never imports main (cycle avoidance).
- main()'s ~155-line inline cron block becomes one build_cron_parser() call.

Behavior-neutral: 'hermes cron create --help' output is byte-identical to
origin/main. main() 3297 -> 3143 LOC.

Validation: tests/hermes_cli/ 6466 passed / 0 failed under per-file process
isolation; new test_subcommands_cron.py covers subactions, aliases, options,
no-agent tristate, injected dispatch, and --accept-hooks.
2026-06-07 22:18:14 -07:00
Teknium
5b43bf7d02
feat: uninstall the Chat GUI without removing the agent (CLI + desktop UI) (#40355)
* feat: uninstall the Chat GUI without removing the agent (CLI + desktop UI)

Adds a GUI-only uninstall path so people can remove the desktop Chat GUI
while keeping the Hermes agent + their config/sessions/.env, and surfaces
the three CLI uninstall modes inside the desktop app's Settings → About.

CLI:
- New hermes_cli/gui_uninstall.py: cross-platform discovery + removal of the
  desktop GUI's artifacts (source-built dist/release/node_modules + build
  stamp, the packaged app bundle, and the Electron userData dir) on Linux,
  macOS, and Windows. Never touches the agent source, venv, or user data.
- `hermes uninstall --gui` removes only the Chat GUI; `--gui-summary` prints a
  JSON install snapshot (used by the desktop UI to gate options + detect a
  missing agent for a future lite client).
- `hermes uninstall --yes` / `--full --yes` now run non-interactively, sharing
  the destructive sequence via a new _perform_uninstall() helper. The keep-data
  and full flows also sweep the GUI artifacts.

Desktop:
- electron/desktop-uninstall.cjs: pure helpers mapping each mode (gui/lite/full)
  to CLI flags, resolving the running app bundle per OS, and building the
  detached cleanup script that waits for the app to exit, runs the Python
  uninstall, and removes the bundle.
- IPC hermes:uninstall:summary / :run, preload bridge, and types.
- Settings → About "Danger zone" with the three options; agent-removing
  options hide when no local agent is detected.

Tests: tests/hermes_cli/test_gui_uninstall.py (22 pass with the existing
uninstall tests), electron/desktop-uninstall.test.cjs (17 pass, wired into
test:desktop:platforms). Docs: desktop.md "Uninstalling" + cli-commands.md.

* fix(desktop): tear down backend process tree before GUI uninstall (Windows lock safety)

The desktop uninstall cleanup script waited only on the desktop app's own
PID, but a backend grandchild (gateway / pty terminal / hermes REPL) can
outlive it and keep hermes.exe + venv files mandatory-locked on Windows —
making the script's rmdir half-fail and leaving a partial install, the same
failure class as the self-update path's #37532.

- main.cjs: runDesktopUninstall now awaits releaseBackendLock() before
  spawning the cleanup script — tree-kills every backend PID the desktop owns
  (primary + pool) via taskkill /T /F and polls the venv shim until unlocked.
  Extracted the shared core out of releaseBackendLockForUpdate so both the
  update hand-off and the uninstaller use the identical, incident-hardened
  teardown. No-op on macOS/Linux (no mandatory locks).
- desktop-uninstall.cjs: Windows cleanup script removes the bundle via a
  bounded rmdir retry loop (10x, 1s) instead of a single rmdir, since Windows
  releases directory handles lazily even after the holding process exits.
- Dropped a fragile tasklist|findstr reap-by-path attempt; the Electron-side
  tree-kill-by-PID is the reliable mechanism.

Tests: desktop-uninstall.test.cjs updated for the retry-loop output (17 pass).

* fix(desktop): address review on GUI uninstall (venv self-delete, gates, wait-loop)

Resolves @OutThisLife's review on #40355:

1. full mode now gated on agent presence (needsAgent: true). It removes the
   agent + user data, so on a lite client with no local agent it's hidden
   like lite — no more offering to remove an agent that isn't there.

2. (Finding 3, the real bug) lite/full no longer rmtree the venv from the
   venv's OWN python. On Windows a running python.exe is mandatory-locked, so
   that half-fails. New lightweight 'python -m hermes_cli.uninstall --mode X'
   entrypoint (stdlib-only imports) lets the desktop run agent-removing modes
   under the SYSTEM python (findSystemPython) with PYTHONPATH=<agentRoot>, so
   import hermes_cli resolves from source while the venv is torn down. Falls
   back to venv python + logs when no system python (gui-only unaffected).

3. Windows wait-loop is now bounded (60 tries, matching POSIX) and matches the
   PID as a whole space-delimited token via findstr (no substring 99->990
   trap, no redundant bare find). set HERMES_HOME/PID/PYTHONPATH now quoted.

4. Renamed the misleading 'returns null for dev run' test — the dev-run safety
   is shouldRemoveAppBundle(isPackaged=false), which the test now asserts.

Docs: note that --gui on a source checkout also sweeps node_modules/build
output. Tests: 18 python + 19 desktop pass.
2026-06-06 18:22:38 -07:00
Zak B. Elep
4bf52022e5 fix(tui): correct --skip-build hint and add TUI workspace install test
- Update the --skip-build pre-build hint in the dashboard startup path
  to use `npm install --workspace web && npm run build -w web` so users
  don't accidentally trigger a desktop rebuild by following the hint.

- Add test_tui_launch_install_uses_workspace_scope to assert that the
  TUI launch npm install carries --workspace ui-tui, covering the call
  site added in the prior commit.
2026-06-06 18:22:20 -07:00
Zak B. Elep
0416f852f2 fix(tui): scope TUI launch install and fix stale hints/test
- Add --workspace ui-tui to the TUI launch npm install, the one call
  site missed by the prior commit. Without scoping it ran from
  PROJECT_ROOT and still resolved apps/desktop via the apps/* glob.

- Update the two manual-recovery hints in _build_web_ui (npm install
  failure and build failure paths) to use the scoped form
  `npm install --workspace web && npm run build -w web` so users
  following the hint don't accidentally trigger a desktop rebuild.

- Update the stale test assertion in test_cmd_update.py to expect
  --workspace web in the _build_web_ui npm ci call, which was
  previously unreachable through the if-guard and left the workspace-
  scoping change from the prior commit unverified.
2026-06-06 18:22:20 -07:00
Zak B. Elep
1c0437dfc5 fix(install): scope npm installs/audits to avoid pulling in apps/desktop
Root package.json uses apps/* workspaces glob which unconditionally
includes apps/desktop (Electron + node-pty@1.1.0, ~200MB, requires
make/g++ to build) in every unscoped npm command run from the repo root.

This commit addresses the core problem by adding explicit workspace
scoping to all internal npm calls:

hermes_cli/main.py (_build_web_ui):
  - Add --workspace web to the npm install call so only the web
    workspace deps are resolved, never apps/desktop.

hermes_cli/tools_config.py:
  - Add --workspaces=false to agent-browser and Camofox root installs
    so only root-level deps (agent-browser, @streamdown/math) are
    installed, bypassing the workspace graph entirely.

hermes_cli/doctor.py (run_doctor npm audit):
  - Replace the single unscoped 'npm audit --json' at PROJECT_ROOT with
    three scoped invocations:
      * --workspaces=false for root deps (Browser tools)
      * --workspace web for the web workspace
      * --workspace ui-tui for the TUI workspace
  - Update remediation hints to use matching scoped 'npm audit fix'
    commands so users don't accidentally trigger a desktop rebuild.

package.json:
  - Add convenience scripts for scoped operations:
      npm run install:root  / install:web / install:tui / install:desktop
      npm run audit:root    / audit:web   / audit:tui
      npm run audit:fix:root / audit:fix:web / audit:fix:tui
    These give developers and CI a safe, explicit interface for the
    most common per-workspace tasks without accidentally pulling desktop.

Fixes #38772
2026-06-06 18:22:20 -07:00
kshitij
ebed881d46
fix(cli): quarantine running hermes.exe during update dep-verification repair on Windows (#40409)
The dependency-verification repair in _verify_core_dependencies_installed
ran 'pip install --reinstall -e .' via _run_install_with_heartbeat directly,
bypassing the Windows shim-quarantine that the primary install path performs.

That reinstall rewrites the entry-point shims, and on Windows the live
hermes.exe is the running process — pip can neither delete nor overwrite it.
With no quarantine, the shim was left missing and 'hermes' dropped off PATH
('hermes' is not recognized... after update).

Extract the rename-out-of-the-way / restore-on-failure logic into a reusable
_run_quarantined_install helper and route both the primary editable installs
and the --reinstall -e . repair through it. The per-package repair installs
only third-party deps (never hermes-agent), so they don't touch the shims and
are left untouched. Add a regression test (fails on old code, passes on new).
2026-06-06 12:50:58 -05:00
The Garden
2820d87ea5 fix(cli): tolerate stale dashboard --tui from old desktop shells
Older Hermes desktop app shells (<= 0.15.x) spawn the backend as
`hermes dashboard --no-open --tui --host ... --port ...`. The --tui flag
was removed from the dashboard subcommand in cae6b5486 (embedded chat is
always on now).

When a user's CLI updates past that commit but their desktop app binary
has not, argparse hard-errored with 'unrecognized arguments: --tui' and
exit(2). The backend died before becoming ready and the desktop GUI showed
only 'Hermes couldn't start' with no actionable cause — a confusing brick
for anyone whose app and CLI versions drift apart across an update.

Add a hidden, deprecated, accepted-and-ignored --tui flag to the dashboard
subparser so an old app shell + new CLI degrades gracefully. Hidden from
--help via argparse.SUPPRESS so we don't re-advertise a removed feature.
Safe to delete once the floor app version is well past 0.16.0.

Adds tests/hermes_cli/test_dashboard_tui_backcompat.py pinning: the flag
parses without error, stays hidden from --help, and the modern (no --tui)
invocation is unaffected.
2026-06-06 12:43:28 -05:00
Teknium
2bf0a6e760
feat(dashboard): full tool backend configuration in the GUI (#40418)
Replicate the `hermes tools` configurator in the dashboard Skills →
Toolsets view. Each toolset now opens a config drawer that covers the
full lifecycle the CLI offers: enable/disable, pick a provider/backend,
enter and save API keys, and run a provider's post-setup install hook
with a live log tail.

The toolset view was previously read+toggle only — the provider matrix
and key-status endpoints existed but the page never called them, and
there was no way to save a key or run a backend install (npm/pip/binary)
from the browser.

Backend:
- New CLI subcommand `hermes tools post-setup <KEY>` — non-interactive,
  scriptable target that runs a provider's install hook (agent_browser,
  camofox, cua_driver, kittentts, piper, ddgs, spotify, langfuse,
  xai_grok). Validated against valid_post_setup_keys() so an arbitrary
  key can't drive _run_post_setup.
- PUT /api/tools/toolsets/{name}/env — save API keys to ~/.hermes/.env
  via save_env_value (same store the CLI writes), validated against the
  toolset category's env-var allowlist; blank values skipped.
- POST /api/tools/toolsets/{name}/post-setup — spawn-action that runs
  `hermes tools post-setup <key>`; frontend tails the log via the
  existing /api/actions/tools-post-setup/status. Registered in
  _ACTION_LOG_FILES.

Frontend:
- New ToolsetConfigDrawer component (provider radios, password key
  inputs with saved-state, get-a-key links, Run-setup + live install
  log). Toolset cards get a Configure button + the drawer also exposes
  the enable toggle.
- api.ts: toggleToolset, getToolsetConfig, selectToolsetProvider,
  saveToolsetEnv, runToolsetPostSetup + ToolsetConfig/Provider/EnvVar/
  EnvResult types.

Validation: 56 admin-endpoint tests pass (10 new: env save w/ CLI
parity + allowlist reject + blank-skip, post-setup spawn validation,
auth gate); 232 web_server tests pass; web npm run build + eslint clean;
HTTP E2E exercises save-key (CLI reads it back) and spawn+poll
post-setup to exit 0.
2026-06-06 07:45:36 -07:00
kshitij
5af899c7ca
feat(cli): display custom profile alias names in profile list/show (#40371)
profile list and profile show assumed the wrapper script is always named
after the profile (wrapper_dir / name). When a custom alias exists — e.g.
`hermes profile alias steve --name qiaobusi` creates ~/.local/bin/qiaobusi
pointing at `hermes -p steve` — the display silently showed the profile
name (or nothing) instead of the alias the user actually typed.

The custom-alias *creation* path (create_wrapper_script(name, target)) was
added later; the *display* path was never updated to match.

Add find_alias_for_profile() — a reverse lookup that scans the wrapper dir
for our own wrappers (alias-named file containing 'hermes -p <profile>'),
prefers a custom alias over the profile-named one, strips .bat on Windows,
and sorts for deterministic output. Populate ProfileInfo.alias_name and wire
it into the three display sites (profile describe, list, show).

Credit: salvages the intent of #11506 by wss434631143, reimplemented on
current main against the post-#11506 custom-alias (--name/target) mechanism.

Tests: 6 new (profile-named, custom-name, none, unrelated-file rejection,
windows .bat strip, list_profiles surfacing). All 123 in test_profiles pass.
E2E verified against the real CLI for both custom and profile-named aliases.
2026-06-06 08:08:07 +00:00
Brooklyn Nicholson
30340eae2f Include git SHA in /version output via banner label helper.
Reuses format_banner_version_label() so CLI, TUI, gateway, and desktop show upstream/local commit when available.
2026-06-05 18:05:05 -07:00
kshitijk4poor
7ae8aac3b9 feat(model): honor discover_models in terminal hermes model named-custom flow
The terminal `hermes model` wizard (_model_flow_named_custom) always
live-probed a custom provider's /models endpoint, ignoring the configured
`models:` list. For plans whose endpoint exposes a large catalog (e.g. Baidu
Qianfan Coding Plan returns 100+ models for a 2-3 model plan) the picker
flooded with models the user can't use.

This wires `discover_models` (and the `models:` list) through
_named_custom_provider_map into the flow and honors `discover_models: false`
the same way the slash-command picker (model_switch.py sections 3 & 4) does:
- Default stays True — live probe, no behaviour change.
- discover_models: false → use the configured `models:` list verbatim,
  skip the probe (string 'false'/'no'/'0' normalised to False).
- If the probe is on but returns empty, fall back to the configured list
  instead of forcing manual entry.

Closes #18726
2026-06-06 01:29:41 +05:30
adybag14-cyber
af8b917dab fix(termux): scope frontend npm installs 2026-06-05 06:56:51 -07:00
Teknium
72eb42d9ec
feat(update): stash/restore by default + settable discard for non-interactive updates (reverts #38542, #39568) (#39645)
* Revert "fix(update): require managed marker before destructive clean"

This reverts commit c8e80cd0bf.

* Revert "fix(update): stop stash/restore from clobbering desktop source on managed clones (#38542)"

This reverts commit 8a19884bf3.

* chore(install): keep npm ci desktop-build fix after stash revert

The destructive-clean reverts (#38542/#39568) pulled the desktop
workspace install back to bare `npm install`. The npm ci -> npm install
fallback is orthogonal build-correctness (avoids the Windows
workspace-hoisting flake where install reports up-to-date against a
stale marker while node_modules is empty, breaking tsc -b). Preserve it.

* feat(update): settable stash-or-discard for non-interactive local changes

Adds updates.non_interactive_local_changes (stash | discard, default
stash). Governs ONLY non-interactive updates (desktop/chat app, gateway,
--yes) — interactive terminal updates always stash-and-ask, unchanged.

- config.py: new key under existing updates section; _config_version 26->27.
- main.py: _cmd_update_impl detects non-interactive (gateway/--yes/no-TTY),
  reads the setting; new _discard_stashed_changes() drops the stash
  (stash-and-drop, never reset --hard/clean -fd, so ignored paths survive).
  Post-pull restore site branches on it; the bail-out and up-to-date
  restores always preserve work.
- web_server.py + apps/desktop settings: exposes it as a stash/discard
  select (Advanced section, In-App Update Local Changes).
- docs + tests (discard drops, stash restores, interactive ignores setting,
  missing section defaults to stash).

* fix(install.ps1): stash/restore instead of reset --hard on Windows update

The PR reverted the destructive update path to stash/restore everywhere
except scripts/install.ps1, whose managed-clone update path still ran
`git reset --hard HEAD` before checkout — silently destroying agent-edited
tracked source on Windows (the same #38542 data-loss class the PR fixes).

- Replace `git reset --hard HEAD` with stash-before-checkout +
  restore-after-checkout, mirroring install.sh. Untracked files are
  included so agent-created dirs (e.g. tinker-atropos/) survive.
- Keep `core.autocrlf false` (it prevents the phantom CRLF dirt that made
  the stash necessary; it's also load-bearing for a clean restore).
- Wrap all three checkout modes (Commit/Tag/Branch); Branch case now uses
  `git pull --ff-only` so local commits are never clobbered.
- Only prompt to restore when a real console is attached (UserInteractive
  + non-redirected stdin/stdout + ConsoleHost); the desktop Update button
  and bootstrap have no usable console, so they default to restore and
  never hang on Read-Host.
- On restore conflict or a failed update, the stash is preserved with
  recovery instructions — work is never silently dropped.

Validated on Windows (PowerShell 5.1, git 2.54): AST parse clean;
E2E non-conflicting restore applies+drops cleanly with ignored paths
(node_modules) untouched; conflicting restore preserves the stash.

---------

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-06-05 17:30:10 +05:30
helix4u
c8e80cd0bf fix(update): require managed marker before destructive clean 2026-06-05 00:05:30 -07:00
ethernet
fb853a1783 fix(install): scrap rebuild venv 2026-06-04 23:20:29 -04:00
Brooklyn Nicholson
89baf02919 Merge origin/main into bb/desktop-profile-support
Resolve conflicts in desktop settings/cron/messaging/sidebar: adopt main's
ListRow + actions-menu refactors for credential rows; keep our profileColor
import on the sidebar. Drop the now-orphaned Tip-based helpers.
2026-06-04 20:17:07 -05:00
ethernet
80672754a8 fix(docs): update all install instructions everywhere 2026-06-04 21:07:45 -04:00
Brooklyn Nicholson
b94b3622b5 feat(desktop): per-session profile switching + cross-profile sessions
Add first-class profile support to the desktop app without app reloads.

- Swap the single live gateway onto a session's profile lazily (spawned on
  demand by the Electron backend pool), so one backend serves the active
  profile and others stay cold — no OOM with many profiles.
- Aggregate sessions across profiles by reading each profile's state.db
  read-only; unified "All profiles" view groups sessions per profile with
  per-profile pagination, while the default view stays scoped to one profile.
- Add an Arc-style profile rail at the sidebar foot: a default<->all toggle
  pinned left, colored named-profile squares scrolling between, Manage pinned
  right. Profile identity is a deterministic per-name color.
- Route profile-scoped REST (config/env/skills/tools/model) to the active
  gateway profile and invalidate React Query caches on swap. Single-profile
  users never trigger a swap, so their path is unchanged.

Backend:
- web_server: profile-aware active/list endpoints + per-profile session
  totals; hermes_state: session_count(exclude_children); main.py: honor
  --profile over HERMES_HOME env for pooled backends.

UI primitives:
- Add a position-aware Tip tooltip (instant, themed) as a drop-in for native
  title=, and strip redundant tooltips from self-descriptive chrome.
2026-06-04 16:35:34 -05:00
Jeff
1f347ee543 fix(uv): move venv aside instead of gutting it in place on Windows rebuild
hermes update can brick a Windows install. When 'hermes update --force' runs
past the concurrent-process guard, rebuild_venv runs while the venv is still in
use: shutil.rmtree(ignore_errors=True) deletes site-packages + certifi's cert
bundle but can't remove the locked python.exe, leaving a half-gutted venv that
uv venv then refuses to overwrite. Every later HTTPS call dies with
FileNotFoundError for the missing cacert and there is no recovery.

--clear alone (the c136eb4de retry path) does not fix the real lock case: when
the locked interpreter is *inside* the venv being rebuilt, neither rmtree nor
uv venv --clear can delete it. os.replace of the parent directory *is* allowed
on Windows (a running .exe is tracked by handle, not path), so we move the old
venv aside atomically to <venv>.old, rebuild with --clear in its place, and the
still-running gateway/desktop keep using the moved-aside copy until they
restart. If the venv genuinely can't be moved, we abort cleanly and leave it
fully intact; if the rebuild fails, we restore the moved-aside copy.

Folds in the call-site guards from #38511 (@f3rs3n):
- rebuild_venv() returns False (and restores the backup) if uv exits 0 without
  producing an interpreter.
- both hermes update venv-rebuild call sites abort with RuntimeError instead of
  continuing into dependency install when rebuild_venv() returns False.

Also gitignore /venv.old/ so the update autostash (git stash --include-untracked)
doesn't sweep the moved-aside venv on every run.

Root-cause fix for #37881. Supersedes the --clear-only retry from c136eb4de.

Co-authored-by: f3rs3n <32328813+f3rs3n@users.noreply.github.com>
2026-06-04 12:18:38 -04:00
Teknium
fef04a197e fix(desktop): purge electron cache unconditionally, not via stdlib zipfile gate
The salvaged detector validated each cached electron-*.zip with
zipfile.testzip() and only purged ones it judged corrupt. But stdlib
zipfile reads from the end-of-central-directory backward, so it silently
tolerates prepended/concatenated junk — which is exactly the corruption
the bug report names ('86257938 extra bytes at beginning or within
zipfile', a partial download resumed into the same file). testzip()
returns clean on those zips, so the self-heal never fired for the
reported failure mode.

Drop the self-rolled validator: on any packaged-build failure, purge the
version's cached zips AND the half-written unpacked dir, then retry once.
@electron/get re-downloads with its own SHASUM verification — the real
source of truth, which catches prepend/concat/truncate alike. An
unrelated failure just costs one clean re-download and fails the same way.

Verified empirically: zipfile.testzip() returns None (clean) on a
prepended-junk zip; the unconditional purge removes it correctly.
2026-06-04 07:17:33 -07:00
Harry Riddle
f583c6ebd5 fix(desktop): recover from corrupt cached Electron download on build
hermes desktop failed on Linux with an ENOENT renaming
release/linux-unpacked/electron -> Hermes. Root cause is a corrupt
cached Electron zip (~/.cache/electron/electron-*.zip): app-builder
unpack-electron extracts a partial tree from the bad zip that is
missing the electron binary, so electron-builder dies on the final
rename. Re-running repeats the broken extraction, leaving the desktop
app permanently unlaunchable until the cache is manually purged.

- Add _electron_download_cache_dirs() + _purge_corrupt_electron_cache()
  to hermes_cli/main.py: validate every electron-*.zip via
  zipfile.testzip() and delete corrupt ones; honor electron_config_cache
  / ELECTRON_CACHE overrides with per-OS defaults.
- Wire purge + single retry into cmd_gui packaged-build failure path so
  a poisoned download self-heals (electron re-downloads clean).
- Add beforePack hook (apps/desktop/scripts/before-pack.cjs) to wipe the
  target unpacked dir before staging, making packaging idempotent across
  interrupted runs. Cross-platform, best-effort.
- Tests: corrupt-zip detector, cmd_gui purge/retry/launch path,
  no-retry-when-clean path, and node --test for the cleanup helper.
2026-06-04 07:17:33 -07:00
ethernet
a6a0a5b1b0 fix(desktop): detect linux arm64 binary 2026-06-04 09:51:26 -04:00
teknium1
c136eb4de1 fix(update): harden venv rebuild + verify core deps after install
Two complementary fixes for a silent partial-install failure that bit
``hermes update`` in the wild: a fresh checkout pulled 145 commits,
``rebuild_venv`` failed to recreate the venv on Windows because
``shutil.rmtree(ignore_errors=True)`` couldn't delete files held open by
the running ``hermes.exe`` shim. ``uv venv`` then refused with
"A directory already exists at: venv" and the update fell back to
installing on top of the stale venv. The resulting partial install
missed exactly one newly-added base dep — ``pathspec==1.1.1`` — which
``hermes desktop --build-only`` imports at the top of its content-hash
check. The desktop rebuild died with ModuleNotFoundError and the parent
update only logged "⚠ Desktop build failed (non-fatal)". Same root cause
made the "default: sync failed" line in the skill-sync stage, because
that sync subprocess hit the same missing import.

Fix 1: ``rebuild_venv`` retries with ``--clear``
------------------------------------------------
If ``uv venv`` fails with "already exists" in stderr (which is what uv
prints, and what uv's own hint tells you to fix with --clear), retry
once with ``--clear``. Only this specific failure pattern triggers the
retry — disk-full / interpreter-download failures still surface as
before so we don't mask real problems.

Fix 2: post-install dep verification
------------------------------------
Belt-and-suspenders so future uv resolver quirks (or any other cause of
partial installs) surface immediately instead of hours later in a
downstream subprocess. After ``_install_python_dependencies_with_optional_fallback``
runs, ``_verify_core_dependencies_installed``:

  1. Reads ``[project.dependencies]`` straight from pyproject.toml
     (so we don't trust the venv's stale metadata).
  2. Filters by environment markers via ``packaging.requirements.Requirement``
     so cross-platform exclusions (``ptyprocess ; sys_platform != 'win32'``)
     don't false-positive on Windows.
  3. Runs ``importlib.metadata.version()`` for each remaining dep inside
     the *target* venv interpreter (resolved from ``VIRTUAL_ENV``, not
     ``sys.executable``).
  4. If anything is missing, reinstalls the base group with
     ``--reinstall`` to force re-resolution. If a second probe still
     reports missing deps, force-installs each one with its pinned spec.
  5. Treats final failure as a warning rather than a hard error — a
     single broken-on-PyPI dep shouldn't block an otherwise-successful
     update — but the message points at ``hermes update --force`` and
     names the missing packages so the user knows what's wrong.

Tests
-----
- ``TestRebuildVenv::test_retries_with_clear_when_dir_already_exists`` —
  simulates the rmtree-couldn't-delete-it failure mode and asserts the
  ``--clear`` retry path is taken and succeeds.
- ``TestRebuildVenv::test_does_not_retry_when_first_failure_is_not_dir_exists``
  — guards against masking real failures (disk full, etc.).
- ``test_verify_core_dependencies.py`` — 7 tests covering the happy
  path, the regression (missing pathspec triggers --reinstall), the
  per-package fallback when --reinstall doesn't help, the platform-
  marker filter so Windows doesn't try to install ptyprocess, the
  missing-pyproject noop, and the VIRTUAL_ENV resolver.

Co-authored-by: Kyssta <218078013+kyssta-exe@users.noreply.github.com>
2026-06-04 06:05:41 -07:00
Ben
cae6b5486f feat(dashboard): always enable embedded chat; remove dashboard --tui flag
The dashboard's embedded Chat surface (/chat, /api/ws, /api/pty) was gated
behind `hermes dashboard --tui` / HERMES_DASHBOARD_TUI=1. The desktop app and
the dashboard's own Chat tab both drive the agent over the /api/ws + /api/pty
WebSockets, so a dashboard started without the flag would pass the /api/status
health check but slam the chat WebSocket shut with WS code 4403 — the app
connects, reports "ready", and chat stays dead. This was the root cause behind
multiple user reports of the desktop app failing to connect to a self-hosted
gateway/dashboard, and it bit Docker and host installs alike.

Make the embedded chat unconditional:

- web_server.py: _DASHBOARD_EMBEDDED_CHAT_ENABLED defaults to True; drop the
  embedded_chat parameter and the runtime reassignment from start_server().
  The WS gates still read the constant (now always true) so the seam — and its
  "rejects when disabled" contract test — stays meaningful.
- main.py: remove the `--tui` argument from the dashboard subparser and the
  `embedded_chat = args.tui or HERMES_DASHBOARD_TUI==1` derivation.
- web/: isDashboardEmbeddedChatEnabled() returns true unconditionally; drop the
  deprecated __HERMES_DASHBOARD_TUI__ alias and the dead LEGACY_TUI_RE scrape in
  the vite dev-token plugin.
- apps/desktop/electron/main.cjs: drop `--tui` from the spawned dashboardArgs
  (it would now error with "unrecognized arguments: --tui") and the redundant
  HERMES_DASHBOARD_TUI env injection.
- Docker: no s6 run-script change needed — the script never passed --tui; the
  HERMES_DASHBOARD_TUI env var is now simply a no-op, so the image works out of
  the box with no extra var.
- Docs: remove every dashboard --tui / HERMES_DASHBOARD_TUI reference across the
  CLI reference, env-var reference, docker/desktop/web-dashboard guides, in-app
  tips, and the zh-Hans translations. The terminal `hermes --tui` / HERMES_TUI
  references are intentionally left untouched.

Tests: 270 passing across web_server, dashboard lifecycle, host-header,
auth-gate, and docker-override-scripts suites.
2026-06-04 03:03:35 -07:00
Teknium
4ed63170e4
fix(update): don't fail desktop rebuild / skills sync on mid-rebuild venv (#38885)
When 'hermes update' rebuilds the project venv (rmtree + uv venv on the
first managed-uv migration), the desktop-rebuild and profile-skills-sync
steps that follow both spawn sys.executable. Firing while the venv is
mid-rewrite makes the child interpreter abort with the bare stderr line
'No pyvenv.cfg file', surfacing as a spurious 'Desktop build failed' /
'default: sync failed' on an update that actually succeeded.

Add _wait_for_interpreter_venv_ready(): resolve the venv hosting
sys.executable and poll briefly for pyvenv.cfg to (re)appear before each
of those subprocess steps. No-op when the interpreter isn't venv-hosted.
The desktop rebuild also retries once after re-waiting, and keeps
streaming its output live (no capture). Best-effort throughout — callers
proceed regardless, so a genuinely broken venv still surfaces the real
error.
2026-06-04 02:20:11 -07:00
Ben
c2ca3f01ab fix(dashboard): honor --portal-url / HERMES_DASHBOARD_PORTAL_URL override in register
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix Lockfile Fix / auto-fix-main (push) Waiting to run
Nix Lockfile Fix / fix (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
Docker / shell lint / Lint Dockerfile (hadolint) (push) Has been cancelled
Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Has been cancelled
The register command resolved the portal base URL purely from the stored
login, ignoring any override. That meant `HERMES_DASHBOARD_PORTAL_URL` (and
the absence of any flag) gave no way to point registration at a staging or
preview portal — the request always hit the login's portal, returning 404
against a branch that wasn't deployed there.

- _resolve_portal_base_url now takes an optional override (precedence:
  override > stored login portal > prod default).
- New --portal-url flag; falls back to HERMES_DASHBOARD_PORTAL_URL env.
- Documents that the access token must be valid at the overridden portal
  (it's minted by whoever you logged into).
- 3 new tests for override precedence.

Verified live against the PR #324 Vercel preview: CLI -> preview endpoint ->
real agent:{id} client_id written to .env.
2026-06-04 00:17:57 -07:00
Ben
bb291b6bbc feat(dashboard): hermes dashboard register for self-hosted OAuth client
Adds a CLI command that registers this install as a self-hosted dashboard
with the user's Nous Portal account, automating the manual browser flow on
/local-dashboards.

- New hermes_cli/dashboard_register.py: resolves a fresh Nous access token
  from auth.json (fast-fails with a `hermes setup` hint when not logged in),
  POSTs to {portal}/api/oauth/self-hosted-client, and writes
  HERMES_DASHBOARD_OAUTH_CLIENT_ID into ~/.hermes/.env idempotently.
- Docker-style adjective_noun auto-naming; --name and --redirect-uri overrides.
- Persists HERMES_DASHBOARD_PORTAL_URL only when non-default and unset (so a
  Vercel preview / staging portal sticks, prod default stays implicit).
- Refuses in managed/hosted installs (the orchestrator stamps the client_id).
- Post-register hint explains the OAuth gate only engages on a non-loopback bind.
- Nested 'register' subparser leaves bare `hermes dashboard` unchanged.
- 9 unit tests (name gen, fast-fails, POST shape, env writes, redirect URI,
  portal-URL persistence, 401/403 mapping); dashboard lifecycle tests still green.

Depends on NousResearch/nous-account-service#324 (the portal endpoint).
2026-06-04 00:17:57 -07:00
Teknium
2f523a4691
fix(tui): cgroup-aware V8 heap cap so memory-limited containers stop dying silently (#38541)
The TUI hardcoded --max-old-space-size=8192. V8 is not cgroup-aware, so in a
Docker/k8s container capped below ~9-10GB the heap grows past the container
limit and the cgroup OOM-killer SIGKILLs the Node parent BEFORE V8's own heap
monitor fires. SIGKILL runs no JS handler, writes no [tui-parent] breadcrumb,
and closes the gateway child's stdin — the user sees only a bare gateway
'stdin EOF'. Complements #38224 (trail-text cap), which reduced pressure but
left the 8GB-vs-container mismatch in place.

- _read_cgroup_memory_limit(): read cgroup v2 (memory.max) then v1
  (memory.limit_in_bytes); handle 'max', the v1 unlimited sentinel, blank/zero,
  and >=1PB as unconstrained.
- _resolve_tui_heap_mb(): unconstrained -> 8192; constrained -> 75% of the
  cgroup limit (headroom for non-heap RSS + the Python child sharing the
  cgroup), floored at 1536MB, never above 8192.
- NODE_OPTIONS block uses the sized value; still respects a user-supplied
  --max-old-space-size.

Net: V8 now GCs/exits gracefully (onCritical breadcrumb fires) instead of being
reaped silently. Display/transport only — no agent context or behavior change.

Tests: tests/hermes_cli/test_tui_heap_sizing.py (20 tests).
2026-06-03 16:40:28 -07:00
Teknium
8a19884bf3
fix(update): stop stash/restore from clobbering desktop source on managed clones (#38542)
The stash/restore cycle in the update path was observed to clobber
freshly-pulled source files (apps/desktop/ deletion -> Vite
'[UNRESOLVED_ENTRY] Cannot resolve entry module index.html'). On a
managed clone the user never edits the source tree, so any 'dirty' state
is pure git artifact (CRLF renormalization, npm lockfile churn, files
left behind when a directory was deleted upstream such as
apps/bootstrap-installer/). Stashing that and re-applying it after a pull
is fragile and unnecessary.

- hermes update (hermes_cli/main.py): on a non-fork (managed) clone,
  discard working-tree dirt via reset --hard HEAD + clean -fd instead of
  stash/apply. Forks keep the stash machinery so intentional edits
  survive. Also pin core.autocrlf=false on Windows so the dirt is never
  created (mirrors install.ps1 #38239).
- install.sh: replace the update-path stash/restore dance with a hard
  reset to origin/<branch>; the installer is a managed-only entry point.
- install.sh + install.ps1 desktop stage: prefer 'npm ci' (wipes and
  reinstalls node_modules from the lockfile) over bare 'npm install',
  which can report 'up to date' against a stale marker while node_modules
  is empty -- leaving tsc unresolved so 'npm run pack' fails.

Tests: managed clone cleans instead of stashing; fork still stashes;
existing stash tests force the stash path explicitly.
2026-06-03 16:40:13 -07:00