Commit graph

1314 commits

Author SHA1 Message Date
teknium1
76f01780f0 fix(kanban): sweep deferred scratch parent on non-scratch child completion + tests
Follow-up on the deferred-cleanup salvage (#33774): _cleanup_workspace
returned early for a non-scratch ('dir'/'worktree') task and never ran the
parent sweep, so a scratch parent waiting on a 'dir' child would leak its
deferred workspace forever. Run the parent sweep before the early return.

Adds regression tests: deferred-while-child-active, swept-after-last-child,
and dir-child-unblocks-scratch-parent.
2026-06-07 09:50:44 -07:00
Teknium
9e63109522
feat(dashboard): change UI font from the theme picker, independent of theme (#41145)
The dashboard font is now selectable from the UI, not just YAML. A new Font
section in the header theme picker overrides the UI font of whatever theme is
active; the choice is orthogonal to the theme and survives theme switches.
Each theme keeps its own font as the default — picking "Theme default" clears
the override.

- web/src/themes/fonts.ts: curated font catalog (system + Google Fonts across
  sans/serif/mono), each with a family stack and optional webfont URL. The
  catalog is the only injected-font surface — no free-text URL box, so the
  injected <link> origins stay fixed.
- web/src/themes/context.tsx: font-override state (localStorage + server),
  applied after theme typography so it wins; theme apply re-asserts it, and
  clearing re-runs theme apply to restore the theme's own font. Mono is left
  to the theme so code/terminal are untouched.
- web/src/components/ThemeSwitcher.tsx: Font section with grouped, self-
  previewing font rows and a "Theme default" clear option.
- hermes_cli/web_server.py: GET/PUT /api/dashboard/font persisting to
  config.yaml dashboard.font, with a server-side id allow-list (unknown ids
  coerce to the theme sentinel).
- i18n + types, api client methods, tests, and docs.

Validation: 6 new backend endpoint tests pass; tsc + vite build clean; live
browser test confirmed pick/persist/survive-theme-switch/clear all work.
2026-06-07 03:39:01 -07:00
Teknium
0507e4630d
fix(desktop): preserve configured base_url on same-provider model switch (#41121)
The desktop model picker calls POST /api/model/set with provider+model only
(no base_url). _apply_main_model_assignment cleared model.base_url for every
non-custom provider, so re-picking a Xiaomi MiMo model wiped a Token Plan
endpoint (https://token-plan-*.xiaomimimo.com/v1) back to the registry default
api.xiaomimimo.com — breaking valid tp- keys with 401s.

Now base_url is cleared only when switching to a different provider (the stale
URL belonged to the old one); same-provider re-assignment preserves it, and an
explicitly supplied base_url is honored for any provider.
2026-06-07 02:48:21 -07:00
kshitijk4poor
44c0c2d4ac refactor(inventory): make force_fresh_nous_tier keyword-only + pin contract
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix Lockfile Fix / auto-fix-main (push) Waiting to run
Nix Lockfile Fix / fix (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Has been cancelled
uv.lock check / uv lock --check (push) Has been cancelled
Follow-up to the salvaged perf fix. The new force_fresh_nous_tier param was
inserted into list_authenticated_providers between custom_providers and
max_models. Make it keyword-only (*) so a positional caller passing max_models
as the 5th arg can never silently mis-bind it to the tier-refresh flag, and
add a signature-contract test that fails if the keyword-only separator is
later dropped. All in-repo callers already use keyword args; verified no
caller breaks.
2026-06-07 00:41:13 -07:00
helix4u
eb70ab894b fix(inventory): avoid fresh Nous tier checks in picker payloads 2026-06-07 00:41:13 -07:00
brooklyn!
846821d8c0
Merge pull request #40684 from NousResearch/bb/cron-sessions-sidebar
feat(desktop): first-class cron jobs in the sidebar + dashboard scheduler
2026-06-07 00:32:25 -05:00
Teknium
fc086da8bd
fix(gateway,windows): reliability — JOB breakaway + status --deep probes + test-leak fix (#40909)
* fix(gateway,windows): reliability — supervisor task, JOB breakaway, status --deep

Three coordinated fixes for the Windows gateway reliability story:

1. CREATE_BREAKAWAY_FROM_JOB on every detached spawn

   The 'hermes update' triggered from the Electron Desktop GUI ran inside
   Electron's job object. Without breakaway, the post-update gateway
   watcher spawned by update — already DETACHED_PROCESS — was still
   reaped when Electron's job tore down, so the gateway never came back
   after a GUI-initiated update. Adds CREATE_BREAKAWAY_FROM_JOB (0x01000000)
   to:
     - hermes_cli/_subprocess_compat.py::windows_detach_flags() — used by
       every helper that calls windows_detach_popen_kwargs(), including
       launch_detached_profile_gateway_restart()
     - The watcher subprocess's own respawn snippet in
       hermes_cli/gateway.py (inlined flags so the watcher's child
       respawn also breaks away)

   _spawn_detached() in gateway_windows.py already had the flag; this
   change brings the rest of the codebase to parity.

2. Per-minute supervisor Scheduled Task — Windows equivalent of
   systemd Restart=always

   Introduces hermes_cli/gateway_supervisor.py and registers it as a
   second Scheduled Task ('Hermes_Gateway_Supervisor', SC MINUTE /MO 1,
   LIMITED rights) alongside the existing ONLOGON task. Every minute,
   the supervisor uses the same gateway.status.get_running_pid() probe
   as 'hermes gateway status' and, if no gateway is alive, calls
   gateway_windows._spawn_detached() (which now includes BREAKAWAY) to
   bring one back.

   Covers every crash mode, not just 'machine rebooted': taskkill,
   OOM, GUI update SIGTERM, parent job teardown. Cheap — one pythonw
   startup per minute when down, one PID-existence check per minute
   when up.

   Wired into both the schtasks-success and Startup-folder-fallback
   install paths via _install_supervisor_best_effort(), and removed in
   uninstall(). Best-effort: a failing supervisor install logs a
   warning but doesn't roll back the primary install.

3. 'hermes gateway status --deep' shows per-probe PASS/FAIL

   Replaces the existing terse '--deep' output (which only printed
   paths) with an actual diagnostic table:
     [1] PID file present
     [2] Lock file held by a live process
     [3] get_running_pid() result
     [4] _pid_exists(pid) — OS-level liveness
     [5] gateway_state.json (state + age)
     [6] Last lifecycle event from gateway-exit-diag.log

   When the high-level summary disagrees with reality, the user can
   see exactly which signal is lying.

Test-leak fix
-------------

tests/hermes_cli/test_gateway_wsl.py::TestGatewayCommandWSLMessages
monkey-patched is_linux/is_wsl/supports_systemd_services to simulate
WSL but did NOT stub is_windows(). On a Windows host, the dispatcher
in _gateway_command_inner takes the is_windows() branch BEFORE the
WSL guidance branch, so the test invoked gateway_windows.install()
for real. install() writes to %APPDATA%\...\Startup\Hermes_Gateway.cmd
— the REAL user Startup folder, never sandboxed by tmp_path — pointing
at the test's pytest-of-<user>/pytest-<N>/.../gateway-service/ wrapper.
When pytest tore down the tmp_path, every subsequent Windows login
flashed a cmd.exe window that failed to find the missing target.

Stubs is_windows=False on all four affected tests:
  test_install_wsl_no_systemd
  test_start_wsl_no_systemd
  test_status_wsl_running_manual
  test_status_wsl_not_running

Defense-in-depth: _build_startup_launcher() now prefixes the launcher
with 'if not exist <target> exit /b 0', so any future stale Startup
entry silently no-ops instead of flashing a console window.

Status enhancements
-------------------

- status() now reports supervisor task presence alongside the existing
  schtasks/Startup info, and nudges the user to reinstall if the
  supervisor isn't registered.
- Deep mode dumps both the supervisor task name + script path.

* fix(gateway,windows): drop the per-minute supervisor task — keep breakaway + deep probes

Earlier in this branch we added a per-minute schtasks-based supervisor to
respawn the gateway after crashes / GUI-update SIGTERMs. The implementation
flashed a brief console window on every firing, which stole window focus.
We tried several variants:

  - cmd.exe wrapper invoking pythonw  -> flashes (cmd.exe is console-subsystem)
  - schtasks /TR pointing at pythonw  -> flashes (uv venv launcher pythonw is
    actually subsystem=Console, not GUI; it respawns the real pythonw)
  - schtasks /TR pointing at base uv  -> still flashes (Task Scheduler-side
    conhost preallocation; documented Windows quirk)
  - XML registration with <Hidden>true>  -> still flashes (<Hidden> only hides
    the task in the Task Scheduler UI, not the spawned window)

Researched what leading projects do:

  - Ollama: GUI-subsystem tray exe + Startup-folder shortcut. No supervisor.
  - Tailscale: real Windows Service via SCM. Session 0, no console possible.
  - Syncthing: --no-console flag inside the binary + Startup folder.
  - openclaw: VBS Run(..., 0, False) wrapper. Suppresses the *window* but
    Super User Q971162 confirms focus-steal still occurs in some cases.

None of these use a per-minute polling scheduled task. The 'auto-restart on
crash' responsibility belongs INSIDE the daemon (Tailscale's in-process
recovery / Ollama's monitor+worker pair) OR is delegated to the Windows
Service Control Manager — not Task Scheduler.

So this commit drops the supervisor entirely. The CREATE_BREAKAWAY_FROM_JOB
fix in _subprocess_compat.py (from commit c1e5fa433) survives — that is the
*real* fix for problem #2 (GUI-update kills gateway): the post-update
watcher in launch_detached_profile_gateway_restart() now breaks out of
Electron's job object, so the gateway respawn watcher survives the GUI
quit and successfully respawns the gateway.

Surviving from c1e5fa433:
  * CREATE_BREAKAWAY_FROM_JOB in hermes_cli/_subprocess_compat.py (fixes #2)
  * Inlined breakaway flag in the watcher respawn snippet in gateway.py
  * hermes gateway status --deep PASS/FAIL probes (fixes #1 — visibility)
  * 'if not exist <target> exit /b 0' guard in _build_startup_launcher
    (fixes #3 — silent no-op for stale Startup entries)
  * tests/hermes_cli/test_gateway_wsl.py is_windows=False stubs (root cause
    of #3 — pytest WSL tests no longer leak Startup entries on Win hosts)

Removed in this commit:
  * hermes_cli/gateway_supervisor.py (entire file)
  * Supervisor section in hermes_cli/gateway_windows.py (~180 lines):
      get_supervisor_task_name, get_supervisor_script_path,
      _build_supervisor_cmd_script, _write_supervisor_script,
      _install_supervisor_task, is_supervisor_task_registered,
      _install_supervisor_best_effort
  * _install_supervisor_best_effort() calls in install() (3 spots)
  * supervisor cleanup block in uninstall()
  * supervisor display lines in status() / status(deep=True)

Future direction (out of scope for this PR): the right place for Windows
'Restart=always' semantics is a real Windows Service installed via
pywin32's win32serviceutil.ServiceFramework — session-0 isolation, SCM
auto-restart, no console window possible. That's a meaningful next-PR
project, not a band-aid.

Tests: 51 pass / 2 pre-existing failures in
tests/hermes_cli/test_gateway_{windows,wsl}.py (the 2 failures are
TestSupportsSystemdServicesWSL cases that fail on origin/main too —
unrelated to this PR).
2026-06-06 19:53:58 -07:00
Teknium
887295ba54
fix(config): preserve custom-provider models maps and metadata through v11->v12 migration (#40573)
Salvaged from #40410; cleaned up, re-verified against main, tests added.

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
2026-06-06 18:43:20 -07:00
Teknium
89040e0db3
fix(secrets): fail early with clear error when bitwarden setup runs without TTY (#40571)
Salvaged from #40280; cleaned up, re-verified against main, tests added.

Co-authored-by: liuhao1024 <liuhao1024@users.noreply.github.com>
2026-06-06 18:36:40 -07:00
Teknium
5b43bf7d02
feat: uninstall the Chat GUI without removing the agent (CLI + desktop UI) (#40355)
* feat: uninstall the Chat GUI without removing the agent (CLI + desktop UI)

Adds a GUI-only uninstall path so people can remove the desktop Chat GUI
while keeping the Hermes agent + their config/sessions/.env, and surfaces
the three CLI uninstall modes inside the desktop app's Settings → About.

CLI:
- New hermes_cli/gui_uninstall.py: cross-platform discovery + removal of the
  desktop GUI's artifacts (source-built dist/release/node_modules + build
  stamp, the packaged app bundle, and the Electron userData dir) on Linux,
  macOS, and Windows. Never touches the agent source, venv, or user data.
- `hermes uninstall --gui` removes only the Chat GUI; `--gui-summary` prints a
  JSON install snapshot (used by the desktop UI to gate options + detect a
  missing agent for a future lite client).
- `hermes uninstall --yes` / `--full --yes` now run non-interactively, sharing
  the destructive sequence via a new _perform_uninstall() helper. The keep-data
  and full flows also sweep the GUI artifacts.

Desktop:
- electron/desktop-uninstall.cjs: pure helpers mapping each mode (gui/lite/full)
  to CLI flags, resolving the running app bundle per OS, and building the
  detached cleanup script that waits for the app to exit, runs the Python
  uninstall, and removes the bundle.
- IPC hermes:uninstall:summary / :run, preload bridge, and types.
- Settings → About "Danger zone" with the three options; agent-removing
  options hide when no local agent is detected.

Tests: tests/hermes_cli/test_gui_uninstall.py (22 pass with the existing
uninstall tests), electron/desktop-uninstall.test.cjs (17 pass, wired into
test:desktop:platforms). Docs: desktop.md "Uninstalling" + cli-commands.md.

* fix(desktop): tear down backend process tree before GUI uninstall (Windows lock safety)

The desktop uninstall cleanup script waited only on the desktop app's own
PID, but a backend grandchild (gateway / pty terminal / hermes REPL) can
outlive it and keep hermes.exe + venv files mandatory-locked on Windows —
making the script's rmdir half-fail and leaving a partial install, the same
failure class as the self-update path's #37532.

- main.cjs: runDesktopUninstall now awaits releaseBackendLock() before
  spawning the cleanup script — tree-kills every backend PID the desktop owns
  (primary + pool) via taskkill /T /F and polls the venv shim until unlocked.
  Extracted the shared core out of releaseBackendLockForUpdate so both the
  update hand-off and the uninstaller use the identical, incident-hardened
  teardown. No-op on macOS/Linux (no mandatory locks).
- desktop-uninstall.cjs: Windows cleanup script removes the bundle via a
  bounded rmdir retry loop (10x, 1s) instead of a single rmdir, since Windows
  releases directory handles lazily even after the holding process exits.
- Dropped a fragile tasklist|findstr reap-by-path attempt; the Electron-side
  tree-kill-by-PID is the reliable mechanism.

Tests: desktop-uninstall.test.cjs updated for the retry-loop output (17 pass).

* fix(desktop): address review on GUI uninstall (venv self-delete, gates, wait-loop)

Resolves @OutThisLife's review on #40355:

1. full mode now gated on agent presence (needsAgent: true). It removes the
   agent + user data, so on a lite client with no local agent it's hidden
   like lite — no more offering to remove an agent that isn't there.

2. (Finding 3, the real bug) lite/full no longer rmtree the venv from the
   venv's OWN python. On Windows a running python.exe is mandatory-locked, so
   that half-fails. New lightweight 'python -m hermes_cli.uninstall --mode X'
   entrypoint (stdlib-only imports) lets the desktop run agent-removing modes
   under the SYSTEM python (findSystemPython) with PYTHONPATH=<agentRoot>, so
   import hermes_cli resolves from source while the venv is torn down. Falls
   back to venv python + logs when no system python (gui-only unaffected).

3. Windows wait-loop is now bounded (60 tries, matching POSIX) and matches the
   PID as a whole space-delimited token via findstr (no substring 99->990
   trap, no redundant bare find). set HERMES_HOME/PID/PYTHONPATH now quoted.

4. Renamed the misleading 'returns null for dev run' test — the dev-run safety
   is shouldRemoveAppBundle(isPackaged=false), which the test now asserts.

Docs: note that --gui on a source checkout also sweeps node_modules/build
output. Tests: 18 python + 19 desktop pass.
2026-06-06 18:22:38 -07:00
Teknium
f2e8234307 test: update non-Termux workspace-scope fixtures for #38358 fix
The non-Termux web/TUI install path now scopes to --workspace <name>;
update two fixtures that asserted the old unscoped install commands.
2026-06-06 18:22:20 -07:00
Teknium
7db7a9462d fix: align test fixture arg order + add zakame to AUTHOR_MAP
Conflict resolution prefixes --workspace web before --silent (preserving
the Termux npm_workspace_args path); update test_cmd_update fixture to match.
Add zakame@zakame.net -> zakame mapping so CI author check passes.
2026-06-06 18:22:20 -07:00
Zak B. Elep
675fb10240 fix(install): correct check_dir tautology and add --workspace web test
- check_dir = npm_dir if audit_extra else npm_dir evaluated identically in
  both branches; change to PROJECT_ROOT if audit_extra else npm_dir so
  workspace-scoped audits check the workspace root's node_modules
- Add test_npm_install_uses_workspace_web_scope asserting --workspace web is
  passed adjacently in the _build_web_ui npm install invocation
2026-06-06 18:22:20 -07:00
Zak B. Elep
4bf52022e5 fix(tui): correct --skip-build hint and add TUI workspace install test
- Update the --skip-build pre-build hint in the dashboard startup path
  to use `npm install --workspace web && npm run build -w web` so users
  don't accidentally trigger a desktop rebuild by following the hint.

- Add test_tui_launch_install_uses_workspace_scope to assert that the
  TUI launch npm install carries --workspace ui-tui, covering the call
  site added in the prior commit.
2026-06-06 18:22:20 -07:00
Zak B. Elep
0416f852f2 fix(tui): scope TUI launch install and fix stale hints/test
- Add --workspace ui-tui to the TUI launch npm install, the one call
  site missed by the prior commit. Without scoping it ran from
  PROJECT_ROOT and still resolved apps/desktop via the apps/* glob.

- Update the two manual-recovery hints in _build_web_ui (npm install
  failure and build failure paths) to use the scoped form
  `npm install --workspace web && npm run build -w web` so users
  following the hint don't accidentally trigger a desktop rebuild.

- Update the stale test assertion in test_cmd_update.py to expect
  --workspace web in the _build_web_ui npm ci call, which was
  previously unreachable through the if-guard and left the workspace-
  scoping change from the prior commit unverified.
2026-06-06 18:22:20 -07:00
Brooklyn Nicholson
f491260365 Merge remote-tracking branch 'origin/main' into bb/cron-sessions-sidebar
# Conflicts:
#	apps/desktop/src/app/cron/index.tsx
2026-06-06 16:34:23 -05:00
kshitij
ebed881d46
fix(cli): quarantine running hermes.exe during update dep-verification repair on Windows (#40409)
The dependency-verification repair in _verify_core_dependencies_installed
ran 'pip install --reinstall -e .' via _run_install_with_heartbeat directly,
bypassing the Windows shim-quarantine that the primary install path performs.

That reinstall rewrites the entry-point shims, and on Windows the live
hermes.exe is the running process — pip can neither delete nor overwrite it.
With no quarantine, the shim was left missing and 'hermes' dropped off PATH
('hermes' is not recognized... after update).

Extract the rename-out-of-the-way / restore-on-failure logic into a reusable
_run_quarantined_install helper and route both the primary editable installs
and the --reinstall -e . repair through it. The per-package repair installs
only third-party deps (never hermes-agent), so they don't touch the shims and
are left untouched. Add a regression test (fails on old code, passes on new).
2026-06-06 12:50:58 -05:00
kshitij
d4a7bfd3aa
Merge pull request #29724 from bbednarski9/bbednarski/nmf-41B-nemoflow-plugin
feat(middleware): add adaptive middleware to hermes-agent, consumed by NeMo-Relay
2026-06-06 10:46:41 -07:00
Brooklyn Nicholson
003110c107 fix(ci): map @TheGardenGallery email + drop unused pytest import
- check-attribution: add chilltulpa@gmail.com -> TheGardenGallery to
  AUTHOR_MAP in scripts/release.py (new external contributor via the
  carried-over commits).
- ty: the dashboard back-compat test imported pytest but never used it,
  tripping unresolved-import. Drop the dead import — tests are plain
  functions driving the parser via subprocess, no pytest API needed.
2026-06-06 12:43:28 -05:00
The Garden
2820d87ea5 fix(cli): tolerate stale dashboard --tui from old desktop shells
Older Hermes desktop app shells (<= 0.15.x) spawn the backend as
`hermes dashboard --no-open --tui --host ... --port ...`. The --tui flag
was removed from the dashboard subcommand in cae6b5486 (embedded chat is
always on now).

When a user's CLI updates past that commit but their desktop app binary
has not, argparse hard-errored with 'unrecognized arguments: --tui' and
exit(2). The backend died before becoming ready and the desktop GUI showed
only 'Hermes couldn't start' with no actionable cause — a confusing brick
for anyone whose app and CLI versions drift apart across an update.

Add a hidden, deprecated, accepted-and-ignored --tui flag to the dashboard
subparser so an old app shell + new CLI degrades gracefully. Hidden from
--help via argparse.SUPPRESS so we don't re-advertise a removed feature.
Safe to delete once the floor app version is well past 0.16.0.

Adds tests/hermes_cli/test_dashboard_tui_backcompat.py pinning: the flag
parses without error, stays hidden from --help, and the modern (no --tui)
invocation is unaffected.
2026-06-06 12:43:28 -05:00
Brooklyn Nicholson
3e2d758816 feat(desktop): fire cron jobs from the dashboard backend
The cron scheduler tick loop only ran inside `hermes gateway run`, but the
desktop app spawns a `hermes dashboard` backend with no gateway — so any cron
a user created in the app was saved and never fired (silently).

Run a minimal scheduler ticker inside the dashboard lifespan, gated on a new
HERMES_DESKTOP=1 marker the electron shell injects, so server `hermes dashboard`
is unaffected. Cross-process safe via the existing cron/.tick.lock, so it never
double-fires alongside a real gateway.
2026-06-06 12:42:32 -05:00
kshitijk4poor
c4c5548eb4 fix(middleware): single-use next_call guard + deepcopy-safe request copies
Address the two non-blocking follow-ups from review:

- next_call is now single-use per middleware frame. A second invocation
  raises instead of silently re-running the downstream provider/tool, so
  the terminal call cannot execute twice via the chain. The error surfaces
  through the existing handler, which preserves the first downstream result.
- Request-middleware payload copies go through _safe_copy(), which falls
  back to a shallow dict copy when deepcopy() fails on a non-deepcopyable
  member (clients, callbacks, file handles) instead of aborting the pass.

Adds regression coverage for both: double next_call() keeps the terminal
single-run, and a non-deepcopyable (threading.Lock) request payload still
runs middleware via the shallow fallback.
2026-06-06 23:07:25 +05:30
Bryan Bednarski
5abe45674d
fix(middleware): preserve translated downstream failures
Track successful next_call completion separately from invocation so execution
  middleware that catches and translates a downstream provider/tool failure does
  not accidentally convert that failure into a successful None result.

  Also avoid wrapping BaseException from downstream execution, and document the
  execution middleware error semantics.

  Tests cover:
  - pre-next_call middleware failures fail open to the remaining chain
  - post-next_call middleware failures preserve the downstream result
  - translated downstream failures propagate instead of returning None
  - downstream BaseException is not wrapped

Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
2026-06-06 09:26:18 -07:00
Brooklyn Nicholson
3606307339 fix(gateway): use user launchd domain + Background session, detached fallback (macOS 26)
Salvages the primary fix from #24275 (asdlem) and layers a last-resort
fallback on top:

Primary (from #24275): the real macOS 26 root cause is that `gui/<uid>`
isn't reachable from non-Aqua/background sessions. Switch the launchd
domain to `user/<uid>` and mark the plist valid for both Aqua and
Background sessions (LimitLoadToSessionType), restoring a real supervised
service. Treat exit code 125 as "job unloaded" so start/restart
re-bootstrap and retry.

Last resort (this PR): the #23387 reporter saw `user/<uid>` bootstrap
also fail with error 5 on some hosts. When even a fresh bootstrap can't
manage the domain (codes 5/125 persist), degrade to a CLI-managed
detached background process instead of crashing — logs to gateway.log,
PID tracked via gateway.pid so stop/status/restart keep working. Print
guidance that it won't auto-start at login or auto-restart on crash.

Co-authored-by: asdlem <asdlem@users.noreply.github.com>
2026-06-06 09:08:37 -07:00
Brooklyn Nicholson
59c273ba3a fix(gateway): fall back to detached launch when launchd rejects domain (macOS 26)
macOS 26+ broke launchctl management of the gui/<uid> (and user/<uid>)
domains: `bootstrap` returns error 5 and `kickstart` returns error 125
("Domain does not support specified action"), so `hermes gateway
start/install/restart` crashed with a cryptic traceback (#23387).

Detect these codes and degrade gracefully: launch the gateway as a
CLI-managed detached background process (the documented `nohup hermes
gateway run --replace` workaround), with logs to gateway.log and the PID
tracked via gateway.pid so stop/status/restart keep working. Print clear
guidance that the service won't auto-start at login or auto-restart on
crash on this macOS version. launchd_stop also tolerates 125/5 from
bootout and falls through to the PID-based kill.
2026-06-06 09:08:37 -07:00
Teknium
2bf0a6e760
feat(dashboard): full tool backend configuration in the GUI (#40418)
Replicate the `hermes tools` configurator in the dashboard Skills →
Toolsets view. Each toolset now opens a config drawer that covers the
full lifecycle the CLI offers: enable/disable, pick a provider/backend,
enter and save API keys, and run a provider's post-setup install hook
with a live log tail.

The toolset view was previously read+toggle only — the provider matrix
and key-status endpoints existed but the page never called them, and
there was no way to save a key or run a backend install (npm/pip/binary)
from the browser.

Backend:
- New CLI subcommand `hermes tools post-setup <KEY>` — non-interactive,
  scriptable target that runs a provider's install hook (agent_browser,
  camofox, cua_driver, kittentts, piper, ddgs, spotify, langfuse,
  xai_grok). Validated against valid_post_setup_keys() so an arbitrary
  key can't drive _run_post_setup.
- PUT /api/tools/toolsets/{name}/env — save API keys to ~/.hermes/.env
  via save_env_value (same store the CLI writes), validated against the
  toolset category's env-var allowlist; blank values skipped.
- POST /api/tools/toolsets/{name}/post-setup — spawn-action that runs
  `hermes tools post-setup <key>`; frontend tails the log via the
  existing /api/actions/tools-post-setup/status. Registered in
  _ACTION_LOG_FILES.

Frontend:
- New ToolsetConfigDrawer component (provider radios, password key
  inputs with saved-state, get-a-key links, Run-setup + live install
  log). Toolset cards get a Configure button + the drawer also exposes
  the enable toggle.
- api.ts: toggleToolset, getToolsetConfig, selectToolsetProvider,
  saveToolsetEnv, runToolsetPostSetup + ToolsetConfig/Provider/EnvVar/
  EnvResult types.

Validation: 56 admin-endpoint tests pass (10 new: env save w/ CLI
parity + allowlist reject + blank-skip, post-setup spawn validation,
auth gate); 232 web_server tests pass; web npm run build + eslint clean;
HTTP E2E exercises save-key (CLI reads it back) and spawn+poll
post-setup to exit 0.
2026-06-06 07:45:36 -07:00
Teknium
56236b16e3
feat(dashboard): rehaul Skills hub browser — connected hubs, featured, preview + security scan (#40384)
The Browse-hub tab was a blank search box with sparse result cards (name +
source + one Install button), no way to read a skill before installing, no
visual security scan, and no indication it was even connected to any hubs.

Backend (web_server.py):
- GET /api/skills/hub/sources — lists the configured hubs (label + trust
  tier + GitHub rate-limit + index availability) and featured skills pulled
  from the centralized index (zero extra API calls), plus installed-skill
  provenance so the UI can mark already-installed results.
- GET /api/skills/hub/preview — fetches a skill's SKILL.md text + file
  manifest WITHOUT installing (decodes byte-stored text, masks binaries).
- GET /api/skills/hub/scan — runs the SAME quarantine + scan_skill +
  should_allow_install pipeline the CLI installer uses, then cleans up
  quarantine, returning verdict / per-finding detail / severity tally /
  install-policy decision.
- search now returns per-source counts + timed-out sources + installed map.

Frontend (SkillsPage HubBrowser):
- Landing state: connected-hubs strip + featured skill grid (no more blank
  page).
- Rich cards: trust-level color coding, source, tags, identifier,
  Details + Install (or Installed state).
- Detail dialog: read the actual SKILL.md, on-demand visual security scan
  (verdict pill, severity tally, per-finding list, allow/block policy),
  GitHub repo link.
- Search meta line: result count + timing + per-source breakdown (the
  'feels slow / no feedback' complaint).

Tests: 4 new endpoint test classes (sources/preview/scan + updated search
shape) in test_dashboard_admin_endpoints.py.
2026-06-06 02:44:50 -07:00
kshitij
5af899c7ca
feat(cli): display custom profile alias names in profile list/show (#40371)
profile list and profile show assumed the wrapper script is always named
after the profile (wrapper_dir / name). When a custom alias exists — e.g.
`hermes profile alias steve --name qiaobusi` creates ~/.local/bin/qiaobusi
pointing at `hermes -p steve` — the display silently showed the profile
name (or nothing) instead of the alias the user actually typed.

The custom-alias *creation* path (create_wrapper_script(name, target)) was
added later; the *display* path was never updated to match.

Add find_alias_for_profile() — a reverse lookup that scans the wrapper dir
for our own wrappers (alias-named file containing 'hermes -p <profile>'),
prefers a custom alias over the profile-named one, strips .bat on Windows,
and sorts for deterministic output. Populate ProfileInfo.alias_name and wire
it into the three display sites (profile describe, list, show).

Credit: salvages the intent of #11506 by wss434631143, reimplemented on
current main against the post-#11506 custom-alias (--name/target) mechanism.

Tests: 6 new (profile-named, custom-name, none, unrelated-file rejection,
windows .bat strip, list_profiles surfacing). All 123 in test_profiles pass.
E2E verified against the real CLI for both custom and profile-named aliases.
2026-06-06 08:08:07 +00:00
Teknium
b91aade176
feat(desktop): warn when main-model switch leaves auxiliary tasks pinned to another provider (#40286)
Switching the main model never touches auxiliary slot pins (they're
independent, sticky per-task overrides). A user who switches main away
from a now-unpaid provider keeps paying 402s on every background aux call
until they manually reset those pins — silently, with no UI signal.

- /api/model/set scope:'main' now returns stale_aux: slots still pinned
  to a provider different from the new main (additive field).
- Desktop Model Settings shows a switch-time notice after Apply AND a
  persistent banner when any loaded aux slot mismatches the main provider,
  both wired to the existing 'Reset all to main' action.
- Never auto-clears pins — a dedicated cheaper aux model is a legitimate
  config; surface-and-offer instead of nuking.
- Fixes a stale pre-existing assertion in the panel test (main model now
  renders via selectors, not a standalone label).
2026-06-05 23:35:36 -07:00
Teknium
50f9ad70fc
fix(dashboard): populate cron delivery dropdown from configured platforms (#40218)
* fix: respect disabled auto-compaction on context overflow

Port from anomalyco/opencode#30749.

When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.

Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.

Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).

* fix(dashboard): populate cron delivery dropdown from configured platforms

The dashboard cron-create/edit dropdown hardcoded five delivery options
(local, telegram, discord, slack, email), so users on Matrix — or any
other backend-supported platform — had no way to pick their channel even
though the cron scheduler delivers to all of them. It also offered
Telegram/Discord/etc. to users who never set those up.

- cron/scheduler.py: add cron_delivery_targets() — the single source of
  truth. Intersects gateway-configured platforms with cron-deliverable
  ones and reports whether each platform's home channel is set.
- web_server.py: GET /api/cron/delivery-targets exposes that list (+ the
  implicit local option) to the dashboard.
- CronPage.tsx: both modals render options from the endpoint. Configured
  platforms missing a home channel still appear, annotated "set a home
  channel first" (option B), so the user knows what to fix. Edit modal
  preserves a job's current target even if it's no longer configured.
  Local-only state shows a "configure a platform under Channels" hint.

Validation: scheduler + endpoint E2E'd with a Matrix gateway (home set
and unset); 5 new tests; tests/cron + tests/hermes_cli/test_web_server
green (366 passed).
2026-06-05 20:23:54 -07:00
Teknium
78122c52cf test(slack): drop /q alias assertion now displaced by /version cap clamp
Slack's native-slash manifest hard-caps at 50 (_SLACK_MAX_SLASH_COMMANDS).
Adding the /version canonical claims a pass-1 slot, so the lowest-priority
pass-2 alias (/q for /quit) clamps off the end. /q stays reachable via
/hermes q. Surviving aliases (/btw /bg /reset) still prove alias parity.
2026-06-05 18:05:05 -07:00
kshitij
e6f7e217ce
Merge pull request #40093 from kshitijk4poor/feat/named-custom-discover-models-18726
feat(model): honor discover_models in terminal hermes model named-custom flow (closes #18726)
2026-06-05 13:08:33 -07:00
kshitijk4poor
7ae8aac3b9 feat(model): honor discover_models in terminal hermes model named-custom flow
The terminal `hermes model` wizard (_model_flow_named_custom) always
live-probed a custom provider's /models endpoint, ignoring the configured
`models:` list. For plans whose endpoint exposes a large catalog (e.g. Baidu
Qianfan Coding Plan returns 100+ models for a 2-3 model plan) the picker
flooded with models the user can't use.

This wires `discover_models` (and the `models:` list) through
_named_custom_provider_map into the flow and honors `discover_models: false`
the same way the slash-command picker (model_switch.py sections 3 & 4) does:
- Default stays True — live probe, no behaviour change.
- discover_models: false → use the configured `models:` list verbatim,
  skip the probe (string 'false'/'no'/'0' normalised to False).
- If the probe is on but returns empty, fall back to the configured list
  instead of forcing manual entry.

Closes #18726
2026-06-06 01:29:41 +05:30
ohMyJason
4b2d00f845 feat(model_switch): honor discover_models in custom_providers section 4
Section 3 (user `providers:`) already honors `discover_models: false` to
skip live /models discovery and keep the explicit `models:` list. Section 4
(`custom_providers:` list) did not — `should_probe` ignored the field, so any
grouped custom provider with an api_key always had its configured subset
replaced by the full live /models catalog.

This adds the same `discover_models` support to section 4:
- Default True — no behaviour change for existing configs.
- `discover_models: false` keeps the explicit `models:` list even when an
  api_key is present.
- String values ("false"/"no"/"0") are normalised to False, matching
  section 3.
- If any entry in a grouped endpoint opts out, the whole group opts out.

Use case: endpoints that expose a full aggregator catalog via /models but
only serve a configured subset.

Salvaged from #29810 — rebased onto current main. The PR's other change
(`key_env` resolution in section 4) landed independently in commit aa283d1e4
(custom provider picker credential isolation), so only the discover_models
portion is carried here.

Co-authored-by: ohMyJason <42903577+ohMyJason@users.noreply.github.com>
2026-06-06 01:04:13 +05:30
adybag14-cyber
af8b917dab fix(termux): scope frontend npm installs 2026-06-05 06:56:51 -07:00
Teknium
9ca11b35d5
perf(/model): prewarm picker provider-models cache in background (#39847)
* fix: respect disabled auto-compaction on context overflow

Port from anomalyco/opencode#30749.

When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.

Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.

Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).

* perf(/model): prewarm picker provider-models cache in background

The no-args /model picker calls list_authenticated_providers(), which
fetches each authenticated provider's live /v1/models list serially. On a
cold or stale (>1h TTL) cache that blocks ~1.5s on the user's critical path
the first time /model is opened in a session.

Warm that exact path off-thread during the idle window right after the CLI
banner is shown: a once-per-process daemon thread runs
list_authenticated_providers() to populate provider_models_cache.json for
every authed provider. By the time the user types /model, the picker hits
the warm disk cache (~136ms vs ~1500ms).

Process-level Event guard (mirrors run_agent's _openrouter_prewarm_done)
ensures at most one thread per process; fully exception-isolated so an
offline/no-creds provider can never affect the session.
2026-06-05 06:55:09 -07:00
Teknium
7583aedacd
fix(completion): remove /model <arg> autocomplete from CLI/TUI (#39727)
* fix: respect disabled auto-compaction on context overflow

Port from anomalyco/opencode#30749.

When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.

Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.

Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).

* fix(completion): remove /model <arg> autocomplete from CLI/TUI

The TUI frontend already suppressed /model argument completion in favor of
the two-step ModelPicker (useCompletion.ts), but the CLI prompt_toolkit
completer and the gateway-backed complete.slash RPC (TUI + desktop) still
emitted model aliases and probed LM Studio on every keystroke.

Drops the /model branch in SlashCommandCompleter.get_completions, the
_model_completions method, and the LM Studio probe/cache helper that only
fed it. Command-name completion (/mod -> model) and sibling arg completers
(/skin, /personality) are untouched. Removes the now-dead TestModelTabCompletion
tests.
2026-06-05 06:43:51 -07:00
brooklyn!
d880b5be09
fix(update/windows): don't return _UvResult on Windows (subprocess argv crash) (#39820)
PR #39780 made ensure_uv() return a _UvResult — a str subclass whose
__iter__ yields (path, fresh_bootstrap) so old `uv_bin, fresh = ensure_uv()`
call sites survive the update boundary. That trick is unsafe on Windows.

The dependency installer passes uv straight into the command list
(`[uv_bin, "pip", "install", ...]`). On Windows, subprocess serializes argv
via subprocess.list2cmdline, which iterates every entry *as a string*
(`for c in arg`). Because _UvResult overrides __iter__, that iteration yields
(path, fresh_bootstrap) instead of characters, injecting the bool into the
command line and crashing the first update with:

    TypeError: sequence item 1: expected str instance, bool found

This bites the common single-assignment caller (`uv_bin = ensure_uv()`) on
its first update after #39780: the freshly pulled _UvResult flows into the
old in-memory call site and into the argv. Reported in the field on a
~10-commits-behind Windows install.

A single return value cannot satisfy both legacy 2-target unpacking and
Windows char-iteration — both use the iterator protocol with contradictory
results. So gate the wrapper to POSIX: Windows returns a plain str/None
(the historical, subprocess-safe contract). POSIX keeps _UvResult and the
#39780 update-boundary fix.

Tests: list2cmdline canary proving _UvResult breaks Windows, plus Windows
returns-plain-str and POSIX dual-contract coverage.
2026-06-05 07:54:08 -05:00
brooklyn!
db204ae203
fix(update): make ensure_uv() survive the update boundary (no first-run crash) (#39780)
* fix(update): make ensure_uv() survive the update boundary (no first-run crash)

`hermes update` runs the `ensure_uv()` call site from the old, already-imported
`hermes_cli.main` against the *freshly pulled* `managed_uv` (managed_uv is only
ever lazily imported, so it loads from disk post-pull). `ensure_uv()`'s return
arity flipped from a single path string to `(path, fresh_bootstrap)` (4df280d51)
and back to a single string (fb853a178). Installs parked on a 2-tuple release
unpack `uv_bin, fresh_bootstrap = ensure_uv()` against the new single-value
module and crash the first update with
`ValueError: not enough values to unpack (expected 2, got 1)` — inside the
dependency-install step, *before* the PR #39763 subprocess hand-off can run.

Return a `_UvResult` (a `str` subclass) that is usable as the bare path AND
unpackable as `(path|None, fresh_bootstrap)`. Missing uv is `""` (falsy) instead
of `None` so legacy 2-target call sites can unpack a failure without raising,
while `if not uv_bin` keeps working for single-value callers. fresh_bootstrap is
always False (the rebuild-venv path it gated was scrapped in fb853a178).

* docs(update): correct the verified error string + mechanism for ensure_uv()

A hermetic repro (old 2-target call site vs the freshly-pulled single-value
module) shows the first-update crash is exactly the string from PR #39763's
report: `ValueError: too many values to unpack (expected 2)` — not "not enough".
The returned path is a plain `str`, which is iterable, so `uv_bin, fresh =
ensure_uv()` walks its characters; the failure path's `None` return raises
`TypeError: cannot unpack non-iterable NoneType`. Both are fixed by `_UvResult`.
Comment/test wording updated to match; no behavior change.
2026-06-05 07:08:43 -05:00
Teknium
72eb42d9ec
feat(update): stash/restore by default + settable discard for non-interactive updates (reverts #38542, #39568) (#39645)
* Revert "fix(update): require managed marker before destructive clean"

This reverts commit c8e80cd0bf.

* Revert "fix(update): stop stash/restore from clobbering desktop source on managed clones (#38542)"

This reverts commit 8a19884bf3.

* chore(install): keep npm ci desktop-build fix after stash revert

The destructive-clean reverts (#38542/#39568) pulled the desktop
workspace install back to bare `npm install`. The npm ci -> npm install
fallback is orthogonal build-correctness (avoids the Windows
workspace-hoisting flake where install reports up-to-date against a
stale marker while node_modules is empty, breaking tsc -b). Preserve it.

* feat(update): settable stash-or-discard for non-interactive local changes

Adds updates.non_interactive_local_changes (stash | discard, default
stash). Governs ONLY non-interactive updates (desktop/chat app, gateway,
--yes) — interactive terminal updates always stash-and-ask, unchanged.

- config.py: new key under existing updates section; _config_version 26->27.
- main.py: _cmd_update_impl detects non-interactive (gateway/--yes/no-TTY),
  reads the setting; new _discard_stashed_changes() drops the stash
  (stash-and-drop, never reset --hard/clean -fd, so ignored paths survive).
  Post-pull restore site branches on it; the bail-out and up-to-date
  restores always preserve work.
- web_server.py + apps/desktop settings: exposes it as a stash/discard
  select (Advanced section, In-App Update Local Changes).
- docs + tests (discard drops, stash restores, interactive ignores setting,
  missing section defaults to stash).

* fix(install.ps1): stash/restore instead of reset --hard on Windows update

The PR reverted the destructive update path to stash/restore everywhere
except scripts/install.ps1, whose managed-clone update path still ran
`git reset --hard HEAD` before checkout — silently destroying agent-edited
tracked source on Windows (the same #38542 data-loss class the PR fixes).

- Replace `git reset --hard HEAD` with stash-before-checkout +
  restore-after-checkout, mirroring install.sh. Untracked files are
  included so agent-created dirs (e.g. tinker-atropos/) survive.
- Keep `core.autocrlf false` (it prevents the phantom CRLF dirt that made
  the stash necessary; it's also load-bearing for a clean restore).
- Wrap all three checkout modes (Commit/Tag/Branch); Branch case now uses
  `git pull --ff-only` so local commits are never clobbered.
- Only prompt to restore when a real console is attached (UserInteractive
  + non-redirected stdin/stdout + ConsoleHost); the desktop Update button
  and bootstrap have no usable console, so they default to restore and
  never hang on Read-Host.
- On restore conflict or a failed update, the stash is preserved with
  recovery instructions — work is never silently dropped.

Validated on Windows (PowerShell 5.1, git 2.54): AST parse clean;
E2E non-conflicting restore applies+drops cleanly with ignored paths
(node_modules) untouched; conflicting restore preserves the stash.

---------

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-06-05 17:30:10 +05:30
Frowtek
3cd1bd971f fix(cli): require Chromium for local browser readiness in setup/status surfaces 2026-06-05 04:06:17 -07:00
Shannon Sands
6bf55a473e Add CLI Telegram QR onboarding
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-06-05 03:20:10 -07:00
harjoth
b459bac02c fix(cli): gitignore Desktop bootstrap marker so hermes update stops autostashing it
The Desktop bootstrap installer writes `.hermes-bootstrap-complete` into the
managed git checkout root. Because it wasn't gitignored, `hermes update`'s
`git stash push --include-untracked` treated it as a local change and created an
autostash on every run — prompting the user to restore "local changes" that were
really Hermes-managed runtime state (and risking the marker getting stranded in a
stash, which re-triggers Desktop bootstrap).

Add the marker to .gitignore; `git stash -u` and `git status --porcelain` both
skip ignored files, so the updater now sees a clean tree.

Fixes #38529
2026-06-05 02:54:32 -07:00
Acean
b0d234f068 fix(cron): don't crash on cron list when a job's repeat is null
Some checks are pending
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix Lockfile Fix / auto-fix-main (push) Waiting to run
Nix Lockfile Fix / fix (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
`cron_list` read `job.get("repeat", {})`, but the dict-default only
applies to a MISSING key. A one-shot job persisted with `"repeat": null`
returns None, and the next `.get("times")` raised AttributeError, taking
down the whole `cron list` output. Coalesce with `or {}` so a
present-but-null repeat renders as ∞ like the other cron readers already
do. Adds a regression test.

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-06-05 00:19:45 -07:00
helix4u
c8e80cd0bf fix(update): require managed marker before destructive clean 2026-06-05 00:05:30 -07:00
Ben Barclay
b1e399de95
fix(update-check): stop reporting phantom "N commits behind" inside Docker (#39559)
Inside the published Docker image, both the `--tui` banner and the
dashboard-embedded TUI report `1 commit behind — run docker pull
nousresearch/hermes-agent:latest to update` even though the container
has no git repo and no way to compute a commit delta.

Root cause: two independent update-detection paths, only one of which
knows it's running in Docker.

- `recommended_update_command()` → `detect_install_method()` reads the
  `.install_method` stamp that `docker/stage2-hook.sh` writes at boot →
  returns "docker", so the *command string* correctly says `docker pull`.
- `banner.check_for_updates()` (the source of the "N commits behind"
  *count*) has no notion of the docker install method. It only detects a
  build via `HERMES_REVISION` (nix-only, unset in the image) or a `.git`
  dir (excluded from the image by .dockerignore). Neither matches, so it
  silently falls through to `check_via_pypi()`, whose PyPI-version
  mismatch flag (1) is then rendered verbatim by the CLI banner
  (build_welcome_banner), the Ink TUI badge (branding.tsx), and `hermes
  version` as "1 commit behind" — a phantom count, no commit math
  involved. `hermes update` already refuses to run in-place in the
  container.

The dashboard's REST `/api/hermes/update/check` endpoint already
short-circuits docker (returns behind=None + the docker guidance). This
mirrors that guard inside `check_for_updates()` so the banner/TUI/version
surfaces agree: when `detect_install_method() == "docker"`, return None
before any git/pypi probe (and before writing a cache entry). None makes
the render guards (`typeof === 'number' && > 0`, `behind and behind > 0`)
stay false, so the badge/line disappears entirely — matching the System
page.

Fix is in one place (check_for_updates) because all three consumers route
through it via get_update_result()/_update_result.

Tests: test_check_for_updates_docker_returns_none asserts None + no
git/pypi probe + no cache write; test_check_for_updates_non_docker_still_checks
guards against over-broadening (pip still version-checks). Mutation-tested:
removing the guard fails the docker test.

Verified against a real `docker build` of the image — see PR description.
2026-06-05 15:37:19 +10:00
ethernet
fb853a1783 fix(install): scrap rebuild venv 2026-06-04 23:20:29 -04:00
Brooklyn Nicholson
89baf02919 Merge origin/main into bb/desktop-profile-support
Resolve conflicts in desktop settings/cron/messaging/sidebar: adopt main's
ListRow + actions-menu refactors for credential rows; keep our profileColor
import on the sidebar. Drop the now-orphaned Tip-based helpers.
2026-06-04 20:17:07 -05:00
Kewe63
4a4b9bd2dc fix(test): add platform guard for grp import
Tests in test_gateway_service.py imported grp inline without a
platform guard, causing ImportError on systems where grp is
unavailable (e.g. macOS, WSL without grp module).

Added pytest.importorskip('grp') at module level alongside the
existing pwd guard, and removed three redundant inline import grp
statements.

Fixes #24531
2026-06-04 17:52:50 -07:00
rob-maron
54cae7d1cb switch model order 2026-06-04 17:29:31 -07:00