mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-09 08:21:50 +00:00
71 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
a5c12f5f59 |
fix(install): move broken checkout aside instead of deleting it
Review feedback (#40998): `rm -rf` / `Remove-Item -Recurse -Force` on the install dir is destructive -- a user might still want whatever is there. Rename the broken checkout to a timestamped `<dir>.broken-<ts>` backup and re-clone fresh, so nothing is ever deleted. Transient cleanup of a clone attempt that fails within the same run is left as-is. |
||
|
|
fc0900d120 |
fix(install): re-clone interrupted (commit-less) checkout instead of failing
An interrupted previous clone leaves the install dir's .git present but with no initial commit. rev-parse --is-inside-work-tree and git status both still succeed there, so the installer entered the update path and ran `git stash`, which aborts with "You do not have the initial commit yet" and failed the desktop install at the "Cloning Hermes repository" stage. - install.ps1: add a `git rev-parse --verify HEAD` probe to the repo-validity check so a commit-less checkout is treated as broken and re-cloned fresh. - install.sh: mirror it at the top of clone_repo() — drop a partial checkout with no resolvable HEAD so the fresh-clone path handles it (POSIX parity). Fixes #40998 |
||
|
|
fd234bad62
|
fix(install): detect TLS cert-trust failures during npm install on Windows (#40588)
* fix: respect disabled auto-compaction on context overflow Port from anomalyco/opencode#30749. When compression.enabled is false, NO automatic compaction trigger may fire. The proactive token-threshold paths (preflight + post-response should_compress gate) already honoured the setting, but the three provider-overflow recovery paths in the agent loop — long-context-tier 429, 413 payload-too-large, and context-overflow — called _compress_context() unconditionally, silently compressing and rotating the session against the user's explicit choice. Add a single guard at the top of the overflow-recovery dispatch: when compression is disabled and the error is one of those three overflow classes, surface a terminal error (compaction_disabled: True) telling the user to /compress manually, /new, switch to a larger-context model, or reduce attachments. Manual /compress (force=True) is unaffected — it never enters this loop. Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't compress when disabled; control case still compresses when enabled). Existing overflow-recovery tests updated to enable compaction explicitly (they verify the recovery fires); fixture defaults flipped to True to match production (compression.enabled defaults to True). * fix(install): detect TLS cert-trust failures during npm install on Windows Corporate MITM proxies and missing root CAs surface as 'unable to get local issuer certificate' while npm (most often Electron's install.js postinstall) downloads over HTTPS. The installer surfaced this as an opaque 'desktop workspace npm install failed (exit 1)', so users misread it as a permissions/admin-rights problem (issue #38016). Add a shared Show-NpmCertHint detector and route all three npm-install failure paths (agent-browser global install, browser-tools workspace, desktop workspace) through it. On a cert error it prints actionable NODE_EXTRA_CA_CERTS / strict-ssl remediation; on any other failure it stays silent. |
||
|
|
72eb42d9ec
|
feat(update): stash/restore by default + settable discard for non-interactive updates (reverts #38542, #39568) (#39645)
* Revert "fix(update): require managed marker before destructive clean" This reverts commit |
||
|
|
338f0b2234 |
fix(desktop): recover from corrupt Electron cache in bootstrap install (Windows)
Windows counterpart of #39127: scripts/install.ps1 `Install-Desktop` runs `npm run pack` once and throws on the opaque ENOENT a corrupt cached Electron download produces, with no recovery. Add `Clear-ElectronBuildCache` plus a purge-and-retry-once on pack failure, mirroring the install.sh fix: remove the cached electron-*.zip (%LOCALAPPDATA%\electron\Cache + ELECTRON_CACHE / electron_config_cache overrides) and stale *-unpacked output, then retry so @electron/get re-downloads with its own SHASUM verification. Refs #37544. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
80672754a8 | fix(docs): update all install instructions everywhere | ||
|
|
36f1cd7dea |
feat(installer): do shallow clones
no need to get the whole repo history :) |
||
|
|
475ecea3d7
|
fix(install): cap requires-python at <3.14 and pin UV_PYTHON to the venv (#38535)
uv selects the project Python from requires-python and from the UV_PYTHON env var, both of which override an already-created venv on the next 'uv sync'. With no upper bound on requires-python, an inherited UV_PYTHON=3.14 (or a fresh distro whose newest interpreter uv auto-picks) silently recreated the installer's 3.11 venv at 3.14, where Rust-backed transitives (pydantic-core) have no cp314 wheel and fall back to a maturin source build that fails. This bit a Windows/WSL user with UV_PYTHON set in their shell and a fresh WSL-arch box where uv auto-picked 3.14. Two layers: - pyproject: requires-python '>=3.11' -> '>=3.11,<3.14' (+ uv lock regen). uv now refuses a 3.14 interpreter with a clear error instead of attempting the maturin build. Backstop independent of the installer. - install.sh / install.ps1: pin UV_PYTHON to the venv interpreter after creating it (in both the venv step and the deps step, since bootstrap runs those stages as separate processes). An inherited UV_PYTHON can no longer hijack the sync/pip tiers, so the install just works regardless of shell env. Verified E2E: hostile UV_PYTHON=3.14 + uv venv --python 3.11 + uv sync now installs into 3.11 with pydantic-core's 3.11 wheel; without the re-pin the capped requires-python produces a legible incompatibility error rather than a cryptic build failure. |
||
|
|
8a19884bf3
|
fix(update): stop stash/restore from clobbering desktop source on managed clones (#38542)
The stash/restore cycle in the update path was observed to clobber freshly-pulled source files (apps/desktop/ deletion -> Vite '[UNRESOLVED_ENTRY] Cannot resolve entry module index.html'). On a managed clone the user never edits the source tree, so any 'dirty' state is pure git artifact (CRLF renormalization, npm lockfile churn, files left behind when a directory was deleted upstream such as apps/bootstrap-installer/). Stashing that and re-applying it after a pull is fragile and unnecessary. - hermes update (hermes_cli/main.py): on a non-fork (managed) clone, discard working-tree dirt via reset --hard HEAD + clean -fd instead of stash/apply. Forks keep the stash machinery so intentional edits survive. Also pin core.autocrlf=false on Windows so the dirt is never created (mirrors install.ps1 #38239). - install.sh: replace the update-path stash/restore dance with a hard reset to origin/<branch>; the installer is a managed-only entry point. - install.sh + install.ps1 desktop stage: prefer 'npm ci' (wipes and reinstalls node_modules from the lockfile) over bare 'npm install', which can report 'up to date' against a stale marker while node_modules is empty -- leaving tsc unresolved so 'npm run pack' fails. Tests: managed clone cleans instead of stashing; fork still stashes; existing stash tests force the stash path explicitly. |
||
|
|
1dca7c6207 |
fix(install): require Node >=20.19/22.12 for the desktop build
The "Build desktop app" install step failed with an opaque "exit code 1" on machines with an old Node, and nothing in the logs explained it. Reproduced: on Node 20.5.1, `npm run pack`'s `vite build` crashes with You are using Node.js 20.5.1. Vite requires Node.js version 20.19+ or 22.12+. SyntaxError: The requested module 'node:util' does not provide an export named 'styleText' Vite 8 (rolldown) imports node:util.styleText, which doesn't exist before Node 20.12, so the build dies before producing the app. The installer's check_node / Test-Node accepted ANY pre-existing Node with no version floor, so a too-old system Node was used for the build instead of the bundled Node 22. Add a version floor (^20.19 || >=22.12) to check_node (install.sh) and Test-Node (install.ps1): a too-old system Node is replaced with the Hermes-managed Node 22 LTS, and the desktop stage re-resolves Node so the build always runs on a satisfying version. Declare the same range in apps/desktop/package.json engines. Verified: build succeeds on Node 22, fails on 20.5.1 with the error above; the floor logic matches Vite's range across boundary versions (20.18/20.19, 21.x, 22.11/22.12). |
||
|
|
214b7e070f
|
fix(install.ps1): handle dirty worktree on Windows update (#38239)
Git for Windows defaults to core.autocrlf=true, which renormalizes the repo's LF-only text files to CRLF in the working tree. On a managed, never-user-edited clone this makes tracked files (.envrc, AGENTS.md, agent/*.py, workflows) show as locally modified, so the update path's bare git checkout aborts with 'Your local changes would be overwritten by checkout' and the desktop bootstrap fails at stage=repository. The bash installer already autostashes before checkout; the PowerShell path had no dirty-tree handling at all and never pinned autocrlf. Fix: (1) git reset --hard HEAD before fetch/checkout in the update path to discard any pre-existing dirt, and (2) pin core.autocrlf=false on both the update and fresh-clone paths so the dirt is never created again. |
||
|
|
43fd63b4b5 |
fix(windows): rip out unused submodule support in installer & docker & docs
we have no submodules anymore, so #37702 was kinda right, but we can just delete it entirely. |
||
|
|
4df280d511 |
refactor(uv): single managed-uv path, delete fts5 installer escalation
Replace the multi-path UV resolution chain (PATH probing, conda guards,
5-location trust ordering, temp-dir fallback installs) with a single
managed uv binary at $HERMES_HOME/bin/uv. Every code path that needs
uv resolves it from that one location; if missing, ensure_uv()
bootstraps it via the official standalone installer.
Key changes:
- New hermes_cli/managed_uv.py: managed_uv_path(), resolve_uv(),
ensure_uv() (returns (path, freshly_bootstrapped) tuple),
update_managed_uv(), rebuild_venv(), installer internals.
- hermes_cli/main.py: replace all shutil.which('uv') with ensure_uv(),
add venv rebuild on first-time managed uv bootstrap, update_managed_uv
before dep install on all 3 update paths.
- scripts/install.sh: install_uv() always installs to
$HERMES_HOME/bin/uv; delete ensure_fts5, _python_has_fts5,
_reinstall_python_with_fts5, _warn_no_fts5 (61 lines).
Managed uv always installs current Python with FTS5.
- scripts/install.ps1: Install-Uv always installs to
$HermesHome\bin\uv.exe; Resolve-UvCmd checks managed location first.
- hermes_state.py: simplified FTS5 warning now suggests 'hermes update'
as the fix instead of blaming install method.
- tests: 15 tests in test_managed_uv.py, autouse _patch_managed_uv
fixture in test_cmd_update.py.
Closes #37605, Closes #37622
|
||
|
|
51c68d4ab1
|
Add Hermes desktop app (#20059)
* feat: better composer etc * docs: add desktop and dashboard run instructions * fix(desktop): address security scan findings * fix(dashboard): resolve @nous-research/ui path under npm workspaces The sync-assets prebuild step shelled out to 'cp -r node_modules/@nous-research/ui/dist/fonts ...' with a path relative to apps/dashboard/. That works only when the dep is installed locally in the dashboard workspace, but 'npm install' at the repo root (the documented setup — see apps/desktop/README.md) hoists shared deps to the root node_modules under npm workspaces. The relative cp then fails with 'No such file or directory', sync-assets exits 1, the Vite build aborts, and 'hermes dashboard' surfaces a generic 'Web UI build failed' message. Replace the shell one-liner with scripts/sync-assets.cjs, which walks up from the dashboard directory looking for node_modules/ @nous-research/ui — working in both the hoisted (workspaces) and co-located (standalone) layouts. Also guards against a missing dist/fonts or dist/assets with a clearer error pointing at a rebuild of the UI package rather than silently copying nothing. * feat(desktop): support connecting to a remote Hermes backend Add HERMES_DESKTOP_REMOTE_URL and HERMES_DESKTOP_REMOTE_TOKEN env vars that, when set, short-circuit the local-child spawn in startHermes() and connect the Electron renderer to an already- running 'hermes dashboard' server reachable over the network. Motivating use case: WSL2 users who want to run the Hermes core (agent loop, tools, filesystem access) inside their WSL distribution while rendering the Electron GUI on native Windows. Before this change, the desktop app always spawned a local Python child on the same host as the renderer, which doesn't cross the WSL/Windows boundary. The remote path reuses waitForHermes() as a liveness probe (/api/status is in the backend's public endpoint allowlist), so the connection is only returned once the backend is actually ready. WebSocket URL derivation picks ws:// or wss:// based on the input scheme. URL validation rejects non-http(s) schemes and requires both env vars together to avoid a half-configured connection that would silently fall through to the spawn path. No behaviour change when the env vars are unset — the default local-spawn flow is untouched. Typical usage: # in WSL2 hermes dashboard --tui --no-open --host 0.0.0.0 --port 9119 --insecure # on Windows set HERMES_DESKTOP_REMOTE_URL=http://localhost:9119 set HERMES_DESKTOP_REMOTE_TOKEN=<session token> set HERMES_DESKTOP_IGNORE_EXISTING=1 (launch Hermes desktop) * ci(desktop): automate desktop releases Add GitHub Actions release channels for signed desktop installers and document the stable/nightly download paths. * feat: file tabs * refactor(desktop): tighten right-rail tab close API Promote closeRightRailTab/closeActiveRightRailTab as the single public entry point. Drops the activeTabRef + handleCloseDocument indirection in ChatPreviewRail, the unused $rightRailHasContent atom, and the legacy dismissFilePreviewTarget alias. -70 LOC. * feat(desktop): polish composer pill toward reference look Solid foreground-on-background send/voice-conversation circle (black-on-white in light, white-on-black in dark) anchors the right edge as the primary CTA instead of the orange theme primary. Bumps the primary control to 2.125rem so it visually outranks the ghost mic/plus controls. Opens up the surface padding (0.625rem x / 0.5rem y) so the input row breathes around its controls, and nudges the corner radius from 20 to 24px for a slightly pill-ier silhouette. LiquidGlass distortion is preserved. * feat(desktop): add startup and onboarding flow Add phase-based desktop boot progress, fresh-install sandbox testing, and first-run provider credential onboarding so packaged installs can start cleanly without manual settings detours. * fix(desktop): gate prompts on provider setup Show the desktop provider onboarding flow before prompt submission when no inference provider is configured, preventing fresh installs from falling through to backend credential errors. * fix(desktop): surface provider onboarding from session warnings Propagate credential warnings through session runtime info and open desktop onboarding whenever a session reports no usable provider, so unconfigured installs cannot fall through to prompt errors. * fix(desktop): route gateway provider errors to onboarding The "No inference provider configured" auth error reaches the renderer through gateway error events, not the prompt.submit promise; the previous patch only caught the latter, so the error toast still surfaced and onboarding never opened. Also strip credential-shaped env vars from the test:desktop:fresh sandbox so the packaged backend can't see provider keys leaking from the launching shell. * fix(desktop): use strict runtime check to drive onboarding setup.status returned True whenever any provider auth state was discoverable, including indirect fallbacks like a gh-CLI Copilot token. That made desktop think the user was set up while the agent's actual resolve_runtime_provider call still raised AuthError, leaving the user with a useless toast and no onboarding. Add a setup.runtime_check gateway method that runs the same resolver the agent uses on session creation, and switch the desktop onboarding overlay and prompt precheck to use it. * feat(desktop): OAuth-first onboarding using existing dashboard provider API Replace the engineer-flavored API key form with a Sign-in-first onboarding overlay that uses the dashboard's existing /api/providers/oauth catalog and PKCE/device-code endpoints (Anthropic, Nous, OpenAI Codex, etc.). API key entry is now a fallback tab with friendly provider names instead of env var prefixes, and the loud raw resolver error is gone in favor of a one-line welcome message. * fix(desktop): polish onboarding provider list Reorder OAuth providers so Nous Portal is first, give the segmented Sign in / API key control equal column widths, and replace the engineer-flavored backend names like "Anthropic (Claude API)" / "MiniMax (OAuth)" with friendlier in-app titles. External-CLI providers now show a softer subtitle and an external-link icon instead of a chevron. * refactor(desktop): split onboarding overlay into store + view Move the OAuth state machine, runtime check, copy-to-clipboard, and api-key save into store/onboarding.ts (matching the boot.ts pattern), leaving the overlay as a presentation layer that subscribes via useStore. Tabs are now table-driven, child panels read flow from the store instead of prop-drilling, and the polling/PKCE/error/success branches share a small Status atom. * fix(desktop): external CLI providers + center mode tabs External-CLI providers (Claude Code, Qwen Code) now open an in-overlay panel with the CLI command, copy button, and an "I've signed in" recheck instead of firing an invisible toast. Center the Sign in / API key tab control so it sits under the heading instead of hugging the left edge. * fix(desktop): drop onboarding tabs for an inline link, group device-code waiting state Replace the Sign in / API key tab pair with an "I have an API key" footer link under the OAuth provider list, with a "Back to sign in" affordance inside the API key form. Group the device-code "Waiting for you to authorize..." status next to the Cancel button so the alignment matches the action. * refactor(desktop): tighten onboarding store + overlay Drop the dead isOnboardingBusy/BUSY set, factor the catch-fallback dance into safeReq, and share a single reloadAndConnect helper between PKCE submit, device-code success, external recheck, and api-key save. In the overlay, extract Step / CodeBlock / FlowFooter / CancelBtn / DocsLink atoms so the four sign-in panels share the same chrome instead of repeating it inline. Net effect: fewer literal divs, one place to touch the spacing, and the code-block + footer rows are reusable across future flows. * fix(desktop): mount onboarding from frame 1 to kill the FOUT Default onboarding.configured to null (unknown until the runtime check resolves) and have the onboarding overlay render whenever it's not yet confirmed true. The boot overlay now yields to it, so the very first paint is the Welcome card with a "While we get you set up..." progress strip instead of a flash of the chat shell between boot dismiss and onboarding mount. The picker swaps in cleanly once the gateway opens and the runtime check confirms the user is not configured. Already-configured users see the same prep card briefly while their existing runtime warms up, then the overlay dismisses without touching the chat shell. * fix(desktop): top-align empty sessions placeholder The "Start a chat to build your history." empty state used a min-h-35 grid place-items-center container, which floated the text in a tall dead zone. Render it as a flat paragraph that sits right under the section header like the empty pinned state does. * refactor(desktop): drop dead boot overlay Onboarding overlay subsumes the boot card now that it mounts from frame 1 and renders boot progress inline. The standalone DesktopBootOverlay is unreachable in every flow (yields whenever onboarding has not confirmed configured, dismisses once it has). * fix(desktop): hide pinned/recents sections until first session A fresh sidebar showed the Pinned and Recent chats headers with floating empty-state copy underneath. Drop both sections (and the now-orphan SidebarEmptySessionState) when there are no sessions yet — they reappear after the first chat. Skeletons during initial load are unchanged. * feat(gui): route embedded TUI through dashboard gateway (#21979) Inject HERMES_TUI_GATEWAY_URL into dashboard PTY sessions so embedded ui-tui instances attach to the in-process websocket gateway, with coverage for the new env wiring. * Add desktop remote gateway settings Make the desktop gateway connection configurable from settings so local remains the default while remote backends can be saved, tested, and applied without environment variables. * feat(gui): first-class Messaging page + gateway menu redesign - Add Messaging page to the desktop app with per-platform setup, status, and inline guidance. Catalog derives from gateway.config Platform enum + plugin registry, so every messaging adapter the CLI supports (Telegram, Discord, Slack, Mattermost, Matrix, WhatsApp, Signal, BlueBubbles, Home Assistant, Email, SMS, DingTalk, Feishu, WeCom, Weixin, QQ, Yuanbao, API server, Webhooks, plugins) shows up without per-platform code. - New REST endpoints: GET /api/messaging/platforms, PUT and POST /test on the same path. Secrets go through the existing .env pipeline; enable/disable writes config.yaml. - Replace gateway statusbar dropdown with a richer panel: status row, icon-only restart + system-panel actions, recent activity (with timestamps trimmed in display, full text on hover), platform list. - Auto-poll the messaging page every 6s (paused when hidden) so status updates without a manual check. - Drop Settings / Command Center from the sidebar nav (still reachable via shortcuts and the titlebar cog). - Flatten top corners on Messaging/Skills/Artifacts/Chat panes. - Share new StatusDot component across messaging + gateway menu. - Fix gateway/config.py so an explicit platforms.<name>.enabled=false in config.yaml is honored when env tokens are present. - pb-9 on the chat content area for breathing room above the composer. * Potential fix for pull request finding 'CodeQL / Clear-text logging of sensitive information' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * pin electron version * hide application menu on non-mac systems * interpret compactPreview for non-string vlaues as JSON or an empty string * fix(desktop): keep composer contenteditable mounted across stacked toggle The composer rendered {input} inside two different parent fragments depending on `stacked`. When auto-expand flipped `stacked` (e.g. the moment typed text wrapped past two lines), React reconciled the two branches as different positions and unmounted/remounted the contenteditable. The fresh mount started empty, so any in-flight characters — most reliably reproduced by holding a key — were lost. Replace the conditional with a single CSS Grid whose template-areas swap on `stacked`. The three children (menu, input, controls) keep stable identities across the toggle; only their grid placement changes, which the browser handles without React tearing down the editor. * refactor(desktop): align install layout with install.ps1 / install.sh Make the desktop app's runtime layout match what scripts/install.ps1 and scripts/install.sh produce, so a desktop-only user and a CLI-only user end up with the same files in the same places and can share one install. Layout - ACTIVE_HERMES_ROOT = HERMES_HOME/hermes-agent (was: process.resourcesPath/hermes-agent, read-only) - VENV_ROOT = HERMES_HOME/hermes-agent/venv (was: userData/hermes-runtime) - desktop.log = HERMES_HOME/logs/desktop.log (was: userData/desktop.log) - HERMES_HOME default: %LOCALAPPDATA%\hermes on Windows, ~/.hermes elsewhere The packaged .app/.exe still ships a read-only payload at process.resourcesPath/hermes-agent (FACTORY_HERMES_ROOT). On first launch or after an installer-driven upgrade we sync factory -> active, then provision the venv and run pip install -e . against the active root. Key behaviors - Pin HERMES_HOME in the spawned Python's env so get_hermes_home() resolves to the same path resolveHermesHome() picked. Without this, Python falls back to ~/.hermes on every platform - fine on mac/linux, a split-state bug on Windows where our default is %LOCALAPPDATA%\hermes. - Detect developer installs by .git presence at ACTIVE; never overwrite a user's checkout via factory sync. - Marker at ACTIVE/.hermes-desktop-runtime.json (schema v4) tracks pyproject hash + factory version + runtime schema version. depsFresh fast-paths when nothing changed. - Dev (npm run dev) prefers SOURCE_REPO_ROOT over ACTIVE so devs run their local edits, not whatever's under HERMES_HOME. - Better error messages distinguish "no payload" from "no Python". - Preserve a legacy ~/.hermes on Windows when no %LOCALAPPDATA%\hermes exists, so users with prior pip/manual installs aren't orphaned. pyproject.toml - Promote fastapi, uvicorn[standard], ptyprocess (non-Windows), and pywinpty (Windows) to main dependencies. The dashboard backend (hermes dashboard) needs them at runtime; the previous lazy-import fallback was a footgun for fresh installs. - Empty the [pty] optional-extra; kept as a no-op back-compat alias for any existing pip install hermes-agent[pty] invocations. Drops the hardcoded BUNDLED_RUNTIME_REQUIREMENTS list in main.cjs - the desktop now installs whatever pyproject.toml says, single source of truth. Files - apps/desktop/electron/main.cjs: runtime layout, HERMES_HOME pin, factory->active sync, marker v4 - apps/desktop/scripts/test-desktop.mjs: track new venv location - apps/desktop/README.md: new Setup, Runtime Bootstrap, and Debugging sections - pyproject.toml: fastapi/uvicorn/pty backends in main dependencies; [pty] extra emptied Tested locally on Windows: npm run dev boots cleanly, sessions land at the new location, type-check + lint + test:desktop:platforms all pass. Verified end-to-end on a fresh Win11 VM via dist:win installer. Known gaps (filed as follow-ups, not in this PR): - Skills not seeded on packaged installs (sync_skills only runs in cmd_chat, not cmd_dashboard). Need to move to shared pre-dispatch. - Git Bash not bundled or detected; agent's terminal tool errors out with a useful message but desktop bootstrapper should pre-flight it. - install.ps1 / install.sh should be decomposed into composable phase libraries so the desktop bootstrapper can reuse them as a single source of truth across all install surfaces. * feat(desktop): theme polish, prose chat typography, composer chrome - DS tokens/midground, Backdrop, scoped scrollbars, typography plugin + prose - Composer liquid/radius utilities, thread font parity, tool/thinking cues - File tree label scale, preview flex, thread retry loading + streaming tests * feat(desktop): NSIS prereq detection page + auto-install via winget The packaged Windows installer now detects Python 3.11+ and Git for Windows at install time and offers to install missing prereqs via winget. Mirrors the prereq logic scripts/install.ps1 already runs for CLI installs, so desktop installer users get the same out-of-the-box experience as install.ps1 users. Why - Hermes' terminal tool calls bash.exe directly (tools/environments/ local.py); on Windows that's Git Bash from Git for Windows. Without it, the agent fails on the first terminal() call. - Hermes' Python runtime needs 3.11+. Without it, the desktop bootstrapper errors out at venv creation. - Both gaps surfaced on a fresh Windows 11 VM smoke test: VM had Python pre-installed but no Git, so the agent's first terminal call failed with "Git Bash isn't installed." - install.ps1 has had Install-Git + Install-Uv functions for ages. The desktop installer was the asymmetric outlier. How — NSIS prereq page - New file: apps/desktop/installer/prereq-check.nsh (plugged into electron-builder via build.nsis.include) - Real Wizard page using nsDialogs, inserted via customPageAfterChangeDir hook (between the Directory page and InstFiles). - Group boxes for Python and Git, each showing detection status. - Pre-checked install checkboxes when winget is available. - Auto-skips silently if both prereqs are already installed. - Falls back to manual download URLs when winget itself is missing. - Detection: - Python: probes `py -3.11`/`-3.12`/`-3.13`/`-3.14` via the Python launcher. Microsoft Store "Python stub" (no py.exe) is correctly classified as not-installed. - Git: `where git`. - winget: `where winget` (Win10 1809+ / Win11 with App Installer). - Install execution (in customInstall macro): - Python: nsExec::ExecToLog with `--scope user --silent`. Per-user install, no UAC prompt, output streams to install log. - Git: ExecShellWait via Windows ShellExecute. Critical because Git always installs per-machine and triggers UAC; ShellExecute preserves the foreground focus chain across non-elevated → elevated process spawns, so UAC actually comes to the foreground. nsExec::ExecToLog breaks the chain because winget runs hidden. - Both pass `--disable-interactivity --accept-package-agreements --accept-source-agreements` to suppress winget's own dialogs. - Verification: probes Git's standard install locations via FileExists rather than `where git`. NSIS's process inherits PATH at startup, so a freshly-installed Git won't be visible to `where` until restart. - Silent installs (/S) skip the prompts; managed deploys handle prereqs out-of-band via Group Policy / Intune. How — Electron-side safety net - New findGitBash() in main.cjs, parallel to findSystemPython(). Probes the same locations as tools/environments/local.py:_find_bash() so a positive result here means the agent's terminal tool will work. - ensureRuntime now throws a clear, actionable error on Windows when Git Bash isn't found, matching the existing "Python 3.11+ is required" error path. - Catches users the NSIS page doesn't: .msi installer users (NSIS prereq page doesn't run for MSI), `npm run dev` users, manual installers, anyone who unchecked the install boxes on the NSIS prereq page. - All gated on `IS_WINDOWS`; macOS / Linux unaffected. NSIS build issue (resolved) - electron-builder defaults to `-WX` (warnings as errors). NSIS optimizer emits "warning 6010: function not referenced" for our page functions because Page custom directives don't count as references in its static-analysis pass. The functions ARE called at runtime when NSIS invokes the page; the optimizer just can't see it statically. - Set `build.nsis.warningsAsErrors=false` in package.json so this spurious warning doesn't fail the build. (Documented option from electron-builder's nsisOptions.) Out of scope (filed for future work) - MSI prereq detection: Windows Installer custom actions are a different mechanism. Enterprise deploys typically handle prereqs via GP/Intune. - Bundle PortableGit + python-build-standalone in extraResources for zero-network installs. ~80MB increase. - Mac / Linux GUI prereq flows (different installer formats; Xcode CLT covers most macOS prereqs already; Linux is per-distro hard). Files - apps/desktop/installer/prereq-check.nsh (new, ~290 lines NSIS) - apps/desktop/package.json (build.nsis.include + warningsAsErrors) - apps/desktop/electron/main.cjs (findGitBash + preflight) - apps/desktop/README.md (Runtime prerequisites section) Cross-platform impact - macOS / Linux builds (dist:mac, dist:mac:dmg, dist:mac:zip): nsis config is ignored entirely; .nsh is dormant. - npm run dev: .nsh dormant; main.cjs preflight gated on IS_WINDOWS. - scripts/install.ps1, scripts/install.sh: no reference to any new files; CLI install paths untouched. - Hermes CLI / dashboard / gateway: no reference; runtime untouched. - All checks: node --check on main.cjs and test-desktop.mjs pass; npm run test:desktop:platforms 4/4 passing; node --test green. Tested - npm run dist:win produces signed .exe and .msi without errors. - Fresh Win11 VM (Python pre-installed, no Git): prereq page renders, Python check shows detected, Git checkbox pre-checked. Click Next → Git installs via winget with UAC prompt in foreground. - After install completes, Hermes launches and the agent's terminal tool can run bash commands. Verified Git Bash is detected at `C:\Program Files\Git\bin\bash.exe` by ensureRuntime's preflight. * feat: theme changes, composer tweaks, in app update ux, finesse * fix(cli): seed bundled skills on dashboard + gateway entrypoints `sync_skills(quiet=True)` was only being called from inside `cmd_chat`, which meant `hermes dashboard` (the desktop GUI's backend) and `hermes gateway` (Telegram/Discord/Slack/etc daemons) never seeded the bundled skill library into ~/.hermes/skills/. This surfaced as "No skills found" in the desktop GUI's skills panel on fresh installs, despite the agent having access to the full bundled library when invoked via `hermes chat`. scripts/install.ps1 worked around it by running skills_sync.py as part of Copy-ConfigTemplates, but that's not part of the desktop installer's bootstrap chain. Fix - Extract the skills-sync block from cmd_chat into a module-level `_sync_bundled_skills_quietly()` helper. - Call the helper from cmd_chat (preserving existing behavior), cmd_dashboard (after the --status/--stop early-return paths and fastapi import check, so we don't run skills_sync on management commands or when deps aren't installed), and cmd_gateway. Why these three entrypoints - cmd_chat: the user's primary CLI entrypoint - cmd_dashboard: the desktop GUI's backend; this is what `hermes dashboard --tui` invokes when the desktop bootstrapper spawns Hermes - cmd_gateway: long-running daemons where the user expects the agent to have full skill access Other entrypoints (cmd_config, cmd_doctor, cmd_login, cmd_status, etc.) are management commands that don't need skill discovery and were never running skills_sync in the first place — leaving them alone. Idempotence - tools/skills_sync.py is manifest-based: skipped skills cost milliseconds. Calling it from multiple entrypoints adds no real cost, and users running `hermes chat` then `hermes dashboard` get two fast no-ops on the second call. Failure handling - Helper wraps skills_sync in try/except. Skills are an enhancement, not a hard dependency — Hermes runs fine with an empty skills/ dir. Files - hermes_cli/main.py: + new helper `_sync_bundled_skills_quietly()` at module level + cmd_chat: replace inline block with helper call + cmd_dashboard: add helper call after fastapi import succeeds + cmd_gateway: add helper call before delegating to gateway_command * feat(desktop): hoisted todo widget, JSON tool summaries, history grouping & timer fixes - Hoist todo to first-class widget (shadcn checkboxes, brand colors, no tool-accordion). Header derives label from active task; non-active rows fade. - Replace raw JSON dumps with structured key/value summaries via formatToolResultSummary; nested error extraction for clearer failures. - Fix loaded-session grouping: stitch interleaved assistant/tool iterations into one bubble instead of orphaned synthetic messages. - Stable tool/thinking timers via keyed registry so unmount/scroll doesn't reset elapsed counts; gate "running" on real live thread state. - Reorganize chat-only assistant-ui components under components/chat/. * fix(desktop): address CodeQL alerts on PR #20059 - settings/helpers.ts: harden setNested against prototype pollution. POLLUTING_PATH_PARTS check is now applied at every assignment site (loop + leaf) and uses Object.defineProperty so CodeQL can see the guard inline rather than via a helper function call. - lib/markdown-preprocess.ts: rebuild the dangling-fence close regex from a fence-char + length instead of marker.replace(...). The marker is captured by `(`{3,}|~{3,})` so it can only be backticks or tildes, but CodeQL was tracing tainted input text into the RegExp source and flagging hostname dots from input as part of the pattern (false positive js/incomplete-hostname-regexp on the test fixture URLs). Reconstructing from a literal char breaks the dataflow. - scripts/notarize-artifact.cjs: drop args from the run() rejection message. Args carry --key-id / --issuer / key file path; the existing outer catch already squashes errors to a generic line, but CodeQL was flagging the args.join(' ') as clear-text logging of APPLE_API_KEY_ID. Composer DOM-text-as-HTML alerts (composer/index.tsx:379, :547) are already addressed in |
||
|
|
60bb98e003
|
fix(install.ps1): pin PortableGit instead of hitting rate-limited GitHub API (#28943)
The Windows installer fetched the latest git-for-windows release via api.github.com/repos/git-for-windows/git/releases/latest, which is rate-limited to 60 requests/hour/IP for unauthenticated callers. Users behind CGNAT, corporate NAT, dorm WiFi, or shared ISP routinely hit the limit, and the installer aborts asking them to install Git manually. Switch to a pinned release tag (v2.54.0.windows.1) and a static github.com/.../releases/download/<tag>/<asset> URL. Static download URLs are served by GitHub's blob storage and are not subject to the API rate limit. Trade-offs: - We have to bump the pin when we want a newer Git for Windows. The installer doesn't depend on Git features beyond 'works', so this is a once-a-year maintenance cost at most. - Loses the (cosmetic) MB size display, since we no longer have asset metadata. Replaced with the version string in the 'Downloading ...' line instead. |
||
|
|
4229facc01 | docs(windows): avoid piping installer directly into iex | ||
|
|
a53e8ca733 |
feat(install.ps1): strip BOM, add -Commit/-Tag pin params, harden git ops
Three install.ps1 improvements pulled from the thin-installer work on
bb/gui (PR #27822) that benefit the canonical CLI install flow on main:
1. Strip UTF-8 BOM from scripts/install.ps1.
The canonical 'irm <raw URL> | iex' install flow has been broken
since commit
|
||
|
|
e3a254d65b
|
feat(dep_ensure): complete Windows bootstrap — dep_ensure + install.ps1 + detection (#27845)
* feat(dep_ensure): complete Windows bootstrap — dep_ensure + install.ps1 + detection dep_ensure.py gains Windows awareness: PowerShell invocation, platform- specific browser detection, (path, shell) tuple returns. install.ps1 gains -Ensure/-PostInstall modes using npm -g --prefix (aligned with install.sh) and agent-browser install for Chromium. browser_tool.py gains node/ in candidate dirs for Windows .cmd shims. Both install scripts bundled in pip wheel. Tracking: #27826 * fix(install.ps1): add --ignore-scripts to npm install for camofox @askjo/camofox-browser has a dependency (impit) whose postinstall script runs `npx only-allow pnpm`, which fails under npm. Adding --ignore-scripts avoids the spurious failure without affecting functionality. Tracking: #27826 * fix: remove duplicate install scripts from git CI already copies scripts/install.{sh,ps1} into hermes_cli/scripts/ during wheel build. No need to commit copies — .gitignore keeps them out, _find_install_script() falls back to scripts/ for git-clone users. Tracking: #27826 * fix: address review — remove env_extra, fix ps1 error handling - Remove unused env_extra parameter from ensure_dependency() - Invoke-EnsureMode node case now uses Test-Node consistently - Install-AgentBrowser uses throw instead of exit 1 |
||
|
|
fb138d91ca |
fix(install.ps1): Stage-Node honest reporting + reject empty -Stage
Two protocol-correctness gaps from review:
1. Stage-Node used [void](Test-Node) which discarded Test-Node's return
value, so the JSON frame always reported ok=true even when Node
install fully failed. A GUI driver consuming the manifest couldn't
tell 'node ready' from 'node missing'. Wire a soft-skip channel
($script:_StageSkippedReason) that workers can populate to surface
'ran, but the thing it was supposed to set up is not available' as
skipped=true with a reason in the JSON, without aborting the install
(Node is optional -- browser tools degrade gracefully, matches
Write-Completion's existing 'Note: Node.js could not be installed'
behavior). Reset before each stage so a prior reason can't leak.
2. The -Stage dispatch used 'if ($Stage)' which is falsy for empty
string, so 'install.ps1 -Stage ""' fell through to Main and silently
kicked off a full destructive install. Switch to
PSBoundParameters.ContainsKey('Stage') so an explicit empty value
surfaces as unknown-stage exit 2 with a structured JSON frame, the
way every other bad stage name does.
|
||
|
|
3925be2791 |
fix(install.ps1): trim completion banner + strip em-dash in test
Address the two cosmetic items from review: - Completion banner middle line was 62 chars vs 59-char top/bottom borders (replacing the 1-char checkmark with [OK] added width that wasn't reflected in the trailing whitespace). Drop 3 trailing spaces. - Smoke test file had a single em-dash in a comment -- the only non-ASCII byte across both files. Replace with -- for consistency with install.ps1's pure-ASCII goal. |
||
|
|
c0b64f0877 |
fix(install.ps1): address Copilot review on #27224
Three issues flagged by the Copilot review on this PR: 1. Double JSON emit on stage failure (Copilot #1, #2). When -Stage <name> ran a worker that threw, Invoke-Stage's finally emitted a JSON result frame AND the entry-point catch emitted a second error frame -- producing two concatenated JSON objects on stdout and breaking the one-line-per-invocation contract that drivers parse against. Same issue applied to -Json mode on a full install (every stage's finally plus a final error frame missing duration_ms/skipped). Fix: Invoke-Stage's finally now sets $script:_StageEmittedErrorFrame when it emits a failure frame; the entry-point catch checks the flag and skips its own emit, still exit 1. 2. $prevEAP uninitialized on early try-block throw (Copilot #3). In Install-Uv, Test-Python, Test-Node's winget fallback, _Run-NpmInstall, and the playwright block, '$prevEAP = $ErrorActionPreference' lived as the first statement INSIDE the try. If anything between 'try {' and that line threw (Write-Info on an unusual host, the npx-finding loop, etc.), the catch's 'if ($prevEAP) { ... }' restore was a no-op and EAP could remain relaxed. Fix: hoist '$prevEAP = $ErrorActionPreference' to the line immediately before 'try {' in all five sites. Catch's restore is now always meaningful regardless of where in the try the throw originated. No change to Invoke-Stage's success path or to the four lint-clean EAP sites (Test-Node was the only winget-related catch). All 19 metadata smoke tests still pass. |
||
|
|
e5f19af2a5 |
feat(install.ps1): stage protocol + Windows clean-VM hardening pass
Adds an opt-in stage protocol that lets programmatic drivers (the
desktop GUI's onboarding wizard, CI, future install.sh parity) drive
install.ps1 one step at a time with structured JSON results. Default
invocation (`irm | iex` one-liner) behaves unchanged.
Entry points:
install.ps1 Today's interactive install (unchanged)
install.ps1 -ProtocolVersion Emit protocol version integer
install.ps1 -Manifest Emit JSON manifest of available stages
install.ps1 -Stage <name> Run one stage, emit JSON result
install.ps1 -NonInteractive Suppress Read-Host prompts (skips the
setup wizard and gateway autostart)
install.ps1 -Json Machine-readable completion frame
Manifest exposes 14 stages across prereqs/install/finalize/post-install
categories, with 2 (configure, gateway) flagged needs_user_input=true
so GUI drivers can skip them and handle the equivalent UX themselves.
Along the way, clean-VM testing on stock Windows 10/11 surfaced a
series of latent install.ps1 bugs that were never exercised by
developer machines. Fixed in the same commit:
* Encoding: file is now pure ASCII with no BOM. Windows PowerShell
5.1 reads BOM-less files as Windows-1252 and chokes on em-dashes
(and other UTF-8 sequences), while iex chokes on a leading U+FEFF.
Pure-ASCII satisfies both invocation paths.
* EAP=Stop + native `2>&1` captures: PowerShell wraps stderr lines
from native commands as ErrorRecord objects under EAP=Stop and
throws even when the command exits 0. Relaxed to EAP=Continue
around the astral.sh uv installer, `uv python install`, `npm
install`, `npx playwright install`, the venv import probes, and
the Node winget fallback. Check $LASTEXITCODE for the real signal.
* Cross-process state: each `-Stage <name>` invocation spawns a
fresh powershell child. $script:UvCmd set by Stage-Uv was invisible
to Stage-Python; PATH updated by Stage-Git/Stage-Node was invisible
to subsequent stages spawned by the driver shell. Added Resolve-UvCmd
helper called at the top of every stage that needs uv, and a
Sync-EnvPath helper called at the top of Invoke-Stage to refresh
PATH from the registry.
* UAC avoidance: `winget install OpenJS.NodeJS.LTS` triggers a UAC
prompt that often appears minimized in the taskbar -- looks like a
hang. Switched Test-Node to prefer the official portable Node zip
dropped into %LOCALAPPDATA%\hermes\node\ (mirrors the PortableGit
pattern Install-Git already uses). winget kept as fallback.
* npx hangs on confirmation: `npx playwright install chromium` blocks
on stdin waiting for "Need to install playwright@X.Y.Z (y/N)" when
playwright isn't in local node_modules. Tee-Object pipelines
disconnect stdin from the user's TTY so the install hangs forever.
Pass `--yes` to auto-accept.
* Silent long-running installs: `*> $logPath` redirected every stream
to disk and left the user staring at a frozen "Installing..." line
for the 5-10 minutes Playwright Chromium takes to download. Switched
to `2>&1 | ForEach-Object { "$_" } | Tee-Object -FilePath $log` so
output streams live to the console AND captures to log for failure
diagnostics. ForEach-Object coercion strips PowerShell's red
NativeCommandError formatter from stderr items.
* Console encoding: forced [Console]::OutputEncoding to UTF-8 so
playwright/git/npm progress bars, box-drawing, and check marks render
correctly instead of as IBM437/Windows-1252 mojibake.
* Performance: set $ProgressPreference = "SilentlyContinue" so
Invoke-WebRequest doesn't paint its per-chunk progress bar. The
PS 5.1 progress UI throttles downloads by 10-100x (a 57MB PortableGit
grab takes 5 minutes with the bar on vs ~20 seconds with it off,
same network). Affects PortableGit, Node portable zip, and the
Hermes repo zip fallback.
Tests: scripts/tests/test-install-ps1-stage-protocol.ps1 provides 19
metadata-only assertions covering -ProtocolVersion, -Manifest schema,
and unknown -Stage error frame. No install side effects.
End-to-end validated on a clean Windows 10 VM via:
1. `irm <branch>/scripts/install.ps1 | iex` (canonical CLI path)
2. `powershell -File install.ps1 -Stage X` iterated through every
stage (GUI driver path, exercises cross-process fixes)
|
||
|
|
4279da4db6 | fix(windows): make PowerShell installer parse in 5.1 | ||
|
|
622c27e55c
|
fix(install.ps1): restore EAP=Continue around uv python install, skip Store stub (#26586)
Fresh Windows installs were failing on first run with:
⚠ uv python install error: Downloading cpython-3.11.15-windows-x86_64-none (24.5MiB)
✗ Installation failed: Python was not found; run without arguments
to install from the Microsoft Store...
Two bugs compounding:
1) EAP=Stop swallows uv's stderr progress as an exception. uv writes
download progress ("Downloading cpython-3.11.15-windows-x86_64-none
(24.5MiB)") to stderr. With $ErrorActionPreference = "Stop" set at
the top of the script plus 2>&1 capture, PowerShell wraps each stderr
line as an ErrorRecord and throws on the first one — even though uv
exits 0 and Python was installed successfully. This was previously
fixed in commit
|
||
|
|
5af672c753
|
chore: remove Atropos RL environments and tinker-atropos integration (#26106)
* chore: remove Atropos RL environments, tools, tests, skill, and tinker-atropos submodule Delete: - environments/ (43 files — base env, agent loop, tool call parsers, benchmarks) - rl_cli.py (standalone RL training CLI) - tools/rl_training_tool.py (all 10 rl_* tools) - tests: test_rl_training_tool, test_tool_call_parsers, test_managed_server_tool_support, test_agent_loop, test_agent_loop_vllm, test_agent_loop_tool_calling, test_terminalbench2_env_security - optional-skills/mlops/hermes-atropos-environments/ - tinker-atropos git submodule + .gitmodules * chore: remove RL/Atropos references from Python source - toolsets.py: remove rl toolset block + update comment - model_tools.py: remove rl_tools group + update async bridging comment - hermes_cli/tools_config.py: remove RL display entry, _DEFAULT_OFF_TOOLSETS, setup block, and rl_training post-setup handler - tools/budget_config.py: remove RL environment reference in docstring - tests/test_model_tools.py: remove rl_tools from expected groups - tests/run_agent/test_streaming_tool_call_repair.py: fix stale cross-reference * chore: remove rl/yc-bench extras and tinker-atropos refs from pyproject.toml - Remove rl extra (atroposlib, tinker, fastapi, uvicorn, wandb) - Remove yc-bench extra - Remove rl_cli from py-modules - Remove [tool.ty.src] exclude for tinker-atropos - Remove [tool.ruff] exclude for tinker-atropos - Regenerate uv.lock * chore: remove tinker-atropos from install/setup scripts - setup-hermes.sh: remove entire tinker-atropos submodule install block - scripts/install.sh: remove both tinker-atropos blocks (Termux + standard) - scripts/install.ps1: remove tinker-atropos block - nix/hermes-agent.nix: remove tinker-atropos pip install line * chore: remove RL references from cli-config.yaml.example * docs: remove Atropos/RL references from README, CONTRIBUTING, AGENTS.md * docs: remove RL/Atropos references from website - Delete: environments.md, rl-training.md, mlops-hermes-atropos-environments.md - sidebars.ts: remove rl-training and environments sidebar entries - optional-skills-catalog.md: remove hermes-atropos-environments row - tools-reference.md: remove entire rl toolset section - toolsets-reference.md: remove rl row + update example - integrations/index.md: remove RL Training bullet - architecture.md: remove environments/ from tree + RL section - contributing.md: remove tinker-atropos setup - updating.md: remove tinker-atropos install + stale submodule update * chore: remove remaining RL/Atropos stragglers - hermes_cli/config.py: remove TINKER_API_KEY + WANDB_API_KEY env var defs - hermes_cli/doctor.py: remove Submodules check section (tinker-atropos) - hermes_cli/setup.py: remove RL Training status check - hermes_cli/status.py: remove Tinker + WandB from API key status display - agent/display.py: remove both rl_* tool preview/activity blocks - website/docs: remove RL references from providers.md + env-variables.md - tests: remove TINKER_API_KEY from conftest, set_config_value, setup_script * chore: remove RL training section from .env.example |
||
|
|
524490a409
|
fix(install.ps1): pin uv sync to venv\, verify baseline imports on Windows (#25755)
* fix(cli): allow rotating broken OpenRouter / AI Gateway key in `hermes model` flow Before: when `OPENROUTER_API_KEY` (or `AI_GATEWAY_API_KEY`) was already set in ~/.hermes/.env, `hermes model openrouter` / `hermes model ai-gateway` skipped the API-key prompt entirely and jumped straight to the model picker. Users with a broken / expired / wrong key had no way to replace it without editing ~/.hermes/.env by hand or re-running `hermes setup` from scratch. Both flows now route through the existing `_prompt_api_key()` helper, which surfaces [K]eep / [R]eplace / [C]lear when a key is already configured — the same UX the generic API-key providers (z.ai, MiniMax, Gemini, etc.) and the Daytona setup already use. * fix(install.ps1): pin uv sync target to venv\, verify baseline imports Two related Windows-installer bugs that produce a broken venv with `ModuleNotFoundError: No module named 'dotenv'` on first `hermes` run. ## Bug 1: uv sync ignores VIRTUAL_ENV, syncs into .venv\ instead of venv\ `Install-Dependencies` creates the venv at `venv\` via `uv venv venv`, sets `$env:VIRTUAL_ENV = "$InstallDir\venv"`, then runs `uv sync --extra all --locked`. Modern uv (>=0.5) ignores `VIRTUAL_ENV` for the `sync` subcommand and uses the project default `.venv\` instead. Result: deps land in `$InstallDir\.venv\`, `venv\` stays empty except for the python.exe stub from the earlier `uv venv` call, `hermes.exe` ends up wired to the wrong site-packages. The bash installer (`scripts/install.sh`) already worked around this in `install_deps()` line 1127 by passing `UV_PROJECT_ENVIRONMENT` — that flag tells uv exactly where to put the project env regardless of `VIRTUAL_ENV`. Port the same fix to PowerShell. ## Bug 2: no post-install verification If the sync still misdirects for any other reason (uv version drift, filesystem quirk, user re-run scenarios), the installer reports success and the user only finds out by running `hermes` and getting an unhelpful traceback. Add a baseline-import probe that runs the venv's own python against the four packages every `hermes` invocation needs (`dotenv`, `openai`, `rich`, `prompt_toolkit`). On failure, throw with a recovery command tailored to whether a sibling `.venv\` exists. User report (Windows 11, Python 3.13.5, Hermes v0.13.0): manual repro steps were exactly this — `uv sync` landed in `.venv\`, recovered by junctioning `venv\` → `.venv\` to bridge the path mismatch. |
||
|
|
3955aefced
|
fix(install): use --extra all not --all-extras; drop lazy-covered extras from [all] (#24515)
* fix(install): use `--extra all` not `--all-extras`; drop lazy-covered extras from [all]
Two coupled fixes for the Windows install hang where uv sync built
python-olm from sdist and failed on missing make.
# Root cause: --all-extras vs --extra all (credit: ethernet)
`uv sync --all-extras` installs every key in [project.optional-
dependencies], bypassing the curated [all] extra entirely. So even
when [all] excluded [matrix], [rl], [yc-bench], etc., the installer
pulled them anyway because they were still defined as extras. On
Windows that meant python-olm (no wheel, needs make to build from
sdist) and the install died there.
The right flag is `--extra all` — install just the [all] extra's
contents, respecting curation. Empirically verified via dry-run:
--all-extras: pulls python-olm, mautrix, ctranslate2, onnxruntime,
atroposlib, tinker, wandb, modal, daytona, vercel,
python-telegram-bot, discord.py, slack-bolt,
dingtalk-stream, lark-oapi, anthropic, boto3,
edge-tts, elevenlabs, exa-py, fal-client, faster-
whisper, firecrawl-py, honcho-ai, parallel-web
--extra all: pulls none of those — just [all]'s curated set
Dockerfile already uses `--extra all` (with comment explaining the
gotcha) — knowledge existed; the gap was install.sh / install.ps1 /
setup-hermes.sh.
Sites fixed: scripts/install.sh L1118, scripts/install.ps1 L809,
setup-hermes.sh L245.
# Companion fix: drop lazy-covered extras from [all]
`tools/lazy_deps.py` already covers anthropic, bedrock, exa,
firecrawl, parallel-web, fal, edge-tts, elevenlabs, modal, daytona,
vercel, all messaging platforms (telegram/discord/slack/matrix/
dingtalk/feishu), honcho, and faster-whisper. They were ALSO in
[all], which defeats the whole point of lazy-install — fresh
installs eager-pulled them and inherited whatever was broken
upstream (the matrix → python-olm → no Windows wheel chain being
the proximate symptom).
[all] now contains only what genuinely can't be lazy-installed:
cron, cli, dev, pty, mcp, homeassistant, sms, acp, google, web,
youtube. Same trim applied to [termux-all]. New regression test
asserts the contract: every extra in LAZY_DEPS must NOT also appear
in [all].
# Companion fix: surface uv progress + errors
setup-hermes.sh's hash-verified path swallowed uv's stderr to a
tempfile, identical to the install.sh bug fixed in PR #24504. Same
fix applied: stream stderr through directly so users see live
progress instead of staring at a frozen prompt.
# Files
- pyproject.toml: trim [all] and [termux-all] to non-lazy extras only.
- scripts/install.sh: --all-extras → --extra all; trim _ALL_EXTRAS /
_PYPI_EXTRAS to match.
- scripts/install.ps1: --all-extras → --extra all; trim $allExtras /
$pypiExtras to match.
- setup-hermes.sh: --all-extras → --extra all; stream stderr.
- tests/test_project_metadata.py: invert matrix-in-[all] assertion;
add lazy-coverage contract test.
- uv.lock: regenerated.
# Validation
5/5 metadata tests pass. 37/37 in update_autostash + tool_token_
estimation. `uv lock --check` passes. Empirical dry-run confirms
`--extra all` excludes python-olm + RL chain on the new lockfile.
* fix(install): parse [all] from pyproject.toml instead of mirroring it
ethernet's review point: the previous patch left two hand-mirrored
copies of [all]'s contents (in install.sh's $_ALL_EXTRAS and
install.ps1's $allExtras). That guarantees future drift the next
time pyproject.toml's [all] changes.
Now both scripts parse pyproject.toml at install time using stdlib
tomllib (Python 3.11+, which the bootstrap step already requires).
Single source of truth. The only purpose of the parsed list is to
build the 'Tier 2: [all] minus broken extras' fallback spec — so we
parse, filter against $brokenExtras, and rebuild the .[a,b,c] spec.
Also: removed redundant fallback tiers.
Before: Tier 1 [all]
Tier 2 [all] minus broken
Tier 3 PyPI-only extras (no git deps)
Tier 4 [web,mcp,cron,cli,messaging,dev]
Tier 5 .
After: Tier 1 [all]
Tier 2 [all] minus broken
Tier 3 .
Tier 3 (PyPI-only) and Tier 4 (dashboard+core) used to dodge the [rl]
git+sdist deps and the [matrix] python-olm build. Both are no longer
in [all] post-2026-05-12 lazy-install migration, so the carve-out
tiers had no remaining content. Tier 4 also referenced [messaging],
which is now lazy-installed — the hardcoded fallback was actually
inconsistent with the new policy.
Defensive fallback: if tomllib parse fails (corrupted pyproject,
unexpected schema), Tier 2 collapses to '.[all]' (same as Tier 1) so
the broken-extras path becomes a no-op rather than crashing.
* fix(gateway): hide Matrix from setup picker on Windows
Matrix is the one messaging platform that has no working install path
on Windows: [matrix] -> mautrix[encryption] -> python-olm, which has
Linux-only wheels and needs make + libolm to build from sdist. The
[all] cleanup in this PR keeps mautrix out of fresh installs, but a
user who picked Matrix in 'hermes setup gateway' would still walk
into the same sdist build failure when the wizard tried to install
the extra.
Hide the option at the picker so users never get the chance to try.
The gate lives in _all_platforms() — single source of truth for the
setup wizard, the curses gateway-config menu, and any future picker.
Adapter loading at runtime is intentionally NOT gated: users who
already have MATRIX_* env vars set (e.g. config copied from a Linux
install) keep working if they somehow have python-olm available.
This is the lowest-friction fix — picker visibility only.
Tests cover linux/darwin/win32 and verify other platforms aren't
collateral damage.
|
||
|
|
c1eb2dcda7
|
feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220)
* feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback
Three coordinated mitigations for the Mini Shai-Hulud worm hitting
mistralai 2.4.6 on PyPI (2026-05-12) and for the next single-package
compromise that follows.
# What this PR makes true
1. Users with the poisoned mistralai 2.4.6 in their venv get a loud
detection banner with copy-pasteable remediation steps the moment
they run hermes (and on every gateway startup).
2. One quarantined / yanked PyPI package can no longer silently demote
a fresh install to 'core only' — the installer keeps every other
extra and tells the user which tier landed.
3. Future opt-in backends (Mistral, ElevenLabs, Honcho, etc.) can
lazy-install on first use under a strict allowlist, instead of
eagerly pulling everything at install time.
# Detection: hermes_cli/security_advisories.py
- ADVISORIES catalog (one entry currently: shai-hulud-2026-05 for
mistralai==2.4.6). Adding the next one is a single dataclass.
- detect_compromised() uses importlib.metadata.version() — no pip
dependency, works in uv venvs that lack pip.
- Banner cache (~/.hermes/cache/advisory_banner_seen) rate-limits
the startup banner to once per 24h per advisory.
- Acks persisted to security.acked_advisories in config.yaml; never
re-banner after ack.
- Wired into:
* hermes doctor — runs first, prints full remediation block
* hermes doctor --ack <id> — dismisses an advisory
* cli.py interactive run() and single-query branches — short
stderr banner pointing at hermes doctor
* gateway/run.py startup — operator-visible warning in gateway.log
# Lazy-install framework: tools/lazy_deps.py
- LAZY_DEPS allowlist maps namespaced feature keys (tts.elevenlabs,
memory.honcho, provider.bedrock, etc.) to pip specs.
- ensure(feature) installs missing deps in the active venv via the
uv → pip → ensurepip ladder (matches tools_config._pip_install).
- Strict spec safety regex rejects URLs, file paths, shell metas,
pip flag injection, control chars — only PyPI-by-name accepted.
- Gated on security.allow_lazy_installs (default true) plus the
HERMES_DISABLE_LAZY_INSTALLS env var for restricted/audited envs.
- Migrated three backends as proof of pattern:
* tools/tts_tool.py — _import_elevenlabs() calls ensure first
* plugins/memory/honcho/client.py — get_honcho_client lazy-installs
* tts.mistral / stt.mistral entries pre-registered for when PyPI
restores mistralai
# Installer fallback tiers
scripts/install.sh, scripts/install.ps1, setup-hermes.sh:
- Centralised _BROKEN_EXTRAS list (currently: mistral). Edit one
array when a transitive breaks; users keep every other extra.
- New 'all minus known-broken' tier between [all] and the existing
PyPI-only-extras tier. Only kicks in when [all] fails resolve.
- All three tiers explicit: every fallback announces which tier
landed and prints a re-run hint when not on Tier 1.
- install.ps1 and install.sh both regenerate their tier specs from
the same _BROKEN_EXTRAS array so updates stay in sync.
Side effect: install.ps1 Tier 2 spec previously hardcoded 'mistral'
in its extra list — bug fixed by the refactor (mistral is filtered
out).
# Config
hermes_cli/config.py — DEFAULT_CONFIG.security gains:
- acked_advisories: [] (advisory IDs the user has dismissed)
- allow_lazy_installs: True (security gate for ensure())
No config version bump needed — both keys nest under existing
security: block, and load_config's deep-merge picks up DEFAULT_CONFIG
defaults for users with older configs.
# Tests
tests/hermes_cli/test_security_advisories.py — 23 tests covering:
- detect_compromised matches/non-matches, wildcard frozenset
- ack persistence, idempotence, blank rejection, config-failure path
- banner cache rate limiting + 24h re-banner + ack-stops-banner
- short_banner_lines / full_remediation_text / render_doctor_section /
gateway_log_message
- shipped catalog well-formedness invariant
tests/tools/test_lazy_deps.py — 40 tests covering:
- spec safety: 11 safe parametrized + 18 unsafe parametrized
- allowlist: unknown-feature rejection, namespace.name shape,
every shipped spec passes the safety regex
- security gating: config flag, env var, default, fail-open
- ensure() happy/sad paths: already-satisfied, install success,
pip stderr surfaced on failure, install-succeeds-but-still-missing
- is_available, feature_install_command
Combined: 63 new tests, all passing under scripts/run_tests.sh.
# Validation
- scripts/run_tests.sh tests/hermes_cli/test_security_advisories.py
tests/tools/test_lazy_deps.py → 63/63 passing
- scripts/run_tests.sh tests/hermes_cli/test_doctor.py
tests/hermes_cli/test_doctor_command_install.py
tests/tools/test_tts_mistral.py tests/tools/test_transcription_tools.py
tests/tools/test_transcription_dotenv_fallback.py → 165/165 passing
- scripts/run_tests.sh tests/hermes_cli/ tests/tools/ →
9191 passed, 8 pre-existing failures (verified on origin/main
before this change)
- bash -n on install.sh and setup-hermes.sh → OK
- py_compile on all modified .py files → OK
- End-to-end smoke test of detect_compromised + render_doctor_section
+ gateway_log_message with mocked installed version → produces
copy-pasteable remediation output
# Community
Full advisory + remediation steps:
website/docs/community/security-advisories/shai-hulud-mistralai-2026-05.md
Short-form post drafts (Discord, GitHub pinned issue, README banner):
scripts/community-announcement-shai-hulud.md
Refs: PR #24205 (mistral disabled), Socket Security advisory
<https://socket.dev/blog/mini-shai-hulud-worm-pypi>
* build(deps): pin every direct dep to ==X.Y.Z (no ranges)
Companion to the supply-chain advisory work: replace every >=/</~= range
in pyproject.toml's [project.dependencies] and [project.optional-dependencies]
with an exact ==X.Y.Z pin sourced from uv.lock.
Why: ranges allow PyPI to ship a fresh version of any direct dep at any
time without a code review on our side. With ranges, the malicious
mistralai 2.4.6 release would have been pulled by every fresh
'pip install -e .[all]' for the hours between upload and PyPI's
quarantine — exactly the install window we got hit on. Exact pins close
that window: the only way a new package version reaches a user is via
an intentional update on our end.
What the user-facing change is: nothing, behavior-wise. Every package
resolves to the same version it was already resolving to via uv.lock —
the pins just remove the resolver's freedom to pick a different one.
Cost: any user installing Hermes alongside another package that requires
a newer pin gets a resolver conflict. Acceptable for our isolated-venv
install path; documented in the new comment block.
Build-system requires line (setuptools>=61.0) is intentionally left
as a range — pinning the build backend would block fresh pip from
bootstrapping the build on architectures where that exact wheel isn't
available.
mistral extra (mistralai==2.3.0) is pinned but stays out of [all]
(per PR #24205). 'uv lock' regeneration will fail until PyPI restores
mistralai; lockfile regeneration is gated behind that, NOT on every PR.
LAZY_DEPS in tools/lazy_deps.py also moved to exact pins so the lazy-
install pathway can never resolve a different version than the one
declared in pyproject.toml.
Validation:
- Cross-checked all 77 pinned direct deps in pyproject.toml against
uv.lock — every pin matches the resolved version exactly.
- Cross-checked all LAZY_DEPS specs against uv.lock — same.
- 'uv pip install -e .[all] --dry-run' resolves 205 packages cleanly.
- tests/tools/test_lazy_deps.py + tests/hermes_cli/test_security_advisories.py
→ 63/63 passing (every shipped spec passes the safety regex).
- Doctor + TTS + transcription targeted suite → 146/146 passing.
* build(deps): hash-verify transitives via uv.lock; remove unresolvable [mistral] extra
You asked: 'what about the dependencies the dependencies rely on?' —
correctly noting that exact-pinning direct deps in pyproject.toml does
NOT cover the transitive graph. `pip install` and `uv pip install` both
re-resolve transitives fresh from PyPI at install time, so a compromised
transitive (e.g. `httpcore` if it got worm-poisoned tomorrow) would
still hit our users even with every direct dep exact-pinned.
# What this commit fixes
1. **Both real installer scripts now prefer `uv sync --locked` as Tier 0.**
uv.lock records SHA256 hashes for every transitive — a compromised
package with a different hash gets REJECTED. Falls through to the
existing `uv pip install` cascade if the lockfile is missing or
stale, with a loud warning that the fallback path does NOT
hash-verify transitives. Previously only `setup-hermes.sh` (the dev
path) used the lockfile; `scripts/install.sh` and `scripts/install.ps1`
(the paths fresh users actually run) skipped it.
2. **Removed the `[mistral]` extra entirely.** The `mistralai` PyPI
project is fully quarantined right now — every version returns 404,
so any pin we wrote was unresolvable, which broke `uv lock --check`
in CI. Restoration is documented in pyproject.toml as a 5-step
checklist (verify, re-add extra, re-enable in 4 modules, regenerate
lock, optionally re-add to [all]).
3. **Regenerated uv.lock.** 262 packages, mistralai/eval-type-backport/
jsonpath-python pruned. `uv lock --check` now passes.
# Defense-in-depth view
| Layer | Where | Protects against |
|----------------------------|-------------------|-------------------------------------------|
| Exact pins in pyproject | direct deps | new mistralai 2.4.6-style direct compromise |
| uv.lock + `--locked` install | transitive graph | transitive worm injection |
| Tier-0 hash-verified path | install.sh / .ps1 | actually USE the lockfile in fresh installs |
| `uv lock --check` CI gate | every PR | drift between pyproject and lockfile |
| `hermes_cli/security_advisories.py` | runtime | cleanup for users who already got hit |
The exact pinning + hash verification together close the supply-chain
gap. Without the lockfile path, exact pins alone are theater.
# Validation
- `uv lock --check` → passes (262 packages resolved, no drift).
- `bash -n` on install.sh + setup-hermes.sh → OK.
- 209/209 tests passing across new + adjacent test files
(test_lazy_deps.py, test_security_advisories.py, test_doctor.py,
test_tts_mistral.py, test_transcription_tools.py).
- TOML parse OK.
* chore: remove community announcement drafts (PR body covers it)
* build(deps): lazy-install every opt-in backend (anthropic, search, terminal, platforms, dashboard)
Extends the lazy-install framework to cover everything that's not used by
every hermes session. Base install drops from ~60 packages to 45.
Moved out of core dependencies = []:
- anthropic (only when provider=anthropic native, not via aggregators)
- exa-py, firecrawl-py, parallel-web (search backends; only when picked)
- fal-client (image gen; only when picked)
- edge-tts (default TTS but still optional)
New extras in pyproject.toml: [anthropic] [exa] [firecrawl] [parallel-web]
[fal] [edge-tts]. All added to [all].
New LAZY_DEPS entries: provider.anthropic, search.{exa,firecrawl,parallel},
tts.edge, image.fal, memory.hindsight, platform.{telegram,discord,matrix},
terminal.{modal,daytona,vercel}, tool.dashboard.
Each import site now calls ensure() before importing the SDK. Where the
module had a top-level try/except (telegram, discord, fastapi), the
graceful-fallback pattern was extended to lazy-install on first
check_*_requirements() call and re-bind module globals.
Updated test_windows_native_support.py tzdata check from snapshot
(>=2023.3 literal) to invariant (any version + win32 marker).
Validation:
- Base install: 45 packages (was ~60); 6 newly-extracted packages absent
- uv lock --check: passes (262 packages, no drift)
- 209/209 lazy_deps + advisory + doctor + tts/transcription tests passing
- py_compile clean on all 12 modified modules
|
||
|
|
59fbcd5ccb |
fix(install.ps1): strip UTF-8 BOM that broke [scriptblock]::Create
Commit
|
||
|
|
0548facc50 |
fix(windows): gateway status dedup + install.ps1 platform-SDK bootstrap
## Two residual Windows fixes that were hanging from earlier commits. ### 1. `hermes gateway status` reported 2 PIDs per gateway — TWO bugs compounded Diagnosed with psutil parent/child walk against live gateway PIDs: **Bug A (the real one): `_get_parent_pid` silently failed on Windows.** The helper shelled out to `ps -o ppid= -p <pid>`, which doesn't exist on Windows — `FileNotFoundError` → returns `None` → the ancestor walk terminated at `os.getpid()` alone. Consequence: the PID table scan in `_scan_gateway_pids` couldn't filter out `hermes gateway status`'s own launcher stub (a venv `pythonw.exe`/`python.exe` that matches the same `-m hermes_cli.main gateway` pattern as the gateway). Every status call saw "itself" as a second gateway. Fix: `_get_parent_pid` now calls `psutil.Process(pid).ppid()` first (psutil is a core dependency since |
||
|
|
52e497ce7f |
fix(windows installer): UTF-8 BOM, tiered extras, skip tinker-atropos by default
install.ps1 had three related problems that compounded into `hermes dashboard` failing to boot on Windows with 'No module named fastapi': 1. UTF-8 BOM missing. Windows PowerShell 5.1 (the default on Windows 10/11, which is what `irm | iex` runs under) reads files without a BOM as cp1252. install.ps1 has em-dashes, arrows, check marks, etc. — PS 5.1 mangled them and the file failed to parse. Added UTF-8 BOM so PS 5.1, PS 7, and the in-memory `irm | iex` path all read the file identically. 2. `uv pip install -e .[all]` had a single-tier silent fallback to bare `.` on any failure, with `2>&1 | Out-Null` swallowing the error. Any transient extras install failure (network hiccup, wheel build issue, etc.) would drop every optional extra including [web], and the installer would still print 'Main package installed'. Replaced with a four-tier fallback (.[all] -> PyPI-only extras -> dashboard+core -> bare) that prints output at every step and a targeted [web] verify+repair at the end so `hermes dashboard` specifically is never silently broken. 3. tinker-atropos was installed unconditionally after the main install. tinker-atropos/pyproject.toml pulls atroposlib and tinker from git+https://github.com/... which can fail on locked-down networks, flaky DNS, or rate-limited github.com and would half-install the venv. install.sh already skipped it by default with a one-liner for users who actually do RL training — install.ps1 now matches that behavior. Parse-checked clean under Windows PowerShell 5.1.26100.8115 (5318 tokens, 0 parse errors). |
||
|
|
03566e5124 |
fix(windows): auto-install Playwright Chromium + surface it in doctor
scripts/install.sh runs 'npx playwright install --with-deps chromium' on every Linux distro after the npm-install step, which is why browser tools Just Work on Linux. scripts/install.ps1 never did the equivalent step, so on native Windows installs check_browser_requirements() in tools/browser_tool.py would return False (no Chromium under %LOCALAPPDATA%\ms-playwright) and every browser_* tool got silently filtered out of the agent's tool schema — no error, no log entry, user just wondered why the tools didn't exist. Two-part fix: 1. scripts/install.ps1: after 'npm install' in InstallDir succeeds, run 'npx playwright install chromium'. Resolves npx via the same execution-policy-aware logic already used for npm (prefer npx.cmd next to npmExe, fall back to Get-Command). Surfaces a warning + manual-recovery hint when the install fails, matching install.sh behaviour for distros. 2. hermes_cli/doctor.py: after the agent-browser check, lazily import tools.browser_tool and reuse the exact same _chromium_installed() predicate check_browser_requirements() uses, so the doctor signal cannot drift from the runtime gate. Skip the check when Camofox / CDP override / a cloud provider / Lightpanda is configured (those bypass local Chromium). On missing Chromium, the hint is platform-correct: '--with-deps' on POSIX, plain 'install chromium' on win32. Verified on Windows 10: - 'npx playwright install chromium' completes successfully, drops Chrome Headless Shell under %LOCALAPPDATA%\ms-playwright - check_browser_requirements() flips from False -> True - 'hermes doctor' now prints either '✓ Playwright Chromium (browser engine)' or '⚠ Playwright Chromium not installed' + fix command - tests/hermes_cli/test_doctor.py: 38/38 pass - tests/tools/test_browser_chromium_check.py: 16/16 pass |
||
|
|
a2efad6bea |
fix(windows): prefer npm.cmd over npm.ps1, skip .py argv0 in relaunch
Two fixes from teknium1's next install run:
1. **npm install: "npm.ps1 cannot be loaded because running scripts is
disabled on this system."** Get-Command's default PATHEXT ordering
picked up ``npm.ps1`` (the PowerShell shim) ahead of ``npm.cmd`` (the
batch shim). Most Windows users have PowerShell's execution policy
set to Restricted or RemoteSigned, which blocks unsigned ``.ps1``
files. ``npm.cmd`` has no such restriction and works universally.
Install-NodeDeps now detects when Get-Command returned npm.ps1, looks
for a sibling npm.cmd in the same directory, and prefers it. Prints
an info line so the user sees why. Emits a warning + hint if only
npm.ps1 is available.
2. **"Launch hermes chat now? Y" crashes with "%1 is not a valid Win32
application" on Windows installs.** The setup wizard calls
``relaunch(["chat"])``; ``resolve_hermes_bin()`` returned
``sys.argv[0]`` which was ``...\\hermes_cli\\main.py`` (because hermes
was launched via ``python -m hermes_cli.main`` during setup).
On Windows, ``os.access(script.py, os.X_OK)`` returns True because
PATHEXT lists ``.py`` when the Python launcher is registered — but
``subprocess.run([script.py, ...])`` can't actually execute a ``.py``
directly. CreateProcessW needs a real PE file.
Fixed ``resolve_hermes_bin`` to reject ``.py``/``.pyc`` argv0 values
on Windows specifically. Falls through to ``shutil.which("hermes")``
(hermes.exe in the venv Scripts dir) or, as a final fallback, lets
build_relaunch_argv build ``[sys.executable, "-m", "hermes_cli.main"]``
which is bulletproof. POSIX behaviour unchanged — ``.py`` argv0 with
a shebang + chmod+x is still a valid exec target there.
3 new tests cover the Windows paths: .py argv0 + hermes.exe on PATH →
returns hermes.exe; .py argv0 + no PATH → returns None (caller uses
python -m); POSIX + executable .py → still accepted.
26 relaunch tests pass, no POSIX regressions.
|
||
|
|
8f91d7bfa9 |
fix(windows): %1 install error, patch CRLF false-negative, SOUL.md BOM
Three bugs from teknium1's successful install + diagnostic chat on Windows:
1. **Start-Process -FilePath npm.cmd fails with "%1 is not a valid Win32
application".** Start-Process bypasses cmd.exe and PATHEXT to call
CreateProcessW directly, which refuses .cmd batch shims. Switched
Install-NodeDeps to use PowerShell's invocation operator (``& $npmExe
install --silent *> $log``) which DOES honour PATHEXT. Extracted a
``_Run-NpmInstall`` helper so the browser + TUI paths share the same
logic. Captures $LASTEXITCODE correctly, still surfaces the real
stderr on failure with a log-file pointer for the full output.
2. **patch tool returns false-negative on Windows due to CRLF round-trip.**
Root cause was upstream of patch: ``subprocess.Popen(..., text=True,
stdin=PIPE)`` on Windows translates ``\\n`` → ``\\r\\n`` when data flows
through the stdin pipe. ``_pipe_stdin()`` was writing the patch's
new_content string through a text-mode pipe, bash then wrote those
CRLF bytes to disk, and patch's post-write verify compared the
on-disk CRLF bytes against the original LF-only string — fail.
Fixed in two places for defense in depth:
- ``_pipe_stdin()`` now writes through ``proc.stdin.buffer`` with
explicit UTF-8 encoding, bypassing Python's newline translation on
every platform. No behaviour change on POSIX (bytes are identical)
but stops the CRLF injection on Windows.
- ``patch_replace``'s post-write verify normalizes CRLF→LF on both
sides before comparing, so even if some future backend still
translates newlines the patch tool won't report a bogus failure.
3. **SOUL.md gets a UTF-8 BOM on Windows PowerShell 5.1.** ``Set-Content
-Encoding UTF8`` on PS5.1 writes UTF-8 WITH a byte-order-mark (changed
in PS7 via ``utf8NoBOM``). Hermes's prompt-injection scanner sees
the BOM (U+FEFF invisible char) and refuses to load the file, so
SOUL.md's persona instructions never get applied.
Fixed by writing the file via ``[System.IO.File]::WriteAllText``
with an explicit ``UTF8Encoding($false)`` — BOM-free on every
PowerShell version.
All POSIX behaviour verified unchanged: 198 tests pass across
test_file_operations, test_local_env_cwd_recovery, test_code_execution,
test_windows_native_support, test_windows_compat.
|
||
|
|
d52e54170a |
fix(install.ps1): step out of $InstallDir before touching it + harden repo probe
User hit 'fatal: not in a git directory' on re-install because: 1. They ran Remove-Item -Force $env:LOCALAPPDATA\hermes -ErrorAction SilentlyContinue WHILE cd'd inside the install dir. Windows silently refuses to delete a directory any shell is currently cd'd inside and leaves the skeleton intact, but the -ErrorAction SilentlyContinue swallowed every partial-delete failure so they thought the wipe succeeded. 2. The installer then walked into Install-Repository, saw $InstallDir still exists with a partial .git stub, my repo-validity probe returned success (the probe's git rev-parse may have exit-code-zeroed in a way I didn't expect), and the real git fetch died with three 'fatal: not a git repository' errors. Two fixes belt-and-braces: - Main() now cds to $env:USERPROFILE at start if the current shell is inside $InstallDir. Harmless when the user ran from elsewhere; critical when they didn't. This alone fixes the user's case. - Install-Repository's 'is this a valid repo' probe now runs BOTH git rev-parse --is-inside-work-tree AND git status, resets $LASTEXITCODE before each to avoid picking up a stale 0, and requires BOTH to succeed. Also requires rev-parse's output to match 'true' (not just exit 0) to rule out exit-0-with-empty-output edge cases. |
||
|
|
c469a05ce5 |
fix(install.ps1): validate existing repo via git itself + clean up broken stubs
teknium1 hit "fatal: not in a git directory" on re-install when the previous install left a $InstallDir\.git stub that Test-Path matched but git didn't recognize (three "fatal: not a git repository" lines, then the script exited before touching anything). Two bugs: 1. Test-Path "$InstallDir\.git" was a weak gate — it matches .git whether it's a directory, file, symlink, submodule gitfile, OR a broken stub from a failed previous Remove-Item. Replaced with a real repo probe: Push-Location + git rev-parse --is-inside-work-tree + $LASTEXITCODE check. If git itself can't see a repo, we treat the directory as not-a-repo and fall through to fresh clone. 2. The original update path ignored $LASTEXITCODE. fetch/checkout/pull all emitted fatals but the script kept going. Now each command checks $LASTEXITCODE and throws with an explicit message. Also: when the directory exists but isn't a valid repo, the new code wipes it (Remove-Item -ErrorAction Stop) and falls through to fresh clone, instead of dying with the old "Directory exists but is not a git repository" error. If the wipe itself fails (file locked, hermes still running), we throw with a user-readable "close any programs using files in <dir>" hint. Refactored the function to use a $didUpdate flag instead of my earlier draft's early `return` — that was skipping the submodule init block at the bottom of the function. Both the update and fresh-clone paths now fall through to the submodule init step, which is correct (git pull doesn't auto-update submodules). PowerShell structural check: 21 functions defined, braces balanced. |
||
|
|
3601e20f47 |
fix(windows): use PortableGit (not MinGit), fix relaunch os.execvp crash, surface npm errors
Three real bugs from teknium1's first Windows install run:
1. **MinGit has no bash.exe.** MinGit is the minimal-automation Git for Windows
distribution — it ships git.exe but deliberately strips bash and the POSIX
coreutils. Installer logged "Could not locate bash.exe" and Hermes would
fail to run any shell command. Switched to PortableGit — the full Git for
Windows minus the installer UI. PortableGit ships bash.exe at
<root>\bin\bash.exe plus sh, awk, sed, grep, curl, ssh in usr\bin\. ARM64
variant is detected separately (PortableGit-*-arm64.7z.exe). 32-bit falls
back to MinGit-32-bit with a warning (PortableGit is 64-bit only).
PortableGit ships as a 7z self-extractor (56MB vs MinGit's 38MB). We
invoke it with `-o<target> -y` to extract silently — no 7z install needed,
it's self-contained.
Updated tools/environments/local.py::_find_bash candidate order to prefer
the PortableGit layout (<root>\bin\bash.exe) with the MinGit layout
(<root>\usr\bin\bash.exe) as a fallback so existing installs keep working.
2. **os.execvp "Exec format error" on Windows.** Setup wizard's "Launch
hermes chat now? Y" called `os.execvp(["hermes", "chat"])` which on
Windows can only swap to real Win32 .exe files — chokes with OSError(8)
on .cmd batch shims and Python console-script wrappers. Added a
win32 branch in hermes_cli/relaunch.py::relaunch() that uses
subprocess.run + sys.exit — functionally identical (user sees "hermes
exited, then new hermes started") with one extra PID in play. POSIX
path is UNCHANGED — still uses os.execvp for in-place replacement.
Catches OSError in the Windows branch and surfaces a "open a new
terminal so PATH picks up, then re-run hermes" hint instead of a
cryptic traceback.
3. **npm install failures silent on Windows.** The install.ps1 was invoking
`npm install --silent 2>&1 | Out-Null` inside a try/catch. PowerShell's
try/catch does NOT trigger on non-zero process exit codes — only on
unhandled .NET exceptions — so npm failing printed a generic "npm
install failed" with zero information about WHY. The silent pipe ate
the stderr.
Rewrote Install-NodeDeps to:
- Resolve npm.cmd via Get-Command (respects PATHEXT) instead of
relying on bare `npm` name resolution.
- Use Start-Process with -PassThru to capture the actual exit code.
- Redirect stderr to a temp log and surface the first ~800 chars of
the real npm error when install fails, plus the log path for the
full text.
- Fail loudly with the right exit code instead of a misleading success.
- Bail cleanly with a helpful message when npm isn't on PATH at all.
4. **"True" printing to console after Node check.** `Test-Node` returns $true;
installer called it as a bare statement (no assignment, no cast). PowerShell
prints bare return values. Wrapped the call in `[void](Test-Node)`.
## Tests
- Added 3 new tests in tests/hermes_cli/test_relaunch.py covering the
Windows branch: subprocess is called (not execvp), child exit code
propagates, OSError surfaces a helpful message. All 23 tests pass
(20 existing + 3 new).
- 77 Windows-compat tests still pass, POSIX behaviour unchanged.
|
||
|
|
b7fe7ed7bd |
feat(windows-install): bundle portable MinGit instead of relying on winget
User hit a real failure case: their system Git was in a half-installed state
(can neither uninstall nor reinstall) and winget refused to work around it.
We were one step away from shipping an installer that would have left users
with exactly the problem he already had.
What other agents do (reality check):
- Claude Code: requires pre-installed Git; breaks if user doesn't have it.
- OpenCode, Codex: don't need bash at all — PowerShell-first design.
- Cline: uses whatever shell VSCode is configured with; installs nothing.
None of them solve the "broken system Git" problem. We need to own our Git.
Changes:
- scripts/install.ps1::Install-Git: dropped winget path entirely. Now:
(1) use existing git if present; (2) download portable MinGit from the
official git-for-windows GitHub release to %LOCALAPPDATA%\hermes\git.
No winget, no admin, no Windows installer registry, no system impact.
- Added %LOCALAPPDATA%\hermes\git\{cmd,usr\bin} to User PATH so git + bash
+ POSIX coreutils (which, env, grep, …) resolve in fresh shells.
- tools/environments/local.py::_find_bash: reorder so Hermes' portable
MinGit install is checked BEFORE falling through to shutil.which("bash")
or system install locations. This way a broken system Git can't
hijack the bash lookup.
- README + installation docs reworded to reflect the new story: "portable
Git Bash, isolated from any system install, recoverable via rm -rf if it
ever breaks."
Recoverability: if Hermes' Git install ever breaks, ``Remove-Item %LOCALAPPDATA%\hermes\git``
and re-run the installer — no system impact, no uninstall drama, no winget
to fight with.
|
||
|
|
7242afaa5f
|
chore: defer WhatsApp bridge install to first use (#12992)
Remove eager npm install of @whiskeysockets/baileys during install.sh, install.ps1, and Docker build. The bridge deps are already installed on-demand by `hermes whatsapp` (Step 4 checks for node_modules and runs npm install if missing), so there is no need to pay the cost at initial install for users who never use WhatsApp. |
||
|
|
a3cfb1de86 | feat: auto install tui deps | ||
|
|
79aeaa97e6 | fix: re-order providers,Quick Install, subscription polling | ||
|
|
ad1bf16f28
|
chore: remove all remaining mini-swe-agent references
Complete cleanup after dropping the mini-swe-agent submodule (PR #2804): - Remove MSWEA_SILENT_STARTUP and MSWEA_GLOBAL_CONFIG_DIR env var settings from cli.py, run_agent.py, hermes_cli/main.py, doctor.py - Remove mini-swe-agent health check from hermes doctor - Remove 'minisweagent' from logger suppression lists - Remove litellm/typer/platformdirs from requirements.txt - Remove mini-swe-agent install steps from install.ps1 (Windows) - Remove mini-swe-agent install steps from website docs - Update all stale comments/docstrings referencing mini-swe-agent in terminal_tool.py, tools/__init__.py, code_execution_tool.py, environments/README.md, environments/agent_loop.py - Remove mini_swe_runner from pyproject.toml py-modules (still exists as standalone script for RL training use) - Shrink test_minisweagent_path.py to empty stub The orphaned mini-swe-agent/ directory on disk needs manual removal: rm -rf mini-swe-agent/ |
||
|
|
4766b3cdb9 |
fix: fall back to ZIP download when git clone fails on Windows
Git for Windows can completely fail to write files during clone due to antivirus software, Windows Defender Controlled Folder Access, or NTFS filter drivers. Even with windows.appendAtomically=false, the checkout phase fails with 'unable to create file: Invalid argument'. New install strategy (3 attempts): 1. git clone with -c windows.appendAtomically=false (SSH then HTTPS) 2. If clone fails: download GitHub ZIP archive, extract with Expand-Archive (Windows native, no git file I/O), then git init the result for future updates 3. All git commands now use -c flag to inject the atomic write fix Also passes -c flag on update path (fetch/checkout/pull) and makes submodule init failure non-fatal with a warning. |
||
|
|
354af6ccee |
chore: remove unnecessary migration code from install.ps1
No existing Windows installations to migrate from. |
||
|
|
c9afbbac0b |
feat: install to %LOCALAPPDATA%\hermes on Windows
Move Windows install location from ~\.hermes (user profile root) to %LOCALAPPDATA%\hermes (C:\Users\<user>\AppData\Local\hermes). The user profile directory is prone to issues from OneDrive sync, Windows Defender Controlled Folder Access, and NTFS filter drivers that break git's atomic file operations. %LOCALAPPDATA% is the standard Windows location for per-user app data (used by VS Code, Discord, etc.) and avoids these issues. Changes: - Default HermesHome to $env:LOCALAPPDATA\hermes - Set HERMES_HOME user env var so Python code finds the new location - Auto-migrate existing ~\.hermes installations on first run - Update completion message to show actual paths |
||
|
|
83fa442c1b |
fix: use env vars for git windows.appendAtomically on Windows
The previous fix set git config --global before clone, but on systems where atomic writes are broken (OneDrive, antivirus, NTFS filter drivers), even writing ~/.gitconfig fails with 'Invalid argument'. Fix: inject the config via GIT_CONFIG_COUNT/KEY/VALUE environment variables, which git reads before performing any file I/O. This bypasses the chicken-and-egg problem where git can't write the config file that would fix its file-writing issue. |
||
|
|
1900e5238b |
fix: git clone fails on Windows with 'copy-fd: Invalid argument'
Git for Windows can fail during clone when copying hook template files
from the system templates directory. The error:
fatal: cannot copy '.../templates/hooks/fsmonitor-watchman.sample'
to '.git/hooks/...': Invalid argument
The script already set windows.appendAtomically=false but only AFTER
clone, which is too late since clone itself triggers the error.
Fix:
- Set git config --global windows.appendAtomically false BEFORE clone
- Add a third fallback: clone with --template='' to skip hook template
copying entirely (they're optional .sample files)
|
||
|
|
ddae1aa2e9 |
fix: install.ps1 exits entire PowerShell window when run via iex
When running via 'irm ... | iex', the script executes in the caller's session scope. The 'exit 1' calls (lines 424, 460, 849-851) would kill the entire PowerShell window instead of just stopping the script. Fix: - Replace all 'exit 1' with 'throw' for proper error propagation - Wrap Main() call in try/catch so errors are caught and displayed with a helpful message instead of silently closing the terminal - Show fallback instructions to download and run as a .ps1 file if the piped install keeps failing |
||
|
|
16274d5a82 |
fix: Windows git 'unable to write loose object' + venv pip path
- Set 'git config windows.appendAtomically false' in hermes update command (win32 only) and in install.ps1 after cloning. Fixes the 'fatal: unable to write loose object file: Invalid argument' error on Windows filesystems. - Fix venv pip fallback path: Scripts/pip on Windows vs bin/pip on Unix - Gate .env encoding fix behind _IS_WINDOWS (no change to Linux/macOS) |
||
|
|
245c766512 |
fix: remove 2>&1 from git commands in PowerShell installer
Root cause: PowerShell with $ErrorActionPreference = 'Stop' only creates NativeCommandError from stderr when you CAPTURE it via 2>&1. Without the redirect, stderr flows directly to the console and PowerShell never intercepts it. This is how OpenClaw's install.ps1 handles it — bare git commands with no stderr redirection. Wrap SSH clone attempt in try/catch since it's expected to fail (falls back to HTTPS). |