mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-25 11:02:03 +00:00
50 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
fcf6cb3d73
|
fix(docker): supervised gateway uses --replace to take over stale holder (NS-505) (#47555)
* fix(docker): supervised gateway uses --replace to take over stale holder Inside the s6 container image the per-profile gateway service rendered a bare `hermes gateway run` (no --replace). When a gateway is started OUTSIDE s6 — a stray shell `hermes gateway run`, an agent action, or the Open WebUI helper (scripts/setup_open_webui.sh) — it grabs the per-HERMES_HOME PID lock first. The supervised slot then execs the bare `gateway run`, hits the "Another gateway instance is already running" guard, exits non-zero, and s6 restarts it: a restart loop that floods the log every ~12s and never binds. The container looks up but the gateway is permanently down, and dashboard-only users (no shell) cannot recover. Render the supervised run script as `gateway run --replace` so s6 is authoritative for its slot: it reaps the stale holder via the hardened takeover path (takeover marker + SIGTERM->SIGKILL-with-confirmation + scoped-lock cleanup in gateway/run.py) and binds. This matches the systemd service path, which already builds its argv with --replace (_build_gateway_argv / 'nohup hermes gateway run --replace'), and the intent already documented in _maybe_redirect_run_to_s6_supervision. The existing HERMES_S6_SUPERVISED_CHILD sentinel still prevents the run->start->run redirect recursion. Each profile is scoped to its own HERMES_HOME and s6 guarantees one supervised instance per slot, so there is no legitimate supervised sibling for --replace to clobber. Reported via beta (NS-505): gateway.log showed PID 17907 'running (manual process)' with the guard error repeating every ~12s on v2026.6.5. Adds a regression test asserting every gateway-run exec line in the rendered script (default + named profile, both privilege branches) carries --replace, and updates the existing render-script assertion. * fix(ci): remove stray .venv symlink committed into repo The PR's commit accidentally tracked a .venv symlink pointing at the developer's local venv (mode 120000 -> /home/ben/nous/hermes-agent/.venv). The CI test/e2e/build jobs run `uv venv` to create .venv and failed with `failed to create directory .venv: File exists (os error 17)` because the checkout already contained the symlink. All test shards aborted in <15s during setup, before any test ran. Untrack the symlink and add a bare `.venv` entry to .gitignore (the existing `.venv/` rule only matches a directory, so a symlink slipped through). |
||
|
|
d62979a6f3
|
feat(desktop): composer status stack, live subagent windows, editable prompts (#44630)
* feat(desktop): session-scoped status stack + kill new-window theme flash Stack subagents, background tasks, and the queue into one collapsible "sink" above the composer, reusing the queue's chrome so every status reads as one piece. Extracts shared StatusSection / StatusRow / TerminalOutput primitives and a unified $statusItemsBySession store (subagents mirrored, background owned here, merged + grouped for render). Renames BrailleSpinner → GlyphSpinner now that it drives more than braille. Separately, fix the white flash on every new/cmd-clicked window: macOS `vibrancy` paints an NSVisualEffectView that follows the OS appearance and ignores `backgroundColor`, so a dark app on a light-mode Mac flashed white until the renderer painted over it. Pin `nativeTheme.themeSource` to the app theme (persisted to userData so cold launches paint right before the renderer loads), hold windows with `show:false` until `ready-to-show`, and pre-paint the themed background via an inline script before the bundle runs. * feat(desktop): dock the slash popover to the composer via one shared fill var The slash·@ popover (and ? help) now docks onto the composer's edge with the same chrome as the queue/status stack — rounded outer corners, fused borderless edge, no shadow — but keeps its own narrow width. Surface + drawer paint a single --composer-fill var; the state ladder (rest / scrolled / focused / drawer-open) lives once in styles.css on [data-slot='composer-root']. The :has() drawer-open rule is last and forces an opaque fill, since translucent glass sampling different backdrops (thread vs fade gradient) can never match. This replaces the focus-within !important override that repainted the surface behind every previous matching attempt. Also drop the chevron column from the project file tree — the folder open/closed icon already carries the expand state. * feat(desktop): base inset for file tree rows (post-chevron alignment) * feat(desktop): wire the status stack's background tasks to the real process registry The background group was UI-only (dev-mock seeded). Now it's live e2e: - tui_gateway: new session-scoped `process.list` (registry snapshot filtered by the session's session_key, plus a 4KB output tail for the inline terminal viewer) and `process.kill` (single process, ownership-checked — unlike process.stop's kill_all). - Renderer: `reconcileBackgroundProcesses` syncs snapshots into the store layout-stably — rows keep their position when state flips (never re-sort), new processes append, unchanged rows keep object identity so memoised rows skip re-rendering, and a dismissed-set stops the registry's retained finished procs from resurrecting X-ed rows. - Refresh triggers: session open, terminal/process tool.complete, status.update(kind=process) from the gateway's notification poller, and a 5s poll armed only while a running row is visible (catches silent exits). - Stop = real `process.kill` + optimistic dismiss; Dismiss = client-side with resurrection guard. - Re-keyed the stack to the RUNTIME session id: it was keyed by the stored session id, where neither subagent events nor process.list would ever land. - Deleted dev-status-mocks.ts (__hermesStatusMocks) — no more seed shit. Reconcile invariants covered in store/composer-status.test.ts. * feat(desktop): todos + openable subagents in the status stack, self-healing file tree - todo lists move out of the inline chat panel into the composer status stack (checklist icon, dashed ring = pending, spinner = in progress, check = done), fed live from todo tool events and seeded from history on session open - subagent rows carry the child's real session id end-to-end (delegate_tool → gateway → renderer) so clicking one opens ITS session window - status stack publishes its measured height so the thread's bottom clearance grows with it; card paints the shared --composer-fill so focused/scrolled states match the composer exactly - file tree self-heals: ENOENT roots retry on a 3s cadence + Try again button, and the main process expands ~ in IPC paths (gateway cwds arrive as ~/...) - composer drag-drop of tree entries inserts inline refs instead of attachments * fix(desktop): file tree falls back to the workspace dir when a session's cwd is gone Sessions record their launch cwd; deleted worktrees leave that path dead, so opening such a session swapped the tree from the default workspace to a directory that ENOENTs forever — the 3s retry just spun on it. On a root read error the tree now asks main to sanitize the cwd (prefers the configured default project dir), displays that fallback, and quietly re-probes the original path so it switches back if the dir reappears. * feat(desktop): working restore-checkpoint button on past user prompts The discard icon on hover of a past user bubble was decorative — clicking did nothing. It's now a real control: a confirmation dialog explains that everything after the prompt is removed, then the session rewinds to that turn and reruns the same prompt (prompt.submit with truncate_before_user_ordinal, the same mechanism the edit composer uses). Failures rethrow into the dialog's inline error instead of toasting. * fix(desktop): show the restore-checkpoint button on the latest user prompt too Restoring the most recent prompt is just 'retry this turn' — no reason to exclude it. Stop still takes the slot while the turn is running. * fix(desktop): finished todo lists clear themselves out of the status stack A list whose every item is completed/cancelled lingers ~4s so the final checkmark is visible, then the todo group drops out of the stack. A fresh active list arriving within the linger cancels the scheduled clear. * chore(desktop): drop dead editableCheckpoint copy, terser restore confirm * fix(desktop): rewind clears the abandoned timeline's todos + background Restoring to (or editing) an earlier prompt rewinds the conversation, but the todos and background processes spawned by the now-discarded turns kept showing in the status stack — and the real background processes kept running. Both rewind paths now clear the session's todo rows and kill + drop its background processes before the fresh run repopulates them. Also drops the click-to-edit clamp transition, which flashed a half-expanded bubble on the way into the edit composer. * feat(desktop): user messages are always editable; edit/restore revert mid-stream The bubble is now always click-to-edit — even while a turn streams — instead of going inert during a run. Sending an edit acts like restore: it rewinds to that prompt and re-runs with the new text. Both edit and restore can fire mid-stream now; the gateway refuses prompt.submit while a turn runs (4009 "session busy"), so they interrupt the live turn first and retry the submit until the cooperative interrupt winds it down. Restore (re-run as-is) shows on every prompt except the latest running one, which keeps the Stop button. * fix(desktop): label preview-pane ⌘L selections with the filename, not "zsh" The terminal owns a global ⌘/Ctrl+L "send selection to composer" shortcut, so selecting text in the file preview pane and hitting it fell through to the terminal handler — which imported the right text but labelled the composer ref "zsh:N lines" off the shell name. When the selection isn't an xterm selection, label it with the previewed file instead. * fix(desktop): ⌘L on a preview line selection inserts the @line ref, like dragging The source preview lets you select lines in the gutter and drag them into the composer as an @line:path:start-end ref. ⌘/Ctrl+L now does the same when a line selection is active — it drops the identical ref instead of falling through to the terminal's global handler (which grabbed the native text selection and sent a bogus terminal block). Capture-phase + stopPropagation so it wins; with a line selection there's no native selection, so the terminal handler stays out of it. * chore: gitignore apps/desktop/demo/ scratch output The desktop demo prompt writes demo/*.txt during recorded walkthroughs; it's throwaway, never part of the app. Ignore it so it stops cluttering git status. * feat(desktop): subagent watch windows, hard stop, sidebar hygiene Child-session mirror for live subagent windows, delegate sessions tagged and excluded from the sidebar, composer focus/stop polish, and WS stall resilience on the gateway transport. * refactor: DRY delegate SQL + trim status-stack noise Extract shared listable-child and delegate-delete helpers in hermes_state, collapse cancelRun busy release, and cut comment bloat in resume/status paths. * fix(desktop): hide orphaned subagent sessions in sidebar Cascade-delete all ephemeral children on parent delete (not just tagged rows), run v16 backfill to tag legacy orphans, and record new delegates as source=subagent. * fix: restore orphan contract for untagged children + lazy session eviction Cascade-delete only _delegate_from-tagged rows (v16 backfill covers legacy), walk marker chains recursively with FK-safe orphaning, gate lazy watch sessions out of the still-starting eviction exemption via an explicit flag, pass session_id to _make_agent only when resuming, and hide source=subagent from session search. * fix(gateway): gate child mirror off upgraded sessions + age out stale run entries Review findings: the mirror could interleave synthetic events with a real native stream once a watch window upgrades (prompt.submit builds an agent), and a lost subagent.complete left _active_child_runs pinning running=true forever. Mirror now stops when the live session owns an agent; liveness reads ignore entries older than an hour. * fix(gateway): reject prompt.submit into a watch session while its child runs A lazy watch session's running flag is False (the run lives in the parent turn), so typing mid-run sailed past the busy guard and built a second agent racing the in-flight child on the same stored session. Busy error until the run completes; afterwards the submit upgrades into a normal conversation. * refactor(gateway): DRY watch-resume payload + compose listable-child SQL Fold the duplicated child-run busy overlay into one _reuse_live_payload helper across both resume reuse paths, collapse the twin mirror early-returns, and build _LISTABLE_CHILD_SQL from _BRANCH_CHILD_SQL instead of restating it. * fix(desktop): clip horizontal overflow on sidebar scroll areas Add overflow-x-hidden alongside overflow-y-auto on session list scrollers and the shared SidebarContent primitive — vertical scroll unchanged. |
||
|
|
cb29e8a82e |
refactor(cron): rebrand Cron Recipes -> Automation Blueprints
Product rename across every surface: module/file names (blueprint_catalog, tools/blueprints, blueprint_cmd), slash command /cron-recipe -> /blueprint (alias /bp), dashboard API /api/cron/blueprints, desktop deep-link hermes://blueprint/<key>, docs catalog page + extract script, and the skill frontmatter block metadata.hermes.blueprint. No behavior change. |
||
|
|
1593ca5406 |
feat(cron): Cron Recipes — parameterized automation templates across every surface
A 'recipe' is a one-place definition of an automation that every surface renders natively. The slot schema (cron/recipe_catalog.py) is the single source of truth; four renderers consume it, and all paths end at the same cron.jobs.create_job — no second job engine. Form where there's a screen, conversation where there's a chat line: - Dashboard / GUI app: a Recipes sub-tab on the Cron page renders each recipe's typed slots as a form (time-picker, enum dropdown, free-text); submit POSTs /api/cron/recipes/instantiate which fills + creates the job. - CLI / TUI / messengers: /cron-recipe lists the catalog, shows a recipe's fields, or fills + creates from a pasted 'key slot=val' command. The shared handler (hermes_cli/cron_recipe_cmd.py) names any missing/invalid slot so the agent can ask a targeted follow-up. - Docs: a generated Cron Recipes catalog page (website, .mdx + React cards) shows each recipe with a copy-paste command and a 'Send to App' button. - Desktop: a hermes:// URL scheme (Electron single-instance lock + setAsDefaultProtocolClient + open-url/second-instance) routes hermes://cron-recipe/<key>?slot=val into the chat composer pre-filled. Typed slots (time/enum/text/weekdays) with defaults: users never type raw cron — recipes parameterize time-of-day and weekday sets and translate to cron expressions; a free-text 'schedule' slot is the full-flexibility escape hatch. Consent-first throughout: nothing schedules without an explicit submit or send. Core: - cron/recipe_catalog.py — CronRecipe + RecipeSlot, 5 curated recipes, recipe_form_schema / recipe_slash_command / recipe_deeplink / recipe_catalog_entry renderers, fill_recipe (validate + translate to create_job kwargs). - hermes_cli/cron_recipe_cmd.py — shared /cron-recipe handler (CLI + TUI + gateway never drift). CommandDef + dispatch in commands.py / cli.py / gateway/run.py. Dashboard: GET /api/cron/recipes + POST /api/cron/recipes/instantiate (web_server.py), CronRecipes.tsx gallery+form, Segmented sub-tab on CronPage, api.ts methods + types. Desktop: hermes:// scheme end to end (main.cjs deep-link router + ready-queue, preload onDeepLink/signalDeepLinkReady, global.d.ts types, desktop-controller composer prefill, electron-builder protocols key). Docs: extract-cron-recipes.py generator wired into prebuild.mjs, cron-recipes-catalog.mdx + CronRecipesCatalog React component, sidebar entry. Generated index json gitignored like skills.json. Tests: 23 core (catalog/slots/schedule-resolution/validation/renderers/command handler/generator) + 5 web_server endpoint tests. E2E verified end to end: slot fill -> create_job -> persisted job with correct schedule/deliver/origin. |
||
|
|
a5c32cdf30
|
fix(update): self-heal a venv left half-built by an interrupted install (#42172)
* fix(update): self-heal a venv left half-built by an interrupted install An update killed mid dependency-install (Ctrl-C, terminal close, WSL OOM) could leave the venv with pip wiped and core deps (e.g. Pillow) missing, with no automatic recovery — the user had to manually run ensurepip + reinstall. Drop an install-scoped .update-incomplete breadcrumb right before the dep install and clear it only after core-dependency verification passes. On the next launch (any command except 'update' itself), if the marker is present, unconditionally bootstrap pip via ensurepip then re-run the .[all] install + verification, then clear the marker. Failure leaves the marker for retry and prints the manual recovery command. Never raises — recovery cannot block launch. * fix(update): address review — stderr-only recovery output, single-flight lock, gitignore marker - Route all recovery output (status lines + streamed pip/uv install via fd-level dup2) to stderr so protocol-on-stdout launches (hermes acp) never get install noise on the JSON-RPC stream. - Single-flight O_EXCL lockfile (.update-incomplete.lock) so a gateway start + CLI launch (or two profiles) can't run concurrent installs into the shared venv; stale locks (>1h) are broken for the next launch. - gitignore .update-incomplete + lock so source-tree installs keep a clean git status and update's autostash skips them. - Document why the loose 'update' argv substring match is intentional (over-match defers one launch; under-match would race the real update). - 4 new tests: lock held → skip, stale lock broken, lock released, output lands on stderr only. |
||
|
|
b459bac02c |
fix(cli): gitignore Desktop bootstrap marker so hermes update stops autostashing it
The Desktop bootstrap installer writes `.hermes-bootstrap-complete` into the managed git checkout root. Because it wasn't gitignored, `hermes update`'s `git stash push --include-untracked` treated it as a local change and created an autostash on every run — prompting the user to restore "local changes" that were really Hermes-managed runtime state (and risking the marker getting stranded in a stash, which re-triggers Desktop bootstrap). Add the marker to .gitignore; `git stash -u` and `git status --porcelain` both skip ignored files, so the updater now sees a clean tree. Fixes #38529 |
||
|
|
1f347ee543 |
fix(uv): move venv aside instead of gutting it in place on Windows rebuild
hermes update can brick a Windows install. When 'hermes update --force' runs past the concurrent-process guard, rebuild_venv runs while the venv is still in use: shutil.rmtree(ignore_errors=True) deletes site-packages + certifi's cert bundle but can't remove the locked python.exe, leaving a half-gutted venv that uv venv then refuses to overwrite. Every later HTTPS call dies with FileNotFoundError for the missing cacert and there is no recovery. --clear alone (the |
||
|
|
64202200a6
|
chore: remove committed RELEASE_v*.md changelogs from repo root (#37855)
These per-release changelog files are transient working files used only to feed `gh release create --notes-file` at release time; the GitHub Release itself permanently stores the published notes. They were never a build artifact (no package-data glob, no MANIFEST.in include, no CI reference) and don't belong in the tracked tree. - Delete all 15 (v0.2.0 through v0.15.1) - Add RELEASE_v*.md to .gitignore so an accidental `git add -A` can't recommit them The hermes-release skill is updated separately to write the changelog to /tmp/ for the whole release process and never stage it. |
||
|
|
51c68d4ab1
|
Add Hermes desktop app (#20059)
* feat: better composer etc * docs: add desktop and dashboard run instructions * fix(desktop): address security scan findings * fix(dashboard): resolve @nous-research/ui path under npm workspaces The sync-assets prebuild step shelled out to 'cp -r node_modules/@nous-research/ui/dist/fonts ...' with a path relative to apps/dashboard/. That works only when the dep is installed locally in the dashboard workspace, but 'npm install' at the repo root (the documented setup — see apps/desktop/README.md) hoists shared deps to the root node_modules under npm workspaces. The relative cp then fails with 'No such file or directory', sync-assets exits 1, the Vite build aborts, and 'hermes dashboard' surfaces a generic 'Web UI build failed' message. Replace the shell one-liner with scripts/sync-assets.cjs, which walks up from the dashboard directory looking for node_modules/ @nous-research/ui — working in both the hoisted (workspaces) and co-located (standalone) layouts. Also guards against a missing dist/fonts or dist/assets with a clearer error pointing at a rebuild of the UI package rather than silently copying nothing. * feat(desktop): support connecting to a remote Hermes backend Add HERMES_DESKTOP_REMOTE_URL and HERMES_DESKTOP_REMOTE_TOKEN env vars that, when set, short-circuit the local-child spawn in startHermes() and connect the Electron renderer to an already- running 'hermes dashboard' server reachable over the network. Motivating use case: WSL2 users who want to run the Hermes core (agent loop, tools, filesystem access) inside their WSL distribution while rendering the Electron GUI on native Windows. Before this change, the desktop app always spawned a local Python child on the same host as the renderer, which doesn't cross the WSL/Windows boundary. The remote path reuses waitForHermes() as a liveness probe (/api/status is in the backend's public endpoint allowlist), so the connection is only returned once the backend is actually ready. WebSocket URL derivation picks ws:// or wss:// based on the input scheme. URL validation rejects non-http(s) schemes and requires both env vars together to avoid a half-configured connection that would silently fall through to the spawn path. No behaviour change when the env vars are unset — the default local-spawn flow is untouched. Typical usage: # in WSL2 hermes dashboard --tui --no-open --host 0.0.0.0 --port 9119 --insecure # on Windows set HERMES_DESKTOP_REMOTE_URL=http://localhost:9119 set HERMES_DESKTOP_REMOTE_TOKEN=<session token> set HERMES_DESKTOP_IGNORE_EXISTING=1 (launch Hermes desktop) * ci(desktop): automate desktop releases Add GitHub Actions release channels for signed desktop installers and document the stable/nightly download paths. * feat: file tabs * refactor(desktop): tighten right-rail tab close API Promote closeRightRailTab/closeActiveRightRailTab as the single public entry point. Drops the activeTabRef + handleCloseDocument indirection in ChatPreviewRail, the unused $rightRailHasContent atom, and the legacy dismissFilePreviewTarget alias. -70 LOC. * feat(desktop): polish composer pill toward reference look Solid foreground-on-background send/voice-conversation circle (black-on-white in light, white-on-black in dark) anchors the right edge as the primary CTA instead of the orange theme primary. Bumps the primary control to 2.125rem so it visually outranks the ghost mic/plus controls. Opens up the surface padding (0.625rem x / 0.5rem y) so the input row breathes around its controls, and nudges the corner radius from 20 to 24px for a slightly pill-ier silhouette. LiquidGlass distortion is preserved. * feat(desktop): add startup and onboarding flow Add phase-based desktop boot progress, fresh-install sandbox testing, and first-run provider credential onboarding so packaged installs can start cleanly without manual settings detours. * fix(desktop): gate prompts on provider setup Show the desktop provider onboarding flow before prompt submission when no inference provider is configured, preventing fresh installs from falling through to backend credential errors. * fix(desktop): surface provider onboarding from session warnings Propagate credential warnings through session runtime info and open desktop onboarding whenever a session reports no usable provider, so unconfigured installs cannot fall through to prompt errors. * fix(desktop): route gateway provider errors to onboarding The "No inference provider configured" auth error reaches the renderer through gateway error events, not the prompt.submit promise; the previous patch only caught the latter, so the error toast still surfaced and onboarding never opened. Also strip credential-shaped env vars from the test:desktop:fresh sandbox so the packaged backend can't see provider keys leaking from the launching shell. * fix(desktop): use strict runtime check to drive onboarding setup.status returned True whenever any provider auth state was discoverable, including indirect fallbacks like a gh-CLI Copilot token. That made desktop think the user was set up while the agent's actual resolve_runtime_provider call still raised AuthError, leaving the user with a useless toast and no onboarding. Add a setup.runtime_check gateway method that runs the same resolver the agent uses on session creation, and switch the desktop onboarding overlay and prompt precheck to use it. * feat(desktop): OAuth-first onboarding using existing dashboard provider API Replace the engineer-flavored API key form with a Sign-in-first onboarding overlay that uses the dashboard's existing /api/providers/oauth catalog and PKCE/device-code endpoints (Anthropic, Nous, OpenAI Codex, etc.). API key entry is now a fallback tab with friendly provider names instead of env var prefixes, and the loud raw resolver error is gone in favor of a one-line welcome message. * fix(desktop): polish onboarding provider list Reorder OAuth providers so Nous Portal is first, give the segmented Sign in / API key control equal column widths, and replace the engineer-flavored backend names like "Anthropic (Claude API)" / "MiniMax (OAuth)" with friendlier in-app titles. External-CLI providers now show a softer subtitle and an external-link icon instead of a chevron. * refactor(desktop): split onboarding overlay into store + view Move the OAuth state machine, runtime check, copy-to-clipboard, and api-key save into store/onboarding.ts (matching the boot.ts pattern), leaving the overlay as a presentation layer that subscribes via useStore. Tabs are now table-driven, child panels read flow from the store instead of prop-drilling, and the polling/PKCE/error/success branches share a small Status atom. * fix(desktop): external CLI providers + center mode tabs External-CLI providers (Claude Code, Qwen Code) now open an in-overlay panel with the CLI command, copy button, and an "I've signed in" recheck instead of firing an invisible toast. Center the Sign in / API key tab control so it sits under the heading instead of hugging the left edge. * fix(desktop): drop onboarding tabs for an inline link, group device-code waiting state Replace the Sign in / API key tab pair with an "I have an API key" footer link under the OAuth provider list, with a "Back to sign in" affordance inside the API key form. Group the device-code "Waiting for you to authorize..." status next to the Cancel button so the alignment matches the action. * refactor(desktop): tighten onboarding store + overlay Drop the dead isOnboardingBusy/BUSY set, factor the catch-fallback dance into safeReq, and share a single reloadAndConnect helper between PKCE submit, device-code success, external recheck, and api-key save. In the overlay, extract Step / CodeBlock / FlowFooter / CancelBtn / DocsLink atoms so the four sign-in panels share the same chrome instead of repeating it inline. Net effect: fewer literal divs, one place to touch the spacing, and the code-block + footer rows are reusable across future flows. * fix(desktop): mount onboarding from frame 1 to kill the FOUT Default onboarding.configured to null (unknown until the runtime check resolves) and have the onboarding overlay render whenever it's not yet confirmed true. The boot overlay now yields to it, so the very first paint is the Welcome card with a "While we get you set up..." progress strip instead of a flash of the chat shell between boot dismiss and onboarding mount. The picker swaps in cleanly once the gateway opens and the runtime check confirms the user is not configured. Already-configured users see the same prep card briefly while their existing runtime warms up, then the overlay dismisses without touching the chat shell. * fix(desktop): top-align empty sessions placeholder The "Start a chat to build your history." empty state used a min-h-35 grid place-items-center container, which floated the text in a tall dead zone. Render it as a flat paragraph that sits right under the section header like the empty pinned state does. * refactor(desktop): drop dead boot overlay Onboarding overlay subsumes the boot card now that it mounts from frame 1 and renders boot progress inline. The standalone DesktopBootOverlay is unreachable in every flow (yields whenever onboarding has not confirmed configured, dismisses once it has). * fix(desktop): hide pinned/recents sections until first session A fresh sidebar showed the Pinned and Recent chats headers with floating empty-state copy underneath. Drop both sections (and the now-orphan SidebarEmptySessionState) when there are no sessions yet — they reappear after the first chat. Skeletons during initial load are unchanged. * feat(gui): route embedded TUI through dashboard gateway (#21979) Inject HERMES_TUI_GATEWAY_URL into dashboard PTY sessions so embedded ui-tui instances attach to the in-process websocket gateway, with coverage for the new env wiring. * Add desktop remote gateway settings Make the desktop gateway connection configurable from settings so local remains the default while remote backends can be saved, tested, and applied without environment variables. * feat(gui): first-class Messaging page + gateway menu redesign - Add Messaging page to the desktop app with per-platform setup, status, and inline guidance. Catalog derives from gateway.config Platform enum + plugin registry, so every messaging adapter the CLI supports (Telegram, Discord, Slack, Mattermost, Matrix, WhatsApp, Signal, BlueBubbles, Home Assistant, Email, SMS, DingTalk, Feishu, WeCom, Weixin, QQ, Yuanbao, API server, Webhooks, plugins) shows up without per-platform code. - New REST endpoints: GET /api/messaging/platforms, PUT and POST /test on the same path. Secrets go through the existing .env pipeline; enable/disable writes config.yaml. - Replace gateway statusbar dropdown with a richer panel: status row, icon-only restart + system-panel actions, recent activity (with timestamps trimmed in display, full text on hover), platform list. - Auto-poll the messaging page every 6s (paused when hidden) so status updates without a manual check. - Drop Settings / Command Center from the sidebar nav (still reachable via shortcuts and the titlebar cog). - Flatten top corners on Messaging/Skills/Artifacts/Chat panes. - Share new StatusDot component across messaging + gateway menu. - Fix gateway/config.py so an explicit platforms.<name>.enabled=false in config.yaml is honored when env tokens are present. - pb-9 on the chat content area for breathing room above the composer. * Potential fix for pull request finding 'CodeQL / Clear-text logging of sensitive information' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * pin electron version * hide application menu on non-mac systems * interpret compactPreview for non-string vlaues as JSON or an empty string * fix(desktop): keep composer contenteditable mounted across stacked toggle The composer rendered {input} inside two different parent fragments depending on `stacked`. When auto-expand flipped `stacked` (e.g. the moment typed text wrapped past two lines), React reconciled the two branches as different positions and unmounted/remounted the contenteditable. The fresh mount started empty, so any in-flight characters — most reliably reproduced by holding a key — were lost. Replace the conditional with a single CSS Grid whose template-areas swap on `stacked`. The three children (menu, input, controls) keep stable identities across the toggle; only their grid placement changes, which the browser handles without React tearing down the editor. * refactor(desktop): align install layout with install.ps1 / install.sh Make the desktop app's runtime layout match what scripts/install.ps1 and scripts/install.sh produce, so a desktop-only user and a CLI-only user end up with the same files in the same places and can share one install. Layout - ACTIVE_HERMES_ROOT = HERMES_HOME/hermes-agent (was: process.resourcesPath/hermes-agent, read-only) - VENV_ROOT = HERMES_HOME/hermes-agent/venv (was: userData/hermes-runtime) - desktop.log = HERMES_HOME/logs/desktop.log (was: userData/desktop.log) - HERMES_HOME default: %LOCALAPPDATA%\hermes on Windows, ~/.hermes elsewhere The packaged .app/.exe still ships a read-only payload at process.resourcesPath/hermes-agent (FACTORY_HERMES_ROOT). On first launch or after an installer-driven upgrade we sync factory -> active, then provision the venv and run pip install -e . against the active root. Key behaviors - Pin HERMES_HOME in the spawned Python's env so get_hermes_home() resolves to the same path resolveHermesHome() picked. Without this, Python falls back to ~/.hermes on every platform - fine on mac/linux, a split-state bug on Windows where our default is %LOCALAPPDATA%\hermes. - Detect developer installs by .git presence at ACTIVE; never overwrite a user's checkout via factory sync. - Marker at ACTIVE/.hermes-desktop-runtime.json (schema v4) tracks pyproject hash + factory version + runtime schema version. depsFresh fast-paths when nothing changed. - Dev (npm run dev) prefers SOURCE_REPO_ROOT over ACTIVE so devs run their local edits, not whatever's under HERMES_HOME. - Better error messages distinguish "no payload" from "no Python". - Preserve a legacy ~/.hermes on Windows when no %LOCALAPPDATA%\hermes exists, so users with prior pip/manual installs aren't orphaned. pyproject.toml - Promote fastapi, uvicorn[standard], ptyprocess (non-Windows), and pywinpty (Windows) to main dependencies. The dashboard backend (hermes dashboard) needs them at runtime; the previous lazy-import fallback was a footgun for fresh installs. - Empty the [pty] optional-extra; kept as a no-op back-compat alias for any existing pip install hermes-agent[pty] invocations. Drops the hardcoded BUNDLED_RUNTIME_REQUIREMENTS list in main.cjs - the desktop now installs whatever pyproject.toml says, single source of truth. Files - apps/desktop/electron/main.cjs: runtime layout, HERMES_HOME pin, factory->active sync, marker v4 - apps/desktop/scripts/test-desktop.mjs: track new venv location - apps/desktop/README.md: new Setup, Runtime Bootstrap, and Debugging sections - pyproject.toml: fastapi/uvicorn/pty backends in main dependencies; [pty] extra emptied Tested locally on Windows: npm run dev boots cleanly, sessions land at the new location, type-check + lint + test:desktop:platforms all pass. Verified end-to-end on a fresh Win11 VM via dist:win installer. Known gaps (filed as follow-ups, not in this PR): - Skills not seeded on packaged installs (sync_skills only runs in cmd_chat, not cmd_dashboard). Need to move to shared pre-dispatch. - Git Bash not bundled or detected; agent's terminal tool errors out with a useful message but desktop bootstrapper should pre-flight it. - install.ps1 / install.sh should be decomposed into composable phase libraries so the desktop bootstrapper can reuse them as a single source of truth across all install surfaces. * feat(desktop): theme polish, prose chat typography, composer chrome - DS tokens/midground, Backdrop, scoped scrollbars, typography plugin + prose - Composer liquid/radius utilities, thread font parity, tool/thinking cues - File tree label scale, preview flex, thread retry loading + streaming tests * feat(desktop): NSIS prereq detection page + auto-install via winget The packaged Windows installer now detects Python 3.11+ and Git for Windows at install time and offers to install missing prereqs via winget. Mirrors the prereq logic scripts/install.ps1 already runs for CLI installs, so desktop installer users get the same out-of-the-box experience as install.ps1 users. Why - Hermes' terminal tool calls bash.exe directly (tools/environments/ local.py); on Windows that's Git Bash from Git for Windows. Without it, the agent fails on the first terminal() call. - Hermes' Python runtime needs 3.11+. Without it, the desktop bootstrapper errors out at venv creation. - Both gaps surfaced on a fresh Windows 11 VM smoke test: VM had Python pre-installed but no Git, so the agent's first terminal call failed with "Git Bash isn't installed." - install.ps1 has had Install-Git + Install-Uv functions for ages. The desktop installer was the asymmetric outlier. How — NSIS prereq page - New file: apps/desktop/installer/prereq-check.nsh (plugged into electron-builder via build.nsis.include) - Real Wizard page using nsDialogs, inserted via customPageAfterChangeDir hook (between the Directory page and InstFiles). - Group boxes for Python and Git, each showing detection status. - Pre-checked install checkboxes when winget is available. - Auto-skips silently if both prereqs are already installed. - Falls back to manual download URLs when winget itself is missing. - Detection: - Python: probes `py -3.11`/`-3.12`/`-3.13`/`-3.14` via the Python launcher. Microsoft Store "Python stub" (no py.exe) is correctly classified as not-installed. - Git: `where git`. - winget: `where winget` (Win10 1809+ / Win11 with App Installer). - Install execution (in customInstall macro): - Python: nsExec::ExecToLog with `--scope user --silent`. Per-user install, no UAC prompt, output streams to install log. - Git: ExecShellWait via Windows ShellExecute. Critical because Git always installs per-machine and triggers UAC; ShellExecute preserves the foreground focus chain across non-elevated → elevated process spawns, so UAC actually comes to the foreground. nsExec::ExecToLog breaks the chain because winget runs hidden. - Both pass `--disable-interactivity --accept-package-agreements --accept-source-agreements` to suppress winget's own dialogs. - Verification: probes Git's standard install locations via FileExists rather than `where git`. NSIS's process inherits PATH at startup, so a freshly-installed Git won't be visible to `where` until restart. - Silent installs (/S) skip the prompts; managed deploys handle prereqs out-of-band via Group Policy / Intune. How — Electron-side safety net - New findGitBash() in main.cjs, parallel to findSystemPython(). Probes the same locations as tools/environments/local.py:_find_bash() so a positive result here means the agent's terminal tool will work. - ensureRuntime now throws a clear, actionable error on Windows when Git Bash isn't found, matching the existing "Python 3.11+ is required" error path. - Catches users the NSIS page doesn't: .msi installer users (NSIS prereq page doesn't run for MSI), `npm run dev` users, manual installers, anyone who unchecked the install boxes on the NSIS prereq page. - All gated on `IS_WINDOWS`; macOS / Linux unaffected. NSIS build issue (resolved) - electron-builder defaults to `-WX` (warnings as errors). NSIS optimizer emits "warning 6010: function not referenced" for our page functions because Page custom directives don't count as references in its static-analysis pass. The functions ARE called at runtime when NSIS invokes the page; the optimizer just can't see it statically. - Set `build.nsis.warningsAsErrors=false` in package.json so this spurious warning doesn't fail the build. (Documented option from electron-builder's nsisOptions.) Out of scope (filed for future work) - MSI prereq detection: Windows Installer custom actions are a different mechanism. Enterprise deploys typically handle prereqs via GP/Intune. - Bundle PortableGit + python-build-standalone in extraResources for zero-network installs. ~80MB increase. - Mac / Linux GUI prereq flows (different installer formats; Xcode CLT covers most macOS prereqs already; Linux is per-distro hard). Files - apps/desktop/installer/prereq-check.nsh (new, ~290 lines NSIS) - apps/desktop/package.json (build.nsis.include + warningsAsErrors) - apps/desktop/electron/main.cjs (findGitBash + preflight) - apps/desktop/README.md (Runtime prerequisites section) Cross-platform impact - macOS / Linux builds (dist:mac, dist:mac:dmg, dist:mac:zip): nsis config is ignored entirely; .nsh is dormant. - npm run dev: .nsh dormant; main.cjs preflight gated on IS_WINDOWS. - scripts/install.ps1, scripts/install.sh: no reference to any new files; CLI install paths untouched. - Hermes CLI / dashboard / gateway: no reference; runtime untouched. - All checks: node --check on main.cjs and test-desktop.mjs pass; npm run test:desktop:platforms 4/4 passing; node --test green. Tested - npm run dist:win produces signed .exe and .msi without errors. - Fresh Win11 VM (Python pre-installed, no Git): prereq page renders, Python check shows detected, Git checkbox pre-checked. Click Next → Git installs via winget with UAC prompt in foreground. - After install completes, Hermes launches and the agent's terminal tool can run bash commands. Verified Git Bash is detected at `C:\Program Files\Git\bin\bash.exe` by ensureRuntime's preflight. * feat: theme changes, composer tweaks, in app update ux, finesse * fix(cli): seed bundled skills on dashboard + gateway entrypoints `sync_skills(quiet=True)` was only being called from inside `cmd_chat`, which meant `hermes dashboard` (the desktop GUI's backend) and `hermes gateway` (Telegram/Discord/Slack/etc daemons) never seeded the bundled skill library into ~/.hermes/skills/. This surfaced as "No skills found" in the desktop GUI's skills panel on fresh installs, despite the agent having access to the full bundled library when invoked via `hermes chat`. scripts/install.ps1 worked around it by running skills_sync.py as part of Copy-ConfigTemplates, but that's not part of the desktop installer's bootstrap chain. Fix - Extract the skills-sync block from cmd_chat into a module-level `_sync_bundled_skills_quietly()` helper. - Call the helper from cmd_chat (preserving existing behavior), cmd_dashboard (after the --status/--stop early-return paths and fastapi import check, so we don't run skills_sync on management commands or when deps aren't installed), and cmd_gateway. Why these three entrypoints - cmd_chat: the user's primary CLI entrypoint - cmd_dashboard: the desktop GUI's backend; this is what `hermes dashboard --tui` invokes when the desktop bootstrapper spawns Hermes - cmd_gateway: long-running daemons where the user expects the agent to have full skill access Other entrypoints (cmd_config, cmd_doctor, cmd_login, cmd_status, etc.) are management commands that don't need skill discovery and were never running skills_sync in the first place — leaving them alone. Idempotence - tools/skills_sync.py is manifest-based: skipped skills cost milliseconds. Calling it from multiple entrypoints adds no real cost, and users running `hermes chat` then `hermes dashboard` get two fast no-ops on the second call. Failure handling - Helper wraps skills_sync in try/except. Skills are an enhancement, not a hard dependency — Hermes runs fine with an empty skills/ dir. Files - hermes_cli/main.py: + new helper `_sync_bundled_skills_quietly()` at module level + cmd_chat: replace inline block with helper call + cmd_dashboard: add helper call after fastapi import succeeds + cmd_gateway: add helper call before delegating to gateway_command * feat(desktop): hoisted todo widget, JSON tool summaries, history grouping & timer fixes - Hoist todo to first-class widget (shadcn checkboxes, brand colors, no tool-accordion). Header derives label from active task; non-active rows fade. - Replace raw JSON dumps with structured key/value summaries via formatToolResultSummary; nested error extraction for clearer failures. - Fix loaded-session grouping: stitch interleaved assistant/tool iterations into one bubble instead of orphaned synthetic messages. - Stable tool/thinking timers via keyed registry so unmount/scroll doesn't reset elapsed counts; gate "running" on real live thread state. - Reorganize chat-only assistant-ui components under components/chat/. * fix(desktop): address CodeQL alerts on PR #20059 - settings/helpers.ts: harden setNested against prototype pollution. POLLUTING_PATH_PARTS check is now applied at every assignment site (loop + leaf) and uses Object.defineProperty so CodeQL can see the guard inline rather than via a helper function call. - lib/markdown-preprocess.ts: rebuild the dangling-fence close regex from a fence-char + length instead of marker.replace(...). The marker is captured by `(`{3,}|~{3,})` so it can only be backticks or tildes, but CodeQL was tracing tainted input text into the RegExp source and flagging hostname dots from input as part of the pattern (false positive js/incomplete-hostname-regexp on the test fixture URLs). Reconstructing from a literal char breaks the dataflow. - scripts/notarize-artifact.cjs: drop args from the run() rejection message. Args carry --key-id / --issuer / key file path; the existing outer catch already squashes errors to a generic line, but CodeQL was flagging the args.join(' ') as clear-text logging of APPLE_API_KEY_ID. Composer DOM-text-as-HTML alerts (composer/index.tsx:379, :547) are already addressed in |
||
|
|
1709776120 |
test(tool-search): add live A/B harness, drop checked-in transcripts
Brings in the tool_search live-test harness from the original PR but leaves out the 11 checked-in scripts/out/*.json transcript files — those are non-deterministic model output that goes stale the moment the model changes and were the bulk of the diff. scripts/out/ is now gitignored so a harness run never re-commits them. Fixes on top: - API-key loading goes through hermes_cli.env_loader.load_hermes_dotenv instead of hand-parsing ~/.hermes/.env and assigning the value to a local. The canonical loader never materializes the secret in a local variable in this module, which clears the four CodeQL high alerts (py/clear-text-storage / py/clear-text-logging-sensitive-data at the transcript write/print sites — they were tracing the key from the hand-rolled parser into the records) and removes a hand-rolled parser. - encoding='utf-8' on every write_text/read_text in both harness scripts (Windows-footgun hygiene). Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com> |
||
|
|
a1eaad2fc0
|
perf(skills-page): lazy-fetch the catalog instead of bundling 34MB into JS (#33809)
PR #33748 grew the live skills index from ~2k skills to ~69k, which made the previous build-time bundling strategy untenable: the skills page's JS chunk was about to balloon from ~1MB to ~35MB. Initial page load on mobile became unusable, search lagged on every keystroke against the 68k-item array, and JSON.parse blocked the main thread at startup. Three changes: 1. extract-skills.py writes skills.json + skills-meta.json into website/static/api/ instead of website/src/data/. Static-served by Vercel as /docs/api/skills.json (gzipped on the wire), same CDN that already serves skills-index.json. 2. skills/index.tsx drops the static import and fetches both files in parallel on mount. Loading state shows '…' for the count; failures surface a small error pill instead of blanking the page. 3. Search is debounced 150ms and runs against a precomputed lowercase haystack stamped onto each row at load time. Before: array-join + toLowerCase per row per keystroke on a 68k array. After: single .includes() per row, deferred until typing settles. Validation: | | before | after | |---|---|---| | skills.json location | src/data/ (bundled) | static/api/ (CDN) | | Largest JS chunk | would be ~35MB at 68k skills | 659 KB | | Initial page render | wait for full parse | immediate, fetch async | | Per-keystroke filter | join+lowercase x 68k rows | single includes x 68k rows | | Debounce | none | 150ms | Built locally for both en and zh-Hans locales; the 34MB skills.json now lives in build/api/ and is served separately rather than inlined into the page's bundle. skills.json and skills-meta.json added to .gitignore — they were already build artifacts, but the gitignore only listed skills-index.json before. |
||
|
|
e2a92ce649 |
chore: gitignore .hermes/ working directory; drop tracked plan artifact
The 4533-line dashboard-OAuth plan was checked into .hermes/plans/ during initial development. .hermes/ is the Hermes Agent's runtime working directory (logs, session caches, in-flight plans) — its contents are never artifacts of the codebase and should not have been tracked. Add .hermes/ to .gitignore so future agent runs that materialise plans/audits/cache files in the working tree don't accidentally stage them. Remove the existing plan file from version control. The plan content is preserved in the branch history if anyone needs to reference it. |
||
|
|
1e267c4859
|
Merge pull request #29025 from slowtokki0409/codex/ignore-local-runtime-files
Ignore local Hermes runtime files |
||
|
|
b689624aee |
feat(ci): 4-way matrix slicing with LPT duration-balanced distribution
run_tests_parallel.py:
- --slice I/N flag (also HERMES_TEST_SLICE env var) runs only the
I-th slice of N, distributing files across slices by cached
duration using LPT (Longest Processing Time first) greedy
algorithm so each slice gets roughly equal wall time
- Duration cache (test_durations.json): maps relative file paths to
last-observed subprocess wall time. _save_durations merges with
existing cache so entries from other slices are preserved.
- Per-file subprocess timing in progress output + end-of-run
distribution summary (percentiles, top-10 slowest, <1s/<2s counts)
- Unknown files default to 2.0s estimate (~P50), spread evenly by LPT
.github/workflows/tests.yml:
- Matrix strategy: slice [1, 2, 3, 4] with fail-fast: false
- Each slice restores duration cache from main (stable key, no SHA),
runs its portion, uploads per-slice durations as artifacts
- save-durations job (main only, if: always()) downloads all 4
artifacts, merges into single cache entry for future PRs
- Timeout reduced from 60min to 30min per slice (~1/4 the work)
Cache design:
- Stable key (test-durations) not keyed by commit SHA — durations
are about files, not commits, and SHA-keyed caches miss on every
new commit and on PR merge commits
- actions/cache scoping: main's cache is visible to all PRs targeting
main; feature branches without a cache still work (default 2.0s)
- No dotfile prefix (upload-artifact v7 skips hidden files)
|
||
|
|
48be2e0e4d
|
test: use subprocesses for each test file (#29016)
* ci(tests): install ripgrep from prebuilt tarball instead of apt
apt-get update + install of ripgrep takes ~4 min on the GHA Ubuntu
runners (the apt-get update against archive.ubuntu.com is the slow
part; ripgrep itself is small). Switching to the upstream musl
binary tarball cuts the step to a few seconds.
- Pinned to ripgrep 15.1.0 with sha256 verification (same hash as
published in the releases sha256 sidecar file).
- Drops the `rg` binary into /usr/local/bin so it is on PATH for
every subsequent step without GITHUB_PATH manipulation.
- Applied to both the test and e2e jobs in tests.yml.
* fix(cli): compile syntax check to tempdir, not source __pycache__
`_validate_critical_files_syntax` runs `py_compile.compile()` on each
critical bootstrap file after a successful `git pull`. The default
`py_compile` writes the resulting `.pyc` next to the source under
`__pycache__/`, which causes two real problems:
1. Parallel test workers walking the same source tree (e.g. running
the suite under per-file process isolation) can race against each
other on the `__pycache__` write — manifests as flaky 'directory
not empty' errors during teardown.
2. In production, the post-pull syntax check leaves a `.pyc` behind
that the next interpreter run might pick up — fine when the
interpreter version matches, sketchy if it doesn't.
Fix: write the compiled output to a `tempfile.TemporaryDirectory()`
that's discarded on function exit. We only care about the compile-or-not
signal, not the artifact.
* test(runner): per-file process isolation, drop manual state reset + xdist
Replace fragile manual _reset_module_state test fixtures with robust
per-file subprocess isolation. Each test file runs in a fresh
`python -m pytest <file>` subprocess via ThreadPoolExecutor. No xdist,
no custom pytest plugin, no shared worker state.
Key changes:
* scripts/run_tests_parallel.py — new runner: discovers test files,
runs N in parallel via ThreadPoolExecutor, captures stdout per file,
treats exit code 5 (no tests collected) as pass, kills all children
on exit. Change from cpu_count to cpu_count*2. The runner is
I/O-bound (waiting on subprocess.communicate() from pytest children)
The parent process does almost no CPU work, so 2x oversubscription
keeps more pipes full. When a file fails, immediately show the last
30 lines of pytest output (stack traces + FAILED summary) plus a
ready-to-copy repro command:
python -m pytest tests/agent/test_auxiliary_client.py
* scripts/run_tests.sh — delegates to run_tests_parallel.py
* .github/workflows/tests.yml — test step: python
scripts/run_tests_parallel.py
* pyproject.toml — drop pytest-xdist, pytest-split; simplify addopts
* tests/conftest.py — remove ~200 lines of manual state-reset fixtures
* AGENTS.md — update Testing section for per-file design
* test(runner): speed gateway test antipattern scan up
* fix(test): web search provider plugin test missing xai
* fix(tests): make 14 test files pass under per-file subprocess isolation
Tests that relied on cross-file state pollution from xdist workers
fail when run in isolation (per-file subprocess model). Root causes
and fixes:
Tool registry not populated:
- test_video_generation_tool_surface_matrix: add discover_builtin_tools()
- test_web_providers_brave_free/ddgs/searxng/general: autouse fixtures
registering all 8 bundled web providers, reset after each test
- test_website_policy: same provider registration pattern
- test_web_tools_tavily: same pattern across 3 dispatch test classes
- Also add is_safe_url/check_website_access mocks where SSRF check
blocks example.com (DNS resolution fails in isolated envs)
Stale check_fn cache:
- test_kanban_tools: invalidate_check_fn_cache() + _clear_tool_defs_cache()
in both kanban guidance tests (prior test cached False for kanban_show)
- test_discord_tool: cache invalidation in setup/teardown
- test_homeassistant_tool: invalidate_check_fn_cache() before registry queries
Module-level state pollution:
- test_auxiliary_client: autouse fixture clearing _aux_unhealthy_until cache
- test_skill_commands: set_session_vars() instead of patch.dict(os.environ)
(ContextVar takes precedence over os.environ)
- test_dm_topics: overwrite sys.modules + separate telegram.constants mock
+ force-reimport of gateway.platforms.telegram
- test_terminal_tool_requirements: removed duplicate class declaration,
autouse _clear_caches fixture
* change(tests): run_tests.sh explicitly includes env vars
instead of manually dropping some vars, now we just only include some
* fix(tests): 5 more isolation/NixOS fixes
- test_approval_plugin_hooks: isolate HERMES_HOME so real user's
command_allowlist doesn't short-circuit the approval path
- test_google_chat: skipif when Platform.GOOGLE_CHAT not in enum
(feature not merged on this branch)
- test_write_deny: test systemd prefix against tmp_path instead of
/etc/systemd which resolves to /nix/store on NixOS
- test_pty_bridge: use shutil.which('cat') instead of /bin/cat
(doesn't exist on NixOS)
- profiles.py: rmtree onexc handler chmod's parent dirs too, fixing
profile deletion when copytree preserved read-only modes from
nix store
* fix(tests): clear unhealthy cache in autouse fixture for auxiliary_client
* fix(tests): skip send_message when telegram not installed; handle missing worker_id in browser_supervisor
* fix: py3.11 rmtree onexc compat + belt-and-suspenders unhealthy cache clear for expired codex test
* fix: address PR #29016 review feedback
- Remove tracked .pytest-cache/ artifact and add to .gitignore
- Fix stale 'xdist worker' comment in conftest.py
- Deduplicate web provider registration into tests/tools/conftest.py
shared helper (register_all_web_providers), replacing 8 copy-pasted
blocks across 6 test files
- Update PR description: remove stale recovered-test-files claim,
fix worker count to match code (cpu_count*2)
* fix: eliminate race in stale-cache achievements test
The background scan thread could complete and overwrite _SNAPSHOT_CACHE
before evaluate_all() returned the stale data — only 10 fake sessions
made the scan finish instantly. Added scan_delay param to _FakeSessionDB
and set it to 2s in the stale-cache test so the background thread can't
win the race.
|
||
|
|
ec641d497a |
chore: ignore local Hermes runtime files
Keep local Hermes Docker runtime data, NotebookLM auth/cache, and personal compose overrides out of Git and Docker build contexts. This protects tokens, OAuth state, sessions, logs, and caches while preserving the source tree. Constraint: Only .gitignore and .dockerignore are in scope for this commit. Tested: git diff --cached --name-only and git diff --cached --stat Co-authored-by: OmX <omx@oh-my-codex.dev> |
||
|
|
b1edf3dfc8 | chore: gitignore hermes_cli/scripts/ (bundled at wheel build time) | ||
|
|
c53fcb0173 |
feat(providers): add GMI Cloud as a first-class API-key provider (#11955)
Add GMI Cloud (api.gmi-serving.com) as a full first-class API-key provider with built-in auth, aliases, model catalog, CLI entry points, auxiliary client routing, context length resolution, doctor checks, env var tracking, and docs. - auth.py: ProviderConfig for 'gmi' (api_key, GMI_API_KEY / GMI_BASE_URL) - providers.py: HermesOverlay with extra_env_vars for models.dev detection - models.py: curated slash-form model catalog; live /v1/models fetch - main.py: 'gmi' in _named_custom_provider_map and --provider choices - model_metadata.py: _URL_TO_PROVIDER, _PROVIDER_PREFIXES, dedicated context-length probe block (GMI's /models has authoritative data) - auxiliary_client.py: alias entries; _compat_model fix for slash-form models on cached aggregator-style clients; gmi aux default model - doctor.py: GMI in provider connectivity checks - config.py: GMI_API_KEY / GMI_BASE_URL in OPTIONAL_ENV_VARS - conftest.py: explicit GMI_BASE_URL clearing (not caught by _API_KEY suffix) - docs: providers.md, environment-variables.md, fallback-providers.md, configuration.md, quickstart.md (expands provider table) Co-authored-by: Isaac Huang <isaachuang@Isaacs-MacBook-Pro.local> |
||
|
|
82fbd4771a |
Update .gitignore
Filter out .DS_Store (Desktop Services Store) |
||
|
|
923539a46b | fix: add nous-research/ui package | ||
|
|
7e4dd6ea02 | Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor | ||
|
|
e2a9b5369f
|
feat: web UI dashboard for managing Hermes Agent (#8756)
* feat: web UI dashboard for managing Hermes Agent (salvage of #8204/#7621) Adds an embedded web UI dashboard accessible via `hermes web`: - Status page: agent version, active sessions, gateway status, connected platforms - Config editor: schema-driven form with tabbed categories, import/export, reset - API Keys page: set, clear, and view redacted values with category grouping - Sessions, Skills, Cron, Logs, and Analytics pages Backend: - hermes_cli/web_server.py: FastAPI server with REST endpoints - hermes_cli/config.py: reload_env() utility for hot-reloading .env - hermes_cli/main.py: `hermes web` subcommand (--port, --host, --no-open) - cli.py / commands.py: /reload slash command for .env hot-reload - pyproject.toml: [web] optional dependency extra (fastapi + uvicorn) - Both update paths (git + zip) auto-build web frontend when npm available Frontend: - Vite + React + TypeScript + Tailwind v4 SPA in web/ - shadcn/ui-style components, Nous design language - Auto-refresh status page, toast notifications, masked password inputs Security: - Path traversal guard (resolve().is_relative_to()) on SPA file serving - CORS localhost-only via allow_origin_regex - Generic error messages (no internal leak), SessionDB handles closed properly Tests: 47 tests covering reload_env, redact_key, API endpoints, schema generation, path traversal, category merging, internal key stripping, and full config round-trip. Original work by @austinpickett (PR #1813), salvaged by @kshitijk4poor (PR #7621 → #8204), re-salvaged onto current main with stale-branch regressions removed. * fix(web): clean up status page cards, always rebuild on `hermes web` - Remove config version migration alert banner from status page - Remove config version card (internal noise, not surfaced in TUI) - Reorder status cards: Agent → Gateway → Active Sessions (3-col grid) - `hermes web` now always rebuilds from source before serving, preventing stale web_dist when editing frontend files * feat(web): full-text search across session messages - Add GET /api/sessions/search endpoint backed by FTS5 - Auto-append prefix wildcards so partial words match (e.g. 'nimb' → 'nimby') - Debounced search (300ms) with spinner in the search icon slot - Search results show FTS5 snippets with highlighted match delimiters - Expanding a search hit auto-scrolls to the first matching message - Matching messages get a warning ring + 'match' badge - Inline term highlighting within Markdown (text, bold, italic, headings, lists) - Clear button (x) on search input for quick reset --------- Co-authored-by: emozilla <emozilla@nousresearch.com> |
||
|
|
76019320fb |
feat(skills): centralized skills index — eliminate GitHub API calls for search/install
Add a CI-built skills index served from the docs site. The index is crawled daily by GitHub Actions, resolves all GitHub paths upfront, and is cached locally by the client. When the index is available: - Search uses the cached index (0 GitHub API calls, was 23+) - Install uses resolved paths from index (6 API calls for file downloads only, was 31-45 for discovery + downloads) Total: 68 → 6 GitHub API calls for a typical search + install flow. Unauthenticated users (60 req/hr) can now search and install without hitting rate limits. Components: - scripts/build_skills_index.py: Crawl all sources (skills.sh, GitHub taps, official, clawhub, lobehub), batch-resolve GitHub paths via tree API, output JSON index - tools/skills_hub.py: HermesIndexSource class — search/fetch/inspect backed by the index, with lazy GitHubSource for file downloads - parallel_search_sources() skips external API sources when index is available (0 GitHub calls for search) - .github/workflows/skills-index.yml: twice-daily CI build + deploy - .github/workflows/deploy-site.yml: also builds index during docs deploy Graceful degradation: when the index is unavailable (first run, network down, stale), all methods return empty/None and downstream sources handle the request via direct API as before. |
||
|
|
74241328f0 | direnv: watch lockfiles/nix files; gitignore .nix-stamps | ||
|
|
b6461903ff
|
feat: nix flake — uv2nix build, NixOS module, persistent container mode (#20)
* feat: nix flake, uv2nix build, dev shell and home manager
* fixed nix run, updated docs for setup
* feat(nix): NixOS module with persistent container mode, managed guards, checks
- Replace homeModules.nix with nixosModules.nix (two deployment modes)
- Mode A (native): hardened systemd service with ProtectSystem=strict
- Mode B (container): persistent Ubuntu container with /nix/store bind-mount,
identity-hash-based recreation, GC root protection, symlink-based updates
- Add HERMES_MANAGED guards blocking CLI config mutation (config set, setup,
gateway install/uninstall) when running under NixOS module
- Add nix/checks.nix with build-time verification (binary, CLI, managed guard)
- Remove container.nix (no Nix-built OCI image; pulls ubuntu:24.04 at runtime)
- Simplify packages.nix (drop fetchFromGitHub submodules, PYTHONPATH wrappers)
- Rewrite docs/nixos-setup.md with full options reference, container
architecture, secrets management, and troubleshooting guide
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Update config.py
* feat(nix): add CI workflow and enhanced build checks
- GitHub Actions workflow for nix flake check + build on linux/macOS
- Entry point sync check to catch pyproject.toml drift
- Expanded managed-guard check to cover config edit
- Wrap hermes-acp binary in Nix package
- Fix Path type mismatch in is_managed()
* Update MCP server package name; bundled skills support
* fix reading .env. instead have container user a common mounted .env file
* feat(nix): container entrypoint with privilege drop and sudo provisioning
Container was running as non-root via --user, which broke apt/pip installs
and caused crashes when $HOME didn't exist. Replace --user with a Nix-built
entrypoint script that provisions the hermes user, sudo (NOPASSWD), and
/home/hermes inside the container on first boot, then drops privileges via
setpriv. Writable layer persists so setup only runs once.
Also expands MCP server options to support HTTP transport and sampling.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix group and user creation in container mode
* feat(nix): persistent /home/hermes and MESSAGING_CWD in container mode
Container mode now bind-mounts ${stateDir}/home to /home/hermes so the
agent's home directory survives container recreation. Previously it lived
in the writable layer and was lost on image/volume/options changes.
Also passes MESSAGING_CWD to the container so the agent finds its
workspace and documents, matching native mode behavior.
Other changes:
- Extract containerDataDir/containerHomeDir bindings (no more magic strings)
- Fix entrypoint chown to run unconditionally (volume mounts always exist)
- Add schema field to container identity hash for auto-recreation
- Add idempotency test (Scenario G) to config-roundtrip check
* docs: add Nix & NixOS setup guide to docs site
Add comprehensive Nix documentation to the Docusaurus site at
website/docs/getting-started/nix-setup.md, covering nix run/profile
install, NixOS module (native + container modes), declarative settings,
secrets management, MCP servers, managed mode, container architecture,
dev shell, flake checks, and full options reference.
- Register nix-setup in sidebar after installation page
- Add Nix callout tip to installation.md linking to new guide
- Add canonical version pointer in docs/nixos-setup.md
* docs: remove docs/nixos-setup.md, consolidate into website docs
Backfill missing details (restart/restartSec in full example,
gateway.pid, 0750 permissions, docker inspect commands) into
the canonical website/docs/getting-started/nix-setup.md and
delete the old standalone file.
* fix(nix): add compression.protect_last_n and target_ratio to config-keys.json
New keys were added to DEFAULT_CONFIG on main, causing the
config-drift check to fail in CI.
* fix(nix): skip checks on aarch64-darwin (onnxruntime wheel missing)
The full Python venv includes onnxruntime (via faster-whisper/STT)
which lacks a compatible uv2nix wheel on aarch64-darwin. Gate all
checks behind stdenv.hostPlatform.isLinux. The package and devShell
still evaluate on macOS.
* fix(nix): skip flake check and build on macOS CI
onnxruntime (transitive dep via faster-whisper) lacks a compatible
uv2nix wheel on aarch64-darwin. Run full checks and build on Linux
only; macOS CI verifies the flake evaluates without building.
* fix(nix): preserve container writable layer across nixos-rebuild
The container identity hash included the entrypoint's Nix store path,
which changes on every nixpkgs update (due to runtimeShell/stdenv
input-addressing). This caused false-positive identity mismatches,
triggering container recreation and losing the persistent writable layer.
- Use stable symlink (current-entrypoint) like current-package already does
- Remove entrypoint from identity hash (only image/volumes/options matter)
- Add GC root for entrypoint so nix-collect-garbage doesn't break it
- Remove global HERMES_HOME env var from addToSystemPackages (conflicted
with interactive CLI use, service already sets its own)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|
|
689344430c
|
chore: gitignore orphaned mini-swe-agent directory | ||
|
|
323ca70846 |
feat: add versioning infrastructure and release script
- Fix version mismatch: __init__.py had 'v1.0.0', pyproject.toml had '0.1.0' Now both use '0.1.0' (no v prefix — added in display code only) - Add __release_date__ for CalVer date tracking alongside SemVer version - Fix double-v bug in cmd_version (was printing 'vv1.0.0') - Update banner title to show 'Hermes Agent v0.1.0 (2026.3.12)' format - Update cli.py banner to match new format - Add scripts/release.py: full release automation tool - Generates categorized changelogs from git history - Maps git authors to GitHub @mentions (70+ contributors) - Supports dry-run preview and --publish mode - Creates annotated CalVer git tags + GitHub Releases - Bumps semver in source files automatically - Usage: python scripts/release.py --bump minor --publish - Add .release_notes.md to .gitignore Versioning scheme: CalVer tags (v2026.3.12) + SemVer display (v0.1.0) |
||
|
|
09fc64c6b6 | add eval output to gitignore | ||
|
|
78e19ebc95 |
chore: update .gitignore to include .worktrees directory
Added .worktrees to the .gitignore file to prevent tracking of worktree-specific files, ensuring a cleaner repository. |
||
|
|
9ec4f7504b | Provide example datagen config scripts | ||
|
|
7cb6427dea |
refactor: streamline cron job handling and update CLI commands
- Removed legacy cron daemon functionality, integrating cron job execution directly into the gateway process for improved efficiency. - Updated CLI commands to reflect changes, replacing `hermes cron daemon` with `hermes cron status` and enhancing documentation for cron job management. - Clarified messaging in the README and other documentation regarding the gateway's role in managing cron jobs. - Removed obsolete terminal_hecate tool and related configurations to simplify the codebase. |
||
|
|
4d5f29c74c |
feat: introduce skill management tool for agent-created skills and skills migration to ~/.hermes
- Added a new `skill_manager_tool` to enable agents to create, update, and delete their own skills, enhancing procedural memory capabilities. - Updated the skills directory structure to support user-created skills in `~/.hermes/skills/`, allowing for better organization and management. - Enhanced the CLI and documentation to reflect the new skill management functionalities, including detailed instructions on creating and modifying skills. - Implemented a manifest-based syncing mechanism for bundled skills to ensure user modifications are preserved during updates. |
||
|
|
14e59706b7 |
Add Skills Hub — universal skill search, install, and management from online registries
Implements the Hermes Skills Hub with agentskills.io spec compliance, multi-registry skill discovery, security scanning, and user-driven management via CLI and /skills slash command. Core features: - Security scanner (tools/skills_guard.py): 120 threat patterns across 12 categories, trust-aware install policy (builtin/trusted/community), structural checks, unicode injection detection, LLM audit pass - Hub client (tools/skills_hub.py): GitHub, ClawHub, Claude Code marketplace, and LobeHub source adapters with shared GitHubAuth (PAT + gh CLI + GitHub App), lock file provenance tracking, quarantine flow, and unified search across all sources - CLI interface (hermes_cli/skills_hub.py): search, install, inspect, list, audit, uninstall, publish (GitHub PR), snapshot export/import, and tap management — powers both `hermes skills` and `/skills` Spec conformance (Phase 0): - Upgraded frontmatter parser to yaml.safe_load with fallback - Migrated 39 SKILL.md files: tags/related_skills to metadata.hermes.* - Added assets/ directory support and compatibility/metadata fields - Excluded .hub/ from skill discovery in skills_tool.py Updated 13 config/doc files including README, AGENTS.md, .env.example, setup wizard, doctor, status, pyproject.toml, and docs. |
||
|
|
d999d9876d |
Enhance async tool execution and error handling in Hermes agent for Atropos integration
- Updated `.gitignore` to exclude `testlogs` directory. - Refactored `handle_web_function_call` in `model_tools.py` to support running async functions in existing event loops, improving compatibility with Atropos. - Introduced a thread pool executor in `agent_loop.py` for running synchronous tool calls that internally use `asyncio.run()`, preventing deadlocks. - Added `ToolError` class to track tool execution errors, enhancing error reporting during agent loops. - Updated `wandb_log` method in `hermes_base_env.py` to log tool error statistics for better monitoring. - Implemented patches in `patches.py` to ensure async-safe operation of tools within Atropos's event loop. - Enhanced `ToolContext` and `terminal_tool.py` to utilize the new async handling, improving overall tool execution reliability. |
||
|
|
07b615e96e |
Add support for Atropos Agentic RL environments (requires branch tool_call_support in Atropos atm)
- Added new environments for reinforcement learning, including `HermesSweEnv` for software engineering tasks and `TerminalTestEnv` for inline testing. - Introduced `ToolContext` for unrestricted access to tools during reward computation. - Updated `.gitignore` to exclude `wandb/` directory. - Enhanced `README.md` with detailed architecture and usage instructions for Atropos environments. - Added configuration files for SWE and terminal test environments to streamline setup. - Removed unnecessary compiled Python files from `__pycache__`. |
||
|
|
ac79725923 |
Update dependencies and enhance installation scripts
- Added `prompt_toolkit` as a direct dependency for interactive CLI support. - Updated `modal` optional dependency to require `swe-rex[modal]>=1.4.0` for improved cloud execution capabilities. - Enhanced `messaging` optional dependencies to include `aiohttp>=3.9.0` for WhatsApp bridge communication. - Refined installation scripts to check for Python version requirements, emphasizing the need for Python 3.11+ for RL training tools. - Improved setup scripts to ensure proper installation of submodules and dependencies, enhancing user experience during setup. |
||
|
|
9c8d707530 |
Update .gitignore to include additional ignored files
- Added 'images/' to the ignore list to prevent tracking of image files. - Retained existing entries for private keys and CLI config to maintain security and privacy. |
||
|
|
8e986584f4 |
Update .gitignore to include private keys and CLI config
- Added patterns to ignore private key files (*.ppk, *.pem) and any files starting with 'privvy'. - Included cli-config.yaml in the ignore list to prevent sensitive SSH paths from being tracked. |
||
|
|
54ca0997ee |
Update .gitignore to include additional directories and files
- Added entries for `node_modules/`, `browser-use/`, and `agent-browser/` to prevent unnecessary files from being tracked. - Updated `data/*` entry to `data/*` for consistency in ignoring data files. - Ensured no newline at the end of the file for proper formatting. |
||
|
|
6eb76c7c1a |
Enhance batch processing and image generation tools
- Updated batch processing to include robust resume functionality by scanning completed prompts based on content rather than indices, improving recovery from failures. - Implemented retry logic for image downloads with exponential backoff to handle transient failures effectively. - Refined image generation tool to utilize the FLUX 2 Pro model, updating descriptions and parameters for clarity and consistency. - Added new configuration scripts for GLM 4.7 and Imagen tasks, enhancing usability and logging capabilities. - Removed outdated scripts and test files to streamline the codebase. |
||
|
|
f957ec2267 | update distribution and gitignore | ||
|
|
c27787f09f | fix gitignore again | ||
|
|
d90fcd4e2b | update gitignore | ||
|
|
8d256779d8 |
Update vision_tools.py to include image downloading and base64 conversion features.
add excluding tmp image dl's in .gitignore |
||
|
|
a398d320b7 | update gitignore | ||
|
|
c5386ed7e6 | add better logging when requests fail | ||
|
|
2082c7caa3 | update gitignore | ||
|
|
f4ff1f496b | update gitignore | ||
|
|
bf4223f381 | implement first pass of scrape/crawl content compression | ||
|
|
21d80ca683 | initital commit |