hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-29 18:46:59 +00:00

Author	SHA1	Message	Date
helix4u	cedd9b6d47	fix(update): avoid SSH auth for passive official checks	2026-06-11 12:45:07 +05:30
Shannon Sands	fa7f24e898	Enable webhooks from dashboard page	2026-06-10 22:55:06 -07:00
brooklyn!	975edd4140	fix(cli): omit --workspace when subpackage has its own package-lock.json (#42973 ) (#43986 ) * fix(cli): omit --workspace when subpackage has its own package-lock.json When ui-tui/ (or web/) contains its own package-lock.json, _workspace_root() returns the subpackage directory itself. Passing --workspace ui-tui in that case fails because npm cannot find a workspace named 'ui-tui' inside ui-tui/. Fix: skip the --workspace flag when npm_cwd equals the target directory, running a plain 'npm install' from the standalone project root instead. Applies the same fix to both _make_tui_argv (TUI) and _build_web_ui (web). Fixes #42973 * test(cli): fix web workspace-scope fixture + cover own-lockfile fallback (#42973) The web half of the #42977 fix broke test_npm_install_uses_workspace_web_scope, which built its fixture with no lockfile anywhere. Without a root lockfile, _workspace_root(web_dir) already returns web_dir, so the new "() if npm_cwd == web_dir" branch correctly drops --workspace and the assertion failed. Model a real workspace checkout instead: the single package-lock.json lives at the root, so --workspace web scopes the install. Also add the symmetric web regression test (web/ carrying its own lockfile => --workspace must be dropped and the install runs plainly from web_dir via npm ci), matching the TUI coverage already in test_tui_npm_install.py. --------- Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>	2026-06-11 05:01:25 +00:00
Teknium	7d8d000b19	revert(cron): remove per-job profile support (PR #28124 ) (#43956 ) Fully removes the cron per-job 'profile' arg added in #28124: the cronjob tool schema field, CLI --profile flags on cron create/edit, job-record storage/validation, the scheduler's _job_profile_context wrapper, and the script-runner env override. Sequential-partition logic reverts to workdir-only. The context-local HERMES_HOME override in hermes_constants and the subprocess bridging in tools/environments/local.py are kept — they now have other consumers (dashboard multi-profile, TUI gateway).	2026-06-10 20:46:17 -07:00
Teknium	914befa9aa	feat(dashboard): profile-scoped skills & toolsets management 'Set as active' on the Profiles page only flips the sticky active_profile file (future CLI/gateway runs) — it never retargets the running dashboard process. The skills/toolsets endpoints called bare load_config()/ save_config(), so after 'activating' a profile in the web UI, deactivating a skill silently wrote into the dashboard's own profile and the activated profile was untouched. Backend: - _profile_scope() context manager on the skills/toolsets endpoints: context-local HERMES_HOME override for call-time config resolution + cron-style locked swap of tools.skills_tool's import-time SKILLS_DIR - profile param on /api/skills, /api/skills/toggle, /api/tools/toolsets* (list/toggle/config/provider/env), hub sources/search installed-state - hub install/uninstall/update spawn 'hermes -p <profile> skills ...' so the child rebinds skills_hub.SKILLS_DIR at import (the override cannot reach import-time globals); profile validated -> 404/400 before spawn Frontend: - Skills page: profile selector (deep-linkable /skills?profile=<name>), amber banner naming the managed profile, threaded through skill toggles, toolset drawer, and hub browser - Profiles page: 'Manage skills & tools' action per card; 'Set as active' toast now says it applies to new CLI/gateway runs only Omitted profile keeps legacy behavior (dashboard's own profile).	2026-06-10 20:34:53 -07:00
Matt Harris	e0e2571711	feat(web): Parallel-backed web search & extract — free Search MCP when keyless, v1 REST when keyed Make Parallel the web search/extract backend with a zero-setup free tier: - Keyless (no PARALLEL_API_KEY): web_search/web_extract work out of the box via Parallel's free hosted Search MCP (search.parallel.ai/mcp), and parallel becomes the default backend when no other web credentials are configured (ahead of ddgs, which is search-only). A small hand-rolled Streamable-HTTP JSON-RPC client speaks the MCP's web_search/web_fetch tools; the existing web_search/web_extract tools are the only tools registered. - Keyed (PARALLEL_API_KEY set): uses the Parallel v1 REST endpoints (client.search / client.extract with advanced_settings.full_content) — no beta. Bumps parallel-web 0.4.2 -> 0.6.0. - Attribution: on the free path only, results carry provider/attribution and the CLI tool line reads "Parallel search" / "Parallel fetch"; the paid path is unbranded. - Selection/registration: web tools register unconditionally (free MCP backstop) while check_web_api_key remains a real usability probe; explicit per-capability backends are honored (so misconfig surfaces) rather than masked by the fallback. Tested: live web_search/web_extract against search.parallel.ai in keyless and keyed modes; unit suites for the MCP client, backend selection, and display labeling; full agent run shows the "Parallel search" label on the free path.	2026-06-10 19:54:38 -07:00
brooklyn!	3ffbdfbcc0	desktop: registry-driven slash commands + first-class /resume & /handoff (#42351 ) * desktop: surface /tools, /save, /personality and fix /help skill count Move /tools and /save out of TERMINAL_ONLY_COMMANDS and /personality out of ADVANCED_COMMANDS so they appear in the desktop slash palette and execute via the existing slash.exec → command.dispatch fallback. The backend gateway already accepts these through slash.exec (none are in _PENDING_INPUT_COMMANDS or the skill list), so no backend change is required. Recompute skill_count in filterDesktopCommandsCatalog from the filtered pairs. Previously the /help footer echoed the unfiltered backend total — e.g. "60 skill commands available" while only ~29 actually appeared in the rendered list, because the desktop hides terminal-only, picker-owned, and advanced commands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * desktop: keep slash popover live while typing args The trigger regex `(?:^\|[\s])([@/])([^\s@/])$` stopped matching the moment the user typed a space after a slash command, so the popover never showed arg completions for `/personality`, `/tools`, etc. — even though the backend's `complete.slash` already returns them with a `replace_from` indicator. Split the trigger detection so `/` allows args (`/cmd arg1 arg2`) while `@` keeps the strict no-space behavior. Restrict the slash command name to `[a-zA-Z][\w-]` so file paths like `src/foo/bar` don't accidentally trigger the popover. Rewrite arg-completion items in useSlashCompletions to insert the full `/personality alice` token instead of stranding `/alice`: when `replace_from` is past the command base, prepend the existing prefix to each item's text so the chip serializer produces a coherent replacement. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * cli: complete toolset names after /tools enable\|disable SlashCommandCompleter previously only auto-derived the first subcommand level from args_hint, so `/tools enable <tab>` yielded nothing — the user had to remember every toolset key (web, file, spotify, …) and every MCP server prefix. Add `_tools_completions` that handles both stages: subcommand (list\|disable\|enable) and tool name. Filter by current enable state so `/tools enable <tab>` only offers disabled toolsets and `/tools disable <tab>` only offers enabled ones — no point suggesting a no-op. MCP server prefixes (server:) come from the saved mcp_servers config; per-tool completion under a server would require runtime MCP introspection and is left as follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * desktop: registry-driven slash commands with first-class pickers Collapse the if/else slash dispatch into one DESKTOP_COMMAND_SPECS table that drives popover suggestions, per-type composer pills, and execution. - /resume, /sessions, /switch: inline session completions (like /skin) plus a "Browse all sessions…" entry that opens a dedicated session picker overlay - /handoff: inline platform completion + handoff.request/handoff.state gateway bridge so desktop reaches CLI parity - colored per-type pills (command/skill/theme) in the composer - strip ANSI and fix width/alignment of slash output in the chat panel * desktop: fold repeated slash session/output boilerplate into one helper runExec, /title, /help and the unavailable case each re-derived the same ensure-session → bail-with-notify → build-renderSlashOutput dance. withSlashOutput() returns {sessionId, render} or null, so each handler is a two-line resolve instead of an eight-line preamble. * desktop: keep backend meta on slash arg completions Arg suggestions (/personality <name>, /tools enable <toolset>, /handoff <platform>) were having their meta overwritten with the parent command's registry description: desktopSlashDescription("/personality none") canonicalizes back to /personality and returns its blurb. Skip the lookup for arg rows so the backend's own display_meta ("clear personality overlay", etc.) survives. * cli: list real personalities in /personality completion _personality_completions resolved load_config().agent.personalities — but that schema has no agent.personalities key, so completion always returned just `none` even though the runtime (load_cli_config().agent.personalities) ships a dozen built-ins (helpful, kawaii, pirate, …). Read from the same source the command actually applies, so `/personality ` surfaces the real options. * desktop: expand bare arg-commands to their options on pick Picking a command like /personality from the slash popover committed it immediately instead of advancing to its argument list. Mark arg-taking commands (/skin, /resume, /handoff, /personality, /tools) in the registry and, when one is picked bare, insert "/cmd " as plain text and re-open the popover on its inline options — mirroring typing "/cmd " by hand. Arg picks (serialized text already contains a space) still commit a single pill. Also realign trigger-popover loading test with the redesigned popover (the /help empty-state hint shows when resolved, not while the spinner is up); the merge from main reintroduced the pre-redesign expectation. * tui_gateway: fold session-db close into a context manager Both handoff RPCs repeated the same `db, close_db = _session_db_handle()` + `finally: if close_db: db.close()` dance. Turn the helper into a `_session_db` contextmanager that owns the close, so callers just `with _session_db(session) as db:`. * desktop: unblock handoff retries and exact resume ids Clear timed-out desktop handoffs through the gateway so retries are not stuck behind a pending row, and let typed /resume session ids bypass the loaded sidebar cache. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-11 01:49:24 +00:00
xxxigm	f7a6d6a6a1	test(cron): cover provider "custom" → providers.custom resolution Add execution-time coverage that bare `provider="custom"` resolves a literal providers.custom endpoint (and still falls through when none exists), plus creation-time coverage that `_resolve_model_override` keeps a resolvable "custom" and only pins the main provider when it is unresolvable.	2026-06-10 14:39:03 -07:00
Tranquil-Flow	a8f404b29f	fix(gateway): probe launchd domain instead of hardcoding user/<uid> (#40831 ) The previous fix for #23387 changed _launchd_domain() from gui/<uid> to user/<uid> to support Background/SSH sessions on macOS 26+. However, this broke Aqua sessions where gui/<uid> is the only working domain and user/<uid> cannot bootstrap or manage the service. Now _launchd_domain() probes which domain actually contains the loaded service: 1. Try gui/<uid> first (Aqua sessions) 2. Fall back to user/<uid> (Background/SSH sessions) 3. Use launchctl managername as heuristic when neither has the service 4. Cache the result for the process lifetime Regression tests cover all four paths plus caching behavior.	2026-06-10 12:39:48 -07:00
Shannon Sands	6fe4821926	Add dashboard file browser paths	2026-06-10 09:53:12 -07:00
Teknium	d986bb0c6d	feat(dashboard): full-featured profile builder (model + skills + MCPs) (#39084 ) * feat(profiles): extend create endpoint for full profile-builder (model + MCPs + skills) Backend foundation for the dashboard profile builder. Extends POST /api/profiles to accept, in one call, everything a profile needs beyond name/clone: - mcp_servers[] -> written into the new profile's config.yaml - keep_skills[] -> replace-semantics: disable every seeded skill not kept - hub_skills[] -> async install via 'hermes -p <name> skills install <id>' All applied best-effort AFTER the profile dir exists, so a hiccup in any one never 500s the create. Model/MCP/keep-skills writes are profile-scoped via the HERMES_HOME context override (same mechanism as the existing _write_profile_model). Hub installs go through a subprocess scoped with -p because skills_hub.SKILLS_DIR is import-time-bound and the runtime override can't redirect it. Adds two helpers (_write_profile_mcp_servers, _disable_unselected_skills) and a TestClient test asserting all four paths land in the NEW profile's config and the hub spawn is scoped to it. Design doc at docs/design/profile-builder.md. * feat(dashboard): full-featured profile builder page Adds a dedicated /profiles/new builder that composes everything a profile needs into one stepped create flow, reusing the existing Models/Skills/MCP data paths instead of duplicating them: - Identity name + description - Model provider+model picker (api.getModelOptions) - Skills keep-which-built-in/optional (replace semantics, default = full bundle) + skills-hub search/add (api.getSkills, searchSkillsHub) - MCPs add HTTP/stdio servers inline - Review blueprint -> single POST /api/profiles create Nothing writes until Create; the one call commits model+MCPs+skill selection and spawns hub-skill installs (reported in the success toast). ProfilesPage header gets a 'Build' button (full builder) alongside 'Create' (quick modal). Route is page-only (not in the sidebar nav). Verified with vite build (2258 modules, green).	2026-06-10 09:18:32 -07:00
Teknium	a5c32cdf30	fix(update): self-heal a venv left half-built by an interrupted install (#42172 ) * fix(update): self-heal a venv left half-built by an interrupted install An update killed mid dependency-install (Ctrl-C, terminal close, WSL OOM) could leave the venv with pip wiped and core deps (e.g. Pillow) missing, with no automatic recovery — the user had to manually run ensurepip + reinstall. Drop an install-scoped .update-incomplete breadcrumb right before the dep install and clear it only after core-dependency verification passes. On the next launch (any command except 'update' itself), if the marker is present, unconditionally bootstrap pip via ensurepip then re-run the .[all] install + verification, then clear the marker. Failure leaves the marker for retry and prints the manual recovery command. Never raises — recovery cannot block launch. * fix(update): address review — stderr-only recovery output, single-flight lock, gitignore marker - Route all recovery output (status lines + streamed pip/uv install via fd-level dup2) to stderr so protocol-on-stdout launches (hermes acp) never get install noise on the JSON-RPC stream. - Single-flight O_EXCL lockfile (.update-incomplete.lock) so a gateway start + CLI launch (or two profiles) can't run concurrent installs into the shared venv; stale locks (>1h) are broken for the next launch. - gitignore .update-incomplete + lock so source-tree installs keep a clean git status and update's autostash skips them. - Document why the loose 'update' argv substring match is intentional (over-match defers one launch; under-match would race the real update). - 4 new tests: lock held → skip, stale lock broken, lock released, output lands on stderr only.	2026-06-10 02:57:05 -07:00
Ben Barclay	15813336cc	fix(config): preserve original .env file mode in remove_env_value too (#43349 ) #33699 fixed save_env_value so an operator-set .env mode (e.g. 0640 on a Docker bind-mount) survives a config write instead of being re-tightened to 0600 by the unconditional _secure_file() call. The sibling remove_env_value() had the identical bug: it restores original_mode and then unconditionally called _secure_file(env_path), clobbering the mode back to 0600 on every `hermes config remove KEY`. Apply the same fix: move _secure_file() into the else branch so it only runs when no original mode was captured (a freshly created .env still gets 0600 hardening; existing operator-set modes survive). Added test_remove_env_value_preserves_existing_file_mode_on_posix, which fails on the unfixed remove path (expected 0o640, got 0o600) and passes with the fix.	2026-06-10 19:53:07 +10:00
kshitij	2f19512341	fix(cli): repair non-UTF-8 stdout/stderr on all platforms, not just Windows (#43439 ) `hermes setup` (and other banner-printing commands) crash with an unhandled UnicodeEncodeError on Linux hosts whose locale selects a non-UTF-8 codec — e.g. a fresh Raspberry Pi / minimal Debian with a latin-1 or C/POSIX locale. The setup wizard prints box-drawing characters (┌│├└─) and the ⚕ glyph before any stream repair runs, so the command dies before it can start. The existing _ensure_utf8() shim already knew how to re-wrap the standard streams as UTF-8, but it returned early on `sys.platform != "win32"`, so the identical crash class on Linux was never covered. - Drop the win32 gate: repair any stdout/stderr whose encoding is not UTF-8. - Prefer TextIOWrapper.reconfigure() so the stream object is fixed in place (cached sys.stdout references keep working); fall back to reopening the fd with closefd=False (the CPython-recommended safe variant). - Use errors="replace" — matching the sibling hermes_cli/stdio.py shim — so a stray un-encodable byte degrades gracefully instead of crashing. - Only set the PYTHONUTF8/PYTHONIOENCODING child-process hints when a repair actually happened, so a healthy UTF-8 host sees zero footprint (no stream swap, no env mutation). This is intentionally the earliest, platform-agnostic guard, running at import time before any banner prints. hermes_cli/stdio.py::configure_windows_stdio() still runs later from the entry points for the Windows-only extras (console code-page flip, EDITOR default, PATH augmentation); it early-returns on non-Windows and its stream reconfigure is an idempotent no-op once we've already repaired the streams here. Add regression tests covering latin-1 and ascii/POSIX streams, the reconfigure fallback, already-UTF-8 no-op (identity preserved + no env mutation), the repair-sets-env and respects-explicit-env contracts, and hostile/None streams.	2026-06-10 02:21:00 -07:00
teknium1	fa32af886f	fix: dedupe concurrent gateway restarts + surface restart outcome in onboarding UI Follow-ups to the salvaged Telegram QR onboarding auto-restart: - _spawn_gateway_restart() reuses a live in-flight 'hermes gateway restart' child instead of spawning a second racing one (stale cached frontend + new backend both requesting a restart, or restart-button double-click). Both /api/gateway/restart and the onboarding apply path go through it. - ChannelsPage polls /api/actions/gateway-restart/status after a server-initiated restart and surfaces a non-zero exit (e.g. systemd linger missing) via the manual-restart banner, since restart_started only means the child spawned. - Test for the reuse path + _ACTION_PROCS isolation in existing tests.	2026-06-10 01:35:12 -07:00
Shannon Sands	984e69ff62	Auto-restart gateway after Telegram QR onboarding	2026-06-10 01:35:12 -07:00
Teknium	298bb93d39	feat(skills): show live per-source progress while browsing (#43398 ) Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker / shell lint / Lint Dockerfile (hadolint) (push) Waiting to run Details Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Waiting to run Details Nix Lockfile Fix / fix (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details do_browse waited on a frozen 'Fetching skills...' spinner while sources resolved, so a slow source looked like a hang. parallel_search_sources already exposes an on_source_done(sid, count) callback fired as each source completes — wire it into the status line so it ticks off sources live (official (12), + github (4), + clawhub (500)). The page is still rendered once, after the full set is merged and trust-sorted, so browse's official-first ordering and pagination contract are untouched.	2026-06-10 01:02:40 -07:00
Robin Fernandes	af978ecb17	fix(model): require confirmation for expensive model selections Rebased onto current main and re-ported across the restructured surfaces: model flows now thread confirm_provider/base_url/api_key through hermes_cli/model_setup_flows.py, the Discord picker lives in plugins/platforms/discord/adapter.py, and the web dashboard picker applies chat-mode switches via config.set so the expensive-model confirmation can ride the response. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 00:24:06 -07:00
Ondrej Drapalik	1c055a4c58	fix(xai): accept Grok Build code during loopback wait + tiny screenshot guard xAI's consent page renders the authorization code in-page instead of redirecting to the loopback callback, so the listener just hangs and the manual-paste flow demands a callback URL that never contains the token. - auth.py: poll stdin non-blockingly while waiting for the xAI loopback callback; accept a pasted bare Grok Build code and substitute the locally generated state (PKCE code_verifier still binds the exchange). No need to wait for timeout or re-run with --manual-paste. - computer_use: parse PNG/JPEG dimensions from base64 and fall back to the text/AX/SOM payload when the screenshot is below the provider minimum (8x8), which xAI rejects with HTTP 400. - model_setup_flows.py: xAI credential reuse prompt uses the standard radio picker via a shared _prompt_auth_credentials_choice helper. - main.py: thread a title through _prompt_provider_choice; re-home the helper import (flows live in model_setup_flows.py post-decomposition). Salvaged from #36781 onto current main (contributor's main.py edits re-homed to model_setup_flows.py, where the flows were extracted since the PR opened).	2026-06-09 23:21:24 -07:00
Teknium	095f526b11	refactor(memory,skills): replace tri-state write_mode with boolean write_approval (default off) (#43354 ) The shipped tri-state write_mode (on\|off\|approve) conflated two concepts — whether writes are enabled and whether they're gated — so 'on' (writes flow freely, gate inactive) read like 'gating is on'. Replace it with a single clear boolean gate that defaults off. memory.write_approval / skills.write_approval: false (default) — write freely; the approval gate is off (pre-gate behaviour) true — require approval: memory foreground prompts inline, memory background-review + all skill writes stage for review The old 'off = block all writes' mode is dropped; memory_enabled: false already disables memory entirely, so a third 'block' state was redundant. - tools/write_approval.py: get_write_mode/MODE_* → write_approval_enabled() bool; evaluate_gate() loses the config-driven 'blocked' path (blocked now only comes from an interactive user denial). - tools/memory_tool.py, tools/skill_manager_tool.py: comment + behaviour follow. - hermes_cli/config.py: memory/skills write_mode → write_approval (False); _config_version 28→29 with a 28→29 migration that renames any persisted write_mode (approve→true, on/off/unset→false) and drops the old key. - slash commands: '/memory\|/skills mode <on\|off\|approve>' → 'approval <on\|off>' ('mode' kept as a back-compat alias); set_mode_fn callback now takes a bool. - write_approval_commands.py, cli_commands_mixin.py, gateway/slash_commands.py, commands.py: handlers + registry args/subcommands updated. - docs + tests rewritten for the boolean model; added migration tests.	2026-06-09 23:21:14 -07:00
Ben Barclay	63a421d4c0	fix(dashboard): _require_token endpoints all 401 behind the OAuth gate (#42578 ) * fix(dashboard): let _require_token endpoints work behind the OAuth gate In gated/OAuth mode (non-loopback bind without --insecure) the dashboard authenticates the SPA via a session cookie and deliberately does NOT inject the legacy ephemeral _SESSION_TOKEN into index.html. gated_auth_middleware verifies the cookie and attaches request.state.session before any non-public /api/ route runs; the legacy auth_middleware short-circuits in this mode too. But several handlers call _require_token() directly, which only validated the (absent) _SESSION_TOKEN header. So every cookie-authenticated request to those endpoints 401'd — making plugin install/enable/disable, /api/dashboard/plugins/hub, and the other _require_token routes permanently unreachable behind the gate. In the UI this surfaced as a 401: {"detail":"Unauthorized"} popup on plugin install for any publicly-bound (e.g. Fly-hosted NAS) dashboard. Fix: _require_token now defers to the active gate. When auth_required is True it accepts the request iff the gate attached a verified session (and 401s otherwise); loopback/--insecure behavior is unchanged (still validates the session token). Adds two regression tests driving the full in-process stub OAuth round trip: the install endpoint must NOT 401 a logged-in request, and must still 401 with no cookie. Verified the accept-test fails on the pre-fix code. * test(dashboard): cover the whole _require_token route class under the gate The install popup was one symptom of a class-wide bug: all 14 endpoints that call _require_token directly (API-key reveal, provider validation, the OAuth-provider connect/disconnect flow, and plugin enable/disable/update/ delete/visibility/providers) 401'd cookie-authenticated requests in gated mode. Add a parametrized test hitting a representative spread (plugins/hub, env/reveal, providers/validate, an oauth provider route, agent-plugin enable) asserting a logged-in caller is never 401'd — proving the fix covers the class, not just agent-plugins/install.	2026-06-09 22:57:49 -07:00
Ben Barclay	e4a1b35a39	fix(config): preserve original .env file mode instead of unconditionally tightening to 0600 (#33699 ) `save_env_value()` captures the original .env file mode (e.g. 0640 for Docker volume mounts) and restores it via `os.chmod` — but then unconditionally calls `_secure_file(env_path)` on the next line, which re-tightens the mode to 0600 and defeats the entire preservation logic. The intent (preserve when `original_mode` is captured, secure otherwise) was already in the code but got short-circuited. Move `_secure_file()` into the `else` branch so it only runs when no original mode was captured — fresh `.env` files written for the first time still get the 0600 hardening treatment, but operator-set modes survive subsequent writes. Salvages #31518 by @blut-agent (config.py portion only). Their PR also bundled unrelated lowercase-lookup changes in `hermes_cli/commands.py`; this salvage takes only the focused config fix. The commands.py changes are reasonable on their own merits but belong in a separate PR. Co-authored-by: blut-agent <278569635+blut-agent@users.noreply.github.com>	2026-06-10 15:42:16 +10:00
Teknium	96af61b6ef	feat(memory,skills): approve/deny gate for memory + skill writes (#38199 ) Adds memory.write_mode and skills.write_mode (on\|off\|approve), applied to both foreground turns and the background self-improvement review fork — the source of the unprompted 'wrong assumption' saves users reported. - on (default): write freely, unchanged behaviour - off: never write; the tool returns a clean disabled result - approve: don't commit. Memory foreground writes prompt inline (small, reviewable in a chat bubble); background memory writes and ALL skill writes stage to a pending store instead (a SKILL.md is too large to review inline, and a daemon thread can't block on a prompt) Review staged writes from CLI or any messaging platform: /memory pending\|approve\|reject\|mode /skills pending\|approve\|reject\|diff\|mode Skill review respects the size asymmetry: inline you see a one-line gist; the full unified diff stays out-of-band (/skills diff, dashboard, or the staged JSON file). New: tools/write_approval.py (gate + pending store), hermes_cli/ write_approval_commands.py (shared CLI+gateway handlers). Gates wired at the single entry points memory_tool() and skill_manage(), using the existing write-origin ContextVar to distinguish foreground from background_review.	2026-06-09 21:51:43 -07:00
Ben Barclay	5cf6e28a2f	fix(gateway): auto-start after container restart via planned-stop marker (#42675 ) (#43236 ) * fix(gateway): auto-start after container restart via planned-stop marker On Docker (s6-overlay), the gateway runs as a dynamically-registered s6 service. When the container stops/restarts/upgrades, s6 sends the gateway a plain SIGTERM. The shutdown path (_stop_impl) ended with an unconditional _update_runtime_status("stopped"), persisting gateway_state=stopped to the volume. container_boot.py reads that on the next boot and only auto-starts gateways whose last state was "running" (_AUTOSTART_STATES) — so after a routine `docker compose up --force-recreate` the gateway stays down and messaging channels silently go dark, with no error surfaced (issue #42675). The codebase already distinguishes intentional stops from unexpected signals via the planned-stop marker (write_planned_stop_marker / consume_planned_stop_marker_for_self): `hermes gateway stop`, systemd/launchd ExecStop, and Ctrl+C write a marker before signalling, so the handler classifies them as planned. An unmarked SIGTERM (container/s6 restart, OOM, bare kill) is signal-initiated. This wires that existing classification through to the state persist, rather than adding unreliable signal-source inference: - run.py: GatewayRunner._signal_initiated_shutdown, set in shutdown_signal_handler's unmarked-signal branch. In _stop_impl, a signal-initiated (non-restart) teardown now persists "running" instead of "stopped" — preserving the operator's run-intent and overwriting the mid-shutdown "draining" marker so _AUTOSTART_STATES matches on reboot. Operator stops and restarts persist "stopped" as before. - service_manager.py: S6ServiceManager.stop() now writes the planned-stop marker for the supervised PID (read from s6-svstat) before `s6-svc -d`, so an in-container `hermes gateway stop` is correctly classified as intentional (parity with the systemd/launchd/host stop paths, which already mark). Best-effort: a marker-write failure falls back to the safe signal-initiated path. Tests: shutdown persist-decision table (signal→running, operator→stopped, restart→stopped), s6 stop marker write + svstat PID parse + failure tolerance. The signal→running and s6-marker tests fail without the respective source change. Verified end-to-end against a container built from this branch: an unmarked SIGTERM to the live gateway leaves gateway_state=running (shutdown-context log confirms signal path); existing real container-restart suite still green. * docs(docker): clarify gateway autostart distinguishes operator-stop from container-kill The per-profile-supervision section described the autostart-across-restart contract as "running gateways come back, stopped stay stopped" without spelling out what records 'stopped'. That contract was the source of #42675 confusion: users expected a restart to bring the gateway back and it didn't. With the write-side fix, only an explicit `hermes gateway stop` records 'stopped'; container/s6 restart SIGTERMs (incl. image upgrades and unexpected exits) leave the state 'running' so the gateway auto-starts. Make that distinction explicit in both the multi-profile and per-profile-supervision sections. * test(docker): real-restart autostart E2E for #42675 Adds test_live_gateway_autostarts_after_real_restart_without_manual_state_stamp: a live s6-supervised gateway is killed by an actual `docker restart` SIGTERM (no manual gateway_state stamp, no planned-stop marker) and must auto-start on the next boot. Exercises the WRITE side of the fix that the existing stamp-based tests bypass. Verified to FAIL against an origin/main image (reconciler logs prior_state=stopped action=registered — the #42675 bug) and PASS against the fixed image (prior_state=running action=started).	2026-06-10 14:01:34 +10:00
Ben Barclay	7df3aa34b1	fix(dashboard-auth): warn when public_url override is silently rejected (#43214 ) A non-empty HERMES_DASHBOARD_PUBLIC_URL / dashboard.public_url value that fails URL validation (overwhelmingly: a missing http(s):// scheme, e.g. "hermes.domain.com") was silently discarded by resolve_public_url(), falling back to reconstructing the OAuth redirect_uri from request headers. Behind a reverse proxy that doesn't forward X-Forwarded-Proto reliably, that yields an http:// callback even though the operator explicitly set the public URL — with no signal as to why (#42780). Emit a deduplicated operator-facing WARNING (once per distinct value, since resolve_public_url runs per request) naming the offending value and the required scheme. Turns a silent footgun into a self-diagnosing one; behaviour is otherwise unchanged. Tests assert the warning fires for a scheme-less value, is deduplicated across repeated calls, and stays silent for a valid value — all three fail without the fix.	2026-06-10 12:14:57 +10:00
Teknium	57c6714995	fix(models): keep curated Anthropic aliases in /model picker (#43103 ) The Anthropic picker returned the live /v1/models dump verbatim whenever credentials were configured. Anthropic's API lags newly-routed curated aliases (e.g. claude-fable-5, reachable on Anthropic before the models endpoint enumerates it), so the curated entry vanished from the picker. Merge curated _PROVIDER_MODELS["anthropic"] with the live catalog — curated first, live-only appended, deduped — mirroring the OpenAI curated-merge path. Live failure / no creds falls back to curated verbatim.	2026-06-09 14:45:19 -07:00
brooklyn!	ba44de06da	fix(install): self-heal a stuck Electron download (salvage of #42894 ) (#42998 ) * fix(install): self-heal a stuck Electron download on the desktop build The desktop build downloads Electron (~114MB) from GitHub. A corrupt cached zip, or a blocked/throttled GitHub release host (the repeating "retrying" log), hard-failed the install — and install.sh had no recovery at all while install.ps1 / `hermes desktop` only purged the cache. All three build paths now escalate on a failed `npm run pack`: GitHub → purge corrupt electron-.zip + stale -unpacked and retry → one retry via a public Electron mirror (npmmirror.com). @electron/get SHASUM-verifies the download, and a user-pinned ELECTRON_MIRROR is always respected (never overridden). Adds a bash clear_electron_build_cache()/_desktop_pack() to mirror the existing PowerShell/Python helpers. * test(install): cover the Electron mirror fallback Verify `hermes desktop` falls back to a mirror when the cache purge finds nothing, and that a user-pinned ELECTRON_MIRROR is respected (no extra attempt, not overridden). * docs(desktop): troubleshoot a stuck Electron download Document the automatic cache-purge + mirror fallback, how to pin your own ELECTRON_MIRROR, and how to clear a corrupt cached zip by hand. * docs(install): correct the Electron mirror trust framing The mirror-fallback comments and the desktop troubleshooting doc implied `@electron/get`'s SHASUM check makes the npmmirror.com download safe against tampering. It doesn't: the SHASUMS256.txt is fetched from the same mirror, so the check guards against a corrupt/partial download, not a compromised mirror. Reframe all four surfaces (install.sh, install.ps1, `hermes desktop`, and the docs) to state the trust trade-off honestly — npmmirror.com is the de-facto Electron community mirror, we only fall back to it after the canonical GitHub download fails, and a user-pinned ELECTRON_MIRROR is never overridden. No behavior change. --------- Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com>	2026-06-09 18:19:14 +00:00
Teknium	f6f573ebaa	feat(plugins): install from a subdirectory within a repo (#42963 ) Support installing a plugin that lives in a subdirectory of a larger repo (docs/tests at root, plugin in a subdir) without forcing a dedicated single-plugin repo. Identifier syntax: owner/repo/path/to/plugin (shorthand + subpath) <url>.git/path/to/plugin (.git boundary on GitHub-style URLs) <url>#path/to/plugin (explicit fragment, any scheme) _resolve_git_url now returns (git_url, subdir); _install_plugin_core reads the manifest from and moves only the subdir, so root-level docs and tests no longer leak into ~/.hermes/plugins. _resolve_subdir_within guards against path traversal, missing dirs, and non-directories. Both the CLI (hermes plugins install) and the dashboard install endpoint inherit this for free since they share _install_plugin_core. Dashboard install hint + placeholder updated to advertise the subdir syntax. Co-authored-by: Austin Pickett <pickett.austin@gmail.com>	2026-06-09 13:42:51 -04:00
Gille	c6dc2fcd21	fix(desktop): release profile backends before delete (#42613 )	2026-06-09 10:52:02 -05:00
helix4u	f8adefdebf	fix(tui): apply terminal backend config before launch Some checks failed Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details Build Skills Index / build-index (push) Has been cancelled Details Build Skills Index / trigger-deploy (push) Has been cancelled Details	2026-06-09 00:31:27 -07:00
Teknium	50ad191a8b	test(hermes_cli): harden concurrent-gate fixture against partial-import race (#42626 ) The autouse _suppress_concurrent_hermes_gate fixture did monkeypatch.setattr(main, '_detect_concurrent_hermes_instances', ...) with no raising=False. Its try/except guards the import but not the setattr, so under pytest's per-test spawn isolation a transiently partial hermes_cli.main module (one a concurrent worker is mid-importing) made setattr raise AttributeError and errored unrelated tests in the slice. Add raising=False so a transiently-absent attribute is a no-op default rather than a hard error. The attribute always exists once main.py finishes importing; the real-function opt-out (@pytest.mark.real_concurrent_gate) is unaffected.	2026-06-08 22:54:25 -07:00
Ben Barclay	a46462ec65	fix(cli): persist custom --portal-url to .env on dashboard register (#42435 ) * fix(cli): persist custom --portal-url to .env on dashboard register `hermes dashboard register --portal-url <url>` resolved the custom portal for the registration request but only persisted it to .env when the var was absent AND non-default. So a user who re-registered against a different portal (e.g. switching preview deploys) silently kept the stale HERMES_DASHBOARD_PORTAL_URL, and an explicit request for the production portal was never written at all. Track whether a custom portal was explicitly supplied (--portal-url flag or HERMES_DASHBOARD_PORTAL_URL env), separately from the resolved value: - explicit custom URL -> always persist (update in place via save_env_value, which overwrites the matching key rather than appending a duplicate), even when it equals the production default; no-op when it already matches. - no custom URL supplied -> unchanged conservative behaviour: only write an inferred portal when absent and non-default; never alter an existing entry unexpectedly. save_env_value already preserves other lines/comments and dedups in place; this only changes the decision of when to call it. Adds TestCustomPortalPersistence covering all four cases. Co-authored-by: Hermes Agent <agent@nousresearch.com> * feat(cli): persist dashboard public URL from --redirect-uri on register When the user registers a publicly-exposed dashboard with --redirect-uri (the full OAuth callback, e.g. https://hermes.example.com/auth/callback), derive its origin and persist it as HERMES_DASHBOARD_PUBLIC_URL — the env var the dashboard auth layer actually consumes at serve time. dashboard_auth/routes._redirect_uri reconstructs the callback as HERMES_DASHBOARD_PUBLIC_URL + "/auth/callback" (verbatim), and dashboard_auth/prefix.resolve_public_url reads that var (then config.yaml dashboard.public_url) to decide the public origin. Previously --redirect-uri was sent to the portal at registration but never persisted, so the operator had to set HERMES_DASHBOARD_PUBLIC_URL by hand for the login gate to engage and the callback to round-trip. We now wire it automatically. Persist the ORIGIN (scheme://host[:port]), not the full callback path — persisting the raw redirect would double the path when the runtime appends /auth/callback. Mirrors the portal-url persistence semantics already in this PR: always write an explicitly-derived value (updating in place, no duplicate), no-op when it already matches, never written on a localhost-only install (no --redirect-uri), and skipped for a non-http(s)/malformed redirect. Verified end-to-end: cmd_dashboard_register writes the origin to .env, then resolve_public_url() reads it back and public_url + /auth/callback reconstructs exactly the originally-supplied --redirect-uri. Adds TestPublicUrlPersistence (8 cases) incl. origin-derivation, port preservation, update-in-place, no-op, no-flag, non-http skip, and both-portal-and-public-url-persisted. Co-authored-by: Hermes Agent <agent@nousresearch.com> --------- Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-09 13:56:33 +10:00
Ben Barclay	52ae9d9f02	feat(dashboard): make `hermes dashboard register` idempotent (#42455 ) Re-running `hermes dashboard register` now updates the existing dashboard record in nous-account-service instead of creating a duplicate. The stable key is the client_id this install already persisted in HERMES_DASHBOARD_OAUTH_CLIENT_ID on a prior run: - No stored client_id -> first registration -> create a fresh client with an auto-generated name (unchanged behavior). - Stored client_id present -> re-send it as `client_id` so the portal updates that row in place. Without an explicit --name, the name is omitted so the portal-stored name isn't churned to a new random value on every re-run. - Prints "Updated dashboard" vs "Registered dashboard" based on whether the portal echoed back the same client_id. A stale/deleted id safely falls through to a fresh create server-side. Requires the matching nous-account-service change (POST /api/oauth/self-hosted-client accepting an optional client_id + optional name). Tests: 7 new TestIdempotentRerun cases (key sent, name preserved/overridden, Updated message, persisted id, stale-id fall-through, blank-id first-run); existing create-path tests unchanged (23 pass).	2026-06-09 13:19:35 +10:00
helix4u	732ababa1a	fix(doctor): allow vendor slugs for named custom providers	2026-06-08 15:53:09 -07:00
Robin Fernandes	639c1e3636	feat(sessions): add optional max session cap	2026-06-08 15:12:12 -07:00
Brooklyn Nicholson	e88116256c	fix(update): scope git fetch to target branch A bare `git fetch origin` (and `git fetch upstream`) pulls every ref. The repo carries thousands of auto-generated branches, so on any non-single-branch checkout the installer's update path and `hermes update` spend minutes downloading the full branch list — long enough to stall the desktop installer or trip the follow-up `git pull --ff-only`. Scope every update-path fetch to the branch we actually compare/merge against: - scripts/install.sh: collapse the remote to single-branch and fetch only $BRANCH on the "existing install, updating" path. - hermes_cli/main.py: fetch the resolved branch in the apply path, the --check path (upstream + origin), and the fork upstream-sync. Tracking-ref updates still happen via git's opportunistic refspec, so the later origin/<branch> rev-parse/rev-list checks are unaffected. Tests assert the apply-path fetch is branch-scoped and never bare.	2026-06-08 15:24:31 -04:00
teknium1	c78b3e1d3c	fix(auth): add Codex OAuth accounts as distinct pool entries hermes auth add openai-codex now creates an independent manual:device_code pool entry per account instead of routing through the singleton _save_codex_tokens save path, which collapsed every added account into the latest login (the second add overwrote the first account's singleton-mirrored device_code entry). This is the add-path half of #39236; PR #39243 (already on this branch) fixes the re-auth half. manual:device_code entries refresh from their own token pair (_sync_codex_entry_from_auth_store only adopts the singleton for source=="device_code"), so they need no providers.openai-codex shadow. Adding the first credential marks openai-codex active (the singleton path did this implicitly) so the setup wizard's get_active_provider() check still passes; subsequent adds leave the active provider untouched. Adds SOURCE_MANUAL_DEVICE_CODE constant and a regression test that two distinct accounts keep distinct token pairs. Updates two existing add tests to the pool-only behavior. Co-authored-by: glesperance <info@glesperance.com>	2026-06-08 11:57:03 -07:00
Ted Malone	761b744abb	fix(auth): preserve independent Codex pool entries on re-auth (#39236 ) The #33538 fix refreshed every credential_pool entry with source "manual:device_code" on every Codex OAuth re-auth, on the assumption that such entries were always legacy aliases of the singleton from the #33000 workaround era. That assumption is no longer true: `hermes auth add openai-codex` also produces "manual:device_code" entries for independent ChatGPT accounts, and the broad sync silently clobbered them with the latest-authenticated token pair (labels preserved, token material overwritten, status / quota readings then lie). Narrow the sync: refresh a "manual:device_code" entry only when its existing access_token matches the previous singleton access_token (true legacy alias). Entries with distinct token material represent independent accounts and are now left alone. Error markers are cleared only on entries actually rewritten, so an independent account's own 429 / 401 state survives a re-auth that targeted a different account. Tests: * New: independent acctB/acctC are not overwritten when acctA re-auths. * New: legacy singleton-alias still refreshed (preserves #33538). * New: missing previous singleton state handled (no crash, no false alias match). * New: access_token-only alias match (legacy schema without refresh_token still recognized). * New: error markers cleared only on entries actually refreshed. * Updated: existing manual-device-code sync test now covers both the legacy-alias path AND the independent-account path in one fixture. Behaviour change is zero for users with a single Codex account and zero for users whose only "manual:device_code" entry is the legacy alias of the singleton. Users with multiple independent Codex accounts added via `hermes auth add` now keep their distinct token material across re-auths. Local: 29 passed in tests/hermes_cli/test_auth_codex_provider.py, no new failures in tests/hermes_cli/ vs upstream/main baseline. Fixes #39236.	2026-06-08 11:57:03 -07:00
xxxigm	96fd9d4979	fix(desktop): stop running Hermes.exe locking win-unpacked before Windows pack (#42100 ) * fix(desktop): stop running app locking win-unpacked before pack On Windows a running Hermes.exe keeps an exclusive lock on release/win-unpacked/Hermes.exe, so electron-builder's pack cannot replace it and dies with "remove ...\Hermes.exe: Access is denied" / ERR_ELECTRON_BUILDER_CANNOT_EXECUTE (before-pack hits the same EPERM cleaning the dir, and the cache-purge retry repeats the failure since the lock is still held). Before building the packaged app, terminate any process whose executable lives inside this build's release/ tree so the rebuild -- including the installer's headless --update rebuild -- can replace the binary. Scope is narrow (only exes under release/), POSIX is a no-op (it can unlink a running binary), and the final error now points Windows users at the running-app cause. * test(desktop): cover the win-unpacked lock-breaker helper Verify _stop_desktop_processes_locking_build is a no-op off-Windows, terminates only processes whose exe lives under release/ (sparing our own PID and unrelated installs), and short-circuits when no release dir exists.	2026-06-08 11:51:31 -07:00
Teknium	abcf996b1f	feat(windows): enable dashboard /chat tab via ConPTY (win_pty_bridge) + tests (#42251 ) * feat(windows): enable dashboard chat tab via ConPTY (win_pty_bridge) Add hermes_cli/win_pty_bridge.py — a pywinpty-backed drop-in for PtyBridge with the same spawn/read/write/resize/close surface — and wire it into the web_server PTY import block so Windows picks it up instead of falling back to None. pywinpty is already a declared win32 dependency (pyproject.toml). The ConPTY read path runs inside run_in_executor so the event loop is never blocked. Spawn/read/write/terminate call shapes are taken directly from tools/process_registry.py which already exercises the same pywinpty version. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: remove WSL2-only caveat for dashboard chat tab The chat pane now works on native Windows via the ConPTY bridge added in the previous commit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(windows): cover ConPTY bridge + web_server platform-branched import Companion to the bridge added in the previous commits. Verified live on native Windows 11 (pywinpty 2.0.15) against `hermes dashboard`'s `/api/pty` WebSocket: the spawned `hermes --tui` (node entry.js) renders through ConPTY, resize escapes reach `setwinsize`, and closing the WS reaps both the node child and the pywinpty agent with zero orphans. tests/hermes_cli/test_win_pty_bridge.py Mirrors the layout of the existing POSIX test_pty_bridge.py: spawn/io/resize/close/env coverage against cmd.exe and python -c, plus the cross-platform fallback surface (PtyUnavailableError, the off-Windows `spawn -> raises PtyUnavailableError` guard, and the load-bearing _clamp() helper that protects setwinsize from garbage winsize values out of xterm.js). tests/hermes_cli/test_web_server_pty_import.py Asserts that web_server.PtyBridge resolves to WinPtyBridge on win32 and to the POSIX PtyBridge on POSIX, that PtyUnavailableError is the matching class on each side (so isinstance checks in /api/pty's spawn fallback path work), and a source-text check that pins the platform-branched import shape so a future refactor can't quietly collapse it back to a POSIX-only import. scripts/release.py AUTHOR_MAP entries so CI release-note generation can resolve both authors' plain (non-noreply) emails to their GitHub logins. Co-Authored-By: JoelJJohnson <josephjohnson.joel@gmail.com> Co-Authored-By: Nea74 <andreas@schwarz-ketsch.de> --------- Co-authored-by: JoelJJohnson <josephjohnson.joel@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Nea74 <andreas@schwarz-ketsch.de>	2026-06-08 11:32:43 -07:00
Teknium	9c9d9113a8	fix(auth): auto-detect OpenRouter credential from the pool, not just env (#42263 ) resolve_provider() auto-detection only checked OPENROUTER_API_KEY/ OPENAI_API_KEY env vars, never the credential pool. A key added via `hermes auth add openrouter` (manual pool entry, no env var) was invisible: the provider failed to resolve or resolved with an empty api_key, so requests went out with no Authorization header and OpenRouter returned "HTTP 401: Missing Authentication header" while `hermes auth list` showed the credential. Closes #42130. - auth.py: check load_pool("openrouter").has_credentials() after the env check - dump.py: `debug share` shows 'openrouter set (auth pool)' instead of the misleading 'not set' when the key lives in the pool - add regression tests (pool credential auto-detects; empty pool still raises)	2026-06-08 10:01:47 -07:00
teknium1	a77efada5f	refactor(cli): extract 18 model-flow wizard functions into model_setup_flows (god-file Phase 2) Lift the 18 _model_flow_* provider-setup wizard functions out of hermes_cli/main.py into hermes_cli/model_setup_flows.py. Behavior-neutral; main.py 14050 -> 11479 LOC. select_provider_and_model (the dispatcher) STAYS in main.py and re-imports the flows via an explicit 'from hermes_cli.model_setup_flows import (...)' block, so both its bare-name calls and existing test monkeypatches targeting hermes_cli.main._model_flow_* keep resolving against main's namespace unchanged. Imports: 3 neutral deps (argparse, os, subprocess) at the module top; the 14 main.py-internal helpers the flows call (_prompt_api_key, _save_custom_provider, the reasoning-effort/stepfun/qwen helpers, _run_anthropic_oauth_flow, ...) are lazy-imported per-flow (from hermes_cli.main import ...) so the new module never imports main at module scope -> no import cycle. Repointed one source-inspection change-detector (test_setup_ollama_cloud_force_refresh) to read the module the ollama-cloud branch moved to. Validation: 6563/6563 hermes_cli tests pass; live flow-dispatch probe confirms the lazy main-internal imports resolve at runtime.	2026-06-08 09:42:44 -07:00
yoniebans	9e360681f8	feat(dashboard): return recent commits from /api/hermes/update/check Add a best-effort `commits` list (sha/summary/author/at) to the update-check response for git/pip installs that are behind upstream, so the desktop's remote update overlay can show what's changed before applying. Additive and non-breaking: existing consumers (legacy dashboard, tests using subset assertions) ignore the new field. Leaves the shared check_for_updates() int contract untouched — commits come from a separate best-effort git call.	2026-06-08 08:58:26 -07:00
paulb26	b31c6c33b2	fix(pty-bridge): terminate PTY process groups on teardown	2026-06-08 07:03:12 -07:00
kshitij	b99c6c4277	Merge #42076 : nested category plugin discovery + alias-normalized enable/disable (#41066 ) Merge #42076: nested category plugin discovery + alias-normalized enable/disable (#41066) Lands the complete nested category plugin fix: - Discovery in `hermes plugins list` (from @islam666's #41076, carried in this PR) - Alias-normalized enable/disable mutation path so nested plugins can be toggled - Fixes the #41076 base breakages (web_server 6-tuple unpack + stale test fixtures) Co-authored work: discovery by @islam666 (#41076). Closes #41066.	2026-06-08 05:47:27 -07:00
kshitijk4poor	2b89afec79	fix(plugins): alias-normalize enable/disable for nested category plugins (follow-up to #41076 ) #41076 makes `hermes plugins list` discover nested category plugins (e.g. observability/nemo_relay). This adds the missing enable/disable mutation path so those plugins can actually be toggled, and fixes two incomplete-update breakages on the #41076 base. Before: `hermes plugins enable nemo_relay` -> "Plugin 'nemo_relay' is not installed or bundled." (exit 1), because cmd_enable/cmd_disable went through _plugin_exists(), which only checked top-level plugins/<name>/. Changes: - Add _resolve_plugin_key(): resolve a bare manifest/leaf name OR a full path-derived key (observability/nemo_relay) to the canonical key the runtime loader gates on, reusing #41076's _discover_all_plugins(). A bare leaf name ambiguous across two categories resolves to None rather than silently picking one. - cmd_enable/cmd_disable resolve first, persist the canonical key, and drop any stale legacy bare-name alias so the enabled/disabled lists can't drift into a contradictory state. _plugin_exists delegates to the same resolver. - Fix #41076 base breakages: _discover_all_plugins now returns 6-tuples, but web_server._merged_plugins_hub() still unpacked 5 (ValueError on the dashboard plugins-hub endpoint) and several test_plugins_cmd_list.py fixtures were still 5-tuples. Both updated; the hub status check is now key-aware. Verified e2e on the real CLI + runtime loader (isolated HERMES_HOME): `hermes plugins enable nemo_relay` writes observability/nemo_relay to config.yaml and the loader then loads it (enabled=True, error=None); a stale bare-name alias is cleared on disable; the dashboard _merged_plugins_hub() runs without crashing. Adds resolution + enable/disable tests; full tests/hermes_cli/test_plugins_cmd* + web_server plugin tests green. Follow-up to #41076 (#41066). Branched from that PR's head.	2026-06-08 17:57:37 +05:30
floory	15c99b437f	fix(cli): set PYTHON env for node-gyp native builds on NixOS (#40690 ) * fix(cli): set PYTHON env for node-gyp native builds on NixOS node-gyp (triggered by node-pty during npm ci) looks for python3 on PATH, which fails on NixOS because python3 lives in the nix store and is not on the system PATH. Add _nixos_build_env() — a two-tier helper that detects NixOS and: 1. Fast path: hermes venv python3 (~0s) 2. Fallback: nix-shell which python3 (~2-5s) Wire it into _run_npm_install_deterministic() via a new env= parameter, then pass it through cmd_gui() and _update_node_dependencies(). Non-NixOS systems: _nixos_build_env() returns None, behavior unchanged. * fix(cli): merge _nixos_build_env() with os.environ, fix NixOS detection, add explicit return None - Critical fix: both Tier 1 (venv) and Tier 2 (nix-shell) now return {**os.environ, "PYTHON": ...} instead of {"PYTHON": ...} — subprocess.run with env= replaces the entire environment, so the old code wiped PATH and broke npm/node on NixOS entirely. - Uses re.search(r"^ID=nixos$", ...) for anchored NixOS detection instead of unanchored substring match (could match ID_LIKE=...nixos). - Removes redundant Path.exists() guard before read_text(); just catches OSError (one filesystem read instead of two). - Adds explicit return None at end of function for type-hint consistency.	2026-06-08 13:57:37 +05:30
Teknium	4d18717b6c	fix(gateway): drop --replace from systemd unit templates (#41892 ) Under systemd's Restart=always, --replace turns every restart into a self-kill loop: the new instance reads gateway.pid, kills the previous process, writes its own PID, and on the next restart the cycle repeats. A process supervisor owns the lifecycle — --replace is for manual one-shot takeovers and fights the supervisor. Remove --replace from both the system-level and user-level systemd ExecStart lines. The --replace flag stays available for manual 'hermes gateway run --replace' and on the macOS launchd fallback path (#23387), which is a deliberate manual takeover, not a supervised unit. Also drop RestartMaxDelaySec / RestartSteps from the templates — they require systemd v255+ and are silently ignored on older versions. The _strip_optional_systemd_directives normalizer stays so existing installs whose on-disk unit still carries those directives aren't flagged as outdated. Credit: reported and diagnosed by @Skippy-the-Magnificent-one (PR #37145); reimplemented here under project authorship because the original commit was authored under a non-existent email.	2026-06-08 00:20:08 -07:00
konsisumer	3714caa1b9	fix(session): follow compression continuations for transcript reads	2026-06-07 23:57:20 -07:00
teknium1	1c68f6f81f	refactor(gateway): extract kanban watcher loops into GatewayKanbanWatchersMixin (god-file Phase 3) gateway/run.py is the largest god file (20k LOC, GatewayRunner with 220 methods). This lifts the cohesive kanban-watcher cluster — _kanban_notifier_watcher, _kanban_dispatcher_watcher, _kanban_advance/unsub/rewind, _deliver_kanban_artifacts (~1,035 LOC, 6 methods) — into gateway/kanban_watchers.py as a mixin that GatewayRunner inherits. Mixin (not free functions) because the methods use only self state: inheriting keeps every self._kanban_* call site working unchanged via the MRO, making this a behavior-neutral move. The methods' lazy imports (_kb, _decomp, _load_config, Platform) travel with them; the mixin needs only stdlib + a matching logging.getLogger('gateway.run'). run.py 20187 -> 19157 LOC; GatewayRunner direct methods 220 -> 214. Behavior-neutral: gateway test suite 6582 passed / 0 failed; start() still wires both watchers via self._kanban_*; MRO resolves all 6 to the mixin. One test (corrupt-board quarantine retry) keyed its time-travel mock on the caller's filename being gateway/run.py — updated to also accept gateway/kanban_watchers.py. Establishes the mixin-extraction pattern for further GatewayRunner decomposition (the 2406-LOC _run_agent and 1164-LOC _handle_message remain, but their callback closures need a context-object redesign — deferred).	2026-06-07 23:14:18 -07:00

1 2 3 4 5 ...

1377 commits