hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-17 09:41:58 +00:00

Author	SHA1	Message	Date
Austin Pickett	0bbff1fc7e	fix(deps): declare websockets as core dep + relax dev setuptools pin (salvage #45486 , #44693 ) (#46744 ) * fix: declare websockets as a core dependency * fix(deps): relax dev setuptools pin 82.0.1 -> 81.0.0 (torch caps setuptools<82) torch >= 2.11 publishes Requires-Dist: setuptools<82, so any environment that resolves the dev extra together with torch is unsatisfiable: $ uv pip install --dry-run ".[dev]" "torch==2.12.0" x No solution found when resolving dependencies: ... torch==2.12.0 and all versions of hermes-agent[dev] are incompatible. 81.0.0 is the latest release under the cap and stays inside the declared build-system window (setuptools>=77.0,<83). uv.lock regenerated with 'uv lock'; diff is scoped to the setuptools entry. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * chore: map salvaged contributor emails for attribution Add AUTHOR_MAP entries for the two cherry-picked contributors so the check-attribution CI gate passes: - yehaotian@xuanshudeMac-mini.local -> ArcanePivot (#45486) - dbeyer7@gmail.com -> benegessarit (#44693) --------- Co-authored-by: 玄枢 <yehaotian@xuanshudeMac-mini.local> Co-authored-by: David Beyer <dbeyer7@gmail.com> Co-authored-by: Claude Fable 5 <noreply@anthropic.com> Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-06-15 12:44:44 -04:00
ethernet	ae433634db	fix(desktop): move tsconfig to es2023 Co-authored-by: ibrahim özsaraç <160004724+iborazzi@users.noreply.github.com>	2026-06-15 12:07:17 -04:00
ethernet	9eb0bcd60f	change(ci): rip out nix ci for now to be re-added later when we have more stable ci flows	2026-06-15 12:06:54 -04:00
xxxigm	45e2f4fdcd	nix: refresh npmDepsHash for the @assistant-ui/store pin The store pin changed package-lock.json, so the workspace-wide npmDepsHash in nix/lib.nix is stale and the Nix flake check fails on the hash mismatch. Use the hash reported by the real fetchNpmDeps build (the flake check's `got:`), which is authoritative — it differs from prefetch-npm-deps' lockfile-contents hash, exactly the divergence nix/lib.nix already documents.	2026-06-15 11:55:02 -04:00
xxxigm	30377e108c	ci(desktop): build the renderer on PRs so vite breaks fail in CI The desktop build break shipped because nothing in CI runs the apps/desktop production build. typecheck only runs `tsc`, which does not exercise Vite/Rolldown module resolution, so an unresolvable package export (the @assistant-ui/tap "./react-shim" split) sailed through green checks and only failed when users built from source on install/update. Add a desktop-build job that runs `npm run build` (tsc -b + vite build + assert-dist-built) for apps/desktop. This closes the gap so the same class of break fails in CI instead of on every user's machine.	2026-06-15 11:55:02 -04:00
xxxigm	f02484feba	test(deps): guard @assistant-ui cluster on one tap version Lockfile invariant that would have caught the desktop build break: the single hoisted @assistant-ui/tap must satisfy every @assistant-ui/* package's declared tap requirement (deps or non-optional peer). It is a contract, not a snapshot -- no hardcoded versions -- so it stays green across routine bumps but fails the moment the cluster splits its tap requirement again.	2026-06-15 11:55:02 -04:00
xxxigm	eae3836eb6	fix(desktop): pin @assistant-ui/store so the cluster shares one tap The desktop app is built from source on every install/update (install.ps1 -> npm ci/install -> tsc -b && vite build). The @assistant-ui packages share an internal reactivity lib, @assistant-ui/tap, and only interoperate when they all resolve the SAME tap version. @assistant-ui/react@0.12.28 and @assistant-ui/core pin tap@^0.5.x (which exports only "." and "./react"), but the caret range react -> store@^0.2.9 floated store up to 0.2.18, which bumped its tap peer to ^0.9.0 and began importing "@assistant-ui/tap/react-shim" -- an entry point that only exists in the tap 0.9.x line. With the hoisted tap stuck on 0.5.x, vite build crashed: "./react-shim" is not exported ... from package @assistant-ui/tap i.e. the opaque "apps/desktop build failed (exit 1)" everyone hit when updating today. Pin @assistant-ui/store via root overrides to 0.2.13 -- the last release that targets tap@^0.5.x -- so react/core/store all agree on the hoisted tap@0.5.14 again. Verified: tsc -b and vite build both pass.	2026-06-15 11:55:02 -04:00
Teknium	3e7e9b24d4	fix: harden salvaged session and browser improvements Polish salvaged contributor work before PR review: - read browser inactivity timeout from config with documented fallback - skip redundant v10 trigram backfill before v11 FTS rebuild - show delegate_task goals safely in progress previews - show gateway status model/context without redundant token wording - wire gateway /sessions to shared session-listing helpers - map Ravenwolf author emails for release attribution Co-authored-by: Wolfram Ravenwolf <github.com@wolfram.ravenwolf.de> Co-authored-by: Amy Ravenwolf <amy@ravenwolf.de>	2026-06-15 07:46:34 -07:00
Wolfram Ravenwolf	ead38107a2	feat(status): restore model and context in gateway status PROBLEM: The old public /status PR drifted out of the current Amy patch stack, leaving /status without the model/provider, context window, or explicit cumulative token label that Wolfram uses to monitor context pressure from chat. SOLUTION: Re-port the feature onto the current gateway status handler. Prefer live/cached agent runtime metadata, fall back to SessionDB + SessionStore state between turns, add localized status model/context lines, and keep token totals explicitly labeled cumulative. Verification: tests/gateway/test_status_command.py, tests/hermes_cli/test_commands.py	2026-06-15 07:46:34 -07:00
Amy Ravenwolf	5035fa9029	feat(display): show delegate_task goals in tool progress notifications Previously, delegate_task in batch mode only showed '3 parallel tasks' without revealing what the tasks actually are. Single-task mode showed the goal via the primary_args fallback, but batch mode had no goal extraction. Changes: - build_tool_preview(): Add dedicated delegate_task handler that extracts individual task goals from both single and batch modes. Batch shows '3 tasks: Goal A \| Goal B \| Goal C'. - _get_cute_tool_message_impl(): Show individual goals in CLI cute messages for batch delegate calls ('3x: Goal A \| Goal B'). - Add 4 tests covering single goal, batch goals, missing goals, and no-goal edge case.	2026-06-15 07:46:34 -07:00
Wolfram Ravenwolf	5b2604df99	fix(state): skip redundant trigram backfill before v11 FTS rebuild	2026-06-15 07:46:34 -07:00
Amy Ravenwolf	2f2e3616b4	fix(config): read browser inactivity timeout from config	2026-06-15 07:46:34 -07:00
xxxigm	bee13817f0	test(desktop): cover $connection resync on profile switch Asserts ensureGatewayProfile keeps $connection in lockstep with the active profile's backend: activating a remote pool profile flips mode to remote, returning to default resyncs to local, a failed descriptor fetch leaves the prior connection intact, and a same-profile activation doesn't churn it. Regression coverage for #46651.	2026-06-15 07:11:02 -07:00
xxxigm	fbabf438a1	fix(desktop): sync $connection on profile switch so remote profiles attach images as bytes The renderer's $connection seeds from the PRIMARY (window) backend at boot and otherwise only refreshes on a sleep/wake reconnect. Activating a background profile (ensureGatewayProfile) pointed the live gateway + REST at that profile's backend but never updated $connection, so its `mode` stayed stuck on the primary. With a local primary and a remote pool profile active, every code path that branches on local-vs-remote misfired: image attachments went out via the path-based `image.attach` instead of `image.attach_bytes`, handing the remote gateway a client-only Windows path it can't resolve ("image not found: C:\..."), and the /api/fs/* file browser and /api/media fetches targeted the wrong machine. Resync $connection from the now-active profile's descriptor right after the gateway swap, so the remote-aware paths follow the live backend. Best-effort: a failed descriptor fetch leaves the prior connection intact for boot/reconnect to resync. Single-profile users are unaffected (the same-profile fast path never runs the swap). Fixes #46651	2026-06-15 07:11:02 -07:00
Teknium	49e743985a	fix: route minimax m3 reasoning controls through profile Follow up PR #46609's api.minimax.io reasoning report by moving the behavior out of the broad run_agent host gate and into the MiniMax provider profile. Only MiniMax-M3 on the documented OpenAI-compatible /v1 route gets reasoning_split/thinking/reasoning_effort; Anthropic-format MiniMax and non-M3 models keep their existing wire shapes. Co-authored-by: goku94123 <gooku94123@gmail.com>	2026-06-15 07:08:43 -07:00
goku94123	ba3883cd18	fix(minimax): enable reasoning extra_body for api.minimax.io	2026-06-15 07:08:43 -07:00
Teknium	be7c919bf9	fix(process): label background completion causes (#46659 ) Track why a background process finished and include that source in notify-on-complete messages so SIGTERM from process.kill, kill_all, backend loss, and ordinary exits are distinguishable.	2026-06-15 07:08:24 -07:00
Teknium	733472952a	fix: complete cron jobs lock salvage Route curator rollback through the same cross-process cron job lock, make save_jobs lock for legacy direct callers without deadlocking nested mutation paths, and harden the regression test so a second _jobs_lock caller really blocks across processes.	2026-06-15 06:29:00 -07:00
CiarasClaws	e5b4cf7bea	fix(cron): make jobs.json writes safe across processes `hermes cron pause`/`resume`/`remove` run in their own CLI process (CLI → cronjob tool → pause_job → update_job → save_jobs), entirely separate from the gateway process that also writes jobs.json (mark_job_run, advance_next_run, due-fast-forward in get_due_jobs). The only synchronization was a module-level `threading.Lock`, which serializes writers within a single process but does nothing across processes — and update_job/pause_job/remove_job/create_job did not even take it. The result is a classic lost update: a `cron pause` issued while the gateway is live loads jobs.json, sets enabled=False, and saves; concurrently the gateway loads the same file and saves back its run-bookkeeping, clobbering the pause. The CLI prints "Paused" (it succeeded against its own in-memory copy) but the job stays enabled and keeps firing, with no error surfaced. The scheduler's `.tick.lock` flock can't be reused for this — it is held for the entire tick, including multi-minute agent runs, so a CLI mutation would block for minutes. Add `_jobs_lock()`: a short-held cross-process advisory file lock (fcntl/msvcrt flock on `<hermes_home>/cron/.jobs.lock`) layered over the existing in-process lock, and wrap every load→modify→save critical section with it — create_job, update_job, remove_job, mark_job_run, advance_next_run, get_due_jobs, rewrite_skill_refs. The lock degrades to in-process-only if neither fcntl nor msvcrt is available, preserving prior behaviour. All critical sections are short (field edits, no agent execution), so contention resolves in milliseconds. Adds a regression test that proves the lock excludes a second process (an in-process threading.Lock cannot). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 06:29:00 -07:00
Teknium	29c6985590	fix(nix): refresh npm deps hash	2026-06-15 06:18:27 -07:00
FT_IOxCS	92a456f711	fix(cli,deps): clear esbuild audit loop Upgrade the Vite/esbuild surfaces that kept web, ui-tui, and the bootstrap installer on vulnerable esbuild versions, regenerate the root lockfile, and preserve intentional package+lock dependency edits during update lockfile cleanup.	2026-06-15 06:18:27 -07:00
Teknium	975b9f0a54	docs: recommend standard installer for development (#46646 )	2026-06-15 06:14:57 -07:00
Teknium	0d82060c74	fix: harden WhatsApp target alias salvage Add a parser-only routing regression that proves raw WhatsApp group JIDs bypass channel-directory resolution and home-channel fallback, include channel_aliases.json in quick state snapshots, harden malformed alias handling, and map Keiron McCammon for release attribution.	2026-06-15 05:51:47 -07:00
Keiron McCammon	ea49a79633	fix(messaging): route WhatsApp group JIDs to the target, not the home DM send_message(target="whatsapp:<group-jid>") silently delivered to the configured home DM instead of the requested group. Two gaps: 1. _parse_target_ref had no WhatsApp branch. Group JIDs (<id>@g.us), user JIDs (<id>@s.whatsapp.net), linked-identity JIDs (<id>@lid), and broadcast/newsletter JIDs matched no pattern and fell through to `return None, None, False`, so the caller treated them as unresolvable and used the home channel. The bridge's /send endpoint accepts any chatId, so only the tool-side target parsing was at fault. Add a whatsapp branch that recognizes native JIDs as explicit targets. The pre-existing '+'-prefixed E.164 path is preserved. 2. WhatsApp groups have no human-friendly name — the channel directory is regenerated from session data on a timer, so a group shows up as its raw 18-digit JID and any hand-edit to channel_directory.json is clobbered on the next rebuild. Add a user-maintained alias overlay (~/.hermes/channel_aliases.json) re-applied on every build AND every load, giving durable friendly names and letting a freshly-created group be pre-named before its first message. Tests: TestParseTargetRefWhatsAppJID (7 cases) for the parser; TestChannelAliases (7 cases) for the overlay, plus an autouse fixture isolating CHANNEL_ALIASES_PATH so a real alias file can't leak into the existing directory tests.	2026-06-15 05:51:47 -07:00
Teknium	c17469cb19	chore: map Veritas-7 release attribution Add the contributor noreply email used by the salvaged xAI OAuth refresh-skew commit so release notes credit the original author.	2026-06-15 05:40:23 -07:00
Veritas-7	febdddb41a	fix(auth): refresh xAI OAuth tokens earlier	2026-06-15 05:40:23 -07:00
Teknium	aab2e99bae	test: cover request debug dump redaction Keep request dump writes on the shared atomic JSON path, add regression coverage for request body/error/stdout redaction, and map the salvaged contributor email for release attribution.	2026-06-15 05:31:21 -07:00
xtymac	ad58dd51ac	redact secrets in API request debug dumps dump_api_request_debug() masks the provider Authorization header but writes the request `body` (system prompt, tool defs, context-embedded values) and the error message raw via atomic_json_write. This path also fires unconditionally on API errors (not only under HERMES_DUMP_REQUESTS), so any secret surfaced into context (e.g. an integration token) lands in cleartext at request_dump_*.json on every failed call. Run the serialized dump through the existing redact_sensitive_text() scrubber (already used for logs/tool output) before persisting and before the HERMES_DUMP_REQUEST_STDOUT print; preserve atomicity via temp-file + Path.replace. Also add the Notion internal-integration prefix (ntn_) to _PREFIX_PATTERNS so bare values are caught. Per SECURITY.md §3.2 this is a redaction (in-process heuristic) hardening, not a §3.1 vulnerability. Refs #46583.	2026-06-15 05:31:21 -07:00
Teknium	a688d2a1bd	test: assert disk cleanup prunes protected walks	2026-06-15 05:25:27 -07:00
墨綠BG	40699c3292	🐛 fix(disk-cleanup): avoid brittle sweep review issues	2026-06-15 05:25:27 -07:00
墨綠BG	c1a70a5439	🐛 fix(disk-cleanup): prune protected cleanup walks	2026-06-15 05:25:27 -07:00
liuhao1024	2cddc9c895	fix(bedrock): check boto3 version >= 1.34.59 before using converse_stream converse() and converse_stream() were added in boto3 1.34.59. When Hermes is installed editable into system Python (e.g. Ubuntu 24.04 ships 1.34.46), the system boto3 takes precedence and calls to converse_stream fail with AttributeError. Add an early version check in _require_boto3() that raises a clear RuntimeError with upgrade instructions.	2026-06-15 05:25:17 -07:00
Teknium	f79b109f4f	chore: map 0xneobyte release author	2026-06-15 05:25:07 -07:00
Tharushka Dinujaya	ec05d2bc3e	fix(gateway): evict scoped lock when PID+start_time match but process is not a gateway On Linux, systemd spawns core services (cron, nginx, sshd) with deterministic PIDs and jiffy start_times across reboots. A service can land on the exact same PID and start_time as a previous gateway, causing acquire_scoped_lock to mistake it for a live gateway and block startup. The existing stale-detection paths only covered: - start_times both non-None and different (clear mismatch) - start_times both None (macOS/Windows fallback to cmdline check) The boot-time collision falls through both: times are non-None and equal, so neither branch fired. Add a third check: when both start_times are known and match but the live process fails _looks_like_gateway_process, read its cmdline. If the cmdline is readable (non-None), we have positive evidence of an impostor and mark the lock stale. Requiring a readable cmdline keeps the check conservative — if cmdline is unreadable we do not evict.	2026-06-15 05:25:07 -07:00
Nicolò Boschi	a376ca0081	feat(hindsight): make observation scopes configurable on retain Adds an observation_scopes config key (and HINDSIGHT_RETAIN_OBSERVATION_SCOPES env var) so retained memories can opt into per_tag / all_combinations / custom scoping instead of Hindsight's default combined pass. Threaded through _build_retain_kwargs so all three retain paths honor it: auto-retain and flush-on-switch already use aretain_batch; the tool retain path is switched from aretain to aretain_batch (functionally equivalent, aretain just wraps a single-item batch) since aretain doesn't accept the observation_scopes parameter.	2026-06-15 04:59:17 -07:00
kshitij	8844e091c1	Merge pull request #46614 from kshitijk4poor/salvage/xai-oauth-profile-writethrough fix(auth): resolve xAI OAuth credentials across profiles + write rotated tokens back to root	2026-06-15 17:16:19 +05:30
kshitijk4poor	1227007aed	chore: map capt-marbles contributor email for attribution Salvaged commit in this PR is authored by capt-marbles (andrewdmwalker@gmail.com), a bare gmail that does not auto-resolve in the check-attribution job. Add the AUTHOR_MAP entry.	2026-06-15 17:09:27 +05:30
kshitijk4poor	497352bc4e	fix(auth): write rotated xAI OAuth tokens back to global root (#43589 ) The salvaged read-side fix lets a profile resolve the xAI OAuth grant from the global-root auth store when it has no own providers.xai-oauth block. But _save_xai_oauth_tokens still wrote rotated tokens only to the active profile store. Because xAI rotates the refresh_token on every refresh, a profile that reads root's grant and refreshes it left root holding a now- revoked refresh token — killing every other profile reading the stale root grant with invalid_grant once its access token expired (#43589). Detect the read-from-root case (profile lacks its own providers.xai-oauth block) and, after the profile save, write the rotated chain back to the global root too via a best-effort, TOCTOU-safe write-through that reuses _save_auth_store with an explicit target path. A profile that genuinely shadows root (has its own block) is left untouched, classic mode is a no-op, and a failed root write never breaks the profile's own save. Pairs with the read fallback in the preceding commit so the cross-profile xAI grant stays coherent in both directions.	2026-06-15 17:08:19 +05:30
Andrew Walker	f1d6f04362	fix(auth): resolve xAI OAuth credentials across profiles (cherry picked from commit `8d8b9f50e4`)	2026-06-15 17:03:35 +05:30
helix4u	dcc3216955	fix(mcp): fail fast for noninteractive oauth without tokens	2026-06-15 04:22:07 -07:00
Teknium	aca11c227e	fix(docker): skip gateway reconciliation in dashboard container (autodetect) (#46293 ) * fix(docker): skip per-profile gateway reconciliation in dashboard container When gateway and dashboard containers share a bind-mounted HERMES_HOME, both run the cont-init.d profile reconciliation script, which creates s6-log processes for every persisted profile. These s6-log processes in different containers race to flock() the same log-directory lock files under logs/gateways/<profile>/lock, producing repeated "s6-log: fatal: unable to lock ... Resource busy" errors and a supervision restart storm. Add HERMES_SKIP_PROFILE_RECONCILE env var support to container_boot.py and set it in the official docker-compose.yml dashboard service so the dashboard container no longer creates per-profile gateway s6 services it never uses. * chore(release): map salvaged contributor * refactor(docker): autodetect dashboard container instead of env-var gate Replace the HERMES_SKIP_PROFILE_RECONCILE env var with PID 1 argv role detection. A dashboard-only container never spawns or supervises per-profile gateways, so the reconcile boot hook now skips itself when /proc/1/cmdline is the dashboard command — no operator flag to set (or forget in a hand-written manifest, which would reintroduce the s6-log flock storm this prevents). - Extract _strip_container_argv_prefix() shared by the legacy-gateway and new dashboard detectors (DRY the init/wrapper/hermes peel). - Add _is_dashboard_container(); gate reconcile main() on it. - Drop HERMES_SKIP_PROFILE_RECONCILE from code + docker-compose.yml. - Tests: argv matrix for both roles + main()-level skip/reconcile proof and a regression that the removed env var is now inert. Co-authored-by: 895252509 <895252509@qq.com> --------- Co-authored-by: zhouxiang <895252509@qq.com> Co-authored-by: Ben <ben@nousresearch.com>	2026-06-15 20:51:48 +10:00
kshitij	6cb88a0874	Merge pull request #46552 from kshitijk4poor/salvage/file-tools-session-cwd Some checks failed Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker / shell lint / Lint Dockerfile (hadolint) (push) Waiting to run Details Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run Details Typecheck / typecheck (apps/desktop) (push) Waiting to run Details Typecheck / typecheck (apps/shared) (push) Waiting to run Details Typecheck / typecheck (ui-tui) (push) Waiting to run Details Typecheck / typecheck (web) (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details Nix / nix (macos-latest) (push) Has been cancelled Details Nix / nix (ubuntu-latest) (push) Has been cancelled Details fix(tools): respect session cwd in file tools (salvage of #46460)	2026-06-15 14:13:15 +05:30
kshitijk4poor	8fce54499f	refactor(tools): extract shared sentinel-free abs cwd validator _configured_terminal_cwd and _registered_task_cwd_override carried a byte-identical sentinel + expanduser + isabs validation tail. Extract it into _sentinel_free_abs_cwd(raw) so the relative/sentinel rejection rule lives in one place. Behaviour unchanged (the str() coercion the override path relied on is preserved in the helper).	2026-06-15 14:03:41 +05:30
kshitijk4poor	b0c99c12dd	docs(tools): document registered-cwd step in resolver docstrings The session-cwd fix inserted a registered task/session cwd override step between the live-cwd and $TERMINAL_CWD fallbacks, but three docstrings still described the old two-step order — _resolve_base_dir's numbered list was outright wrong. Update _authoritative_workspace_root, _resolve_base_dir, and _path_resolution_warning to reflect the actual four-step resolution order. No behaviour change.	2026-06-15 14:02:54 +05:30
kshitijk4poor	ddf7c7af81	refactor(tools): consolidate task-override lookup into one helper The raw-key-first-then-collapsed override lookup was hand-rolled in three places with subtly different spellings: terminal_tool's command setup, and both file_tools._registered_task_cwd_override and _get_file_ops. Since that exact raw-vs-collapsed invariant is what the session-cwd fix depends on, keeping three copies invites the drift that caused the original bug. Add terminal_tool.resolve_task_overrides(task_id) as the single source and route all three sites through it. Behaviour is unchanged (verified byte-equivalent across raw/collapsed/isolation/None/subagent inputs).	2026-06-15 14:02:17 +05:30
Gille	d6a8d9dcab	fix(tools): respect session cwd in file tools	2026-06-15 14:00:42 +05:30
Ben Barclay	95715dcb03	fix(s6): reserved default gateway must not follow sticky active_profile (#46483 ) The supervised `gateway-default` s6 slot runs bare `hermes gateway run` (no -p) to mean "the root HERMES_HOME profile". But `_apply_profile_override` falls through its #22502 HERMES_HOME guard for the container root (/opt/data, whose parent is not `profiles`) and reads the sticky `active_profile` file. If the user set another profile active (e.g. via the dashboard), the reserved default gateway gets redirected into that profile — producing a duplicate gateway for the active profile and no real default gateway. The profile page and `gateway status` then correctly report default as "not running" because there genuinely isn't one. Guard step 2 (the sticky active_profile fallback) with the existing HERMES_S6_SUPERVISED_CHILD sentinel that the container run-script already exports. Supervised named-profile slots pass -p explicitly (step 1, never reaches step 2); only the bare default slot was affected. Inert outside the s6 container — the sentinel is never set elsewhere. Reported in the 'Docker & Profiles & Dashboard' support thread.	2026-06-15 05:36:20 +00:00
Ben Barclay	80f8ffc74c	fix(dashboard): pin machine-dashboard reroute to the machine root, not $HOME/.hermes (#46487 ) The unified machine-dashboard reroute (cmd_dashboard) re-execs a named-profile dashboard launch as the machine dashboard and dropped HERMES_HOME from the child env with the comment "so the child binds the machine root". That holds for a standard install (root == ~/.hermes) but breaks the Docker layout: the published image sets `ENV HERMES_HOME=/opt/data`, so once HERMES_HOME is unset the child falls back to $HOME/.hermes = /opt/data/.hermes — an empty, auto-seeded home. Two user-visible symptoms, one root cause (reported via support): 1. Dashboard Profiles page shows only an empty `default` — the real default/oracle/saga profiles live under /opt/data/profiles, but the rerouted child resolves _get_profiles_root() to /opt/data/.hermes/profiles. 2. The "Update Hermes" button runs `hermes update` inside the container repeatedly instead of bailing with the docker-update guidance. The Docker guard keys off detect_install_method(), which reads $HERMES_HOME/.install_method; the image stamps that at /opt/data, but the misresolved home has no stamp, no HERMES_MANAGED, and no .git → falls through to "pip", so the guard never fires. The reporter's workaround was to bind-mount the host dir at both /opt/data and /opt/data/.hermes so the two paths converge (at the cost of a self-referential recursion). Fix: resolve the machine root explicitly with get_default_hermes_root() and set it on the child env instead of popping HERMES_HOME. That helper returns the root for both layouts — ~/.hermes for a standard install, and /opt/data for Docker (it strips a trailing profiles/<name>). Falls back to the old pop behaviour only if root resolution raises, so the reroute is never blocked. Regression tests in test_dashboard_unified_launch.py: the existing standard- install test now asserts the child carries HERMES_HOME == get_default_hermes_root() (not absent), and a new test_reexec_pins_docker_machine_root covers the Docker layout (HERMES_HOME=/opt/data/profiles/oracle → child gets /opt/data). Both fail against the pre-fix pop behaviour (mutation-verified).	2026-06-15 15:33:15 +10:00
Teknium	c2b7669ad3	fix(s6): clear stale log lock before startup (#46289 ) * fix(cli): clear stale s6-log lock file before startup on virtiofs * chore(release): map salvaged contributor --------- Co-authored-by: zxcasongs <35259607+zxcasongs@users.noreply.github.com> Co-authored-by: Ben <ben@nousresearch.com>	2026-06-15 14:10:51 +10:00
Teknium	b770967263	fix(s6): persist profile gateway desired state (#46292 ) * fix: persist s6 gateway desired state * chore(release): map salvaged contributor --------- Co-authored-by: Alfred Smith <alfred@my-cloud.me> Co-authored-by: Ben <ben@nousresearch.com>	2026-06-15 14:02:10 +10:00

1 2 3 4 5 ...

11750 commits