Commit graph

11758 commits

Author SHA1 Message Date
liuhao1024
60cc42e38b fix(inventory): deduplicate models between user-defined and aggregator providers
When a user-defined provider (e.g. litellm-proxy) and an aggregator
(e.g. openrouter) both advertise the same model name, the Desktop/TUI
model picker would show the model under both groups. Selecting it from
the aggregator row silently set model.provider to the aggregator,
breaking calls because the aggregator doesn't actually serve that model
ID.

Fix: after list_authenticated_providers() returns, collect all models
from user-defined provider rows and filter them out of aggregator rows.
Uses is_aggregator() from hermes_cli/providers.py to identify
aggregators. Case-insensitive matching.

Fixes #45954
2026-06-15 12:25:41 -07:00
liuhao1024
9df1a1a8de fix(doctor): recognize nvidia as vendor-slug-accepting provider
NVIDIA NIM API uses vendor-prefixed model IDs (e.g. qwen/qwen3.5-122b-a10b,
nvidia/nemotron-3-super-120b-a12b). The doctor command incorrectly warns that
vendor-prefixed slugs belong to aggregators like openrouter when nvidia is
the configured provider.

Add 'nvidia' to the providers_accepting_vendor_slugs set so doctor no longer
raises false-positive warnings for valid NVIDIA NIM configurations.

Fixes #35425
2026-06-15 12:24:46 -07:00
brooklyn!
c33e0457d7
Merge pull request #46836 from NousResearch/bb/salvage-macos-electron-pack
fix(desktop): restore Electron binary before macOS pack rename (salvage #38673)
2026-06-15 14:22:05 -05:00
Austin Pickett
f7c1cbe66f
docs: point desktop download links to site root (deprecate /desktop) (#46795)
The /desktop page is deprecated and redirects to the home page. The
landing page for the desktop app is now simply
https://hermes-agent.nousresearch.com/. Update all docs and the
Docusaurus nav/footer links accordingly.

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-06-15 15:02:24 -04:00
Brooklyn Nicholson
c23a2eec15 chore: map salvaged contributor email for attribution (#38673) 2026-06-15 13:53:23 -05:00
ChasLui
f3b32e9f52 fix(desktop): restore Electron binary before macOS pack rename (salvage #38673)
electron-builder 26.8.x can stage an Electron.app without its
Contents/MacOS/Electron binary, then fail renaming it to Hermes:

    ENOENT: no such file or directory, rename .../MacOS/Electron -> .../MacOS/Hermes

This breaks `npm run pack` and the installer desktop stage before a
launchable Hermes.app exists.

- Point build.electronDist at the already-installed Electron dist so
  electron-builder reuses it instead of re-unpacking from cache.
- Add a darwin-only prebuilder patch that restores the missing main
  binary from the runtime dist before the rename. Idempotent (marker
  guard), soft-fails on shape mismatch, survives node_modules reinstall.

Co-authored-by: ChasLui <chaslui@outlook.com>
2026-06-15 13:53:01 -05:00
Austin Pickett
5f6be7f31b
fix(teams): package Microsoft Teams SDK as an installable extra (salvage #43945) (#46764)
* fix(teams): package Microsoft Teams SDK as an installable extra

The Teams adapter imports the microsoft-teams-apps SDK, but it was never
declared as a dependency, so source/local installs hit ImportError and the
adapter silently reported the SDK as unavailable. Add a 'teams' extra
(microsoft-teams-apps==2.0.13.4 + aiohttp) and document 'uv sync --extra teams'.

Per the 2026-05-12 [all] policy, opt-in messaging-platform SDKs are NOT added
to [all] (they would break every fresh install on a quarantined release); the
teams extra is installed on demand like the other platform backends.

Co-authored-by: rio-jeong <rio.jeong@thebytesize.ai>

* chore: map rio-jeong contributor email for attribution (#43945)

* feat(teams): lazy-install the Teams SDK on demand (parity with other channels)

The teams extra alone left Teams as the only messaging platform that wouldn't
auto-install its SDK — every other channel (telegram, discord, slack, matrix,
dingtalk, feishu) lazy-installs via tools.lazy_deps on first connect. Bring
Teams to parity:

- Add 'platform.teams' to LAZY_DEPS (microsoft-teams-apps + aiohttp).
- Replace the passive 'check_teams_requirements = check_requirements' alias with
  a real lazy-installer that calls ensure_and_bind('platform.teams', ...),
  rebinding all Teams SDK globals on success (mirrors check_slack_requirements).
- Call check_teams_requirements() at the top of TeamsAdapter.connect() so
  enabling Teams installs the SDK on demand.
- Keep the passive check_requirements() as the registry check_fn so 'gateway
  status' probes never trigger a pip install.

The 'teams' extra remains for packagers / explicit 'uv sync --extra teams'.

Tests: rework the alias test into shortcircuit + lazy-install assertions, and
update test_connect_fails_without_sdk to simulate an uninstallable SDK.

---------

Co-authored-by: rio-jeong <rio.jeong@thebytesize.ai>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-06-15 14:35:15 -04:00
Austin Pickett
0bbf325a8f
fix(dashboard): scope chat sidebar model card to selected profile (#46665)
* fix(dashboard): scope chat sidebar model card to selected profile

The PTY already honors ?profile= on profile switch, but the JSON-RPC
sidecar created sessions against the dashboard launch profile. Pass the
management profile through session.create and reconnect on switch.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(dashboard): sync active profile with management scope

Align the sidebar switcher with the sticky active profile on load and
when "Set as active" is clicked, so Chat and management pages match
what the Profiles page shows as active.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(dashboard): auto-reconnect chat sidebar on profile switch

Bump the sidecar connection version when profile or PTY channel changes,
matching the manual Reconnect path so gateway and events sockets come
back without clicking the error banner.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(dashboard): prevent model selector chevron overlapping label

Use inline flex layout instead of Button suffix, which is absolutely
positioned and overlapped truncated model names at px-0.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-15 12:50:19 -04:00
Austin Pickett
0bbff1fc7e
fix(deps): declare websockets as core dep + relax dev setuptools pin (salvage #45486, #44693) (#46744)
* fix: declare websockets as a core dependency

* fix(deps): relax dev setuptools pin 82.0.1 -> 81.0.0 (torch caps setuptools<82)

torch >= 2.11 publishes Requires-Dist: setuptools<82, so any environment
that resolves the dev extra together with torch is unsatisfiable:

    $ uv pip install --dry-run ".[dev]" "torch==2.12.0"
    x No solution found when resolving dependencies:
      ... torch==2.12.0 and all versions of hermes-agent[dev] are incompatible.

81.0.0 is the latest release under the cap and stays inside the declared
build-system window (setuptools>=77.0,<83). uv.lock regenerated with
'uv lock'; diff is scoped to the setuptools entry.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* chore: map salvaged contributor emails for attribution

Add AUTHOR_MAP entries for the two cherry-picked contributors so the
check-attribution CI gate passes:
- yehaotian@xuanshudeMac-mini.local -> ArcanePivot (#45486)
- dbeyer7@gmail.com -> benegessarit (#44693)

---------

Co-authored-by: 玄枢 <yehaotian@xuanshudeMac-mini.local>
Co-authored-by: David Beyer <dbeyer7@gmail.com>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-06-15 12:44:44 -04:00
ethernet
ae433634db fix(desktop): move tsconfig to es2023
Co-authored-by: ibrahim özsaraç <160004724+iborazzi@users.noreply.github.com>
2026-06-15 12:07:17 -04:00
ethernet
9eb0bcd60f change(ci): rip out nix ci for now
to be re-added later when we have more stable ci flows
2026-06-15 12:06:54 -04:00
xxxigm
45e2f4fdcd nix: refresh npmDepsHash for the @assistant-ui/store pin
The store pin changed package-lock.json, so the workspace-wide
npmDepsHash in nix/lib.nix is stale and the Nix flake check fails on
the hash mismatch. Use the hash reported by the real fetchNpmDeps
build (the flake check's `got:`), which is authoritative — it differs
from prefetch-npm-deps' lockfile-contents hash, exactly the divergence
nix/lib.nix already documents.
2026-06-15 11:55:02 -04:00
xxxigm
30377e108c ci(desktop): build the renderer on PRs so vite breaks fail in CI
The desktop build break shipped because nothing in CI runs the
apps/desktop production build. typecheck only runs `tsc`, which does
not exercise Vite/Rolldown module resolution, so an unresolvable
package export (the @assistant-ui/tap "./react-shim" split) sailed
through green checks and only failed when users built from source on
install/update.

Add a desktop-build job that runs `npm run build` (tsc -b + vite build
+ assert-dist-built) for apps/desktop. This closes the gap so the same
class of break fails in CI instead of on every user's machine.
2026-06-15 11:55:02 -04:00
xxxigm
f02484feba test(deps): guard @assistant-ui cluster on one tap version
Lockfile invariant that would have caught the desktop build break: the
single hoisted @assistant-ui/tap must satisfy every @assistant-ui/*
package's declared tap requirement (deps or non-optional peer). It is a
contract, not a snapshot -- no hardcoded versions -- so it stays green
across routine bumps but fails the moment the cluster splits its tap
requirement again.
2026-06-15 11:55:02 -04:00
xxxigm
eae3836eb6 fix(desktop): pin @assistant-ui/store so the cluster shares one tap
The desktop app is built from source on every install/update
(install.ps1 -> npm ci/install -> tsc -b && vite build). The
@assistant-ui packages share an internal reactivity lib,
@assistant-ui/tap, and only interoperate when they all resolve the
SAME tap version.

@assistant-ui/react@0.12.28 and @assistant-ui/core pin tap@^0.5.x
(which exports only "." and "./react"), but the caret range
react -> store@^0.2.9 floated store up to 0.2.18, which bumped its
tap peer to ^0.9.0 and began importing "@assistant-ui/tap/react-shim"
-- an entry point that only exists in the tap 0.9.x line. With the
hoisted tap stuck on 0.5.x, vite build crashed:

    "./react-shim" is not exported ... from package @assistant-ui/tap

i.e. the opaque "apps/desktop build failed (exit 1)" everyone hit when
updating today.

Pin @assistant-ui/store via root overrides to 0.2.13 -- the last
release that targets tap@^0.5.x -- so react/core/store all agree on the
hoisted tap@0.5.14 again. Verified: tsc -b and vite build both pass.
2026-06-15 11:55:02 -04:00
Teknium
3e7e9b24d4 fix: harden salvaged session and browser improvements
Polish salvaged contributor work before PR review:
- read browser inactivity timeout from config with documented fallback
- skip redundant v10 trigram backfill before v11 FTS rebuild
- show delegate_task goals safely in progress previews
- show gateway status model/context without redundant token wording
- wire gateway /sessions to shared session-listing helpers
- map Ravenwolf author emails for release attribution

Co-authored-by: Wolfram Ravenwolf <github.com@wolfram.ravenwolf.de>
Co-authored-by: Amy Ravenwolf <amy@ravenwolf.de>
2026-06-15 07:46:34 -07:00
Wolfram Ravenwolf
ead38107a2 feat(status): restore model and context in gateway status
PROBLEM: The old public /status PR drifted out of the current Amy patch stack, leaving /status without the model/provider, context window, or explicit cumulative token label that Wolfram uses to monitor context pressure from chat.

SOLUTION: Re-port the feature onto the current gateway status handler. Prefer live/cached agent runtime metadata, fall back to SessionDB + SessionStore state between turns, add localized status model/context lines, and keep token totals explicitly labeled cumulative.

Verification: tests/gateway/test_status_command.py, tests/hermes_cli/test_commands.py
2026-06-15 07:46:34 -07:00
Amy Ravenwolf
5035fa9029 feat(display): show delegate_task goals in tool progress notifications
Previously, delegate_task in batch mode only showed '3 parallel tasks'
without revealing what the tasks actually are. Single-task mode showed
the goal via the primary_args fallback, but batch mode had no goal
extraction.

Changes:
- build_tool_preview(): Add dedicated delegate_task handler that
  extracts individual task goals from both single and batch modes.
  Batch shows '3 tasks: Goal A | Goal B | Goal C'.
- _get_cute_tool_message_impl(): Show individual goals in CLI cute
  messages for batch delegate calls ('3x: Goal A | Goal B').
- Add 4 tests covering single goal, batch goals, missing goals,
  and no-goal edge case.
2026-06-15 07:46:34 -07:00
Wolfram Ravenwolf
5b2604df99 fix(state): skip redundant trigram backfill before v11 FTS rebuild 2026-06-15 07:46:34 -07:00
Amy Ravenwolf
2f2e3616b4 fix(config): read browser inactivity timeout from config 2026-06-15 07:46:34 -07:00
xxxigm
bee13817f0 test(desktop): cover $connection resync on profile switch
Asserts ensureGatewayProfile keeps $connection in lockstep with the active
profile's backend: activating a remote pool profile flips mode to remote,
returning to default resyncs to local, a failed descriptor fetch leaves the
prior connection intact, and a same-profile activation doesn't churn it.
Regression coverage for #46651.
2026-06-15 07:11:02 -07:00
xxxigm
fbabf438a1 fix(desktop): sync $connection on profile switch so remote profiles attach images as bytes
The renderer's $connection seeds from the PRIMARY (window) backend at boot and
otherwise only refreshes on a sleep/wake reconnect. Activating a background
profile (ensureGatewayProfile) pointed the live gateway + REST at that profile's
backend but never updated $connection, so its `mode` stayed stuck on the
primary. With a local primary and a remote pool profile active, every code path
that branches on local-vs-remote misfired: image attachments went out via the
path-based `image.attach` instead of `image.attach_bytes`, handing the remote
gateway a client-only Windows path it can't resolve ("image not found: C:\..."),
and the /api/fs/* file browser and /api/media fetches targeted the wrong
machine.

Resync $connection from the now-active profile's descriptor right after the
gateway swap, so the remote-aware paths follow the live backend. Best-effort: a
failed descriptor fetch leaves the prior connection intact for boot/reconnect to
resync. Single-profile users are unaffected (the same-profile fast path never
runs the swap).

Fixes #46651
2026-06-15 07:11:02 -07:00
Teknium
49e743985a fix: route minimax m3 reasoning controls through profile
Follow up PR #46609's api.minimax.io reasoning report by moving the behavior out of the broad run_agent host gate and into the MiniMax provider profile. Only MiniMax-M3 on the documented OpenAI-compatible /v1 route gets reasoning_split/thinking/reasoning_effort; Anthropic-format MiniMax and non-M3 models keep their existing wire shapes.

Co-authored-by: goku94123 <gooku94123@gmail.com>
2026-06-15 07:08:43 -07:00
goku94123
ba3883cd18 fix(minimax): enable reasoning extra_body for api.minimax.io 2026-06-15 07:08:43 -07:00
Teknium
be7c919bf9
fix(process): label background completion causes (#46659)
Track why a background process finished and include that source in notify-on-complete messages so SIGTERM from process.kill, kill_all, backend loss, and ordinary exits are distinguishable.
2026-06-15 07:08:24 -07:00
Teknium
733472952a fix: complete cron jobs lock salvage
Route curator rollback through the same cross-process cron job lock, make save_jobs lock for legacy direct callers without deadlocking nested mutation paths, and harden the regression test so a second _jobs_lock caller really blocks across processes.
2026-06-15 06:29:00 -07:00
CiarasClaws
e5b4cf7bea fix(cron): make jobs.json writes safe across processes
`hermes cron pause`/`resume`/`remove` run in their own CLI process (CLI →
cronjob tool → pause_job → update_job → save_jobs), entirely separate from
the gateway process that also writes jobs.json (mark_job_run, advance_next_run,
due-fast-forward in get_due_jobs). The only synchronization was a module-level
`threading.Lock`, which serializes writers *within a single process* but does
nothing across processes — and update_job/pause_job/remove_job/create_job did
not even take it.

The result is a classic lost update: a `cron pause` issued while the gateway is
live loads jobs.json, sets enabled=False, and saves; concurrently the gateway
loads the same file and saves back its run-bookkeeping, clobbering the pause.
The CLI prints "Paused" (it succeeded against its own in-memory copy) but the
job stays enabled and keeps firing, with no error surfaced. The scheduler's
`.tick.lock` flock can't be reused for this — it is held for the entire tick,
including multi-minute agent runs, so a CLI mutation would block for minutes.

Add `_jobs_lock()`: a short-held cross-process advisory file lock (fcntl/msvcrt
flock on `<hermes_home>/cron/.jobs.lock`) layered over the existing in-process
lock, and wrap every load→modify→save critical section with it — create_job,
update_job, remove_job, mark_job_run, advance_next_run, get_due_jobs,
rewrite_skill_refs. The lock degrades to in-process-only if neither fcntl nor
msvcrt is available, preserving prior behaviour. All critical sections are short
(field edits, no agent execution), so contention resolves in milliseconds.

Adds a regression test that proves the lock excludes a second process (an
in-process threading.Lock cannot).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 06:29:00 -07:00
Teknium
29c6985590 fix(nix): refresh npm deps hash 2026-06-15 06:18:27 -07:00
FT_IOxCS
92a456f711 fix(cli,deps): clear esbuild audit loop
Upgrade the Vite/esbuild surfaces that kept web, ui-tui, and the bootstrap installer on vulnerable esbuild versions, regenerate the root lockfile, and preserve intentional package+lock dependency edits during update lockfile cleanup.
2026-06-15 06:18:27 -07:00
Teknium
975b9f0a54
docs: recommend standard installer for development (#46646) 2026-06-15 06:14:57 -07:00
Teknium
0d82060c74 fix: harden WhatsApp target alias salvage
Add a parser-only routing regression that proves raw WhatsApp group JIDs bypass channel-directory resolution and home-channel fallback, include channel_aliases.json in quick state snapshots, harden malformed alias handling, and map Keiron McCammon for release attribution.
2026-06-15 05:51:47 -07:00
Keiron McCammon
ea49a79633 fix(messaging): route WhatsApp group JIDs to the target, not the home DM
send_message(target="whatsapp:<group-jid>") silently delivered to the
configured home DM instead of the requested group. Two gaps:

1. _parse_target_ref had no WhatsApp branch. Group JIDs (<id>@g.us),
   user JIDs (<id>@s.whatsapp.net), linked-identity JIDs (<id>@lid), and
   broadcast/newsletter JIDs matched no pattern and fell through to
   `return None, None, False`, so the caller treated them as
   unresolvable and used the home channel. The bridge's /send endpoint
   accepts any chatId, so only the tool-side target parsing was at fault.
   Add a whatsapp branch that recognizes native JIDs as explicit targets.
   The pre-existing '+'-prefixed E.164 path is preserved.

2. WhatsApp groups have no human-friendly name — the channel directory
   is regenerated from session data on a timer, so a group shows up as
   its raw 18-digit JID and any hand-edit to channel_directory.json is
   clobbered on the next rebuild. Add a user-maintained alias overlay
   (~/.hermes/channel_aliases.json) re-applied on every build AND every
   load, giving durable friendly names and letting a freshly-created
   group be pre-named before its first message.

Tests: TestParseTargetRefWhatsAppJID (7 cases) for the parser;
TestChannelAliases (7 cases) for the overlay, plus an autouse fixture
isolating CHANNEL_ALIASES_PATH so a real alias file can't leak into the
existing directory tests.
2026-06-15 05:51:47 -07:00
Teknium
c17469cb19 chore: map Veritas-7 release attribution
Add the contributor noreply email used by the salvaged xAI OAuth refresh-skew commit so release notes credit the original author.
2026-06-15 05:40:23 -07:00
Veritas-7
febdddb41a fix(auth): refresh xAI OAuth tokens earlier 2026-06-15 05:40:23 -07:00
Teknium
aab2e99bae test: cover request debug dump redaction
Keep request dump writes on the shared atomic JSON path, add regression coverage for request body/error/stdout redaction, and map the salvaged contributor email for release attribution.
2026-06-15 05:31:21 -07:00
xtymac
ad58dd51ac redact secrets in API request debug dumps
dump_api_request_debug() masks the provider Authorization header but writes
the request `body` (system prompt, tool defs, context-embedded values) and the
error message raw via atomic_json_write. This path also fires unconditionally
on API errors (not only under HERMES_DUMP_REQUESTS), so any secret surfaced
into context (e.g. an integration token) lands in cleartext at
request_dump_*.json on every failed call.

Run the serialized dump through the existing redact_sensitive_text() scrubber
(already used for logs/tool output) before persisting and before the
HERMES_DUMP_REQUEST_STDOUT print; preserve atomicity via temp-file +
Path.replace. Also add the Notion internal-integration prefix (ntn_) to
_PREFIX_PATTERNS so bare values are caught.

Per SECURITY.md §3.2 this is a redaction (in-process heuristic) hardening, not
a §3.1 vulnerability. Refs #46583.
2026-06-15 05:31:21 -07:00
Teknium
a688d2a1bd test: assert disk cleanup prunes protected walks 2026-06-15 05:25:27 -07:00
墨綠BG
40699c3292 🐛 fix(disk-cleanup): avoid brittle sweep review issues 2026-06-15 05:25:27 -07:00
墨綠BG
c1a70a5439 🐛 fix(disk-cleanup): prune protected cleanup walks 2026-06-15 05:25:27 -07:00
liuhao1024
2cddc9c895 fix(bedrock): check boto3 version >= 1.34.59 before using converse_stream
converse() and converse_stream() were added in boto3 1.34.59. When Hermes
is installed editable into system Python (e.g. Ubuntu 24.04 ships 1.34.46),
the system boto3 takes precedence and calls to converse_stream fail with
AttributeError. Add an early version check in _require_boto3() that raises
a clear RuntimeError with upgrade instructions.
2026-06-15 05:25:17 -07:00
Teknium
f79b109f4f chore: map 0xneobyte release author 2026-06-15 05:25:07 -07:00
Tharushka Dinujaya
ec05d2bc3e fix(gateway): evict scoped lock when PID+start_time match but process is not a gateway
On Linux, systemd spawns core services (cron, nginx, sshd) with
deterministic PIDs and jiffy start_times across reboots. A service can
land on the exact same PID and start_time as a previous gateway, causing
acquire_scoped_lock to mistake it for a live gateway and block startup.

The existing stale-detection paths only covered:
  - start_times both non-None and different (clear mismatch)
  - start_times both None (macOS/Windows fallback to cmdline check)

The boot-time collision falls through both: times are non-None and
equal, so neither branch fired.

Add a third check: when both start_times are known and match but the
live process fails _looks_like_gateway_process, read its cmdline. If
the cmdline is readable (non-None), we have positive evidence of an
impostor and mark the lock stale. Requiring a readable cmdline keeps the
check conservative — if cmdline is unreadable we do not evict.
2026-06-15 05:25:07 -07:00
Nicolò Boschi
a376ca0081 feat(hindsight): make observation scopes configurable on retain
Adds an observation_scopes config key (and HINDSIGHT_RETAIN_OBSERVATION_SCOPES
env var) so retained memories can opt into per_tag / all_combinations /
custom scoping instead of Hindsight's default combined pass.

Threaded through _build_retain_kwargs so all three retain paths honor it:
auto-retain and flush-on-switch already use aretain_batch; the tool retain
path is switched from aretain to aretain_batch (functionally equivalent,
aretain just wraps a single-item batch) since aretain doesn't accept the
observation_scopes parameter.
2026-06-15 04:59:17 -07:00
kshitij
8844e091c1
Merge pull request #46614 from kshitijk4poor/salvage/xai-oauth-profile-writethrough
fix(auth): resolve xAI OAuth credentials across profiles + write rotated tokens back to root
2026-06-15 17:16:19 +05:30
kshitijk4poor
1227007aed chore: map capt-marbles contributor email for attribution
Salvaged commit in this PR is authored by capt-marbles
(andrewdmwalker@gmail.com), a bare gmail that does not auto-resolve in
the check-attribution job. Add the AUTHOR_MAP entry.
2026-06-15 17:09:27 +05:30
kshitijk4poor
497352bc4e fix(auth): write rotated xAI OAuth tokens back to global root (#43589)
The salvaged read-side fix lets a profile resolve the xAI OAuth grant from
the global-root auth store when it has no own providers.xai-oauth block.
But _save_xai_oauth_tokens still wrote rotated tokens only to the active
profile store. Because xAI rotates the refresh_token on every refresh, a
profile that reads root's grant and refreshes it left root holding a now-
revoked refresh token — killing every other profile reading the stale root
grant with invalid_grant once its access token expired (#43589).

Detect the read-from-root case (profile lacks its own providers.xai-oauth
block) and, after the profile save, write the rotated chain back to the
global root too via a best-effort, TOCTOU-safe write-through that reuses
_save_auth_store with an explicit target path. A profile that genuinely
shadows root (has its own block) is left untouched, classic mode is a
no-op, and a failed root write never breaks the profile's own save.

Pairs with the read fallback in the preceding commit so the cross-profile
xAI grant stays coherent in both directions.
2026-06-15 17:08:19 +05:30
Andrew Walker
f1d6f04362 fix(auth): resolve xAI OAuth credentials across profiles
(cherry picked from commit 8d8b9f50e4)
2026-06-15 17:03:35 +05:30
helix4u
dcc3216955 fix(mcp): fail fast for noninteractive oauth without tokens 2026-06-15 04:22:07 -07:00
Teknium
aca11c227e
fix(docker): skip gateway reconciliation in dashboard container (autodetect) (#46293)
* fix(docker): skip per-profile gateway reconciliation in dashboard container

When gateway and dashboard containers share a bind-mounted HERMES_HOME,
both run the cont-init.d profile reconciliation script, which creates
s6-log processes for every persisted profile.  These s6-log processes
in different containers race to flock() the same log-directory lock
files under logs/gateways/<profile>/lock, producing repeated
"s6-log: fatal: unable to lock ... Resource busy" errors and a
supervision restart storm.

Add HERMES_SKIP_PROFILE_RECONCILE env var support to container_boot.py
and set it in the official docker-compose.yml dashboard service so the
dashboard container no longer creates per-profile gateway s6 services
it never uses.

* chore(release): map salvaged contributor

* refactor(docker): autodetect dashboard container instead of env-var gate

Replace the HERMES_SKIP_PROFILE_RECONCILE env var with PID 1 argv role
detection. A dashboard-only container never spawns or supervises
per-profile gateways, so the reconcile boot hook now skips itself when
/proc/1/cmdline is the dashboard command — no operator flag to set (or
forget in a hand-written manifest, which would reintroduce the s6-log
flock storm this prevents).

- Extract _strip_container_argv_prefix() shared by the legacy-gateway
  and new dashboard detectors (DRY the init/wrapper/hermes peel).
- Add _is_dashboard_container(); gate reconcile main() on it.
- Drop HERMES_SKIP_PROFILE_RECONCILE from code + docker-compose.yml.
- Tests: argv matrix for both roles + main()-level skip/reconcile proof
  and a regression that the removed env var is now inert.

Co-authored-by: 895252509 <895252509@qq.com>

---------

Co-authored-by: zhouxiang <895252509@qq.com>
Co-authored-by: Ben <ben@nousresearch.com>
2026-06-15 20:51:48 +10:00
kshitij
6cb88a0874
Merge pull request #46552 from kshitijk4poor/salvage/file-tools-session-cwd
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker / shell lint / Lint Dockerfile (hadolint) (push) Waiting to run
Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run
Typecheck / typecheck (apps/desktop) (push) Waiting to run
Typecheck / typecheck (apps/shared) (push) Waiting to run
Typecheck / typecheck (ui-tui) (push) Waiting to run
Typecheck / typecheck (web) (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
Nix / nix (macos-latest) (push) Has been cancelled
Nix / nix (ubuntu-latest) (push) Has been cancelled
fix(tools): respect session cwd in file tools (salvage of #46460)
2026-06-15 14:13:15 +05:30