Commit graph

1476 commits

Author SHA1 Message Date
kshitij
17251e865b
Merge pull request #46857 from liuhao1024/fix/model-picker-merge-live-static
fix(models): merge live API results with curated static catalog in generic provider path
2026-06-16 23:30:34 +05:30
kshitijk4poor
658ac1d866 fix(models): keep curated-first ordering in live+curated merge; use pure-catalog helper in validation
The generic live+curated merge (commit 630b438) seeded the merged list
from live results, demoting curated-only models below live ones. That
regressed #46309, which deliberately surfaces the newest curated model
(kimi-k2.7-code) FIRST in the native picker even when the live /models
listing lags. Restore curated-first ordering: curated entries lead (in
catalog order), live-only entries are appended for discovery. This keeps
the #46850 fix (zai glm-5.2 now appears) without the kimi regression.

Also switch the validate_requested_model curated fallback (commit
ee7b8a4) from provider_model_ids() — which triggers a second, uncached
live /models fetch with its own 8s timeout and may resolve different
credentials than the api_key/base_url just probed — to the pure-catalog
helper _model_in_provider_catalog(). Membership is checked against the
shipped catalog only, with no extra network call.

Tests: restore the curated-first assertion in
test_kimi_coding_live_catalog_does_not_hide_curated_k2_7_code; update
the new merge tests to curated-first semantics; de-circularize the
validation fallback tests to patch _PROVIDER_MODELS (the real source)
instead of mocking the function under test.
2026-06-16 23:25:07 +05:30
brooklyn!
c6e99ab375
Merge pull request #46959 from NousResearch/bb/composer-model-selector
feat(desktop): composer model selector, per-model presets & external-provider disconnect
2026-06-16 09:55:57 -05:00
MrDiamondBallz
9a59ad73dd fix(auth): preserve Codex pool-only rate-limit state
Classify exhausted pool-only openai-codex credentials as quota/rate-limited instead of missing auth. This prevents auth status and runtime credential resolution from reporting missing credentials when a valid manual:device_code pool credential exists but is temporarily in a 429 usage-limit cooldown.

Adds regression coverage for pool-only Codex auth status and runtime resolution.
2026-06-16 05:56:11 -07:00
liuhao1024
ee7b8a4672 fix(models): validate_requested_model falls back to curated catalog when live API omits model
When live /v1/models responds but omits a model that exists in the
curated static catalog, validate_requested_model now accepts it with
a note instead of rejecting. This covers the /model slash-command path
(the picker path was already fixed in the parent commit).

Addresses review feedback from potatogim on #46857.
2026-06-16 16:24:11 +08:00
liuhao1024
630b43892d fix(models): merge live API results with curated static catalog in generic provider path
When a provider's live /v1/models endpoint returns a stale or incomplete
list (e.g. Z.AI missing glm-5.2), the generic profile-based code path
returned only the live results, silently dropping curated models.

Generalize the kimi-coding merge pattern to all providers: live entries
come first (provider's preferred order), then curated-only entries are
appended with case-insensitive dedup. This ensures models that the live
endpoint omits still appear in /model picker.

Fixes #46850
2026-06-16 16:21:01 +08:00
Brooklyn Nicholson
a0ec4f52b9 feat(desktop): disconnect external (CLI-managed) providers
External providers (Claude Code) store creds outside Hermes, so the
disconnect API refuses them. The backend now hands the GUI a per-OS
`disconnect_command` that clears the credential the same way the CLI's
logout does (macOS Keychain entry + ~/.claude/.credentials.json), and
the misleading "use claude setup-token" hint is corrected.

Settings → Providers offers a Disconnect button for these: it confirms,
leaves Settings, and runs the removal command in the embedded terminal
via a new runInTerminal() (queues onto $terminalInjection; the terminal
pane flushes and clears it once its session is live). The expanded list
also gets its own "Other providers" header so it no longer reads as
grouped under "Connected". API-managed providers keep the one-click
(trash) disconnect.
2026-06-16 00:08:21 -05:00
brooklyn!
c6b0eb4de0
fix(desktop): open remote-gateway artifacts via authenticated download (#46895)
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run
Typecheck / typecheck (apps/desktop) (push) Waiting to run
Typecheck / typecheck (apps/shared) (push) Waiting to run
Typecheck / typecheck (ui-tui) (push) Waiting to run
Typecheck / typecheck (web) (push) Waiting to run
Typecheck / desktop-build (push) Waiting to run
Docker / shell lint / Lint Dockerfile (hadolint) (push) Has been cancelled
Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Has been cancelled
OSV-Scanner / Scan lockfiles (push) Has been cancelled
uv.lock check / uv lock --check (push) Has been cancelled
On a remote gateway connection, agent-written files live on the gateway
host, not the desktop's disk, so the Artifacts view's file:// hrefs failed
("Invalid external URL") and image thumbnails broke.

Make mediaExternalUrl() remote-aware in one place: in remote mode it
rewrites gateway-local paths to GET /api/files/download (a new endpoint
that streams the file as a Content-Disposition: attachment). The artifacts
view now resolves through it, and so do the existing chat-media and
generated-image callers, for free.

The download endpoint stays auth-gated; auth_middleware additionally
accepts the session token as a ?token= query param for this one path so a
shell/browser-opened download (which can't set the session header) still
authenticates — the same query-token tradeoff as the /api/pty WebSocket.
It is NOT added to PUBLIC_API_PATHS.

Salvages #46663 (which carried ~19k lines of CRLF noise and made the
endpoint public). Reimplemented on a clean LF base with the security hole
closed and tests added.

Co-authored-by: qingshan89 <qs2816661685@gmail.com>
2026-06-15 23:50:19 -05:00
Gille
0441b7f19f
fix(desktop): route global remote profile REST calls (#47011)
* fix(desktop): route global remote profile REST calls

* fix(dashboard): scope oauth provider routes by profile

* test(tui): isolate notification poller queue
2026-06-15 23:24:55 -05:00
Shannon Sands
7cd71de1f4 Simplify dashboard update detection to containers 2026-06-15 20:08:39 -07:00
Shannon Sands
b1d6a57883 Detect containerized dashboard update management 2026-06-15 20:08:39 -07:00
Shannon Sands
0b6b29a30c Hide hosted dashboard update controls 2026-06-15 20:08:39 -07:00
xxxigm
2a08b8c86f test(dump): cover terminal backend override reporting
Verifies `hermes debug` surfaces a TERMINAL_ENV override of
terminal.backend, reports the config value when no override is present,
and emits no spurious note when env and config agree.
2026-06-15 12:31:23 -07:00
liuhao1024
60cc42e38b fix(inventory): deduplicate models between user-defined and aggregator providers
When a user-defined provider (e.g. litellm-proxy) and an aggregator
(e.g. openrouter) both advertise the same model name, the Desktop/TUI
model picker would show the model under both groups. Selecting it from
the aggregator row silently set model.provider to the aggregator,
breaking calls because the aggregator doesn't actually serve that model
ID.

Fix: after list_authenticated_providers() returns, collect all models
from user-defined provider rows and filter them out of aggregator rows.
Uses is_aggregator() from hermes_cli/providers.py to identify
aggregators. Case-insensitive matching.

Fixes #45954
2026-06-15 12:25:41 -07:00
liuhao1024
9df1a1a8de fix(doctor): recognize nvidia as vendor-slug-accepting provider
NVIDIA NIM API uses vendor-prefixed model IDs (e.g. qwen/qwen3.5-122b-a10b,
nvidia/nemotron-3-super-120b-a12b). The doctor command incorrectly warns that
vendor-prefixed slugs belong to aggregators like openrouter when nvidia is
the configured provider.

Add 'nvidia' to the providers_accepting_vendor_slugs set so doctor no longer
raises false-positive warnings for valid NVIDIA NIM configurations.

Fixes #35425
2026-06-15 12:24:46 -07:00
FT_IOxCS
92a456f711 fix(cli,deps): clear esbuild audit loop
Upgrade the Vite/esbuild surfaces that kept web, ui-tui, and the bootstrap installer on vulnerable esbuild versions, regenerate the root lockfile, and preserve intentional package+lock dependency edits during update lockfile cleanup.
2026-06-15 06:18:27 -07:00
Teknium
0d82060c74 fix: harden WhatsApp target alias salvage
Add a parser-only routing regression that proves raw WhatsApp group JIDs bypass channel-directory resolution and home-channel fallback, include channel_aliases.json in quick state snapshots, harden malformed alias handling, and map Keiron McCammon for release attribution.
2026-06-15 05:51:47 -07:00
Veritas-7
febdddb41a fix(auth): refresh xAI OAuth tokens earlier 2026-06-15 05:40:23 -07:00
kshitijk4poor
497352bc4e fix(auth): write rotated xAI OAuth tokens back to global root (#43589)
The salvaged read-side fix lets a profile resolve the xAI OAuth grant from
the global-root auth store when it has no own providers.xai-oauth block.
But _save_xai_oauth_tokens still wrote rotated tokens only to the active
profile store. Because xAI rotates the refresh_token on every refresh, a
profile that reads root's grant and refreshes it left root holding a now-
revoked refresh token — killing every other profile reading the stale root
grant with invalid_grant once its access token expired (#43589).

Detect the read-from-root case (profile lacks its own providers.xai-oauth
block) and, after the profile save, write the rotated chain back to the
global root too via a best-effort, TOCTOU-safe write-through that reuses
_save_auth_store with an explicit target path. A profile that genuinely
shadows root (has its own block) is left untouched, classic mode is a
no-op, and a failed root write never breaks the profile's own save.

Pairs with the read fallback in the preceding commit so the cross-profile
xAI grant stays coherent in both directions.
2026-06-15 17:08:19 +05:30
Andrew Walker
f1d6f04362 fix(auth): resolve xAI OAuth credentials across profiles
(cherry picked from commit 8d8b9f50e4)
2026-06-15 17:03:35 +05:30
helix4u
dcc3216955 fix(mcp): fail fast for noninteractive oauth without tokens 2026-06-15 04:22:07 -07:00
Teknium
aca11c227e
fix(docker): skip gateway reconciliation in dashboard container (autodetect) (#46293)
* fix(docker): skip per-profile gateway reconciliation in dashboard container

When gateway and dashboard containers share a bind-mounted HERMES_HOME,
both run the cont-init.d profile reconciliation script, which creates
s6-log processes for every persisted profile.  These s6-log processes
in different containers race to flock() the same log-directory lock
files under logs/gateways/<profile>/lock, producing repeated
"s6-log: fatal: unable to lock ... Resource busy" errors and a
supervision restart storm.

Add HERMES_SKIP_PROFILE_RECONCILE env var support to container_boot.py
and set it in the official docker-compose.yml dashboard service so the
dashboard container no longer creates per-profile gateway s6 services
it never uses.

* chore(release): map salvaged contributor

* refactor(docker): autodetect dashboard container instead of env-var gate

Replace the HERMES_SKIP_PROFILE_RECONCILE env var with PID 1 argv role
detection. A dashboard-only container never spawns or supervises
per-profile gateways, so the reconcile boot hook now skips itself when
/proc/1/cmdline is the dashboard command — no operator flag to set (or
forget in a hand-written manifest, which would reintroduce the s6-log
flock storm this prevents).

- Extract _strip_container_argv_prefix() shared by the legacy-gateway
  and new dashboard detectors (DRY the init/wrapper/hermes peel).
- Add _is_dashboard_container(); gate reconcile main() on it.
- Drop HERMES_SKIP_PROFILE_RECONCILE from code + docker-compose.yml.
- Tests: argv matrix for both roles + main()-level skip/reconcile proof
  and a regression that the removed env var is now inert.

Co-authored-by: 895252509 <895252509@qq.com>

---------

Co-authored-by: zhouxiang <895252509@qq.com>
Co-authored-by: Ben <ben@nousresearch.com>
2026-06-15 20:51:48 +10:00
Ben Barclay
95715dcb03
fix(s6): reserved default gateway must not follow sticky active_profile (#46483)
The supervised `gateway-default` s6 slot runs bare `hermes gateway run`
(no -p) to mean "the root HERMES_HOME profile". But `_apply_profile_override`
falls through its #22502 HERMES_HOME guard for the container root
(/opt/data, whose parent is not `profiles`) and reads the sticky
`active_profile` file. If the user set another profile active (e.g. via
the dashboard), the reserved default gateway gets redirected into that
profile — producing a duplicate gateway for the active profile and no
real default gateway. The profile page and `gateway status` then
correctly report default as "not running" because there genuinely isn't
one.

Guard step 2 (the sticky active_profile fallback) with the existing
HERMES_S6_SUPERVISED_CHILD sentinel that the container run-script already
exports. Supervised named-profile slots pass -p explicitly (step 1, never
reaches step 2); only the bare default slot was affected. Inert outside
the s6 container — the sentinel is never set elsewhere.

Reported in the 'Docker & Profiles & Dashboard' support thread.
2026-06-15 05:36:20 +00:00
Ben Barclay
80f8ffc74c
fix(dashboard): pin machine-dashboard reroute to the machine root, not $HOME/.hermes (#46487)
The unified machine-dashboard reroute (cmd_dashboard) re-execs a named-profile
dashboard launch as the machine dashboard and dropped HERMES_HOME from the
child env with the comment "so the child binds the machine root". That holds
for a standard install (root == ~/.hermes) but breaks the Docker layout: the
published image sets `ENV HERMES_HOME=/opt/data`, so once HERMES_HOME is unset
the child falls back to $HOME/.hermes = /opt/data/.hermes — an empty,
auto-seeded home.

Two user-visible symptoms, one root cause (reported via support):

1. Dashboard Profiles page shows only an empty `default` — the real
   default/oracle/saga profiles live under /opt/data/profiles, but the
   rerouted child resolves _get_profiles_root() to /opt/data/.hermes/profiles.

2. The "Update Hermes" button runs `hermes update` inside the container
   repeatedly instead of bailing with the docker-update guidance. The Docker
   guard keys off detect_install_method(), which reads
   $HERMES_HOME/.install_method; the image stamps that at /opt/data, but the
   misresolved home has no stamp, no HERMES_MANAGED, and no .git → falls
   through to "pip", so the guard never fires.

The reporter's workaround was to bind-mount the host dir at both /opt/data and
/opt/data/.hermes so the two paths converge (at the cost of a self-referential
recursion).

Fix: resolve the machine root explicitly with get_default_hermes_root() and set
it on the child env instead of popping HERMES_HOME. That helper returns the
root for both layouts — ~/.hermes for a standard install, and /opt/data for
Docker (it strips a trailing profiles/<name>). Falls back to the old pop
behaviour only if root resolution raises, so the reroute is never blocked.

Regression tests in test_dashboard_unified_launch.py: the existing standard-
install test now asserts the child carries HERMES_HOME == get_default_hermes_root()
(not absent), and a new test_reexec_pins_docker_machine_root covers the Docker
layout (HERMES_HOME=/opt/data/profiles/oracle → child gets /opt/data). Both
fail against the pre-fix pop behaviour (mutation-verified).
2026-06-15 15:33:15 +10:00
Teknium
b770967263
fix(s6): persist profile gateway desired state (#46292)
* fix: persist s6 gateway desired state

* chore(release): map salvaged contributor

---------

Co-authored-by: Alfred Smith <alfred@my-cloud.me>
Co-authored-by: Ben <ben@nousresearch.com>
2026-06-15 14:02:10 +10:00
Teknium
61ee2dbfdb
fix(s6): make profile gateway log parent writable (#46291)
* fix(gateway): chown logs/gateways parent so late-added profiles can log

The per-profile log service script created $HERMES_HOME/logs/gateways/
via 'mkdir -p' but only chowned the leaf logs/gateways/<profile>. When
the first log service boots in root context, the gateways/ parent stays
root:root; every profile registered later runs its log service as the
dropped hermes user, 'mkdir -p' fails with EACCES, and s6-log enters a
sub-second fatal crash-loop flooding the container log. The stage2
recursive heal does not catch it either: it is gated on needs_chown,
which is false when the top-level $HERMES_HOME is already hermes-owned.

Two complementary fixes:

- service_manager._render_log_run: chown the gateways/ parent
  (non-recursively) before the leaf chown. Runs on every root-context
  boot, so it also heals volumes already poisoned by older images.
- docker/stage2-hook.sh: seed logs/gateways in the as_hermes mkdir -p
  block; cont-init runs before any service starts, so the parent
  already exists hermes-owned when the first log/run does 'mkdir -p'.

The needs_chown repair loop needs no twin entry: it already chowns
logs/ recursively, which covers logs/gateways.

Fixes #45258

* chore(release): map salvaged contributor

---------

Co-authored-by: tangtaizhong666 <tangtaizhong792@gmail.com>
2026-06-15 13:47:05 +10:00
Teknium
40d7c264f0
fix(s6): register profile gateways without auto-starting (#46266)
* fix(s6): prevent profile create from auto-starting gateway service

When hermes profile create runs inside an s6 container,
_maybe_register_gateway_service() calls register_profile_gateway()
which creates the service directory and triggers s6-svscanctl -a.
Previously the service always started immediately, causing profiles
that share the main gateway's bot token (e.g. Kanban worker profiles)
to fail with a token-lock conflict and persist gateway_state: running
— becoming zombies that resurrect on every container restart.

Wire the existing start_now parameter through the S6 implementation:
when start_now=False, write a  marker file (same pattern as
container_boot.py _register_gateway_slot) so s6-supervise leaves the
service stopped until the user explicitly runs hermes -p <profile>
gateway start.

4 files, +61/-6, 4 new tests (all passing).

* test(docker): wait for gateway running state before restart

---------

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
2026-06-15 11:43:23 +10:00
Teknium
4eb0ff639b
Remove is_container check when restarting over dashboard (#46290)
Co-authored-by: IAvecilla <ignacio.avecilla@lambdaclass.com>
2026-06-15 11:09:23 +10:00
Teknium
f3fe99863d
revert(web): remove keyless Parallel search fallback (#46350)
Remove the free Parallel Search MCP path and restore the keyed Parallel backend behavior from before it was introduced.

Also drops the keyless fallback registration/display labeling tests and returns the Parallel SDK pin to the prior version.
2026-06-14 16:47:57 -07:00
Teknium
a829e04d62
fix: migrate cloned profile configs (#46345) 2026-06-14 16:30:23 -07:00
Teknium
2a14e8957d
fix(kimi): surface K2.7 Code in native picker (#46309) 2026-06-14 14:01:03 -07:00
kshitijk4poor
ce19fdb7ce fix(skills): apply global|platform disabled union to all resolution sites
The platform-disabled fix landed only in agent.skill_utils.get_disabled_skill_names
(the system-prompt path). Two sibling resolvers still used the old
replace-not-union semantics, so the same skill could be hidden from the
<available_skills> prompt yet reported enabled elsewhere:

- hermes_cli/skills_config.get_disabled_skills (the 'hermes skills config' UI)
  returned only the platform list, so a globally-disabled skill showed as
  enabled (unchecked) on any platform with a platform_disabled entry.
- tools/skills_tool._is_skill_disabled (gates whether skill_view loads a skill)
  ignored the global list when a platform list existed, so a globally-disabled
  skill could still be loaded on such a platform.

Both now union the global list with the platform list, matching
get_disabled_skill_names. An explicit empty platform list no longer re-enables
a globally-disabled skill — global disables hold on every platform (#46201).

Also: fix the now-stale get_disabled_skill_names docstring and drop a stray
blank line. Regression tests added for both sites (proven to fail on the old
replace semantics).
2026-06-14 22:54:54 +05:30
ibrahim özsaraç
7bbe7024c2 fix: filter platform-disabled skills from <available_skills> prompt (#46201)
build_skills_system_prompt() already resolved _platform_hint but called
get_disabled_skill_names() with no argument, so the resolved platform never
reached the filter and the prompt cache_key varied by platform while the
disabled set did not. Pass _platform_hint or None.

get_disabled_skill_names() also fully ignored the global 'disabled' list once
a platform-specific list was found. Return the union (global | platform) so a
globally-disabled skill stays disabled on every platform.

Salvaged from #46203 by @iborazzi; the unrelated apps/shared/tsconfig.json
ES2023 bump is intentionally dropped (one concern per PR).
2026-06-14 22:52:57 +05:30
Teknium
7433d5f0eb fix(gateway): scope early duplicate guard to pid file 2026-06-14 08:42:06 -07:00
konsisumer
1436793051 fix(gateway): block shell gateway run when a service supervises the profile 2026-06-14 08:42:06 -07:00
Diyon18
288f7026e3 fix(messaging): correct Weixin personal account labeling 2026-06-14 04:52:54 -07:00
Teknium
a27d7e68cc
fix(mcp): block suspicious stdio configs before probe (#46112) 2026-06-14 04:46:54 -07:00
Teknium
972a9885ee
fix(mcp): block exfil-shaped stdio server configs (#46083) 2026-06-14 04:24:14 -07:00
Teknium
0428945b5b
fix(desktop): keep profile homes out of bootstrap (#46073) 2026-06-14 03:08:52 -07:00
LeonSGP43
89bdb1e546 fix: read dashboard spa assets as utf-8
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-14 02:31:04 -07:00
Teknium
7b9dc7cd0a test(gateway): align web profile wrapper expectation 2026-06-14 02:20:55 -07:00
helix4u
d76a58bd15 fix(gateway): resolve sudo profile system installs 2026-06-14 02:20:55 -07:00
helix4u
85e6232a07 fix(providers): support anthropic proxy v1 endpoints 2026-06-14 02:09:16 -07:00
Teknium
1b16c48170 fix: guard OAuth account removal 2026-06-13 21:47:13 -07:00
Teknium
c8e5f34f24
fix(gemini): strip native self prefixes before generateContent (#36141)
Strip `google/` and `gemini/` self-prefixes before native Gemini generateContent calls, and keep provider-normalization expectations aligned.
2026-06-13 13:47:08 -07:00
Teknium
08890d77e6
fix(plugins): normalize browser-pasted GitHub repo URLs (#33539)
Accept common GitHub web URLs in `hermes plugins install` by normalizing repository views back to cloneable `.git` URLs, with focused parser coverage.
2026-06-13 13:23:59 -07:00
helix4u
78c11d99e3 fix(update): stop Windows gateways before mutating install 2026-06-13 10:46:08 -07:00
WompaJango
28bf8fb47d feat(dashboard): clone profiles from any source 2026-06-13 07:33:58 -07:00
Que0x
3380563d94 fix(security): stop /api/status leaking host paths and PID on gated binds
The dashboard's public /api/status liveness endpoint is in PUBLIC_API_PATHS
and bypasses dashboard auth, yet it returned absolute hermes_home,
config_path, env_path, the gateway PID, and the internal gateway health URL.
That exceeds the shape its own allowlist documents as public ("version,
gateway state, active session count, and the dashboard auth-gate shape. No
bodies, no session content, no secrets"), leaking deployment recon to any
unauthenticated caller on a network-exposed (gated) bind.

Withhold host-local detail unless the bind is loopback / --insecure, where
the dashboard is local-only and the caller is already inside the trust
envelope -- the same split should_require_auth draws. The NAS liveness probe
and the auth-gate badge are unaffected.

Adds invariant tests for both modes (gated withholds, loopback keeps).
2026-06-13 07:18:59 -07:00
Teknium
d206e1f51d fix(dashboard): keep local file browser on home 2026-06-13 06:39:38 -07:00