Commit graph

2806 commits

Author SHA1 Message Date
kshitij
49d7481dfb
Merge pull request #47706 from NousResearch/fix/cli-login-deprecation-graceful
fix(cli): deprecated `hermes login` fails gracefully for any provider
2026-06-17 23:02:32 +05:30
definitelynotguru
eaddeaf2e6 feat(xai): add grok-composer-2.5-fast to xAI OAuth model picker
The model is callable via xAI OAuth but omitted from models.dev and
/v1/models listings. Merge it into the curated xAI catalog so it appears
in `hermes model` without requiring a custom model name.
2026-06-17 09:49:46 -07:00
Teknium
c6c8abbadb
refactor: remove agent-callable send_message tool (#47856)
* feat(mcp): raise default tool-call timeout 120s -> 300s

Port from openai/codex#28234. Long-running MCP tools (web fetches,
sandboxed builds, deep-research servers) routinely exceed 120s, causing
spurious timeout failures. Codex bumped its default MCP tool timeout from
120 to 300 for the same reason.

- _DEFAULT_TOOL_TIMEOUT 120 -> 300 in tools/mcp_tool.py (per-server
  'timeout' config override unchanged)
- update test_default_timeout assertion
- document the default in mcp-config-reference.md

* refactor: remove agent-callable send_message tool

The agent should not decide on its own to fire off cross-platform
messages or reactions. Outbound platform messaging is handled outside
the agent loop — cron delivery, the gateway kanban notifier
(dashboard-toggled), and the `hermes send` CLI.

Removes the model-tool registration only; the send engine in
send_message_tool.py (_send_to_platform, _send_via_adapter,
_parse_target_ref, per-platform _send_* helpers) is kept intact for
those non-agent callers. Drops the now-empty 'messaging' toolset and
its `hermes tools` toggle. Yuanbao DM guidance now points at the
native yb_send_dm tool.
2026-06-17 07:11:23 -07:00
Teknium
cbfa018aef
fix(auth): retry Codex device-code login on 429 with clear rate-limit message (#47860)
The OpenAI device-code login (POST auth.openai.com/.../deviceauth/usercode)
had no retry or 429 handling — a transient throttle from OpenAI surfaced as
a bare "Device code request returned status 429" with no guidance, reading
as a hard login failure.

- Retry the device-code request with capped exponential backoff (honoring
  Retry-After), up to 4 attempts.
- On persistent 429, raise a clear AuthError tagged CODEX_RATE_LIMITED_CODE
  (classified transient, not a credential problem) with a wait hint.
- Apply the same 429 classification to the token-exchange step (same bug
  class).

Unrelated to PR #47399 (Responses-API cache headers); this is the OAuth
device-code path in hermes_cli/auth.py.
2026-06-17 05:48:35 -07:00
teknium1
06d907dc4e fix(dashboard): only run runtime-pid liveness fallback against local status
get_runtime_status_running_pid() validates liveness with a local
os.kill(pid, 0) probe. In /api/status the runtime record can be the
REMOTE health-probe body (cross-container), whose PID belongs to another
host and is display-only — probing it locally is wrong and trips the
test live-system guard (os.kill on a PID outside the test subtree).
Run the fallback only against the local read_runtime_status() record.
2026-06-17 05:40:57 -07:00
teknium1
dc86d48a3e fix(dashboard): use await-safe config-only scope for /api/status profile
_profile_scope swaps process-global skills_tool/skill_manager module
attrs under an RLock; /api/status holds that scope across the
run_in_executor remote-health probe await, so a concurrent
/api/skills?profile=X request can cross-restore the status profile's
skill dir on its finally. Add _config_profile_scope (contextvar-only,
task-local, await-safe) and use it for status, which only resolves
get_hermes_home() at call time for config/env/gateway state and never
needs the skills-module globals.
2026-06-17 05:40:57 -07:00
Shannon Sands
674e8b098a Fix dashboard gateway profile scoping 2026-06-17 05:40:57 -07:00
Teknium
f80381c456
feat(prompt): scale context-file cap to model window + point agent at truncated file (#47846)
Context files (AGENTS.md, CLAUDE.md, .hermes.md, .cursorrules, SOUL.md) were
hard-capped at a flat 20K chars before head/tail truncation. Among the agent
harnesses we track, only Codex caps project docs at all (32 KiB); Claude Code,
OpenCode, and Cline load them whole. The flat 20K predates large context
windows and silently truncates real-world AGENTS.md files.

B — dynamic cap: when context_file_max_chars is unset (now the shipped
default), the cap scales with the model's context window
(ctx_tokens * 4 * 0.06, floor 20K, ceiling 500K). Small-context models stay at
the historical 20K; a 200K model gets 48K; large models stop truncating real
docs. An explicit context_file_max_chars still wins. Context length is resolved
once per conversation (stable -> prompt cache untouched).

C — when truncation does happen, the marker now names the concrete file path
and tells the agent to read_file it for the full content.

Validation: 154 targeted tests + full agent/ + hermes_cli/ + test_config
(0 failures); E2E against a real 60K AGENTS.md confirms small windows truncate
with the path-bearing marker, large windows load whole, and the system prompt
is byte-stable across rebuilds.
2026-06-17 05:40:26 -07:00
Teknium
7bbffceb9c
feat(curator): make skill consolidation opt-in (prune stays default-on) (#47840)
The curator now defaults to prune-only: the deterministic inactivity pass
(mark stale / archive long-unused skills) still runs whenever the curator is
enabled, but the opinionated LLM umbrella-building consolidation fork is OFF
by default.

- agent/curator.py: add DEFAULT_CONSOLIDATE=False + get_consolidate(); gate
  the forked aux-model review in run_curator_review behind it (new consolidate
  param, None=read config). When off, the LLM pass is skipped entirely (no
  aux-model cost); the run is still recorded and reported.
- config.py: add curator.consolidate (default false); v29->v30 migration seeds
  the key for existing installs without clobbering a user-set value.
- hermes_cli/curator.py: 'hermes curator run --consolidate' override; status
  shows consolidate state; prune-only notice on run.
- docs + tests.
2026-06-17 05:20:32 -07:00
Teknium
e48803daec
fix(gateway): defer macOS launchd reload when run inside the gateway tree (#47842)
When refresh_launchd_plist_if_needed() runs from inside the gateway's own
launchd process tree (agent-initiated self-update via the terminal tool), a
direct launchctl bootout tears down the service's process group — including
the CLI doing the refresh — before the follow-up bootstrap can run. The
gateway is left unloaded and KeepAlive can't revive it (#43842).

Detect in-service execution via gateway.status.get_running_pid() +
_is_pid_ancestor_of_current_process(), and delegate the bootout->bootstrap to
a detached (start_new_session=True) helper that survives the process-group
teardown. The normal out-of-tree CLI path is unchanged.

Fixes #43842.
2026-06-17 05:19:21 -07:00
kshitijk4poor
a7ec334448 fix(cli): deprecated hermes login fails gracefully for any provider
`hermes login` was removed in favor of `hermes auth` / `hermes model`, but
the subparser still validated `--provider` against a hardcoded choices list
(nous, openai-codex, xai-oauth). Running `hermes login --provider anthropic`
therefore crashed in argparse with `invalid choice: 'anthropic'` *before* the
deprecation handler could print the redirect to `hermes model` — so a user
trying to authenticate a perfectly valid provider just saw a hard error and
assumed the feature was broken rather than relocated.

- Drop the restrictive `choices=` so every `--provider` value reaches the
  deprecation handler (which ignores the value and prints guidance).
- Omit the subparser `help=` kwarg so the dead command no longer advertises
  itself in `hermes --help` (#24756). Avoids the `==SUPPRESS==` placeholder
  leak that `help=argparse.SUPPRESS` emits for a top-level subparser on 3.12+.
- `hermes login [--flags]` still reaches the actionable deprecation message
  for old scripts/aliases; `hermes login --help` shows the redirect.

Picks up the intent of the inactivity-closed #24902, rebased onto the
post-refactor parser location (hermes_cli/subcommands/login.py) and extended
to fix the whole bug class (any provider value), not just hiding from --help.

Tests: parametrized provider acceptance + help-suppression (no SUPPRESS leak).
2026-06-17 12:55:40 +05:30
kshitijk4poor
ca6542f602 docs(cli): note URL exclusion in _extract_path_word docstring
The docstring described a token as path-like when it contains a "/"
separator, but the keystroke-latency fix now excludes "://" scheme tokens
(URLs) even though they contain "/". Document the exclusion so the contract
matches the behavior.
2026-06-17 12:36:01 +05:30
xxxigm
f48b312037 fix(cli): keep typing responsive by not blocking the keystroke loop
The interactive CLI input box runs its completer with
`complete_while_typing=True`, so `SlashCommandCompleter.get_completions`
is invoked on *every* keystroke. That completer does blocking I/O:
fuzzy `@`-file indexing shells out to `rg`/`fd` (up to a 2s timeout) and
file-path completion calls `os.listdir` + `stat`. Because the completer
was passed inline (never wrapped in `ThreadedCompleter`), all of this ran
synchronously on the prompt_toolkit event loop, stalling the render after
each key — very noticeable on WSL2 and other slow-filesystem setups
("typing in the prompt box being very latent").

Two fixes:

- Wrap the input completer in `ThreadedCompleter` so completion work runs
  off the UI event loop and never blocks rendering between keystrokes.
- Stop treating URLs as file paths in `_extract_path_word`: a token like
  `https://example.com/x` contains `/`, so it triggered `os.listdir` on
  every keystroke while typing/pasting a link (listing a bogus `https:`
  dir) for a completion that can never be useful. Skip any token with a
  `://` scheme separator.

(cherry picked from commit b5be2ba276)
2026-06-17 12:32:38 +05:30
teknium
36ae958473 feat(gateway): gate message timestamps behind opt-in (default off)
Follow-up to salvaged PR #41633: the timestamp prefix injection was
unconditional. Gate the in-context render behind
gateway.message_timestamps.enabled (default false) at both the live-message
and history-replay sites; timestamp metadata is still captured + persisted
regardless so the toggle can be flipped on later. Add DEFAULT_CONFIG entry,
docs, and gate tests.
2026-06-16 15:49:59 -07:00
xxxigm
d1ecebcbfd
fix(desktop): re-download Electron binary via mirror when pack fails (#47266) (#47276)
* fix(desktop): re-download Electron binary via mirror when pack fails (#47266)

Since #38673 pinned build.electronDist to node_modules/electron/dist,
electron-builder reads the Electron binary straight from there and never
downloads it during `npm run pack`. That dist tree is only produced by the
electron package's postinstall (install.js) during `npm ci`. When that
download is blocked or throttled (GitHub's release host is unreachable in
some regions), the dist is missing and the build dies with:

    The specified electronDist does not exist: .../node_modules/electron/dist

The existing ELECTRON_MIRROR fallback in all three desktop-build paths
(scripts/install.ps1, scripts/install.sh, and `hermes desktop` in
hermes_cli/main.py) re-ran `npm run pack` with ELECTRON_MIRROR set — but
pack never downloads Electron anymore, so the mirror was never used and the
retry re-read the same missing dist. The fallback was effectively dead.

Drive the mirror through electron's own downloader instead:

- Add a dist-presence check + a downloader helper (Test-ElectronDist /
  Restore-ElectronDist, _electron_dist_ok / _restore_electron_dist,
  _electron_dist_ok / _redownload_electron_dist) that wipes a partial dist
  + the path.txt version marker (electron's install.js short-circuits on it)
  and re-runs `node install.js`, optionally via a mirror.
- On the first retry, repopulate a missing dist from the canonical source;
  on the mirror retry, re-fetch through npmmirror.com, then pack.
- Gate the re-download on the dist check so an unrelated build failure
  (tsc/vite) doesn't trigger a pointless ~200 MB refetch, and skip the final
  pack when the binary still can't be fetched instead of failing the same way.

* test(desktop): cover Electron dist re-download mirror fallback (#47266)

Add behavior coverage for the electronDist re-download fix:

- _electron_dist_ok across linux/win32/darwin, including the partial-dist
  case (dir present but binary missing) that makes the pinned electronDist
  fail.
- _redownload_electron_dist: no-op when the binary is present, bail when
  install.js is absent, wipe a stale dist + path.txt marker and run
  electron's downloader with ELECTRON_MIRROR injected, and report failure
  when the download still produces no binary.
- `hermes desktop`: the mirror fallback now drives electron's own downloader
  before re-running pack, and skips the final pack entirely when the binary
  can't be fetched.

Replaces the old mirror test that asserted the (now-fixed) dead behavior of
re-running `npm run pack` with ELECTRON_MIRROR set — pack never downloads
Electron under the pinned electronDist, so that retry could never help.
2026-06-16 15:40:55 -05:00
liuhao1024
1b962f001e fix(models): pass model.base_url to fetch_models in /model picker
The /model interactive picker resolved a base_url from user credentials
but never passed it to ProviderProfile.fetch_models(), causing the
picker to always query the provider's hardcoded default endpoint
instead of the user's custom URL (e.g. a company litellm proxy).

- providers/base.py: add optional base_url parameter to fetch_models()
- hermes_cli/models.py: pass resolved base_url to fetch_models()
- Update all subclass overrides for signature compatibility
- Add 6 regression tests covering override, fallback, and integration
2026-06-16 13:09:40 -07:00
chimpera
1039e90b5e fix(model-switch): probe /v1/models for providers without api_key
Section 3 of list_authenticated_providers (user-defined endpoints from
the providers: config section) required an api_key before probing the
endpoint's /v1/models for live model discovery. This broke local
self-hosted backends (llama.cpp, Ollama, vLLM, etc.) that don't require
authentication — they would only ever show the single default_model
from config instead of the full model catalog.

Section 4 (custom_providers list) already handled this correctly with
the policy: probe when api_key is set OR when no explicit models are
configured. Apply the same logic to Section 3 so local backends get
full model discovery without requiring a placeholder api_key workaround.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-16 13:07:52 -07:00
cyb0rgk1tty
b7fa62c530 fix(inventory): keep user-defined custom providers in model dedup
The #45954 model-dedup builds `user_models` from every is_user_defined
row, then strips those model IDs from every row where is_aggregator(slug)
is True. But is_aggregator() returns True for *every* `custom:*` slug, and
list_authenticated_providers emits named custom providers with slug
`custom:<name>` and is_user_defined=True. So a user's own custom provider
is treated as an aggregator and filtered against user_models — which holds
exactly its own models (the row helped build that set). Every model is
removed, the row drops to zero, and the provider disappears from the model
picker.

Guard the dedup loop to skip is_user_defined rows: a user's configured
provider is never an aggregator duplicate of itself. Built-in aggregators
(openrouter, etc.) are still deduped as before. Adds a regression test.
2026-06-16 13:04:07 -07:00
Jaaneek
bbc842d31e feat(xai): default to grok-build-0.1
Switch the default model for the xAI/Grok provider and the xAI web
search backend from grok-4.3 to grok-build-0.1. grok-build-0.1 is
already recognized by the model metadata, so no new model definition
is required; grok-4.3 remains selectable.
2026-06-16 11:50:17 -07:00
kshitij
8fa562a399
Merge pull request #47391 from kshitijk4poor/feat/add-glm-5.2
feat: add z-ai/glm-5.2 to OpenRouter and Nous model lists
2026-06-17 00:02:05 +05:30
Wolfram Ravenwolf
f6a42b1acf feat(prompt): make context-file truncation limit configurable
PROBLEM: Automatic context files such as SOUL.md and AGENTS.md were capped by a hardcoded CONTEXT_FILE_MAX_CHARS value. Amy's local fork had raised that constant from 20K to 25K so a larger SOUL.md would not be silently truncated, but the hardcoded 25K value changed upstream default behavior and made the patch less generally useful.

SOLUTION: Restore the upstream-compatible 20K default, add a context_file_max_chars config setting for users who intentionally keep larger identity/project-context files, keep chat-visible truncation warnings, and document the new setting. Tests cover the default, config override, explicit max_chars precedence, and the warning text.
2026-06-16 11:28:35 -07:00
kshitijk4poor
b2da39a0f3 feat: add z-ai/glm-5.2 to OpenRouter and Nous model lists
Z.ai released GLM 5.2 on 2026-06-15, available on OpenRouter:
  - https://openrouter.ai/z-ai/glm-5.2

GLM-5.2 is Z.ai's flagship for long-horizon tasks, shipping a 1M-token
context window (up from 200K on GLM 5.1) and tool calling. Per the
OpenRouter API: text-only, context_length 1048576, tools supported.
No separate -fast variant exists.

The 1M context length, native zai picker entry, setup wizard, and Z.ai
coding-plan auth entries for glm-5.2 already landed on main. This fills
the remaining gap: the two aggregator surfaces where glm-5.1 appears but
glm-5.2 did not.

Changes:

  hermes_cli/models.py
    - Add z-ai/glm-5.2 to the OpenRouter fallback snapshot (OPENROUTER_MODELS)
      and the Nous Portal curated list (_PROVIDER_MODELS["nous"]), newest
      flagship first. Live catalogs surface it automatically when reachable;
      the fallback lists matter when the manifest fetch fails.

  website/static/api/model-catalog.json
    - Regenerated via scripts/build_model_catalog.py (not hand-edited) so the
      manifest stays in sync with the source lists; guarded by
      tests/hermes_cli/test_model_catalog.py.
2026-06-16 23:35:45 +05:30
kshitij
17251e865b
Merge pull request #46857 from liuhao1024/fix/model-picker-merge-live-static
fix(models): merge live API results with curated static catalog in generic provider path
2026-06-16 23:30:34 +05:30
kshitijk4poor
658ac1d866 fix(models): keep curated-first ordering in live+curated merge; use pure-catalog helper in validation
The generic live+curated merge (commit 630b438) seeded the merged list
from live results, demoting curated-only models below live ones. That
regressed #46309, which deliberately surfaces the newest curated model
(kimi-k2.7-code) FIRST in the native picker even when the live /models
listing lags. Restore curated-first ordering: curated entries lead (in
catalog order), live-only entries are appended for discovery. This keeps
the #46850 fix (zai glm-5.2 now appears) without the kimi regression.

Also switch the validate_requested_model curated fallback (commit
ee7b8a4) from provider_model_ids() — which triggers a second, uncached
live /models fetch with its own 8s timeout and may resolve different
credentials than the api_key/base_url just probed — to the pure-catalog
helper _model_in_provider_catalog(). Membership is checked against the
shipped catalog only, with no extra network call.

Tests: restore the curated-first assertion in
test_kimi_coding_live_catalog_does_not_hide_curated_k2_7_code; update
the new merge tests to curated-first semantics; de-circularize the
validation fallback tests to patch _PROVIDER_MODELS (the real source)
instead of mocking the function under test.
2026-06-16 23:25:07 +05:30
brooklyn!
c6e99ab375
Merge pull request #46959 from NousResearch/bb/composer-model-selector
feat(desktop): composer model selector, per-model presets & external-provider disconnect
2026-06-16 09:55:57 -05:00
Teknium
4d470b3dbb
fix(slack): route /debug via /hermes to restore Telegram-parity (#47248)
Slack caps apps at 50 slash commands and the registry is at that ceiling, so
adding /debug clamped it out of the native list and broke the telegram-parity
test (debug on Telegram, absent from Slack native slashes, in neither
exclusion set). Add 'debug' to _SLACK_VIA_HERMES_ONLY — same treatment credits
already gets. /debug stays native on CLI/TUI/Telegram/Discord and reachable via
/hermes debug on Slack.
2026-06-16 06:20:01 -07:00
MrDiamondBallz
9a59ad73dd fix(auth): preserve Codex pool-only rate-limit state
Classify exhausted pool-only openai-codex credentials as quota/rate-limited instead of missing auth. This prevents auth status and runtime credential resolution from reporting missing credentials when a valid manual:device_code pool credential exists but is temporarily in a 429 usage-limit cooldown.

Adds regression coverage for pool-only Codex auth status and runtime resolution.
2026-06-16 05:56:11 -07:00
teknium
6373aba80f feat(gateway): rename to tool_progress_grouping, add config/docs/tests
Follow-up to salvaged PR #41620:
- Rename tool_progress_style -> tool_progress_grouping (clearer intent)
- Add display.tool_progress_grouping to DEFAULT_CONFIG (accumulate default)
- Document in messaging docs incl. 'separate is noisier, only where progress enabled'
- Add resolver tests (default/global/override/invalid/case)
2026-06-16 05:49:24 -07:00
teknium
98ae28657f feat(display): document and test memory_notifications setting
Follow-up to salvaged PR #4684:
- Add display.memory_notifications to DEFAULT_CONFIG (off|on|verbose, default on)
- Document the setting in docs/user-guide/features/memory.md
- Add resolver tests for off/on/verbose memory + skill paths
2026-06-16 05:45:40 -07:00
Teknium
a6364bfa08
fix(telegram): edit streamed previews in place as rich (Bot API 10.1) (#46890)
Streamed Telegram replies that finalize through editMessageText were
converted to MarkdownV2, which has no table syntax and rewrites pipe
tables into bullet lists — users saw a table while streaming that
collapsed to a list at the last moment.

Finalize now edits the existing preview IN PLACE via Bot API 10.1's
editMessageText rich_message parameter when the content has constructs
the legacy path degrades (tables, task lists, <details>, block math).
No fresh send + delete, so no duplicate-preview flicker — the reason
#46206 reverted the fresh-final re-send path. prefers_fresh_final_streaming
stays False; the in-place edit replaces it.

- _needs_rich_rendering(): rich reserved for table/task-list/details/math
  (adapted from #45995, @YonganZhang); plain replies stay on MarkdownV2.
- _try_edit_rich(): editMessageText + rich_message via do_api_request,
  mirroring _try_send_rich's fallback/latch/transient contract.
- edit_message finalize tries rich in place before the 4,096 overflow
  pre-flight (rich cap is 32,768), falling back to legacy on rejection.
- rich_messages default flipped back to True (DEFAULT_CONFIG + adapter).
- docs (en + zh-Hans) + cli-config example updated to default-on.

Closes the root cause behind #45911 / #46009.
2026-06-16 05:26:04 -07:00
liuhao1024
ee7b8a4672 fix(models): validate_requested_model falls back to curated catalog when live API omits model
When live /v1/models responds but omits a model that exists in the
curated static catalog, validate_requested_model now accepts it with
a note instead of rejecting. This covers the /model slash-command path
(the picker path was already fixed in the parent commit).

Addresses review feedback from potatogim on #46857.
2026-06-16 16:24:11 +08:00
liuhao1024
630b43892d fix(models): merge live API results with curated static catalog in generic provider path
When a provider's live /v1/models endpoint returns a stale or incomplete
list (e.g. Z.AI missing glm-5.2), the generic profile-based code path
returned only the live results, silently dropping curated models.

Generalize the kimi-coding merge pattern to all providers: live entries
come first (provider's preferred order), then curated-only entries are
appended with case-insensitive dedup. This ensures models that the live
endpoint omits still appear in /model picker.

Fixes #46850
2026-06-16 16:21:01 +08:00
Brooklyn Nicholson
a0ec4f52b9 feat(desktop): disconnect external (CLI-managed) providers
External providers (Claude Code) store creds outside Hermes, so the
disconnect API refuses them. The backend now hands the GUI a per-OS
`disconnect_command` that clears the credential the same way the CLI's
logout does (macOS Keychain entry + ~/.claude/.credentials.json), and
the misleading "use claude setup-token" hint is corrected.

Settings → Providers offers a Disconnect button for these: it confirms,
leaves Settings, and runs the removal command in the embedded terminal
via a new runInTerminal() (queues onto $terminalInjection; the terminal
pane flushes and clears it once its session is live). The expanded list
also gets its own "Other providers" header so it no longer reads as
grouped under "Connected". API-managed providers keep the one-click
(trash) disconnect.
2026-06-16 00:08:21 -05:00
brooklyn!
c6b0eb4de0
fix(desktop): open remote-gateway artifacts via authenticated download (#46895)
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run
Typecheck / typecheck (apps/desktop) (push) Waiting to run
Typecheck / typecheck (apps/shared) (push) Waiting to run
Typecheck / typecheck (ui-tui) (push) Waiting to run
Typecheck / typecheck (web) (push) Waiting to run
Typecheck / desktop-build (push) Waiting to run
Docker / shell lint / Lint Dockerfile (hadolint) (push) Has been cancelled
Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Has been cancelled
OSV-Scanner / Scan lockfiles (push) Has been cancelled
uv.lock check / uv lock --check (push) Has been cancelled
On a remote gateway connection, agent-written files live on the gateway
host, not the desktop's disk, so the Artifacts view's file:// hrefs failed
("Invalid external URL") and image thumbnails broke.

Make mediaExternalUrl() remote-aware in one place: in remote mode it
rewrites gateway-local paths to GET /api/files/download (a new endpoint
that streams the file as a Content-Disposition: attachment). The artifacts
view now resolves through it, and so do the existing chat-media and
generated-image callers, for free.

The download endpoint stays auth-gated; auth_middleware additionally
accepts the session token as a ?token= query param for this one path so a
shell/browser-opened download (which can't set the session header) still
authenticates — the same query-token tradeoff as the /api/pty WebSocket.
It is NOT added to PUBLIC_API_PATHS.

Salvages #46663 (which carried ~19k lines of CRLF noise and made the
endpoint public). Reimplemented on a clean LF base with the security hole
closed and tests added.

Co-authored-by: qingshan89 <qs2816661685@gmail.com>
2026-06-15 23:50:19 -05:00
Gille
0441b7f19f
fix(desktop): route global remote profile REST calls (#47011)
* fix(desktop): route global remote profile REST calls

* fix(dashboard): scope oauth provider routes by profile

* test(tui): isolate notification poller queue
2026-06-15 23:24:55 -05:00
Shannon Sands
7cd71de1f4 Simplify dashboard update detection to containers 2026-06-15 20:08:39 -07:00
Shannon Sands
b1d6a57883 Detect containerized dashboard update management 2026-06-15 20:08:39 -07:00
Shannon Sands
0b6b29a30c Hide hosted dashboard update controls 2026-06-15 20:08:39 -07:00
Teknium
c66ecf0bc3
feat(delegation): async background subagents via delegate_task(background=true) (#40946)
* feat(delegation): async background subagents via delegate_task(background=true)

delegate_task(background=true) dispatches a subagent that runs in the
background and returns a handle immediately, so the user and model keep
working while it runs. The full result — plus the original task source —
re-enters the conversation as a new turn when the subagent finishes,
riding the same completion-queue rail as terminal background processes.

- tools/async_delegation.py: daemon-executor registry, capacity cap,
  rich self-contained completion event pushed onto the shared
  process_registry.completion_queue (type='async_delegation').
- delegate_tool.py: background param + single-task dispatch branch;
  batch async rejected (v1).
- process_registry.py: format_process_notification renders the rich
  task-source block (goal/context/toolsets/model/status/result).
- gateway/run.py: dedicated _async_delegation_watcher drains + injects
  results into the originating session (idle + post-turn), session_key
  routing enrichment, shutdown interrupt of dangling delegations.
- config: delegation.max_async_children (default 3).

Reuses the existing idle-drain wiring rather than mutating a running
agent loop, preserving message-role alternation and prompt-cache
invariants. 13 targeted tests; CLI + gateway paths E2E-verified.

* test(delegation): make async non-blocking tests environment-independent

CI 'test (5)' flaked on a cold, 8-worker runner: the first
delegate_task(background=true) call measured 2.27s of one-time setup
(config load + child-agent construction + imports), tripping the
elapsed < 1.0 wall-clock assertion. That assertion was testing setup
overhead, not blocking.

Replace the wall-clock thresholds with the real invariant: dispatch
returns while the child is still gated (active_count == 1, completion
queue empty), which a synchronous impl could not do. Keep only a loose
4s sanity backstop well under the runner's 5s gate.

* fix(delegation): harden async background delegation

Follow-up review fixes:
- Detach background child from parent._active_children at dispatch —
  otherwise parent-turn interrupts (Ctrl+C, mid-turn steering), cache
  evicts (release_clients), and session close (/new) kill/close the
  detached subagent mid-run, defeating the point of background mode.
  Lifecycle is owned by the async registry's interrupt_fn.
- Make the capacity check atomic with the record insert (TOCTOU: two
  concurrent dispatches could both pass active_count() and exceed the cap).
- TUI dedup: key async_delegation events by delegation_id — the
  fallthrough keyed them all as ("", type), suppressing every completion
  after the first in the desktop/TUI status feed.
- CLI /stop now interrupts running background delegations and /agents
  lists them (they live outside the process registry and were invisible).
- Drop stray unbalanced ']' line from the re-injection block and the
  unused _ASYNC_DEFAULT import.

Tests: detach-at-dispatch + concurrent-capacity race added (15 total in
test_async_delegation.py); 137 delegate + 140 process-registry/notify/watch
+ 7 TUI dedup tests pass.

* fix(delegation): harden async background completion drains
2026-06-15 13:33:12 -07:00
xxxigm
b2a4766463 fix(dump): report effective terminal backend in hermes debug
`terminal.backend` in config.yaml is bridged to the TERMINAL_ENV env var,
but a TERMINAL_ENV set in .env / the shell overrides config and is what
terminal_tool actually uses. The dump printed only the config value, so a
user whose agent was jailed in a docker/podman sandbox via a stale
TERMINAL_ENV still saw `terminal: local` — hiding the real cause. Report
the effective backend and flag when TERMINAL_ENV overrides config.yaml.
2026-06-15 12:31:23 -07:00
liuhao1024
60cc42e38b fix(inventory): deduplicate models between user-defined and aggregator providers
When a user-defined provider (e.g. litellm-proxy) and an aggregator
(e.g. openrouter) both advertise the same model name, the Desktop/TUI
model picker would show the model under both groups. Selecting it from
the aggregator row silently set model.provider to the aggregator,
breaking calls because the aggregator doesn't actually serve that model
ID.

Fix: after list_authenticated_providers() returns, collect all models
from user-defined provider rows and filter them out of aggregator rows.
Uses is_aggregator() from hermes_cli/providers.py to identify
aggregators. Case-insensitive matching.

Fixes #45954
2026-06-15 12:25:41 -07:00
liuhao1024
9df1a1a8de fix(doctor): recognize nvidia as vendor-slug-accepting provider
NVIDIA NIM API uses vendor-prefixed model IDs (e.g. qwen/qwen3.5-122b-a10b,
nvidia/nemotron-3-super-120b-a12b). The doctor command incorrectly warns that
vendor-prefixed slugs belong to aggregators like openrouter when nvidia is
the configured provider.

Add 'nvidia' to the providers_accepting_vendor_slugs set so doctor no longer
raises false-positive warnings for valid NVIDIA NIM configurations.

Fixes #35425
2026-06-15 12:24:46 -07:00
Teknium
3e7e9b24d4 fix: harden salvaged session and browser improvements
Polish salvaged contributor work before PR review:
- read browser inactivity timeout from config with documented fallback
- skip redundant v10 trigram backfill before v11 FTS rebuild
- show delegate_task goals safely in progress previews
- show gateway status model/context without redundant token wording
- wire gateway /sessions to shared session-listing helpers
- map Ravenwolf author emails for release attribution

Co-authored-by: Wolfram Ravenwolf <github.com@wolfram.ravenwolf.de>
Co-authored-by: Amy Ravenwolf <amy@ravenwolf.de>
2026-06-15 07:46:34 -07:00
Wolfram Ravenwolf
ead38107a2 feat(status): restore model and context in gateway status
PROBLEM: The old public /status PR drifted out of the current Amy patch stack, leaving /status without the model/provider, context window, or explicit cumulative token label that Wolfram uses to monitor context pressure from chat.

SOLUTION: Re-port the feature onto the current gateway status handler. Prefer live/cached agent runtime metadata, fall back to SessionDB + SessionStore state between turns, add localized status model/context lines, and keep token totals explicitly labeled cumulative.

Verification: tests/gateway/test_status_command.py, tests/hermes_cli/test_commands.py
2026-06-15 07:46:34 -07:00
FT_IOxCS
92a456f711 fix(cli,deps): clear esbuild audit loop
Upgrade the Vite/esbuild surfaces that kept web, ui-tui, and the bootstrap installer on vulnerable esbuild versions, regenerate the root lockfile, and preserve intentional package+lock dependency edits during update lockfile cleanup.
2026-06-15 06:18:27 -07:00
Teknium
0d82060c74 fix: harden WhatsApp target alias salvage
Add a parser-only routing regression that proves raw WhatsApp group JIDs bypass channel-directory resolution and home-channel fallback, include channel_aliases.json in quick state snapshots, harden malformed alias handling, and map Keiron McCammon for release attribution.
2026-06-15 05:51:47 -07:00
Veritas-7
febdddb41a fix(auth): refresh xAI OAuth tokens earlier 2026-06-15 05:40:23 -07:00
kshitijk4poor
497352bc4e fix(auth): write rotated xAI OAuth tokens back to global root (#43589)
The salvaged read-side fix lets a profile resolve the xAI OAuth grant from
the global-root auth store when it has no own providers.xai-oauth block.
But _save_xai_oauth_tokens still wrote rotated tokens only to the active
profile store. Because xAI rotates the refresh_token on every refresh, a
profile that reads root's grant and refreshes it left root holding a now-
revoked refresh token — killing every other profile reading the stale root
grant with invalid_grant once its access token expired (#43589).

Detect the read-from-root case (profile lacks its own providers.xai-oauth
block) and, after the profile save, write the rotated chain back to the
global root too via a best-effort, TOCTOU-safe write-through that reuses
_save_auth_store with an explicit target path. A profile that genuinely
shadows root (has its own block) is left untouched, classic mode is a
no-op, and a failed root write never breaks the profile's own save.

Pairs with the read fallback in the preceding commit so the cross-profile
xAI grant stays coherent in both directions.
2026-06-15 17:08:19 +05:30
Andrew Walker
f1d6f04362 fix(auth): resolve xAI OAuth credentials across profiles
(cherry picked from commit 8d8b9f50e4)
2026-06-15 17:03:35 +05:30
Teknium
aca11c227e
fix(docker): skip gateway reconciliation in dashboard container (autodetect) (#46293)
* fix(docker): skip per-profile gateway reconciliation in dashboard container

When gateway and dashboard containers share a bind-mounted HERMES_HOME,
both run the cont-init.d profile reconciliation script, which creates
s6-log processes for every persisted profile.  These s6-log processes
in different containers race to flock() the same log-directory lock
files under logs/gateways/<profile>/lock, producing repeated
"s6-log: fatal: unable to lock ... Resource busy" errors and a
supervision restart storm.

Add HERMES_SKIP_PROFILE_RECONCILE env var support to container_boot.py
and set it in the official docker-compose.yml dashboard service so the
dashboard container no longer creates per-profile gateway s6 services
it never uses.

* chore(release): map salvaged contributor

* refactor(docker): autodetect dashboard container instead of env-var gate

Replace the HERMES_SKIP_PROFILE_RECONCILE env var with PID 1 argv role
detection. A dashboard-only container never spawns or supervises
per-profile gateways, so the reconcile boot hook now skips itself when
/proc/1/cmdline is the dashboard command — no operator flag to set (or
forget in a hand-written manifest, which would reintroduce the s6-log
flock storm this prevents).

- Extract _strip_container_argv_prefix() shared by the legacy-gateway
  and new dashboard detectors (DRY the init/wrapper/hermes peel).
- Add _is_dashboard_container(); gate reconcile main() on it.
- Drop HERMES_SKIP_PROFILE_RECONCILE from code + docker-compose.yml.
- Tests: argv matrix for both roles + main()-level skip/reconcile proof
  and a regression that the removed env var is now inert.

Co-authored-by: 895252509 <895252509@qq.com>

---------

Co-authored-by: zhouxiang <895252509@qq.com>
Co-authored-by: Ben <ben@nousresearch.com>
2026-06-15 20:51:48 +10:00