Commit graph

10741 commits

Author SHA1 Message Date
Teknium
f8a241e105 fix(delegate): flatten content blocks in live overlay tail + AUTHOR_MAP
Follow-up on the cherry-picked content-block fix. _extract_output_tail
(the live subagent overlay) still used crude str(content), which renders
a "[{'type': 'text'...}]" blob and — worse — mislabels a block-wrapped
"Error: ..." result as is_error=False. Route it through the same
_stringify_tool_content helper so error detection and previews work at
both consumer sites.

- delegate_tool.py: _extract_output_tail uses _stringify_tool_content
- tests: add _extract_output_tail content-block test (error detection +
  clean preview)
- release.py: AUTHOR_MAP entry for randomsnowflake (CI gate)
2026-06-05 23:34:00 -07:00
Alexander Lehmann
f83918c31d fix(delegate): handle content-block tool results 2026-06-05 23:34:00 -07:00
teknium1
16beab421f fix(desktop): About panel shows live Hermes version, not stale package.json
The native macOS About panel showed the Electron package.json version
(e.g. 0.15.1) while the status bar showed the real Hermes version
(0.16.0). setAboutPanelOptions() set applicationName + copyright but
omitted applicationVersion, so macOS fell back to app.getVersion() =
package.json, which drifts (release.py's desktop lockstep bump didn't
land for 0.16.0).

resolveHermesVersion() already reads the live version from
hermes_cli/__init__.py and was built 'so the desktop About panel shows
the real Hermes version' per its own comment, but was never wired in.

- Seed applicationVersion: resolveHermesVersion() at module load.
- Replace the macOS About menu item's role:'about' with a click handler
  (showAboutPanelFresh) that re-resolves the version on every open, so an
  in-place `hermes update` is reflected without an app restart.
2026-06-05 23:32:16 -07:00
helix4u
338c074336 fix(send-message): treat ntfy topic targets as explicit 2026-06-05 20:38:28 -07:00
Teknium
50f9ad70fc
fix(dashboard): populate cron delivery dropdown from configured platforms (#40218)
* fix: respect disabled auto-compaction on context overflow

Port from anomalyco/opencode#30749.

When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.

Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.

Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).

* fix(dashboard): populate cron delivery dropdown from configured platforms

The dashboard cron-create/edit dropdown hardcoded five delivery options
(local, telegram, discord, slack, email), so users on Matrix — or any
other backend-supported platform — had no way to pick their channel even
though the cron scheduler delivers to all of them. It also offered
Telegram/Discord/etc. to users who never set those up.

- cron/scheduler.py: add cron_delivery_targets() — the single source of
  truth. Intersects gateway-configured platforms with cron-deliverable
  ones and reports whether each platform's home channel is set.
- web_server.py: GET /api/cron/delivery-targets exposes that list (+ the
  implicit local option) to the dashboard.
- CronPage.tsx: both modals render options from the endpoint. Configured
  platforms missing a home channel still appear, annotated "set a home
  channel first" (option B), so the user knows what to fix. Edit modal
  preserves a job's current target even if it's no longer configured.
  Local-only state shows a "configure a platform under Channels" hint.

Validation: scheduler + endpoint E2E'd with a Matrix gateway (home set
and unset); 5 new tests; tests/cron + tests/hermes_cli/test_web_server
green (366 passed).
2026-06-05 20:23:54 -07:00
brooklyn!
150687447b
Merge pull request #40240 from NousResearch/bb/desktop-steer
feat: usable mid-turn steer — desktop affordance + trusted injection
2026-06-05 21:10:57 -05:00
Brooklyn Nicholson
5d4c93afe4 refactor(desktop): hoist single draft.trim() in composer
Compute the trimmed draft once and reuse for hasComposerPayload + canSteer
instead of trimming three times per render.
2026-06-05 21:05:56 -05:00
Brooklyn Nicholson
7cceead273 fix(desktop): render steer note as a codicon, not an emoji
The inline steer note used a  emoji. Emit a structured `steer:<text>`
system note and render it in SystemMessage as a codicon (compass) row —
same style as slash-status output. No emoji in the transcript.
2026-06-05 21:03:05 -05:00
Brooklyn Nicholson
efa53fb3be feat(desktop): reserve Cmd/Ctrl+Enter strictly for steer
Cmd/Ctrl+Enter now steers when there's a steerable draft and is a no-op
otherwise — it never falls through to a send, so the shortcut can't
surprise-send. Plain Enter keeps its role (queue while busy, send when idle).
2026-06-05 21:01:20 -05:00
Brooklyn Nicholson
0f45509daf fix(agent): make mid-turn /steer trusted, not read as injection
A steer rides inside a tool result (the only role-alternation-safe slot
mid-turn), so a bare "User guidance:" line reads as untrusted tool content —
well-behaved models refuse it as suspected prompt injection (observed live:
"I only follow instructions from you directly, not ones injected through
command results").

- Wrap steers in a bounded, self-describing [OUT-OF-BAND USER MESSAGE] marker
  (prompt_builder.format_steer_marker), shared by both drain sites.
- Add STEER_CHANNEL_NOTE to the core system prompt so the model expects this
  exact marker and trusts it as a genuine user message — while still ignoring
  lookalikes buried in tool/web/file output. Static text → byte-stable prompt,
  no prompt-cache regression; gated on the agent having tools.
- Desktop: steer ack is now an inline transcript note ( steered · …) instead
  of a toast.

Marker is intentionally static (not a per-session nonce) to honor the
byte-stable system-prompt caching policy; nonce hardening noted as follow-up.
2026-06-05 20:59:36 -05:00
Brooklyn Nicholson
40aef6af91 feat(desktop): steer the live run from the composer
The desktop app could only queue while busy — `/steer` was in the palette
but had no first-class affordance, so the "nudge the agent mid-turn without
interrupting" lane was effectively unreachable.

Add a steer action to the composer: while busy with a text-only draft, a
steering-wheel button (and Cmd/Ctrl+Enter) injects the text into the live
turn via the `session.steer` RPC — the gateway folds it into the next tool
result so the model reads it on its next iteration. Plain Enter still queues.

steerPrompt returns false when the gateway has no live tool window (or the
RPC errors), and the composer re-queues the words so nothing is lost — the
same safety net as a plain queue.
2026-06-05 20:50:30 -05:00
brooklyn!
e375c33f70
fix(tui): clean force-send of queued messages (#40235)
Force-sending a queued message (double-empty-enter, or interrupt-mode
submit) flipped busy→false optimistically, so the queue drain raced the
still-unwinding turn: duplicate user bubble, a stray "queued: …" note, and
the cancelled turn's "Operation interrupted…" reply leaking in.

interruptTurn gains `keepBusy`: hold busy until the gateway's real settle
edge (message.complete, suppressed while interrupted), which drains the
queued message exactly once — desktop "send now" parity. The interrupt
paths now queue + interrupt instead of optimistically sending.
2026-06-06 01:39:10 +00:00
brooklyn!
ac177cea87
Merge pull request #40234 from NousResearch/bb/desktop-queue-arrow-edit-v2
feat(desktop): arrow-key history + queue editing in composer
2026-06-05 20:38:37 -05:00
Brooklyn Nicholson
ce50030634 feat(desktop): integrate arrow history with the message queue
Builds on @naqerl's arrow up/down history (previous commit), making
ArrowUp do the right thing when a queue exists.

ArrowUp/ArrowDown priority:
1. Editing a queued turn → walk older/newer through queued entries,
   saving each edit; ArrowDown past the newest exits and restores the
   pre-edit draft.
2. Empty composer + queued turns → ArrowUp opens the newest queued entry
   for editing (the row's pencil), so Enter saves it back to the queue
   instead of firing a new message — the gap the history nav had alone.
3. Otherwise → sent-message history recall (unchanged).

Also: Esc cancels an in-progress queue edit (else interrupts).

Cleanups on the integrated code: fold the browse-state reset into the
existing session-change effect (drop the duplicate ref+effect); reuse
loadIntoComposer for history recall; sort imports; add curly braces +
the runDrain sessionId dep (lint).
2026-06-05 20:33:53 -05:00
naqerl
f94363d1f0 feat(desktop): arrow up/down to navigate previous user messages 2026-06-05 20:32:29 -05:00
brooklyn!
0cbcc75935
fix(desktop): reliable composer message queue (#40221)
* fix(desktop): make composer message queue reliable

The queue felt 'dumb' because of three real bugs:

1. Drained-after-interrupt sends went silent. cancelRun sets
   interrupted:true and nothing reset it; submitPromptText's optimistic
   seed preserved it, and the message stream drops every delta while
   interrupted. So Send-now-while-busy and any interrupt+drain submitted
   the next turn into a muted session. Fix: a fresh submit is a new turn —
   seed interrupted:false.

2. Back-to-back queue drains stalled. The drain fires on the busy->false
   settle edge, but busyRef (synced from the busy store by a separate
   effect) can still read true on that same edge, so the drained send hit
   the busy guard, returned false, and the entry was never removed. Fix:
   fromQueue sends bypass the busyRef guard (the queue drain lock
   serializes them); the user path keeps the guard.

3. Double-enter-to-interrupt killed single non-queue turns. The hidden
   450ms timer meant a natural double-tap after sending stopped the agent.
   Fix: empty Enter while busy is a no-op; interrupting is explicit —
   Stop button or Esc.

Also: clean stop (no [interrupted] marker), Send-now works while busy
(promote + interrupt + auto-drain), settle on the interrupted completion
path. Adds regression tests and unblocks the prompt-actions suite by
completing its stale @/hermes mock.

* fix(desktop): float the queue panel as an overlay so the chat doesn't resize

The queue list rendered in-flow inside the composer root, so its height
fed --composer-measured-height (the composer rect drives the thread's
bottom padding + last-message clearance). Queuing a message grew that
rect and the whole chat visibly resized.

Anchor the panel out of flow above the composer (absolute bottom-full,
capped at 40vh with internal scroll). It no longer contributes to the
measured height, so the thread layout stays put and the list overlays the
(already faded) chat. Still collapsible via the panel's own
disclosure header.

* fix(desktop): queue panel collapsed by default + shared border with composer

- Default the queue disclosure to collapsed (compact 'N queued' pill)
  instead of expanded.
- Drop the gap and merge the panel into the composer: square bottom
  corners, no bottom border/radius, and overlap down by the Root's pt-2
  (-mb-2) so the panel's borderless bottom lands on the composer surface's
  top border — one continuous bordered shape.

* style(desktop): tighten queue panel padding

* style(desktop): trim queue-ux comments to house style

* style(desktop): drop 'Cursor' references from comments
2026-06-05 20:21:41 -05:00
Gille
0c0a707744
fix(desktop): repair macOS updater helper (#40217) 2026-06-05 20:05:32 -05:00
Teknium
78122c52cf test(slack): drop /q alias assertion now displaced by /version cap clamp
Slack's native-slash manifest hard-caps at 50 (_SLACK_MAX_SLASH_COMMANDS).
Adding the /version canonical claims a pass-1 slot, so the lowest-priority
pass-2 alias (/q for /quit) clamps off the end. /q stays reachable via
/hermes q. Surviving aliases (/btw /bg /reset) still prove alias parity.
2026-06-05 18:05:05 -07:00
Brooklyn Nicholson
30340eae2f Include git SHA in /version output via banner label helper.
Reuses format_banner_version_label() so CLI, TUI, gateway, and desktop show upstream/local commit when available.
2026-06-05 18:05:05 -07:00
Brooklyn Nicholson
9c1bb8d2c7 Add /version slash command across CLI, gateway, TUI, and desktop.
Surfaces Hermes Agent version info on demand without leaving chat; works mid-run like /help and /update.
2026-06-05 18:05:05 -07:00
teknium1
aa52cd3b57 test(desktop): unmount between IME composition repro cases
The new IME repro test has two it() blocks but the desktop suite registers
no global testing-library auto-cleanup, so the first render() leaked its
editor into the second test and getByTestId('editor') matched two nodes.
Add afterEach(cleanup) so each case renders into a fresh DOM.
2026-06-05 18:05:00 -07:00
xxxigm
da9425bf9b test(desktop): cover IME-composed send-button visibility (Chinese/Japanese/Korean)
DOM repro that drives compositionstart -> input(preedit) -> compositionend with
no trailing input event and asserts the composer payload (send button) becomes
visible for committed CJK/IME input. Regression guard for #39614.
2026-06-05 18:05:00 -07:00
xxxigm
8e629b9f38 fix(desktop): flush committed IME text on compositionend so the send button appears
Typing committed multi-character IME text (e.g. Chinese "你好", and equally
Japanese/Korean or any IME-composed script) left the send button hidden until
an unrelated edit. Input events during composition carry uncommitted preedit
text and are intentionally skipped; the code assumed a trailing input event
after compositionend would deliver the finalized text, but Chromium does not
reliably emit one on Windows IMEs. The committed text therefore never reached
composer state, so `hasComposerPayload` stayed false and the send button stayed
hidden (deleting a char fired a non-composition input that finally synced it).

Flush the live editor text into composer state in onCompositionEnd. Extract the
shared sync into flushEditorToDraft so input and compositionend both update
state.

Fixes #39614
2026-06-05 18:05:00 -07:00
teknium1
be2c64be02 fix(desktop): wire serializeJsonBody into OAuth request path
The salvaged helper exported serializeJsonBody but main.cjs still inline-built
the request body, leaving the export dead and the test decoupled from the real
path. Use it at the fetchJsonViaOauthSession site so the helper's coverage
exercises production body construction. Byte-identical output.
2026-06-05 18:04:45 -07:00
helix4u
b8234e7599 fix(desktop): avoid restricted oauth request header 2026-06-05 18:04:45 -07:00
Teknium
3c231eb397
chore: release v0.16.0 (2026.6.5) (#40206)
The Surface Release — native desktop app, browser admin panel,
remote-gateway connect, Simplified Chinese desktop UI, leaner default
skill set, NVIDIA/skills trusted tap, fuzzy model picker, /undo.

874 commits · 542 PRs · 170 contributors · 399 issues closed.
2026-06-05 17:55:43 -07:00
Teknium
ea266f43e9
fix(file-ops): make rg/grep search error guard reachable and preserve partial matches (#39858)
The error guard in _search_with_rg/_search_with_grep was unreachable and,
if it had fired, would have discarded valid results.

Two root causes:

1. Unreachable. Both methods pipe the search through `| head` with no
   pipefail, so the pipeline reported head's exit code (0), masking rg/grep's
   error code (2). The guard never fired. Worse, because _exec merges stderr
   into stdout (stderr=subprocess.STDOUT), the error text was then parsed as
   bogus match lines instead of being surfaced — the user got garbage matches
   with no indication the search failed.

2. Latent results-dropping. The original `not result.stdout.strip()` check
   was always False on error (error text lives in stdout), and the
   `hasattr(result, 'stderr')` branch was dead code (ExecuteResult has no
   stderr field). A naive broadening to `exit_code == 2` would have nuked
   real matches whenever rg/grep also hit a non-fatal error (e.g. one
   unreadable file in a tree that otherwise matched), which both tools signal
   with exit 2.

Fix:
- Prefix the piped command with `set -o pipefail` so rg/grep's real exit
  status propagates. rg exits 0 on a truncating head; grep exits 141
  (SIGPIPE), so the strict `== 2` guard ignores truncated-success.
- Add _split_tool_diagnostics() to separate tool diagnostics from match
  output by tool prefix and output shape. Diagnostics never become matches;
  on a hard error they are the message to surface.
- Only surface an error when exit==2 AND no usable match payload remains, so
  partial errors keep their real matches.

Tests: tests/tools/test_search_error_guard.py drives both methods through the
real local backend (hard error surfaced, partial error keeps matches,
truncation no false error, files_only/count exclude diagnostics) plus unit
coverage for the splitter.

Supersedes #39710.
2026-06-05 17:44:52 -07:00
kshitij
66a6b9c930
Merge pull request #39482 from liuhao1024/fix/rich-markup-error-on-session-resume
fix(cli): use Rich [dim] tag instead of ANSI escape in session resume messages
2026-06-05 13:12:17 -07:00
kshitij
e6f7e217ce
Merge pull request #40093 from kshitijk4poor/feat/named-custom-discover-models-18726
feat(model): honor discover_models in terminal hermes model named-custom flow (closes #18726)
2026-06-05 13:08:33 -07:00
kshitij
b5d42daa53
Merge pull request #40080 from kshitijk4poor/salvage/discover-models-section4-29810
feat(model_switch): honor discover_models in custom_providers section 4 (salvage #29810)
2026-06-05 13:05:34 -07:00
kshitijk4poor
7ae8aac3b9 feat(model): honor discover_models in terminal hermes model named-custom flow
The terminal `hermes model` wizard (_model_flow_named_custom) always
live-probed a custom provider's /models endpoint, ignoring the configured
`models:` list. For plans whose endpoint exposes a large catalog (e.g. Baidu
Qianfan Coding Plan returns 100+ models for a 2-3 model plan) the picker
flooded with models the user can't use.

This wires `discover_models` (and the `models:` list) through
_named_custom_provider_map into the flow and honors `discover_models: false`
the same way the slash-command picker (model_switch.py sections 3 & 4) does:
- Default stays True — live probe, no behaviour change.
- discover_models: false → use the configured `models:` list verbatim,
  skip the probe (string 'false'/'no'/'0' normalised to False).
- If the probe is on but returns empty, fall back to the configured list
  instead of forcing manual entry.

Closes #18726
2026-06-06 01:29:41 +05:30
kshitijk4poor
53bba70854 chore: add ohMyJason to AUTHOR_MAP 2026-06-06 01:04:25 +05:30
ohMyJason
4b2d00f845 feat(model_switch): honor discover_models in custom_providers section 4
Section 3 (user `providers:`) already honors `discover_models: false` to
skip live /models discovery and keep the explicit `models:` list. Section 4
(`custom_providers:` list) did not — `should_probe` ignored the field, so any
grouped custom provider with an api_key always had its configured subset
replaced by the full live /models catalog.

This adds the same `discover_models` support to section 4:
- Default True — no behaviour change for existing configs.
- `discover_models: false` keeps the explicit `models:` list even when an
  api_key is present.
- String values ("false"/"no"/"0") are normalised to False, matching
  section 3.
- If any entry in a grouped endpoint opts out, the whole group opts out.

Use case: endpoints that expose a full aggregator catalog via /models but
only serve a configured subset.

Salvaged from #29810 — rebased onto current main. The PR's other change
(`key_env` resolution in section 4) landed independently in commit aa283d1e4
(custom provider picker credential isolation), so only the discover_models
portion is carried here.

Co-authored-by: ohMyJason <42903577+ohMyJason@users.noreply.github.com>
2026-06-06 01:04:13 +05:30
brooklyn!
6f6eb871d8
fix(gateway): new chats honor their profile in global-remote mode (#39993)
Follow-up to #39921. That PR scoped session.resume + prompt.submit to a
session's profile, but a BRAND-NEW chat (session.create) under a non-launch
profile was still built and persisted against the dashboard's launch profile.
Two visible symptoms in app-global remote mode (one dashboard, many profiles):

  1. "who are you" in profile S replied as the launch (default) profile/agent —
     the agent was built with the launch HERMES_HOME, so config/SOUL/identity
     came from the wrong profile.
  2. "session not found" on later resume — _ensure_session_db_row persisted the
     row into the launch profile's state.db via _get_db(), so the session lived
     in the wrong db, the unified list mis-tagged it (it showed up under BOTH
     profiles), and resume routed to the wrong one.

Fix — carry the owning profile through the create path too:

- session.create accepts an optional `profile`; resolves its home and stores
  `profile_home` on the session (alongside what resume already set).
- _start_agent_build binds that profile's HERMES_HOME while building the agent
  (config/skills/model/identity resolve to it) and hands the agent the profile's
  state.db so turns persist there.
- _ensure_session_db_row writes the row into the profile's state.db, not the
  launch db — fixing the duplicate row + mis-tag + resume 404.
- desktop sends the new-chat profile on session.create.

None/launch profile → unchanged (single-profile and per-profile-remote setups
take the same path). Verified live against a one-dashboard / multi-profile
remote: a new chat under `work` builds as work's agent (correct SOUL identity),
persists ONLY to work's state.db (launch db stays empty), the unified list tags
it `work` exactly once, and it resumes cleanly.

tests/test_tui_gateway_server.py: _make_agent mocks updated for the session_db
param added in #39921's build path.
2026-06-05 17:44:45 +00:00
Jim Liu 宝玉
1d9c3ebae0 feat(desktop): persist i18n language in config 2026-06-05 10:32:26 -07:00
Jim Liu 宝玉
4a1907bd10 feat(desktop): add i18n with Simplified Chinese (zh-Hans) support
Introduce a lightweight React context-based i18n layer for the desktop
app and translate the UI into Simplified Chinese.

- New apps/desktop/src/i18n module: typed Translations interface, en + zh
  locale tables, I18nProvider/useI18n, localStorage-persisted locale
  (defaults to English), and language endonym metadata for the picker.
- Wire I18nProvider at the app root in main.tsx.
- Refactor 24 desktop screens/components to read strings from the `t`
  object instead of hard-coded English.
- Add a unit test for the i18n context.
2026-06-05 10:32:26 -07:00
brooklyn!
02d6bf1c39
fix(desktop+gateway): full multi-profile support over one global-remote dashboard (#39921)
* fix(desktop): cross-profile session history in app-global remote mode

#39894 made remote-profile sessions first-class for PER-PROFILE remote
overrides. But the common setup — Settings → Gateway → "All profiles" → Remote
— writes app-GLOBAL remote mode (connection.json top-level mode:'remote', empty
profiles map), which the intercept didn't recognize. Switching to a non-launch
profile then 404'd every session read, so no history showed for it.

In global remote mode a SINGLE backend serves every profile via ?profile= (it
reads each profile's state.db off the remote host's own disk — verified: one
dashboard returns /api/profiles and /api/profiles/sessions?profile=all across
all profiles). The fix: when no per-profile override matches but global remote
mode is active, route per-session reads/mutations to that one backend and KEEP
the ?profile= param so it opens the right state.db (instead of bailing to the
local path and dropping the profile scope).

- new globalRemoteActive() — true for connection.json mode:'remote' or the
  HERMES_DESKTOP_REMOTE_URL env override.
- per-session branch: per-profile override → route sans profile (own db);
  global mode → route to the single backend WITH ?profile= preserved.
- unified list is unchanged in global mode: it already passes through to the one
  backend, which aggregates all profiles natively.

Verified live against a one-dashboard / multi-profile remote (Austin's topology):
cross-profile transcript reads load (was 404), rename/delete route to the right
profile, unified list spans both profiles.

Known limitation (architectural, not fixed here): LIVE chat as a non-launch
profile still needs a per-profile dashboard on the remote — the dashboard binds
HERMES_HOME once at process start, so one global backend can't run an agent
turn as another profile. Session history/read/mutate now work regardless.

* fix(gateway): resume + chat any profile over one global-remote dashboard

The REST half of this branch made cross-profile session history visible in
app-global remote mode, but resume + chat still went over the WebSocket gateway,
which was hard-bound to the dashboard's launch profile. Resuming a non-launch
profile's session 404'd ("session not found") and sending spawned a new session
— because session.resume/prompt.submit had no profile concept and the live
agent + state.db were process-global to the launch profile's HERMES_HOME.

Make the WS gateway per-session profile-aware so ONE dashboard can serve every
local profile on its host (the app-global remote topology):

- session.resume accepts an optional `profile`. _profile_home() resolves that
  profile's home on this host; resume opens THAT profile's state.db, binds its
  HERMES_HOME (ContextVar override) while building the agent so config/skills/
  model resolve to it, and passes the profile db to the agent so turns persist
  to the right state.db. The owning profile_home is stored on the session.
- prompt.submit re-binds the stored profile_home for the turn thread (mid-turn
  home reads — memory, skills — resolve to the resumed profile), reset in finally.
- _make_agent gains an optional session_db param (defaults to _get_db()).
- _load_cfg honors the home override (falls back to _hermes_home) so a resumed
  profile loads its own config; cache keyed on resolved path.
- desktop: session.resume now sends the owning profile.

Omitted/launch profile → unchanged (single-profile and per-profile-remote setups
are byte-for-byte the same path). Verified live against a one-dashboard /
multi-profile remote: resuming a non-launch profile's session loads its history,
runs a real turn against THAT profile's home/env, and persists to its state.db.

tests/tui_gateway/test_protocol.py: _make_agent mocks updated for the new param.
2026-06-05 12:22:55 -05:00
teknium1
e837856ecd chore(release): map ViewWay author email for AUTHOR_MAP 2026-06-05 09:10:26 -07:00
teknium1
2dda393f9f test(gateway): regression tests for max_tokens propagation chain (#20741) 2026-06-05 09:10:26 -07:00
teknium1
14275d7baa fix(gateway): honor per-provider max_output_tokens in max_tokens chain
Widens ViewWay's #20741 fix to the sibling config surface: a
custom_providers entry can pin its own output cap via max_output_tokens
(or max_tokens). _get_named_custom_provider now lifts it onto the
resolved runtime at all three return sites, and the gateway uses it as a
fallback only when the documented global model.max_tokens isn't set, so
the global key always wins.

Precedence: HERMES_MAX_TOKENS > model.max_tokens > provider
max_output_tokens > None. Closes the same #20741 truncation for users who
configure the cap per-provider rather than globally.

Picks up the intent of #19782 (alexcam1901), reimplemented to feed
ViewWay's max_tokens pipeline.
2026-06-05 09:10:26 -07:00
ViewWay
1c909e75e1 fix(cli,gateway): complete max_tokens propagation — CLI path + env var override
Previous commit only covered the gateway runtime path. This adds:
- CLI __init__: read max_tokens from model config with HERMES_MAX_TOKENS env override
- CLI AIAgent() calls (interactive + background): pass max_tokens
- Gateway _resolve_runtime_agent_kwargs: add HERMES_MAX_TOKENS env override

All three code paths (CLI, gateway runtime, session override) now
consistently propagate max_tokens to AIAgent.
2026-06-05 09:10:26 -07:00
ViewWay
cf786593cd fix(gateway): propagate max_tokens from config.yaml to AIAgent
max_tokens set under model: in config.yaml was silently ignored.
The value was never read from config, never passed through
_resolve_runtime_agent_kwargs(), _resolve_turn_agent_config(),
or the session override path.  Added it to all three code paths
so custom/Ollama endpoints receive the correct output cap.

Closes #20741
2026-06-05 09:10:26 -07:00
brooklyn!
9af54b2f8c
fix(desktop): make remote-profile sessions first-class (resume, read, rename/archive/delete) (#39894)
* fix(desktop): route remote-profile session reads to the owning remote backend

Per-profile remote hosts (#39778) wired the chat/resume socket to a profile's
remote backend, but session list + transcript reads still assumed every
profile's state.db is a local file the primary can open. For a remote profile
the local file is absent or stale, so the IDs the sidebar shows 404 the moment
resume runs against the remote -- the "session not found -> new session" bug.

Intercept the three session-read GETs in the hermes:api handler and route them
to the owning remote backend (which serves its own state.db natively):

  GET /api/profiles/sessions        -> splice each remote profile's real rows in
  GET /api/sessions/{id}[/messages] -> read from the remote for remote profiles

No remote profiles configured -> untouched local fast path. A dead remote
contributes nothing rather than breaking the sidebar.

Verified end-to-end against a live remote backend: a remote-profile session
resumes from remote history and continues on the remote across turns (history
grows in place, no new session spawned).

* fix(desktop): route remote-profile session mutations + fix unified-list pagination

Follow-up to the read-routing fix: make remote-profile sessions fully
first-class, not just resumable.

Mutations (rename/archive/delete) went through the same hermes:api handler but
never carried the owning profile, so they hit the local primary's state.db --
which has no row for a remote session. Deleting/archiving/renaming a remote
session silently no-op'd or 404'd, and the row reappeared on next refresh.

- hermes.ts: setSessionArchived/deleteSession/renameSession take the owning
  profile and pass it as request.profile so Electron routes to that profile's
  backend (matching the read path). Callers now forward session.profile.
- main.cjs: generalize the intercept (read -> request) to also reroute
  DELETE/PATCH on /api/sessions/{id} for remote profiles, stripping the profile
  param (the remote serves its own state.db; no cross-profile semantics there).
- web_server.py: DELETE /api/sessions/{id} gains a profile param for parity with
  GET/PATCH (local cross-profile delete).

Also fix the unified-list merge: it concatenated each remote's page onto the
primary's without re-windowing, so a limit=N request could return up to
N*(1+remotes) rows and report the primary's (stale) total. Now it over-fetches
limit+offset from each remote (from offset 0), re-sorts by recency, re-windows
to the page, and recomputes total/profile_totals from the remote counts.

Verified live against a remote backend: rename/archive/delete mutate the remote
db; page 1 windows to limit, profile_totals reflect remote counts, page 2 has no
overlap with page 1. tsc -b clean; connection-config tests pass.
2026-06-05 10:13:10 -05:00
Brooklyn Nicholson
3045d54547 fix(desktop): route remote-profile session mutations + fix unified-list pagination
Follow-up to the read-routing fix: make remote-profile sessions fully
first-class, not just resumable.

Mutations (rename/archive/delete) went through the same hermes:api handler but
never carried the owning profile, so they hit the local primary's state.db --
which has no row for a remote session. Deleting/archiving/renaming a remote
session silently no-op'd or 404'd, and the row reappeared on next refresh.

- hermes.ts: setSessionArchived/deleteSession/renameSession take the owning
  profile and pass it as request.profile so Electron routes to that profile's
  backend (matching the read path). Callers now forward session.profile.
- main.cjs: generalize the intercept (read -> request) to also reroute
  DELETE/PATCH on /api/sessions/{id} for remote profiles, stripping the profile
  param (the remote serves its own state.db; no cross-profile semantics there).
- web_server.py: DELETE /api/sessions/{id} gains a profile param for parity with
  GET/PATCH (local cross-profile delete).

Also fix the unified-list merge: it concatenated each remote's page onto the
primary's without re-windowing, so a limit=N request could return up to
N*(1+remotes) rows and report the primary's (stale) total. Now it over-fetches
limit+offset from each remote (from offset 0), re-sorts by recency, re-windows
to the page, and recomputes total/profile_totals from the remote counts.

Verified live against a remote backend: rename/archive/delete mutate the remote
db; page 1 windows to limit, profile_totals reflect remote counts, page 2 has no
overlap with page 1. tsc -b clean; connection-config tests pass.
2026-06-05 10:08:26 -05:00
Brooklyn Nicholson
83c13862f1 fix(desktop): route remote-profile session reads to the owning remote backend
Per-profile remote hosts (#39778) wired the chat/resume socket to a profile's
remote backend, but session list + transcript reads still assumed every
profile's state.db is a local file the primary can open. For a remote profile
the local file is absent or stale, so the IDs the sidebar shows 404 the moment
resume runs against the remote -- the "session not found -> new session" bug.

Intercept the three session-read GETs in the hermes:api handler and route them
to the owning remote backend (which serves its own state.db natively):

  GET /api/profiles/sessions        -> splice each remote profile's real rows in
  GET /api/sessions/{id}[/messages] -> read from the remote for remote profiles

No remote profiles configured -> untouched local fast path. A dead remote
contributes nothing rather than breaking the sidebar.

Verified end-to-end against a live remote backend: a remote-profile session
resumes from remote history and continues on the remote across turns (history
grows in place, no new session spawned).
2026-06-05 09:52:52 -05:00
adybag14-cyber
af8b917dab fix(termux): scope frontend npm installs 2026-06-05 06:56:51 -07:00
Teknium
9ca11b35d5
perf(/model): prewarm picker provider-models cache in background (#39847)
* fix: respect disabled auto-compaction on context overflow

Port from anomalyco/opencode#30749.

When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.

Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.

Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).

* perf(/model): prewarm picker provider-models cache in background

The no-args /model picker calls list_authenticated_providers(), which
fetches each authenticated provider's live /v1/models list serially. On a
cold or stale (>1h TTL) cache that blocks ~1.5s on the user's critical path
the first time /model is opened in a session.

Warm that exact path off-thread during the idle window right after the CLI
banner is shown: a once-per-process daemon thread runs
list_authenticated_providers() to populate provider_models_cache.json for
every authed provider. By the time the user types /model, the picker hits
the warm disk cache (~136ms vs ~1500ms).

Process-level Event guard (mirrors run_agent's _openrouter_prewarm_done)
ensures at most one thread per process; fully exception-isolated so an
offline/no-creds provider can never affect the session.
2026-06-05 06:55:09 -07:00
Teknium
ca1fb32c26
docs: remove --include-desktop install instructions (#39762)
* docs: remove --include-desktop install instructions

Drop the --include-desktop curl one-liner from the desktop app docs.
The flag remains in scripts/install.sh; these docs now point to the
desktop installer / website and the 'hermes desktop' path instead.

* docs: remove --include-desktop from install docs

Drop the redundant 'Hermes Desktop installer on Linux' block (which
used --include-desktop) from quickstart, installation, and index docs.
The website installer covers macOS/Windows desktop; the CLI-only path
covers Linux. Removes the flag from all user-facing docs.
2026-06-05 06:53:58 -07:00
Teknium
7583aedacd
fix(completion): remove /model <arg> autocomplete from CLI/TUI (#39727)
* fix: respect disabled auto-compaction on context overflow

Port from anomalyco/opencode#30749.

When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.

Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.

Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).

* fix(completion): remove /model <arg> autocomplete from CLI/TUI

The TUI frontend already suppressed /model argument completion in favor of
the two-step ModelPicker (useCompletion.ts), but the CLI prompt_toolkit
completer and the gateway-backed complete.slash RPC (TUI + desktop) still
emitted model aliases and probed LM Studio on every keystroke.

Drops the /model branch in SlashCommandCompleter.get_completions, the
_model_completions method, and the LM Studio probe/cache helper that only
fed it. Command-name completion (/mod -> model) and sibling arg completers
(/skin, /personality) are untouched. Removes the now-dead TestModelTabCompletion
tests.
2026-06-05 06:43:51 -07:00
brooklyn!
14fee4f112
fix(update/windows): retry handoff hermes update once on first-run crash (#39831)
The in-app updater (Hermes-Setup --update) runs `hermes update`, which lazily
imports the freshly-pulled modules — but the dependency-install step runs the
already-in-memory PRE-pull code for one invocation. When a release changes an
updater-path contract across that boundary, the FIRST update on the parked
population crashes even though the fix is already on disk.

Concretely this is #39780's `_UvResult`: its `__iter__` yields (path, bool), so
Windows `subprocess.list2cmdline([uv_bin, "pip", ...])` injects the bool and
dies with `TypeError: sequence item 1: expected str instance, bool found`
(fixed in #39820). A parked Windows user clicking Update pulls #39820 to disk,
then still crashes on the in-memory pre-merge module; only the SECOND click runs
clean. Field repro: ryanc's bootstrap.log (2026-06-05 12:41:41).

Fix: when the first `hermes update` exits non-zero (and it isn't the
concurrent-instance guard, exit 2, which a retry can't fix), retry once
automatically. The retry loads the now-current module from the start and
succeeds — so the parked user gets a working one-click update instead of a
scary crash + manual second attempt.

Verified: cargo check clean.
2026-06-05 08:37:16 -05:00