Tool-progress now shows a terminal command in a ```bash fenced block —
full command, no surrounding quotes, no label, no 40-char truncation —
instead of the noisy `terminal: "cmd…"` line, on every platform that
renders markdown code blocks (Telegram, Slack, Matrix, WhatsApp, Feishu,
Weixin, Discord). Plain-text platforms keep the compact preview line.
Gated on a new `BasePlatformAdapter.supports_code_blocks` capability
(default False) rather than a hardcoded platform list, so plugin adapters
(Discord lives in plugins/platforms/) opt in by setting the flag. Applies
to both all/new and verbose progress modes, with a safe fallback when the
command arg is missing or blank.
The desktop chat GUI pinned the viewport to the bottom on every content
growth while a turn streamed, so the window chased tokens as they arrived.
Remove that follow behavior: once a turn is running the viewport stays
exactly where the user left it.
- Delete the streaming ResizeObserver re-pin loop in useThreadScrollAnchor.
- Delete the post-run bottom lock (kept pinning ~1.2s after completion).
- Keep the one-time jump-to-bottom on user submit / new turn / session
change so a freshly submitted message still lands in view.
- Update streaming.test.tsx to assert the viewport no longer follows
streaming growth or snaps down on final code-highlight remeasure.
After sleep/wake, a remote (global-remote) primary backend can become
unreachable, but it has no child process whose 'exit' clears the main
process's cached connectionPromise. The renderer then re-dials the same
dead remote forever and the composer stays stuck on "Starting Hermes…";
only a quit+reopen recovered.
Fix: the renderer's existing backoff-paced reconnect loop now asks the
main process to revalidate the cached connection before re-dialing. The
main process liveness-probes the cached REMOTE backend's public
/api/status and, if unreachable, drops the cache (resetHermesConnection
only nulls connectionPromise for a remote — no child to SIGTERM) so the
next getConnection() rebuilds a reachable descriptor. Local backends are
never touched here; they self-heal via the child 'exit' handler. The
renderer's loop already provides retry pacing and rides out transient
blips, so no streak/episode bookkeeping is needed in the main process.
The boot hook dismisses the boot-progress overlay on the post-rebuild
'open' so an in-place rebuild can't leave it stuck at ~94%.
Reimplements #40135 by @AlchemistChaos on a smaller, more interpretable
path (63 added lines vs 555): no extracted helper module, no
failure-streak / episode-window state, the renderer's backoff loop is
the retry mechanism. Original diagnosis and fix by @AlchemistChaos.
Co-authored-by: AlchemistChaos <alchemistchaos@protonmail.com>
Follow-up to #34306. The provider fix made SearXNG *usable* with a
config-only SEARXNG_URL, but tools/web_tools._has_env still read raw
os.getenv, so the backend auto-detect cascade and check_web_api_key
remained blind to it — SearXNG worked when explicitly selected but was
never auto-selected. Route _has_env (and the SearXNG diagnostic print)
through a config-aware _env_value helper mirroring the provider's
_searxng_url(). Fixing the shared helper covers every provider key in
one place. Adds regression tests for config-only auto-detect and
check_web_api_key. See #34290.
When apps/desktop's `npm ci`/`npm install` fails, install_desktop printed a
single "Desktop workspace npm install failed" line and aborted, leaving the
user with a wall of raw npm output. A common trigger is a root-owned ~/.npm
cache left by an earlier `sudo npm`/`sudo npx`: the non-root install then
cannot write the shared cache, and npm reports it as EEXIST / "File exists"
while the real errno is EACCES (-13) -- so it reads like an installer bug.
Add a targeted remediation hint on that failure path pointing at:
sudo chown -R "$(id -un)" ~/.npm && npm cache verify
followed by the manual rebuild command. The stage stays a hard failure by
design (a silent skip yields a "complete" install with no app); only the
failure output changes.
hermes skills browse capped the hermes-index source at 5000, so it
surfaced ~5.4k of the ~90.7k skills the index actually carries. Raise
the per-source ceiling above catalog size; browse already paginates
client-side and the index is disk-cached, so no extra fetch cost.
Desktop connected to a remote gateway can now attach images and PDFs and
display agent-written images. Previously the desktop passed a LOCAL file path
to image.attach; on a remote gateway that path doesn't exist, so the image was
silently dropped ("skipped unreadable path") and the vision model never saw it.
The reverse direction was also broken — images the agent wrote on the gateway
rendered as dead links in the remote client.
Gateway (tui_gateway/server.py):
- image.attach_bytes: base64 byte upload written into the gateway's own images
dir and queued via the existing native-image-attach pipeline. Magic-byte
extension sniffing, data-URL prefix + whitespace tolerance, 25 MB cap,
structured error codes. Accepts content_base64/filename (canonical) and
data/ext (older-desktop aliases).
- pdf.attach: renders each page to PNG via pdftoppm (poppler-utils) at 150 DPI
and queues the pages as images; 50 MB / 25-page caps. Accepts host path or
base64 upload.
- Shared helpers (_decode_attach_base64, _sniff_image_ext, _queue_attached_image)
so the two methods and the existing image.attach don't duplicate logic.
Gateway (hermes_cli/web_server.py):
- GET /api/media: returns a gateway-local image as a base64 data URL so remote
clients can display it. Auth-gated like every /api route, extension
allowlist + size cap, AND confined to the gateway's own media roots
(images/screenshots/cache, resolved symlink-safe) so an authed caller can't
read image-extension files anywhere on disk.
Desktop (apps/desktop):
- syncImageAttachmentsForSubmit uploads bytes via image.attach_bytes when the
connection mode is 'remote'; the local fast path is unchanged.
- media.ts gains isRemoteGateway() + gatewayMediaDataUrl(); directive-text and
markdown-text fetch images over /api/media in remote mode.
Consolidates the competing remote-media PRs (#38876, #40317, #21908, #39437)
into one coherent implementation, taking the strongest parts of each and adding
shared-helper cleanup plus the /api/media root-confinement hardening on top.
The per-profile gateway switching from #38876 is intentionally left out as a
separable feature. TUI file uploads (#40492) remain a separate surface.
Tested: 11 new tui_gateway tests + 5 /api/media endpoint tests + desktop
media.remote unit tests; full tui_gateway + web_server suites green (472
passed); tsc -b clean; E2E verified the full attach→disk→queue and
gateway-path→data-URL display round-trip plus the out-of-root security block.
Co-authored-by: Max Mitcham <maxmitcham@mac.home>
Co-authored-by: Justlrnal4 <Justlrnal4@users.noreply.github.com>
Co-authored-by: Chris Cook <ccook@nvms.com>
Co-authored-by: Thomas Paquette <thomas.paquette@gmail.com>
* feat(desktop): surface TTS/STT/terminal backends as Settings dropdowns
Every native tool backend that the agent supports now shows up as a
clickable picker in the desktop Settings UI instead of a free-text box.
Desktop Settings renders a config field as a <Select> only if its dotpath
is a key in ENUM_OPTIONS (helpers.ts::enumOptionsFor returns undefined ->
free-text <Input> otherwise). Three backend-selector fields were surfaced
in their sections but missing from the map, so users had to hand-type the
provider name and could reasonably assume it was unsupported:
- tts.provider — now lists all built-in TTS backends incl. xai (Grok)
- stt.provider — local/groq/openai/mistral/elevenlabs
- terminal.backend — local/docker/singularity/modal/daytona/ssh
Each list is kept in sync with its backend source of truth (TTS:
agent/tts_registry.py::_BUILTIN_NAMES + tools/tts_tool.py; STT + terminal:
hermes_cli/config.py / tools/terminal_tool.py). The existing
enumOptionsFor current-value-append keeps any hand-typed/legacy value
selected, and command-type TTS providers still work.
Reported for Grok/xAI TTS, which was already a fully-wired built-in
provider (tts.provider: xai + XAI_API_KEY) with no picker entry.
* feat(desktop): expose per-backend TTS/STT/terminal config fields in Settings
Completes the backend-coverage pass: not just the provider PICKER but every
backend's own config fields are now tunable from desktop Settings, so a user
who picks (e.g.) Grok TTS can also set its voice/language without hand-editing
config.yaml.
Also fixes the STT provider dropdown: added 'xai' (Grok STT), which the
transcription dispatcher (tools/transcription_tools.py) handles but the
config.py comment had omitted — the dispatch ladder is the source of truth.
New Settings fields (Voice section):
- TTS xai (voice_id, language), minimax (model, voice_id), mistral
(model, voice_id), gemini (model, voice), neutts (model, device),
kittentts (model, voice), piper (voice)
- STT openai (model), groq (model), mistral (model)
New Settings fields (Advanced section):
- terminal docker_image / singularity_image / modal_image / daytona_image
New ENUM_OPTIONS dropdowns: stt.provider (+xai), stt.openai.model,
stt.mistral.model, tts.openai.model, tts.elevenlabs.model_id,
tts.neutts.device. Each list mirrors the backend generator's accepted values
(tools/tts_tool.py, tools/transcription_tools.py, hermes_cli/config.py).
i18n: FIELD_LABELS/FIELD_DESCRIPTIONS cover all locales via the English
fallback in config-settings.tsx; added native translations to ja/zh/zh-hant.
Secrets (provider API keys, modal/daytona tokens, ssh host/key) intentionally
stay in Settings -> Keys as env vars, not duplicated as config fields.
The agent-facing cronjob tool scans the user prompt with _scan_cron_prompt()
before creating/updating a job (tools/cronjob_tools.py); the REST cron
endpoints (POST /api/jobs, PATCH /api/jobs/{id}) validated length but not
content. This adds the same scan to both handlers so an exfiltration/injection
prompt is rejected the same way regardless of which surface created the job.
NOT a security boundary, defense-in-depth / parity only: the REST cron
endpoints are authenticated (every handler runs _check_auth, and connect()
refuses to start without API_SERVER_KEY), and _scan_cron_prompt is a documented
in-process heuristic, not a containment boundary (SECURITY.md 3.2).
Raised externally via GHSA-fr3q-rjg3-x6mf (DNS-rebinding pre-auth RCE). The
report's load-bearing 'no auth by default' premise was already closed three
weeks after it was filed by the API_SERVER_KEY-required guard (commit
1a9ef8314); this lands the create/update prompt-validation parity the report
also pointed at. Scanner imported defensively so a missing scanner cannot
disable the cron REST API.
Follow-up on the deferred-cleanup salvage (#33774): _cleanup_workspace
returned early for a non-scratch ('dir'/'worktree') task and never ran the
parent sweep, so a scratch parent waiting on a 'dir' child would leak its
deferred workspace forever. Run the parent sweep before the early return.
Adds regression tests: deferred-while-child-active, swept-after-last-child,
and dir-child-unblocks-scratch-parent.
When a Kanban task with workspace_kind=scratch completes, the
_cleanup_workspace() function immediately deletes the workspace
directory. If the task has children linked via task_links, those
children find the workspace deleted when they start.
This fix adds two checks:
1. Before deleting, check if any children are still active
(todo/ready/running). If so, defer cleanup.
2. After a child completes, check if parent workspace can now
be cleaned up (all children terminal).
FixesNousResearch/hermes-agent#33774
* feat(onboarding): opt-in structured profile-build path on first contact
On a user's very first gateway message, Hermes now optionally offers to
build a short profile of them — then, only with consent, gathers durable
facts and persists them to the user-profile memory store (memory tool,
target="user") so future sessions start already knowing who they are.
Inspired by Poke's zero-input onboarding, but consent-first by design:
- The agent OFFERS, never assumes. Declining stops it immediately.
- Before ANY external lookup it states what it will look up and asks.
- It never reads connected accounts (email/calendar) silently — the
exact privacy concern that made naive implementations feel invasive.
Wiring reuses existing infrastructure end-to-end:
- gateway/run.py first-message hook (was a plain self-intro) now swaps in
the profile-build directive when enabled and not yet offered.
- agent/onboarding.py gains profile_build_mode()/profile_build_directive()
+ PROFILE_BUILD_FLAG, latched once via the existing onboarding.seen
mechanism so the offer fires at most once per install.
- config default onboarding.profile_build: "ask" (set "off" to disable).
Added to an existing section, so no _config_version bump needed.
No new storage layer, no new injection path, no prompt-cache impact.
* fix(dashboard): fold onboarding into agent tab to avoid 1-field category
onboarding.profile_build is the only schema-surfaced onboarding field
(onboarding.seen is an internal latch dict), so the dashboard CONFIG_SCHEMA
single-field-category invariant rejected it. Merge onboarding -> agent like
the other small categories.
Compaction summaries now receive the current date and instruct the
summarizer to rewrite completed actions as absolute, dated, past-tense
facts (e.g. "email John about the proposal" -> "Sent the proposal email
to John on 2026-06-07"). A resumed conversation no longer re-issues work
that already happened or treats a finished action as still pending.
The date is resolved via hermes_time.now() (date-only, user-configured
timezone) inside _generate_summary. The compaction summary is a
mid-conversation message that is never part of the cached prefix, so the
date does not affect prompt-cache stability. Date resolution is
best-effort: a clock failure omits the rule rather than blocking
compaction. The rule rides the shared template, so both first-compaction
and iterative-update prompts carry it.
Inspired by Poke's summarization (temporal anchoring + semantic
preservation).
Three gateway tests broke on main after the component-auth security
hardening (test_discord_component_auth.py) made empty Discord component
allowlists fail-closed: a view built with allowed_user_ids=set() now
rejects every click instead of allowing anyone.
The clarify and model-picker BEHAVIOR tests still constructed their views
with an empty allowlist and expected the click to succeed — a stale
assumption from before the hardening. Fixed by giving each view an
allowlist containing the clicking user (the interaction's own id), which
is the realistic shape and what the security model requires.
Production code unchanged — this only updates the test fixtures to match
the intended (and separately pinned) fail-closed contract. The security
regression suite and these behavior suites now both pass.
Fixes:
- test_discord_clarify_buttons.py: test_choice_falls_back_to_label_text_when_entry_missing, test_other_flips_entry_to_awaiting_text
- test_discord_model_picker.py: test_model_picker_clears_controls_before_running_switch_callback
Salvage of the Discord half of PR #30964 by @LaPhilosophie. Discord
component button callbacks (ExecApprovalView, SlashConfirmView,
UpdatePromptView, ModelPickerView) bypass the normal message dispatch
authorization path. _component_check_auth previously returned True when
both the user and role allowlists were empty, so any guild member who
could see an approval prompt could click Approve on a dangerous command.
Fail closed instead: require DISCORD_ALLOWED_USERS / DISCORD_ALLOWED_ROLES
/ GATEWAY_ALLOWED_USERS membership, or an explicit DISCORD_ALLOW_ALL_USERS
/ GATEWAY_ALLOW_ALL_USERS opt-in for deliberately-open deployments.
Mirrors the Telegram (#24457) and Matrix fail-closed precedent.
The Slack half of #30964 is superseded by PR #33844's helper.
Reported via GHSA-mc26-p6fw-7pp6 (@whyiug).
Co-authored-by: LaPhilosophie <804436395@qq.com>
A non-numeric value in env vars like HERMES_STREAM_RETRIES,
HERMES_KANBAN_SPECIFY_MAX_TOKENS, GOOGLE_CHAT_MAX_BYTES, IRC_PORT, etc.
raised ValueError at import/init and crashed startup. Parse them safely,
falling back to the default.
Unified onto the existing utils.env_int(key, default) helper for core/
hermes_cli/tools modules instead of the original PR's three duplicate
local helpers; plugins keep minimal inline guards (no core-utils import).
All existing max()/min()/`or extra.get()` wrappers preserved.
Co-authored-by: annguyenNous <annguyenNous@users.noreply.github.com>
hermes doctor and hermes honcho status warned 'Honcho config not found'
whenever ~/.honcho/config.json was absent, even though HONCHO_API_KEY in
.env resolves a working config via HonchoClientConfig.from_global_config()
-> from_env(). Both now check hcfg.api_key/base_url before warning.
Co-authored-by: oxngon <98992931+oxngon@users.noreply.github.com>
## What does this PR do?
The trajectory compressor could corrupt training trajectories by cutting a
conversation in the middle of a tool-call/tool-response pair. In the from/value
trajectory format a `tool` turn (carrying `<tool_response>` markers) is always
emitted immediately after the `gpt` turn whose `<tool_call>` it answers, so the
two turns must stay together. The compressible region's end boundary, however,
was chosen purely by token accumulation: the loop stopped at the first turn where
the accumulated tokens met the savings target, with no regard for turn roles. For
any over-budget trajectory whose savings boundary happened to land between a `gpt`
turn and its `tool` turn, the `gpt` (with its `<tool_call>`) was summarised away
into the replacement `human` message while the now-orphaned `tool` turn (with its
`<tool_response>`) was kept verbatim in the tail — producing an unmatched marker
and silently corrupting the training signal. The head boundary had the mirror
problem when the first tool turn was not protected.
This change snaps both compression boundaries to a clean turn boundary before the
region is extracted and replaced, so the summary always covers whole gpt+tool
blocks and a `tool` turn is never separated from the `gpt` turn that precedes it.
The boundary is moved forward when possible (folding an orphaned tool turn into
the region that already holds its gpt) and falls back to moving backward when no
clean boundary exists ahead, such as when the protected tail itself begins on a
tool turn.
## Related Issue
N/A
## Type of Change
- [x] 🐛 Bug fix (non-breaking change that fixes an issue)
## Changes Made
- `trajectory_compressor.py`: added `_is_boundary_clean()` and `_snap_boundary()`
helpers on `TrajectoryCompressor`, and applied them to both the head and tail
compression boundaries in `compress_trajectory()` and
`compress_trajectory_async()`. When snapping collapses the region to nothing
safe to compress, the trajectory is returned unchanged and flagged as still
over the limit rather than being corrupted.
- `tests/test_trajectory_compressor.py`: added `TestCompressionToolPairIntegrity`
covering the sync and async paths plus direct unit tests for the boundary
snapping (forward skip and backward fallback).
## How to Test
1. Run the focused tests: `pytest tests/test_trajectory_compressor.py -q`.
2. The new sync/async cases build a trajectory of gpt/tool pairs with an oversized
middle gpt turn and choose a token target that forces the accumulation
boundary to stop between a `<tool_call>` and its `<tool_response>`. They assert
that `<tool_call>` and `<tool_response>` markers stay balanced after
compression and that every kept `tool` turn is immediately preceded by a `gpt`
turn (never the inserted summary or another tool turn).
## Checklist
### Code
- [x] I've read the [Contributing Guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md)
- [x] My commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) (`fix(scope):`, `feat(scope):`, etc.)
- [x] I searched for [existing PRs](https://github.com/NousResearch/hermes-agent/pulls) to make sure this isn't a duplicate
- [x] My PR contains **only** changes related to this fix/feature (no unrelated commits)
- [x] I've run `pytest tests/ -q` and all tests pass
- [x] I've added tests for my changes (required for bug fixes, strongly encouraged for features)
- [x] I've tested on my platform: macOS 15 (Darwin 25.5)
### Documentation & Housekeeping
- [x] I've updated relevant documentation (README, `docs/`, docstrings) — or N/A
- [x] I've updated `cli-config.yaml.example` if I added/changed config keys — or N/A
- [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — or N/A
- [x] I've considered cross-platform impact (Windows, macOS) per the [compatibility guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md#cross-platform-compatibility) — or N/A
- [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A
SIMPLEX_ALLOWED_USERS silently denied every contact when operators
listed display names instead of numeric contactIds. The SimpleX UI
never surfaces the numeric id, so display names are what operators
naturally put in the env var. _is_user_authorized only compared
source.user_id (the contactId), so the allowlist never matched.
Expand check_ids to include source.user_name for the simplex platform,
mirroring the existing WhatsApp phone-LID aliasing pattern. Adds doc +
setup-prompt clarification and three regression tests.
Salvaged from PR #40393. Adds manishbyatroy to release.py AUTHOR_MAP.
The desktop statusbar turn timer read a single process-global $turnStartedAt,
set/cleared only for the active session. With multiple same-profile sessions
running at once, switching to session B reset the one shared clock, so
session A's still-running turn "restarted from zero" the moment you left it —
exactly the behaviour @Da7_Tech reported after the profile-scoped session work.
Move turnStartedAt onto ClientSessionState so each session owns its own turn
clock. The global atom now just mirrors whichever session is focused, written
on view-sync (the flush that already stages the active session's state). A
backgrounded turn keeps counting in its own cache entry, and focusing it
restores its real elapsed time instead of zeroing it.
Set/clear sites: message.start (seed), message.complete + error + interrupted
bail (clear), and the session.info running-state path (seed if missing / clear
on stop) so a turn that goes busy via session.info — e.g. resuming a session
that's already running — also gets a clock.
Note: the agent loop itself never froze — every same-profile session runs in
its own backend thread and background deltas are buffered per-session. This
fixes the timer-reset symptom; the "no live progress until you return" is
inherent to a single-view transcript and is out of scope here.
DANGEROUS_PATTERNS and HARDLINE_PATTERNS are matched on the raw command string,
so backslash-escape (r\m) and empty-quote split (r''m) bypass both lists.
_normalize_command_for_detection now strips these before pattern matching.
tui_gateway shell.exec had a bare 'except ImportError: pass' that silently
disabled the entire safety gate if tools.approval wasn't importable. Changed
to fail-closed (return 5001 error). Added detect_hardline_command check.
Fixes#36846, #36847.
Two isolated reliability fixes:
- chat_completion_helpers: raise on a zero-chunk stream (no finish_reason,
no content/reasoning/tool_calls) so retry handles it instead of
fabricating a successful empty turn.
- model_metadata: parse the OpenRouter/Nous output-cap error phrasing
("maximum context length is N ... (A of text input, B of tool input,
C in the output)") so parse_available_output_tokens_from_error returns
a real cap and the caller stops looping on it.
Salvaged from #40405 (@ashishpatel26) — took the two stream/error-parsing
fixes. The PR also bundled compression-state changes (on_session_start
clearing _previous_summary; cron session-id prefix preservation, #38788);
those touch the compression hot path and are split out for separate review.
Co-authored-by: ashishpatel26 <ashishpatel26@users.noreply.github.com>
Packaged Desktop first-launch bootstrap no longer dies with a fatal HTTP
404 when install-stamp.json pins a commit that isn't fetchable from GitHub.
This only happens for locally-built desktop apps: write-build-stamp.cjs's
fromLocalGit() pins `git rev-parse HEAD`, which can be an unpushed commit
or dirty tree. CI builds stamp $GITHUB_SHA and are unaffected. The fix
unblocks the dev / self-builder workflow.
resolveInstallScript() now wraps the GitHub download in try/catch; on
failure it resolves ~/.hermes/hermes-agent/scripts/install.sh (the
already-installed agent checkout), copies it into bootstrap-cache, and
returns it as source 'installed-agent'. If the cache copy fails (read-only
FS), it uses the source path directly. With no installed checkout to fall
back to, the original error rethrows unchanged.
Download is now injectable via an optional _download param so the fallback
path is tested hermetically (no network).
Reported with a precise repro and suggested fix by @Tamaz-sujashvili (#40815).
Co-authored-by: Tamaz-sujashvili <56168197+Tamaz-sujashvili@users.noreply.github.com>
The dashboard font is now selectable from the UI, not just YAML. A new Font
section in the header theme picker overrides the UI font of whatever theme is
active; the choice is orthogonal to the theme and survives theme switches.
Each theme keeps its own font as the default — picking "Theme default" clears
the override.
- web/src/themes/fonts.ts: curated font catalog (system + Google Fonts across
sans/serif/mono), each with a family stack and optional webfont URL. The
catalog is the only injected-font surface — no free-text URL box, so the
injected <link> origins stay fixed.
- web/src/themes/context.tsx: font-override state (localStorage + server),
applied after theme typography so it wins; theme apply re-asserts it, and
clearing re-runs theme apply to restore the theme's own font. Mono is left
to the theme so code/terminal are untouched.
- web/src/components/ThemeSwitcher.tsx: Font section with grouped, self-
previewing font rows and a "Theme default" clear option.
- hermes_cli/web_server.py: GET/PUT /api/dashboard/font persisting to
config.yaml dashboard.font, with a server-side id allow-list (unknown ids
coerce to the theme sentinel).
- i18n + types, api client methods, tests, and docs.
Validation: 6 new backend endpoint tests pass; tsc + vite build clean; live
browser test confirmed pick/persist/survive-theme-switch/clear all work.
process_command() is typed -> bool, but the /clear, /new, and /undo
cancel paths did a bare `return` (None) when _confirm_destructive_slash
was declined, leaking None through the bool contract. Return True
(command handled, keep the REPL alive) on cancel.
Co-authored-by: yubingz <yubingz@users.noreply.github.com>
The desktop model picker calls POST /api/model/set with provider+model only
(no base_url). _apply_main_model_assignment cleared model.base_url for every
non-custom provider, so re-picking a Xiaomi MiMo model wiped a Token Plan
endpoint (https://token-plan-*.xiaomimimo.com/v1) back to the registry default
api.xiaomimimo.com — breaking valid tp- keys with 401s.
Now base_url is cleared only when switching to a different provider (the stale
URL belonged to the old one); same-provider re-assignment preserves it, and an
explicitly supplied base_url is honored for any provider.
The desktop markdown preprocessor autolinks bare URLs by wrapping them in
<...>. RAW_URL_RE allowed '*' in its character classes, so a bold line with
a URL and no separating space — e.g. '**PR opened: https://.../pull/123**' —
greedily pulled the closing '**' into the href, producing a broken link and
an unterminated bold run. Exclude '*' from both URL character classes; '_'
and '~' (which can appear in real paths) are preserved.
The cron run-history endpoint (GET /api/cron/jobs/{id}/runs, added in
#40684) reused list_sessions_rich's order_by_last_active path with a
leading-wildcard id_query. That routes through the recursive
compression-chain CTE, which seeds from EVERY source='cron' row in the DB
and runs per-row preview/last_active subqueries before filtering to one
job and applying LIMIT. Work scaled with the total cron history, so a
large pile made the run-history load time out before eventually
populating.
Cron runs are flat, never-compressed sessions with ids of the form
cron_{job_id}_{ts}, so the chain machinery is pure overhead and the
job binding is a true prefix, not a substring.
- New SessionDB.list_cron_job_runs(): bounded [prefix, hi) id-range scan
on source='cron', ordered by started_at DESC, with the same
preview/last_active enrichment. No CTE, no leading-wildcard LIKE.
- Add idx_sessions_source(source, id) so the range is an index scan;
bump SCHEMA_VERSION 14 -> 15 (index reconciles onto existing DBs via
CREATE INDEX IF NOT EXISTS on startup).
- Point the endpoint at the new method.
Measured on a real SessionDB with 30k cron rows: 5ms vs 85ms for the old
path (16x), and the new path stays flat as the pile grows while the old
one scaled with it. Verified the query plan uses idx_sessions_source_id
(range scan, no full table scan), runs are correctly scoped (substring
collisions like cron_xalpha_ excluded), newest-first, and paged.
* fix(desktop): scope in-session /model switch per-session, stop process-env leak
The desktop/dashboard tui_gateway backend hosts every same-profile session
in ONE process. An in-session /model switch wrote process-global env vars
(HERMES_MODEL / HERMES_INFERENCE_MODEL / HERMES_TUI_PROVIDER /
HERMES_INFERENCE_PROVIDER), which _resolve_startup_runtime() reads when
building a fresh agent. So switching the model in one session leaked into
every other live session's next agent rebuild (/new, resume) — changing the
model in session B silently changed it in session A.
Fix: record the switch as a per-session model_override on the session dict
instead of mutating os.environ. _make_agent honors that override on rebuild
(carrying the concrete base_url/api_key/api_mode the switch resolved), and
falls back to global config when absent. Global persistence on the --global
flag is unchanged.
Also a cleaner fix for #16857 (/new after switching to a custom-provider
model): the override carries the resolved credentials, so the rebuild keeps
the right endpoint without relying on the leaky env vars.
Reported via Twitter (@Da7_Tech): MiniMax M3 in one session + GLM 5.1 in
another interfere when switching between them.
* test(tui_gateway): align /model switch tests with per-session override contract
The three test_config_set_model_syncs_* tests asserted the old leaky contract
(switch writes HERMES_MODEL / HERMES_TUI_PROVIDER / HERMES_INFERENCE_PROVIDER to
process env). That env-sync IS the cross-session contamination bug this PR
removes. Updated to assert the new contract: shared process env untouched, the
switch recorded as a per-session model_override carrying provider/model/base_url/
api_key/api_mode. #16857's intent (a custom-provider switch survives /new) is
still covered — now via the override _make_agent honors on rebuild.
The desktop sidebar fetched the unified cross-profile session list as
profile='all' and filtered it client-side by the active profile. On a
large multi-profile install the active profile's rows could be windowed
out of the cross-profile recency page entirely, so switching to a profile
agent showed an empty history panel (and the 'all' fetch could exceed the
15s IPC timeout on startup). Scope the fetch to the active profile so its
own page comes back on its merits, and bump the session-list IPC timeout
to 60s. profileScope is now a refreshSessions dep, so the existing
gateway-open effect re-pulls on profile switch.
Persist the inbound user turn before provider/tool execution so a crash
before run_conversation() (e.g. provider/httpx client init failure) keeps
the inbound message in the transcript. Repair stale/missing SSL_CERT_FILE
state on gateway startup, and avoid duplicate gateway fallback writes.
The salvaged main-agent fix (sanidhyasin) applies model.default_headers
to the primary OpenAI client, but the auxiliary client (title generation,
context compression, vision routing) builds its own clients and did not
read the override. For a `provider: custom` endpoint behind a gateway/WAF
that rejects the OpenAI SDK's identifying headers, the main turn would
succeed while auxiliary calls to the same endpoint still failed with the
opaque 502/4xx from #40033.
Add agent.auxiliary_client._apply_user_default_headers() (user values win
over provider/SDK defaults; no-op when unconfigured) and apply it at every
OpenAI-wire client construction site:
- _try_custom_endpoint() — config-level `model.provider: custom`
- the named custom-provider branch (custom_providers/providers entries),
including the anthropic-SDK-missing OpenAI-wire fallback
- the api-key-provider, async-conversion, and main resolve_provider_client
fallback branches
To prevent the two clients ever drifting on precedence/value handling,
AIAgent._apply_user_default_headers (run_agent.py) now delegates the config
read + merge to this shared helper (run_agent already imports from
auxiliary_client). Native Anthropic/Bedrock branches are untouched (they
don't use the OpenAI wire).
8 new tests (helper semantics + config-level custom + named custom);
full aux + attribution header suites green (295).
Custom OpenAI-compatible endpoints sitting behind a gateway/WAF can reject
the OpenAI Python SDK's default identifying headers (User-Agent: OpenAI/Python,
X-Stainless-*) and return an opaque 502/4xx even though the same request body
succeeds under curl. There was no supported way to override those headers.
Add a model.default_headers config key whose values are merged onto the
OpenAI client's default_headers, taking precedence over provider- and
SDK-supplied defaults. Applied at client construction and on every credential
swap / client rebuild so the override survives reconnects. No-op for native
Anthropic / Bedrock modes and when unconfigured.
Mirrors the EN deep-audit fixes (PR #40952) into the zh-Hans translation so the
two locales agree. zh-Hans is the only non-English locale; 26 translated pages
carried the same stale claims.
Corrections ported (code tokens identical across locales; prose re-translated
where the surrounding text was already Chinese):
- reference: /version slash command + dual-surface list; cli --provider adds
openai-api + novita aliases; tool count 70->71 (+ removed phantom "10 RL tools"
and fixed kanban 7->9); model_catalog ttl 24->1.
- user-guide: hermes -w -q -> -w -z; language list 8->16; aux slots 8->11;
docker separate-dashboard claim; gateway-streaming per-platform note;
computer-use frontmatter.
- features: curator prune_builtins truth; codex-runtime aux keys
(context_compression->compression, vision_detect->vision); voice-mode STT/TTS
enums; removed phantom rl toolset.
- integrations: StepFun step-3-mini->step-3.5-flash; web-search backends 4->8;
nous-portal status subcommand.
- messaging: WeCom typing/streaming columns; telegram transport default
edit->auto; sms host 0.0.0.0->127.0.0.1; simplex/ntfy gateway-setup + pairing
approve; line smart-chunking; matrix MATRIX_DM_AUTO_THREAD; msgraph host note.
- developer-guide: entry-point group hermes.plugins->hermes_agent.plugins;
PLUGIN.yaml->plugin.yaml.
Net-new EN sections (mcp mTLS, api-server run-approval, kanban CLI verbs) are
untranslated in zh-Hans and fall back to English source, consistent with the
mirror's existing partial-coverage state. Verified: docusaurus build --locale
zh-Hans succeeds; no new broken anchors from these edits.