* fix(deps): bump urllib3 and PyJWT to clear CVEs
urllib3 2.6.3 → 2.7.0: fixes GHSA-mf9v-mfxr-j63j (decompression-bomb
bypass in streaming API) and GHSA-qccp-gfcp-xxvc (sensitive headers
forwarded across origins in proxied redirects).
PyJWT 2.12.1 → 2.13.0: fixes PYSEC-2026-175/177/178/179.
Note: python-multipart and idna are already at patched versions in
uv.lock (0.0.27 and 3.15 respectively).
Fixes#40176
* fix(deps): add upper bound for urllib3 dependency spec
Add '<3' ceiling to urllib3 specifier to satisfy the PyPI dependency
upper bounds CI check. Per CONTRIBUTING.md policy, all PyPI deps must
use '>=floor,<next_major' pinning.
* fix(photon): override transitive CVEs in the sidecar deps
`npm audit` flagged 7 high-severity transitive CVEs (protobufjs code injection
GHSA-66ff-xgx4-vchm + outdated @opentelemetry OTLP exporters) pulled in via
spectrum-ts -> @photon-ai/otel. npm's suggested fix downgrades spectrum-ts to a
version that targets the decommissioned spectrum host, so instead pin patched
versions via `overrides` (protobufjs 8.6.1, @opentelemetry/* 0.218.0) without
touching spectrum-ts. `npm audit` -> 0; spectrum-ts + provider still import.
* fix(photon): harden the sidecar bridge + bound the dedup cache
- constant-time sidecar control-token comparison (was `!==`, timing-attackable).
- cap the control-channel request body (2 MiB) so a compromised local peer can't
OOM the sidecar.
- wrap the inbound gRPC stream consumer in a re-subscribe loop with capped
exponential backoff + jitter — if the async iterator throws/ends it would
otherwise stop inbound forever (the adapter dedupes any replay).
- add an unhandledRejection handler so a stray rejection logs instead of killing
the process.
- dedup cache (adapter) was a true bounded LRU only for expired entries; a burst
of unique ids within the window grew it without limit. Evict oldest at the cap.
* chore: add AUTHOR_MAP entry for PhilipAD
---------
Co-authored-by: PhilipAD <philipadsouza@gmail.com>
* fix(deps): declare packaging as a core dependency so it ships everywhere
packaging is imported directly on three production paths but was never
declared in [project.dependencies], so it only reached users transitively
(pip/uv pull it for other tools). The slim official Docker image ships
without it, where each try/except-ImportError fallback silently degrades:
- plugins/memory/hindsight/__init__.py (_meets_minimum_version) returns
False when packaging is absent, disabling update_mode='append' so every
session leaks separate Hindsight documents (the reported #40503 symptom).
- tools/lazy_deps.py (_is_satisfied) falls back to "installed counts as
satisfied", defeating every version-constraint check on lazy extras.
- hermes_cli/main.py drops to naive name==version requirement parsing.
Promote it to a declared core dep pinned to packaging==26.0 — the exact
version already resolved in uv.lock, so there is zero resolution churn (the
lock change is two edge annotations marking it transitive->direct). It is a
pure-Python py3-none-any wheel with no compiled extensions, safe to ship on
every platform. Declaring it also wires it into the
_verify_core_dependencies_installed() update-repair guard, which reinstalls
missing [project.dependencies] on hermes update.
Adds a hermetic tomllib-parse regression test that fails before the
declaration and passes after.
Fixes#40503
* test(deps): make packaging dep-name extraction PEP 508-robust
Address Copilot review on #40522: the inline name-extraction only handled
==, >=, [ and ; and could mis-parse valid requirement strings using <=, ~=,
!=, <, > or a direct reference (name @ url). Factor a _distribution_name
helper that drops markers, direct-reference URLs and extras, then strips any
version operator via regex, so a future dep declared with any PEP 508
specifier shape is matched correctly.
---------
Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>
* fix(desktop): keep chat recents focused and reset hotkey target
Exclude messaging platform threads from chat recents pagination so Load More returns chat sessions, and clear stale quick-create profile state before Ctrl+N starts a new session.
* fix(desktop): surface new sessions in sidebar + unstick new-chat Thinking
Two renderer regressions in the desktop chat app:
- Sidebar ordering: orderByIds/reconcileOrderIds appended ids missing from
the persisted order to the BOTTOM. Callers pass recency-sorted lists
(newest first), so a brand-new Ctrl+N session sank below the saved order
and read as "my latest session never showed up". Prepend fresh ids so new
activity surfaces at the top.
- New-chat stuck on "Thinking": terminal/attention state transitions
(turn finished, error, or agent now waiting on user) were RAF-batched.
Electron throttles requestAnimationFrame to ~0 while the window is
backgrounded, occluded, or unfocused, stranding the deferred flush. Flush
critical transitions (!busy || needsInput) synchronously; keep the busy
heartbeat RAF-batched to avoid scroll churn.
Does not touch the messaging-source exclusion in chat recents queries.
* fix(desktop): stop excluding messaging platforms from chat recents
The "keep chat recents focused" change excluded every messaging-platform
source (telegram, discord, slack, …) from the recents query. That silently
undid the messaging-source-folder feature already on main (ede4f5a4a): the
sidebar builds those folders purely from the loaded recents page, so once the
sources were filtered out the folders never rendered — telegram and friends
vanished from the left sidebar.
Only cron stays excluded (it has its own dedicated section). Messaging
sessions belong in the sidebar and render with their platform folder/icon.
Removes the now-unused MESSAGING_SESSION_SOURCE_IDS export.
* fix(desktop): give each messaging platform its own self-managed sidebar section
Recents are local-only again: cron and every messaging platform are excluded
from the chat-recents query, so "Load more" pages through interactive local
chats instead of interleaving gateway threads that bury them.
Each messaging platform (telegram, discord, ...) is now fetched as its own
slice (refreshMessagingSessions) and rendered as a self-managed sidebar
section with its platform icon, count, and per-platform "load more" — no
source-grouping magic inside recents.
Handed-off sessions (live source becomes local after a handoff) keep their
origin-platform badge on the row via handoff_platform, so a Telegram thread
continued in the desktop still reads as Telegram.
* fix(desktop): self-heal a stranded routed session in route-resume
An intermittent create/stream race can leave selected/active session ids
null while the route stays on /:sid — the transcript then sticks empty
even though the turn completed and persisted (the "second Ctrl+N shows no
response" symptom). The pathname didn't change, so route-resume's normal
gate skipped and the view stayed stuck.
Resume whenever the routed session isn't the loaded one, gated on
freshDraftReady so the /:sid -> /new transition (which also momentarily
nulls selected/active a render before the pathname flips) is NOT treated
as stranded. selectedStoredSessionIdRef is set synchronously at resume
entry, so this can't loop, and the resume cached fast-path restores the
already-streamed messages without a refetch.
* fix(desktop): bypass smooth reveal on primary markdown stream
Render main assistant text through deferred markdown directly instead of the smooth-reveal wrapper. This isolates the wrapper to reasoning surfaces and avoids the intermittent blank-response regression after consecutive new-session flows.
Add TestParseCharBasedOutputCap for the LM Studio / llama.cpp phrasing
(context in tokens, prompt in characters): the reported error resolves to
the available output budget, the retried cap plus the estimated input
stays inside the window, and a prompt larger than the window falls through
to None so the prompt-too-long/compression path still owns that case.
LM Studio / llama.cpp-style servers report the context window in tokens
but the prompt size in characters, e.g. "maximum context length is 65536
tokens. However, you requested 65536 output tokens and your prompt
contains 77409 characters". When a provider profile's default_max_tokens
equals the model's context window, the very first request asks for the
whole window as output and the server returns a hard HTTP 400 — even on a
trivial "hi".
parse_available_output_tokens_from_error did not recognise this phrasing,
so the overflow was misrouted to the prompt-too-long/compression path
(which can't help when the input already fits) instead of the output-cap
reduction + retry path. Detect the "requested N output tokens" form,
estimate the input from the character count (~3 chars/token, conservative
so the retried cap stays inside the window), and return the available
output budget so the existing retry logic shrinks max_tokens and succeeds.
The salvaged refactor's new tests use unittest.mock.patch (25 call sites)
but the import line only brought in AsyncMock and MagicMock, so 10 of the
new tests failed with NameError. Add patch to the import.
Adds three vitest cases for the recovery path: resume+retry on
"session not found", no-resume passthrough on other errors, and
no-resume when there is no stored session id. Also maps the
contributor's commit email in release.py AUTHOR_MAP.
When the laptop sleeps and wakes, the WebSocket reconnects but the
gateway's in-memory session table is cleared. The desktop app still
holds the old activeSessionId, so the next prompt.submit call returns
error 4001 ('session not found'), surfaced to the user as:
'Prompt failed: session not found'
Fix: wrap prompt.submit in a try/catch. On 'session not found', call
session.resume with the durable SQLite session ID (selectedStoredSessionIdRef)
to re-register the session in the gateway, update activeSessionIdRef to
the fresh live session_id, then retry prompt.submit once.
If recovery fails or the error is unrelated, the original error is
re-thrown and surfaces normally.
Based on #42658 by @mnajafian-nv.
Preserves the real downstream provider/tool exception when NeMo Relay's
managed adaptive execution wraps a failing callback as an internal runtime
error. Without this, the original exception (and its retry-classification
signal, e.g. status_code) is lost behind Relay's wrapper.
Salvage changes on top of the original PR:
- Tolerant Relay-wrapper match: _is_relay_wrapped_callback_error now uses
str.startswith on the "internal error: <cls>: <msg>" prefix instead of
exact equality, so a future Relay version appending a traceback/suffix
doesn't silently defeat the unwrap. On a total format change it returns
False and falls back to the pre-fix behavior (surfacing Relay's error)
rather than masking it.
- Deduplicated the LLM and tool execute paths into a shared
_run_managed_with_downstream_preservation helper, removing ~20 lines of
copy-pasted nonlocal/try-except scaffolding that could drift out of sync.
- Added a real-middleware regression guard
(test_nemo_relay_downstream_unwrap_matches_real_middleware_wrapper_shape)
that drives hermes_cli.middleware._run_execution_chain and asserts the
plugin's _original_downstream_error unwraps the actual private
_DownstreamExecutionError wrapper. The original synthetic tests modeled the
wrapper with a local class, so a rename or shape change in core middleware
would not have been caught; this test fails loudly if that contract drifts.
Co-authored-by: mnajafian-nv <mnajafian@nvidia.com>
The markdown code-block change rendered args['command'] in full in both
verbose AND non-verbose (all/new) modes, so a long or multi-line terminal
command bypassed the tool_preview_length cap (default 40) and rendered as
a huge block. Non-verbose now collapses to a single line capped at the
preview length while keeping the fence; verbose keeps the full command.
* fix(terminal): complete sane PATH entries on POSIX
Fixes macOS gateway/launchd terminal sessions whose PATH already
includes /usr/bin while omitting Apple Silicon Homebrew paths.
LocalEnvironment._make_run_env() now appends each missing _SANE_PATH
entry individually on POSIX, preserving caller precedence and avoiding
duplicate sane entries.
Root cause: the previous logic used /usr/bin as the sentinel for sane
PATH injection. macOS launchd commonly provides /usr/bin while leaving
out /opt/homebrew/bin and /opt/homebrew/sbin, so Homebrew-installed
CLIs stayed unavailable in terminal tool calls.
Salvaged from #35614 by @y0shua1ee. Fixes#35613.
Co-authored-by: y0shua1ee <104712437+y0shua1ee@users.noreply.github.com>
* test(terminal): harden sane PATH completion against dup/empty entries
Follow-up to the #35613 fix. Strengthens _append_missing_sane_path_entries:
- De-duplicate the caller-supplied PATH (first occurrence wins) so a PATH
that already contains duplicate entries is collapsed rather than carried
through. Previously only newly-appended sane entries were guarded against
duplication; pre-existing caller duplicates were preserved verbatim.
- Drop empty PATH entries (leading/trailing/double ':'), which POSIX shells
interpret as the current working directory — a mild foot-gun in a
default terminal environment.
Behaviour for well-formed PATHs (no duplicates, no empty entries) is
byte-identical to before; only malformed/duplicated inputs change.
Adds regression tests for: the literal macOS launchd PATH
(/usr/bin:/bin:/usr/sbin:/sbin), caller-duplicate collapsing with
order preservation, and empty-entry stripping.
* docs(terminal): clarify PATH normalisation semantics; drop dead set add
Addresses review findings on the sane-PATH completion follow-up:
- Sharpen the _append_missing_sane_path_entries docstring to state
explicitly that on POSIX the caller PATH is rewritten (empty entries
stripped, duplicates collapsed) rather than merely appended to, and
that well-formed PATHs remain byte-identical bar the appended sane
entries. This makes the intentional semantic change visible rather
than buried under "hardening".
- Document why _path_env_key is a deliberate second Windows guard
distinct from the helper's early return (key-casing selection vs
standalone safety), so neither is mistaken for redundant and removed.
- Drop the dead `seen.add(entry)` in the sane-entry loop: _SANE_PATH is
a static duplicate-free constant, so the membership check against the
caller entries is sufficient and `seen` is never read afterwards.
No behaviour change: verified byte-identical output across the launchd,
minimal, empty, duplicate, empty-entry and already-full cases, and
re-confirmed gh/brew resolve through the real LocalEnvironment.execute()
path under a launchd-style PATH. 133 targeted tests pass.
Intentionally NOT consolidating with tools/browser_tool._merge_browser_path:
it prepends (vs append), filters on os.path.isdir, uses os.pathsep, and
draws from a dynamic candidate set — a shared helper is a separate
refactor, out of scope for this bugfix.
---------
Co-authored-by: y0shua1ee <104712437+y0shua1ee@users.noreply.github.com>
`test_terminal_config_env_sync.py::_save_config_env_sync_keys()`
AST-scanned `hermes_cli/config.py:set_config_value` for a
`_config_to_env_sync = {...}` literal. The terminal-config env bridging
was consolidated onto the canonical `TERMINAL_CONFIG_ENV_MAP` (now read
via `terminal_config_env_var_for_key()`), so that literal no longer
exists and the scanner raised:
AssertionError: Could not find `_config_to_env_sync = {...}` literal in source
failing 8 of 9 tests on main for every PR.
Read the live `TERMINAL_CONFIG_ENV_MAP` instead — the actual source of
truth `set_config_value` bridges through — mirroring its `terminal.cwd`
exclusion. Refresh the stale module docstring and the now-incorrect
error-message hints that still referenced `_config_to_env_sync`.
Verified: the suite goes green, and a mutation (dropping `docker_volumes`
from `TERMINAL_CONFIG_ENV_MAP`) still trips the pinned regression test,
so the drift guard retains its teeth.
@file: attachments now work when the desktop is connected to a remote
gateway. Previously a referenced file resolved to a client-disk path the
gateway couldn't see, so context_references rejected it with "path is
outside the allowed workspace" and the agent never saw the file.
Adds a file.attach RPC (sibling to the existing image.attach_bytes /
pdf.attach byte-upload pipeline): the desktop uploads the file bytes, the
gateway stages them into <workspace>/.hermes/desktop-attachments/ and
returns a workspace-relative @file: ref that resolves cleanly. Local mode
passes the path directly; a gateway-visible file outside the workspace is
copied in; an in-workspace file is referenced as-is with no copy.
Consolidates the file-sync design from #38615 (LeonSGP43) and the
host-file-staging idea from #33455 (Carry00), rebased onto the
image/PDF remote-media helpers already on main.
Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
The Portal's /api/nous/recommended-models endpoint is the source of truth for
which models are free/paid right now, but its result was cached in-process
only. When the live fetch failed (network, parse, non-2xx), the function
returned {} and the model picker silently dropped the free/paid
recommendations — free models would vanish with no indication anything went
wrong.
Add a per-base disk cache at $HERMES_HOME/cache/nous_recommended_cache.json:
a successful live fetch is persisted as last-known-good, and a failed fetch
with an empty in-process cache falls back to the disk copy instead of {}.
Self-heals on the next successful fetch. With no disk copy, still degrades to
{} (callers already handle that). Keyed by portal base URL so staging/prod
don't collide.
E2E: live fetch writes disk; simulated Portal failure returns the cached free
models from disk; no-disk + failure returns {}.
Two new free-tier slugs surfaced in /model and `hermes model`. owl-alpha
was already present. Regenerated website/static/api/model-catalog.json to
keep the manifest sync test green.
The autouse _suppress_concurrent_hermes_gate fixture did
monkeypatch.setattr(main, '_detect_concurrent_hermes_instances', ...) with
no raising=False. Its try/except guards the import but not the setattr, so
under pytest's per-test spawn isolation a transiently partial hermes_cli.main
module (one a concurrent worker is mid-importing) made setattr raise
AttributeError and errored unrelated tests in the slice.
Add raising=False so a transiently-absent attribute is a no-op default rather
than a hard error. The attribute always exists once main.py finishes
importing; the real-function opt-out (@pytest.mark.real_concurrent_gate) is
unaffected.
Follow-up to the salvaged fallback-chain fix:
- Replace the hand-rolled fallback loader with the shared
hermes_cli.fallback_config.get_fallback_chain() helper so the TUI path
matches HermesCLI and gateway/run.py exactly: fallback_providers stays
first and keeps order, with distinct legacy fallback_model entries
merged in after (deduped). Previously the TUI loader picked one key OR
the other, diverging from CLI/gateway when both were set.
- Update the test to assert the merged canonical semantics.
- Add psionic73 to scripts/release.py AUTHOR_MAP (CI gate).
Extend the sidecar and Python adapter to handle `voice` content
alongside `attachment`. Voice notes are inlined as base64 (same
size-cap logic), surfaced as `MessageType.VOICE`, and include an
optional `duration` field in fallback markers when bytes are
unavailable.
Switch `list_users`, `find_user_by_phone`, `create_user`,
`register_user_if_absent`, and `refresh_user_numbers` from the
Dashboard API (Bearer token) to the Spectrum API (Basic auth with
project credentials). Update response unwrapping to handle the nested
`data.users` envelope returned by Spectrum, add `_spectrum_host()`
resolver, `_basic()` header helper, and structured error helpers.
Update tests, docs, and plugin.yaml accordingly.
Store operator and assigned iMessage numbers in `auth.json` after
setup, and surface them in `hermes photon status`. When numbers are
missing, status auto-refreshes from the dashboard without provisioning
new lines.
image_generate returns its artifact as JSON ({"image": "/abs/path.png"})
with no MEDIA: tag, so the gateway auto-append path (which only recognized
text_to_speech MEDIA: tags) never delivered it — image delivery silently
depended on the model restating the path in its reply. Add image_generate to
the producer allowlist and extract the local path from its JSON result
(host_image > image > agent_visible_image), reusing the existing
extension-anchored matcher and history-dedupe so remote URLs, unknown
extensions, failures, and already-sent paths are rejected.
Closes the remaining unfixed path from #19105.
The blanket stdin=subprocess.DEVNULL pass added the kwarg to the docker
'version' preflight call; the test pinned the exact kwargs dict. Update
the expected dict to match.
Regression tests for the salvage follow-up: the interactive 'claude
setup-token' login must keep inherited stdin, and the guard's inline
'noqa: subprocess-stdin' marker must exempt a call.
The blanket DEVNULL pass muzzled run_oauth_setup_token()'s interactive
'claude setup-token' login, which needs inherited stdin to prompt the
user. Revert that one call and replace the guard's brittle file:line
whitelist with an inline 'noqa: subprocess-stdin' marker that travels
with the code.
scripts/check_subprocess_stdin.py scans agent/, tools/, plugins/, and
tui_gateway/ for subprocess.run() and subprocess.Popen() calls that
don't explicitly set stdin=. Missing stdin= means the child inherits the
parent's fd, which in TUI mode is the JSON-RPC pipe — causing gateway
crashes on stdin EOF.
Exits 0 (pass) or 1 (violations found). Can be run manually or added to
CI. Skips comments, docstring references, and calls that use input= (which
creates its own pipe).
Usage: python scripts/check_subprocess_stdin.py
When Hermes runs in TUI mode, the gateway child process communicates with
the Node.js parent over a JSON-RPC protocol on stdin. Subprocess calls that
inherit this stdin fd can trigger a race condition where the child's stdin
read returns EOF, causing the gateway to exit cleanly (exit code 0) mid-tool-
execution.
This is the same root cause as issue #14036 (byterover plugin) and PR #39257
(SSH environment backend). This commit applies the fix — stdin=subprocess.DEVNULL
— to all 85 subprocess.run() and subprocess.Popen() calls that execute inside
the TUI gateway child process.
Scope: TUI-context code only (agent/, tools/, plugins/, tui_gateway/server.py).
CLI code (cli.py, hermes_cli/), tests, scripts, and gateway process management
are excluded — they don't run inside the TUI child and inherit the terminal's
stdin, not the JSON-RPC pipe.
85 call sites across 28 files. All files pass syntax check.
hermes update pulls the latest repo, so the freshly-pulled
website/static/api/model-catalog.json is already the newest catalog. Copy
it straight over ~/.hermes/cache/model_catalog.json instead of relying on a
network fetch (which can be Vercel bot-gated or hit a Portal hiccup and
silently degrade the picker to a stale/short list).
Adds seed_cache_from_checkout() in model_catalog.py (read shipped manifest,
validate, atomic write via _write_disk_cache, reset in-process cache) and
calls it from both update paths in main.py: _cmd_update_impl (git pull) and
_update_via_zip (Docker/no-git). Non-fatal on missing/malformed/invalid
files — the normal network refresh still applies on next picker open.
The desktop code uses Array.prototype.findLast (chat/composer/index.tsx) and
findLastIndex (session/hooks/use-session-actions.ts), which are ES2023 APIs,
but tsconfig declared only the ES2022 lib. Some TypeScript builds tolerate this,
but a correct/stricter tsc fails the desktop build with:
TS2550: Property 'findLast' does not exist on type 'ChatMessage[]'.
Do you need to change your target library? Try changing 'lib' to 'es2023'.
Declare es2023 so the build is correct regardless of the resolved TypeScript
version (reported on Windows with Node 24).
Refs #38970
Terminal tool progress on markdown-capable gateways (Telegram, Slack,
Discord, WhatsApp, Matrix, Weixin, Feishu) renders the full command in a
fenced code block again, in all/new AND verbose modes — gated on the
adapter's supports_code_blocks capability. Plain-text platforms keep the
short truncated preview.
No language tag is emitted: Slack mrkdwn renders a '```bash' fence with
'bash' as a literal first code line, so a bare '```' fence is used, which
renders correctly on every platform that supports blocks.
This restores the #41215 feature (removed in #41950 due to the command
showing in group chats) as the default. For a personal assistant the
command display is desired; the group-chat concern is a preference, not a
vulnerability.
Drop `replyTo` from all outbound send paths and update the `/typing`
endpoint to use the documented `typing("start" | "stop")` content
builder. Adds a `stop_typing` method on the adapter to pair with
`send_typing`.
Replace raw `{ replyTo }` send options with the `spectrumReply` content
builder from spectrum-ts, which is the correct API for threading
replies.
Adds `maybeReplyContent` helper with graceful fallback to normal send
when
the reply target cannot be resolved.
Allow PHOTON_HOME_CHANNEL to accept a bare E.164 phone number or a
`any;-;+1...` DM chat GUID in addition to a Spectrum space id. Inbound
DM spaces are cached so replies resolve without a second SDK lookup,
and `photon` is added to _PHONE_PLATFORMS so send_message treats E.164
strings as explicit targets rather than falling through to channel-name
resolution.
During `hermes photon setup`, allowlist the operator's number and set
their DM as the cron home channel when those env vars are unset. Without
this, the gateway denies the operator's own messages and cron has no
default delivery target. Re-runs never overwrite hand-tuned values.
Also teaches the sidecar's `resolveSpace` to accept a bare E.164 number
as a space identifier, resolving it to the user's DM space so
`PHOTON_HOME_CHANNEL` can be set to a phone number instead of an opaque
space id.
On shared-number plans, `/lines` has no dedicated entry, so the
`assignedPhoneNumber` field on the user object is the source of truth
for which number to text the agent. Fall back to the line inventory
only when no per-user assignment exists.