Commit graph

13525 commits

Author SHA1 Message Date
HexLab98
88e6f9b98c fix(auxiliary): preserve max_tokens for NVIDIA NIM aux calls
NVIDIA integrate.api.nvidia.com models such as minimaxai/minimax-m3 can
return HTTP 200 with empty choices when max_tokens is omitted. Keep the
output cap on auxiliary chat-completions routes, matching the main NVIDIA
provider profile behavior.
2026-06-29 18:04:39 +05:30
Ben Barclay
f53ba9bb54
fix(s6): dot-prefix gateway staging dir so svscan ignores it mid-build (#54834)
Some checks are pending
CI / Detect affected areas (push) Waiting to run
CI / Python tests (push) Blocked by required conditions
CI / Python lints (push) Blocked by required conditions
CI / TypeScript (push) Blocked by required conditions
CI / Docs Site (push) Blocked by required conditions
CI / Deny unrelated histories (push) Blocked by required conditions
CI / Check contributors (push) Blocked by required conditions
CI / Check uv.lock (push) Blocked by required conditions
CI / Lint Docker scripts (push) Blocked by required conditions
CI / Build&Test Docker image (push) Blocked by required conditions
CI / Supply-chain scan (push) Blocked by required conditions
CI / OSV scan (push) Waiting to run
CI / All required checks pass (push) Blocked by required conditions
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
The register path builds each profile-gateway slot in a sibling staging
dir under /run/service (the scandir s6-svscan watches), then atomically
renames it to the live gateway-<profile> name. The staging dir was named
gateway-<profile>.tmp — a NON-dotfile — so a concurrent `s6-svscanctl -a`
rescan (fired by the cont-init reconciler registering gateway-default, or
by a sibling register) would supervise the half-built slot the moment it
had a valid type/run: s6-supervise spawns AS ROOT and mkdirs supervise/
root-owned 0700, then the in-flight _seed_supervise_skeleton early-returns
on the now-existing supervise/ and the next `mkdir supervise/event` hits
PermissionError.

That is the arm64-only CI flake on
test_s6_unregister_removes_service_dir_in_live_container
(PermissionError: /run/service/gateway-phase3test.tmp/supervise/event) —
arm64-only because the native-arm runner's wider scheduling jitter lets
the rescan land inside the ~ms seed window; amd64 ran 30/30 clean.

Fix: dot-prefix the staging dir (.gateway-<profile>.tmp) in both register
paths (S6ServiceManager.register_profile_gateway and
container_boot._register_service). s6-svscan skips any scandir entry whose
name begins with '.', so the half-built slot can never be supervised
mid-build. The atomic rename to the dotless live name is unchanged.

Verified on a real s6 image (amd64): a non-dotted staging dir is picked up
by an svscanctl -a rescan (SUPERVISED owner=root) while a dot-prefixed one
is ignored (NOT-SUPERVISED). Added a docker-harness regression test that
asserts both, plus a unit test that the staging dir is dot-prefixed.
2026-06-29 21:33:00 +10:00
Teknium
dbad6d47d3 fix(gateway): also neutralize untrusted Matrix room name in prompt
Widen #5961's _format_untrusted_prompt_value coverage to the Matrix
room display name (**Matrix Room:**), a sibling attacker-controllable
field the original fix missed. chat_name is user-settable, so an
injected room name could render as literal markdown in the system
prompt. Adds a regression test.
2026-06-29 04:25:51 -07:00
Xowiek
09666ceb76 fix(gateway): neutralize untrusted session metadata in prompts 2026-06-29 04:25:51 -07:00
teknium1
ea1372d2af fix(security): wire session-id sanitizer into artifact paths + API boundary
Defense-in-depth on top of _safe_session_filename_component (#5958):

Sink (makes the bad write impossible regardless of entry point):
- run_agent._save_session_log: sanitize session_id before building the
  session_{sid}.json snapshot path.
- agent_runtime_helpers.dump_api_request_debug: sanitize before building
  the request_dump_{sid}_{ts}.json path.

Boundary (clean 400 instead of a silently-hashed filename):
- api_server rejects path-traversal-shaped X-Hermes-Session-Id on the
  session-continuation path and the explicit /api/sessions create path,
  reusing gateway.session._is_path_unsafe (mirrors the native gateway's
  entry-boundary guard). Also enforces the session-header length cap on
  the continuation path.

Tests: traversal session_id stays contained at the write site; sanitizer
always yields a traversal-free segment; the API header rejects
../, absolute, and Windows-traversal IDs with 400.
2026-06-29 04:25:45 -07:00
Xowiek
1debd5e8f9 fix(security): add session-id filename sanitizer to prevent path traversal
Session IDs can originate from untrusted input (e.g. the
X-Hermes-Session-Id API header) and are interpolated raw into on-disk
artifact filenames under ~/.hermes/sessions/. A traversal-shaped ID
(../../../../etc/pwned) would let a caller write the session snapshot
or request dump outside the sessions directory.

_safe_session_filename_component() collapses every non [A-Za-z0-9_-]
character to _, caps the length, and appends a short content hash when
sanitization changed the string, always yielding a single traversal-free
path segment.

Closes #5958.
2026-06-29 04:25:45 -07:00
teknium1
cdd8e0a271 test(gateway): exercise last_prompt_tokens in reset-activity tests
The reset-had-activity tests set total_tokens (dead state) to simulate
activity; production records activity via last_prompt_tokens. Update
the fixtures to match the field the fix and runtime actually use.
2026-06-29 04:25:37 -07:00
Mibayy
0fe9755016 fix(gateway): use last_prompt_tokens for session-reset activity check
reset_had_activity gated on entry.total_tokens, which is never written
(token counts migrated to agent-direct persistence) so it was always 0.
That suppressed session-reset notifications for sessions that genuinely
had activity. Switch to last_prompt_tokens, which is updated on every
turn.
2026-06-29 04:25:37 -07:00
Mibayy
9e490138a0 fix(security): fail-closed feishu webhook rate limiter + whatsapp bridge path guard
Salvages the two still-valid hardenings from #5381 onto the relocated
plugin adapters (the discord/feishu/whatsapp adapters moved to
plugins/platforms/ since the PR was opened, and 4 of its 6 hunks are
already on main or superseded).

- feishu: rate limiter now denies untracked keys when the tracking table
  is at capacity after pruning stale entries (was: allow through without
  tracking). At-capacity-with-all-fresh-entries only happens under abuse,
  so allowing untracked requests let an attacker who flooded the table
  bypass the limiter entirely. Already-tracked keys and post-prune room
  are unaffected.
- whatsapp: absolute file paths handed back by the Baileys bridge are now
  validated to resolve inside a known media cache dir before being
  attached. A compromised/buggy bridge could otherwise return an
  arbitrary path (e.g. /etc/passwd) that would be sent verbatim to the
  model. Guard resolves symlinks and accepts both the canonical
  cache/<kind> and legacy <kind>_cache layouts.
2026-06-29 04:25:31 -07:00
Ruzzgar
576424cc1c fix(security): redact browser CDP endpoint logs 2026-06-29 04:25:26 -07:00
teknium1
23c03ced75 fix(session-db): enrich NULL session metadata via upsert instead of INSERT OR IGNORE
The gateway's get_or_create_session() creates a bare session row (source +
user_id) before the agent exists. The agent's later create_session() carries
the real model/model_config/system_prompt, but _insert_session_row used
INSERT OR IGNORE — silently dropping that enrichment. Gateway sessions were
left with NULL model and NULL billing metadata.

Switch to INSERT ... ON CONFLICT(id) DO UPDATE with COALESCE so NULL columns
get backfilled while values an earlier writer already set are never
overwritten (a later bare write with source='unknown' can't clobber a real
source/model). Credit: original report and fix direction by @LucidPaths (#5048).
2026-06-29 04:25:23 -07:00
teknium1
61f56d27db refactor(dashboard-auth): drop redundant _interactive_providers helper
list_session_providers() already filters on supports_session=True, so the
new helper re-filtered an already-filtered list. Call it directly at the
single auto-SSO call site.
2026-06-29 04:25:18 -07:00
Ben
f5ecbe1ec6 feat(dashboard): auto-initiate portal SSO redirect on unauthenticated load
When the dashboard gateway has no local session cookie, it rendered a
click-through /login interstitial — even though the Nous portal's
/oauth/authorize auto-approves any current member of the dashboard's org
and is a silent 302 when the user already holds a portal session. For the
common case (clicking a hosted-agent dashboard link while signed in to the
portal) that interstitial click is pure friction.

This makes the gate auto-initiate the OAuth redirect on an unauthenticated
HTML document load instead of rendering the interstitial, when exactly one
interactive provider is registered. A one-shot loop-guard cookie
(hermes_sso_attempt, 60s TTL) ensures that a genuinely absent portal
session (the portal bounces back still-unauthenticated) falls back to the
/login page after exactly one bounce rather than ping-ponging forever. The
marker is cleared on a successful callback and whenever the gate falls back
to /login.

Security: this removes a human CLICK, not a security check. The redirect
lands on the existing /auth/login route and runs the unchanged PKCE
auth-code flow; token verification, audience checks, redirect-URI match,
and org-membership checks are all untouched. /api/* fetches still get the
401 JSON envelope (never a 302 a fetch() would follow opaquely), and with
two or more providers the /login chooser still renders.

Phase 1 of the cloud-auto-discovery work.
2026-06-29 04:25:18 -07:00
Teknium
dc5ef20d89
test(reasoning-floor): isolate stale-timeout floor tests from config-module reload races (#54775)
The five _resolved_api_call_stale_timeout_base integration tests reloaded
hermes_cli.config + hermes_cli.timeouts via importlib.reload to clear cached
config. Under xdist that mutates module-global state shared across the worker
process, so a sibling test could leave the config cache in a state that made
get_provider_stale_timeout return a leaked value — intermittently failing
test_reasoning_floor_applies_to_opus_4_thinking (shard 6 flake, #52217 area).

Patch run_agent.get_provider_stale_timeout per-test instead: floor-path tests
get None (resolver falls through to the reasoning floor / env var / default),
the explicit-config test gets 60.0 (priority-1 short-circuit). Same assertions,
no shared-module mutation, deterministic under parallel execution.
2026-06-29 02:42:54 -07:00
sgaofen
194bff0687 fix(gateway): confirm final delivery before suppressing send
Fixes #14238. During a compression/session split at the response
boundary, the interim callback delivered unrelated commentary, setting
response_previewed=True. The suppression logic treated that as proof the
final reply had been delivered and skipped the normal send — the response
was persisted to the child session but never sent to chat.

Only suppress the normal final send when the stream consumer confirms
final delivery (final_response_sent / final_content_delivered) or the
exact final response text was delivered as a preview.
2026-06-29 02:37:11 -07:00
teknium1
fa3dba4b30 docs(infographic): add list_profiles perf-fix infographic 2026-06-29 02:35:57 -07:00
teknium1
10c9eafde2 chore(attribution): map mango001@126.com -> max-chen for salvaged #51194 2026-06-29 02:35:57 -07:00
Sahil-SS9
1bb7b59c5d fix: offload blocking profiles endpoints from asyncio event loop (#54523)
(cherry picked from commit 09f10e2b77)
2026-06-29 02:35:57 -07:00
chenxiang
d5eee133eb perf(profiles): fix list_profiles O(N*M) wrapper rescan (6.4s -> 0.4s)
find_alias_for_profile re-scanned the whole wrapper dir (~/.local/bin) and
read_text every file for EACH profile — including large unrelated binaries
(ffmpeg etc.) read 15x over. With 16 profiles this took ~6.4s, long enough
that the desktop's per-request backend calls timed out (15s) and the sidebar
rendered '全部智能体 0 / 会话 0'.

- Add build_alias_map(): single-pass {profile -> alias} reverse map, reads
  only an 8KB head slice per wrapper, skips binaries via UnicodeDecodeError.
- find_alias_for_profile now delegates to it (behavior preserved).
- Cache _count_skills by skills-dir mtime signature (+30s TTL).

list_profiles: 6.37s -> 0.84s cold / 0.44s warm. 138 profile tests pass.

(cherry picked from commit 89e593749a)
2026-06-29 02:35:57 -07:00
teknium1
2f5950a83a chore(release): add telos-oc to AUTHOR_MAP for PR #14353 salvage 2026-06-29 02:25:48 -07:00
Telos
fa11b11cf5 fix: propagate key_env from custom_providers into ProviderDef
resolve_custom_provider() previously returned api_key_env_vars=()
for every custom provider entry, silently dropping the configured
key_env field. This caused 401 errors for any custom provider that
required an API key via environment variable (e.g. Xiaomi MiMo Token
Plan, self-hosted OpenAI-compatible servers).

The key_env field is already documented in _VALID_CUSTOM_PROVIDER_FIELDS
and normalized by normalize_custom_provider_entry(), so this was just
an oversight in the ProviderDef construction.

Also adds a regression test that verifies key_env is properly
propagated into the resolved ProviderDef.
2026-06-29 02:25:48 -07:00
teknium1
9f97915163 fix(browser): route open-timeout base through _safe_command_timeout
Wire the salvaged _safe_command_timeout() guard into the surviving
open-timeout call site. _get_open_command_timeout() feeds the
browser_navigate 'open' path; this closes the last call site that
could observe a None timeout from a torn cache (#14331), since the
original PR's max(_get_command_timeout(), 60) site no longer exists
on main (now routed through _get_open_command_timeout).
2026-06-29 02:24:57 -07:00
Sanjay Santhanam
c79e6bceae fix(browser_tool): resolve race in _get_command_timeout cache returning None (#14331)
# Conflicts:
#	tools/browser_tool.py
2026-06-29 02:24:57 -07:00
Teknium
bf0d8fed8e
fix(config): v32 migration flips baked-in verify_on_stop=true to false (#54740)
The first ship of verify-on-stop (config v30) defaulted
DEFAULT_CONFIG agent.verify_on_stop to a literal True, and migrate_config
persists defaults with strip_defaults=False — so every install that updated
through v30 had verify_on_stop: true written into config.yaml as a literal.

The v30->v31 migration only flipped missing/'auto' values to false and
deliberately preserved an explicit bool, so it skipped that entire population
and left verify-on-stop ON for everyone who had updated. A literal true was
never a user choice: the feature had no off-switch worth setting it against
until v31 introduced one, so a true persisted before v32 is always the old
machine default.

v32 migration flips a literal true -> false once, for both v30 (skipped v31)
and v31 (preserved-by-bug) installs. A true the user sets AFTER v32 is a
deliberate opt-in and is never touched.
2026-06-29 01:51:08 -07:00
teknium1
75317d82d0 fix(vision): narrow the fan-out cap to the CPU encode burst only
The original cap held a process-global slot across the WHOLE vision
analysis (image load + encode + LLM call) with a default of min(CPUs, 4).
That serialized legitimate multi-image workflows — "compare these 6
screenshots", "read this 10-page scan", "analyze every frame" — behind a
4-wide gate, and on the native fast path it even throttled calls that make
no LLM request at all. Excess calls queued (blocking acquire, nothing
dropped), but the latency hit on real fan-out was the wrong tradeoff.

The incident was CPU exhaustion, not call count: concurrent base64/resize
bursts saturated every core and left none to service the shared event loop
serving /api/status. So cap ONLY that:

- A dedicated, bounded ThreadPoolExecutor (_vision_cpu_executor) runs the
  encode/resize/dimension-check off the caller's loop, sized to the host's
  usable core count with NO fixed ceiling — the cap tracks the actual
  exhausted resource (cores), not a magic number. Excess encodes queue on
  the executor; cores stay free for the loop.
- The LLM call is deliberately OUTSIDE the executor, so multi-image
  workflows keep full request concurrency.
- Override via auxiliary.vision.max_concurrency / HERMES_VISION_MAX_CONCURRENCY
  (honored verbatim, including above core count); sub-1 ignored.
- _vision_concurrency_slot() is now a no-op shim for back-compat.

Tests assert: resolver defaults to host cores with no ceiling; env/config
override (incl. above cores); sub-1 rejection; the executor is dedicated and
core-sized; encode runs on a vision-encode thread; and crucially that encode
bursts are bounded to the cap while the analyses themselves stay fully
concurrent (calls_peak > cap).
2026-06-29 01:27:10 -07:00
Ben Barclay
eddfecd2ce fix(vision): cap vision_analyze fan-out concurrency process-wide
A single agent turn can fan out N vision_analyze calls at once — the
classic trigger is "analyze every frame of this video", where ffmpeg
explodes a clip into dozens of frames and the model calls vision_analyze
on each. Every call does a CPU-heavy base64-encode/resize burst AND holds
a long-lived LLM stream open. The tool executor runs concurrent tool calls
on a per-session ThreadPoolExecutor (_MAX_TOOL_WORKERS=8), and multiple
agent sessions share one process (the dashboard runs the agent in-process),
so there was no global ceiling. In prod (June 2026) a video-frame fan-out
pinned a worker thread at ~100% CPU and starved the shared asyncio event
loop that also serves the dashboard's /api/status liveness probe, flapping
the instance to UNHEALTHY even though nothing had crashed.

Add a process-global threading.BoundedSemaphore that bounds how many vision
analyses run concurrently across the whole process, held across the entire
analysis (image load + encode + LLM call) in the single _handle_vision_analyze
chokepoint (covers both the native fast path and the legacy aux-LLM path).

It is a threading semaphore, NOT asyncio: each vision call is dispatched
through model_tools._run_async on a per-thread event loop, so an asyncio
primitive bound to one loop cannot coordinate across them. The acquire is
offloaded via run_in_executor so waiting for a slot never blocks the calling
loop.

Default: min(host CPUs, 4), floored at 1 — respect the host's concurrency,
or lower. Override via auxiliary.vision.max_concurrency (config.yaml) or
HERMES_VISION_MAX_CONCURRENCY (env). Values < 1 are ignored so the cap can
never be disabled into an unbounded fan-out.

Tests: bounded-fan-out regression guard + a control proving it would fail
without the cap; resolver tests for host-cpu default, ceiling clamp, low-cpu
host, env override, and sub-1 rejection. Pre-existing handler tests updated
for the now-async _handle_vision_analyze. Verified via the real
registry.dispatch -> _run_async per-thread-loop path (16 concurrent calls,
peak bounded to cap).
2026-06-29 01:27:10 -07:00
teknium1
115e78c377 test(camofox): accept headers= kwarg in persistence test mocks
The auth-header fix adds headers=_auth_headers() to all Camofox HTTP
calls. Two _capture_post mocks in the persistence test lacked a headers
parameter, so navigate raised TypeError and the success assertions
failed. Add headers=None to both mock signatures.
2026-06-29 01:26:24 -07:00
teknium1
41095fdb04 fix(camofox): register CAMOFOX_API_KEY in OPTIONAL_ENV_VARS
The auth-header fix reads CAMOFOX_API_KEY but it was never registered,
so it didn't surface in `hermes setup` / `hermes tools`. Add it as an
advanced password-category tool env var alongside CAMOFOX_URL.
2026-06-29 01:26:24 -07:00
kaishi00
08d6195bc4 fix(camofox): auto-recover from stale tab 404 on navigate
When a Camofox browser tab is garbage collected (idle timeout, browser
recycle), the held tab_id becomes stale. The next browser_navigate call
hits /tabs/{stale_id}/navigate -> HTTP 404 -> unhandled HTTPError.

Catch the 404 in camofox_navigate, clear the stale tab_id, and create a
fresh tab via _ensure_tab. The agent recovers transparently without
requiring a session restart.

Other tab operations (snapshot, click, type, etc.) use the same pattern
but only fail if the tab dies between successful calls — much rarer.
The navigate fix covers 95%+ of cases since navigate is always the entry
point.
2026-06-29 01:26:24 -07:00
liuhao1024
fe38d50833 fix(tools): read browser.command_timeout in Camofox HTTP client
The Camofox browser backend hardcoded a 30s HTTP timeout via
_DEFAULT_TIMEOUT, ignoring the user's browser.command_timeout config.
The main browser_tool path already reads this config via
_get_command_timeout().

This commit adds an equivalent _get_command_timeout() to
browser_camofox.py that reads browser.command_timeout from config
with caching, and switches all HTTP helper methods (_post, _get,
_get_raw, _delete) to use it as the default timeout.

Fixes #40843
2026-06-29 01:26:24 -07:00
刘昊
babd9168ba fix(browser): send Authorization header in Camofox HTTP calls when CAMOFOX_API_KEY is set
The five HTTP call sites in browser_camofox.py (_ensure_tab, _post,
_get, _get_raw, _delete) did not include Authorization headers, causing
403 Forbidden when the Camofox server has API key auth enabled.

Added _auth_headers() helper and wired it into all five call sites.
The health check endpoint (/health) is left without auth since it is
a connectivity probe, not a browser operation.

Regression test covers: header present when key set, absent when unset,
blank key produces empty headers.

Fixes #20476
2026-06-29 01:26:24 -07:00
liuhao1024
270456308c fix(tools): send listItemId instead of sessionKey in Camofox tab creation
The Camoufox REST API server expects `listItemId` in the `POST /tabs`
body, but `_ensure_tab` was sending `sessionKey`.  This caused a 400
Bad Request on every `browser_navigate` call.

The parameter name mismatch is visible in the same file: line 283
already reads `tab.get("listItemId")` when adopting existing tabs,
confirming the server-side field name.

Fixes #37960
2026-06-29 01:26:24 -07:00
teknium1
34e616e778 feat(slack): nudge stale installs to add mpim scopes; mark message.mpim required
Follow-up to the group-DM manifest fix. The manifest change only helps
NEW installs; existing apps keep their old (mpim-less) scopes until the
admin reinstalls. Since a missing message.mpim event delivers nothing
(no runtime API error to catch), detect stale installs at connect time
from the auth.test x-oauth-scopes header and log an actionable reinstall
nudge when im:history is granted but mpim:history is not. Also promote
message.mpim from Recommended to Required in the docs event tables so the
default setup path can't drop it.
2026-06-29 01:02:53 -07:00
Ben
4125cc3b7c fix(slack): subscribe to message.mpim + mpim scopes so group DMs work
Group DMs (multi-person DMs, channel_type=mpim) were never delivered to
the Slack bot. The adapter already classifies mpim as a DM and replies
ambiently (adapter.py:2526, is_dm = channel_type in {im, mpim}), but the
generated app manifest only subscribed to message.im / im:history — the
1:1 DM pair. Without the message.mpim event subscription Slack drops
group-DM messages before the adapter ever sees them, so 1:1 DMs worked
while group-DM ambient mode was dead.

Add message.mpim to bot_events and mpim:history (the scope that event
requires per Slack docs) + mpim:read (mirrors im:read for the
conversations.info classification call) to bot_scopes. Update the
SLACK_BOT_TOKEN / SLACK_APP_TOKEN setup-help strings and the Slack docs
(EN + zh-Hans: scope table, event table, troubleshooting) so existing
installs are told to add the new scopes and reinstall.

Reported by an enterprise customer. Note: this is a manifest/scope
change, so it only takes effect after the app is reinstalled and the
new scopes are accepted.

Tests: assert message.mpim + mpim:history + mpim:read are in the
manifest (with and without assistant mode); both fail on current main
and pass with this change.
2026-06-29 01:02:53 -07:00
Teknium
29f0968275
test(windows): harden pid-scan no-window assertion against captured-call leakage (#54707)
test_gateway_pid_scan_hides_wmic_and_powershell_windows flaked once in CI
(slice 7/8) with 'KeyError: creationflags' while passing 15/15 under exact
CI-parity locally. The positional 'kwargs["creationflags"]' indexing raises
a bare KeyError the moment any stray subprocess.run call is captured, masking
the real contract. Filter captured calls to the two intended Windows console
spawns (wmic + PowerShell fallback) and assert each is windowless via
.get('creationflags'); a leaked/extra call now surfaces as a readable
len-mismatch with the full captured list, not a cryptic KeyError.
2026-06-29 01:01:29 -07:00
Teknium
0434a9a5ec chore: regenerate uv.lock for supermemory + mem0 extras 2026-06-29 00:25:36 -07:00
Ben Barclay
1289f12812 fix(memory): lazy-install supermemory + mem0 SDKs like honcho/hindsight
The supermemory and mem0 memory providers shipped third-party SDKs
(supermemory / mem0ai) that are not core dependencies, but — unlike the
honcho and hindsight providers — they imported those SDKs directly with
no tools.lazy_deps.ensure() preflight and had no LAZY_DEPS allowlist
entry. On the published Docker image the agent venv is sealed
(HERMES_DISABLE_LAZY_INSTALLS=1) and lazy installs are redirected to a
writable durable target (HERMES_LAZY_INSTALL_TARGET). honcho/hindsight
route through ensure() and install fine there; supermemory/mem0 never
called it, so their SDK was never installed on a hosted instance and the
provider silently reported itself unavailable even with the API key set.

Fixes:
- Add memory.supermemory + memory.mem0 to the LAZY_DEPS allowlist
  (tools/lazy_deps.py), pinned to current PyPI releases.
- Call ensure('memory.<x>', prompt=False) at each SDK-import chokepoint
  (_SupermemoryClient.__init__; Mem0MemoryProvider._create_backend),
  mirroring honcho's wrapped try/except shape.
- Drop the SDK-import gate from supermemory's is_available() — it was a
  chicken-and-egg trap (provider never loaded on a sealed venv, so
  ensure() never ran). Now key-presence only, like honcho/mem0.
- Add matching pyproject extras [supermemory]/[mem0]; update the
  lazy-covered-extras contract test (excluded from [all] by policy).

Tests prove each path fails without the fix and the real sealed-venv
durable-target gate accepts both features.
2026-06-29 00:25:36 -07:00
teknium1
f860492842 test(desktop): match multiline spawn(ps, fullArgs) via regex like sibling sites
The bootstrap-runner PowerShell spawn is formatted multiline (spawn(\n  ps,\n  fullArgs,...), so the literal substring 'spawn(ps, fullArgs' never matched and the assertion was failing on main independent of #54635. Convert it to a whitespace-tolerant regex like every other call-site assertion in this file.
2026-06-28 23:59:32 -07:00
emozilla
aa2ae36c3f fix(desktop): launch Windows backend as console python so child consoles are inherited, not flashed
The recurring Windows desktop console-flash bug (#54220) is governed by the
*parent's* console, not by each child spawn. The desktop backend was launched as
GUI-subsystem pythonw.exe, which has no console at all — so every
console-subsystem child it spawns (git, gh, cmd, wmic, powershell, ...) had to
allocate its own console, flashing a window. That is why the fix had become an
endless per-call-site sweep of CREATE_NO_WINDOW flags: each leaf spawn was
papering over a missing console on the root.

Launch the backend as the venv's console python.exe instead. Under the existing
hiddenWindowsChildOptions() wrapper (windowsHide: true -> CREATE_NO_WINDOW) the
backend owns a single *windowless* console, and every descendant spawn inherits
it instead of allocating a visible one. This makes "no flashing windows" a
property of the one backend launch rather than a flag that must be remembered at
every spawn site — including spawns inside third-party libraries that no
call-site sweep can reach.

Verified on Windows 11 25H2 (Windows Terminal default): with the per-site hide
flag forcibly neutered, the canonical culprits (git/gh/cmd/wmic/powershell)
spawned naively and none flashed, while the same naive spawn from the old
console-less pythonw parent did flash — isolating the parent console as the cause.

Two premises behind the old pythonw approach did not hold up on current Windows
and are dropped here:
- The venv Scripts\python.exe uv shim, under CREATE_NO_WINDOW, re-execs base
  python *windowless* — it does not flash a conhost (the #52239 concern), so the
  base-pythonw detour is unnecessary.
- Console python restores stdout, so the backend announces its port on the normal
  HERMES_DASHBOARD_READY stdout line; the pythonw-only ready-file side channel is
  no longer needed and the readyFile opt-in is removed.

Removes the now-dead pythonw machinery (getNoConsoleVenvPython, toNoConsolePython,
applyWindowsNoConsoleSpawnHints, readVenvHome) and updates the test to assert the
new invariant: backend command is never pythonw, both backend spawns still go
through hiddenWindowsChildOptions, and no backend opts into the ready-file path.

Scope: this fixes the high-frequency backend-descendant flash classes. The
updater/UAC handoff (#54543) and embedded-terminal PTY accumulation (#53555)
classes have separate root causes and are unaffected.
2026-06-28 23:59:32 -07:00
Ben
20b03d9aee i18n: add Custom Keys strings to all locale files
The env translation block is type-checked across every locale (tsc -b), so
the 8 new customKeys strings must exist in all of them, not just en/zh. Add
translated entries to the remaining 14 locales (de, es, fr, it, ja, ko, pt,
ru, tr, uk, hu, ga, af, zh-hant).
2026-06-28 22:53:56 -07:00
Ben
1c75e7c9d8 feat(dashboard): list & add arbitrary custom .env keys on the Keys page
The Keys page only rendered env vars present in a catalog (OPTIONAL_ENV_VARS
or the provider catalog); any other key a user set in .env was invisible, and
there was no way to add an arbitrary env var from the GUI (e.g. to inject a
var a skill or MCP server needs).

Backend: GET /api/env now also emits a row for every on-disk .env key that
isn't in any catalog, flagged category="custom" + custom=true and
password-masked (an unrecognised key could hold anything, so it's redacted and
reveal-gated like any secret). Channel-managed credentials stay excluded. The
write (PUT /api/env) and reveal (POST /api/env/reveal) paths already handle
arbitrary keys, with the existing env-name guard + denylist (PATH, LD_PRELOAD,
PYTHONPATH, …) enforced server-side — no new write surface.

Frontend: a new "Custom Keys" section lists those custom rows and carries an
add-a-key form (client-side name validation mirroring the backend regex; the
new row reuses the normal edit/save flow, so on save it round-trips back from
the backend as a durable custom row). i18n added for en + zh + types.

Tests: behavior-contract coverage that an unknown .env key surfaces as a
masked custom row and a catalogued key does not — verified to fail on the
pre-fix backend.
2026-06-28 22:53:56 -07:00
HexLab98
23f245eda5 test(vision): cover Ollama /api/show vision capability routing (#54511) 2026-06-28 22:52:59 -07:00
HexLab98
d7e573e54d fix(vision): detect Ollama vision models via /api/show (#54511)
When local Ollama models are absent from models.dev, probe the Ollama
server's /api/show capabilities so attached images are routed natively
instead of being stripped as non-vision input.
2026-06-28 22:52:59 -07:00
sgaofen
b481348fbc fix(agent): stream copilot ACP chat completions 2026-06-28 22:52:51 -07:00
sgaofen
0106082d1f fix(agent): return OpenAI-shaped copilot ACP tool calls 2026-06-28 22:52:51 -07:00
sgaofen
032d702140 fix(agent): omit stream_options for native Gemini streaming
Google's native Gemini REST endpoint (generativelanguage.googleapis.com,
non-/openai) rejects OpenAI-only stream_options={"include_usage": true},
crashing every streaming chat-completions call with TypeError. Omit it for
that endpoint while keeping it for the Gemini OpenAI-compat shim and all
OpenAI-compatible aggregators (OpenRouter, etc.) so usage accounting is
preserved.

Reuses is_native_gemini_base_url() so the compat shim (.../openai), which
accepts stream_options, is correctly excluded from the omission.

Fixes #14387

Co-authored-by: Hermes Agent <127238744+teknium1@users.noreply.github.com>
2026-06-28 22:52:46 -07:00
Teknium
25d35cce18 infographic: Windows CLH lock-timeout traceback suppression (#54436 salvage) 2026-06-28 22:35:56 -07:00
helix4u
98a7cfb8f9 fix(logging): suppress Windows lock timeout tracebacks 2026-06-28 22:35:56 -07:00
Teknium
74541beb9c
fix(security): cap WeCom callback body size before pre-auth XML parse (#54615)
The WeCom callback endpoint (internet-facing, 0.0.0.0) parsed untrusted
request bodies before signature verification. defusedxml already guards
the entity-expansion class on main, but there was no cap on raw body
size, so an unauthenticated POST could still force unbounded read work
pre-auth.

Set client_max_size=64KB on the aiohttp app (413 at the framework layer)
plus an explicit length guard in _handle_callback as defense in depth.
WeCom callbacks are small encrypted XML envelopes — media is delivered
out-of-band via MediaId, never inline — so 64KB is ample for legitimate
traffic. Adds tests for oversized (413) and normal-sized (not 413) bodies.

Salvaged from #10192 by @memosr (body-size limit half; defusedxml half
already superseded on main).
2026-06-28 22:35:43 -07:00
teknium1
0b733a8418 test(gateway): pin auto-reset cached-agent eviction (#10710)
Relocate marco0158's eviction into the dedicated auto-reset cleanup block
(single source of truth for dropping session-scoped transient state) and
add an AST invariant pinning _evict_cached_agent into that block. Add
AUTHOR_MAP entry for marco0158.
2026-06-28 22:35:17 -07:00