Commit graph

6568 commits

Author SHA1 Message Date
Telos
fa11b11cf5 fix: propagate key_env from custom_providers into ProviderDef
resolve_custom_provider() previously returned api_key_env_vars=()
for every custom provider entry, silently dropping the configured
key_env field. This caused 401 errors for any custom provider that
required an API key via environment variable (e.g. Xiaomi MiMo Token
Plan, self-hosted OpenAI-compatible servers).

The key_env field is already documented in _VALID_CUSTOM_PROVIDER_FIELDS
and normalized by normalize_custom_provider_entry(), so this was just
an oversight in the ProviderDef construction.

Also adds a regression test that verifies key_env is properly
propagated into the resolved ProviderDef.
2026-06-29 02:25:48 -07:00
Sanjay Santhanam
c79e6bceae fix(browser_tool): resolve race in _get_command_timeout cache returning None (#14331)
# Conflicts:
#	tools/browser_tool.py
2026-06-29 02:24:57 -07:00
Teknium
bf0d8fed8e
fix(config): v32 migration flips baked-in verify_on_stop=true to false (#54740)
The first ship of verify-on-stop (config v30) defaulted
DEFAULT_CONFIG agent.verify_on_stop to a literal True, and migrate_config
persists defaults with strip_defaults=False — so every install that updated
through v30 had verify_on_stop: true written into config.yaml as a literal.

The v30->v31 migration only flipped missing/'auto' values to false and
deliberately preserved an explicit bool, so it skipped that entire population
and left verify-on-stop ON for everyone who had updated. A literal true was
never a user choice: the feature had no off-switch worth setting it against
until v31 introduced one, so a true persisted before v32 is always the old
machine default.

v32 migration flips a literal true -> false once, for both v30 (skipped v31)
and v31 (preserved-by-bug) installs. A true the user sets AFTER v32 is a
deliberate opt-in and is never touched.
2026-06-29 01:51:08 -07:00
teknium1
75317d82d0 fix(vision): narrow the fan-out cap to the CPU encode burst only
The original cap held a process-global slot across the WHOLE vision
analysis (image load + encode + LLM call) with a default of min(CPUs, 4).
That serialized legitimate multi-image workflows — "compare these 6
screenshots", "read this 10-page scan", "analyze every frame" — behind a
4-wide gate, and on the native fast path it even throttled calls that make
no LLM request at all. Excess calls queued (blocking acquire, nothing
dropped), but the latency hit on real fan-out was the wrong tradeoff.

The incident was CPU exhaustion, not call count: concurrent base64/resize
bursts saturated every core and left none to service the shared event loop
serving /api/status. So cap ONLY that:

- A dedicated, bounded ThreadPoolExecutor (_vision_cpu_executor) runs the
  encode/resize/dimension-check off the caller's loop, sized to the host's
  usable core count with NO fixed ceiling — the cap tracks the actual
  exhausted resource (cores), not a magic number. Excess encodes queue on
  the executor; cores stay free for the loop.
- The LLM call is deliberately OUTSIDE the executor, so multi-image
  workflows keep full request concurrency.
- Override via auxiliary.vision.max_concurrency / HERMES_VISION_MAX_CONCURRENCY
  (honored verbatim, including above core count); sub-1 ignored.
- _vision_concurrency_slot() is now a no-op shim for back-compat.

Tests assert: resolver defaults to host cores with no ceiling; env/config
override (incl. above cores); sub-1 rejection; the executor is dedicated and
core-sized; encode runs on a vision-encode thread; and crucially that encode
bursts are bounded to the cap while the analyses themselves stay fully
concurrent (calls_peak > cap).
2026-06-29 01:27:10 -07:00
Ben Barclay
eddfecd2ce fix(vision): cap vision_analyze fan-out concurrency process-wide
A single agent turn can fan out N vision_analyze calls at once — the
classic trigger is "analyze every frame of this video", where ffmpeg
explodes a clip into dozens of frames and the model calls vision_analyze
on each. Every call does a CPU-heavy base64-encode/resize burst AND holds
a long-lived LLM stream open. The tool executor runs concurrent tool calls
on a per-session ThreadPoolExecutor (_MAX_TOOL_WORKERS=8), and multiple
agent sessions share one process (the dashboard runs the agent in-process),
so there was no global ceiling. In prod (June 2026) a video-frame fan-out
pinned a worker thread at ~100% CPU and starved the shared asyncio event
loop that also serves the dashboard's /api/status liveness probe, flapping
the instance to UNHEALTHY even though nothing had crashed.

Add a process-global threading.BoundedSemaphore that bounds how many vision
analyses run concurrently across the whole process, held across the entire
analysis (image load + encode + LLM call) in the single _handle_vision_analyze
chokepoint (covers both the native fast path and the legacy aux-LLM path).

It is a threading semaphore, NOT asyncio: each vision call is dispatched
through model_tools._run_async on a per-thread event loop, so an asyncio
primitive bound to one loop cannot coordinate across them. The acquire is
offloaded via run_in_executor so waiting for a slot never blocks the calling
loop.

Default: min(host CPUs, 4), floored at 1 — respect the host's concurrency,
or lower. Override via auxiliary.vision.max_concurrency (config.yaml) or
HERMES_VISION_MAX_CONCURRENCY (env). Values < 1 are ignored so the cap can
never be disabled into an unbounded fan-out.

Tests: bounded-fan-out regression guard + a control proving it would fail
without the cap; resolver tests for host-cpu default, ceiling clamp, low-cpu
host, env override, and sub-1 rejection. Pre-existing handler tests updated
for the now-async _handle_vision_analyze. Verified via the real
registry.dispatch -> _run_async per-thread-loop path (16 concurrent calls,
peak bounded to cap).
2026-06-29 01:27:10 -07:00
teknium1
115e78c377 test(camofox): accept headers= kwarg in persistence test mocks
The auth-header fix adds headers=_auth_headers() to all Camofox HTTP
calls. Two _capture_post mocks in the persistence test lacked a headers
parameter, so navigate raised TypeError and the success assertions
failed. Add headers=None to both mock signatures.
2026-06-29 01:26:24 -07:00
liuhao1024
fe38d50833 fix(tools): read browser.command_timeout in Camofox HTTP client
The Camofox browser backend hardcoded a 30s HTTP timeout via
_DEFAULT_TIMEOUT, ignoring the user's browser.command_timeout config.
The main browser_tool path already reads this config via
_get_command_timeout().

This commit adds an equivalent _get_command_timeout() to
browser_camofox.py that reads browser.command_timeout from config
with caching, and switches all HTTP helper methods (_post, _get,
_get_raw, _delete) to use it as the default timeout.

Fixes #40843
2026-06-29 01:26:24 -07:00
刘昊
babd9168ba fix(browser): send Authorization header in Camofox HTTP calls when CAMOFOX_API_KEY is set
The five HTTP call sites in browser_camofox.py (_ensure_tab, _post,
_get, _get_raw, _delete) did not include Authorization headers, causing
403 Forbidden when the Camofox server has API key auth enabled.

Added _auth_headers() helper and wired it into all five call sites.
The health check endpoint (/health) is left without auth since it is
a connectivity probe, not a browser operation.

Regression test covers: header present when key set, absent when unset,
blank key produces empty headers.

Fixes #20476
2026-06-29 01:26:24 -07:00
liuhao1024
270456308c fix(tools): send listItemId instead of sessionKey in Camofox tab creation
The Camoufox REST API server expects `listItemId` in the `POST /tabs`
body, but `_ensure_tab` was sending `sessionKey`.  This caused a 400
Bad Request on every `browser_navigate` call.

The parameter name mismatch is visible in the same file: line 283
already reads `tab.get("listItemId")` when adopting existing tabs,
confirming the server-side field name.

Fixes #37960
2026-06-29 01:26:24 -07:00
teknium1
34e616e778 feat(slack): nudge stale installs to add mpim scopes; mark message.mpim required
Follow-up to the group-DM manifest fix. The manifest change only helps
NEW installs; existing apps keep their old (mpim-less) scopes until the
admin reinstalls. Since a missing message.mpim event delivers nothing
(no runtime API error to catch), detect stale installs at connect time
from the auth.test x-oauth-scopes header and log an actionable reinstall
nudge when im:history is granted but mpim:history is not. Also promote
message.mpim from Recommended to Required in the docs event tables so the
default setup path can't drop it.
2026-06-29 01:02:53 -07:00
Ben
4125cc3b7c fix(slack): subscribe to message.mpim + mpim scopes so group DMs work
Group DMs (multi-person DMs, channel_type=mpim) were never delivered to
the Slack bot. The adapter already classifies mpim as a DM and replies
ambiently (adapter.py:2526, is_dm = channel_type in {im, mpim}), but the
generated app manifest only subscribed to message.im / im:history — the
1:1 DM pair. Without the message.mpim event subscription Slack drops
group-DM messages before the adapter ever sees them, so 1:1 DMs worked
while group-DM ambient mode was dead.

Add message.mpim to bot_events and mpim:history (the scope that event
requires per Slack docs) + mpim:read (mirrors im:read for the
conversations.info classification call) to bot_scopes. Update the
SLACK_BOT_TOKEN / SLACK_APP_TOKEN setup-help strings and the Slack docs
(EN + zh-Hans: scope table, event table, troubleshooting) so existing
installs are told to add the new scopes and reinstall.

Reported by an enterprise customer. Note: this is a manifest/scope
change, so it only takes effect after the app is reinstalled and the
new scopes are accepted.

Tests: assert message.mpim + mpim:history + mpim:read are in the
manifest (with and without assistant mode); both fail on current main
and pass with this change.
2026-06-29 01:02:53 -07:00
Teknium
29f0968275
test(windows): harden pid-scan no-window assertion against captured-call leakage (#54707)
test_gateway_pid_scan_hides_wmic_and_powershell_windows flaked once in CI
(slice 7/8) with 'KeyError: creationflags' while passing 15/15 under exact
CI-parity locally. The positional 'kwargs["creationflags"]' indexing raises
a bare KeyError the moment any stray subprocess.run call is captured, masking
the real contract. Filter captured calls to the two intended Windows console
spawns (wmic + PowerShell fallback) and assert each is windowless via
.get('creationflags'); a leaked/extra call now surfaces as a readable
len-mismatch with the full captured list, not a cryptic KeyError.
2026-06-29 01:01:29 -07:00
Ben Barclay
1289f12812 fix(memory): lazy-install supermemory + mem0 SDKs like honcho/hindsight
The supermemory and mem0 memory providers shipped third-party SDKs
(supermemory / mem0ai) that are not core dependencies, but — unlike the
honcho and hindsight providers — they imported those SDKs directly with
no tools.lazy_deps.ensure() preflight and had no LAZY_DEPS allowlist
entry. On the published Docker image the agent venv is sealed
(HERMES_DISABLE_LAZY_INSTALLS=1) and lazy installs are redirected to a
writable durable target (HERMES_LAZY_INSTALL_TARGET). honcho/hindsight
route through ensure() and install fine there; supermemory/mem0 never
called it, so their SDK was never installed on a hosted instance and the
provider silently reported itself unavailable even with the API key set.

Fixes:
- Add memory.supermemory + memory.mem0 to the LAZY_DEPS allowlist
  (tools/lazy_deps.py), pinned to current PyPI releases.
- Call ensure('memory.<x>', prompt=False) at each SDK-import chokepoint
  (_SupermemoryClient.__init__; Mem0MemoryProvider._create_backend),
  mirroring honcho's wrapped try/except shape.
- Drop the SDK-import gate from supermemory's is_available() — it was a
  chicken-and-egg trap (provider never loaded on a sealed venv, so
  ensure() never ran). Now key-presence only, like honcho/mem0.
- Add matching pyproject extras [supermemory]/[mem0]; update the
  lazy-covered-extras contract test (excluded from [all] by policy).

Tests prove each path fails without the fix and the real sealed-venv
durable-target gate accepts both features.
2026-06-29 00:25:36 -07:00
Ben
1c75e7c9d8 feat(dashboard): list & add arbitrary custom .env keys on the Keys page
The Keys page only rendered env vars present in a catalog (OPTIONAL_ENV_VARS
or the provider catalog); any other key a user set in .env was invisible, and
there was no way to add an arbitrary env var from the GUI (e.g. to inject a
var a skill or MCP server needs).

Backend: GET /api/env now also emits a row for every on-disk .env key that
isn't in any catalog, flagged category="custom" + custom=true and
password-masked (an unrecognised key could hold anything, so it's redacted and
reveal-gated like any secret). Channel-managed credentials stay excluded. The
write (PUT /api/env) and reveal (POST /api/env/reveal) paths already handle
arbitrary keys, with the existing env-name guard + denylist (PATH, LD_PRELOAD,
PYTHONPATH, …) enforced server-side — no new write surface.

Frontend: a new "Custom Keys" section lists those custom rows and carries an
add-a-key form (client-side name validation mirroring the backend regex; the
new row reuses the normal edit/save flow, so on save it round-trips back from
the backend as a durable custom row). i18n added for en + zh + types.

Tests: behavior-contract coverage that an unknown .env key surfaces as a
masked custom row and a catalogued key does not — verified to fail on the
pre-fix backend.
2026-06-28 22:53:56 -07:00
HexLab98
23f245eda5 test(vision): cover Ollama /api/show vision capability routing (#54511) 2026-06-28 22:52:59 -07:00
sgaofen
b481348fbc fix(agent): stream copilot ACP chat completions 2026-06-28 22:52:51 -07:00
sgaofen
0106082d1f fix(agent): return OpenAI-shaped copilot ACP tool calls 2026-06-28 22:52:51 -07:00
sgaofen
032d702140 fix(agent): omit stream_options for native Gemini streaming
Google's native Gemini REST endpoint (generativelanguage.googleapis.com,
non-/openai) rejects OpenAI-only stream_options={"include_usage": true},
crashing every streaming chat-completions call with TypeError. Omit it for
that endpoint while keeping it for the Gemini OpenAI-compat shim and all
OpenAI-compatible aggregators (OpenRouter, etc.) so usage accounting is
preserved.

Reuses is_native_gemini_base_url() so the compat shim (.../openai), which
accepts stream_options, is correctly excluded from the omission.

Fixes #14387

Co-authored-by: Hermes Agent <127238744+teknium1@users.noreply.github.com>
2026-06-28 22:52:46 -07:00
helix4u
98a7cfb8f9 fix(logging): suppress Windows lock timeout tracebacks 2026-06-28 22:35:56 -07:00
Teknium
74541beb9c
fix(security): cap WeCom callback body size before pre-auth XML parse (#54615)
The WeCom callback endpoint (internet-facing, 0.0.0.0) parsed untrusted
request bodies before signature verification. defusedxml already guards
the entity-expansion class on main, but there was no cap on raw body
size, so an unauthenticated POST could still force unbounded read work
pre-auth.

Set client_max_size=64KB on the aiohttp app (413 at the framework layer)
plus an explicit length guard in _handle_callback as defense in depth.
WeCom callbacks are small encrypted XML envelopes — media is delivered
out-of-band via MediaId, never inline — so 64KB is ample for legitimate
traffic. Adds tests for oversized (413) and normal-sized (not 413) bodies.

Salvaged from #10192 by @memosr (body-size limit half; defusedxml half
already superseded on main).
2026-06-28 22:35:43 -07:00
teknium1
0b733a8418 test(gateway): pin auto-reset cached-agent eviction (#10710)
Relocate marco0158's eviction into the dedicated auto-reset cleanup block
(single source of truth for dropping session-scoped transient state) and
add an AST invariant pinning _evict_cached_agent into that block. Add
AUTHOR_MAP entry for marco0158.
2026-06-28 22:35:17 -07:00
Junass1
61a4526ac7 fix(gateway): clear session-scoped model overrides on /resume
/resume is a conversation boundary, but unlike /new it did not clear the
chat-keyed _session_model_overrides / _pending_model_notes. A /model switch
made in the previous session under the same chat session_key leaked into the
resumed conversation, running it on the wrong model.

Clear both maps for the session_key after the switch (mirroring /new), scoped
to that key so other chats' overrides are untouched. The cached-agent eviction
this leak also implied already landed via #6672.

Closes #10702.
2026-06-28 22:35:12 -07:00
Shannon Sands
476875acb9 Add dashboard backup upload and download 2026-06-28 22:35:09 -07:00
Ben Barclay
8fe800ee1a
fix(file-tools): sanitize host/relative cwd override before it reaches container sandbox (#54447) (#54616)
(cherry picked from commit 82132f7911)

Co-authored-by: Tranquil-Flow <66773372+Tranquil-Flow@users.noreply.github.com>
2026-06-29 15:32:20 +10:00
brooklyn!
388268ecde
Merge pull request #54568 from NousResearch/bb/shared-websocket-layer
refactor(desktop+dashboard): shared WebSocket layer + decouple desktop from dashboard (hermes serve)
2026-06-28 23:43:49 -05:00
Brooklyn Nicholson
1af109c79c test(cli): drop pytest dep + use real sentinel handlers in serve test
Clears the ty diff bot's warnings on the new test: pass real callables to
build_dashboard_parser (not object()) and replace the pytest.mark.parametrize
with a plain loop so the file is stdlib-only.
2026-06-28 23:24:45 -05:00
Ruzzgar
313a8c6833 fix(skills): replace string prefix check with strict path containment 2026-06-28 21:14:01 -07:00
Ben Barclay
0943e2a272
fix(cron): don't report a false 'gateway not running' on external-provider instances (#54600)
`hermes cron status` (and the create/list 'gateway not running' nag)
judge whether cron will fire purely from the in-process ticker's
heartbeat file + a live gateway PID. That heuristic is correct for the
built-in ticker but WRONG for an external provider like Chronos:

Chronos arms exactly one external one-shot per job and is fired by a
NAS-mediated webhook (POST /api/cron/fire). Its `start()` returns
immediately and it deliberately runs no 60s loop and writes no ticker
heartbeat — that's the whole point of scale-to-zero (the machine is at
zero between fires). So on a perfectly healthy Chronos instance,
`cron status` always printed '✗ Gateway is not running — cron jobs will
NOT fire' (or a STALLED-ticker warning), and `cron create` always
appended the 'jobs won't fire automatically' nag — both false.

Verified live on a staging Chronos instance: jobs fired and completed on
schedule via the relay while `cron status` insisted the gateway wasn't
running and the heartbeat was 370s+ stale.

Fix: resolve the active provider (offline — `resolve_cron_scheduler`,
whose `is_available()` contract forbids network) and, for any non-builtin
provider, report the managed-scheduler state instead of the ticker
heuristics, and suppress the ticker-only 'gateway not running' warning.
The built-in path is byte-unchanged. Active-job summary is factored into
a shared helper so both paths print it identically.

New tests prove both directions (chronos: no false negative even with no
gateway PID / no heartbeat; builtin: historical warning preserved) and
fail without the fix.
2026-06-29 14:03:02 +10:00
Teknium
e20ff352b9 test(matrix): authorize inviter in DM-invite fixture for new invite-auth gate
_on_invite now rejects auto-joins from users not on the allow-list. The
DM-recording tests invite @alice and expect a join, so the shared
_make_adapter fixture now puts @alice on _allowed_user_ids.
2026-06-28 20:47:33 -07:00
lkevincc
163562bf88 fix: normalize lmstudio base urls 2026-06-28 20:46:44 -07:00
teknium1
14204b0646 test(agent): cover .hermes.md no-git-root cwd-only behavior
Regression tests for the injection fix: outside a git repo only cwd is
checked (planted ancestor .hermes.md is ignored), a cwd-local .hermes.md
is still found, and inside a git repo the parent walk to the git root
still works.
2026-06-28 20:46:32 -07:00
Brooklyn Nicholson
9d9a50c2bc test(cli): pin the hermes serve decoupling contract
Add a focused contract test for the headless `serve` command (routes to the
shared dashboard handler, headless by default while `dashboard` is not, accepts
the legacy --no-open, shares the same runtime/lifecycle flag surface). Also
refresh the dashboard.py module docstring to cover both commands.
2026-06-28 22:11:48 -05:00
Brooklyn Nicholson
ae465e9fb8 Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/desktop-multiterminal 2026-06-28 21:37:52 -05:00
Ben
dee41d0716 feat(dashboard): catalogue all memory-provider API keys in OPTIONAL_ENV_VARS
The dashboard Keys page and `hermes setup` render API-key rows from
OPTIONAL_ENV_VARS, but only Honcho had an entry — so Hindsight,
Supermemory, Mem0, RetainDB, ByteRover, and OpenViking read their keys
straight from os.environ yet had no place to set them in the GUI.

Add catalog entries (category=tool, password-masked, with get-key URLs
and the tool each powers) for all six, plus the relevant base-URL/endpoint
companions. Pure declaration: the generic GET /api/env endpoint, the
save/reveal write path, and the sandbox env blocklist (which auto-derives
from tool-category OPTIONAL_ENV_VARS) all pick these up with no further
wiring.

Adds a behavior-contract test asserting every memory provider's primary
credential key is catalogued, tool-categorised, and password-masked.
2026-06-28 19:17:02 -07:00
Brooklyn Nicholson
e117cfdff0 feat(desktop): live agent terminals + agent-driven tab close
Make the read-only agent terminal mirrors stream in real time and give
the agent a desktop-only way to dismiss its own tabs.

- Stream background output live: the local reader used a blocking
  read(4096) that buffered small periodic output until EOF, so agent
  tabs only "filled in" at process exit. Switch to buffer.read1(4096)
  (decoded) for incremental chunks.
- Route agent.terminal.output / terminal.close to the window that owns
  the process (its gateway session) instead of an empty session id, so
  events actually reach the desktop renderer.
- Add close_terminal: a HERMES_DESKTOP-gated tool (sibling of
  read_terminal) that drops a process's read-only tab WITHOUT killing it
  via process_registry.on_close; output keeps buffering and the user can
  reopen from the status stack.
- ⌘W now closes a focused agent tab: mark the agent instance
  data-terminal and focus it on activation so isFocusWithin routes there.
- ensureTerminal() no longer spawns an extra user shell when a tab
  already exists (e.g. opening a background task from the status stack).
2026-06-28 21:15:14 -05:00
LIC99
dda3268d09 fix(approvals): warn and default to manual on unknown approvals.mode
_normalize_approval_mode() previously accepted any string, so an unknown
value like 'auto' fell through every downstream mode check (off/smart) and
silently behaved like manual with no signal. Validate against the known
modes (manual/smart/off), emit a warning for anything else, and default to
manual to match the config default and the rest of the function.

Bug 1 from the original PR (/approve & /deny bypassing the running-agent
guard) already landed on main independently, so only the mode-validation
fix is salvaged here.

Fixes #4261

Co-authored-by: Hermes Agent <agent@nousresearch.com>
2026-06-28 19:04:18 -07:00
Teknium
11183e8332 fix(profiles): validate custom alias names to prevent path traversal
`hermes profile alias <profile> --name <custom>` accepted arbitrary
strings and used them verbatim as a filename under ~/.local/bin. Because
normalize_profile_name only lowercases/strips (no regex gate), a value
like `../../.bashrc` escaped the wrapper directory and clobbered
arbitrary user-writable files. remove_wrapper_script had the same sink.

Add validate_alias_name (reusing the profile-id regex, which forbids
`/`, `.`, and `..`) and wire it into check_alias_collision,
create_wrapper_script, remove_wrapper_script, and the CLI alias action so
the rejection surfaces a clear "Invalid alias name" error instead of
silently writing or unlinking outside the wrapper dir.

Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>
Co-authored-by: Xowiek <xowiekk@gmail.com>
2026-06-28 18:53:33 -07:00
Teknium
490f215a19 test: cover export-prefix stripping in .env parsers (PR #6659) 2026-06-28 18:53:00 -07:00
Teknium
3483424aaa
fix(security): redact bare-token credentials in URL userinfo (#6396) (#54475)
git remote set-url with an embedded password (https://PASSWORD@github.com)
leaked the credential into agent output — the redaction engine only masked
user:pass@ DB connection strings, never the colon-less bare-token userinfo
form a git remote uses.

Add _URL_BARE_TOKEN_RE: scheme://TOKEN@host for web/transport schemes
(http/https/wss/git/ssh/ftp), 8+ char floor to skip short usernames, token
class forbidding /:@ so an @ in a path/query is never treated as userinfo.

Deliberately scoped to the bare-token form only. The user:pass@ colon form
and query-string tokens stay passing through (#34029, 'pass web URLs through
unchanged') so magic-link / OAuth round-trip skills keep working — a bare
credential in userinfo is never a workflow token (those live in the query
string), so masking it can't break a skill.
2026-06-28 18:52:42 -07:00
Teknium
9860d93f2a
fix(terminal): require approval for host-bound Docker commands (#54483)
* fix(terminal): require approval for host-bound Docker commands

The Docker terminal backend blanket-skips dangerous-command approval on
the assumption that the container is isolated from the host. That holds
only when nothing is bind-mounted in. Once a host path is exposed (via
TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE or a host-path entry in
TERMINAL_DOCKER_VOLUMES), a command like `rm -rf /workspace` reaches
real host files but is still auto-approved.

Detect host bind mounts and route those sessions through the normal
approval flow. Isolated Docker keeps the fast path. The same gating is
applied to the execute_code guard, which had the identical blanket skip.

Co-authored-by: Hermes Agent <agent@nousresearch.com>

* chore: add AUTHOR_MAP entry for PR #6436 salvage (Kolektori)

* test: accept has_host_access kwarg in _check_all_guards mocks

The host-bound Docker approval fix adds a has_host_access kwarg to the
_check_all_guards wrapper. Six pre-existing tests monkeypatch it with a
fixed (command, env_type) / (cmd, env) lambda signature, which now
raises TypeError when terminal_tool passes the new kwarg. Widen those
mock signatures to accept **kwargs.

---------

Co-authored-by: Kolektori <256073454+Kolektori@users.noreply.github.com>
Co-authored-by: Hermes Agent <agent@nousresearch.com>
2026-06-29 11:35:41 +10:00
Ben Barclay
7cfa2fa13f
fix(docker): gate resource limit flags on cgroup controller availability (#54516)
On hosts where the cgroup v2 cpu/memory/pids controllers are not delegated
to the docker/podman process (unprivileged Proxmox LXCs, some rootless and
nested setups), --pids-limit/--cpus/--memory cause every container start to
fail with OCI runtime error / exit 126, breaking terminal + execute_code.

- Add _cgroup_limits_available(image): one-shot, host-wide cached probe that
  spawns a throwaway container from the sandbox image itself (sleep 0) with
  all three flags together, mirroring the existing _storage_opt_supported
  probe-and-degrade pattern.
- Remove --pids-limit from static _BASE_SECURITY_ARGS; apply it (default 256
  via _DEFAULT_PIDS_LIMIT) in resource_args gated on the probe.
- Gate --cpus and --memory on the same probe.

Behavior unchanged on cgroup-capable hosts; graceful degradation with a
one-time warning where controllers aren't delegated.

Fixes #6568.

(cherry picked from commit c933880b7e)

Co-authored-by: angelos <angelos@oikos.lan.home.malaiwah.com>
2026-06-29 11:01:08 +10:00
Brooklyn Nicholson
f34cf7e3a4 test(gmi): stub profile fetch_models in static-fallback test
The fallback test only mocked fetch_api_models; CI still hit the real GMI
/v1/models endpoint via ProviderProfile.fetch_models and merged live
models into the result.
2026-06-28 18:05:28 -05:00
Brooklyn Nicholson
cb1bb1a48d refactor(windows): unify windowless spawn form across the touched sites
windows_hide_flags() already returns 0 on POSIX (and creationflags=0 is
the no-op default there, exactly how server.py::_list_repo_files does it),
so drop the IS_WINDOWS import + ternary/one-use-dict gating and just pass
creationflags=windows_hide_flags() directly. Tests lose the now-pointless
IS_WINDOWS monkeypatch.
2026-06-28 17:44:47 -05:00
Brooklyn Nicholson
32087e4bc9 fix(windows): hide console flash on checkpoint git + skills_hub gh probes
The #54236/#54417 backend git/gh sweep routed git_probe, the repo-file
picker, coding_context, context_references, copilot_auth, and the gateway
process scans through CREATE_NO_WINDOW, but two sibling spawn legs that
also run inside the console-less desktop/gateway backend were missed:

- tools/checkpoint_manager.py `_run_git` (and the one-shot `git init
  --bare` in `_init_store`) — when checkpoints are enabled, every
  file-mutating turn fires multiple bare `git` calls (status, add,
  write-tree/commit-tree, update-ref). Spawned from a parent with no
  console (Electron spawns the backend with windowsHide → CREATE_NO_WINDOW),
  each one allocates its own conhost window → a flurry of terminal popups.
- tools/skills_hub.py `GitHubAuth._try_gh_cli` — `gh auth token`, the same
  bug class as the already-fixed copilot_auth gh probe.

Route both through `windows_hide_flags()` (no-op on POSIX), matching the
established per-site pattern. Tests added to
tests/test_windows_subprocess_no_window_flags.py.
2026-06-28 17:41:47 -05:00
Teknium
980622d0ec
perf(startup): parse config + plugin manifests with libyaml CSafeLoader (#54486)
The startup config/manifest reads used PyYAML's pure-Python SafeLoader,
which is ~8x slower than the libyaml-backed CSafeLoader C extension.
config.yaml is parsed several times during launch (cli config, raw
config, early interface/redaction bridge, logging config) and every
plugin manifest is parsed once — all on the slow path.

Add utils.fast_safe_load (CSafeLoader-preferring, pure-Python fallback,
true drop-in for safe_load) and route the hot startup parse sites
through it: hermes_cli/config.py (config + manifest reads),
hermes_cli/plugins.py (manifest parse), env_loader, cli.load_cli_config,
hermes_logging, and the two pre-config early YAML bridges in main.py.

Behavior is identical (same restricted safe tag set); only speed changes.
safe_load calls on the startup path drop from ~79 to ~0, cutting the
YAML parse cost from ~0.9s to ~0.15s under profiling.

Adds tests/test_fast_safe_load.py asserting equivalence with safe_load
across input shapes, empty-doc falsiness, C-loader preference, and that
python/object tags are still rejected (safe, not full loader).
2026-06-28 15:38:39 -07:00
Teknium
d65468e7ff
fix(security): SSRF guard yuanbao media download_url (#54470)
yuanbao_media.download_url() fetched model-supplied (outbound) and inbound
image/file URLs server-side via httpx with follow_redirects=True and no
SSRF check. A model response containing <img src="http://169.254.169.254/...">
routed through ImageUrlHandler -> download_url and would fetch cloud-metadata
endpoints; same for inbound media.

Add an is_safe_url() pre-flight plus an async redirect event-hook that
re-validates every 30x target, matching the cache_image_from_url() guard in
gateway/platforms/base.py. The other gateway adapters already guard their
URL-fetch paths; this was the remaining unguarded one.
2026-06-28 15:29:59 -07:00
brooklyn!
16ff1a3b93
Merge pull request #54457 from NousResearch/bb/windows-console-launcher-repair
fix(windows): repair missing console script launchers
2026-06-28 17:15:56 -05:00
奥森木
e7d4ade8cf fix(anthropic): ignore stale non-Anthropic base_url across all resolution paths
A config left with `provider: anthropic` but a leftover
`base_url: https://openrouter.ai/api/v1` (e.g. after a provider switch)
would route Anthropic OAuth/setup-token traffic to OpenRouter and 404.

Add `_anthropic_base_url_override_ok()` and gate the three native-Anthropic
resolution branches (pool, explicit, native) on it. The guard honors a
configured `model.base_url` only when it plausibly speaks the Anthropic
Messages protocol — official `*.anthropic.com` / `*.claude.com` hosts, Azure
Foundry endpoints, and `/anthropic`-suffixed or Kimi `/coding` proxies — and
falls back to `https://api.anthropic.com` otherwise. Aggregator URLs like
openrouter.ai / api.openai.com are treated as stale.

Reconstructed from @clovericbot's PR #3661 onto current main: the original
patched one branch with an anthropic-only allow-list, which would have broken
Azure-via-anthropic; widened to all three sites and made Azure/proxy-safe.
2026-06-28 15:12:03 -07:00
Mibayy
b0b7ff0d75 fix(provider): auto+base_url bypasses cloud API when custom endpoint configured (#3846)
When config.yaml has `provider: auto` and a non-cloud `base_url` (e.g. Ollama
at localhost:11434), requests were silently sent to https://api.anthropic.com
whenever ANTHROPIC_API_KEY was present in the environment, ignoring the
configured local endpoint and returning HTTP 401 / "credit balance too low".

Root cause: resolve_provider("auto") scans env vars and returns "anthropic"
when ANTHROPIC_API_KEY is set, before config.model.base_url is ever consulted.

In resolve_runtime_provider(), before calling resolve_provider(), short-circuit
to the OpenAI-compatible resolver when no explicit creds were passed, provider
is "auto"/unset, and a non-cloud base_url is configured. Well-known cloud roots
(openrouter.ai, anthropic.com, openai.com) are matched on HOST (not substring)
so look-alike hosts can't evade the bypass and leak a cloud credential.

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-06-28 15:11:55 -07:00
Teknium
86e64900b9
fix(gateway): preserve sessions across restarts (#54442) 2026-06-28 15:10:39 -07:00