hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-01 12:02:05 +00:00

Author	SHA1	Message	Date
Teknium	ee8cbfdc03	feat(web_extract): truncate-and-store instead of LLM summarization (#54843 ) * feat(web_extract): truncate-and-store instead of LLM summarization web_extract no longer runs an auxiliary LLM over scraped pages. The extract backends (Firecrawl/Tavily/Exa/Parallel) already return clean, boilerplate- stripped markdown, so we return it directly: pages within a char budget (default 15000, web.extract_char_limit) come back whole; larger pages get a head+tail window plus an explicit footer giving the stored full-text path and the read_file call to page through the omitted middle. The full clean text is written to cache/web (mounted read-only into remote backends like the other cache dirs), so nothing is lost. Inline base64 images are converted to [IMAGE: alt] placeholders (token bombs dropped) while real http(s) image URLs are preserved as links so the agent can still web_extract/vision_analyze them. Removes process_content_with_llm + the chunked summarizer + check_auxiliary_model + _resolve_web_extract_auxiliary. context_references._default_url_fetcher is updated to the truncate path and its stale data.documents shape read is fixed to results (it was silently returning empty). Live before/after eval (firecrawl, 4 URLs): 11.7x faster overall (176.6s -> 15.1s); 10-60x on large pages. Quality identical; findability 4/4 (answer recoverable from stored full text on every truncated page). web_search is unchanged. No own scraper added; no changes to web_search. * fix(web_extract): add char_limit to execute_code web_extract stub The new web_extract char_limit param must appear in the code_execution_tool _TOOL_STUBS signature (and doc line) or test_stubs_cover_all_schema_params fails — the stub schema must cover every real schema param.	2026-06-29 10:00:49 -07:00
Ruzzgar	576424cc1c	fix(security): redact browser CDP endpoint logs	2026-06-29 04:25:26 -07:00
teknium1	9f97915163	fix(browser): route open-timeout base through _safe_command_timeout Wire the salvaged _safe_command_timeout() guard into the surviving open-timeout call site. _get_open_command_timeout() feeds the browser_navigate 'open' path; this closes the last call site that could observe a None timeout from a torn cache (#14331), since the original PR's max(_get_command_timeout(), 60) site no longer exists on main (now routed through _get_open_command_timeout).	2026-06-29 02:24:57 -07:00
Sanjay Santhanam	c79e6bceae	fix(browser_tool): resolve race in _get_command_timeout cache returning None (#14331 ) # Conflicts: # tools/browser_tool.py	2026-06-29 02:24:57 -07:00
teknium1	75317d82d0	fix(vision): narrow the fan-out cap to the CPU encode burst only The original cap held a process-global slot across the WHOLE vision analysis (image load + encode + LLM call) with a default of min(CPUs, 4). That serialized legitimate multi-image workflows — "compare these 6 screenshots", "read this 10-page scan", "analyze every frame" — behind a 4-wide gate, and on the native fast path it even throttled calls that make no LLM request at all. Excess calls queued (blocking acquire, nothing dropped), but the latency hit on real fan-out was the wrong tradeoff. The incident was CPU exhaustion, not call count: concurrent base64/resize bursts saturated every core and left none to service the shared event loop serving /api/status. So cap ONLY that: - A dedicated, bounded ThreadPoolExecutor (_vision_cpu_executor) runs the encode/resize/dimension-check off the caller's loop, sized to the host's usable core count with NO fixed ceiling — the cap tracks the actual exhausted resource (cores), not a magic number. Excess encodes queue on the executor; cores stay free for the loop. - The LLM call is deliberately OUTSIDE the executor, so multi-image workflows keep full request concurrency. - Override via auxiliary.vision.max_concurrency / HERMES_VISION_MAX_CONCURRENCY (honored verbatim, including above core count); sub-1 ignored. - _vision_concurrency_slot() is now a no-op shim for back-compat. Tests assert: resolver defaults to host cores with no ceiling; env/config override (incl. above cores); sub-1 rejection; the executor is dedicated and core-sized; encode runs on a vision-encode thread; and crucially that encode bursts are bounded to the cap while the analyses themselves stay fully concurrent (calls_peak > cap).	2026-06-29 01:27:10 -07:00
Ben Barclay	eddfecd2ce	fix(vision): cap vision_analyze fan-out concurrency process-wide A single agent turn can fan out N vision_analyze calls at once — the classic trigger is "analyze every frame of this video", where ffmpeg explodes a clip into dozens of frames and the model calls vision_analyze on each. Every call does a CPU-heavy base64-encode/resize burst AND holds a long-lived LLM stream open. The tool executor runs concurrent tool calls on a per-session ThreadPoolExecutor (_MAX_TOOL_WORKERS=8), and multiple agent sessions share one process (the dashboard runs the agent in-process), so there was no global ceiling. In prod (June 2026) a video-frame fan-out pinned a worker thread at ~100% CPU and starved the shared asyncio event loop that also serves the dashboard's /api/status liveness probe, flapping the instance to UNHEALTHY even though nothing had crashed. Add a process-global threading.BoundedSemaphore that bounds how many vision analyses run concurrently across the whole process, held across the entire analysis (image load + encode + LLM call) in the single _handle_vision_analyze chokepoint (covers both the native fast path and the legacy aux-LLM path). It is a threading semaphore, NOT asyncio: each vision call is dispatched through model_tools._run_async on a per-thread event loop, so an asyncio primitive bound to one loop cannot coordinate across them. The acquire is offloaded via run_in_executor so waiting for a slot never blocks the calling loop. Default: min(host CPUs, 4), floored at 1 — respect the host's concurrency, or lower. Override via auxiliary.vision.max_concurrency (config.yaml) or HERMES_VISION_MAX_CONCURRENCY (env). Values < 1 are ignored so the cap can never be disabled into an unbounded fan-out. Tests: bounded-fan-out regression guard + a control proving it would fail without the cap; resolver tests for host-cpu default, ceiling clamp, low-cpu host, env override, and sub-1 rejection. Pre-existing handler tests updated for the now-async _handle_vision_analyze. Verified via the real registry.dispatch -> _run_async per-thread-loop path (16 concurrent calls, peak bounded to cap).	2026-06-29 01:27:10 -07:00
kaishi00	08d6195bc4	fix(camofox): auto-recover from stale tab 404 on navigate When a Camofox browser tab is garbage collected (idle timeout, browser recycle), the held tab_id becomes stale. The next browser_navigate call hits /tabs/{stale_id}/navigate -> HTTP 404 -> unhandled HTTPError. Catch the 404 in camofox_navigate, clear the stale tab_id, and create a fresh tab via _ensure_tab. The agent recovers transparently without requiring a session restart. Other tab operations (snapshot, click, type, etc.) use the same pattern but only fail if the tab dies between successful calls — much rarer. The navigate fix covers 95%+ of cases since navigate is always the entry point.	2026-06-29 01:26:24 -07:00
liuhao1024	fe38d50833	fix(tools): read browser.command_timeout in Camofox HTTP client The Camofox browser backend hardcoded a 30s HTTP timeout via _DEFAULT_TIMEOUT, ignoring the user's browser.command_timeout config. The main browser_tool path already reads this config via _get_command_timeout(). This commit adds an equivalent _get_command_timeout() to browser_camofox.py that reads browser.command_timeout from config with caching, and switches all HTTP helper methods (_post, _get, _get_raw, _delete) to use it as the default timeout. Fixes #40843	2026-06-29 01:26:24 -07:00
刘昊	babd9168ba	fix(browser): send Authorization header in Camofox HTTP calls when CAMOFOX_API_KEY is set The five HTTP call sites in browser_camofox.py (_ensure_tab, _post, _get, _get_raw, _delete) did not include Authorization headers, causing 403 Forbidden when the Camofox server has API key auth enabled. Added _auth_headers() helper and wired it into all five call sites. The health check endpoint (/health) is left without auth since it is a connectivity probe, not a browser operation. Regression test covers: header present when key set, absent when unset, blank key produces empty headers. Fixes #20476	2026-06-29 01:26:24 -07:00
liuhao1024	270456308c	fix(tools): send listItemId instead of sessionKey in Camofox tab creation The Camoufox REST API server expects `listItemId` in the `POST /tabs` body, but `_ensure_tab` was sending `sessionKey`. This caused a 400 Bad Request on every `browser_navigate` call. The parameter name mismatch is visible in the same file: line 283 already reads `tab.get("listItemId")` when adopting existing tabs, confirming the server-side field name. Fixes #37960	2026-06-29 01:26:24 -07:00
Ben Barclay	1289f12812	fix(memory): lazy-install supermemory + mem0 SDKs like honcho/hindsight The supermemory and mem0 memory providers shipped third-party SDKs (supermemory / mem0ai) that are not core dependencies, but — unlike the honcho and hindsight providers — they imported those SDKs directly with no tools.lazy_deps.ensure() preflight and had no LAZY_DEPS allowlist entry. On the published Docker image the agent venv is sealed (HERMES_DISABLE_LAZY_INSTALLS=1) and lazy installs are redirected to a writable durable target (HERMES_LAZY_INSTALL_TARGET). honcho/hindsight route through ensure() and install fine there; supermemory/mem0 never called it, so their SDK was never installed on a hosted instance and the provider silently reported itself unavailable even with the API key set. Fixes: - Add memory.supermemory + memory.mem0 to the LAZY_DEPS allowlist (tools/lazy_deps.py), pinned to current PyPI releases. - Call ensure('memory.<x>', prompt=False) at each SDK-import chokepoint (_SupermemoryClient.__init__; Mem0MemoryProvider._create_backend), mirroring honcho's wrapped try/except shape. - Drop the SDK-import gate from supermemory's is_available() — it was a chicken-and-egg trap (provider never loaded on a sealed venv, so ensure() never ran). Now key-presence only, like honcho/mem0. - Add matching pyproject extras [supermemory]/[mem0]; update the lazy-covered-extras contract test (excluded from [all] by policy). Tests prove each path fails without the fix and the real sealed-venv durable-target gate accepts both features.	2026-06-29 00:25:36 -07:00
Ben Barclay	8fe800ee1a	fix(file-tools): sanitize host/relative cwd override before it reaches container sandbox (#54447 ) (#54616 ) (cherry picked from commit `82132f7911`) Co-authored-by: Tranquil-Flow <66773372+Tranquil-Flow@users.noreply.github.com>	2026-06-29 15:32:20 +10:00
Ruzzgar	313a8c6833	fix(skills): replace string prefix check with strict path containment	2026-06-28 21:14:01 -07:00
Brooklyn Nicholson	ae465e9fb8	Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/desktop-multiterminal	2026-06-28 21:37:52 -05:00
Brooklyn Nicholson	e117cfdff0	feat(desktop): live agent terminals + agent-driven tab close Make the read-only agent terminal mirrors stream in real time and give the agent a desktop-only way to dismiss its own tabs. - Stream background output live: the local reader used a blocking read(4096) that buffered small periodic output until EOF, so agent tabs only "filled in" at process exit. Switch to buffer.read1(4096) (decoded) for incremental chunks. - Route agent.terminal.output / terminal.close to the window that owns the process (its gateway session) instead of an empty session id, so events actually reach the desktop renderer. - Add close_terminal: a HERMES_DESKTOP-gated tool (sibling of read_terminal) that drops a process's read-only tab WITHOUT killing it via process_registry.on_close; output keeps buffering and the user can reopen from the status stack. - ⌘W now closes a focused agent tab: mark the agent instance data-terminal and focus it on activation so isFocusWithin routes there. - ensureTerminal() no longer spawns an extra user shell when a tab already exists (e.g. opening a background task from the status stack).	2026-06-28 21:15:14 -05:00
LIC99	dda3268d09	fix(approvals): warn and default to manual on unknown approvals.mode _normalize_approval_mode() previously accepted any string, so an unknown value like 'auto' fell through every downstream mode check (off/smart) and silently behaved like manual with no signal. Validate against the known modes (manual/smart/off), emit a warning for anything else, and default to manual to match the config default and the rest of the function. Bug 1 from the original PR (/approve & /deny bypassing the running-agent guard) already landed on main independently, so only the mode-validation fix is salvaged here. Fixes #4261 Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-28 19:04:18 -07:00
aaronagent	5c1ac6c70d	fix(config): strip `export` prefix in .env parsers across three modules All three .env parsers use `line.partition("=")` without stripping the bash-compatible `export ` prefix first. A line like `export API_KEY=sk-...` produces key `"export API_KEY"` instead of `"API_KEY"`, silently ignoring the variable and causing auth failures for users who copy-paste from bash profiles or follow tutorials that include `export`. - tools/skills_tool.py: `load_env()` for skill environment - hermes_cli/config.py: `load_env()` for core config - hermes_cli/main.py: `_has_any_provider_configured()` inline parser Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-06-28 18:53:00 -07:00
Teknium	9860d93f2a	fix(terminal): require approval for host-bound Docker commands (#54483 ) * fix(terminal): require approval for host-bound Docker commands The Docker terminal backend blanket-skips dangerous-command approval on the assumption that the container is isolated from the host. That holds only when nothing is bind-mounted in. Once a host path is exposed (via TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE or a host-path entry in TERMINAL_DOCKER_VOLUMES), a command like `rm -rf /workspace` reaches real host files but is still auto-approved. Detect host bind mounts and route those sessions through the normal approval flow. Isolated Docker keeps the fast path. The same gating is applied to the execute_code guard, which had the identical blanket skip. Co-authored-by: Hermes Agent <agent@nousresearch.com> * chore: add AUTHOR_MAP entry for PR #6436 salvage (Kolektori) * test: accept has_host_access kwarg in _check_all_guards mocks The host-bound Docker approval fix adds a has_host_access kwarg to the _check_all_guards wrapper. Six pre-existing tests monkeypatch it with a fixed (command, env_type) / (cmd, env) lambda signature, which now raises TypeError when terminal_tool passes the new kwarg. Widen those mock signatures to accept **kwargs. --------- Co-authored-by: Kolektori <256073454+Kolektori@users.noreply.github.com> Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-29 11:35:41 +10:00
Ben Barclay	7cfa2fa13f	fix(docker): gate resource limit flags on cgroup controller availability (#54516 ) On hosts where the cgroup v2 cpu/memory/pids controllers are not delegated to the docker/podman process (unprivileged Proxmox LXCs, some rootless and nested setups), --pids-limit/--cpus/--memory cause every container start to fail with OCI runtime error / exit 126, breaking terminal + execute_code. - Add _cgroup_limits_available(image): one-shot, host-wide cached probe that spawns a throwaway container from the sandbox image itself (sleep 0) with all three flags together, mirroring the existing _storage_opt_supported probe-and-degrade pattern. - Remove --pids-limit from static _BASE_SECURITY_ARGS; apply it (default 256 via _DEFAULT_PIDS_LIMIT) in resource_args gated on the probe. - Gate --cpus and --memory on the same probe. Behavior unchanged on cgroup-capable hosts; graceful degradation with a one-time warning where controllers aren't delegated. Fixes #6568. (cherry picked from commit `c933880b7e`) Co-authored-by: angelos <angelos@oikos.lan.home.malaiwah.com>	2026-06-29 11:01:08 +10:00
Brooklyn Nicholson	520212cc59	feat(desktop): stream agent terminal output live instead of polling Replace the 5s output_tail poll (which often showed nothing) with a real push stream. The process registry gains an on_output sink called from its reader threads with each chunk; the tui_gateway wires it to emit agent.terminal.output {process_id, chunk} (write_json is _stdout_lock-guarded, so emitting from the reader thread is safe). The desktop routes chunks by process id straight into the read-only agent xterm via a small writer registry, with a capped backlog so a tab opened mid-stream (or reopened) replays what it missed. Drops the fragile poll/tail path: no session-key matching, no truncation, no lag — full-fidelity ANSI, env-agnostic (local/docker/ssh).	2026-06-28 19:33:43 -05:00
Brooklyn Nicholson	cb1bb1a48d	refactor(windows): unify windowless spawn form across the touched sites windows_hide_flags() already returns 0 on POSIX (and creationflags=0 is the no-op default there, exactly how server.py::_list_repo_files does it), so drop the IS_WINDOWS import + ternary/one-use-dict gating and just pass creationflags=windows_hide_flags() directly. Tests lose the now-pointless IS_WINDOWS monkeypatch.	2026-06-28 17:44:47 -05:00
Brooklyn Nicholson	32087e4bc9	fix(windows): hide console flash on checkpoint git + skills_hub gh probes The #54236/#54417 backend git/gh sweep routed git_probe, the repo-file picker, coding_context, context_references, copilot_auth, and the gateway process scans through CREATE_NO_WINDOW, but two sibling spawn legs that also run inside the console-less desktop/gateway backend were missed: - tools/checkpoint_manager.py `_run_git` (and the one-shot `git init --bare` in `_init_store`) — when checkpoints are enabled, every file-mutating turn fires multiple bare `git` calls (status, add, write-tree/commit-tree, update-ref). Spawned from a parent with no console (Electron spawns the backend with windowsHide → CREATE_NO_WINDOW), each one allocates its own conhost window → a flurry of terminal popups. - tools/skills_hub.py `GitHubAuth._try_gh_cli` — `gh auth token`, the same bug class as the already-fixed copilot_auth gh probe. Route both through `windows_hide_flags()` (no-op on POSIX), matching the established per-site pattern. Tests added to tests/test_windows_subprocess_no_window_flags.py.	2026-06-28 17:41:47 -05:00
Teknium	e5d22ab80d	fix(daytona): quote single-upload mkdir parent path (#54440 ) * fix(daytona): quote single-upload mkdir parent path The single-file _daytona_upload() path shelled out 'mkdir -p {parent}' with the remote parent interpolated unquoted, so shell metacharacters in the path could break the command or inject arbitrary commands into the sandbox. The bulk-upload, bulk-download, and delete paths were already hardened with shlex-quoting helpers; this single-upload path was missed. Route it through the existing quoted_mkdir_command() helper and add a regression test covering a path with shell metacharacters. Reported by @Gutslabs (#3960); the original branch predated the file_sync refactor, so the fix is re-applied to the current code path. * docs(infographic): daytona quote-sync fix	2026-06-28 14:33:03 -07:00
srojk34	61210097a5	fix(browser): extend private-network guard to browser_get_images The SSRF cluster (`7a6fe9bb`, `48f5c425`, `7ef04ae7`) sealed browser_snapshot, browser_vision, and _browser_eval against eval-navigated private pages, but browser_get_images bypasses _browser_eval and calls _run_browser_command("eval", ...) directly. An eval-driven navigation to a private address followed by browser_get_images would leak image src URLs and alt text from the private page. Add the same _eval_ssrf_guard_active + _current_page_private_url recheck before returning image data, matching the pattern established by the sibling guards. 5 new tests cover: block on private page, allow on public page, skip for local backend, skip when private URLs allowed, no guard needed on failed eval.	2026-06-28 14:25:10 -07:00
Teknium	9a0010fd46	fix(windows): cover remaining console-flash spawn legs (#54417 )	2026-06-28 13:49:08 -07:00
Brooklyn Nicholson	70292596ef	feat(browser): auto-install Chromium binary on local cold-start failure When a local browser_navigate (or any browser command) fails fast because Chromium isn't on disk, attempt a one-shot binary download via `agent-browser install` and retry instead of only printing a hint. Scope is narrow on purpose: - binary only, never `--with-deps` (that shells apt/needs root, so missing system libraries stay a user action) - gated by `security.allow_lazy_installs` (same opt-out as every lazy install) - skipped in Docker (Chromium ships in the image) - attempted once per process Follow-up to #54353, which made the cold-start failure legible; this closes the "doesn't actually install the missing browser" gap for the common case.	2026-06-28 12:25:15 -05:00
Brooklyn Nicholson	1ab5c3cdda	refactor(browser): drop redundant sandbox-hint substring check	2026-06-28 12:14:47 -05:00
infinitycrew39	a10727a555	fix(browser): extend first-open timeout and surface daemon errors Local browser_navigate cold-starts the agent-browser daemon and Chromium; 60s was too short on slow Linux hosts and timeouts discarded stderr, leaving users with a generic failure. Use a 120s floor on first open, inject --no-sandbox in Docker, include captured daemon output plus install hints when commands time out, and show "Failed to open" in the desktop tool chip when navigation returns success=false.	2026-06-28 12:14:21 -05:00
Brooklyn Nicholson	eeca59f489	fix(windows): hide remaining backend console-flash legs missed on main main (`cb982ad99`) wired windows_hide_flags() into the auxiliary git/gh/wmic/ bash/powershell/taskkill legs but left two it didn't reach, plus the Electron backend-launch leg it explicitly deferred. Cover them the same way: - apps/desktop/electron/main.cjs: getNoConsoleVenvPython resolves the BASE pythonw.exe instead of the venv Scripts\pythonw.exe shim, which re-execs a console python.exe and flashes a conhost the desktop backend can't suppress. Both backend creators put the venv site-packages on PYTHONPATH so imports still resolve under the base interpreter. (main's commit said this Electron leg "needs a Windows-tested change of its own".) - tools/tts_tool.py, tools/transcription_tools.py, plugins/platforms/discord: ffmpeg conversions (voice notes / TTS / STT) via windows_hide_flags(). - plugins/platforms/whatsapp: netstat + taskkill bridge-port cleanup via windows_hide_flags(). All no-ops on POSIX. Tests assert the base-pythonw preference and the ffmpeg legs pass CREATE_NO_WINDOW.	2026-06-28 10:19:21 -05:00
homelab-ha-agent	d05cc8f4d6	fix(mcp): skip preflight content-type probe for OAuth servers OAuth-protected MCP servers (e.g. Hospitable) return 200 text/html on an unauthenticated HEAD probe — a login/landing page the server cannot substitute for a real MCP response without a Bearer token. The preflight cannot distinguish this from a misconfigured URL, so it raises NonMcpEndpointError before the OAuth browser flow has a chance to run. Add `and self._auth_type != "oauth"` to the preflight condition in MCPServerTask.run(). The probe is inapplicable to OAuth servers: their URL legitimacy is established by .well-known/oauth-protected-resource during the OAuth handshake, not by a GET content-type check. Concrete repro: Hospitable (https://mcp.hospitable.com/mcp) returns `200 text/html` to an unauthenticated httpx HEAD. Without the guard: ✗ NonMcpEndpointError at `hermes mcp test` With the guard: ✓ Connected (1487ms) — 63 tools discovered Relation to open PRs: - #37598 adds a POST probe fallback for POST-only non-OAuth servers (e.g. DocuSeal), but only passes when POST returns 2xx + MCP content-type. Hospitable returns 401 on the POST probe (Bearer challenge), so #37598 does not cover this case. - #49463 extends the POST probe to also pass on non-2xx auth challenges (making it OAuth-aware), but is labeled duplicate of #37598 and may not land independently. This fix is complementary: it handles OAuth servers with zero extra round-trips rather than adding a POST probe step. Tests: - test_oauth_server_html_response_raises_without_skip: documents that _preflight_content_type raises NonMcpEndpointError for 200 text/html (the underlying issue), with an OAuth-server docstring. - test_run_skips_preflight_for_oauth: verifies that run() does NOT invoke _preflight_content_type when auth_type=="oauth", using class-level monkeypatching so the gate is exercised without a live MCP transport. 23 passed tests/tools/test_mcp_preflight_content_type.py	2026-06-28 04:47:39 -07:00
kshitijk4poor	de928bccde	fix(redact): non-reusable sentinel for prefix secrets in file reads (#35519 ) When security.redact_secrets is on (default), read_file/search_files/cat applied redact_sensitive_text(code_file=True) to file content, which still ran prefix masking. An API key in config.yaml (ghp_..., sk-..., xai-..., etc.) came back as a head/tail mask like `ghp_S1...Pn2T` — a plausible-looking truncated key. When an agent read that and wrote it back to config, the masked value replaced the real credential, silently breaking auth (401). Production evidence: a config.yaml found containing the exact 13-char masked GitHub PAT. The two community PRs (#35529, #35534) fixed the corruption by NOT redacting prefixes for config reads — but that exposes the user's real keys to the agent context, model, and logs (a security regression). This takes the safer route: keep redacting, but for file content emit a NON-REUSABLE sentinel. - New `_mask_token_nonreusable`: prefix secrets -> `«redacted:ghp_…»` (vendor label preserved for debuggability; zero secret bytes; angle-bracket/ellipsis wrapper is syntactically invalid as a token so it can't be mistaken for or written back as a usable key). - New `redact_sensitive_text(file_read=True)` routes prefix matches through it (implies code_file=True). Default/log/display mode is UNCHANGED — `_mask_token` still keeps head/tail (fine for logs, never written back). - Wired the 3 file_tools.py call sites (read_file / search_files / cat) to file_read=True. Fixes both the corruption AND avoids the secret-exposure of the un-redact approach. 6 new tests (sentinel shape, no-leak, not-a-plausible-key, default mode unchanged, file_read implies code_file, sk- prefix); 88 redact tests pass; mutation-verified (reverting to the old mask fails the sentinel/leak tests). Co-authored-by: liuhao1024 <sunsky.lau@gmail.com> Co-authored-by: adammatski1972 <289282750+adammatski1972@users.noreply.github.com> Closes #35519. Supersedes #35529, #35534.	2026-06-28 04:13:20 -07:00
tymrtn	d7f655f370	fix: accept typed clarify choice replies	2026-06-28 04:13:19 -07:00
Teknium	c1c179a239	fix(security): redact secrets in background process + foreground env-dump output (#43025 ) (#54149 ) * fix(security): redact secrets in background process + foreground env-dump output Terminal-output redaction was incomplete (#43025): - Gap 1: process(action=poll/log/wait) returned background stdout verbatim — no redaction at all. A background printenv/server/test emitting a key leaked raw to the model, session.db, and CLI display. Same for the gateway background-process watcher's completion/progress notifications. - Gap 2: the foreground terminal path hardcoded code_file=True, which skips the ENV-assignment pass, so an opaque token (no vendor prefix) from env/printenv leaked even there. Adds agent.redact.redact_terminal_output(output, command) as the single policy for ALL terminal-output surfaces: env-dump commands (env/printenv/set/export/ declare) get the ENV-assignment pass (code_file=False) to mask opaque tokens; other commands stay on code_file=True to avoid false positives on source dumps. Wired into terminal_tool, process_registry (_handle_process boundary), and the gateway watcher. Respects security.redact_secrets (no force) — opt-out preserved. * docs: add infographic for #43025 terminal-output redaction fix	2026-06-28 02:44:21 -07:00
teknium1	7ef04ae7a7	fix(browser): close eval return-value SSRF bypass (sibling of #44731 ) The snapshot/vision guards re-check the page URL before returning content, but browser_console(expression=...) -> _browser_eval returns arbitrary JS results directly, leaving two same-class bypasses open: 1. Direct fetch: fetch('http://127.0.0.1/secret').then(r=>r.text()) reads a private endpoint and returns the body — the page URL stays public so the post-eval recheck never sees it. 2. Navigate-then-read: location.href='http://127.0.0.1/' then a later eval reads document.body.innerText. Guard _browser_eval on the same condition as navigate/snapshot/vision (not local backend, not local sidecar, not allow_private_urls): - pre-scan the expression for private/always-blocked URL literals - re-check window.location.href after the eval at both success-return sites (supervisor fast-path + subprocess fallback) Probe failures fail-open (matching the snapshot/vision guards).	2026-06-28 02:42:01 -07:00
liuhao1024	0ae6196087	fix(browser): allow local sidecar sessions to bypass SSRF guard The private-network guard in browser_snapshot() and browser_vision() blocked all private URLs, including those accessed via local sidecar sessions (hybrid routing). Local sidecar sessions intentionally access private URLs — the cloud provider never sees the URL in that case. Add `_is_local_sidecar_key(effective_task_id)` check to both guards, matching the existing pattern in browser_navigate(). Fixes #45101 review feedback from egilewski.	2026-06-28 02:42:01 -07:00
liuhao1024	48f5c42599	fix(browser): extend private-network guard to browser_vision The SSRF bypass in #44731 was only patched for browser_snapshot(), but browser_vision() exposes the same vulnerability — it takes a screenshot and sends it to the vision model without checking if eval-driven navigation moved the page to a private/internal URL. Add the same current-page URL safety check to browser_vision() before any screenshot is captured, encoded, or forwarded to the vision model. This covers both the normal screenshot path and the Lightpanda Chrome fallback path. 7 new tests: blocks private URL, allows public URL, skips in local backend, skips when private URLs allowed, handles eval failure/empty/exception.	2026-06-28 02:42:01 -07:00
liuhao1024	7a6fe9bbfa	fix(browser): block snapshot from eval-navigated private pages browser_snapshot() now checks the current page URL before returning content. When browser_console() changes location.href to a private or internal address (e.g., http://127.0.0.1:8080/), the snapshot returns an error instead of exposing the private page content. This closes the SSRF bypass where an attacker could: 1. Navigate to a public page 2. Use browser_console to eval location.href = 'http://127.0.0.1:port/' 3. Use browser_snapshot to read the private page content The fix reuses the existing _is_safe_url() and _allow_private_urls() infrastructure, and fails open if the URL check itself fails. Fixes #44731	2026-06-28 02:42:01 -07:00
Teknium	9f17f16c66	fix(environments): use $BASHPID for atomic snapshot temp + harden failure path The atomic mv approach (kyssta-exe's commit) narrows but does not close the #38249 race: the temp name used $$ (parent shell PID), which is identical across &-launched concurrent subshells. Two concurrent writers pick the same temp file, clobber each other mid-write, and mv then publishes a torn snapshot — a reader sourcing it absorbs declare-x/export fragments into PATH. - Use $BASHPID (actual per-subshell PID) so concurrent writers never collide. - Chain mv on export success (&&) and rm the temp on failure so a partial dump never replaces a good snapshot; apply the same to the init_session bootstrap. - shlex-quote the static temp-path portion (Windows/spaces), $BASHPID outside. - LocalEnvironment.cleanup sweeps orphaned snap.tmp.* temps. - Regression tests: string-shape + a behavioral concurrent writers/readers test that proves the snapshot never tears (would still tear with $$).	2026-06-28 02:08:57 -07:00
kyssta-exe	6a2958a521	fix(environments): use atomic file replacement for snapshot writes Fix race condition in terminal environment snapshots that could corrupt PATH with declare -x entries. When concurrent terminal calls share the same snapshot file, the non-atomic 'export -p > snapshot.sh' write could be read mid-write by another process, causing partial/corrupted env vars to be sourced and mixed into PATH. The fix uses atomic file replacement: - Write to a temp file: export -p > snapshot.sh.tmp.303651 - Atomically replace: mv -f snapshot.sh.tmp.303651 snapshot.sh On POSIX, mv within the same filesystem is atomic, so source() will either see the old complete snapshot or the new complete one, never a partial/truncated file. Fixes #38249	2026-06-28 02:08:57 -07:00
Coy Geek	d7a1052424	fix(env-passthrough): fail closed when provider blocklist import fails When tools.environments.local can't be imported (partial install, import-time error), _is_hermes_provider_credential() returned False — fail-open. A skill could then register a Hermes provider credential (ANTHROPIC_API_KEY, etc.) as env passthrough; _scrub_child_env lets passthrough vars bypass the secret-substring net (rule 1), so the operator's real key would land in the execute_code child. Reopens the GHSA-rhgp-j443-p4rf bypass. Fail closed instead: on import failure, treat the name as a protected provider credential and refuse passthrough. Regression test exercises the full register -> scrub path under a simulated import failure. Co-authored-by: Hermes Agent <noreply@nousresearch.com>	2026-06-28 02:05:43 -07:00
Teknium	674e16e7c6	fix(redact): stop DB-connstr redaction from corrupting code output (#33801 ) (#54061 ) Secret redaction is display/output-scoped on main — write_file writes content verbatim, terminal/execute_code redact only output not the command/source. The real bug is in displayed tool OUTPUT (read_file, terminal, execute_code): _DB_CONNSTR_RE's password group [^@]+ was greedy across newlines, so on a multi-line block it scanned past the DSN line to the next stray '@' (a Python @decorator), replacing every intervening character — including line breaks — with *. That dropped lines and concatenated the next line onto the f-string line, making read_file output look corrupted (the file on disk was always correct). Reported in #33801. Fix: - Forbid whitespace in the userinfo/password groups ([^:\s]+ / [^@\s]+) so the match can never span a line break. A real DSN password never contains whitespace. This alone kills the catastrophic line-dropping. - Under code_file=True, preserve a password group that is a pure {...} brace expression — f"postgresql://{user}:{pass}@{host}" is an f-string template, not a live credential. Literal passwords are still masked. - Pass code_file=True at the terminal and execute_code output redaction call sites (file_tools already did) so code-execution output isn't corrupted by ENV/JSON/template false positives. Real prefixes, auth headers, JWTs, and private keys are still redacted. Verified E2E against the reporter's exact pydantic-settings module: file written verbatim, read_file shows the DSN f-string + @model_validator intact with zero * corruption, while a literal postgresql://admin:pw@host DSN and a real sk- key are still masked. Reported-by: koishi70 Reported-by: pfrenssen	2026-06-28 01:15:39 -07:00
teknium1	aacc15b2c9	fix(clarify): raise default clarify_timeout to 3600s (#32762 ) The 600s default evicted the gateway clarify entry while users were still away (meeting/AFK); a later button tap then landed on a dead entry and the agent hung on 'running: clarify'. Raise the default to 1h in DEFAULT_CONFIG and the get_clarify_timeout() code-level fallback, documenting the running-agent-guard tradeoff. User overrides still win.	2026-06-28 01:07:53 -07:00
kshitijk4poor	fc7a01b6cb	test+harden: modernize salvaged Matrix path for current plugin layout Two follow-ups on top of the salvaged #46365 fix: 1. Tests: the salvaged tests injected the ephemeral MatrixAdapter via sys.modules["gateway.platforms.matrix"], but Matrix migrated to a plugin (#41112) and the fallback now imports from plugins.platforms.matrix.adapter. Point the three sys.modules patches at the current module path so the ephemeral-fallback tests actually exercise the injected fake adapter. 2. Harden the live-adapter lookup: split the gateway import guard from the adapter lookup and log (instead of silently swallowing) when a runner exists but adapters.get() raises. A silent fall-through there would re-introduce the per-send reconnect/OTK-exhaustion storm this fix exists to prevent (#46310). Documented that the live adapter is gateway-owned and must not be disconnected, and why the ephemeral finally never touches it.	2026-06-28 12:48:08 +05:30
liuhao1024	a7fd62d824	fix(send_message): reuse live gateway adapter for Matrix media sends When a live gateway adapter is available (i.e. the tool runs inside a running gateway), reuse the persistent connection instead of creating a new MatrixAdapter per call. This eliminates per-message E2EE re-init storms that exhaust recipient OTKs and silently drop messages. The fix follows the same pattern as _send_to_platform (line 618): gateway_runner_ref → runner.adapters[Platform.MATRIX]. Falls back to the ephemeral connect/disconnect cycle for standalone contexts. Also extracts the shared send logic into _send_via_matrix_adapter() to avoid duplicating the media dispatch code between the two paths. Fixes #46310	2026-06-28 12:48:08 +05:30
zccyman	db11849c9d	fix(skills): skip shadowing when external_dirs provides the skill Fixes #28126. sync_skills() was unconditionally writing bundled skills into the local <profile_home>/skills/ tree even when the profile's config.yaml delegated skill resolution to an external directory via skills.external_dirs. The skill loader then saw two candidates for the same name (local shadow + external canonical), refused to resolve on collision, and every worker that auto-loaded such a skill crashed with 'Unknown skill(s): <name>'. Changes: - _build_external_skill_index() indexes skills available in external dirs (by directory name and frontmatter name) - sync_skills() skips writing a bundled skill when it finds the same name in the external index; records the hash in the manifest so subsequent syncs treat it as already handled - Self-healing: removes stale local shadows left by prior buggy syncs (only when origin_hash == bundled_hash == user_hash, i.e. we wrote it and user didn't touch it) - New 'shadowed_by_external' key in sync_skills() return dict 3 new tests in TestExternalDirsIndexing (all passing). All 48 tests in test_skills_sync.py pass. Closes #28126	2026-06-27 21:07:53 -07:00
Teknium	56abbaeac3	fix(curator): fail closed on unverified skill deletes during consolidation (#53935 ) The curator's LLM consolidation pass could archive whole clusters of active skills with zero verified consolidations (#29912): a bare prune (skill_manage delete with absorbed_into empty/omitted) from the forked review agent was accepted, removing the skill's name from lookup even though counts.consolidated_this_run was 0. - _delete_skill now fails closed during the curator/background-review pass: a delete is only allowed when it declares a verified consolidation (absorbed_into=<umbrella>, umbrella must exist). A prune with no forwarding target is refused; the skill stays active. The deterministic inactivity prune (archive_skill) is unaffected. - A verified consolidation delete during the curator pass now routes through the recoverable archive primitive instead of shutil.rmtree, so a misjudged consolidation can be undone with hermes curator restore. The usage record is kept (state=archived) rather than forgotten. - Foreground, user-directed deletes keep their existing hard-delete semantics.	2026-06-27 20:45:57 -07:00
teknium1	a1ac6baac4	fix(gateway): make bg-process reset TTL configurable + surface session-scoped processes Follow-up to the cherry-picked #29212 (#29177): - Promote the 24h stale-process threshold to config.yaml (session_reset.bg_process_max_age_hours) instead of a hardcoded constant. 0 disables the cutoff (legacy: any live process blocks reset). Wired through GatewayConfig.default_reset_policy in gateway/run.py. - Bug 2: process(action=list) now resolves the gateway session_key from the contextvar and surfaces session-scoped background processes (a forgotten preview server under a different task), flagged session_scoped — so the agent/user can discover and kill the blocker. Previously the task-scoped list returned [] and the blocker was invisible. - Tests: config round-trip for the new field, cross-task list visibility. - Docs: messaging session-reset section.	2026-06-27 20:45:43 -07:00
annguyenNous	33d8b66d5b	fix: stale background processes no longer permanently block session reset Background processes (e.g. http.server preview) that Hermes starts and forgets about previously blocked session idle/daily reset indefinitely. The reset guard in session.py checked has_active_for_session() with no max age — a 3-day-old preview server blocked reset the same as a task started 30 seconds ago. Changes: - Add max_active_age parameter to has_active_for_session() in process_registry.py. Processes older than this threshold are ignored. - Add MAX_ACTIVE_PROCESS_AGE constant (24h / 86400s). - Wire max_active_age into the gateway's session store callback in run.py so stale processes no longer block session lifecycle. - Add debug logging when reset is skipped due to active processes. - Add 3 tests covering recent, stale, and legacy (None) max age. Fixes #29177	2026-06-27 20:45:43 -07:00
teknium1	8c8967a50b	fix: defer hermes_subprocess_env import in browser_tool The module-level import broke tests/tools/test_managed_browserbase_and_modal.py, which loads browser_tool.py via spec_from_file_location against a stubbed 'tools' package that does not include tools.environments.local. Move the import into a _build_browser_env() helper called at the two agent-browser spawn sites, matching the lazy-import pattern already used by lazy_deps.py.	2026-06-27 20:45:31 -07:00
teknium1	9c6229ce24	fix(security): centralize credential-safe subprocess env (#29157 ) Subprocesses spawned outside the terminal/execute_code path (agent-browser, copilot ACP, dep-ensure, lazy_deps uv install, TUI Node host, cli.exec) inherited the operator's full credential environment via os.environ.copy(). The terminal path was already scrubbed by _HERMES_PROVIDER_ENV_BLOCKLIST (#1002/#1264/#32314); these spawn sites bypassed it. Adds hermes_subprocess_env(inherit_credentials=) in tools/environments/local.py reusing the existing dynamic blocklist as the single source of truth: - Tier 1 (_ALWAYS_STRIP_KEYS): gateway bot tokens, GitHub auth, infra secrets -- stripped even for credential-inheriting children. - Tier 2 (_HERMES_PROVIDER_ENV_BLOCKLIST): provider/tool keys -- stripped unless inherit_credentials=True. The opt-in is grep-able for audit. Browser worker keeps a _BROWSER_PASSTHROUGH_KEYS allowlist (BROWSERBASE/ FIRECRAWL) re-added after the strip. Model-driving children (ACP, TUI Node host, cli.exec) use inherit_credentials=True so they still get provider keys while losing Tier-1 secrets. Installers (dep-ensure, lazy_deps) inherit nothing sensitive. cua_backend already routed through _sanitize_subprocess_env on main -- left as-is. Gateway adapter utility spawns (gh pr comment, ffmpeg) are left inheriting env: gh needs GH_TOKEN by design, ffmpeg is a trusted system binary -- no untrusted-dependency exposure. This is defense-in-depth (personal-assistant trust model: same-user spawns), making the existing scrub policy uniform across the spawn surface; the main real payoff is shrinking the blast radius if a transitive npm dep in agent-browser is compromised. Reconstructed on current main from the design in #31959 (Tranquil-Flow); also credits #39003 (rodboev), #37843 (coygeek), #35769 (egilewski). Co-authored-by: Tranquil-Flow <tranquil_flow@protonmail.com> Co-authored-by: rodboev <rod.boev@gmail.com> Co-authored-by: egilewski <egilewski@egilewski.com>	2026-06-27 20:45:31 -07:00

1 2 3 4 5 ...

1914 commits