hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-26 17:38:36 +00:00

Author	SHA1	Message	Date
teknium1	189ffe7362	test: port voice-reply suffix assertions, fix change-detector cap test, add AUTHOR_MAP entry - Add output_path suffix assertions (.ogg Telegram / .mp3 non-Telegram) to _send_voice_reply tests, covering the OGG voice-note path that landed on main in `ae82eed2b` (the PR's third commit was redundant with it). - Convert test_gemini_default_is_32000 back to an invariant against PROVIDER_MAX_TEXT_LENGTH instead of a hardcoded literal. - Map barronlroth@gmail.com -> barronlroth in scripts/release.py.	2026-06-10 02:57:39 -07:00
Barron Roth	2c19208224	feat(tts): add Gemini audio tag rewrite	2026-06-10 02:57:39 -07:00
Barron Roth	5718811de0	feat(tts): add Gemini persona prompt file	2026-06-10 02:57:39 -07:00
Teknium	af3c8b80b5	fix(tests): close pid-file read race in test_grandchild_reaped_via_pgroup (#43447 ) The grandchild wrote its pid with open('w').write(...), so the polling reader in the test could observe the file after creation but before the write flushed, parsing '' -> ValueError: invalid literal for int(). Write to a temp file and os.replace() it into place so the pid file only ever appears fully written.	2026-06-10 02:57:27 -07:00
Teknium	70d5d7e39b	fix(memory,skills): repair write-approval inline prompt, gateway staging, and gateway /skills review (#43452 ) Follow-ups to #38199/#43354 found in post-merge review: - Inline CLI memory approval never worked: the per-thread approval callback was not passed to prompt_dangerous_approval, so the prompt_toolkit fail-closed guard (#15216) denied every gated foreground write without showing a prompt. Now invokes the registered callback directly; a crashed prompt falls back to staging instead of a silent deny. - Gateway sessions claimed inline support but prompt_dangerous_approval has no gateway round-trip (that lives in the pending-approval queue), so gated gateway memory writes hit the input() fallback and denied. Gateway contexts now stage for /memory pending review. - /skills pending\|approve\|reject\|diff\|approval now works on the gateway (gateway_config_gate on skills.write_approval), so skills staged from a messaging session can be reviewed there. Diff output truncated for chat. - memory_tool validates required params before the gate so invalid writes are rejected immediately instead of staged and failing at approve time. - Stale tri-state write_mode docstrings updated to the boolean gate; docs table corrected (inline prompt is interactive-CLI-only). - 6 new tests covering the interactive approve/deny/error paths, gateway staging, skills never-prompt invariant, and pre-gate validation.	2026-06-10 02:57:15 -07:00
Evi Nova	5d8c44a393	fix(docker): pre-install matrix deps in Docker image (#30399 ) (#42413 ) The Matrix gateway requires mautrix[encryption] which pulls in python-olm. While python-olm was removed from [all] due to missing Windows/macOS wheels, it has binary manylinux wheels for Linux amd64/arm64. The Docker image only runs on Linux, so adding --extra matrix to the uv sync line is safe. libolm-dev is already in the apt-get install line for runtime linking. Fixes: #30399	2026-06-10 19:23:06 +10:00
Teknium	eee1da45f0	fix(skills): bound ClawHub catalog walk to requested page on cold start (#43395 ) Browse renders one page but the cold-cache fallback walked the entire 50k+ ClawHub catalog, then sliced off the first N — pure waste behind the 12s budget band-aid. _load_catalog_index now takes max_items: browse's empty-query path bounds the walk to its limit and stops early; the offline index builder still passes limit=0 (unbounded) and walks to exhaustion. A bounded walk is partial, so it is not written to the shared full-catalog cache (same poison-guard as the budget-truncated case).	2026-06-10 01:01:53 -07:00
tomekpanek	383d44bc9a	fix(web): rank explicit credentials above managed-gateway probe Backend selection ordered firecrawl (including the Nous-managed-tool-gateway probe) ahead of explicit-credential backends, so a user who had both a Nous OAuth token AND a TAVILY_API_KEY (or EXA/PARALLEL key) got firecrawl auto-selected — then the request failed at runtime because the free Nous tier does not include web search, and there is no fallback to the next available backend. Explicit user setup lost to a managed convenience. Reorder so direct-credential backends (tavily > exa > parallel > firecrawl- direct) are tried first, then the managed-gateway firecrawl probe, then free-tier fallbacks. Behaviour for users with only Nous OAuth (no explicit key) is unchanged — firecrawl-via-gateway is still selected. Behaviour change to flag: a user with BOTH a Nous OAuth token AND a TAVILY_API_KEY (or EXA/PARALLEL key) now gets the explicit backend instead of the managed gateway. This matches the principle of least surprise — a user does not set TAVILY_API_KEY without intent — and sidesteps the silent runtime failure of the gateway path on free tiers.	2026-06-10 00:34:38 -07:00
briandevans	105625d650	fix(skills): honour overall_timeout and bound ClawHub catalog walk parallel_search_sources accepted an overall_timeout but never honoured it. The ThreadPoolExecutor ran inside a `with ... as pool` block, whose __exit__ calls shutdown(wait=True); even after as_completed() raised TimeoutError on schedule, leaving the block blocked the caller until every worker finished. A single slow source (e.g. ClawHub) therefore stalled the entire browse for minutes. Manage the executor manually and shut it down with wait=False, cancel_futures=True in a finally, so the timeout actually returns and not-yet-started work is dropped. ClawHubSource._load_catalog_index walked up to 750 sequential pages with no wall-clock bound (each request under its own timeout=30, so nothing errored), and wrote the result to the index cache unconditionally — so an interrupted or slow walk poisoned the cache with a partial catalog. Add a CATALOG_WALK_BUDGET_SECONDS deadline that breaks the walk early, and only write the cache when the walk reaches a natural stop (cursor exhausted or page cap), never on a budget-truncated walk. Adds regression tests covering both bugs (timeout honoured + slow source flagged; budget abort does not poison cache) plus their happy-path invariants.	2026-06-09 23:22:54 -07:00
Ondrej Drapalik	1c055a4c58	fix(xai): accept Grok Build code during loopback wait + tiny screenshot guard xAI's consent page renders the authorization code in-page instead of redirecting to the loopback callback, so the listener just hangs and the manual-paste flow demands a callback URL that never contains the token. - auth.py: poll stdin non-blockingly while waiting for the xAI loopback callback; accept a pasted bare Grok Build code and substitute the locally generated state (PKCE code_verifier still binds the exchange). No need to wait for timeout or re-run with --manual-paste. - computer_use: parse PNG/JPEG dimensions from base64 and fall back to the text/AX/SOM payload when the screenshot is below the provider minimum (8x8), which xAI rejects with HTTP 400. - model_setup_flows.py: xAI credential reuse prompt uses the standard radio picker via a shared _prompt_auth_credentials_choice helper. - main.py: thread a title through _prompt_provider_choice; re-home the helper import (flows live in model_setup_flows.py post-decomposition). Salvaged from #36781 onto current main (contributor's main.py edits re-homed to model_setup_flows.py, where the flows were extracted since the PR opened).	2026-06-09 23:21:24 -07:00
Teknium	095f526b11	refactor(memory,skills): replace tri-state write_mode with boolean write_approval (default off) (#43354 ) The shipped tri-state write_mode (on\|off\|approve) conflated two concepts — whether writes are enabled and whether they're gated — so 'on' (writes flow freely, gate inactive) read like 'gating is on'. Replace it with a single clear boolean gate that defaults off. memory.write_approval / skills.write_approval: false (default) — write freely; the approval gate is off (pre-gate behaviour) true — require approval: memory foreground prompts inline, memory background-review + all skill writes stage for review The old 'off = block all writes' mode is dropped; memory_enabled: false already disables memory entirely, so a third 'block' state was redundant. - tools/write_approval.py: get_write_mode/MODE_* → write_approval_enabled() bool; evaluate_gate() loses the config-driven 'blocked' path (blocked now only comes from an interactive user denial). - tools/memory_tool.py, tools/skill_manager_tool.py: comment + behaviour follow. - hermes_cli/config.py: memory/skills write_mode → write_approval (False); _config_version 28→29 with a 28→29 migration that renames any persisted write_mode (approve→true, on/off/unset→false) and drops the old key. - slash commands: '/memory\|/skills mode <on\|off\|approve>' → 'approval <on\|off>' ('mode' kept as a back-compat alias); set_mode_fn callback now takes a bool. - write_approval_commands.py, cli_commands_mixin.py, gateway/slash_commands.py, commands.py: handlers + registry args/subcommands updated. - docs + tests rewritten for the boolean model; added migration tests.	2026-06-09 23:21:14 -07:00
kshitijk4poor	9caa12f4ec	fix(skills): resolve skill_view by frontmatter name when dir name differs skills_list() surfaces each skill's frontmatter `name:`, but skill_view() only matched on the on-disk directory name (Strategy 2). When a skill's directory is a shorter category/alias that differs from its frontmatter name, skill_view(name) failed to find it. Extend the recursive Strategy-2 walk to also match frontmatter `name:`, guarded by a try/except so an unreadable/malformed SKILL.md can't break discovery. Adds a regression test that creates a skill whose directory name differs from its frontmatter name and asserts skill_view resolves it (fails on current main, passes with this change). Salvaged the skill_view fix from #39682 onto current main as a standalone, single-concern change with the test the original PR lacked. Co-authored-by: foras910521-lab <foras910521-lab@users.noreply.github.com>	2026-06-10 10:51:45 +05:30
Teknium	96af61b6ef	feat(memory,skills): approve/deny gate for memory + skill writes (#38199 ) Adds memory.write_mode and skills.write_mode (on\|off\|approve), applied to both foreground turns and the background self-improvement review fork — the source of the unprompted 'wrong assumption' saves users reported. - on (default): write freely, unchanged behaviour - off: never write; the tool returns a clean disabled result - approve: don't commit. Memory foreground writes prompt inline (small, reviewable in a chat bubble); background memory writes and ALL skill writes stage to a pending store instead (a SKILL.md is too large to review inline, and a daemon thread can't block on a prompt) Review staged writes from CLI or any messaging platform: /memory pending\|approve\|reject\|mode /skills pending\|approve\|reject\|diff\|mode Skill review respects the size asymmetry: inline you see a one-line gist; the full unified diff stays out-of-band (/skills diff, dashboard, or the staged JSON file). New: tools/write_approval.py (gate + pending store), hermes_cli/ write_approval_commands.py (shared CLI+gateway handlers). Gates wired at the single entry points memory_tool() and skill_manage(), using the existing write-origin ContextVar to distinguish foreground from background_review.	2026-06-09 21:51:43 -07:00
BROCCOLO1D	29036155ce	fix(terminal): lazy-parse docker env config (#42733 ) Co-authored-by: BROCCOLO1D <279959838+BROCCOLO1D@users.noreply.github.com>	2026-06-10 11:04:27 +10:00
kshitij	a38cc69bcc	fix(terminal): complete sane PATH entries on POSIX (salvage of #35614 ) (#42653 ) * fix(terminal): complete sane PATH entries on POSIX Fixes macOS gateway/launchd terminal sessions whose PATH already includes /usr/bin while omitting Apple Silicon Homebrew paths. LocalEnvironment._make_run_env() now appends each missing _SANE_PATH entry individually on POSIX, preserving caller precedence and avoiding duplicate sane entries. Root cause: the previous logic used /usr/bin as the sentinel for sane PATH injection. macOS launchd commonly provides /usr/bin while leaving out /opt/homebrew/bin and /opt/homebrew/sbin, so Homebrew-installed CLIs stayed unavailable in terminal tool calls. Salvaged from #35614 by @y0shua1ee. Fixes #35613. Co-authored-by: y0shua1ee <104712437+y0shua1ee@users.noreply.github.com> * test(terminal): harden sane PATH completion against dup/empty entries Follow-up to the #35613 fix. Strengthens _append_missing_sane_path_entries: - De-duplicate the caller-supplied PATH (first occurrence wins) so a PATH that already contains duplicate entries is collapsed rather than carried through. Previously only newly-appended sane entries were guarded against duplication; pre-existing caller duplicates were preserved verbatim. - Drop empty PATH entries (leading/trailing/double ':'), which POSIX shells interpret as the current working directory — a mild foot-gun in a default terminal environment. Behaviour for well-formed PATHs (no duplicates, no empty entries) is byte-identical to before; only malformed/duplicated inputs change. Adds regression tests for: the literal macOS launchd PATH (/usr/bin:/bin:/usr/sbin:/sbin), caller-duplicate collapsing with order preservation, and empty-entry stripping. * docs(terminal): clarify PATH normalisation semantics; drop dead set add Addresses review findings on the sane-PATH completion follow-up: - Sharpen the _append_missing_sane_path_entries docstring to state explicitly that on POSIX the caller PATH is rewritten (empty entries stripped, duplicates collapsed) rather than merely appended to, and that well-formed PATHs remain byte-identical bar the appended sane entries. This makes the intentional semantic change visible rather than buried under "hardening". - Document why _path_env_key is a deliberate second Windows guard distinct from the helper's early return (key-casing selection vs standalone safety), so neither is mistaken for redundant and removed. - Drop the dead `seen.add(entry)` in the sane-entry loop: _SANE_PATH is a static duplicate-free constant, so the membership check against the caller entries is sufficient and `seen` is never read afterwards. No behaviour change: verified byte-identical output across the launchd, minimal, empty, duplicate, empty-entry and already-full cases, and re-confirmed gh/brew resolve through the real LocalEnvironment.execute() path under a launchd-style PATH. 133 targeted tests pass. Intentionally NOT consolidating with tools/browser_tool._merge_browser_path: it prepends (vs append), filters on os.path.isdir, uses os.pathsep, and draws from a dynamic candidate set — a shared helper is a separate refactor, out of scope for this bugfix. --------- Co-authored-by: y0shua1ee <104712437+y0shua1ee@users.noreply.github.com>	2026-06-09 02:21:12 -07:00
kshitij	76f89d66de	fix(test): track TERMINAL_CONFIG_ENV_MAP after env-sync consolidation (#42695 ) `test_terminal_config_env_sync.py::_save_config_env_sync_keys()` AST-scanned `hermes_cli/config.py:set_config_value` for a `_config_to_env_sync = {...}` literal. The terminal-config env bridging was consolidated onto the canonical `TERMINAL_CONFIG_ENV_MAP` (now read via `terminal_config_env_var_for_key()`), so that literal no longer exists and the scanner raised: AssertionError: Could not find `_config_to_env_sync = {...}` literal in source failing 8 of 9 tests on main for every PR. Read the live `TERMINAL_CONFIG_ENV_MAP` instead — the actual source of truth `set_config_value` bridges through — mirroring its `terminal.cwd` exclusion. Refresh the stale module docstring and the now-incorrect error-message hints that still referenced `_config_to_env_sync`. Verified: the suite goes green, and a mutation (dropping `docker_volumes` from `TERMINAL_CONFIG_ENV_MAP`) still trips the pinned regression test, so the drift guard retains its teeth.	2026-06-09 02:11:46 -07:00
teknium	18ead88273	test: update docker preflight assertion for stdin=DEVNULL kwarg The blanket stdin=subprocess.DEVNULL pass added the kwarg to the docker 'version' preflight call; the test pinned the exact kwargs dict. Update the expected dict to match.	2026-06-08 22:46:57 -07:00
teknium	dba6380ca6	test: guard OAuth setup-token stays interactive + marker exemption Regression tests for the salvage follow-up: the interactive 'claude setup-token' login must keep inherited stdin, and the guard's inline 'noqa: subprocess-stdin' marker must exempt a call.	2026-06-08 22:46:57 -07:00
m4dni5	8bb60ff039	test: add pytest guard for subprocess stdin= in TUI-context code Wraps scripts/check_subprocess_stdin.py as a pytest so CI catches regressions when new subprocess calls are added without stdin=.	2026-06-08 22:46:57 -07:00
underthestars-zhy	0646656884	fix(photon): support E.164 and DM GUID targets for home channel Allow PHOTON_HOME_CHANNEL to accept a bare E.164 phone number or a `any;-;+1...` DM chat GUID in addition to a Spectrum space id. Inbound DM spaces are cached so replies resolve without a second SDK lookup, and `photon` is added to _PHONE_PLATFORMS so send_message treats E.164 strings as explicit targets rather than falling through to channel-name resolution.	2026-06-08 21:03:58 -07:00
helix4u	b0efe1d64b	fix(approval): gate resolved Hermes config paths	2026-06-08 11:55:40 -07:00
qWait	cef00ae602	fix(tui): handle Windows PTY stdin and detached WS frames (#41953 ) Two narrow Windows desktop fixes: 1. tools/process_registry.py — PTY stdin writes are now platform-aware. pywinpty (Windows) expects str; ptyprocess (POSIX) expects bytes. Previously bytes was unconditionally passed, producing a TypeError on Windows ("'bytes' object cannot be converted to 'PyString'"). 2. tui_gateway/server.py + ws.py — Detached WebSocket sessions now park on a _DropTransport sink instead of _stdio_transport. In the desktop the gateway runs in-process and stdout is captured by Electron into desktop.log, so falling back to stdio leaked raw JSON-RPC frames into the desktop log after WS disconnects. Orphan-reap semantics are preserved via _ws_session_is_orphaned. Verified on a Windows desktop install: - pywinpty 2.0.15 rejects bytes / accepts str — reproduced exactly - Focused suite green (write_stdin × 2, write_json_drops_detached_ws_frames, ws_orphan_reap × 2) - All 6 CI test shards green, e2e green, nix (ubuntu/macos) green Salvage commit (`21be7ca`) fixes the new test referencing an undefined _ThreadUnsafeStdout — uses the existing _ChunkyStdout helper.	2026-06-08 09:41:20 -07:00
Teknium	e45b745835	fix(file-tools): reject sentinel TERMINAL_CWD; anchor worktree edits before live cwd exists (#41861 ) Completes the worktree-misroute fix from #35399, which made misroutes visible (resolved_path) but did not prevent them: its divergence warning only fired once a terminal command had populated the live cwd registry. A fresh worktree session (registry still empty) with a stale TERMINAL_CWD='.' got neither a worktree anchor nor a warning, so a relative write_file/patch silently landed in the MAIN checkout. Two changes in tools/file_tools.py: - Treat sentinel TERMINAL_CWD values ('', '.', './', 'auto', 'cwd') and any relative value as UNSET rather than a literal anchor. Previously '.' was joined onto the process cwd, silently routing edits to wherever the process happened to be (the main repo, in a worktree session). The gateway already sanitizes the same set at import time; the file-tool layer now matches. - New _authoritative_workspace_root(): prefers the live terminal cwd, else a sentinel-free absolute TERMINAL_CWD (the worktree path cli.py/main.py set for -w). _resolve_base_dir() and _path_resolution_warning() both use it, so a worktree session resolves into — and warns about escaping — the worktree from the very first write, before any cd has run. Validation: 11 new/parametrized tests (sentinel handling, empty-registry anchoring, early divergence warning, live-cwd precedence). 32/32 pass under scripts/run_tests.sh. Live E2E: relative write in an empty-registry worktree session lands in the worktree, main untouched.	2026-06-07 23:58:47 -07:00
liuhao1024	6459b3d991	fix(terminal): collapse CWD-only overrides to shared container When register_task_env_overrides is called with only a 'cwd' key (ACP adapter workspace tracking), the task_id should collapse to 'default' so all interactive surfaces (TUI, gateway, dashboard) share one long-lived container. Previously, any override registration — even CWD-only — caused _resolve_container_task_id to return the session key unchanged, spinning up a separate container per session. This made it impossible to authenticate into external services once and have that auth available across all surfaces. Now only overrides containing isolation keys (docker_image, modal_image, singularity_image, daytona_image, env_type) trigger per-task container isolation. Fixes #37361	2026-06-07 23:04:54 -07:00
Teknium	86c537d209	fix(memory): instruct in-turn consolidation + retry on overflow (#41755 ) * fix(memory): make overflow errors instruct in-turn consolidation + retry When bounded memory is full, the add/replace overflow errors now explicitly tell the model to consolidate (merge/remove/shorten) and retry the write in the same turn, matching the documented behavior. The replace-overflow path now also echoes current_entries + usage for parity with add-overflow, so the model has the same context to act on. Closes #23378 (working-as-documented; this sharpens runtime to match docs). * fix(memory): broaden overflow remediation hint beyond 'stale' Say 'stale or less important' — entries don't have to be stale to be the right ones to drop when making room.	2026-06-07 22:16:28 -07:00
Teknium	48ae8029aa	fix(delegate): resolve custom-endpoint subagent pools by endpoint identity (#41730 ) Subagents delegated to a custom endpoint were misrouted when the parent ran on a different custom endpoint. Both runtimes collapse to provider="custom", so _resolve_child_credential_pool() treated them as interchangeable and handed the child the parent's pool. Leasing from it then overwrote the child's delegated base_url with the parent's endpoint via _swap_credential() — the child sent the delegated model name to the wrong endpoint. Custom runtimes now resolve by endpoint identity (the custom:<name> pool key derived from base_url). The parent pool is reused only when both parent and child resolve to the same custom endpoint; unregistered raw endpoints return None so the child keeps its fixed delegated credential. Non-custom provider paths are unchanged. Fixes #7833.	2026-06-07 22:05:14 -07:00
Teknium	69a293b419	hardening(todo): bound TodoStore item content length and count The todo list is re-injected into the model's context after every context-compression event (TodoStore.format_for_injection), so an oversized todo item or an unbounded number of items defeats the compression it is meant to ride through. TodoStore.write/_validate previously enforced no size or count bounds, so a single 50KB item produced a ~50KB re-injection block on every subsequent turn. Add two caps: - MAX_TODO_CONTENT_CHARS (4000): per-item content is truncated with a marker. Routed through a shared _cap_content() so the merge-update path (which writes content directly, bypassing _validate) is capped too. - MAX_TODO_ITEMS (256): total list length is bounded, keeping the highest-priority head (list order is priority). Both caps are generous relative to real plans — a todo item is a short task description and active lists are a handful of items. NOT a security fix. Raised externally via GHSA-5g4g-6jrg-mw3g, which framed a caller-supplied conversation_history on the authenticated API server replaying into _hydrate_todo_store as a DoS. That path is authenticated (the API server refuses to start without API_SERVER_KEY) and self-scoped (the caller supplies their own entire history and can only inflate their own response chain — forged role=tool entries are never persisted to the session DB), so it is out of scope as a vulnerability under SECURITY.md 3.2. These bounds are footgun containment that also applies to the trusted agent path, where the model itself authors the todos. Credit to the reporter for the observation. Co-authored-by: YLChen-007 <30854794+YLChen-007@users.noreply.github.com>	2026-06-07 18:06:27 -07:00
kshitijk4poor	7df81d0557	fix(web): make _has_env config-aware so SEARXNG_URL auto-detect honors Hermes config Follow-up to #34306. The provider fix made SearXNG usable with a config-only SEARXNG_URL, but tools/web_tools._has_env still read raw os.getenv, so the backend auto-detect cascade and check_web_api_key remained blind to it — SearXNG worked when explicitly selected but was never auto-selected. Route _has_env (and the SearXNG diagnostic print) through a config-aware _env_value helper mirroring the provider's _searxng_url(). Fixing the shared helper covers every provider key in one place. Adds regression tests for config-only auto-detect and check_web_api_key. See #34290.	2026-06-08 01:12:32 +05:30
kshitij	0c0fbf763b	Merge pull request #41430 from helix4u/fix-url-tools-unicode-normalization fix(tools): percent-encode non-ascii URL components	2026-06-07 12:39:30 -07:00
helix4u	333f01bc7f	fix(tools): percent-encode non-ascii URL components	2026-06-07 11:42:26 -06:00
Teknium	af08c43f3e	fix: skip MCP preflight content-type probe on reconnect when already ready (#40604 ) Closes #40366. Salvaged from #40548; re-verified on main, tightened, tested. Co-authored-by: mohamedorigami-jpg <mohamedorigami-jpg@users.noreply.github.com>	2026-06-07 09:51:11 -07:00
teknium1	1a4010edf5	test(approval): regression for shell-escape denylist bypass (#36846 , #36847 )	2026-06-07 03:57:21 -07:00
helix4u	591e6fb8f4	fix(computer_use): honor custom vision routing	2026-06-07 02:09:20 -07:00
Teknium	fe0b3f2338	fix(windows): retry watcher Popen without breakaway when parent job denies it, plus regression tests for the breakaway bit (#40956 ) #40909 added `CREATE_BREAKAWAY_FROM_JOB` to `windows_detach_flags()`, which fixed the headline bug (gateway dies after Desktop GUI update and never comes back). The flag's own docstring acknowledges that restrictive parent job objects can still refuse breakaway with `ERROR_ACCESS_DENIED`, surfacing as `OSError` on the `subprocess.Popen` call: "Callers in this codebase already wrap detached spawns in try/except OSError and fall back to a cmd.exe wrapper, so the breakaway-denied case degrades gracefully rather than crashing." That's true for `_spawn_detached` in `gateway_windows.py` (the `hermes gateway start` path), which has both the breakaway bit AND a retry-without-breakaway fallback. It's NOT true for the post-update watcher path in `launch_detached_profile_gateway_restart` (`hermes_cli/gateway.py`), which only has `except OSError: return False` and gives up entirely. If a user's shell/terminal/container wraps Hermes in a breakaway-denying job, the gateway-respawn watcher silently fails to launch instead of trying again without breakaway. This PR closes that gap and adds the regression tests that were missing from the original fix. ## Changes ### `hermes_cli/_subprocess_compat.py` Adds a sibling helper `windows_detach_flags_without_breakaway()` so callers can express the fallback symbolically (via the helper) rather than coding the magic `& ~0x01000000` mask at every site. Documented on `windows_detach_flags` and `windows_detach_flags_without_breakaway` with the recommended try/except pattern. ### `hermes_cli/gateway.py::launch_detached_profile_gateway_restart` Two changes, both aligned with the canonical pattern in `gateway_windows._spawn_detached`: 1. The outer watcher Popen now wraps in `try/except OSError`, and on failure retries with `windows_detach_flags_without_breakaway()` (POSIX never reaches this branch — `start_new_session=True` can't raise OSError). 2. The inlined respawn payload (the `python -c` watcher) also wraps its CreateProcess in try/except OSError and retries with `_flags & ~_CREATE_BREAKAWAY_FROM_JOB` on failure. This matters because the watcher's job-object inheritance is independent of the outer process's — even if the outer Popen succeeds with breakaway, the respawned gateway might inherit a job that doesn't. ### Regression tests in `tests/tools/test_windows_native_support.py` #40909 shipped the fix without any test that the breakaway bit is present (the existing `test_windows_detach_flags_has_expected_win32_bits` asserted only the three legacy bits). Four new tests close that: - `test_windows_detach_flags_includes_breakaway_from_job` — explicit assertion that the breakaway bit is in the default bundle, with the rationale spelled out in the docstring so a future maintainer staring at this test understands why removing it would resurrect the gateway-dies-after-GUI-update bug. - `test_windows_detach_flags_without_breakaway_drops_only_that_bit` — fallback payload keeps the other three detach bits intact. - `test_launch_detached_profile_gateway_restart_inlined_watcher_uses_breakaway` — static-text check on the stringified watcher payload. The inlined Python program isn't reachable via normal import-time inspection because it lives in a `textwrap.dedent("""...""")` literal that gets passed to a separate `python -c` interpreter. Asserting that both `_CREATE_BREAKAWAY_FROM_JOB` (symbolic) and `0x01000000` (hex literal) appear inside the dedent block is a sufficient regression guard against accidental refactors. - `test_launch_detached_profile_gateway_restart_outer_popen_has_access_denied_fallback` — static check that this PR's fallback retry is wired up symbolically. Without standing up a real Windows job object that refuses breakaway, we can't trigger the OSError in a unit test; the text guard catches the case where a future refactor removes the helper import or the `& ~_CREATE_BREAKAWAY_FROM_JOB` retry. Also extends `test_windows_detach_flags_has_expected_win32_bits` to include the breakaway bit assertion and updates `test_windows_flags_zero_on_posix` to cover the new helper. ## Tests Locally on Windows: 8/8 in the `-k "detach or breakaway or popen_kwargs or launch_detached or gateway_run_update or hermes_cli_gateway"` slice pass. Broader `tests/hermes_cli/test_gateway*.py + test_windows_native_support.py`: 172 passed, 10 failed. All 10 failures are pre-existing POSIX-only tests running on a Windows host (os.geteuid, SIGKILL fallback, is_linux fixture mismatches). Stashing this PR and re-running on bare post-#40909 main reproduces all 10 identically — none are regressions. POSIX paths unchanged: `windows_detach_flags()` and `windows_detach_flags_without_breakaway()` both return 0 off Windows, `windows_detach_popen_kwargs()` still yields `{"start_new_session": True}`. ## Out of scope - The other detached-spawn site in `hermes_cli/gateway.py` (around line 3068) also uses `windows_detach_popen_kwargs()` + `except OSError`. It deserves the same fallback treatment but the codepath is different enough (not the update-flow watcher) that it warrants a separate PR with its own scrutiny. - `gateway/run.py` has Windows branches with `windows_detach_popen_kwargs` too — same reasoning. ## Context Follow-up to #40909 (merged). I had a parallel PR (#40934, closed) that duplicated the core breakaway fix; the bits unique to that PR that #40909 didn't cover are the contents of this one. Closing #40934 and opening this slimmed-down version as the focused follow-up.	2026-06-07 01:21:58 -07:00
Teknium	887295ba54	fix(config): preserve custom-provider models maps and metadata through v11->v12 migration (#40573 ) Salvaged from #40410; cleaned up, re-verified against main, tests added. Co-authored-by: rodboev <rodboev@users.noreply.github.com>	2026-06-06 18:43:20 -07:00
Teknium	365437e4aa	fix(cua-driver): reconnect MCP stdio session once on ClosedResourceError after daemon restart (#40570 ) Salvaged from #40282; cleaned up, re-verified against main, tests added. Co-authored-by: jeeves-assistant <jeeves-assistant@users.noreply.github.com>	2026-06-06 18:35:12 -07:00
Teknium	5a36f76a00	fix(skill_manager): allow SKILL.md in _validate_file_path without weakening traversal guard (#40568 ) Salvaged from #40453; cleaned up, re-verified against main, tests added. Co-authored-by: l37525778-coder <l37525778-coder@users.noreply.github.com>	2026-06-06 18:32:37 -07:00
Teknium	c0424b06af	fix(osv_check): honor npx --package/-p install target when parsing package arg (#40567 ) Salvaged from #40461; cleaned up, re-verified against main, tests added. Co-authored-by: HeLLGURD <HeLLGURD@users.noreply.github.com>	2026-06-06 18:30:39 -07:00
Teknium	56f833efa4	fix(skills): block path traversal via skill_view name argument (#40566 ) Closes #38643. Salvaged from #40521; cleaned up, re-verified against main, tests added. Co-authored-by: xy200303 <xy200303@users.noreply.github.com>	2026-06-06 18:29:52 -07:00
kshitijk4poor	c79e3fd0ba	refactor(image_gen): delegate cache-path mapping to shared helper Follow-up on the backend-visible artifact-path fix. - Extract the cache-mount iteration loop into a reusable, backend-agnostic credential_files.map_cache_path_to_container(host_path, container_base) that returns the POSIX container path or None. to_agent_visible_cache_path() now delegates to it (keeping its Docker-only gate), and image_generation_tool's _agent_visible_cache_path() delegates to it too — eliminating the duplicated loop and the divergent path-join (posixpath vs Path) between the two. - Drop the now-unused posixpath/Path imports from image_generation_tool.py. - Document the agent_visible_cache_base getattr probe as a forward-looking optional hook (no producer yet) so it doesn't read as a typo'd attribute. - Add unit tests for map_cache_path_to_container.	2026-06-06 13:19:07 -07:00
Gille	7c4aa3e4da	fix(image_gen): expose backend-visible artifact paths	2026-06-06 13:19:07 -07:00
kshitijk4poor	c37c6eaf29	refactor(gateway): migrate Home Assistant adapter to bundled plugin Move gateway/platforms/homeassistant.py into plugins/platforms/homeassistant/ following the same shape as the Mattermost and Discord migrations. - Adapter file is renamed via git mv (history is preserved). - register() exposes the platform via the plugin system instead of the hardcoded Platform.HOMEASSISTANT elif in gateway/run.py::build_adapter(). - _standalone_send() replaces the legacy _send_homeassistant() helper in tools/send_message_tool.py. Out-of-process cron delivery (deliver=homeassistant from a cron process not co-located with the gateway) now flows through the registry's standalone_sender_fn path instead of the hardcoded elif. - _is_connected() probes HASS_TOKEN via hermes_cli.gateway.get_env_value so existing connected-platform checks behave identically. The HASS_TOKEN / HASS_URL env-to-PlatformConfig seeding in gateway/config.py stays in core — same pattern bluebubbles, mattermost, and discord migrations followed. No setup_fn or apply_yaml_config_fn is registered because Home Assistant has no _setup_homeassistant wizard in hermes_cli/setup.py and no homeassistant: YAML block in config.yaml today; setup runs through the existing hermes_cli/tools_config.py toolset wizard. Test imports were rewritten across tests/gateway/test_homeassistant.py, tests/integration/test_ha_integration.py, and tests/tools/test_send_message_missing_platforms.py; the legacy (token, extra, chat_id, message)-shaped _send_homeassistant call site is preserved via a small SimpleNamespace shim in test_send_message_missing_platforms.py (same approach used when mattermost moved). - Focused HA suites (64 tests across the three rewritten files) pass. - Broader gateway/cron sweep produces 10 failures identical to main baseline (telegram approval/model-picker xdist isolation flakes, wecom_callback defusedxml issue, cron script_timeout fixture issue). Zero net new failures.	2026-06-06 11:46:24 -07:00
Teknium	f8a241e105	fix(delegate): flatten content blocks in live overlay tail + AUTHOR_MAP Follow-up on the cherry-picked content-block fix. _extract_output_tail (the live subagent overlay) still used crude str(content), which renders a "[{'type': 'text'...}]" blob and — worse — mislabels a block-wrapped "Error: ..." result as is_error=False. Route it through the same _stringify_tool_content helper so error detection and previews work at both consumer sites. - delegate_tool.py: _extract_output_tail uses _stringify_tool_content - tests: add _extract_output_tail content-block test (error detection + clean preview) - release.py: AUTHOR_MAP entry for randomsnowflake (CI gate)	2026-06-05 23:34:00 -07:00
Alexander Lehmann	f83918c31d	fix(delegate): handle content-block tool results	2026-06-05 23:34:00 -07:00
helix4u	338c074336	fix(send-message): treat ntfy topic targets as explicit	2026-06-05 20:38:28 -07:00
Teknium	ea266f43e9	fix(file-ops): make rg/grep search error guard reachable and preserve partial matches (#39858 ) The error guard in _search_with_rg/_search_with_grep was unreachable and, if it had fired, would have discarded valid results. Two root causes: 1. Unreachable. Both methods pipe the search through `\| head` with no pipefail, so the pipeline reported head's exit code (0), masking rg/grep's error code (2). The guard never fired. Worse, because _exec merges stderr into stdout (stderr=subprocess.STDOUT), the error text was then parsed as bogus match lines instead of being surfaced — the user got garbage matches with no indication the search failed. 2. Latent results-dropping. The original `not result.stdout.strip()` check was always False on error (error text lives in stdout), and the `hasattr(result, 'stderr')` branch was dead code (ExecuteResult has no stderr field). A naive broadening to `exit_code == 2` would have nuked real matches whenever rg/grep also hit a non-fatal error (e.g. one unreadable file in a tree that otherwise matched), which both tools signal with exit 2. Fix: - Prefix the piped command with `set -o pipefail` so rg/grep's real exit status propagates. rg exits 0 on a truncating head; grep exits 141 (SIGPIPE), so the strict `== 2` guard ignores truncated-success. - Add _split_tool_diagnostics() to separate tool diagnostics from match output by tool prefix and output shape. Diagnostics never become matches; on a hard error they are the message to surface. - Only surface an error when exit==2 AND no usable match payload remains, so partial errors keep their real matches. Tests: tests/tools/test_search_error_guard.py drives both methods through the real local backend (hard error surfaced, partial error keeps matches, truncation no false error, files_only/count exclude diagnostics) plus unit coverage for the splitter. Supersedes #39710.	2026-06-05 17:44:52 -07:00
Teknium	d41427504e	feat(delegation): uncap max_spawn_depth (floor 1, no ceiling) (#39772 ) * fix: respect disabled auto-compaction on context overflow Port from anomalyco/opencode#30749. When compression.enabled is false, NO automatic compaction trigger may fire. The proactive token-threshold paths (preflight + post-response should_compress gate) already honoured the setting, but the three provider-overflow recovery paths in the agent loop — long-context-tier 429, 413 payload-too-large, and context-overflow — called _compress_context() unconditionally, silently compressing and rotating the session against the user's explicit choice. Add a single guard at the top of the overflow-recovery dispatch: when compression is disabled and the error is one of those three overflow classes, surface a terminal error (compaction_disabled: True) telling the user to /compress manually, /new, switch to a larger-context model, or reduce attachments. Manual /compress (force=True) is unaffected — it never enters this loop. Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't compress when disabled; control case still compresses when enabled). Existing overflow-recovery tests updated to enable compaction explicitly (they verify the recovery fires); fixture defaults flipped to True to match production (compression.enabled defaults to True). * feat(delegation): uncap max_spawn_depth to match max_concurrent_children Removed the hard ceiling of 3 on delegation.max_spawn_depth. Depth now has a floor of 1 and no upper limit, mirroring max_concurrent_children. Cost (each level multiplies API spend) is the practical limiter, not a constant. - delegate_tool.py: drop _MAX_SPAWN_DEPTH_CAP, _get_max_spawn_depth() floors at 1 instead of clamping to [1,3]; depth-limit error string reworded - config.py / cli-config.yaml.example: doc comments say floor 1, no ceiling - docs (configuration, delegation, delegation-patterns): range 1-3 -> >=1 - tests: convert clamp-above-3 change-detector into a no-ceiling invariant, drop the _MAX_SPAWN_DEPTH_CAP==3 snapshot assert, fix warning-text assert	2026-06-05 04:46:02 -07:00
Coy Geek	3278b423d5	fix(dashboard): strip session token from subprocess env Add HERMES_DASHBOARD_SESSION_TOKEN to the Hermes-managed subprocess environment blocklist so dashboard authorization material does not propagate into shell, PTY, or background process launches. Extend the local environment blocklist regression coverage to prove the dashboard session token is stripped like other Hermes-managed secrets.	2026-06-05 02:31:19 -07:00
Baris Sencan	ad69d3edc7	fix(terminal): guard os.getcwd() against a deleted CWD `os.getcwd()` raises FileNotFoundError when the process's working directory was removed out from under it (e.g. a scratch workspace cleaned up mid-session), crashing terminal env setup. Extract a `_safe_getcwd()` helper that falls back to TERMINAL_CWD, then the user's home, on FileNotFoundError, and route all three `os.getcwd()` call sites in terminal_tool.py through it (local default_cwd, the Docker cwd-passthrough source, and the debug-config print) so the same crash can't resurface at a sibling site. Adds unit tests for the real-cwd path and both fallback branches. Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-06-04 23:39:34 -07:00
kyssta-exe	25742372eb	fix(approval): check is_approved in execute_code guard (#39275 ) check_execute_code_guard() never called is_approved() before entering the approval flow, and never persisted session/permanent approvals from the gateway response. This meant 'Approve session' and 'Always' buttons had no effect — every execute_code call re-prompted the user. - Add is_approved() check after get_current_session_key(), matching check_all_command_guards() - Persist session ('approve_session') and permanent ('approve_permanent') approvals based on the gateway choice, same as terminal command guard - Add 3 regression tests for session persistence, permanent persistence, and short-circuit on pre-existing approval	2026-06-04 19:40:30 -07:00

1 2 3 4 5 ...

1104 commits