hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-01 12:02:05 +00:00

Author	SHA1	Message	Date
teknium1	a1ac6baac4	fix(gateway): make bg-process reset TTL configurable + surface session-scoped processes Follow-up to the cherry-picked #29212 (#29177): - Promote the 24h stale-process threshold to config.yaml (session_reset.bg_process_max_age_hours) instead of a hardcoded constant. 0 disables the cutoff (legacy: any live process blocks reset). Wired through GatewayConfig.default_reset_policy in gateway/run.py. - Bug 2: process(action=list) now resolves the gateway session_key from the contextvar and surfaces session-scoped background processes (a forgotten preview server under a different task), flagged session_scoped — so the agent/user can discover and kill the blocker. Previously the task-scoped list returned [] and the blocker was invisible. - Tests: config round-trip for the new field, cross-task list visibility. - Docs: messaging session-reset section.	2026-06-27 20:45:43 -07:00
annguyenNous	33d8b66d5b	fix: stale background processes no longer permanently block session reset Background processes (e.g. http.server preview) that Hermes starts and forgets about previously blocked session idle/daily reset indefinitely. The reset guard in session.py checked has_active_for_session() with no max age — a 3-day-old preview server blocked reset the same as a task started 30 seconds ago. Changes: - Add max_active_age parameter to has_active_for_session() in process_registry.py. Processes older than this threshold are ignored. - Add MAX_ACTIVE_PROCESS_AGE constant (24h / 86400s). - Wire max_active_age into the gateway's session store callback in run.py so stale processes no longer block session lifecycle. - Add debug logging when reset is skipped due to active processes. - Add 3 tests covering recent, stale, and legacy (None) max age. Fixes #29177	2026-06-27 20:45:43 -07:00
teknium1	9c6229ce24	fix(security): centralize credential-safe subprocess env (#29157 ) Subprocesses spawned outside the terminal/execute_code path (agent-browser, copilot ACP, dep-ensure, lazy_deps uv install, TUI Node host, cli.exec) inherited the operator's full credential environment via os.environ.copy(). The terminal path was already scrubbed by _HERMES_PROVIDER_ENV_BLOCKLIST (#1002/#1264/#32314); these spawn sites bypassed it. Adds hermes_subprocess_env(inherit_credentials=) in tools/environments/local.py reusing the existing dynamic blocklist as the single source of truth: - Tier 1 (_ALWAYS_STRIP_KEYS): gateway bot tokens, GitHub auth, infra secrets -- stripped even for credential-inheriting children. - Tier 2 (_HERMES_PROVIDER_ENV_BLOCKLIST): provider/tool keys -- stripped unless inherit_credentials=True. The opt-in is grep-able for audit. Browser worker keeps a _BROWSER_PASSTHROUGH_KEYS allowlist (BROWSERBASE/ FIRECRAWL) re-added after the strip. Model-driving children (ACP, TUI Node host, cli.exec) use inherit_credentials=True so they still get provider keys while losing Tier-1 secrets. Installers (dep-ensure, lazy_deps) inherit nothing sensitive. cua_backend already routed through _sanitize_subprocess_env on main -- left as-is. Gateway adapter utility spawns (gh pr comment, ffmpeg) are left inheriting env: gh needs GH_TOKEN by design, ffmpeg is a trusted system binary -- no untrusted-dependency exposure. This is defense-in-depth (personal-assistant trust model: same-user spawns), making the existing scrub policy uniform across the spawn surface; the main real payoff is shrinking the blast radius if a transitive npm dep in agent-browser is compromised. Reconstructed on current main from the design in #31959 (Tranquil-Flow); also credits #39003 (rodboev), #37843 (coygeek), #35769 (egilewski). Co-authored-by: Tranquil-Flow <tranquil_flow@protonmail.com> Co-authored-by: rodboev <rod.boev@gmail.com> Co-authored-by: egilewski <egilewski@egilewski.com>	2026-06-27 20:45:31 -07:00
Hermes Agent	88b3d8638e	test: de-flake SIGKILL-tree, compression-tip resume, and fallback-cooldown tests Three CI flakes hit while landing the credential-pool restore fix; all three were timing/wall-clock races in the tests, not product bugs (each passes locally and the assertions are correct): - test_entire_tree_is_sigkilled_not_just_parent: _terminate_host_pid SIGKILLs synchronously, but the test's 4s budget after a 1s in-function SIGTERM grace left almost no slack for the kernel to tear down 3 processes + reparent the children to zombies under loaded-CI scheduling. Widen the wait to 15s and make the liveness predicate tolerant of vanished-pid / zombie races. The assertion never weakens: every tree member must end up dead or zombie. - test_session_resume_follows_compression_tip: appended messages got time.time() timestamps (~now) while the test forced session started_at into the past, so the get_compression_tip MAX(m.timestamp) tiebreaker depended on wall-clock ordering. Pass explicit, well-separated message timestamps so the chain resolution is deterministic by construction. - test_non_retryable_exhaustion_arms_cooldown: asserted the short (5s) exhaustion cooldown with a tight +1.0s slack, which false-fails when wall-clock jitter between the 'before' snapshot and the cooldown computation exceeds a second on a loaded runner. Widen to +30s — still cleanly below the 60s rate-limit window it must distinguish from.	2026-06-27 20:04:45 -07:00
teknium1	457c8a0a7c	fix(file-ops): keep worktree isolation when restoring preserved cwd (#26211 ) The durable _last_known_cwd anchor is keyed by the shared 'default' container, so a non-owning worktree session could inherit the owning session's cwd through it — breaking the wrong-worktree-routing fix (test_file_tools_cwd_resolution:: test_resolution_routes_to_resolving_sessions_worktree). Reorder _authoritative_workspace_root so the session-specific registered cwd override (keyed by raw session id) is checked BEFORE the shared-container _last_known_cwd fallback. A non-owning session now resolves into its own registered worktree; the durable anchor only fills in when there's no session-specific override (the #26211 single-session case). Adds a regression test covering the owner-mirrors-then-other-session-resolves interaction.	2026-06-27 19:29:06 -07:00
teknium1	b2faeba182	fix(file-ops): make preserved cwd reachable at write-time resolution (#26211 ) Belt-and-suspenders on top of the cherry-picked cwd-preservation fix: - Proactively mirror every live terminal cwd into _last_known_cwd on each successful read, so the durable anchor survives even when the cleanup thread pops both _file_ops_cache and _active_environments before _get_file_ops' stale-cache save branch can fire. - Fall back to _last_known_cwd in _authoritative_workspace_root. write_file_tool resolves the path (via _resolve_path_for_task) BEFORE _get_file_ops rebuilds the env, so restoring only the rebuilt env's cwd was insufficient — the resolution that decides where the file lands runs first. This closes that gap. The local env's persisted _cwd_file can't serve this role: it's keyed by a random per-session uuid and deleted on cleanup (the same cleanup that triggers the bug). The in-memory _last_known_cwd registry is the durable anchor instead. Adds a real-IO E2E regression (TestSilentFileMisplacementE2E) exercising the actual write_file_tool path after env cleanup.	2026-06-27 19:29:06 -07:00
zccyman	adeba1d7a8	fix(file-ops): preserve CWD across terminal environment re-creation (#26211 ) Root cause: when the terminal environment (`_active_environments` entry) is cleaned up and re-created during a long conversation, the new environment always starts with the default config CWD (typically `~/.hermes/hermes-agent`) instead of preserving the user's last-known working directory. Subsequent relative-path writes (`write_file`, `execute_code`, shell commands) silently land in the default CWD, making files appear to be "created but absent." Fix: add `_last_known_cwd` dict that preserves the old environment's CWD before the stale cache entry is invalidated. When a new environment is created for the same task_id, we check `_last_known_cwd` first and use the preserved CWD instead of the config default. Changes: - tools/file_tools.py: add `_last_known_cwd` dict, save CWD before stale cache invalidation, restore CWD on env recreation - tests/tools/test_file_tools.py: add `TestLastKnownCwd` with 2 tests verifying CWD preservation and fallback behavior Fixes #26211	2026-06-27 19:29:06 -07:00
teknium1	926a1b915d	fix(tools): suppress transient check_fn flakes so subagents keep file/terminal tools A flaky external probe in a tool's check_fn (e.g. check_terminal_requirements running `docker version` with a 5s timeout, momentarily timing out under load) would return False for a single get_tool_definitions() call. Because file tools delegate their check_fn to the terminal check, that one flake silently stripped read_file/write_file/patch/search_files AND terminal from whatever agent was being constructed at that instant — most visibly a delegate_task subagent, which then reported "Tool read_file does not exist". This explains both the intermittent (~80% success) user-session failures and the deterministic cron failures in #21658 / #5304. The existing _check_fn TTL cache made this worse: it cached the transient False for the full 30s window, poisoning every subagent spawned in that span. Fix: remember the last time each check_fn returned True; when a fresh probe fails within a short grace window of that success, treat it as a flake — serve the last-good True and do NOT cache the failure (so the next call re-probes). A failure with no recent success, or past the grace window, is honored normally so a backend that genuinely went down stops advertising its tools. Probe failures now log at WARNING regardless of quiet mode, making the previously-silent tool loss diagnosable in subagent (quiet) sessions. Co-authored-by: Stuart Horner <5261694+djstunami@users.noreply.github.com>	2026-06-27 19:29:00 -07:00
Teknium	d3d621f7c3	revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853 ) * Revert "fix(windows): capture is not a no-window boundary; route flashing spawns through chokepoint (#53829)" This reverts commit `2ecca1e7d3`. * Revert "fix(windows): stop terminal-window popups from background spawns (#53810)" This reverts commit `5db1430af9`. * Revert "fix(windows): stop subprocess console-window popups + add CI guard (#53791)" This reverts commit `ef17cd204d`.	2026-06-27 15:59:00 -07:00
brooklyn!	5db1430af9	fix(windows): stop terminal-window popups from background spawns (#53810 ) * fix(windows): stop terminal-window popups from background spawns Native-Windows desktop/gateway users saw cmd/conhost windows flash on gateway restart, image paste, the dashboard Projects tree, voice notes, and ~5 min after closing the app (detached cron). Two root causes: - Console-subsystem exes (taskkill, schtasks, wmic, netstat, tasklist, agent-browser, git, ffmpeg, powershell, git-bash) spawned via raw subprocess allocate a fresh console when the launching process has none (pythonw desktop backend / detached gateway) - even with output captured. - uv venv pythonw shims re-exec console python.exe, so Python children get a console regardless of how they're launched. Fixes: - Single hidden-spawn primitive (_subprocess_compat.run/.popen) that ORs CREATE_NO_WINDOW on Windows, no-op on POSIX. Route every Hermes-owned console-exe spawn through it. - FreeConsole() catch-all in hermes_bootstrap: any Python child that exclusively owns an auto-allocated console detaches it at startup (GetConsoleProcessList()==1 gate leaves shared interactive consoles untouched). - Replace PowerShell/wmic gateway PID scans with in-process psutil. - Skip schtasks queries on non-interactive desktop restarts. - Prefer native agent-browser .exe over .cmd shims. - Guard test bans raw subprocess spawns of the Windows-only console tools repo-wide so the popup class can't regress. * fix(windows): scope FreeConsole to background entry points; fix merge fallout Console detach review (per #53810 feedback): GetConsoleProcessList()==1 can't tell a uv pythonw->python phantom console apart from a user opening the interactive CLI/TUI in its own fresh console (double-click, shortcut, ConPTY) — both report a single attached process with a tty. Running FreeConsole() in the import-time bootstrap therefore risked detaching a legitimately-interactive terminal. - Extract FreeConsole into explicit hermes_bootstrap.detach_orphan_console(); remove it from apply_windows_utf8_bootstrap() (import side effect). - Call it only from known background mains: gateway run, dashboard backend (start_server, what the desktop spawns), cron standalone, tui_gateway entry, slash worker. Interactive CLI/TUI never calls it. - Behavior-contract tests: frees only when solo owner, leaves shared console, no-op without console / on POSIX, and asserts it's not an import side effect. Merge fallout from origin/main (#53791): - local.py: 3-way merge left a dangling *_popen_kwargs (NameError crashing every terminal init). _subprocess_compat.popen already hides the window, so drop it. - discord adapter: merge stacked an undefined windows_hide_flags() onto the primitive call; drop the redundant arg. - test_gateway: scan now goes psutil-first (zero spawn); rewrite the case-variant test to drive that production path. test(claw): mock _subprocess_compat.run seam for Windows process scan claw.py's Windows tasklist/powershell scan routes through the hidden-spawn primitive; the tests still patched claw_mod.subprocess, so on win32 the mock was never hit and real spawns returned nothing. Patch the actual seam.	2026-06-27 14:02:24 -07:00
Dale Nguyen	dbbf102b8e	fix(terminal): strip VIRTUAL_ENV/CONDA_PREFIX from terminal subprocess env The Hermes gateway runs inside its own venv, so its process environment carries VIRTUAL_ENV (and possibly CONDA_PREFIX). The terminal tool spawned subprocesses inheriting those markers. When the agent ran `uv sync`, `uv pip install`, `poetry install`, etc. in ANY other project directory, those tools honored the inherited VIRTUAL_ENV and rebuilt/synced that project's dependencies into the Hermes venv path — wiping Hermes' own runtime deps (and, when the other project pinned a different Python, replacing the interpreter), bricking the gateway on the next restart (#23473). Strip VIRTUAL_ENV/CONDA_PREFIX in both subprocess-env construction points in tools/environments/local.py — `_sanitize_subprocess_env` and `_make_run_env` — via a shared `_ACTIVE_VENV_MARKER_VARS` constant. The Hermes venv stays reachable because its bin dir is already first on PATH, so removing the active-environment markers is safe and only prevents the cross-project clobber. Adds TestActiveVenvMarkerStripping: end-to-end (markers in os.environ don't reach the spawned subprocess) and unit coverage for both functions, plus a guard on the marker constant. Also adds the AUTHOR_MAP entry for the salvaged contributor. Closes #23473	2026-06-28 01:04:20 +05:30
Brandon Zarnitz	9c81c938d3	fix(approval): honour tirith_fail_open=false on Tirith ImportError (#20733 ) check_all_command_guards() swallowed ImportError from tools.tirith_security with an unconditional pass, leaving tirith_result["action"] as "allow" regardless of security.tirith_fail_open. When an operator sets tirith_fail_open: false they have explicitly opted into fail-closed behaviour; a missing or broken Tirith module must not silently permit command execution. Inside the except ImportError handler, read the live security config. When tirith_enabled is true and tirith_fail_open is false, synthesise a "warn"-action Tirith result so the command flows through the normal approval path (prompt the user, or block in cron/gateway contexts) instead of bypassing it. The default tirith_fail_open: true behaviour is unchanged. Adds three regression tests to tests/tools/test_approval.py: - fail_open=true + ImportError → silently allowed (no regression) - fail_open=false + ImportError → approval callback invoked, command denied - tirith_enabled=false → always allowed regardless of fail_open Fixes #20733 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> # Conflicts: # tests/tools/test_approval.py	2026-06-27 04:41:24 -07:00
Teknium	fe1c1c1121	fix(session_search): demote cron below interactive sessions in discover ranking (#53597 ) Cron jobs accumulate large volumes of repetitive vocabulary (recurring project names, dates, summaries) and out-number a user's interactive sessions. Under bare BM25 they dominate the top FTS rows, so discover's early-exit-at-N dedup collects only cron sessions and the user's own conversations never surface — "recall blindness" (#19434). - _order_for_recall() stable-sorts FTS rows so interactive sources rank above cron before lineage dedup; within each class BM25/recency order is preserved. Cron is demoted, not excluded, so it still surfaces when it is the only match. - raise discover scan limit 50 -> 300 so buried interactive matches are in hand for the demotion pass. Fixes the cron-flooding sub-bug of #19434. The split-brain sub-bug is covered by #52798; the child-session sub-bug is superseded by in-place compaction.	2026-06-27 04:41:22 -07:00
Teknium	cd592c105c	feat(send_message): native WhatsApp media delivery via Baileys bridge (#53598 ) send_message with MEDIA:/path to a WhatsApp target previously dropped the attachment: the WhatsApp branch never passed media_files, the plugin's _standalone_send accepted the param but only POSTed text, and WhatsApp was absent from the media-supported platform list. - send_message_tool: add a Platform.WHATSAPP media block (mirrors Feishu) that routes media_files through the whatsapp plugin's standalone_sender_fn, and add whatsapp to the supported-media list strings. - whatsapp adapter: _standalone_send now sends text first (skipped when the chunk is media-only), then uploads each file via the bridge /send-media endpoint with a mediaType derived from extension/is_voice/force_document, so images/videos/voice arrive as native bubbles instead of documents. - _bridge_media_type classifier maps ext -> image\|video\|audio\|document. Closes #19105 (remaining send_message gap). Other items in the report (inbound video paths, image_generate auto-deliver, history dedup, native gateway bubbles) already landed on main.	2026-06-27 04:40:05 -07:00
Teknium	88c02469cc	fix(mcp): never permanently wedge the circuit breaker on a dead transport (#53599 ) A long-running gateway session could permanently lose an MCP server: once a stdio subprocess died (or transient drops accumulated over the session), the run loop exhausted its reconnect budget and returned, orphaning the task. With no listener for _reconnect_event, the circuit breaker's half-open probe could never revive the server — every probe hit a dead/absent session, re-armed the 60s cooldown, and looped forever until a full gateway restart (#16788). Root cause was split ownership of transport liveness between the run loop and the tool handler, plus a permanent give-up path. Fixed by one invariant: a non-shutdown server task is always reconnectable. - run loop parks (deregisters phantom tools, then awaits _reconnect_event) instead of returning when the reconnect budget is exhausted, so the task stays alive as a dormant listener - retry budget resets on every successful (re)connect, so a healthy long-lived server can't accumulate lifetime drops into a death sentence - half-open probe with no live session signals a reconnect (reviving a parked/dead task and respawning a dead stdio subprocess) and returns a clean 'reconnecting' error instead of writing into a dead pipe - breaker resets on successful session init across all transports (stdio/HTTP/SSE) — fully transport-agnostic, no PID/pipe polling Builds on the closed-PR cluster for this issue: keeps #49255's deregister-on- exhaustion insight and #21006's signal-don't-probe insight, discards the racy os.kill PID machinery. Co-authored-by: LeonSGP43 <LeonSGP43@users.noreply.github.com> Co-authored-by: srojk34 <srojk34@users.noreply.github.com>	2026-06-27 04:39:54 -07:00
ethernet	c918d07b50	refactor(ci): rewrite docker tests to check built container	2026-06-26 19:15:18 -07:00
ethernet	638243726e	refactor(ci): faster docker builds via --link and chmod removal	2026-06-26 19:15:18 -07:00
zapabob	e55ddc3e33	fix(mcp): suppress interactive OAuth stdin prompts during background discovery (#35927 ) When an MCP server requires OAuth, the interactive `hermes` TUI froze on startup: background MCP discovery hit the OAuth flow, which on an interactive TTY spawns a daemon thread doing a blocking `sys.stdin.readline()` (the "paste the redirect URL" fallback in mcp_oauth._wait_for_callback). That thread competes with the TUI's own stdin reader for the same terminal, so keystrokes get swallowed and the TUI appears frozen (up to the 300s OAuth timeout). Reported symptom: "MCP OAuth: authorization required / Open this URL ... the tui is freezing, not respond to typing." Add a thread-local `suppress_interactive_oauth()` context manager in tools/mcp_oauth.py; `_is_interactive()` returns False while it's active, so the stdin paste-thread and prompt are never created. Background discovery (hermes_cli/mcp_startup.py, tui_gateway/entry.py) now runs discovery inside that context, so OAuth-requiring servers soft-skip (raise OAuthNonInteractiveError, already handled) instead of stealing the TUI's stdin. A real `hermes mcp login` on the main thread is unaffected (thread-local). Salvaged from #35945 by @zapabob (authorship preserved via cherry-pick; resolved a conflict against main's new mcp_discovery_timeout / wait_for_mcp_ discovery refactor, keeping both). Verified E2E: with suppression the paste prompt is NOT printed and no stdin thread spawns (raises OAuthNonInteractive soft-skip); without it the prompt shows (the freeze). Mutation-verified (removing the suppress check in _is_interactive fails the regression test). 76 tests pass, ruff clean. Closes #35927. SELF-REVIEW FIX: the original #35945 used threading.local(), which does NOT propagate to the dedicated mcp-event-loop thread where OAuth actually runs (discover_mcp_tools dispatches the connect via run_coroutine_threadsafe), so the suppression was a NO-OP in production (the tests passed only by stubbing out the cross-thread dispatch). Converted to a contextvars.ContextVar, which asyncio copies onto the scheduled coroutine — empirically verified suppression now holds on the mcp-event-loop thread through the real _run_on_mcp_loop path. Added a cross-thread regression test (fails on threading.local, passes on the ContextVar) so the no-op can't regress.	2026-06-27 04:59:23 +05:30
briandevans	2d8c44ac87	fix(hermes-home): only honour legacy dir layout when it has content get_hermes_dir(new_subpath, old_name) returned the legacy <old_name>/ location as soon as it existed on disk — even when empty. When an empty legacy stub is created on a profile that already has populated data at the new consolidated <new_subpath>/ (install scaffolds, profile init, a stray mkdir, or ensure_hermes_home() recreating legacy dirs), the resolver silently flipped to the empty legacy dir and the real data became invisible. No log, no error — the feature behaved as if state was wiped. Reproduced as a Discord pairing store losing every approved user when an empty pairing/ shadowed the populated platforms/pairing/. Resolve the legacy path only when it has content: a populated directory (any entry) or a non-directory file counts; an empty directory falls through to the new layout. Inspection failures (PermissionError on lstat/iterdir, or any OSError short of FileNotFoundError) are treated as "occupied" so a transient error never orphans legacy data — only a genuine FileNotFoundError counts as absent. The lstat()-based gate also fixes the prior exists()/is_dir() path swallowing PermissionError and mis-reading an unreadable legacy dir as absent. This hardens all 11+ call sites that share the resolver (pairing, image/audio/video/document caches, matrix/whatsapp session stores, vision/credential/tts/browser dirs). Adds TestGetHermesDir regression coverage (empty/populated/subdir/file/ unreadable/unstatable cases) and updates test_credential_files to populate its legacy dirs so they still count as content. Closes #27602 Closes #27715	2026-06-27 04:57:15 +05:30
kshitijk4poor	cdb1dfbc49	fix: use os.pathsep, add tests, update tips for multi-root support - Use os.pathsep instead of literal ':' so Windows paths (C:\dir) and the Windows separator ';' work correctly. - Add 9 tests covering multi-root behavior: writes inside first/second root, writes outside all roots, trailing/leading/double separators, all-separators edge case, static deny priority, duplicate dedup. - Update hermes_cli/tips.py tip string to mention multiple paths. - Update docs to mention os.pathsep / ; on Windows. Follow-up for salvaged PR #49557.	2026-06-27 04:01:12 +05:30
xxxigm	2608f78b93	test(delegate): cover stale parent base_url inheritance for subagents Add regression tests ensuring delegate_task passes the parent's active localhost endpoint to child agents instead of a leftover OpenRouter URL.	2026-06-27 03:59:36 +05:30
liuhao1024	515192c4b9	fix(tools): use start_new_session instead of preexec_fn to prevent SIGSEGV in multi-threaded processes preexec_fn=os.setsid runs Python code in the forked child before exec, which is unsafe in multi-threaded processes (CPython docs). When the Desktop gateway loads native libraries (onnxruntime, BLAS, provider SDKs) with active thread pools, the fork can SIGSEGV before the child execs. Replace all preexec_fn usage with start_new_session=True, which provides the same setsid/process-group semantics without running Python in the fork. This is already the pattern used throughout hermes_cli/gateway.py and hermes_cli/_subprocess_compat.py. Fixes #46789	2026-06-27 03:08:41 +05:30
Teknium	525e1e775d	fix(skills): background review fork respects pinned skills (#53226 ) The autonomous self-improvement review fork could still write to a pinned skill — only external/bundled/hub-installed/protected-builtin skills were guarded. The curator skips pinned skills from every auto-transition; the review fork is the same kind of no-user-present actor and must too. Adds a pin check to _background_review_write_guard so background-origin edit/patch/delete/write_file/remove_file on a pinned skill are refused. Stricter than the foreground _pinned_guard (delete-only) by design: with no user in the loop there is no one to consent to an edit. Fixes #25839	2026-06-26 12:49:33 -07:00
briandevans	3c8d3ecfa0	fix(approval): extend gateway-lifecycle guard to launchctl and pidof-based kills The dangerous-command approval layer already blocks `hermes gateway (stop\|restart)`, `pkill/killall hermes\|gateway`, and `kill ... $(pgrep ...)`. A reporter noted on #33071 that the agent can still achieve the same effect by driving launchd directly against the gateway's service label (`launchctl stop ai.hermes.gateway`, `launchctl kickstart -k system/ai.hermes.gateway`, etc.) or by substituting `pidof` for `pgrep` in the kill-expansion form. This widens the "Gateway lifecycle protection" block in `tools/approval.py` to cover both vectors: - `launchctl (stop\|kickstart\|bootout\|unload\|kill\|disable\|remove)` scoped to commands that target a Hermes label (`hermes`, `ai.hermes`). Read-only inspection (`launchctl print …`, `launchctl list`) and operations against unrelated labels remain unflagged. - `kill ... $(pidof …)` and the backtick form, alongside the existing `pgrep` expansion. `pidof` is the BSD/Linux equivalent and is equally opaque to the `(pkill\|killall) … hermes` name pattern. Intentionally left out of scope: plain `kill -TERM <numeric_pid>` with a PID looked up out-of-band. Catching that would require runtime PID state and would break the existing `TestPgrepKillExpansion::test_safe_kill_pid_not_flagged` contract, which guarantees that a plain literal-PID `kill 12345` stays safe.	2026-06-26 11:38:28 -07:00
Teknium	3d735fe156	fix(skills-hub): surface per-tap providers (NVIDIA/OpenAI/...) in runtime search (#53191 ) Natural-language skill search returned a short, arbitrary list and never surfaced NVIDIA (or OpenAI/Anthropic/HuggingFace) skills. Two causes: 1. The runtime index collapses every GitHub tap into source="github", so there was no way to find or filter by provider at the CLI — the per-tap identity only existed in the docs-site catalog. 2. HermesIndexSource.search matched only name/description/tags (not the identifier or provider) and broke at the first `limit` hits in raw index order, burying the most relevant skills. `search` also defaulted to --limit 10 against an 86k-entry catalog. Changes: - GitHubSource stamps a per-tap provider label (extra.provider) on each skill via github_provider_for(); source stays "github" so dedup/floor/ index-skip logic is untouched. Flows into the built index. - HermesIndexSource.search now matches identifier + provider too, and collect-then-ranks (exact > prefix > whole-word > substring) instead of break-at-limit. - --source nvidia\|openai\|anthropic\|huggingface\|voltagent\|gstack\|minimax provider filters for browse/search (narrows merged results by provider). - search --limit default 10 -> 25; table Source column shows the provider label for github skills. Tested: 181 unit tests pass; E2E against the live runtime index confirms 'nvidia'/'cuda' searches now surface NVIDIA-provider skills and --source nvidia narrows to exactly the NVIDIA catalog.	2026-06-26 11:04:41 -07:00
Teknium	d430684d7c	fix(gateway,windows): respawn gateway windowless after GUI update (#52239 ) The post-update gateway restart path relaunched the gateway with the venv's console `python.exe` (via `get_python_path()` in `_gateway_run_args_for_profile`). On Windows this leaves a terminal window open permanently: uv's `venv\Scripts\python.exe` is a launcher shim that re-execs the base console interpreter, which allocates its own conhost — and `CREATE_NO_WINDOW` cannot suppress that second window. The clean-start path (`_spawn_detached`) already dodges this by routing through `_resolve_detached_python` to use the windowless base `pythonw.exe`; the restart watcher did not. Symptom (reported on Windows 11): after an in-app GUI update, a console window for the gateway stays open and never closes. Confirmed on the reporter's box — the running gateway was `python.exe ... gateway run --replace` with a live conhost child and the foreground "Press Ctrl+C to stop" banner, born exactly at the update's "Restarting Windows gateway" log line. Fix: - Add `gateway_windows.windowless_gateway_restart_spec(run_argv)` which rewrites a console-python gateway argv into the windowless `pythonw.exe` equivalent and returns the cwd + env overlay (VIRTUAL_ENV / PYTHONPATH / HERMES_HOME) the base interpreter needs to import `hermes_cli` without the venv launcher's site config. No-op on POSIX. - `_spawn_gateway_restart_watcher` now applies that rewrite on Windows and threads cwd= / env= into the inlined respawn Popen. Covers both restart entry points (`launch_detached_profile_gateway_restart` and `launch_detached_gateway_restart_by_cmdline`). CREATE_NO_WINDOW \| DETACHED_PROCESS \| CREATE_BREAKAWAY_FROM_JOB and the breakaway-denied fallback are all preserved. Verified E2E on a real Windows 11 box: drove the actual watcher against a dummy old-pid; the respawned gateway came up as `pythonw.exe` (zero console python, no conhost child) and booted fully (housekeeping + kanban dispatcher started → imports resolved under the base interpreter). Tests: TestWindowlessGatewayRestartSpec (behavior) + TestGatewayDetachedWatcherWindowsFlags regression assert. Pre-existing Linux-only failures on a Windows host (SIGKILL, systemd, docker-root) confirmed identical on the bare base.	2026-06-26 17:39:46 +00:00
liuhao1024	d9f1f1a1de	fix(terminal): prefer $SHELL over bash for background process spawning (#42203 ) On macOS, terminal(background=true) silently failed: the process returned a session_id and exit_code=0 but the command never ran (empty stdout, no side effects). Root cause is two interacting issues: 1. _find_shell was aliased to _find_bash, which prefers `shutil.which("bash")` → /bin/bash (GNU bash 3.2, still shipped on macOS) over $SHELL (/bin/zsh). 2. process_registry.spawn_local runs [shell, "-lic", "set +m; <cmd>"] with stdin=/dev/null. bash 3.2 as a login shell sources ~/.bash_profile, which on many macOS setups contains `exec /bin/zsh -l`; that exec replaces bash but drops the -c argument, so the command is swallowed (exit 0, no output). Decouple _find_shell from _find_bash: _find_shell now prefers the user's configured $SHELL on POSIX (the shell they actually log in with), falling back to _find_bash when $SHELL is unset/missing. _find_bash is unchanged, so callers that genuinely need bash (e.g. the _run_bash login-shell snapshot) keep bash semantics. zsh handles -lic correctly even with redirected stdin. Salvaged from #42219 by @liuhao1024 (authorship preserved via cherry-pick). On top of the original (8 unit tests covering $SHELL-set/unset/missing/empty, Windows-ignores-$SHELL, _find_bash-unchanged), added an E2E regression test that reproduces the real bash-3.2 login-shell swallow (exit 0 / no file) and asserts the shell _find_shell selects actually executes a -lic background command. Mutation-verified: reverting _find_shell to the bash alias fails the $SHELL-preference test. Bug reproduced directly: /bin/bash 3.2 -lic with a .bash_profile->exec-zsh creates no file; zsh -lic does. Closes #42203. Supersedes #42290.	2026-06-26 20:45:32 +05:30
kyssta-exe	07cc567dfa	fix(security): add circuit breaker for tirith crashes to prevent agent hangs (#41400 )	2026-06-26 15:26:08 +05:30
teknium1	fbfccbb3ee	fix(security): align cron invisible-unicode set with install-time scanner The cron runtime tripwire (_scan_cron_prompt) used a 10-char invisible-unicode set while the install-time scanner (threat_patterns.INVISIBLE_CHARS) flags 17. The cron-local set was missing U+2062-U+2064 (invisible math operators) and U+2066-U+2069 (directional isolates), so a directive obfuscated with one of those codepoints (e.g. "ig<U+2063>nore all previous instructions") slipped past the runtime cron gate while being caught at install time. Import the canonical set so the cron tripwire and install scanner can't drift apart again. Emoji-ZWJ protection (_zwj_has_emoji_neighbour) is unchanged. Fixes #35075 Co-authored-by: rlaope <piyrw9754@gmail.com>	2026-06-26 01:11:11 -07:00
Teknium	099df3cd89	fix(security): stop blocking AGENTS.md/SOUL.md that name an agent 'Praxis' (#52925 ) The known_c2_framework threat pattern included 'praxis' in its alternation alongside genuine offensive-security tool brands (Cobalt Strike, Sliver, Havoc, Mythic, Metasploit, Brainworm). Unlike those distinctive brand names, 'praxis' is a common English word (Greek for practice/action) and a legitimate agent name, so any context file that mentioned an agent named Praxis matched at 'context' scope and the whole AGENTS.md / SOUL.md was replaced with a [BLOCKED] placeholder before it reached the system prompt. Remove 'praxis' from the alternation and add a guard comment: every token in this list must be a distinctive tool brand, not a common word. Real C2 brands still fire.	2026-06-26 00:36:01 -07:00
teknium1	4d0dd6bd52	test(mcp): make invalid_client tests interactive under hermetic env The new _maybe_flag_poisoned_client tests built a provider via get_or_build_provider without an interactive stdin. Under the hermetic test env (no TTY, no cached tokens), the non-interactive guard in mcp_oauth_manager._make_provider raised OAuthNonInteractiveError before the provider was built, failing 6 tests in CI parity (they passed locally where stdin was a TTY). Thread monkeypatch into _provider_with_token_endpoint and present an interactive stdin, matching the sibling test_manager_builds_hermes_provider_subclass.	2026-06-26 00:35:27 -07:00
Max Hsu	075f93ad78	fix(mcp): auto-recover from invalid_client on stale OAuth client registration Fixes #36767. Two complementary recoveries for the recurring "delete three cache files and re-auth by hand" ritual when an MCP server's dynamically-registered OAuth client goes dead server-side (IdP redeploy / DB wipe / rebrand): - Auto-heal (token-endpoint subset): HermesMCPOAuthProvider now sniffs auth-flow responses and, on a 400/401 `invalid_client` from the discovered token endpoint, backs up + deletes `<server>.client.json` and `.meta.json` and clears the in-memory client so the SDK re-runs RFC 7591 dynamic client registration on the next flow. Conservative by construction: only dynamically-registered (non config-supplied) clients, only the token endpoint, only on a word-boundary `invalid_client` match (so RFC 7591's `invalid_client_metadata` does not trip it); best-effort so a miss never breaks the live flow. Covers both code-exchange and refresh when the token endpoint was discovered. Tokens are preserved. - `hermes mcp reauth [<name>\|--all]`: the reporter's primary symptom — the IdP's in-browser "Redirect URI Mismatch" — produces no HTTP signal (the SDK only sees a callback timeout), so it cannot be auto-detected. The new command re-auths one or ALL `auth: oauth` servers, serially: one browser flow at a time, which also fixes the startup popup storm when several servers are stale at once. Single-server reauth is factored out of `mcp login` and shared. Tests: +14 (poison helper x2; token-endpoint detection x5 incl. wrong-endpoint, success-response, pre-registered, and invalid_client_metadata negative guards; a bridge integration test driving the real async_auth_flow generator to prove the detection hook preserves the bidirectional asend() forwarding contract; reauth CLI x6). Verified against the pinned mcp==1.26.0: scripts/run_tests.sh 122/122 green for the touched suites; check-windows-footguns.py and ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-26 00:35:27 -07:00
kshitij	a28b939092	Merge pull request #52678 from kshitijk4poor/salvage/52502-fuzzy-boundary fix(fuzzy-match): preserve boundary space after whitespace-normalized match (#52491)	2026-06-26 10:59:14 +05:30
yu-xin-c	96bc524a71	fix(curator): protect external skills from background curation	2026-06-25 22:03:02 -07:00
teknium1	6c58878e7d	fix(browser): force secret-pattern redaction on browser_type display Force redact_sensitive_text(force=True) on the browser_type text arg so recognized credentials (API keys, tokens, JWTs) are masked in tool progress, previews, callbacks, and return payloads even when the global security.redact_secrets opt-out is set — a typed credential reaching chat history is a security boundary, not log hygiene. Normal typed text matches no pattern and stays fully readable for debuggability. Tests assert the API-key-shaped secret is masked across every surface and that normal text passes through unchanged.	2026-06-25 22:02:22 -07:00
rebel	8ff426e53b	fix: redact browser typed text surfaces	2026-06-25 22:02:22 -07:00
teknium1	43b8ba4181	fix(telegram): preserve Bot API update queue on watcher reconnect After a prolonged outage the in-process network-error ladder escalates to fatal and GatewayRunner._platform_reconnect_watcher rebuilds a fresh adapter that reconnects through the bootstrap path. That path called start_polling(drop_pending_updates=True), discarding every update Telegram queued during the outage — all messages sent while the bot was down were silently lost. The in-process ladder and 409-conflict handler already passed drop_pending_updates=False; only bootstrap did not distinguish a cold first boot from a reconnect. Thread an is_reconnect signal from the watcher through _connect_adapter_with_timeout into adapter.connect(). The base BasePlatformAdapter.connect() gains a keyword-only is_reconnect=False so every adapter inherits a tolerant signature (no per-platform breakage when the runner forwards the kwarg). Telegram translates is_reconnect into drop_pending_updates=not is_reconnect on both the polling and webhook bootstrap calls. Cold boot still drops the stale queue; a watcher reconnect preserves it. Fixes #46621. Co-authored-by: annguyenNous <annguyen@nousresearch.com> Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com> Co-authored-by: Kewe63 <Kewe63@users.noreply.github.com>	2026-06-25 21:29:57 -07:00
Harjoth Khara	233ef98afe	fix(docker): skip symlinked stage2 chown targets (#52789 ) Prevents stage2-hook.sh recursive chown from following a symlinked $HERMES_HOME/home (or profiles/cron) and destroying the host user's home directory. Also guards top-level state-file chowns and refuses first-boot seeding through symlinks. Fixes #52781. Co-authored-by: harjoth <harjoth.khara@gmail.com>	2026-06-26 12:09:52 +10:00
Que0x	b8fc8c908b	fix(approval): fold Windows absolute home paths in dangerous-command detection The detector folds absolute home / Hermes-home prefixes into their canonical ~/ and ~/.hermes/ forms so static patterns catch /home/alice/.bashrc the same way they catch ~/.bashrc (`abd69b81`). On native Windows this fold never fired, so terminal commands writing to shell startup files, ~/.ssh/authorized_keys, or ~/.hermes/config.yaml / .env returned "safe" and skipped the approval prompt — and config.yaml carries the approval policy itself. Two compounding causes: 1. The fold ran after the backslash-escape strip (r\m -> rm), which dissolves the backslash separators in a Windows path (C:\Users\alice\.bashrc -> C:Usersalice...) before the fold could match. It now runs before the strip. 2. The fold only recognized POSIX absolute paths and only the home prefix, leaving multi-segment backslash suffixes (\.ssh\authorized_keys) to be mangled by the strip. Consolidated into _home_prefix_fold_regex / _fold_home_prefixes: match a home prefix with either separator, capture the rest of the path token, and normalize its separators to / so multi-segment patterns match. The degenerate-path guard generalizes count("/") >= 2 to "at least two components below the root" (also rejecting a bare drive root C:\). HOME is consulted directly because Windows' expanduser ignores it; the more specific Hermes home is folded first, longest candidate first, so neither fold clobbers the other. POSIX behavior unchanged; the r\m -> rm anti-obfuscation strip still runs. Adds TestWindowsAbsolutePathFolding, which monkeypatches a Windows-style HOME/HERMES_HOME so the behavior is also exercised on the CI runner.	2026-06-25 17:49:39 -07:00
brooklyn!	ffa3d3c811	Merge pull request #49037 from NousResearch/bb/projects-paradigm feat(desktop): first-class projects — sidebar, coding rail, review pane, and agent project tools	2026-06-25 17:49:05 -05:00
Gille	bf0513bca0	test(windows): align gateway restart CI coverage	2026-06-25 14:42:38 -07:00
Brooklyn Nicholson	cb3f8ec03d	fix(tools): isolate per-session worktree cwd	2026-06-25 16:40:27 -05:00
Teknium	c6575df927	feat(moa): expose MoA presets as selectable virtual models (#46081 ) * feat(moa): expose MoA presets as selectable virtual models Reconstructed onto current main (PR #46081's base had diverged with no common ancestor, marking the PR dirty so CI never dispatched). MoA is now a virtual provider: each named preset is a selectable model under provider 'moa', and the preset's aggregator is the acting model that answers and calls tools. Reference models fan out in parallel via a bounded ThreadPoolExecutor (the same batch pattern delegate_task uses) — all references dispatched at once, collected when every one finishes, then handed to the aggregator. Output order is preserved, failures and the MoA-recursion guard stay isolated per reference. - Removed the old mixture_of_agents model tool and moa toolset. - Added moa as a virtual provider in the provider/model inventory. - /moa is shortcut behavior over model selection (default preset / named preset / one-shot prompt). - Dashboard + Desktop manage named presets; presets appear in model pickers. - Parallel reference fan-out in agent/moa_loop.py with regression test. * fix(moa): thread moa_config through _run_agent to _run_agent_inner The reconstructed gateway MoA wiring declared moa_config on _run_agent (the profile-scoping wrapper) and used it inside _run_agent_inner, but the wrapper never forwarded it — _run_agent_inner had no such parameter, so the runtime hit NameError: name 'moa_config' is not defined on the compression-failure session sync path. Add moa_config to _run_agent_inner's signature and forward it from both wrapper call sites (multiplex and non-multiplex). Caught by tests/gateway/test_compression_failure_session_sync.py on CI shard test(4). * fix(moa): classify moa as a virtual provider in the catalog The moa virtual provider has no PROVIDER_REGISTRY/ProviderProfile entry, so provider_catalog() fell through to the default auth_type="api_key" with no env vars — tripping two catalog invariants: - test_provider_catalog: api_key providers must expose a credential env var - test_provider_parity: every hermes-model provider must be desktop-configurable moa already declares auth_type="virtual" in HERMES_OVERLAYS; consult that overlay as an auth_type fallback so the catalog reports moa as virtual (no real credential, no network endpoint). Exempt virtual providers from the desktop parity union check the same way 'custom' is exempt — derived from the catalog, not a hardcoded slug, so future virtual providers are covered too.	2026-06-25 13:52:06 -07:00
kshitij	42bea9e298	Merge pull request #52618 from NousResearch/salvage/14185-todo-coercion fix(tools): defensive type coercion in todo_tool for malformed LLM input (#14185)	2026-06-26 02:02:18 +05:30
liuhao1024	f23d077b5f	fix(fuzzy-match): preserve boundary space after whitespace-normalized match The trailing-whitespace expansion in _map_normalized_positions unconditionally consumed whitespace after the matched region — including the word-boundary space that separates the match from the next token. This caused silent file corruption when the fuzzy matcher fell back to the whitespace_normalized strategy. Guard the expansion on the normalized match actually ending with whitespace (i.e. the original had a run of spaces that were collapsed). When the match ends with a non-space character, the first whitespace in the original is a boundary and must not be consumed. Fixes #52491	2026-06-26 01:55:27 +05:30
helix4u	4efec63a34	fix(tools): let session_search match session titles	2026-06-26 01:12:26 +05:30
rob-maron	525ee58b43	krea	2026-06-25 12:38:33 -07:00
Tranquil-Flow	0be10607d9	fix(tools): defensive type coercion in todo_tool for malformed LLM input (#14185 ) todo_tool crashed with `AttributeError: 'str' object has no attribute 'get'` when the LLM emitted the `todos` param as a JSON-encoded string instead of an array, or as a list containing non-dict items (observed intermittently on Claude 4.5/4.6/4.7, and after a prior tool-call rejection where the model "self-corrects" by wrapping the list in json.dumps). Three additive guards, no behavior change for well-formed input: - todo_tool(): if `todos` is a str, json.loads it; reject unparseable strings and non-list values with a clear tool_error instead of crashing downstream. - _validate(): non-dict items return a {id:"?", content:"(invalid item)"} placeholder rather than calling .get() on a str/int/None. - _dedupe_by_id(): non-dict items get a synthetic key so _validate handles them. Salvaged from #14785 by @Tranquil-Flow (authorship preserved via cherry-pick). Comprehensive tests: JSON-string coercion (parse / unparseable / non-list / non-string), non-dict list items (str/None/int/mixed), and a well-formed- unchanged regression class — both guards mutation-verified to fail without them. Closes #14185. Supersedes #14187, #22505, #14350 (same fix, less/no test coverage) and #16952 (bundled unrelated scope-creep).	2026-06-25 23:42:42 +05:30
kshitij	d682f320b3	Merge pull request #52147 from NousResearch/salvage/29184-mcp-osv-nonblocking fix(mcp): run OSV malware preflight off the event loop with a bounded timeout (#29184)	2026-06-25 23:39:44 +05:30
qdaszx	6305ac0e4b	fix(mcp): run OSV malware preflight off the event loop with a bounded timeout (#29184 ) During stdio MCP server startup, _run_stdio (an async method) called the synchronous check_package_for_malware() inline. That makes a blocking urllib HTTPS POST to api.osv.dev whose own timeout doesn't reliably cover a stalled SSL handshake, so an intermittent network issue froze the entire asyncio event loop for up to ~120s — blowing past the TUI/gateway's 15s startup budget and showing "gateway startup timeout". Run the check via asyncio.to_thread (off the loop) AND bound it with asyncio.wait_for(timeout=_OSV_MALWARE_CHECK_TIMEOUT_S=12s). The malware check is fail-open, so on timeout we log and proceed rather than blocking startup. Salvaged from #29190 by @qdaszx (re-applied on current main — the call site moved since the PR was opened), combining the to_thread approach also proposed in #29192 by @ygd58. Two load-bearing tests: event-loop-not-blocked-during- check and timeout-fails-open — both mutation-verified to fail against the old inline blocking call. Closes #29184. Co-authored-by: ygd58 <buraysandro9@gmail.com>	2026-06-25 23:30:41 +05:30

1 2 3 4 5 ...

1274 commits