hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-01 12:02:05 +00:00

Author	SHA1	Message	Date
Dale Nguyen	dbbf102b8e	fix(terminal): strip VIRTUAL_ENV/CONDA_PREFIX from terminal subprocess env The Hermes gateway runs inside its own venv, so its process environment carries VIRTUAL_ENV (and possibly CONDA_PREFIX). The terminal tool spawned subprocesses inheriting those markers. When the agent ran `uv sync`, `uv pip install`, `poetry install`, etc. in ANY other project directory, those tools honored the inherited VIRTUAL_ENV and rebuilt/synced that project's dependencies into the Hermes venv path — wiping Hermes' own runtime deps (and, when the other project pinned a different Python, replacing the interpreter), bricking the gateway on the next restart (#23473). Strip VIRTUAL_ENV/CONDA_PREFIX in both subprocess-env construction points in tools/environments/local.py — `_sanitize_subprocess_env` and `_make_run_env` — via a shared `_ACTIVE_VENV_MARKER_VARS` constant. The Hermes venv stays reachable because its bin dir is already first on PATH, so removing the active-environment markers is safe and only prevents the cross-project clobber. Adds TestActiveVenvMarkerStripping: end-to-end (markers in os.environ don't reach the spawned subprocess) and unit coverage for both functions, plus a guard on the marker constant. Also adds the AUTHOR_MAP entry for the salvaged contributor. Closes #23473	2026-06-28 01:04:20 +05:30
Brandon Zarnitz	9c81c938d3	fix(approval): honour tirith_fail_open=false on Tirith ImportError (#20733 ) check_all_command_guards() swallowed ImportError from tools.tirith_security with an unconditional pass, leaving tirith_result["action"] as "allow" regardless of security.tirith_fail_open. When an operator sets tirith_fail_open: false they have explicitly opted into fail-closed behaviour; a missing or broken Tirith module must not silently permit command execution. Inside the except ImportError handler, read the live security config. When tirith_enabled is true and tirith_fail_open is false, synthesise a "warn"-action Tirith result so the command flows through the normal approval path (prompt the user, or block in cron/gateway contexts) instead of bypassing it. The default tirith_fail_open: true behaviour is unchanged. Adds three regression tests to tests/tools/test_approval.py: - fail_open=true + ImportError → silently allowed (no regression) - fail_open=false + ImportError → approval callback invoked, command denied - tirith_enabled=false → always allowed regardless of fail_open Fixes #20733 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> # Conflicts: # tests/tools/test_approval.py	2026-06-27 04:41:24 -07:00
Teknium	fe1c1c1121	fix(session_search): demote cron below interactive sessions in discover ranking (#53597 ) Cron jobs accumulate large volumes of repetitive vocabulary (recurring project names, dates, summaries) and out-number a user's interactive sessions. Under bare BM25 they dominate the top FTS rows, so discover's early-exit-at-N dedup collects only cron sessions and the user's own conversations never surface — "recall blindness" (#19434). - _order_for_recall() stable-sorts FTS rows so interactive sources rank above cron before lineage dedup; within each class BM25/recency order is preserved. Cron is demoted, not excluded, so it still surfaces when it is the only match. - raise discover scan limit 50 -> 300 so buried interactive matches are in hand for the demotion pass. Fixes the cron-flooding sub-bug of #19434. The split-brain sub-bug is covered by #52798; the child-session sub-bug is superseded by in-place compaction.	2026-06-27 04:41:22 -07:00
Teknium	cd592c105c	feat(send_message): native WhatsApp media delivery via Baileys bridge (#53598 ) send_message with MEDIA:/path to a WhatsApp target previously dropped the attachment: the WhatsApp branch never passed media_files, the plugin's _standalone_send accepted the param but only POSTed text, and WhatsApp was absent from the media-supported platform list. - send_message_tool: add a Platform.WHATSAPP media block (mirrors Feishu) that routes media_files through the whatsapp plugin's standalone_sender_fn, and add whatsapp to the supported-media list strings. - whatsapp adapter: _standalone_send now sends text first (skipped when the chunk is media-only), then uploads each file via the bridge /send-media endpoint with a mediaType derived from extension/is_voice/force_document, so images/videos/voice arrive as native bubbles instead of documents. - _bridge_media_type classifier maps ext -> image\|video\|audio\|document. Closes #19105 (remaining send_message gap). Other items in the report (inbound video paths, image_generate auto-deliver, history dedup, native gateway bubbles) already landed on main.	2026-06-27 04:40:05 -07:00
Teknium	88c02469cc	fix(mcp): never permanently wedge the circuit breaker on a dead transport (#53599 ) A long-running gateway session could permanently lose an MCP server: once a stdio subprocess died (or transient drops accumulated over the session), the run loop exhausted its reconnect budget and returned, orphaning the task. With no listener for _reconnect_event, the circuit breaker's half-open probe could never revive the server — every probe hit a dead/absent session, re-armed the 60s cooldown, and looped forever until a full gateway restart (#16788). Root cause was split ownership of transport liveness between the run loop and the tool handler, plus a permanent give-up path. Fixed by one invariant: a non-shutdown server task is always reconnectable. - run loop parks (deregisters phantom tools, then awaits _reconnect_event) instead of returning when the reconnect budget is exhausted, so the task stays alive as a dormant listener - retry budget resets on every successful (re)connect, so a healthy long-lived server can't accumulate lifetime drops into a death sentence - half-open probe with no live session signals a reconnect (reviving a parked/dead task and respawning a dead stdio subprocess) and returns a clean 'reconnecting' error instead of writing into a dead pipe - breaker resets on successful session init across all transports (stdio/HTTP/SSE) — fully transport-agnostic, no PID/pipe polling Builds on the closed-PR cluster for this issue: keeps #49255's deregister-on- exhaustion insight and #21006's signal-don't-probe insight, discards the racy os.kill PID machinery. Co-authored-by: LeonSGP43 <LeonSGP43@users.noreply.github.com> Co-authored-by: srojk34 <srojk34@users.noreply.github.com>	2026-06-27 04:39:54 -07:00
teknium1	ab1f9b94c5	fix(telegram): accept @username chat_id in delivery paths (#13206 ) TELEGRAM_HOME_CHANNEL set to an @username (not a numeric chat ID) crashed all webhook/cron->Telegram home-channel delivery with 'ValueError: invalid literal for int()'. The Telegram Bot API accepts both a numeric chat_id and an @username string; Hermes was force-coercing every chat_id with int(). Add normalize_telegram_chat_id() (returns int for numeric values, passes @username strings through) and apply it at the Bot API send/edit sites in the Telegram adapter and the send_message tool. Username targets are now recognized as explicit targets in _parse_target_ref. Reapplies the approach from #13274 (season179), whose branch predated the gateway/platforms/telegram.py -> plugins/platforms/telegram/adapter.py relocation. Dupes: #13535 (Tranquil-Flow), #37572 (chewkaah). Co-authored-by: season179 <season.saw@gmail.com>	2026-06-27 04:01:58 -07:00
zapabob	e55ddc3e33	fix(mcp): suppress interactive OAuth stdin prompts during background discovery (#35927 ) When an MCP server requires OAuth, the interactive `hermes` TUI froze on startup: background MCP discovery hit the OAuth flow, which on an interactive TTY spawns a daemon thread doing a blocking `sys.stdin.readline()` (the "paste the redirect URL" fallback in mcp_oauth._wait_for_callback). That thread competes with the TUI's own stdin reader for the same terminal, so keystrokes get swallowed and the TUI appears frozen (up to the 300s OAuth timeout). Reported symptom: "MCP OAuth: authorization required / Open this URL ... the tui is freezing, not respond to typing." Add a thread-local `suppress_interactive_oauth()` context manager in tools/mcp_oauth.py; `_is_interactive()` returns False while it's active, so the stdin paste-thread and prompt are never created. Background discovery (hermes_cli/mcp_startup.py, tui_gateway/entry.py) now runs discovery inside that context, so OAuth-requiring servers soft-skip (raise OAuthNonInteractiveError, already handled) instead of stealing the TUI's stdin. A real `hermes mcp login` on the main thread is unaffected (thread-local). Salvaged from #35945 by @zapabob (authorship preserved via cherry-pick; resolved a conflict against main's new mcp_discovery_timeout / wait_for_mcp_ discovery refactor, keeping both). Verified E2E: with suppression the paste prompt is NOT printed and no stdin thread spawns (raises OAuthNonInteractive soft-skip); without it the prompt shows (the freeze). Mutation-verified (removing the suppress check in _is_interactive fails the regression test). 76 tests pass, ruff clean. Closes #35927. SELF-REVIEW FIX: the original #35945 used threading.local(), which does NOT propagate to the dedicated mcp-event-loop thread where OAuth actually runs (discover_mcp_tools dispatches the connect via run_coroutine_threadsafe), so the suppression was a NO-OP in production (the tests passed only by stubbing out the cross-thread dispatch). Converted to a contextvars.ContextVar, which asyncio copies onto the scheduled coroutine — empirically verified suppression now holds on the mcp-event-loop thread through the real _run_on_mcp_loop path. Added a cross-thread regression test (fails on threading.local, passes on the ContextVar) so the no-op can't regress.	2026-06-27 04:59:23 +05:30
kshitijk4poor	a67ddf5983	fix: drop isinstance(str) guard so client.base_url fallback works with httpx.URL The OpenAI SDK exposes client.base_url as an httpx.URL object, not str. The isinstance(live_raw, str) guard made this branch dead code in production. Use _normalized_runtime_url (which coerces via str()) so the fallback actually fires.	2026-06-27 03:59:36 +05:30
xxxigm	25b7348457	fix(delegate): inherit subagent endpoint from parent active client When parent_agent.base_url still carries a stale OpenRouter URL but the live OpenAI client already points at local Ollama, subagents were routing API calls to OpenRouter and failing with HTTP 401. Prefer _client_kwargs and the mounted client base_url when they disagree with the surface field.	2026-06-27 03:59:36 +05:30
liuhao1024	515192c4b9	fix(tools): use start_new_session instead of preexec_fn to prevent SIGSEGV in multi-threaded processes preexec_fn=os.setsid runs Python code in the forked child before exec, which is unsafe in multi-threaded processes (CPython docs). When the Desktop gateway loads native libraries (onnxruntime, BLAS, provider SDKs) with active thread pools, the fork can SIGSEGV before the child execs. Replace all preexec_fn usage with start_new_session=True, which provides the same setsid/process-group semantics without running Python in the fork. This is already the pattern used throughout hermes_cli/gateway.py and hermes_cli/_subprocess_compat.py. Fixes #46789	2026-06-27 03:08:41 +05:30
Teknium	525e1e775d	fix(skills): background review fork respects pinned skills (#53226 ) The autonomous self-improvement review fork could still write to a pinned skill — only external/bundled/hub-installed/protected-builtin skills were guarded. The curator skips pinned skills from every auto-transition; the review fork is the same kind of no-user-present actor and must too. Adds a pin check to _background_review_write_guard so background-origin edit/patch/delete/write_file/remove_file on a pinned skill are refused. Stricter than the foreground _pinned_guard (delete-only) by design: with no user in the loop there is no one to consent to an edit. Fixes #25839	2026-06-26 12:49:33 -07:00
briandevans	3c8d3ecfa0	fix(approval): extend gateway-lifecycle guard to launchctl and pidof-based kills The dangerous-command approval layer already blocks `hermes gateway (stop\|restart)`, `pkill/killall hermes\|gateway`, and `kill ... $(pgrep ...)`. A reporter noted on #33071 that the agent can still achieve the same effect by driving launchd directly against the gateway's service label (`launchctl stop ai.hermes.gateway`, `launchctl kickstart -k system/ai.hermes.gateway`, etc.) or by substituting `pidof` for `pgrep` in the kill-expansion form. This widens the "Gateway lifecycle protection" block in `tools/approval.py` to cover both vectors: - `launchctl (stop\|kickstart\|bootout\|unload\|kill\|disable\|remove)` scoped to commands that target a Hermes label (`hermes`, `ai.hermes`). Read-only inspection (`launchctl print …`, `launchctl list`) and operations against unrelated labels remain unflagged. - `kill ... $(pidof …)` and the backtick form, alongside the existing `pgrep` expansion. `pidof` is the BSD/Linux equivalent and is equally opaque to the `(pkill\|killall) … hermes` name pattern. Intentionally left out of scope: plain `kill -TERM <numeric_pid>` with a PID looked up out-of-band. Catching that would require runtime PID state and would break the existing `TestPgrepKillExpansion::test_safe_kill_pid_not_flagged` contract, which guarantees that a plain literal-PID `kill 12345` stays safe.	2026-06-26 11:38:28 -07:00
Teknium	3d735fe156	fix(skills-hub): surface per-tap providers (NVIDIA/OpenAI/...) in runtime search (#53191 ) Natural-language skill search returned a short, arbitrary list and never surfaced NVIDIA (or OpenAI/Anthropic/HuggingFace) skills. Two causes: 1. The runtime index collapses every GitHub tap into source="github", so there was no way to find or filter by provider at the CLI — the per-tap identity only existed in the docs-site catalog. 2. HermesIndexSource.search matched only name/description/tags (not the identifier or provider) and broke at the first `limit` hits in raw index order, burying the most relevant skills. `search` also defaulted to --limit 10 against an 86k-entry catalog. Changes: - GitHubSource stamps a per-tap provider label (extra.provider) on each skill via github_provider_for(); source stays "github" so dedup/floor/ index-skip logic is untouched. Flows into the built index. - HermesIndexSource.search now matches identifier + provider too, and collect-then-ranks (exact > prefix > whole-word > substring) instead of break-at-limit. - --source nvidia\|openai\|anthropic\|huggingface\|voltagent\|gstack\|minimax provider filters for browse/search (narrows merged results by provider). - search --limit default 10 -> 25; table Source column shows the provider label for github skills. Tested: 181 unit tests pass; E2E against the live runtime index confirms 'nvidia'/'cuda' searches now surface NVIDIA-provider skills and --source nvidia narrows to exactly the NVIDIA catalog.	2026-06-26 11:04:41 -07:00
liuhao1024	d9f1f1a1de	fix(terminal): prefer $SHELL over bash for background process spawning (#42203 ) On macOS, terminal(background=true) silently failed: the process returned a session_id and exit_code=0 but the command never ran (empty stdout, no side effects). Root cause is two interacting issues: 1. _find_shell was aliased to _find_bash, which prefers `shutil.which("bash")` → /bin/bash (GNU bash 3.2, still shipped on macOS) over $SHELL (/bin/zsh). 2. process_registry.spawn_local runs [shell, "-lic", "set +m; <cmd>"] with stdin=/dev/null. bash 3.2 as a login shell sources ~/.bash_profile, which on many macOS setups contains `exec /bin/zsh -l`; that exec replaces bash but drops the -c argument, so the command is swallowed (exit 0, no output). Decouple _find_shell from _find_bash: _find_shell now prefers the user's configured $SHELL on POSIX (the shell they actually log in with), falling back to _find_bash when $SHELL is unset/missing. _find_bash is unchanged, so callers that genuinely need bash (e.g. the _run_bash login-shell snapshot) keep bash semantics. zsh handles -lic correctly even with redirected stdin. Salvaged from #42219 by @liuhao1024 (authorship preserved via cherry-pick). On top of the original (8 unit tests covering $SHELL-set/unset/missing/empty, Windows-ignores-$SHELL, _find_bash-unchanged), added an E2E regression test that reproduces the real bash-3.2 login-shell swallow (exit 0 / no file) and asserts the shell _find_shell selects actually executes a -lic background command. Mutation-verified: reverting _find_shell to the bash alias fails the $SHELL-preference test. Bug reproduced directly: /bin/bash 3.2 -lic with a .bash_profile->exec-zsh creates no file; zsh -lic does. Closes #42203. Supersedes #42290.	2026-06-26 20:45:32 +05:30
kyssta-exe	07cc567dfa	fix(security): add circuit breaker for tirith crashes to prevent agent hangs (#41400 )	2026-06-26 15:26:08 +05:30
teknium1	fbfccbb3ee	fix(security): align cron invisible-unicode set with install-time scanner The cron runtime tripwire (_scan_cron_prompt) used a 10-char invisible-unicode set while the install-time scanner (threat_patterns.INVISIBLE_CHARS) flags 17. The cron-local set was missing U+2062-U+2064 (invisible math operators) and U+2066-U+2069 (directional isolates), so a directive obfuscated with one of those codepoints (e.g. "ig<U+2063>nore all previous instructions") slipped past the runtime cron gate while being caught at install time. Import the canonical set so the cron tripwire and install scanner can't drift apart again. Emoji-ZWJ protection (_zwj_has_emoji_neighbour) is unchanged. Fixes #35075 Co-authored-by: rlaope <piyrw9754@gmail.com>	2026-06-26 01:11:11 -07:00
Teknium	099df3cd89	fix(security): stop blocking AGENTS.md/SOUL.md that name an agent 'Praxis' (#52925 ) The known_c2_framework threat pattern included 'praxis' in its alternation alongside genuine offensive-security tool brands (Cobalt Strike, Sliver, Havoc, Mythic, Metasploit, Brainworm). Unlike those distinctive brand names, 'praxis' is a common English word (Greek for practice/action) and a legitimate agent name, so any context file that mentioned an agent named Praxis matched at 'context' scope and the whole AGENTS.md / SOUL.md was replaced with a [BLOCKED] placeholder before it reached the system prompt. Remove 'praxis' from the alternation and add a guard comment: every token in this list must be a distinctive tool brand, not a common word. Real C2 brands still fire.	2026-06-26 00:36:01 -07:00
Max Hsu	075f93ad78	fix(mcp): auto-recover from invalid_client on stale OAuth client registration Fixes #36767. Two complementary recoveries for the recurring "delete three cache files and re-auth by hand" ritual when an MCP server's dynamically-registered OAuth client goes dead server-side (IdP redeploy / DB wipe / rebrand): - Auto-heal (token-endpoint subset): HermesMCPOAuthProvider now sniffs auth-flow responses and, on a 400/401 `invalid_client` from the discovered token endpoint, backs up + deletes `<server>.client.json` and `.meta.json` and clears the in-memory client so the SDK re-runs RFC 7591 dynamic client registration on the next flow. Conservative by construction: only dynamically-registered (non config-supplied) clients, only the token endpoint, only on a word-boundary `invalid_client` match (so RFC 7591's `invalid_client_metadata` does not trip it); best-effort so a miss never breaks the live flow. Covers both code-exchange and refresh when the token endpoint was discovered. Tokens are preserved. - `hermes mcp reauth [<name>\|--all]`: the reporter's primary symptom — the IdP's in-browser "Redirect URI Mismatch" — produces no HTTP signal (the SDK only sees a callback timeout), so it cannot be auto-detected. The new command re-auths one or ALL `auth: oauth` servers, serially: one browser flow at a time, which also fixes the startup popup storm when several servers are stale at once. Single-server reauth is factored out of `mcp login` and shared. Tests: +14 (poison helper x2; token-endpoint detection x5 incl. wrong-endpoint, success-response, pre-registered, and invalid_client_metadata negative guards; a bridge integration test driving the real async_auth_flow generator to prove the detection hook preserves the bidirectional asend() forwarding contract; reauth CLI x6). Verified against the pinned mcp==1.26.0: scripts/run_tests.sh 122/122 green for the touched suites; check-windows-footguns.py and ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-26 00:35:27 -07:00
kshitij	a28b939092	Merge pull request #52678 from kshitijk4poor/salvage/52502-fuzzy-boundary fix(fuzzy-match): preserve boundary space after whitespace-normalized match (#52491)	2026-06-26 10:59:14 +05:30
yu-xin-c	96bc524a71	fix(curator): protect external skills from background curation	2026-06-25 22:03:02 -07:00
teknium1	6c58878e7d	fix(browser): force secret-pattern redaction on browser_type display Force redact_sensitive_text(force=True) on the browser_type text arg so recognized credentials (API keys, tokens, JWTs) are masked in tool progress, previews, callbacks, and return payloads even when the global security.redact_secrets opt-out is set — a typed credential reaching chat history is a security boundary, not log hygiene. Normal typed text matches no pattern and stays fully readable for debuggability. Tests assert the API-key-shaped secret is masked across every surface and that normal text passes through unchanged.	2026-06-25 22:02:22 -07:00
rebel	8ff426e53b	fix: redact browser typed text surfaces	2026-06-25 22:02:22 -07:00
Teknium	5b5c79a8ef	feat(kanban): typed block reasons + unblock-loop breaker (#52848 ) * feat(kanban): typed block reasons + unblock-loop breaker Stops the kanban blocked-task loop: a worker blocks a task, a cron unblocks it, the worker re-blocks for the same reason, repeat forever. block_task now takes a typed kind and a persistent block_recurrences counter on the tasks table: - kind=dependency routes to todo (parent-gated, auto-resumed), never the human 'blocked' bucket a cron would keep unblocking. - needs_input/capability/transient/untyped land in blocked; each same-cause re-block after an unblock increments block_recurrences, and at BLOCK_RECURRENCE_LIMIT (default 2) the task routes to triage for a human instead of blocked. - unblock_task no longer resets block_recurrences (the amnesia that let the loop run unbounded); complete_task clears it on success. Wired through the worker kanban_block tool (new kind arg) and the hermes kanban block --kind CLI flag, both reporting where the task actually landed. Docs + 11 new tests; 536 existing kanban tests green. * test(kanban): make second-block notify test use a distinct block cause test_notifier_second_blocked_delivers blocked the same task twice with the same (untyped) reason, which now trips the new unblock-loop breaker and routes the second block to triage instead of blocked — so only one 'blocked' notification fired. The test's actual intent is that TWO distinct block cycles each notify; give the two cycles different kinds (needs_input then capability) so they're genuinely separate blocks. The same-cause loop→triage path is covered by test_kanban_block_kinds.py.	2026-06-25 21:46:58 -07:00
Que0x	b8fc8c908b	fix(approval): fold Windows absolute home paths in dangerous-command detection The detector folds absolute home / Hermes-home prefixes into their canonical ~/ and ~/.hermes/ forms so static patterns catch /home/alice/.bashrc the same way they catch ~/.bashrc (`abd69b81`). On native Windows this fold never fired, so terminal commands writing to shell startup files, ~/.ssh/authorized_keys, or ~/.hermes/config.yaml / .env returned "safe" and skipped the approval prompt — and config.yaml carries the approval policy itself. Two compounding causes: 1. The fold ran after the backslash-escape strip (r\m -> rm), which dissolves the backslash separators in a Windows path (C:\Users\alice\.bashrc -> C:Usersalice...) before the fold could match. It now runs before the strip. 2. The fold only recognized POSIX absolute paths and only the home prefix, leaving multi-segment backslash suffixes (\.ssh\authorized_keys) to be mangled by the strip. Consolidated into _home_prefix_fold_regex / _fold_home_prefixes: match a home prefix with either separator, capture the rest of the path token, and normalize its separators to / so multi-segment patterns match. The degenerate-path guard generalizes count("/") >= 2 to "at least two components below the root" (also rejecting a bare drive root C:\). HOME is consulted directly because Windows' expanduser ignores it; the more specific Hermes home is folded first, longest candidate first, so neither fold clobbers the other. POSIX behavior unchanged; the r\m -> rm anti-obfuscation strip still runs. Adds TestWindowsAbsolutePathFolding, which monkeypatches a Windows-style HOME/HERMES_HOME so the behavior is also exercised on the CI runner.	2026-06-25 17:49:39 -07:00
brooklyn!	ffa3d3c811	Merge pull request #49037 from NousResearch/bb/projects-paradigm feat(desktop): first-class projects — sidebar, coding rail, review pane, and agent project tools	2026-06-25 17:49:05 -05:00
Gille	e7d2f0b93c	fix(windows): suppress console flashes and harden gateway restarts	2026-06-25 14:42:38 -07:00
Brooklyn Nicholson	cb3f8ec03d	fix(tools): isolate per-session worktree cwd	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	4ffdedd369	feat(tools): add project workspace tools	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	e7811345c1	feat(kanban): link tasks to project worktrees	2026-06-25 16:40:26 -05:00
Teknium	c6575df927	feat(moa): expose MoA presets as selectable virtual models (#46081 ) * feat(moa): expose MoA presets as selectable virtual models Reconstructed onto current main (PR #46081's base had diverged with no common ancestor, marking the PR dirty so CI never dispatched). MoA is now a virtual provider: each named preset is a selectable model under provider 'moa', and the preset's aggregator is the acting model that answers and calls tools. Reference models fan out in parallel via a bounded ThreadPoolExecutor (the same batch pattern delegate_task uses) — all references dispatched at once, collected when every one finishes, then handed to the aggregator. Output order is preserved, failures and the MoA-recursion guard stay isolated per reference. - Removed the old mixture_of_agents model tool and moa toolset. - Added moa as a virtual provider in the provider/model inventory. - /moa is shortcut behavior over model selection (default preset / named preset / one-shot prompt). - Dashboard + Desktop manage named presets; presets appear in model pickers. - Parallel reference fan-out in agent/moa_loop.py with regression test. * fix(moa): thread moa_config through _run_agent to _run_agent_inner The reconstructed gateway MoA wiring declared moa_config on _run_agent (the profile-scoping wrapper) and used it inside _run_agent_inner, but the wrapper never forwarded it — _run_agent_inner had no such parameter, so the runtime hit NameError: name 'moa_config' is not defined on the compression-failure session sync path. Add moa_config to _run_agent_inner's signature and forward it from both wrapper call sites (multiplex and non-multiplex). Caught by tests/gateway/test_compression_failure_session_sync.py on CI shard test(4). * fix(moa): classify moa as a virtual provider in the catalog The moa virtual provider has no PROVIDER_REGISTRY/ProviderProfile entry, so provider_catalog() fell through to the default auth_type="api_key" with no env vars — tripping two catalog invariants: - test_provider_catalog: api_key providers must expose a credential env var - test_provider_parity: every hermes-model provider must be desktop-configurable moa already declares auth_type="virtual" in HERMES_OVERLAYS; consult that overlay as an auth_type fallback so the catalog reports moa as virtual (no real credential, no network endpoint). Exempt virtual providers from the desktop parity union check the same way 'custom' is exempt — derived from the catalog, not a hardcoded slug, so future virtual providers are covered too.	2026-06-25 13:52:06 -07:00
kshitij	42bea9e298	Merge pull request #52618 from NousResearch/salvage/14185-todo-coercion fix(tools): defensive type coercion in todo_tool for malformed LLM input (#14185)	2026-06-26 02:02:18 +05:30
liuhao1024	f23d077b5f	fix(fuzzy-match): preserve boundary space after whitespace-normalized match The trailing-whitespace expansion in _map_normalized_positions unconditionally consumed whitespace after the matched region — including the word-boundary space that separates the match from the next token. This caused silent file corruption when the fuzzy matcher fell back to the whitespace_normalized strategy. Guard the expansion on the normalized match actually ending with whitespace (i.e. the original had a run of spaces that were collapsed). When the match ends with a non-space character, the first whitespace in the original is a boundary and must not be consumed. Fixes #52491	2026-06-26 01:55:27 +05:30
helix4u	4efec63a34	fix(tools): let session_search match session titles	2026-06-26 01:12:26 +05:30
rob-maron	525ee58b43	krea	2026-06-25 12:38:33 -07:00
Tranquil-Flow	0be10607d9	fix(tools): defensive type coercion in todo_tool for malformed LLM input (#14185 ) todo_tool crashed with `AttributeError: 'str' object has no attribute 'get'` when the LLM emitted the `todos` param as a JSON-encoded string instead of an array, or as a list containing non-dict items (observed intermittently on Claude 4.5/4.6/4.7, and after a prior tool-call rejection where the model "self-corrects" by wrapping the list in json.dumps). Three additive guards, no behavior change for well-formed input: - todo_tool(): if `todos` is a str, json.loads it; reject unparseable strings and non-list values with a clear tool_error instead of crashing downstream. - _validate(): non-dict items return a {id:"?", content:"(invalid item)"} placeholder rather than calling .get() on a str/int/None. - _dedupe_by_id(): non-dict items get a synthetic key so _validate handles them. Salvaged from #14785 by @Tranquil-Flow (authorship preserved via cherry-pick). Comprehensive tests: JSON-string coercion (parse / unparseable / non-list / non-string), non-dict list items (str/None/int/mixed), and a well-formed- unchanged regression class — both guards mutation-verified to fail without them. Closes #14185. Supersedes #14187, #22505, #14350 (same fix, less/no test coverage) and #16952 (bundled unrelated scope-creep).	2026-06-25 23:42:42 +05:30
kshitij	d682f320b3	Merge pull request #52147 from NousResearch/salvage/29184-mcp-osv-nonblocking fix(mcp): run OSV malware preflight off the event loop with a bounded timeout (#29184)	2026-06-25 23:39:44 +05:30
qdaszx	6305ac0e4b	fix(mcp): run OSV malware preflight off the event loop with a bounded timeout (#29184 ) During stdio MCP server startup, _run_stdio (an async method) called the synchronous check_package_for_malware() inline. That makes a blocking urllib HTTPS POST to api.osv.dev whose own timeout doesn't reliably cover a stalled SSL handshake, so an intermittent network issue froze the entire asyncio event loop for up to ~120s — blowing past the TUI/gateway's 15s startup budget and showing "gateway startup timeout". Run the check via asyncio.to_thread (off the loop) AND bound it with asyncio.wait_for(timeout=_OSV_MALWARE_CHECK_TIMEOUT_S=12s). The malware check is fail-open, so on timeout we log and proceed rather than blocking startup. Salvaged from #29190 by @qdaszx (re-applied on current main — the call site moved since the PR was opened), combining the to_thread approach also proposed in #29192 by @ygd58. Two load-bearing tests: event-loop-not-blocked-during- check and timeout-fails-open — both mutation-verified to fail against the old inline blocking call. Closes #29184. Co-authored-by: ygd58 <buraysandro9@gmail.com>	2026-06-25 23:30:41 +05:30
kshitijk4poor	15ee2d6f04	refactor: lightweight sudo count + drop chatty multi-sudo tip Replace _count_real_sudo_invocations (which called _rewrite_real_sudo_invocations and discarded the rewritten string) with a lightweight token scan that reuses the same tokeniser but skips string building. Remove the agent-facing tip about nested sudo in heredocs — the cache-cleared warning is enough.	2026-06-25 23:08:48 +05:30
xxxigm	8278d82e17	fix(terminal): improve sudo -S password delivery and cache invalidation Pipe one password line per sudo invocation in compound commands so a correct password is not rejected on the second `sudo` in `sudo a && sudo b`. Drop the session cache when sudo returns Authentication failed, surface sudo_auth_failed in the tool result, and add hints for interactive sessions.	2026-06-25 23:08:48 +05:30
brooklyn!	da0320bf40	Merge pull request #52285 from NousResearch/bb/verify-ledger feat(agent): record coding verification evidence	2026-06-24 23:07:10 -05:00
Brooklyn Nicholson	fcbdf3c356	feat(agent): record coding verification evidence Record foreground verification commands in a bounded, profile-scoped ledger and mark evidence stale when code edits change the workspace.	2026-06-24 22:35:27 -05:00
Victor Kyriazakos	b693bee100	feat(cron): thread-preferred continuable delivery (open a thread, mirror DM fallback) Continuable cron jobs (attach_to_session / cron.mirror_delivery, default OFF) now prefer a dedicated thread on thread-capable platforms, falling back to origin-DM mirroring where threads don't exist. - Thread-capable (Telegram topics, Discord/Slack threads): open a fresh thread for the job via the shipped adapter.create_handoff_thread, route the brief into it, and seed the thread-keyed session so the user's in-thread reply continues with full context. This is the 'continuable cron opens its own thread' interface. - DM-only (WhatsApp/Signal/SMS): create_handoff_thread returns None -> fall back to mirroring into the origin DM session (existing behaviour). Reuses existing infrastructure end-to-end — no new adapter surface, no provider-chain signature change: - adapter.create_handoff_thread (already implemented per-platform, returns None on unsupported platforms = the fallback signal) - the live SessionStore via adapter._session_store (already set on every adapter), reached without threading a new param through the frozen CronScheduler.start() contract - gateway.mirror.mirror_to_session for the seed/append - existing per-target delivery routing carries the new thread_id for free Mirrors GatewayRunner._process_handoff's open-thread-or-fallback + seed pattern, standalone for the cron delivery path. thread_seeded guards against a double-mirror after seeding. Scoped to the origin target only; fan-out/broadcast targets are never threaded or mirrored. Config docs updated (cron.mirror_delivery) + cronjob tool attach_to_session description reframed around continuable/thread-preferred. Tests: +5 (thread id returned on thread platform; None on DM platform; None without capability/loop; seed creates thread session + mirrors; seed no-op on empty). 22/22 in TestCronDeliveryMirror; 532 cron tests pass (4 failures pre-existing: croniter-not-installed + TZ).	2026-06-24 20:27:05 -07:00
Victor Kyriazakos	98f3c19282	feat(cron): pass origin user_id to delivery mirror (send_message parity) Multi-participant parity with interactive send_message, which passes HERMES_SESSION_USER_ID to gateway.mirror.mirror_to_session so the mirror lands in the exact participant's session. - cronjob_tools._origin_from_env now captures user_id from the session context at job-create time (alongside platform/chat_id/thread_id). - _maybe_mirror_cron_delivery forwards user_id to mirror_to_session. - _deliver_result threads origin.user_id through for the origin target. Effect: in a per-user-isolated group chat (group_sessions_per_user=True, the default), the mirror resolves to the member who scheduled the job instead of conservatively no-op'ing on ambiguous candidates. DMs and shared group/thread sessions are unaffected (single candidate). Default still OFF. Tests: helper forwards user_id; E2E _deliver_result forwards origin user_id. 17/17 in TestCronDeliveryMirror; 527 cron tests pass (4 failures pre-existing: croniter-not-installed + TZ, identical on baseline).	2026-06-24 20:27:05 -07:00
Victor Kyriazakos	1b181724fa	feat(cron): optional mirror of cron delivery into target chat session Adds an opt-in path so a cron job's delivered output is also appended to the TARGET chat's gateway session transcript (as an assistant turn), so a user reply to a recurring delivery (daily brief, reminder) is answered with the delivery in context instead of 'what is that?' amnesia. - Reuses the shipped gateway.mirror.mirror_to_session — the same primitive interactive send_message mirroring already uses. No messaging-toolset change (cron still can't call send_message; this rides delivery). - Gated: per-job attach_to_session overrides global cron.mirror_delivery (config.yaml). Default OFF — historical isolation preserved byte-for-byte. - Mirrors the CLEAN agent output, not the cron header/footer wrapper. - Alternation/cache-safe: append lands at a turn boundary, never mid-loop, never mutates the cached system prompt. Cold-start (no target session) is a silent no-op; mirror errors never fail a successful delivery. - Surfaced on the cronjob tool (attach_to_session) + config schema. Driven by enterprise cron-as-control-plane use case. 10 new tests; full cron + cronjob-tool suites pass (600).	2026-06-24 20:27:05 -07:00
Ben Barclay	c15945655f	fix(terminal): sanitize host/relative cwd OVERRIDE before it reaches docker run -w (#50636 ) terminal_tool() resolves a per-task cwd override that WINS over config["cwd"]: cwd = overrides.get("cwd") or config["cwd"] config["cwd"] is sanitized for container backends in _get_env_config() (host prefixes /Users//home//C:\\/C:/ and relative paths are replaced with the backend default /root). But the override was applied RAW — it was never run through that guard. The gateway/TUI registers the host launch dir as a cwd override for workspace tracking (tui_gateway/server.py _register_session_cwd -> _terminal_task_cwd -> _session_cwd -> os.getcwd()), so on a container backend a host path leaked straight to `docker run -w <host-path>`: - Windows desktop: -w C:\Users\<user> -> container fails to start (exit 125) - POSIX: -w /home/<user> -> same The ACP adapter translates its override cwd (acp_adapter/session.py _translate_acp_cwd), but the gateway path did neither translation nor sanitization, so the override bypassed the one guard that would have caught it. Fix: extract the host/relative-path predicate into a shared _is_unusable_container_cwd() helper (so the existing _get_env_config() sanitizer and the new guard can't drift), and re-apply it to the resolved cwd at the override-resolution site. Valid in-container override paths (RL/benchmark sandboxes that set cwd to /workspace, /root, ...) are absolute non-host paths and pass through untouched. Tests: unit-pin the predicate (Windows backslash/forwardslash, POSIX home, macOS /Users, relative, valid container paths) AND an E2E call-site pin that drives terminal_tool() with a host-path override registered and asserts the cwd reaching _create_environment is sanitized. Mutation-verified: reverting the call-site guard makes the two host-path E2E tests fail (showing the raw host path leaking) while the valid-/workspace-override test stays green.	2026-06-25 02:33:40 +00:00
Ben	d1cac0e5ef	feat(gateway): scale-to-zero idle detection + dormant-quiesce (Phase 0) The gateway-side BEHAVIOUR layer that consumes the relay scale-to-zero primitives (gateway-gateway Phase 5): the gateway decides it is idle and drives the relay transport dormant so the platform (Fly autostop:"suspend") can suspend the now-traffic-idle machine, which wakes on the connector's wakeUrl poke (decisions.md Q3=C', D1-D13). - gateway/scale_to_zero.py: pure helpers — scale_to_zero_enabled (the NAS Labs HERMES_SCALE_TO_ZERO stamp, D11/Q8=A), parse_idle_timeout_seconds (config.yaml gateway.scale_to_zero.idle_timeout_minutes, D2), messaging_is_relay_only_or_absent (F6/D1), should_arm (D1/D11/§3.4(1)), is_idle (D2/D3/F7). - gateway/run.py: _last_inbound_at clock stamped on user inbound in _handle_message (F13); the arm-gate + idle predicate + the _scale_to_zero_watcher dormant sequence (mark draining -> adapter go_dormant() -> cooldown), started only when armed. Deliberately NOT the stop path and NOT mark_resume_pending (F12/D13). - tools/process_registry.py: has_any_active() for the bg-work guard (D3/F7). - hermes_cli/config.py: gateway.scale_to_zero.idle_timeout_minutes default 5. Tests: 38 pure-logic + 6 watcher (incl. bg-work regression guard proven RED). Full relay + scale-to-zero suites: 184 passed. The 20 unrelated failures in the broader run are PRE-EXISTING on origin/main (custom-provider/tools tests), confirmed via a pristine baseline worktree.	2026-06-24 18:47:18 -07:00
Ben	cbd6ba1bdd	fix(docker): redirect lazy installs to a durable target so opt-in backends work in the immutable image (#51136 ) The published Docker image seals the agent venv (root-owned, read-only /opt/hermes) and sets HERMES_DISABLE_LAZY_INSTALLS=1 so a runtime install can't mutate and brick the core. But opt-in backends (Firecrawl web search, Exa, Feishu, ...) deliberately keep their SDKs in tools/lazy_deps.py and out of [all] (pyproject policy 2026-05-12: one quarantined release must not break every install). The two policies collided: the SDK isn't baked in AND can't lazy-install, so the default Firecrawl web_search/web_extract fail out of the box in Docker (#51136), as do Exa (#49445) and Feishu (#50205). Fix the whole class instead of baking in one backend: when HERMES_LAZY_INSTALL_TARGET is set, lazy installs are redirected to a writable dir on the durable /opt/data volume via `pip/uv install --target`, and that dir is APPENDED to the end of sys.path. Because the core venv always wins name collisions, a package installed this way can only ADD new modules — it can never shadow, downgrade, or break a module the core ships. The worst a bad/incompatible backend package can do is fail to import and report itself unavailable; the agent core stays healthy. That structural guarantee is what made it safe to seal the venv, and it is preserved here even with installs re-enabled. - tools/lazy_deps.py: durable-target mode — `--target` install + core-pinned `--constraint` file (shared deps resolve to core's versions, conflicts fail loudly at install time), append-only sys.path activation, ABI/Python-version stamp that wipes the store if an image rebuild bumps the interpreter, and a reworked gate so HERMES_DISABLE_LAZY_INSTALLS=1 redirects (rather than hard- blocks) when a target is set. security.allow_lazy_installs=false still disables installs in every mode. - hermes_bootstrap.py: activate the durable target on sys.path at first import (before any backend imports its SDK) so packages installed on a previous run are importable on this run. - Dockerfile: set HERMES_LAZY_INSTALL_TARGET=/opt/data/lazy-packages. - docker/stage2-hook.sh: seed + chown the dir on the data volume. - tests: real-install E2E proving installs land in the target, import cleanly, don't leak into the sealed venv, and that a core package is never shadowed; ABI-stamp wipe/preserve; gate matrix; Dockerfile/stage2 contract test. Fixes #51136	2026-06-25 09:20:13 +10:00
liuhao1024	dbf0797335	fix(tools): catch mkdtemp OSError in tirith install to prevent unbounded retry and temp-dir leak (#51826 ) When tempfile.mkdtemp() raises OSError (e.g. disk full), the exception propagated past the try/finally block, so _mark_install_failed() was never called. The 24h backoff marker never engaged, causing unbounded retry on every command -- each attempt leaked a tirith-install-* temp directory, eventually filling /tmp completely. Fix: wrap mkdtemp in its own try/except OSError, returning (None, "no_space") so the caller's normal failure path (including _mark_install_failed) executes. Salvaged from #51831 by @liuhao1024. Closes #51826	2026-06-25 02:13:56 +05:30
Riyasudeen Farook	1e4df599ec	fix(delegate): strip cronjob toolset from delegated children (#43466 ) _strip_blocked_tools used a hardcoded set missing 'cronjob'. Children on gateway platforms could inherit the cronjob toolset, scheduling persistent jobs that outlive the delegation despite DELEGATE_BLOCKED_TOOLS. Fix: derive the strip set from DELEGATE_BLOCKED_TOOLS at runtime so the two lists can never drift. Add 'cronjob' to DELEGATE_BLOCKED_TOOLS for documentation consistency. Two regression tests lock the invariant. Salvaged from #43687 by @riyas22. Adapted test to current main (no 'messaging' toolset exists -- send_message is intentionally not registered as an agent tool). Closes #43466	2026-06-25 01:37:25 +05:30
liuhao1024	25e2312230	fix(memory): skip drift guard for add (append-only) action (#42874 ) The drift guard (introduced for #26045) correctly protects replace/remove from clobbering un-roundtrippable content, but it also fires on the add path. Since add only appends and never overwrites, the guard is unnecessary and causes false positives when prior add() calls in the same session shift the byte count of the on-disk file. Add skip_drift parameter to _reload_target() and pass True from add(). Replace/remove continue to use the drift guard unchanged. Salvaged from #42880 by @liuhao1024. Closes #42874	2026-06-25 00:51:12 +05:30

1 2 3 4 5 ...

1856 commits