hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-08 03:01:47 +00:00

Author	SHA1	Message	Date
Teknium	dd2dc2bddf	fix(mcp): forward OAuth auth and bump sse_read_timeout on SSE transport (#21323 ) * fix(mcp): re-raise CancelledError explicitly in MCPServerTask.run On Python 3.11+, `asyncio.CancelledError` inherits from `BaseException` (not `Exception`), so the broad `except Exception as exc:` in `MCPServerTask.run`'s transport loop did NOT catch it. Task cancellation from gateway restart / explicit `task.cancel()` silently escaped past the reconnect logic — the MCP server task died without going through the shutdown/reconnect code paths that check `_shutdown_event`. Add an explicit `except asyncio.CancelledError: raise` before the broad catch so cancellation propagation is self-documenting rather than an accident of exception hierarchy, and future sibling-site work (e.g. distinguishing shutdown-cancel from transport-cancel) has an obvious hook. Behavior on pre-3.8 Pythons where CancelledError WAS an Exception subclass is also corrected: the old path would have caught it and treated it as a connection failure worth retrying. Closes #9930. * fix(mcp): forward OAuth auth and bump sse_read_timeout on SSE transport Two surgical correctness bugs in the SSE branch of MCPServerTask._run_http, distilled from @amiller's PR #5981 that couldn't be cherry-picked wholesale (branch too stale). 1. sse_read_timeout was set to the tool timeout (default 60s). That's the wrong dimension — it governs how long sse_client will wait between events on the SSE stream, not per-call latency. SSE servers routinely hold the stream idle for minutes between events; a 60s read timeout drops the connection after the first slow stretch (Router Teamwork, Supermemory on Cloudflare Workers idle-disconnect at ~60s). Bump to 300s to match the Streamable HTTP path's httpx read timeout. 2. OAuth auth was built via get_manager().get_or_build_provider() but never forwarded to sse_client. SSE MCP servers behind OAuth 2.1 PKCE would silently fail with 401s on every request. Keepalive (the other half of #5981) intentionally left for a follow-up — it's a real improvement but a bigger change, and these two are obvious corrections to ship now. Credits to @amiller. Co-authored-by: Andrew Miller <socrates1024@gmail.com> --------- Co-authored-by: Andrew Miller <socrates1024@gmail.com>	2026-05-07 07:08:04 -07:00
Teknium	e0a2b08768	fix(mcp): re-raise CancelledError explicitly in MCPServerTask.run (#21318 ) On Python 3.11+, `asyncio.CancelledError` inherits from `BaseException` (not `Exception`), so the broad `except Exception as exc:` in `MCPServerTask.run`'s transport loop did NOT catch it. Task cancellation from gateway restart / explicit `task.cancel()` silently escaped past the reconnect logic — the MCP server task died without going through the shutdown/reconnect code paths that check `_shutdown_event`. Add an explicit `except asyncio.CancelledError: raise` before the broad catch so cancellation propagation is self-documenting rather than an accident of exception hierarchy, and future sibling-site work (e.g. distinguishing shutdown-cancel from transport-cancel) has an obvious hook. Behavior on pre-3.8 Pythons where CancelledError WAS an Exception subclass is also corrected: the old path would have caught it and treated it as a connection failure worth retrying. Closes #9930.	2026-05-07 07:04:38 -07:00
Teknium	5a3e5b23d2	fix(memory): remove dead allOf schema block at the source PR #21238 introduced top-level `allOf: [{if/then/required}]` blocks in the built-in memory tool's parameters schema as conditional-required hints. Two problems: 1. OpenAI's Codex backend (chatgpt.com/backend-api/codex, gpt-5.x) rejects top-level `allOf`/`anyOf`/`oneOf`/`enum`/`not` outright with a non-retryable 400 — affected every user on openai-codex/gpt-5.x. 2. The `if/then` hints were silently ignored by every other provider (Chat Completions doesn't honour them on function schemas), so they never actually enforced anything anywhere. The runtime handler in `memory_tool()` already validates the per-action required fields and returns actionable error messages, so removing the block changes nothing behaviourally. Paired with the defense-in-depth sanitizer in the previous commit, this closes the bug both at the source (schema no longer emits the forbidden form) and at the wire boundary (sanitizer strips it if anything else re-introduces it). - Rewrites `tests/tools/test_memory_tool_schema.py` to guard against regressing the forbidden-combinator shape instead of asserting it. - Adds AUTHOR_MAP entry for @hrkzogw (author of the sanitizer fix).	2026-05-07 07:03:21 -07:00
Hirokazu Ogawa	3924cb408b	fix: strip Codex-hostile top-level schema combinators	2026-05-07 07:03:21 -07:00
luyao618	e795b7e3ab	fix(delegate): expand composite toolsets before intersection in delegate_task When the parent agent uses a composite toolset like hermes-cli, calling delegate_task with individual toolsets (e.g. web, terminal) resulted in zero tools because the name-based intersection failed: 'web' != 'hermes-cli'. Add _expand_parent_toolsets() which collects all tool names from parent toolsets, then recognises any individual toolset whose tools are a subset of the parent's available tools. This allows delegate_task(toolsets=['web']) to work correctly when the parent has hermes-cli enabled. Fixes #19447	2026-05-07 06:41:42 -07:00
liuhao1024	f9b4b8af34	fix(mcp): include exception type in error messages when str(exc) is empty Some exception classes (e.g. anyio.ClosedResourceError) are raised without a message argument, so str(exc) returns an empty string. The existing error format f'{type(exc).__name__}: {exc}' would produce messages like 'MCP call failed: ClosedResourceError: ' with nothing after the colon. Add _exc_str() helper that falls back to repr(exc) when str(exc) is empty, and apply it to all 6 MCP error formatting sites (5 tool/prompt/resource handlers + 1 sampling handler). Fixes #19417	2026-05-07 06:33:57 -07:00
Alexander Monas	a1f85ef2b9	fix(mcp): retry stale pipe transport failures Treat closed-resource, closed-transport, broken-pipe, and EOF MCP failures as stale session equivalents so the existing reconnect/retry-once path can recover. Add regression coverage for the stale-pipe marker variants.\n\nChecks:\n- python -m py_compile tools/mcp_tool.py tests/tools/test_mcp_tool_session_expired.py\n- python -m pytest tests/tools/test_mcp_tool_session_expired.py -q -o addopts=\n- selected secret scan over touched files	2026-05-07 06:32:45 -07:00
Mason James	80548f9a4f	fix(mcp): report configured timeout in MCP call errors Track elapsed wall time in _run_on_mcp_loop, cancel the in-flight future when a timeout expires, and raise a descriptive TimeoutError that includes the elapsed and configured timeout. Add regression coverage for the new timeout diagnostics.	2026-05-07 06:28:11 -07:00
AJV20	9575bce6ca	fix(mcp): clear stale thread interrupt before MCP discovery Fixes #9930 When an agent session is interrupted (Ctrl+C or gateway timeout), the current thread's interrupt flag is set in _interrupted_threads. asyncio executor threads are pooled and reused across sessions, so a thread that carried an interrupt flag from a prior session will immediately cancel any new asyncio work dispatched to it — including MCP server discovery. Fix: in register_mcp_servers(), temporarily clear the interrupt flag on the current thread before running _discover_all(), then restore it afterward in a finally block so the original interrupt state is not lost.	2026-05-07 06:25:35 -07:00
Kowen Hao	a9c7bdaea6	feat(image-gen): honor image_gen.model from config.yaml in plugin dispatch Image generation plugins were dispatched without a model name, leaving the plugin to pick its default. Users on OpenRouter, ComfyUI, or custom backends had no way to select a specific model through config — they had to fork the plugin or patch the tool. Add _read_configured_image_model() that reads image_gen.model from the active profile's config.yaml and forwards it into _dispatch_to_plugin_provider(). When model is set, the plugin call gains a 'model' kwarg; when unset, the plugin falls back to its own default, so single-model users see no behavior change. Example config: image_gen: provider: openrouter model: flux-pro Tests: all 170 image tool tests pass. The new code path is opt-in via config and no existing test exercises it, so the change is strictly additive.	2026-05-07 06:24:24 -07:00
LeonSGP43	d12be46df8	fix(skills): lock usage telemetry updates	2026-05-07 06:13:37 -07:00
Alan Chen	c2d6b385f1	fix(windows): terminal drain and cwd path conversion for native Windows Two fixes for the local terminal backend on Windows (Git Bash): 1. `_drain()` in base.py: `select.select()` only works on sockets on Windows, not pipe file descriptors. On Windows, use blocking `os.read()` in the daemon thread instead. EOF arrives promptly when bash exits, so this is safe. 2. `_run_bash()` in local.py: When `self.cwd` is updated from `pwd` output, it contains Git Bash-style paths (`/c/Users/...`). `subprocess.Popen(cwd=...)` needs a native Windows path (`C:\Users\...`). Added a conversion before Popen. Without these fixes, all terminal() calls on Windows return empty output (exit code 126), and cwd tracking breaks. Tested on Windows 11 with Git for Windows + Python 3.13. Fixes #14638	2026-05-07 06:11:00 -07:00
altmazza0-star	5b24c0fa85	fix: require memory schema fields by action	2026-05-07 05:48:17 -07:00
Teknium	ae1f058b3c	feat(curator): add `hermes curator list-archived` command (#21236 ) Lists the skills sitting in ~/.hermes/skills/.archive/ so users have something to pass to `hermes curator restore`. `curator status` already shows counts; this fills the name-discovery gap. Archive layout is flat (`archive_skill` writes to `.archive/<skill>/`), so the directory name IS the skill name — no frontmatter parsing needed. Timestamped collision directories (`<skill>-<ts>`) are listed literally; user can still pass them to `restore`. Reshape of @EvilDrag0n's #20651, simplified: drop the frontmatter rglob + preamble/trailer output + duplicate subcommand registration. Co-authored-by: EvilDrag0n <lxl694522264@gmail.com>	2026-05-07 05:46:51 -07:00
Teknium	0214858ef5	fix(browser): enforce cloud-metadata SSRF floor in hybrid routing (#16234 ) (#21228 ) Cloud metadata endpoints (169.254.169.254 etc.) are now always blocked by browser_navigate regardless of hybrid routing, allow_private_urls, or backend. Bug: commit `42c076d3` (#16136) added hybrid routing that flips auto_local_this_nav=True for private URLs and short-circuits _is_safe_url(). IMDS endpoints are technically private (169.254/16 link-local), so the sidecar happily routed them to a local Chromium, and the agent could read IAM credentials via browser_snapshot. On EC2/GCP/Azure this is a full SSRF-to-credential-theft. Fix: new is_always_blocked_url() in url_safety.py — a narrow floor that checks _BLOCKED_HOSTNAMES, _ALWAYS_BLOCKED_IPS, _ALWAYS_BLOCKED_NETWORKS only. Applied as an independent gate in browser_navigate's pre-nav and post-redirect checks, BEFORE auto_local_this_nav gets a chance to short-circuit. Ordinary private URLs (localhost, 192.168.x, 10.x, .local, CGNAT) still route to the local sidecar as the #16136 feature intends. Secondary fix (reporter's finding): _url_is_private() now explicitly checks 172.16.0.0/12. ipaddress.is_private only covers that range on Python ≥3.11 (bpo-40791), so on 3.10 runtimes those URLs were routed to cloud instead of the local sidecar. No security impact — just a correctness fix for the hybrid-routing feature. Closes #16234.	2026-05-07 05:38:05 -07:00
Andrew Ho	12289c2630	feat: add SSE transport support for MCP client Add support for MCP servers using the SSE transport protocol (SseServerTransport) alongside the existing Streamable HTTP and stdio transports. Many MCP servers use SSE (GET /sse + POST /messages/) which was previously unsupported -- the client silently fell back to Streamable HTTP, causing 10s connection timeouts. Changes: - Import mcp.client.sse.sse_client with graceful fallback - Check config.get('transport') == 'sse' in _run_http() to select the SSE transport path with proper timeout handling - Read transport type from config in get_mcp_status() instead of hardcoding 'http' for URL-based servers - Update docstring, example config, and feature list	2026-05-07 05:36:28 -07:00
Teknium	c4a7992317	fix(mcp-oauth): persist OAuth server metadata across process restarts (#21226 ) The MCP SDK discovers OAuth server metadata (token_endpoint, etc.) on demand and keeps it in memory only. Without disk persistence, a restart with valid cached refresh tokens forces the SDK to fall back to the guessed '{server_url}/token' path — which returns 404 on most real providers (Notion, Atlassian, GitHub remote MCP, etc.) and triggers a full browser re-authorization even though the refresh token is fine. Add a .meta.json file next to the existing tokens/client_info files: HERMES_HOME/mcp-tokens/<server>.json -- tokens (existing) HERMES_HOME/mcp-tokens/<server>.client.json -- client info (existing) HERMES_HOME/mcp-tokens/<server>.meta.json -- oauth metadata (new) Changes: - HermesTokenStorage.save_oauth_metadata / load_oauth_metadata / _meta_path — disk layer for the discovered OAuthMetadata. - HermesTokenStorage.remove() now also clears .meta.json so 'hermes mcp remove <name>' and the manager's remove() path clean up fully. - HermesMCPOAuthProvider._initialize cold-restores from disk before the existing pre-flight discovery runs. If disk has metadata we skip the discovery HTTP round-trips entirely. - HermesMCPOAuthProvider._prefetch_oauth_metadata now persists ASM as soon as it's discovered, so even the first pre-flight run seeds disk. - HermesMCPOAuthProvider._persist_oauth_metadata_if_changed() is called at the end of async_auth_flow so metadata discovered via the SDK's lazy 401-branch (not pre-flight) is also saved for next time. Tests cover the storage roundtrip (save/load/missing/corrupt/remove) and the manager provider path (cold-load restore, skip-when-in-memory, persist-on-discover, noop-when-unchanged, end-to-end async_auth_flow). Co-authored-by: nocturnum91 <50326054+nocturnum91@users.noreply.github.com>	2026-05-07 05:35:33 -07:00
leon7609	d34f03c32a	feat(gateway): support [[as_document]] directive for skill media routing Skills that produce large/lossless images (e.g. info-graph, where a rendered JPG is 1-2 MB) currently lose quality in Telegram delivery because `_IMAGE_EXTS` membership routes the file through `send_multiple_images` → `sendMediaGroup`, which Telegram's server re-encodes to JPEG @ 1280px max edge. The original bytes only survive when the file goes through `send_document`, which the dispatch tables in three places (`_process_message_background`, `_deliver_media_from_response`, and the `send_message` tool's telegram path) only reach for files whose extension is NOT in `_IMAGE_EXTS`. This commit adds an `[[as_document]]` directive that mirrors the existing `[[audio_as_voice]]` shape: a skill emits the directive once in its response, and every image-extension MEDIA: file in that response is delivered via `send_document` instead of `send_multiple_images` / `sendPhoto`. The directive is detected at the dispatch sites (which see the raw response) and the directive string is stripped from the user-visible cleaned text in `extract_media` so it never leaks. Granularity is intentionally all-or-nothing per response, matching [[audio_as_voice]]'s scope. Skills that need fine control can split into two responses. Verified the targeted use case: info-graph emits 信息图已生成（...） [[as_document]] MEDIA:/tmp/info-graph-x/infographic.jpg → Telegram receives `infographic.jpg` via sendDocument, original 1MB JPEG bytes preserved, no recompression. Forwarding and download filenames stay clean (`infographic.jpg`). Tests: +3 cases in TestExtractMedia covering directive strip, isolation from voice flag, and coexistence with [[audio_as_voice]]. All 113 pre-existing media/extract/send tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 05:20:10 -07:00
Kailigithub	5bf12eb44a	fix: exclude hidden and archive dirs from _find_skill rglob	2026-05-07 05:15:28 -07:00
liuhao1024	69692039e9	fix(delegate): correct ACP docs — Claude Code CLI has no --acp flag The delegate_task tool schema descriptions referenced 'claude --acp --stdio' as an example, but Claude Code CLI does not support --acp or --stdio flags. The ACP subprocess transport (agent/copilot_acp_client.py) is specifically built for GitHub Copilot CLI ('copilot --acp --stdio'). Changes: - Per-task acp_command example: 'claude' → 'copilot' - Top-level acp_command description: remove 'Claude Code' reference, clarify requirement for ACP-compatible CLI (currently Copilot only) - acp_args description: remove misleading claude-opus-4-6 example Fixes #19055	2026-05-07 05:13:30 -07:00
Brian Su	8b32a9d0f1	feat: add Discord message deletion action	2026-05-07 05:11:09 -07:00
stephen0110	40b51c93a2	fix(kanban): heartbeat tool extends claim TTL, not just last_heartbeat_at The kanban_heartbeat tool called heartbeat_worker but never heartbeat_claim, so a worker that loops the tool while a single tool call blocks the agent for >DEFAULT_CLAIM_TTL_SECONDS still got reclaimed by release_stale_claims. The function name and heartbeat_claim's own docstring imply otherwise: "Workers that know they'll exceed 15 minutes should call this every few minutes to keep ownership." But there was no caller in the worker tool path. Workers couldn't invoke heartbeat_claim themselves either — it isn't exposed as a tool. Fix: _handle_heartbeat now calls heartbeat_claim first, reading HERMES_KANBAN_CLAIM_LOCK from the worker env (the dispatcher pins this in _default_spawn). Falls back to _claimer_id() for locally- driven workers that didn't go through dispatcher spawn. Test: tests/tools/test_kanban_tools.py::test_heartbeat_extends_claim_expires rewinds claim_expires into the past, calls the tool, and asserts the new value is at least now + DEFAULT_CLAIM_TTL_SECONDS // 2. Verified to fail against the unfixed code (claim_expires stays at the rewound value). Closes the root cause underlying the symptom in #21141 (15-min respawns of long-running workers). #21141 separately addresses post-reclaim cleanup; this fixes the upstream "shouldn't have been reclaimed in the first place" half.	2026-05-07 05:05:20 -07:00
ambition0802	7c0766e06a	fix(gateway): translate inbound document host paths to container paths for Docker backend When terminal.backend is docker, inbound documents uploaded via messaging platforms (Telegram, Slack, Discord, Feishu, Email, etc.) are cached at a host path under ~/.hermes/cache/documents, but the container sandbox only sees them at the auto-mounted /root/.hermes/cache/documents path. This PR adds to_agent_visible_cache_path() in tools/credential_files.py (the natural sibling to get_cache_directory_mounts()) and calls it at the document-context-injection site in gateway/run.py so the agent always receives a path it can open directly, matching the mount layout already established by get_cache_directory_mounts() (#4846). Scope: only Docker backend for now; other backends use different mount semantics and are left unchanged until verified. Fixes #18787	2026-05-07 05:02:26 -07:00
Gutslabs	7d36e8346b	fix(security): close TOCTOU window when saving MCP OAuth credentials _write_json (the persistence helper used by HermesTokenStorage for both tokens and client_info) created the temp file via Path.write_text and only chmod'd it to 0o600 afterward. Between create and chmod the file existed on disk at the process umask (commonly 0o644 = world-readable), briefly exposing MCP OAuth access/refresh tokens to other local users. Use os.open with O_WRONLY\|O_CREAT\|O_EXCL and an explicit S_IRUSR\|S_IWUSR mode so the file is created atomically at 0o600, plus tighten the parent dir to 0o700 so siblings can't traverse to the creds file. The temp name also gains a per-process random suffix to avoid collisions between concurrent writers and stale leftovers from a crashed prior write. Mirrors the fix shipped for agent/google_oauth.py in #19673. Adds a regression test asserting the resulting file mode is 0o600 and the parent directory is 0o700 (skipped on Windows where POSIX mode bits aren't enforced).	2026-05-07 04:56:13 -07:00
kshitij	5c906d7026	feat(web): add SearXNG as a native search-only backend Adds SearXNG as a free, self-hosted web search provider. SearXNG is a privacy-respecting metasearch engine that requires no API key — just a running instance and SEARXNG_URL pointing at it. ## What this adds - `tools/web_providers/searxng.py` — `SearXNGSearchProvider` implementing `WebSearchProvider` (search only; no extract capability) - `_is_backend_available("searxng")` — gates on SEARXNG_URL - `_get_backend()` — accepts "searxng" as a configured value; adds it to auto-detect candidates (lower priority than paid services) - `web_search_tool` — dispatches to SearXNG when it is the active backend - `check_web_api_key()` — includes SearXNG in availability check - `OPTIONAL_ENV_VARS["SEARXNG_URL"]` — registered with tools=["web_search"] - `tools_config.py` — SearXNG appears in the `hermes tools` provider picker - `nous_subscription.py` — `direct_searxng` detection, web_active / web_available - `setup.py` — SEARXNG_URL listed in the missing-credential hint - 23 tests covering: is_configured, happy-path search, score sorting, limit, HTTP/request errors, _is_backend_available, _get_backend, check_web_api_key ## Config ```yaml # Use SearXNG for search, any paid provider for extract web: search_backend: "searxng" extract_backend: "firecrawl" # Or: SearXNG as the sole backend (web_extract will use the next available) web: backend: "searxng" ``` SearXNG is search-only — it does not implement WebExtractProvider. Users who only configure SEARXNG_URL get web_search available; web_extract falls back to the next available extract provider (or is unavailable if none). Closes #19198 (Phase 2 Task 4 — SearXNG provider) Ref: #11562 (original SearXNG PR)	2026-05-06 10:05:29 -07:00
kshitij	cd2cbc73b7	refactor(web): per-capability backend selection for search/extract split Introduce the foundation for independently selecting web search and extract backends — enabling future combinations like SearXNG for search + Firecrawl for extract. Architecture: - tools/web_providers/base.py: WebSearchProvider and WebExtractProvider ABCs with normalized result contracts (mirrors CloudBrowserProvider) - tools/web_tools.py: _get_search_backend() and _get_extract_backend() read per-capability config keys, fall through to shared web.backend - hermes_cli/config.py: web.search_backend and web.extract_backend in DEFAULT_CONFIG (empty = inherit from web.backend) Behavioral change: - web_search_tool() now dispatches via _get_search_backend() - web_extract_tool() now dispatches via _get_extract_backend() - When per-capability keys are empty (default), behavior is identical to before — _get_search_backend() falls through to _get_backend() This is purely structural — no new backends are added. SearXNG and other search-only/extract-only providers can now be added as simple drop-in modules in follow-up PRs. 12 new tests, 49 existing tests pass with zero regressions. Ref: #19198	2026-05-06 09:16:25 -07:00
Teknium	a0fedfbb1b	feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709 ) Replaces the per-directory shadow-repo design with a single shared shadow git store at ~/.hermes/checkpoints/store/. Object DB is now deduplicated across every working directory the agent has ever touched; a dozen worktrees of the same project cost near-zero in additional disk. Why --- Pre-v2 design had three compounding problems that let ~/.hermes/checkpoints/ grow to multi-GB on active machines: 1. Each working directory got its own full shadow git repo — no object dedup across projects or across worktrees of the same project. 2. _prune() was a documented no-op: max_snapshots only limited the /rollback listing. Loose objects accumulated forever. 3. Defaults: enabled=True, auto_prune=False — users paid the disk cost without ever asking for /rollback. Field report on a single workstation: 847 MB across 47 shadow repos, mostly redundant clones of the hermes-agent source tree. Changes ------- - tools/checkpoint_manager.py: full rewrite. Single bare store, per-project refs (refs/hermes/<hash>), per-project indexes (store/indexes/<hash>), per-project metadata (store/projects/<hash>.json with workdir + created_at + last_touch). On first v2 init, any pre-v2 per-directory shadow repos are auto-migrated into legacy-<timestamp>/ so the new store starts clean. _prune() now actually rewrites the per-project ref to the last max_snapshots commits and runs git gc --prune=now. New _enforce_size_cap() drops oldest commits round-robin across projects when the store exceeds max_total_size_mb. _drop_oversize_from_index() filters any single file larger than max_file_size_mb out of the snapshot. - hermes_cli/checkpoints.py: new 'hermes checkpoints' CLI (status / list / prune / clear / clear-legacy) for managing the store outside a session. - hermes_cli/config.py: flipped defaults — enabled=False, max_snapshots=20, auto_prune=True. Added max_total_size_mb=500, max_file_size_mb=10. Tightened DEFAULT_EXCLUDES (added target/, .so/.dylib/.dll, .mp4/.mov, .zip/*.tar.gz, .worktrees/, .mypy_cache/, etc.). - run_agent.py / cli.py / gateway/run.py: thread the new kwargs through AIAgent and the startup auto_prune hooks. - Tests rewritten to match v2 storage while keeping backwards-compat coverage for the pre-v2 prune path (per-directory shadow repos under base/ are still swept correctly for anyone mid-migration). - Docs updated: user-guide/checkpoints-and-rollback.md explains the shared store, new defaults, migration, and the new CLI; reference/cli-commands.md documents 'hermes checkpoints'. E2E validated ------------- - Legacy migration: pre-v2 shadow repos auto-archived into legacy-<ts>/. - Object dedup: two projects with an identical shared.py blob resolve to 7 total objects in the store (v1 would have stored the blob twice). - max_snapshots=3 actually enforced: after 6 commits, list shows 3. - Orphan prune: deleting a project's workdir + 'hermes checkpoints prune --retention-days 0' removes its ref, index, and metadata; GC reclaims the objects. - max_file_size_mb=1 excludes a 2 MB weights.bin while keeping the tracked source code files. - hermes checkpoints {status,prune,clear,clear-legacy} all work from the CLI without an agent running. Breaking / migration -------------------- No in-place data migration — legacy per-directory shadow repos are moved into legacy-<timestamp>/ on first run. Old /rollback history is still accessible by inspecting the archive with git; run 'hermes checkpoints clear-legacy' to reclaim the space when ready. Users relying on /rollback must now set checkpoints.enabled=true (or pass --checkpoints) explicitly.	2026-05-06 05:44:35 -07:00
Kshitij Kapoor	629d8b843d	fix(browser): tighten Lightpanda fallback edge cases	2026-05-06 03:41:21 -07:00
Kshitij Kapoor	3ebdd26449	fix(browser): surface Lightpanda Chrome fallback warnings	2026-05-06 03:23:19 -07:00
kshitijk4poor	395dbcc873	feat(browser): add Lightpanda engine support with automatic Chrome fallback Add Lightpanda as an optional browser engine for local mode. Lightpanda is a headless browser built from scratch in Zig -- faster navigation than Chrome with significantly less memory. One config line to enable: browser: engine: lightpanda New functions in browser_tool.py: - _get_browser_engine() -- config/env reader with validation + caching - _should_inject_engine() -- only inject in local non-cloud mode - _needs_lightpanda_fallback() -- detect empty/failed LP results - _chrome_fallback_screenshot() -- temporary Chrome session for screenshots - Engine injection in _run_browser_command (--engine flag) - browser_vision pre-routes screenshots to Chrome when engine=lightpanda Config: - browser.engine in DEFAULT_CONFIG (auto/lightpanda/chrome) - AGENT_BROWSER_ENGINE in OPTIONAL_ENV_VARS - /browser status shows engine info in local mode Rebased from PR #7144 onto current main. All existing code preserved -- pure additions only (+520/-2). 25 new tests + 81 total browser tests pass (0 failures).	2026-05-06 03:23:19 -07:00
misery-hl	56b4795115	guard kanban worker lifecycle by run id	2026-05-05 15:09:28 -07:00
jani	0df80f4391	docs: align terminal-backend count and naming across docs and code README:24 claimed "Six terminal backends" while tools/environments/ exposes seven top-level backend choices through TERMINAL_ENV: local, docker, ssh, singularity, modal, daytona, vercel_sandbox. Modal additionally has direct and Nous-managed modes selected via terminal.modal_mode (the ManagedModalEnvironment class is a Modal sub-mode, not a separate top-level backend). The same drift appeared in five other doc and code-comment sites with inconsistent counts (six, seven, or implicit) and varying lists. Updated all sites to a consistent seven-backend list in canonical order. The configuration guide also clarifies how Modal's two modes are selected so operators do not search for a non-existent backend: managed_modal value. CONTRIBUTING.md:160 lists six backend filenames in a code tree but does not carry the "Six terminal" prose; left out of scope per cohesion sweep guidance to bundle only identical wording. Files updated: - README.md (line 24, marketing copy) - website/docs/index.md (line 49, landing page) - website/docs/user-guide/configuration.md (line 86, config guide) - tools/environments/__init__.py (lines 3-6, package docstring) - tools/file_operations.py (line 6, module docstring) - environments/README.md (line 43, RL training docs — TERMINAL_ENV list)	2026-05-05 13:44:09 -07:00
beardthelion	a6289927d3	docs(web_tools): correct web_extract summarizer timeout comment The comment at tools/web_tools.py:700-702 stated the runtime default for auxiliary.web_extract.timeout is 360s. The actual runtime default is 30s (_DEFAULT_AUX_TIMEOUT in agent/auxiliary_client.py:3140), used by _get_task_timeout when no auxiliary.web_extract.timeout key is present in config.yaml. The 360s figure is the config template default written by hermes_cli/config.py:697 into freshly-generated config.yaml files. It only takes effect when that key exists in the user's config — not as a fallback. Users on configs that predate commit `20b4060d` (Apr 5, 2026), or who removed the key, fall through to the 30s _DEFAULT_AUX_TIMEOUT runtime default. The comment was introduced in `20b4060d` alongside the template-default bump from 30 to 360. The runtime default in auxiliary_client.py was not changed in that commit and has remained 30s since `839d9d74` (Mar 28, 2026).	2026-05-05 13:24:19 -07:00
LeonSGP43	244bacd0dc	fix(skills): support category-qualified local skill names	2026-05-05 10:15:31 -07:00
sprmn24	db84c1535d	fix(ssh): add scp availability check to preflight validation	2026-05-05 09:57:23 -07:00
Teknium	de9238d37e	feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232 ) Workers completing a kanban task can now claim the ids of cards they created via an optional ``created_cards`` field on ``kanban_complete``. The kernel verifies each id exists and was created by the completing worker's profile; any phantom id blocks the completion with a ``HallucinatedCardsError`` and records a ``completion_blocked_hallucination`` event on the task so the rejected attempt is auditable. Successful completions also get a non-blocking prose-scan pass over their ``summary`` + ``result`` that emits a ``suspected_hallucinated_references`` event for any ``t_<hex>`` reference that doesn't resolve. Closes #20017. Recovery UX (kernel + CLI + dashboard) -------------------------------------- A structural gate alone isn't enough — operators also need to see and act on stuck workers, especially when a profile's model is the root cause. This PR ships the full loop: * ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that releases an active worker claim immediately (unlike ``release_stale_claims`` which only acts after claim_expires has passed). Emits a ``reclaimed`` event with ``manual: True`` payload. * ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` — switch a task to a different profile, optionally reclaiming a stuck running worker in the same call. * ``hermes kanban reclaim <id> [--reason ...]`` and ``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]`` CLI subcommands wired through to the same helpers. * ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and ``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the dashboard plugin. Dashboard surfacing ------------------- * ⚠ warning badge on cards with active hallucination events. * attention strip at the top of the board listing all flagged tasks; dismissible per session. * events callout in the task drawer — hallucination events render with a red left border, amber icon, and phantom ids as styled chips. * recovery section in the task drawer with three actions: Reclaim, Reassign (with profile picker + reclaim-first checkbox), and a copy-to-clipboard hint for ``hermes -p <profile> model`` since profile config lives on disk and can't be edited from the browser. Auto-opens when the task has warnings, collapsed otherwise. Keyed by task id so state doesn't leak between drawers. Active-vs-stale rule: warnings clear when a clean ``completed`` or ``edited`` event supersedes the hallucination, so recovery is never permanently stigmatising — the audit events persist for debugging but the badge goes away once the worker succeeds. Skill updates ------------- * ``skills/devops/kanban-worker/SKILL.md`` documents the ``created_cards`` contract with good/bad examples. * ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering stuck workers" section with the three actions and when to use each. Tests ----- * Kernel gate: verified-cards manifest, phantom rejection + audit event, cross-worker rejection, prose scan positive + negative. * Recovery helpers: reclaim on running task, reclaim on non-running returns False, reassign refuses running without reclaim_first, reassign with reclaim_first succeeds on running. * API endpoints: warnings field present on /board and /tasks/:id, warnings cleared after clean completion, reclaim 200 + 409 paths, reassign 200 + 409 + reclaim_first paths. * CLI smoke: reclaim + reassign subcommands. Live-verified end-to-end on a dashboard with seeded scenarios: attention strip renders, badges land on the right cards, drawer callout shows phantom chips, Reclaim on a running task flips status to ready + emits manual reclaimed event + refreshes the drawer, Reassign swaps the assignee and triggers board refresh. 359/359 kanban-suite tests pass (test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).	2026-05-05 08:06:55 -07:00
Teknium	7de3c86c5a	feat(i18n): add display.language for static message translation (zh/ja/de/es) (#20231 ) * revert(gateway): remove stale-code self-check and auto-restart Removes the _detect_stale_code / _trigger_stale_code_restart mechanism introduced in #17648 and iterated in #19740. On every incoming message the gateway compared the boot-time git HEAD SHA to the current SHA on disk, and if they differed it would reply with Gateway code was updated in the background -- restarting this gateway so your next message runs on the new code. Please retry in a moment. and then kick off a graceful restart. This is unwanted behaviour: users who run a long-lived gateway and do their own ad-hoc git operations on the checkout end up with their chat interrupted and the current message dropped every time HEAD moves, with no way to opt out. If an operator really needs the old protection against stale sys.modules after "hermes update", the SIGKILL-survivor sweep in hermes update (hermes_cli/main.py, also tagged #17648) already handles the supervisor-respawn case on its own. Removed: gateway/run.py: - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS - _read_git_head_sha(), _compute_repo_mtime() module helpers - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha / _stale_code_restart_triggered defaults - __init__ boot-snapshot block (_boot_, _cached_current_sha, _repo_root_for_staleness, _stale_code_notified) - _current_git_sha_cached(), _detect_stale_code(), _trigger_stale_code_restart() methods - stale-code check + user-facing restart notice at the top of _handle_message() tests/gateway/test_stale_code_self_check.py (deleted, 412 lines) No new logic added. Zero remaining references to any removed symbol. Gateway test suite passes the same 4589 tests it passed before; the 3 pre-existing unrelated failures (discord free-channel, feishu bot admission, teams typing) are unchanged by this commit. * feat(i18n): add display.language for static message translation (zh/ja/de/es) Adds a thin-slice i18n layer covering the highest-impact static user-facing messages: the CLI dangerous-command approval prompt and a handful of gateway slash-command replies (restart-drain, goal cleared, approval expired, config read/save errors). Out of scope (stays English): agent responses, log lines, tool outputs, slash-command descriptions, error tracebacks. Infrastructure: - agent/i18n.py: catalog loader, t() helper, language resolution (HERMES_LANGUAGE env var > display.language config > en) - locales/{en,zh,ja,de,es}.yaml: ~19 translated strings per language - display.language in DEFAULT_CONFIG (hermes_cli/config.py) Tests: - tests/agent/test_i18n.py: 21 tests covering catalog parity, placeholder parity across locales, fallback behavior, env-var override, alias normalization, missing-key graceful degradation. Docs: - website/docs/user-guide/configuration.md: display.language entry plus a short section explaining scope so users don't expect agent responses to translate via this knob.	2026-05-05 08:03:07 -07:00
vominh1919	44cf33449d	fix(mcp): add periodic keepalive to _wait_for_lifecycle_event Sends a lightweight list_tools() probe every 3 minutes during idle periods to prevent TCP connections from going stale behind LB / NAT idle timeouts (commonly 300-600s). When the keepalive fails, the reconnect event fires so the transport rebuilds the session cleanly. Salvages the keepalive portion of @vominh1919's PR #17016. The circuit-breaker half-open recovery from the same PR was independently landed on main via #benbarclay's commit `8cc3cebca` ("fix(mcp): add half-open state to circuit breaker", Apr 21); only the keepalive is salvaged here. Fixes #17003.	2026-05-05 05:47:33 -07:00
Teknium	b10e38e392	fix(skills): pin protects against deletion only, not edits (#20220 ) Previously, pinning a skill blocked every skill_manage write action (edit, patch, delete, write_file, remove_file). The 'hard fence' design conflated two concerns: 1. Pin as deletion protection — don't let the curator archive or the agent delete a stable skill. 2. Pin as content freeze — don't let the agent rewrite it mid-conversation. In practice (1) is what users pin for: they want a skill to survive curator passes. (2) created friction — agents finding a new pitfall in a pinned skill had to ask the user to unpin, then the agent patches, then the user re-pins. The dance discouraged skill maintenance and pinned skills went stale. This narrows the _pinned_guard to skill_manage(action='delete') only. Patches, edits, and supporting-file writes go through on pinned skills so the agent can keep improving them. The curator's own pinned-skip behavior (agent/curator.py:271 for auto-archive, line 349 for the LLM review prompt) is unchanged — curator still never touches pinned skills. Changes: - tools/skill_manager_tool.py: remove _pinned_guard calls from _edit_skill, _patch_skill, _write_file, _remove_file; keep on _delete_skill. Updated _pinned_guard docstring and error message. - tools/skill_manager_tool.py: updated skill_manage model-facing tool description to reflect the new semantic. - website/docs/user-guide/features/curator.md: updated pinning section. - tests/tools/test_skill_manager_tool.py: flipped refuses-pinned tests for edit/patch/write_file/remove_file into allowed-when-pinned; kept test_delete_refuses_pinned (strengthened assertion to check the 'cannot be deleted' wording). Closes #18354	2026-05-05 05:43:10 -07:00
LeonSGP43	68c1a08ad1	fix(curator): protect hub skills by frontmatter name	2026-05-05 04:55:22 -07:00
Teknium	5168226d60	feat(file_tools): post-write delta lint on write_file + patch, add JSON/YAML/TOML/Python in-process linters (#20191 ) Closes the gap where write_file skipped the post-edit syntax check that patch already ran, so silent file corruption (bad quote escaping, truncated writes, etc.) would persist on disk until a later read. ## Changes tools/file_operations.py: - Add in-process linters for .py, .json, .yaml, .toml (LINTERS_INPROC). Python uses ast.parse, JSON/YAML/TOML use stdlib/PyYAML parsers. Zero subprocess overhead; preferred over shell linters when both apply. - _check_lint() now accepts optional content and routes to in-process linter first. Shell linter (py_compile, node --check, tsc, go vet, rustfmt) remains the fallback for languages without an in-process equivalent. - New _check_lint_delta() implements the post-first/pre-lazy pattern borrowed from Cline and OpenCode: lint post-write state first; only if errors are found AND pre-content was captured does it lint the pre-state and diff. If the pre-existing file had the SAME errors the edit didn't introduce anything new, so the file is reported as 'still broken, pre-existing' with success=False but a message explaining the errors were pre-existing. If the edit introduced genuinely new errors, those are surfaced and pre-existing ones are filtered out. - WriteResult gains a lint field. - write_file() captures pre-content for in-process-lintable extensions and calls _check_lint_delta after a successful write. - patch_replace() switches from _check_lint to _check_lint_delta, reusing the pre-edit content it already has in scope. tools/file_tools.py: - Update write_file schema description to mention the post-write lint. tests/tools/test_file_operations_edge_cases.py: - Update existing brace-path tests to use .js (shell linter) now that .py is in-process. - Add TestCheckLintInproc (9 tests) covering Python/JSON/YAML/TOML in-process linters. - Add TestCheckLintDelta (5 tests) covering the post-first/pre-lazy short-circuit, new-file path, and the single-error-parser caveat. ## Performance In-process linters are microseconds per call (ast.parse, json.loads). The hot path (clean write) runs exactly one lint — matches main's cost for patch. Pre-state capture is skipped when the file has no applicable linter. Measured 4.89ms/write average over 100 .py writes including lint. ## Inspiration - Cline's DiffViewProvider.getNewDiagnosticProblems() — filters pre-write diagnostics from post-write diagnostics (src/integrations/editor/DiffViewProvider.ts). - OpenCode's WriteTool — runs lsp.diagnostics() after write and appends errors to tool output (packages/opencode/src/tool/write.ts). - Claude Code's DiagnosticTrackingService — captures baseline via beforeFileEdited() and returns new-diagnostics-only from getNewDiagnostics() (src/services/diagnosticTracking.ts). ## Validation - tests/tools/test_file_operations.py + test_file_operations_edge_cases.py + test_file_tools.py + test_file_tools_live.py + test_file_write_safety.py + test_write_deny.py + test_patch_parser.py + test_file_ops_cwd_tracking.py: 228 passed locally. - Live E2E reproduction of the tips.py corruption incident: broken content written; lint field surfaces 'SyntaxError: invalid syntax. Perhaps you forgot a comma? (line 6, column 5)' — the exact error that would have self-corrected the bug on the next turn.	2026-05-05 04:54:17 -07:00
Chris Danis	28f4d6db63	fix(tool-schemas): reactive strip of pattern/format on llama.cpp grammar 400s MCP servers commonly emit JSON Schema `pattern` (e.g. `\\d{4}-\\d{2}-\\d{2}` for date-time params) and `format` keywords. llama.cpp's `json-schema-to-grammar` converter rejects regex escape classes (\\d/\\w/\\s) and most format values, returning HTTP 400 "parse: error parsing grammar: unknown escape at \\d" — the whole request fails. Cloud providers (OpenAI, Anthropic, OpenRouter, Gemini) accept these keywords fine and use them as prompting hints. Stripping unconditionally loses useful hints for every cloud user to fix a llama.cpp-only bug. Approach: classify the llama.cpp grammar-parse 400 in the error classifier, and on match do a one-shot in-place strip of pattern/format from `self.tools`, then retry. Follows the existing `thinking_signature` recovery pattern. Cloud users hit zero overhead; llama.cpp users pay one failed request per session. Changes - agent/error_classifier.py: new `FailoverReason.llama_cpp_grammar_pattern` + narrow HTTP-400 branch matching "error parsing grammar", "json-schema-to-grammar", or "unable to generate parser ... template". - tools/schema_sanitizer.py: new `strip_pattern_and_format()` helper — reactive, walks schema nodes, skips property names (search_files.pattern survives). Returns strip count for logging. - run_agent.py: new one-shot recovery block in the retry loop. Strips, logs, continues. Falls through to normal retry if nothing to strip. - tests: 4 classifier tests (3 variants + 1 non-400 negative), 7 strip tests including the property-name preservation and idempotency checks. Co-authored-by: Chris Danis <cdanis@gmail.com>	2026-05-05 04:25:18 -07:00
kshitij	109c3e468c	fix(terminal): guard background process spawn against deleted cwd (#19933 ) Follow-up to #19928 which fixed the foreground path in _run_bash. The background process spawn in process_registry.py had the same vulnerability: Popen(cwd=session.cwd) and PtyProcess.spawn(cwd=...) would raise FileNotFoundError if the directory was deleted. Apply _resolve_safe_cwd() at session creation time so both the PTY and pipe-mode Popen paths receive a validated cwd.	2026-05-04 15:35:34 -07:00
briandevans	9fa3a093f2	fix(local): test root as ancestor candidate; use real pipe for fake stdout Address Copilot review on PR #17569: 1. _resolve_safe_cwd never tested the filesystem root because the loop exited when `os.path.dirname(parent) == parent`, which is true once `parent == '/'`. Restructure so the root is checked before the self-equal exit. Adds `test_returns_root_when_only_root_exists` — regression-guarded by reverting the loop and watching it fail. 2. The fake `Popen.stdout` was a `MagicMock`; `BaseEnvironment._wait_for_process` calls `proc.stdout.fileno()` then `select.select`/`os.read` against it, which raised `TypeError: fileno() returned a non-integer` (visible as a thread exception in test output) and could in theory read from an unrelated real fd. Hand `fake_popen` a real `os.pipe()` with the write end pre-closed so the drain loop sees EOF immediately. Helper records each fd so the test cleans up after itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 15:31:47 -07:00
briandevans	9644b8ae67	fix(local): recover when persistent_shell cwd is deleted (#17558 ) When a tool call deletes its own working directory (`cd /tmp/foo && rm -rf /tmp/foo`), the next `subprocess.Popen(args, cwd=self.cwd)` raised `FileNotFoundError: [Errno 2]` before bash even started — every subsequent terminal/file-tool call hit the same wedge until the gateway restarted. Fix in `LocalEnvironment._run_bash`: before handing `self.cwd` to Popen, resolve a safe alternative when the path is gone (walk up to the nearest existing ancestor, falling back to `tempfile.gettempdir()` only as a last resort). Log a warning so the recovery is visible — not silent — and update `self.cwd` so the next call doesn't repeat the message. Defense in depth in `LocalEnvironment._update_cwd`: only adopt the new cwd when it still exists as a directory. `pwd -P` from a deleted cwd can leave a stale value in the marker file; refusing to store a missing path keeps `self.cwd` valid by construction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 15:31:47 -07:00
Yoimex	c050ee6573	fix(file_ops): resolve search_files path/line collision for hyphenated numeric filenames	2026-05-04 12:37:47 -07:00
ClawdIA	64ad7dec0d	fix(file-ops): allow file search in hidden roots	2026-05-04 12:37:09 -07:00
lhysdl	6875471916	fix(tts): update MiniMax API endpoint to v1/text_to_speech MiniMax deprecated the old v1/t2a_v2 endpoint (api.minimax.io) and moved to v1/text_to_speech (api.minimax.chat). The new API: - Uses a flat payload: {model, text, voice_id} instead of nested voice_setting / audio_setting objects - Returns raw audio bytes (Content-Type: audio/mpeg) instead of JSON with hex-encoded audio - Uses model 'speech-01' instead of 'speech-2.8-hd' - Updated default voice_id to 'female-shaonv' for Chinese TTS The implementation detects Content-Type to handle both old and new API responses, maintaining backward compatibility for any users who manually configured the legacy base_url.	2026-05-04 12:36:09 -07:00
Teknium	3db6b9cc87	feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern) (#19709 ) * feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern) Adds a no_agent=True option to the cronjob system. When enabled, the scheduler runs the attached script on schedule and delivers its stdout directly to the job's target — no LLM, no agent loop, no token spend. This is the classic bash-watchdog pattern (memory alert every 5 min, disk alert every 15 min, CI ping) reimplemented as a first-class Hermes primitive instead of a systemd timer + curl + bot token triplet living outside the system. ## What hermes cron create "every 5m" \ --no-agent \ --script memory-watchdog.sh \ --deliver telegram \ --name memory-watchdog Agent tool: cronjob(action='create', schedule='every 5m', script='memory-watchdog.sh', no_agent=True, deliver='telegram') Semantics: - Script stdout (trimmed) → delivered verbatim as the message - Empty stdout → silent tick (no delivery; watchdog pattern) - wakeAgent=false gate → silent tick (same gate LLM jobs use) - Non-zero exit/timeout → delivered as an error alert (broken watchdogs shouldn't fail silently) - No LLM ever invoked; no tokens spent; no provider fallback applied ## Implementation cron/jobs.py * create_job gains no_agent: bool = False * prompt becomes Optional (no_agent jobs don't need one) * Validation: no_agent=True requires a script at create time * Field roundtrips via load_jobs / save_jobs / update_job cron/scheduler.py * run_job: new short-circuit branch at the top that runs the script, wraps its output into the (success, doc, final_response, error) tuple downstream delivery already expects, and returns before any AIAgent import or construction * _run_job_script: picks interpreter by extension — .sh/.bash run under /bin/bash, anything else under sys.executable (Python). Shell support unlocks the bash-watchdog pattern without wrapping scripts in Python. Extension is explicit; we deliberately do NOT trust the file's own shebang. Path-containment guard (scripts dir) unchanged. tools/cronjob_tools.py * Schema: new no_agent boolean property with clear trigger guidance * cronjob() accepts no_agent and validates mode-specific shape: - no_agent=True requires script; prompt/skills optional - no_agent=False keeps the existing 'prompt or skill required' rule * update path rejects flipping no_agent=True on a job without a script * _format_job surfaces no_agent in list output * Handler lambda forwards no_agent from tool args hermes_cli/main.py, hermes_cli/cron.py * 'hermes cron create --no-agent' and edit's --no-agent / --agent pair for toggling at CLI parity with the agent tool * Existing --script help text updated to describe both modes * List / create / edit output now shows 'Mode: no-agent (...)' when set ## Tests tests/cron/test_cron_no_agent.py — 18 tests covering: * create_job: no_agent shape, validation, field persistence * update_job: flag roundtrip across reload * cronjob tool: schema validation, update toggling, mode-specific requirements, prompt-relaxation rule * run_job short-circuit: - success path delivers stdout verbatim - empty stdout → SILENT_MARKER (no delivery downstream) - wakeAgent=false gate → silent - script failure → error alert - run_job does NOT import AIAgent (verified via mock) * _run_job_script: - .sh executes via bash (no shebang required) - .bash executes via bash - .py still runs via sys.executable (regression) - path-traversal still blocked (security regression) All 18 new tests pass. 341/342 pre-existing cron tests still pass; the one failure (test_script_empty_output_noted) was already broken on main and is unrelated to this change. ## Docs website/docs/guides/cron-script-only.md — new dedicated guide covering the watchdog pattern, interpreter rules, delivery mapping, worked examples (memory / disk alerts), and the comparison table vs hermes send, regular LLM cron jobs, and OS-level cron. website/docs/user-guide/features/cron.md — new 'No-agent mode' section in the cron feature reference, cross-linked to the guide. website/docs/guides/automate-with-cron.md — new tip box pointing users to no-agent mode when they don't need LLM reasoning. ## Compatibility - Existing jobs: unchanged. no_agent defaults to False, existing code paths untouched until the flag is set. - Schema additive only; older jobs.json without the field load fine via .get() with False default. - New CLI flags are opt-in and don't alter existing flag behavior. * fix(cron): lazy-import AIAgent + SessionDB so no_agent ticks pay zero The unconditional `from run_agent import AIAgent` + SessionDB() init at the top of run_job() meant every no_agent tick still paid the full agent module load cost (~300ms + transitive imports + DB open) even though it never touched any of that machinery. Move both to live under the default (LLM) path, after the no_agent short-circuit has returned. Now a no_agent tick's sys.modules stays clean — verified end-to-end: assert 'run_agent' not in sys.modules # before run_job(no_agent_job) assert 'run_agent' not in sys.modules # after The existing mock-based unit test (test_run_job_no_agent_never_invokes_aiagent) kept passing because patch() replaces the class AFTER import; the leak was only visible via real subprocess-style verification. End-to-end demo confirmed: agent calls cronjob(no_agent=True) → script runs → stdout delivered → no LLM machinery loaded. * docs(cron): tighten no_agent tool schema — defaults, silent semantics, pick rule Previous description buried the important bits in one long sentence. Agents could plausibly miss three things an LLM-facing schema should make unmissable: 1. What the default is — now first sentence + JSON Schema `default: false` 2. What 'silent run' actually means for the user — now spelled out: 'nothing is sent to the user and they won't see anything happened' 3. When to pick True vs False — now a concrete decision rule with examples on both sides (watchdogs/metrics/pollers → True; summarize/draft/pick/rephrase → False) Also adds explicit 'prompt and skills are ignored when True' since the agent could otherwise still pass them out of habit. No behavior change — schema text only.	2026-05-04 12:31:01 -07:00
ygd58	74c1b946e0	fix(browser): inject --no-sandbox for root and AppArmor userns restrictions On VPS/Docker and some Ubuntu 23.10+ hosts, Chromium refuses to start without --no-sandbox: - uid=0 (root): hard requirement (VPS/Docker deployments) - AppArmor apparmor_restrict_unprivileged_userns=1 (Ubuntu 23.10+): non-root too, under systemd or unprivileged containers Detect both conditions and inject AGENT_BROWSER_CHROME_FLAGS with --no-sandbox --disable-dev-shm-usage when the user hasn't already set the flags themselves. Salvage of #15771 — only the browser_tool.py fix is cherry-picked. The PR's accompanying MCP preset addition (new feature surface) was dropped so the bug fix can land independently. Co-authored-by: ygd58 <buraysandro9@gmail.com>	2026-05-04 05:27:23 -07:00

1 2 3 4 5 ...

1233 commits