hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-30 19:09:28 +00:00

Author	SHA1	Message	Date
Brooklyn Nicholson	6bd8132bf9	Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/gui	2026-05-30 17:23:21 -05:00
Teknium	b1a25404b6	perf(read_file): make compact gutter the only format; drop HERMES_READ_GUTTER (#35532 ) The compact "<n>\|content" gutter from #35368 is now the sole behavior. Removes the HERMES_READ_GUTTER=padded escape hatch and its env lookup — no legacy fixed-width path to maintain. Padding was pure token overhead (~48% more tokens than bare content, ~16% more than compact) with no measured accuracy gain in the original A/B. - file_operations.py: drop env lookup + os import; gutter always f"{i}\|{line}" - tests: drop the padded env-override test; compact assertions retained	2026-05-30 14:38:30 -07:00
Brooklyn Nicholson	c83cd38391	Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/gui # Conflicts: # tui_gateway/server.py	2026-05-30 13:19:27 -05:00
Sylw3ster	6a72af044c	fix(managed-gateway): keep tool availability scans off the Nous token-refresh path	2026-05-30 07:58:08 -07:00
Teknium	96643b4a52	fix(file-tools): anchor relative-path resolution to absolute base; report resolved path (#35399 ) Relative paths in write_file/patch could resolve against the agent PROCESS cwd instead of the terminal's working directory. In a git-worktree session with a stale TERMINAL_CWD='.' (a relative base), early edits silently landed in the MAIN checkout, verified there, and reported success — while the agent inspected the worktree and saw nothing, misreading it as the patch tool no-op'ing. - _resolve_base_dir(): resolution base is now ALWAYS absolute. A relative TERMINAL_CWD is anchored to the process cwd once, deterministically, instead of being left to resolve()-time cwd. Live terminal cwd stays authoritative. - write_file/patch pass the resolved absolute path to the shell FileOps layer so the tool layer and shell layer can't disagree about which file is edited. - Responses now report the absolute resolved_path and files_modified, so a wrong-cwd mismatch is visible on the first call. - _path_resolution_warning(): emits a _warning when a relative path resolves OUTSIDE the live terminal cwd (e.g. a worktree session writing into main). Validation: 11 new unit tests + 43 live E2E assertions (worktree routing, mid-session cd, V4A patches, divergence warning, absolute paths, consecutive patches); 466 existing file/path/terminal tests green.	2026-05-30 07:55:36 -07:00
Teknium	92ad7cc62c	fix(browser): recover from CDP DOM-node serialization crash in browser_console (#35385 ) browser_console(expression="document.body") returned the cryptic CDP error "Object reference chain is too long" instead of a usable result. With returnByValue=true, Chrome deep-serializes the eval result; for a live DOM Node/NodeList/Window that serialization overruns CDP's recursion guard and fails the whole call with a protocol-level error (not a JS exception), which _browser_eval surfaced raw. - browser_supervisor.evaluate_runtime: on that specific error, retry once with returnByValue=false so Chrome returns the node's description string — the same graceful path already used for document.querySelector() results. - browser_tool._browser_eval (CLI subprocess fallback): the subprocess can't retry, so convert the reference-chain error into actionable guidance (extract a primitive / use JSON.stringify) instead of leaking it raw. No expression rewriting — normal evals (1+41 -> 42) are untouched.	2026-05-30 07:31:25 -07:00
teknium1	182739fcda	test(interrupt): assert no leaked tid instead of no-op block Follow-up on the #35309 regression test: the trailing `with _lock: pass` asserted nothing. Replace it with a concrete assertion that _interrupted_threads is empty after the worker exits, directly verifying the leak the fix prevents.	2026-05-30 07:28:11 -07:00
liuhao1024	bede3cf12d	fix(tools): wrap _run_tool cleanup in finally to prevent interrupt state leak When _invoke_tool raises a BaseException (CancelledError, KeyboardInterrupt), the cleanup code at the end of _run_tool was bypassed because it sat outside the except block (which only catches Exception). ThreadPoolExecutor recycles thread IDs, so the leaked tid in _interrupted_threads poisons the next tool scheduled on that thread — it instantly aborts with 'Interrupted'. Move the discard + _set_interrupt(False) into a finally block so cleanup runs regardless of how the worker exits. Fixes #35309	2026-05-30 07:28:11 -07:00
Teknium	ea6eaabd8f	perf(read_file): compact line-number gutter — ~14% fewer tokens per read (#35368 ) read_file's gutter used a fixed-width zero/space-padded prefix (" 1\|content"). The padding is pure token overhead: measured with cl100k on real Hermes source, the padded gutter costs ~48% more tokens than bare content and ~16% more than a compact "<n>\|content" gutter, because the leading spaces tokenize into extra tokens on every line. Switched the default to the compact "<n>\|content" form. An A/B (Sonnet 4.6 via OpenRouter, 2 passes, 4-task battery, every claim verified against ground truth) showed: - padded : 4/4 PASS both passes - compact : 4/4 PASS both passes ← keeps line-referencing + patch - none : 3/4 PASS both passes ← dropping numbers entirely made the model hand-count lines and answer off-by-one (33 vs 34) So we keep the line numbers (the model genuinely uses them to reference lines) but drop the wasteful padding — capturing ~14% of the read-token cost with zero measured accuracy change. Dropping numbers entirely (the larger 33% saving) is rejected: it regresses line-referencing. patch/fuzzy_match never consumed the gutter (they match old_string text and compute char offsets internally), so editing is unaffected. No downstream parser keys on the fixed-width columns. HERMES_READ_GUTTER= padded restores the legacy format for anyone relying on alignment. Tests: updated the 3 format assertions to the compact gutter; added an env-override test for the legacy padded format. 209 file-tool tests green.	2026-05-30 07:01:22 -07:00
Teknium	5f84c9144a	fix(file-tools): handle UTF-8 BOM in read_file / write_file / patch (#35278 ) Some Windows editors prepend an invisible UTF-8 BOM (U+FEFF) to text files. We had no awareness of it, so: read_file surfaced a phantom U+FEFF as the first character; patch matches against the true first line could miss; and a write/patch round-trip silently stripped the marker, changing the file's byte signature. Now: - read_file / read_file_raw strip a single leading BOM so the model never sees it (only on the first chunk — the marker lives at byte 0). - patch_replace strips the BOM before fuzzy-matching (so an exact first-line match works) and its post-write verification compares BOM-stripped content. - write_file restores the BOM when the original file had one and the new content doesn't, mirroring the existing line-ending preservation (detect on disk via a cheap `head -c 3` probe or reuse pre_content, re-prepend across the edit). Guards against double-BOM. Mid-content U+FEFF is left alone (it's data there, not a file marker). Tests: TestBomHandling (real LocalEnvironment) — read-strips, raw-read strips, write preserves, no-BOM-when-original-had-none, no-double-BOM, patch round-trip preserves, patch matches first line through a BOM, plus helper unit tests. 208 file-tool tests green.	2026-05-30 06:25:50 -07:00
teknium1	bfc4a26032	fix(tools): point email home-channel error at EMAIL_HOME_ADDRESS The no-home-channel error for send_message derived the env var name generically as <PLATFORM>_HOME_CHANNEL, producing EMAIL_HOME_CHANNEL for the email platform. But gateway/config.py reads EMAIL_HOME_ADDRESS, so a user following the error's guidance would set a variable that is never consulted. Add a per-platform override map so the email hint names the variable actually read; all other platforms keep the generic hint.	2026-05-30 02:39:08 -07:00
liuhao1024	d3724c0be6	fix(tools): recognize email addresses as explicit targets in send_message When using send_message with the email platform, valid email addresses like user@example.com were not recognized as explicit targets by _parse_target_ref(). This caused the function to return (None, None, False), forcing the system into channel-name resolution which has no way to resolve a raw email address, resulting in 'No home channel set for email' errors. Add _EMAIL_TARGET_RE pattern and email platform handler in _parse_target_ref() so email addresses are treated as explicit targets and routed directly without requiring a home target configuration.	2026-05-30 02:39:08 -07:00
teknium1	41decf2c4a	test(mcp): import os and pytest in test_mcp_stability The salvaged grandchild-reaping tests reference os.getpgid/os.killpg and pytest.mark/skip/importorskip directly, but the file only imported asyncio, signal, and unittest.mock. Add the missing imports so collection succeeds on current main.	2026-05-30 02:08:29 -07:00
konsisumer	a29d64e50c	fix(mcp): reap stdio MCP grandchildren via process-group signal The orphan reaper for stdio MCP subprocesses only tracked the direct child PID spawned by ``stdio_client`` (e.g. ``openclaw mcp serve``). When that wrapper itself spawned a helper (``claude mcp serve``) and then exited, the helper reparented to ``systemd --user`` and survived shutdown. The MCP SDK already spawns stdio children with ``start_new_session=True``, so the wrapper is its own pgroup leader and same-pgroup descendants are reachable via ``killpg``. Capture the pgid at spawn time and reap via ``killpg(pgid, sig)`` so reparented grandchildren are reaped alongside the direct child, even after the wrapper itself exits. Falls back to per-pid ``os.kill`` on Windows or when no pgid was recorded. Fixes part 2 (orphan ``claude mcp serve``) of #23799. Part 1 (per-invocation respawn) was confirmed by the reporter to be an environmental artifact, not a code bug.	2026-05-30 02:08:29 -07:00
Teknium	39f6b6e9d2	fix(file-tools): make write_file/patch atomic (temp-file + rename) (#35252 ) * Inspired by Claude Code: /compress here [N] — boundary-aware 'summarize up to here' Adds a user-chosen compression boundary to the existing /compress command. /compress here [N] summarizes everything except the most recent N exchanges (default 2), which are preserved verbatim — letting the user pick the compression boundary instead of relying on the automatic token-budget heuristic. Inspired by Claude Code's Rewind 'Summarize up to here' action (v2.1.139, Week 20, May 2026): https://code.claude.com/docs/en/whats-new/2026-w20 - hermes_cli/partial_compress.py: pure split/parse helpers + seam-alternation guard (shared by CLI and gateway). - cli.py / gateway/run.py: route 'here [N]' / '--keep N' to partial compression; compress only the head, re-append the verbatim tail through the seam guard. - Preserves message-flow role alternation (seam guard merges any illegal user->user / assistant->assistant adjacency). - Reuses the existing _compress_context session-rotation/lock machinery — no changes to the compression core. - Bare /compress (full) and /compress <focus> behavior unchanged. Tests: 12 helper unit tests + 5 CLI integration tests + E2E (interleaved tool-call transcript, degenerate/multimodal seams, real handler path). * fix(file-tools): make write_file/patch atomic (temp-file + rename) write_file streamed content straight into the target via `cat > path`, so a crash, SIGKILL, or truncated pipe mid-write left the file half-written and corrupt. patch_replace routes through write_file, so it shared the flaw. Now writes stream into a temp file in the SAME directory and `mv` it over the target — a real same-filesystem rename, which is atomic on POSIX and on every terminal backend (local/docker/ssh/modal). A failed write leaves the original byte-intact and leaks no temp file. The existing file's mode is preserved across the swap (stat + chmod, GNU/BSD), and content still rides stdin so there's no ARG_MAX limit. A trap cleans the temp on any error path. Tests: added TestAtomicWrite (real LocalEnvironment, no mocks) covering inode-change-on-overwrite, mode preservation, failed-write-leaves-original, no-temp-leak, special chars, and patch routing. Updated two mocks in test_file_operations.py that keyed on the literal `cat >` write command to key on the stdin_data behavioral signal instead. 200 file-tool tests green.	2026-05-30 02:07:50 -07:00
teknium1	6a08fd3c3f	test(skills): assert restore via synced[copied], not manifest re-read The hermetic CI env (slice 4/6) redirects HERMES_HOME, so a post-restore _read_manifest() can resolve to an empty/redirected manifest path and return {}. Assert on sync_skills's in-memory return value (synced["copied"]) instead, which is the resilient signal that the skill was re-copied and is no longer in limbo.	2026-05-30 02:05:10 -07:00
teknium1	8ae0802d59	fix(skills): make _rmtree_writable handle read-only directories, not just files The cherry-picked fix's onerror handler chmod'd only the failing path, but unlinking a child requires write permission on its PARENT directory. On a true Nix-store copy (r-xr-xr-x dirs + files) rmtree still failed. Now chmod the parent dir as well before retrying. Also rewrites the regression test: the original asserted the helper FAILS on a read-only dir (documenting the limitation), which is the wrong success criterion. Split into two tests — restore succeeds on a full read-only tree (real Nix case), and manifest is preserved when removal genuinely cannot proceed (monkeypatched).	2026-05-30 02:05:10 -07:00
annguyenNous	83a7d0b601	fix(skills): fix transaction ordering in reset_bundled_skill and handle read-only files in rmtree Two related bugs in tools/skills_sync.py affecting Nix-store and immutable-package installs: #34972 — reset_bundled_skill corrupts manifest on rmtree failure: The function deleted the manifest entry BEFORE attempting rmtree. If rmtree failed (read-only files from Nix store), the function returned early — leaving the skill in a manifest-less limbo state where future syncs silently skip it forever. Fix: reorder steps — attempt rmtree FIRST, only delete manifest entry after rmtree succeeds. If rmtree fails, nothing is changed. #34860 — stale .bak directories after sync: sync_skills() called shutil.rmtree(backup, ignore_errors=True) which silently failed on read-only files, leaving persistent .bak dirs. Fix: add _rmtree_writable() helper that makes files writable via an onerror callback before retrying removal. Used in both sync_skills() backup cleanup and reset_bundled_skill(). Fixes #34972 Fixes #34860	2026-05-30 02:05:10 -07:00
Teknium	b4cf114f68	fix(vision): fail fast on non-retryable image download errors (#35221 ) _download_image() wrapped every download attempt in a blanket `except Exception` and retried 3x with 2s/4s/8s backoff regardless of cause. A 404/403 image URL would never resolve on retry, so it just burned up to 6s of wall-clock + extra GETs before failing — inflating latency for a deterministic failure (issue #32296, umbrella #35114). Add _is_retryable_download_error(): 4xx client errors (except 429), website-policy PermissionError, and too-large/SSRF ValueError now raise on the first attempt. 429, 5xx, and unclassified network errors stay retryable. Removed the now-unreachable fall-through branch since the loop always returns on success or re-raises on the final/terminal attempt.	2026-05-30 01:40:39 -07:00
Brooklyn Nicholson	b86043834f	Merge origin/main into bb/gui Adopt main's web/ dashboard layout (apps/dashboard removed; web/ restored), keep bb/gui's desktop CLI/update workspace handling, and preserve main's mTLS/URL validation MCP changes. Dashboard backend is aligned to main with only the intended STT provider quarantine/ElevenLabs override reapplied.	2026-05-29 20:40:08 -05:00
Teknium	3a2c03061c	fix(stt,tts): restore mistralai — 2.4.8 is clean, ban lifted (#34841 ) * docs(code-execution): document HERMES_* env narrowing + passthrough workaround The execute_code sandbox-child env scrub (`108397726`, #27303) deliberately dropped the broad HERMES_ prefix passthrough, keeping only an operational 4-var allowlist (HERMES_HOME/PROFILE/CONFIG/ENV). A script that relied on a non-secret HERMES_* var (HERMES_BASE_URL, HERMES_KANBAN_DB, HERMES__WEBHOOK, or a plugin-defined one) now sees it unset in the child. Document the behavior change and the two recovery routes (terminal.env_passthrough in config.yaml, or required_environment_variables in skill frontmatter), plus the debug log line that surfaces the drop for diagnosis. fix(stt,tts): restore mistralai — 2.4.8 is clean, ban lifted PyPI quarantined mistralai on 2026-05-12 after the malicious 2.4.6 release (Mini Shai-Hulud worm). 2.4.6 has since been removed from the registry and clean releases resumed (2.4.7 2026-05-25, 2.4.8 2026-05-28). This rolls back the blanket runtime ban so Voxtral STT + TTS work again, following the restoration checklist the repo left in pyproject.toml. Verified against the real SDK: 2.4.8 keeps the import path the code uses (from mistralai.client import Mistral) and the audio.transcriptions.complete / audio.speech.complete surfaces. Changes: - pyproject.toml: re-add mistral extra pinned to mistralai==2.4.8; left OUT of [all] per the 2026-05-12 lazy-install policy (one quarantined release must not break fresh installs). uv.lock regenerated. - tools/lazy_deps.py: add stt.mistral / tts.mistral entries so the SDK lazy-installs on first use (matches edge / elevenlabs). - tools/transcription_tools.py: restore explicit-provider gate (_HAS_MISTRAL + key) and auto-detect entry (local>groq>openai>mistral>xai); _transcribe_mistral lazy-installs before import. - tools/tts_tool.py: dispatcher routes back to _generate_mistral_tts; _import_mistral_client lazy-installs the SDK. - hermes_cli/tools_config.py, hermes_cli/web_server.py: un-hide Mistral from the TTS provider picker and dashboard STT options. - hermes_cli/security_advisories.py: KEEP the shai-hulud-2026-05 advisory (module policy forbids removal) — it is scoped to 2.4.6 only, so it still warns anyone with the poisoned build cached and never fires on 2.4.8. Summary note updated to reflect the un-quarantine. - tests: revert the disabled-behavior assertions added by the ban commit back to routing/positive expectations; add mistral to the lazy-installable-extras-excluded-from-[all] contract. Reported by @SkYNewZ (#34503). Validation: 189 targeted STT/TTS/lazy_deps/metadata tests pass; E2E with the real mistralai 2.4.8 SDK routes both STT and TTS to mistral.	2026-05-29 13:24:12 -07:00
Teknium	b6ed3913d2	feat(skills): categorize tap skills from skills.sh.json grouping sidecar A GitHub tap can ship a repo-root skills.sh.json (the published skills.sh schema) declaring category groupings. The Skills Hub now reads it at index time and uses each grouping title as the skill's category label, instead of the tag-derived guess. Generic: any tap that ships the file gets real categorization — NVIDIA's groupings (Inference AI, Decision Optimization, GPU Development, etc.) flow through automatically. - GitHubSource: _get_skillsh_groupings() fetches+caches the sidecar per repo; _parse_skillsh_groupings() flattens it to {skill_name: title}; _list_skills_in_repo() stamps meta.extra['category']; _meta_to_dict now serializes extra so the category survives the index cache round-trip. - extract-skills.py: prefers extra['category'] over the tag heuristic and exempts sidecar categories from the small-category to Other collapse. - Docs + 12 tests.	2026-05-29 12:24:39 -07:00
Teknium	4de8009ce4	feat(skills): integrate NVIDIA/skills as a trusted skills hub tap NVIDIA/skills is now a default trusted tap in the Hermes Skills Hub — discoverable, browsable, searchable, and auto-updating through the same pipeline that already serves OpenAI, Anthropic, and HuggingFace skills. Rebased onto current main.	2026-05-29 12:24:39 -07:00
alt-glitch	0563ab0652	fix(test): add fal_client.submit stub to surface matrix test The plugin switched from fal_client.subscribe() to submit()+handle.get(). The test mock only had subscribe, causing CI failures.	2026-05-29 22:26:24 +05:30
alt-glitch	b6294ea9f1	test(video_gen): cover gateway decision matrix gaps and 4xx error path - Add test for 4xx ValueError with actionable remediation message - Add test for is_available() returning True via managed gateway - Add test for prefers_gateway overriding direct FAL_KEY - Add test for is_available() via gateway in plugin test file	2026-05-29 22:26:24 +05:30
alt-glitch	d04b3c193e	feat(video_gen): route FAL video gen through managed Nous gateway Wire plugins/video_gen/fal/__init__.py to use the same _ManagedFalSyncClient pattern that image gen already uses. Changes: - Add managed gateway resolution, client caching, and _submit_fal_video_request() that routes between direct FAL_KEY and Nous gateway modes - Update is_available() to return True when either FAL_KEY or the managed gateway is reachable - Update generate() to use submit+get handle pattern instead of fal_client.subscribe() directly - Fix happy-horse endpoint namespace: fal-ai/ → alibaba/ (matches the tool-gateway allowlist from fal-video-gen branch) - Surface actionable error on 4xx gateway rejections Tests: - 4 new tests in test_managed_media_gateways.py (gateway routing, client reuse, direct mode fallback, alibaba namespace) - Updated existing test_fal_plugin.py fixture to use submit/handle pattern and patch _resolve_managed_fal_video_gateway for isolation	2026-05-29 22:26:24 +05:30
teknium1	1c53d39eaa	test: deflake process-registry kill + PTY resize tests Two CI flakes surfaced on PR #34572 (both in files this PR doesn't touch; pre-existing host-dependent flakes): 1. test_process_registry::TestPopenLeakOnSetupFailure — the failure-cleanup tests use a fake proc.pid (8888/9999) and assert proc.kill() runs. But spawn_local's primary cleanup is os.killpg(os.getpgid(pid), SIGKILL), falling back to proc.kill() only on ProcessLookupError/PermissionError/ OSError. When the fake PID happens to exist on a busy host, os.getpgid succeeds, os.killpg fires against an UNRELATED real process group, and proc.kill() is never reached -> flaky AssertionError (and a real risk of SIGKILLing an innocent process group from a unit test). Patch os.getpgid to raise ProcessLookupError so the fallback path runs deterministically and no real killpg is ever issued. 2. test_web_server::test_resize_escape_is_forwarded — the receive loop calls the blocking conn.receive_bytes() with no exception guard. Once the child prints its winsize and exits, the PTY closes; on a missed-marker run the next recv blocks until the 30s pytest-timeout instead of failing fast. Add a try/except break (matching the working sibling tests) and bump the child's pre-read sleep 0.15s -> 0.5s so the resize reliably lands first. Verified: 4/4 pass across 3 consecutive runs; root cause for #1 reproduced (os.getpgid(1) succeeds -> old code skips proc.kill).	2026-05-29 04:22:41 -07:00
briandevans	6e179c44b1	fix(web): ensure plugin discovery before web_*_tool registry lookups Web search/extract dispatch read agent.web_search_registry before plugin discovery had run, so in any process that hadn't imported model_tools.py (subprocess agent runs, delegate children, standalone scripts) the registry was empty: get_provider('firecrawl') returned None and the dispatcher emitted the misleading 'No web extract provider configured' error even with web.extract_backend set and FIRECRAWL_API_KEY exported. Adds an idempotent _ensure_web_plugins_loaded() helper (mirrors tools.browser_tool._ensure_browser_plugins_loaded) and calls it at the top of both the web_search_tool and web_extract_tool dispatch sites before the registry lookup. Fixes #27580. Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>	2026-05-29 04:00:00 -07:00
teknium1	c77a697fa4	refactor(vision): consolidate native fast-path gate into one shared helper The fast-path decision (native routing + provider allowlist OR supports_vision override) lived inline in vision_analyze and was copied into browser_vision. Extract it to _should_use_native_vision_fast_path() so both tools share one source of truth. - vision_tools: gate logic now one helper; vision_analyze calls it in 3 lines - browser_tool: thin envelope decoration over the shared helper, not a copy - browser_vision typed Union[str, Dict] to match its real return shape - tests slimmed to target the override path + text-mode-wins invariant	2026-05-29 03:58:56 -07:00
tillfalko	2402ec5e7b	test: extend test coverage to native image routing	2026-05-29 03:58:56 -07:00
teknium1	3171845479	fix(code-exec): make dropped HERMES_* env vars diagnosable in sandbox scrub Follow-up mitigation for the #27303 env-scrub tightening. Dropping the broad HERMES_ prefix in favor of a 4-var operational allowlist is correct hardening, but a sandbox script that imports a repo module reading a non-allowlisted HERMES_* var at import time would otherwise see it silently unset. _scrub_child_env now emits a one-shot debug log naming the dropped non-secret HERMES_* vars and pointing at the env_passthrough opt-in escape hatch. Secret-shaped vars are never named in the log. Tests: dropped vars are logged + env_passthrough named; no log when nothing is dropped; secret vars excluded from the diagnostic.	2026-05-29 03:44:49 -07:00
firefly	4bdae34771	test(code-exec): regression suite for the approval-bypass cluster Cover context+callback propagation and teardown-clears, a source guard that both RPC threads stay wrapped, the check_execute_code_guard decision matrix (isolated backend, headless-local, cron-deny, gateway approve/deny/timeout/missing-notify, smart mode, session-yolo), the env-scrub allowlist/secret rules, and a behavioral test that execute_code() blocks before spawning on denial. Refs #4146, #27303, #30882, #33057	2026-05-29 03:44:49 -07:00
firefly	1083977261	fix(code-exec): restore approval context in execute_code RPC threads + guard entry Wrap both execute_code RPC threads (local UDS + remote file-RPC) with propagate_context_to_thread so gateway sessions no longer fall into check_dangerous_command's non-interactive auto-approve branch and the CLI approval prompt stays reachable. Add check_execute_code_guard: one-shot fail-closed approval of the whole script in gateway/ask/cron-deny before the child spawns (skips isolated backends; command-string built only past the early returns). Drop the broad HERMES_ env passthrough for an explicit operational allowlist plus DSN/WEBHOOK secret substrings, and update the POSIX-equivalence oracle. Refs #4146, #27303, #30882, #33057	2026-05-29 03:44:49 -07:00
teknium1	7427b9d581	fix(tool-search): scope bridge catalog + dispatch to the session's toolsets Tool Search read its catalog from the global registry (get_tool_definitions with no toolset scope = 'start with everything'), so a restricted-toolset session — subagent, kanban worker, curated gateway session — could: 1. tool_search the entire process registry, not just its granted tools, and 2. tool_call any registered plugin/MCP tool it was never given, because registry.dispatch() has no enabled_tools gate for non-execute_code tools. A scoped session (enabled_toolsets=['mcp-github']) reported total_available=26 and successfully invoked an out-of-scope plugin tool via tool_call. Fix: - handle_function_call gains enabled_toolsets/disabled_toolsets; the bridge dispatch scopes get_tool_definitions to them (also stops polluting the process-global _last_resolved_tool_names with out-of-scope tools, which leaked into execute_code's sandbox-tool fallback). - A defense-in-depth gate rejects any tool_call'd name not in the scoped deferrable catalog. - tool_executor's unwrap (both concurrent + sequential paths) enforces the same scope before dispatch, since it unwraps tool_call -> underlying name and bypasses the bridge branch. New _tool_search_scoped_names() helper, cached per-agent on registry generation + toolset scope. - New scoped_deferrable_names() helper in tool_search.py shared by both sites. Tests: 4 new regression tests in TestRegression_ToolsetScoping (scoped catalog, out-of-scope tool_call rejection, no global pollution, helper).	2026-05-29 02:04:12 -07:00
teknium1	369075dc95	feat(tools): progressive tool disclosure for MCP and plugin tools Adds Tool Search, a structured-tools progressive-disclosure layer that replaces MCP and non-core plugin tools in the model-visible tools array with three bridge tools (tool_search / tool_describe / tool_call) when the deferrable surface would consume more than a configurable percentage of the active model's context window. Core Hermes tools are never deferred. Default mode is 'auto' with a 10% context threshold, so small toolsets pay no overhead. Set tools.tool_search.enabled to 'on' to force or 'off' to disable. Design carefully reflects the OpenClaw production failure modes documented in the openclaw-tool-search-report: - Core tools never defer (toolsets._HERMES_CORE_TOOLS). Addresses the 'tools silently missing from isolated cron turns' regression class (openclaw#84141) by construction: there is no code path that can drop a core tool. - Catalog is stateless across turns — rebuilt from the live tool-defs list on every assembly. No session-keyed Map that can drift out of sync with the registry. - tool_call unwraps the bridge call before any hook fires, so plugin pre/post hooks, guardrails, approval flows, and the activity feed all see the underlying tool name, not the bridge (addresses openclaw#85588 and the verbose-mode complaint on openclaw#79823). - The unwrap happens in both the parallel and sequential paths of agent/tool_executor.py and also in handle_function_call, so direct callers (sandboxed code, eval harnesses) are covered too. - Bridge tools cannot invoke each other (recursion guard) and cannot invoke core tools (those must be called directly). - Tools mode only — no JS-sandbox code-mode. Keeps the surface small. - Token estimation via cheap char/4 heuristic; precision isn't needed for the threshold decision. Files: - tools/tool_search.py — new module (BM25 retrieval, classification, threshold gate, bridge dispatch, unwrap helper). - tests/tools/test_tool_search.py — 35 tests including the OpenClaw #84141 regression guard. - model_tools.py — wires assembly into _compute_tool_definitions as the final step, adds skip_tool_search_assembly kwarg so the bridge can see the real catalog, dispatches the three bridge tools. - agent/tool_executor.py — unwraps tool_call in both parallel and sequential parsing loops so checkpointing, guardrails, plugin hooks, and tool-progress callbacks all observe the underlying tool name. - hermes_cli/config.py — DEFAULT_CONFIG['tools']['tool_search'] block. - website/docs/user-guide/features/tool-search.md — user docs. Validation: - 35/35 new tests pass. - Existing tool/registry/model_tools/config/coercion/executor tests (82 + 74 + small adjacents) green. - Live E2E: 20 fake MCP tools registered, get_tool_definitions returns 3 bridges, tool_search returns top 3 hits, tool_describe returns full schema, tool_call dispatches to the real underlying handler and the underlying result is what the model sees. - Reserved-name recursion guard verified live. - Core-tool refusal via tool_call verified live.	2026-05-29 02:04:12 -07:00
teknium1	6bebab4761	fix(security): narrow Bedrock subprocess strip to inference bearer token only Scopes the AWS_SDK subprocess strip down from the full AWS credential chain to just AWS_BEARER_TOKEN_BEDROCK — the only Hermes-managed inference secret (analogous to OPENAI_API_KEY). The general AWS credential chain (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_SESSION_TOKEN / AWS_PROFILE / config + role pointers) is intentionally left inheritable. Why: per SECURITY.md §3.2 the local terminal is the user's trusted operator shell. Hard-blocklisting the general chain would (a) regress every user who runs aws/terraform/cdk/boto3 in the agent terminal — not just Bedrock users, since PROVIDER_REGISTRY is iterated unconditionally at import — and (b) be unrecoverable, because env_passthrough.py refuses to re-allow anything in _HERMES_PROVIDER_ENV_BLOCKLIST (GHSA-rhgp-j443-p4rf). The narrow strip closes the reported leak (opencode enumerating the Bedrock catalog off the leaked bearer token) with no capability loss. Keeps zapabob's self-healing auth_type=="aws_sdk" mechanism so any future SDK-cred provider is covered automatically. Tests: bearer token stripped + general chain preserved (no-regression guard), on both the runtime strip path and the blocklist-membership path. Co-authored-by: zapabob <1920071390@campus.ouj.ac.jp>	2026-05-29 01:48:08 -07:00
zapabob	95b5b72404	fix(security): block AWS SDK creds from subprocess env	2026-05-29 01:48:08 -07:00
Ben Barclay	48083211ef	fix(docker): accept PUID/PGID as aliases for HERMES_UID/HERMES_GID (#25872 ) (#34401 ) Salvages #25872 by @konsisumer against current main. NAS users (UGOS, Synology, unRAID) expect the LinuxServer.io PUID/PGID convention and bind-mount /opt/data from a host directory owned by their own UID. Without this alias those vars are silently ignored and the s6-setuidgid drop to UID 10000 leaves the runtime unable to read the volume. HERMES_UID/HERMES_GID still take precedence when both are set. The original PR targeted docker/entrypoint.sh, which is now a 27-line deprecation shim under s6-overlay (the May 2026 rework moved all bootstrap logic to docker/stage2-hook.sh, installed as /etc/cont-init.d/01-hermes-setup). Re-applied the same 2-line alias resolution at the equivalent spot in stage2-hook.sh just before the existing UID/GID remap block. Test was retargeted at docker/stage2-hook.sh; docs hunk adapted to current main's wording ("stage2 hook" + s6-setuidgid, not the obsolete "entrypoint drops via gosu") with the NAS bind-mount example preserved verbatim. Test-first regression verification: reverted just docker/stage2-hook.sh to origin/main and re-ran the new tests. Result: FAILED test_stage2_hook_resolves_puid_pgid_aliases FAILED test_puid_pgid_populate_hermes_uid_gid AssertionError: assert ':' == '1000:10' That's the exact bug shape — PUID=1000 PGID=10 silently ignored, HERMES_UID/HERMES_GID stay empty. With the salvage applied, all 4 tests pass. Closes #25872 Co-authored-by: konsisumer <11262660+konsisumer@users.noreply.github.com>	2026-05-29 16:07:15 +10:00
wysie	a0fc3df878	fix(browser): rewrite Camofox Docker loopback URLs (#25541 ) Co-authored-by: Wysie <wysie@users.noreply.github.com>	2026-05-29 15:43:55 +10:00
Teknium	00b8204cf4	fix: restore side-effect imports in test files (test_kanban_tools, test_command_guards) The previous ruff prune commit removed two categories of test-file imports whose value is the side effect of importing them, not their binding: tests/tools/test_kanban_tools.py — 5 sites `import tools.kanban_tools # ensure registered` The import itself runs tools/kanban_tools.py's @registry.register calls; without it, the kanban tool registry is empty and test_kanban_tools_visible_with_env_var asserts {} != {7 kanban tools}. tests/tools/test_command_guards.py — 1 site `import tools.tirith_security # Ensure the module is importable so we can patch it` The comment names the requirement: keep the bare module reference so subsequent mock.patch("tools.tirith_security.<fn>") calls find a registered submodule. CI failure: test (5) shard, tests/tools/test_kanban_tools.py:58 AssertionError: expected {kanban_*}, got set()	2026-05-28 22:26:25 -07:00
kshitijk4poor	66827f8947	chore: prune unused imports and duplicate import redefinitions Remove unused imports (F401) and duplicate/shadowed import redefinitions (F811) across the codebase using ruff's safe autofixes. No behavioral changes -- imports only. - ~1400 safe autofixes applied across 644 files (net -1072 lines) - __init__.py re-exports preserved (excluded from F401 removal so public re-export surfaces stay intact) - Re-exports that are imported or monkeypatched by tests but look unused in their defining module are kept with explicit # noqa: F401 (gateway/run.py load_dotenv; run_agent re-exports from agent.message_sanitization, agent.context_compressor, agent.retry_utils, agent.prompt_builder, agent.process_bootstrap, agent.codex_responses_adapter) - Unsafe F841 (unused-variable) fixes deliberately skipped -- those can change behavior when the RHS has side effects - ruff lints remain disabled in pyproject.toml (only PLW1514 is selected); this is a one-time cleanup, not a config change Verification: - python -m compileall: clean - pytest --collect-only: all 27161 tests collect (zero import errors) - core entry points import clean (run_agent, model_tools, cli, toolsets, hermes_state, batch_runner, gateway) - static scan: every name any test imports directly from an edited module still resolves	2026-05-28 22:26:25 -07:00
Teknium	a4d8f0f62a	feat(prompt): universal task-completion guidance + local Python toolchain probe (#34340 ) * fix(codex): surface error code in Responses 'failed' status errors When a Codex Responses turn ends with status=failed, the response carries the failure details under `response.error` as `{code, message, param, ...}`. The previous extractor pulled only `message`, so users seeing a rate-limit failure got a bare "Slow down" string indistinguishable from a generic stream truncation; an internal_error with empty message degraded to a dict dump ("{'code': 'internal_error', 'message': ''}"). Extract a `_format_responses_error()` helper that: - prefixes `code` when both code and message are present (e.g. 'rate_limit_exceeded: Slow down') - falls back to the bare `code` when message is empty - accepts both dict and attribute-style payloads (SDK and JSON-RPC paths) - preserves the prior status-only fallback when no error payload exists Apply the same helper at the sibling site in `codex_app_server_session.run_turn()` so codex-CLI subprocess turn failures get the same treatment. Tests: - 8 new unit tests for `_format_responses_error` covering both shapes, empty/missing fields, non-string fields, and the status-only fallback. - 2 regression tests on `_normalize_codex_response` for failed status with and without a code, asserting the exact RuntimeError message. - All 3603 tests in tests/agent/ pass. Adapted from anomalyco/opencode#28757. * feat(prompt): universal task-completion guidance + local Python toolchain probe Two cross-model failure modes get a single-line answer in the cached system prompt. Both gated by config (default on), both add zero overhead when not needed, both verified via real AIAgent prompt builds. ## What changed `TASK_COMPLETION_GUIDANCE` — short prompt block applied to ALL models. Targets two failure modes observed on a real Sarasota real-estate build task: (1) Opus stopped after writing an 85-byte stub and gave a prose response with finish_reason=stop on call #3 of 90; (2) DeepSeek pushed through a PEP-668 wall, then returned fabricated listings instead of admitting the blocker. Both behaviors are model-family-agnostic, so the guidance lives outside the existing tool_use_enforcement gate (~192 tokens, paid once per session via prefix cache). `tools/env_probe.py` — local Python toolchain probe. Detects python3/pip/uv/PEP-668 state and emits ONE short line in the system prompt when something is non-default. Emits NOTHING when the env is clean (zero token cost for normal users). Skipped entirely for remote terminal backends (docker/modal/ssh) — they have their own probe. Example output on a broken environment (the actual case): Python toolchain: python3=3.11.15 (no pip module), python=missing (use python3), pip→python3.12 (mismatch), PEP 668=yes (use venv or uv). ## Config Both flags live under `agent.` in config.yaml, default True: agent: task_completion_guidance: true # universal "finish the job" block environment_probe: true # local Python toolchain hints Neither addition required a `_config_version` bump — deep-merge fills defaults in for existing user configs. ## Validation \| Test surface \| Result \| \|---\|---\| \| tests/tools/test_env_probe.py \| 10/10 pass (probe unit) \| \| tests/run_agent/test_run_agent.py — new classes \| 8/8 pass (integration) \| \| TestToolUseEnforcementConfig \| 17/17 pass (no regression) \| \| TestBuildSystemPrompt \| 9/9 pass (no regression) \| \| TestInvalidateSystemPrompt \| 2/2 pass (no regression) \| \| tests/agent/test_prompt_builder.py \| 124/124 pass (no regression) \| \| tests/hermes_cli/ \| 5662/5662 pass (config defaults) \| \| E2E AIAgent build (broken env) \| Both blocks present, 2,178 chars \| \| E2E AIAgent build (clean env) \| 771-char net overhead, env probe silent \|	2026-05-28 22:26:09 -07:00
Ben Barclay	40fa0c1d19	fix(docker): skip credential/skills/cache mounts when source is invalid (#24490 ) (#34331 ) Salvages #24490 by @liuhao1024 against current main. The Docker daemon will silently auto-create a directory at the host path of any `-v <host>:<container>` bind mount when the host path doesn't exist. In Docker-in-Docker setups (where the outer host's real credential file isn't visible inside the agent's parent container), this leaves a directory at the credential mount source — and the inner `docker run` then refuses to mount a directory over a file destination with exit 125. Add defensive shape guards to all three mount loops in DockerEnvironment.__init__: * credentials (expected: file) — skip + warn on directory or missing * skills (expected: dir) — skip + warn when not a directory * cache (expected: dir) — skip + warn when not a directory Failed mounts surface as WARN logs rather than crashing the container start. Existing well-formed sources mount unchanged. The original PR's branch was on a pre-container-reuse-rework base (May 12) and conflicted with the post-May-28 driver work (label tagging, container reuse, orphan reaper). Reconstructed the same intent on current main; the three guard blocks slot cleanly into `tools/environments/docker.py` around the existing mount loops. Three new tests pinned in `tests/tools/test_docker_environment.py`: directory-source skip, missing-source skip, valid-file mounts. Test- first regression verification: reverted just the production code to `origin/main` and confirmed the new tests fail with `'deleted_token.json' is contained here: /root/.hermes/...` — the fixed code makes them pass. Full file passes (54/54). Closes #24490 Co-authored-by: liuhao1024 <11816344+liuhao1024@users.noreply.github.com>	2026-05-29 14:09:04 +10:00
teknium1	bfecfabd0f	Revert "feat(skills): integrate NVIDIA/skills as a trusted skills hub tap" This reverts commit `9992e32db3`.	2026-05-28 20:39:39 -07:00
liuhao1024	44df52005a	fix(tools): guard Path.home() against PermissionError in has_direct_modal_credentials (#33528 ) When HOME=/root (Docker containers) and the process runs as unprivileged user (hermes, uid 10000), Path.home() / '.modal.toml' raises PermissionError because /root/ is inaccessible. This crashes the dashboard /api/skills endpoint. Catch PermissionError/OSError and treat as 'no config file'. Env vars still take priority (tested). Fixes #33525	2026-05-29 13:35:39 +10:00
Teknium	9992e32db3	feat(skills): integrate NVIDIA/skills as a trusted skills hub tap NVIDIA's verified skills catalog (https://github.com/NVIDIA/skills) ships NVIDIA-signed skills for CUDA-X, AIQ, cuOpt, cuPyNumeric, DeepStream, NeMo, NemoClaw and the Skill Card Generator — each bundle carrying a detached `skill.oms.sig` signature, a governance `skill-card.md`, and `evals/`. The sync pipeline drops any skill missing those artifacts before publishing. Changes: - tools/skills_hub.py: add NVIDIA/skills to GitHubSource.DEFAULT_TAPS so it lights up in `hermes skills browse`, `hermes skills search <q>`, the twice-daily skills-index build, and the docs-site Skills Hub page (https://hermes-agent.nousresearch.com/docs/skills) automatically. - tools/skills_guard.py: add NVIDIA/skills to TRUSTED_REPOS so installs resolve to trust_level="trusted" (looser install policy than community). - website/scripts/extract-skills.py: map the `github` source id to a friendly "NVIDIA" pill label for the docs hub page. - website/src/pages/skills/index.tsx: register the NVIDIA pill (green #76b900) and slot it into SOURCE_ORDER after HuggingFace. - website/docs/user-guide/features/skills.md (+ zh-Hans i18n): document the new default tap and the expanded trusted-repos list. - tests/tools/test_skills_guard.py: assert NVIDIA/skills resolves to "trusted" (including the skills-sh-wrapped form). - tests/tools/test_skills_hub.py: invariant — every TRUSTED_REPOS entry must be reachable via GitHubSource.DEFAULT_TAPS (prevents future trusted repos from being declared but never browseable). Validation: - Live GitHub fetch: `src.fetch('NVIDIA/skills/skills/aiq-deploy')` pulled 17 files including SKILL.md (13 KB), skill-card.md, skill.oms.sig, and the full references/ + evals/ tree. trust_level="trusted". - Live inspect resolved name, description, and trust correctly. - All 193 existing skills_guard + skills_hub tests still pass.	2026-05-28 20:35:13 -07:00
Dusk	c834624f7d	fix(voice): honor PIPEWIRE_REMOTE in PortAudio fallback checks (#33473 )	2026-05-29 13:30:17 +10:00
Ben	2f0f03c40d	fix(docker): cleanup_vm() default honors persist mode (don't kill container on session close) Commit 4 made cleanup_vm() default to force_remove=True, which was wrong: cleanup_vm() is called from AIAgent.close() (TUI session close at tui_gateway/server.py:2991, gateway session teardown at gateway/run.py:3569) and from per-turn cleanup (agent/chat_completion_helpers.py:1517). All three are session-lifecycle events that should honor persist mode, not explicit user-initiated teardown. Ben reported the symptom: container shared between multiple TUI sessions (good) but killed as soon as any session closed (bad). With force_remove=True as the default, every `session.close` JSON-RPC tore down the container. The fix is to flip cleanup_vm()'s force_remove default back to False. The kwarg still exists for future explicit-teardown paths (`/reset`-style flows, "destroy my sandbox" commands) that haven't been wired up yet. Two new unit tests pin the behavior: * `test_cleanup_vm_default_honors_persist_mode` — asserts `cleanup_vm(task_id)` does neither docker stop nor docker rm on a persist-mode container (the regression Ben caught). * `test_cleanup_vm_force_remove_tears_down_persist_container` — asserts the kwarg still flows through the runtime-signature-inspection plumbing to the backend's cleanup(). E2E verified against real Docker (in addition to all 17 existing checks): ✓ Default cleanup_vm() leaves persist-mode container running ✓ cleanup_vm(force_remove=True) removed the container Refs #20561	2026-05-29 11:49:54 +10:00
Ben	5c2170a7c6	fix(docker): persist-mode cleanup is no-op; add force_remove kwarg (#20561 ) The first iteration of this PR did docker stop on every cleanup in persist mode (only skipping docker rm). Ben caught this as contradicting the documented "ONE long-lived container shared across sessions" semantics: stopping the container on every Hermes /quit kills any background processes inside (npm watchers, pytest watchers, long-running scripts) — exactly the case persist mode is supposed to protect. This commit splits the cleanup paths cleanly: * Persist mode (default) — cleanup() is a NO-OP for the container. Container stays running, processes survive, next Hermes process attaches via the existing label probe in ~ms instead of waiting for docker start. Resource reclamation happens via the orphan reaper at next startup (2 × lifetime_seconds threshold), which covers the SIGKILL / OOM / abandoned-laptop cases. * Opt-out mode (persist_across_processes=False) — unchanged: docker stop + docker rm -f on cleanup as before. * Explicit teardown — new cleanup(force_remove=True) kwarg overrides persist mode and tears the container down unconditionally. cleanup_vm(task_id) now defaults to force_remove=True since it's the user-driven reset path (called from AIAgent.close(), /reset-style flows, and the idle reaper's per-turn cleanup). The idle reaper in _cleanup_inactive_envs calls env.cleanup() directly with no kwargs, so idle persist-mode envs are no-op'd — the container survives the in-process pop and the next tool call re-probes via labels. No state leak: _container_id is still cleared on the in-process handle. E2E verified against real Docker: ✓ Container is still running after cleanup() ✓ Background process (sleep loop) survived cleanup() ✓ Filesystem state preserved across cleanup() ✓ In-process container_id cleared (next __init__ will re-probe) ✓ Background process visible from reused env (no docker start happened) ✓ force_remove=True removed the container even in persist mode ✓ cleanup_vm() removed the container (defaults to force_remove=True) Test changes: * Replaces `test_cleanup_with_persist_only_stops_no_rm` with `test_cleanup_with_persist_is_noop_for_container` — asserts neither stop nor rm runs in persist mode, and the in-process handle is cleared so re-probe works. * Adds `test_cleanup_force_remove_stops_and_rms_even_in_persist_mode` — covers the new kwarg. * Updates `test_cleanup_uses_subprocess_run_not_detached_shell` and `test_wait_for_cleanup_after_cleanup_returns_true` to pass `force_remove=True` so they actually exercise the docker code path (default no-op would trivially pass). cleanup_vm() forwards `force_remove` only to backends whose cleanup() accepts the kwarg (currently just DockerEnvironment) via runtime signature inspection — Modal/Daytona/SSH `cleanup()` signatures are unchanged. Refs #20561	2026-05-29 11:49:54 +10:00
Ben	d77d877665	fix(docker): startup orphan reaper for crashed-process containers The cleanup-fix in the previous commit handles the graceful-exit leak: a Hermes process that runs ``atexit`` will now actually wait on the docker stop/rm worker thread, so containers either survive (persist mode) or are fully removed (opt-out mode) by the time the interpreter exits. But ``atexit`` doesn't fire on SIGKILL, OOM-kill, or terminal-window close. Containers from those exits stay parked with no surviving Python process to reuse or remove them, so they accumulate until the operator intervenes with ``docker rm -f``. The cleanup-fix doesn't help this class — there's no live cleanup() to fix. This commit adds the safety net: a startup orphan reaper that runs once per Hermes process and removes long-Exited hermes-labeled containers that the prior commit couldn't reach. Implementation: * New ``reap_orphan_containers()`` in ``tools/environments/docker.py``. Filters: ``label=hermes-agent=1`` + ``status=exited`` + (optional) ``label=hermes-profile=<current>``. Per-container ``docker inspect`` parses ``State.FinishedAt`` (with nanosecond-precision trimming for Python's microsecond-bound ``fromisoformat``); containers older than the threshold get ``docker rm -f``'d. The ``status=exited`` filter is load-bearing — a running container may belong to a sibling Hermes process whose reuse path will pick it up; killing it would crash the sibling mid-command. Single-container failures are logged and the sweep continues to the next candidate. * New ``_maybe_reap_docker_orphans()`` helper in ``tools/terminal_tool.py``. Wired into ``_create_environment()`` for ``env_type == "docker"``. Gated by: - ``terminal.docker_orphan_reaper: true`` (default; opt-out for operators running multiple Hermes processes in the same profile who don't trust the conservative defaults) - ``_docker_orphan_reaper_ran`` module flag with double-checked locking — parallel subagents and RL rollouts don't trigger N concurrent docker ps storms - Age threshold = ``2 × TERMINAL_LIFETIME_SECONDS`` with a 60s floor (so ``TERMINAL_LIFETIME_SECONDS=0`` doesn't race the user's own setup) - Profile scoping — a research profile NEVER reaps the default profile's stragglers - Exception swallow — a janitor failure must never block container creation * New config ``terminal.docker_orphan_reaper`` wired through all four config-bridge sites (cli.py, gateway/run.py, hermes_cli/config.py, tests/conftest.py) and pinned by ``test_docker_orphan_reaper_is_bridged_everywhere``. Coverage: * 9 new unit tests in test_docker_environment.py — happy path, recent- container sparing, profile scoping, unparseable-timestamp safety, docker-ps-failure handling, partial-failure continuation, nanosecond timestamp parsing, zero-value FinishedAt rejection. * 6 new integration tests in test_docker_orphan_reaper_integration.py — once-per-process gate, disable-flag respected, lifetime doubling with 60s floor, current-profile filter wiring, exception swallow. * 1 new bridge-invariant regression test. Closes #20561 (combined with the two prior commits on this branch).	2026-05-29 11:49:54 +10:00

1 2 3 4 5 ...

1040 commits