hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

Author	SHA1	Message	Date
Teknium	0dee92df22	feat(security): promptware defense — shared threat patterns + memory load-time scan + tool-result delimiters (#32269 ) Hardens the context window against Brainworm-class promptware attacks (see #496). Three changes: 1. tools/threat_patterns.py — single source of truth for injection/promptware patterns. Replaces the duplicated pattern lists in prompt_builder.py and memory_tool.py. Adds ~15 new Brainworm/C2 patterns (node registration, heartbeat/beacon, pull tasking, anti-forensic disk avoidance, identity override, known framework names). Three scopes — 'all' (narrow, classic injection), 'context' (adds promptware/role-play, broader detection), 'strict' (adds persistence/SSH-backdoor patterns for user-mediated writes). 2. MemoryStore.load_from_disk() now scans entries at snapshot-build time. Poisoned entries are replaced with [BLOCKED: ...] placeholders in the frozen system-prompt snapshot. Live state keeps the original so the user can still inspect + remove via memory(action=read/remove). Scan is deterministic from disk bytes — prefix-cache invariant holds. 3. make_tool_result_message() wraps results from high-risk tools (web_extract, web_search, browser_, mcp_) in <untrusted_tool_result source="...">...</untrusted_tool_result> delimiters with framing prose telling the model the content is data, not instructions. Architectural defense against indirect injection from poisoned web pages, GitHub issues, MCP responses — does NOT regex-scan tool results (pattern arms race + per-iteration latency). Multimodal content lists pass through unwrapped to preserve adapter compatibility. Pattern philosophy: anchor on C2-specific vocabulary or unambiguous attack behavior, NOT on bossy English. Dropped patterns suggested in #496 that would have tripped legitimate content: standalone 'you are obligated to', 'do not respond immediately', 'you must X' without a C2-verb anchor. Validation: - 257/257 targeted tests pass (test_threat_patterns + test_memory_tool + test_tool_dispatch_helpers + test_prompt_builder) - E2E run with real Brainworm payload: blocked from AGENTS.md context-file path, blocked from MEMORY.md snapshot, wrapped in delimiters when arriving via web_extract. Legitimate 'you must follow conventions' phrasing not flagged. Explicitly NOT in this PR (per #496 discussion): - Per-tool-result regex scanning (pattern arms race) - SessionBehaviorMonitor / polling-loop detection (wrong layer) - Outbound network gating (Docker backend already covers this) - security.context_scanning warn\|block knob (current behavior is always block-with-placeholder — there's no warn mode that makes sense) Closes #496 for Phase 1 + the architectural delimiter piece of Phase 2. Phase 3 stays in tracking issue territory.	2026-05-25 14:52:24 -07:00
EloquentBrush0x	1d73d5facc	fix(tts): prevent double [pause] in xAI auto speech tags for multi-paragraph text _apply_xai_auto_speech_tags runs two independent transformations: 1. paragraph breaks (\n\n) → " [pause] " 2. first-sentence boundary → " [pause] " Both fired unconditionally, so multi-paragraph input produced "Hello world. [pause] [pause] Second paragraph." — an unnatural double pause in the TTS audio. Guard the first-sentence substitution with _XAI_SPEECH_TAG_RE.search(clean): if the paragraph pass already inserted a [pause] tag, skip the first-sentence pass. Single-paragraph behavior is unchanged.	2026-05-25 14:30:06 -07:00
峯岸亮	3b9b9a7ad7	fix(skills): guard uninstall lock paths Validate Skills Hub lock-file install paths at both ends of the lifecycle so a poisoned or malformed lock.json entry cannot drive shutil.rmtree to a location outside SKILLS_DIR: - HubLockFile.record_install rejects empty/'.'/absolute/traversal/ Windows-drive paths at write time, and requires the final path component to match the skill name (shape: '<skill>' or '<category>/<skill>'). - install_from_quarantine resolves its destination through the same validator, catching symlink/junction redirects inside skills/. - uninstall_skill resolves the lock entry through the new validator before rmtree. Refuses anything that resolves to SKILLS_DIR itself (empty/dot paths) or to a target outside SKILLS_DIR (absolute paths, traversal, symlinked dirs in skills/ pointing outward). - 14 focused regression tests covering each rejection class plus a symlink-redirect case. E2E verified: hand-crafted poisoned lock.json entries (absolute path, empty install_path, traversal) all refuse and leave the targeted victim untouched; legitimate uninstall still succeeds. Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-25 06:13:36 -07:00
Teknium	8191f663dd	feat(mcp-oauth): accept 'skip' at paste prompt to bypass auth without disabling server (#32069 ) When an MCP server triggers OAuth at startup, the user can now type 'skip' (or 'cancel', 's', 'n', 'no', 'q', 'quit') at the paste prompt + Enter to exit the flow cleanly and continue agent startup without that server. Previously the only ways to bypass an unwanted OAuth prompt were: - Wait the full 5-minute paste timeout - Ctrl+C (also kills the whole reload, may leave half-state) - Edit config.yaml to set 'enabled: false' on the server Skip writes a sentinel to result['error'] which _wait_for_callback maps to OAuthNonInteractiveError('user_skipped'). mcp_tool already classifies that as an auth error in _is_auth_error() and the reconnect loop logs it as 'not retrying automatically' — server stays disconnected for the session, other MCP servers continue normally, no infinite retry burn. The skip message tells users how to re-auth later ('hermes mcp login') or disable persistently ('enabled: false'), so they don't have to remember. 14 new tests covering: case-insensitive skip parsing, all 7 skip tokens, skip not stomping an HTTP-listener win, skip routed to skip path rather than URL-parse path, sentinel mapped to OAuthNonInteractiveError, prompt mentions the skip option.	2026-05-25 05:37:30 -07:00
Teknium	1c3c364287	feat(cli): show live background terminal-process count in status bar (#32061 ) The CLI status bar tracked /background agent tasks (▶ N) but not shell processes spawned via terminal(background=true). Both kinds of work can run concurrently and a user has no in-bar signal for shell processes. Add an independent indicator (⚙ N) sourced from tools.process_registry.process_registry._running. The two indicators render side-by-side when both are active (▶ 1 │ ⚙ 2), hidden when their count is zero. Renders at all four status-bar tiers (text fallback + prompt_toolkit fragments, narrow + wide widths). The narrow <52 tier still drops both for space — unchanged. New ProcessRegistry.count_running() returns len(_running) without acquiring _lock; CPython dict len is atomic and we're polling on every status-bar tick, so lock-free is the right tradeoff.	2026-05-25 05:35:02 -07:00
Teknium	0ff7c09e2f	feat(mcp-oauth): stdin paste-back fallback for headless OAuth flow (#32053 ) When the user runs OAuth on a remote/SSH machine without a port forward, the OAuth provider redirects to http://127.0.0.1:<port>/callback which only the listener on the remote machine can receive — the user's browser on another box just shows a connection error. _wait_for_callback() now races the HTTP listener against a stdin reader on interactive TTYs. The user can copy the URL from the browser's address bar after authorization (which contains code=...&state=...) and paste it back at the prompt. Whichever fills the result dict first wins; the HTTP listener remains the primary path for local sessions and SSH tunnels. Accepts any of: - Full local redirect URL: http://127.0.0.1:N/callback?code=...&state=... - Provider URL after redirect: https://mcp.linear.app/callback?code=...&state=... - Just the query string: ?code=...&state=... or code=...&state=... The paste thread only spawns when _is_interactive() is true, preserving the existing 'no input() in headless runs' invariant — verified by TestWaitForCallbackPasteIntegration.test_paste_prompt_NOT_shown_when_noninteractive. The SSH-session hint in _redirect_handler is updated to surface the paste option as the primary remedy, with ssh -L tunneling as the alternative.	2026-05-25 05:20:05 -07:00
aaronlab	5f20322d23	fix(tts): reject '..' traversal in output_path text_to_speech_tool accepts an explicit output_path. Without a traversal guard, a path containing '..' components (whether prompt-injection- controlled, from a confused skill, or just a buggy caller) could escape its declared base and write the audio to a system location — e.g. `output_path='audio/../../etc/cron.d/x'` lands the file outside the intended audio cache. Reject '..' components in the user-supplied path. Explicit absolute paths are unchanged (the agent legitimately writes audio wherever the user/caller asks); only traversal-style escapes are blocked. The terminal tool can still write anywhere with approval — this just keeps the unattended TTS surface from materializing files via traversal. Regression tests cover: '..' in the middle (audio/../../etc/...), bare '..' prefix, and the negative cases (absolute paths + relative paths without '..' both pass through unchanged). Salvaged from PR #6693 by @aaronlab. The original PR confined output to DEFAULT_OUTPUT_DIR-or-cwd, which broke 9 existing tests that legitimately write to tmp_path locations. The traversal-only check covers the actual threat (path-escape via '..' from prompt injection) without restricting where users can choose to write their audio. The remaining pieces of #6693 (skill_commands rglob symlink rejection, delegate_tool batch prefix display) are dropped: - skill_commands rglob: breaks the documented design supporting ~/.hermes/skills/<name> as a symlink to a checked-out skill elsewhere (see comment at agent/skill_commands.py:73-75) - delegate_tool batch prefix: pure UX, doesn't belong in a security PR Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-25 05:15:55 -07:00
Peter	95848b1cbc	fix(transcription): reject symlinked audio inputs (#10082 ) * fix(transcription): reject symlinked audio inputs Validation runs before provider selection, so rejecting symbolic-link paths there prevents supported-extension links from being treated as normal audio files. Use os.path.islink to avoid perturbing the existing Path.stat error path and to reject links before resolving targets. Constraint: Keep validation platform-safe and avoid requiring symlink support where unavailable. Rejected: Use Path.is_symlink \| it consumes pathlib stat calls and broke the existing stat error regression. Confidence: high Scope-risk: narrow Directive: Keep path hardening in _validate_audio_file before provider dispatch. Tested: source venv/bin/activate && python -m pytest tests/tools/test_transcription_tools.py::TestValidateAudioFileEdgeCases -q (5 passed) Tested: source venv/bin/activate && python -m pytest tests/tools/test_transcription_tools.py::TestValidateAudioFileEdgeCases tests/tools/test_transcription_tools.py::TestTranscribeAudioDispatch::test_invalid_file_short_circuits -q (6 passed) Tested: source venv/bin/activate && python -m compileall tools/transcription_tools.py tests/tools/test_transcription_tools.py Tested: git diff --check Not-tested: Full tests/tools/test_transcription_tools.py under .[dev] only; existing faster_whisper optional dependency tests fail with ModuleNotFoundError. * Keep transcription tests independent of optional whisper install The transcription suite mocks faster-whisper directly, so a minimal test stub keeps the branch verifiable in environments where the optional package is not installed. This preserves the existing mock-based coverage without adding a dependency. Constraint: faster-whisper is an optional local STT dependency and is absent from the current validation environment Rejected: Install faster-whisper just for branch validation \| would add heavyweight environment coupling outside the patch scope Confidence: high Scope-risk: narrow Directive: Keep this as a test-only stub unless production import semantics change Tested: pytest tests/tools/test_transcription_tools.py -q --------- Co-authored-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com>	2026-05-25 05:07:45 -07:00
Peter	ee59ef1946	fix: reject read_file symlinks to blocking devices (#10133 ) * fix: reject read_file symlinks to blocking devices The read_file guard already refused direct device paths such as /dev/zero, but a workspace symlink resolving to one of those devices could still reach the shell-backed read path and hang on wc/head/sed. Keep the literal alias check and add a resolved-path pass so local symlinks to blocked device/fd endpoints are rejected before I/O. Constraint: Preserve literal /dev/stdin handling before terminal-specific realpath resolution Confidence: high Scope-risk: narrow Tested: pytest tests/tools/test_file_read_guards.py tests/tools/test_file_tools.py -q; python -m compileall tools/file_tools.py tests/tools/test_file_read_guards.py; git diff --check Signed-off-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com> * Keep file guard tests off sensitive macOS temp paths The branch now inherits a sensitive-path write guard from upstream main. On macOS, tempfile.mkdtemp() resolves under /private/var/folders, so the new write-path guard fired before the file read dedup assertions could exercise their intended behavior. The tests now create their scratch files inside the worktree temp checkout, outside those system-sensitive prefixes, without changing production behavior. Constraint: Rebased branch must pass the expanded file read guard suite on macOS. Rejected: Loosen the production sensitive-path prefix list \| broader behavior change unrelated to this PR. Confidence: high Scope-risk: narrow Tested: pytest tests/tools/test_file_read_guards.py -q --------- Signed-off-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com> Co-authored-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com>	2026-05-25 05:07:38 -07:00
Dakota Secula-Rosell	b7b8bec800	fix(security): block /proc//environ, cmdline, maps from file read (#4609 ) The read_file tool and terminal cat can access /proc/self/environ to recover all process env vars including secrets stripped by the subprocess blocklist. Output redaction partially mitigates (catches known-format tokens) but misses custom/proprietary key formats, especially when values are printed without their key names. Add /proc//environ, /proc//cmdline, and /proc//maps to the blocked device paths in _is_blocked_device(): - /proc//environ: leaks full process env (API keys, tokens) - /proc//cmdline: leaks command-line args (may contain passwords) - /proc/*/maps: leaks memory layout (ASLR bypass for exploitation) Legitimate /proc reads (cpuinfo, meminfo, uptime, version) remain accessible — the check only blocks per-pid pseudo-files with known sensitive suffixes. Complements PR #4432 (PID namespace isolation for child processes) which prevents children from reading the parent's /proc, but does not prevent the parent process itself from being read via file tools. Partially addresses #4427 Changes: tools/file_tools.py \| +6 tests/tools/test_file_read_guards.py \| +18 -1 Co-authored-by: dsr-restyn <dsr-restyn@users.noreply.github.com>	2026-05-25 05:07:31 -07:00
Rodrigo	4cb3eb03c7	fix(approval): harden YOLO bypass, LLM parsing, auto-approve audit, pipe pattern (#23835 ) * fix(approval): harden YOLO bypass, LLM parsing, auto-approve audit, pipe pattern - BUG-009 (CRITICAL): freeze HERMES_YOLO_MODE at module import via _YOLO_MODE_FROZEN; prevents skills/prompt-injection from calling os.environ["HERMES_YOLO_MODE"]="true" at runtime to bypass all checks - BUG-002 (HIGH): replace substring "APPROVE" in answer with exact answer == "APPROVE" in _smart_approve; prompt already requests exactly one word, substring match was exploitable via verbose LLM responses - BUG-001 (MEDIUM): add logger.warning for every dangerous command that auto-approves in non-interactive non-gateway context; makes silent approvals visible in audit logs without breaking script behavior - BUG-008 (LOW): expand curl/wget pipe pattern to cover \| /bin/bash and \| bash -c variants, not just \| sh / \| bash Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(approval): add missing is_truthy_value import + fix yolo test patches _YOLO_MODE_FROZEN uses is_truthy_value() from utils — import was missing. Tests that set HERMES_YOLO_MODE via monkeypatch.setenv() no longer work because the value is frozen at import time; update them to patch the module-level flag directly via monkeypatch.setattr(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-25 03:35:33 -07:00
Dennis Vorobyov	3ab7e2aa91	harden(env_passthrough): apply GHSA-rhgp-j443-p4rf filter to config.yaml path (#27794 ) register_env_passthrough() (the skill-declared path) filters out names in _HERMES_PROVIDER_ENV_BLOCKLIST and logs a warning citing GHSA-rhgp-j443-p4rf. _load_config_passthrough() (the config.yaml path) did not. Both feed the same is_env_passthrough() allowlist that local.py and code_execution_tool.py consult before stripping a variable from the child env. A skill that wanted to leak ANTHROPIC_API_KEY or OPENAI_API_KEY into execute_code could no longer self-register the name (the GHSA fix blocks it), but the same outcome was still reachable by asking the operator to add the name to terminal.env_passthrough in config.yaml, or by any in-process actor with write access to ~/.hermes/config.yaml. Apply the same _is_hermes_provider_credential filter inside _load_config_passthrough, mirroring the skill-path warning so operators see the same explanation. Non-Hermes API keys (TENOR_API_KEY, NOTION_TOKEN, etc.) are unaffected since they are not in the blocklist.	2026-05-25 03:35:23 -07:00
waefrebeorn	5faea3f618	fix(file_tools): reject '..' traversal in V4A patch headers V4A patch '* Update File:', '* Add File:', '* Delete File:' headers come from patch CONTENT, not the explicit `path=` argument. That makes them attacker-influenceable through skill content, web extract output, prompt injection, and other surfaces the agent processes. Headers like '* Update File: ../../../etc/shadow' would resolve relative to the agent's cwd; in deployment configurations where that cwd is deep enough to land outside Hermes' protected paths, the write could land somewhere the agent operator did not intend. Reject any V4A header containing a '..' path component before applying the patch. The explicit `path=` argument on patch_tool is UNCHANGED — the agent legitimately uses '..' there (e.g. `patch path='../other_module/x.py'` from a worktree dir is normal cross-module editing). Regression tests: V4A Update header with traversal rejected, V4A Add header with traversal rejected, patch_v4a never invoked when rejection fires. Salvaged from PR #29395 by @waefrebeorn. The original PR added has_traversal_component as a blanket reject on read_file_tool, write_file_tool, patch_tool's explicit path, and search_tool — that would break legitimate agent operation where '..' is normal. Also dropped the over-eager skills_guard pattern additions (pickle.loads/marshal.loads/ctypes.CDLL/importlib at high/critical severity would false-positive on legit data-science and FFI skills). Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-25 01:55:59 -07:00
AdamPlatin123	00bd24e27c	fix(security): expand memory content scanning patterns to parity with skills guard (#9151 ) Expand _MEMORY_THREAT_PATTERNS from 13 to 24 regex patterns and align _INVISIBLE_CHARS with skills_guard.py (10 → 17 characters). Key changes: - Add multi-word bypass prevention (?:\w+\s+)* to injection patterns - Add missing injection patterns: role_pretend, leak_system_prompt, remove_filters, fake_update, translate_execute, html_comment_injection, hidden_div - Add exfiltration patterns: send_to_url, context_exfil - Add persistence patterns: agent_config_mod, hermes_config_mod (both require modification-verb prefix to avoid false positives on mere mentions of config filenames) - Add hardcoded secret detection pattern - Add role_hijack precision fix: require article after "now" to avoid blocking "you are now ready/connected/set up" etc. - Expand invisible unicode set with directional isolates (U+2066-2069) and invisible math operators (U+2062-2064) Test coverage expanded from ~8 to ~30 scan tests including dedicated false-positive regression tests for all precision-sensitive patterns. Known limitations (deferred to follow-up PRs): - prompt_builder.py and cronjob_tools.py still use older pattern sets - No semantic/LLM-based scanning (regex-only approach) - No cross-entry or cross-store analysis	2026-05-25 01:51:53 -07:00
Edward-x	7ebebfbb8d	Harden Skills Guard multi-word prompt patterns (#26852 ) Co-authored-by: openhands <openhands@all-hands.dev>	2026-05-25 01:51:27 -07:00
Jorge Fuenmayor	93660643a6	fix: harden skill trust source matching (#31229 ) Co-authored-by: gaia <gaia@gaia.local>	2026-05-25 01:51:15 -07:00
teknium1	d3ffbc6409	feat(stt): add stt.providers.<name> command-provider registry Mirror of the TTS command-provider registry (PR #17843) for STT. Lets any shell-driven ASR engine — Doubao ASR, NVIDIA Parakeet, whisper.cpp builds, SenseVoice, curl pipelines — become an STT backend with zero Python. Complements the legacy HERMES_LOCAL_STT_COMMAND escape hatch (preserved untouched via the built-in local_command path) and the register_transcription_provider() Python plugin hook also shipped in this PR. Resolution order (mirrors TTS exactly): 1. Built-in (local, local_command, groq, openai, mistral, xai) → native handler. Always wins. 2. stt.providers.<name>: type: command → command-provider runner. 3. Plugin-registered TranscriptionProvider → plugin dispatch. 4. No match → 'No STT provider available'. Files ----- - tools/transcription_tools.py: BUILTIN_STT_PROVIDERS frozenset retained; added _resolve_command_stt_provider_config, _transcribe_command_stt, and local helpers for template rendering, shell-quote context, and process-tree termination. Helpers are documented as mirrors of their tts_tool.py counterparts (kept local to avoid cross-tool private import). Wire-in is one insertion point in transcribe_audio() after the xai elif and before the plugin dispatcher. Plugin dispatcher additionally defensively short-circuits when a same-name command config exists (command-wins-over-plugin invariant). - tests/tools/test_transcription_command_providers.py: 50 new tests covering resolution (builtin precedence, type/command gating, case-insensitive lookup, legacy stt.<name> back-compat), helpers (timeout fallback, format validation, iter, has-any), template rendering (shell-quote contexts, doubled-brace preservation), end-to-end via _transcribe_command_stt (output_path read, stdout fallback, timeout, nonzero exit envelope, model override, language precedence), and dispatcher integration via the real transcribe_audio() including command-wins-over-plugin and builtin-shadow-rejection. - tests/plugins/transcription/check_parity_vs_main.py: extended from 10 to 13 scenarios. New cases: command-provider-installed, command-vs-plugin-same-name (verifies command wins precedence), explicit-openai-with-command-shadow (verifies built-in wins). Adds command_provider dispatch_kind detection via transcript prefix (CMD: vs PLUGIN:) so command-provider scenarios can be distinguished from plugin scenarios even when sharing a provider name. - website/docs/user-guide/features/tts.md: new 'STT custom command providers' section symmetric to the TTS section — example config, placeholder grammar table (input_path / output_path / output_dir / format / language / model), transcript-read-back semantics (file first, then stdout fallback), optional keys table, behavior notes, security note. Updated 'Python plugin providers (STT)' to include the new 'When to pick which (STT)' decision table and updated resolution-order section (now 4 layers instead of 3). Verification ------------ 189/189 STT targeted tests + 50/50 new command-provider tests pass. Combined sweep: tests/tools/ 5576/5576, tests/agent/ + tests/hermes_cli/ 8623/8623 — zero regressions across 14,199 tests. Parity harness: 13 scenarios, 9 OK + 4 expected diffs (no_provider_error → plugin, plugin_unavailable, command_provider × 2). E2E live-verified in an isolated HERMES_HOME with a real .wav file: command: → dispatched to stt.providers.my-fake-cli plugin: → dispatched to registered TranscriptionProvider command-wins-over-plugin: → command provider beats same-name plugin builtin-wins-over-command: → built-in OpenAI handler fires; stt.providers.openai: type: command does NOT hijack it.	2026-05-25 01:41:19 -07:00
kshitijk4poor	2cd952e110	feat(stt): add register_transcription_provider() plugin hook Add an opt-in Python plugin surface for speech-to-text backends, mirroring the TTS hook pattern. New backends (OpenRouter, SenseAudio, Gemini-STT, custom proprietary engines) can be implemented as plugins without modifying tools/transcription_tools.py. Built-ins always win -------------------- The 6 built-in STT providers (local/faster-whisper, local_command, groq, openai, mistral, xai) keep their native handlers. Plugins attempting to register under a built-in name are rejected at registration time with a warning and re-checked defensively at dispatch. Resolution order ---------------- 1. stt.provider matches a built-in → built-in dispatch (unchanged) 2. stt.provider matches a registered plugin → a. if plugin.is_available() returns False → unavailability envelope identifying the plugin (not the generic "No STT provider" message — the user explicitly opted into this plugin) b. otherwise plugin.transcribe() with model + language forwarded from stt.<provider>.{model,language} config 3. No match → legacy "No STT provider available" error (unchanged) Per-provider config namespace ----------------------------- Plugins read their config from stt.<provider> in config.yaml, mirroring how built-ins read stt.openai.model / stt.mistral.model. The dispatcher forwards `model` and `language` from this section. Caller's explicit `model=` argument overrides the config-set model. Files ----- - agent/transcription_provider.py: TranscriptionProvider ABC - agent/transcription_registry.py: register/get/list providers, built-in shadow guard, _reset_for_tests - hermes_cli/plugins.py: register_transcription_provider() on PluginContext - tools/transcription_tools.py: BUILTIN_STT_PROVIDERS frozenset, _dispatch_to_plugin_provider() with availability gate, wire-in after xai branch and before "No STT provider" error - tests/agent/test_transcription_registry.py: 27 tests - tests/hermes_cli/test_plugins_transcription_registration.py: 3 tests - tests/tools/test_transcription_plugin_dispatch.py: 28 tests (covering built-in short-circuit, plugin dispatch, exception envelope, non-dict guard, availability gate, language forwarding) - tests/plugins/transcription/check_parity_vs_main.py: 10-scenario subprocess-pinned parity harness vs origin/main - website/docs/user-guide/features/{tts,plugins}.md: docs Behavior parity --------------- 10 scenarios, 8 OK + 2 expected DIFFs: no_provider_error → plugin (plugin-installed scenario) no_provider_error → plugin_unavailable (plugin-installed-unavailable scenario; PR returns cleaner envelope) Zero behavior change for users not opting into a plugin. Issue follow-up to #30398.	2026-05-25 01:41:19 -07:00
Ben Barclay	7e165e843d	Merge pull request #31760 from NousResearch/hermes/hermes-bf5898da feat(docker)!: s6-overlay container supervision (salvage of #30136)	2026-05-25 12:57:51 +10:00
kshitijk4poor	af973e4071	refactor(gateway): migrate Mattermost adapter to bundled plugin Second migration of an existing built-in platform adapter after Discord (PR #30591) — follows the same shape established by IRC / Teams / LINE / Google Chat / SimpleX and the playbook in `references/platform-plugin-migration.md`. Advances the umbrella refactor in #3823. Matches Discord's parity bar — adapter under `plugins/platforms/mattermost/` with the standard `__init__.py` / `adapter.py` / `plugin.yaml` shell, `register(ctx)` entry point, no back-compat shim at the old import path, and full parity for all five hooks Discord uses plus the `apply_yaml_config_fn` hook (mattermost is the second consumer of #25443 after Discord): * `standalone_sender_fn` — out-of-process cron delivery via Mattermost REST API. Picks up the thread_id + media_files capabilities the legacy `_send_mattermost` lacked (parity with Discord's `_standalone_send`). * `setup_fn` — interactive `hermes setup gateway` wizard. * `apply_yaml_config_fn` — translates `config.yaml` `mattermost:` keys (`require_mention`, `free_response_channels`, `allowed_channels`) into `MATTERMOST_` env vars (replaces the hardcoded block in `gateway/config.py`). `is_connected` — declares connection state from `MATTERMOST_TOKEN` + `MATTERMOST_URL`. * `check_fn` — verifies aiohttp is installed and both required env vars are set. * plus `allowed_users_env`, `allow_all_env`, `cron_deliver_env_var`, `max_message_length` (4000 — Mattermost practical limit), `emoji`, `required_env`, `install_hint`. Files ----- * `gateway/platforms/mattermost.py` (873 LOC) → `plugins/platforms/mattermost/adapter.py` (git rename, R071) + appended `register()` block, hook helpers, and `_standalone_send` with media upload + thread_id support. * New `plugins/platforms/mattermost/{__init__.py, plugin.yaml}` with `requires_env` / `optional_env` declarations covering MATTERMOST_URL, MATTERMOST_TOKEN, MATTERMOST_ALLOWED_USERS, MATTERMOST_ALLOW_ALL_USERS, MATTERMOST_HOME_CHANNEL, MATTERMOST_REPLY_MODE, MATTERMOST_REQUIRE_MENTION, MATTERMOST_FREE_RESPONSE_CHANNELS, MATTERMOST_ALLOWED_CHANNELS. * `gateway/config.py`: delete 17-LOC `mattermost_cfg` YAML→env bridge (moved into plugin's `_apply_yaml_config`). * `gateway/run.py::_create_adapter`: delete `Platform.MATTERMOST elif` — replaced by the existing generic plugin-registry-first dispatch. * `tools/send_message_tool.py`: delete `_send_mattermost` (22 LOC) + `Platform.MATTERMOST elif` in `_send_to_platform` — the `else` branch already routes plugin platforms through `_send_via_adapter`, which hits the registry's `standalone_sender_fn`. * `hermes_cli/setup.py`: delete `_setup_mattermost` (44 LOC) — replaced by the plugin's `interactive_setup`. * `hermes_cli/gateway.py`: delete `_PLATFORMS["mattermost"]` dict entry (3 LOC) — plugin's `setup_fn` is dispatched via the plugin path in `_configure_platform`. * Consumer rewrite: 5 test files (test_mattermost.py, test_media_download_retry.py, test_send_multiple_images.py, test_stream_consumer.py, test_ws_auth_retry.py) get `gateway.platforms.mattermost` → `plugins.platforms.mattermost.adapter` with the bulk-rewrite recipe from the platform-plugin-migration playbook. Single `mock.patch` string in test_stream_consumer.py also repointed. * `tests/tools/test_send_message_missing_platforms.py`: thin `(token, extra, chat_id, message)` compat shim around the plugin's `_standalone_send(pconfig, …)` so existing test bodies continue to work without rewriting every signature. Validation ---------- * Plugin discovery: mattermost registers from `plugins/platforms/mattermost/` alongside discord / teams / irc / line / google_chat / simplex. All 9 hooks present (setup_fn, standalone_sender_fn, apply_yaml_config_fn, is_connected, check_fn, allowed_users_env, allow_all_env, cron_deliver_env_var, max_message_length=4000). * Mattermost-touching tests: 62/62 pass (`test_mattermost.py` + `test_send_message_missing_platforms.py`). * Targeted selectors (mattermost or platform_registry or stream_consumer or ws_auth_retry or media_download_retry or send_multiple_images or send_message_tool or platform_connected): 433/433 pass. * Full sweep (`scripts/run_tests.sh tests/gateway/ tests/cron/ tests/tools/test_send_message_tool.py tests/tools/test_send_message_missing_platforms.py tests/integration/`): 6220/6220 pass in 47.8s, 0 failures. * Lint: ruff clean on all touched files. * Git identity verified: kshitijk4poor. * Rename detection: R071 (similarity dropped from a hypothetical R09x by the ~320-line appended register block — ~36% growth over the 873-LoC base, vs Discord's 5101 LoC base which kept R091). Closes part of #3823.	2026-05-24 18:05:33 -07:00
Ben	4b4c36cb61	feat(docker): remove gosu from bundled image; s6-setuidgid handles privilege drop The s6-overlay migration replaced every runtime use of gosu with s6-setuidgid (in stage2-hook.sh, main-wrapper.sh, per-service run scripts, and cont-init.d hooks), but the gosu binary itself was still being copied into the image from tianon/gosu, and several comments across the repo still pointed to it. Image changes: - Drop the FROM tianon/gosu:1.19-trixie AS gosu_source stage - Drop the COPY --from=gosu_source /gosu /usr/local/bin/ layer - Net: one fewer base-image pull, ~12-15 MB layer eliminated Documentation/comment refresh (no behavior change): - Dockerfile: update root-user rationale comment + cont-init.d comment - docker/main-wrapper.sh: drop "pre-s6 contract (gosu drop)" reference - docker-compose.yml: update UID/GID remap comment - .hadolint.yaml: update DL3002 ignore rationale - website/docs/user-guide/docker.md: privilege-drop helper is s6-setuidgid now - hermes_cli/config.py: docker_run_as_host_user docstring tools/environments/docker.py runs arbitrary user images via the terminal backend, not the bundled Hermes image. It still needs SETUID/ SETGID caps so user images that use gosu/su/s6-setuidgid all work. Renamed the cap-list constant _GOSU_CAP_ARGS → _PRIVDROP_CAP_ARGS and updated comments to list s6-setuidgid alongside the others as examples. The matching test (test_security_args_include_setuid_setgid_for_gosu_drop → test_security_args_include_setuid_setgid_for_privdrop) was renamed and its docstring updated; behavior is unchanged. Verification: - hadolint clean against .hadolint.yaml - shellcheck clean against all docker/ shell scripts - Image rebuilt successfully (sha 1a090924ccea) - Docker harness: 19 passed in 41.87s (every Phase 0 test + Phase 4 per-profile-gateway lifecycle + container-restart reconciliation) - tests/tools/test_docker_environment.py: 23 passed (rename did not break test discovery; pre-existing unrelated mock warning) The plan document (docs/plans/2026-05-07-s6-overlay-dynamic-subagent-gateways.md) intentionally retains its historical references to gosu — it describes the pre-s6 entrypoint as background for understanding the migration.	2026-05-24 18:05:33 -07:00
kshitijk4poor	00ec0b617c	feat(tts): add register_tts_provider() plugin hook (closes #30398 ) Adds a `TTSProvider(ABC)` + `register_tts_provider()` extension point to the plugin context API, alongside the existing config-driven `tts.providers.<name>: type: command` registry from PR #17843. This is additive — the command-provider surface stays as the primary way to add a TTS backend. The hook covers cases the shell-template grammar can't reasonably express: - Native Python SDKs without a CLI (Cartesia, Fish Audio, etc.) - Streaming synthesis (chunked Opus → voice-bubble delivery) - Voice metadata API for the `hermes tools` picker - OAuth-refreshing auth flows None of the 10 inline built-in providers (`edge`, `openai`, `elevenlabs`, `minimax`, `gemini`, `mistral`, `xai`, `piper`, `kittentts`, `neutts`) are migrated to plugins. They stay inline. The hook is for new engines that aren't built-in. ## Resolution order The dispatcher's resolution order is the load-bearing invariant: 1. `tts.provider` is a built-in name → built-in dispatch. Always wins. 2. `tts.provider` matches `tts.providers.<name>` with `command:` set → command-provider dispatch (PR #17843). 3. `tts.provider` matches a plugin-registered `TTSProvider` → plugin dispatch (new). 4. No match → falls through to Edge TTS default (legacy behavior). Built-ins-always-win is enforced at THREE layers: - Registry: `register_provider()` rejects shadowing names with a warning. - Dispatcher: `_dispatch_to_plugin_provider()` short-circuits built-in names defensively before consulting the registry. - Picker: `_plugin_tts_providers()` filters built-in shadows out of the `hermes tools` row list defensively. Command-providers-win-over-plugins is enforced at TWO layers: - The caller in `text_to_speech_tool` checks `_resolve_command_provider_config` first. - `_dispatch_to_plugin_provider` re-checks for a same-name command config defensively so a refactor of the caller can't silently break the invariant. ## New files - `agent/tts_provider.py` — `TTSProvider(ABC)` with `synthesize()` (required), `list_voices()`, `list_models()`, `get_setup_schema()`, `stream()`, `voice_compatible` (all optional with sane defaults). Mirrors `agent/image_gen_provider.py` shape. - `agent/tts_registry.py` — `register_provider`/`get_provider`/`list_providers` with `_BUILTIN_NAMES` reject-shadowing invariant. Mirrors `agent/image_gen_registry.py` shape. - `plugins/tts/...` directory ready for community plugins (none shipped). ## Modified files - `hermes_cli/plugins.py` — `register_tts_provider()` method on `PluginContext`. Matches the gating shape of `register_image_gen_provider()` / `register_browser_provider()`. - `tools/tts_tool.py` — `_dispatch_to_plugin_provider()` + `_plugin_provider_is_voice_compatible()` + walrus-elif wiring into the main dispatcher. Built-in elif chain untouched. - `hermes_cli/tools_config.py` — `_plugin_tts_providers()` injects plugin rows into the Text-to-Speech picker category alongside the 10 hardcoded built-in rows. ## Tests - `tests/agent/test_tts_registry.py` — 47 tests covering registration, lookup, ABC contract, helpers, AND a `TestBuiltinSync` regression test that fails if `agent.tts_registry._BUILTIN_NAMES` drifts from `tools.tts_tool.BUILTIN_TTS_PROVIDERS` (kept duplicated due to circular import constraints). - `tests/tools/test_tts_plugin_dispatch.py` — 35 tests covering built-in-always-wins, command-wins-over-plugin, plugin dispatch, exception passthrough, voice_compatible helper. - `tests/hermes_cli/test_tts_picker.py` — 10 tests covering the picker surface, builtin shadowing defense, integration with `_visible_providers`. - `tests/hermes_cli/test_plugins_tts_registration.py` — 3 end-to-end tests via `PluginManager.discover_and_load()`. - `tests/plugins/tts/check_parity_vs_main.py` — 9-scenario subprocess parity harness vs `origin/main`. The only intentional diff is `fallback_edge → plugin` for the `plugin-installed` scenario. ## Verification - 95/95 new tests pass. - 170/170 pre-existing TTS tests (test_tts_command_providers, test_tts_max_text_length, test_tts_speed, etc.) pass unchanged. - Parity harness against `origin/main`: 8 OK + 1 expected DIFF. - E2E smoke: a registered plugin's `synthesize()` is called via `text_to_speech_tool` with the standard JSON envelope returned. - Ruff clean on all touched files. ## Docs - `website/docs/user-guide/features/tts.md` — new "Python plugin providers" section with a decision table (command-provider vs plugin), minimal plugin example, and the optional-hook reference. - `website/docs/user-guide/features/plugins.md` — TTS row updated to mention both surfaces (command-provider primary, plugin for SDK/streaming). Closes #30398	2026-05-24 18:04:54 -07:00
Teknium	3d66787a04	fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452 ) * fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main for vision Fixes #31179. Three coupled fixes so a configured aux vision backend actually serves vision tasks instead of silently routing images to the user's main provider: 1. agent/auxiliary_client.py: `auxiliary.<task>.provider: openai` resolves to `custom` + `https://api.openai.com/v1`. "openai" was not in PROVIDER_REGISTRY (we have `openai-codex` for OAuth and `custom` for manual base_url), so the obvious config name silently failed to build a client. User-supplied base_url is still preserved; only the provider name normalises to `custom` so resolution doesn't hit the PROVIDER_REGISTRY-only path. 2. agent/auxiliary_client.py: the vision auto-detect chain now skips the user's main provider when models.dev reports `supports_vision=False`. Without this guard, a misconfigured aux provider would fall back to `auto`, which happily returned the main-provider client. The caller would then send image content to e.g. api.deepseek.com with model `gpt-4o-mini` and get a cryptic `unknown variant 'image_url', expected 'text'` from the provider's parser. 3. tools/vision_tools.py + tools/browser_tool.py: `check_vision_requirements` now mirrors the runtime fallback chain (explicit provider, then auto), so `vision_analyze` shows up whenever vision is actually serviceable. `browser_vision` gets a new `check_browser_vision_requirements` check_fn that AND-gates browser + vision availability, so it doesn't get advertised to the model when the call would fail at runtime. Reproduction (config from the bug report): model.provider: deepseek model.default: deepseek-v4-pro auxiliary.vision.provider: openai auxiliary.vision.model: gpt-4o-mini Before: resolve_vision_provider_client() returns None for the explicit provider, fallback auto returns the deepseek client with model='gpt-4o-mini', image hits api.deepseek.com → 'unknown variant image_url'. vision_analyze hidden from tool list; browser_vision exposed but fails at call time. After: resolves to custom + api.openai.com/v1 with model gpt-4o-mini. vision_analyze and browser_vision both gate correctly on capability. Tests: tests/agent/test_vision_routing_31179.py covers all three fixes (12 cases including the user's exact scenario, base_url preservation, text-only-main skip, capability-unknown permissive fallback, and tool gating parity). Existing 382 tests across auxiliary/vision/image_routing suites still pass. * test(vision): use exact hostname check to silence CodeQL substring-sanitization alert * fix(auxiliary): drop model name from vision-skip debug log to silence CodeQL The new `logger.debug(...)` added in the previous commit interpolated both `main_provider` and `vision_model` (a public model slug \u2014 not sensitive). CodeQL's `py/clear-text-logging-sensitive-data` heuristic re-flagged it twice because the rule mis-detects multi-value interpolations near tainted-via-config provider strings. Drop the model from the log args (provider alone is enough to diagnose the skip; the same sibling branch a few lines up already logs provider only). Behavior unchanged; CodeQL false positive cleared.	2026-05-24 15:01:28 -07:00
Teknium	5acaeba2bb	fix(mcp): raise ImportError instead of NameError when stdio SDK missing (#31450 ) When the 'mcp' Python SDK isn't installed, _run_stdio leaked a bare 'NameError: name StdioServerParameters is not defined' because the top-level 'from mcp import ...' fails inside try/except ImportError, leaving the names unbound at module scope. Mirror the _MCP_HTTP_AVAILABLE gate that _run_http already had: raise a clear ImportError with install instructions instead. Fixes #30904	2026-05-24 04:44:59 -07:00
Teknium	d3c167b644	fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290 ) * fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint Adds a soft guard so an agent running under one Hermes profile cannot silently edit a different profile's skills/plugins/cron/memories. Three layers: A. agent/file_safety.classify_cross_profile_target Classifies a write target against the active HERMES_HOME. Returns a {active_profile, target_profile, area, target_path} dict when the path lands in another profile's scoped area. PROFILE_SCOPED_AREAS = (skills, plugins, cron, memories). get_cross_profile_warning() wraps it into a model-facing error string that names both profiles, names the area, and points at the cross_profile=True bypass. Defense-in-depth, NOT a security boundary — the terminal tool runs as the same OS user and can write any of these paths directly. The guard exists to prevent confused-agent corruption, not to stop a determined attacker. SECURITY.md §3.2 (terminal-bypass posture) still applies. Wired into tools/file_tools.write_file_tool and patch_tool with a cross_profile=False kwarg. WRITE_FILE_SCHEMA and PATCH_SCHEMA both advertise cross_profile so the model can pass it after explicit user direction. patch_tool extracts target paths from V4A patch bodies before checking (same shape as the existing sensitive-path check). skill_manage is already scoped to the active profile's SKILLS_DIR by construction, so no extra guard wiring is needed there. The D-side error message (below) still names other profiles when the skill exists elsewhere. B. agent/system_prompt One deterministic line near the environment-hints block names the active profile and tells the model not to modify another profile's skills/plugins/cron/memories without explicit direction. Profile name is stable for the lifetime of the AIAgent, so the line is prompt-cache-safe. D. tools/skill_manager_tool._skill_not_found_error Replaces the bare "Skill 'X' not found." with a message that: - names the active profile, - searches OTHER profiles' skills dirs for the same name, - names the profile(s) where the skill exists and the path, - suggests `hermes -p <name>` to switch profiles, or cross_profile=True for an explicit edit. All 5 "not found" sites in skill_manager_tool (edit, patch, delete, write_file, remove_file) now go through the helper. Reference incident (May 2026): a hermes-security profile session edited skills under both ~/.hermes/profiles/hermes-security/skills/ AND ~/.hermes/skills/ (the default profile's skills) without realizing the second path belonged to a different profile. Three of the four skill files needed manual restoration afterward. What this PR does NOT do: * No hard block. The terminal tool can still touch any of these paths with no guard — same posture as the dangerous-command approval flow. SECURITY.md §3.2 applies. * No regex sweep on terminal commands for cross-profile paths. That direction is a Skills-Guard-style arms race (cd + relative paths, base64, etc.) and would false-positive on legitimate cross-profile reads. Filed as a follow-up. * No on-disk path migration. ~/.hermes/skills/ remains the default profile's skills dir; this PR is about telling the agent about that boundary, not changing the layout. Tests: tests/agent/test_file_safety_cross_profile.py (16 tests) - _resolve_active_profile_name covers default/named/failure paths - classify_cross_profile_target covers all four scoped areas, both directions (default → named, named → default, named → named), non-Hermes paths, and root-level config files - get_cross_profile_warning covers in-profile no-op, cross-profile message shape, and the defense-in-depth self-documentation tests/tools/test_cross_profile_guard.py (12 tests) - write_file: in-profile allow, cross-profile block, cross_profile=True bypass, non-Hermes pass-through - patch: replace-mode block, cross_profile=True bypass, V4A patch path extraction - skill_manage: error names the other profile (single + multiple), missing-everywhere falls back to skills_list hint - system prompt: contract-level checks (both branches present, cross_profile=True mentioned, ~/.hermes/profiles/ referenced) All 207 existing tests in file_safety/file_operations/skill_manager still pass. 10 system-prompt tests still pass. E2E verified: the exact incident scenario (security profile editing default's hermes-agent-dev skill) is now blocked with the warning message; cross_profile=True unblocks. * fix(code_execution): add cross_profile to write_file/patch stubs The cross_profile kwarg added to write_file_tool/patch_tool needs to flow through the execute_code sandbox stubs in _TOOL_STUBS so the test_stubs_cover_all_schema_params drift test passes. Without this, scripts running inside execute_code couldn't pass cross_profile=True through hermes_tools.write_file(). Caught by CI on PR #31290.	2026-05-24 00:38:17 -07:00
Teknium	d97c324473	fix(terminal): warn at call time when background=true runs silently (#31289 ) `terminal(background=true)` without `notify_on_complete=true` or `watch_patterns` runs the process SILENTLY — the agent has no way to learn it finished short of calling `process(action='poll')` explicitly. That's correct for genuine long-lived processes (servers, watchers, daemons) but is a footgun for every bounded task (tests, builds, deploys, CI pollers, batch jobs), which is the vast majority of background uses. Hit on May 23, 2026 (PR #31231 incident): agent launched a CI-watch loop with `background=true` only. The poller ran fine, exited green 6 minutes later, agent never noticed. User had to surface 'we are green CI, you can merge.' Memory and skill docs said what to do (poll in background) but not how to receive the result. The `notify_on_complete=true` flag exists and works, but is easy to forget when bg seems sufficient on its own. Two changes here, mutually reinforcing: 1. Runtime nudge: tool result for `background=true` w/o notify or watch_patterns now includes a `hint` field explaining the silent- process failure mode and pointing at the corrective flag. Agent sees it on the same turn and self-corrects without needing the user to surface anything. Cost for legitimate server cases is one ignored read (~50 tokens); cost for forgot-notify cases is prevented blindness (potentially many turns, or a user nudge). False positives << false negatives. 2. Schema/description rewrite: top-level TERMINAL_TOOL_DESCRIPTION and the `background` field description now lead with 'Almost always pair with notify_on_complete=true' instead of presenting it as one of two equally-likely patterns. The two legitimate non-notify shapes (long-lived servers; watch_patterns mid-process signals) are still documented, but as the minority case. Tests cover all four shapes: bg-only emits hint, bg+notify doesn't, bg+watch_patterns doesn't, foreground doesn't. 4 new tests; full suite of background/process tests stays green (160/160 across the relevant 6 test files).	2026-05-23 21:02:14 -07:00
teknium1	7ce6b504a2	fix(process_registry): use taskkill /T /F for tree-kill on Windows The Windows branch of `_terminate_host_pid` early-returned after `os.kill(pid, SIGTERM)` (which Python maps to `TerminateProcess` for the target handle only), leaving descendant processes — e.g. Chromium renderer/GPU/network helpers spawned by an `agent-browser` daemon — running on Windows even after the preceding commit fixed POSIX. The right Windows primitive is `taskkill /PID <pid> /T /F`: `/T` walks the tree, `/F` force-terminates. Same approach `gateway.status.terminate_pid(force=True)` already uses for the gateway's own shutdown path; reuse the same shape here. Why NOT extend the POSIX psutil tree-walk to Windows: 1. Windows doesn't maintain a Unix-style process tree. `psutil. Process.children(recursive=True)` walks PPID links that go stale when intermediate processes exit, so enumeration is best-effort and silently misses orphaned descendants. The whole bug we're fixing is orphaned descendants. 2. `psutil.Process.terminate()` on Windows is `TerminateProcess()` for one handle — same single-PID scope as the existing `os.kill`. The existing comment in `gateway/status.py:: terminate_pid` warns this explicitly: 'os.kill SIGTERM is not equivalent to a tree-killing hard stop' on Windows. 3. Headless Chromium has no GUI window, so the softer `taskkill /T` without `/F` (which sends WM_CLOSE) won't reach it either. `/F` is required. POSIX path is unchanged. The taskkill subprocess uses the same `creationflags=windows_hide_flags()` pattern other Windows shellouts in this codebase use. `FileNotFoundError` / `TimeoutExpired` / `OSError` fall back to bare `os.kill(SIGTERM)` as cheap insurance. Tests cover the Windows branch via the codebase's standard `monkeypatch _IS_WINDOWS` pattern (`references/windows-native- support.md`), plus POSIX tree-walk order, NoSuchProcess swallow, and the OSError fallback path. 7 new tests, all green on Linux CI.	2026-05-23 20:30:29 -07:00
Yuan Li	22f3f5a75a	fix(browser): use process-tree termination for daemon cleanup os.kill(pid, SIGTERM) only signals the parent, leaving Chromium child processes (renderer, GPU, etc.) orphaned. Reuse the existing ProcessRegistry._terminate_host_pid() helper which walks the process tree leaf-up via psutil, terminating children before the parent.	2026-05-23 20:30:29 -07:00
teknium1	4254f7dd17	refactor(skills): slim AST diagnostic to single entry point Trim ~600 LOC off the original contribution while keeping the same operator-facing surface and detection coverage. - Collapse three entry points (file / dir / bundle) into one ast_scan_path(path) that handles both files and directories. - Drop AstFinding dataclass + severity field — replaced with plain (file, line, pattern_id, description) tuples. Severity ordering was display-only for a diagnostic that explicitly disclaims security verdicts, so the field added bookkeeping without earning its place. - Replace Rich-markup formatter with plain text grouped by file. - Drop the 'inspect --ast-deep' surface — same scanner, same output as 'audit --deep', single CLI entry is enough. Operators audit after install; pre-install inspection signal isn't worth the second surface. - Trim test file to the cases that earn their place: bypass payload, syntax error survival, RecursionError survival, false-positive guard (importer lookalike), literal-arg false-positive guard, non-.py ignored, directory recursion + cache-dir skipping, missing-path, getattr/__dict__ detection, formatter empty + populated. Net: tools/skills_ast_audit.py 353 -> 133 LOC, tests/tools/test_skills_ast_audit.py 299 -> 103 LOC, full diff +704/-12 -> +264/-6. No change to tools/skills_guard.py — Skills Guard verdicts remain untouched per SECURITY.md §2.4.	2026-05-23 17:47:26 -07:00
Tranquil-Flow	7255050c99	feat(skills): add opt-in AST deep diagnostics Add opt-in AST diagnostics for skill review without making Skills Guard stricter by default. - Add hermes skills inspect --ast-deep to scan fetched skill bundles before installation - Add hermes skills audit --deep to scan already-installed hub skills - Keep AST analysis in tools/skills_ast_audit.py, separate from tools/skills_guard.py - Label output as diagnostic hints, not security verdicts - Cover dynamic import/access patterns: importlib, __import__(computed), getattr(computed), and __dict__[computed] This follows the maintainer guidance from closed PR #7436: useful AST-level analysis belongs in an opt-in diagnostic path, not in Skills Guard's default heuristic scan.	2026-05-23 17:47:26 -07:00
Teknium	6a8e131a0a	refactor(ntfy): convert built-in adapter to platform plugin ntfy now ships as a self-contained plugin under plugins/platforms/ntfy/ instead of editing 8 core files (gateway/config.py Platform enum, gateway/run.py factory + auth maps, cron/scheduler.py, toolsets.py, hermes_cli/status.py, agent/prompt_builder.py, gateway/channel_directory.py, tools/send_message_tool.py). All routing goes through gateway/platform_registry via register_platform(): - adapter_factory, check_fn, validate_config, is_connected - env_enablement_fn seeds PlatformConfig.extra from NTFY_* env vars so gateway status reflects env-only setups without instantiating httpx - standalone_sender_fn handles deliver=ntfy cron jobs when cron runs out-of-process from the gateway - allowed_users_env / allow_all_env hook into _is_user_authorized - cron_deliver_env_var=NTFY_HOME_CHANNEL for cron home routing - platform_hint surfaces in the system prompt - pii_safe=True (topic names are the only identifier; no PII to redact) Tests moved to tests/gateway/test_ntfy_plugin.py using _plugin_adapter_loader so the module lives under plugin_adapter_ntfy in sys.modules and cannot collide with sibling plugin-adapter tests on the same xdist worker. The core-file grep tests (Platform.NTFY in source, hermes-ntfy in toolsets, etc.) are replaced with plugin-shape tests covering register() metadata, env_enablement_fn output, and standalone_sender_fn behavior. 68 tests pass under scripts/run_tests.sh.	2026-05-23 16:13:01 -07:00
sprmn24	b10f17bf1e	feat(ntfy): add ntfy platform adapter with atomic reconnect, identity fix, and 81 tests	2026-05-23 16:13:01 -07:00
Teknium	7f1b2b4569	fix(approval): pin 'silence is not consent' contract on timeout/deny (#24912 ) (#30879 ) User incident (Slack, 2026-05-13): user walked away mid-conversation, agent requested approval to run `rm -rf .git`, the prompt timed out after the gateway_timeout (default 300s), and the agent removed the .git folder on its own. Corroborated by an independent report from a Telegram user. The underlying code path was correct — `check_all_command_guards` returns `approved=False` with a BLOCKED message on both timeout and explicit deny, and `terminal_tool` surfaces that as `status=blocked` to the agent. The bug is at the model-interface layer: the message "BLOCKED: Command timed out. Do NOT retry this command." reads to some models as "try a different command achieving the same outcome." This commit changes only the model-facing message + the structured return shape: - Timeout message now explicitly names the three evasion paths the agent must avoid: retry, rephrase, AND achieve the same outcome via a different command. Ends with "Silence is not consent." - Explicit deny gets the same shape minus the silence-is-not-consent line (it WAS an explicit deny, not silence). - New structured fields on the return dict: `outcome` ("timeout" or "denied") and `user_consent` (always False on this branch) so plugins, hooks, and audit pipelines don't have to string-parse the message to distinguish the two cases. The mechanism that should already have prevented the original incident — timeout treated as deny, BLOCKED result, post hook fires with `choice="timeout"` — is unchanged. This commit hardens only the agent's reading of the result. Tests: - test_timeout_returns_approved_false_with_no_consent — pins the return shape on the Slack-shaped notify_cb-registered path - test_timeout_message_is_emphatic_against_retry_and_rephrase — pins the exact phrases the message must contain - test_explicit_deny_carries_same_no_consent_shape — same contract on explicit /deny - test_timeout_emits_post_hook_with_timeout_outcome — pins the post_approval_response hook payload so audit plugins can act 329 approval tests passing (4 new + 325 existing). Fixes #24912	2026-05-23 02:59:13 -07:00
Teknium	6855d17753	fix(memory): guard against external drift in MEMORY.md/USER.md (#26045 ) (#30877 ) Reproduction (production, 2026-05-14): two concurrent sessions on the same agent. Session A patches MEMORY.md directly via the patch tool, appending ~8KB of structured content (Vendor Master, Standing Orders, Pin Board) — none of it through the memory tool, so no § delimiters. Session B starts later with stale in-memory state (1 entry, ~331 chars). Session B calls memory(action=replace) on its one known entry. The tool's _read_file parses A's content as a single 8KB 'entry' (no § splits), then replace truncates that entry to B's new 333-byte content. ~8KB of structured content silently destroyed. The atomic-rename write path is fine in isolation. The bug is the implicit contract: the tool assumes MEMORY.md is exclusively a §-delimited list of small entries it wrote, but the v0.13 install runbook itself uses 'cat >> MEMORY.md' for onboarding, the patch tool edits the file directly, and operators do too. Fix: a drift guard in MemoryStore._detect_external_drift that fires on either signal: 1. Re-parse + re-serialize doesn't produce identical bytes (catches oddly-encoded delimiters / partial writes). 2. Any single parsed entry exceeds the store's whole-file char limit. The tool budgets the ENTIRE store against that limit (2200 chars for memory, 1375 for user), so no tool-written entry can legitimately be larger. An entry bigger than the store limit means an external writer dropped free-form content into what the tool will treat as one entry. When drift fires, _reload_target writes a .bak.<ts> snapshot of the on-disk file, then add/replace/remove refuse to flush. The original file stays untouched. The error dict surfaces the .bak path AND a remediation string ('integrate missing entries via memory(add=...) one at a time, then rewrite the file clean') so the model can act on it without escalating to the operator. Tests: - test_replace_refuses_on_drift, test_add_refuses_on_drift, test_remove_refuses_on_drift — all three mutators refuse - test_clean_file_does_not_trigger_drift — false-positive check - test_error_message_points_at_remediation — error string shape - test_drift_guard_also_protects_user_target — USER.md too - test_drift_backup_filename_is_unique_per_invocation — bak.<ts> naming pin 144 memory tests passing (was 137; +7). Fixes #26045	2026-05-23 02:51:29 -07:00
Teknium	6942b1836e	fix(skills_guard): explain why --force is rejected on dangerous verdicts Follow-up to @sprmn24's verdict-logic fix. The previous block-message ended in 'Use --force to override' regardless of verdict — but as of the --force fix above, dangerous community/trusted skills can't be overridden by --force at all. The misleading hint sends users in a loop. Replace it with a specific message that tells them what the documented behavior actually is. Adds two regression tests covering the dangerous-verdict message shape and one that pins the existing --force hint for non-dangerous blocks.	2026-05-23 02:37:30 -07:00
sprmn24	789043b691	fix(security): update tests for verdict and --force changes	2026-05-23 02:37:30 -07:00
sprmn24	0f8215f633	fix(security): correct verdict logic and enforce --force limitation in skills_guard - _determine_verdict() returned 'caution' for medium/low-only findings, causing community skills with harmless patterns (e.g. path traversal notation, unpinned pip install) to be incorrectly blocked. Now returns 'safe' when only medium/low severity findings are present. - should_allow_install() allowed --force to override 'dangerous' verdict, contradicting documented behavior that --force does NOT override dangerous scan results. Added explicit check to prevent force-installing skills with dangerous verdict.	2026-05-23 02:37:30 -07:00
Eugeniusz Gilewski	41d2c758c3	Fix unsafe gateway media path delivery	2026-05-23 01:40:35 -07:00
briandevans	567ea61298	fix(file-safety): block auth.json read via TERMINAL_CWD relative path read_file_tool resolves relative paths against TERMINAL_CWD (or the task's live terminal cwd), but the prior call passed the original unresolved string to get_read_block_error. That function's own resolve() is anchored at the Python process cwd, so when a task's TERMINAL_CWD pointed at HERMES_HOME and the agent issued read_file on the relative path "auth.json", the credential-store denylist was never reached and the file was read normally. Pass the already-resolved absolute path string at the file_tools call site, document the contract on get_read_block_error, and add a read_file_tool-level regression test that pins the relative-path case under TERMINAL_CWD == HERMES_HOME. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 20:15:09 -07:00
Teknium	3f78d8073c	fix(skills): make content_hash filename-sensitive too (symmetric with bundle_content_hash) PR #6656 added rel_path + \x00 prefixing to ``bundle_content_hash`` so a filename swap between two files in a bundle changes the digest. But it only patched the in-memory side — ``content_hash`` in ``tools/skills_guard.py`` (the on-disk equivalent) still hashed file contents only. These two functions need to stay symmetric: ``check_for_skill_updates`` compares the disk hash of an installed skill against the bundle hash of the upstream copy. With the asymmetric fix, every clean install showed as drifted because the digests no longer matched (2 existing tests in ``test_skills_hub.py`` started failing as soon as the contributor's change landed). Apply the same ``rel_path + \x00 + content`` shape to the disk-side function. Both functions now produce the same digest for the same skill content laid out two ways. Documented the symmetry invariant in the docstring so a future change to either function knows to touch both. Also adds tests/tools/test_pr_6656_regressions.py with 10 regression tests covering all three fixes salvaged in PR #6656: - uninstall_skill path traversal (4 cases: parent segments, absolute paths, symlink escape, legitimate skill) - bundle_content_hash filename swap detection (4 cases: in-memory swap, identity, disk-side swap, bundle↔disk symmetry) - list_pending lock contract (2 cases: source-grep contract, smoke) Also fixes AUTHOR_MAP entry for @aaronlab — their commit email (1115117931@qq.com) maps to "aaronagent" which isn't a real GitHub login, so changelog @mentions would 404.	2026-05-22 19:59:24 -07:00
aaronagent	b82608a6f5	fix(skills,pairing): path traversal guard in uninstall, lock list_pending, hash file paths - skills_hub: validate that uninstall_skill's install_path resolves inside SKILLS_DIR before calling shutil.rmtree, preventing recursive deletion of arbitrary directories via poisoned lock.json entries - skills_hub: include file paths (not just contents) in bundle_content_hash so swapping filenames between files changes the hash, strengthening update-detection integrity - pairing: wrap list_pending() in self._lock so _cleanup_expired() file writes don't race with concurrent generate_code()/approve_code() calls Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-22 19:59:24 -07:00
kshitijk4poor	cc8e5ec2af	refactor(gateway): migrate Discord adapter to bundled plugin (full Teams parity) First migration of an existing built-in platform adapter to the plugin system established by IRC / Teams / LINE / Google Chat. Closes #24325; advances the umbrella refactor in #3823. Matches Teams' shape exactly — adapter under ``plugins/platforms/discord/`` with the standard ``__init__.py`` / ``adapter.py`` / ``plugin.yaml`` shell, ``register(ctx)`` entry point, no back-compat shim at the old import path, and full parity for the four hooks Teams uses plus the ``apply_yaml_config_fn`` hook that landed in #25443 (the Discord plugin is the first consumer of that hook): * ``standalone_sender_fn`` — out-of-process cron delivery via REST API * ``setup_fn`` — interactive ``hermes setup gateway`` wizard * ``apply_yaml_config_fn`` — translate ``config.yaml`` ``discord:`` keys into ``DISCORD_`` env vars (replaces the hardcoded block in ``gateway/config.py``) ``is_connected`` — declares connection state from ``DISCORD_BOT_TOKEN`` * ``check_fn`` — lazy-installs ``discord.py`` on demand * plus ``allowed_users_env``, ``allow_all_env``, ``cron_deliver_env_var``, ``max_message_length``, ``emoji``, ``required_env``, ``install_hint`` * ``gateway/platforms/discord.py`` (5,101 LOC) → ``plugins/platforms/discord/adapter.py`` (git rename, R090). * New ``plugins/platforms/discord/{__init__.py, plugin.yaml}`` with ``requires_env`` / ``optional_env`` declarations. * Append ``register(ctx)`` block + new hook implementations (``_standalone_send``, ``interactive_setup``, ``_apply_yaml_config``, ``_clean_discord_user_ids``, ``_is_connected``, ``_build_adapter``, plus helpers ``_DISCORD_CHANNEL_TYPE_PROBE_CACHE`` etc.) to the adapter. * Replace the ``Platform.DISCORD elif`` branch in ``GatewayRunner._create_adapter()`` (−9 LOC) with a generic post-creation hook (+6 LOC) in the registry path: any plugin adapter that declares a ``gateway_runner`` attribute now gets it auto-injected. Webhook's built-in branch is unchanged (it doesn't go through the registry path). * Move ``_send_discord`` (190 LOC) and helpers (``_DISCORD_CHANNEL_TYPE_PROBE_CACHE``, ``_remember_channel_is_forum``, ``_probe_is_forum_cached``, ``_derive_forum_thread_name``) from ``tools/send_message_tool.py`` into the plugin as ``_standalone_send``. * Wire via ``standalone_sender_fn=_standalone_send`` (Teams pattern; same gap fixed in #21804 for other plugin platforms). * Replace the Discord ``elif`` in ``tools/send_message_tool.py`` ``_send_to_platform`` with a 10-line registry-hook dispatch. * Drop the ``DiscordAdapter`` import and the ``Platform.DISCORD: DiscordAdapter.MAX_MESSAGE_LENGTH`` ``_MAX_LENGTHS`` entry — the registry's ``max_message_length=2000`` covers it. * Move ``_setup_discord`` and ``_clean_discord_user_ids`` (68 LOC) from ``hermes_cli/setup.py`` into the plugin as ``interactive_setup``. * Wire via ``setup_fn=interactive_setup``. CLI helpers (``prompt``, ``print_info``, etc.) are lazy-imported so the plugin's module-load surface stays minimal. * Remove ``"discord": _s._setup_discord`` from ``hermes_cli/gateway.py::_builtin_setup_fn``. * Remove the entire 32-line ``_PLATFORMS["discord"]`` static dict entry — Discord's setup metadata is now discovered dynamically via ``_all_platforms()`` from the registry entry. * Move the 59-line ``discord_cfg`` YAML→env bridge from ``gateway/config.py::load_gateway_config()`` into the plugin as ``_apply_yaml_config``. Covers ``require_mention``, ``thread_require_mention``, ``free_response_channels``, ``auto_thread``, ``reactions``, ``ignored_channels``, ``allowed_channels``, ``no_thread_channels``, ``allow_mentions.{everyone,roles,users, replied_user}``, and ``reply_to_mode`` (including the YAML 1.1 ``off``-as-False coercion and the ``extra.reply_to_mode`` fallback). * Wire via ``apply_yaml_config_fn=_apply_yaml_config``. * The hook runs BEFORE ``_apply_env_overrides`` and after the generic shared-key loop, exactly as documented in ``website/docs/developer-guide/adding-platform-adapters.md``. * Behavior is preserved exactly — every assignment still uses ``not os.getenv(...)`` guards so env vars take precedence over YAML. All 78 references to the old import path are rewritten — no back-compat shim: * 51 ``from gateway.platforms.discord import X`` → ``from plugins.platforms.discord.adapter import X`` * 5 ``import gateway.platforms.discord as discord_platform`` → ``import plugins.platforms.discord.adapter as discord_platform`` * 1 ``from gateway.platforms import discord as discord_mod`` → ``from plugins.platforms.discord import adapter as discord_mod`` * 21 ``mock.patch("gateway.platforms.discord.X")`` strings → ``mock.patch("plugins.platforms.discord.adapter.X")`` * 1 docstring reference in ``hermes_cli/commands.py`` * 1 import in ``tools/send_message_tool.py`` (now removed entirely) The import-safety test in ``tests/gateway/test_discord_imports.py`` is updated to purge the new canonical module name from ``sys.modules``. 38 files changed, +621 / −473 — net positive due to the YAML hook implementation (89 new LOC in the plugin trading for 59 deleted in core), but every line moved has a clear plugin home now. The git rename is detected at R090 because the adapter gained ~340 LOC of moved-in hook implementations (``_standalone_send`` + ``interactive_setup`` + ``_apply_yaml_config`` + helpers). * All 568 Discord-specific tests pass across 25 ``test_discord_.py`` files plus voice/send/text-batching/reload-skills/stream-consumer/ integration tests. All 147 tests in the YAML-touching subset (``test_discord_reply_mode``, ``test_discord_free_response``, ``test_discord_allowed_channels``, ``test_discord_allowed_mentions``, ``test_discord_channel_controls``, ``test_discord_reactions``, ``test_discord_thread_persistence``, ``test_runtime_footer``) pass — this is the strongest signal that the YAML→env hook behaves identically to the legacy block. * Broader gateway/cron/integration sweep (1297 tests) introduces zero new failures vs ``main``. Pre-existing failures in ``tests/gateway/test_tts_media_routing.py`` and ``tests/e2e/test_platform_commands.py`` reproduce identically on the unchanged ``main`` revision. * Plugin discovery sanity check confirms Discord registers alongside the other four platform plugins: Registered platforms: ['discord', 'google_chat', 'irc', 'line', 'teams'] These Discord-shaped tendrils in core were deliberately not moved — they are generic platform-registry concerns affecting every platform, not Discord-specific: * ``gateway/config.py:1205`` ``DISCORD_BOT_TOKEN → config.token`` env enablement — same shape Telegram has. The existing ``env_enablement_fn`` registry hook only seeds ``extra``, not ``.token``, so it can't replace this without an adapter refactor to read from ``extra["bot_token"]``. * ``gateway/run.py`` voice-mode hooks (``self.adapters.get(Platform.DISCORD)`` for ``start_voice_mode``/``stop_voice_mode``), role-based auth, ``DISCORD_ALLOW_BOTS`` branch in ``_is_user_authorized``, ``_UPDATE_ALLOWED_PLATFORMS`` frozenset, and the per-platform allowlist maps — generic platform-registry concerns. * ``Platform.DISCORD`` enum literal — stable identifier used as dict keys throughout the codebase; removing it is a separate refactor with no real benefit. * ``tools/discord_tool.py`` and ``tools/environments/local.py`` — first-class agent tools and env-passthrough config, neither is the gateway adapter. Each of these is worth its own scoping issue when the time comes.	2026-05-22 14:21:41 -07:00
0xDevNinja	3ac2125140	refactor(image_gen): port FAL backend to plugins/image_gen/fal Mirrors the architecture established by the web (#25182), browser (#25214), and video_gen (#25126) plugin migrations: * `tools/fal_common.py` — stateless atoms shared by both FAL-backed plugins (image_gen + video_gen). Holds the lazy `fal_client` import helper, `_ManagedFalSyncClient`, `_normalize_fal_queue_url_format`, `_extract_http_status`. Stateful pieces (`fal_client` module global, `_managed_fal_client` cache, `_submit_fal_request`, `_resolve_managed_fal_gateway`, `_get_managed_fal_client`) intentionally stay on `tools.image_generation_tool` so the existing `monkeypatch.setattr(image_tool, ...)` patch sites keep working unchanged. `plugins/video_gen/fal/__init__.py` — drops its inline `_load_fal_client` duplicate; consumes `tools.fal_common.import_fal_client`. * `plugins/image_gen/fal/{plugin.yaml,__init__.py}` — new plugin. `FalImageGenProvider` is a thin registration adapter that resolves the legacy module via `import tools.image_generation_tool as _it` and calls `_it.image_generate_tool` + `_it._resolve_fal_model` at call time. The 18-model catalog, `_build_fal_payload`, managed- gateway selection, and Clarity Upscaler chaining all remain in `tools.image_generation_tool` as the single source of truth — the plugin is a registration adapter, not a parallel implementation. * `tools/image_generation_tool.py::_dispatch_to_plugin_provider` — drops the `configured == "fal"` skip. Setting `image_gen.provider: fal` now routes through the registry like any other provider; the plugin re-enters this module's pipeline so behavior is identical. Unset `image_gen.provider` still falls through to the in-tree pipeline (preserves no-config-with-FAL_KEY UX from #15696). * `hermes_cli/tools_config.py` — drops the hardcoded "FAL.ai" row from `TOOL_CATEGORIES["image_gen"]["providers"]` (now injected by `_plugin_image_gen_providers` like every other backend) and the `getattr(provider, "name") == "fal"` skip that protected against duplication with the hardcoded row. The "Nous Subscription" row stays as a setup-flow entry — same shape browser kept "Nous Subscription (Browser Use cloud)" after #25214. * `tests/plugins/image_gen/test_fal_provider.py` — 14 cases covering the ABC surface, call-time indirection (verifying `monkeypatch.setattr(image_tool, "image_generate_tool", ...)` takes effect through the plugin), response-shape stamping, exception handling, and registry wiring. * `tests/plugins/image_gen/check_parity_vs_main.py` — subprocess harness mirroring `tests/plugins/browser/check_parity_vs_main.py`. Pins one path to origin/main, one to the worktree; runs six scenarios (unset, explicit-fal-no-creds, explicit-fal-with-creds, explicit-fal-with-model, typo provider, managed-gateway-only) and diffs the reduced shape `{dispatch_kind, provider_name, model}` per scenario. The only acceptable diff is "legacy_fal → plugin (fal)" for explicit-FAL paths — every other delta is flagged as a regression. * `tests/hermes_cli/test_image_gen_picker.py::test_fal_surfaced_alongside_other_plugins` — flips the previous `test_fal_skipped_to_avoid_duplicate` to match the new shape (FAL is a plugin now, no dedup needed). Verified: 195/195 tests across `tests/{tools/test_image_generation*,tools/test_managed_media_gateways,plugins/image_gen,plugins/video_gen,hermes_cli/test_image_gen_picker}.py` pass on this branch with no test patches modified outside the picker test that asserted the old skip behaviour. Fixes #26241	2026-05-22 04:10:45 -07:00
Rodrigo	fbdca64f73	fix(computer-use): skip capture_after when action failed (ok=False) _maybe_follow_capture() issued a follow-up screenshot unconditionally when capture_after=True, even when res.ok=False. The model then received a normal-looking screenshot alongside an error message, and in practice it often ignored ok=False and proceeded as if the action had succeeded. Fix: return _text_response(res) early when res.ok is False so the model receives only the error and can decide how to recover. Tests added: - test_capture_after_skipped_when_action_failed: patches click to return ok=False and asserts no capture call is issued. - test_capture_after_fires_when_action_succeeds: ensures the happy path still triggers the follow-up capture.	2026-05-22 01:19:01 -07:00
Rodrigo	c52cd48e25	fix(computer-use): add set_value to ComputerUseBackend ABC and _NoopBackend stub _dispatch() routes action="set_value" to backend.set_value(), but: - ComputerUseBackend did not declare set_value as @abstractmethod, so subclasses could silently omit it without a TypeError at class load time. - _NoopBackend (the test/CI stub) had no set_value method at all, causing AttributeError in any test that exercises the set_value action path. Fix: - Add set_value as @abstractmethod to ComputerUseBackend in backend.py. - Add a recording stub in _NoopBackend in tool.py. - Add two TestDispatch cases: one verifying the call reaches the backend, one verifying the missing-value guard returns a clean error.	2026-05-22 01:14:15 -07:00
teknium1	372e9a18cd	fixup: log lazy-install errors at debug + AUTHOR_MAP for CipherFrame Co-authored-by: CipherFrame <cipherframe@users.noreply.github.com>	2026-05-21 23:36:18 -07:00
CipherFrame	b5c6d9ac08	fix: wire STT lazy-install into transcription_tools.py The ensure('stt.faster_whisper') lazy-install mechanism was defined in lazy_deps.py but never called from the STT code path. When _HAS_FASTER_WHISPER (a module-level constant) evaluated to False at import time, _get_provider() returned 'none' immediately without attempting installation. On fresh container builds or venv recreations, this meant voice message transcription broke silently until someone manually installed faster-whisper. Add _try_lazy_install_stt() helper that calls ensure() and re-checks dynamically via importlib.util.find_spec. Wire it into all three gates in transcription_tools.py: - _get_provider() explicit 'local' path (line 221) - _get_provider() auto-detect path (line 287) - _transcribe_local() guard (line 405) This ensures the first voice message after any fresh install triggers auto-installation instead of failing permanently until a process restart.	2026-05-21 23:36:18 -07:00
Stark-X	eb51fb6f50	fix(ssh): keep bulk sync extraction scoped to .hermes	2026-05-21 19:17:51 -07:00
Teknium	0e2873a77d	fix(computer_use): build summary once before aux-vision routing branch The cherry-pick of #22891 (max_elements cap) reshuffled _capture_response so summary was assigned inside both the multimodal and AX branches, but #30126's aux-vision routing call (_route_capture_through_aux_vision) fires BEFORE either branch and references the not-yet-bound name. Compute summary once up-front, keep the AX-branch rebuild for the truncation note.	2026-05-21 19:07:32 -07:00
briandevans	280dd4513a	fix(computer-use): address Copilot review on max_elements cap Four findings from Copilot's review on PR #22891, all in the AX elements-array cap added by 22fa1ed: 1. The truncation note ("response truncated to N of M elements") was appended unconditionally — including in the som/vision multimodal path, whose response carries a screenshot rather than an `elements` array. The note described a payload field that wasn't present. Moved the note into the AX-text branch where the array actually appears. 2. `_format_elements(cap.elements)` ran on the full untrimmed list with its own `max_lines=40` cap, so a caller passing `max_elements=10` would see summary lines referencing `#11..#40` even though the JSON `elements` array only held #1..#10. Format on `visible_elements` instead so the summary indices always exist in the response. 3. `_coerce_max_elements` enforced a lower bound but no upper bound, so `max_elements=10_000_000` silently disabled the safeguard and reintroduced the original context-blow-up. Added a hard cap (`_MAX_ALLOWED_MAX_ELEMENTS = 1000`) that clamps oversized values. 4. The schema string said "Default 100" but the property carried no `default` field, and claimed `max_elements` had no effect on som/ vision while the image-missing fallback path can still return an elements array. Added `"default": 100`, `"maximum": 1000`, and clarified the fallback-path wording. Each finding gets a regression test: - test_capture_ax_clamps_oversized_max_elements_to_hard_cap - test_capture_ax_summary_indices_match_returned_elements - test_capture_multimodal_summary_omits_truncation_note - test_schema_max_elements_documents_default_and_upper_bound Verified with `pytest tests/tools/test_computer_use.py` (53 passed, including the 5 new cases). Confirmed each new test fails on the pre-fix code path before applying the production change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 19:07:32 -07:00

1 2 3 4 5 ...

1483 commits