hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

Author	SHA1	Message	Date
aaronlab	5f20322d23	fix(tts): reject '..' traversal in output_path text_to_speech_tool accepts an explicit output_path. Without a traversal guard, a path containing '..' components (whether prompt-injection- controlled, from a confused skill, or just a buggy caller) could escape its declared base and write the audio to a system location — e.g. `output_path='audio/../../etc/cron.d/x'` lands the file outside the intended audio cache. Reject '..' components in the user-supplied path. Explicit absolute paths are unchanged (the agent legitimately writes audio wherever the user/caller asks); only traversal-style escapes are blocked. The terminal tool can still write anywhere with approval — this just keeps the unattended TTS surface from materializing files via traversal. Regression tests cover: '..' in the middle (audio/../../etc/...), bare '..' prefix, and the negative cases (absolute paths + relative paths without '..' both pass through unchanged). Salvaged from PR #6693 by @aaronlab. The original PR confined output to DEFAULT_OUTPUT_DIR-or-cwd, which broke 9 existing tests that legitimately write to tmp_path locations. The traversal-only check covers the actual threat (path-escape via '..' from prompt injection) without restricting where users can choose to write their audio. The remaining pieces of #6693 (skill_commands rglob symlink rejection, delegate_tool batch prefix display) are dropped: - skill_commands rglob: breaks the documented design supporting ~/.hermes/skills/<name> as a symlink to a checked-out skill elsewhere (see comment at agent/skill_commands.py:73-75) - delegate_tool batch prefix: pure UX, doesn't belong in a security PR Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-25 05:15:55 -07:00
daimon-nous[bot]	ac5359a3f3	fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998 ) (#32012 ) * fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998) When a stream stalls mid-tool-call (e.g. a large write_file), the partial-stream-stub recovery used finish_reason='stop' which caused the conversation loop to treat the turn as complete, returning only the warning text. When users said 'continue', the model retried the same large tool call, hit the same stale timeout, and looped indefinitely. Changes: - chat_completion_helpers.py: change _stub_finish_reason from 'stop' to 'length' for mid-tool-call partials. The stub still has tool_calls=None so no tool auto-executes — the model gets a fresh API call through the existing length-continuation machinery (bounded to 3 retries). Also attach _dropped_tool_names to the stub for downstream use. - conversation_loop.py: add a third continuation prompt branch for partial-stream-stubs with dropped tool calls. Instead of the generic 'continue where you left off' (which would retry the same large call), tell the model to break the output into smaller tool calls (~8K tokens each) to avoid stream timeouts. - test_partial_stream_finish_reason.py: update existing test from finish_reason='stop' to 'length', add _dropped_tool_names assertion, add new test_dropped_tool_call_uses_chunking_prompt for the 3-way prompt branching. Safety: tool_calls=None is preserved on the stub, so the conversation loop enters the text-continuation branch (line 1513), NOT the tool-call execution branch (line 3246). No tool auto-executes. The model simply gets another API call with targeted guidance. * refactor: extract constants and continuation prompt helper - Move magic strings to hermes_constants.py (PARTIAL_STREAM_STUB_ID, FINISH_REASON_LENGTH) - Extract _get_continuation_prompt() in conversation_loop.py — DRYs the 3-way prompt branching and lets tests import the real function - Trim verbose inline comments in chat_completion_helpers.py - Tests import constants + helper instead of duplicating logic --------- Co-authored-by: alt-glitch <balyan.sid@gmail.com>	2026-05-25 17:43:10 +05:30
nguyen binh	46d8b5dadf	fix(profile): reject symlinks in distributions (#25292 )	2026-05-25 05:07:58 -07:00
nguyen binh	0d55315c36	fix(backup): skip symlinked files in zip archives (#25289 )	2026-05-25 05:07:52 -07:00
Teknium	79799c80f5	test(approval): patch _YOLO_MODE_FROZEN directly in test_yolo_overrides_cron_deny The test set HERMES_YOLO_MODE=1 via monkeypatch.setenv, expecting check_dangerous_command() to honor yolo and bypass cron_mode=deny. But tools.approval._YOLO_MODE_FROZEN is intentionally frozen at module import time (security: prevents prompt-injection runtime escalation). When CI imports the module BEFORE the test sets the env, the frozen value stays False and the yolo bypass never activates. Local runs missed this because the conftest leaked a non-empty HERMES_YOLO_MODE into the import-time env. CI's clean-env path exposed the bug deterministically on test (3) / test (4) shards. Fix: patch the module attribute directly via mock.patch.object so the test simulates process-startup-with-yolo regardless of import order. The behavior under test (yolo bypasses cron_mode=deny for non-hardline commands) is unchanged; the security invariant (_YOLO_MODE_FROZEN can't be set at runtime by skills) is preserved. Reproduced locally with: env -i HOME=$HOME PATH=$PATH python3 -m pytest tests/tools/test_cron_approval_mode.py -o 'addopts=' -v Without the fix: 1 failed, 23 passed. With the fix: 24 passed.	2026-05-25 05:07:49 -07:00
Peter	95848b1cbc	fix(transcription): reject symlinked audio inputs (#10082 ) * fix(transcription): reject symlinked audio inputs Validation runs before provider selection, so rejecting symbolic-link paths there prevents supported-extension links from being treated as normal audio files. Use os.path.islink to avoid perturbing the existing Path.stat error path and to reject links before resolving targets. Constraint: Keep validation platform-safe and avoid requiring symlink support where unavailable. Rejected: Use Path.is_symlink \| it consumes pathlib stat calls and broke the existing stat error regression. Confidence: high Scope-risk: narrow Directive: Keep path hardening in _validate_audio_file before provider dispatch. Tested: source venv/bin/activate && python -m pytest tests/tools/test_transcription_tools.py::TestValidateAudioFileEdgeCases -q (5 passed) Tested: source venv/bin/activate && python -m pytest tests/tools/test_transcription_tools.py::TestValidateAudioFileEdgeCases tests/tools/test_transcription_tools.py::TestTranscribeAudioDispatch::test_invalid_file_short_circuits -q (6 passed) Tested: source venv/bin/activate && python -m compileall tools/transcription_tools.py tests/tools/test_transcription_tools.py Tested: git diff --check Not-tested: Full tests/tools/test_transcription_tools.py under .[dev] only; existing faster_whisper optional dependency tests fail with ModuleNotFoundError. * Keep transcription tests independent of optional whisper install The transcription suite mocks faster-whisper directly, so a minimal test stub keeps the branch verifiable in environments where the optional package is not installed. This preserves the existing mock-based coverage without adding a dependency. Constraint: faster-whisper is an optional local STT dependency and is absent from the current validation environment Rejected: Install faster-whisper just for branch validation \| would add heavyweight environment coupling outside the patch scope Confidence: high Scope-risk: narrow Directive: Keep this as a test-only stub unless production import semantics change Tested: pytest tests/tools/test_transcription_tools.py -q --------- Co-authored-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com>	2026-05-25 05:07:45 -07:00
Peter	ee59ef1946	fix: reject read_file symlinks to blocking devices (#10133 ) * fix: reject read_file symlinks to blocking devices The read_file guard already refused direct device paths such as /dev/zero, but a workspace symlink resolving to one of those devices could still reach the shell-backed read path and hang on wc/head/sed. Keep the literal alias check and add a resolved-path pass so local symlinks to blocked device/fd endpoints are rejected before I/O. Constraint: Preserve literal /dev/stdin handling before terminal-specific realpath resolution Confidence: high Scope-risk: narrow Tested: pytest tests/tools/test_file_read_guards.py tests/tools/test_file_tools.py -q; python -m compileall tools/file_tools.py tests/tools/test_file_read_guards.py; git diff --check Signed-off-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com> * Keep file guard tests off sensitive macOS temp paths The branch now inherits a sensitive-path write guard from upstream main. On macOS, tempfile.mkdtemp() resolves under /private/var/folders, so the new write-path guard fired before the file read dedup assertions could exercise their intended behavior. The tests now create their scratch files inside the worktree temp checkout, outside those system-sensitive prefixes, without changing production behavior. Constraint: Rebased branch must pass the expanded file read guard suite on macOS. Rejected: Loosen the production sensitive-path prefix list \| broader behavior change unrelated to this PR. Confidence: high Scope-risk: narrow Tested: pytest tests/tools/test_file_read_guards.py -q --------- Signed-off-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com> Co-authored-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com>	2026-05-25 05:07:38 -07:00
Dakota Secula-Rosell	b7b8bec800	fix(security): block /proc//environ, cmdline, maps from file read (#4609 ) The read_file tool and terminal cat can access /proc/self/environ to recover all process env vars including secrets stripped by the subprocess blocklist. Output redaction partially mitigates (catches known-format tokens) but misses custom/proprietary key formats, especially when values are printed without their key names. Add /proc//environ, /proc//cmdline, and /proc//maps to the blocked device paths in _is_blocked_device(): - /proc//environ: leaks full process env (API keys, tokens) - /proc//cmdline: leaks command-line args (may contain passwords) - /proc/*/maps: leaks memory layout (ASLR bypass for exploitation) Legitimate /proc reads (cpuinfo, meminfo, uptime, version) remain accessible — the check only blocks per-pid pseudo-files with known sensitive suffixes. Complements PR #4432 (PID namespace isolation for child processes) which prevents children from reading the parent's /proc, but does not prevent the parent process itself from being read via file tools. Partially addresses #4427 Changes: tools/file_tools.py \| +6 tests/tools/test_file_read_guards.py \| +18 -1 Co-authored-by: dsr-restyn <dsr-restyn@users.noreply.github.com>	2026-05-25 05:07:31 -07:00
Teknium	4909dd84c1	chore(release): map 66773372+Tranquil-Flow@users.noreply.github.com to Tranquil-Flow (PR #27518 )	2026-05-25 05:07:11 -07:00
Evi Nova	1b12cd5241	fix(cli): bracketed-paste timeout prevents permanent input freeze (#16263 ) When the terminal drops the ESC[201~ end mark during a bracketed paste (terminal race, torn write, SSH glitch, macOS sleep/wake), prompt_toolkit's Vt100Parser keeps buffering all later input in _paste_buffer forever. From the user's perspective, the CLI appears frozen — the only recovery was closing the tab/session. This patch monkey-patches Vt100Parser.feed() so that bracketed-paste mode flushes buffered content as a normal BracketedPaste event after 2 seconds without an end marker, then restores normal parsing. Includes 8 regression tests covering normal paste, timeout recovery, torn end marks, and edge cases. Surgical reapply of PR #27518. Original branch was many months stale (1193 files / 172k LOC of unrelated reverts); the substantive ~77 LOC patch in cli.py plus the new 157-line test file were reapplied onto current main with the contributor's authorship preserved via --author.	2026-05-25 05:07:11 -07:00
Teknium	8697471419	test(cli): cover KeyboardInterrupt guard around slash command dispatch 4 tests: KBI during slash command does not set _should_exit; truthy return keeps session alive; falsy return still sets exit (legit /exit path); non-KBI exceptions propagate normally.	2026-05-25 05:06:06 -07:00
ygd58	63d6b9e637	fix(cli): catch KeyboardInterrupt during slash commands to prevent session exit A Ctrl+C during a slow slash command (e.g. /skills browse on a large skill tree, /sessions list against a multi-GB SQLite DB) used to unwind past self.process_command() to the outer prompt_toolkit event loop, which killed the entire session — losing all conversation state. Fix: wrap the slash-command dispatch in try/except KeyboardInterrupt so Ctrl+C aborts the command but the prompt loop continues. Other exceptions still propagate so real bugs aren't silently swallowed. Surgical reapply of PR #5189. Original branch was many months stale (3764 files / 1M+ LOC of unrelated reverts); the substantive ~6 LOC change in cli.py was reapplied by hand onto current main with the contributor's authorship preserved via --author.	2026-05-25 05:06:06 -07:00
Teknium	ee7789e547	chore(release): map simo.kiihamaki@gmail.com to SimoKiihamaki (PR #30773 )	2026-05-25 05:06:03 -07:00
simokiihamaki	fae815adc2	fix(cli): prevent /reset and /new freeze on Windows by falling back to stdin prompt On Windows (PowerShell/Windows Terminal), the queue-based modal used for destructive slash command confirmations deadlocks because prompt_toolkit's input channel becomes unresponsive when entered from the process_loop daemon thread. Keystrokes never reach the key bindings, so response_queue.get() blocks until the 120-second timeout expires. Fix: fall back to _prompt_text_input (stdin-based) when: 1. sys.platform == 'win32' — Windows console doesn't support the modal reliably 2. Called from non-main thread — key bindings can't fire from daemon threads 3. self._app is not set — existing behavior for tests/non-interactive This mirrors the thread-aware guard from _prompt_text_input (PR #23454). 9 new regression tests covering Windows detection, non-main thread fallback, macOS/Linux modal preservation, and integration with _confirm_destructive_slash. Fixes #30768 Surgical reapply of PR #30773. Original branch was many months stale (911 files / 146k LOC of unrelated reverts); the substantive ~30 LOC change in cli.py plus the new test file were reapplied onto current main with the contributor's authorship preserved via --author.	2026-05-25 05:06:03 -07:00
Tranquil-Flow	b1adb95038	fix(codex): surface actionable hint when stale-call detector fires on known silent-reject pattern The ChatGPT Codex backend (chatgpt.com/backend-api/codex) has historically silently dropped certain model requests: the connection is accepted but no stream events are emitted and no error is raised. PR #31967 lowered the implicit stale-call default from 300s to 90s so fallbacks kick in faster, but users still see an opaque "No response from provider for 90s (non-streaming, ...)" message that gives no path forward. This patch adds a narrow heuristic — gpt-5.5 family on the Codex backend via codex_responses api_mode — that substitutes the generic timeout message with actionable text naming the gpt-5.4-codex workaround and pointing at #21444 for symptom history. Changes: - run_agent.py — new ``AIAgent._codex_silent_hang_hint(model=...)`` method. Returns ``None`` for any request that does not match all three guards (codex_responses api_mode, openai-codex provider or chatgpt.com Codex base URL, gpt-5.5-family model name with word-boundary regex anchoring to avoid false-positives on e.g. ``gpt-5.50``). - agent/chat_completion_helpers.py — the non-stream stale-call site consults the hint via ``getattr(...)`` so the call site stays robust if the helper is ever removed or stubbed in tests. Hint is appended to both the ``_emit_status`` warning and the ``TimeoutError`` message so the user sees it in their terminal AND it lands in any retry-loop diagnostics. - tests/run_agent/test_codex_silent_hang_hint.py — 10 regression tests covering positive cases (bare gpt-5.5, vendor-prefixed openai/gpt-5.5, gpt-5.5-codex SKU, model=None fallback to self.model) and negative cases (gpt-5.4-codex workaround, gpt-5.50 false-positive guard, non-codex api_mode, non-codex provider, empty/None model, unrelated models on Codex). Does NOT fix the backend-side issue (that's an upstream OpenAI/ChatGPT problem we cannot patch from here). Only converts an opaque timeout into text that names the workaround so users do not have to dig through logs or wait for a forum post to learn what to do. Closes #22046	2026-05-25 04:49:22 -07:00
teknium1	4c64638897	chore(release): map liuhao1024 for PR #20778 salvage	2026-05-25 03:40:47 -07:00
liuhao1024	ba3c450914	fix(security): block read_file on project-local .env files get_read_block_error() only blocked internal Hermes cache files but allowed reading project-local secret-bearing environment files (.env, .env.production, .env.local, etc.) through both read_file and ACP fs/read_text_file paths. Add a basename deny set for common secret-bearing .env variants. .env.example remains readable as documentation. Fixes #20734	2026-05-25 03:40:47 -07:00
teknium1	51c913caf7	chore(release): map dusterbloom for PR #25726 salvage	2026-05-25 03:40:47 -07:00
dusterbloom	79fc92e9cb	fix(security): tighten .env file permissions to 0600 at all creation sites .env holds API keys and secrets. Multiple creation sites used `cp` / `touch` / `shutil.copy2` which obey the process umask — commonly 0o022, leaving the file at 0o644 (world-readable). Apply chmod 0o600 explicitly at every site that creates or copies .env. Sites covered: - docker/stage2-hook.sh: after the seed_one '.env' call, applied unconditionally (not just on first-seed) so a host-mounted .env with loose perms gets tightened on every container restart - hermes_cli/doctor.py: 'hermes doctor --fix' touches an empty .env when missing - hermes_cli/profiles.py: 'hermes profile create --clone' copies .env from the source profile; shutil.copy2 preserves source mode, so a source .env at 0o644 was being cloned into 0o644 - setup-hermes.sh: in-tree setup script's cp .env.example .env path, plus the already-exists branch (mirror of install.sh which already chmods 600 unconditionally on line 1442) scripts/install.sh was NOT changed — it already chmod 600's the .env unconditionally after the create/already-exists branches (line 1442). Salvaged from PR #25726 by @dusterbloom. The docker/entrypoint.sh portion of the original PR was dropped because main switched to an s6-overlay shim — the .env creation logic moved to stage2-hook.sh, which is where the chmod now lives. Closes #25497 (subset — install.sh + setup-hermes.sh) and #8448 (subset — install.sh only) as superseded. Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-25 03:40:47 -07:00
Rodrigo	4cb3eb03c7	fix(approval): harden YOLO bypass, LLM parsing, auto-approve audit, pipe pattern (#23835 ) * fix(approval): harden YOLO bypass, LLM parsing, auto-approve audit, pipe pattern - BUG-009 (CRITICAL): freeze HERMES_YOLO_MODE at module import via _YOLO_MODE_FROZEN; prevents skills/prompt-injection from calling os.environ["HERMES_YOLO_MODE"]="true" at runtime to bypass all checks - BUG-002 (HIGH): replace substring "APPROVE" in answer with exact answer == "APPROVE" in _smart_approve; prompt already requests exactly one word, substring match was exploitable via verbose LLM responses - BUG-001 (MEDIUM): add logger.warning for every dangerous command that auto-approves in non-interactive non-gateway context; makes silent approvals visible in audit logs without breaking script behavior - BUG-008 (LOW): expand curl/wget pipe pattern to cover \| /bin/bash and \| bash -c variants, not just \| sh / \| bash Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(approval): add missing is_truthy_value import + fix yolo test patches _YOLO_MODE_FROZEN uses is_truthy_value() from utils — import was missing. Tests that set HERMES_YOLO_MODE via monkeypatch.setenv() no longer work because the value is frozen at import time; update them to patch the module-level flag directly via monkeypatch.setattr(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-25 03:35:33 -07:00
Dennis Vorobyov	3ab7e2aa91	harden(env_passthrough): apply GHSA-rhgp-j443-p4rf filter to config.yaml path (#27794 ) register_env_passthrough() (the skill-declared path) filters out names in _HERMES_PROVIDER_ENV_BLOCKLIST and logs a warning citing GHSA-rhgp-j443-p4rf. _load_config_passthrough() (the config.yaml path) did not. Both feed the same is_env_passthrough() allowlist that local.py and code_execution_tool.py consult before stripping a variable from the child env. A skill that wanted to leak ANTHROPIC_API_KEY or OPENAI_API_KEY into execute_code could no longer self-register the name (the GHSA fix blocks it), but the same outcome was still reachable by asking the operator to add the name to terminal.env_passthrough in config.yaml, or by any in-process actor with write access to ~/.hermes/config.yaml. Apply the same _is_hermes_provider_credential filter inside _load_config_passthrough, mirroring the skill-path warning so operators see the same explanation. Non-Hermes API keys (TENOR_API_KEY, NOTION_TOKEN, etc.) are unaffected since they are not in the blocklist.	2026-05-25 03:35:23 -07:00
Teknium	0219b0408a	perf(cli): cut hermes startup 63% — flip head-to-head vs codex (#31968 ) * perf(bitwarden): persist secret-fetch cache across CLI invocations Every `hermes` invocation paid a ~380ms tax for `bws secret list` to Bitwarden Secrets Manager because the existing cache was in-process only. Back-to-back `hermes chat -q`, gateway-spawned agents, and cron-launched runs all re-fetched. Adds a disk-persisted L2 cache at `<hermes_home>/cache/bws_cache.json` (mode 0600, never contains the access token — only the SHA-256 fingerprint prefix). Same TTL as the in-process cache. Read on miss, write on bws success, ignored on key mismatch / corruption / expiry. Measured on a startup profile: load_hermes_dotenv() cold: 372ms → warm (disk cache hit): 20ms End-to-end `hermes --version` cold→warm: 666ms → ~295ms. In a hermes-vs-codex benchmark across 11 single- and multi-turn tasks (framework overhead = wall − llm − tool_exec, median over 3 trials): cohort before after saved single-turn (median) 2.96s 2.31s -0.65s multi-turn (5-turn) 9.40s 8.95s -0.45s (≈0.3s/turn) Hermes now wins head-to-head on 6/11 tasks vs codex (was 4/11 before). The remaining ~0.6s single-turn delta is mostly Python's own import cost in hermes_cli.main, which is a separate optimization. * perf(cli): lazy-load model catalog + dedupe config.yaml reads at startup Two import-time wins on top of the bws disk-cache fix: 1. Lazy-load `hermes_cli.models._PROVIDER_MODELS` via PEP 562 module-level `__getattr__`. The catalog is ~55ms of work that was eagerly imported on every CLI invocation (line 4557 `if not _is_termux_startup_environment(): from hermes_cli.models import _PROVIDER_MODELS`). Audit showed every internal call site already does its own function-local import; only test code reads `hermes_cli.main._PROVIDER_MODELS` as a module attribute, and __getattr__ keeps that working transparently. First access triggers the import once and caches the result on the module via `globals()[name] = ...`, so subsequent reads are dict lookups. 2. Dedupe the double config.yaml read in the top-of-module bootstrap. Previously: one raw yaml.safe_load for the `security.redact_secrets` bridge, then a separate full `load_config()` (with deep-merge) for `network.force_ipv4`. Both keys come from the same file. Merged into one raw yaml load. Combined with the bws cache fix in the previous commit: hermes --version wall time: original (cold): 666 ms after bws fix (warm): 295 ms after lazy-load + dedupe: 228 ms (-67 ms additional, -66% from original) Tests: - tests/hermes_cli/test_api_key_providers.py: 173/173 pass (lazy __getattr__ correctly handles `from hermes_cli.main import _PROVIDER_MODELS`) - tests/test_ipv4_preference.py + tests/hermes_cli/test_redact_config_bridge.py + tests/agent/test_redact.py: 93/93 pass (dedupe preserves both bridges) - tests/test_bitwarden_secrets.py + env_loader tests: 49/49 pass	2026-05-25 03:06:39 -07:00
teknium1	c0169496d0	chore(release): map jfuenmayor + Jiahui-Gu + YLChen-007 + AdamPlatin123 + waefrebeorn for S11 cluster salvage	2026-05-25 01:55:59 -07:00
waefrebeorn	5faea3f618	fix(file_tools): reject '..' traversal in V4A patch headers V4A patch '* Update File:', '* Add File:', '* Delete File:' headers come from patch CONTENT, not the explicit `path=` argument. That makes them attacker-influenceable through skill content, web extract output, prompt injection, and other surfaces the agent processes. Headers like '* Update File: ../../../etc/shadow' would resolve relative to the agent's cwd; in deployment configurations where that cwd is deep enough to land outside Hermes' protected paths, the write could land somewhere the agent operator did not intend. Reject any V4A header containing a '..' path component before applying the patch. The explicit `path=` argument on patch_tool is UNCHANGED — the agent legitimately uses '..' there (e.g. `patch path='../other_module/x.py'` from a worktree dir is normal cross-module editing). Regression tests: V4A Update header with traversal rejected, V4A Add header with traversal rejected, patch_v4a never invoked when rejection fires. Salvaged from PR #29395 by @waefrebeorn. The original PR added has_traversal_component as a blanket reject on read_file_tool, write_file_tool, patch_tool's explicit path, and search_tool — that would break legitimate agent operation where '..' is normal. Also dropped the over-eager skills_guard pattern additions (pickle.loads/marshal.loads/ctypes.CDLL/importlib at high/critical severity would false-positive on legit data-science and FFI skills). Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-25 01:55:59 -07:00
AdamPlatin123	00bd24e27c	fix(security): expand memory content scanning patterns to parity with skills guard (#9151 ) Expand _MEMORY_THREAT_PATTERNS from 13 to 24 regex patterns and align _INVISIBLE_CHARS with skills_guard.py (10 → 17 characters). Key changes: - Add multi-word bypass prevention (?:\w+\s+)* to injection patterns - Add missing injection patterns: role_pretend, leak_system_prompt, remove_filters, fake_update, translate_execute, html_comment_injection, hidden_div - Add exfiltration patterns: send_to_url, context_exfil - Add persistence patterns: agent_config_mod, hermes_config_mod (both require modification-verb prefix to avoid false positives on mere mentions of config filenames) - Add hardcoded secret detection pattern - Add role_hijack precision fix: require article after "now" to avoid blocking "you are now ready/connected/set up" etc. - Expand invisible unicode set with directional isolates (U+2066-2069) and invisible math operators (U+2062-2064) Test coverage expanded from ~8 to ~30 scan tests including dedicated false-positive regression tests for all precision-sensitive patterns. Known limitations (deferred to follow-up PRs): - prompt_builder.py and cronjob_tools.py still use older pattern sets - No semantic/LLM-based scanning (regex-only approach) - No cross-entry or cross-store analysis	2026-05-25 01:51:53 -07:00
Edward-x	7ebebfbb8d	Harden Skills Guard multi-word prompt patterns (#26852 ) Co-authored-by: openhands <openhands@all-hands.dev>	2026-05-25 01:51:27 -07:00
JiahuiGu	0a2ee71ccc	fix(skill): guard pickle.loads in darwinian-evolver show_snapshot with explicit flag (#29276 ) show_snapshot.py unpickled a user-supplied path unconditionally. pickle.loads is equivalent to arbitrary code execution, so a snapshot from an untrusted source = RCE. Require an explicit --i-trust-this-file acknowledgement before calling pickle.loads, and emit a stderr warning when proceeding. Co-authored-by: Jiahui-Gu <jiahuigu@users.noreply.github.com>	2026-05-25 01:51:21 -07:00
Jorge Fuenmayor	93660643a6	fix: harden skill trust source matching (#31229 ) Co-authored-by: gaia <gaia@gaia.local>	2026-05-25 01:51:15 -07:00
Kasun Athaudahetti	2d422720b5	fix(codex): size and propagate timeouts for Responses-API requests; lower stale defaults Codex / Responses-API requests had three latent timeout bugs that combined into the long silent hangs reported on #21444: 1. The non-stream stale-call detector estimated context tokens from ``api_kwargs["messages"]`` only. Codex / Responses-API payloads carry their conversational load in ``input`` (with ``instructions`` and ``tools``), so every Codex turn logged ``context=~0 tokens`` and the detector never applied its >50k / >100k tier bumps. 2. ``providers.<id>.request_timeout_seconds`` was silently dropped on the main Codex path. The chat_completions path and the auxiliary Codex adapter both forwarded it; the main path skipped it through three places (``build_api_kwargs``, ``ResponsesApiTransport.build_kwargs``, ``_preflight_codex_api_kwargs``). 3. The streaming stale detector had the same payload-shape bug for ``codex_responses`` requests, which route through the non-streaming detector (it's the path that emits the user-facing "No response from provider for 300s (non-streaming, ...)" warning that reporters keep pasting). This commit: - Adds ``estimate_request_context_tokens`` in ``chat_completion_helpers``, used by both the non-stream and stream detectors. Handles ``messages`` (Chat Completions), ``input + instructions + tools`` (Responses API), bare lists, and an unknown-dict fallback. - Forwards ``timeout`` through ``ResponsesApiTransport.build_kwargs`` and ``_preflight_codex_api_kwargs`` (with guards against zero/negative/inf/bool values), and wires ``_resolved_api_call_timeout()`` into the Codex branch of ``build_api_kwargs``. - Lowers the implicit non-stream stale defaults so fallback providers kick in faster when upstream stalls: * base 300s -> 90s * >50k 450s -> 150s * >100k 600s -> 240s These only apply when the user has not set ``providers.<id>.stale_timeout_seconds`` or ``HERMES_API_CALL_STALE_TIMEOUT``. Explicit config still wins. - Adds regression tests for the estimator shapes, the new defaults, the context-tier scaling, transport timeout pass-through, and preflight timeout pass-through / rejection of invalid values. Closes #21444 Supersedes #21652 #24126 #31855 Co-authored-by: Hoang V. Pham <26063003+hehehe0803@users.noreply.github.com>	2026-05-25 01:47:55 -07:00
Teknium	76135b329d	docs(i18n): translate all docs into Simplified Chinese (zh-Hans) (#31942 ) Translates the full English docs corpus (335 files) into Simplified Chinese under website/i18n/zh-Hans/. Combined with PR #31895 (cross- locale link fix), the 简体中文 locale toggle now serves a complete Chinese site with working cross-page navigation. Pipeline: - Claude Sonnet 4.6 via OpenRouter, 8-way concurrent - Preserves frontmatter keys, code blocks, MDX/JSX, link URLs, brand names, and technical jargon (prompt/token/hook/MCP/ACP/etc.) - Translates only frontmatter title/description and prose - Two largest files (configuration.md 93KB, research-paper-writing.md 107KB) retried with 64K max_tokens after initial fence-drift - 3 manual post-fixes for MDX edge cases the model didn't escape: < in optional-skills-catalog table, double-quotes in an alt= tag, and a bare URL adjacent to a full-width period Cost: ~$30 total (Sonnet 4.6 input $3/M + output $15/M). Verified `npm run build` succeeds for both en and zh-Hans locales, no double-prefixed /docs/zh-Hans/docs/ URLs in rendered output, all in-page navigation resolves correctly. Translations are machine-generated and may need human review on specific pages — but they're an enormous improvement over the previous state (3 zh-Hans pages out of 335).	2026-05-25 01:47:38 -07:00
Teknium	ffe11c14ec	test(cli): cover quiet-mode resume status lines routed to stderr 4 tests: session-not-found in quiet mode -> stderr; in full mode -> stdout (unchanged); resumed banner in quiet mode -> stderr; has-no-messages in quiet mode -> stderr.	2026-05-25 01:47:12 -07:00
Michel Belleau	25295e7ac9	fix(cli): redirect resume status lines to stderr in quiet mode (#11793 ) When 'hermes chat --quiet --resume <id> -q "..."' is used, three status messages were written to stdout via ChatConsole / _cprint: - '↻ Resumed session <id> (N user messages, M total messages)' - 'Session <id> found but has no messages. Starting fresh.' - 'Session not found: <id>' / usage hint This polluted the machine-readable stdout that automation wrappers capture with $(...), making it impossible to cleanly separate the agent's answer from the resume banner. Fix: detect quiet mode via tool_progress_mode == 'off' and route the three resume status messages to stderr (as plain text, matching the existing stderr convention for session_id). Interactive mode is unchanged — it still uses the Rich-rendered path through ChatConsole. Surgical reapply of PR #11868. Original branch was stale against current main; reapplied onto current cli.py by hand with original authorship preserved via --author.	2026-05-25 01:47:12 -07:00
Teknium	11c40d6a42	test+polish(compression): pin anti-thrash gate and gateway session_id persistence Follow-up to @someaka's fix. Polish: - Drop the redundant `_preflight_tokens >= threshold_tokens` clause. `should_compress(tokens)` already short-circuits when tokens < threshold, so the explicit comparison was dead code on the True branch. Tests: - Preflight: pin that should_compress() is called (anti-thrash has a vote). Mocks should_compress to return False even with tokens past the raw threshold and asserts no compression runs — exact bug shape from #29335. - Gateway: AST scan of gateway/run.py asserts every `session_entry.session_id = ...` assignment is followed by a `session_store._save()` call within the same block. Three sites mutate the session_id after compression; all three must persist or the next turn loads the pre-compression transcript and re-loops. Empirically verified the test catches the bug (drops the new _save() line → red). AUTHOR_MAP: - Map ed@bebop.crew -> someaka so the salvaged commit resolves to @someaka in release notes.	2026-05-25 01:44:46 -07:00
Radical Edward	3914089d52	fix(compression): 3-line fix for infinite compression loop (#29335 ) Three compounding root causes: A) run_conversation() result dict missing session_id — gateway's dead-code guard at gateway/run.py:8700 never triggers B) preflight compression bypasses should_compress() anti-thrashing — re-triggers every turn when tool schemas dominate token budget C) gateway updates session_entry.session_id in memory but doesn't persist via session_store._save() Fixes: #29335	2026-05-25 01:44:46 -07:00
Teknium	222a3a9c19	test(cli): cover exit resume hint -p flag across profiles 5 tests: default/custom profiles emit no -p; named profile emits -p <name> on both --resume and -c hints; lookup failure falls back gracefully.	2026-05-25 01:41:54 -07:00
CK iRonin.IT	2a2cef4ac7	fix: include -p profile flag in exit resume hint Session IDs are profile-constrained, so the resume hint needs to include the active profile for multi-profile users. Without this, copying the hint from a non-default profile fails to resume the correct session. Before: hermes --resume 20260414_063228_c1240e After: hermes --resume 20260414_063228_c1240e -p dev Also includes -p on the resume-by-title hint. Skipped for 'default' and 'custom' profiles (no -p needed). Surgical reapply of PR #9652. Original branch was stale against current main (~6 months); reapplied onto current cli.py by hand with original authorship preserved.	2026-05-25 01:41:54 -07:00
teknium1	d3ffbc6409	feat(stt): add stt.providers.<name> command-provider registry Mirror of the TTS command-provider registry (PR #17843) for STT. Lets any shell-driven ASR engine — Doubao ASR, NVIDIA Parakeet, whisper.cpp builds, SenseVoice, curl pipelines — become an STT backend with zero Python. Complements the legacy HERMES_LOCAL_STT_COMMAND escape hatch (preserved untouched via the built-in local_command path) and the register_transcription_provider() Python plugin hook also shipped in this PR. Resolution order (mirrors TTS exactly): 1. Built-in (local, local_command, groq, openai, mistral, xai) → native handler. Always wins. 2. stt.providers.<name>: type: command → command-provider runner. 3. Plugin-registered TranscriptionProvider → plugin dispatch. 4. No match → 'No STT provider available'. Files ----- - tools/transcription_tools.py: BUILTIN_STT_PROVIDERS frozenset retained; added _resolve_command_stt_provider_config, _transcribe_command_stt, and local helpers for template rendering, shell-quote context, and process-tree termination. Helpers are documented as mirrors of their tts_tool.py counterparts (kept local to avoid cross-tool private import). Wire-in is one insertion point in transcribe_audio() after the xai elif and before the plugin dispatcher. Plugin dispatcher additionally defensively short-circuits when a same-name command config exists (command-wins-over-plugin invariant). - tests/tools/test_transcription_command_providers.py: 50 new tests covering resolution (builtin precedence, type/command gating, case-insensitive lookup, legacy stt.<name> back-compat), helpers (timeout fallback, format validation, iter, has-any), template rendering (shell-quote contexts, doubled-brace preservation), end-to-end via _transcribe_command_stt (output_path read, stdout fallback, timeout, nonzero exit envelope, model override, language precedence), and dispatcher integration via the real transcribe_audio() including command-wins-over-plugin and builtin-shadow-rejection. - tests/plugins/transcription/check_parity_vs_main.py: extended from 10 to 13 scenarios. New cases: command-provider-installed, command-vs-plugin-same-name (verifies command wins precedence), explicit-openai-with-command-shadow (verifies built-in wins). Adds command_provider dispatch_kind detection via transcript prefix (CMD: vs PLUGIN:) so command-provider scenarios can be distinguished from plugin scenarios even when sharing a provider name. - website/docs/user-guide/features/tts.md: new 'STT custom command providers' section symmetric to the TTS section — example config, placeholder grammar table (input_path / output_path / output_dir / format / language / model), transcript-read-back semantics (file first, then stdout fallback), optional keys table, behavior notes, security note. Updated 'Python plugin providers (STT)' to include the new 'When to pick which (STT)' decision table and updated resolution-order section (now 4 layers instead of 3). Verification ------------ 189/189 STT targeted tests + 50/50 new command-provider tests pass. Combined sweep: tests/tools/ 5576/5576, tests/agent/ + tests/hermes_cli/ 8623/8623 — zero regressions across 14,199 tests. Parity harness: 13 scenarios, 9 OK + 4 expected diffs (no_provider_error → plugin, plugin_unavailable, command_provider × 2). E2E live-verified in an isolated HERMES_HOME with a real .wav file: command: → dispatched to stt.providers.my-fake-cli plugin: → dispatched to registered TranscriptionProvider command-wins-over-plugin: → command provider beats same-name plugin builtin-wins-over-command: → built-in OpenAI handler fires; stt.providers.openai: type: command does NOT hijack it.	2026-05-25 01:41:19 -07:00
kshitijk4poor	2cd952e110	feat(stt): add register_transcription_provider() plugin hook Add an opt-in Python plugin surface for speech-to-text backends, mirroring the TTS hook pattern. New backends (OpenRouter, SenseAudio, Gemini-STT, custom proprietary engines) can be implemented as plugins without modifying tools/transcription_tools.py. Built-ins always win -------------------- The 6 built-in STT providers (local/faster-whisper, local_command, groq, openai, mistral, xai) keep their native handlers. Plugins attempting to register under a built-in name are rejected at registration time with a warning and re-checked defensively at dispatch. Resolution order ---------------- 1. stt.provider matches a built-in → built-in dispatch (unchanged) 2. stt.provider matches a registered plugin → a. if plugin.is_available() returns False → unavailability envelope identifying the plugin (not the generic "No STT provider" message — the user explicitly opted into this plugin) b. otherwise plugin.transcribe() with model + language forwarded from stt.<provider>.{model,language} config 3. No match → legacy "No STT provider available" error (unchanged) Per-provider config namespace ----------------------------- Plugins read their config from stt.<provider> in config.yaml, mirroring how built-ins read stt.openai.model / stt.mistral.model. The dispatcher forwards `model` and `language` from this section. Caller's explicit `model=` argument overrides the config-set model. Files ----- - agent/transcription_provider.py: TranscriptionProvider ABC - agent/transcription_registry.py: register/get/list providers, built-in shadow guard, _reset_for_tests - hermes_cli/plugins.py: register_transcription_provider() on PluginContext - tools/transcription_tools.py: BUILTIN_STT_PROVIDERS frozenset, _dispatch_to_plugin_provider() with availability gate, wire-in after xai branch and before "No STT provider" error - tests/agent/test_transcription_registry.py: 27 tests - tests/hermes_cli/test_plugins_transcription_registration.py: 3 tests - tests/tools/test_transcription_plugin_dispatch.py: 28 tests (covering built-in short-circuit, plugin dispatch, exception envelope, non-dict guard, availability gate, language forwarding) - tests/plugins/transcription/check_parity_vs_main.py: 10-scenario subprocess-pinned parity harness vs origin/main - website/docs/user-guide/features/{tts,plugins}.md: docs Behavior parity --------------- 10 scenarios, 8 OK + 2 expected DIFFs: no_provider_error → plugin (plugin-installed scenario) no_provider_error → plugin_unavailable (plugin-installed-unavailable scenario; PR returns cleaner envelope) Zero behavior change for users not opting into a plugin. Issue follow-up to #30398.	2026-05-25 01:41:19 -07:00
Teknium	2e0ac31a72	chore(release): map claw@openclaw.ai to wanwan2qq (PR #10215 )	2026-05-25 01:33:32 -07:00
Teknium	4fbdf0e893	test(cli,gateway): cover bracket-stripping and gateway session-ID lookup - CLI: bracketed/quoted target resolves; mismatched single bracket passes through unchanged. - Gateway: bracketed session ID resolves; bare untitled session ID resolves via get_session() fallback.	2026-05-25 01:33:32 -07:00
Claw Assistant	1c7a783c42	fix(cli,gateway): strip outer brackets/quotes from /resume args + accept session IDs in gateway The /resume usage hint shows '<session_id_or_title>' which a few users have typed verbatim, including the angle brackets. Strip outer <>, [], "", and '' from the argument before lookup so '/resume <abc123>' works the same as '/resume abc123'. Mirrors the new bracket-stripping in the CLI handler. Also let the gateway resolve a bare session ID. Previously the gateway only called resolve_session_by_title, so '/resume <session_id>' always returned 'Session not found' even for valid IDs. Try get_session() first, fall back to title resolution second. Surgical reapply of PR #10215 (branch was based on a many-months-old main and reverted ~3100 unrelated files; original commit by claw@openclaw.ai preserved via --author).	2026-05-25 01:33:32 -07:00
Teknium	920b350e57	test(auth): align copilot-remove test with borrowed-credential policy (#31416 ) PR #31416 (avoid persisting borrowed credential secrets) added sanitize_borrowed_credential_payload, which strips access_token from any auth.json pool entry whose (provider, source) isn't in the _PERSISTABLE_PROVIDER_SOURCES allowlist. (copilot, gh_cli) is borrowed (not in the allowlist), so the test fixture's pre-seeded access_token now gets stripped at load_pool() time, leaving the pool empty. resolve_target('1') then fails with 'No credential #1. Provider: copilot.' Fix: align the test with the new contract. At runtime, copilot tokens are hydrated by resolve_copilot_token() — mock that path so the pool gets an entry the test can remove. The behavior under test (suppression of gh_cli + env variants on remove) is unchanged. CI repro on origin/main HEAD; reproduced locally with stock checkout.	2026-05-25 01:23:31 -07:00
Teknium	9c77a0c3ce	fix(plugins): widen masked secret prompt to plugin setup wizards Extend PR #31716 to plugin setup paths that were also using bare getpass.getpass(): hindsight (4 sites), honcho, simplex, line. Same mechanical swap onto hermes_cli.secret_prompt.masked_secret_prompt.	2026-05-25 01:20:33 -07:00
helix4u	ec4d6f1823	fix(cli): show masked feedback for secret prompts	2026-05-25 01:20:33 -07:00
Glen Workman	d952b377aa	fix: add cron API provenance logging (#24889 ) Co-authored-by: sgtworkman <178342791+sgtworkman@users.noreply.github.com>	2026-05-25 01:15:56 -07:00
teknium1	92d91365e7	chore(release): map zapabob for PR #29826 salvage	2026-05-25 01:15:24 -07:00
zapabob	2c3ca475c0	fix(cron): reject id mutation + validate output paths under OUTPUT_DIR Two defense-in-depth fixes on cron output path handling: 1. cron/jobs.py:update_job() rejects mutation of the immutable 'id' field (raises ValueError). Dashboard PUT /api/cron/jobs/{id} converts this to HTTP 400. Without this, an attacker who can reach the update endpoint could rename a job's id to '../escape' and move its output directory outside OUTPUT_DIR. 2. cron/jobs.py:_job_output_dir() validates job IDs before composing paths: rejects '.', '..', '/', '\\', absolute paths, and Windows drive prefixes. Used by save_job_output() and remove_job() so legacy unsafe IDs (from before this guard) fail closed rather than half-applying a shutil.rmtree or output write outside the sandbox. Tests: - update_job rejects {'id': '../escape'} without renaming - remove_job(legacy '../escape' id) raises ValueError without deleting files outside OUTPUT_DIR or removing the job from the store - save_job_output rejects '..', './escape', 'nested/escape', absolute paths - dashboard PUT /api/cron/jobs/{id} with {'id': '../escape'} returns 400, job list unchanged Salvaged from PR #29826 by @zapabob. Simplified implementation: - Dropped a 23-line _validate_job_output_id() helper using Path.parts semantics. The inline check (path separators + dot-components + is_absolute) is shorter and behaviorally identical. - Dropped the secondary OUTPUT_DIR.resolve()/relative_to() check — redundant once we reject any path separator at the input boundary. - Dropped the _docs/2026-05-21_cron-output-path-hardening_codex.md planning artifact (we don't check planning docs into the repo). Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-25 01:15:24 -07:00
teknium1	0c3e34e298	chore(release): map Schrotti77 for PR #25786 salvage	2026-05-25 01:09:54 -07:00
Schrotti77	9863a07af6	fix(cron): layer agent.disabled_toolsets onto cron baseline (#25752 ) The bug: cron/scheduler.py:_resolve_cron_enabled_toolsets returns an LLM-supplied per-job enabled_toolsets verbatim. The disabled_toolsets passed to AIAgent was a hardcoded [cronjob, messaging, clarify] that ignored agent.disabled_toolsets from config.yaml. An LLM could call cronjob(action='add', enabled_toolsets=['terminal','file'], prompt='...') and the cron-spawned agent would receive terminal+file even when the operator had globally disabled them. Fix: new _resolve_cron_disabled_toolsets() helper that ALWAYS layers agent.disabled_toolsets on top of the cron baseline. AIAgent's disabled_toolsets takes precedence over enabled_toolsets, so this stops the bypass regardless of what the per-job override contains. This is the disabled-side fix. Three concurrent PRs (#25842, #25815, #25780) proposed intersection-side variants on _resolve_cron_enabled_toolsets; this fix is more robust because it stops the leak at the precedence boundary AIAgent itself enforces, not at a layer above. Regression test reproduces the issue's PoC exactly: config.yaml has agent.disabled_toolsets=[terminal,file]; cron job has enabled_toolsets=[web,terminal,file]; assertion: AIAgent receives disabled_toolsets containing terminal AND file. Salvaged from PR #25786 by @Schrotti77. Simplified the implementation: dropped a 23-line _normalize_toolset_list() helper (handled str/tuple/ set/garbage input shapes) in favor of the existing convention (agent_cfg.get('disabled_toolsets') or []) used elsewhere in the codebase. YAML always parses these as lists; the elaborate normalizer was theatre for shapes we never produce. Closes #25752 Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-25 01:09:54 -07:00
teknium1	a6b0414ea0	feat(providers): extend openai-api with live /v1/models fetch + gpt-5.5-pro Follow-up on top of @jacevys' PR #21437 cherry-pick: - _provider_model_ids() now also matches normalized == 'openai-api' for the live /v1/models fetch path, so users see the full catalog instead of just the curated list. - Add gpt-5.5-pro and gpt-5.3-codex to the curated list for parity with the existing 'openai' table (used as fallback when /v1/models fails). - Add scripts/release.py AUTHOR_MAP entry for jacevys so CI doesn't block the salvage PR.	2026-05-25 00:59:53 -07:00

1 2 3 4 5 ...

9515 commits