hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-31 06:51:29 +00:00

Author	SHA1	Message	Date
Teknium	46c1ae8b24	fix(tests): four pre-existing flakes from the security cluster merge (#32072 ) All four failures were broken by the security cluster (#10082 / #10133 / #4609 / symlink-reject batch) merging on May 25. They were red on origin/main HEAD when #32042 and #32061 ran, gating PRs that touched unrelated code. 1) tests/hermes_cli/test_update_zip_symlink_reject.py test_update_via_zip_accepts_normal_member called the real _update_via_zip without sandboxing PROJECT_ROOT — so the function's shutil.copytree() actually copied the fake README from the test ZIP over the real repo's README.md, which then made test_readme_mentions_powershell_installer fail in any test run that happened to pick this test up earlier. Mock PROJECT_ROOT to an isolated tmp_path / install_dir, stub subprocess so pip/uv reinstall doesn't actually run, and assert the fake README lands in the sandbox (not the real tree). 2) tests/tools/test_windows_native_support.py test_readme_mentions_powershell_installer was the victim of (1) — nothing wrong with the test itself, the fix in (1) clears it. 3) tests/tools/test_file_read_guards.py test_proc_fd_other_not_blocked called _is_blocked_device('/proc/self/fd/3') expecting False. But _is_blocked_device runs realpath() and on pytest xdist workers fd 3 happens to be dup'd to /dev/urandom (because the worker subprocess inherits open fds from pytest's collection pipe machinery). Switch to the lower-level _is_blocked_device_path which is the path-pattern check the test actually means to exercise; realpath-resolution coverage already lives in test_symlink_to_blocked_device_is_blocked. 4) tests/tools/test_transcription_tools.py Module installed a faster_whisper stub via sys.modules without setting __spec__, then later @pytest.mark.skipif called importlib.util.find_spec('faster_whisper') which raises 'ValueError: __spec__ is None' for modules with a None spec attr. Set __spec__ on the stub to a real ModuleSpec. Validation: 195/195 green across the 4 affected files.	2026-05-25 05:50:29 -07:00
Peter	ee59ef1946	fix: reject read_file symlinks to blocking devices (#10133 ) * fix: reject read_file symlinks to blocking devices The read_file guard already refused direct device paths such as /dev/zero, but a workspace symlink resolving to one of those devices could still reach the shell-backed read path and hang on wc/head/sed. Keep the literal alias check and add a resolved-path pass so local symlinks to blocked device/fd endpoints are rejected before I/O. Constraint: Preserve literal /dev/stdin handling before terminal-specific realpath resolution Confidence: high Scope-risk: narrow Tested: pytest tests/tools/test_file_read_guards.py tests/tools/test_file_tools.py -q; python -m compileall tools/file_tools.py tests/tools/test_file_read_guards.py; git diff --check Signed-off-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com> * Keep file guard tests off sensitive macOS temp paths The branch now inherits a sensitive-path write guard from upstream main. On macOS, tempfile.mkdtemp() resolves under /private/var/folders, so the new write-path guard fired before the file read dedup assertions could exercise their intended behavior. The tests now create their scratch files inside the worktree temp checkout, outside those system-sensitive prefixes, without changing production behavior. Constraint: Rebased branch must pass the expanded file read guard suite on macOS. Rejected: Loosen the production sensitive-path prefix list \| broader behavior change unrelated to this PR. Confidence: high Scope-risk: narrow Tested: pytest tests/tools/test_file_read_guards.py -q --------- Signed-off-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com> Co-authored-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com>	2026-05-25 05:07:38 -07:00
Dakota Secula-Rosell	b7b8bec800	fix(security): block /proc//environ, cmdline, maps from file read (#4609 ) The read_file tool and terminal cat can access /proc/self/environ to recover all process env vars including secrets stripped by the subprocess blocklist. Output redaction partially mitigates (catches known-format tokens) but misses custom/proprietary key formats, especially when values are printed without their key names. Add /proc//environ, /proc//cmdline, and /proc//maps to the blocked device paths in _is_blocked_device(): - /proc//environ: leaks full process env (API keys, tokens) - /proc//cmdline: leaks command-line args (may contain passwords) - /proc/*/maps: leaks memory layout (ASLR bypass for exploitation) Legitimate /proc reads (cpuinfo, meminfo, uptime, version) remain accessible — the check only blocks per-pid pseudo-files with known sensitive suffixes. Complements PR #4432 (PID namespace isolation for child processes) which prevents children from reading the parent's /proc, but does not prevent the parent process itself from being read via file tools. Partially addresses #4427 Changes: tools/file_tools.py \| +6 tests/tools/test_file_read_guards.py \| +18 -1 Co-authored-by: dsr-restyn <dsr-restyn@users.noreply.github.com>	2026-05-25 05:07:31 -07:00
Teknium	a32d07529c	fix(file-tools): escalate to BLOCKED on repeated read_file dedup stubs (#16382 ) read_file's dedup path returned a lightweight stub on re-reads of an unchanged file, then returned early — so the consecutive-read loop guard (hard block at count>=4) at the bottom of read_file_tool never ran for stub-looped calls. Weaker tool-following models (local Qwen3.6 variants in the reported case) ignore the passive 'refer to earlier result' hint and hammer the same read_file call until iteration budget runs out. Track per-key stub returns in task_data['dedup_hits'] and, on the second stub for the same (path, offset, limit), return a hard BLOCKED error mirroring the wording the real-read path already uses. A real read, an intervening non-read tool call (notify_other_tool_call), or reset_file_dedup (on context compression) all clear the counter so the guard never stays engaged longer than the actual loop. Closes #15759	2026-04-27 00:17:26 -07:00
Teknium	ced8f44cd2	fix(file-tools): broaden dedup-status write guard to cover small wrappers The write_file guard added in #16223 used strict equality against the internal dedup status message. In practice, the model sometimes prepends a short note or appends a trailing comment before calling write_file, which slipped past the strict check. Broaden the heuristic: reject writes whose stripped content equals the status message OR contains it and is <=2x its length. Short, status-dominated writes are always corruption; legitimate docs that quote the message verbatim are always much longer. Adds two tests: one for the small-wrapper corruption shape, one confirming large legitimate files that quote the status still write.	2026-04-26 19:05:36 -07:00
helix4u	977d5f56c9	fix(file-tools): keep read dedup status out of file content	2026-04-26 19:05:36 -07:00
voidborne-d	a32b325d06	fix(tools): invalidate read_file dedup cache on write_file and patch write_file_tool and patch_tool both call _update_read_timestamp to refresh the staleness tracker after writing, but they never invalidate the dedup cache entries for the written path. The dedup cache keys are (resolved_path, offset, limit) → mtime tuples populated by read_file_tool. On filesystems where a read and write land in the same mtime second (or when mtime granularity is 1s), the cached and current mtime are equal, so the dedup check incorrectly returns a 'File unchanged since last read' stub — even though the file was just overwritten. The agent then sees stale content (or a stale 'File not found' error) and enters expensive error-recovery loops, burning API calls. Fix: add _invalidate_dedup_for_path(filepath, task_id) that removes all dedup entries whose resolved path matches the written file. Called from _update_read_timestamp so both write_file_tool and patch_tool benefit automatically. Scoped to the writing task_id — other tasks' caches are not affected. 6 regression tests added covering: - read→write→read within same mtime second (core #13144 scenario) - invalidation across all offset/limit combinations - isolation: writing file A does not invalidate file B's cache - isolation: writing in task A does not invalidate task B's cache - _invalidate_dedup_for_path safety on missing task / empty dedup All 25 tests pass (19 existing + 6 new). Fixes #13144	2026-04-26 19:05:36 -07:00
Teknium	8d023e43ed	refactor: remove dead code — 1,784 lines across 77 files (#9180 ) Deep scan with vulture, pyflakes, and manual cross-referencing identified: - 41 dead functions/methods (zero callers in production) - 7 production-dead functions (only test callers, tests deleted) - 5 dead constants/variables - ~35 unused imports across agent/, hermes_cli/, tools/, gateway/ Categories of dead code removed: - Refactoring leftovers: _set_default_model, _setup_copilot_reasoning_selection, rebuild_lookups, clear_session_context, get_logs_dir, clear_session - Unused API surface: search_models_dev, get_pricing, skills_categories, get_read_files_summary, clear_read_tracker, menu_labels, get_spinner_list - Dead compatibility wrappers: schedule_cronjob, list_cronjobs, remove_cronjob - Stale debug helpers: get_debug_session_info copies in 4 tool files (centralized version in debug_helpers.py already exists) - Dead gateway methods: send_emote, send_notice (matrix), send_reaction (bluebubbles), _normalize_inbound_text (feishu), fetch_room_history (matrix), _start_typing_indicator (signal), parse_feishu_post_content - Dead constants: NOUS_API_BASE_URL, SKILLS_TOOL_DESCRIPTION, FILE_TOOLS, VALID_ASPECT_RATIOS, MEMORY_DIR - Unused UI code: _interactive_provider_selection, _interactive_model_selection (superseded by prompt_toolkit picker) Test suite verified: 609 tests covering affected files all pass. Tests for removed functions deleted. Tests using removed utilities (clear_read_tracker, MEMORY_DIR) updated to use internal APIs directly.	2026-04-13 16:32:04 -07:00
Teknium	e3f8347be3	feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315 ) * feat(file_tools): harden read_file with size guard, dedup, and device blocking Three improvements to read_file_tool to reduce wasted context tokens and prevent process hangs: 1. Character-count guard: reads that produce more than 100K characters (≈25-35K tokens across tokenisers) are rejected with an error that tells the model to use offset+limit for a smaller range. The effective cap is min(file_size, 100K) so small files that happen to have long lines aren't over-penalised. Large truncated files also get a hint nudging toward targeted reads. 2. File-read deduplication: when the same (path, offset, limit) is read a second time and the file hasn't been modified (mtime unchanged), return a lightweight stub instead of re-sending the full content. Writes and patches naturally change mtime, so post-edit reads always return fresh content. The dedup cache is cleared on context compression — after compression the original read content is summarised away, so the model needs the full content again. 3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin etc. are rejected before any I/O to prevent process hangs from infinite-output or blocking-input devices. Tests: 17 new tests covering all three features plus the dedup-reset- on-compression integration. All 52 file-read tests pass (35 existing + 17 new). Full tool suite (2124 tests) passes with 0 failures. * feat: make file_read_max_chars configurable, add docs Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool reads this on first call and caches for the process lifetime. Users on large-context models can raise it; users on small local models can lower it. Also adds a 'File Read Safety' section to the configuration docs explaining the char limit, dedup behavior, and example values.	2026-03-31 12:53:19 -07:00

9 commits