hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

History

ethernet 48be2e0e4d test: use subprocesses for each test file (#29016 ) * ci(tests): install ripgrep from prebuilt tarball instead of apt apt-get update + install of ripgrep takes ~4 min on the GHA Ubuntu runners (the apt-get update against archive.ubuntu.com is the slow part; ripgrep itself is small). Switching to the upstream musl binary tarball cuts the step to a few seconds. - Pinned to ripgrep 15.1.0 with sha256 verification (same hash as published in the releases sha256 sidecar file). - Drops the `rg` binary into /usr/local/bin so it is on PATH for every subsequent step without GITHUB_PATH manipulation. - Applied to both the test and e2e jobs in tests.yml. * fix(cli): compile syntax check to tempdir, not source __pycache__ `_validate_critical_files_syntax` runs `py_compile.compile()` on each critical bootstrap file after a successful `git pull`. The default `py_compile` writes the resulting `.pyc` next to the source under `__pycache__/`, which causes two real problems: 1. Parallel test workers walking the same source tree (e.g. running the suite under per-file process isolation) can race against each other on the `__pycache__` write — manifests as flaky 'directory not empty' errors during teardown. 2. In production, the post-pull syntax check leaves a `.pyc` behind that the next interpreter run might pick up — fine when the interpreter version matches, sketchy if it doesn't. Fix: write the compiled output to a `tempfile.TemporaryDirectory()` that's discarded on function exit. We only care about the compile-or-not signal, not the artifact. * test(runner): per-file process isolation, drop manual state reset + xdist Replace fragile manual _reset_module_state test fixtures with robust per-file subprocess isolation. Each test file runs in a fresh `python -m pytest <file>` subprocess via ThreadPoolExecutor. No xdist, no custom pytest plugin, no shared worker state. Key changes: * scripts/run_tests_parallel.py — new runner: discovers test files, runs N in parallel via ThreadPoolExecutor, captures stdout per file, treats exit code 5 (no tests collected) as pass, kills all children on exit. Change from cpu_count to cpu_count2. The runner is I/O-bound (waiting on subprocess.communicate() from pytest children) The parent process does almost no CPU work, so 2x oversubscription keeps more pipes full. When a file fails, immediately show the last 30 lines of pytest output (stack traces + FAILED summary) plus a ready-to-copy repro command: python -m pytest tests/agent/test_auxiliary_client.py scripts/run_tests.sh — delegates to run_tests_parallel.py * .github/workflows/tests.yml — test step: python scripts/run_tests_parallel.py * pyproject.toml — drop pytest-xdist, pytest-split; simplify addopts * tests/conftest.py — remove ~200 lines of manual state-reset fixtures * AGENTS.md — update Testing section for per-file design * test(runner): speed gateway test antipattern scan up * fix(test): web search provider plugin test missing xai * fix(tests): make 14 test files pass under per-file subprocess isolation Tests that relied on cross-file state pollution from xdist workers fail when run in isolation (per-file subprocess model). Root causes and fixes: Tool registry not populated: - test_video_generation_tool_surface_matrix: add discover_builtin_tools() - test_web_providers_brave_free/ddgs/searxng/general: autouse fixtures registering all 8 bundled web providers, reset after each test - test_website_policy: same provider registration pattern - test_web_tools_tavily: same pattern across 3 dispatch test classes - Also add is_safe_url/check_website_access mocks where SSRF check blocks example.com (DNS resolution fails in isolated envs) Stale check_fn cache: - test_kanban_tools: invalidate_check_fn_cache() + _clear_tool_defs_cache() in both kanban guidance tests (prior test cached False for kanban_show) - test_discord_tool: cache invalidation in setup/teardown - test_homeassistant_tool: invalidate_check_fn_cache() before registry queries Module-level state pollution: - test_auxiliary_client: autouse fixture clearing _aux_unhealthy_until cache - test_skill_commands: set_session_vars() instead of patch.dict(os.environ) (ContextVar takes precedence over os.environ) - test_dm_topics: overwrite sys.modules + separate telegram.constants mock + force-reimport of gateway.platforms.telegram - test_terminal_tool_requirements: removed duplicate class declaration, autouse _clear_caches fixture * change(tests): run_tests.sh explicitly includes env vars instead of manually dropping some vars, now we just only include some * fix(tests): 5 more isolation/NixOS fixes - test_approval_plugin_hooks: isolate HERMES_HOME so real user's command_allowlist doesn't short-circuit the approval path - test_google_chat: skipif when Platform.GOOGLE_CHAT not in enum (feature not merged on this branch) - test_write_deny: test systemd prefix against tmp_path instead of /etc/systemd which resolves to /nix/store on NixOS - test_pty_bridge: use shutil.which('cat') instead of /bin/cat (doesn't exist on NixOS) - profiles.py: rmtree onexc handler chmod's parent dirs too, fixing profile deletion when copytree preserved read-only modes from nix store * fix(tests): clear unhealthy cache in autouse fixture for auxiliary_client * fix(tests): skip send_message when telegram not installed; handle missing worker_id in browser_supervisor * fix: py3.11 rmtree onexc compat + belt-and-suspenders unhealthy cache clear for expired codex test * fix: address PR #29016 review feedback - Remove tracked .pytest-cache/ artifact and add to .gitignore - Fix stale 'xdist worker' comment in conftest.py - Deduplicate web provider registration into tests/tools/conftest.py shared helper (register_all_web_providers), replacing 8 copy-pasted blocks across 6 test files - Update PR description: remove stale recovered-test-files claim, fix worker count to match code (cpu_count2) fix: eliminate race in stale-cache achievements test The background scan thread could complete and overwrite _SNAPSHOT_CACHE before evaluate_all() returned the stale data — only 10 fake sessions made the scan finish instantly. Added scan_delay param to _FakeSessionDB and set it to 2s in the stale-cache test so the background thread can't win the race.		2026-05-21 16:40:04 +05:30
..
lib	feat: lazy bootstrap node	2026-04-16 10:47:37 -05:00
tests	fix(install.ps1): trim completion banner + strip em-dash in test	2026-05-16 22:55:12 -07:00
whatsapp-bridge	chore(deps): bump protobufjs in /scripts/whatsapp-bridge (#28889 )	2026-05-20 15:25:32 -04:00
benchmark_browser_eval.py	perf(browser): route browser_console eval through supervisor's persistent CDP WS (180x faster) (#23226 )	2026-05-10 07:37:55 -07:00
build_model_catalog.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
build_skills_index.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
check-windows-footguns.py	fix(scripts): fix UnicodeEncodeError in footgun checker on Windows	2026-05-16 23:05:27 -07:00
contributor_audit.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
discord-voice-doctor.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
hermes-gateway	fix: prevent systemd restart storm on gateway connection failure	2026-03-21 09:26:39 -07:00
install.cmd	docs(windows): avoid piping installer directly into iex	2026-05-18 20:05:47 -07:00
install.ps1	fix(install.ps1): pin PortableGit instead of hitting rate-limited GitHub API (#28943 )	2026-05-19 14:38:34 -07:00
install.sh	docs(windows): avoid piping installer directly into iex	2026-05-18 20:05:47 -07:00
install_psutil_android.py	fix(install): also patch psutil on Termux fresh-install path	2026-05-09 17:53:15 -07:00
keystroke_diagnostic.py	docs: add Windows-Specific Quirks section to hermes-agent skill + keystroke diagnostic	2026-05-08 14:27:40 -07:00
kill_modal.sh	refactor: replace swe-rex with native Modal SDK for Modal backend (#3538 )	2026-03-28 11:21:44 -07:00
lint_diff.py	feat(ci): add typecheck (warnings only in CI)	2026-05-06 10:58:12 -04:00
profile-tui.py	Merge remote-tracking branch 'origin/main' into fix/bundle-size	2026-05-11 16:01:04 -04:00
release.py	chore: trim verbose comments/docstrings, add AUTHOR_MAP entry	2026-05-21 12:49:21 +05:30
run_tests.sh	test: use subprocesses for each test file (#29016 )	2026-05-21 16:40:04 +05:30
run_tests_parallel.py	test: use subprocesses for each test file (#29016 )	2026-05-21 16:40:04 +05:30
sample_and_compress.py	refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821 )	2026-04-07 10:25:31 -07:00
setup_open_webui.sh	fix(install): use resolved python variable in setup_open_webui.sh	2026-05-16 22:54:22 -07:00