hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

History

ethernet 48be2e0e4d test: use subprocesses for each test file (#29016 ) * ci(tests): install ripgrep from prebuilt tarball instead of apt apt-get update + install of ripgrep takes ~4 min on the GHA Ubuntu runners (the apt-get update against archive.ubuntu.com is the slow part; ripgrep itself is small). Switching to the upstream musl binary tarball cuts the step to a few seconds. - Pinned to ripgrep 15.1.0 with sha256 verification (same hash as published in the releases sha256 sidecar file). - Drops the `rg` binary into /usr/local/bin so it is on PATH for every subsequent step without GITHUB_PATH manipulation. - Applied to both the test and e2e jobs in tests.yml. * fix(cli): compile syntax check to tempdir, not source __pycache__ `_validate_critical_files_syntax` runs `py_compile.compile()` on each critical bootstrap file after a successful `git pull`. The default `py_compile` writes the resulting `.pyc` next to the source under `__pycache__/`, which causes two real problems: 1. Parallel test workers walking the same source tree (e.g. running the suite under per-file process isolation) can race against each other on the `__pycache__` write — manifests as flaky 'directory not empty' errors during teardown. 2. In production, the post-pull syntax check leaves a `.pyc` behind that the next interpreter run might pick up — fine when the interpreter version matches, sketchy if it doesn't. Fix: write the compiled output to a `tempfile.TemporaryDirectory()` that's discarded on function exit. We only care about the compile-or-not signal, not the artifact. * test(runner): per-file process isolation, drop manual state reset + xdist Replace fragile manual _reset_module_state test fixtures with robust per-file subprocess isolation. Each test file runs in a fresh `python -m pytest <file>` subprocess via ThreadPoolExecutor. No xdist, no custom pytest plugin, no shared worker state. Key changes: * scripts/run_tests_parallel.py — new runner: discovers test files, runs N in parallel via ThreadPoolExecutor, captures stdout per file, treats exit code 5 (no tests collected) as pass, kills all children on exit. Change from cpu_count to cpu_count2. The runner is I/O-bound (waiting on subprocess.communicate() from pytest children) The parent process does almost no CPU work, so 2x oversubscription keeps more pipes full. When a file fails, immediately show the last 30 lines of pytest output (stack traces + FAILED summary) plus a ready-to-copy repro command: python -m pytest tests/agent/test_auxiliary_client.py scripts/run_tests.sh — delegates to run_tests_parallel.py * .github/workflows/tests.yml — test step: python scripts/run_tests_parallel.py * pyproject.toml — drop pytest-xdist, pytest-split; simplify addopts * tests/conftest.py — remove ~200 lines of manual state-reset fixtures * AGENTS.md — update Testing section for per-file design * test(runner): speed gateway test antipattern scan up * fix(test): web search provider plugin test missing xai * fix(tests): make 14 test files pass under per-file subprocess isolation Tests that relied on cross-file state pollution from xdist workers fail when run in isolation (per-file subprocess model). Root causes and fixes: Tool registry not populated: - test_video_generation_tool_surface_matrix: add discover_builtin_tools() - test_web_providers_brave_free/ddgs/searxng/general: autouse fixtures registering all 8 bundled web providers, reset after each test - test_website_policy: same provider registration pattern - test_web_tools_tavily: same pattern across 3 dispatch test classes - Also add is_safe_url/check_website_access mocks where SSRF check blocks example.com (DNS resolution fails in isolated envs) Stale check_fn cache: - test_kanban_tools: invalidate_check_fn_cache() + _clear_tool_defs_cache() in both kanban guidance tests (prior test cached False for kanban_show) - test_discord_tool: cache invalidation in setup/teardown - test_homeassistant_tool: invalidate_check_fn_cache() before registry queries Module-level state pollution: - test_auxiliary_client: autouse fixture clearing _aux_unhealthy_until cache - test_skill_commands: set_session_vars() instead of patch.dict(os.environ) (ContextVar takes precedence over os.environ) - test_dm_topics: overwrite sys.modules + separate telegram.constants mock + force-reimport of gateway.platforms.telegram - test_terminal_tool_requirements: removed duplicate class declaration, autouse _clear_caches fixture * change(tests): run_tests.sh explicitly includes env vars instead of manually dropping some vars, now we just only include some * fix(tests): 5 more isolation/NixOS fixes - test_approval_plugin_hooks: isolate HERMES_HOME so real user's command_allowlist doesn't short-circuit the approval path - test_google_chat: skipif when Platform.GOOGLE_CHAT not in enum (feature not merged on this branch) - test_write_deny: test systemd prefix against tmp_path instead of /etc/systemd which resolves to /nix/store on NixOS - test_pty_bridge: use shutil.which('cat') instead of /bin/cat (doesn't exist on NixOS) - profiles.py: rmtree onexc handler chmod's parent dirs too, fixing profile deletion when copytree preserved read-only modes from nix store * fix(tests): clear unhealthy cache in autouse fixture for auxiliary_client * fix(tests): skip send_message when telegram not installed; handle missing worker_id in browser_supervisor * fix: py3.11 rmtree onexc compat + belt-and-suspenders unhealthy cache clear for expired codex test * fix: address PR #29016 review feedback - Remove tracked .pytest-cache/ artifact and add to .gitignore - Fix stale 'xdist worker' comment in conftest.py - Deduplicate web provider registration into tests/tools/conftest.py shared helper (register_all_web_providers), replacing 8 copy-pasted blocks across 6 test files - Update PR description: remove stale recovered-test-files claim, fix worker count to match code (cpu_count2) fix: eliminate race in stale-cache achievements test The background scan thread could complete and overwrite _SNAPSHOT_CACHE before evaluate_all() returned the stale data — only 10 fake sessions made the scan finish instantly. Added scan_delay param to _FakeSessionDB and set it to 2s in the stale-cache test so the background thread can't win the race.		2026-05-21 16:40:04 +05:30
..
proxy	feat(proxy): add xai upstream adapter for Grok via OAuth	2026-05-18 20:09:32 -07:00
__init__.py	chore: release v0.14.0 (2026.5.16) (#26862 )	2026-05-16 02:58:57 -07:00
_parser.py	fix: add dashboard to CLI help epilogue and Docker CI smoke test	2026-05-07 06:16:23 -07:00
_subprocess_compat.py	feat(windows): close remaining POSIX-only landmines — TUI crash, kanban waitpid, AF_UNIX sandbox, /bin/bash, npm .cmd shims, cwd tracking, detach flags	2026-05-08 14:27:40 -07:00
auth.py	fix(security): guard os.chmod(parent) against / and top-level dirs	2026-05-20 22:56:55 -07:00
auth_commands.py	feat(cli): wire --manual-paste into ``hermes auth add` `and` `hermes model``	2026-05-18 20:10:52 -07:00
azure_detect.py	feat(azure-foundry): add Microsoft Entra ID auth	2026-05-18 10:14:38 -07:00
backup.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
banner.py	refactor: DRY cleanup from code review	2026-05-15 14:45:43 -07:00
browser_connect.py	feat: auto-launch Chromium-family browser for CDP	2026-05-19 22:34:05 -07:00
bundles.py	feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373 )	2026-05-18 21:38:05 -07:00
callbacks.py	fix: ESC cancels secret/sudo prompts, clearer skip messaging (#9902 )	2026-04-14 16:11:37 -07:00
checkpoints.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
claw.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
cli_output.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
clipboard.py	fix(clipboard): only read PNG signature bytes, not entire file	2026-05-13 22:54:21 -07:00
codex_models.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
codex_runtime_plugin_migration.py	fix(codex-runtime): de-dup [plugins.X] tables and stop leaking HERMES_HOME into config.toml	2026-05-15 02:31:30 -07:00
codex_runtime_switch.py	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )	2026-05-17 02:29:41 -07:00
colors.py	feat: respect NO_COLOR env var and TERM=dumb (#4079 )	2026-03-30 17:07:21 -07:00
commands.py	fix(gateway): reorder telegram menu priority — everyday commands first	2026-05-20 19:14:21 -07:00
completion.py	test(cli): strengthen zsh completion regression coverage	2026-05-13 09:34:15 -07:00
config.py	feat(sessions): opt-in per-session JSON snapshot writer	2026-05-20 11:44:10 -07:00
copilot_auth.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
cron.py	feat: add cron job profile support	2026-05-18 17:39:50 +00:00
curator.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
curses_ui.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
debug.py	fix(debug): redact log content at upload time in hermes debug share	2026-05-03 11:42:20 -07:00
default_soul.py	fix: reset default SOUL.md to baseline identity text (#3159 )	2026-03-26 01:34:27 -07:00
dep_ensure.py	feat(dep_ensure): complete Windows bootstrap — dep_ensure + install.ps1 + detection (#27845 )	2026-05-18 16:34:24 +05:30
dingtalk_auth.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
doctor.py	feat(doctor): surface xAI model retirement in hermes doctor	2026-05-20 09:18:23 -07:00
dump.py	refactor(env): use shared Hermes dotenv loader	2026-05-05 10:13:13 -07:00
env_loader.py	feat(cross-platform): psutil for PID/process management + Windows footgun checker	2026-05-08 14:27:40 -07:00
fallback_cmd.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
gateway.py	fix(gateway): harden Windows gateway install lifecycle	2026-05-19 11:23:15 -07:00
gateway_windows.py	fix(gateway): harden Windows gateway install lifecycle	2026-05-19 11:23:15 -07:00
goals.py	feat: inject current time into goal judge prompt	2026-05-16 23:05:27 -07:00
hooks.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
inventory.py	refactor(inventory): extract shared ConfigContext + build_models_payload	2026-05-13 22:31:11 -07:00
kanban.py	feat(kanban): add scheduled status for delayed follow-ups	2026-05-18 21:39:03 -07:00
kanban_db.py	fix(gateway): harden kanban and provider cleanup races	2026-05-20 14:31:22 -07:00
kanban_decompose.py	fix: assign single-task kanban decompositions	2026-05-18 20:26:02 -07:00
kanban_diagnostics.py	fix(kanban): honor severity thresholds in diagnostics	2026-05-18 20:47:01 -07:00
kanban_specify.py	fix(cli): make kanban specify max_tokens configurable	2026-05-18 20:15:20 -07:00
kanban_swarm.py	feat(cli): add kanban swarm topology helper	2026-05-18 21:10:12 -07:00
logs.py	feat: component-separated logging with session context and filtering (#7991 )	2026-04-11 17:23:36 -07:00
main.py	test: use subprocesses for each test file (#29016 )	2026-05-21 16:40:04 +05:30
mcp_config.py	fix(mcp): pre-compile env-var regex and unify interpolation	2026-05-15 01:43:54 -07:00
memory_setup.py	fix: restrict .env file permissions to 0600	2026-05-14 07:59:38 -07:00
migrate.py	feat(cli): hermes migrate xai [--apply] [--no-backup]	2026-05-20 09:18:23 -07:00
model_catalog.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
model_normalize.py	fix(opencode-go): keep users on opencode-go instead of hijacking to native providers (#20802 )	2026-05-06 09:08:33 -07:00
model_switch.py	fix(model-switch): mark bare custom provider as current	2026-05-19 10:57:35 -07:00
models.py	fix: detect gh-copilot deprecation and improve GitHub Models 413 errors (#10648 )	2026-05-16 02:24:48 -07:00
nous_subscription.py	feat(web): add SearXNG as a native search-only backend	2026-05-06 10:05:29 -07:00
oneshot.py	fix(oneshot): pass fallback_providers from profile config to AIAgent	2026-05-18 20:37:23 -07:00
pairing.py	fix(pairing): enforce lockout on approve_code, not just generate_code (#10195 ) (#21325 )	2026-05-07 07:18:21 -07:00
platforms.py	feat: complete plugin platform parity — all 12 integration points	2026-04-29 21:56:51 -07:00
plugins.py	feat(browser): add BrowserProvider ABC mirroring web_search_provider template	2026-05-17 04:04:15 -07:00
plugins_cmd.py	test(plugins): cover _discover_all_plugins recursion + cross-link loader	2026-05-16 17:15:19 -07:00
profile_describer.py	feat(kanban): orchestrator-driven auto-decomposition on triage (#27572 )	2026-05-17 13:54:12 -07:00
profile_distribution.py	feat(profile): shareable profile distributions via git (#20831 )	2026-05-08 10:04:32 -07:00
profiles.py	test: use subprocesses for each test file (#29016 )	2026-05-21 16:40:04 +05:30
providers.py	fix: add default base_url_override for ollama-cloud provider	2026-05-18 14:31:37 -07:00
pt_input_extras.py	fix(cli): make Ctrl+Enter insert newline on WSL/SSH/Windows Terminal (#22777 )	2026-05-09 12:48:14 -07:00
pty_bridge.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
relaunch.py	fix(windows): prefer npm.cmd over npm.ps1, skip .py argv0 in relaunch	2026-05-08 14:27:40 -07:00
runtime_provider.py	fix(security): derive <VENDOR>_API_KEY from host as final credential fallback	2026-05-20 22:12:09 -07:00
security_advisories.py	feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220 )	2026-05-12 01:02:25 -07:00
send_cmd.py	fix(review): address Copilot follow-up on sanitizer and file decode errors	2026-05-16 23:00:58 -05:00
session_recap.py	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )	2026-05-17 02:29:41 -07:00
setup.py	fix(cli): preserve setup config picker writes	2026-05-19 14:23:19 -07:00
skills_config.py	refactor(config): migrate remaining 33 cfg_get call sites (#17311 )	2026-04-29 04:03:03 -07:00
skills_hub.py	fix(skills-hub): fix dedup in browse_skills() programmatic API	2026-05-20 15:04:01 -07:00
skin_engine.py	fix(tui): improve charizard completion menu contrast	2026-05-18 20:05:23 -07:00
slack_cli.py	fix(slack): enable writable app home DMs in manifest	2026-05-08 17:01:12 -07:00
status.py	feat(status): show xAI OAuth login state in hermes status	2026-05-17 11:35:57 -07:00
stdio.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
timeouts.py	perf(agent-loop): cut 47% of per-conversation function calls via 3 targeted hot-path optimizations (#28866 )	2026-05-19 14:25:10 -07:00
tips.py	feat: auto-launch Chromium-family browser for CDP	2026-05-19 22:34:05 -07:00
tools_config.py	feat(tools): mirror image_gen plugin-injection in Browser Automation picker	2026-05-17 04:04:15 -07:00
uninstall.py	docs(windows): avoid piping installer directly into iex	2026-05-18 20:05:47 -07:00
vercel_auth.py	feat: add Vercel Sandbox backend	2026-04-29 07:22:33 -07:00
voice.py	fix(tui): restore voice push-to-talk parity (#20897 )	2026-05-06 15:49:59 -07:00
web_server.py	fix(dashboard): use browser scrollback for chat wheel	2026-05-19 00:07:33 -07:00
webhook.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
xai_retirement.py	fix(xai): align migrate retirement map with docs	2026-05-20 09:18:23 -07:00