hermes-agent/tests/hermes_cli
Teknium d404849351
test: make test env hermetic; enforce CI parity via scripts/run_tests.sh (#11577)
* test: make test env hermetic; enforce CI parity via scripts/run_tests.sh

Fixes the recurring 'works locally, fails in CI' (and vice versa) class
of flakes by making tests hermetic and providing a canonical local runner
that matches CI's environment.

## Layer 1 — hermetic conftest.py (tests/conftest.py)

Autouse fixture now unsets every credential-shaped env var before every
test, so developer-local API keys can't leak into tests that assert
'auto-detect provider when key present'.

Pattern: unset any var ending in _API_KEY, _TOKEN, _SECRET, _PASSWORD,
_CREDENTIALS, _ACCESS_KEY, _PRIVATE_KEY, etc. Plus an explicit list of
credential names that don't fit the suffix pattern (AWS_ACCESS_KEY_ID,
FAL_KEY, GH_TOKEN, etc.) and all the provider BASE_URL overrides that
change auto-detect behavior.

Also unsets HERMES_* behavioral vars (HERMES_YOLO_MODE, HERMES_QUIET,
HERMES_SESSION_*, etc.) that mutate agent behavior.

Also:
  - Redirects HOME to a per-test tempdir (not just HERMES_HOME), so
    code reading ~/.hermes/* directly can't touch the real dir.
  - Pins TZ=UTC, LANG=C.UTF-8, LC_ALL=C.UTF-8, PYTHONHASHSEED=0 to
    match CI's deterministic runtime.

The old _isolate_hermes_home fixture name is preserved as an alias so
any test that yields it explicitly still works.

## Layer 2 — scripts/run_tests.sh canonical runner

'Always use scripts/run_tests.sh, never call pytest directly' is the
new rule (documented in AGENTS.md). The script:
  - Unsets all credential env vars (belt-and-suspenders for callers
    who bypass conftest — e.g. IDE integrations)
  - Pins TZ/LANG/PYTHONHASHSEED
  - Uses -n 4 xdist workers (matches GHA ubuntu-latest; -n auto on
    a 20-core workstation surfaces test-ordering flakes CI will never
    see, causing the infamous 'passes in CI, fails locally' drift)
  - Finds the venv in .venv, venv, or main checkout's venv
  - Passes through arbitrary pytest args

Installs pytest-split on demand so the script can also be used to run
matrix-split subsets locally for debugging.

## Remove 3 module-level dotenv stubs that broke test isolation

tests/hermes_cli/test_{arcee,xiaomi,api_key}_provider.py each had a
module-level:

    if 'dotenv' not in sys.modules:
        fake_dotenv = types.ModuleType('dotenv')
        fake_dotenv.load_dotenv = lambda *a, **kw: None
        sys.modules['dotenv'] = fake_dotenv

This patches sys.modules['dotenv'] to a fake at import time with no
teardown. Under pytest-xdist LoadScheduling, whichever worker collected
one of these files first poisoned its sys.modules; subsequent tests in
the same worker that imported load_dotenv transitively (e.g.
test_env_loader.py via hermes_cli.env_loader) got the no-op lambda and
saw their assertions fail.

dotenv is a required dependency (python-dotenv>=1.2.1 in pyproject.toml),
so the defensive stub was never needed. Removed.

## Validation

- tests/hermes_cli/ alone: 2178 passed, 1 skipped, 0 failed (was 4
  failures in test_env_loader.py before this fix)
- tests/test_plugin_skills.py, tests/hermes_cli/test_plugins.py,
  tests/test_hermes_logging.py combined: 123 passed (the caplog
  regression tests from PR #11453 still pass)
- Local full run shows no F/E clusters in the 0-55% range that were
  previously present before the conftest hardening

## Background

See AGENTS.md 'Testing' section for the full list of drift sources
this closes. Matrix split (closed as #11566) will be re-attempted
once this foundation lands — cross-test pollution was the root cause
of the shard-3 hang in that PR.

* fix(conftest): don't redirect HOME — it broke CI subprocesses

PR #11577's autouse fixture was setting HOME to a per-test tempdir.
CI started timing out at 97% complete with dozens of E/F markers and
orphan python processes at cleanup — tests (or transitive deps)
spawn subprocesses that expect a stable HOME, and the redirect broke
them in non-obvious ways.

Env-var unsetting and TZ/LANG/hashseed pinning (the actual CI-drift
fixes) are unchanged and still in place. HERMES_HOME redirection is
also unchanged — that's the canonical way to isolate tests from
~/.hermes/, not HOME.

Any code in the codebase reading ~/.hermes/* via `Path.home() / ".hermes"`
instead of `get_hermes_home()` is a bug to fix at the callsite, not
something to paper over in conftest.
2026-04-17 06:09:09 -07:00
..
__init__.py test: reorganize test structure and add missing unit tests 2026-02-26 03:20:08 +03:00
test_anthropic_oauth_flow.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_anthropic_provider_persistence.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_api_key_providers.py test: make test env hermetic; enforce CI parity via scripts/run_tests.sh (#11577) 2026-04-17 06:09:09 -07:00
test_arcee_provider.py test: make test env hermetic; enforce CI parity via scripts/run_tests.sh (#11577) 2026-04-17 06:09:09 -07:00
test_argparse_flag_propagation.py test: remove 169 change-detector tests across 21 files (#11472) 2026-04-17 01:05:09 -07:00
test_atomic_json_write.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_atomic_yaml_write.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_auth_codex_provider.py fix: write refreshed Codex tokens back to ~/.codex/auth.json (#8277) 2026-04-12 02:05:20 -07:00
test_auth_commands.py fix(auth): codex auth remove no longer silently undone by auto-import (#11485) 2026-04-17 04:10:17 -07:00
test_auth_nous_provider.py fix(nous): respect 'Skip (keep current)' after OAuth login (#11476) 2026-04-17 00:52:42 -07:00
test_auth_provider_gate.py fix: resolve CI test failures — add missing functions, fix stale tests (#9483) 2026-04-14 01:43:45 -07:00
test_auth_qwen_provider.py feat(qwen): add Qwen OAuth provider with portal request support 2026-04-08 13:46:30 -07:00
test_backup.py feat: fix SQLite safety in hermes backup + add --quick snapshots + /snapshot command (#8971) 2026-04-13 04:46:13 -07:00
test_banner.py fix(banner): normalize toolset labels and use skin colors 2026-03-18 03:22:58 -07:00
test_banner_git_state.py fix: CLI/UX batch — ChatConsole errors, curses scroll, skin-aware banner, git state banner (#5974) 2026-04-07 17:59:42 -07:00
test_banner_skills.py fix: disabled skills respected across banner, system prompt, slash commands, and skill_view (#1897) 2026-03-18 03:17:37 -07:00
test_chat_skills_flag.py fix(termux): add local image chat route 2026-04-09 16:24:53 -07:00
test_claw.py fix: unify OpenClaw detection, add isatty guard, fix print_warning import 2026-04-12 16:40:37 -07:00
test_clear_stale_base_url.py fix: warn and clear stale OPENAI_BASE_URL on provider switch (#5161) 2026-04-11 01:52:58 -07:00
test_cmd_update.py fix(update): skip config migration prompts in non-interactive sessions (#3584) 2026-03-28 14:26:32 -07:00
test_coalesce_session_args.py fix(cli): handle unquoted multi-word session names in -c/--continue and -r/--resume 2026-03-09 21:36:29 -07:00
test_codex_cli_model_picker.py fix: openai-codex and anthropic not appearing in /model picker for external credentials (#8224) 2026-04-12 00:33:42 -07:00
test_codex_models.py fix(model): normalize native provider-prefixed model ids 2026-04-10 05:52:45 -07:00
test_commands.py fix: remove 'q' alias from /quit so /queue's 'q' alias works (#10467) (#10538) 2026-04-15 15:04:01 -07:00
test_completion.py fix: preserve profile name completion in dynamic shell completion 2026-04-14 10:45:42 -07:00
test_config.py feat(discord): add channel_prompts config 2026-04-15 16:31:28 -07:00
test_config_env_expansion.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_config_validation.py feat: config structure validation — detect malformed YAML at startup (#5426) 2026-04-05 23:31:20 -07:00
test_container_aware_cli.py fix(gateway): harden Docker/container gateway pathway 2026-04-12 16:36:11 -07:00
test_copilot_auth.py fix: remove 115 verified dead code symbols across 46 production files 2026-04-10 03:44:43 -07:00
test_cron.py feat: add multi-skill cron editing and docs 2026-03-14 19:18:10 -07:00
test_custom_provider_model_switch.py fix: three provider-related bugs (#8161, #8181, #8147) (#8243) 2026-04-12 01:44:18 -07:00
test_debug.py test: remove 169 change-detector tests across 21 files (#11472) 2026-04-17 01:05:09 -07:00
test_deprecated_cwd_warning.py fix: enforce config.yaml as sole CWD source + deprecate .env CWD vars + add hermes memory reset (#11029) 2026-04-16 06:48:33 -07:00
test_dingtalk_auth.py test(dingtalk): cover QR device-flow auth + OpenClaw branding disclosure 2026-04-17 05:08:07 -07:00
test_doctor.py fix(doctor): skip health check for OpenCode Go (no shared /models endpoint) 2026-04-15 15:05:32 -07:00
test_doctor_command_install.py feat(doctor): add Command Installation check for hermes bin symlink 2026-04-14 23:13:11 -07:00
test_env_loader.py fix(config): reload .env over stale shell overrides 2026-03-15 06:46:28 -07:00
test_env_sanitize_on_load.py fix: follow-up for salvaged PR #8939 2026-04-13 04:35:37 -07:00
test_gateway.py fix: add all_profiles param + narrow exception handling 2026-04-11 14:44:29 -07:00
test_gateway_linger.py fix(termux): disable gateway service flows on android 2026-04-09 16:24:53 -07:00
test_gateway_runtime_health.py fix(gateway): harden Telegram polling conflict handling 2026-03-14 12:11:23 -07:00
test_gateway_service.py fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation 2026-04-15 22:05:21 -07:00
test_gateway_wsl.py feat(gateway): WSL-aware gateway with smart systemd detection (#7510) 2026-04-10 21:15:47 -07:00
test_gemini_provider.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_launcher.py fix: use argparse entrypoint in top-level launcher (#3874) 2026-03-29 21:54:36 -07:00
test_logs.py feat: component-separated logging with session context and filtering (#7991) 2026-04-11 17:23:36 -07:00
test_managed_installs.py chore: prepare Hermes for Homebrew packaging (#4099) 2026-03-30 17:34:43 -07:00
test_mcp_config.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_tools_config.py feat: interactive MCP tool configuration in hermes tools (#1694) 2026-03-17 03:48:44 -07:00
test_memory_reset.py fix: enforce config.yaml as sole CWD source + deprecate .env CWD vars + add hermes memory reset (#11029) 2026-04-16 06:48:33 -07:00
test_model_normalize.py fix(copilot): normalize vendor-prefixed and dash-notation model IDs (#6879) (#11561) 2026-04-17 04:19:36 -07:00
test_model_provider_persistence.py fix(setup): validate base URL input in hermes model flow (#8264) 2026-04-12 01:51:57 -07:00
test_model_switch_copilot_api_mode.py fix: recompute Copilot api_mode after model switch 2026-04-16 01:16:14 -07:00
test_model_switch_custom_providers.py test: add regression tests for custom_providers multi-model dedup and grouping 2026-04-13 16:41:30 -07:00
test_model_switch_opencode_anthropic.py fix(opencode): strip /v1 from base_url on mid-session /model switch to Anthropic-routed models (#11286) 2026-04-16 19:41:41 -07:00
test_model_switch_variant_tags.py fix(models): preserve OpenRouter variant tags (:free, :extended, :fast) during model switch (#6383) 2026-04-08 19:58:16 -07:00
test_model_validation.py fix(models): add glm-5.1 to opencode-go catalogs 2026-04-16 16:49:22 -07:00
test_models.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
test_non_ascii_credential.py fix: detect and strip non-ASCII characters from API keys (#6843) 2026-04-14 20:20:31 -07:00
test_nous_hermes_non_agentic.py fix(cli): narrow Nous Hermes non-agentic warning to actual hermes-3/-4 models 2026-04-13 04:33:52 -07:00
test_nous_subscription.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_ollama_cloud_auth.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_ollama_cloud_provider.py fix: wire up Ollama Cloud dynamic model discovery in /model TUI picker 2026-04-16 07:17:45 -07:00
test_opencode_go_in_model_list.py fix(models): add glm-5.1 to opencode-go catalogs 2026-04-16 16:49:22 -07:00
test_overlay_slug_resolution.py fix: resolve overlay provider slug mismatch in /model picker (#7373) 2026-04-10 14:46:57 -07:00
test_path_completion.py feat(cli): add file path autocomplete in the input prompt (#1545) 2026-03-16 06:07:45 -07:00
test_placeholder_usage.py fix: cover remaining config placeholder help text 2026-03-14 10:35:14 -07:00
test_plugin_cli_registration.py test: remove 169 change-detector tests across 21 files (#11472) 2026-04-17 01:05:09 -07:00
test_plugins.py fix(tests): attach caplog to specific logger in 3 order-dependent tests (#11453) 2026-04-17 00:20:40 -07:00
test_plugins_cmd.py test: remove 169 change-detector tests across 21 files (#11472) 2026-04-17 01:05:09 -07:00
test_profile_export_credentials.py fix: also exclude .env from default profile exports 2026-04-01 11:20:33 -07:00
test_profiles.py fix: improve profile creation UX — seed SOUL.md + credential warning (#8553) 2026-04-12 12:22:34 -07:00
test_reasoning_effort_menu.py fix: normalize reasoning effort ordering in UI 2026-04-09 14:20:16 -07:00
test_runtime_provider_resolution.py fix(config): restore custom providers after v11→v12 migration 2026-04-13 10:50:52 -07:00
test_session_browse.py feat: interactive session browser with search filtering (#718) 2026-03-08 17:42:50 -07:00
test_sessions_delete.py fix(cli): handle EOFError in sessions delete/prune confirmation prompts (#3101) 2026-03-25 18:06:04 -07:00
test_set_config_value.py fix(cli): allow empty strings and falsy values in config set 2026-03-31 11:41:12 -07:00
test_setup.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_setup_hermes_script.py fix(termux): make setup-hermes use android path 2026-04-09 16:24:53 -07:00
test_setup_matrix_e2ee.py docs(matrix): update all references from matrix-nio to mautrix 2026-04-10 21:15:59 -07:00
test_setup_model_provider.py fix: remove 115 verified dead code symbols across 46 production files 2026-04-10 03:44:43 -07:00
test_setup_noninteractive.py Harden setup provider flows 2026-04-10 02:57:39 -07:00
test_setup_openclaw_migration.py fix: OpenClaw migration now shows dry-run preview before executing (#6769) 2026-04-09 12:15:06 -07:00
test_setup_prompt_menus.py fix(ci): resolve 4 pre-existing main failures (docs lint + 3 stale tests) (#11373) 2026-04-16 20:43:41 -07:00
test_skills_config.py fix: respect per-platform disabled skills in Telegram menu and gateway dispatch (#4799) 2026-04-03 10:10:53 -07:00
test_skills_hub.py fix(cli): add ChatConsole.status for /skills search 2026-04-11 15:38:43 -07:00
test_skills_install_flags.py fix: add --yes flag to bypass confirmation in /skills install and uninstall (#1647) 2026-03-17 01:59:07 -07:00
test_skills_skip_confirm.py fix(skills): cache-aware /skills install and uninstall in TUI (#3586) 2026-03-28 14:32:23 -07:00
test_skills_subparser.py fix(cli): resolve duplicate 'skills' subparser crash on Python 3.11+ 2026-03-11 00:50:39 -07:00
test_skin_engine.py fix(cli): handle null/non-dict display config in skin initialization 2026-04-16 06:35:31 -07:00
test_status.py fix(termux): improve status and install UX 2026-04-09 16:24:53 -07:00
test_status_model_provider.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_subparser_routing_fallback.py test: remove 169 change-detector tests across 21 files (#11472) 2026-04-17 01:05:09 -07:00
test_subprocess_timeouts.py fix(cli): add missing subprocess.run() timeouts in doctor and status (#4009) 2026-03-30 11:17:15 -07:00
test_terminal_menu_fallbacks.py Harden setup provider flows 2026-04-10 02:57:39 -07:00
test_tips.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
test_tool_token_estimation.py fix(tests): resolve 10 CI failures across hooks, tiktoken, plugins (#3848) 2026-03-29 20:05:59 -07:00
test_tools_config.py feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) 2026-04-16 20:19:53 -07:00
test_tools_disable_enable.py fix: MCP toolset resolution for runtime and config (#3252) 2026-03-26 13:39:41 -07:00
test_update_autostash.py fix(update): always reset on stash conflict — never leave conflict markers (#7010) 2026-04-10 00:32:20 -07:00
test_update_check.py fix: profile paths broken in Docker — profiles go to /root/.hermes instead of mounted volume (#7170) 2026-04-10 05:53:10 -07:00
test_update_gateway_restart.py fix: write update exit code before gateway restart (cgroup kill race) (#8288) 2026-04-12 02:33:21 -07:00
test_user_providers_model_switch.py fix(model): Support providers: dict for custom endpoints in /model 2026-04-13 05:16:21 -07:00
test_web_server.py dashboard: show GATEWAY_HEALTH_URL instead of PID for remote gateways 2026-04-16 16:48:14 -07:00
test_webhook_cli.py feat(webhook): hermes webhook CLI + skill for event-driven subscriptions (#3578) 2026-03-28 14:33:35 -07:00
test_xiaomi_provider.py test: make test env hermetic; enforce CI parity via scripts/run_tests.sh (#11577) 2026-04-17 06:09:09 -07:00