hermes-agent/tests/hermes_cli
Teknium 404640a2b7
feat(goals): /goal checklist + /subgoal user controls (#23456)
* feat(goals): /goal checklist + /subgoal user controls

Two-phase judge for /goal — Phase A decomposes the goal into a detailed
checklist on first turn; Phase B evaluates each pending item harshly
against the agent's most recent response. The goal completes only when
every item is in a terminal status (completed or impossible). Adds
/subgoal so the user can append, complete, mark impossible, undo,
remove, or clear items the judge missed or got wrong.

Mechanics:
- GoalState gains `checklist` and `decomposed` fields, both backwards
  compatible (old state_meta rows load unchanged).
- Phase A: aux call writes a harsh, exhaustive checklist; biased toward
  more items not fewer. Falls through to legacy freeform judge when
  decompose fails.
- Phase B: judge gets the checklist + last-response snippet + path to
  a per-session conversation dump at <HERMES_HOME>/goals/<sid>.json.
  A bounded read_file tool (max 5 calls per turn, restricted to that
  one file) lets the judge inspect history when the snippet is
  ambiguous. Stickiness in code: terminal items are frozen, only the
  user can revert via /subgoal undo.
- Continuation prompt shows checklist progress when non-empty;
  reverts to old prompt when empty.
- Status line shows M/N done counts.

CLI + gateway + TUI gateway all pass the agent reference into
evaluate_after_turn so the dump can be written. Gateway-side
/subgoal is allowed mid-run since it only modifies the checklist
the judge consults at turn boundaries.

Tests: 24 new cases — backcompat round-trip, Phase A decompose,
Phase B updates + new_items + stickiness, user override flows,
conversation dump (incl. unsafe-sid sanitization), judge read_file
restriction. Existing freeform-mode tests updated to patch the
renamed `judge_goal_freeform` and skip Phase A explicitly.

* fix(goals): off-by-one in judge index, message-list plumbing, prompt tuning

Three live-test findings from running /goal end-to-end against
gemini-3-flash-preview as the judge:

1. Off-by-one bug — the judge sees the checklist rendered with 1-based
   indices ('1. [ ] foo, 2. [ ] bar') but the apply layer indexed
   state.checklist as 0-based. Result: every judge update landed on
   the wrong item, evidence got attached to neighbouring rows, and
   the genuine 'first pending' item (usually #1) never got marked.
   Fix: convert 1 → 0 in _parse_evaluate_response. Also tightened the
   user prompt to call out the 1-based scheme explicitly. New tests
   cover the parser conversion + an end-to-end fake-judge round-trip.

2. Conversation dump never happened — _extract_agent_messages tried
   common AIAgent attribute names (.messages, .conversation_history,
   etc.) but AIAgent doesn't expose the message list as an instance
   attribute; it lives inside run_conversation()'s scope. Result: the
   judge's read_file tool always saw history_path=unavailable. Fix:
   added an explicit messages= kwarg to evaluate_after_turn that all
   three call sites (CLI, gateway, TUI gateway) now pass directly.
   Agent-attribute extraction kept as back-compat fallback.

3. Prompt was too harsh on simple goals. The original 'be HARSH,
   default to leaving items pending' wording made the judge refuse
   to mark 'file exists' completed even after the agent ran ls,
   test -f, os.path.isfile, and find — burning the entire 8-turn
   budget on a fizzbuzz task. Softened to 'strict but not absurd'
   with explicit guidance on what counts as evidence and a directive
   not to require re-proving items already established earlier.

Re-tested live with the same fizzbuzz goal: now terminates in 2
turns with all 8 checklist items correctly attributed to their
own evidence. /subgoal user-action flow (add / complete / undo /
impossible) verified live as well.
2026-05-10 16:56:51 -07:00
..
__init__.py test: reorganize test structure and add missing unit tests 2026-02-26 03:20:08 +03:00
conftest.py fix(kanban): suppress dispatcher stuck-warn when ready queue holds only non-spawnable assignees 2026-05-05 04:13:12 -07:00
test_ai_gateway_models.py refactor(ai-gateway): single source of truth for model catalog (#13304) 2026-04-20 22:21:21 -07:00
test_anthropic_model_flow_stale_oauth.py fix: re-auth on stale OAuth token; read Claude Code credentials from macOS Keychain 2026-04-24 07:14:00 -07:00
test_anthropic_oauth_flow.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_anthropic_provider_persistence.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_api_key_providers.py chore(salvage): strip duplicated/merge-corrupted blocks from PR #17664 2026-04-29 21:56:51 -07:00
test_apply_model_switch_result_context.py fix(cli): /model picker honors provider-specific context caps (#16030) 2026-04-26 05:43:31 -07:00
test_apply_profile_override.py fix(profiles): honour active_profile when HERMES_HOME points to hermes root 2026-05-09 11:10:53 -07:00
test_arcee_provider.py feat(providers): add tencent-tokenhub provider support 2026-04-28 03:45:52 -07:00
test_argparse_flag_propagation.py feat: shell hooks — wire shell scripts as Hermes hook callbacks 2026-04-20 20:53:51 -07:00
test_at_context_completion_filter.py fix(tui): @folder: only yields directories, @file: only yields files 2026-04-21 14:31:48 -05:00
test_atomic_json_write.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_atomic_yaml_write.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_auth_codex_provider.py fix(model): let Codex setup reuse or reauthenticate 2026-04-24 04:53:32 -07:00
test_auth_commands.py fix(auth): make provider config writes atomic 2026-04-30 20:39:41 -07:00
test_auth_nous_provider.py test(auth): assert Nous refresh rotation payload 2026-05-08 04:17:42 -07:00
test_auth_profile_fallback.py fix(auth): fall back to global-root auth.json for providers missing in profile 2026-05-06 13:29:54 -07:00
test_auth_provider_gate.py fix: resolve CI test failures — add missing functions, fix stale tests (#9483) 2026-04-14 01:43:45 -07:00
test_auth_qwen_provider.py feat(qwen): add Qwen OAuth provider with portal request support 2026-04-08 13:46:30 -07:00
test_auth_ssl_macos.py fix(auth): honor SSL CA env vars across httpx + requests callsites 2026-04-24 03:00:33 -07:00
test_auth_toctou_file_modes.py test: migrate stale os.kill monkeypatches to gateway.status._pid_exists 2026-05-08 14:27:40 -07:00
test_aux_config.py fix(aux): add session_search extra_body and concurrency controls 2026-04-20 00:47:39 -07:00
test_azure_detect.py feat(azure-foundry): auto-detect transport, models, context length 2026-04-25 18:48:43 -07:00
test_backup.py fix(backup): floor pre-update backup_keep to 1 so the new backup survives 2026-05-04 05:07:13 -07:00
test_banner.py feat(banner): hyperlink startup banner title to latest GitHub release (#14945) 2026-04-23 23:28:34 -07:00
test_banner_git_state.py fix: CLI/UX batch — ChatConsole errors, curses scroll, skin-aware banner, git state banner (#5974) 2026-04-07 17:59:42 -07:00
test_banner_skills.py fix: disabled skills respected across banner, system prompt, slash commands, and skill_view (#1897) 2026-03-18 03:17:37 -07:00
test_bedrock_model_picker.py fix(model): avoid bedrock credential probe in provider picker 2026-05-03 00:32:55 -07:00
test_chat_skills_flag.py fix(termux): add local image chat route 2026-04-09 16:24:53 -07:00
test_claw.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_clear_stale_base_url.py fix: warn and clear stale OPENAI_BASE_URL on provider switch (#5161) 2026-04-11 01:52:58 -07:00
test_cmd_update.py fix(update): use termux-all uv fallback path on Termux 2026-05-09 17:53:15 -07:00
test_coalesce_session_args.py fix(cli): handle unquoted multi-word session names in -c/--continue and -r/--resume 2026-03-09 21:36:29 -07:00
test_codex_cli_model_picker.py test(codex-spark): add live-API regression and make picker test deterministic 2026-05-09 23:17:25 -07:00
test_codex_models.py test(codex-spark): add live-API regression and make picker test deterministic 2026-05-09 23:17:25 -07:00
test_commands.py feat: add Telegram DM topic-mode sessions 2026-05-04 12:07:17 -07:00
test_completion.py fix: preserve profile name completion in dynamic shell completion 2026-04-14 10:45:42 -07:00
test_config.py fix(cli): prevent .env sanitizer from splitting GLM_API_KEY by LM_API_KEY suffix 2026-04-28 22:22:45 -07:00
test_config_drift.py feat(delegate): orchestrator role and configurable spawn depth (default flat) 2026-04-21 14:23:45 -07:00
test_config_env_expansion.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_config_env_refs.py fix(config): preserve env refs when save_config rewrites config (#11892) 2026-04-17 19:03:26 -07:00
test_config_validation.py fix(config): accept fallback_model list (chain) in validator + save 2026-04-28 01:40:25 -07:00
test_container_aware_cli.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_copilot_auth.py fix: remove 115 verified dead code symbols across 46 production files 2026-04-10 03:44:43 -07:00
test_copilot_catalog_oauth_fallback.py fix(copilot): require successful exchange when walking credential_pool catalog tokens 2026-04-28 01:18:09 -07:00
test_copilot_context.py fix(copilot): wire live /models max_prompt_tokens into context-window resolver 2026-04-24 05:09:08 -07:00
test_copilot_in_model_list.py fix(model): repair Discord Copilot /model flow 2026-04-24 03:33:29 -07:00
test_copilot_token_exchange.py fix(copilot): exchange raw GitHub token for Copilot API JWT 2026-04-24 05:09:08 -07:00
test_cron.py feat(skills): consolidate find-nearby into maps as a single location skill 2026-04-19 05:19:22 -07:00
test_curator_archive_prune.py feat(curator): add archive and prune subcommands (#20200) 2026-05-05 05:15:54 -07:00
test_curator_recent_run_notice.py feat(curator): show rename map in user-visible summary (#22910) 2026-05-09 18:43:40 -07:00
test_curator_run.py fix(curator): make manual runs synchronous 2026-05-07 05:27:47 -07:00
test_curator_status.py fix(curator): make manual runs synchronous 2026-05-07 05:27:47 -07:00
test_custom_provider_context_length.py fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K (#15844) 2026-04-25 18:47:53 -07:00
test_custom_provider_model_switch.py fix(cli): omit empty api_mode when probing custom models 2026-05-04 02:46:41 -07:00
test_dashboard_browser_safe_imports.py Merge upstream/main and address Copilot review feedback 2026-04-30 06:43:22 -04:00
test_dashboard_lifecycle_flags.py feat(dashboard): add --stop and --status flags (#17840) 2026-04-30 02:30:20 -07:00
test_dashboard_profiles_nav_label.py fix(dashboard): keep profiles list resilient 2026-04-29 01:39:52 -04:00
test_debug.py feat(security): enable secret redaction by default (#17691, #20785) (#21193) 2026-05-07 05:10:33 -07:00
test_deprecated_cwd_warning.py fix: enforce config.yaml as sole CWD source + deprecate .env CWD vars + add hermes memory reset (#11029) 2026-04-16 06:48:33 -07:00
test_destructive_slash_confirm_gate.py feat: confirm prompt for destructive slash commands (#4069) (#22687) 2026-05-09 11:04:46 -07:00
test_detect_api_mode_for_url.py fix: restrict provider URL detection to exact hostname matches 2026-04-20 22:14:29 -07:00
test_determine_api_mode_hostname.py fix: extend hostname-match provider detection across remaining call sites 2026-04-20 22:14:29 -07:00
test_dingtalk_auth.py test(dingtalk): cover QR device-flow auth + OpenClaw branding disclosure 2026-04-17 05:08:07 -07:00
test_discord_skill_clamp_warning.py test: add tests for cmd_key preservation through name clamping 2026-05-03 03:25:45 -07:00
test_doctor.py feat: add termux doctor fallback guidance for blocked extras 2026-05-07 13:04:08 -07:00
test_doctor_command_install.py feat(doctor): add Command Installation check for hermes bin symlink 2026-04-14 23:13:11 -07:00
test_doctor_dedicated_provider_skip.py fix(doctor): skip pluggable provider profiles when a dedicated check exists (#22346) 2026-05-09 13:36:33 -07:00
test_env_loader.py clarify placeholder telegram credential in tests 2026-05-04 15:31:15 -04:00
test_env_sanitize_on_load.py clarify placeholder telegram credential in tests 2026-05-04 15:31:15 -04:00
test_fallback_cmd.py feat(cli): add 'hermes fallback' command to manage fallback providers (#16052) 2026-04-26 06:19:04 -07:00
test_gateway.py test(gateway): stub /proc unavailability in find_gateway_pids fallback test 2026-05-09 17:54:17 -07:00
test_gateway_linger.py fix(termux): disable gateway service flows on android 2026-04-09 16:24:53 -07:00
test_gateway_proc_fallback.py fix(gateway): detect gateway process via /proc in Docker without procps 2026-05-09 17:54:17 -07:00
test_gateway_runtime_health.py fix(gateway): harden Telegram polling conflict handling 2026-03-14 12:11:23 -07:00
test_gateway_service.py fix(test_gateway): stop run_gateway() tests from rewriting the dev's installed systemd unit (#22900) 2026-05-09 17:54:09 -07:00
test_gateway_wsl.py feat(gateway): WSL-aware gateway with smart systemd detection (#7510) 2026-04-10 21:15:47 -07:00
test_gemini_free_tier_setup_block.py feat(gemini): block free-tier keys at setup + surface guidance on 429 (#15100) 2026-04-24 04:46:17 -07:00
test_gemini_provider.py test: stop testing mutable data — convert change-detectors to invariants (#13363) 2026-04-20 23:20:33 -07:00
test_gmi_provider.py refactor(gmi): move User-Agent to profile.default_headers 2026-05-08 03:22:11 -07:00
test_goals.py feat(goals): /goal checklist + /subgoal user controls (#23456) 2026-05-10 16:56:51 -07:00
test_hooks_cli.py feat: shell hooks — wire shell scripts as Hermes hook callbacks 2026-04-20 20:53:51 -07:00
test_ignore_user_config_flags.py refactor(cli): derive relaunch flag table from argparse introspection 2026-04-29 20:33:29 -07:00
test_image_gen_picker.py fix(image-gen): persist plugin provider on reconfigure 2026-04-23 01:56:09 -07:00
test_kanban_boards.py fix(kanban): ignore stale current board pointers 2026-05-05 04:34:45 -07:00
test_kanban_cli.py fix(kanban): /kanban slash command emits argparse garbage instead of help 2026-05-09 22:49:29 -07:00
test_kanban_core_functionality.py fix(tools): clarify kanban_complete phantom-card retry guidance 2026-05-10 16:14:43 -07:00
test_kanban_db.py fix(kanban): extend stale claim instead of killing live worker 2026-05-10 15:23:04 -07:00
test_kanban_diagnostics.py fix(kanban): unify failure counter across spawn/timeout/crash outcomes (#20410) 2026-05-05 13:55:37 -07:00
test_kanban_notify.py test(kanban): cover redeliver-on-cycle + flip stale unsub-on-abnormal-event tests 2026-05-10 14:27:59 -07:00
test_kanban_specify.py feat(kanban): add specify — auxiliary LLM fleshes out triage tasks (#21435) 2026-05-07 13:04:41 -07:00
test_kanban_specify_db.py feat(kanban): add specify — auxiliary LLM fleshes out triage tasks (#21435) 2026-05-07 13:04:41 -07:00
test_launcher.py fix: use argparse entrypoint in top-level launcher (#3874) 2026-03-29 21:54:36 -07:00
test_list_picker_providers.py fix(gateway): preserve model picker current context 2026-05-06 03:50:59 -07:00
test_logs.py feat: component-separated logging with session context and filtering (#7991) 2026-04-11 17:23:36 -07:00
test_managed_installs.py chore: prepare Hermes for Homebrew packaging (#4099) 2026-03-30 17:34:43 -07:00
test_mcp_add_command_dest.py fix(mcp): give 'mcp add --command' a distinct argparse dest 2026-05-07 05:17:03 -07:00
test_mcp_config.py fix(mcp): give 'mcp add --command' a distinct argparse dest 2026-05-07 05:17:03 -07:00
test_mcp_reload_confirm_gate.py feat(gateway,cli): confirm /reload-mcp to warn about prompt cache invalidation 2026-04-29 21:56:47 -07:00
test_mcp_tools_config.py feat: interactive MCP tool configuration in hermes tools (#1694) 2026-03-17 03:48:44 -07:00
test_memory_reset.py fix: enforce config.yaml as sole CWD source + deprecate .env CWD vars + add hermes memory reset (#11029) 2026-04-16 06:48:33 -07:00
test_model_catalog.py feat(models): remote model catalog manifest for OpenRouter + Nous Portal (#16033) 2026-04-26 05:46:43 -07:00
test_model_normalize.py fix(model-normalize): pass DeepSeek V-series IDs through instead of folding to deepseek-chat 2026-04-24 05:24:54 -07:00
test_model_picker_viewport.py refactor(cli): align model picker viewport with PR #11260 vocabulary 2026-04-17 06:33:21 -07:00
test_model_provider_persistence.py test: remove 50 stale/broken tests to unblock CI (#22098) 2026-05-08 14:55:40 -07:00
test_model_switch_context_display.py fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K (#15844) 2026-04-25 18:47:53 -07:00
test_model_switch_copilot_api_mode.py fix: recompute Copilot api_mode after model switch 2026-04-16 01:16:14 -07:00
test_model_switch_custom_providers.py fix(model_switch): live model discovery for custom_providers in /model picker 2026-05-07 05:21:26 -07:00
test_model_switch_opencode_anthropic.py fix(opencode): derive api_mode from target model, not stale config default (#15106) 2026-04-24 04:58:46 -07:00
test_model_switch_variant_tags.py fix(models): preserve OpenRouter variant tags (:free, :extended, :fast) during model switch (#6383) 2026-04-08 19:58:16 -07:00
test_model_validation.py test: remove 50 stale/broken tests to unblock CI (#22098) 2026-05-08 14:55:40 -07:00
test_models.py fix(tui): resolve startup model aliases statically 2026-04-25 14:13:02 -05:00
test_models_dev_preferred_merge.py feat(/model): merge models.dev entries for lesser-loved providers (#14221) 2026-04-22 17:33:42 -07:00
test_non_ascii_credential.py fix(env_loader): warn when non-ASCII stripped from credential env vars (#13300) 2026-04-20 22:14:03 -07:00
test_nous_hermes_non_agentic.py fix(cli): narrow Nous Hermes non-agentic warning to actual hermes-3/-4 models 2026-04-13 04:33:52 -07:00
test_nous_subscription.py fix(cli): coerce use_gateway config flags in tool routing 2026-04-26 19:02:55 -07:00
test_ollama_cloud_auth.py fix(opencode): derive api_mode from target model, not stale config default (#15106) 2026-04-24 04:58:46 -07:00
test_ollama_cloud_provider.py fix(models): strip :cloud/-cloud suffix from models.dev Ollama Cloud IDs 2026-05-04 12:38:15 -07:00
test_openai_codex_model_validation_fallback.py fix(codex-spark): defensive 128k entry in DEFAULT_CONTEXT_LENGTHS + clarify validation test docstring 2026-05-09 23:17:25 -07:00
test_opencode_go_flat_namespace.py fix(opencode-go): keep users on opencode-go instead of hijacking to native providers (#20802) 2026-05-06 09:08:33 -07:00
test_opencode_go_in_model_list.py feat(/model): merge models.dev entries for lesser-loved providers (#14221) 2026-04-22 17:33:42 -07:00
test_opencode_go_validation_fallback.py fix(/model): accept provider switches when /models is unreachable 2026-04-21 05:19:43 -07:00
test_overlay_slug_resolution.py fix(model_picker): detect mapped-provider auth-store credentials 2026-04-24 05:20:05 -07:00
test_path_completion.py feat(cli): add file path autocomplete in the input prompt (#1545) 2026-03-16 06:07:45 -07:00
test_pin_kanban_board_env.py test(kanban): isolate HERMES_KANBAN_BOARD writes in pin-env tests 2026-05-05 04:37:47 -07:00
test_placeholder_usage.py fix: cover remaining config placeholder help text 2026-03-14 10:35:14 -07:00
test_plugin_cli_registration.py test: remove 169 change-detector tests across 21 files (#11472) 2026-04-17 01:05:09 -07:00
test_plugin_scanner_recursion.py feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) 2026-04-21 21:30:10 -07:00
test_plugins.py feat(plugins): HERMES_PLUGINS_DEBUG=1 surfaces plugin discovery logs (#22684) 2026-05-09 11:07:12 -07:00
test_plugins_cmd.py fix(plugins): resolve Git binary for installs under minimal PATH 2026-05-09 11:10:04 -07:00
test_post_setup_gating.py fix(tools): install cua-driver when Computer Use is enabled via 'hermes tools' (#22765) 2026-05-09 13:02:25 -07:00
test_profile_distribution.py feat(profile): shareable profile distributions via git (#20831) 2026-05-08 10:04:32 -07:00
test_profile_export_credentials.py fix: also exclude .env from default profile exports 2026-04-01 11:20:33 -07:00
test_profiles.py fix(profiles): exclude infrastructure artifacts when cloning with --clone-all 2026-05-09 04:10:35 -07:00
test_prompt_api_key.py fix(setup): offer Keep/Replace/Clear when API key already exists 2026-05-05 04:08:11 -07:00
test_provider_config_validation.py fix(config): add request_timeout_seconds and stale_timeout_seconds to provider _KNOWN_KEYS 2026-04-28 01:28:25 -07:00
test_pty_bridge.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_reasoning_effort_menu.py fix: normalize reasoning effort ordering in UI 2026-04-09 14:20:16 -07:00
test_redact_config_bridge.py feat(security): enable secret redaction by default (#17691, #20785) (#21193) 2026-05-07 05:10:33 -07:00
test_regression_16767.py test(cli): regression coverage for user-provider routing fix (#16767) 2026-04-28 01:47:20 -07:00
test_relaunch.py fix(windows): prefer npm.cmd over npm.ps1, skip .py argv0 in relaunch 2026-05-08 14:27:40 -07:00
test_resolve_last_session.py fix(cli): tighten MRU lookup and session DB cleanup 2026-04-27 08:52:12 -07:00
test_runtime_provider_resolution.py fix(fallback): let custom_providers shadow built-in aliases 2026-04-30 20:18:44 -07:00
test_session_browse.py fix(sessions): /save lands under $HERMES_HOME, widen browse+TUI picker, force-refresh ollama-cloud on setup (#16296) 2026-04-26 18:49:48 -07:00
test_session_handoff.py feat(session): make /handoff actually transfer the session live 2026-05-10 13:06:25 -07:00
test_sessions_delete.py test(sessions): wire sessions_dir through auto-prune + file-cleanup regression tests 2026-04-26 18:31:07 -07:00
test_set_config_value.py fix(config): preserve YAML lists in hermes config set (#17876) 2026-04-30 04:32:17 -07:00
test_setup.py fix(setup): add missing SLACK_HOME_CHANNEL prompt to _setup_slack() 2026-05-04 01:37:18 -07:00
test_setup_agent_settings.py fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) (#18761) 2026-05-02 02:08:06 -07:00
test_setup_hermes_script.py fix(termux): make setup-hermes use android path 2026-04-09 16:24:53 -07:00
test_setup_irc.py feat(plugins): bundled platform plugins auto-load by default 2026-04-29 21:56:51 -07:00
test_setup_matrix_e2ee.py docs(matrix): update all references from matrix-nio to mautrix 2026-04-10 21:15:59 -07:00
test_setup_model_provider.py fix: remove 115 verified dead code symbols across 46 production files 2026-04-10 03:44:43 -07:00
test_setup_noninteractive.py feat(setup): auto-reconfigure on existing installs (#15879) 2026-04-25 22:02:02 -07:00
test_setup_ollama_cloud_force_refresh.py fix(sessions): /save lands under $HERMES_HOME, widen browse+TUI picker, force-refresh ollama-cloud on setup (#16296) 2026-04-26 18:49:48 -07:00
test_setup_openclaw_migration.py feat(plugins): bundled platform plugins auto-load by default 2026-04-29 21:56:51 -07:00
test_setup_prompt_menus.py fix(cli): sanitize bracketed paste markers during setup 2026-05-05 06:12:42 -07:00
test_setup_reconfigure.py feat(setup): auto-reconfigure on existing installs (#15879) 2026-04-25 22:02:02 -07:00
test_skills_config.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
test_skills_hub.py feat(skills): install skills from a direct HTTP(S) URL (#16323) 2026-04-26 20:57:10 -07:00
test_skills_install_flags.py fix: add --yes flag to bypass confirmation in /skills install and uninstall (#1647) 2026-03-17 01:59:07 -07:00
test_skills_skip_confirm.py fix(skills): cache-aware /skills install and uninstall in TUI (#3586) 2026-03-28 14:32:23 -07:00
test_skills_subparser.py fix(cli): resolve duplicate 'skills' subparser crash on Python 3.11+ 2026-03-11 00:50:39 -07:00
test_skin_engine.py fix(tui): restore macOS copy behavior and theme polish (#17131) 2026-04-28 18:47:14 -05:00
test_slack_cli.py fix(slack): enable writable app home DMs in manifest 2026-05-08 17:01:12 -07:00
test_spotify_auth.py fix(auth): keep Spotify logout from resetting model config 2026-05-07 05:53:14 -07:00
test_startup_plugin_gating.py perf(cli): skip eager plugin discovery on known built-in subcommands (#22120) 2026-05-08 16:07:23 -07:00
test_status.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_status_model_provider.py feat(agent): add lmstudio integration 2026-04-28 12:27:36 -07:00
test_subparser_routing_fallback.py test: remove 169 change-detector tests across 21 files (#11472) 2026-04-17 01:05:09 -07:00
test_subprocess_timeouts.py fix(cli): add missing subprocess.run() timeouts in doctor and status (#4009) 2026-03-30 11:17:15 -07:00
test_suppress_eio_on_interrupt.py fix(cli): guard logger.debug in signal handler (#13710 regression) (#20673) 2026-05-06 03:55:47 -07:00
test_teams_pipeline_plugin_cli.py feat(teams-pipeline): add plugin runtime and operator cli 2026-05-08 11:18:14 -07:00
test_tencent_tokenhub_provider.py fix(model-metadata): align hy3-preview static fallback + delete change-detector test (#22805) 2026-05-09 13:37:19 -07:00
test_terminal_menu_fallbacks.py Harden setup provider flows 2026-04-10 02:57:39 -07:00
test_timeouts.py fix(config): add stale timeout settings 2026-04-20 00:52:50 -07:00
test_tips.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
test_tool_token_estimation.py fix(tests): resolve 10 CI failures across hooks, tiktoken, plugins (#3848) 2026-03-29 20:05:59 -07:00
test_tools_config.py fix(cli): expand composite toolset when mixed with configurables in platform_toolsets 2026-05-09 11:08:05 -07:00
test_tools_disable_enable.py fix: MCP toolset resolution for runtime and config (#3252) 2026-03-26 13:39:41 -07:00
test_tui_npm_install.py fix(tui): tolerate npm's peer-flag drop in lockfile comparison 2026-05-04 14:13:38 +10:00
test_tui_resume_flow.py fix: make session search initialize session db 2026-05-09 14:36:58 -07:00
test_update_autostash.py fix(update): use termux-all uv fallback path on Termux 2026-05-09 17:53:15 -07:00
test_update_check.py test: remove 8 flaky tests that fail under parallel xdist scheduling (#12784) 2026-04-19 19:38:02 -07:00
test_update_config_clears_custom_fields.py fix(anthropic): complete third-party Anthropic-compatible provider support (#12846) 2026-04-19 22:43:09 -07:00
test_update_gateway_restart.py fix(update): bypass systemd RestartSec after graceful drain (#22101) 2026-05-08 16:11:07 -07:00
test_update_hangup_protection.py fix(update): survive mid-update terminal disconnect (#11960) 2026-04-17 21:29:24 -07:00
test_update_stale_dashboard.py fix(tests): make test_update_stale_dashboard immune to hermes_cli.main reload (#17881) 2026-04-30 02:46:56 -07:00
test_update_yes_flag.py test: remove 50 stale/broken tests to unblock CI (#22098) 2026-05-08 14:55:40 -07:00
test_user_providers_model_switch.py test(model_switch): cover private user_providers override 2026-04-30 19:44:26 -07:00
test_voice_wrapper.py fix(tui): restore voice push-to-talk parity (#20897) 2026-05-06 15:49:59 -07:00
test_web_server.py test(security): broaden plugin API auth coverage + correct stale docstring 2026-05-10 07:04:18 -07:00
test_web_server_host_header.py fix(web_server,whatsapp-bridge): validate Host header against bound interface (#13530) 2026-04-21 06:26:35 -07:00
test_web_ui_build.py fix(cli): check hermes_cli/web_dist/ not web/dist/ for build staleness 2026-04-26 18:43:57 -07:00
test_webhook_cli.py feat(webhook): hermes webhook CLI + skill for event-driven subscriptions (#3578) 2026-03-28 14:33:35 -07:00
test_xiaomi_provider.py feat(providers): add tencent-tokenhub provider support 2026-04-28 03:45:52 -07:00