hermes-agent/tests
Teknium 3a9bc9d88a
fix(model picker): unify /model and hermes model lists, add disk cache (#33867)
* fix(model picker): unify /model and `hermes model` model lists, add disk cache

The /model slash picker and `hermes model` were drifting apart. /model
read the raw static `OPENROUTER_MODELS` list (31 entries, including 5
that fail at runtime — no tool-call support or absent from live catalog),
while `hermes model` ran the same list through the live OpenRouter
/v1/models tool-support filter and showed 26 valid entries. Same problem
existed for every other authed provider: /model used curated static
lists, `hermes model` used live /v1/models.

Unifies both surfaces on `provider_model_ids()` and adds a generic
disk-cached wrapper so the picker stays snappy.

Changes
- hermes_cli/models.py: new `cached_provider_model_ids()` —
  ~/.hermes/provider_models_cache.json, 1h TTL, per-provider entries
  keyed by credential fingerprint (env vars + OAuth file mtimes).
  Stale-data-beats-no-data on transient failures. Pair with
  `clear_provider_models_cache(provider=None)`.
- hermes_cli/models.py: `provider_model_ids("nous")` now falls back
  to the docs-hosted manifest (not the in-repo snapshot) when the live
  Portal /models call fails — preserves the model_catalog regression
  guarantee while still going through the unified pathway.
- hermes_cli/model_switch.py: `list_authenticated_providers` routes
  sections 1, 2, and 2b through `cached_provider_model_ids(slug)` with
  curated fallback when the live fetcher comes up empty.
- hermes_cli/model_switch.py: `parse_model_flags` extended to a
  4-tuple, parses `--refresh`.
- cli.py / gateway/run.py / tui_gateway/server.py: updated unpacking;
  CLI + gateway wire `--refresh` to `clear_provider_models_cache()`.
- hermes_cli/main.py: `hermes model --refresh` argparse flag.
- hermes_cli/commands.py: `/model` args_hint advertises `--refresh`.
- tests/hermes_cli/test_inventory.py: refresh stale comment.

Live PTY parity verification
- /model → OpenRouter row: `(26 models)` (was 31, with broken entries)
- `hermes model` → OpenRouter: 26 models (unchanged)
- The 5 dropped entries: `pareto-code` (no tool-call support),
  `gemini-3-pro-image-preview` (no tool-call support),
  `elephant-alpha`, `hy3-preview:free`, `ring-2.6-1t:free` (gone
  from OpenRouter's live catalog).

Live PTY timing
- First /model open, empty cache: 4624 ms (full network round trip
  across every authed provider)
- Second /model open, warm cache: 51 ms (90× faster)
- `/model --refresh` clears the disk cache and re-fetches.

Cache schema (~/.hermes/provider_models_cache.json, ~3 KB):
  { "anthropic": {"fp": "<sha256:16>", "at": 1748..., "models": [...]},
    ... }

Targeted tests: tests/hermes_cli/ + gateway model tests + tui_gateway —
5855/5855 pass.

* fix(model picker): use blake2b for cache fingerprint to silence CodeQL

py/weak-sensitive-data-hashing flagged the sha256 call in
_credential_fingerprint() as a high-severity alert because the input
includes env var values whose names contain *_API_KEY / *_TOKEN.

The hash is used solely as a cache-bust identity — never reversed, never
stored, collisions are harmless (worst case: cache miss → live re-fetch).
blake2b serves the same purpose and isn't flagged by this rule.

Functional behavior identical: 16-hex-char digest, cache hit/miss logic
unchanged. Live re-verified — 26 OpenRouter models, warm-cache 78ms.
2026-05-28 11:33:16 -07:00
..
acp test(acp): drop flaky runtime_calls[-1] tail-position assertion 2026-05-24 23:23:12 -07:00
acp_adapter feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
agent fix(redact): pass web URLs through unchanged (#34029) 2026-05-28 11:32:39 -07:00
cli test(auth): update entitlement CI expectations 2026-05-28 00:19:31 -07:00
cron test(ci): harden two flaky tests against CI noise (#33675) 2026-05-27 23:15:41 -07:00
docker fix(docker): bake build-time git SHA into the image 2026-05-28 15:14:05 +10:00
e2e refactor(gateway): migrate Discord adapter to bundled plugin (full Teams parity) 2026-05-22 14:21:41 -07:00
fakes
gateway fix(gateway): default media-delivery validation to denylist-only, restore .md delivery (#34022) 2026-05-28 11:32:36 -07:00
hermes_cli fix(model picker): unify /model and hermes model lists, add disk cache (#33867) 2026-05-28 11:33:16 -07:00
hermes_state feat(session_search): single-shape tool with discovery, scroll, browse — no LLM (#27590) 2026-05-17 23:28:45 -07:00
honcho_plugin fix(honcho): align peer-card read and write paths 2026-05-27 10:49:33 -07:00
integration chore(web): remove web_crawl tool + provider crawl plumbing (#33824) 2026-05-28 04:52:42 -07:00
openviking_plugin
plugins chore(web): remove web_crawl tool + provider crawl plumbing (#33824) 2026-05-28 04:52:42 -07:00
providers feat(openrouter): pass session_id in extra_body for sticky routing 2026-05-28 08:52:19 -07:00
run_agent fix(agent): fallback immediately on provider content-policy blocks (#33883) 2026-05-28 07:28:24 -07:00
scripts
skills fix(skills): add timeout to Google OAuth urlopen calls 2026-05-19 00:11:44 -07:00
stress docs: align kanban readiness docs and smoke tests 2026-05-18 21:07:03 -07:00
tools fix(gateway): default media-delivery validation to denylist-only, restore .md delivery (#34022) 2026-05-28 11:32:36 -07:00
tui_gateway chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
website
__init__.py
conftest.py test(dashboard-auth): strip HERMES_DASHBOARD_OAUTH_* env vars in hermetic fixture 2026-05-27 02:12:27 -07:00
run_interrupt_test.py
test_account_usage.py
test_atomic_replace_symlinks.py
test_base_url_hostname.py
test_batch_runner_checkpoint.py
test_bitwarden_secrets.py perf(cli): cut hermes startup 63% — flip head-to-head vs codex (#31968) 2026-05-25 03:06:39 -07:00
test_cli_file_drop.py
test_cli_manual_compress.py fix(tests): catch up six stale tests after compression/aux/kanban changes (#28465) 2026-05-18 21:43:59 -07:00
test_cli_skin_integration.py
test_ctx_halving_fix.py
test_docker_home_override_scripts.py fix(docker): align HOME for dashboard and s6 gateway services (#33481) 2026-05-28 13:42:27 +10:00
test_empty_model_fallback.py
test_env_loader_secret_sources.py fix(secrets): only apply external secrets once per HERMES_HOME per process (#32271) 2026-05-25 15:18:55 -07:00
test_evidence_store.py
test_gateway_streaming_nested_config.py
test_get_tool_definitions_cache_isolation.py
test_hermes_bootstrap.py
test_hermes_constants.py fix(security): guard os.chmod(parent) against / and top-level dirs 2026-05-20 22:56:55 -07:00
test_hermes_home_profile_warning.py
test_hermes_logging.py fix(tests): catch up 25 stale tests after recent merges (#28626) 2026-05-19 01:28:32 -07:00
test_hermes_state.py fix(kanban): skip redundant WAL pragma on already-WAL connections 2026-05-27 14:31:55 -07:00
test_hermes_state_wal_fallback.py fix(kanban): skip redundant WAL pragma on already-WAL connections 2026-05-27 14:31:55 -07:00
test_honcho_client_config.py
test_honcho_session_context.py fix(honcho): align user context peer perspective 2026-05-27 10:49:33 -07:00
test_install_sh_browser_install.py
test_install_sh_pythonpath_sanitization.py
test_install_sh_root_fhs_uv_python_path.py test(install): harden uv-python-path regression test against future drift 2026-05-27 13:55:51 -07:00
test_install_sh_setup_wizard_tty_probe.py
test_install_sh_symlink_stomp.py
test_install_sh_termux_network_prereqs.py
test_ipv4_preference.py
test_lazy_session_regressions.py
test_lint_config.py
test_live_system_guard_self_test.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_mcp_serve.py
test_mini_swe_runner.py
test_minimax_model_validation.py
test_minimax_oauth.py fix(minimax-oauth): refresh short-lived access tokens per request (#30619) 2026-05-22 15:16:15 -07:00
test_minisweagent_path.py
test_model_picker_scroll.py
test_model_tools.py
test_model_tools_async_bridge.py
test_ollama_num_ctx.py
test_package_json_lazy_deps.py fix(update): make Camofox lazy-installed instead of eager (#27055) 2026-05-16 12:15:45 -07:00
test_packaging_metadata.py
test_plugin_skills.py
test_process_loop_event_loop_warning.py
test_project_metadata.py remove Vercel AI Gateway and Vercel Sandbox (#33067) 2026-05-27 00:43:32 -07:00
test_retry_utils.py
test_run_tests_parallel.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_sanitize_tool_error.py security: sanitize tool error strings before injecting into model context (#26823) 2026-05-16 00:57:39 -07:00
test_sql_injection.py
test_subprocess_home_isolation.py fix: avoid process-wide cron profile home mutation 2026-05-18 17:39:50 +00:00
test_termux_all_extra_compat.py
test_timezone.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_toolset_distributions.py
test_toolsets.py
test_trajectory_compressor.py
test_trajectory_compressor_async.py
test_transform_llm_output_hook.py
test_transform_tool_result_hook.py
test_tui_gateway_server.py feat: add TUI session orchestrator 2026-05-26 20:51:59 -07:00
test_utils_truthy_values.py
test_yuanbao_integration.py
test_yuanbao_markdown.py
test_yuanbao_pipeline.py
test_yuanbao_proto.py