mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
`fetch_models_dev()` is on the hot path of every `AIAgent.__init__`
(via `context_compressor → get_model_context_length`). The previous
policy was "always try network first, only fall back to disk if
network fails," so every fresh `hermes chat` / `hermes gateway` /
batch / cron process paid 250-500 ms re-fetching a 2 MB JSON registry
that was already on disk from earlier runs.
Add a stage 2 between in-mem and network: if
`models_dev_cache.json` exists and its mtime is younger than the
existing `_MODELS_DEV_CACHE_TTL` (1 hour, same TTL the in-mem cache
already uses), load from disk and skip the network call.
The in-mem TTL is anchored to the disk file's age, so a 50-min-old
cache stays in-memory for only 10 more minutes — no surprise
extension of staleness window.
Invariants preserved:
- `force_refresh=True` still always hits the network and only falls
back to disk on failure (`hermes config refresh` semantics).
- Missing disk cache → fall through to network (first-ever run).
- Stale disk cache (mtime > TTL) → fall through to network.
- Negative file age (clock skew) → fall through to network.
- Network failure → existing stage-4 stale-disk fallback unchanged.
Measured impact (3-run medians, 9950X3D, fresh process per run):
fetch_models_dev cold: 256 → 17 ms (-93%)
hermes chat -q wall: 4.00 → 3.73 s (-7% median)
3.99 → 3.60 s (-10% min)
The chat-end-to-end win is bounded below by API latency variance, but
the fetch_models_dev microbenchmark is the cleanest signal: 239 ms
shaved off every fresh-process agent construction.
Win compounds with the previous perf PRs:
#22681 google_chat lazy-load
#22766 doctor parallel + IMDS off
#22790 gateway.platforms PEP 562
Tests: all 30 `tests/agent/test_models_dev.py` pass (added 4 new ones
covering the new disk-cache-first path, force_refresh override, stale
disk fallback, and missing-disk-cache fall-through). Full `tests/agent/`
suite: 2560 passed, 0 failed.
|
||
|---|---|---|
| .. | ||
| acp | ||
| acp_adapter | ||
| agent | ||
| cli | ||
| cron | ||
| e2e | ||
| environments/benchmarks | ||
| fakes | ||
| gateway | ||
| hermes_cli | ||
| hermes_state | ||
| honcho_plugin | ||
| integration | ||
| openviking_plugin | ||
| plugins | ||
| providers | ||
| run_agent | ||
| skills | ||
| stress | ||
| tools | ||
| tui_gateway | ||
| website | ||
| __init__.py | ||
| conftest.py | ||
| run_interrupt_test.py | ||
| test_account_usage.py | ||
| test_atomic_replace_symlinks.py | ||
| test_base_url_hostname.py | ||
| test_batch_runner_checkpoint.py | ||
| test_cli_file_drop.py | ||
| test_cli_manual_compress.py | ||
| test_cli_skin_integration.py | ||
| test_ctx_halving_fix.py | ||
| test_empty_model_fallback.py | ||
| test_evidence_store.py | ||
| test_get_tool_definitions_cache_isolation.py | ||
| test_hermes_bootstrap.py | ||
| test_hermes_constants.py | ||
| test_hermes_home_profile_warning.py | ||
| test_hermes_logging.py | ||
| test_hermes_state.py | ||
| test_hermes_state_wal_fallback.py | ||
| test_honcho_client_config.py | ||
| test_install_sh_pythonpath_sanitization.py | ||
| test_install_sh_setup_wizard_tty_probe.py | ||
| test_install_sh_termux_network_prereqs.py | ||
| test_ipv4_preference.py | ||
| test_lazy_session_regressions.py | ||
| test_lint_config.py | ||
| test_mcp_serve.py | ||
| test_mini_swe_runner.py | ||
| test_minimax_model_validation.py | ||
| test_minimax_oauth.py | ||
| test_minisweagent_path.py | ||
| test_model_picker_scroll.py | ||
| test_model_tools.py | ||
| test_model_tools_async_bridge.py | ||
| test_ollama_num_ctx.py | ||
| test_packaging_metadata.py | ||
| test_plugin_skills.py | ||
| test_process_loop_event_loop_warning.py | ||
| test_project_metadata.py | ||
| test_retry_utils.py | ||
| test_sql_injection.py | ||
| test_subprocess_home_isolation.py | ||
| test_termux_all_extra_compat.py | ||
| test_timezone.py | ||
| test_toolset_distributions.py | ||
| test_toolsets.py | ||
| test_trajectory_compressor.py | ||
| test_trajectory_compressor_async.py | ||
| test_transform_llm_output_hook.py | ||
| test_transform_tool_result_hook.py | ||
| test_tui_gateway_server.py | ||
| test_utils_truthy_values.py | ||
| test_yuanbao_integration.py | ||
| test_yuanbao_markdown.py | ||
| test_yuanbao_pipeline.py | ||
| test_yuanbao_proto.py | ||