hermes-agent/hermes_cli
Teknium 88643a1ba9
feat: overhaul context length detection with models.dev and provider-aware resolution (#2158)
Replace the fragile hardcoded context length system with a multi-source
resolution chain that correctly identifies context windows per provider.

Key changes:

- New agent/models_dev.py: Fetches and caches the models.dev registry
  (3800+ models across 100+ providers with per-provider context windows).
  In-memory cache (1hr TTL) + disk cache for cold starts.

- Rewritten get_model_context_length() resolution chain:
  0. Config override (model.context_length)
  1. Custom providers per-model context_length
  2. Persistent disk cache
  3. Endpoint /models (local servers)
  4. Anthropic /v1/models API (max_input_tokens, API-key only)
  5. OpenRouter live API (existing, unchanged)
  6. Nous suffix-match via OpenRouter (dot/dash normalization)
  7. models.dev registry lookup (provider-aware)
  8. Thin hardcoded defaults (broad family patterns)
  9. 128K fallback (was 2M)

- Provider-aware context: same model now correctly resolves to different
  context windows per provider (e.g. claude-opus-4.6: 1M on Anthropic,
  128K on GitHub Copilot). Provider name flows through ContextCompressor.

- DEFAULT_CONTEXT_LENGTHS shrunk from 80+ entries to ~16 broad patterns.
  models.dev replaces the per-model hardcoding.

- CONTEXT_PROBE_TIERS changed from [2M, 1M, 512K, 200K, 128K, 64K, 32K]
  to [128K, 64K, 32K, 16K, 8K]. Unknown models no longer start at 2M.

- hermes model: prompts for context_length when configuring custom
  endpoints. Supports shorthand (32k, 128K). Saved to custom_providers
  per-model config.

- custom_providers schema extended with optional models dict for
  per-model context_length (backward compatible).

- Nous Portal: suffix-matches bare IDs (claude-opus-4-6) against
  OpenRouter's prefixed IDs (anthropic/claude-opus-4.6) with dot/dash
  normalization. Handles all 15 current Nous models.

- Anthropic direct: queries /v1/models for max_input_tokens. Only works
  with regular API keys (sk-ant-api*), not OAuth tokens. Falls through
  to models.dev for OAuth users.

Tests: 5574 passed (18 new tests for models_dev + updated probe tiers)
Docs: Updated configuration.md context length section, AGENTS.md

Co-authored-by: Test <test@test.com>
2026-03-20 06:04:33 -07:00
..
__init__.py feat: integrate GitHub Copilot providers across Hermes 2026-03-17 23:40:22 -07:00
auth.py fix: resolve MiniMax 401 auth error by defaulting to anthropic_messages (#2103) 2026-03-19 17:47:05 -07:00
banner.py fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051) 2026-03-19 06:01:16 -07:00
callbacks.py refactor(cli): implement approval locking mechanism to serialize concurrent requests 2026-03-13 23:59:18 -07:00
checklist.py fix: skip hanging tests + add global test timeout 2026-03-12 01:23:28 -07:00
claw.py fix(claw): warn when API keys are skipped during OpenClaw migration (#1580) 2026-03-17 02:10:36 -07:00
clipboard.py fix: clean up empty file after failed wl-paste clipboard extraction 2026-03-11 02:56:19 -07:00
codex_models.py fix: add codex forward-compat model listing 2026-03-13 21:34:01 -07:00
colors.py Revert "feat(cli): skin-aware light/dark theme mode with terminal auto-detection" 2026-03-17 10:04:53 -07:00
commands.py fix(gateway): replace bare text approval with /approve and /deny commands (#2002) 2026-03-18 16:58:20 -07:00
config.py fix(honcho): read HONCHO_BASE_URL for local/self-hosted instances 2026-03-20 04:36:06 -07:00
copilot_auth.py feat: proper Copilot auth with OAuth device code flow and token validation 2026-03-18 03:25:58 -07:00
cron.py docs: clarify gateway service scopes (#1378) 2026-03-14 21:17:41 -07:00
curses_ui.py refactor: extract shared curses checklist, fix skill discovery perf 2026-03-11 03:06:15 -07:00
default_soul.py feat: seed a default global SOUL.md 2026-03-14 08:05:30 -07:00
doctor.py feat: add Kilo Code (kilocode) as first-class inference provider (#1666) 2026-03-17 02:40:34 -07:00
env_loader.py fix(config): reload .env over stale shell overrides 2026-03-15 06:46:28 -07:00
gateway.py fix(gateway): detect script-style gateway processes for --replace 2026-03-18 03:12:59 -07:00
main.py feat: overhaul context length detection with models.dev and provider-aware resolution (#2158) 2026-03-20 06:04:33 -07:00
models.py fix: /model command — bare provider names, custom endpoint display 2026-03-19 12:06:48 -07:00
pairing.py Cleanup time! 2026-02-20 23:23:32 -08:00
plugins.py feat: first-class plugin architecture (#1555) 2026-03-16 07:17:36 -07:00
runtime_provider.py fix(openai): route api.openai.com to Responses API for GPT-5.x 2026-03-20 05:09:41 -07:00
setup.py feat: overhaul context length detection with models.dev and provider-aware resolution (#2158) 2026-03-20 06:04:33 -07:00
skills_config.py fix: wire email platform into toolset mappings + add documentation 2026-03-11 06:34:32 -07:00
skills_hub.py fix: add --yes flag to bypass confirmation in /skills install and uninstall (#1647) 2026-03-17 01:59:07 -07:00
skin_engine.py Revert "feat(cli): skin-aware light/dark theme mode with terminal auto-detection" 2026-03-17 10:04:53 -07:00
status.py feat(web): add Tavily as web search/extract/crawl backend (#1731) 2026-03-17 04:28:03 -07:00
tools_config.py feat(web): add Tavily as web search/extract/crawl backend (#1731) 2026-03-17 04:28:03 -07:00
uninstall.py feat(gateway): scope systemd service name to HERMES_HOME 2026-03-16 04:42:46 -07:00