hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-26 17:38:36 +00:00

History

Teknium d76fa7fc37 fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051 ) * fix: detect context length for custom model endpoints via fuzzy matching + config override Custom model endpoints (non-OpenRouter, non-known-provider) were silently falling back to 2M tokens when the model name didn't exactly match what the endpoint's /v1/models reported. This happened because: 1. Endpoint metadata lookup used exact match only — model name mismatches (e.g. 'qwen3.5:9b' vs 'Qwen3.5-9B-Q4_K_M.gguf') caused a miss 2. Single-model servers (common for local inference) required exact name match even though only one model was loaded 3. No user escape hatch to manually set context length Changes: - Add fuzzy matching for endpoint model metadata: single-model servers use the only available model regardless of name; multi-model servers try substring matching in both directions - Add model.context_length config override (highest priority) so users can explicitly set their model's context length in config.yaml - Log an informative message when falling back to 2M probe, telling users about the config override option - Thread config_context_length through ContextCompressor and AIAgent init Tests: 6 new tests covering fuzzy match, single-model fallback, config override (including zero/None edge cases). * fix: auto-detect local model name and context length for local servers Cherry-picked from PR #2043 by sudoingX. - Auto-detect model name from local server's /v1/models when only one model is loaded (no manual model name config needed) - Add n_ctx_train and n_ctx to context length detection keys for llama.cpp - Query llama.cpp /props endpoint for actual allocated context (not just training context from GGUF metadata) - Strip .gguf suffix from display in banner and status bar - _auto_detect_local_model() in runtime_provider.py for CLI init Co-authored-by: sudo <sudoingx@users.noreply.github.com> * fix: revert accidental summary_target_tokens change + add docs for context_length config - Revert summary_target_tokens from 2500 back to 500 (accidental change during patching) - Add 'Context Length Detection' section to Custom & Self-Hosted docs explaining model.context_length config override --------- Co-authored-by: Test <test@test.com> Co-authored-by: sudo <sudoingx@users.noreply.github.com>		2026-03-19 06:01:16 -07:00
..
__init__.py	feat: integrate GitHub Copilot providers across Hermes	2026-03-17 23:40:22 -07:00
auth.py	feat: proper Copilot auth with OAuth device code flow and token validation	2026-03-18 03:25:58 -07:00
banner.py	fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051 )	2026-03-19 06:01:16 -07:00
callbacks.py	refactor(cli): implement approval locking mechanism to serialize concurrent requests	2026-03-13 23:59:18 -07:00
checklist.py	fix: skip hanging tests + add global test timeout	2026-03-12 01:23:28 -07:00
claw.py	fix(claw): warn when API keys are skipped during OpenClaw migration (#1580 )	2026-03-17 02:10:36 -07:00
clipboard.py	fix: clean up empty file after failed wl-paste clipboard extraction	2026-03-11 02:56:19 -07:00
codex_models.py	fix: add codex forward-compat model listing	2026-03-13 21:34:01 -07:00
colors.py	Revert "feat(cli): skin-aware light/dark theme mode with terminal auto-detection"	2026-03-17 10:04:53 -07:00
commands.py	fix(gateway): replace bare text approval with /approve and /deny commands (#2002 )	2026-03-18 16:58:20 -07:00
config.py	feat: OpenAI-compatible API server + WhatsApp configurable reply prefix (#1756 )	2026-03-17 10:44:37 -07:00
copilot_auth.py	feat: proper Copilot auth with OAuth device code flow and token validation	2026-03-18 03:25:58 -07:00
cron.py	docs: clarify gateway service scopes (#1378 )	2026-03-14 21:17:41 -07:00
curses_ui.py	refactor: extract shared curses checklist, fix skill discovery perf	2026-03-11 03:06:15 -07:00
default_soul.py	feat: seed a default global SOUL.md	2026-03-14 08:05:30 -07:00
doctor.py	feat: add Kilo Code (kilocode) as first-class inference provider (#1666 )	2026-03-17 02:40:34 -07:00
env_loader.py	fix(config): reload .env over stale shell overrides	2026-03-15 06:46:28 -07:00
gateway.py	fix(gateway): detect script-style gateway processes for --replace	2026-03-18 03:12:59 -07:00
main.py	feat: proper Copilot auth with OAuth device code flow and token validation	2026-03-18 03:25:58 -07:00
models.py	Merge origin/main, resolve conflicts (self._base_url_lower)	2026-03-18 04:09:00 -07:00
pairing.py	Cleanup time!	2026-02-20 23:23:32 -08:00
plugins.py	feat: first-class plugin architecture (#1555 )	2026-03-16 07:17:36 -07:00
runtime_provider.py	fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051 )	2026-03-19 06:01:16 -07:00
setup.py	Merge origin/main, resolve conflicts (self._base_url_lower)	2026-03-18 04:09:00 -07:00
skills_config.py	fix: wire email platform into toolset mappings + add documentation	2026-03-11 06:34:32 -07:00
skills_hub.py	fix: add --yes flag to bypass confirmation in /skills install and uninstall (#1647 )	2026-03-17 01:59:07 -07:00
skin_engine.py	Revert "feat(cli): skin-aware light/dark theme mode with terminal auto-detection"	2026-03-17 10:04:53 -07:00
status.py	feat(web): add Tavily as web search/extract/crawl backend (#1731 )	2026-03-17 04:28:03 -07:00
tools_config.py	feat(web): add Tavily as web search/extract/crawl backend (#1731 )	2026-03-17 04:28:03 -07:00
uninstall.py	feat(gateway): scope systemd service name to HERMES_HOME	2026-03-16 04:42:46 -07:00