hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-28 18:19:28 +00:00

History

Teknium d76fa7fc37 fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051 ) * fix: detect context length for custom model endpoints via fuzzy matching + config override Custom model endpoints (non-OpenRouter, non-known-provider) were silently falling back to 2M tokens when the model name didn't exactly match what the endpoint's /v1/models reported. This happened because: 1. Endpoint metadata lookup used exact match only — model name mismatches (e.g. 'qwen3.5:9b' vs 'Qwen3.5-9B-Q4_K_M.gguf') caused a miss 2. Single-model servers (common for local inference) required exact name match even though only one model was loaded 3. No user escape hatch to manually set context length Changes: - Add fuzzy matching for endpoint model metadata: single-model servers use the only available model regardless of name; multi-model servers try substring matching in both directions - Add model.context_length config override (highest priority) so users can explicitly set their model's context length in config.yaml - Log an informative message when falling back to 2M probe, telling users about the config override option - Thread config_context_length through ContextCompressor and AIAgent init Tests: 6 new tests covering fuzzy match, single-model fallback, config override (including zero/None edge cases). * fix: auto-detect local model name and context length for local servers Cherry-picked from PR #2043 by sudoingX. - Auto-detect model name from local server's /v1/models when only one model is loaded (no manual model name config needed) - Add n_ctx_train and n_ctx to context length detection keys for llama.cpp - Query llama.cpp /props endpoint for actual allocated context (not just training context from GGUF metadata) - Strip .gguf suffix from display in banner and status bar - _auto_detect_local_model() in runtime_provider.py for CLI init Co-authored-by: sudo <sudoingx@users.noreply.github.com> * fix: revert accidental summary_target_tokens change + add docs for context_length config - Revert summary_target_tokens from 2500 back to 500 (accidental change during patching) - Add 'Context Length Detection' section to Custom & Self-Hosted docs explaining model.context_length config override --------- Co-authored-by: Test <test@test.com> Co-authored-by: sudo <sudoingx@users.noreply.github.com>		2026-03-19 06:01:16 -07:00
..
__init__.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
anthropic_adapter.py	fix(anthropic): tool_choice 'none' still allowed tool calls	2026-03-17 04:02:49 -07:00
auxiliary_client.py	fix: respect config.yaml model.base_url for Anthropic provider (#1948 ) (#1998 )	2026-03-18 16:51:24 -07:00
context_compressor.py	fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051 )	2026-03-19 06:01:16 -07:00
copilot_acp_client.py	feat: integrate GitHub Copilot providers across Hermes	2026-03-17 23:40:22 -07:00
display.py	feat(tools): centralize tool emoji metadata in registry + skin integration	2026-03-15 20:21:21 -07:00
insights.py	feat: add route-aware pricing estimates (#1695 )	2026-03-17 03:44:44 -07:00
model_metadata.py	fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051 )	2026-03-19 06:01:16 -07:00
prompt_builder.py	feat: use SOUL.md as primary agent identity instead of hardcoded default (#1922 )	2026-03-18 04:11:20 -07:00
prompt_caching.py	fix(cache_control) treat empty text like None to avoid anthropic api cache_control error	2026-03-13 18:08:46 -07:00
redact.py	feat: secure skill env setup on load (core #688 )	2026-03-13 03:14:04 -07:00
skill_commands.py	fix: disabled skills respected across banner, system prompt, slash commands, and skill_view (#1897 )	2026-03-18 03:17:37 -07:00
smart_model_routing.py	feat: integrate GitHub Copilot providers across Hermes	2026-03-17 23:40:22 -07:00
title_generator.py	feat: auto-generate session titles after first exchange	2026-03-17 04:14:40 -07:00
trajectory.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
usage_pricing.py	feat: use endpoint metadata for custom model context and pricing (#1906 )	2026-03-18 03:04:07 -07:00