hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-17 09:41:58 +00:00

History

Rob Moen 0dd373ec43 fix(context): honor model.context_length for Ollama num_ctx and all display paths When a user sets model.context_length in config.yaml, the value was only used for Hermes' internal compression decisions (context_compressor) but NOT for Ollama's num_ctx parameter. Ollama auto-detects context from GGUF metadata (often 256K+) and allocates that much VRAM regardless of the user's config — causing OOM on smaller GPUs like the P100 (16GB). Root cause: two separate context values existed independently: - context_compressor.context_length = config value (e.g. 65536) ✓ - _ollama_num_ctx = GGUF metadata value (e.g. 256000) ✗ ignored config Changes: 1. Cap Ollama num_ctx to config context_length (run_agent.py) When model.context_length is explicitly set and no explicit ollama_num_ctx override exists, cap the auto-detected GGUF value to the user's context_length. This is the core fix — it prevents Ollama from allocating more VRAM than the user budgeted. 2. Pass config_context_length through all secondary call sites Several paths called get_model_context_length() without the config override, falling through to the 256K default fallback: - cli.py: @-reference expansion and /model switch display - gateway/run.py: @-reference expansion and /model switch display - tui_gateway/server.py: @-reference expansion - hermes_cli/model_switch.py: resolve_display_context_length() 3. Normalize root-level context_length in config (hermes_cli/config.py) _normalize_root_model_keys() now migrates root-level context_length into the model section, matching existing behavior for provider and base_url. Users who wrote `context_length: 65536` at the YAML root instead of under `model:` had it silently ignored. 4. Fix misleading comments (agent/model_metadata.py) DEFAULT_FALLBACK_CONTEXT is 256K (CONTEXT_PROBE_TIERS[0]), not 128K as two comments stated. Tests: 3 new tests for root-level context_length normalization. All existing context_length tests pass (96 tests).		2026-04-30 04:31:23 -07:00
..
acp	fix(acp): drop dead message_id kwarg from replay chunks	2026-04-30 02:45:54 -07:00
agent	fix(transport): omit thinking_config for Gemma on the gemini provider (#17426 )	2026-04-30 04:29:04 -07:00
cli	fix(context): honor model.context_length for Ollama num_ctx and all display paths	2026-04-30 04:31:23 -07:00
cron	fix(cron): surface agent run_conversation failure flags as job failure	2026-04-30 03:27:37 -07:00
e2e	fix(gateway): coerce plaintext "restart gateway" DMs to /restart	2026-04-28 01:40:28 -07:00
environments/benchmarks
fakes
gateway	fix(gateway): enforce auth check in busy-session path to prevent unauthorized injection (#17775 )	2026-04-30 04:29:15 -07:00
hermes_cli	fix: handle gateway Ctrl+C shutdown cleanly	2026-04-30 03:29:57 -07:00
hermes_state	fix(resume): redirect --resume to the descendant that actually holds the messages	2026-04-24 03:04:42 -07:00
honcho_plugin	feat(honcho): explain why when honcho_profile returns an empty card	2026-04-27 12:37:33 -07:00
integration
openviking_plugin	fix(openviking): pre-check fs/stat to route file URIs before hitting directory-only endpoints	2026-04-30 02:35:29 -07:00
plugins	fix: stabilize CI — TS widen, sys.modules restore, WS subscriber race (#17836 )	2026-04-30 01:34:08 -07:00
run_agent	fix(aux): remove hardcoded Codex fallback model, drop Codex from auto chain (#17765 )	2026-04-29 23:23:50 -07:00
skills	test(openclaw-migration): cover alias reverse-lookup for real OpenClaw schema	2026-04-28 04:58:13 -07:00
tools	feat(gateway/signal): add support for multiple images sending	2026-04-30 04:28:08 -07:00
tui_gateway	fix(tui): responsive /compress with live progress + CLI-parity feedback (#17661 )	2026-04-29 18:01:18 -07:00
website	fix(website): auto-wrap ASCII-art code blocks in generated skill pages (#16497 )	2026-04-27 03:38:39 -07:00
__init__.py
conftest.py	fix(ci): stabilize main test suite regressions (#17660 )	2026-04-29 23:18:55 -07:00
run_interrupt_test.py
test_account_usage.py	feat(account-usage): add per-provider account limits module	2026-04-21 01:56:35 -07:00
test_atomic_replace_symlinks.py	refactor: consolidate symlink-safe atomic replace into shared helper	2026-04-28 04:58:22 -07:00
test_base_url_hostname.py	security(runtime_provider): close OLLAMA_API_KEY substring-leak sweep miss (#13522 )	2026-04-21 06:06:16 -07:00
test_batch_runner_checkpoint.py	test: regression coverage for checkpoint dedup and inf/nan coercion	2026-04-24 14:32:21 -07:00
test_cli_file_drop.py	fix(tui): improve macOS paste and shortcut parity	2026-04-21 08:00:00 -07:00
test_cli_manual_compress.py	test(cli): regression test for manual /compress system_message	2026-04-28 05:21:49 -07:00
test_cli_skin_integration.py	fix(ci): stabilize main test suite regressions (#17660 )	2026-04-29 23:18:55 -07:00
test_ctx_halving_fix.py
test_empty_model_fallback.py
test_evidence_store.py
test_hermes_constants.py
test_hermes_logging.py	fix(logging): attach gateway log after cli init	2026-04-26 19:01:26 -07:00
test_hermes_state.py	fix(state): repair FTS5 delete trigger and add v11 migration for tool-call indexing	2026-04-28 01:33:00 -07:00
test_honcho_client_config.py
test_install_sh_setup_wizard_tty_probe.py	fix(install): widen /dev/tty open-probe to sibling gates (#16746 )	2026-04-28 06:45:55 -07:00
test_ipv4_preference.py
test_mcp_serve.py
test_mini_swe_runner.py	fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157 )	2026-04-20 12:23:05 -07:00
test_minimax_model_validation.py	fix(models): validate MiniMax models against static catalog (#12611 , #12460 , #12399 , #12547 )	2026-04-19 22:44:47 -07:00
test_minimax_oauth.py	test(cli): cover minimax-oauth resolution, refresh, menu wiring	2026-04-29 09:53:42 -07:00
test_minisweagent_path.py
test_model_picker_scroll.py
test_model_tools.py	fix(plugins): stop firing pre_tool_call hook twice per tool execution (#17611 )	2026-04-29 12:43:39 -07:00
test_model_tools_async_bridge.py	fix(model_tools): cancel coroutine on timeout so worker thread exits + log full traceback	2026-04-29 05:00:40 -07:00
test_ollama_num_ctx.py
test_packaging_metadata.py
test_plugin_skills.py
test_project_metadata.py	build(deps): add qrcode to dingtalk + feishu extras (parity with messaging) (#11627 )	2026-04-17 13:31:53 -07:00
test_retry_utils.py
test_sql_injection.py
test_subprocess_home_isolation.py
test_timezone.py	test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )	2026-04-17 14:21:22 -07:00
test_toolset_distributions.py
test_toolsets.py	feat(discord): split discord_server into discord + discord_admin tools	2026-04-25 04:50:14 -07:00
test_trajectory_compressor.py	fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157 )	2026-04-20 12:23:05 -07:00
test_trajectory_compressor_async.py	fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157 )	2026-04-20 12:23:05 -07:00
test_transform_tool_result_hook.py	test: stop testing mutable data — convert change-detectors to invariants (#13363 )	2026-04-20 23:20:33 -07:00
test_tui_gateway_server.py	fix(ci): stabilize main test suite regressions (#17660 )	2026-04-29 23:18:55 -07:00
test_utils_truthy_values.py
test_yuanbao_integration.py	yuanbao platform (#16298 )	2026-04-26 18:50:49 -07:00
test_yuanbao_markdown.py	yuanbao platform (#16298 )	2026-04-26 18:50:49 -07:00
test_yuanbao_pipeline.py	yuanbao platform (#16298 )	2026-04-26 18:50:49 -07:00
test_yuanbao_proto.py	yuanbao platform (#16298 )	2026-04-26 18:50:49 -07:00