hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-17 09:41:58 +00:00

History

Rob Moen 0dd373ec43 fix(context): honor model.context_length for Ollama num_ctx and all display paths When a user sets model.context_length in config.yaml, the value was only used for Hermes' internal compression decisions (context_compressor) but NOT for Ollama's num_ctx parameter. Ollama auto-detects context from GGUF metadata (often 256K+) and allocates that much VRAM regardless of the user's config — causing OOM on smaller GPUs like the P100 (16GB). Root cause: two separate context values existed independently: - context_compressor.context_length = config value (e.g. 65536) ✓ - _ollama_num_ctx = GGUF metadata value (e.g. 256000) ✗ ignored config Changes: 1. Cap Ollama num_ctx to config context_length (run_agent.py) When model.context_length is explicitly set and no explicit ollama_num_ctx override exists, cap the auto-detected GGUF value to the user's context_length. This is the core fix — it prevents Ollama from allocating more VRAM than the user budgeted. 2. Pass config_context_length through all secondary call sites Several paths called get_model_context_length() without the config override, falling through to the 256K default fallback: - cli.py: @-reference expansion and /model switch display - gateway/run.py: @-reference expansion and /model switch display - tui_gateway/server.py: @-reference expansion - hermes_cli/model_switch.py: resolve_display_context_length() 3. Normalize root-level context_length in config (hermes_cli/config.py) _normalize_root_model_keys() now migrates root-level context_length into the model section, matching existing behavior for provider and base_url. Users who wrote `context_length: 65536` at the YAML root instead of under `model:` had it silently ignored. 4. Fix misleading comments (agent/model_metadata.py) DEFAULT_FALLBACK_CONTEXT is 256K (CONTEXT_PROBE_TIERS[0]), not 128K as two comments stated. Tests: 3 new tests for root-level context_length normalization. All existing context_length tests pass (96 tests).		2026-04-30 04:31:23 -07:00
..
transports	fix(transport): omit thinking_config for Gemma on the gemini provider (#17426 )	2026-04-30 04:29:04 -07:00
__init__.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
account_usage.py	feat(account-usage): add per-provider account limits module	2026-04-21 01:56:35 -07:00
anthropic_adapter.py	fix(anthropic): reactive recovery for OAuth 1M-context beta rejection (#17752 )	2026-04-29 21:56:54 -07:00
auxiliary_client.py	fix(aux): remove hardcoded Codex fallback model, drop Codex from auto chain (#17765 )	2026-04-29 23:23:50 -07:00
bedrock_adapter.py	fix(bedrock): add live model discovery and region resolution for non-US regions	2026-04-28 03:53:11 -07:00
codex_responses_adapter.py	fix(agent): preserve Codex message items for replay	2026-04-25 18:22:06 -07:00
context_compressor.py	revert: computer-use cua-driver (PR #16919 ) (#16927 )	2026-04-28 01:57:21 -07:00
context_engine.py	fix(compress): don't reach into ContextCompressor privates from /compress (#15039 )	2026-04-24 02:55:43 -07:00
context_references.py	fix(agent): fall back when rg is blocked for @folder references	2026-04-20 01:56:41 -07:00
copilot_acp_client.py	fix(ci): stabilize main test suite regressions (#17660 )	2026-04-29 23:18:55 -07:00
credential_pool.py	feat(minimax-oauth): full integration with peer OAuth providers	2026-04-29 09:53:42 -07:00
credential_sources.py	feat(minimax-oauth): full integration with peer OAuth providers	2026-04-29 09:53:42 -07:00
curator.py	fix(curator): unify under auxiliary.curator (hermes model, dashboard) (#17868 )	2026-04-30 02:46:01 -07:00
display.py	revert: computer-use cua-driver (PR #16919 ) (#16927 )	2026-04-28 01:57:21 -07:00
error_classifier.py	fix(anthropic): reactive recovery for OAuth 1M-context beta rejection (#17752 )	2026-04-29 21:56:54 -07:00
file_safety.py	fix(security): apply file safety to copilot acp fs	2026-04-21 01:31:58 -07:00
gemini_cloudcode_adapter.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
gemini_native_adapter.py	fix(gemini): fail fast on missing API key + surface it in hermes dump (#15133 )	2026-04-24 05:35:17 -07:00
gemini_schema.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
google_code_assist.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
google_oauth.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
image_gen_provider.py	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )	2026-04-21 21:30:10 -07:00
image_gen_registry.py	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )	2026-04-21 21:30:10 -07:00
image_routing.py	feat(image-input): native multimodal routing based on model vision capability (#16506 )	2026-04-27 06:27:59 -07:00
insights.py	Merge branch 'main' into feat/dashboard-skill-analytics	2026-04-20 05:25:49 -07:00
lmstudio_reasoning.py	feat(agent): add lmstudio integration	2026-04-28 12:27:36 -07:00
manual_compression_feedback.py	fix(gateway): make manual compression feedback truthful	2026-04-10 21:16:53 -07:00
memory_manager.py	feat(memory): notify providers on mid-process session_id rotation (#17409 )	2026-04-29 04:57:22 -07:00
memory_provider.py	feat(memory): notify providers on mid-process session_id rotation (#17409 )	2026-04-29 04:57:22 -07:00
model_metadata.py	fix(context): honor model.context_length for Ollama num_ctx and all display paths	2026-04-30 04:31:23 -07:00
models_dev.py	feat(minimax-oauth): full integration with peer OAuth providers	2026-04-29 09:53:42 -07:00
moonshot_schema.py	fix(kimi,mcp): Moonshot schema sanitizer + MCP schema robustness (#14805 )	2026-04-23 16:11:57 -07:00
nous_rate_guard.py	refactor: consolidate symlink-safe atomic replace into shared helper	2026-04-28 04:58:22 -07:00
onboarding.py	docs(onboarding): lead OpenClaw residue banner with migrate, warn that cleanup breaks OpenClaw (#17507 )	2026-04-29 08:08:36 -07:00
prompt_builder.py	chore(prompt): tell telegram models to prefer bullets over tables	2026-04-28 05:37:50 -07:00
prompt_caching.py	fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter	2026-03-21 16:54:43 -07:00
rate_limit_tracker.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
redact.py	fix(ci): stabilize main test suite regressions (#17660 )	2026-04-29 23:18:55 -07:00
retry_utils.py	feat(agent): add jittered retry backoff	2026-04-08 00:41:36 -07:00
shell_hooks.py	refactor: consolidate symlink-safe atomic replace into shared helper	2026-04-28 04:58:22 -07:00
skill_commands.py	refactor(reload-skills): queue note for next turn, drop cache invalidation + agent tool	2026-04-29 21:07:47 -07:00
skill_preprocessing.py	fix(skills): apply inline shell in skill_view	2026-04-24 15:15:07 -07:00
skill_utils.py	fix: resolve external_dirs relative to HERMES_HOME instead of cwd (#9949 )	2026-04-28 22:29:09 -07:00
subdirectory_hints.py	fix(agent): catch PermissionError in subdirectory hint discovery	2026-04-09 03:10:30 -07:00
title_generator.py	fix(auxiliary): custom provider URL rewrite + main_runtime model for title gen	2026-04-28 01:47:25 -07:00
trajectory.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
usage_pricing.py	fix(usage_pricing): add MiniMax-M2.7 pricing for minimax and minimax-cn providers	2026-04-29 04:56:50 -07:00