hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-09 08:21:50 +00:00

History

Teknium 6f11ff53ad fix(anthropic): use model-native output limits instead of hardcoded 16K (#3426 ) The Anthropic adapter defaulted to max_tokens=16384 when no explicit value was configured. This severely limits thinking-enabled models where thinking tokens count toward max_tokens: - Claude Opus 4.6 supports 128K output but was capped at 16K - Claude Sonnet 4.6 supports 64K output but was capped at 16K With extended thinking (adaptive or budget-based), the model could exhaust the entire 16K on reasoning, leaving zero tokens for the actual response. This caused two user-visible errors: - 'Response truncated (finish_reason=length)' — thinking consumed most tokens - 'Response only contains think block with no content' — thinking consumed all Fix: add _ANTHROPIC_OUTPUT_LIMITS lookup table (sourced from Anthropic docs and Cline's model catalog) and use the model's actual output limit as the default. Unknown future models default to 128K (the current maximum). Also adds context_length clamping: if the user configured a smaller context window (e.g. custom endpoint), max_tokens is clamped to context_length - 1 to avoid exceeding the window. Closes #2706		2026-03-27 13:02:52 -07:00
..
__init__.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
anthropic_adapter.py	fix(anthropic): use model-native output limits instead of hardcoded 16K (#3426 )	2026-03-27 13:02:52 -07:00
auxiliary_client.py	fix: eliminate 'Event loop is closed' / 'Press ENTER to continue' during idle (#3398 )	2026-03-27 09:45:25 -07:00
context_compressor.py	chore: remove ~100 unused imports across 55 files (#3016 )	2026-03-25 15:02:03 -07:00
context_references.py	fix(context): restrict @ references to safe workspace paths (#2601 )	2026-03-23 06:40:05 -07:00
copilot_acp_client.py	fix(acp): preserve leading whitespace in streaming chunks	2026-03-20 09:38:13 -07:00
display.py	fix(gateway): silence background agent terminal output (#3297 )	2026-03-26 17:40:31 -07:00
insights.py	chore: fix 154 f-strings, simplify getattr/URL patterns, remove dead code (#3119 )	2026-03-25 19:47:58 -07:00
model_metadata.py	feat: add Hugging Face as a first-class inference provider (#3419 )	2026-03-27 12:41:59 -07:00
models_dev.py	fix: 6 bugs in model metadata, reasoning detection, and delegate tool	2026-03-20 08:52:37 -07:00
prompt_builder.py	perf(ttft): cache skills prompt with shared skill_utils module (salvage #3366 ) (#3421 )	2026-03-27 10:54:02 -07:00
prompt_caching.py	fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter	2026-03-21 16:54:43 -07:00
redact.py	fix(redact): safely handle non-string inputs	2026-03-21 16:55:02 -07:00
skill_commands.py	fix: disabled skills respected across banner, system prompt, slash commands, and skill_view (#1897 )	2026-03-18 03:17:37 -07:00
skill_utils.py	perf(ttft): cache skills prompt with shared skill_utils module (salvage #3366 ) (#3421 )	2026-03-27 10:54:02 -07:00
smart_model_routing.py	feat: integrate GitHub Copilot providers across Hermes	2026-03-17 23:40:22 -07:00
title_generator.py	feat: auto-generate session titles after first exchange	2026-03-17 04:14:40 -07:00
trajectory.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
usage_pricing.py	fix: status bar shows 26K instead of 260K for token counts with trailing zeros (#3024 )	2026-03-25 12:45:58 -07:00