hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-31 19:16:29 +00:00

History

kshitij c14b3b5880 fix(kimi): force fixed temperature on kimi-k2.* models (k2.5, thinking, turbo) (#12144 ) * fix(kimi): force fixed temperature on kimi-k2.* models (k2.5, thinking, turbo) The prior override only matched the literal model name "kimi-for-coding", but Moonshot's coding endpoint is hit with real model IDs such as `kimi-k2.5`, `kimi-k2-turbo-preview`, `kimi-k2-thinking`, etc. Those requests bypassed the override and kept the caller's temperature, so Moonshot returns HTTP 400 "invalid temperature: only 0.6 is allowed for this model" (or 1.0 for thinking variants). Match the whole kimi-k2.* family: * kimi-k2-thinking / kimi-k2-thinking-turbo -> 1.0 (thinking mode) * all other kimi-k2.* -> 0.6 (non-thinking / instant mode) Also accept an optional vendor prefix (e.g. `moonshotai/kimi-k2.5`) so aggregator routings are covered. * refactor(kimi): whitelist-match kimi coding models instead of prefix Addresses review feedback on PR #12144. - Replace `startswith("kimi-k2")` with explicit frozensets sourced from Moonshot's kimi-for-coding model list. The prefix match would have also clamped `kimi-k2-instruct` / `kimi-k2-instruct-0905`, which are the separate non-coding K2 family with variable temperature (recommended 0.6 but not enforced — see huggingface.co/moonshotai/Kimi-K2-Instruct). - Confirmed via platform.kimi.ai docs that all five coding models (k2.5, k2-turbo-preview, k2-0905-preview, k2-thinking, k2-thinking-turbo) share the fixed-temperature lock, so the preview-model mapping is no longer an assumption. - Drop the fragile `"thinking" in bare` substring test for a set lookup. - Log a debug line on each override so operators can see when Hermes silently rewrites temperature. - Update class docstring. Extend the negative test to parametrize over kimi-k2-instruct, Kimi-K2-Instruct-0905, and a hypothetical future kimi-k2-experimental name — all must keep the caller's temperature.		2026-04-18 09:35:51 -07:00
..
__init__.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
anthropic_adapter.py	fix(agent): downgrade xhigh→max on Anthropic pre-4.7 adaptive models	2026-04-16 12:00:56 -07:00
auxiliary_client.py	fix(kimi): force fixed temperature on kimi-k2.* models (k2.5, thinking, turbo) (#12144 )	2026-04-18 09:35:51 -07:00
bedrock_adapter.py	feat: native AWS Bedrock provider via Converse API	2026-04-15 16:17:17 -07:00
context_compressor.py	fix(context_compressor): always keep last user message in tail to prevent active-task loss	2026-04-16 07:45:31 -07:00
context_engine.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
context_references.py	fix(agent): preserve quoted @file references with spaces	2026-04-10 13:05:01 -07:00
copilot_acp_client.py	fix: handle httpx.Timeout object in CopilotACPClient (#11058 )	2026-04-16 12:05:11 -07:00
credential_pool.py	fix(auth): restore --label for hermes auth add nous --type oauth	2026-04-17 19:13:40 -07:00
display.py	fix: remove context pressure warnings entirely (#11039 )	2026-04-16 06:44:23 -07:00
error_classifier.py	feat: native AWS Bedrock provider via Converse API	2026-04-15 16:17:17 -07:00
gemini_cloudcode_adapter.py	fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833 )	2026-04-17 15:34:12 -07:00
google_code_assist.py	fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833 )	2026-04-17 15:34:12 -07:00
google_oauth.py	feat(gemini): add Google Gemini CLI OAuth provider via Cloud Code Assist (free + paid tiers) (#11270 )	2026-04-16 16:49:00 -07:00
insights.py	fix(insights): hide cache read/write and cost metrics from display (#11477 )	2026-04-17 01:02:06 -07:00
manual_compression_feedback.py	fix(gateway): make manual compression feedback truthful	2026-04-10 21:16:53 -07:00
memory_manager.py	feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation (#10619 )	2026-04-15 19:12:19 -07:00
memory_provider.py	refactor(memory): drop on_session_reset — commit-only is enough	2026-04-15 11:28:45 -07:00
model_metadata.py	fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833 )	2026-04-17 15:34:12 -07:00
models_dev.py	feat: add Ollama Cloud as built-in provider	2026-04-16 02:22:09 -07:00
nous_rate_guard.py	fix: Nous Portal rate limit guard — prevent retry amplification (#10568 )	2026-04-15 16:31:48 -07:00
prompt_builder.py	fix(skills): use frontmatter name in skills index instead of directory name	2026-04-17 18:56:37 -07:00
prompt_caching.py	fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter	2026-03-21 16:54:43 -07:00
rate_limit_tracker.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
redact.py	fix(security): add JWT token and Discord mention redaction (#10547 )	2026-04-15 16:08:52 -07:00
retry_utils.py	feat(agent): add jittered retry backoff	2026-04-08 00:41:36 -07:00
skill_commands.py	fix: use absolute skill_dir for external skills (#10313 ) (#10587 )	2026-04-15 17:22:55 -07:00
skill_utils.py	feat(plugins): namespaced skill registration for plugin skill bundles	2026-04-14 10:42:58 -07:00
smart_model_routing.py	fix: UTF-8 config encoding, pairing hint, credential_pool key, header normalization (#7174 )	2026-04-10 05:33:48 -07:00
subdirectory_hints.py	fix(agent): catch PermissionError in subdirectory hint discovery	2026-04-09 03:10:30 -07:00
title_generator.py	fix: title_generator no longer logs as 'compression' task	2026-04-12 04:17:18 -07:00
trajectory.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
usage_pricing.py	feat: native AWS Bedrock provider via Converse API	2026-04-15 16:17:17 -07:00