mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-09 08:21:50 +00:00
fix(ollama): set default_max_tokens for custom/Ollama provider
The custom/Ollama provider profile had no default_max_tokens, so no max_tokens was sent on requests and Ollama fell back to its internal num_predict=128 — truncating responses after a few tokens with finish_reason='length' (#39281, e.g. gemma4). max_tokens resolution is ephemeral > user model.max_tokens > profile default, so this is only a floor used when the user hasn't set their own cap. Set it to 65536 (matching the qwen-oauth tier) rather than a conservative value, since users can always override per-model. Fixes #39281
This commit is contained in:
parent
ab0a6270c3
commit
09ec26c66a
1 changed files with 5 additions and 0 deletions
|
|
@ -63,6 +63,11 @@ custom = CustomProfile(
|
|||
),
|
||||
env_vars=(), # No fixed key — custom endpoint
|
||||
base_url="", # User-configured
|
||||
# Without this, no max_tokens is sent and Ollama falls back to its internal
|
||||
# num_predict=128, truncating responses after a few tokens (#39281). This is
|
||||
# only a floor used when the user hasn't set model.max_tokens — they can
|
||||
# override per-model — so we set it generously rather than lowballing it.
|
||||
default_max_tokens=65536,
|
||||
)
|
||||
|
||||
register_provider(custom)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue