hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

History

helix4u ba9964ff0d fix(custom): pass custom provider extra body Allow custom OpenAI-compatible providers declared under `custom_providers:` to set provider-specific `extra_body` fields and have Hermes merge them into chat-completions requests when the matching custom endpoint is active. This is a manual per-provider override rather than a model-name heuristic. OpenAI-compatible Gemma thinking support is real, but the on-wire payload shape is backend-specific: some servers want top-level `enable_thinking`, while vLLM Gemma and NIM-style endpoints expect `chat_template_kwargs`. A per-provider override is safer than picking one assumed payload. Example config: ```yaml custom_providers: - name: gemma-local base_url: http://localhost:8080/v1 model: google/gemma-4-31b-it extra_body: enable_thinking: true reasoning_effort: high ``` For vLLM Gemma or NIM-style endpoints, use the nested shape those servers expect: ```yaml extra_body: chat_template_kwargs: enable_thinking: true ``` Changes: - `hermes_cli/config.py`: preserve `extra_body` in normalized `custom_providers:` entries and allow it in the validated field set. - `hermes_cli/runtime_provider.py`: propagate custom-provider `extra_body` as `request_overrides.extra_body` for named custom runtime resolution, including credential-pool paths. - `agent/agent_init.py`: at agent init, locate the matching custom-provider entry by `base_url` (+ optional model) and merge its `extra_body` into `AIAgent.request_overrides`, with caller-provided overrides winning on conflicting top-level keys. - `plugins/model-providers/custom/__init__.py`: keep existing CustomProfile behavior (Ollama `num_ctx`, `think=False` when reasoning disabled); user-configured `extra_body` flows through `request_overrides`. - `website/docs/integrations/providers.md`: document the explicit `extra_body` override and the vLLM/Gemma `chat_template_kwargs` variant. - Tests cover config normalization, runtime propagation, model matching, trailing-slash equivalence, fallback when no `model` field is set, and caller-override merging precedence. Verified end-to-end against `CustomProfile` via `ChatCompletionsTransport`: configured `extra_body` reaches `kwargs.extra_body` on the wire request, and coexists with profile-generated entries (Ollama `num_ctx`, `think=False`) without clobber. Salvaged from #29022 onto current `main`. Cosmetic typing edit in `plugins/model-providers/custom/__init__.py` and a stale-base docs revert in `providers.md` were dropped during cherry-pick. Closes #29022		2026-05-21 07:48:53 -07:00
..
__init__.py	feat: provider modules — ProviderProfile ABC, 33 providers, fetch_models, transport single-path	2026-05-05 13:40:01 -07:00
test_e2e_wiring.py	feat: provider modules — ProviderProfile ABC, 33 providers, fetch_models, transport single-path	2026-05-05 13:40:01 -07:00
test_plugin_discovery.py	fix(ci): stabilize shared test state after 21012	2026-05-14 14:28:14 -07:00
test_profile_wiring.py	feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779 )	2026-05-12 20:49:20 -07:00
test_provider_profiles.py	feat(nvidia): add NIM billing origin header	2026-05-15 14:06:51 -07:00
test_transport_parity.py	fix(custom): pass custom provider extra body	2026-05-21 07:48:53 -07:00