hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-09 08:21:50 +00:00

History

Teknium ec46f5912e fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints (#39730 ) * fix: respect disabled auto-compaction on context overflow Port from anomalyco/opencode#30749. When compression.enabled is false, NO automatic compaction trigger may fire. The proactive token-threshold paths (preflight + post-response should_compress gate) already honoured the setting, but the three provider-overflow recovery paths in the agent loop — long-context-tier 429, 413 payload-too-large, and context-overflow — called _compress_context() unconditionally, silently compressing and rotating the session against the user's explicit choice. Add a single guard at the top of the overflow-recovery dispatch: when compression is disabled and the error is one of those three overflow classes, surface a terminal error (compaction_disabled: True) telling the user to /compress manually, /new, switch to a larger-context model, or reduce attachments. Manual /compress (force=True) is unaffected — it never enters this loop. Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't compress when disabled; control case still compresses when enabled). Existing overflow-recovery tests updated to enable compaction explicitly (they verify the recovery fires); fixture defaults flipped to True to match production (compression.enabled defaults to True). * fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints Two distinct failures hit users on the gemini provider with only Google AI Studio keys set. 1. Truncation loop: build_gemini_request() only set maxOutputTokens when max_tokens was non-None. Hermes passes None to mean "unlimited", but Gemini's native generateContent does NOT treat an absent maxOutputTokens as full budget — it applies a low internal default and stops early with finishReason=MAX_TOKENS, truncating tool calls. The agent then retries 3x and refuses the incomplete call. Now default to the published 65,535 ceiling (shared by all current Gemini text models) when max_tokens=None. 2. HTTP 400 on Gemini endpoint: the chat_completions transport assembles profile extra_body (Nous portal 'tags', reasoning, provider prefs) and sends it via the OpenAI client to whatever base_url is resolved. When a profile that emits extra_body (e.g. Nous) is active but the endpoint is a native Gemini base_url — typical when only Google creds exist and a fallback/aux call lands on Gemini — Google rejects the unknown 'tags' field with a non-retryable 400. Strip all non-thinking_config extra_body keys when the resolved endpoint is native Gemini. Verified E2E against real transport code: tags stripped on native Gemini, preserved on Nous and the /openai compat endpoint; maxOutputTokens=65535 on None, explicit values respected.		2026-06-05 03:53:59 -07:00
..
__init__.py	feat: provider modules — ProviderProfile ABC, 33 providers, fetch_models, transport single-path	2026-05-05 13:40:01 -07:00
anthropic.py	fix(agent): only strip mcp_ prefix for OAuth-injected tools (GH-25255)	2026-05-24 15:27:45 -07:00
base.py	feat: add transport ABC + AnthropicTransport wired to all paths	2026-04-21 01:27:01 -07:00
bedrock.py	feat: add BedrockTransport + wire all Bedrock transport paths	2026-04-21 20:58:37 -07:00
chat_completions.py	fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints (#39730 )	2026-06-05 03:53:59 -07:00
codex.py	fix(codex): omit tools key from Codex Responses kwargs when no tools registered	2026-05-27 11:46:17 -07:00
codex_app_server.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
codex_app_server_session.py	feat(prompt): universal task-completion guidance + local Python toolchain probe (#34340 )	2026-05-28 22:26:09 -07:00
codex_event_projector.py	feat(codex-runtime): optional codex app-server runtime for OpenAI/Codex models (#24182 )	2026-05-13 17:18:15 -07:00
hermes_tools_mcp_server.py	docs(hermes_tools_mcp_server): align scope docstring with EXPOSED_TOOLS (#26603 )	2026-05-15 14:44:27 -07:00
types.py	fix(transports): use PEP 604 annotation for ToolCall.extra_content	2026-05-09 02:25:37 -07:00