hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-13 09:01:54 +00:00

Author	SHA1	Message	Date
Teknium	ec46f5912e	fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints (#39730 ) * fix: respect disabled auto-compaction on context overflow Port from anomalyco/opencode#30749. When compression.enabled is false, NO automatic compaction trigger may fire. The proactive token-threshold paths (preflight + post-response should_compress gate) already honoured the setting, but the three provider-overflow recovery paths in the agent loop — long-context-tier 429, 413 payload-too-large, and context-overflow — called _compress_context() unconditionally, silently compressing and rotating the session against the user's explicit choice. Add a single guard at the top of the overflow-recovery dispatch: when compression is disabled and the error is one of those three overflow classes, surface a terminal error (compaction_disabled: True) telling the user to /compress manually, /new, switch to a larger-context model, or reduce attachments. Manual /compress (force=True) is unaffected — it never enters this loop. Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't compress when disabled; control case still compresses when enabled). Existing overflow-recovery tests updated to enable compaction explicitly (they verify the recovery fires); fixture defaults flipped to True to match production (compression.enabled defaults to True). * fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints Two distinct failures hit users on the gemini provider with only Google AI Studio keys set. 1. Truncation loop: build_gemini_request() only set maxOutputTokens when max_tokens was non-None. Hermes passes None to mean "unlimited", but Gemini's native generateContent does NOT treat an absent maxOutputTokens as full budget — it applies a low internal default and stops early with finishReason=MAX_TOKENS, truncating tool calls. The agent then retries 3x and refuses the incomplete call. Now default to the published 65,535 ceiling (shared by all current Gemini text models) when max_tokens=None. 2. HTTP 400 on Gemini endpoint: the chat_completions transport assembles profile extra_body (Nous portal 'tags', reasoning, provider prefs) and sends it via the OpenAI client to whatever base_url is resolved. When a profile that emits extra_body (e.g. Nous) is active but the endpoint is a native Gemini base_url — typical when only Google creds exist and a fallback/aux call lands on Gemini — Google rejects the unknown 'tags' field with a non-retryable 400. Strip all non-thinking_config extra_body keys when the resolved endpoint is native Gemini. Verified E2E against real transport code: tags stripped on native Gemini, preserved on Nous and the /openai compat endpoint; maxOutputTokens=65535 on None, explicit values respected.	2026-06-05 03:53:59 -07:00
Teknium	ba44a3d256	fix(gemini): fail fast on missing API key + surface it in hermes dump (#15133 ) Two small fixes triggered by a support report where the user saw a cryptic 'HTTP 400 - Error 400 (Bad Request)!!1' (Google's GFE HTML error page, not a real API error) on every gemini-2.5-pro request. The underlying cause was an empty GOOGLE_API_KEY / GEMINI_API_KEY, but nothing in our output made that diagnosable: 1. hermes_cli/dump.py: the api_keys section enumerated 23 providers but omitted Google entirely, so users had no way to verify from 'hermes dump' whether the key was set. Added GOOGLE_API_KEY and GEMINI_API_KEY rows. 2. agent/gemini_native_adapter.py: GeminiNativeClient.__init__ accepted an empty/whitespace api_key and stamped it into the x-goog-api-key header, which made Google's frontend return a generic HTML 400 long before the request reached the Generative Language backend. Now we raise RuntimeError at construction with an actionable message pointing at GOOGLE_API_KEY/GEMINI_API_KEY and aistudio.google.com. Added a regression test that covers '', ' ', and None.	2026-04-24 05:35:17 -07:00
helix4u	8155ebd7c4	fix(gemini): sanitize tool schemas for Google providers	2026-04-20 00:26:18 -07:00
kshitijk4poor	d393104bad	fix(gemini): tighten native routing and streaming replay - only use the native adapter for the canonical Gemini native endpoint - keep custom and /openai base URLs on the OpenAI-compatible path - preserve Hermes keepalive transport injection for native Gemini clients - stabilize streaming tool-call replay across repeated SSE events - add follow-up tests for base_url precedence, async streaming, and duplicate tool-call chunks	2026-04-19 12:40:08 -07:00
kshitijk4poor	3dea497b20	feat(providers): route gemini through the native AI Studio API - add a native Gemini adapter over generateContent/streamGenerateContent - switch the built-in gemini provider off the OpenAI-compatible endpoint - preserve thought signatures and native functionResponse replay - route auxiliary Gemini clients through the same adapter - add focused unit coverage plus native-provider integration checks	2026-04-19 12:40:08 -07:00

5 commits