* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints
Two distinct failures hit users on the gemini provider with only Google
AI Studio keys set.
1. Truncation loop: build_gemini_request() only set maxOutputTokens when
max_tokens was non-None. Hermes passes None to mean "unlimited", but
Gemini's native generateContent does NOT treat an absent maxOutputTokens
as full budget — it applies a low internal default and stops early with
finishReason=MAX_TOKENS, truncating tool calls. The agent then retries
3x and refuses the incomplete call. Now default to the published 65,535
ceiling (shared by all current Gemini text models) when max_tokens=None.
2. HTTP 400 on Gemini endpoint: the chat_completions transport assembles
profile extra_body (Nous portal 'tags', reasoning, provider prefs) and
sends it via the OpenAI client to whatever base_url is resolved. When a
profile that emits extra_body (e.g. Nous) is active but the endpoint is a
native Gemini base_url — typical when only Google creds exist and a
fallback/aux call lands on Gemini — Google rejects the unknown 'tags'
field with a non-retryable 400. Strip all non-thinking_config extra_body
keys when the resolved endpoint is native Gemini.
Verified E2E against real transport code: tags stripped on native Gemini,
preserved on Nous and the /openai compat endpoint; maxOutputTokens=65535
on None, explicit values respected.