hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-24 16:54:43 +00:00

History

Teknium ec46f5912e fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints (#39730 ) * fix: respect disabled auto-compaction on context overflow Port from anomalyco/opencode#30749. When compression.enabled is false, NO automatic compaction trigger may fire. The proactive token-threshold paths (preflight + post-response should_compress gate) already honoured the setting, but the three provider-overflow recovery paths in the agent loop — long-context-tier 429, 413 payload-too-large, and context-overflow — called _compress_context() unconditionally, silently compressing and rotating the session against the user's explicit choice. Add a single guard at the top of the overflow-recovery dispatch: when compression is disabled and the error is one of those three overflow classes, surface a terminal error (compaction_disabled: True) telling the user to /compress manually, /new, switch to a larger-context model, or reduce attachments. Manual /compress (force=True) is unaffected — it never enters this loop. Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't compress when disabled; control case still compresses when enabled). Existing overflow-recovery tests updated to enable compaction explicitly (they verify the recovery fires); fixture defaults flipped to True to match production (compression.enabled defaults to True). * fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints Two distinct failures hit users on the gemini provider with only Google AI Studio keys set. 1. Truncation loop: build_gemini_request() only set maxOutputTokens when max_tokens was non-None. Hermes passes None to mean "unlimited", but Gemini's native generateContent does NOT treat an absent maxOutputTokens as full budget — it applies a low internal default and stops early with finishReason=MAX_TOKENS, truncating tool calls. The agent then retries 3x and refuses the incomplete call. Now default to the published 65,535 ceiling (shared by all current Gemini text models) when max_tokens=None. 2. HTTP 400 on Gemini endpoint: the chat_completions transport assembles profile extra_body (Nous portal 'tags', reasoning, provider prefs) and sends it via the OpenAI client to whatever base_url is resolved. When a profile that emits extra_body (e.g. Nous) is active but the endpoint is a native Gemini base_url — typical when only Google creds exist and a fallback/aux call lands on Gemini — Google rejects the unknown 'tags' field with a non-retryable 400. Strip all non-thinking_config extra_body keys when the resolved endpoint is native Gemini. Verified E2E against real transport code: tags stripped on native Gemini, preserved on Nous and the /openai compat endpoint; maxOutputTokens=65535 on None, explicit values respected.		2026-06-05 03:53:59 -07:00
..
lsp	fix(lsp): handle Windows .cmd shims in LSP process spawn	2026-05-30 02:08:36 -07:00
secret_sources	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
transports	fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints (#39730 )	2026-06-05 03:53:59 -07:00
__init__.py	fix(agent): preload jiter native parser	2026-05-28 00:20:11 -07:00
account_usage.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
agent_init.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
agent_runtime_helpers.py	perf(observability): gate tool-hook emit on has_hook; slim per-tool footprint	2026-06-03 06:36:46 -07:00
anthropic_adapter.py	fix(anthropic): demote dead thinking signature when orphan-strip mutates the latest turn	2026-05-31 06:14:34 -07:00
async_utils.py	fix(async): close unscheduled coroutines in all threadsafe bridges (#26584 )	2026-05-15 14:00:01 -07:00
auxiliary_client.py	fix(minimax): align default_aux_model with M3 frontier on minimax + minimax-cn	2026-06-04 05:53:35 -07:00
azure_identity_adapter.py	feat(azure-foundry): add Microsoft Entra ID auth	2026-05-18 10:14:38 -07:00
background_review.py	test: cover ci-unblocker production regressions	2026-05-27 22:14:53 -07:00
bedrock_adapter.py	chore: remove dead code — 28 unused functions/classes across 16 files	2026-05-29 04:22:27 -07:00
browser_provider.py	fix(browser): self-review pass — dead-import, log levels, future-proofing	2026-05-17 04:04:15 -07:00
browser_registry.py	style: restore PEP8 blank-line separation after dead-code removal	2026-05-29 04:22:27 -07:00
chat_completion_helpers.py	fix: strip extra_content from tool_calls for strict APIs (Fireworks, Mistral)	2026-06-03 16:42:52 -07:00
codex_responses_adapter.py	feat(prompt): universal task-completion guidance + local Python toolchain probe (#34340 )	2026-05-28 22:26:09 -07:00
codex_runtime.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
context_compressor.py	fix(compressor): strip stale handoff prefix on resume; reconcile #26290+#32787 (#35344 )	2026-05-30 07:29:21 -07:00
context_engine.py	fix(compression): avoid repeat preflight compaction from rough estimates	2026-05-29 19:05:03 -07:00
context_references.py	fix(agent): fall back when rg is blocked for @folder references	2026-04-20 01:56:41 -07:00
conversation_compression.py	fix(vision): guard image pixel dimensions, not just bytes (#37677 )	2026-06-04 06:16:45 -07:00
conversation_loop.py	fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints (#39730 )	2026-06-05 03:53:59 -07:00
copilot_acp_client.py	fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes	2026-05-19 00:12:41 -07:00
credential_persistence.py	fix: avoid persisting borrowed credential secrets (#31416 )	2026-05-25 00:32:08 -07:00
credential_pool.py	fix(auth): align Codex OAuth persistence paths (#37517 )	2026-06-02 12:19:44 -05:00
credential_sources.py	docs(auth): replace stale 'hermes login' references with 'hermes auth add'	2026-05-26 15:41:11 -07:00
curator.py	feat(curator): prune built-in skills after inactivity + track usage for all skills (#36701 )	2026-06-01 02:07:32 -07:00
curator_backup.py	feat(curator): prune built-in skills after inactivity + track usage for all skills (#36701 )	2026-06-01 02:07:32 -07:00
display.py	chore(web): remove web_crawl tool + provider crawl plumbing (#33824 )	2026-05-28 04:52:42 -07:00
error_classifier.py	fix(vision): cap pixel dimensions proactively at embed time + declare Pillow	2026-06-04 06:16:45 -07:00
file_safety.py	fix(file-safety): extend sandbox-mirror guard to cover inner-container path (#32049 ) (#32407 )	2026-06-02 14:03:37 +10:00
gemini_cloudcode_adapter.py	fix(agent/gemini-cloudcode): seed delta defaults for reasoning-only stream chunks	2026-05-14 08:03:56 -07:00
gemini_native_adapter.py	fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints (#39730 )	2026-06-05 03:53:59 -07:00
gemini_schema.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
google_code_assist.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
google_oauth.py	fix(auth): don't launch a text-mode browser inside the terminal for OAuth (#34479 )	2026-05-29 01:23:06 -07:00
i18n.py	fix(packaging): ship locales/ i18n catalogs in wheel, sdist, and Nix (#38383 )	2026-06-03 12:00:27 -07:00
image_gen_provider.py	fix(image_gen): cache xAI ephemeral URL responses to disk (#26942 ) (#31759 )	2026-05-24 18:10:47 -07:00
image_gen_registry.py	fix(plugins): filter resolution by is_available() in web + image_gen registries	2026-05-13 22:31:28 -07:00
image_routing.py	feat(kanban): attach images referenced in task bodies to worker vision (#34210 )	2026-05-28 17:50:42 -07:00
insights.py	Merge branch 'main' into feat/dashboard-skill-analytics	2026-04-20 05:25:49 -07:00
iteration_budget.py	refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget	2026-05-16 17:59:32 -07:00
jiter_preload.py	fix(agent): preload jiter native parser	2026-05-28 00:20:11 -07:00
lmstudio_reasoning.py	feat(agent): add lmstudio integration	2026-04-28 12:27:36 -07:00
manual_compression_feedback.py	fix(compression): include system prompt + tool schemas in token estimates (#18265 )	2026-04-30 23:03:54 -07:00
markdown_tables.py	fix(cli): vertical fallback for markdown tables wider than terminal (#23948 )	2026-05-11 16:49:13 -07:00
memory_manager.py	fix(memory): only forward rewound kwarg when set	2026-06-01 01:22:38 -07:00
memory_provider.py	feat(memory): add rewound kwarg to on_session_switch hook	2026-06-01 01:22:38 -07:00
message_sanitization.py	revert: drop cumulative-resend tool-arg heuristic from shared streaming path (#35718 ) (#35860 )	2026-05-31 06:14:32 -07:00
model_metadata.py	fix(local): recognize unqualified hostnames as local endpoints (#9248 )	2026-06-05 10:18:10 +10:00
models_dev.py	remove Vercel AI Gateway and Vercel Sandbox (#33067 )	2026-05-27 00:43:32 -07:00
moonshot_schema.py	Add Hermes desktop app (#20059 )	2026-05-31 17:46:56 -05:00
nous_rate_guard.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
onboarding.py	docs(onboarding): lead OpenClaw residue banner with migrate, warn that cleanup breaks OpenClaw (#17507 )	2026-04-29 08:08:36 -07:00
plugin_llm.py	feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194 )	2026-05-10 07:09:28 -07:00
portal_tags.py	feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779 )	2026-05-12 20:49:20 -07:00
process_bootstrap.py	refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget	2026-05-16 17:59:32 -07:00
prompt_builder.py	refactor(skills): clean up bundled skill set + add environments: relevance gate (#39028 )	2026-06-04 06:11:22 -07:00
prompt_caching.py	fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778 )	2026-05-12 20:46:04 -07:00
rate_limit_tracker.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
redact.py	fix: remove Discord mention redaction from secret scrubber	2026-05-30 20:48:41 -07:00
retry_utils.py	feat(agent): add jittered retry backoff	2026-04-08 00:41:36 -07:00
runtime_cwd.py	fix(desktop): stabilize project folder sessions (#37586 )	2026-06-02 20:23:09 +00:00
shell_hooks.py	fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes	2026-05-19 00:12:41 -07:00
skill_bundles.py	feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373 )	2026-05-18 21:38:05 -07:00
skill_commands.py	refactor(skills): clean up bundled skill set + add environments: relevance gate (#39028 )	2026-06-04 06:11:22 -07:00
skill_preprocessing.py	fix: treat inline-shell timeout guard as timeout	2026-05-18 19:36:04 -07:00
skill_utils.py	refactor(skills): clean up bundled skill set + add environments: relevance gate (#39028 )	2026-06-04 06:11:22 -07:00
stream_diag.py	feat(agent): buffer retry/fallback status, surface only on terminal failure (#33816 )	2026-05-28 04:53:27 -07:00
subdirectory_hints.py	fix(subdirectory_hints): prevent loading AGENTS.md outside workspace	2026-05-25 23:17:33 -07:00
system_prompt.py	refactor(prompt): route context-file cwd through runtime_cwd resolver	2026-06-01 16:55:04 -07:00
think_scrubber.py	fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924 ) (#20184 )	2026-05-05 04:33:38 -07:00
title_generator.py	fix: improve telegram topic mode setup	2026-05-04 12:07:17 -07:00
tool_dispatch_helpers.py	feat(security): promptware defense — shared threat patterns + memory load-time scan + tool-result delimiters (#32269 )	2026-05-25 14:52:24 -07:00
tool_executor.py	perf(observability): gate tool-hook emit on has_hook; slim per-tool footprint	2026-06-03 06:36:46 -07:00
tool_guardrails.py	fix: add recovery hints to loop guard warnings	2026-05-19 00:12:12 -07:00
tool_result_classification.py	fix: classify landed file mutations with diagnostics	2026-05-13 06:46:23 -07:00
trajectory.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
transcription_provider.py	feat(stt): add register_transcription_provider() plugin hook	2026-05-25 01:41:19 -07:00
transcription_registry.py	feat(stt): add register_transcription_provider() plugin hook	2026-05-25 01:41:19 -07:00
tts_provider.py	feat(tts): add register_tts_provider() plugin hook (closes #30398 )	2026-05-24 18:04:54 -07:00
tts_registry.py	feat(tts): add register_tts_provider() plugin hook (closes #30398 )	2026-05-24 18:04:54 -07:00
usage_pricing.py	feat: add claude-opus-4.8 and claude-opus-4.8-fast (#34003 )	2026-05-28 10:31:59 -07:00
video_gen_provider.py	feat(video_gen): unified video_generate tool with pluggable provider backends (#25126 )	2026-05-13 16:39:41 -07:00
video_gen_registry.py	feat(video_gen): unified video_generate tool with pluggable provider backends (#25126 )	2026-05-13 16:39:41 -07:00
web_search_provider.py	chore(web): remove web_crawl tool + provider crawl plumbing (#33824 )	2026-05-28 04:52:42 -07:00
web_search_registry.py	chore(web): remove web_crawl tool + provider crawl plumbing (#33824 )	2026-05-28 04:52:42 -07:00