hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

History

Teknium ff9752410a feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 ) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` \| `backend` \| `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().		2026-04-21 21:30:10 -07:00
..
transports	feat: add BedrockTransport + wire all Bedrock transport paths	2026-04-21 20:58:37 -07:00
__init__.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
account_usage.py	feat(account-usage): add per-provider account limits module	2026-04-21 01:56:35 -07:00
anthropic_adapter.py	fix(kimi): don't send Anthropic thinking to api.kimi.com/coding (#13826 )	2026-04-21 21:19:14 -07:00
auxiliary_client.py	feat(aux): use Portal /api/nous/recommended-models for auxiliary models	2026-04-21 20:35:16 -07:00
bedrock_adapter.py	feat: native AWS Bedrock provider via Converse API	2026-04-15 16:17:17 -07:00
codex_responses_adapter.py	refactor: extract codex_responses logic into dedicated adapter	2026-04-20 11:53:17 -07:00
context_compressor.py	refactor: remove redundant local imports already available at module level	2026-04-21 00:50:58 -07:00
context_engine.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
context_references.py	fix(agent): fall back when rg is blocked for @folder references	2026-04-20 01:56:41 -07:00
copilot_acp_client.py	fix(security): apply file safety to copilot acp fs	2026-04-21 01:31:58 -07:00
credential_pool.py	fix(auth): unify credential source removal — every source sticks (#13427 )	2026-04-21 01:52:49 -07:00
credential_sources.py	fix(auth): unify credential source removal — every source sticks (#13427 )	2026-04-21 01:52:49 -07:00
display.py	fix(display): render <missing old_text> in memory previews instead of empty quotes (#12852 )	2026-04-19 22:45:47 -07:00
error_classifier.py	fix(error_classifier): handle dict-typed message fields without crashing	2026-04-20 02:40:20 -07:00
file_safety.py	fix(security): apply file safety to copilot acp fs	2026-04-21 01:31:58 -07:00
gemini_cloudcode_adapter.py	refactor: remove redundant local imports already available at module level	2026-04-21 00:50:58 -07:00
gemini_native_adapter.py	refactor: remove redundant local imports already available at module level	2026-04-21 00:50:58 -07:00
gemini_schema.py	fix(gemini): sanitize tool schemas for Google providers	2026-04-20 00:26:18 -07:00
google_code_assist.py	fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833 )	2026-04-17 15:34:12 -07:00
google_oauth.py	feat(gemini): add Google Gemini CLI OAuth provider via Cloud Code Assist (free + paid tiers) (#11270 )	2026-04-16 16:49:00 -07:00
image_gen_provider.py	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )	2026-04-21 21:30:10 -07:00
image_gen_registry.py	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )	2026-04-21 21:30:10 -07:00
insights.py	Merge branch 'main' into feat/dashboard-skill-analytics	2026-04-20 05:25:49 -07:00
manual_compression_feedback.py	fix(gateway): make manual compression feedback truthful	2026-04-10 21:16:53 -07:00
memory_manager.py	feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation (#10619 )	2026-04-15 19:12:19 -07:00
memory_provider.py	refactor(memory): drop on_session_reset — commit-only is enough	2026-04-15 11:28:45 -07:00
model_metadata.py	test: stop testing mutable data — convert change-detectors to invariants (#13363 )	2026-04-20 23:20:33 -07:00
models_dev.py	fix(gemini): hide stale and low-TPM Google models	2026-04-18 12:52:01 -07:00
nous_rate_guard.py	fix: Nous Portal rate limit guard — prevent retry amplification (#10568 )	2026-04-15 16:31:48 -07:00
prompt_builder.py	fix(prompt): tell CLI agents not to emit MEDIA:/path tags (#13766 )	2026-04-21 19:36:05 -07:00
prompt_caching.py	fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter	2026-03-21 16:54:43 -07:00
rate_limit_tracker.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
redact.py	feat: replace kimi-k2.5 with kimi-k2.6 on OpenRouter and Nous Portal (#13148 )	2026-04-20 11:49:54 -07:00
retry_utils.py	feat(agent): add jittered retry backoff	2026-04-08 00:41:36 -07:00
shell_hooks.py	feat: shell hooks — wire shell scripts as Hermes hook callbacks	2026-04-20 20:53:51 -07:00
skill_commands.py	feat(skills+terminal): make bundled skill scripts runnable out of the box (#13384 )	2026-04-21 00:39:19 -07:00
skill_utils.py	feat(plugins): namespaced skill registration for plugin skill bundles	2026-04-14 10:42:58 -07:00
subdirectory_hints.py	fix(agent): catch PermissionError in subdirectory hint discovery	2026-04-09 03:10:30 -07:00
title_generator.py	fix: title_generator no longer logs as 'compression' task	2026-04-12 04:17:18 -07:00
trajectory.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
usage_pricing.py	fix: sweep remaining provider-URL substring checks across codebase	2026-04-20 22:14:29 -07:00