hermes-agent/hermes_cli
Teknium ff9752410a
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799)
* feat(plugins): pluggable image_gen backends + OpenAI provider

Adds a ImageGenProvider ABC so image generation backends register as
bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner
gains three primitives to make this work generically:

- `kind:` manifest field (`standalone` | `backend` | `exclusive`).
  Bundled `kind: backend` plugins auto-load — no `plugins.enabled`
  incantation. User-installed backends stay opt-in.
- Path-derived keys: `plugins/image_gen/openai/` gets key
  `image_gen/openai`, so a future `tts/openai` cannot collide.
- Depth-2 recursion into category namespaces (parent dirs without a
  `plugin.yaml` of their own).

Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5
default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64
responses save to `$HERMES_HOME/cache/images/`; URL responses pass
through.

FAL stays in-tree for this PR — a follow-up ports it into
`plugins/image_gen/fal/` so the in-tree `image_generation_tool.py`
slims down. The dispatch shim in `_handle_image_generate` only fires
when `image_gen.provider` is explicitly set to a non-FAL value, so
existing FAL setups are untouched.

- 41 unit tests (scanner recursion, kind parsing, gate logic,
  registry, OpenAI payload shapes)
- E2E smoke verified: bundled plugin autoloads, registers, and
  `_handle_image_generate` routes to OpenAI when configured

* fix(image_gen/openai): don't send response_format to gpt-image-*

The live API rejects it: 'Unknown parameter: response_format'
(verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return
b64_json unconditionally, so the parameter was both unnecessary and
actively broken.

* feat(image_gen/openai): gpt-image-2 only, drop legacy catalog

gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21)
and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 /
dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward
(dall-e-2 squares only). Trim the catalog down to a single model.

Live-verified end-to-end: landscape 1536x1024 render of a Moog-style
synth matches prompt exactly, 2.4MB PNG saved to cache.

* feat(image_gen/openai): expose gpt-image-2 as three quality tiers

Users pick speed/fidelity via the normal model picker instead of a
hidden quality knob. All three tier IDs resolve to the single underlying
gpt-image-2 API model with a different quality parameter:

  gpt-image-2-low     ~15s   fast iteration
  gpt-image-2-medium  ~40s   default
  gpt-image-2-high    ~2min  highest fidelity

Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the
same 1024x1024 prompt.

Config:
  image_gen.openai.model: gpt-image-2-high
  # or
  image_gen.model: gpt-image-2-low
  # or env var for scripts/tests
  OPENAI_IMAGE_MODEL=gpt-image-2-medium

Live-verified end-to-end with the low tier: 18.8s landscape render of a
golden retriever in wildflowers, vision-confirmed exact match.

* feat(tools_config): plugin image_gen providers inject themselves into picker

'hermes tools' → Image Generation now shows plugin-registered backends
alongside Nous Subscription and FAL.ai without tools_config.py needing
to know about them. OpenAI appears as a third option today; future
backends appear automatically as they're added.

Mechanism:
- ImageGenProvider gains an optional get_setup_schema() hook
  (name, badge, tag, env_vars). Default derived from display_name.
- tools_config._plugin_image_gen_providers() pulls the schemas from
  every registered non-FAL plugin provider.
- _visible_providers() appends those rows when rendering the Image
  Generation category.
- _configure_provider() handles the new image_gen_plugin_name marker:
  writes image_gen.provider and routes to the plugin's list_models()
  catalog for the model picker.
- _toolset_needs_configuration_prompt('image_gen') stops demanding a
  FAL key when any plugin provider reports is_available().

FAL is skipped in the plugin path because it already has hardcoded
TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up
PR the hardcoded rows go away and it surfaces through the same path
as OpenAI.

Verified live: picker shows Nous Subscription / FAL.ai / OpenAI.
Picking OpenAI prompts for OPENAI_API_KEY, then shows the
gpt-image-2-low/medium/high model picker sourced from the plugin.

397 tests pass across plugins/, tools_config, registry, and picker.

* fix(image_gen): close final gaps for plugin-backend parity with FAL

Two small places that still hardcoded FAL:

- hermes_cli/setup.py status line: an OpenAI-only setup showed
  'Image Generation: missing FAL_KEY'. Now probes plugin providers
  and reports '(OpenAI)' when one is_available() — or falls back to
  'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured.

- image_generate tool schema description: said 'using FAL.ai, default
  FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are
  user-configured' — and notes the 'image' field can be a URL or an
  absolute path, which the gateway delivers either way via
  extract_local_files().
2026-04-21 21:30:10 -07:00
..
__init__.py chore: release v0.10.0 (2026.4.16) (#11209) 2026-04-16 12:53:06 -07:00
auth.py remove Nous Portal free-model allowlist 2026-04-21 20:35:16 -07:00
auth_commands.py fix(auth): unify credential source removal — every source sticks (#13427) 2026-04-21 01:52:49 -07:00
backup.py fix(backup): handle files with pre-1980 timestamps 2026-04-20 00:47:40 -07:00
banner.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
callbacks.py fix: ESC cancels secret/sudo prompts, clearer skip messaging (#9902) 2026-04-14 16:11:37 -07:00
claw.py fix: unify OpenClaw detection, add isatty guard, fix print_warning import 2026-04-12 16:40:37 -07:00
cli_output.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
clipboard.py feat: fix img pasting in new ink plus newline after tools 2026-04-11 13:14:32 -05:00
codex_models.py fix: remove codex spark model support 2026-04-20 04:51:44 -07:00
colors.py feat: respect NO_COLOR env var and TERM=dumb (#4079) 2026-03-30 17:07:21 -07:00
commands.py fix(tui): @folder: only yields directories, @file: only yields files 2026-04-21 14:31:48 -05:00
completion.py fix: preserve profile name completion in dynamic shell completion 2026-04-14 10:45:42 -07:00
config.py fix(tts): use per-provider input-character caps instead of global 4000 (#13743) 2026-04-21 17:49:39 -07:00
copilot_auth.py fix(copilot): resolve GHE token poisoning when GITHUB_TOKEN is set 2026-04-13 05:12:36 -07:00
cron.py feat(cron): track delivery failures in job status (#6042) 2026-04-07 22:49:01 -07:00
curses_ui.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
debug.py fix: two process leaks (agent-browser daemons, paste.rs sleepers) (#11843) 2026-04-17 18:46:30 -07:00
default_soul.py fix: reset default SOUL.md to baseline identity text (#3159) 2026-03-26 01:34:27 -07:00
dingtalk_auth.py test(dingtalk): cover QR device-flow auth + OpenClaw branding disclosure 2026-04-17 05:08:07 -07:00
doctor.py fix(kimi): reconcile sk-kimi- routing with Anthropic SDK URL semantics 2026-04-21 19:48:39 -07:00
dump.py refactor: remove smart_model_routing feature (#12732) 2026-04-19 18:12:55 -07:00
env_loader.py fix(env_loader): warn when non-ASCII stripped from credential env vars (#13300) 2026-04-20 22:14:03 -07:00
gateway.py refactor: remove redundant local imports already available at module level 2026-04-21 00:50:58 -07:00
hooks.py feat: shell hooks — wire shell scripts as Hermes hook callbacks 2026-04-20 20:53:51 -07:00
logs.py feat: component-separated logging with session context and filtering (#7991) 2026-04-11 17:23:36 -07:00
main.py remove Nous Portal free-model allowlist 2026-04-21 20:35:16 -07:00
mcp_config.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
memory_setup.py fix(memory): discover user-installed memory providers from $HERMES_HOME/plugins/ (#10529) 2026-04-15 14:25:40 -07:00
model_normalize.py fix(copilot): normalize vendor-prefixed and dash-notation model IDs (#6879) (#11561) 2026-04-17 04:19:36 -07:00
model_switch.py fix: Enhance Kimi Coding API mode detection and User-Agent 2026-04-21 19:48:39 -07:00
models.py feat(models): add minimax/minimax-m2.5:free to OpenRouter catalog (#13836) 2026-04-21 21:27:40 -07:00
nous_subscription.py fix(fal): extend whitespace-only FAL_KEY handling to all call sites 2026-04-21 02:04:21 -07:00
pairing.py chore: fix 154 f-strings, simplify getattr/URL patterns, remove dead code (#3119) 2026-03-25 19:47:58 -07:00
platforms.py feat(gateway): unify QQBot branding, add PLATFORM_HINTS, fix streaming, restore missing setup functions 2026-04-14 00:11:49 -07:00
plugins.py feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) 2026-04-21 21:30:10 -07:00
plugins_cmd.py feat(plugins): make all plugins opt-in by default 2026-04-20 04:46:45 -07:00
profiles.py fix(gateway): fix discrepancies in gateway status 2026-04-17 18:58:29 -07:00
providers.py fix: Enhance Kimi Coding API mode detection and User-Agent 2026-04-21 19:48:39 -07:00
runtime_provider.py fix(kimi-coding): add KIMI_CODING_API_KEY fallback + api_mode detection for /coding endpoint 2026-04-21 19:48:39 -07:00
setup.py feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) 2026-04-21 21:30:10 -07:00
skills_config.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
skills_hub.py Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 08:59:33 -05:00
skin_engine.py Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 10:47:41 -05:00
status.py fix(gateway): fix discrepancies in gateway status 2026-04-17 18:58:29 -07:00
timeouts.py fix(config): add stale timeout settings 2026-04-20 00:52:50 -07:00
tips.py feat(delegate): orchestrator role and configurable spawn depth (default flat) 2026-04-21 14:23:45 -07:00
tools_config.py feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) 2026-04-21 21:30:10 -07:00
uninstall.py feat(uninstall): offer to remove named profiles when uninstalling from default 2026-04-18 19:18:13 -07:00
web_server.py Merge pull request #13526 from NousResearch/feat/dashboard-action-buttons 2026-04-21 08:40:26 -07:00
webhook.py feat(webhook): direct delivery mode for zero-LLM push notifications (#12473) 2026-04-19 05:18:19 -07:00