feat(plugins): pluggable image_gen backends + OpenAI provider (#13799)

* feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-25 00:51:20 +00:00 · 2026-04-21 21:30:10 -07:00 · 2026-04-21 21:30:10 -07:00 · ff9752410a
commit ff9752410a
parent d1acf17773
13 changed files with 2122 additions and 67 deletions
--- a/hermes_cli/tools_config.py
+++ b/hermes_cli/tools_config.py
@ -847,6 +847,51 @@ def _configure_toolset(ts_key: str, config: dict):
        _configure_simple_requirements(ts_key)


+def _plugin_image_gen_providers() -> list[dict]:
+    """Build picker-row dicts from plugin-registered image gen providers.
+
+    Each returned dict looks like a regular ``TOOL_CATEGORIES`` provider
+    row but carries an ``image_gen_plugin_name`` marker so downstream
+    code (config writing, model picker) knows to route through the
+    plugin registry instead of the in-tree FAL backend.
+
+    FAL is skipped — it's already exposed by the hardcoded
+    ``TOOL_CATEGORIES["image_gen"]`` entries. When FAL gets ported to
+    a plugin in a follow-up PR, the hardcoded entries go away and this
+    function surfaces it alongside OpenAI automatically.
+    """
+    try:
+        from agent.image_gen_registry import list_providers
+        from hermes_cli.plugins import _ensure_plugins_discovered
+
+        _ensure_plugins_discovered()
+        providers = list_providers()
+    except Exception:
+        return []
+
+    rows: list[dict] = []
+    for provider in providers:
+        if getattr(provider, "name", None) == "fal":
+            # FAL has its own hardcoded rows today.
+            continue
+        try:
+            schema = provider.get_setup_schema()
+        except Exception:
+            continue
+        if not isinstance(schema, dict):
+            continue
+        rows.append(
+            {
+                "name": schema.get("name", provider.display_name),
+                "badge": schema.get("badge", ""),
+                "tag": schema.get("tag", ""),
+                "env_vars": schema.get("env_vars", []),
+                "image_gen_plugin_name": provider.name,
+            }
+        )
+    return rows
+
+
 def _visible_providers(cat: dict, config: dict) -> list[dict]:
    """Return provider entries visible for the current auth/config state."""
    features = get_nous_subscription_features(config)
@ -857,6 +902,12 @@ def _visible_providers(cat: dict, config: dict) -> list[dict]:
        if provider.get("requires_nous_auth") and not features.nous_auth_present:
            continue
        visible.append(provider)
+
+    # Inject plugin-registered image_gen backends (OpenAI today, more
+    # later) so the picker lists them alongside FAL / Nous Subscription.
+    if cat.get("name") == "Image Generation":
+        visible.extend(_plugin_image_gen_providers())
+
    return visible


@ -876,7 +927,24 @@ def _toolset_needs_configuration_prompt(ts_key: str, config: dict) -> bool:
        browser_cfg = config.get("browser", {})
        return not isinstance(browser_cfg, dict) or "cloud_provider" not in browser_cfg
    if ts_key == "image_gen":
-        return not fal_key_is_configured()
+        # Satisfied when the in-tree FAL backend is configured OR any
+        # plugin-registered image gen provider is available.
+        if fal_key_is_configured():
+            return False
+        try:
+            from agent.image_gen_registry import list_providers
+            from hermes_cli.plugins import _ensure_plugins_discovered
+
+            _ensure_plugins_discovered()
+            for provider in list_providers():
+                try:
+                    if provider.is_available():
+                        return False
+                except Exception:
+                    continue
+        except Exception:
+            pass
+        return True

    return not _toolset_has_keys(ts_key, config)

@ -1095,6 +1163,88 @@ def _configure_imagegen_model(backend_name: str, config: dict) -> None:
    _print_success(f"  Model set to: {chosen}")


+def _plugin_image_gen_catalog(plugin_name: str):
+    """Return ``(catalog_dict, default_model_id)`` for a plugin provider.
+
+    ``catalog_dict`` is shaped like the legacy ``FAL_MODELS`` table —
+    ``{model_id: {"display", "speed", "strengths", "price", ...}}`` —
+    so the existing picker code paths work without change. Returns
+    ``({}, None)`` if the provider isn't registered or has no models.
+    """
+    try:
+        from agent.image_gen_registry import get_provider
+        from hermes_cli.plugins import _ensure_plugins_discovered
+
+        _ensure_plugins_discovered()
+        provider = get_provider(plugin_name)
+    except Exception:
+        return {}, None
+    if provider is None:
+        return {}, None
+    try:
+        models = provider.list_models() or []
+        default = provider.default_model()
+    except Exception:
+        return {}, None
+    catalog = {m["id"]: m for m in models if isinstance(m, dict) and "id" in m}
+    return catalog, default
+
+
+def _configure_imagegen_model_for_plugin(plugin_name: str, config: dict) -> None:
+    """Prompt the user to pick a model for a plugin-registered backend.
+
+    Writes selection to ``image_gen.model``. Mirrors
+    :func:`_configure_imagegen_model` but sources its catalog from the
+    plugin registry instead of :data:`IMAGEGEN_BACKENDS`.
+    """
+    catalog, default_model = _plugin_image_gen_catalog(plugin_name)
+    if not catalog:
+        return
+
+    cur_cfg = config.setdefault("image_gen", {})
+    if not isinstance(cur_cfg, dict):
+        cur_cfg = {}
+        config["image_gen"] = cur_cfg
+    current_model = cur_cfg.get("model") or default_model
+    if current_model not in catalog:
+        current_model = default_model
+
+    model_ids = list(catalog.keys())
+    ordered = [current_model] + [m for m in model_ids if m != current_model]
+
+    widths = {
+        "model": max(len(m) for m in model_ids),
+        "speed": max((len(catalog[m].get("speed", "")) for m in model_ids), default=6),
+        "strengths": max((len(catalog[m].get("strengths", "")) for m in model_ids), default=0),
+    }
+
+    print()
+    header = (
+        f"  {'Model':<{widths['model']}}  "
+        f"{'Speed':<{widths['speed']}}  "
+        f"{'Strengths':<{widths['strengths']}}  "
+        f"Price"
+    )
+    print(color(header, Colors.CYAN))
+
+    rows = []
+    for mid in ordered:
+        row = _format_imagegen_model_row(mid, catalog[mid], widths)
+        if mid == current_model:
+            row += "  ← currently in use"
+        rows.append(row)
+
+    idx = _prompt_choice(
+        f"  Choose {plugin_name} model:",
+        rows,
+        default=0,
+    )
+
+    chosen = ordered[idx]
+    cur_cfg["model"] = chosen
+    _print_success(f"  Model set to: {chosen}")
+
+
 def _configure_provider(provider: dict, config: dict):
    """Configure a single provider - prompt for API keys and set config."""
    env_vars = provider.get("env_vars", [])
@ -1151,10 +1301,28 @@ def _configure_provider(provider: dict, config: dict):
        _print_success(f"  {provider['name']} - no configuration needed!")
        if managed_feature:
            _print_info("  Requests for this tool will be billed to your Nous subscription.")
+        # Plugin-registered image_gen provider: write image_gen.provider
+        # and route model selection to the plugin's own catalog.
+        plugin_name = provider.get("image_gen_plugin_name")
+        if plugin_name:
+            img_cfg = config.setdefault("image_gen", {})
+            if not isinstance(img_cfg, dict):
+                img_cfg = {}
+                config["image_gen"] = img_cfg
+            img_cfg["provider"] = plugin_name
+            _print_success(f"  image_gen.provider set to: {plugin_name}")
+            _configure_imagegen_model_for_plugin(plugin_name, config)
+            return
        # Imagegen backends prompt for model selection after backend pick.
        backend = provider.get("imagegen_backend")
        if backend:
            _configure_imagegen_model(backend, config)
+            # In-tree FAL is the only non-plugin backend today. Keep
+            # image_gen.provider clear so the dispatch shim falls through
+            # to the legacy FAL path.
+            img_cfg = config.setdefault("image_gen", {})
+            if isinstance(img_cfg, dict) and img_cfg.get("provider") not in (None, "", "fal"):
+                img_cfg["provider"] = "fal"
        return

    # Prompt for each required env var
@ -1189,10 +1357,23 @@ def _configure_provider(provider: dict, config: dict):

    if all_configured:
        _print_success(f"  {provider['name']} configured!")
+        plugin_name = provider.get("image_gen_plugin_name")
+        if plugin_name:
+            img_cfg = config.setdefault("image_gen", {})
+            if not isinstance(img_cfg, dict):
+                img_cfg = {}
+                config["image_gen"] = img_cfg
+            img_cfg["provider"] = plugin_name
+            _print_success(f"  image_gen.provider set to: {plugin_name}")
+            _configure_imagegen_model_for_plugin(plugin_name, config)
+            return
        # Imagegen backends prompt for model selection after env vars are in.
        backend = provider.get("imagegen_backend")
        if backend:
            _configure_imagegen_model(backend, config)
+            img_cfg = config.setdefault("image_gen", {})
+            if isinstance(img_cfg, dict) and img_cfg.get("provider") not in (None, "", "fal"):
+                img_cfg["provider"] = "fal"


 def _configure_simple_requirements(ts_key: str):