feat(plugins): pluggable image_gen backends + OpenAI provider (#13799)

* feat(plugins): pluggable image_gen backends + OpenAI provider

Adds a ImageGenProvider ABC so image generation backends register as
bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner
gains three primitives to make this work generically:

- `kind:` manifest field (`standalone` | `backend` | `exclusive`).
  Bundled `kind: backend` plugins auto-load — no `plugins.enabled`
  incantation. User-installed backends stay opt-in.
- Path-derived keys: `plugins/image_gen/openai/` gets key
  `image_gen/openai`, so a future `tts/openai` cannot collide.
- Depth-2 recursion into category namespaces (parent dirs without a
  `plugin.yaml` of their own).

Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5
default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64
responses save to `$HERMES_HOME/cache/images/`; URL responses pass
through.

FAL stays in-tree for this PR — a follow-up ports it into
`plugins/image_gen/fal/` so the in-tree `image_generation_tool.py`
slims down. The dispatch shim in `_handle_image_generate` only fires
when `image_gen.provider` is explicitly set to a non-FAL value, so
existing FAL setups are untouched.

- 41 unit tests (scanner recursion, kind parsing, gate logic,
  registry, OpenAI payload shapes)
- E2E smoke verified: bundled plugin autoloads, registers, and
  `_handle_image_generate` routes to OpenAI when configured

* fix(image_gen/openai): don't send response_format to gpt-image-*

The live API rejects it: 'Unknown parameter: response_format'
(verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return
b64_json unconditionally, so the parameter was both unnecessary and
actively broken.

* feat(image_gen/openai): gpt-image-2 only, drop legacy catalog

gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21)
and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 /
dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward
(dall-e-2 squares only). Trim the catalog down to a single model.

Live-verified end-to-end: landscape 1536x1024 render of a Moog-style
synth matches prompt exactly, 2.4MB PNG saved to cache.

* feat(image_gen/openai): expose gpt-image-2 as three quality tiers

Users pick speed/fidelity via the normal model picker instead of a
hidden quality knob. All three tier IDs resolve to the single underlying
gpt-image-2 API model with a different quality parameter:

  gpt-image-2-low     ~15s   fast iteration
  gpt-image-2-medium  ~40s   default
  gpt-image-2-high    ~2min  highest fidelity

Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the
same 1024x1024 prompt.

Config:
  image_gen.openai.model: gpt-image-2-high
  # or
  image_gen.model: gpt-image-2-low
  # or env var for scripts/tests
  OPENAI_IMAGE_MODEL=gpt-image-2-medium

Live-verified end-to-end with the low tier: 18.8s landscape render of a
golden retriever in wildflowers, vision-confirmed exact match.

* feat(tools_config): plugin image_gen providers inject themselves into picker

'hermes tools' → Image Generation now shows plugin-registered backends
alongside Nous Subscription and FAL.ai without tools_config.py needing
to know about them. OpenAI appears as a third option today; future
backends appear automatically as they're added.

Mechanism:
- ImageGenProvider gains an optional get_setup_schema() hook
  (name, badge, tag, env_vars). Default derived from display_name.
- tools_config._plugin_image_gen_providers() pulls the schemas from
  every registered non-FAL plugin provider.
- _visible_providers() appends those rows when rendering the Image
  Generation category.
- _configure_provider() handles the new image_gen_plugin_name marker:
  writes image_gen.provider and routes to the plugin's list_models()
  catalog for the model picker.
- _toolset_needs_configuration_prompt('image_gen') stops demanding a
  FAL key when any plugin provider reports is_available().

FAL is skipped in the plugin path because it already has hardcoded
TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up
PR the hardcoded rows go away and it surfaces through the same path
as OpenAI.

Verified live: picker shows Nous Subscription / FAL.ai / OpenAI.
Picking OpenAI prompts for OPENAI_API_KEY, then shows the
gpt-image-2-low/medium/high model picker sourced from the plugin.

397 tests pass across plugins/, tools_config, registry, and picker.

* fix(image_gen): close final gaps for plugin-backend parity with FAL

Two small places that still hardcoded FAL:

- hermes_cli/setup.py status line: an OpenAI-only setup showed
  'Image Generation: missing FAL_KEY'. Now probes plugin providers
  and reports '(OpenAI)' when one is_available() — or falls back to
  'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured.

- image_generate tool schema description: said 'using FAL.ai, default
  FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are
  user-configured' — and notes the 'image' field can be a URL or an
  absolute path, which the gateway delivers either way via
  extract_local_files().
This commit is contained in:
Teknium 2026-04-21 21:30:10 -07:00 committed by GitHub
parent d1acf17773
commit ff9752410a
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
13 changed files with 2122 additions and 67 deletions

View file

@ -0,0 +1,243 @@
"""Tests for the bundled OpenAI image_gen plugin (gpt-image-2, three tiers)."""
from __future__ import annotations
from pathlib import Path
from types import SimpleNamespace
from unittest.mock import MagicMock, patch
import pytest
import plugins.image_gen.openai as openai_plugin
# 1×1 transparent PNG — valid bytes for save_b64_image()
_PNG_HEX = (
"89504e470d0a1a0a0000000d49484452000000010000000108060000001f15c4"
"890000000d49444154789c6300010000000500010d0a2db40000000049454e44"
"ae426082"
)
def _b64_png() -> str:
import base64
return base64.b64encode(bytes.fromhex(_PNG_HEX)).decode()
def _fake_response(*, b64=None, url=None, revised_prompt=None):
item = SimpleNamespace(b64_json=b64, url=url, revised_prompt=revised_prompt)
return SimpleNamespace(data=[item])
@pytest.fixture(autouse=True)
def _tmp_hermes_home(tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
yield tmp_path
@pytest.fixture
def provider(monkeypatch):
monkeypatch.setenv("OPENAI_API_KEY", "test-key")
return openai_plugin.OpenAIImageGenProvider()
def _patched_openai(fake_client: MagicMock):
fake_openai = MagicMock()
fake_openai.OpenAI.return_value = fake_client
return patch.dict("sys.modules", {"openai": fake_openai})
# ── Metadata ────────────────────────────────────────────────────────────────
class TestMetadata:
def test_name(self, provider):
assert provider.name == "openai"
def test_default_model(self, provider):
assert provider.default_model() == "gpt-image-2-medium"
def test_list_models_three_tiers(self, provider):
ids = [m["id"] for m in provider.list_models()]
assert ids == ["gpt-image-2-low", "gpt-image-2-medium", "gpt-image-2-high"]
def test_catalog_entries_have_display_speed_strengths(self, provider):
for entry in provider.list_models():
assert entry["display"].startswith("GPT Image 2")
assert entry["speed"]
assert entry["strengths"]
# ── Availability ────────────────────────────────────────────────────────────
class TestAvailability:
def test_no_api_key_unavailable(self, monkeypatch):
monkeypatch.delenv("OPENAI_API_KEY", raising=False)
assert openai_plugin.OpenAIImageGenProvider().is_available() is False
def test_api_key_set_available(self, monkeypatch):
monkeypatch.setenv("OPENAI_API_KEY", "test")
assert openai_plugin.OpenAIImageGenProvider().is_available() is True
# ── Model resolution ────────────────────────────────────────────────────────
class TestModelResolution:
def test_default_is_medium(self):
model_id, meta = openai_plugin._resolve_model()
assert model_id == "gpt-image-2-medium"
assert meta["quality"] == "medium"
def test_env_var_override(self, monkeypatch):
monkeypatch.setenv("OPENAI_IMAGE_MODEL", "gpt-image-2-high")
model_id, meta = openai_plugin._resolve_model()
assert model_id == "gpt-image-2-high"
assert meta["quality"] == "high"
def test_env_var_unknown_falls_back(self, monkeypatch):
monkeypatch.setenv("OPENAI_IMAGE_MODEL", "bogus-tier")
model_id, _ = openai_plugin._resolve_model()
assert model_id == openai_plugin.DEFAULT_MODEL
def test_config_openai_model(self, tmp_path):
import yaml
(tmp_path / "config.yaml").write_text(
yaml.safe_dump({"image_gen": {"openai": {"model": "gpt-image-2-low"}}})
)
model_id, meta = openai_plugin._resolve_model()
assert model_id == "gpt-image-2-low"
assert meta["quality"] == "low"
def test_config_top_level_model(self, tmp_path):
"""``image_gen.model: gpt-image-2-high`` also works (top-level)."""
import yaml
(tmp_path / "config.yaml").write_text(
yaml.safe_dump({"image_gen": {"model": "gpt-image-2-high"}})
)
model_id, meta = openai_plugin._resolve_model()
assert model_id == "gpt-image-2-high"
assert meta["quality"] == "high"
# ── Generate ────────────────────────────────────────────────────────────────
class TestGenerate:
def test_empty_prompt_rejected(self, provider):
result = provider.generate("", aspect_ratio="square")
assert result["success"] is False
assert result["error_type"] == "invalid_argument"
def test_missing_api_key(self, monkeypatch):
monkeypatch.delenv("OPENAI_API_KEY", raising=False)
result = openai_plugin.OpenAIImageGenProvider().generate("a cat")
assert result["success"] is False
assert result["error_type"] == "auth_required"
def test_b64_saves_to_cache(self, provider, tmp_path):
import base64
png_bytes = bytes.fromhex(_PNG_HEX)
fake_client = MagicMock()
fake_client.images.generate.return_value = _fake_response(b64=_b64_png())
with _patched_openai(fake_client):
result = provider.generate("a cat", aspect_ratio="landscape")
assert result["success"] is True
assert result["model"] == "gpt-image-2-medium"
assert result["aspect_ratio"] == "landscape"
assert result["provider"] == "openai"
assert result["quality"] == "medium"
saved = Path(result["image"])
assert saved.exists()
assert saved.parent == tmp_path / "cache" / "images"
assert saved.read_bytes() == png_bytes
call_kwargs = fake_client.images.generate.call_args.kwargs
# All tiers hit the single underlying API model.
assert call_kwargs["model"] == "gpt-image-2"
assert call_kwargs["quality"] == "medium"
assert call_kwargs["size"] == "1536x1024"
# gpt-image-2 rejects response_format — we must NOT send it.
assert "response_format" not in call_kwargs
@pytest.mark.parametrize("tier,expected_quality", [
("gpt-image-2-low", "low"),
("gpt-image-2-medium", "medium"),
("gpt-image-2-high", "high"),
])
def test_tier_maps_to_quality(self, provider, monkeypatch, tier, expected_quality):
monkeypatch.setenv("OPENAI_IMAGE_MODEL", tier)
fake_client = MagicMock()
fake_client.images.generate.return_value = _fake_response(b64=_b64_png())
with _patched_openai(fake_client):
result = provider.generate("a cat")
assert result["model"] == tier
assert result["quality"] == expected_quality
assert fake_client.images.generate.call_args.kwargs["quality"] == expected_quality
# Always the same underlying API model regardless of tier.
assert fake_client.images.generate.call_args.kwargs["model"] == "gpt-image-2"
@pytest.mark.parametrize("aspect,expected_size", [
("landscape", "1536x1024"),
("square", "1024x1024"),
("portrait", "1024x1536"),
])
def test_aspect_ratio_mapping(self, provider, aspect, expected_size):
fake_client = MagicMock()
fake_client.images.generate.return_value = _fake_response(b64=_b64_png())
with _patched_openai(fake_client):
provider.generate("a cat", aspect_ratio=aspect)
assert fake_client.images.generate.call_args.kwargs["size"] == expected_size
def test_revised_prompt_passed_through(self, provider):
fake_client = MagicMock()
fake_client.images.generate.return_value = _fake_response(
b64=_b64_png(), revised_prompt="A photo of a cat",
)
with _patched_openai(fake_client):
result = provider.generate("a cat")
assert result["revised_prompt"] == "A photo of a cat"
def test_api_error_returns_error_response(self, provider):
fake_client = MagicMock()
fake_client.images.generate.side_effect = RuntimeError("boom")
with _patched_openai(fake_client):
result = provider.generate("a cat")
assert result["success"] is False
assert result["error_type"] == "api_error"
assert "boom" in result["error"]
def test_empty_response_data(self, provider):
fake_client = MagicMock()
fake_client.images.generate.return_value = SimpleNamespace(data=[])
with _patched_openai(fake_client):
result = provider.generate("a cat")
assert result["success"] is False
assert result["error_type"] == "empty_response"
def test_url_fallback_if_api_changes(self, provider):
"""Defensive: if OpenAI ever returns URL instead of b64, pass through."""
fake_client = MagicMock()
fake_client.images.generate.return_value = _fake_response(
b64=None, url="https://example.com/img.png",
)
with _patched_openai(fake_client):
result = provider.generate("a cat")
assert result["success"] is True
assert result["image"] == "https://example.com/img.png"