mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799)
* feat(plugins): pluggable image_gen backends + OpenAI provider
Adds a ImageGenProvider ABC so image generation backends register as
bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner
gains three primitives to make this work generically:
- `kind:` manifest field (`standalone` | `backend` | `exclusive`).
Bundled `kind: backend` plugins auto-load — no `plugins.enabled`
incantation. User-installed backends stay opt-in.
- Path-derived keys: `plugins/image_gen/openai/` gets key
`image_gen/openai`, so a future `tts/openai` cannot collide.
- Depth-2 recursion into category namespaces (parent dirs without a
`plugin.yaml` of their own).
Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5
default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64
responses save to `$HERMES_HOME/cache/images/`; URL responses pass
through.
FAL stays in-tree for this PR — a follow-up ports it into
`plugins/image_gen/fal/` so the in-tree `image_generation_tool.py`
slims down. The dispatch shim in `_handle_image_generate` only fires
when `image_gen.provider` is explicitly set to a non-FAL value, so
existing FAL setups are untouched.
- 41 unit tests (scanner recursion, kind parsing, gate logic,
registry, OpenAI payload shapes)
- E2E smoke verified: bundled plugin autoloads, registers, and
`_handle_image_generate` routes to OpenAI when configured
* fix(image_gen/openai): don't send response_format to gpt-image-*
The live API rejects it: 'Unknown parameter: response_format'
(verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return
b64_json unconditionally, so the parameter was both unnecessary and
actively broken.
* feat(image_gen/openai): gpt-image-2 only, drop legacy catalog
gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21)
and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 /
dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward
(dall-e-2 squares only). Trim the catalog down to a single model.
Live-verified end-to-end: landscape 1536x1024 render of a Moog-style
synth matches prompt exactly, 2.4MB PNG saved to cache.
* feat(image_gen/openai): expose gpt-image-2 as three quality tiers
Users pick speed/fidelity via the normal model picker instead of a
hidden quality knob. All three tier IDs resolve to the single underlying
gpt-image-2 API model with a different quality parameter:
gpt-image-2-low ~15s fast iteration
gpt-image-2-medium ~40s default
gpt-image-2-high ~2min highest fidelity
Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the
same 1024x1024 prompt.
Config:
image_gen.openai.model: gpt-image-2-high
# or
image_gen.model: gpt-image-2-low
# or env var for scripts/tests
OPENAI_IMAGE_MODEL=gpt-image-2-medium
Live-verified end-to-end with the low tier: 18.8s landscape render of a
golden retriever in wildflowers, vision-confirmed exact match.
* feat(tools_config): plugin image_gen providers inject themselves into picker
'hermes tools' → Image Generation now shows plugin-registered backends
alongside Nous Subscription and FAL.ai without tools_config.py needing
to know about them. OpenAI appears as a third option today; future
backends appear automatically as they're added.
Mechanism:
- ImageGenProvider gains an optional get_setup_schema() hook
(name, badge, tag, env_vars). Default derived from display_name.
- tools_config._plugin_image_gen_providers() pulls the schemas from
every registered non-FAL plugin provider.
- _visible_providers() appends those rows when rendering the Image
Generation category.
- _configure_provider() handles the new image_gen_plugin_name marker:
writes image_gen.provider and routes to the plugin's list_models()
catalog for the model picker.
- _toolset_needs_configuration_prompt('image_gen') stops demanding a
FAL key when any plugin provider reports is_available().
FAL is skipped in the plugin path because it already has hardcoded
TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up
PR the hardcoded rows go away and it surfaces through the same path
as OpenAI.
Verified live: picker shows Nous Subscription / FAL.ai / OpenAI.
Picking OpenAI prompts for OPENAI_API_KEY, then shows the
gpt-image-2-low/medium/high model picker sourced from the plugin.
397 tests pass across plugins/, tools_config, registry, and picker.
* fix(image_gen): close final gaps for plugin-backend parity with FAL
Two small places that still hardcoded FAL:
- hermes_cli/setup.py status line: an OpenAI-only setup showed
'Image Generation: missing FAL_KEY'. Now probes plugin providers
and reports '(OpenAI)' when one is_available() — or falls back to
'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured.
- image_generate tool schema description: said 'using FAL.ai, default
FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are
user-configured' — and notes the 'image' field can be a URL or an
absolute path, which the gateway delivers either way via
extract_local_files().
This commit is contained in:
parent
d1acf17773
commit
ff9752410a
13 changed files with 2122 additions and 67 deletions
|
|
@ -847,6 +847,51 @@ def _configure_toolset(ts_key: str, config: dict):
|
|||
_configure_simple_requirements(ts_key)
|
||||
|
||||
|
||||
def _plugin_image_gen_providers() -> list[dict]:
|
||||
"""Build picker-row dicts from plugin-registered image gen providers.
|
||||
|
||||
Each returned dict looks like a regular ``TOOL_CATEGORIES`` provider
|
||||
row but carries an ``image_gen_plugin_name`` marker so downstream
|
||||
code (config writing, model picker) knows to route through the
|
||||
plugin registry instead of the in-tree FAL backend.
|
||||
|
||||
FAL is skipped — it's already exposed by the hardcoded
|
||||
``TOOL_CATEGORIES["image_gen"]`` entries. When FAL gets ported to
|
||||
a plugin in a follow-up PR, the hardcoded entries go away and this
|
||||
function surfaces it alongside OpenAI automatically.
|
||||
"""
|
||||
try:
|
||||
from agent.image_gen_registry import list_providers
|
||||
from hermes_cli.plugins import _ensure_plugins_discovered
|
||||
|
||||
_ensure_plugins_discovered()
|
||||
providers = list_providers()
|
||||
except Exception:
|
||||
return []
|
||||
|
||||
rows: list[dict] = []
|
||||
for provider in providers:
|
||||
if getattr(provider, "name", None) == "fal":
|
||||
# FAL has its own hardcoded rows today.
|
||||
continue
|
||||
try:
|
||||
schema = provider.get_setup_schema()
|
||||
except Exception:
|
||||
continue
|
||||
if not isinstance(schema, dict):
|
||||
continue
|
||||
rows.append(
|
||||
{
|
||||
"name": schema.get("name", provider.display_name),
|
||||
"badge": schema.get("badge", ""),
|
||||
"tag": schema.get("tag", ""),
|
||||
"env_vars": schema.get("env_vars", []),
|
||||
"image_gen_plugin_name": provider.name,
|
||||
}
|
||||
)
|
||||
return rows
|
||||
|
||||
|
||||
def _visible_providers(cat: dict, config: dict) -> list[dict]:
|
||||
"""Return provider entries visible for the current auth/config state."""
|
||||
features = get_nous_subscription_features(config)
|
||||
|
|
@ -857,6 +902,12 @@ def _visible_providers(cat: dict, config: dict) -> list[dict]:
|
|||
if provider.get("requires_nous_auth") and not features.nous_auth_present:
|
||||
continue
|
||||
visible.append(provider)
|
||||
|
||||
# Inject plugin-registered image_gen backends (OpenAI today, more
|
||||
# later) so the picker lists them alongside FAL / Nous Subscription.
|
||||
if cat.get("name") == "Image Generation":
|
||||
visible.extend(_plugin_image_gen_providers())
|
||||
|
||||
return visible
|
||||
|
||||
|
||||
|
|
@ -876,7 +927,24 @@ def _toolset_needs_configuration_prompt(ts_key: str, config: dict) -> bool:
|
|||
browser_cfg = config.get("browser", {})
|
||||
return not isinstance(browser_cfg, dict) or "cloud_provider" not in browser_cfg
|
||||
if ts_key == "image_gen":
|
||||
return not fal_key_is_configured()
|
||||
# Satisfied when the in-tree FAL backend is configured OR any
|
||||
# plugin-registered image gen provider is available.
|
||||
if fal_key_is_configured():
|
||||
return False
|
||||
try:
|
||||
from agent.image_gen_registry import list_providers
|
||||
from hermes_cli.plugins import _ensure_plugins_discovered
|
||||
|
||||
_ensure_plugins_discovered()
|
||||
for provider in list_providers():
|
||||
try:
|
||||
if provider.is_available():
|
||||
return False
|
||||
except Exception:
|
||||
continue
|
||||
except Exception:
|
||||
pass
|
||||
return True
|
||||
|
||||
return not _toolset_has_keys(ts_key, config)
|
||||
|
||||
|
|
@ -1095,6 +1163,88 @@ def _configure_imagegen_model(backend_name: str, config: dict) -> None:
|
|||
_print_success(f" Model set to: {chosen}")
|
||||
|
||||
|
||||
def _plugin_image_gen_catalog(plugin_name: str):
|
||||
"""Return ``(catalog_dict, default_model_id)`` for a plugin provider.
|
||||
|
||||
``catalog_dict`` is shaped like the legacy ``FAL_MODELS`` table —
|
||||
``{model_id: {"display", "speed", "strengths", "price", ...}}`` —
|
||||
so the existing picker code paths work without change. Returns
|
||||
``({}, None)`` if the provider isn't registered or has no models.
|
||||
"""
|
||||
try:
|
||||
from agent.image_gen_registry import get_provider
|
||||
from hermes_cli.plugins import _ensure_plugins_discovered
|
||||
|
||||
_ensure_plugins_discovered()
|
||||
provider = get_provider(plugin_name)
|
||||
except Exception:
|
||||
return {}, None
|
||||
if provider is None:
|
||||
return {}, None
|
||||
try:
|
||||
models = provider.list_models() or []
|
||||
default = provider.default_model()
|
||||
except Exception:
|
||||
return {}, None
|
||||
catalog = {m["id"]: m for m in models if isinstance(m, dict) and "id" in m}
|
||||
return catalog, default
|
||||
|
||||
|
||||
def _configure_imagegen_model_for_plugin(plugin_name: str, config: dict) -> None:
|
||||
"""Prompt the user to pick a model for a plugin-registered backend.
|
||||
|
||||
Writes selection to ``image_gen.model``. Mirrors
|
||||
:func:`_configure_imagegen_model` but sources its catalog from the
|
||||
plugin registry instead of :data:`IMAGEGEN_BACKENDS`.
|
||||
"""
|
||||
catalog, default_model = _plugin_image_gen_catalog(plugin_name)
|
||||
if not catalog:
|
||||
return
|
||||
|
||||
cur_cfg = config.setdefault("image_gen", {})
|
||||
if not isinstance(cur_cfg, dict):
|
||||
cur_cfg = {}
|
||||
config["image_gen"] = cur_cfg
|
||||
current_model = cur_cfg.get("model") or default_model
|
||||
if current_model not in catalog:
|
||||
current_model = default_model
|
||||
|
||||
model_ids = list(catalog.keys())
|
||||
ordered = [current_model] + [m for m in model_ids if m != current_model]
|
||||
|
||||
widths = {
|
||||
"model": max(len(m) for m in model_ids),
|
||||
"speed": max((len(catalog[m].get("speed", "")) for m in model_ids), default=6),
|
||||
"strengths": max((len(catalog[m].get("strengths", "")) for m in model_ids), default=0),
|
||||
}
|
||||
|
||||
print()
|
||||
header = (
|
||||
f" {'Model':<{widths['model']}} "
|
||||
f"{'Speed':<{widths['speed']}} "
|
||||
f"{'Strengths':<{widths['strengths']}} "
|
||||
f"Price"
|
||||
)
|
||||
print(color(header, Colors.CYAN))
|
||||
|
||||
rows = []
|
||||
for mid in ordered:
|
||||
row = _format_imagegen_model_row(mid, catalog[mid], widths)
|
||||
if mid == current_model:
|
||||
row += " ← currently in use"
|
||||
rows.append(row)
|
||||
|
||||
idx = _prompt_choice(
|
||||
f" Choose {plugin_name} model:",
|
||||
rows,
|
||||
default=0,
|
||||
)
|
||||
|
||||
chosen = ordered[idx]
|
||||
cur_cfg["model"] = chosen
|
||||
_print_success(f" Model set to: {chosen}")
|
||||
|
||||
|
||||
def _configure_provider(provider: dict, config: dict):
|
||||
"""Configure a single provider - prompt for API keys and set config."""
|
||||
env_vars = provider.get("env_vars", [])
|
||||
|
|
@ -1151,10 +1301,28 @@ def _configure_provider(provider: dict, config: dict):
|
|||
_print_success(f" {provider['name']} - no configuration needed!")
|
||||
if managed_feature:
|
||||
_print_info(" Requests for this tool will be billed to your Nous subscription.")
|
||||
# Plugin-registered image_gen provider: write image_gen.provider
|
||||
# and route model selection to the plugin's own catalog.
|
||||
plugin_name = provider.get("image_gen_plugin_name")
|
||||
if plugin_name:
|
||||
img_cfg = config.setdefault("image_gen", {})
|
||||
if not isinstance(img_cfg, dict):
|
||||
img_cfg = {}
|
||||
config["image_gen"] = img_cfg
|
||||
img_cfg["provider"] = plugin_name
|
||||
_print_success(f" image_gen.provider set to: {plugin_name}")
|
||||
_configure_imagegen_model_for_plugin(plugin_name, config)
|
||||
return
|
||||
# Imagegen backends prompt for model selection after backend pick.
|
||||
backend = provider.get("imagegen_backend")
|
||||
if backend:
|
||||
_configure_imagegen_model(backend, config)
|
||||
# In-tree FAL is the only non-plugin backend today. Keep
|
||||
# image_gen.provider clear so the dispatch shim falls through
|
||||
# to the legacy FAL path.
|
||||
img_cfg = config.setdefault("image_gen", {})
|
||||
if isinstance(img_cfg, dict) and img_cfg.get("provider") not in (None, "", "fal"):
|
||||
img_cfg["provider"] = "fal"
|
||||
return
|
||||
|
||||
# Prompt for each required env var
|
||||
|
|
@ -1189,10 +1357,23 @@ def _configure_provider(provider: dict, config: dict):
|
|||
|
||||
if all_configured:
|
||||
_print_success(f" {provider['name']} configured!")
|
||||
plugin_name = provider.get("image_gen_plugin_name")
|
||||
if plugin_name:
|
||||
img_cfg = config.setdefault("image_gen", {})
|
||||
if not isinstance(img_cfg, dict):
|
||||
img_cfg = {}
|
||||
config["image_gen"] = img_cfg
|
||||
img_cfg["provider"] = plugin_name
|
||||
_print_success(f" image_gen.provider set to: {plugin_name}")
|
||||
_configure_imagegen_model_for_plugin(plugin_name, config)
|
||||
return
|
||||
# Imagegen backends prompt for model selection after env vars are in.
|
||||
backend = provider.get("imagegen_backend")
|
||||
if backend:
|
||||
_configure_imagegen_model(backend, config)
|
||||
img_cfg = config.setdefault("image_gen", {})
|
||||
if isinstance(img_cfg, dict) and img_cfg.get("provider") not in (None, "", "fal"):
|
||||
img_cfg["provider"] = "fal"
|
||||
|
||||
|
||||
def _configure_simple_requirements(ts_key: str):
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue