mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799)
* feat(plugins): pluggable image_gen backends + OpenAI provider
Adds a ImageGenProvider ABC so image generation backends register as
bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner
gains three primitives to make this work generically:
- `kind:` manifest field (`standalone` | `backend` | `exclusive`).
Bundled `kind: backend` plugins auto-load — no `plugins.enabled`
incantation. User-installed backends stay opt-in.
- Path-derived keys: `plugins/image_gen/openai/` gets key
`image_gen/openai`, so a future `tts/openai` cannot collide.
- Depth-2 recursion into category namespaces (parent dirs without a
`plugin.yaml` of their own).
Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5
default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64
responses save to `$HERMES_HOME/cache/images/`; URL responses pass
through.
FAL stays in-tree for this PR — a follow-up ports it into
`plugins/image_gen/fal/` so the in-tree `image_generation_tool.py`
slims down. The dispatch shim in `_handle_image_generate` only fires
when `image_gen.provider` is explicitly set to a non-FAL value, so
existing FAL setups are untouched.
- 41 unit tests (scanner recursion, kind parsing, gate logic,
registry, OpenAI payload shapes)
- E2E smoke verified: bundled plugin autoloads, registers, and
`_handle_image_generate` routes to OpenAI when configured
* fix(image_gen/openai): don't send response_format to gpt-image-*
The live API rejects it: 'Unknown parameter: response_format'
(verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return
b64_json unconditionally, so the parameter was both unnecessary and
actively broken.
* feat(image_gen/openai): gpt-image-2 only, drop legacy catalog
gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21)
and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 /
dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward
(dall-e-2 squares only). Trim the catalog down to a single model.
Live-verified end-to-end: landscape 1536x1024 render of a Moog-style
synth matches prompt exactly, 2.4MB PNG saved to cache.
* feat(image_gen/openai): expose gpt-image-2 as three quality tiers
Users pick speed/fidelity via the normal model picker instead of a
hidden quality knob. All three tier IDs resolve to the single underlying
gpt-image-2 API model with a different quality parameter:
gpt-image-2-low ~15s fast iteration
gpt-image-2-medium ~40s default
gpt-image-2-high ~2min highest fidelity
Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the
same 1024x1024 prompt.
Config:
image_gen.openai.model: gpt-image-2-high
# or
image_gen.model: gpt-image-2-low
# or env var for scripts/tests
OPENAI_IMAGE_MODEL=gpt-image-2-medium
Live-verified end-to-end with the low tier: 18.8s landscape render of a
golden retriever in wildflowers, vision-confirmed exact match.
* feat(tools_config): plugin image_gen providers inject themselves into picker
'hermes tools' → Image Generation now shows plugin-registered backends
alongside Nous Subscription and FAL.ai without tools_config.py needing
to know about them. OpenAI appears as a third option today; future
backends appear automatically as they're added.
Mechanism:
- ImageGenProvider gains an optional get_setup_schema() hook
(name, badge, tag, env_vars). Default derived from display_name.
- tools_config._plugin_image_gen_providers() pulls the schemas from
every registered non-FAL plugin provider.
- _visible_providers() appends those rows when rendering the Image
Generation category.
- _configure_provider() handles the new image_gen_plugin_name marker:
writes image_gen.provider and routes to the plugin's list_models()
catalog for the model picker.
- _toolset_needs_configuration_prompt('image_gen') stops demanding a
FAL key when any plugin provider reports is_available().
FAL is skipped in the plugin path because it already has hardcoded
TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up
PR the hardcoded rows go away and it surfaces through the same path
as OpenAI.
Verified live: picker shows Nous Subscription / FAL.ai / OpenAI.
Picking OpenAI prompts for OPENAI_API_KEY, then shows the
gpt-image-2-low/medium/high model picker sourced from the plugin.
397 tests pass across plugins/, tools_config, registry, and picker.
* fix(image_gen): close final gaps for plugin-backend parity with FAL
Two small places that still hardcoded FAL:
- hermes_cli/setup.py status line: an OpenAI-only setup showed
'Image Generation: missing FAL_KEY'. Now probes plugin providers
and reports '(OpenAI)' when one is_available() — or falls back to
'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured.
- image_generate tool schema description: said 'using FAL.ai, default
FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are
user-configured' — and notes the 'image' field can be a URL or an
absolute path, which the gateway delivers either way via
extract_local_files().
This commit is contained in:
parent
d1acf17773
commit
ff9752410a
13 changed files with 2122 additions and 67 deletions
|
|
@ -133,6 +133,9 @@ def _get_enabled_plugins() -> Optional[set]:
|
|||
# Data classes
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_VALID_PLUGIN_KINDS: Set[str] = {"standalone", "backend", "exclusive"}
|
||||
|
||||
|
||||
@dataclass
|
||||
class PluginManifest:
|
||||
"""Parsed representation of a plugin.yaml manifest."""
|
||||
|
|
@ -146,6 +149,23 @@ class PluginManifest:
|
|||
provides_hooks: List[str] = field(default_factory=list)
|
||||
source: str = "" # "user", "project", or "entrypoint"
|
||||
path: Optional[str] = None
|
||||
# Plugin kind — see plugins.py module docstring for semantics.
|
||||
# ``standalone`` (default): hooks/tools of its own; opt-in via
|
||||
# ``plugins.enabled``.
|
||||
# ``backend``: pluggable backend for an existing core tool (e.g.
|
||||
# image_gen). Built-in (bundled) backends auto-load;
|
||||
# user-installed still gated by ``plugins.enabled``.
|
||||
# ``exclusive``: category with exactly one active provider (memory).
|
||||
# Selection via ``<category>.provider`` config key; the
|
||||
# category's own discovery system handles loading and the
|
||||
# general scanner skips these.
|
||||
kind: str = "standalone"
|
||||
# Registry key — path-derived, used by ``plugins.enabled``/``disabled``
|
||||
# lookups and by ``hermes plugins list``. For a flat plugin at
|
||||
# ``plugins/disk-cleanup/`` the key is ``disk-cleanup``; for a nested
|
||||
# category plugin at ``plugins/image_gen/openai/`` the key is
|
||||
# ``image_gen/openai``. When empty, falls back to ``name``.
|
||||
key: str = ""
|
||||
|
||||
|
||||
@dataclass
|
||||
|
|
@ -366,6 +386,33 @@ class PluginContext:
|
|||
self.manifest.name, engine.name,
|
||||
)
|
||||
|
||||
# -- image gen provider registration ------------------------------------
|
||||
|
||||
def register_image_gen_provider(self, provider) -> None:
|
||||
"""Register an image generation backend.
|
||||
|
||||
``provider`` must be an instance of
|
||||
:class:`agent.image_gen_provider.ImageGenProvider`. The
|
||||
``provider.name`` attribute is what ``image_gen.provider`` in
|
||||
``config.yaml`` matches against when routing ``image_generate``
|
||||
tool calls.
|
||||
"""
|
||||
from agent.image_gen_provider import ImageGenProvider
|
||||
from agent.image_gen_registry import register_provider
|
||||
|
||||
if not isinstance(provider, ImageGenProvider):
|
||||
logger.warning(
|
||||
"Plugin '%s' tried to register an image_gen provider that does "
|
||||
"not inherit from ImageGenProvider. Ignoring.",
|
||||
self.manifest.name,
|
||||
)
|
||||
return
|
||||
register_provider(provider)
|
||||
logger.info(
|
||||
"Plugin '%s' registered image_gen provider: %s",
|
||||
self.manifest.name, provider.name,
|
||||
)
|
||||
|
||||
# -- hook registration --------------------------------------------------
|
||||
|
||||
def register_hook(self, hook_name: str, callback: Callable) -> None:
|
||||
|
|
@ -465,11 +512,16 @@ class PluginManager:
|
|||
manifests: List[PluginManifest] = []
|
||||
|
||||
# 1. Bundled plugins (<repo>/plugins/<name>/)
|
||||
# Repo-shipped generic plugins live next to hermes_cli/. Memory and
|
||||
# context_engine subdirs are handled by their own discovery paths, so
|
||||
# skip those names here. Bundled plugins are discovered (so they
|
||||
# show up in `hermes plugins`) but only loaded when added to
|
||||
# `plugins.enabled` in config.yaml — opt-in like any other plugin.
|
||||
#
|
||||
# Repo-shipped plugins live next to hermes_cli/. Two layouts are
|
||||
# supported (see ``_scan_directory`` for details):
|
||||
#
|
||||
# - flat: ``plugins/disk-cleanup/plugin.yaml`` (standalone)
|
||||
# - category: ``plugins/image_gen/openai/plugin.yaml`` (backend)
|
||||
#
|
||||
# ``memory/`` and ``context_engine/`` are skipped at the top level —
|
||||
# they have their own discovery systems. Porting those to the
|
||||
# category-namespace ``kind: exclusive`` model is a future PR.
|
||||
repo_plugins = Path(__file__).resolve().parent.parent / "plugins"
|
||||
manifests.extend(
|
||||
self._scan_directory(
|
||||
|
|
@ -492,36 +544,69 @@ class PluginManager:
|
|||
manifests.extend(self._scan_entry_points())
|
||||
|
||||
# Load each manifest (skip user-disabled plugins).
|
||||
# Later sources override earlier ones on name collision — user plugins
|
||||
# take precedence over bundled, project plugins take precedence over
|
||||
# user. Dedup here so we only load the final winner.
|
||||
# Later sources override earlier ones on key collision — user
|
||||
# plugins take precedence over bundled, project plugins take
|
||||
# precedence over user. Dedup here so we only load the final
|
||||
# winner. Keys are path-derived (``image_gen/openai``,
|
||||
# ``disk-cleanup``) so ``tts/openai`` and ``image_gen/openai``
|
||||
# don't collide even when both manifests say ``name: openai``.
|
||||
disabled = _get_disabled_plugins()
|
||||
enabled = _get_enabled_plugins() # None = opt-in default (nothing enabled)
|
||||
winners: Dict[str, PluginManifest] = {}
|
||||
for manifest in manifests:
|
||||
winners[manifest.name] = manifest
|
||||
winners[manifest.key or manifest.name] = manifest
|
||||
for manifest in winners.values():
|
||||
# Explicit disable always wins.
|
||||
if manifest.name in disabled:
|
||||
lookup_key = manifest.key or manifest.name
|
||||
|
||||
# Explicit disable always wins (matches on key or on legacy
|
||||
# bare name for back-compat with existing user configs).
|
||||
if lookup_key in disabled or manifest.name in disabled:
|
||||
loaded = LoadedPlugin(manifest=manifest, enabled=False)
|
||||
loaded.error = "disabled via config"
|
||||
self._plugins[manifest.name] = loaded
|
||||
logger.debug("Skipping disabled plugin '%s'", manifest.name)
|
||||
self._plugins[lookup_key] = loaded
|
||||
logger.debug("Skipping disabled plugin '%s'", lookup_key)
|
||||
continue
|
||||
# Opt-in gate: plugins must be in the enabled allow-list.
|
||||
# If the allow-list is missing (None), treat as "nothing enabled"
|
||||
# — users have to explicitly enable plugins to load them.
|
||||
# Memory and context_engine providers are excluded from this gate
|
||||
# since they have their own single-select config (memory.provider
|
||||
# / context.engine), not the enabled list.
|
||||
if enabled is None or manifest.name not in enabled:
|
||||
|
||||
# Exclusive plugins (memory providers) have their own
|
||||
# discovery/activation path. The general loader records the
|
||||
# manifest for introspection but does not load the module.
|
||||
if manifest.kind == "exclusive":
|
||||
loaded = LoadedPlugin(manifest=manifest, enabled=False)
|
||||
loaded.error = "not enabled in config (run `hermes plugins enable {}` to activate)".format(
|
||||
manifest.name
|
||||
loaded.error = (
|
||||
"exclusive plugin — activate via <category>.provider config"
|
||||
)
|
||||
self._plugins[manifest.name] = loaded
|
||||
self._plugins[lookup_key] = loaded
|
||||
logger.debug(
|
||||
"Skipping '%s' (not in plugins.enabled)", manifest.name
|
||||
"Skipping '%s' (exclusive, handled by category discovery)",
|
||||
lookup_key,
|
||||
)
|
||||
continue
|
||||
|
||||
# Built-in backends auto-load — they ship with hermes and must
|
||||
# just work. Selection among them (e.g. which image_gen backend
|
||||
# services calls) is driven by ``<category>.provider`` config,
|
||||
# enforced by the tool wrapper.
|
||||
if manifest.kind == "backend" and manifest.source == "bundled":
|
||||
self._load_plugin(manifest)
|
||||
continue
|
||||
|
||||
# Everything else (standalone, user-installed backends,
|
||||
# entry-point plugins) is opt-in via plugins.enabled.
|
||||
# Accept both the path-derived key and the legacy bare name
|
||||
# so existing configs keep working.
|
||||
is_enabled = (
|
||||
enabled is not None
|
||||
and (lookup_key in enabled or manifest.name in enabled)
|
||||
)
|
||||
if not is_enabled:
|
||||
loaded = LoadedPlugin(manifest=manifest, enabled=False)
|
||||
loaded.error = (
|
||||
"not enabled in config (run `hermes plugins enable {}` to activate)"
|
||||
.format(lookup_key)
|
||||
)
|
||||
self._plugins[lookup_key] = loaded
|
||||
logger.debug(
|
||||
"Skipping '%s' (not in plugins.enabled)", lookup_key
|
||||
)
|
||||
continue
|
||||
self._load_plugin(manifest)
|
||||
|
|
@ -545,9 +630,37 @@ class PluginManager:
|
|||
) -> List[PluginManifest]:
|
||||
"""Read ``plugin.yaml`` manifests from subdirectories of *path*.
|
||||
|
||||
*skip_names* is an optional allow-list of names to ignore (used
|
||||
for the bundled scan to exclude ``memory`` / ``context_engine``
|
||||
subdirs that have their own discovery path).
|
||||
Supports two layouts, mixed freely:
|
||||
|
||||
* **Flat** — ``<root>/<plugin-name>/plugin.yaml``. Key is
|
||||
``<plugin-name>`` (e.g. ``disk-cleanup``).
|
||||
* **Category** — ``<root>/<category>/<plugin-name>/plugin.yaml``,
|
||||
where the ``<category>`` directory itself has no ``plugin.yaml``.
|
||||
Key is ``<category>/<plugin-name>`` (e.g. ``image_gen/openai``).
|
||||
Depth is capped at two segments.
|
||||
|
||||
*skip_names* is an optional allow-list of names to ignore at the
|
||||
top level (kept for back-compat; the current call sites no longer
|
||||
pass it now that categories are first-class).
|
||||
"""
|
||||
return self._scan_directory_level(
|
||||
path, source, skip_names=skip_names, prefix="", depth=0
|
||||
)
|
||||
|
||||
def _scan_directory_level(
|
||||
self,
|
||||
path: Path,
|
||||
source: str,
|
||||
*,
|
||||
skip_names: Optional[Set[str]],
|
||||
prefix: str,
|
||||
depth: int,
|
||||
) -> List[PluginManifest]:
|
||||
"""Recursive implementation of :meth:`_scan_directory`.
|
||||
|
||||
``prefix`` is the category path already accumulated ("" at root,
|
||||
"image_gen" one level in). ``depth`` is the recursion depth; we
|
||||
cap at 2 so ``<root>/a/b/c/`` is ignored.
|
||||
"""
|
||||
manifests: List[PluginManifest] = []
|
||||
if not path.is_dir():
|
||||
|
|
@ -556,37 +669,88 @@ class PluginManager:
|
|||
for child in sorted(path.iterdir()):
|
||||
if not child.is_dir():
|
||||
continue
|
||||
if skip_names and child.name in skip_names:
|
||||
if depth == 0 and skip_names and child.name in skip_names:
|
||||
continue
|
||||
manifest_file = child / "plugin.yaml"
|
||||
if not manifest_file.exists():
|
||||
manifest_file = child / "plugin.yml"
|
||||
if not manifest_file.exists():
|
||||
logger.debug("Skipping %s (no plugin.yaml)", child)
|
||||
|
||||
if manifest_file.exists():
|
||||
manifest = self._parse_manifest(
|
||||
manifest_file, child, source, prefix
|
||||
)
|
||||
if manifest is not None:
|
||||
manifests.append(manifest)
|
||||
continue
|
||||
|
||||
try:
|
||||
if yaml is None:
|
||||
logger.warning("PyYAML not installed – cannot load %s", manifest_file)
|
||||
continue
|
||||
data = yaml.safe_load(manifest_file.read_text()) or {}
|
||||
manifest = PluginManifest(
|
||||
name=data.get("name", child.name),
|
||||
version=str(data.get("version", "")),
|
||||
description=data.get("description", ""),
|
||||
author=data.get("author", ""),
|
||||
requires_env=data.get("requires_env", []),
|
||||
provides_tools=data.get("provides_tools", []),
|
||||
provides_hooks=data.get("provides_hooks", []),
|
||||
source=source,
|
||||
path=str(child),
|
||||
# No manifest at this level. If we're still within the depth
|
||||
# cap, treat this directory as a category namespace and recurse
|
||||
# one level in looking for children with manifests.
|
||||
if depth >= 1:
|
||||
logger.debug("Skipping %s (no plugin.yaml, depth cap reached)", child)
|
||||
continue
|
||||
|
||||
sub_prefix = f"{prefix}/{child.name}" if prefix else child.name
|
||||
manifests.extend(
|
||||
self._scan_directory_level(
|
||||
child,
|
||||
source,
|
||||
skip_names=None,
|
||||
prefix=sub_prefix,
|
||||
depth=depth + 1,
|
||||
)
|
||||
manifests.append(manifest)
|
||||
except Exception as exc:
|
||||
logger.warning("Failed to parse %s: %s", manifest_file, exc)
|
||||
)
|
||||
|
||||
return manifests
|
||||
|
||||
def _parse_manifest(
|
||||
self,
|
||||
manifest_file: Path,
|
||||
plugin_dir: Path,
|
||||
source: str,
|
||||
prefix: str,
|
||||
) -> Optional[PluginManifest]:
|
||||
"""Parse a single ``plugin.yaml`` into a :class:`PluginManifest`.
|
||||
|
||||
Returns ``None`` on parse failure (logs a warning).
|
||||
"""
|
||||
try:
|
||||
if yaml is None:
|
||||
logger.warning("PyYAML not installed – cannot load %s", manifest_file)
|
||||
return None
|
||||
data = yaml.safe_load(manifest_file.read_text()) or {}
|
||||
|
||||
name = data.get("name", plugin_dir.name)
|
||||
key = f"{prefix}/{plugin_dir.name}" if prefix else name
|
||||
|
||||
raw_kind = data.get("kind", "standalone")
|
||||
if not isinstance(raw_kind, str):
|
||||
raw_kind = "standalone"
|
||||
kind = raw_kind.strip().lower()
|
||||
if kind not in _VALID_PLUGIN_KINDS:
|
||||
logger.warning(
|
||||
"Plugin %s: unknown kind '%s' (valid: %s); treating as 'standalone'",
|
||||
key, raw_kind, ", ".join(sorted(_VALID_PLUGIN_KINDS)),
|
||||
)
|
||||
kind = "standalone"
|
||||
|
||||
return PluginManifest(
|
||||
name=name,
|
||||
version=str(data.get("version", "")),
|
||||
description=data.get("description", ""),
|
||||
author=data.get("author", ""),
|
||||
requires_env=data.get("requires_env", []),
|
||||
provides_tools=data.get("provides_tools", []),
|
||||
provides_hooks=data.get("provides_hooks", []),
|
||||
source=source,
|
||||
path=str(plugin_dir),
|
||||
kind=kind,
|
||||
key=key,
|
||||
)
|
||||
except Exception as exc:
|
||||
logger.warning("Failed to parse %s: %s", manifest_file, exc)
|
||||
return None
|
||||
|
||||
# -----------------------------------------------------------------------
|
||||
# Entry-point scanning
|
||||
# -----------------------------------------------------------------------
|
||||
|
|
@ -609,6 +773,7 @@ class PluginManager:
|
|||
name=ep.name,
|
||||
source="entrypoint",
|
||||
path=ep.value,
|
||||
key=ep.name,
|
||||
)
|
||||
manifests.append(manifest)
|
||||
except Exception as exc:
|
||||
|
|
@ -670,10 +835,16 @@ class PluginManager:
|
|||
loaded.error = str(exc)
|
||||
logger.warning("Failed to load plugin '%s': %s", manifest.name, exc)
|
||||
|
||||
self._plugins[manifest.name] = loaded
|
||||
self._plugins[manifest.key or manifest.name] = loaded
|
||||
|
||||
def _load_directory_module(self, manifest: PluginManifest) -> types.ModuleType:
|
||||
"""Import a directory-based plugin as ``hermes_plugins.<name>``."""
|
||||
"""Import a directory-based plugin as ``hermes_plugins.<slug>``.
|
||||
|
||||
The module slug is derived from ``manifest.key`` so category-namespaced
|
||||
plugins (``image_gen/openai``) import as
|
||||
``hermes_plugins.image_gen__openai`` without colliding with any
|
||||
future ``tts/openai``.
|
||||
"""
|
||||
plugin_dir = Path(manifest.path) # type: ignore[arg-type]
|
||||
init_file = plugin_dir / "__init__.py"
|
||||
if not init_file.exists():
|
||||
|
|
@ -686,7 +857,9 @@ class PluginManager:
|
|||
ns_pkg.__package__ = _NS_PARENT
|
||||
sys.modules[_NS_PARENT] = ns_pkg
|
||||
|
||||
module_name = f"{_NS_PARENT}.{manifest.name.replace('-', '_')}"
|
||||
key = manifest.key or manifest.name
|
||||
slug = key.replace("/", "__").replace("-", "_")
|
||||
module_name = f"{_NS_PARENT}.{slug}"
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
module_name,
|
||||
init_file,
|
||||
|
|
@ -767,10 +940,12 @@ class PluginManager:
|
|||
def list_plugins(self) -> List[Dict[str, Any]]:
|
||||
"""Return a list of info dicts for all discovered plugins."""
|
||||
result: List[Dict[str, Any]] = []
|
||||
for name, loaded in sorted(self._plugins.items()):
|
||||
for key, loaded in sorted(self._plugins.items()):
|
||||
result.append(
|
||||
{
|
||||
"name": name,
|
||||
"name": loaded.manifest.name,
|
||||
"key": loaded.manifest.key or loaded.manifest.name,
|
||||
"kind": loaded.manifest.kind,
|
||||
"version": loaded.manifest.version,
|
||||
"description": loaded.manifest.description,
|
||||
"source": loaded.manifest.source,
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue