hermes-agent/website/docs/developer-guide
Teknium c02192ff6a
feat(image-gen): add image-to-image / editing to image_generate (#48705)
* feat(image-gen): add image-to-image / editing to image_generate

Brings image generation to parity with video generation: the unified
image_generate tool now edits/transforms a source image (image-to-image)
when given image_url / reference_image_urls, routing to each backend's
edit endpoint, exactly as video_generate routes to image-to-video.

- ImageGenProvider ABC: generate() gains keyword-only image_url +
  reference_image_urls; new capabilities() declares modalities +
  max_reference_images (defaults to text-only, backward compatible).
  success_response gains a modality field; adds normalize_reference_images.
- image_generate tool: schema exposes image_url + reference_image_urls;
  dynamic schema reflects the active model's actual edit capability so the
  agent knows when image_url is honored. Handler + plugin dispatch forward
  the new inputs; legacy/text-only providers get a clear modality_unsupported
  error instead of silently dropping the source image.
- In-tree FAL: 7 models gain edit endpoints (flux-2-klein, flux-2-pro,
  nano-banana-pro, gpt-image-1.5, gpt-image-2, ideogram/v3, qwen-image)
  with per-model edit_supports whitelists + reference caps; routes to the
  /edit endpoint and skips the upscaler for edits.
- Plugins: openai (images.edit, 16 refs), xai (/v1/images/edits via
  grok-imagine-image-quality, JSON body per xAI docs), krea
  (image_style_references, 10 refs). openai-codex stays text-only and
  rejects edits with an actionable error.
- Tests: 15 new (payload, routing, dispatch forwarding, dynamic schema,
  capabilities); updated 2 change-detector/lambda tests for the new schema.
- Docs: image-generation feature page, image-gen provider plugin guide,
  tools reference.

* fix(image-gen): preserve legacy passthrough in fal/krea plugin tests

Two existing plugin tests asserted pre-image-to-image behavior:
- fal: forward image_url/reference_image_urls only when supplied, so a
  text-to-image delegation stays byte-identical (no None kwargs).
- krea: keep dict-shaped image_style_references refs verbatim (the unified
  string refs go through normalize_reference_images; legacy non-string ref
  objects pass through unchanged) — fixes KeyError when callers pass the
  richer Krea ref-object shape.

* fix(image-gen): clearer not-capable message for text-to-image-only models

When a text-to-image-only model (incl. gpt-image-2 on the Codex OAuth path,
which can't do editing through the Responses image_generation tool) gets a
source image, say 'this model is not capable of image-to-image / editing —
provide a text-only prompt' rather than sending the user shopping for other
backends. Applies to the openai-codex guard, the in-tree FAL no-edit-endpoint
error, and the dynamic tool-schema text-only line.
2026-06-18 22:13:07 -07:00
..
_category_.json feat: add documentation website (Docusaurus) 2026-03-05 05:24:55 -08:00
acp-internals.md feat(acp-registry): switch to uvx distribution, drop npm launcher 2026-05-14 22:27:09 -07:00
adding-platform-adapters.md docs: deep audit — registry drift, stale claims, 2-week PR coverage, dashboard screenshot (#40952) 2026-06-07 01:39:06 -07:00
adding-providers.md docs: 30-day overhaul — correctness audit, PR coverage, Nous Portal weave, sidebar reorg (#33782) 2026-05-28 02:41:36 -07:00
adding-tools.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
agent-loop.md docs: deep audit — registry drift, stale claims, 2-week PR coverage, dashboard screenshot (#40952) 2026-06-07 01:39:06 -07:00
architecture.md docs(prompt): align precedence docs with system prompt runtime 2026-05-29 12:06:22 -07:00
browser-supervisor.md docs: 30-day overhaul — correctness audit, PR coverage, Nous Portal weave, sidebar reorg (#33782) 2026-05-28 02:41:36 -07:00
context-compression-and-caching.md feat(compression): raise compaction trigger to 85% for gpt-5.5 on Codex OAuth (#40957) 2026-06-07 01:40:50 -07:00
context-engine-plugin.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
contributing.md docs: recommend standard installer for development (#46646) 2026-06-15 06:14:57 -07:00
creating-skills.md refactor(cron): rebrand Cron Recipes -> Automation Blueprints 2026-06-11 10:49:47 -07:00
cron-internals.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
extending-the-cli.md docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738) 2026-04-29 20:55:59 -07:00
gateway-internals.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
image-gen-provider-plugin.md feat(image-gen): add image-to-image / editing to image_generate (#48705) 2026-06-18 22:13:07 -07:00
memory-provider-plugin.md feat: expose completed-turn message context to memory providers 2026-05-29 02:16:43 +05:30
model-provider-plugin.md fix(openrouter): route reasoning_effort to verbosity for adaptive Anthropic models (#43436) 2026-06-10 15:03:01 +05:30
plugin-llm-access.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
programmatic-integration.md feat: add TUI session orchestrator 2026-05-26 20:51:59 -07:00
prompt-assembly.md docs(prompt): document platform_hints config override 2026-06-18 14:28:01 -07:00
provider-runtime.md docs: 30-day overhaul — correctness audit, PR coverage, Nous Portal weave, sidebar reorg (#33782) 2026-05-28 02:41:36 -07:00
session-storage.md docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738) 2026-04-29 20:55:59 -07:00
tools-runtime.md remove Vercel AI Gateway and Vercel Sandbox (#33067) 2026-05-27 00:43:32 -07:00
trajectory-format.md docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393) 2026-04-05 19:45:50 -07:00
video-gen-provider-plugin.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
web-search-provider-plugin.md docs: deep audit — registry drift, stale claims, 2-week PR coverage, dashboard screenshot (#40952) 2026-06-07 01:39:06 -07:00