mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-20 10:11:58 +00:00
* feat(image-gen): add image-to-image / editing to image_generate Brings image generation to parity with video generation: the unified image_generate tool now edits/transforms a source image (image-to-image) when given image_url / reference_image_urls, routing to each backend's edit endpoint, exactly as video_generate routes to image-to-video. - ImageGenProvider ABC: generate() gains keyword-only image_url + reference_image_urls; new capabilities() declares modalities + max_reference_images (defaults to text-only, backward compatible). success_response gains a modality field; adds normalize_reference_images. - image_generate tool: schema exposes image_url + reference_image_urls; dynamic schema reflects the active model's actual edit capability so the agent knows when image_url is honored. Handler + plugin dispatch forward the new inputs; legacy/text-only providers get a clear modality_unsupported error instead of silently dropping the source image. - In-tree FAL: 7 models gain edit endpoints (flux-2-klein, flux-2-pro, nano-banana-pro, gpt-image-1.5, gpt-image-2, ideogram/v3, qwen-image) with per-model edit_supports whitelists + reference caps; routes to the /edit endpoint and skips the upscaler for edits. - Plugins: openai (images.edit, 16 refs), xai (/v1/images/edits via grok-imagine-image-quality, JSON body per xAI docs), krea (image_style_references, 10 refs). openai-codex stays text-only and rejects edits with an actionable error. - Tests: 15 new (payload, routing, dispatch forwarding, dynamic schema, capabilities); updated 2 change-detector/lambda tests for the new schema. - Docs: image-generation feature page, image-gen provider plugin guide, tools reference. * fix(image-gen): preserve legacy passthrough in fal/krea plugin tests Two existing plugin tests asserted pre-image-to-image behavior: - fal: forward image_url/reference_image_urls only when supplied, so a text-to-image delegation stays byte-identical (no None kwargs). - krea: keep dict-shaped image_style_references refs verbatim (the unified string refs go through normalize_reference_images; legacy non-string ref objects pass through unchanged) — fixes KeyError when callers pass the richer Krea ref-object shape. * fix(image-gen): clearer not-capable message for text-to-image-only models When a text-to-image-only model (incl. gpt-image-2 on the Codex OAuth path, which can't do editing through the Responses image_generation tool) gets a source image, say 'this model is not capable of image-to-image / editing — provide a text-only prompt' rather than sending the user shopping for other backends. Applies to the openai-codex guard, the in-tree FAL no-edit-endpoint error, and the dynamic tool-schema text-only line. |
||
|---|---|---|
| .. | ||
| _category_.json | ||
| acp.md | ||
| api-server.md | ||
| batch-processing.md | ||
| browser.md | ||
| built-in-plugins.md | ||
| code-execution.md | ||
| codex-app-server-runtime.md | ||
| computer-use.md | ||
| context-files.md | ||
| context-references.md | ||
| credential-pools.md | ||
| cron.md | ||
| curator.md | ||
| delegation.md | ||
| deliverable-mode.md | ||
| extending-the-dashboard.md | ||
| fallback-providers.md | ||
| goals.md | ||
| honcho.md | ||
| hooks.md | ||
| image-generation.md | ||
| kanban-tutorial.md | ||
| kanban-worker-lanes.md | ||
| kanban.md | ||
| lsp.md | ||
| mcp.md | ||
| memory-providers.md | ||
| memory.md | ||
| overview.md | ||
| personality.md | ||
| plugins.md | ||
| provider-routing.md | ||
| skills.md | ||
| skins.md | ||
| spotify.md | ||
| subscription-proxy.md | ||
| tool-gateway.md | ||
| tool-search.md | ||
| tools.md | ||
| tts.md | ||
| vision.md | ||
| voice-mode.md | ||
| web-dashboard.md | ||
| web-search.md | ||
| x-search.md | ||