From f721d2cda9f25fecd782525d8ea1312cfebec879 Mon Sep 17 00:00:00 2001 From: Teknium <127238744+teknium1@users.noreply.github.com> Date: Mon, 22 Jun 2026 13:40:42 -0700 Subject: [PATCH] fix(image/video gen): make schema delivery instruction platform-neutral (#51031) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * chore: re-trigger CI (workflows did not dispatch on prior head) * fix(image/video gen): make schema delivery instruction platform-neutral The image_generate and video_generate tool schema descriptions hardcoded a gateway-only delivery instruction ('display it with markdown ![description](url-or-path) and the gateway will deliver it'). That schema is sent on every platform, so on CLI it directly contradicted the CLI platform hint ('Do NOT emit MEDIA:/path tags ... state its absolute path in plain text'), and on messaging platforms it was also wrong about the mechanism (local file paths are delivered via MEDIA: tags, not markdown image syntax — markdown ![]() only works for URLs). The per-platform file-delivery convention is already owned correctly by the platform hints in prompt_builder.py. The tool schema now just describes the result shape (URL or absolute path in the image/video field) and defers 'how to deliver' to the active platform's guidance. Provider/model injection already works via _build_dynamic_image_schema() (the 'Active backend: · model: ' line); no change there. --- tools/image_generation_tool.py | 12 +++++++----- tools/video_generation_tool.py | 8 +++++--- 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/tools/image_generation_tool.py b/tools/image_generation_tool.py index 101b000db2a..81c6491f9d9 100644 --- a/tools/image_generation_tool.py +++ b/tools/image_generation_tool.py @@ -1184,11 +1184,13 @@ IMAGE_GENERATE_SCHEMA = { "`reference_image_urls` for style/composition references; omit both " "for text-to-image. The underlying backend (FAL, OpenAI, xAI, etc.) " "and model are user-configured and not selectable by the agent. " - "Returns either a URL or an absolute file path in the `image` field; " - "display it with markdown ![description](url-or-path) and the gateway " - "will deliver it. When the active terminal backend has a different " - "filesystem, successful local-file results may also include " - "`agent_visible_image` for follow-up terminal/file operations." + "Returns the result in the `image` field — either a URL or an absolute " + "file path. To show it to the user, reference that path/URL in your " + "response using the file-delivery convention for the current platform " + "(your platform guidance describes how files are delivered here). When " + "the active terminal backend has a different filesystem, successful " + "local-file results may also include `agent_visible_image` for " + "follow-up terminal/file operations." ), "parameters": { "type": "object", diff --git a/tools/video_generation_tool.py b/tools/video_generation_tool.py index 2465199f3d1..789ead6a054 100644 --- a/tools/video_generation_tool.py +++ b/tools/video_generation_tool.py @@ -419,9 +419,11 @@ _GENERIC_DESCRIPTION = ( "endpoint. The backend and model family are user-configured via " "`hermes tools` → Video Generation; the agent does not pick them. " "Long-running generations may take 30 seconds to several minutes — " - "the call blocks until the video is ready. Returns either an HTTP " - "URL or an absolute file path in the `video` field; display it with " - "markdown ![description](url-or-path) and the gateway will deliver it." + "the call blocks until the video is ready. Returns the result in the " + "`video` field — either an HTTP URL or an absolute file path. To show " + "it to the user, reference that path/URL in your response using the " + "file-delivery convention for the current platform (your platform " + "guidance describes how files are delivered here)." )