diff --git a/scripts/release.py b/scripts/release.py index 1712c327309..c16e8341d24 100755 --- a/scripts/release.py +++ b/scripts/release.py @@ -71,6 +71,7 @@ AUTHOR_MAP = { "kyanam.preetham@gmail.com": "pkyanam", "127238744+teknium1@users.noreply.github.com": "teknium1", "147827411+EloquentBrush@users.noreply.github.com": "AhmetArif0", + "97489706+purzbeats@users.noreply.github.com": "purzbeats", "hugosequier@gmail.com": "Hugo-SEQUIER", "128259593+Gutslabs@users.noreply.github.com": "Gutslabs", "50326054+nocturnum91@users.noreply.github.com": "nocturnum91", diff --git a/skills/creative/comfyui/SKILL.md b/skills/creative/comfyui/SKILL.md index 4fbeb603572..e5a8a7c0745 100644 --- a/skills/creative/comfyui/SKILL.md +++ b/skills/creative/comfyui/SKILL.md @@ -1,8 +1,8 @@ --- name: comfyui description: "Generate images, video, and audio with ComfyUI — install, launch, manage nodes/models, run workflows with parameter injection. Uses the official comfy-cli for lifecycle and direct REST/WebSocket API for execution." -version: 5.0.0 -author: [kshitijk4poor, alt-glitch] +version: 5.1.0 +author: [kshitijk4poor, alt-glitch, purzbeats] license: MIT platforms: [macos, linux, windows] compatibility: "Requires ComfyUI (local, Comfy Desktop, or Comfy Cloud) and comfy-cli (auto-installed via pipx/uvx by the setup script)." @@ -40,6 +40,12 @@ for workflow execution. - `official-cli.md` — every `comfy ...` command, with flags - `rest-api.md` — REST + WebSocket endpoints (local + cloud), payload schemas - `workflow-format.md` — API-format JSON, common node types, param mapping +- `template-integrity.md` — converting `comfyui-workflow-templates` from + editor format to API format: Reroute bypass, dotted dynamic-input keys + (`values.a`, `resize_type.width`), Cloud quirks (302 redirect, 1 concurrent + free-tier job, 1080p VRAM ceiling), Discord-compatible ffmpeg stitch. + Authored by [@purzbeats](https://github.com/purzbeats). Load this whenever + you're starting from an official template. **Scripts (`scripts/`):** diff --git a/skills/creative/comfyui/references/template-integrity.md b/skills/creative/comfyui/references/template-integrity.md new file mode 100644 index 00000000000..050e3e6b5cf --- /dev/null +++ b/skills/creative/comfyui/references/template-integrity.md @@ -0,0 +1,243 @@ +# ComfyUI Workflow-Template Integrity + +> **Authored by [@purzbeats](https://github.com/purzbeats)** — adapted from +> [purzbeats/hermes-agent-comfyui-helper](https://github.com/purzbeats/hermes-agent-comfyui-helper). +> Use this reference when converting workflows from the official +> `comfyui-workflow-templates` package (editor format) into API format for +> submission via `/api/prompt`. The conversion has subtle gotchas that cause +> hard-to-diagnose validation errors if you don't follow these rules. + +## Background + +The official ComfyUI template package (`comfyui-workflow-templates`, currently +v0.9.69) is installed inside the ComfyUI venv at a path like: + +``` +/.venv/lib/python3.*/site-packages/comfyui_workflow_templates_*/templates/ +``` + +The exact path depends on how ComfyUI was installed (comfy-cli default, +Comfy Desktop, manual venv, etc.). Find it once with: + +```bash +comfy --workspace run-python -c "import comfyui_workflow_templates, pathlib; print(pathlib.Path(comfyui_workflow_templates.__file__).parent / 'templates')" +``` + +Templates ship in **editor format** — `nodes` / `links` arrays inside +`data['definitions']['subgraphs'][0]`. They must be converted to **API +format** (a `node_id -> {class_type, inputs}` mapping) before submission. + +--- + +## RULE #1: Use templates AS CLOSE TO ORIGINAL AS POSSIBLE + +- **Never strip, simplify, or "minimize" nodes** from a template. +- Full template architecture (dual-pass pipelines, LoRA chains, distilled + sigmas, conditioning paths) is intentional — removing any part breaks quality. +- If an image-dependent path exists but the task is text-to-video, **leave + it wired with the bypass toggle enabled** — don't remove the nodes. +- Only change: prompt text, seed, and dimensions (when explicitly requested). + +## RULE #2: Server validation errors are the source of truth + +When a workflow submission fails, the server response looks like: + +```json +{ + "node_errors": { + "238": { + "errors": [{ + "message": "Required input is missing", + "details": "width", + "extra_info": { "input_name": "resize_type.width" } + }] + } + } +} +``` + +**The `extra_info.input_name` field tells you EXACTLY what JSON key the server +wants. Use it literally.** If it says `"values.a"` or `"resize_type.width"`, +those are the actual key names in the JSON object. Do not "simplify" them to +flat names based on assumptions about what the field "should" be called. + +## RULE #3: Don't rebuild from scratch — patch the failing nodes + +Every regeneration from the template reintroduces the same bugs. Instead: + +1. Submit the workflow once. +2. Read the server error details for exact key names. +3. Use targeted patch/fix calls against the workflow file on disk. +4. Resubmit and check if errors resolved. + +--- + +## Reroute nodes: bypass, don't delete + +Most servers (local, Cloud) don't have a `Reroute` node type. When converting +a template: + +1. Find what feeds into the Reroute by looking at links where + `target_id` = the Reroute node ID. +2. Replace all inputs referencing the Reroute with + `[source_node_id, source_slot]`. +3. Delete the Reroute node from the API mapping. + +**Real example — LTX 2.3 t2v template:** + +- Reroute node 255 receives VAE from `CheckpointLoaderSimple 236` slot 2. +- Three nodes reference Reroute 255 for their VAE input: + `LTXVImgToVideoInplace` (230), `LTXVLatentUpsampler` (253), + `VAEDecodeTiled` (251). +- Fix: replace all occurrences of `vae: ["255", 0]` with `vae: ["236", 2]`. +- `CheckpointLoaderSimple` slot 2 = VAE (not slot 0 = MODEL). + +| | | +|---|---| +| ❌ Wrong | `vae: ["236", 0]` → `MODELV mismatch input_type(VAE)` | +| ✅ Correct | `vae: ["236", 2]` | + +--- + +## Dynamic template nodes: dotted key names are correct + +### ComfyMathExpression (COMFY_AUTOGROW_V3) + +```json +{ + "class_type": "ComfyMathExpression", + "inputs": { + "expression": "a/2", + "values.a": ["257", 0] + } +} +``` + +- `values` is a `COMFY_AUTOGROW_V3` template. +- Input names in links are `values.a`, `values.b`, etc. +- **Keep the dotted format as JSON keys.** +- Do NOT convert to `{"values": {"a": ...}}` or flatten to just `"a"`. + +### ResizeImageMaskNode (COMFY_DYNAMICCOMBO_V3) + +```json +{ + "class_type": "ResizeImageMaskNode", + "inputs": { + "input": ["276", 0], + "scale_method": "lanczos", + "resize_type": "scale dimensions", + "resize_type.width": 1920, + "resize_type.height": 1088, + "resize_type.crop": "center" + } +} +``` + +- `resize_type` is a `COMFY_DYNAMICCOMBO_V3`. +- Mode-specific fields: `resize_type.width`, `resize_type.height`, `resize_type.crop`. +- `scale_method` options: `"nearest-exact"`, `"bilinear"`, `"area"`, `"bicubic"`, `"lanczos"`. +- **Keep the dotted format as JSON keys.** +- Do NOT flatten `resize_type.width` to just `"width"`. + +--- + +## Conversion recipe + +1. Load template from the installed package path. +2. Parse `data['definitions']['subgraphs'][0]`. +3. For each node (skip Reroute): + - Resolve linked inputs from `sg['links']` dict. + - Map `widgets_values` to input field names. + - Keep all dotted key names as-is from the template. +4. Bypass Reroute: trace source, replace references. +5. Change only: prompt text, seed values, and user-requested parameters. +6. Add `SaveVideo` terminal node if template uses only `CreateVideo`. +7. Submit → read errors → patch specific nodes → resubmit. + +## What to NEVER change in a template + +| Element | Why | +|---------|-----| +| Node topology | Graph is designed for the specific model | +| Sigmas values | Tuned for the model/sampler combination | +| LoRA/distilled paths | Required for quality, even if they look unused | +| Model parameters (cfg, steps, shifts) | Model-specific | +| Conditioning chains (zero-out, crop guides) | Required for correct conditioning | +| Pass-through wiring | Don't remove nodes, bypass them | + +--- + +## Cloud compatibility (verified May 2025) + +The full LTX 2.3 T2V template (`video_ltx2_3_t2v.json`) runs **without +modification** on Comfy Cloud. + +**Confirmed working on Cloud (all custom nodes available):** +`ComfyMathExpression`, `ResizeImageMaskNode`, `ResizeImagesByLongerEdge`, +`PrimitiveInt`, `PrimitiveStringMultiline`, `PrimitiveBoolean`, `SaveVideo`, +`LTXVCropGuides`, `LTXVImgToVideoInplace`, `LTXVConcatAVLatent`, +`LTXVSeparateAVLatent`, `LTXVLatentUpsampler`, `LTXVAudioVAELoader`, +`LTXVAudioVAEDecode`, `LTXVEmptyLatentAudio`, `LTXVPreprocess`, +`LTXVConditioning`, `ManualSigmas`, `LTXAVTextEncoderLoader`, plus all core +nodes. + +**Cloud vs Local for LTX 2.3 (768x512):** + +- Cloud: ~39s per video (4x faster). +- Local (RTX 5090): ~160s per video. +- `example.png` placeholder works on Cloud for bypassed image-dependent paths. +- Submission format is **identical** between local and Cloud: + `{"prompt": wf, "extra_data": {}}` to `/api/prompt`. +- Free tier = 1 concurrent job. + +**Cloud submission pitfalls:** + +- `/api/object_info/` returns 404 on free tier — can't query node + schemas remotely, but the workflow runs fine anyway. Always probe + `object_info` locally before building workflows. +- Cloud is ~4x faster — prefer Cloud for batch runs unless local is needed + for debugging. +- Cloud `/api/view` returns **302 redirect to signed GCS URL** — use + `curl -s -L` to follow and download. Python `urllib` fails with 401 + (forwards auth headers to GCS CDN). +- `COMFY_CLOUD_API_KEY` is only in the terminal/bash env, not in the Python + sandbox. Use subprocess or terminal scripts for Cloud API calls. +- Cloud free tier processes jobs **sequentially** (1 at a time). Submit all, + then poll history. +- LTX 2.3 at **1920x1080 OOMs locally** (even RTX 5090) — upscaler pass + exceeds VRAM. Prefer Cloud for 1080p; use 1280x720 locally (~90s/video). + +--- + +## FFmpeg stitch settings (Discord-compatible) + +Generated ComfyUI videos often use `yuv444p` pixel format which does NOT work +on Discord. Re-encode with: + +```bash +ffmpeg -y -i input.mp4 \ + -c:v libx264 -profile:v main -preset medium -crf 13 -pix_fmt yuv420p \ + -c:a aac -b:a 192k \ + output_discord.mp4 +``` + +Key settings: + +- `-pix_fmt yuv420p` — **required for Discord**, ComfyUI outputs `yuv444p` by default. +- `-crf 13` — high quality without massive file size (default 23 is too lossy). +- `-profile:v main` — widely compatible. + +For multi-video crossfade stitching, chain `xfade` (video) and `acrossfade` +(audio): + +```bash +ffmpeg -y -i a.mp4 -i b.mp4 -i c.mp4 \ + -filter_complex "[0:v][1:v]xfade=transition=fade:duration=1:offset=3.04[v1];[v1][2:v]xfade=transition=fade:duration=1:offset=6.08[vout];[0:a][1:a]acrossfade=duration=1:c1=tri:c2=tri[a1];[a1][2:a]acrossfade=duration=1:c1=tri:c2=tri[aout]" \ + -map "[vout]" -map "[aout]" \ + -c:v libx264 -profile:v main -crf 13 -pix_fmt yuv420p \ + -c:a aac -b:a 192k \ + output.mp4 +``` + +Offset for xfade #N = `(N+1) × duration - N × overlap`.