diff --git a/skills/creative/baoyu-article-illustrator/PORT_NOTES.md b/skills/creative/baoyu-article-illustrator/PORT_NOTES.md index cba424387c2..d81dbc9ed83 100644 --- a/skills/creative/baoyu-article-illustrator/PORT_NOTES.md +++ b/skills/creative/baoyu-article-illustrator/PORT_NOTES.md @@ -4,7 +4,7 @@ Ported from [JimLiu/baoyu-skills](https://github.com/JimLiu/baoyu-skills) v1.57. ## Changes from upstream -`SKILL.md`, `references/workflow.md`, `references/usage.md`, `references/style-presets.md`, `references/styles.md`, and `references/prompt-construction.md` were adapted. The 21 style files, 4 palette files, and `prompts/system.md` are verbatim copies. The `references/config/` directory was removed entirely. +`SKILL.md`, `references/workflow.md`, `references/usage.md`, `references/style-presets.md`, `references/styles.md`, `references/prompt-construction.md`, and `prompts/system.md` were adapted. The 23 style files and 4 palette files are verbatim copies. The `references/config/` directory was removed entirely. ### Adaptations @@ -14,19 +14,20 @@ Ported from [JimLiu/baoyu-skills](https://github.com/JimLiu/baoyu-skills) v1.57. | Trigger | `/baoyu-article-illustrator` slash command + CLI flags | Natural language skill matching | | User config | EXTEND.md (project/user/XDG paths) + first-time-setup | Removed — not part of Hermes infra | | User prompts | `AskUserQuestion` (batched, multi-question) | `clarify` tool (one question at a time) | -| Image generation | `baoyu-imagine` (Bun/TypeScript, multi-provider, accepts `--ref`) | `image_generate` tool (describes references in prompt text) | +| Image generation | `baoyu-imagine` (Bun/TypeScript, multi-provider, accepts `--ref`, writes to local path) | `image_generate` (returns URL only; agent downloads via `terminal`/`curl`) | +| Backend selection | User picks provider via CLI flags | Not agent-selectable — `image_generate` uses the user-configured FAL model. Removed hardcoded "nano banana pro" line from `prompts/system.md`. | +| Reference images | Passed to backend via `--ref`, copied via shell | `vision_analyze` extracts a textual description (binary never touched by `write_file`/`read_file`); description is embedded in prompts. Optional `terminal cp` for a local record. | | Platform support | Linux/macOS/Windows/WSL/PowerShell | Linux/macOS only | -| File operations | Bash commands | Hermes file tools (`write_file`, `read_file`) | +| File operations | Bash commands | Hermes file tools: `write_file`/`read_file` for text, `terminal` for binaries and URL downloads, `vision_analyze` for reading images | | Watermark | Driven by EXTEND.md `watermark.enabled` | Optional — user asks for it per-article | | Output directory | EXTEND.md `default_output_dir` (imgs-subdir / same-dir / illustrations-subdir / independent) | Defaults based on input type; user overrides in request | ### What was preserved - Type × Style × Palette three-dimension framework -- All style definitions (23 files) -- All palette definitions (4 files) -- Core reference files (workflow, prompt-construction, styles, style-presets) -- `prompts/system.md` (generation prompt template) +- All style definitions (23 files, verbatim) +- All palette definitions (4 files, verbatim) +- Core reference files (workflow, prompt-construction, styles, style-presets) — adapted for Hermes tooling - Core principles and workflow structure (analyze → confirm → outline → prompts → generate) - Prompt-file-as-reproducibility-record discipline - Author, version, homepage attribution @@ -44,4 +45,4 @@ curl -sL https://raw.githubusercontent.com/JimLiu/baoyu-skills/main/skills/baoyu diff <(curl -sL https://raw.githubusercontent.com/JimLiu/baoyu-skills/main/skills/baoyu-article-illustrator/references/styles/blueprint.md) references/styles/blueprint.md ``` -`references/styles/*`, `references/palettes/*`, and `prompts/system.md` can be overwritten directly. `SKILL.md`, `references/workflow.md`, `references/usage.md`, `references/style-presets.md`, `references/styles.md`, and `references/prompt-construction.md` must be manually merged since they contain Hermes-specific adaptations. +`references/styles/*` and `references/palettes/*` can be overwritten directly. `SKILL.md`, `references/workflow.md`, `references/usage.md`, `references/style-presets.md`, `references/styles.md`, `references/prompt-construction.md`, and `prompts/system.md` must be manually merged since they contain Hermes-specific adaptations (tool wiring, backend neutrality, removed EXTEND.md references). diff --git a/skills/creative/baoyu-article-illustrator/SKILL.md b/skills/creative/baoyu-article-illustrator/SKILL.md index 23771708f5e..6af0c37ee8f 100644 --- a/skills/creative/baoyu-article-illustrator/SKILL.md +++ b/skills/creative/baoyu-article-illustrator/SKILL.md @@ -91,11 +91,11 @@ If the user asks for a different layout (e.g., images alongside the article, or ### Step 1: Detect Reference Images -If the user supplies reference images (paths pasted inline, attachments, or a list of files): +If the user supplies reference images (paths pasted inline, attachments, or a URL): -1. Copy each reference to `{output-dir}/references/NN-ref-{slug}.{ext}` using `write_file`. -2. Create a sidecar `NN-ref-{slug}.md` describing the reference. -3. If the user described a reference but can't provide a file path, extract style/palette verbally and record under `references/extracted-style.md` — do NOT add a `references:` field to prompt frontmatter in that case. +1. For each reference, call `vision_analyze` with the path/URL and a question asking for style, palette, composition, and subject. Record the returned description in `{output-dir}/references/NN-ref-{slug}.md` via `write_file`. +2. **Do not** try to copy the binary via `write_file` / `read_file` — those are text-only. If you want a local copy for the record, use `terminal` (`cp "$src" "{output-dir}/references/NN-ref-{slug}.{ext}"`). The skill itself never needs to read the binary; it works off the vision description. +3. Since `image_generate` doesn't take image inputs, the vision description is what gets embedded in prompts during Step 5. Full procedures: [references/workflow.md](references/workflow.md#step-1-detect-reference-images). @@ -156,11 +156,14 @@ For each illustration: ### Step 6: Generate Images -Use the `image_generate` tool with the assembled prompt from each prompt file. +For each prompt file: -- Map aspect ratio to `image_generate` format: `16:9` → `landscape`, `9:16` → `portrait`, `1:1` → `square`. For custom ratios, pick the closest named aspect. -- Generate sequentially through the outline. On failure, auto-retry once. -- Save each image to `{output-dir}/NN-{type}-{slug}.png`. +1. Call `image_generate(prompt=..., aspect_ratio=...)`. `image_generate` returns a JSON result containing an image URL; it does NOT write to disk and does NOT accept an output path. +2. Map the prompt's `ASPECT` to `image_generate`'s enum: `16:9` → `landscape`, `9:16` → `portrait`, `1:1` → `square`. Custom ratios → nearest named aspect. +3. Download the returned URL to `{output-dir}/NN-{type}-{slug}.png` via `terminal` (e.g. `curl -sSL -o "{output-dir}/NN-{type}-{slug}.png" "{url}"`). +4. On generation failure, auto-retry once. + +Note: the underlying image-generation backend is user-configured (default: FAL FLUX 2 Klein 9B) and is NOT agent-selectable via `image_generate`. Do not write model names into prompts expecting them to route. ### Step 7: Finalize @@ -199,3 +202,5 @@ Images: X/N generated 3. **Don't illustrate metaphors literally** — visualize the underlying concept. 4. **Prompt files are mandatory** — no image generation without a saved prompt file. The file is what lets you regenerate or switch backends later. 5. **`image_generate` aspect ratios** — the tool supports `landscape`, `portrait`, and `square`. Custom ratios map to the nearest option. +6. **`image_generate` returns a URL, not a local file** — always download via `terminal` (`curl`) before inserting local image paths into the article. +7. **No backend selection from the agent** — `image_generate` uses whatever model the user configured (default: FAL FLUX 2 Klein 9B). Don't write `"use to generate this"` into prompts expecting it to route. diff --git a/skills/creative/baoyu-article-illustrator/prompts/system.md b/skills/creative/baoyu-article-illustrator/prompts/system.md index 9eaf2a7f512..3320564c4d4 100644 --- a/skills/creative/baoyu-article-illustrator/prompts/system.md +++ b/skills/creative/baoyu-article-illustrator/prompts/system.md @@ -29,4 +29,4 @@ Create a cartoon-style infographic illustration following these guidelines: --- -Please use nano banana pro to generate the illustration based on the content provided below: +Generate the illustration based on the content provided below: diff --git a/skills/creative/baoyu-article-illustrator/references/workflow.md b/skills/creative/baoyu-article-illustrator/references/workflow.md index 3eb937593e6..b859b7f3a60 100644 --- a/skills/creative/baoyu-article-illustrator/references/workflow.md +++ b/skills/creative/baoyu-article-illustrator/references/workflow.md @@ -2,34 +2,39 @@ ## Step 1: Detect Reference Images -Check if the user provided reference images. Handle based on input type: +If the user provides reference images (local path or URL), the goal is to produce **textual descriptions** that can be embedded in prompts — `image_generate` doesn't accept reference-image inputs, and Hermes' text file tools can't read or write binaries. + +**Tool rules**: + +| Task | Tool | Notes | +|------|------|-------| +| Analyze a reference image | `vision_analyze` | Accepts URL or local path. Ask for style, palette, composition, subject. | +| Write the text description | `write_file` | Sidecar `.md` files only — never try to `write_file` a PNG/JPG. | +| (Optional) Keep a local copy of the binary | `terminal` | `cp "$src" "{output-dir}/references/NN-ref-{slug}.{ext}"` — purely for the record; the skill itself doesn't read the binary. | | Input Type | Action | |------------|--------| -| Image file path provided | Copy to `{output-dir}/references/` → reference it by description in prompts | -| Image in conversation (no path) | Ask user (via `clarify`) for a file path or a description | -| User can't provide path | Extract style/palette verbally → append to prompts (no `references:` frontmatter) | +| Image file path provided | `vision_analyze` → write sidecar `.md`. Optional `terminal cp` for a local record. | +| Image URL provided | `vision_analyze` with the URL → write sidecar `.md`. | +| Image in conversation (no path, no URL) | Ask via `clarify` for a path or URL, or for a verbal description. | +| User can't provide either | Extract style/palette verbally from the user → write `references/extracted-style.md`. Do NOT add `references:` to prompt frontmatter. | -**CRITICAL**: Only add a `references:` field to prompt frontmatter if files are ACTUALLY SAVED to the `references/` subdirectory. +**Procedure** (when a path/URL is available): -**If user provides a file path**: -1. Copy to `{output-dir}/references/NN-ref-{slug}.png` using `write_file` -2. Create description: `{output-dir}/references/NN-ref-{slug}.md` -3. Verify files exist (via `read_file`) before proceeding +1. Call `vision_analyze(image_url=..., question="Describe the style, color palette (with hex approximations), composition, and subject so this can be used as a style/palette reference for another illustration.")`. +2. Write `{output-dir}/references/NN-ref-{slug}.md` via `write_file` with the description. +3. (Optional) Run `terminal` with `cp` (or `curl -sSL -o ...` for URLs) to keep a local binary copy. Not required by the skill. +4. Mark the reference in the outline with usage `direct` / `style` / `palette`. In Step 5.1 the description gets appended to the prompt body. -**If user can't provide a path** (extracted verbally): -1. Analyze the image visually, extract: colors, style, composition -2. Create `{output-dir}/references/extracted-style.md` with extracted info -3. Do NOT add `references:` to prompt frontmatter -4. Instead, append extracted style/colors directly to prompt text - -**Description File Format** (only when file saved): +**Sidecar File Format**: ```yaml --- ref_id: NN -filename: NN-ref-{slug}.png +source: "" +local_copy: "NN-ref-{slug}.png" # omit if no copy made +usage_hint: style # direct | style | palette --- -[User's description or auto-generated description] +[vision_analyze description — colors, style, composition, subject] ``` --- @@ -80,9 +85,9 @@ Save analysis to `{output-dir}/analysis.md` using `write_file`. - Decorative scenes - Generic illustrations -### 2.5 Analyze Reference Images (if saved in Step 1) +### 2.5 Plan Reference Image Usage (if analyzed in Step 1) -For each reference image: +For each reference image (use the `vision_analyze` description from Step 1): | Analysis | Description | |----------|-------------| @@ -92,13 +97,13 @@ For each reference image: | Style match | Which illustration types/styles align | | Usage recommendation | `direct` / `style` / `palette` | -| Usage | When to Use | -|-------|-------------| -| `direct` | Reference matches desired output closely | -| `style` | Extract visual style characteristics only | -| `palette` | Extract color scheme only | +| Usage | When to Use | How it's applied in Step 5.1 | +|-------|-------------|------------------------------| +| `direct` | Reference matches desired output closely | Paste the description (composition + subject + style + palette) into the prompt body | +| `style` | Extract visual style characteristics only | Append style traits to prompt body | +| `palette` | Extract color scheme only | Append extracted hex colors to prompt body | -Note: `image_generate` does not accept reference-image inputs. For `direct` usage, describe the reference in the prompt text (composition, subject, palette) rather than passing the file itself. +Note: `image_generate` does not accept reference-image inputs under any usage type. Everything is mediated through the `vision_analyze` description. --- @@ -255,32 +260,40 @@ For each illustration in the outline: 8. **Backup rule**: If a prompt file exists, rename to `prompts/NN-{type}-{slug}-backup-YYYYMMDD-HHMMSS.md` **CRITICAL - References in Frontmatter**: -- Only add `references` field if files ACTUALLY EXIST in `{output-dir}/references/` directory -- If style/palette was extracted verbally (no file), append info to prompt BODY instead -- Before writing frontmatter, verify the reference file exists +- Only add `references` field if a sidecar `.md` description exists in `{output-dir}/references/` +- If style/palette was extracted verbally (no description file), append info to prompt BODY only +- Before writing frontmatter, confirm the sidecar exists (try `read_file` on the `.md`) -### 5.1 Process References (if references saved in Step 1) +### 5.1 Process References (if analyzed in Step 1) -Since `image_generate` doesn't accept reference-image inputs, convert every reference to a textual description and append it to the prompt body: +Read the `vision_analyze` description from the sidecar `references/NN-ref-{slug}.md` (via `read_file`) and embed it in the prompt body. `image_generate` never receives the binary. | Usage | Action | |-------|--------| -| `direct` | Describe the reference (composition, subject, style, palette) in the prompt body | -| `style` | Append style traits to prompt: "Style: clean lines, gradient backgrounds..." | -| `palette` | Append extracted colors to prompt: "Colors: #E8756D coral, #7ECFC0 mint..." | +| `direct` | Paste the full reference description (composition, subject, style, palette) into the prompt body | +| `style` | Append only the style traits: "Style: clean lines, gradient backgrounds..." | +| `palette` | Append only the hex colors: "Colors: #E8756D coral, #7ECFC0 mint..." | --- ## Step 6: Generate Images +`image_generate` returns a JSON blob with a URL (`{"success": true, "image": ""}`). It does NOT save a local file, does NOT accept an output path, and does NOT let the agent pick a backend/model. Treat the URL as a temporary artifact and download it explicitly. + For each prompt file: 1. Read the prompt file (via `read_file`) and extract the assembled prompt -2. Map the prompt's `ASPECT` to `image_generate`'s format: `16:9` → `landscape`, `9:16` → `portrait`, `1:1` → `square`. Custom ratios → nearest named aspect. -3. Call `image_generate` with the prompt text -4. **Backup rule**: If an existing image file is present, rename to `NN-{type}-{slug}-backup-YYYYMMDD-HHMMSS.png` before writing -5. Save the resulting image to `{output-dir}/NN-{type}-{slug}.png` -6. On failure, retry once, then log and continue. After each generation, report "Generated X/N". +2. Map the prompt's `ASPECT` to `image_generate`'s enum: `16:9` → `landscape`, `9:16` → `portrait`, `1:1` → `square`. Custom ratios → nearest named aspect. +3. Call `image_generate(prompt=, aspect_ratio=)` and extract the `image` URL from the returned JSON. +4. **Backup rule**: If `{output-dir}/NN-{type}-{slug}.png` already exists, rename it via `terminal` (`mv "{output-dir}/NN-{type}-{slug}.png" "{output-dir}/NN-{type}-{slug}-backup-YYYYMMDD-HHMMSS.png"`) before writing. +5. Download the URL via `terminal`: + ```bash + curl -sSL -o "{output-dir}/NN-{type}-{slug}.png" "{image_url}" + ``` + If `curl` is unavailable, fall back to `wget -qO "{output-dir}/NN-{type}-{slug}.png" "{image_url}"`. +6. Verify the file exists and has non-zero size (`terminal`: `test -s "{path}" && echo ok`). +7. On generation failure, retry `image_generate` once. On download failure, retry `curl` once with a longer timeout. Then log and continue. +8. After each generation, report "Generated X/N". ---