# Detailed Workflow Procedures ## Step 1: Detect Reference Images If the user provides reference images (local path or URL), the goal is to produce **textual descriptions** that can be embedded in prompts — `image_generate` doesn't accept reference-image inputs, and Hermes' text file tools can't read or write binaries. **Tool rules**: | Task | Tool | Notes | |------|------|-------| | Analyze a reference image | `vision_analyze` | Accepts URL or local path. Ask for style, palette, composition, subject. | | Write the text description | `write_file` | Sidecar `.md` files only — never try to `write_file` a PNG/JPG. | | (Optional) Keep a local copy of the binary | `terminal` | `cp "$src" "{output-dir}/references/NN-ref-{slug}.{ext}"` — purely for the record; the skill itself doesn't read the binary. | | Input Type | Action | |------------|--------| | Image file path provided | `vision_analyze` → write sidecar `.md`. Optional `terminal cp` for a local record. | | Image URL provided | `vision_analyze` with the URL → write sidecar `.md`. | | Image in conversation (no path, no URL) | Ask via `clarify` for a path or URL, or for a verbal description. | | User can't provide either | Extract style/palette verbally from the user → write `references/extracted-style.md`. Do NOT add `references:` to prompt frontmatter. | **Procedure** (when a path/URL is available): 1. Call `vision_analyze(image_url=..., question="Describe the style, color palette (with hex approximations), composition, and subject so this can be used as a style/palette reference for another illustration.")`. 2. Write `{output-dir}/references/NN-ref-{slug}.md` via `write_file` with the description. 3. (Optional) Run `terminal` with `cp` (or `curl -sSL -o ...` for URLs) to keep a local binary copy. Not required by the skill. 4. Mark the reference in the outline with usage `direct` / `style` / `palette`. In Step 5.1 the description gets appended to the prompt body. **Sidecar File Format**: ```yaml --- ref_id: NN source: "" local_copy: "NN-ref-{slug}.png" # omit if no copy made usage_hint: style # direct | style | palette --- [vision_analyze description — colors, style, composition, subject] ``` --- ## Step 2: Analyze ### 2.1 Determine Output Directory | Input | Output Directory | Source-save path | |-------|------------------|------------------| | Article file path | `{article-dir}/imgs/` (default) | — (read article via `read_file`) | | Pasted content | `illustrations/{topic-slug}/` (cwd) | `source-{slug}.{ext}` (save via `write_file`) | If the user explicitly asked for a different layout (e.g., images in the article's folder, or an `illustrations/` subdirectory), honor that. ### 2.2 Analyze Content | Analysis | Description | |----------|-------------| | Content type | Technical / Tutorial / Methodology / Narrative | | Illustration purpose | information / visualization / imagination | | Core arguments | 2-5 main points to visualize | | Visual opportunities | Positions where illustrations add value | | Recommended type | Based on content signals and purpose | | Recommended density | Based on length and complexity | Save analysis to `{output-dir}/analysis.md` using `write_file`. ### 2.3 Extract Core Arguments - Main thesis - Key concepts reader needs - Comparisons/contrasts - Framework/model proposed **CRITICAL**: If the article uses metaphors (e.g., "电锯切西瓜"), do NOT illustrate literally. Visualize the **underlying concept**. ### 2.4 Identify Positions **Illustrate**: - Core arguments (REQUIRED) - Abstract concepts - Data comparisons - Processes, workflows **Do NOT Illustrate**: - Metaphors literally - Decorative scenes - Generic illustrations ### 2.5 Plan Reference Image Usage (if analyzed in Step 1) For each reference image (use the `vision_analyze` description from Step 1): | Analysis | Description | |----------|-------------| | Visual characteristics | Style, colors, composition | | Content/subject | What the reference depicts | | Suitable positions | Which sections match this reference | | Style match | Which illustration types/styles align | | Usage recommendation | `direct` / `style` / `palette` | | Usage | When to Use | How it's applied in Step 5.1 | |-------|-------------|------------------------------| | `direct` | Reference matches desired output closely | Paste the description (composition + subject + style + palette) into the prompt body | | `style` | Extract visual style characteristics only | Append style traits to prompt body | | `palette` | Extract color scheme only | Append extracted hex colors to prompt body | Note: `image_generate` does not accept reference-image inputs under any usage type. Everything is mediated through the `vision_analyze` description. --- ## Step 3: Confirm Settings Use the `clarify` tool. Since `clarify` handles one question at a time, ask the most important question first. Skip any question the user already answered in their request. ### Q1: Preset or Type (highest priority) Based on Step 2 content analysis, recommend a preset first (sets both type & style). Look up [style-presets.md](style-presets.md) "Content Type → Preset Recommendations" table. - [Recommended preset] — [brief: type + style + why] - [Alternative preset] — [brief] - Or choose type manually: infographic / scene / flowchart / comparison / framework / timeline / mixed **If user picks a preset → skip Q3** (type & style both resolved). **If user picks a type → Q3 is required.** ### Q2: Density - minimal (1-2) — Core concepts only - balanced (3-5) — Major sections - per-section — At least 1 per section/chapter (Recommended) - rich (6+) — Comprehensive coverage ### Q3: Style (skip if preset chosen in Q1) Present Core Styles first: - [Best compatible core style] (Recommended) - [Other compatible core style 1] - [Other compatible core style 2] - Other (see full Style Gallery) **Core Styles** (simplified selection): | Core Style | Maps To | Best For | |------------|---------|----------| | `minimal-flat` | notion | General, knowledge sharing, SaaS | | `sci-fi` | blueprint | AI, frontier tech, system design | | `hand-drawn` | sketch/warm | Relaxed, reflective, casual | | `editorial` | editorial | Processes, data, journalism | | `scene` | warm/watercolor | Narratives, emotional, lifestyle | | `poster` | screen-print | Opinion, editorial, cultural, cinematic | Style selection based on Type × Style compatibility matrix ([styles.md](styles.md)). **In Step 5**, read `styles/