* fix(skills/baoyu-comic): require absolute paths for curl -o downloads When downloading generated images across several batches of image_generate calls, relying on persistent-shell CWD is unsafe. The terminal tool's shell can rotate (TERMINAL_LIFETIME_SECONDS expiry, a failed cd that leaves the shell somewhere else), and 'curl -fsSL <url> -o relative.png' then silently writes to the wrong directory with no error. Update the skill's Step 7 Download step to require absolute -o paths (or workdir= on the terminal tool) and add a matching pitfall entry referencing the Apr 2026 incident where pages 06-09 of a 10-page comic landed at the repo root instead of comic/<slug>/. The agent then spent several turns claiming the files existed where they didn't. * fix(skills/baoyu-comic): handle clarify timeouts correctly in Step 2 A clarify timeout returning 'Use your best judgement to make the choice and proceed' is NOT user consent to default the entire Step 2 questionnaire. It is a per-question default only. Add guidance at both instruction sites (SKILL.md User Questions section, references/workflow.md Step 2 header) telling the agent to: 1. Continue asking the remaining questions in the sequence after a timeout — each question is an independent consent point. 2. Surface every defaulted choice in the next user-visible message so the user can correct it when they return. An unreported default is indistinguishable from never having asked. Reported live Apr 2026: agent asked style question via clarify, got a timeout response, and silently defaulted style + narrative focus + audience + review flags in one pass. User only learned style had defaulted to 'ohmsha' after the comic was fully generated.
14 KiB
| name | description | version | author | license | metadata | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| baoyu-comic | Knowledge comic creator supporting multiple art styles and tones. Creates original educational comics with detailed panel layouts and sequential image generation. Use when user asks to create "知识漫画", "教育漫画", "biography comic", "tutorial comic", or "Logicomix-style comic". | 1.56.1 | 宝玉 (JimLiu) | MIT |
|
Knowledge Comic Creator
Adapted from baoyu-comic for Hermes Agent's tool ecosystem.
Create original knowledge comics with flexible art style × tone combinations.
When to Use
Trigger this skill when the user asks to create a knowledge/educational comic, biography comic, tutorial comic, or uses terms like "知识漫画", "教育漫画", or "Logicomix-style". The user provides content (text, file path, URL, or topic) and optionally specifies art style, tone, layout, aspect ratio, or language.
Reference Images
Hermes' image_generate tool is prompt-only — it accepts a text prompt and an aspect ratio, and returns an image URL. It does NOT accept reference images. When the user supplies a reference image, use it to extract traits in text that get embedded in every page prompt:
Intake: Accept file paths when the user provides them (or pastes images in conversation).
- File path(s) → copy to
refs/NN-ref-{slug}.{ext}alongside the comic output for provenance - Pasted image with no path → ask the user for the path via
clarify, or extract style traits verbally as a text fallback - No reference → skip this section
Usage modes (per reference):
| Usage | Effect |
|---|---|
style |
Extract style traits (line treatment, texture, mood) and append to every page's prompt body |
palette |
Extract hex colors and append to every page's prompt body |
scene |
Extract scene composition or subject notes and append to the relevant page(s) |
Record in each page's prompt frontmatter when refs exist:
references:
- ref_id: 01
filename: 01-ref-scene.png
usage: style
traits: "muted earth tones, soft-edged ink wash, low-contrast backgrounds"
Character consistency is driven by text descriptions in characters/characters.md (written in Step 3) that get embedded inline in every page prompt (Step 5). The optional PNG character sheet generated in Step 7.1 is a human-facing review artifact, not an input to image_generate.
Options
Visual Dimensions
| Option | Values | Description |
|---|---|---|
| Art | ligne-claire (default), manga, realistic, ink-brush, chalk, minimalist | Art style / rendering technique |
| Tone | neutral (default), warm, dramatic, romantic, energetic, vintage, action | Mood / atmosphere |
| Layout | standard (default), cinematic, dense, splash, mixed, webtoon, four-panel | Panel arrangement |
| Aspect | 3:4 (default, portrait), 4:3 (landscape), 16:9 (widescreen) | Page aspect ratio |
| Language | auto (default), zh, en, ja, etc. | Output language |
| Refs | File paths | Reference images used for style / palette trait extraction (not passed to the image model). See Reference Images above. |
Partial Workflow Options
| Option | Description |
|---|---|
| Storyboard only | Generate storyboard only, skip prompts and images |
| Prompts only | Generate storyboard + prompts, skip images |
| Images only | Generate images from existing prompts directory |
| Regenerate N | Regenerate specific page(s) only (e.g., 3 or 2,5,8) |
Details: references/partial-workflows.md
Art, Tone & Preset Catalogue
-
Art styles (6):
ligne-claire,manga,realistic,ink-brush,chalk,minimalist. Full definitions atreferences/art-styles/<style>.md. -
Tones (7):
neutral,warm,dramatic,romantic,energetic,vintage,action. Full definitions atreferences/tones/<tone>.md. -
Presets (5) with special rules beyond plain art+tone:
Preset Equivalent Hook ohmshamanga + neutral Visual metaphors, no talking heads, gadget reveals wuxiaink-brush + action Qi effects, combat visuals, atmospheric shoujomanga + romantic Decorative elements, eye details, romantic beats concept-storymanga + warm Visual symbol system, growth arc, dialogue+action balance four-panelminimalist + neutral + four-panel layout 起承转合 structure, B&W + spot color, stick-figure characters Full rules at
references/presets/<preset>.md— load the file when a preset is picked. -
Compatibility matrix and content-signal → preset table live in references/auto-selection.md. Read it before recommending combinations in Step 2.
File Structure
Output directory: comic/{topic-slug}/
- Slug: 2-4 words kebab-case from topic (e.g.,
alan-turing-bio) - Conflict: append timestamp (e.g.,
turing-story-20260118-143052)
Contents:
| File | Description |
|---|---|
source-{slug}.md |
Saved source content (kebab-case slug matches the output directory) |
analysis.md |
Content analysis |
storyboard.md |
Storyboard with panel breakdown |
characters/characters.md |
Character definitions |
characters/characters.png |
Character reference sheet (downloaded from image_generate) |
prompts/NN-{cover|page}-[slug].md |
Generation prompts |
NN-{cover|page}-[slug].png |
Generated images (downloaded from image_generate) |
refs/NN-ref-{slug}.{ext} |
User-supplied reference images (optional, for provenance) |
Language Handling
Detection Priority:
- User-specified language (explicit option)
- User's conversation language
- Source content language
Rule: Use user's input language for ALL interactions:
- Storyboard outlines and scene descriptions
- Image generation prompts
- User selection options and confirmations
- Progress updates, questions, errors, summaries
Technical terms remain in English.
Workflow
Progress Checklist
Comic Progress:
- [ ] Step 1: Setup & Analyze
- [ ] 1.1 Analyze content
- [ ] 1.2 Check existing directory
- [ ] Step 2: Confirmation - Style & options ⚠️ REQUIRED
- [ ] Step 3: Generate storyboard + characters
- [ ] Step 4: Review outline (conditional)
- [ ] Step 5: Generate prompts
- [ ] Step 6: Review prompts (conditional)
- [ ] Step 7: Generate images
- [ ] 7.1 Generate character sheet (if needed) → characters/characters.png
- [ ] 7.2 Generate pages (with character descriptions embedded in prompt)
- [ ] Step 8: Completion report
Flow
Input → Analyze → [Check Existing?] → [Confirm: Style + Reviews] → Storyboard → [Review?] → Prompts → [Review?] → Images → Complete
Step Summary
| Step | Action | Key Output |
|---|---|---|
| 1.1 | Analyze content | analysis.md, source-{slug}.md |
| 1.2 | Check existing directory | Handle conflicts |
| 2 | Confirm style, focus, audience, reviews | User preferences |
| 3 | Generate storyboard + characters | storyboard.md, characters/ |
| 4 | Review outline (if requested) | User approval |
| 5 | Generate prompts | prompts/*.md |
| 6 | Review prompts (if requested) | User approval |
| 7.1 | Generate character sheet (if needed) | characters/characters.png |
| 7.2 | Generate pages | *.png files |
| 8 | Completion report | Summary |
User Questions
Use the clarify tool to confirm options. Since clarify handles one question at a time, ask the most important question first and proceed sequentially. See references/workflow.md for the full Step 2 question set.
Timeout handling (CRITICAL): clarify can return "The user did not provide a response within the time limit. Use your best judgement to make the choice and proceed." — this is NOT user consent to default everything.
- Treat it as a default for that one question only. Continue asking the remaining Step 2 questions in sequence; each question is an independent consent point.
- Surface the default to the user visibly in your next message so they have a chance to correct it: e.g.
"Style: defaulted to ohmsha preset (clarify timed out). Say the word to switch."— an unreported default is indistinguishable from never having asked. - Do NOT collapse Step 2 into a single "use all defaults" pass after one timeout. If the user is genuinely absent, they will be equally absent for all five questions — but they can correct visible defaults when they return, and cannot correct invisible ones.
Step 7: Image Generation
Use Hermes' built-in image_generate tool for all image rendering. Its schema accepts only prompt and aspect_ratio (landscape | portrait | square); it returns a URL, not a local file. Every generated page or character sheet must therefore be downloaded to the output directory.
Prompt file requirement (hard): write each image's full, final prompt to a standalone file under prompts/ (naming: NN-{type}-[slug].md) BEFORE calling image_generate. The prompt file is the reproducibility record.
Aspect ratio mapping — the storyboard's aspect_ratio field maps to image_generate's format as follows:
| Storyboard ratio | image_generate format |
|---|---|
3:4, 9:16, 2:3 |
portrait |
4:3, 16:9, 3:2 |
landscape |
1:1 |
square |
Download step — after every image_generate call:
- Read the URL from the tool result
- Fetch the image bytes using an absolute output path, e.g.
curl -fsSL "<url>" -o /abs/path/to/comic/<slug>/NN-page-<slug>.png - Verify the file exists and is non-empty at that exact path before proceeding to the next page
Never rely on shell CWD persistence for -o paths. The terminal tool's persistent-shell CWD can change between batches (session expiry, TERMINAL_LIFETIME_SECONDS, a failed cd that leaves you in the wrong directory). curl -o relative/path.png is a silent footgun: if CWD has drifted, the file lands somewhere else with no error. Always pass a fully-qualified absolute path to -o, or pass workdir=<abs path> to the terminal tool. Incident Apr 2026: pages 06-09 of a 10-page comic landed at the repo root instead of comic/<slug>/ because batch 3 inherited a stale CWD from batch 2 and curl -o 06-page-skills.png wrote to the wrong directory. The agent then spent several turns claiming the files existed where they didn't.
7.1 Character sheet — generate it (to characters/characters.png, aspect landscape) when the comic is multi-page with recurring characters. Skip for simple presets (e.g., four-panel minimalist) or single-page comics. The prompt file at characters/characters.md must exist before invoking image_generate. The rendered PNG is a human-facing review artifact (so the user can visually verify character design) and a reference for later regenerations or manual prompt edits — it does not drive Step 7.2. Page prompts are already written in Step 5 from the text descriptions in characters/characters.md; image_generate cannot accept images as visual input.
7.2 Pages — each page's prompt MUST already be at prompts/NN-{cover|page}-[slug].md before invoking image_generate. Because image_generate is prompt-only, character consistency is enforced by embedding character descriptions (sourced from characters/characters.md) inline in every page prompt during Step 5. The embedding is done uniformly whether or not a PNG sheet is produced in 7.1; the PNG is only a review/regeneration aid.
Backup rule: existing prompts/…md and …png files → rename with -backup-YYYYMMDD-HHMMSS suffix before regenerating.
Full step-by-step workflow (analysis, storyboard, review gates, regeneration variants): references/workflow.md.
References
Core Templates:
- analysis-framework.md - Deep content analysis
- character-template.md - Character definition format
- storyboard-template.md - Storyboard structure
- ohmsha-guide.md - Ohmsha manga specifics
Style Definitions:
references/art-styles/- Art styles (ligne-claire, manga, realistic, ink-brush, chalk, minimalist)references/tones/- Tones (neutral, warm, dramatic, romantic, energetic, vintage, action)references/presets/- Presets with special rules (ohmsha, wuxia, shoujo, concept-story, four-panel)references/layouts/- Layouts (standard, cinematic, dense, splash, mixed, webtoon, four-panel)
Workflow:
- workflow.md - Full workflow details
- auto-selection.md - Content signal analysis
- partial-workflows.md - Partial workflow options
Page Modification
| Action | Steps |
|---|---|
| Edit | Update prompt file FIRST → regenerate image → download new PNG |
| Add | Create prompt at position → generate with character descriptions embedded → renumber subsequent → update storyboard |
| Delete | Remove files → renumber subsequent → update storyboard |
IMPORTANT: When updating pages, ALWAYS update the prompt file (prompts/NN-{cover|page}-[slug].md) FIRST before regenerating. This ensures changes are documented and reproducible.
Pitfalls
- Image generation: 10-30 seconds per page; auto-retry once on failure
- Always download the URL returned by
image_generateto a local PNG — downstream tooling (and the user's review) expects files in the output directory, not ephemeral URLs - Use absolute paths for
curl -o— never rely on persistent-shell CWD across batches. Silent footgun: files land in the wrong directory and subsequentlson the intended path shows nothing. See Step 7 "Download step". - Use stylized alternatives for sensitive public figures
- Step 2 confirmation required - do not skip
- Steps 4/6 conditional - only if user requested in Step 2
- Step 7.1 character sheet - recommended for multi-page comics, optional for simple presets. The PNG is a review/regeneration aid; page prompts (written in Step 5) use the text descriptions in
characters/characters.md, not the PNG.image_generatedoes not accept images as visual input - Strip secrets — scan source content for API keys, tokens, or credentials before writing any output file