mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

Jim Liu 宝玉 83a7a005aa fix(skills): clarify baoyu-comic character sheet role

Page prompts are written in Step 5 from the text descriptions in
characters/characters.md — the PNG sheet generated in Step 7.1
cannot be used to write them. Reposition the PNG as a human-facing
review artifact (and reference for later regenerations / manual
edits), and drop the confusing "Character sheet | Strategy" tables
since the embedding rule is uniform.

2026-04-21 17:50:04 -07:00

5.6 KiB

Raw Permalink Blame History

Port Notes — baoyu-comic

Ported from JimLiu/baoyu-skills v1.56.1.

Changes from upstream

SKILL.md adaptations

Change	Upstream	Hermes
Metadata namespace	`openclaw`	`hermes` (with `tags` + `homepage`)
Trigger	Slash commands / CLI flags	Natural language skill matching
User config	EXTEND.md file (project/user/XDG paths)	Removed — not part of Hermes infra
User prompts	`AskUserQuestion` (batched)	`clarify` tool (one question at a time)
Image generation	baoyu-imagine (Bun/TypeScript, supports `--ref`)	`image_generate` — prompt-only, returns a URL; no reference image input; agent must download the URL to the output directory
PDF assembly	`scripts/merge-to-pdf.ts` (Bun + `pdf-lib`)	Removed — the PDF merge step is out of scope for this port; pages are delivered as PNGs only
Platform support	Linux/macOS/Windows/WSL/PowerShell	Linux/macOS only
File operations	Generic instructions	Hermes file tools (`write_file`, `read_file`)

Structural removals

references/config/ directory (removed entirely):
- first-time-setup.md — blocking first-time setup flow for EXTEND.md
- preferences-schema.md — EXTEND.md YAML schema
- watermark-guide.md — watermark config (tied to EXTEND.md)
scripts/ directory (removed entirely): upstream's merge-to-pdf.ts depended on pdf-lib, which is not declared anywhere in the Hermes repo. Rather than add a new dependency, the port drops PDF assembly and delivers per-page PNGs.
Workflow Step 8 (Merge to PDF) removed from workflow.md; Step 9 (Completion report) renumbered to Step 8.
Workflow Step 1.1 — "Load Preferences (EXTEND.md)" section removed from workflow.md; steps 1.2/1.3 renumbered to 1.1/1.2.
Generic "User Input Tools" and "Image Generation Tools" preambles — SKILL.md no longer lists fallback rules for multiple possible tools; it references clarify and image_generate directly.

Image generation strategy changes

image_generate's schema accepts only prompt and aspect_ratio (landscape | portrait | square). Upstream's reference-image flow (--ref characters.png for character consistency, plus user-supplied refs for style/palette/scene) does not map to this tool, so the workflow was restructured:

Character sheet PNG is still generated for multi-page comics, but it is repositioned as a human-facing review artifact (for visual verification) and a reference for later regenerations / manual prompt edits. Page prompts themselves are built from the text descriptions in characters/characters.md (embedded inline during Step 5). image_generate never sees the PNG as a visual input.
User-supplied reference images are reduced to style / palette / scene trait extraction — traits are embedded in the prompt body; the image files themselves are kept only for provenance under refs/.
Page prompts now mandate that character descriptions are embedded inline (copied from characters/characters.md) — this is the only mechanism left to enforce cross-page character consistency.
Download step — after every image_generate call, the returned URL is fetched to disk (e.g., curl -fsSL "<url>" -o <target>.png) and verified before the workflow advances.

SKILL.md reductions

CLI option columns (--art, --tone, --layout, --aspect, --lang, --ref, --storyboard-only, --prompts-only, --images-only, --regenerate) converted to plain-English option descriptions.
Preset files (presets/*.md) and ohmsha-guide.md: `--style X` / `--art X --tone Y` shorthand rewritten to art=X, tone=Y + natural-language references.
partial-workflows.md: per-skill slash command invocations rewritten as user-intent cues; PDF-related outputs removed.
auto-selection.md: priority order dropped the EXTEND.md tier.
analysis-framework.md: language-priority comment updated (user option → conversation → source).

File naming convention

Source content pasted by the user is saved as source-{slug}.md, where {slug} is the kebab-case topic slug used for the output directory. Backups follow the same pattern with a -backup-YYYYMMDD-HHMMSS suffix. SKILL.md and workflow.md now agree on this single convention.

What was preserved verbatim

All 6 art-style definitions (references/art-styles/)
All 7 tone definitions (references/tones/)
All 7 layout definitions (references/layouts/)
Core templates: character-template.md, storyboard-template.md, base-prompt.md
Preset bodies (only the first few intro lines adapted; special rules unchanged)
Author, version, homepage attribution

Syncing with upstream

To pull upstream updates:

# Compare versions
curl -sL https://raw.githubusercontent.com/JimLiu/baoyu-skills/main/skills/baoyu-comic/SKILL.md | head -5
# Look for the version: line

# Diff a reference file
diff <(curl -sL https://raw.githubusercontent.com/JimLiu/baoyu-skills/main/skills/baoyu-comic/references/art-styles/manga.md) \
     references/art-styles/manga.md

Art-style, tone, and layout reference files can usually be overwritten directly (they're upstream-verbatim). SKILL.md, references/workflow.md, references/partial-workflows.md, references/auto-selection.md, references/analysis-framework.md, references/ohmsha-guide.md, and references/presets/*.md must be manually merged since they contain Hermes-specific adaptations.

If upstream adds a Hermes-compatible PDF merge step (no extra npm deps), restore scripts/ and reintroduce Step 8 in workflow.md.

5.6 KiB Raw Permalink Blame History