mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-07 02:51:50 +00:00

James 50aabb9eb2 feat(skill): add hyperframes optional creative skill

Adds an optional creative skill that integrates HyperFrames, an
HTML-based video rendering framework, as a sibling to manim-video.
Complements manim's math-focused animation with motion-graphics,
captioned narration, audio-reactive visuals, shader transitions, and
website-to-video production.

Scope:
- optional-skills/creative/hyperframes/SKILL.md      — entry point
- references/composition.md                          — data-attr schema, timeline contract
- references/cli.md                                  — every npx hyperframes command
- references/gsap.md                                 — GSAP core API for compositions
- references/website-to-video.md                     — 7-step capture-to-video workflow
- references/troubleshooting.md                      — OpenClaw / Chromium 147 fix
- scripts/setup.sh                                   — idempotent one-time setup

OpenClaw / Chromium 147 fix (hyperframes#294):
Pinning hyperframes@>=0.4.2 (commit 4c72ba4 ships the
HeadlessExperimental.beginFrame auto-detect + screenshot fallback).
setup.sh pre-caches chrome-headless-shell so the fast BeginFrame path
is preferred over system Chrome. The PRODUCER_FORCE_SCREENSHOT=true
escape hatch is documented in troubleshooting.md and in SKILL.md
Pitfalls.

Placed under optional-skills/ (not bundled) per CONTRIBUTING.md
guidance for heavyweight deps: requires Node.js >= 22, FFmpeg, and
~300 MB chrome-headless-shell download.

2026-05-04 14:13:17 -07:00

4.8 KiB

Raw Blame History

Website to Video

Capture a website, produce a professional video from it. Use when the user provides a URL and wants a video — social ad, product tour, 30-second promo, etc.

The workflow has 7 steps. Each produces an artifact that gates the next. Do not skip steps — each artifact prevents a downstream failure mode.

Step 1: Capture & Understand

npx hyperframes capture https://example.com --out captured/

Produces:

captured/snapshot.html — self-contained page
captured/screenshot.png — above-the-fold visual
captured/assets/ — logos, hero images, background video (if any)
captured/palette.json — extracted colors (sorted by pixel coverage)
captured/text.md — extracted headings, paragraphs, CTAs
captured/fonts.json — font families and stacks detected in computed styles

Gate: Print a site summary — name, top 3 colors, primary + display fonts, hero asset path, one-sentence vibe. Keep it in your context — don't re-capture.

Step 2: Write DESIGN.md

Small brand reference at the project root. 6 sections, ~90 lines. This is the cheat sheet — not the creative plan.

# DESIGN

## Brand
- Name: Example Co.
- One-line mission: "…"

## Colors
- Background: #0B0F14
- Primary: #00E0A4 (accent, CTA)
- Secondary: #7A8B9B (body text)
- Text: #FFFFFF

## Typography
- Display: "Inter Tight", 700, tight letter-spacing
- Body: "Inter", 400

## Motion
- Mood: precise, technical, confident
- Eases: `power3.out` for entrances, `expo.in` for exits

## Assets
- Logo: `captured/assets/logo.svg`
- Hero image: `captured/assets/hero.png`

## What NOT to Do
- No purple, no pastels, no serif body
- No playful/bubbly eases (`elastic`, `bounce`)
- No drop shadows on text

Gate: DESIGN.md exists in the project directory.

Step 3: Write SCRIPT.md

Narration script. Story backbone. Scene durations come from the narration, not from guessing.

# SCRIPT

## Scene 1 — Hook (0:00–0:04)
"What if your dashboards wrote themselves?"

## Scene 2 — Problem (0:04–0:11)
"Teams spend hours stitching together queries, charts, and callouts — every Monday."

## Scene 3 — Solution (0:11–0:22)
"Example Co. watches your data streams and proposes the dashboard you'd have built — in seconds."

## Scene 4 — CTA (0:22–0:28)
"Try it free at example.com."

Run npx hyperframes tts SCRIPT.md --voice af_nova --output narration.wav to generate TTS audio. Note the exact duration — that's the video's duration.

Gate: SCRIPT.md + narration.wav exist and durations match the plan (±0.3s).

Step 4: Storyboard

Text-only scene plan: for each scene, describe the hero frame — what's on screen at the scene's most-visible moment.

# STORYBOARD

## Scene 1 (0:00–0:04) — Hook
Hero frame: giant "WHAT IF YOUR DASHBOARDS WROTE THEMSELVES?" in display font, centered, on near-black. Logo top-left at 40% opacity.
Entrance: each word staggers in, 0.08s apart.
Transition out: flash-through-white into Scene 2.

One paragraph per scene. Do NOT skip this step — it's where you catch narrative gaps before writing HTML.

Gate: STORYBOARD.md exists. Each scene has: hero frame, entrance, transition.

Step 5: Composition

Write index.html scene-by-scene:

Each scene is a <div class="scene scene-N"> positioned absolutely, full-bleed.
Static HTML+CSS for the hero frame first (no GSAP).
Layer the narration <audio> at data-start="0" on a high track index.
Add a transitions component (flash-through-white, liquid-wipe, etc.) between each scene.
THEN add GSAP entrances (gsap.from()), no exits — transitions own the exit.
Register window.__timelines["root"] = tl.

Install transitions as needed:

npx hyperframes add flash-through-white

Step 6: Render

npx hyperframes lint --strict          # must pass
npx hyperframes validate               # WCAG contrast audit
npx hyperframes render --quality draft --output draft.mp4

Watch the draft. Note issues in a REVIEW.md bullet list (scene, timestamp, issue). Fix, re-render.

When happy:

npx hyperframes render --quality high --output final.mp4

Step 7: Deliver

Report file path + duration + file size to the user.
If the user wants a vertical cut, re-render with a 9:16 composition (data-width="1080" data-height="1920") — typically requires a separate index-vertical.html with tighter typography and re-stacked scene layout.

Common Failure Modes

Skipped DESIGN.md → colors drift scene-to-scene; output feels like "AI slides."
Skipped STORYBOARD.md → scenes overlap or hero frames collide with transitions.
Exit animations before transitions → empty frames when the transition fires.
Narration longer than data-duration → audio clips mid-sentence. Update the composition's data-duration to match the TTS output length + 0.5s buffer.

4.8 KiB Raw Blame History Unescape Escape