hermes-agent/website/docs/user-guide/skills/optional/creative/creative-hyperframes.md
Teknium 252d68fd45
docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784)
* docs: deep audit — fix stale config keys, missing commands, and registry drift

Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level
user-guide, user-guide/features) against the live registries:

  hermes_cli/commands.py    COMMAND_REGISTRY (slash commands)
  hermes_cli/auth.py        PROVIDER_REGISTRY (providers)
  hermes_cli/config.py      DEFAULT_CONFIG (config keys)
  toolsets.py               TOOLSETS (toolsets)
  tools/registry.py         get_all_tool_names() (tools)
  python -m hermes_cli.main <subcmd> --help (CLI args)

reference/
- cli-commands.md: drop duplicate hermes fallback row + duplicate section,
  add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand
  lists to match --help output (status/logout/spotify, login, archive/prune/
  list-archived).
- slash-commands.md: add missing /sessions and /reload-skills entries +
  correct the cross-platform Notes line.
- tools-reference.md: drop bogus '68 tools' headline, drop fictional
  'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated),
  add missing 'kanban' and 'video' toolset sections, fix MCP example to use
  the real mcp_<server>_<tool> prefix.
- toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser'
  row, add missing 'kanban' and 'video' toolset rows, drop the stale
  '38 tools' count for hermes-cli.
- profile-commands.md: add missing install/update/info subcommands, document
  fish completion.
- environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the
  one with the correct gmi-serving.com default).
- faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just
  via OpenRouter), refresh the OpenAI model list.

getting-started/
- installation.md: PortableGit (not MinGit) is what the Windows installer
  fetches; document the 32-bit MinGit fallback.
- installation.md / termux.md: installer prefers .[termux-all] then falls
  back to .[termux].
- nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid
  'nix flake update --flake' invocation.
- updating.md: 'hermes backup restore --state pre-update' doesn't exist —
  point at the snapshot/quick-snapshot flow; correct config key
  'updates.pre_update_backup' (was 'update.backup').

user-guide/
- configuration.md: api_max_retries default 3 (not 2); display.runtime_footer
  is the real key (not display.runtime_metadata_footer); checkpoints defaults
  enabled=false / max_snapshots=20 (not true / 50).
- configuring-models.md: 'hermes model list' / 'hermes model set ...' don't
  exist — hermes model is interactive only.
- tui.md: busy_indicator -> tui_status_indicator with values
  kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none).
- security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env,
  not config.yaml.
- windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the
  OpenAI-compatible API server runs inside hermes gateway.

user-guide/features/
- computer-use.md: approvals.mode (not security.approval_level); fix broken
  ./browser-use.md link to ./browser.md.
- fallback-providers.md: top-level fallback_providers (not
  model.fallback_providers); the picker is subcommand-based, not modal.
- api-server.md: API_SERVER_* are env vars — write to per-profile .env,
  not 'hermes config set' which targets YAML.
- web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl
  modes are exposed through web_extract.
- kanban.md: failure_limit default is 2, not '~5'.
- plugins.md: drop hard-coded '33 providers' count.
- honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document
  that 'hermes honcho' subcommand is gated on memory.provider=honcho;
  reconcile subcommand list with actual --help output.
- memory-providers.md: legacy 'hermes honcho setup' redirect documented.

Verified via 'npm run build' — site builds cleanly; broken-link count went
from 149 to 146 (no regressions, fixed a few in passing).

* docs: round 2 audit fixes + regenerate skill catalogs

Follow-up to the previous commit on this branch:

Round 2 manual fixes:
- quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY;
  voice-mode and ACP install commands rewritten — bare 'pip install ...'
  doesn't work for curl-installed setups (no pip on PATH, not in repo
  dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e
  ".[voice]"'. ACP already ships in [all] so the curl install includes it.
- cli.md / configuration.md: 'auxiliary.compression.model' shown as
  'google/gemini-3-flash-preview' (the doc's own claimed default);
  actual default is empty (= use main model). Reworded as 'leave empty
  (default) or pin a cheap model'.
- built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row
  that was missing from the table.

Regenerated skill catalogs:
- ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill
  pages and both reference catalogs (skills-catalog.md,
  optional-skills-catalog.md). This adds the entries that were genuinely
  missing — productivity/teams-meeting-pipeline (bundled),
  optional/finance/* (entire category — 7 skills:
  3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model,
  merger-model, pptx-author), creative/hyperframes,
  creative/kanban-video-orchestrator, devops/watchers,
  productivity/shop-app, research/searxng-search,
  apple/macos-computer-use — and rewrites every other per-skill page from
  the current SKILL.md. Most diffs are tiny (one line of refreshed
  metadata).

Validation:
- 'npm run build' succeeded.
- Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation
  shells that lag every newly-added skill page (pre-existing pattern).
  No regressions on any en/ page.
2026-05-09 13:19:51 -07:00

16 KiB

title sidebar_label description
Hyperframes Hyperframes Create HTML-based video compositions, animated title cards, social overlays, captioned talking-head videos, audio-reactive visuals, and shader transitions us...

{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}

Hyperframes

Create HTML-based video compositions, animated title cards, social overlays, captioned talking-head videos, audio-reactive visuals, and shader transitions using HyperFrames. HTML is the source of truth for video. Use when the user wants a rendered MP4/WebM from an HTML composition, wants to animate text/logos/charts over media, needs captions synced to audio, wants TTS narration, or wants to convert a website into a video.

Skill metadata

Source Optional — install with hermes skills install official/creative/hyperframes
Path optional-skills/creative/hyperframes
Version 1.0.0
Author heygen-com
License Apache-2.0
Platforms linux, macos, windows
Tags creative, video, animation, html, gsap, motion-graphics
Related skills manim-video, meme-generation

Reference: full SKILL.md

:::info The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. :::

HyperFrames

HTML is the source of truth for video. A composition is an HTML file with data-* attributes for timing, a GSAP timeline for animation, and CSS for appearance. The HyperFrames engine captures the page frame-by-frame and encodes to MP4/WebM with FFmpeg.

Complement to manim-video: Use manim-video for mathematical/geometric explainers (equations, 3B1B-style). Use hyperframes for motion-graphics, talking-head with captions, product tours, social overlays, shader transitions, and anything driven by real video/audio media.

When to Use

  • User asks for a rendered video from text, a script, or a website
  • Animated title cards, lower thirds, or typographic intros
  • Captioned narration video (TTS + captions synced to waveform)
  • Audio-reactive visuals (beat sync, spectrum bars, pulsing glow)
  • Scene-to-scene transitions (crossfade, wipe, shader warp, flash-through-white)
  • Social overlays (Instagram/TikTok/YouTube style)
  • Website-to-video pipeline (capture a URL, produce a promo)
  • Any HTML/CSS/JS animation that must render deterministically to a video file

Do not use this skill for:

  • Pure math/equation animation (→ manim-video)
  • Image generation or memes (→ meme-generation, image models)
  • Live video conferencing or streaming

Quick Reference

npx hyperframes init my-video               # scaffold a project
cd my-video
npx hyperframes lint                        # validate before preview/render
npx hyperframes preview                     # live-reload browser preview (port 3002)
npx hyperframes render --output final.mp4   # render to MP4
npx hyperframes doctor                      # diagnose environment issues

Render flags: --quality draft|standard|high · --fps 24|30|60 · --format mp4|webm · --docker (reproducible) · --strict.

Full CLI reference: references/cli.md.

Setup (one-time)

bash "$(dirname "$(find ~/.hermes/skills -path '*/hyperframes/SKILL.md' 2>/dev/null | head -1)")/scripts/setup.sh"

The script:

  1. Verifies Node.js >= 22 and FFmpeg are installed (prints fix instructions if not).
  2. Installs the hyperframes CLI globally (npm install -g hyperframes@>=0.4.2).
  3. Pre-caches chrome-headless-shell via Puppeteer — required for best-quality rendering via Chrome's HeadlessExperimental.beginFrame capture path.
  4. Runs npx hyperframes doctor and reports the result.

See references/troubleshooting.md if setup fails.

Procedure

1. Plan before writing HTML

Before touching code, articulate at a high level:

  • What — narrative arc, key moments, emotional beats
  • Structure — compositions, tracks (video/audio/overlays), durations
  • Visual identity — colors, fonts, motion character (explosive / cinematic / fluid / technical)
  • Hero frame — for each scene, the moment when the most elements are simultaneously visible. This is the static layout you'll build first.

Visual Identity Gate (HARD-GATE). Before writing ANY composition HTML, a visual identity must be defined. Do NOT write compositions with default or generic colors (#333, #3b82f6, Roboto are tells that this step was skipped). Check in order:

  1. DESIGN.md at project root? → Use its exact colors, fonts, motion rules, and "What NOT to Do" constraints.

  2. User named a style (e.g. "Swiss Pulse", "dark and techy", "luxury brand")? → Generate a minimal DESIGN.md with ## Style Prompt, ## Colors (3-5 hex with roles), ## Typography (1-2 families), ## What NOT to Do (3-5 anti-patterns).

  3. None of the above? → Ask 3 questions before writing any HTML:

    • Mood? (explosive / cinematic / fluid / technical / chaotic / warm)
    • Light or dark canvas?
    • Any brand colors, fonts, or visual references?

    Then generate a DESIGN.md from the answers. Every composition must trace its palette and typography back to DESIGN.md or explicit user direction.

2. Scaffold

npx hyperframes init my-video --non-interactive

Templates: blank, warm-grain, play-mode, swiss-grid, vignelli, decision-tree, kinetic-type, product-promo, nyt-graph. Pass --example <name> to pick one, --video clip.mp4 or --audio track.mp3 to seed with media.

3. Layout before animation

Write the static HTML+CSS for the hero frame first — no GSAP yet. The .scene-content container must fill the scene (width:100%; height:100%; padding:Npx) with display:flex + gap. Use padding to push content inward — never position: absolute; top: Npx on a content container (content overflows when taller than the remaining space).

Only after the hero frame looks right, add gsap.from() entrances (animate to the CSS position) and gsap.to() exits (animate from it).

See references/composition.md for the full data-attribute schema and composition rules.

4. Animate with GSAP

Every composition must:

  • Register its timeline: window.__timelines["<composition-id>"] = tl
  • Start paused: gsap.timeline({ paused: true }) — the player controls playback
  • Use finite repeat values (no repeat: -1 — breaks the capture engine). Calculate: repeat: Math.ceil(duration / cycleDuration) - 1.
  • Be deterministic — no Math.random(), Date.now(), or wall-clock logic. Use a seeded PRNG if you need pseudo-randomness.
  • Build synchronously — no async/await, setTimeout, or Promises around timeline construction.

See references/gsap.md for the core GSAP API (tweens, eases, stagger, timelines).

5. Transitions between scenes

Multi-scene compositions require transitions. Rules:

  1. Always use a transition between scenes — no jump cuts.
  2. Always use entrance animations on every scene element (gsap.from(...)).
  3. Never use exit animations except on the final scene — the transition IS the exit.
  4. The final scene may fade out.

Use npx hyperframes add <transition-name> to install shader transitions (flash-through-white, liquid-wipe, etc.). Full list: npx hyperframes add --list.

6. Audio, captions, TTS, audio-reactive, highlighting

  • Audio: always a separate <audio> element (video is muted playsinline).
  • TTS: npx hyperframes tts "Script text" --voice af_nova --output narration.wav. List voices with --list. Voice ID first letter encodes language (a/b=English, e=Spanish, f=French, j=Japanese, z=Mandarin, etc.) — the CLI auto-infers the phonemizer locale; pass --lang only to override. Non-English phonemization requires espeak-ng installed system-wide.
  • Captions: npx hyperframes transcribe narration.wav → word-level transcript. Pick style from the transcript tone (hype / corporate / tutorial / storytelling / social — see the table in references/features.md). Language rule: never use .en whisper models unless the audio is confirmed English — .en translates non-English audio instead of transcribing it. Every caption group MUST have a hard tl.set(el, { opacity: 0, visibility: "hidden" }, group.end) kill after its exit tween — otherwise groups leak visible into later ones.
  • Audio-reactive visuals: pre-extract audio bands (bass / mid / treble) and sample per-frame inside the timeline with a for loop of tl.call(draw, [], f / fps) — a single long tween does NOT react to audio. Map bass → scale (pulse), treble → textShadow/boxShadow (glow), overall amplitude → opacity/y/backgroundColor. Avoid equalizer-bar clichés — let content guide the visual, audio drive its behavior.
  • Marker-style highlighting: highlight, circle, burst, scribble, sketchout effects for text emphasis are deterministic CSS+GSAP — see references/features.md#marker-highlighting. Fully seekable, no animated SVG filters.
  • Scene transitions: every multi-scene composition MUST use transitions (no jump cuts). Pick from CSS primitives (push slide, blur crossfade, zoom through, staggered blocks) or shader transitions (flash-through-white, liquid-wipe, cross-warp-morph, chromatic-split, etc.) via npx hyperframes add. Mood and energy tables live in references/features.md#transitions. Do not mix CSS and shader transitions in the same composition.

7. Lint, validate, inspect, preview, render

npx hyperframes lint              # catches missing data-composition-id, overlapping tracks, unregistered timelines
npx hyperframes validate          # WCAG contrast audit at 5 timestamps
npx hyperframes inspect           # visual layout audit — overflow, off-frame elements, occluded text
npx hyperframes preview           # live browser preview
npx hyperframes render --quality draft --output draft.mp4    # fast iteration
npx hyperframes render --quality high --output final.mp4     # final delivery

hyperframes validate samples background pixels behind every text element and warns on contrast ratios below 4.5:1 (or 3:1 for large text). hyperframes inspect is the layout-side companion — runs the page at multiple timestamps and flags issues that a static lint can't see (a caption that wraps past the safe area only at 4.5s, a card that overflows when its title is the longest variant, an element that ends up behind a transition shader). Run inspect especially on compositions with speech bubbles, cards, captions, or tight typography.

8. Website-to-video (if the user gives a URL)

Use the 7-step capture-to-video workflow in references/website-to-video.md: capture → DESIGN.md → SCRIPT.md → storyboard → composition → render → deliver.

Pitfalls

  • HeadlessExperimental.beginFrame' wasn't found — Chromium 147+ removed this protocol. Ensure you're on hyperframes@>=0.4.2 (auto-detects and falls back to screenshot mode). Escape hatch: export PRODUCER_FORCE_SCREENSHOT=true. See hyperframes#294 and references/troubleshooting.md.
  • System Chrome (not chrome-headless-shell) — renders hang for 120s then timeout. Run npx puppeteer browsers install chrome-headless-shell (setup.sh does this). hyperframes doctor reports which binary will be used.
  • repeat: -1 anywhere — breaks the capture engine. Always compute a finite repeat count.
  • gsap.set() on clip elements that enter later — the element doesn't exist at page load. Use tl.set(selector, vars, timePosition) inside the timeline instead, at or after the clip's data-start.
  • <br> inside content text — forced breaks don't know the rendered font width, so natural wrap + <br> double-breaks. Use max-width to let text wrap. Exception: short display titles where each word is deliberately on its own line.
  • Animating visibility or display — GSAP can't tween these. Use autoAlpha (handles both visibility and opacity).
  • Calling video.play() or audio.play() — the framework owns playback. Never call these yourself.
  • Building timelines async — the capture engine reads window.__timelines synchronously after page load. Never wrap timeline construction in async, setTimeout, or a Promise.
  • Standalone index.html wrapped in <template> — hides all content from the browser. Only sub-compositions loaded via data-composition-src use <template>.
  • Using video for audio — always muted <video> + separate <audio>.

Verification

Before and after rendering:

  1. Lint + validate + inspect pass: npx hyperframes lint --strict && npx hyperframes validate && npx hyperframes inspect (lint catches structural issues, validate catches contrast, inspect catches visual layout / overflow issues — see troubleshooting.md if warnings appear).
  2. Animation choreography — for new compositions or significant animation changes, run the animation map. npx hyperframes init copies the skill scripts into the project, so the path is project-local:
    node skills/hyperframes/scripts/animation-map.mjs <composition-dir> \
      --out <composition-dir>/.hyperframes/anim-map
    
    Outputs a single animation-map.json with per-tween summaries, ASCII Gantt timeline, stagger detection, dead zones (>1s with no animation), element lifecycles, and flags (offscreen, collision, invisible, paced-fast <0.2s, paced-slow >2s). Scan summaries and flags — fix or justify each. Skip on small edits.
  3. File exists + non-zero: ls -lh final.mp4.
  4. Duration matches data-duration: ffprobe -v error -show_entries format=duration -of default=nw=1:nk=1 final.mp4.
  5. Visual check: extract a mid-composition frame: ffmpeg -i final.mp4 -ss 00:00:05 -vframes 1 preview.png.
  6. Audio present if expected: ffprobe -v error -show_streams -select_streams a -of default=nw=1:nk=1 final.mp4 | head -1.

If hyperframes render fails, run npx hyperframes doctor and attach its output when reporting.

References

  • composition.md — data attributes, timeline contract, non-negotiable rules, typography/asset rules
  • cli.md — every CLI command (init, capture, lint, validate, inspect, preview, render, transcribe, tts, doctor, browser, info, upgrade, benchmark)
  • gsap.md — GSAP core API for HyperFrames (tweens, eases, stagger, timelines, matchMedia)
  • features.md — captions, TTS, audio-reactive, marker highlighting, transitions (load on demand)
  • website-to-video.md — 7-step capture-to-video workflow
  • troubleshooting.md — OpenClaw fix, env vars, common render errors