mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-21 05:11:26 +00:00
docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784)
* docs: deep audit — fix stale config keys, missing commands, and registry drift Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level user-guide, user-guide/features) against the live registries: hermes_cli/commands.py COMMAND_REGISTRY (slash commands) hermes_cli/auth.py PROVIDER_REGISTRY (providers) hermes_cli/config.py DEFAULT_CONFIG (config keys) toolsets.py TOOLSETS (toolsets) tools/registry.py get_all_tool_names() (tools) python -m hermes_cli.main <subcmd> --help (CLI args) reference/ - cli-commands.md: drop duplicate hermes fallback row + duplicate section, add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand lists to match --help output (status/logout/spotify, login, archive/prune/ list-archived). - slash-commands.md: add missing /sessions and /reload-skills entries + correct the cross-platform Notes line. - tools-reference.md: drop bogus '68 tools' headline, drop fictional 'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated), add missing 'kanban' and 'video' toolset sections, fix MCP example to use the real mcp_<server>_<tool> prefix. - toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser' row, add missing 'kanban' and 'video' toolset rows, drop the stale '38 tools' count for hermes-cli. - profile-commands.md: add missing install/update/info subcommands, document fish completion. - environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the one with the correct gmi-serving.com default). - faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just via OpenRouter), refresh the OpenAI model list. getting-started/ - installation.md: PortableGit (not MinGit) is what the Windows installer fetches; document the 32-bit MinGit fallback. - installation.md / termux.md: installer prefers .[termux-all] then falls back to .[termux]. - nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid 'nix flake update --flake' invocation. - updating.md: 'hermes backup restore --state pre-update' doesn't exist — point at the snapshot/quick-snapshot flow; correct config key 'updates.pre_update_backup' (was 'update.backup'). user-guide/ - configuration.md: api_max_retries default 3 (not 2); display.runtime_footer is the real key (not display.runtime_metadata_footer); checkpoints defaults enabled=false / max_snapshots=20 (not true / 50). - configuring-models.md: 'hermes model list' / 'hermes model set ...' don't exist — hermes model is interactive only. - tui.md: busy_indicator -> tui_status_indicator with values kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none). - security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env, not config.yaml. - windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the OpenAI-compatible API server runs inside hermes gateway. user-guide/features/ - computer-use.md: approvals.mode (not security.approval_level); fix broken ./browser-use.md link to ./browser.md. - fallback-providers.md: top-level fallback_providers (not model.fallback_providers); the picker is subcommand-based, not modal. - api-server.md: API_SERVER_* are env vars — write to per-profile .env, not 'hermes config set' which targets YAML. - web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl modes are exposed through web_extract. - kanban.md: failure_limit default is 2, not '~5'. - plugins.md: drop hard-coded '33 providers' count. - honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document that 'hermes honcho' subcommand is gated on memory.provider=honcho; reconcile subcommand list with actual --help output. - memory-providers.md: legacy 'hermes honcho setup' redirect documented. Verified via 'npm run build' — site builds cleanly; broken-link count went from 149 to 146 (no regressions, fixed a few in passing). * docs: round 2 audit fixes + regenerate skill catalogs Follow-up to the previous commit on this branch: Round 2 manual fixes: - quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY; voice-mode and ACP install commands rewritten — bare 'pip install ...' doesn't work for curl-installed setups (no pip on PATH, not in repo dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e ".[voice]"'. ACP already ships in [all] so the curl install includes it. - cli.md / configuration.md: 'auxiliary.compression.model' shown as 'google/gemini-3-flash-preview' (the doc's own claimed default); actual default is empty (= use main model). Reworded as 'leave empty (default) or pin a cheap model'. - built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row that was missing from the table. Regenerated skill catalogs: - ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill pages and both reference catalogs (skills-catalog.md, optional-skills-catalog.md). This adds the entries that were genuinely missing — productivity/teams-meeting-pipeline (bundled), optional/finance/* (entire category — 7 skills: 3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model, merger-model, pptx-author), creative/hyperframes, creative/kanban-video-orchestrator, devops/watchers, productivity/shop-app, research/searxng-search, apple/macos-computer-use — and rewrites every other per-skill page from the current SKILL.md. Most diffs are tiny (one line of refreshed metadata). Validation: - 'npm run build' succeeded. - Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation shells that lag every newly-added skill page (pre-existing pattern). No regressions on any en/ page.
This commit is contained in:
parent
ea2d66ddc0
commit
252d68fd45
181 changed files with 5498 additions and 122 deletions
|
|
@ -18,6 +18,7 @@ Control Blender directly from Hermes via socket connection to the blender-mcp ad
|
|||
| Path | `optional-skills/creative/blender-mcp` |
|
||||
| Version | `1.0.0` |
|
||||
| Author | alireza78a |
|
||||
| Platforms | linux, macos, windows |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
|
|
|
|||
|
|
@ -19,6 +19,7 @@ Generate flat, minimal light/dark-aware SVG diagrams as standalone HTML files, u
|
|||
| Version | `0.1.0` |
|
||||
| Author | v1k22 (original PR), ported into hermes-agent |
|
||||
| License | MIT |
|
||||
| Platforms | linux, macos, windows |
|
||||
| Tags | `diagrams`, `svg`, `visualization`, `education`, `physics`, `chemistry`, `engineering` |
|
||||
| Related skills | [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw), `generative-widgets` |
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,205 @@
|
|||
---
|
||||
title: "Hyperframes"
|
||||
sidebar_label: "Hyperframes"
|
||||
description: "Create HTML-based video compositions, animated title cards, social overlays, captioned talking-head videos, audio-reactive visuals, and shader transitions us..."
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Hyperframes
|
||||
|
||||
Create HTML-based video compositions, animated title cards, social overlays, captioned talking-head videos, audio-reactive visuals, and shader transitions using HyperFrames. HTML is the source of truth for video. Use when the user wants a rendered MP4/WebM from an HTML composition, wants to animate text/logos/charts over media, needs captions synced to audio, wants TTS narration, or wants to convert a website into a video.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Source | Optional — install with `hermes skills install official/creative/hyperframes` |
|
||||
| Path | `optional-skills/creative/hyperframes` |
|
||||
| Version | `1.0.0` |
|
||||
| Author | heygen-com |
|
||||
| License | Apache-2.0 |
|
||||
| Platforms | linux, macos, windows |
|
||||
| Tags | `creative`, `video`, `animation`, `html`, `gsap`, `motion-graphics` |
|
||||
| Related skills | [`manim-video`](/docs/user-guide/skills/bundled/creative/creative-manim-video), [`meme-generation`](/docs/user-guide/skills/optional/creative/creative-meme-generation) |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
:::info
|
||||
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
|
||||
:::
|
||||
|
||||
# HyperFrames
|
||||
|
||||
HTML is the source of truth for video. A composition is an HTML file with `data-*` attributes for timing, a GSAP timeline for animation, and CSS for appearance. The HyperFrames engine captures the page frame-by-frame and encodes to MP4/WebM with FFmpeg.
|
||||
|
||||
**Complement to `manim-video`:** Use `manim-video` for mathematical/geometric explainers (equations, 3B1B-style). Use `hyperframes` for motion-graphics, talking-head with captions, product tours, social overlays, shader transitions, and anything driven by real video/audio media.
|
||||
|
||||
## When to Use
|
||||
|
||||
- User asks for a rendered video from text, a script, or a website
|
||||
- Animated title cards, lower thirds, or typographic intros
|
||||
- Captioned narration video (TTS + captions synced to waveform)
|
||||
- Audio-reactive visuals (beat sync, spectrum bars, pulsing glow)
|
||||
- Scene-to-scene transitions (crossfade, wipe, shader warp, flash-through-white)
|
||||
- Social overlays (Instagram/TikTok/YouTube style)
|
||||
- Website-to-video pipeline (capture a URL, produce a promo)
|
||||
- Any HTML/CSS/JS animation that must render deterministically to a video file
|
||||
|
||||
Do **not** use this skill for:
|
||||
- Pure math/equation animation (→ `manim-video`)
|
||||
- Image generation or memes (→ `meme-generation`, image models)
|
||||
- Live video conferencing or streaming
|
||||
|
||||
## Quick Reference
|
||||
|
||||
```bash
|
||||
npx hyperframes init my-video # scaffold a project
|
||||
cd my-video
|
||||
npx hyperframes lint # validate before preview/render
|
||||
npx hyperframes preview # live-reload browser preview (port 3002)
|
||||
npx hyperframes render --output final.mp4 # render to MP4
|
||||
npx hyperframes doctor # diagnose environment issues
|
||||
```
|
||||
|
||||
Render flags: `--quality draft|standard|high` · `--fps 24|30|60` · `--format mp4|webm` · `--docker` (reproducible) · `--strict`.
|
||||
|
||||
Full CLI reference: [references/cli.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/hyperframes/references/cli.md).
|
||||
|
||||
## Setup (one-time)
|
||||
|
||||
```bash
|
||||
bash "$(dirname "$(find ~/.hermes/skills -path '*/hyperframes/SKILL.md' 2>/dev/null | head -1)")/scripts/setup.sh"
|
||||
```
|
||||
|
||||
The script:
|
||||
1. Verifies Node.js >= 22 and FFmpeg are installed (prints fix instructions if not).
|
||||
2. Installs the `hyperframes` CLI globally (`npm install -g hyperframes@>=0.4.2`).
|
||||
3. Pre-caches `chrome-headless-shell` via Puppeteer — **required** for best-quality rendering via Chrome's `HeadlessExperimental.beginFrame` capture path.
|
||||
4. Runs `npx hyperframes doctor` and reports the result.
|
||||
|
||||
See [references/troubleshooting.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/hyperframes/references/troubleshooting.md) if setup fails.
|
||||
|
||||
## Procedure
|
||||
|
||||
### 1. Plan before writing HTML
|
||||
|
||||
Before touching code, articulate at a high level:
|
||||
- **What** — narrative arc, key moments, emotional beats
|
||||
- **Structure** — compositions, tracks (video/audio/overlays), durations
|
||||
- **Visual identity** — colors, fonts, motion character (explosive / cinematic / fluid / technical)
|
||||
- **Hero frame** — for each scene, the moment when the most elements are simultaneously visible. This is the static layout you'll build first.
|
||||
|
||||
**Visual Identity Gate (HARD-GATE).** Before writing ANY composition HTML, a visual identity must be defined. Do NOT write compositions with default or generic colors (`#333`, `#3b82f6`, `Roboto` are tells that this step was skipped). Check in order:
|
||||
|
||||
1. **`DESIGN.md` at project root?** → Use its exact colors, fonts, motion rules, and "What NOT to Do" constraints.
|
||||
2. **User named a style** (e.g. "Swiss Pulse", "dark and techy", "luxury brand")? → Generate a minimal `DESIGN.md` with `## Style Prompt`, `## Colors` (3-5 hex with roles), `## Typography` (1-2 families), `## What NOT to Do` (3-5 anti-patterns).
|
||||
3. **None of the above?** → Ask 3 questions before writing any HTML:
|
||||
- Mood? (explosive / cinematic / fluid / technical / chaotic / warm)
|
||||
- Light or dark canvas?
|
||||
- Any brand colors, fonts, or visual references?
|
||||
|
||||
Then generate a `DESIGN.md` from the answers. Every composition must trace its palette and typography back to `DESIGN.md` or explicit user direction.
|
||||
|
||||
### 2. Scaffold
|
||||
|
||||
```bash
|
||||
npx hyperframes init my-video --non-interactive
|
||||
```
|
||||
|
||||
Templates: `blank`, `warm-grain`, `play-mode`, `swiss-grid`, `vignelli`, `decision-tree`, `kinetic-type`, `product-promo`, `nyt-graph`. Pass `--example <name>` to pick one, `--video clip.mp4` or `--audio track.mp3` to seed with media.
|
||||
|
||||
### 3. Layout before animation
|
||||
|
||||
Write the static HTML+CSS for the **hero frame first** — no GSAP yet. The `.scene-content` container must fill the scene (`width:100%; height:100%; padding:Npx`) with `display:flex` + `gap`. Use padding to push content inward — never `position: absolute; top: Npx` on a content container (content overflows when taller than the remaining space).
|
||||
|
||||
Only after the hero frame looks right, add `gsap.from()` entrances (animate **to** the CSS position) and `gsap.to()` exits (animate **from** it).
|
||||
|
||||
See [references/composition.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/hyperframes/references/composition.md) for the full data-attribute schema and composition rules.
|
||||
|
||||
### 4. Animate with GSAP
|
||||
|
||||
Every composition must:
|
||||
- Register its timeline: `window.__timelines["<composition-id>"] = tl`
|
||||
- Start paused: `gsap.timeline({ paused: true })` — the player controls playback
|
||||
- Use finite `repeat` values (no `repeat: -1` — breaks the capture engine). Calculate: `repeat: Math.ceil(duration / cycleDuration) - 1`.
|
||||
- Be deterministic — no `Math.random()`, `Date.now()`, or wall-clock logic. Use a seeded PRNG if you need pseudo-randomness.
|
||||
- Build synchronously — no `async`/`await`, `setTimeout`, or Promises around timeline construction.
|
||||
|
||||
See [references/gsap.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/hyperframes/references/gsap.md) for the core GSAP API (tweens, eases, stagger, timelines).
|
||||
|
||||
### 5. Transitions between scenes
|
||||
|
||||
Multi-scene compositions require transitions. Rules:
|
||||
1. **Always use a transition between scenes** — no jump cuts.
|
||||
2. **Always use entrance animations** on every scene element (`gsap.from(...)`).
|
||||
3. **Never use exit animations** except on the final scene — the transition IS the exit.
|
||||
4. The final scene may fade out.
|
||||
|
||||
Use `npx hyperframes add <transition-name>` to install shader transitions (`flash-through-white`, `liquid-wipe`, etc.). Full list: `npx hyperframes add --list`.
|
||||
|
||||
### 6. Audio, captions, TTS, audio-reactive, highlighting
|
||||
|
||||
- **Audio:** always a separate `<audio>` element (video is `muted playsinline`).
|
||||
- **TTS:** `npx hyperframes tts "Script text" --voice af_nova --output narration.wav`. List voices with `--list`. Voice ID first letter encodes language (`a`/`b`=English, `e`=Spanish, `f`=French, `j`=Japanese, `z`=Mandarin, etc.) — the CLI auto-infers the phonemizer locale; pass `--lang` only to override. Non-English phonemization requires `espeak-ng` installed system-wide.
|
||||
- **Captions:** `npx hyperframes transcribe narration.wav` → word-level transcript. Pick style from the transcript tone (hype / corporate / tutorial / storytelling / social — see the table in `references/features.md`). **Language rule:** never use `.en` whisper models unless the audio is confirmed English — `.en` translates non-English audio instead of transcribing it. Every caption group MUST have a hard `tl.set(el, { opacity: 0, visibility: "hidden" }, group.end)` kill after its exit tween — otherwise groups leak visible into later ones.
|
||||
- **Audio-reactive visuals:** pre-extract audio bands (bass / mid / treble) and sample per-frame inside the timeline with a `for` loop of `tl.call(draw, [], f / fps)` — a single long tween does NOT react to audio. Map bass → `scale` (pulse), treble → `textShadow`/`boxShadow` (glow), overall amplitude → `opacity`/`y`/`backgroundColor`. Avoid equalizer-bar clichés — let content guide the visual, audio drive its behavior.
|
||||
- **Marker-style highlighting:** highlight, circle, burst, scribble, sketchout effects for text emphasis are deterministic CSS+GSAP — see `references/features.md#marker-highlighting`. Fully seekable, no animated SVG filters.
|
||||
- **Scene transitions:** every multi-scene composition MUST use transitions (no jump cuts). Pick from CSS primitives (push slide, blur crossfade, zoom through, staggered blocks) or shader transitions (`flash-through-white`, `liquid-wipe`, `cross-warp-morph`, `chromatic-split`, etc.) via `npx hyperframes add`. Mood and energy tables live in `references/features.md#transitions`. Do not mix CSS and shader transitions in the same composition.
|
||||
|
||||
### 7. Lint, validate, inspect, preview, render
|
||||
|
||||
```bash
|
||||
npx hyperframes lint # catches missing data-composition-id, overlapping tracks, unregistered timelines
|
||||
npx hyperframes validate # WCAG contrast audit at 5 timestamps
|
||||
npx hyperframes inspect # visual layout audit — overflow, off-frame elements, occluded text
|
||||
npx hyperframes preview # live browser preview
|
||||
npx hyperframes render --quality draft --output draft.mp4 # fast iteration
|
||||
npx hyperframes render --quality high --output final.mp4 # final delivery
|
||||
```
|
||||
|
||||
`hyperframes validate` samples background pixels behind every text element and warns on contrast ratios below 4.5:1 (or 3:1 for large text). `hyperframes inspect` is the layout-side companion — runs the page at multiple timestamps and flags issues that a static lint can't see (a caption that wraps past the safe area only at 4.5s, a card that overflows when its title is the longest variant, an element that ends up behind a transition shader). Run `inspect` especially on compositions with speech bubbles, cards, captions, or tight typography.
|
||||
|
||||
### 8. Website-to-video (if the user gives a URL)
|
||||
|
||||
Use the 7-step capture-to-video workflow in [references/website-to-video.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/hyperframes/references/website-to-video.md): capture → DESIGN.md → SCRIPT.md → storyboard → composition → render → deliver.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- **`HeadlessExperimental.beginFrame' wasn't found`** — Chromium 147+ removed this protocol. Ensure you're on `hyperframes@>=0.4.2` (auto-detects and falls back to screenshot mode). Escape hatch: `export PRODUCER_FORCE_SCREENSHOT=true`. See [hyperframes#294](https://github.com/heygen-com/hyperframes/issues/294) and [references/troubleshooting.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/hyperframes/references/troubleshooting.md).
|
||||
- **System Chrome (not `chrome-headless-shell`)** — renders hang for 120s then timeout. Run `npx puppeteer browsers install chrome-headless-shell` (setup.sh does this). `hyperframes doctor` reports which binary will be used.
|
||||
- **`repeat: -1` anywhere** — breaks the capture engine. Always compute a finite repeat count.
|
||||
- **`gsap.set()` on clip elements that enter later** — the element doesn't exist at page load. Use `tl.set(selector, vars, timePosition)` inside the timeline instead, at or after the clip's `data-start`.
|
||||
- **`<br>` inside content text** — forced breaks don't know the rendered font width, so natural wrap + `<br>` double-breaks. Use `max-width` to let text wrap. Exception: short display titles where each word is deliberately on its own line.
|
||||
- **Animating `visibility` or `display`** — GSAP can't tween these. Use `autoAlpha` (handles both visibility and opacity).
|
||||
- **Calling `video.play()` or `audio.play()`** — the framework owns playback. Never call these yourself.
|
||||
- **Building timelines async** — the capture engine reads `window.__timelines` synchronously after page load. Never wrap timeline construction in `async`, `setTimeout`, or a Promise.
|
||||
- **Standalone `index.html` wrapped in `<template>`** — hides all content from the browser. Only **sub-compositions** loaded via `data-composition-src` use `<template>`.
|
||||
- **Using video for audio** — always muted `<video>` + separate `<audio>`.
|
||||
|
||||
## Verification
|
||||
|
||||
Before and after rendering:
|
||||
|
||||
1. **Lint + validate + inspect pass:** `npx hyperframes lint --strict && npx hyperframes validate && npx hyperframes inspect` (lint catches structural issues, validate catches contrast, inspect catches visual layout / overflow issues — see troubleshooting.md if warnings appear).
|
||||
2. **Animation choreography** — for new compositions or significant animation changes, run the animation map. `npx hyperframes init` copies the skill scripts into the project, so the path is project-local:
|
||||
```bash
|
||||
node skills/hyperframes/scripts/animation-map.mjs <composition-dir> \
|
||||
--out <composition-dir>/.hyperframes/anim-map
|
||||
```
|
||||
Outputs a single `animation-map.json` with per-tween summaries, ASCII Gantt timeline, stagger detection, dead zones (>1s with no animation), element lifecycles, and flags (`offscreen`, `collision`, `invisible`, `paced-fast` <0.2s, `paced-slow` >2s). Scan summaries and flags — fix or justify each. Skip on small edits.
|
||||
3. **File exists + non-zero:** `ls -lh final.mp4`.
|
||||
4. **Duration matches `data-duration`:** `ffprobe -v error -show_entries format=duration -of default=nw=1:nk=1 final.mp4`.
|
||||
5. **Visual check:** extract a mid-composition frame: `ffmpeg -i final.mp4 -ss 00:00:05 -vframes 1 preview.png`.
|
||||
6. **Audio present if expected:** `ffprobe -v error -show_streams -select_streams a -of default=nw=1:nk=1 final.mp4 | head -1`.
|
||||
|
||||
If `hyperframes render` fails, run `npx hyperframes doctor` and attach its output when reporting.
|
||||
|
||||
## References
|
||||
|
||||
- [composition.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/hyperframes/references/composition.md) — data attributes, timeline contract, non-negotiable rules, typography/asset rules
|
||||
- [cli.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/hyperframes/references/cli.md) — every CLI command (init, capture, lint, validate, inspect, preview, render, transcribe, tts, doctor, browser, info, upgrade, benchmark)
|
||||
- [gsap.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/hyperframes/references/gsap.md) — GSAP core API for HyperFrames (tweens, eases, stagger, timelines, matchMedia)
|
||||
- [features.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/hyperframes/references/features.md) — captions, TTS, audio-reactive, marker highlighting, transitions (load on demand)
|
||||
- [website-to-video.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/hyperframes/references/website-to-video.md) — 7-step capture-to-video workflow
|
||||
- [troubleshooting.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/hyperframes/references/troubleshooting.md) — OpenClaw fix, env vars, common render errors
|
||||
|
|
@ -0,0 +1,219 @@
|
|||
---
|
||||
title: "Kanban Video Orchestrator — Plan, set up, and monitor a multi-agent video production pipeline backed by Hermes Kanban"
|
||||
sidebar_label: "Kanban Video Orchestrator"
|
||||
description: "Plan, set up, and monitor a multi-agent video production pipeline backed by Hermes Kanban"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Kanban Video Orchestrator
|
||||
|
||||
Plan, set up, and monitor a multi-agent video production pipeline backed by Hermes Kanban. Use when the user wants to make ANY video — narrative film, product/marketing, music video, explainer, ASCII/terminal art, abstract/generative loop, comic, 3D, real-time/installation — and the work warrants decomposition into specialized profiles (writer, designer, animator, renderer, voice, editor, etc.) coordinated through a kanban board. Performs adaptive discovery to scope the brief, designs an appropriate team for the requested style, generates the setup script that creates Hermes profiles + initial kanban task, then helps monitor execution and intervene when tasks stall or fail. Routes scenes to whichever Hermes rendering / audio / design skill fits each beat (`ascii-video`, `manim-video`, `p5js`, `comfyui`, `touchdesigner-mcp`, `blender-mcp`, `pixel-art`, `baoyu-comic`, `claude-design`, `excalidraw`, `songsee`, `heartmula`, …) plus external APIs for TTS, image-gen, and image-to-video as needed.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Source | Optional — install with `hermes skills install official/creative/kanban-video-orchestrator` |
|
||||
| Path | `optional-skills/creative/kanban-video-orchestrator` |
|
||||
| Version | `1.0.0` |
|
||||
| Author | ['SHL0MS', 'alt-glitch'] |
|
||||
| License | MIT |
|
||||
| Platforms | linux, macos, windows |
|
||||
| Tags | `video`, `kanban`, `multi-agent`, `orchestration`, `production-pipeline` |
|
||||
| Related skills | [`kanban-orchestrator`](/docs/user-guide/skills/bundled/devops/devops-kanban-orchestrator), [`kanban-worker`](/docs/user-guide/skills/bundled/devops/devops-kanban-worker), [`ascii-video`](/docs/user-guide/skills/bundled/creative/creative-ascii-video), [`manim-video`](/docs/user-guide/skills/bundled/creative/creative-manim-video), [`p5js`](/docs/user-guide/skills/bundled/creative/creative-p5js), [`comfyui`](/docs/user-guide/skills/bundled/creative/creative-comfyui), [`touchdesigner-mcp`](/docs/user-guide/skills/bundled/creative/creative-touchdesigner-mcp), [`blender-mcp`](/docs/user-guide/skills/optional/creative/creative-blender-mcp), [`pixel-art`](/docs/user-guide/skills/bundled/creative/creative-pixel-art), [`ascii-art`](/docs/user-guide/skills/bundled/creative/creative-ascii-art), [`songwriting-and-ai-music`](/docs/user-guide/skills/bundled/creative/creative-songwriting-and-ai-music), [`heartmula`](/docs/user-guide/skills/bundled/media/media-heartmula), [`songsee`](/docs/user-guide/skills/bundled/media/media-songsee), [`spotify`](/docs/user-guide/skills/bundled/media/media-spotify), [`youtube-content`](/docs/user-guide/skills/bundled/media/media-youtube-content), [`claude-design`](/docs/user-guide/skills/bundled/creative/creative-claude-design), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw), [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram), [`concept-diagrams`](/docs/user-guide/skills/optional/creative/creative-concept-diagrams), [`baoyu-comic`](/docs/user-guide/skills/bundled/creative/creative-baoyu-comic), [`baoyu-infographic`](/docs/user-guide/skills/bundled/creative/creative-baoyu-infographic), [`humanizer`](/docs/user-guide/skills/bundled/creative/creative-humanizer), [`gif-search`](/docs/user-guide/skills/bundled/media/media-gif-search), [`meme-generation`](/docs/user-guide/skills/optional/creative/creative-meme-generation) |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
:::info
|
||||
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
|
||||
:::
|
||||
|
||||
# Kanban Video Orchestrator
|
||||
|
||||
Wrap any video request — from a 15-second product teaser to a 5-minute narrative
|
||||
short to a music video to an ASCII loop — in a Hermes Kanban pipeline that
|
||||
decomposes the work to specialized agent profiles.
|
||||
|
||||
This skill does **not** render anything itself. It is a meta-pipeline that:
|
||||
|
||||
1. **Scopes** the request through targeted discovery
|
||||
2. **Designs** an appropriate team (which roles, which tools per role) based on the style
|
||||
3. **Generates** a setup script that creates Hermes profiles, project workspace, and the initial kanban task
|
||||
4. **Hands off** to the director profile, which decomposes via the kanban
|
||||
5. **Monitors** execution, helps intervene when tasks stall or fail
|
||||
|
||||
The actual rendering happens inside the kanban once it's running, via whichever
|
||||
existing skills + tools fit the scenes — `ascii-video`, `manim-video`, `p5js`,
|
||||
`comfyui`, `touchdesigner-mcp`, `blender-mcp`, `songwriting-and-ai-music`,
|
||||
`heartmula`, external APIs, or plain Python with PIL + ffmpeg.
|
||||
|
||||
## When NOT to use this skill
|
||||
|
||||
- The video is one continuous procedural project that needs no specialists. Just write the code directly.
|
||||
- The user wants a quick one-shot conversion (e.g. "convert this mp4 to a GIF") — use ffmpeg directly.
|
||||
- The output is a static image, GIF, or audio-only artifact — use the matching specific skill (`ascii-art`, `gifs`, `meme-generation`, `songwriting-and-ai-music`).
|
||||
- The work fits a single existing skill cleanly (e.g. a pure ASCII video — just use `ascii-video`).
|
||||
|
||||
## Workflow
|
||||
|
||||
```
|
||||
DISCOVER → BRIEF → TEAM DESIGN → SETUP → EXECUTE → MONITOR
|
||||
```
|
||||
|
||||
### Step 1 — Discover (ask the right questions)
|
||||
|
||||
The discovery process is **adaptive**: ask only what is actually needed. Always
|
||||
start with three questions to identify the broad shape:
|
||||
|
||||
- **What is the video?** (one-sentence brief)
|
||||
- **How long?** (5-30s teaser / 30-90s short / 90s-3min explainer / 3-10min film / longer)
|
||||
- **What aspect ratio + target platform?** (1:1 / 9:16 / 16:9; X, IG, YouTube, internal, etc.)
|
||||
|
||||
From the answer, classify the style category. The style determines which
|
||||
follow-up questions to ask. **Do not ask all questions at once.** Ask 2-4 at a
|
||||
time, listen, then proceed. Make reasonable assumptions whenever the user
|
||||
implies an answer.
|
||||
|
||||
For complete intake patterns and per-style question banks, see
|
||||
**[references/intake.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/kanban-video-orchestrator/references/intake.md)**.
|
||||
|
||||
### Step 2 — Brief
|
||||
|
||||
Once enough is known, produce a structured `brief.md` using the template in
|
||||
`assets/brief.md.tmpl`. Stages:
|
||||
|
||||
1. **Concept** — the one-sentence pitch + emotional north star
|
||||
2. **Scope** — duration, aspect, platform, deadline
|
||||
3. **Style** — visual references, brand constraints, tone
|
||||
4. **Scenes** — beat-by-beat breakdown (durations, content, target tool)
|
||||
5. **Audio** — narration / music / SFX / silent (per scene if needed)
|
||||
6. **Deliverables** — file format, resolution, optional alternates (vertical cut, GIF, etc.)
|
||||
|
||||
Show the brief to the user for confirmation before designing the team. **The
|
||||
brief is the contract** — every downstream task references it.
|
||||
|
||||
### Step 3 — Team design
|
||||
|
||||
Pick role archetypes from the library that fit this video. **Compose, don't
|
||||
clone.** Most videos need 4-7 profiles. The director is always present; the
|
||||
rest are picked by what the brief actually requires.
|
||||
|
||||
For the role library and per-style team compositions, see
|
||||
**[references/role-archetypes.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/kanban-video-orchestrator/references/role-archetypes.md)**.
|
||||
|
||||
For mapping role → which Hermes skills + toolsets it loads, see
|
||||
**[references/tool-matrix.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/kanban-video-orchestrator/references/tool-matrix.md)**.
|
||||
|
||||
### Step 4 — Setup
|
||||
|
||||
Generate a setup script (`setup.sh`) and run it. The script:
|
||||
|
||||
1. Creates the project workspace (`~/projects/video-pipeline/<slug>/`)
|
||||
2. Copies any provided assets into `taste/`, `audio/`, `assets/`
|
||||
3. Creates each Hermes profile via `hermes profile create --clone`
|
||||
4. Writes per-profile `SOUL.md` (personality + role definition)
|
||||
5. Configures profile YAML (toolsets, always_load skills, cwd)
|
||||
6. Writes `brief.md`, `TEAM.md`, and `taste/` content
|
||||
7. Fires the initial `hermes kanban create` task assigned to the director
|
||||
|
||||
Use `scripts/bootstrap_pipeline.py` to generate setup.sh from a brief +
|
||||
team-design JSON. See **[references/kanban-setup.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/kanban-video-orchestrator/references/kanban-setup.md)**
|
||||
for the setup script structure, profile config patterns, and the critical
|
||||
"shared workspace" rule.
|
||||
|
||||
### Step 5 — Execute
|
||||
|
||||
Run `setup.sh`. Then provide the user with monitoring commands:
|
||||
|
||||
```bash
|
||||
hermes kanban watch --tenant <project-tenant> # live events
|
||||
hermes kanban list --tenant <project-tenant> # board snapshot
|
||||
hermes dashboard # visual board UI
|
||||
```
|
||||
|
||||
The director profile takes over from here, decomposing the work and routing
|
||||
tasks to specialist profiles via the kanban toolset.
|
||||
|
||||
### Step 6 — Monitor and intervene
|
||||
|
||||
Stay engaged — the kanban runs autonomously but a stuck task or bad output
|
||||
needs human (or AI) judgment.
|
||||
|
||||
Monitoring patterns: poll `kanban list` periodically, inspect any RUNNING task
|
||||
that exceeds its expected duration with `kanban show <id>`, and check
|
||||
heartbeats. When a worker's output fails review, the standard interventions are:
|
||||
|
||||
1. Comment on the worker's task with specific feedback (`kanban_comment`)
|
||||
2. Create a re-run task with the original as parent
|
||||
3. Adjust the brief's scope and let the director re-decompose
|
||||
|
||||
For diagnostic patterns, intervention recipes, and the "task is stuck"
|
||||
playbook, see **[references/monitoring.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/kanban-video-orchestrator/references/monitoring.md)**.
|
||||
|
||||
## Reference: worked examples
|
||||
|
||||
Six concrete pipelines covering very different video styles — narrative film,
|
||||
product/marketing, music video, math/algorithm explainer, ASCII video, real-time
|
||||
installation — showing how the same workflow yields very different teams and
|
||||
task graphs. See **[references/examples.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/creative/kanban-video-orchestrator/references/examples.md)**.
|
||||
|
||||
## Critical rules
|
||||
|
||||
1. **Discovery before action.** Never start generating a brief or team without
|
||||
asking at least the three baseline questions. A bad brief cascades through
|
||||
the entire pipeline.
|
||||
|
||||
2. **Match the team to the video.** Don't reuse the same 4-profile setup for
|
||||
every job. A music video that doesn't have a beat-analysis profile will
|
||||
misfire. A narrative film that doesn't have a writer profile will produce
|
||||
incoherent scenes. See `references/role-archetypes.md`.
|
||||
|
||||
3. **One workspace per project.** All profiles for a given video share the same
|
||||
`dir:` workspace. Tasks pass artifacts via shared filesystem and structured
|
||||
handoffs. **Every** `kanban_create` call passes
|
||||
`workspace_kind="dir"` + `workspace_path="<absolute project path>"`.
|
||||
|
||||
4. **Tenant every project.** Use a project-specific tenant
|
||||
(`--tenant <project-slug>`). Keeps the dashboard scoped and prevents
|
||||
cross-pollination with other ongoing kanbans.
|
||||
|
||||
5. **Respect existing skills.** When a scene fits an existing skill, the
|
||||
relevant renderer should load that skill via `--skill <name>` on its task
|
||||
or `always_load` in its profile. Do not re-derive what a skill already
|
||||
provides.
|
||||
|
||||
6. **The director never executes.** Even with the full `kanban + terminal +
|
||||
file` toolset, the director's `SOUL.md` rules forbid it from executing
|
||||
work itself. It decomposes and routes only — every concrete task becomes
|
||||
a `hermes kanban create` call to a specialist profile. The
|
||||
`kanban-orchestrator` skill spells this out further.
|
||||
|
||||
7. **Don't over-decompose.** A 30-second product video does NOT need 20 tasks.
|
||||
Aim for the smallest task graph that still parallelizes well and exposes the
|
||||
right human-review gates.
|
||||
|
||||
8. **Verify API keys BEFORE firing.** External APIs (TTS, image-gen,
|
||||
image-to-video) need keys in `~/.hermes/.env` or the user's secret store.
|
||||
A worker that hits a missing-key error wastes a task slot. The setup
|
||||
script's `check_key` helper aborts cleanly if a required key is missing.
|
||||
|
||||
## File map
|
||||
|
||||
```
|
||||
SKILL.md ← this file (workflow + rules)
|
||||
references/
|
||||
intake.md ← discovery question banks per style
|
||||
role-archetypes.md ← role library (writer, designer, animator, …)
|
||||
tool-matrix.md ← skill + toolset mapping per role
|
||||
kanban-setup.md ← setup script structure & profile config
|
||||
monitoring.md ← watch + intervene patterns
|
||||
examples.md ← six worked pipelines
|
||||
assets/
|
||||
brief.md.tmpl ← brief skeleton
|
||||
setup.sh.tmpl ← setup script skeleton
|
||||
soul.md.tmpl ← profile personality skeleton
|
||||
scripts/
|
||||
bootstrap_pipeline.py ← generate setup.sh from brief + team JSON
|
||||
monitor.py ← polling + intervention helpers
|
||||
```
|
||||
|
|
@ -19,6 +19,7 @@ Generate real meme images by picking a template and overlaying text with Pillow.
|
|||
| Version | `2.0.0` |
|
||||
| Author | adanaleycio |
|
||||
| License | MIT |
|
||||
| Platforms | linux, macos, windows |
|
||||
| Tags | `creative`, `memes`, `humor`, `images` |
|
||||
| Related skills | [`ascii-art`](/docs/user-guide/skills/bundled/creative/creative-ascii-art), `generative-widgets` |
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue