mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-08 03:01:47 +00:00
rename: video-orchestrator → kanban-video-orchestrator
The kanban prefix makes the skill discoverable alongside `kanban-orchestrator` and `kanban-worker`, and signals up front that this skill drives the kanban plugin rather than being a generic video tool. Updated: - directory rename - SKILL.md frontmatter `name:` and H1 - setup.sh.tmpl header
This commit is contained in:
parent
511add7249
commit
0dd8e3f8d8
12 changed files with 3 additions and 3 deletions
206
optional-skills/creative/kanban-video-orchestrator/SKILL.md
Normal file
206
optional-skills/creative/kanban-video-orchestrator/SKILL.md
Normal file
|
|
@ -0,0 +1,206 @@
|
|||
---
|
||||
name: kanban-video-orchestrator
|
||||
description: Plan, set up, and monitor a multi-agent video production pipeline backed by Hermes Kanban. Use when the user wants to make ANY video — narrative film, product/marketing, music video, explainer, ASCII/terminal art, abstract/generative loop, comic, 3D, real-time/installation — and the work warrants decomposition into specialized profiles (writer, designer, animator, renderer, voice, editor, etc.) coordinated through a kanban board. Performs adaptive discovery to scope the brief, designs an appropriate team for the requested style, generates the setup script that creates Hermes profiles + initial kanban task, then helps monitor execution and intervene when tasks stall or fail. Routes scenes to whichever Hermes rendering / audio / design skill fits each beat (`ascii-video`, `manim-video`, `p5js`, `comfyui`, `touchdesigner-mcp`, `blender-mcp`, `pixel-art`, `baoyu-comic`, `claude-design`, `excalidraw`, `songsee`, `heartmula`, …) plus external APIs for TTS, image-gen, and image-to-video as needed.
|
||||
version: 1.0.0
|
||||
author: [SHL0MS, alt-glitch]
|
||||
license: MIT
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [video, kanban, multi-agent, orchestration, production-pipeline]
|
||||
related_skills: [kanban-orchestrator, kanban-worker, ascii-video, manim-video, p5js, comfyui, touchdesigner-mcp, blender-mcp, pixel-art, ascii-art, songwriting-and-ai-music, heartmula, songsee, spotify, youtube-content, claude-design, excalidraw, architecture-diagram, concept-diagrams, baoyu-comic, baoyu-infographic, humanizer, gif-search, meme-generation]
|
||||
credits: |
|
||||
The single-project workspace layout, profile-config patching pattern,
|
||||
SOUL.md-per-profile model, TEAM.md task-graph convention, and
|
||||
`--workspace dir:<path>` discipline are adapted from alt-glitch's
|
||||
original multi-agent video pipeline at
|
||||
https://github.com/NousResearch/kanban-video-pipeline.
|
||||
---
|
||||
|
||||
# Kanban Video Orchestrator
|
||||
|
||||
Wrap any video request — from a 15-second product teaser to a 5-minute narrative
|
||||
short to a music video to an ASCII loop — in a Hermes Kanban pipeline that
|
||||
decomposes the work to specialized agent profiles.
|
||||
|
||||
This skill does **not** render anything itself. It is a meta-pipeline that:
|
||||
|
||||
1. **Scopes** the request through targeted discovery
|
||||
2. **Designs** an appropriate team (which roles, which tools per role) based on the style
|
||||
3. **Generates** a setup script that creates Hermes profiles, project workspace, and the initial kanban task
|
||||
4. **Hands off** to the director profile, which decomposes via the kanban
|
||||
5. **Monitors** execution, helps intervene when tasks stall or fail
|
||||
|
||||
The actual rendering happens inside the kanban once it's running, via whichever
|
||||
existing skills + tools fit the scenes — `ascii-video`, `manim-video`, `p5js`,
|
||||
`comfyui`, `touchdesigner-mcp`, `blender-mcp`, `songwriting-and-ai-music`,
|
||||
`heartmula`, external APIs, or plain Python with PIL + ffmpeg.
|
||||
|
||||
## When NOT to use this skill
|
||||
|
||||
- The video is one continuous procedural project that needs no specialists. Just write the code directly.
|
||||
- The user wants a quick one-shot conversion (e.g. "convert this mp4 to a GIF") — use ffmpeg directly.
|
||||
- The output is a static image, GIF, or audio-only artifact — use the matching specific skill (`ascii-art`, `gifs`, `meme-generation`, `songwriting-and-ai-music`).
|
||||
- The work fits a single existing skill cleanly (e.g. a pure ASCII video — just use `ascii-video`).
|
||||
|
||||
## Workflow
|
||||
|
||||
```
|
||||
DISCOVER → BRIEF → TEAM DESIGN → SETUP → EXECUTE → MONITOR
|
||||
```
|
||||
|
||||
### Step 1 — Discover (ask the right questions)
|
||||
|
||||
The discovery process is **adaptive**: ask only what is actually needed. Always
|
||||
start with three questions to identify the broad shape:
|
||||
|
||||
- **What is the video?** (one-sentence brief)
|
||||
- **How long?** (5-30s teaser / 30-90s short / 90s-3min explainer / 3-10min film / longer)
|
||||
- **What aspect ratio + target platform?** (1:1 / 9:16 / 16:9; X, IG, YouTube, internal, etc.)
|
||||
|
||||
From the answer, classify the style category. The style determines which
|
||||
follow-up questions to ask. **Do not ask all questions at once.** Ask 2-4 at a
|
||||
time, listen, then proceed. Make reasonable assumptions whenever the user
|
||||
implies an answer.
|
||||
|
||||
For complete intake patterns and per-style question banks, see
|
||||
**[references/intake.md](references/intake.md)**.
|
||||
|
||||
### Step 2 — Brief
|
||||
|
||||
Once enough is known, produce a structured `brief.md` using the template in
|
||||
`assets/brief.md.tmpl`. Stages:
|
||||
|
||||
1. **Concept** — the one-sentence pitch + emotional north star
|
||||
2. **Scope** — duration, aspect, platform, deadline
|
||||
3. **Style** — visual references, brand constraints, tone
|
||||
4. **Scenes** — beat-by-beat breakdown (durations, content, target tool)
|
||||
5. **Audio** — narration / music / SFX / silent (per scene if needed)
|
||||
6. **Deliverables** — file format, resolution, optional alternates (vertical cut, GIF, etc.)
|
||||
|
||||
Show the brief to the user for confirmation before designing the team. **The
|
||||
brief is the contract** — every downstream task references it.
|
||||
|
||||
### Step 3 — Team design
|
||||
|
||||
Pick role archetypes from the library that fit this video. **Compose, don't
|
||||
clone.** Most videos need 4-7 profiles. The director is always present; the
|
||||
rest are picked by what the brief actually requires.
|
||||
|
||||
For the role library and per-style team compositions, see
|
||||
**[references/role-archetypes.md](references/role-archetypes.md)**.
|
||||
|
||||
For mapping role → which Hermes skills + toolsets it loads, see
|
||||
**[references/tool-matrix.md](references/tool-matrix.md)**.
|
||||
|
||||
### Step 4 — Setup
|
||||
|
||||
Generate a setup script (`setup.sh`) and run it. The script:
|
||||
|
||||
1. Creates the project workspace (`~/projects/video-pipeline/<slug>/`)
|
||||
2. Copies any provided assets into `taste/`, `audio/`, `assets/`
|
||||
3. Creates each Hermes profile via `hermes profile create --clone`
|
||||
4. Writes per-profile `SOUL.md` (personality + role definition)
|
||||
5. Configures profile YAML (toolsets, always_load skills, cwd)
|
||||
6. Writes `brief.md`, `TEAM.md`, and `taste/` content
|
||||
7. Fires the initial `hermes kanban create` task assigned to the director
|
||||
|
||||
Use `scripts/bootstrap_pipeline.py` to generate setup.sh from a brief +
|
||||
team-design JSON. See **[references/kanban-setup.md](references/kanban-setup.md)**
|
||||
for the setup script structure, profile config patterns, and the critical
|
||||
"shared workspace" rule.
|
||||
|
||||
### Step 5 — Execute
|
||||
|
||||
Run `setup.sh`. Then provide the user with monitoring commands:
|
||||
|
||||
```bash
|
||||
hermes kanban watch --tenant <project-tenant> # live events
|
||||
hermes kanban list --tenant <project-tenant> # board snapshot
|
||||
hermes dashboard # visual board UI
|
||||
```
|
||||
|
||||
The director profile takes over from here, decomposing the work and routing
|
||||
tasks to specialist profiles via the kanban toolset.
|
||||
|
||||
### Step 6 — Monitor and intervene
|
||||
|
||||
Stay engaged — the kanban runs autonomously but a stuck task or bad output
|
||||
needs human (or AI) judgment.
|
||||
|
||||
Monitoring patterns: poll `kanban list` periodically, inspect any RUNNING task
|
||||
that exceeds its expected duration with `kanban show <id>`, and check
|
||||
heartbeats. When a worker's output fails review, the standard interventions are:
|
||||
|
||||
1. Comment on the worker's task with specific feedback (`kanban_comment`)
|
||||
2. Create a re-run task with the original as parent
|
||||
3. Adjust the brief's scope and let the director re-decompose
|
||||
|
||||
For diagnostic patterns, intervention recipes, and the "task is stuck"
|
||||
playbook, see **[references/monitoring.md](references/monitoring.md)**.
|
||||
|
||||
## Reference: worked examples
|
||||
|
||||
Six concrete pipelines covering very different video styles — narrative film,
|
||||
product/marketing, music video, math/algorithm explainer, ASCII video, real-time
|
||||
installation — showing how the same workflow yields very different teams and
|
||||
task graphs. See **[references/examples.md](references/examples.md)**.
|
||||
|
||||
## Critical rules
|
||||
|
||||
1. **Discovery before action.** Never start generating a brief or team without
|
||||
asking at least the three baseline questions. A bad brief cascades through
|
||||
the entire pipeline.
|
||||
|
||||
2. **Match the team to the video.** Don't reuse the same 4-profile setup for
|
||||
every job. A music video that doesn't have a beat-analysis profile will
|
||||
misfire. A narrative film that doesn't have a writer profile will produce
|
||||
incoherent scenes. See `references/role-archetypes.md`.
|
||||
|
||||
3. **One workspace per project.** All profiles for a given video share the same
|
||||
`dir:` workspace. Tasks pass artifacts via shared filesystem and structured
|
||||
handoffs. **Every** `kanban_create` call passes
|
||||
`workspace_kind="dir"` + `workspace_path="<absolute project path>"`.
|
||||
|
||||
4. **Tenant every project.** Use a project-specific tenant
|
||||
(`--tenant <project-slug>`). Keeps the dashboard scoped and prevents
|
||||
cross-pollination with other ongoing kanbans.
|
||||
|
||||
5. **Respect existing skills.** When a scene fits an existing skill, the
|
||||
relevant renderer should load that skill via `--skill <name>` on its task
|
||||
or `always_load` in its profile. Do not re-derive what a skill already
|
||||
provides.
|
||||
|
||||
6. **The director never executes.** Even with the full `kanban + terminal +
|
||||
file` toolset, the director's `SOUL.md` rules forbid it from executing
|
||||
work itself. It decomposes and routes only — every concrete task becomes
|
||||
a `hermes kanban create` call to a specialist profile. The
|
||||
`kanban-orchestrator` skill spells this out further.
|
||||
|
||||
7. **Don't over-decompose.** A 30-second product video does NOT need 20 tasks.
|
||||
Aim for the smallest task graph that still parallelizes well and exposes the
|
||||
right human-review gates.
|
||||
|
||||
8. **Verify API keys BEFORE firing.** External APIs (TTS, image-gen,
|
||||
image-to-video) need keys in `~/.hermes/.env` or the user's secret store.
|
||||
A worker that hits a missing-key error wastes a task slot. The setup
|
||||
script's `check_key` helper aborts cleanly if a required key is missing.
|
||||
|
||||
## File map
|
||||
|
||||
```
|
||||
SKILL.md ← this file (workflow + rules)
|
||||
references/
|
||||
intake.md ← discovery question banks per style
|
||||
role-archetypes.md ← role library (writer, designer, animator, …)
|
||||
tool-matrix.md ← skill + toolset mapping per role
|
||||
kanban-setup.md ← setup script structure & profile config
|
||||
monitoring.md ← watch + intervene patterns
|
||||
examples.md ← six worked pipelines
|
||||
assets/
|
||||
brief.md.tmpl ← brief skeleton
|
||||
setup.sh.tmpl ← setup script skeleton
|
||||
soul.md.tmpl ← profile personality skeleton
|
||||
scripts/
|
||||
bootstrap_pipeline.py ← generate setup.sh from brief + team JSON
|
||||
monitor.py ← polling + intervention helpers
|
||||
```
|
||||
|
|
@ -0,0 +1,79 @@
|
|||
# Video Brief — {{TITLE}}
|
||||
|
||||
> Slug: `{{SLUG}}` · Tenant: `{{TENANT}}` · Project workspace: `{{WORKSPACE}}`
|
||||
|
||||
## 1. Concept
|
||||
|
||||
**One-line pitch.** {{ONE_LINE_PITCH}}
|
||||
|
||||
**Emotional north star.** {{EMOTIONAL_NORTH_STAR}}
|
||||
*(What should the viewer feel walking away?)*
|
||||
|
||||
## 2. Scope
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Duration | {{DURATION_S}} seconds |
|
||||
| Aspect ratio | {{ASPECT}} |
|
||||
| Resolution | {{RESOLUTION}} |
|
||||
| Frame rate | {{FPS}} fps |
|
||||
| Target platforms | {{PLATFORMS}} |
|
||||
| Deadline | {{DEADLINE}} |
|
||||
| Quality bar | {{QUALITY_BAR}} *(rough draft / polished / archival)* |
|
||||
|
||||
## 3. Style
|
||||
|
||||
**Visual references.** {{VISUAL_REFS}}
|
||||
|
||||
**Tone.** {{TONE}}
|
||||
|
||||
**Brand constraints.** {{BRAND_CONSTRAINTS}}
|
||||
*(colors, typography, motion language; or "n/a")*
|
||||
|
||||
**Aesthetic rules.**
|
||||
{{AESTHETIC_RULES}}
|
||||
|
||||
## 4. Scenes
|
||||
|
||||
Beat-by-beat breakdown. Each scene gets a row.
|
||||
|
||||
| # | Time | Content | Target tool / skill | Audio | Notes |
|
||||
|---|------|---------|---------------------|-------|-------|
|
||||
| 1 | 0:00–0:0X | {{SCENE_1_CONTENT}} | {{SCENE_1_TOOL}} | {{SCENE_1_AUDIO}} | {{SCENE_1_NOTES}} |
|
||||
| 2 | 0:0X–0:0Y | ... | ... | ... | ... |
|
||||
|
||||
## 5. Audio
|
||||
|
||||
**Approach.** {{AUDIO_APPROACH}}
|
||||
*(narration / music-only / synced to track / silent / mixed)*
|
||||
|
||||
**Voiceover.** {{VO_DETAILS}}
|
||||
*(provider, voice, language, script source — "n/a" if no VO)*
|
||||
|
||||
**Music.** {{MUSIC_DETAILS}}
|
||||
*(provided track path / commission via Suno / commission via heartmula /
|
||||
license-free / "n/a")*
|
||||
|
||||
**SFX.** {{SFX_DETAILS}}
|
||||
*(generated, library, or "n/a")*
|
||||
|
||||
## 6. Deliverables
|
||||
|
||||
| Format | Resolution | Notes |
|
||||
|--------|-----------|-------|
|
||||
| {{PRIMARY_FORMAT}} | {{PRIMARY_RES}} | The main output |
|
||||
| {{ALT_FORMAT_1}} | {{ALT_RES_1}} | {{ALT_NOTES_1}} |
|
||||
|
||||
**Final filename.** `output/final.mp4`
|
||||
*(plus optional `output/final-9x16.mp4`, `output/captions.srt`, etc.)*
|
||||
|
||||
## 7. Constraints
|
||||
|
||||
- API keys required: {{API_KEYS_REQUIRED}}
|
||||
- External dependencies: {{EXT_DEPS}}
|
||||
- Source assets to incorporate: {{SOURCE_ASSETS}}
|
||||
|
||||
---
|
||||
|
||||
**This brief is the contract. The director and every downstream profile read
|
||||
it. If the brief changes, the kanban must be re-fired — don't edit live.**
|
||||
|
|
@ -0,0 +1,185 @@
|
|||
#!/usr/bin/env bash
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# Video Pipeline Setup — {{TITLE}}
|
||||
#
|
||||
# Generated by kanban-video-orchestrator skill.
|
||||
#
|
||||
# Slug: {{SLUG}}
|
||||
# Workspace: {{WORKSPACE}}
|
||||
# Tenant: {{TENANT}}
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
set -euo pipefail
|
||||
|
||||
PROJECT_SLUG="{{SLUG}}"
|
||||
WORKSPACE="$HOME/projects/video-pipeline/${PROJECT_SLUG}"
|
||||
TENANT="{{TENANT}}"
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
# 1. Verify required API keys
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
echo "═══ Checking required API keys ═══"
|
||||
|
||||
check_key() {
|
||||
local var="$1"
|
||||
local kc_account="${2:-hermes}"
|
||||
local kc_service="${3:-$1}"
|
||||
if grep -q "^${var}=" "$HOME/.hermes/.env" 2>/dev/null && \
|
||||
[ -n "$(grep "^${var}=" "$HOME/.hermes/.env" | cut -d= -f2-)" ]; then
|
||||
echo " ✓ ${var} (env)"
|
||||
return 0
|
||||
fi
|
||||
if command -v security >/dev/null 2>&1 && \
|
||||
security find-generic-password -a "${kc_account}" -s "${kc_service}" -w >/dev/null 2>&1; then
|
||||
echo " ✓ ${var} (Keychain ${kc_account}/${kc_service})"
|
||||
return 0
|
||||
fi
|
||||
echo " ✗ ${var} not set in ~/.hermes/.env or Keychain (${kc_account}/${kc_service})"
|
||||
return 1
|
||||
}
|
||||
|
||||
# Customize this list per project — only check keys actually used:
|
||||
{{KEY_CHECKS}}
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
# 2. Create project workspace
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
echo "═══ Creating project workspace ═══"
|
||||
mkdir -p "$WORKSPACE"/{taste,audio/{voiceover,sfx},assets,scenes,checkpoints,tools,output}
|
||||
{{SCENE_DIRS}}
|
||||
echo " ✓ $WORKSPACE"
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
# 3. Create Hermes profiles
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
echo "═══ Creating Hermes profiles ═══"
|
||||
|
||||
{{PROFILE_CREATE_COMMANDS}}
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
# 4. Configure profiles (toolsets, skills, cwd)
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
echo "═══ Configuring profiles ═══"
|
||||
|
||||
configure_profile() {
|
||||
local profile="$1"
|
||||
local toolsets_json="$2" # JSON array string, e.g. '["kanban","terminal","file"]'
|
||||
local skills_json="$3" # JSON array string, e.g. '["kanban-worker","ascii-video"]'
|
||||
python3 - "$profile" "$toolsets_json" "$skills_json" "$WORKSPACE" <<'PY'
|
||||
"""Patch a Hermes profile config.yaml using PyYAML so we don't depend on the
|
||||
exact default-config string format. Validates the patch took effect and exits
|
||||
non-zero if anything's off."""
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
|
||||
try:
|
||||
import yaml
|
||||
except ImportError:
|
||||
print("ERROR: PyYAML required. pip install pyyaml", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
profile, toolsets_json, skills_json, workspace = sys.argv[1:5]
|
||||
toolsets = json.loads(toolsets_json)
|
||||
skills = json.loads(skills_json)
|
||||
|
||||
p = os.path.expanduser(f"~/.hermes/profiles/{profile}/config.yaml")
|
||||
if not os.path.exists(p):
|
||||
print(f" ✗ profile config not found: {p}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
with open(p) as f:
|
||||
cfg = yaml.safe_load(f) or {}
|
||||
|
||||
# Apply our changes — only the keys we actually want to set.
|
||||
cfg["toolsets"] = toolsets
|
||||
cfg.setdefault("skills", {})
|
||||
cfg["skills"]["always_load"] = skills
|
||||
|
||||
# Note: we do NOT touch cfg["approvals"] — that's a security-sensitive
|
||||
# setting (manual confirmation of tool calls). Workspace cwd is overridden
|
||||
# per-task by `--workspace dir:<path>` on `hermes kanban create`, so we
|
||||
# don't need to mutate cfg["terminal"]["cwd"] either.
|
||||
|
||||
with open(p, "w") as f:
|
||||
yaml.safe_dump(cfg, f, sort_keys=False)
|
||||
|
||||
# Validate
|
||||
with open(p) as f:
|
||||
after = yaml.safe_load(f)
|
||||
errors = []
|
||||
if after.get("toolsets") != toolsets:
|
||||
errors.append(f"toolsets mismatch: {after.get('toolsets')!r}")
|
||||
if after.get("skills", {}).get("always_load") != skills:
|
||||
errors.append(f"skills.always_load mismatch: {after.get('skills', {}).get('always_load')!r}")
|
||||
if errors:
|
||||
print(f" ✗ {profile}: " + "; ".join(errors), file=sys.stderr)
|
||||
sys.exit(1)
|
||||
PY
|
||||
if [ $? -ne 0 ]; then
|
||||
echo " ✗ failed to configure ${profile}" >&2
|
||||
exit 1
|
||||
fi
|
||||
echo " ✓ ${profile}"
|
||||
}
|
||||
|
||||
{{PROFILE_CONFIG_COMMANDS}}
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
# 5. Write SOUL.md per profile
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
echo "═══ Writing profile personalities ═══"
|
||||
|
||||
{{SOUL_WRITES}}
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
# 6. Copy brief, TEAM.md, and any provided assets
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
echo "═══ Writing brief + taste ═══"
|
||||
|
||||
cat > "$WORKSPACE/brief.md" <<'BRIEF_EOF'
|
||||
{{BRIEF_CONTENTS}}
|
||||
BRIEF_EOF
|
||||
|
||||
cat > "$WORKSPACE/TEAM.md" <<'TEAM_EOF'
|
||||
{{TEAM_CONTENTS}}
|
||||
TEAM_EOF
|
||||
|
||||
{{TASTE_WRITES}}
|
||||
|
||||
{{ASSET_COPIES}}
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
# 7. Fire the initial kanban task
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
echo "═══ Firing initial kanban task ═══"
|
||||
|
||||
hermes kanban create "Direct production of {{TITLE}}" \
|
||||
--assignee director \
|
||||
--workspace dir:"$WORKSPACE" \
|
||||
--tenant "$TENANT" \
|
||||
--priority 2 \
|
||||
--max-runtime 4h \
|
||||
--body "$(cat <<EOF
|
||||
Read brief.md, TEAM.md, and taste/.
|
||||
|
||||
Decompose into the team graph defined in TEAM.md.
|
||||
|
||||
All child tasks MUST use:
|
||||
workspace_kind="dir"
|
||||
workspace_path="$WORKSPACE"
|
||||
tenant="$TENANT"
|
||||
|
||||
Do not execute the work yourself — route every concrete subtask to the
|
||||
appropriate profile via kanban_create.
|
||||
EOF
|
||||
)"
|
||||
|
||||
echo ""
|
||||
echo "═══ Setup complete ═══"
|
||||
echo ""
|
||||
echo "Monitor with:"
|
||||
echo " hermes kanban watch --tenant $TENANT"
|
||||
echo " hermes kanban list --tenant $TENANT"
|
||||
echo " hermes dashboard"
|
||||
echo ""
|
||||
echo "Workspace: $WORKSPACE"
|
||||
|
|
@ -0,0 +1,38 @@
|
|||
# {{ROLE_NAME}}
|
||||
|
||||
You are the **{{ROLE_NAME}}** for this video production.
|
||||
|
||||
## Project context
|
||||
|
||||
- **Brief:** read `brief.md` in your CWD
|
||||
- **Team graph:** read `TEAM.md` in your CWD
|
||||
- **Style spec:** read `taste/brand-guide.md` and `taste/emotional-dna.md` in
|
||||
your CWD
|
||||
|
||||
## What you do
|
||||
|
||||
{{ROLE_RESPONSIBILITIES}}
|
||||
|
||||
## Inputs you read
|
||||
|
||||
{{INPUTS_READ}}
|
||||
|
||||
## Outputs you produce
|
||||
|
||||
{{OUTPUTS_PRODUCED}}
|
||||
|
||||
## Tools and skills available
|
||||
|
||||
- **Toolsets:** {{TOOLSETS}}
|
||||
- **Skills loaded:** {{SKILLS}}
|
||||
- **External APIs / CLIs:** {{EXTERNAL_TOOLS}}
|
||||
|
||||
## Rules
|
||||
|
||||
{{ROLE_RULES}}
|
||||
|
||||
{{COMMON_RULES}}
|
||||
|
||||
## Common reference commands
|
||||
|
||||
{{COMMON_COMMANDS}}
|
||||
|
|
@ -0,0 +1,227 @@
|
|||
# Worked Examples
|
||||
|
||||
Six concrete pipelines covering different video styles. Each shows the team
|
||||
composition, task graph, and skill/tool choices the orchestrator would make
|
||||
for that brief. **These are illustrative, not templates** — adapt to the
|
||||
actual brief.
|
||||
|
||||
## Example 1 — Narrative short film (text-to-image → image-to-video → cut)
|
||||
|
||||
**Brief:** A 90-second noir-style short. A detective walks through a rainy
|
||||
city. Voiceover narration. AI-generated visuals.
|
||||
|
||||
**Team:**
|
||||
- `director` — vision, decomposition, approval
|
||||
- `writer` — script + voiceover copy (loads `humanizer` for natural voice)
|
||||
- `storyboarder` — beat-by-beat shot list (loads `excalidraw`)
|
||||
- `image-generator` — generates each shot's still via local ComfyUI workflows
|
||||
(loads `comfyui`)
|
||||
- `image-to-video-generator` — animates each still (Runway/Kling, OR
|
||||
ComfyUI's AnimateDiff/WAN workflows via `comfyui`)
|
||||
- `voice-talent` — narration via ElevenLabs
|
||||
- `audio-mixer` — VO + ambient pad
|
||||
- `editor` — assembly + transitions
|
||||
- `reviewer` — final QA
|
||||
|
||||
**Task graph:**
|
||||
```
|
||||
T0 director decompose
|
||||
T1 writer script + voiceover.md (parent: T0)
|
||||
T2 storyboarder shot list with framing per beat (parent: T1)
|
||||
T3 image-generator one still per shot (~12 shots) (parent: T2)
|
||||
T4 image-to-video animate each still (parent: T3)
|
||||
T5 voice-talent generate narration audio (parent: T1)
|
||||
T6 audio-mixer mix VO + ambient (parent: T5)
|
||||
T7 editor cut + transitions + audio mux (parents: T4, T6)
|
||||
T8 reviewer final QA (parent: T7)
|
||||
```
|
||||
|
||||
**Key choices:**
|
||||
- Local ComfyUI via `comfyui` skill is preferred over external API for
|
||||
cost/control — but external APIs are fine if ComfyUI isn't installed
|
||||
- `editor` profile is ffmpeg-only, no Hermes skill required beyond
|
||||
`kanban-worker`
|
||||
- Storyboarder produces `storyboard.excalidraw` alongside the markdown
|
||||
|
||||
## Example 2 — Product / marketing teaser
|
||||
|
||||
**Brief:** A 30-second product teaser for a developer tool. Shows code +
|
||||
terminal + UI screen recordings, voiceover, CTA at end. Square 1:1.
|
||||
|
||||
**Team:**
|
||||
- `director`
|
||||
- `copywriter` — taglines, voiceover script, CTA (loads `humanizer`)
|
||||
- `concept-artist` — style frames (loads `claude-design` for UI mockups)
|
||||
- `renderer-motion-graphics` — animated UI sequences (Remotion CLI)
|
||||
- `renderer-ascii` — terminal-style demo scenes (loads `ascii-video`)
|
||||
- `voice-talent` — VO via ElevenLabs
|
||||
- `editor` — assembly + brand-color treatment
|
||||
- `audio-mixer` — VO + light music bed
|
||||
- `captioner` — burned subtitles for muted-autoplay platforms
|
||||
- `masterer` — produces 1:1 + 9:16 + 16:9 variants
|
||||
|
||||
**Task graph:**
|
||||
```
|
||||
T0 director decompose
|
||||
T1 copywriter copy.md + cta + vo script (parent: T0)
|
||||
T2 concept-artist visual-spec.md + style frames (parent: T1)
|
||||
T3a renderer-motion-graphics scene 1: UI sequence (parent: T2)
|
||||
T3b renderer-ascii scene 2: terminal demo (parent: T2)
|
||||
T3c renderer-motion-graphics scene 3: feature highlight (parent: T2)
|
||||
T3d renderer-motion-graphics scene 4: CTA card (parent: T2)
|
||||
T4 voice-talent narration (parent: T1)
|
||||
T5 audio-mixer VO + music bed (parent: T4)
|
||||
T6 editor cut + transitions (parents: T3*, T5)
|
||||
T7 captioner SRT + burned subtitles (parent: T6)
|
||||
T8 masterer 1:1, 9:16, 16:9 variants (parent: T7)
|
||||
```
|
||||
|
||||
**Key choices:**
|
||||
- Multiple specialized renderers (motion-graphics + ASCII) coexist
|
||||
- Captioner is included because muted autoplay is the norm on social
|
||||
- `claude-design` skill for UI mockups maps directly to the product video idiom
|
||||
|
||||
## Example 3 — Music video (synced to provided track)
|
||||
|
||||
**Brief:** A 3-minute music video for a provided lo-fi hip-hop track. Visuals
|
||||
should pulse with the beat. Generative + ASCII hybrid. Vertical 9:16.
|
||||
|
||||
**Team:**
|
||||
- `director`
|
||||
- `music-supervisor` — analyze track, emit `audio/beats.json` (loads `songsee`)
|
||||
- `storyboarder` — beat-aligned shot list (loads `excalidraw`)
|
||||
- `renderer-ascii` — ASCII scenes synced to bass kicks (loads `ascii-video`)
|
||||
- `renderer-p5js` — generative particle scenes synced to highs (loads `p5js`)
|
||||
- `editor` — beat-cut assembly using `beats.json`
|
||||
- `reviewer` — sync QA
|
||||
|
||||
**Task graph:**
|
||||
```
|
||||
T0 director decompose
|
||||
T1 music-supervisor analyze track → beats.json + spectrogram (parent: T0)
|
||||
T2 storyboarder shot list aligned to beats (parents: T1, T0)
|
||||
T3a renderer-ascii scene 1: bass-driven ASCII (parent: T2)
|
||||
T3b renderer-p5js scene 2: high-end particle field (parent: T2)
|
||||
... (more scenes)
|
||||
T4 editor cut to beats + mux track (parents: T3*, T1)
|
||||
T5 reviewer sync QA + final approval (parent: T4)
|
||||
```
|
||||
|
||||
**Key choices:**
|
||||
- `music-supervisor` runs FIRST — `beats.json` gates the renderers
|
||||
- `editor` uses `beats.json` directly to align cuts to bass kicks
|
||||
- No voice-talent — music is the audio
|
||||
- Two specialized renderers (`ascii-video` + `p5js`) for visual variety
|
||||
|
||||
## Example 4 — Math/algorithm explainer
|
||||
|
||||
**Brief:** A 2-minute explainer of an algorithm. 3Blue1Brown-style. Animated
|
||||
diagrams, equations, narration. Square 1:1.
|
||||
|
||||
**Team:**
|
||||
- `director`
|
||||
- `writer` — narration script (loads `humanizer`)
|
||||
- `cinematographer` — visual spec (loads `manim-video`)
|
||||
- `renderer-manim` — all animated scenes (loads `manim-video`)
|
||||
- `voice-talent` — narration via ElevenLabs
|
||||
- `editor` — assembly + audio mux
|
||||
- `captioner` — burned subtitles
|
||||
|
||||
**Task graph:**
|
||||
```
|
||||
T0 director decompose
|
||||
T1 writer script + narration (parent: T0)
|
||||
T2 cinematographer visual spec for all scenes (parent: T1)
|
||||
T3a-Tn renderer-manim scenes 1..N (parents: T2)
|
||||
T4 voice-talent narration audio (parent: T1)
|
||||
T5 editor cut + mux (parents: T3*, T4)
|
||||
T6 captioner SRT + burn (parent: T5)
|
||||
```
|
||||
|
||||
**Key choices:**
|
||||
- `manim-video` skill drives both the cinematographer (visual language) and
|
||||
the renderer (actual scene production)
|
||||
- The `manim-video` skill's reference docs (animation-design-thinking,
|
||||
scene-planning, equations) auto-load when needed via the renderer's pinned skill
|
||||
|
||||
## Example 5 — ASCII video, music-track-only
|
||||
|
||||
**Brief:** A 60-second pure-ASCII video reactive to an existing track. No
|
||||
voiceover, no other tools. Square 1:1.
|
||||
|
||||
**Team:**
|
||||
- `director`
|
||||
- `music-supervisor` — track analysis (loads `songsee`)
|
||||
- `renderer-ascii` — all visuals (loads `ascii-video`)
|
||||
- `editor` — assembly + audio mux
|
||||
|
||||
**Task graph:**
|
||||
```
|
||||
T0 director decompose
|
||||
T1 music-supervisor analyze track (parent: T0)
|
||||
T2a renderer-ascii scene 1 (parents: T1, T0)
|
||||
T2b renderer-ascii scene 2 (parents: T1, T0)
|
||||
T2c renderer-ascii scene 3 (parents: T1, T0)
|
||||
T3 editor stitch + mux audio (parents: T2*)
|
||||
```
|
||||
|
||||
**Key choices:**
|
||||
- Minimal team (4 profiles) for a focused single-tool project
|
||||
- No reviewer — short experimental piece, director approves directly
|
||||
- All scenes run through one `renderer-ascii` profile because the `ascii-video`
|
||||
skill covers everything
|
||||
|
||||
This example illustrates the rule: **don't over-decompose**. Three scenes
|
||||
through one renderer is fine. Don't spawn three renderer profiles.
|
||||
|
||||
## Example 6 — Real-time / installation art
|
||||
|
||||
**Brief:** A 2-minute audio-reactive visual for a gallery installation. Driven
|
||||
by an audio input feed. TouchDesigner-based. 16:9 4K.
|
||||
|
||||
**Team:**
|
||||
- `director`
|
||||
- `cinematographer` — visual language spec (loads `touchdesigner-mcp`)
|
||||
- `renderer-touchdesigner` — all visuals + record-to-disk
|
||||
(loads `touchdesigner-mcp`)
|
||||
- `audio-mixer` — final loudness pass on the captured audio (optional if
|
||||
pre-mixed source)
|
||||
- `editor` — assemble final clip from TouchDesigner recording
|
||||
- `reviewer` — visual QA
|
||||
|
||||
**Task graph:**
|
||||
```
|
||||
T0 director decompose
|
||||
T1 cinematographer TD operator graph spec (parent: T0)
|
||||
T2 renderer-touchdesigner build TD network + record output (parent: T1)
|
||||
T3 editor trim + audio mux (parent: T2)
|
||||
T4 reviewer final QA (parent: T3)
|
||||
```
|
||||
|
||||
**Key choices:**
|
||||
- `touchdesigner-mcp` controls a running TouchDesigner instance — the
|
||||
cinematographer designs the operator graph, renderer builds it
|
||||
- Output is a recording from the running TD network, not a render-to-frames
|
||||
process; editor mostly just trims
|
||||
|
||||
## Pattern recognition
|
||||
|
||||
When the user describes a video, look for these signals to map to an example:
|
||||
|
||||
- **Plot, characters, scripted dialogue** → Example 1 (narrative)
|
||||
- **Specific product, CTA, brand colors, voiceover** → Example 2 (marketing)
|
||||
- **Track file provided, "synced to music"** → Example 3 (music video)
|
||||
- **"Explain how X works", math/algorithm/concept walkthrough** → Example 4 (manim explainer)
|
||||
- **Terminal aesthetic, ASCII, retro pixel** → Example 5 (ASCII)
|
||||
- **"Audio-reactive", "real-time", "installation"** → Example 6 (TouchDesigner)
|
||||
- **Comic-style narrative** → use `renderer-comic` (`baoyu-comic` skill)
|
||||
- **Retro game / pixel-art aesthetic** → use `renderer-pixel` (`pixel-art` skill)
|
||||
- **3D scene, photoreal environment** → use `renderer-3d` (`blender-mcp`)
|
||||
- **Generative art, particle system, shader** → use `renderer-p5js` (`p5js`)
|
||||
- **AI-generated photoreal stills + animation** → use `renderer-comfyui`
|
||||
(`comfyui`) for both stills and image-to-video
|
||||
- **"video about how the system works", recursive demo** → composable from
|
||||
any of the above; the recursion is a rendering technique, not a style
|
||||
|
||||
The actual team should be derived from the specific brief — these examples are
|
||||
starting points, not endpoints.
|
||||
|
|
@ -0,0 +1,166 @@
|
|||
# Intake — Discovery Question Banks
|
||||
|
||||
The discovery process is **adaptive**. Always start with three baseline
|
||||
questions to identify the broad style category, then drill into a per-style
|
||||
question bank. Ask 2-4 questions at a time, listen, then proceed. Make
|
||||
reasonable assumptions whenever the user implies an answer.
|
||||
|
||||
## Tier 0 — Baseline (always ask)
|
||||
|
||||
1. **What is the video?** — One-sentence pitch
|
||||
2. **How long?** — Approximate duration
|
||||
3. **Aspect ratio + target platform?** — 16:9 / 9:16 / 1:1 / 4:5; X, IG, YouTube, internal, etc.
|
||||
|
||||
From these answers, classify the style category and pick the relevant Tier 1
|
||||
follow-ups. **Do not** continue asking until you have at least these three.
|
||||
|
||||
## Style classification
|
||||
|
||||
Map the brief to one of these archetypes (or a hybrid):
|
||||
|
||||
| Archetype | Tells |
|
||||
|-----------|-------|
|
||||
| **Narrative film** | Plot, characters, scenes-with-events, dialogue, location |
|
||||
| **Product / marketing** | A specific product or feature being shown / sold; CTA at end |
|
||||
| **Music video** | A specific track exists; visuals sync to music |
|
||||
| **Explainer / educational** | A concept being taught; voiceover-driven |
|
||||
| **Tutorial / changelog** | Software demo, terminal-heavy, technical |
|
||||
| **ASCII / terminal art** | Retro terminal aesthetic explicit, character-grid |
|
||||
| **Abstract / loop** | Generative, no plot, often perfect-loop |
|
||||
| **Documentary / interview cut** | Real footage, transcription-driven |
|
||||
| **Real-time / installation** | Audio-reactive, gallery installation, VJ output |
|
||||
|
||||
If ambiguous, **ask** which category fits — don't guess. Hybrids are common
|
||||
(e.g., a product video with a narrative arc); decompose into the dominant
|
||||
mode + secondary modifiers.
|
||||
|
||||
**Recursive / meta** ("a video that shows its own production") is a
|
||||
*rendering technique*, not a separate style — compose it from any of the
|
||||
above by adding a two-pass render step where pass 2 uses pass 1's output as
|
||||
texture inside the final scene.
|
||||
|
||||
## Tier 1 — Per-style follow-ups
|
||||
|
||||
### Narrative film
|
||||
|
||||
- **Setting / world?** — When and where the story takes place
|
||||
- **Characters?** — How many, archetypes, who carries dialogue
|
||||
- **Beat list or full script?** — Has the user written the story or do we draft it
|
||||
- **Dialogue language?** — Spoken lines, on-screen subs only, silent
|
||||
- **Visual generation approach?** — Text-to-image (FAL/Midjourney/Imagen) →
|
||||
image-to-video (Runway/Kling), 3D animation (Blender), 2D animation,
|
||||
procedural, or hybrid
|
||||
- **Voice approach?** — TTS (which voice), recorded VO, no dialogue
|
||||
- **Music / score?** — Commissioned (via `songwriting-and-ai-music` Suno
|
||||
prompts, or local `heartmula`), licensed track provided, silent
|
||||
|
||||
### Product / marketing
|
||||
|
||||
- **Product?** — Name, what it does, key feature being shown
|
||||
- **Target audience?** — Who's watching, what they care about
|
||||
- **CTA?** — Visit URL, install, sign up, etc.
|
||||
- **Tone?** — Serious, playful, technical, premium, edgy
|
||||
- **Brand assets available?** — Logo files, color palette, fonts, existing footage
|
||||
- **Animation style?** — Motion graphics (Remotion / AE-style), screen recording,
|
||||
generative, illustrated
|
||||
- **Voiceover?** — Yes (which voice / language) or text-only
|
||||
- **Music?** — Track provided, license-free needed, custom-composed
|
||||
|
||||
### Music video
|
||||
|
||||
- **Track file?** — Path to the audio (essential — we'll analyze BPM + beats)
|
||||
- **Track length to use?** — Full song or a section
|
||||
- **Genre / energy?** — Tells what visual rhythm and density to use
|
||||
- **Lyric / narrative content?** — Are there lyrics to render on screen,
|
||||
or is it purely visual?
|
||||
- **Visual reference style?** — Existing music videos / artists for reference
|
||||
- **Performer footage?** — None, has clips, will provide
|
||||
- **Visual generation approach?** — Per-beat generative, edit-driven cuts of stock
|
||||
footage, illustrated, hybrid
|
||||
|
||||
### Explainer / educational
|
||||
|
||||
- **What concept is being taught?** — One-sentence concept, key takeaway
|
||||
- **Audience expertise?** — Beginner / intermediate / expert
|
||||
- **Diagram density?** — Heavy math / formulas / code / abstract concepts
|
||||
- **Voiceover?** — TTS / recorded / on-screen text only
|
||||
- **Tool preference?** — `manim-video` (math), `p5js` (generative),
|
||||
Remotion (UI motion graphics), `comfyui` (AI-generated visuals),
|
||||
`ascii-video` (technical/retro), hybrid
|
||||
- **Pacing?** — Fast and dense (3Blue1Brown) or slow and contemplative
|
||||
|
||||
### Tutorial / changelog / software demo
|
||||
|
||||
- **Software being demonstrated?** — Name, what it does
|
||||
- **Demo script?** — Sequence of commands / screens to show
|
||||
- **Terminal-only or with GUI?**
|
||||
- **Voiceover for narration?**
|
||||
- **Diagram support needed?** — Often these benefit from a diagram skill
|
||||
alongside the screen-capture/render step (`excalidraw`,
|
||||
`architecture-diagram`, `concept-diagrams`)
|
||||
|
||||
### ASCII / terminal art
|
||||
|
||||
- **Source material?** — Generative / driven by audio / converting existing
|
||||
video / static image starting point
|
||||
- **Color palette?** — Brand-driven (gold/black/blue), Matrix green, full
|
||||
rainbow, monochrome
|
||||
- **Audio reactivity?** — None / loose mood / tight beat sync / FFT-driven
|
||||
- **Character set?** — ASCII only / Unicode block-drawing / mystic glyphs
|
||||
- **Loop or narrative?** — Perfect loop or one-shot
|
||||
|
||||
### Abstract / loop
|
||||
|
||||
- **Mood / emotion?** — One word that captures the feel
|
||||
- **Motion type?** — Zoom-into-itself, particle drift, wave, geometric, organic
|
||||
- **Loop required?** — Perfect loop (Droste-style) or just satisfying ending
|
||||
- **Audio?** — Silent, ambient pad, beat-synced
|
||||
|
||||
### Documentary / interview cut
|
||||
|
||||
- **Source footage?** — Provided clips, length per clip
|
||||
- **Transcript / subtitles?** — Provided or to be generated
|
||||
- **Story structure?** — Chronological / thematic / arc
|
||||
- **B-roll approach?** — Generated, stock library, none
|
||||
|
||||
### Real-time / installation
|
||||
|
||||
- **Output environment?** — Gallery wall, projector, screen, web embed
|
||||
- **Audio source?** — Live audio input, pre-recorded track, both
|
||||
- **Reactivity tightness?** — Mood-level (loose) vs. tight beat-sync vs. live
|
||||
parameter control
|
||||
- **Tool preference?** — `touchdesigner-mcp` for full TD operator graphs;
|
||||
`p5js` for web-canvas; `comfyui` for generative-AI fed by audio features
|
||||
|
||||
## Tier 2 — Always ask near the end
|
||||
|
||||
- **Brand assets path?** — Where logo / color palette / fonts / music library lives
|
||||
- **Output format requirements?** — Codec preference, target file size, accepted
|
||||
alternates (vertical cut, GIF, audio-only)
|
||||
- **Deadline?** — Affects task `max_runtime_seconds` and acceptable scope
|
||||
- **Quality bar?** — Rough draft for review / polished final / archival
|
||||
- **Existing footage / assets to reuse?** — Anything that should appear, not just inform
|
||||
|
||||
## Reasonable assumption defaults
|
||||
|
||||
When the user under-specifies, fill in these defaults rather than asking:
|
||||
|
||||
| Question | Default |
|
||||
|----------|---------|
|
||||
| Frame rate | 30 fps for X / IG; 60 fps for tutorials/explainers; 24 fps for narrative film |
|
||||
| Resolution | 1080×1080 for square, 1920×1080 for 16:9, 1080×1920 for 9:16 |
|
||||
| Codec | H.264 / yuv420p, CRF 18 |
|
||||
| Audio codec | AAC 192 kbps |
|
||||
| Voice | Provider's mid-range neutral voice unless brand calls for distinctive timbre |
|
||||
| Music | Silent (require user to specify if music is wanted) |
|
||||
| Captions | On for explainer/tutorial; off for narrative/abstract unless requested |
|
||||
| Quality bar | Polished final unless user says draft |
|
||||
|
||||
State the assumption explicitly: *"Assuming 30fps and AAC audio unless you say otherwise — proceed?"*
|
||||
|
||||
## Anti-patterns
|
||||
|
||||
- **Asking 10 questions at once.** Maximum 4 per turn.
|
||||
- **Asking for things the brief already implies.** If the user said "music video for my track," do not ask "is there a track?"
|
||||
- **Failing to classify before drilling in.** Tier-1 questions depend on classification; mixing them up wastes turns.
|
||||
- **Treating "make a video" as enough to proceed.** Always confirm the three baseline questions.
|
||||
|
|
@ -0,0 +1,276 @@
|
|||
# Kanban Setup — Project Bootstrap & Profile Configuration
|
||||
|
||||
Once the brief is locked and the team is designed, the next step is producing
|
||||
the actual `setup.sh` that creates the project workspace, configures Hermes
|
||||
profiles, and fires the initial kanban task.
|
||||
|
||||
This file documents the patterns. The companion script
|
||||
`scripts/bootstrap_pipeline.py` automates most of it from a structured input
|
||||
JSON.
|
||||
|
||||
> **Credit:** the single-project-workspace layout, profile-config patching
|
||||
> approach, SOUL.md-per-profile convention, and `--workspace dir:<path>` rule
|
||||
> are adapted from alt-glitch's original multi-agent video pipeline:
|
||||
> [NousResearch/kanban-video-pipeline](https://github.com/NousResearch/kanban-video-pipeline).
|
||||
> This skill generalizes those patterns across video styles and replaces the
|
||||
> string-replacement config patcher with a PyYAML-based one.
|
||||
|
||||
## Project workspace structure
|
||||
|
||||
Every video project gets one workspace under `~/projects/video-pipeline/<slug>/`:
|
||||
|
||||
```
|
||||
~/projects/video-pipeline/<slug>/
|
||||
├── brief.md ← the contract; all tasks reference
|
||||
├── TEAM.md ← team composition + task graph (director reads this)
|
||||
├── taste/
|
||||
│ ├── brand-guide.md ← color, typography, motion rules
|
||||
│ ├── emotional-dna.md ← what the piece should FEEL like
|
||||
│ └── style-frames/ ← optional: visual references
|
||||
├── audio/
|
||||
│ ├── track.mp3 ← provided music (if any)
|
||||
│ ├── voiceover/ ← per-line TTS clips
|
||||
│ └── sfx/ ← sound effects
|
||||
├── assets/
|
||||
│ ├── logos/
|
||||
│ ├── fonts/
|
||||
│ └── existing-footage/ ← reusable provided clips
|
||||
├── scenes/
|
||||
│ ├── scene-01/
|
||||
│ │ ├── VISUAL_SPEC.md ← cinematographer's per-scene spec
|
||||
│ │ ├── render.py ← renderer's code (or sketch.html, etc.)
|
||||
│ │ ├── checkpoints/ ← preview frames for QA
|
||||
│ │ └── clip.mp4 ← the deliverable for this scene
|
||||
│ ├── scene-02/...
|
||||
│ └── ...
|
||||
├── checkpoints/ ← global review frames
|
||||
├── tools/ ← optional project-local helpers
|
||||
└── output/
|
||||
├── final.mp4 ← stitched + audio
|
||||
├── final-noaudio.mp4
|
||||
├── final-9x16.mp4 ← optional: vertical alternate
|
||||
└── captions.srt ← optional: subtitle file
|
||||
```
|
||||
|
||||
**The slug** is derived from the brief title: lowercase, hyphen-separated.
|
||||
Example: `q3-product-teaser`, `ascii-mood-loop`, `interview-cut-2026-q1`.
|
||||
|
||||
## The setup.sh script
|
||||
|
||||
The setup script does six things in order:
|
||||
|
||||
1. **Create workspace tree** — all directories above
|
||||
2. **Create profiles** — `hermes profile create <name> --clone`
|
||||
3. **Configure profiles** — patch each profile's
|
||||
`~/.hermes/profiles/<name>/config.yaml` to set toolsets, always_load skills,
|
||||
and `cwd`
|
||||
4. **Write SOUL.md per profile** — the personality + role definition
|
||||
5. **Copy any provided assets + write `brief.md`, `TEAM.md`, and `taste/`**
|
||||
6. **Fire the initial kanban task** — `hermes kanban create` assigned to the director
|
||||
|
||||
See `assets/setup.sh.tmpl` for the skeleton.
|
||||
|
||||
### Profile creation pattern
|
||||
|
||||
```bash
|
||||
hermes profile create director --clone 2>/dev/null || true
|
||||
```
|
||||
|
||||
The `--clone` flag clones from the active profile (preserving model, base
|
||||
config). The `|| true` makes the script idempotent — re-running won't error if
|
||||
the profile already exists.
|
||||
|
||||
### Profile config patching
|
||||
|
||||
Each profile has a YAML config at `~/.hermes/profiles/<name>/config.yaml`. The
|
||||
setup script edits exactly two keys:
|
||||
|
||||
1. `toolsets:` — replace the default with the role's required toolsets
|
||||
2. `skills.always_load:` — list the role's must-load skills (may be empty)
|
||||
|
||||
**Do NOT** modify `approvals.mode` (controls user-confirmation of tool calls
|
||||
— a security setting that must stay as the user configured it). **Do NOT**
|
||||
modify `terminal.cwd` — the kanban dispatcher overrides cwd per-task via
|
||||
`--workspace dir:<path>`, so the profile's cwd is irrelevant to the kanban
|
||||
work and changing it could break the user's interactive use of the profile.
|
||||
|
||||
Use **PyYAML**, not string replacement, so the patch is robust against
|
||||
default-config schema drift:
|
||||
|
||||
```bash
|
||||
configure_profile() {
|
||||
local profile="$1"
|
||||
local toolsets_json="$2" # JSON array, e.g. '["kanban","terminal","file"]'
|
||||
local skills_json="$3" # JSON array, e.g. '["kanban-worker","ascii-video"]'
|
||||
python3 - "$profile" "$toolsets_json" "$skills_json" <<'PY'
|
||||
import json, os, sys, yaml
|
||||
profile, ts_json, sk_json = sys.argv[1:4]
|
||||
p = os.path.expanduser(f"~/.hermes/profiles/{profile}/config.yaml")
|
||||
with open(p) as f:
|
||||
cfg = yaml.safe_load(f) or {}
|
||||
cfg["toolsets"] = json.loads(ts_json)
|
||||
cfg.setdefault("skills", {})["always_load"] = json.loads(sk_json)
|
||||
with open(p, "w") as f:
|
||||
yaml.safe_dump(cfg, f, sort_keys=False)
|
||||
PY
|
||||
}
|
||||
```
|
||||
|
||||
PyYAML must be installed in the user's Python (it ships with most Hermes
|
||||
installs). If absent: `pip install pyyaml`.
|
||||
|
||||
The setup script should also **validate** the patch by re-reading the file
|
||||
and comparing — see `assets/setup.sh.tmpl` for the validation pattern.
|
||||
|
||||
### SOUL.md per profile
|
||||
|
||||
Each profile gets a `SOUL.md` at `~/.hermes/profiles/<name>/SOUL.md` that
|
||||
defines its role, voice, and rules. See `assets/soul.md.tmpl` for the
|
||||
template. Customize per role and per project.
|
||||
|
||||
The director's SOUL.md should be the most opinionated — its voice flavors
|
||||
the entire production. **Critical content for the director's SOUL.md:**
|
||||
|
||||
- **Anti-temptation rules:** "Do not execute the work yourself. For every
|
||||
concrete task, create a kanban task and assign it. Decompose, route, comment,
|
||||
approve — that's the whole job." (The `kanban-orchestrator` skill provides
|
||||
the deeper playbook; load it.)
|
||||
- **Decomposition steps:** Read `brief.md`, `TEAM.md`, `taste/`. Use the team
|
||||
graph in `TEAM.md` to fan out tasks.
|
||||
- **The workspace_path rule** (see below).
|
||||
|
||||
Other profiles' SOUL.md is briefer; mostly mechanical: who you are, what you
|
||||
read, what you produce, what skills/tools to use, where to write outputs.
|
||||
Most non-director profiles should `always_load: kanban-worker` for the
|
||||
deeper-than-baseline kanban guidance.
|
||||
|
||||
### Initial kanban task
|
||||
|
||||
The final action of setup.sh is firing the kanban:
|
||||
|
||||
```bash
|
||||
hermes kanban create "Direct production of <video title>" \
|
||||
--assignee director \
|
||||
--workspace dir:"$HOME/projects/video-pipeline/${PROJECT_SLUG}" \
|
||||
--tenant ${PROJECT_SLUG} \
|
||||
--priority 2 \
|
||||
--max-runtime 4h \
|
||||
--body "$(cat <<EOF
|
||||
Read brief.md, TEAM.md, and taste/.
|
||||
Decompose into the team graph defined in TEAM.md.
|
||||
All child tasks MUST use:
|
||||
workspace_kind="dir"
|
||||
workspace_path="$HOME/projects/video-pipeline/${PROJECT_SLUG}"
|
||||
tenant="${PROJECT_SLUG}"
|
||||
EOF
|
||||
)"
|
||||
```
|
||||
|
||||
The `--workspace dir:<path>` flag is **critical** — it tells the kanban that
|
||||
all child tasks share this workspace. Skipping or using `worktree` will
|
||||
isolate profiles and break artifact sharing.
|
||||
|
||||
## The TEAM.md file
|
||||
|
||||
Alongside `brief.md`, write a `TEAM.md` that the director reads. It documents
|
||||
the team composition + task graph the orchestrator should follow. This
|
||||
removes ambiguity and prevents the director from inventing extra steps.
|
||||
|
||||
Example structure (for an ASCII video with a music supervisor and editor):
|
||||
|
||||
```markdown
|
||||
# Team & Task Graph — <video title>
|
||||
|
||||
## Team
|
||||
|
||||
- `director` (this profile) — vision, decomposition, approval
|
||||
- `cinematographer` — visual spec, quality review (loads `ascii-video`)
|
||||
- `renderer-ascii` — ASCII scenes (loads `ascii-video`)
|
||||
- `music-supervisor` — track analysis (loads `songsee`)
|
||||
- `voice-talent` — narration (uses ElevenLabs API)
|
||||
- `audio-mixer` — final mix (ffmpeg)
|
||||
- `editor` — assembly (ffmpeg)
|
||||
- `reviewer` — final QA gate
|
||||
|
||||
## Task Graph
|
||||
|
||||
T0: this task — decompose
|
||||
│
|
||||
├── T1: cinematographer "Design visual language" (parent: T0)
|
||||
│ │
|
||||
│ ├── T2a: renderer-ascii "Scene 1 — title card" (parent: T1)
|
||||
│ ├── T2b: renderer-ascii "Scene 2 — main beat" (parent: T1)
|
||||
│ ├── T2c: renderer-ascii "Scene 3 — outro" (parent: T1)
|
||||
│
|
||||
├── T3: music-supervisor "Analyze track + emit beats.json" (parent: T0)
|
||||
│
|
||||
├── T4: voice-talent "Generate narration" (parent: T0)
|
||||
│
|
||||
├── T5: audio-mixer "Mix VO + bg music" (parents: T3, T4)
|
||||
│
|
||||
├── T6: editor "Assemble cut + mux audio" (parents: T2*, T5)
|
||||
│
|
||||
└── T7: reviewer "Final QA" (parent: T6)
|
||||
```
|
||||
|
||||
The director turns this into actual `kanban_create` calls.
|
||||
|
||||
## API-key prerequisites check
|
||||
|
||||
Before firing the kanban, verify required keys are available. Check both
|
||||
`~/.hermes/.env` and macOS Keychain (if on macOS):
|
||||
|
||||
```bash
|
||||
check_key() {
|
||||
local var="$1"
|
||||
local kc_account="$2"
|
||||
local kc_service="$3"
|
||||
if grep -q "^${var}=" ~/.hermes/.env 2>/dev/null && \
|
||||
[ -n "$(grep "^${var}=" ~/.hermes/.env | cut -d= -f2-)" ]; then
|
||||
return 0
|
||||
fi
|
||||
if command -v security >/dev/null 2>&1 && \
|
||||
security find-generic-password -a "${kc_account}" -s "${kc_service}" -w >/dev/null 2>&1; then
|
||||
return 0
|
||||
fi
|
||||
echo "ERROR: ${var} not set in ~/.hermes/.env or Keychain (${kc_account}/${kc_service})"
|
||||
return 1
|
||||
}
|
||||
|
||||
check_key ELEVENLABS_API_KEY hermes ELEVENLABS_API_KEY || exit 1
|
||||
check_key OPENROUTER_API_KEY hermes OPENROUTER_API_KEY || exit 1
|
||||
# ...
|
||||
```
|
||||
|
||||
If a key is missing, the script aborts with a clear message rather than
|
||||
firing a kanban that will hit credential errors mid-execution.
|
||||
|
||||
## Critical rules
|
||||
|
||||
1. **`workspace_kind="dir"` + `workspace_path="<absolute>"` on every kanban_create.** Otherwise profiles can't share artifacts.
|
||||
|
||||
2. **Tenant every task.** `--tenant <project-slug>` keeps the dashboard scoped
|
||||
and prevents cross-pollination with other ongoing kanbans.
|
||||
|
||||
3. **Idempotency keys.** For tasks that should not duplicate on re-run (e.g.,
|
||||
setup creating profiles), use the `idempotency_key` argument or check
|
||||
existence first.
|
||||
|
||||
4. **`max_runtime_seconds` per task.** Renderers that get stuck eat compute.
|
||||
Standard defaults:
|
||||
- Renderer task: 1800s (30min)
|
||||
- Editor task: 600s (10min)
|
||||
- Voice-talent task: 300s (5min)
|
||||
- Image-generator task: 600s (10min)
|
||||
- Image-to-video-generator task: 900s (15min)
|
||||
|
||||
5. **Heartbeats for long renders.** Tasks expected to run >5min should emit
|
||||
`kanban_heartbeat` periodically with progress. Renderers should report
|
||||
frame counts; the editor should report assembly progress.
|
||||
|
||||
6. **The `audio/` and `taste/` dirs are populated BEFORE firing the kanban.**
|
||||
Don't ask the director's pipeline to source these — copy at setup time.
|
||||
|
||||
7. **`brief.md` is read-only after setup.** If the brief changes during
|
||||
execution, that's a significant pivot — re-fire the kanban rather than edit
|
||||
live.
|
||||
|
|
@ -0,0 +1,180 @@
|
|||
# Monitoring — Watch the Pipeline + Intervene
|
||||
|
||||
After `setup.sh` fires the kanban, the work runs autonomously. The role of
|
||||
this skill in the execution phase is to help the user (and the AI overseeing
|
||||
the session) detect problems early and intervene effectively.
|
||||
|
||||
## Live monitoring commands
|
||||
|
||||
```bash
|
||||
# Live event stream — task spawns, status changes, heartbeats, completions
|
||||
hermes kanban watch --tenant <project-slug>
|
||||
|
||||
# Snapshot of the board
|
||||
hermes kanban list --tenant <project-slug>
|
||||
hermes kanban list --tenant <project-slug> --json # machine-readable
|
||||
|
||||
# Per-status counts + oldest-ready age
|
||||
hermes kanban stats --tenant <project-slug>
|
||||
|
||||
# Visual dashboard (browser)
|
||||
hermes dashboard
|
||||
|
||||
# Inspect a specific task (includes comments + events)
|
||||
hermes kanban show <task-id>
|
||||
|
||||
# Follow a single task's event stream
|
||||
hermes kanban tail <task-id>
|
||||
```
|
||||
|
||||
Verify available subcommands with `hermes kanban --help` — the kanban CLI
|
||||
ships with `init / create / list / show / assign / link / unlink / claim /
|
||||
comment / complete / block / unblock / archive / tail / dispatch / watch /
|
||||
stats / heartbeat / log / runs / context / gc`.
|
||||
|
||||
The companion `scripts/monitor.py` polls the kanban via the CLI and surfaces
|
||||
common issues (stuck tasks, missing heartbeats, repeated retries, dependency
|
||||
deadlocks).
|
||||
|
||||
## What to watch for
|
||||
|
||||
### Healthy pipeline indicators
|
||||
|
||||
- Tasks transition `READY → RUNNING → DONE` in roughly the expected order
|
||||
- Renderers emit periodic `kanban_heartbeat` events with progress (e.g. "frame
|
||||
240/720")
|
||||
- Each task's runtime is well under its `max_runtime_seconds` cap
|
||||
- No task accumulates more than 1 retry
|
||||
- Dependency arrows resolve (children unblock as parents complete)
|
||||
|
||||
### Warning signs
|
||||
|
||||
| Symptom | Likely cause | Action |
|
||||
|---------|--------------|--------|
|
||||
| Task RUNNING but no heartbeat in 2+ min | Worker stuck, infinite loop, blocked on input | `hermes kanban show <id>` — read the worker's last events. The dispatcher SIGTERMs tasks that exceed their `max-runtime`; if you need to stop one earlier, `hermes kanban block <id>` then `hermes kanban archive <id>`, and create a re-run task. |
|
||||
| Same task retried 2+ times | Reproducible failure (missing key, bad spec, broken tool) | `hermes kanban show <id>` to read failure events. Fix root cause before re-running. |
|
||||
| RUNNING longer than max_runtime | Task is slow but progressing OR genuinely stuck | Check heartbeats with `hermes kanban tail <id>`. If progressing, the dispatcher will SIGTERM eventually anyway — raise `max-runtime` on a re-created task. |
|
||||
| Child task READY but parents still RUNNING for >2× expected | Cascade slow, dependency miswired | Check the dependency graph. Inspect the parent: sometimes it completed but its handoff fields (summary, metadata) were empty so the child has nothing to consume. |
|
||||
| New tasks not appearing | Director is hung in decomposition | Inspect director task with `kanban show`. Often a malformed `kanban_create` call. |
|
||||
| Specialist tasks completing instantly | Decomposition created tasks without bodies | Director didn't pass enough context. Re-create with explicit body content. |
|
||||
| Tasks created but never picked up | Profile not running, or tenant mismatch, or dispatcher not running | Check `hermes profile list` (profile exists?), `hermes status` (gateway/dispatcher up?), and verify tenant. |
|
||||
| Specific renderer task fails → review note → renderer redoes → fails again | Brief is asking for the impossible | Pivot the brief, not the renderer. |
|
||||
|
||||
## Intervention recipes
|
||||
|
||||
### Rejecting bad output
|
||||
|
||||
When a renderer ships a clip that doesn't pass review:
|
||||
|
||||
```bash
|
||||
# 1. Comment on the renderer's task with specific feedback
|
||||
hermes kanban comment <renderer-task-id> "Scene 3 looks too sparse \
|
||||
— increase visual density. Tighten color palette to brand spec."
|
||||
|
||||
# 2. Create a re-render task with the original as parent
|
||||
hermes kanban create "Scene 3 — re-render with feedback" \
|
||||
--assignee renderer-ascii \
|
||||
--parent <renderer-task-id> \
|
||||
--workspace dir:"$HOME/projects/video-pipeline/<slug>" \
|
||||
--tenant <slug> \
|
||||
--skill ascii-video \
|
||||
--max-runtime 30m
|
||||
```
|
||||
|
||||
### Adding a new dependency mid-flight
|
||||
|
||||
When the editor needs an asset that wasn't originally planned (e.g., a captions
|
||||
file):
|
||||
|
||||
```bash
|
||||
# 1. Create the new task and capture its id
|
||||
NEW_TASK_ID=$(hermes kanban create "Generate SRT captions from voiceover" \
|
||||
--assignee captioner \
|
||||
--workspace dir:"$HOME/projects/video-pipeline/<slug>" \
|
||||
--tenant <slug> \
|
||||
--json | python3 -c "import json,sys;print(json.load(sys.stdin)['id'])")
|
||||
|
||||
# 2. Wire it as a parent of the editor's task with `kanban link`
|
||||
hermes kanban link "$NEW_TASK_ID" <editor-task-id>
|
||||
```
|
||||
|
||||
`kanban link` takes `parent_id child_id` (parent first). Use `kanban unlink`
|
||||
to remove a dependency.
|
||||
|
||||
### Stopping a worker that's stuck
|
||||
|
||||
The kanban dispatcher will SIGTERM (then SIGKILL) any task that exceeds its
|
||||
`--max-runtime` automatically. To stop one sooner:
|
||||
|
||||
```bash
|
||||
# Mark blocked so the dispatcher leaves it alone, then archive
|
||||
hermes kanban block <task-id>
|
||||
hermes kanban archive <task-id>
|
||||
|
||||
# Diagnose what happened
|
||||
hermes kanban show <task-id> # task body, comments, recent events
|
||||
hermes kanban tail <task-id> # follow the live event stream
|
||||
hermes kanban log <task-id> # worker process log
|
||||
```
|
||||
|
||||
After stopping, decide: fix root cause + re-create the task, or skip and
|
||||
adjust dependent tasks.
|
||||
|
||||
### Pivoting the brief
|
||||
|
||||
If during execution the user wants something fundamentally different:
|
||||
|
||||
1. Cancel the active director task and all RUNNING children
|
||||
2. Edit `brief.md` and `TEAM.md`
|
||||
3. Re-fire the initial `hermes kanban create` for the director
|
||||
|
||||
Don't try to "edit while running" — the kanban's audit trail makes a clean
|
||||
pivot more legible than mid-stream changes.
|
||||
|
||||
## Periodic check-in script
|
||||
|
||||
A simple polling pattern for hands-off monitoring:
|
||||
|
||||
```bash
|
||||
while true; do
|
||||
clear
|
||||
hermes kanban list --tenant <slug>
|
||||
echo "---"
|
||||
hermes kanban stats --tenant <slug>
|
||||
sleep 30
|
||||
done
|
||||
```
|
||||
|
||||
For a live event feed, run `hermes kanban watch --tenant <slug>` in a
|
||||
separate terminal — it streams task lifecycle events as they happen.
|
||||
|
||||
For automated intervention (auto-restart stuck tasks, auto-create re-render on
|
||||
review failure), see the `scripts/monitor.py` patterns.
|
||||
|
||||
## When to call it done
|
||||
|
||||
The pipeline is finished when:
|
||||
|
||||
1. All RENDER tasks complete and pass review
|
||||
2. The editor's `output/final.mp4` exists and `ffprobe` confirms expected
|
||||
duration + streams
|
||||
3. The reviewer (if present) has approved
|
||||
4. Optional masterer variants exist
|
||||
|
||||
At this point, present the final.mp4 path to the user along with any review
|
||||
notes. Do NOT delete the workspace — the user may want to iterate on a single
|
||||
scene without re-running the whole pipeline.
|
||||
|
||||
## Common gotchas
|
||||
|
||||
- **Tenant mismatches.** A task created with the wrong tenant won't appear in
|
||||
monitoring. Always pass `--tenant <slug>` consistently.
|
||||
- **Profile process not running.** Tasks queue indefinitely in READY if no
|
||||
worker for that profile is online. Check `hermes profile list` and start
|
||||
any missing profiles.
|
||||
- **Workspace permissions.** All profiles need read+write to the workspace
|
||||
directory. `chmod -R u+rw <workspace>` if any worker reports permission
|
||||
errors.
|
||||
- **Audio/visual sync.** The editor's clip stitching must match the
|
||||
renderer's actual output durations. Don't hardcode scene durations in
|
||||
the editor — read from the renderer's handoff metadata.
|
||||
|
|
@ -0,0 +1,298 @@
|
|||
# Role Archetypes
|
||||
|
||||
The library of role archetypes for video production. **Compose a team from this
|
||||
list, don't clone a fixed roster.** Most videos need 4-7 profiles. The director
|
||||
is always present; everything else is conditional on the brief.
|
||||
|
||||
Each role's profile name is by convention `kebab-case` (e.g. `creative-director`,
|
||||
`image-generator`). Multiple instances of the same role get descriptive suffixes
|
||||
when they need different focus (e.g., `renderer-ascii`, `renderer-3d`).
|
||||
|
||||
For toolset + skill mapping per role, see [tool-matrix.md](tool-matrix.md).
|
||||
|
||||
## Always present
|
||||
|
||||
### director
|
||||
|
||||
The vision-holder. Reads the brief and brand guide, decomposes into a task
|
||||
graph, comments to steer creative direction, approves the final cut.
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-orchestrator`. The kanban plugin auto-injects baseline
|
||||
orchestration guidance for free; `kanban-orchestrator` is the deeper
|
||||
decomposition playbook. Add `creative-ideation` if the brief is wide-open
|
||||
and needs framing help.
|
||||
- **Personality:** Tied to the brand voice — see `assets/soul.md.tmpl`
|
||||
|
||||
The director has the same toolset as everyone else, but its `SOUL.md` rules
|
||||
**forbid** execution. The "decompose, don't execute" discipline is enforced
|
||||
by personality + the kanban-orchestrator skill, not by missing tools.
|
||||
|
||||
## Pre-production roles
|
||||
|
||||
Pick based on what the brief needs.
|
||||
|
||||
### writer / screenwriter
|
||||
|
||||
Writes scripts, dialogue, voiceover copy, narration. Use for any video with
|
||||
spoken or written words beyond a tagline.
|
||||
|
||||
- **Toolsets:** kanban, file
|
||||
- **Skills:** `kanban-worker`, `humanizer` (post-process to strip AI-tells)
|
||||
- **Outputs:** `script.md`, `narration.md`, `dialogue/scene-NN.md`
|
||||
|
||||
### copywriter
|
||||
|
||||
Like `writer` but specifically for marketing copy: taglines, CTAs, voiceover
|
||||
scripts for product videos.
|
||||
|
||||
- **Toolsets:** kanban, file
|
||||
- **Skills:** `kanban-worker`, `humanizer`
|
||||
- **Outputs:** `copy.md`
|
||||
|
||||
### concept-artist / visual-designer
|
||||
|
||||
Develops the visual identity: mood board, style frames, color palette
|
||||
rationale, typography choices. Produces a `visual-spec.md` that all generators
|
||||
follow. Often produces still reference frames using image-generation APIs or
|
||||
local skills.
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker` plus any project-specific design skill —
|
||||
`claude-design` (UI/web), `sketch` (quick mockup variants),
|
||||
`popular-web-designs` (matching known web aesthetic), `pixel-art` (retro),
|
||||
`ascii-art` (terminal/retro), `excalidraw` (hand-drawn frames),
|
||||
`design-md` (text-based design docs)
|
||||
- **Outputs:** `visual-spec.md`, `taste/style-frames/*.png`
|
||||
|
||||
### storyboarder
|
||||
|
||||
Maps the brief to a beat-by-beat shot list with timing. Critical for narrative
|
||||
film and music video. Often pairs with a diagramming tool.
|
||||
|
||||
- **Toolsets:** kanban, file
|
||||
- **Skills:** `kanban-worker` plus a diagram skill — `excalidraw` (sketch),
|
||||
`architecture-diagram` (technical/system), `concept-diagrams` (educational/
|
||||
scientific)
|
||||
- **Outputs:** `storyboard.md` with one row per scene/shot, optional
|
||||
storyboard sketches
|
||||
|
||||
### cinematographer / dp
|
||||
|
||||
Designs the visual language: framing, color, motion, transitions. Reviews
|
||||
generator output for visual consistency. Hands off per-scene `VISUAL_SPEC.md`.
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker` plus the visual skill that matches the project
|
||||
(e.g., `ascii-video` for ASCII work, `manim-video` for explainers,
|
||||
`touchdesigner-mcp` for real-time visuals, etc.)
|
||||
- **Outputs:** `scenes/scene-NN/VISUAL_SPEC.md`, review comments on renderer
|
||||
tasks
|
||||
- **Reviews via:** any media-analysis approach (Gemini multimodal, manual
|
||||
inspection of clip thumbnails, ffprobe summaries)
|
||||
|
||||
## Production roles
|
||||
|
||||
### renderer (generic)
|
||||
|
||||
A worker that produces visual content for one or more scenes. Loaded with
|
||||
whichever creative skill fits the scene's style. Multiple renderers can run in
|
||||
parallel, each pinned to a different skill via `always_load` in their profile
|
||||
or `--skill` on the task.
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** one creative skill (see specialized variants below)
|
||||
- **Outputs:** `scenes/scene-NN/clip.mp4`
|
||||
|
||||
### Specialized renderer variants
|
||||
|
||||
When scenes need very different tools, create specialized renderer profiles
|
||||
instead of overloading one. Each loads a different creative skill.
|
||||
|
||||
| Variant | Skill | Best for |
|
||||
|---------|-------|----------|
|
||||
| `renderer-ascii` | `ascii-video` | Terminal aesthetic, retro pixel, audio-reactive grid, video-to-ASCII conversion |
|
||||
| `renderer-manim` | `manim-video` | Math, algorithms, 3Blue1Brown-style explainers, equation derivations |
|
||||
| `renderer-p5js` | `p5js` | Generative art, particles, shaders, organic motion, web-canvas content |
|
||||
| `renderer-comfyui` | `comfyui` | AI-generated stills + video using local ComfyUI workflows (img-to-img, img-to-video, etc.) |
|
||||
| `renderer-touchdesigner` | `touchdesigner-mcp` | Real-time, audio-reactive, installation art, VJ-style content |
|
||||
| `renderer-3d` | `blender-mcp` *(optional)* | 3D modeling, animation, photoreal environments, character animation |
|
||||
| `renderer-pixel` | `pixel-art` | Retro game aesthetic with era-correct palettes |
|
||||
| `renderer-comic` | `baoyu-comic` | Knowledge-comic style narrative scenes |
|
||||
| `renderer-meme` | `meme-generation` *(optional)* | Meme-style stills for satirical/social content |
|
||||
| `renderer-procedural` | (none — Python with PIL + ffmpeg directly) | Custom procedural content where no skill fits |
|
||||
| `renderer-video` | (external image-to-video API: Runway / Kling / Luma) | Animating still images in narrative film |
|
||||
| `renderer-motion-graphics` | (external — Remotion CLI) | Motion graphics, kinetic typography, UI animations |
|
||||
|
||||
For external-API renderers, the profile holds the API client logic; only
|
||||
`kanban-worker` is loaded, plus the terminal toolset and the API key.
|
||||
|
||||
### image-generator
|
||||
|
||||
Specifically for text-to-image generation. Often produces stills that go to
|
||||
`renderer-video` for animation.
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker`, optionally `comfyui` (drives a local
|
||||
ComfyUI install for image generation)
|
||||
- **External APIs (alternative to local ComfyUI):** FAL, Replicate, OpenAI
|
||||
Images, Midjourney
|
||||
- **Outputs:** `scenes/scene-NN/stills/*.png`
|
||||
|
||||
### image-to-video-generator
|
||||
|
||||
Takes still images and animates them via Runway/Kling/Luma APIs, or via
|
||||
ComfyUI's image-to-video workflows locally. Almost always follows
|
||||
`image-generator` in narrative film pipelines.
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker`, optionally `comfyui` (for local image-to-video
|
||||
workflows like AnimateDiff or WAN)
|
||||
- **External APIs:** Runway, Kling, Luma, Pika
|
||||
- **Outputs:** `scenes/scene-NN/clip.mp4`
|
||||
|
||||
### music-supervisor
|
||||
|
||||
Sources, analyzes, and prepares the music track. For music videos, also
|
||||
produces a beat/BPM map and key-moment timestamps. Uses `songsee` for
|
||||
spectrograms when the editor or renderer needs a visual reference of the
|
||||
audio's energy.
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker`, `songsee` (audio visualization), plus one of:
|
||||
- `songwriting-and-ai-music` — when commissioning lyrics + Suno prompts
|
||||
- `heartmula` — when generating music with the open-source local model
|
||||
- `spotify` — when sourcing existing tracks
|
||||
- **Outputs:** `audio/track.mp3`, `audio/beats.json`, optional
|
||||
`audio/track-spectrogram.png`
|
||||
|
||||
### voice-talent / narrator
|
||||
|
||||
Generates voiceover audio. Calls a TTS API directly; no Hermes skill required
|
||||
beyond `kanban-worker`. The user can also supply pre-recorded VO instead of
|
||||
generation.
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker`
|
||||
- **External APIs:** ElevenLabs, OpenAI TTS, etc.
|
||||
- **Outputs:** `audio/voiceover/line-NN.mp3`, `audio/voiceover/timeline.mp3`
|
||||
|
||||
### foley / sfx-designer
|
||||
|
||||
Sound effects and ambient design. Often optional unless the brief calls for
|
||||
sound design specifically.
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker`, `songsee` for audio-feature visualization when
|
||||
designing to a track
|
||||
- **Outputs:** `audio/sfx/*.mp3`
|
||||
|
||||
## Post-production roles
|
||||
|
||||
### editor
|
||||
|
||||
Assembles the final cut from clips. Uses ffmpeg for stitching, fades,
|
||||
transitions. Reviews each clip for pacing and quality before assembly.
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker`
|
||||
- **External tools:** ffmpeg, ffprobe
|
||||
- **Outputs:** `output/final.mp4`, `output/final-noaudio.mp4`
|
||||
|
||||
### colorist
|
||||
|
||||
Color grading. Usually optional — if the renderers already produce
|
||||
brand-consistent output and the editor just stitches, the colorist is overkill.
|
||||
Worth including for narrative film with hero shots.
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker`
|
||||
- **Outputs:** `output/final-graded.mp4`
|
||||
|
||||
### audio-mixer
|
||||
|
||||
Mixes voiceover + music + SFX into a final audio track. Sets levels, ducks
|
||||
music under VO, normalizes loudness (LUFS).
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker`
|
||||
- **External tools:** ffmpeg with `loudnorm` filter, optional `sox`
|
||||
- **Outputs:** `audio/final-mix.mp3`
|
||||
|
||||
### captioner
|
||||
|
||||
Burns subtitles into the video, generates SRT, handles accessibility. Can also
|
||||
generate captions from audio via Whisper.
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker`
|
||||
- **External tools:** Whisper (CLI or API), ffmpeg subtitle filters
|
||||
- **Outputs:** `output/captions.srt`, `output/final-captioned.mp4`
|
||||
|
||||
### masterer
|
||||
|
||||
Final encode + format variants. Produces deliverables for each platform target
|
||||
(square for IG, vertical for TikTok, full HD for YouTube, etc.).
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker`
|
||||
- **Outputs:** `output/final-1080.mp4`, `output/final-9x16.mp4`, etc.
|
||||
|
||||
## QA roles
|
||||
|
||||
### reviewer
|
||||
|
||||
A neutral quality gate. Reads the brief, watches the cut, comments
|
||||
specifically on what's off (pacing, sync, brand alignment, technical
|
||||
quality). Distinct from the cinematographer (who reviews visuals during
|
||||
production) and the editor (who reviews for assembly).
|
||||
|
||||
- **Toolsets:** kanban, terminal, file
|
||||
- **Skills:** `kanban-worker`
|
||||
- **External tools:** any media-analysis approach (Gemini multimodal,
|
||||
ffprobe, manual frame extraction)
|
||||
- **Outputs:** `review-notes.md`, comments on tasks
|
||||
|
||||
### brand-cop
|
||||
|
||||
Reviews specifically for brand compliance — colors, typography, voice. Use
|
||||
when the brand guidelines are detailed and a generic reviewer might miss
|
||||
violations.
|
||||
|
||||
- **Toolsets:** kanban, file
|
||||
- **Skills:** `kanban-worker`
|
||||
- **Outputs:** comments + `brand-review.md`
|
||||
|
||||
## Composing teams — heuristics
|
||||
|
||||
- **Always:** director + at least one renderer + editor.
|
||||
- **Add writer** if scripted dialogue / narration / on-screen text exceeds a
|
||||
tagline.
|
||||
- **Add storyboarder** if the brief has more than 5 distinct beats and the
|
||||
director hasn't already laid out a beat list.
|
||||
- **Add cinematographer** if multiple renderer instances need consistent
|
||||
visual language. (For a single-tool video, the renderer's own skill spec
|
||||
is enough.)
|
||||
- **Add image-generator + image-to-video-generator pair** for narrative film
|
||||
with photorealistic visuals.
|
||||
- **Add music-supervisor** when music is provided and rhythm matters
|
||||
(music videos always; explainers sometimes).
|
||||
- **Add voice-talent** for any voiceover / narrative dialogue.
|
||||
- **Add audio-mixer** when there are 2+ audio sources (VO + music, music + SFX).
|
||||
- **Add captioner** for accessibility-priority projects (explainer, tutorial,
|
||||
any platform that defaults to muted playback).
|
||||
- **Add reviewer** for high-stakes projects. Skip for quick experimental loops.
|
||||
- **Add masterer** when multiple platform deliverables are needed.
|
||||
|
||||
## Anti-patterns
|
||||
|
||||
- **One renderer doing everything.** If scenes use very different tools
|
||||
(ASCII + 3D + motion graphics), use specialized renderer variants. The
|
||||
renderer loads ONE creative skill at a time; mixing styles in a single
|
||||
renderer causes thrashing.
|
||||
- **A separate profile per scene.** No. Profiles are per-role, not per-scene.
|
||||
Eight scenes use one or two renderer profiles, not eight.
|
||||
- **A "general" profile that does everything.** Worse than no specialization.
|
||||
The kanban routing breaks down if every task fits every profile.
|
||||
- **No reviewer for important deliverables.** Saves an hour of pipeline time
|
||||
but ships flaws.
|
||||
|
|
@ -0,0 +1,305 @@
|
|||
# Tool Matrix — Skills + Toolsets per Role
|
||||
|
||||
Maps each role archetype to the Hermes skills it should `always_load` and the
|
||||
toolsets it needs. Only references skills that ship in the public hermes-agent
|
||||
repository (under `skills/` or `optional-skills/`). External APIs and CLIs are
|
||||
called from the terminal toolset; they don't appear in `always_load`.
|
||||
|
||||
## Hermes skills relevant to video production
|
||||
|
||||
### Visual / rendering skills (`hermes-agent/skills/creative/`)
|
||||
|
||||
| Skill | What it does | Best fit for |
|
||||
|-------|--------------|--------------|
|
||||
| `ascii-video` | Production pipeline for ASCII art video — generative, audio-reactive, video-to-ASCII | Renderer for ASCII / terminal / retro pixel content; cinematographer for ASCII projects |
|
||||
| `ascii-art` | Static ASCII art generation | Concept artist for ASCII style frames; secondary tool for ASCII renderer |
|
||||
| `manim-video` | Manim CE animations — math, algorithms, 3Blue1Brown-style explainers | Renderer for math, algorithm walkthroughs, technical concept explainers |
|
||||
| `p5js` | p5.js sketches — generative art, shaders, interactive, 3D | Renderer for generative art, particle systems, organic motion, web-canvas content |
|
||||
| `comfyui` | Generate images, video, audio with ComfyUI workflows (image-to-image, image-to-video, etc.) | image-generator, image-to-video-generator, or general renderer for AI-generated content |
|
||||
| `touchdesigner-mcp` | Control a running TouchDesigner instance — real-time visuals, audio-reactive installation art, VJ | Renderer for real-time/audio-reactive content; installation art; live performance |
|
||||
| `blender-mcp` *(optional)* | Control Blender 4.3+ via MCP — 3D modeling, animation, rendering | Renderer for 3D scenes, photoreal environments, character animation |
|
||||
| `pixel-art` | Pixel art with era palettes (NES, Game Boy, PICO-8) | Renderer for retro game aesthetic; concept artist for pixel-style frames |
|
||||
| `baoyu-comic` | Knowledge-comic generation (educational, biography, tutorial) | Renderer for comic-style narrative; explainer in panel form |
|
||||
| `baoyu-infographic` | Infographic generation | Renderer for data-driven explainer scenes |
|
||||
| `meme-generation` *(optional)* | Generate meme images by overlaying text on templates | Generator for satirical/social content; meme-style stills |
|
||||
|
||||
### Design / pre-production skills (`hermes-agent/skills/creative/`)
|
||||
|
||||
| Skill | What it does | Best fit for |
|
||||
|-------|--------------|--------------|
|
||||
| `claude-design` | Design one-off HTML artifacts (landing, deck, prototype) | Concept artist for product video style frames; storyboarder for UI-heavy content |
|
||||
| `design-md` | Design markdown docs | Concept artist documenting visual specs |
|
||||
| `popular-web-designs` | Reference patterns for popular web designs | Concept artist; cinematographer when matching a known UI aesthetic |
|
||||
| `sketch` | Throwaway HTML mockups (2-3 design variants to compare) | Concept artist exploring directions; storyboarder for UI flows |
|
||||
| `excalidraw` | Excalidraw-style hand-drawn diagrams | Storyboarder; concept artist for sketch-style frames |
|
||||
| `architecture-diagram` | Software architecture diagrams | Storyboarder for technical content; explainer scenes about systems |
|
||||
| `concept-diagrams` *(optional)* | Flat, minimal SVG diagrams (educational visual language; physics, chemistry, math, anatomy, etc.) | Renderer / storyboarder for explainer scenes with clean educational diagrams |
|
||||
| `pretext` | Mathematical/scientific content authoring | Writer / cinematographer for technical-explainer pretexts |
|
||||
| `creative-ideation` | Constraint-driven project ideation | Director / cinematographer when the brief is wide-open and needs framing |
|
||||
| `humanizer` | Strip AI-isms from text, add real voice | Writer / copywriter post-process to avoid AI-tells in scripts and VO copy |
|
||||
|
||||
### Audio / media skills (`hermes-agent/skills/creative/` + `skills/media/`)
|
||||
|
||||
| Skill | What it does | Best fit for |
|
||||
|-------|--------------|--------------|
|
||||
| `songwriting-and-ai-music` | Songwriting craft + Suno prompt patterns | Music supervisor when commissioning a track via Suno |
|
||||
| `heartmula` | Open-source music generation (Apache-2.0, Suno-like) | Music supervisor generating bespoke tracks without external APIs |
|
||||
| `songsee` | Spectrograms, mel/chroma/MFCC of audio files | Music supervisor analyzing tracks; foley-designer designing to a beat; editor visualizing a mix |
|
||||
| `spotify` | Spotify control — play, search, queue, manage playlists | Music supervisor sourcing existing tracks; reference research |
|
||||
| `youtube-content` | Fetch transcripts + transform to chapters/summaries/posts | Documentary cut, content adaptation, research for explainers |
|
||||
| `gif-search` | Find existing GIFs | Editor / concept artist sourcing references |
|
||||
| `gifs` | GIF tooling | Masterer producing GIF deliverables |
|
||||
|
||||
### Kanban infrastructure (`hermes-agent/skills/devops/`)
|
||||
|
||||
| Skill | What it does | When to load |
|
||||
|-------|--------------|--------------|
|
||||
| `kanban-orchestrator` | Decomposition playbook + anti-temptation rules for orchestrator profiles | Director only |
|
||||
| `kanban-worker` | Pitfalls, examples, edge cases for kanban workers (deeper than auto-injected guidance) | Any profile — load when handling tricky multi-step workflows |
|
||||
|
||||
The kanban plugin auto-injects baseline orchestration guidance into every
|
||||
worker's system prompt — the `kanban_create` fan-out pattern, claim/handoff
|
||||
lifecycle, and the "decompose, don't execute" rule for orchestrators.
|
||||
`kanban-orchestrator` and `kanban-worker` are deeper playbooks loaded when a
|
||||
profile needs them.
|
||||
|
||||
## External tools (called from terminal toolset)
|
||||
|
||||
These are **not** Hermes skills but external CLIs / APIs that profiles invoke.
|
||||
They don't appear in `always_load`; instead the role's terminal commands hit
|
||||
them directly.
|
||||
|
||||
| Tool | What it does | Profile that uses it |
|
||||
|------|--------------|----------------------|
|
||||
| `ffmpeg` | Video / audio encode, splice, mux | renderer, editor, audio-mixer, masterer |
|
||||
| `ffprobe` | Inspect media | All media-touching profiles |
|
||||
| Whisper (CLI or API) | Speech-to-text for captions | captioner |
|
||||
| Text-to-image API (FAL / Replicate / OpenAI / Midjourney) | Stills generation | image-generator (alternative to local `comfyui`) |
|
||||
| Image-to-video API (Runway / Kling / Luma / Pika) | Animate stills | image-to-video-generator |
|
||||
| Text-to-speech API (ElevenLabs / OpenAI TTS / etc.) | Voiceover generation | voice-talent |
|
||||
| Suno API or web | Track composition (paired with `songwriting-and-ai-music`) | music-supervisor |
|
||||
| Remotion CLI (`npx remotion render`) | React-based motion graphics | renderer-motion-graphics |
|
||||
| Manim CE (`manim`) | Math animation render (driven by `manim-video` skill's recipes) | renderer-manim |
|
||||
| Blender (`blender -b`) | 3D rendering (alternative to `blender-mcp`) | renderer-3d |
|
||||
| Gemini multimodal / Claude vision | AI review of clips | reviewer, cinematographer, editor |
|
||||
|
||||
## Standard toolset configurations per role
|
||||
|
||||
### director
|
||||
|
||||
```yaml
|
||||
toolsets:
|
||||
- kanban
|
||||
- terminal
|
||||
- file
|
||||
skills:
|
||||
always_load:
|
||||
- kanban-orchestrator
|
||||
```
|
||||
|
||||
The director's terminal access is conventional but the SOUL.md rules forbid
|
||||
execution. Audit logs catch violations.
|
||||
|
||||
### writer / copywriter
|
||||
|
||||
```yaml
|
||||
toolsets:
|
||||
- kanban
|
||||
- file
|
||||
skills:
|
||||
always_load:
|
||||
- kanban-worker
|
||||
- humanizer # post-process scripts to strip AI-tells
|
||||
```
|
||||
|
||||
No terminal — writers don't need it.
|
||||
|
||||
### concept-artist
|
||||
|
||||
```yaml
|
||||
toolsets:
|
||||
- kanban
|
||||
- terminal
|
||||
- file
|
||||
skills:
|
||||
always_load:
|
||||
- kanban-worker
|
||||
# plus one or more (style-dependent):
|
||||
# - claude-design (UI / web product video)
|
||||
# - sketch (quick mockup variants)
|
||||
# - excalidraw (hand-drawn frames)
|
||||
# - ascii-art (ASCII style frames)
|
||||
# - pixel-art (retro/game aesthetic)
|
||||
# - popular-web-designs (matching known web aesthetic)
|
||||
# - design-md (text-based design docs)
|
||||
```
|
||||
|
||||
### storyboarder
|
||||
|
||||
```yaml
|
||||
toolsets:
|
||||
- kanban
|
||||
- file
|
||||
skills:
|
||||
always_load:
|
||||
- kanban-worker
|
||||
# one of:
|
||||
# - excalidraw (sketch storyboards)
|
||||
# - architecture-diagram (technical/system content)
|
||||
# - concept-diagrams (educational / scientific content)
|
||||
```
|
||||
|
||||
### cinematographer
|
||||
|
||||
```yaml
|
||||
toolsets:
|
||||
- kanban
|
||||
- terminal
|
||||
- file
|
||||
skills:
|
||||
always_load:
|
||||
- kanban-worker
|
||||
# the visual skill that matches the project, e.g.:
|
||||
# - ascii-video (ASCII projects)
|
||||
# - manim-video (math/explainer)
|
||||
# - p5js (generative)
|
||||
# - comfyui (AI-generated visuals)
|
||||
# - blender-mcp (3D)
|
||||
# - touchdesigner-mcp (real-time/installation)
|
||||
```
|
||||
|
||||
### renderer (specialized variants)
|
||||
|
||||
```yaml
|
||||
toolsets:
|
||||
- kanban
|
||||
- terminal
|
||||
- file
|
||||
skills:
|
||||
always_load:
|
||||
- kanban-worker
|
||||
# ONE skill per renderer variant (or empty for external-API renderers):
|
||||
# - ascii-video (renderer-ascii)
|
||||
# - manim-video (renderer-manim)
|
||||
# - p5js (renderer-p5js)
|
||||
# - comfyui (renderer-comfyui — img/video AI gen)
|
||||
# - touchdesigner-mcp (renderer-touchdesigner)
|
||||
# - blender-mcp (renderer-3d)
|
||||
# - pixel-art (renderer-pixel)
|
||||
# - baoyu-comic (renderer-comic)
|
||||
# - meme-generation (renderer-meme)
|
||||
```
|
||||
|
||||
For external-API renderers (image-to-video-generator using Runway, voice-talent
|
||||
using ElevenLabs, renderer-motion-graphics using Remotion), `always_load` only
|
||||
contains `kanban-worker` — the role's work is API-driven and the API key +
|
||||
terminal commands suffice.
|
||||
|
||||
For multi-skill renderer setups (rare — usually one variant per skill is
|
||||
cleaner) use `--skill <name>` on individual `kanban_create` calls to override
|
||||
which skill loads for that specific task.
|
||||
|
||||
### image-generator / image-to-video-generator / voice-talent
|
||||
|
||||
```yaml
|
||||
toolsets:
|
||||
- kanban
|
||||
- terminal
|
||||
- file
|
||||
skills:
|
||||
always_load:
|
||||
- kanban-worker
|
||||
# for image-generator that drives ComfyUI locally:
|
||||
# - comfyui
|
||||
env_required:
|
||||
# populate based on the chosen API:
|
||||
- FAL_KEY # or REPLICATE_API_TOKEN, OPENAI_API_KEY for image-gen
|
||||
- RUNWAY_API_KEY # or KLING_API_KEY, LUMA_API_KEY for image-to-video
|
||||
- ELEVENLABS_API_KEY # or OPENAI_API_KEY for TTS
|
||||
```
|
||||
|
||||
If the user's setup has ComfyUI installed locally, the `comfyui` skill can
|
||||
replace the external image-gen API entirely (cheaper, more control, supports
|
||||
custom workflows for image-to-video too).
|
||||
|
||||
### music-supervisor
|
||||
|
||||
```yaml
|
||||
toolsets:
|
||||
- kanban
|
||||
- terminal
|
||||
- file
|
||||
skills:
|
||||
always_load:
|
||||
- kanban-worker
|
||||
- songsee # spectrograms / audio analysis
|
||||
# plus (depending on what the project needs):
|
||||
# - songwriting-and-ai-music (commissioning Suno tracks)
|
||||
# - heartmula (commissioning open-source local generation)
|
||||
# - spotify (sourcing existing tracks)
|
||||
```
|
||||
|
||||
### editor / audio-mixer / captioner / masterer
|
||||
|
||||
```yaml
|
||||
toolsets:
|
||||
- kanban
|
||||
- terminal
|
||||
- file
|
||||
skills:
|
||||
always_load:
|
||||
- kanban-worker
|
||||
```
|
||||
|
||||
These are mostly ffmpeg-driven; no special skill needed beyond `kanban-worker`.
|
||||
For captioner add Whisper invocation patterns to the SOUL.md.
|
||||
|
||||
### reviewer / brand-cop
|
||||
|
||||
```yaml
|
||||
toolsets:
|
||||
- kanban
|
||||
- terminal # for media inspection
|
||||
- file
|
||||
skills:
|
||||
always_load:
|
||||
- kanban-worker
|
||||
env_required:
|
||||
- OPENROUTER_API_KEY # if using Gemini multimodal review
|
||||
# or ANTHROPIC_API_KEY if using Claude vision (already required globally)
|
||||
```
|
||||
|
||||
## API key requirements
|
||||
|
||||
Track these in the project setup. The setup script should verify each required
|
||||
key is present in `~/.hermes/.env` (or macOS Keychain) before firing the kanban.
|
||||
|
||||
| Service | Env var | Used by |
|
||||
|---------|---------|---------|
|
||||
| ElevenLabs | `ELEVENLABS_API_KEY` | voice-talent |
|
||||
| OpenAI | `OPENAI_API_KEY` | image-generator (DALL-E), voice-talent (TTS) |
|
||||
| OpenRouter | `OPENROUTER_API_KEY` | reviewer, cinematographer, editor (Gemini multimodal review) |
|
||||
| FAL | `FAL_KEY` | image-generator (FAL flux models) |
|
||||
| Replicate | `REPLICATE_API_TOKEN` | image-generator (alternate provider) |
|
||||
| Runway | `RUNWAY_API_KEY` | image-to-video-generator |
|
||||
| Kling | `KLING_API_KEY` | image-to-video-generator (alternate) |
|
||||
| Luma | `LUMA_API_KEY` | image-to-video-generator (alternate) |
|
||||
| Suno | `SUNO_API_KEY` | music-supervisor (paired with `songwriting-and-ai-music`) |
|
||||
| Spotify | `SPOTIFY_CLIENT_ID` + `SPOTIFY_CLIENT_SECRET` | music-supervisor (paired with `spotify` skill) |
|
||||
| Anthropic | `ANTHROPIC_API_KEY` | every Hermes profile (Claude) |
|
||||
|
||||
If a key is missing, prompt the user to add it. Storage methods, in order of
|
||||
preference: macOS Keychain → `~/.hermes/.env` → environment variable.
|
||||
|
||||
## Skill version pinning
|
||||
|
||||
If a specific skill version is desired, pass it via the per-task
|
||||
`--skill <name>=<version>` flag. The default is whatever's installed.
|
||||
|
||||
## Adding a new skill to the matrix
|
||||
|
||||
When a new Hermes-public video skill ships:
|
||||
|
||||
1. Add a row to the relevant table at the top of this file
|
||||
2. If it warrants a specialized renderer variant, add to `role-archetypes.md`
|
||||
3. Update relevant per-style examples in `examples.md`
|
||||
501
optional-skills/creative/kanban-video-orchestrator/scripts/bootstrap_pipeline.py
Executable file
501
optional-skills/creative/kanban-video-orchestrator/scripts/bootstrap_pipeline.py
Executable file
|
|
@ -0,0 +1,501 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
Bootstrap a video production kanban from a structured plan JSON.
|
||||
|
||||
Reads a plan.json describing the team + brief, expands templates from
|
||||
../assets/, and writes a setup.sh that creates Hermes profiles and fires the
|
||||
initial kanban task.
|
||||
|
||||
Profile-config patching, SOUL.md-per-profile, TEAM.md task-graph convention,
|
||||
and the `hermes kanban create --workspace dir:` initial-task pattern are
|
||||
adapted from alt-glitch's NousResearch/kanban-video-pipeline.
|
||||
|
||||
Usage:
|
||||
bootstrap_pipeline.py plan.json [--out setup.sh]
|
||||
|
||||
The plan.json schema is documented inline below — see the `validate_plan`
|
||||
function. A minimal example:
|
||||
|
||||
{
|
||||
"title": "Q3 Product Teaser",
|
||||
"slug": "q3-product-teaser",
|
||||
"tenant": "q3-product-teaser",
|
||||
"duration_s": 30,
|
||||
"aspect": "1:1",
|
||||
"resolution": "1080x1080",
|
||||
"fps": 30,
|
||||
"team": [
|
||||
{
|
||||
"profile": "director",
|
||||
"role": "director",
|
||||
"toolsets": ["kanban", "terminal", "file"],
|
||||
"skills": [],
|
||||
"responsibilities": "...",
|
||||
"inputs": "brief.md, TEAM.md, taste/",
|
||||
"outputs": "kanban tasks for the team"
|
||||
},
|
||||
...
|
||||
],
|
||||
"scenes": [
|
||||
{"n": 1, "time": "0:00-0:08", "content": "...", "tool": "renderer-ascii"},
|
||||
...
|
||||
],
|
||||
"audio": {"approach": "voiceover + music bed", "vo": "ElevenLabs Lily",
|
||||
"music": "license-free", "sfx": "n/a"},
|
||||
"deliverables": [
|
||||
{"format": "mp4", "resolution": "1080x1080", "notes": "primary"}
|
||||
],
|
||||
"api_keys_required": ["ELEVENLABS_API_KEY", "OPENROUTER_API_KEY"],
|
||||
"brief_extra": {
|
||||
"concept_one_liner": "...",
|
||||
"emotional_north_star": "...",
|
||||
"visual_refs": "...",
|
||||
"tone": "...",
|
||||
"brand_constraints": "..."
|
||||
}
|
||||
}
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
ASSETS_DIR = Path(__file__).resolve().parent.parent / "assets"
|
||||
|
||||
|
||||
def load_template(name: str) -> str:
|
||||
return (ASSETS_DIR / name).read_text()
|
||||
|
||||
|
||||
PROFILE_NAME_RE = re.compile(r"^[a-z0-9][a-z0-9_-]{0,63}$")
|
||||
SLUG_RE = re.compile(r"^[a-z0-9][a-z0-9-]+$")
|
||||
|
||||
|
||||
def validate_plan(plan: dict) -> list[str]:
|
||||
"""Return a list of validation error strings; empty list = valid."""
|
||||
errors = []
|
||||
required_top = ["title", "slug", "tenant", "duration_s", "aspect",
|
||||
"resolution", "fps", "team", "scenes", "audio",
|
||||
"deliverables"]
|
||||
for k in required_top:
|
||||
if k not in plan:
|
||||
errors.append(f"missing required key: {k}")
|
||||
|
||||
if "team" in plan:
|
||||
if not isinstance(plan["team"], list) or not plan["team"]:
|
||||
errors.append("team must be a non-empty list")
|
||||
else:
|
||||
roles = [t.get("role") for t in plan["team"]]
|
||||
if "director" not in roles:
|
||||
errors.append("team must include a director role")
|
||||
seen_profiles = set()
|
||||
for i, t in enumerate(plan["team"]):
|
||||
for k in ["profile", "role", "toolsets", "skills",
|
||||
"responsibilities"]:
|
||||
if k not in t:
|
||||
errors.append(f"team[{i}] missing {k}")
|
||||
# Profile name must match Hermes's regex (lowercase
|
||||
# alphanumeric + hyphens + underscores, up to 64 chars).
|
||||
if "profile" in t:
|
||||
if not PROFILE_NAME_RE.match(t["profile"]):
|
||||
errors.append(
|
||||
f"team[{i}].profile {t['profile']!r} must match "
|
||||
f"[a-z0-9][a-z0-9_-]{{0,63}} per Hermes profile rules"
|
||||
)
|
||||
if t["profile"] in seen_profiles:
|
||||
errors.append(
|
||||
f"team[{i}].profile {t['profile']!r} is duplicated"
|
||||
)
|
||||
seen_profiles.add(t["profile"])
|
||||
# Toolsets / skills must be lists, not strings.
|
||||
if "toolsets" in t and not isinstance(t["toolsets"], list):
|
||||
errors.append(
|
||||
f"team[{i}].toolsets must be a list of strings"
|
||||
)
|
||||
if "skills" in t and not isinstance(t["skills"], list):
|
||||
errors.append(
|
||||
f"team[{i}].skills must be a list of strings"
|
||||
)
|
||||
|
||||
if "slug" in plan:
|
||||
if not SLUG_RE.match(plan["slug"]):
|
||||
errors.append("slug must be lowercase, hyphenated, "
|
||||
"starting with [a-z0-9]")
|
||||
|
||||
return errors
|
||||
|
||||
|
||||
def render_brief(plan: dict) -> str:
|
||||
"""Render brief.md from the plan."""
|
||||
tmpl = load_template("brief.md.tmpl")
|
||||
extra = plan.get("brief_extra", {})
|
||||
|
||||
# Scene table rows
|
||||
scene_rows = []
|
||||
for s in plan["scenes"]:
|
||||
scene_rows.append(
|
||||
f"| {s.get('n', '?')} | {s.get('time', '?')} | "
|
||||
f"{s.get('content', '')} | {s.get('tool', '')} | "
|
||||
f"{s.get('audio', '')} | {s.get('notes', '')} |"
|
||||
)
|
||||
scene_table = "\n".join(scene_rows) if scene_rows else "_(none yet)_"
|
||||
|
||||
# Deliverable rows
|
||||
deliv_rows = []
|
||||
for d in plan["deliverables"]:
|
||||
deliv_rows.append(
|
||||
f"| {d.get('format', '?')} | {d.get('resolution', '?')} | "
|
||||
f"{d.get('notes', '')} |"
|
||||
)
|
||||
deliv_table = "\n".join(deliv_rows) if deliv_rows else "_(none)_"
|
||||
|
||||
# Replacements (single-pass)
|
||||
replacements = {
|
||||
"TITLE": plan["title"],
|
||||
"SLUG": plan["slug"],
|
||||
"TENANT": plan["tenant"],
|
||||
"WORKSPACE": f"~/projects/video-pipeline/{plan['slug']}",
|
||||
"ONE_LINE_PITCH": extra.get("concept_one_liner", "_(TBD)_"),
|
||||
"EMOTIONAL_NORTH_STAR": extra.get("emotional_north_star", "_(TBD)_"),
|
||||
"DURATION_S": str(plan["duration_s"]),
|
||||
"ASPECT": plan["aspect"],
|
||||
"RESOLUTION": plan["resolution"],
|
||||
"FPS": str(plan["fps"]),
|
||||
"PLATFORMS": extra.get("platforms", "_(TBD)_"),
|
||||
"DEADLINE": extra.get("deadline", "_(none)_"),
|
||||
"QUALITY_BAR": extra.get("quality_bar", "polished"),
|
||||
"VISUAL_REFS": extra.get("visual_refs", "_(none)_"),
|
||||
"TONE": extra.get("tone", "_(TBD)_"),
|
||||
"BRAND_CONSTRAINTS": extra.get("brand_constraints", "_(none)_"),
|
||||
"AESTHETIC_RULES": extra.get("aesthetic_rules", "_(TBD)_"),
|
||||
"AUDIO_APPROACH": plan["audio"].get("approach", "_(TBD)_"),
|
||||
"VO_DETAILS": plan["audio"].get("vo", "_(n/a)_"),
|
||||
"MUSIC_DETAILS": plan["audio"].get("music", "_(n/a)_"),
|
||||
"SFX_DETAILS": plan["audio"].get("sfx", "_(n/a)_"),
|
||||
"PRIMARY_FORMAT": plan["deliverables"][0]["format"],
|
||||
"PRIMARY_RES": plan["deliverables"][0]["resolution"],
|
||||
"ALT_FORMAT_1": (plan["deliverables"][1]["format"]
|
||||
if len(plan["deliverables"]) > 1 else "_(none)_"),
|
||||
"ALT_RES_1": (plan["deliverables"][1]["resolution"]
|
||||
if len(plan["deliverables"]) > 1 else ""),
|
||||
"ALT_NOTES_1": (plan["deliverables"][1].get("notes", "")
|
||||
if len(plan["deliverables"]) > 1 else ""),
|
||||
"API_KEYS_REQUIRED": ", ".join(plan.get("api_keys_required", [])) or "none",
|
||||
"EXT_DEPS": extra.get("ext_deps", "ffmpeg, Python 3.11+"),
|
||||
"SOURCE_ASSETS": extra.get("source_assets", "_(none)_"),
|
||||
}
|
||||
out = tmpl
|
||||
for k, v in replacements.items():
|
||||
out = out.replace("{{" + k + "}}", str(v))
|
||||
|
||||
# Scene + deliv tables: replace the placeholder row in the template
|
||||
out = re.sub(
|
||||
r"\|\s*1\s*\|\s*0:00–0:0X.+?\n\|\s*2\s*\|.+?\n",
|
||||
scene_table + "\n",
|
||||
out, flags=re.DOTALL,
|
||||
)
|
||||
return out
|
||||
|
||||
|
||||
def render_team_md(plan: dict) -> str:
|
||||
"""Render TEAM.md from the team list + scene → tool mapping."""
|
||||
lines = [f"# Team & Task Graph — {plan['title']}", "", "## Team", ""]
|
||||
for t in plan["team"]:
|
||||
skills = (
|
||||
f"loads `{', '.join(t['skills'])}`"
|
||||
if t["skills"] else "no skills required"
|
||||
)
|
||||
lines.append(
|
||||
f"- `{t['profile']}` — {t['responsibilities']} ({skills})"
|
||||
)
|
||||
lines.extend(["", "## Task Graph", "", "```"])
|
||||
|
||||
# Build a simple task graph based on conventions
|
||||
profiles_by_role = {t["role"]: t["profile"] for t in plan["team"]}
|
||||
director = profiles_by_role.get("director", "director")
|
||||
lines.append(f"T0 {director} — decompose")
|
||||
|
||||
next_id = 1
|
||||
parents_for_renderer: list[str] = ["T0"]
|
||||
|
||||
if "cinematographer" in profiles_by_role:
|
||||
cid = f"T{next_id}"
|
||||
lines.append(
|
||||
f"{cid:5} {profiles_by_role['cinematographer']} — visual spec for all scenes (parent: T0)"
|
||||
)
|
||||
parents_for_renderer = [cid]
|
||||
next_id += 1
|
||||
|
||||
if "music-supervisor" in profiles_by_role:
|
||||
cid = f"T{next_id}"
|
||||
lines.append(
|
||||
f"{cid:5} {profiles_by_role['music-supervisor']} — track analysis + beats.json (parent: T0)"
|
||||
)
|
||||
next_id += 1
|
||||
ms_id = cid
|
||||
else:
|
||||
ms_id = None
|
||||
|
||||
# Scenes
|
||||
scene_ids = []
|
||||
for s in plan["scenes"]:
|
||||
cid = f"T{next_id}"
|
||||
renderer_profile = s.get("tool") or "renderer"
|
||||
# Lookup the actual profile name
|
||||
for t in plan["team"]:
|
||||
if t["role"] == renderer_profile or t["profile"] == renderer_profile:
|
||||
renderer_profile = t["profile"]
|
||||
break
|
||||
parents = parents_for_renderer + ([ms_id] if ms_id else [])
|
||||
parent_str = ", ".join(parents)
|
||||
lines.append(
|
||||
f"{cid:5} {renderer_profile} — scene {s.get('n', '?')}: "
|
||||
f"{s.get('content', '')[:50]} (parents: {parent_str})"
|
||||
)
|
||||
scene_ids.append(cid)
|
||||
next_id += 1
|
||||
|
||||
# VO + audio mix
|
||||
if "voice-talent" in profiles_by_role:
|
||||
vo_id = f"T{next_id}"
|
||||
lines.append(f"{vo_id:5} {profiles_by_role['voice-talent']} — narration (parent: T0)")
|
||||
next_id += 1
|
||||
else:
|
||||
vo_id = None
|
||||
|
||||
if "audio-mixer" in profiles_by_role:
|
||||
am_id = f"T{next_id}"
|
||||
am_parents = [p for p in [ms_id, vo_id] if p]
|
||||
lines.append(
|
||||
f"{am_id:5} {profiles_by_role['audio-mixer']} — mix audio (parents: {', '.join(am_parents)})"
|
||||
)
|
||||
next_id += 1
|
||||
else:
|
||||
am_id = None
|
||||
|
||||
# Editor
|
||||
if "editor" in profiles_by_role:
|
||||
ed_id = f"T{next_id}"
|
||||
ed_parents = scene_ids + [p for p in [am_id, vo_id, ms_id] if p and p not in scene_ids]
|
||||
lines.append(
|
||||
f"{ed_id:5} {profiles_by_role['editor']} — assemble + mux (parents: {', '.join(ed_parents)})"
|
||||
)
|
||||
next_id += 1
|
||||
else:
|
||||
ed_id = None
|
||||
|
||||
# Captioner
|
||||
if "captioner" in profiles_by_role and ed_id:
|
||||
cap_id = f"T{next_id}"
|
||||
lines.append(
|
||||
f"{cap_id:5} {profiles_by_role['captioner']} — SRT + burn (parent: {ed_id})"
|
||||
)
|
||||
next_id += 1
|
||||
last = cap_id
|
||||
else:
|
||||
last = ed_id
|
||||
|
||||
# Reviewer
|
||||
if "reviewer" in profiles_by_role and last:
|
||||
rv_id = f"T{next_id}"
|
||||
lines.append(
|
||||
f"{rv_id:5} {profiles_by_role['reviewer']} — final QA (parent: {last})"
|
||||
)
|
||||
|
||||
lines.append("```")
|
||||
lines.extend([
|
||||
"",
|
||||
"## Per-task workspace requirement",
|
||||
"",
|
||||
f"All `kanban_create` calls MUST pass:",
|
||||
f"```",
|
||||
f'workspace_kind="dir"',
|
||||
f'workspace_path="$HOME/projects/video-pipeline/{plan["slug"]}"',
|
||||
f'tenant="{plan["tenant"]}"',
|
||||
f"```",
|
||||
])
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def render_setup_sh(plan: dict, brief_md: str, team_md: str) -> str:
|
||||
"""Render setup.sh from the plan."""
|
||||
tmpl = load_template("setup.sh.tmpl")
|
||||
|
||||
# API key checks
|
||||
key_checks = []
|
||||
for key in plan.get("api_keys_required", []):
|
||||
key_checks.append(f'check_key {key} hermes {key} || exit 1')
|
||||
key_checks_str = "\n".join(key_checks) if key_checks else "# (no API keys required)"
|
||||
|
||||
# Scene dirs
|
||||
scene_dir_lines = []
|
||||
for s in plan["scenes"]:
|
||||
n = s.get("n", "?")
|
||||
scene_dir_lines.append(f'mkdir -p "$WORKSPACE/scenes/scene-{n:02d}"/checkpoints')
|
||||
scene_dirs = "\n".join(scene_dir_lines) if scene_dir_lines else ""
|
||||
|
||||
# Profile create
|
||||
profile_creates = []
|
||||
for t in plan["team"]:
|
||||
profile_creates.append(
|
||||
f'hermes profile create {t["profile"]} --clone 2>/dev/null || true'
|
||||
)
|
||||
|
||||
# Profile config — emit JSON arrays so the bash function can pass them
|
||||
# safely through to the Python YAML patcher.
|
||||
profile_configs = []
|
||||
for t in plan["team"]:
|
||||
ts_json = json.dumps(t["toolsets"])
|
||||
sk_json = json.dumps(t["skills"])
|
||||
# Use single-quoted bash strings; JSON only contains "/[/], no single
|
||||
# quotes, so this is safe.
|
||||
profile_configs.append(
|
||||
f"configure_profile {t['profile']!r} {ts_json!r} {sk_json!r}"
|
||||
)
|
||||
|
||||
# SOUL writes — uses heredocs per profile
|
||||
soul_writes = []
|
||||
for t in plan["team"]:
|
||||
soul_writes.append(
|
||||
f'cat > "$HOME/.hermes/profiles/{t["profile"]}/SOUL.md" <<\'SOUL_EOF\'\n'
|
||||
f"{render_soul_md(t, plan)}\n"
|
||||
f"SOUL_EOF\n"
|
||||
f'echo " ✓ SOUL.md for {t["profile"]}"'
|
||||
)
|
||||
|
||||
# Taste writes (placeholder; real content optional)
|
||||
taste_writes = (
|
||||
'cat > "$WORKSPACE/taste/brand-guide.md" <<\'TASTE_EOF\'\n'
|
||||
'# Brand Guide\n\n'
|
||||
'_(Populate with project-specific colors, typography, motion rules)_\n'
|
||||
'TASTE_EOF\n'
|
||||
'cat > "$WORKSPACE/taste/emotional-dna.md" <<\'DNA_EOF\'\n'
|
||||
'# Emotional DNA\n\n'
|
||||
'_(What this piece should FEEL like — populate from the brief.)_\n'
|
||||
'DNA_EOF'
|
||||
)
|
||||
|
||||
# Asset copies — leave empty by default; user fills in
|
||||
asset_copies = "# Add cp/rsync commands here for any provided assets"
|
||||
|
||||
out = tmpl
|
||||
out = out.replace("{{TITLE}}", plan["title"])
|
||||
out = out.replace("{{SLUG}}", plan["slug"])
|
||||
out = out.replace("{{TENANT}}", plan["tenant"])
|
||||
out = out.replace("{{WORKSPACE}}", f"~/projects/video-pipeline/{plan['slug']}")
|
||||
out = out.replace("{{KEY_CHECKS}}", key_checks_str)
|
||||
out = out.replace("{{SCENE_DIRS}}", scene_dirs)
|
||||
out = out.replace("{{PROFILE_CREATE_COMMANDS}}", "\n".join(profile_creates))
|
||||
out = out.replace("{{PROFILE_CONFIG_COMMANDS}}", "\n".join(profile_configs))
|
||||
out = out.replace("{{SOUL_WRITES}}", "\n".join(soul_writes))
|
||||
out = out.replace("{{BRIEF_CONTENTS}}", brief_md)
|
||||
out = out.replace("{{TEAM_CONTENTS}}", team_md)
|
||||
out = out.replace("{{TASTE_WRITES}}", taste_writes)
|
||||
out = out.replace("{{ASSET_COPIES}}", asset_copies)
|
||||
|
||||
return out
|
||||
|
||||
|
||||
def render_soul_md(team_member: dict, plan: dict) -> str:
|
||||
"""Render a profile's SOUL.md from a team member dict + plan context."""
|
||||
tmpl = load_template("soul.md.tmpl")
|
||||
role = team_member["role"]
|
||||
|
||||
common_rules = (
|
||||
"- **Read the brief and team graph** before doing anything else.\n"
|
||||
"- **Pass `workspace_kind=\"dir\"` and `workspace_path` on every "
|
||||
"`kanban_create` call.** This keeps the team in one shared workspace.\n"
|
||||
f"- **Use tenant `{plan['tenant']}`** on every kanban call.\n"
|
||||
"- **Write outputs to predictable paths.** Other profiles depend on "
|
||||
"your filename conventions.\n"
|
||||
"- **Emit heartbeats** during long-running work. Renderers should "
|
||||
"report frame counts; editors should report assembly progress.\n"
|
||||
)
|
||||
|
||||
if role == "director":
|
||||
common_rules += (
|
||||
"- **Do not execute the work yourself.** For every concrete task, "
|
||||
"create a kanban task and assign it to the appropriate profile.\n"
|
||||
"- **Decompose, route, comment, approve — that's the whole job.**\n"
|
||||
"- **Read TEAM.md** for the canonical task graph. Do not invent "
|
||||
"new roles unless the brief truly demands it.\n"
|
||||
"- **Load the `kanban-orchestrator` skill** for the deeper "
|
||||
"decomposition playbook beyond the auto-injected baseline.\n"
|
||||
)
|
||||
|
||||
common_commands = (
|
||||
"```bash\n"
|
||||
"# Inspect a clip\n"
|
||||
"ffprobe -v quiet -show_entries format=duration -show_entries "
|
||||
"stream=codec_name,width,height,r_frame_rate <file.mp4>\n"
|
||||
"\n"
|
||||
"# Extract a frame for QA\n"
|
||||
"ffmpeg -y -i <input.mp4> -vf \"select='eq(n,30)'\" -vsync vfr <out.png>\n"
|
||||
"```"
|
||||
)
|
||||
|
||||
out = tmpl
|
||||
out = out.replace("{{ROLE_NAME}}", role)
|
||||
out = out.replace("{{ROLE_RESPONSIBILITIES}}", team_member["responsibilities"])
|
||||
out = out.replace("{{INPUTS_READ}}", team_member.get("inputs", "_(see brief)_"))
|
||||
out = out.replace("{{OUTPUTS_PRODUCED}}", team_member.get("outputs", "_(see brief)_"))
|
||||
out = out.replace("{{TOOLSETS}}", ", ".join(team_member["toolsets"]))
|
||||
out = out.replace(
|
||||
"{{SKILLS}}",
|
||||
", ".join(team_member["skills"]) if team_member["skills"] else "(none)"
|
||||
)
|
||||
out = out.replace(
|
||||
"{{EXTERNAL_TOOLS}}",
|
||||
team_member.get("external_tools", "ffmpeg, ffprobe (via terminal)")
|
||||
)
|
||||
out = out.replace(
|
||||
"{{ROLE_RULES}}",
|
||||
team_member.get("role_rules", "_(see TEAM.md and brief.md)_")
|
||||
)
|
||||
out = out.replace("{{COMMON_RULES}}", common_rules)
|
||||
out = out.replace("{{COMMON_COMMANDS}}", common_commands)
|
||||
return out
|
||||
|
||||
|
||||
def main():
|
||||
ap = argparse.ArgumentParser(description=__doc__,
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
ap.add_argument("plan_json", help="Path to plan.json")
|
||||
ap.add_argument("--out", default="setup.sh",
|
||||
help="Output path for setup.sh (default: ./setup.sh)")
|
||||
ap.add_argument("--brief-out", default=None,
|
||||
help="Write brief.md alongside (default: skipped)")
|
||||
ap.add_argument("--team-out", default=None,
|
||||
help="Write TEAM.md alongside (default: skipped)")
|
||||
args = ap.parse_args()
|
||||
|
||||
plan = json.loads(Path(args.plan_json).read_text())
|
||||
errors = validate_plan(plan)
|
||||
if errors:
|
||||
print("Plan validation failed:", file=sys.stderr)
|
||||
for e in errors:
|
||||
print(f" - {e}", file=sys.stderr)
|
||||
sys.exit(2)
|
||||
|
||||
brief = render_brief(plan)
|
||||
team = render_team_md(plan)
|
||||
setup = render_setup_sh(plan, brief, team)
|
||||
|
||||
Path(args.out).write_text(setup)
|
||||
os.chmod(args.out, 0o755)
|
||||
print(f"Wrote {args.out}")
|
||||
|
||||
if args.brief_out:
|
||||
Path(args.brief_out).write_text(brief)
|
||||
print(f"Wrote {args.brief_out}")
|
||||
if args.team_out:
|
||||
Path(args.team_out).write_text(team)
|
||||
print(f"Wrote {args.team_out}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
195
optional-skills/creative/kanban-video-orchestrator/scripts/monitor.py
Executable file
195
optional-skills/creative/kanban-video-orchestrator/scripts/monitor.py
Executable file
|
|
@ -0,0 +1,195 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
Monitor a running video-production kanban. Polls `hermes kanban list` and
|
||||
`events` for a tenant and surfaces issues (stuck tasks, missing heartbeats,
|
||||
repeated retries, dependency deadlocks).
|
||||
|
||||
Usage:
|
||||
monitor.py --tenant <project-slug> [--interval 30]
|
||||
|
||||
Outputs a periodic snapshot to stdout. Sends alerts via stderr when issues
|
||||
are detected. Designed to run alongside the kanban — kill with Ctrl-C when
|
||||
you're satisfied (or scripted to stop on completion).
|
||||
|
||||
This is best-effort observability. It does not auto-restart tasks; intervention
|
||||
decisions should remain human/AI-overseen.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
from collections import defaultdict
|
||||
from datetime import datetime, timedelta
|
||||
|
||||
|
||||
def hermes_available() -> bool:
|
||||
return shutil.which("hermes") is not None
|
||||
|
||||
|
||||
def kanban_list(tenant: str) -> list[dict]:
|
||||
"""Returns parsed task rows. Falls back to plain stdout parsing if JSON
|
||||
output isn't supported by the installed hermes CLI."""
|
||||
try:
|
||||
out = subprocess.run(
|
||||
["hermes", "kanban", "list", "--tenant", tenant, "--json"],
|
||||
capture_output=True, text=True, check=False,
|
||||
)
|
||||
if out.returncode == 0 and out.stdout.strip().startswith("["):
|
||||
return json.loads(out.stdout)
|
||||
except (FileNotFoundError, json.JSONDecodeError):
|
||||
pass
|
||||
# Fallback: textual parse of `hermes kanban list`
|
||||
out = subprocess.run(
|
||||
["hermes", "kanban", "list", "--tenant", tenant],
|
||||
capture_output=True, text=True, check=False,
|
||||
)
|
||||
rows = []
|
||||
for line in out.stdout.splitlines():
|
||||
line = line.strip()
|
||||
if not line or line.startswith("#") or "STATUS" in line.upper():
|
||||
continue
|
||||
parts = line.split()
|
||||
if len(parts) >= 4 and parts[0].startswith("t_"):
|
||||
rows.append({
|
||||
"id": parts[0],
|
||||
"status": parts[1] if len(parts) > 1 else "?",
|
||||
"assignee": parts[2] if len(parts) > 2 else "?",
|
||||
"title": " ".join(parts[3:]) if len(parts) > 3 else "",
|
||||
"started_at": None,
|
||||
"heartbeat_at": None,
|
||||
"max_runtime_s": None,
|
||||
})
|
||||
return rows
|
||||
|
||||
|
||||
def kanban_show(task_id: str) -> dict | None:
|
||||
out = subprocess.run(
|
||||
["hermes", "kanban", "show", task_id, "--json"],
|
||||
capture_output=True, text=True, check=False,
|
||||
)
|
||||
if out.returncode != 0:
|
||||
return None
|
||||
try:
|
||||
return json.loads(out.stdout)
|
||||
except json.JSONDecodeError:
|
||||
return None
|
||||
|
||||
|
||||
def detect_issues(tasks: list[dict]) -> list[str]:
|
||||
"""Return a list of issue strings, one per concern."""
|
||||
now = datetime.now()
|
||||
issues: list[str] = []
|
||||
by_status = defaultdict(list)
|
||||
for t in tasks:
|
||||
by_status[t.get("status", "?")].append(t)
|
||||
|
||||
# Stuck tasks: RUNNING with no heartbeat in 2 min
|
||||
for t in by_status.get("running", []) + by_status.get("RUNNING", []):
|
||||
hb = t.get("heartbeat_at")
|
||||
if not hb:
|
||||
continue
|
||||
try:
|
||||
hb_dt = datetime.fromisoformat(str(hb).rstrip("Z"))
|
||||
except ValueError:
|
||||
continue
|
||||
if now - hb_dt > timedelta(minutes=2):
|
||||
issues.append(
|
||||
f"STUCK: {t['id']} ({t.get('assignee', '?')}) — "
|
||||
f"no heartbeat in {(now - hb_dt).total_seconds():.0f}s"
|
||||
)
|
||||
|
||||
# Tasks exceeding max_runtime
|
||||
for t in by_status.get("running", []) + by_status.get("RUNNING", []):
|
||||
started = t.get("started_at")
|
||||
max_rt = t.get("max_runtime_s")
|
||||
if not started or not max_rt:
|
||||
continue
|
||||
try:
|
||||
started_dt = datetime.fromisoformat(str(started).rstrip("Z"))
|
||||
except ValueError:
|
||||
continue
|
||||
elapsed = (now - started_dt).total_seconds()
|
||||
if elapsed > max_rt:
|
||||
issues.append(
|
||||
f"OVERTIME: {t['id']} ({t.get('assignee', '?')}) — "
|
||||
f"running {elapsed:.0f}s, cap was {max_rt}s"
|
||||
)
|
||||
|
||||
# Repeated retries
|
||||
for t in tasks:
|
||||
retries = t.get("retries", 0)
|
||||
if retries and retries >= 2:
|
||||
issues.append(
|
||||
f"FLAPPING: {t['id']} ({t.get('assignee', '?')}) — "
|
||||
f"retried {retries}× — fix root cause before next run"
|
||||
)
|
||||
|
||||
return issues
|
||||
|
||||
|
||||
def snapshot(tenant: str) -> tuple[list[dict], list[str]]:
|
||||
tasks = kanban_list(tenant)
|
||||
issues = detect_issues(tasks)
|
||||
return tasks, issues
|
||||
|
||||
|
||||
def print_snapshot(tasks: list[dict], issues: list[str]):
|
||||
counts = defaultdict(int)
|
||||
for t in tasks:
|
||||
counts[str(t.get("status", "?")).lower()] += 1
|
||||
|
||||
print(f"\n[{datetime.now().strftime('%H:%M:%S')}] "
|
||||
f"Total: {len(tasks)} | "
|
||||
+ " | ".join(f"{k}: {v}" for k, v in sorted(counts.items())))
|
||||
|
||||
for t in tasks:
|
||||
bar = "✓" if str(t.get("status", "")).lower() == "done" else \
|
||||
"▶" if str(t.get("status", "")).lower() == "running" else \
|
||||
"·" if str(t.get("status", "")).lower() == "ready" else \
|
||||
"✗" if str(t.get("status", "")).lower() == "failed" else "?"
|
||||
print(f" {bar} {t.get('id', '?'):14} {t.get('assignee', '?'):20} "
|
||||
f"{t.get('title', '')[:60]}")
|
||||
|
||||
if issues:
|
||||
print("\n ⚠ ISSUES:", file=sys.stderr)
|
||||
for i in issues:
|
||||
print(f" {i}", file=sys.stderr)
|
||||
|
||||
|
||||
def main():
|
||||
ap = argparse.ArgumentParser(description=__doc__,
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
ap.add_argument("--tenant", required=True,
|
||||
help="Project tenant slug to monitor")
|
||||
ap.add_argument("--interval", type=int, default=30,
|
||||
help="Poll interval in seconds (default: 30)")
|
||||
ap.add_argument("--once", action="store_true",
|
||||
help="Print one snapshot and exit (no polling loop)")
|
||||
args = ap.parse_args()
|
||||
|
||||
if not hermes_available():
|
||||
print("ERROR: 'hermes' CLI not found in PATH", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
if args.once:
|
||||
tasks, issues = snapshot(args.tenant)
|
||||
print_snapshot(tasks, issues)
|
||||
sys.exit(0 if not issues else 2)
|
||||
|
||||
print(f"Monitoring tenant '{args.tenant}' every {args.interval}s. "
|
||||
"Ctrl-C to exit.")
|
||||
try:
|
||||
while True:
|
||||
tasks, issues = snapshot(args.tenant)
|
||||
print_snapshot(tasks, issues)
|
||||
time.sleep(args.interval)
|
||||
except KeyboardInterrupt:
|
||||
print("\nStopped.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Loading…
Add table
Add a link
Reference in a new issue