hermes-agent/skills/media/youtube-content/SKILL.md
Teknium b9bac87d5a feat(skills): declare platforms frontmatter for all 79 undeclared built-in skills
Completes the Windows-gating coverage for the built-in skills/ tree. Every
bundled SKILL.md now carries an explicit platforms: declaration so the
loader (agent.skill_utils.skill_matches_platform) can skip-load skills
that don't fit the current OS.

74 skills declared cross-platform (platforms: [linux, macos, windows]):
  Creative (16): ascii-art, ascii-video, architecture-diagram, baoyu-comic,
    baoyu-infographic, claude-design, creative-ideation, design-md,
    excalidraw, humanizer, manim-video, p5js, pixel-art,
    popular-web-designs, pretext, sketch, songwriting-and-ai-music,
    touchdesigner-mcp
  Autonomous agents: claude-code, codex, hermes-agent, opencode
  Data/devops: jupyter-live-kernel, kanban-orchestrator, kanban-worker,
    webhook-subscriptions, dogfood, codebase-inspection
  GitHub: github-auth, github-code-review, github-issues,
    github-pr-workflow, github-repo-management
  Media: gif-search, heartmula, songsee, spotify, youtube-content
  MCP / email / gaming / notes / smart-home: native-mcp, himalaya,
    pokemon-player, obsidian, openhue
  mlops (non-broken): weights-and-biases, huggingface-hub, llama-cpp,
    outlines, segment-anything-model, dspy, trl-fine-tuning
  Productivity: airtable, google-workspace, linear, maps, nano-pdf,
    notion, ocr-and-documents, powerpoint
  Red-teaming / research: godmode, arxiv, blogwatcher, llm-wiki,
    polymarket
  Software-dev: debugging-hermes-tui-commands, hermes-agent-skill-authoring,
    node-inspect-debugger, plan, requesting-code-review, spike,
    subagent-driven-development, systematic-debugging,
    test-driven-development, writing-plans
  Misc: yuanbao

5 skills gated from Windows (platforms: [linux, macos]):
  mlops/inference/vllm (serving-llms-vllm)
    vLLM is officially Linux-only; Windows requires WSL.
  mlops/training/axolotl
    Axolotl's flash-attn + deepspeed + bitsandbytes stack is Linux-first.
  mlops/training/unsloth
    Requires Triton + xformers + flash-attn — Linux only in practice.
  mlops/models/audiocraft (audiocraft-audio-generation)
    torchaudio ffmpeg backend + encodec dependencies are Linux-first.
  mlops/inference/obliteratus
    Research abliteration workflow; relies on Linux-focused pytorch
    kernels and MLX — no first-class Windows path.

Same strict-over-lenient policy as the optional-skills sweep: when the
underlying tool's Windows support is rough, missing, or WSL-only, gate the
skill. Easier to un-gate after verified Windows support lands than to leak
partial support that manifests as mid-task failures.

Combined with prior commits in this branch, every bundled SKILL.md
(skills/ + optional-skills/) now has a platforms: declaration.
2026-05-08 09:23:27 -07:00

3.1 KiB

name description platforms
youtube-content YouTube transcripts to summaries, threads, blogs.
linux
macos
windows

YouTube Content Tool

When to use

Use when the user shares a YouTube URL or video link, asks to summarize a video, requests a transcript, or wants to extract and reformat content from any YouTube video. Transforms transcripts into structured content (chapters, summaries, threads, blog posts).

Extract transcripts from YouTube videos and convert them into useful formats.

Setup

pip install youtube-transcript-api

Helper Script

SKILL_DIR is the directory containing this SKILL.md file. The script accepts any standard YouTube URL format, short links (youtu.be), shorts, embeds, live links, or a raw 11-character video ID.

# JSON output with metadata
python3 SKILL_DIR/scripts/fetch_transcript.py "https://youtube.com/watch?v=VIDEO_ID"

# Plain text (good for piping into further processing)
python3 SKILL_DIR/scripts/fetch_transcript.py "URL" --text-only

# With timestamps
python3 SKILL_DIR/scripts/fetch_transcript.py "URL" --timestamps

# Specific language with fallback chain
python3 SKILL_DIR/scripts/fetch_transcript.py "URL" --language tr,en

Output Formats

After fetching the transcript, format it based on what the user asks for:

  • Chapters: Group by topic shifts, output timestamped chapter list
  • Summary: Concise 5-10 sentence overview of the entire video
  • Chapter summaries: Chapters with a short paragraph summary for each
  • Thread: Twitter/X thread format — numbered posts, each under 280 chars
  • Blog post: Full article with title, sections, and key takeaways
  • Quotes: Notable quotes with timestamps

Example — Chapters Output

00:00 Introduction — host opens with the problem statement
03:45 Background — prior work and why existing solutions fall short
12:20 Core method — walkthrough of the proposed approach
24:10 Results — benchmark comparisons and key takeaways
31:55 Q&A — audience questions on scalability and next steps

Workflow

  1. Fetch the transcript using the helper script with --text-only --timestamps.
  2. Validate: confirm the output is non-empty and in the expected language. If empty, retry without --language to get any available transcript. If still empty, tell the user the video likely has transcripts disabled.
  3. Chunk if needed: if the transcript exceeds ~50K characters, split into overlapping chunks (~40K with 2K overlap) and summarize each chunk before merging.
  4. Transform into the requested output format. If the user did not specify a format, default to a summary.
  5. Verify: re-read the transformed output to check for coherence, correct timestamps, and completeness before presenting.

Error Handling

  • Transcript disabled: tell the user; suggest they check if subtitles are available on the video page.
  • Private/unavailable video: relay the error and ask the user to verify the URL.
  • No matching language: retry without --language to fetch any available transcript, then note the actual language to the user.
  • Dependency missing: run pip install youtube-transcript-api and retry.