mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
docs(website): dedicated page per bundled + optional skill (#14929)
Generates a full dedicated Docusaurus page for every one of the 132 skills
(73 bundled + 59 optional) under website/docs/user-guide/skills/{bundled,optional}/<category>/.
Each page carries the skill's description, metadata (version, author, license,
dependencies, platform gating, tags, related skills cross-linked to their own
pages), and the complete SKILL.md body that Hermes loads at runtime.
Previously the two catalog pages just listed skills with a one-line blurb and
no way to see what the skill actually did — users had to go read the source
repo. Now every skill has a browsable, searchable, cross-linked reference in
the docs.
- website/scripts/generate-skill-docs.py — generator that reads skills/ and
optional-skills/, writes per-skill pages, regenerates both catalog indexes,
and rewrites the Skills section of sidebars.ts. Handles MDX escaping
(outside fenced code blocks: curly braces, unsafe HTML-ish tags) and
rewrites relative references/*.md links to point at the GitHub source.
- website/docs/reference/skills-catalog.md — regenerated; each row links to
the new dedicated page.
- website/docs/reference/optional-skills-catalog.md — same.
- website/sidebars.ts — Skills section now has Bundled / Optional subtrees
with one nested category per skill folder.
- .github/workflows/{docs-site-checks,deploy-site}.yml — run the generator
before docusaurus build so CI stays in sync with the source SKILL.md files.
Build verified locally with `npx docusaurus build`. Only remaining warnings
are pre-existing broken link/anchor issues in unrelated pages.
This commit is contained in:
parent
eb93f88e1d
commit
0f6eabb890
139 changed files with 43523 additions and 306 deletions
101
website/docs/user-guide/skills/bundled/media/media-gif-search.md
Normal file
101
website/docs/user-guide/skills/bundled/media/media-gif-search.md
Normal file
|
|
@ -0,0 +1,101 @@
|
|||
---
|
||||
title: "Gif Search — Search and download GIFs from Tenor using curl"
|
||||
sidebar_label: "Gif Search"
|
||||
description: "Search and download GIFs from Tenor using curl"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Gif Search
|
||||
|
||||
Search and download GIFs from Tenor using curl. No dependencies beyond curl and jq. Useful for finding reaction GIFs, creating visual content, and sending GIFs in chat.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Source | Bundled (installed by default) |
|
||||
| Path | `skills/media/gif-search` |
|
||||
| Version | `1.1.0` |
|
||||
| Author | Hermes Agent |
|
||||
| License | MIT |
|
||||
| Tags | `GIF`, `Media`, `Search`, `Tenor`, `API` |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
:::info
|
||||
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
|
||||
:::
|
||||
|
||||
# GIF Search (Tenor API)
|
||||
|
||||
Search and download GIFs directly via the Tenor API using curl. No extra tools needed.
|
||||
|
||||
## Setup
|
||||
|
||||
Set your Tenor API key in your environment (add to `~/.hermes/.env`):
|
||||
|
||||
```bash
|
||||
TENOR_API_KEY=your_key_here
|
||||
```
|
||||
|
||||
Get a free API key at https://developers.google.com/tenor/guides/quickstart — the Google Cloud Console Tenor API key is free and has generous rate limits.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- `curl` and `jq` (both standard on macOS/Linux)
|
||||
- `TENOR_API_KEY` environment variable
|
||||
|
||||
## Search for GIFs
|
||||
|
||||
```bash
|
||||
# Search and get GIF URLs
|
||||
curl -s "https://tenor.googleapis.com/v2/search?q=thumbs+up&limit=5&key=${TENOR_API_KEY}" | jq -r '.results[].media_formats.gif.url'
|
||||
|
||||
# Get smaller/preview versions
|
||||
curl -s "https://tenor.googleapis.com/v2/search?q=nice+work&limit=3&key=${TENOR_API_KEY}" | jq -r '.results[].media_formats.tinygif.url'
|
||||
```
|
||||
|
||||
## Download a GIF
|
||||
|
||||
```bash
|
||||
# Search and download the top result
|
||||
URL=$(curl -s "https://tenor.googleapis.com/v2/search?q=celebration&limit=1&key=${TENOR_API_KEY}" | jq -r '.results[0].media_formats.gif.url')
|
||||
curl -sL "$URL" -o celebration.gif
|
||||
```
|
||||
|
||||
## Get Full Metadata
|
||||
|
||||
```bash
|
||||
curl -s "https://tenor.googleapis.com/v2/search?q=cat&limit=3&key=${TENOR_API_KEY}" | jq '.results[] | {title: .title, url: .media_formats.gif.url, preview: .media_formats.tinygif.url, dimensions: .media_formats.gif.dims}'
|
||||
```
|
||||
|
||||
## API Parameters
|
||||
|
||||
| Parameter | Description |
|
||||
|-----------|-------------|
|
||||
| `q` | Search query (URL-encode spaces as `+`) |
|
||||
| `limit` | Max results (1-50, default 20) |
|
||||
| `key` | API key (from `$TENOR_API_KEY` env var) |
|
||||
| `media_filter` | Filter formats: `gif`, `tinygif`, `mp4`, `tinymp4`, `webm` |
|
||||
| `contentfilter` | Safety: `off`, `low`, `medium`, `high` |
|
||||
| `locale` | Language: `en_US`, `es`, `fr`, etc. |
|
||||
|
||||
## Available Media Formats
|
||||
|
||||
Each result has multiple formats under `.media_formats`:
|
||||
|
||||
| Format | Use case |
|
||||
|--------|----------|
|
||||
| `gif` | Full quality GIF |
|
||||
| `tinygif` | Small preview GIF |
|
||||
| `mp4` | Video version (smaller file size) |
|
||||
| `tinymp4` | Small preview video |
|
||||
| `webm` | WebM video |
|
||||
| `nanogif` | Tiny thumbnail |
|
||||
|
||||
## Notes
|
||||
|
||||
- URL-encode the query: spaces as `+`, special chars as `%XX`
|
||||
- For sending in chat, `tinygif` URLs are lighter weight
|
||||
- GIF URLs can be used directly in markdown: ``
|
||||
188
website/docs/user-guide/skills/bundled/media/media-heartmula.md
Normal file
188
website/docs/user-guide/skills/bundled/media/media-heartmula.md
Normal file
|
|
@ -0,0 +1,188 @@
|
|||
---
|
||||
title: "Heartmula — Set up and run HeartMuLa, the open-source music generation model family (Suno-like)"
|
||||
sidebar_label: "Heartmula"
|
||||
description: "Set up and run HeartMuLa, the open-source music generation model family (Suno-like)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Heartmula
|
||||
|
||||
Set up and run HeartMuLa, the open-source music generation model family (Suno-like). Generates full songs from lyrics + tags with multilingual support.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Source | Bundled (installed by default) |
|
||||
| Path | `skills/media/heartmula` |
|
||||
| Version | `1.0.0` |
|
||||
| Tags | `music`, `audio`, `generation`, `ai`, `heartmula`, `heartcodec`, `lyrics`, `songs` |
|
||||
| Related skills | `audiocraft` |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
:::info
|
||||
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
|
||||
:::
|
||||
|
||||
# HeartMuLa - Open-Source Music Generation
|
||||
|
||||
## Overview
|
||||
HeartMuLa is a family of open-source music foundation models (Apache-2.0) that generates music conditioned on lyrics and tags. Comparable to Suno for open-source. Includes:
|
||||
- **HeartMuLa** - Music language model (3B/7B) for generation from lyrics + tags
|
||||
- **HeartCodec** - 12.5Hz music codec for high-fidelity audio reconstruction
|
||||
- **HeartTranscriptor** - Whisper-based lyrics transcription
|
||||
- **HeartCLAP** - Audio-text alignment model
|
||||
|
||||
## When to Use
|
||||
- User wants to generate music/songs from text descriptions
|
||||
- User wants an open-source Suno alternative
|
||||
- User wants local/offline music generation
|
||||
- User asks about HeartMuLa, heartlib, or AI music generation
|
||||
|
||||
## Hardware Requirements
|
||||
- **Minimum**: 8GB VRAM with `--lazy_load true` (loads/unloads models sequentially)
|
||||
- **Recommended**: 16GB+ VRAM for comfortable single-GPU usage
|
||||
- **Multi-GPU**: Use `--mula_device cuda:0 --codec_device cuda:1` to split across GPUs
|
||||
- 3B model with lazy_load peaks at ~6.2GB VRAM
|
||||
|
||||
## Installation Steps
|
||||
|
||||
### 1. Clone Repository
|
||||
```bash
|
||||
cd ~/ # or desired directory
|
||||
git clone https://github.com/HeartMuLa/heartlib.git
|
||||
cd heartlib
|
||||
```
|
||||
|
||||
### 2. Create Virtual Environment (Python 3.10 required)
|
||||
```bash
|
||||
uv venv --python 3.10 .venv
|
||||
. .venv/bin/activate
|
||||
uv pip install -e .
|
||||
```
|
||||
|
||||
### 3. Fix Dependency Compatibility Issues
|
||||
|
||||
**IMPORTANT**: As of Feb 2026, the pinned dependencies have conflicts with newer packages. Apply these fixes:
|
||||
|
||||
```bash
|
||||
# Upgrade datasets (old version incompatible with current pyarrow)
|
||||
uv pip install --upgrade datasets
|
||||
|
||||
# Upgrade transformers (needed for huggingface-hub 1.x compatibility)
|
||||
uv pip install --upgrade transformers
|
||||
```
|
||||
|
||||
### 4. Patch Source Code (Required for transformers 5.x)
|
||||
|
||||
**Patch 1 - RoPE cache fix** in `src/heartlib/heartmula/modeling_heartmula.py`:
|
||||
|
||||
In the `setup_caches` method of the `HeartMuLa` class, add RoPE reinitialization after the `reset_caches` try/except block and before the `with device:` block:
|
||||
|
||||
```python
|
||||
# Re-initialize RoPE caches that were skipped during meta-device loading
|
||||
from torchtune.models.llama3_1._position_embeddings import Llama3ScaledRoPE
|
||||
for module in self.modules():
|
||||
if isinstance(module, Llama3ScaledRoPE) and not module.is_cache_built:
|
||||
module.rope_init()
|
||||
module.to(device)
|
||||
```
|
||||
|
||||
**Why**: `from_pretrained` creates model on meta device first; `Llama3ScaledRoPE.rope_init()` skips cache building on meta tensors, then never rebuilds after weights are loaded to real device.
|
||||
|
||||
**Patch 2 - HeartCodec loading fix** in `src/heartlib/pipelines/music_generation.py`:
|
||||
|
||||
Add `ignore_mismatched_sizes=True` to ALL `HeartCodec.from_pretrained()` calls (there are 2: the eager load in `__init__` and the lazy load in the `codec` property).
|
||||
|
||||
**Why**: VQ codebook `initted` buffers have shape `[1]` in checkpoint vs `[]` in model. Same data, just scalar vs 0-d tensor. Safe to ignore.
|
||||
|
||||
### 5. Download Model Checkpoints
|
||||
```bash
|
||||
cd heartlib # project root
|
||||
hf download --local-dir './ckpt' 'HeartMuLa/HeartMuLaGen'
|
||||
hf download --local-dir './ckpt/HeartMuLa-oss-3B' 'HeartMuLa/HeartMuLa-oss-3B-happy-new-year'
|
||||
hf download --local-dir './ckpt/HeartCodec-oss' 'HeartMuLa/HeartCodec-oss-20260123'
|
||||
```
|
||||
|
||||
All 3 can be downloaded in parallel. Total size is several GB.
|
||||
|
||||
## GPU / CUDA
|
||||
|
||||
HeartMuLa uses CUDA by default (`--mula_device cuda --codec_device cuda`). No extra setup needed if the user has an NVIDIA GPU with PyTorch CUDA support installed.
|
||||
|
||||
- The installed `torch==2.4.1` includes CUDA 12.1 support out of the box
|
||||
- `torchtune` may report version `0.4.0+cpu` — this is just package metadata, it still uses CUDA via PyTorch
|
||||
- To verify GPU is being used, look for "CUDA memory" lines in the output (e.g. "CUDA memory before unloading: 6.20 GB")
|
||||
- **No GPU?** You can run on CPU with `--mula_device cpu --codec_device cpu`, but expect generation to be **extremely slow** (potentially 30-60+ minutes for a single song vs ~4 minutes on GPU). CPU mode also requires significant RAM (~12GB+ free). If the user has no NVIDIA GPU, recommend using a cloud GPU service (Google Colab free tier with T4, Lambda Labs, etc.) or the online demo at https://heartmula.github.io/ instead.
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Generation
|
||||
```bash
|
||||
cd heartlib
|
||||
. .venv/bin/activate
|
||||
python ./examples/run_music_generation.py \
|
||||
--model_path=./ckpt \
|
||||
--version="3B" \
|
||||
--lyrics="./assets/lyrics.txt" \
|
||||
--tags="./assets/tags.txt" \
|
||||
--save_path="./assets/output.mp3" \
|
||||
--lazy_load true
|
||||
```
|
||||
|
||||
### Input Formatting
|
||||
|
||||
**Tags** (comma-separated, no spaces):
|
||||
```
|
||||
piano,happy,wedding,synthesizer,romantic
|
||||
```
|
||||
or
|
||||
```
|
||||
rock,energetic,guitar,drums,male-vocal
|
||||
```
|
||||
|
||||
**Lyrics** (use bracketed structural tags):
|
||||
```
|
||||
[Intro]
|
||||
|
||||
[Verse]
|
||||
Your lyrics here...
|
||||
|
||||
[Chorus]
|
||||
Chorus lyrics...
|
||||
|
||||
[Bridge]
|
||||
Bridge lyrics...
|
||||
|
||||
[Outro]
|
||||
```
|
||||
|
||||
### Key Parameters
|
||||
| Parameter | Default | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `--max_audio_length_ms` | 240000 | Max length in ms (240s = 4 min) |
|
||||
| `--topk` | 50 | Top-k sampling |
|
||||
| `--temperature` | 1.0 | Sampling temperature |
|
||||
| `--cfg_scale` | 1.5 | Classifier-free guidance scale |
|
||||
| `--lazy_load` | false | Load/unload models on demand (saves VRAM) |
|
||||
| `--mula_dtype` | bfloat16 | Dtype for HeartMuLa (bf16 recommended) |
|
||||
| `--codec_dtype` | float32 | Dtype for HeartCodec (fp32 recommended for quality) |
|
||||
|
||||
### Performance
|
||||
- RTF (Real-Time Factor) ≈ 1.0 — a 4-minute song takes ~4 minutes to generate
|
||||
- Output: MP3, 48kHz stereo, 128kbps
|
||||
|
||||
## Pitfalls
|
||||
1. **Do NOT use bf16 for HeartCodec** — degrades audio quality. Use fp32 (default).
|
||||
2. **Tags may be ignored** — known issue (#90). Lyrics tend to dominate; experiment with tag ordering.
|
||||
3. **Triton not available on macOS** — Linux/CUDA only for GPU acceleration.
|
||||
4. **RTX 5080 incompatibility** reported in upstream issues.
|
||||
5. The dependency pin conflicts require the manual upgrades and patches described above.
|
||||
|
||||
## Links
|
||||
- Repo: https://github.com/HeartMuLa/heartlib
|
||||
- Models: https://huggingface.co/HeartMuLa
|
||||
- Paper: https://arxiv.org/abs/2601.10547
|
||||
- License: Apache-2.0
|
||||
|
|
@ -0,0 +1,97 @@
|
|||
---
|
||||
title: "Songsee — Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc"
|
||||
sidebar_label: "Songsee"
|
||||
description: "Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Songsee
|
||||
|
||||
Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Source | Bundled (installed by default) |
|
||||
| Path | `skills/media/songsee` |
|
||||
| Version | `1.0.0` |
|
||||
| Author | community |
|
||||
| License | MIT |
|
||||
| Tags | `Audio`, `Visualization`, `Spectrogram`, `Music`, `Analysis` |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
:::info
|
||||
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
|
||||
:::
|
||||
|
||||
# songsee
|
||||
|
||||
Generate spectrograms and multi-panel audio feature visualizations from audio files.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Requires [Go](https://go.dev/doc/install):
|
||||
```bash
|
||||
go install github.com/steipete/songsee/cmd/songsee@latest
|
||||
```
|
||||
|
||||
Optional: `ffmpeg` for formats beyond WAV/MP3.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Basic spectrogram
|
||||
songsee track.mp3
|
||||
|
||||
# Save to specific file
|
||||
songsee track.mp3 -o spectrogram.png
|
||||
|
||||
# Multi-panel visualization grid
|
||||
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux
|
||||
|
||||
# Time slice (start at 12.5s, 8s duration)
|
||||
songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg
|
||||
|
||||
# From stdin
|
||||
cat track.mp3 | songsee - --format png -o out.png
|
||||
```
|
||||
|
||||
## Visualization Types
|
||||
|
||||
Use `--viz` with comma-separated values:
|
||||
|
||||
| Type | Description |
|
||||
|------|-------------|
|
||||
| `spectrogram` | Standard frequency spectrogram |
|
||||
| `mel` | Mel-scaled spectrogram |
|
||||
| `chroma` | Pitch class distribution |
|
||||
| `hpss` | Harmonic/percussive separation |
|
||||
| `selfsim` | Self-similarity matrix |
|
||||
| `loudness` | Loudness over time |
|
||||
| `tempogram` | Tempo estimation |
|
||||
| `mfcc` | Mel-frequency cepstral coefficients |
|
||||
| `flux` | Spectral flux (onset detection) |
|
||||
|
||||
Multiple `--viz` types render as a grid in a single image.
|
||||
|
||||
## Common Flags
|
||||
|
||||
| Flag | Description |
|
||||
|------|-------------|
|
||||
| `--viz` | Visualization types (comma-separated) |
|
||||
| `--style` | Color palette: `classic`, `magma`, `inferno`, `viridis`, `gray` |
|
||||
| `--width` / `--height` | Output image dimensions |
|
||||
| `--window` / `--hop` | FFT window and hop size |
|
||||
| `--min-freq` / `--max-freq` | Frequency range filter |
|
||||
| `--start` / `--duration` | Time slice of the audio |
|
||||
| `--format` | Output format: `jpg` or `png` |
|
||||
| `-o` | Output file path |
|
||||
|
||||
## Notes
|
||||
|
||||
- WAV and MP3 are decoded natively; other formats require `ffmpeg`
|
||||
- Output images can be inspected with `vision_analyze` for automated audio analysis
|
||||
- Useful for comparing audio outputs, debugging synthesis, or documenting audio processing pipelines
|
||||
|
|
@ -0,0 +1,88 @@
|
|||
---
|
||||
title: "Youtube Content"
|
||||
sidebar_label: "Youtube Content"
|
||||
description: "Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts)"
|
||||
---
|
||||
|
||||
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
|
||||
|
||||
# Youtube Content
|
||||
|
||||
Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts). Use when the user shares a YouTube URL or video link, asks to summarize a video, requests a transcript, or wants to extract and reformat content from any YouTube video.
|
||||
|
||||
## Skill metadata
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Source | Bundled (installed by default) |
|
||||
| Path | `skills/media/youtube-content` |
|
||||
|
||||
## Reference: full SKILL.md
|
||||
|
||||
:::info
|
||||
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
|
||||
:::
|
||||
|
||||
# YouTube Content Tool
|
||||
|
||||
Extract transcripts from YouTube videos and convert them into useful formats.
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
pip install youtube-transcript-api
|
||||
```
|
||||
|
||||
## Helper Script
|
||||
|
||||
`SKILL_DIR` is the directory containing this SKILL.md file. The script accepts any standard YouTube URL format, short links (youtu.be), shorts, embeds, live links, or a raw 11-character video ID.
|
||||
|
||||
```bash
|
||||
# JSON output with metadata
|
||||
python3 SKILL_DIR/scripts/fetch_transcript.py "https://youtube.com/watch?v=VIDEO_ID"
|
||||
|
||||
# Plain text (good for piping into further processing)
|
||||
python3 SKILL_DIR/scripts/fetch_transcript.py "URL" --text-only
|
||||
|
||||
# With timestamps
|
||||
python3 SKILL_DIR/scripts/fetch_transcript.py "URL" --timestamps
|
||||
|
||||
# Specific language with fallback chain
|
||||
python3 SKILL_DIR/scripts/fetch_transcript.py "URL" --language tr,en
|
||||
```
|
||||
|
||||
## Output Formats
|
||||
|
||||
After fetching the transcript, format it based on what the user asks for:
|
||||
|
||||
- **Chapters**: Group by topic shifts, output timestamped chapter list
|
||||
- **Summary**: Concise 5-10 sentence overview of the entire video
|
||||
- **Chapter summaries**: Chapters with a short paragraph summary for each
|
||||
- **Thread**: Twitter/X thread format — numbered posts, each under 280 chars
|
||||
- **Blog post**: Full article with title, sections, and key takeaways
|
||||
- **Quotes**: Notable quotes with timestamps
|
||||
|
||||
### Example — Chapters Output
|
||||
|
||||
```
|
||||
00:00 Introduction — host opens with the problem statement
|
||||
03:45 Background — prior work and why existing solutions fall short
|
||||
12:20 Core method — walkthrough of the proposed approach
|
||||
24:10 Results — benchmark comparisons and key takeaways
|
||||
31:55 Q&A — audience questions on scalability and next steps
|
||||
```
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Fetch** the transcript using the helper script with `--text-only --timestamps`.
|
||||
2. **Validate**: confirm the output is non-empty and in the expected language. If empty, retry without `--language` to get any available transcript. If still empty, tell the user the video likely has transcripts disabled.
|
||||
3. **Chunk if needed**: if the transcript exceeds ~50K characters, split into overlapping chunks (~40K with 2K overlap) and summarize each chunk before merging.
|
||||
4. **Transform** into the requested output format. If the user did not specify a format, default to a summary.
|
||||
5. **Verify**: re-read the transformed output to check for coherence, correct timestamps, and completeness before presenting.
|
||||
|
||||
## Error Handling
|
||||
|
||||
- **Transcript disabled**: tell the user; suggest they check if subtitles are available on the video page.
|
||||
- **Private/unavailable video**: relay the error and ask the user to verify the URL.
|
||||
- **No matching language**: retry without `--language` to fetch any available transcript, then note the actual language to the user.
|
||||
- **Dependency missing**: run `pip install youtube-transcript-api` and retry.
|
||||
Loading…
Add table
Add a link
Reference in a new issue