hermes-agent/website/docs/user-guide/skills/bundled/media/media-songsee.md
Teknium 0f6eabb890
docs(website): dedicated page per bundled + optional skill (#14929)
Generates a full dedicated Docusaurus page for every one of the 132 skills
(73 bundled + 59 optional) under website/docs/user-guide/skills/{bundled,optional}/<category>/.
Each page carries the skill's description, metadata (version, author, license,
dependencies, platform gating, tags, related skills cross-linked to their own
pages), and the complete SKILL.md body that Hermes loads at runtime.

Previously the two catalog pages just listed skills with a one-line blurb and
no way to see what the skill actually did — users had to go read the source
repo. Now every skill has a browsable, searchable, cross-linked reference in
the docs.

- website/scripts/generate-skill-docs.py — generator that reads skills/ and
  optional-skills/, writes per-skill pages, regenerates both catalog indexes,
  and rewrites the Skills section of sidebars.ts. Handles MDX escaping
  (outside fenced code blocks: curly braces, unsafe HTML-ish tags) and
  rewrites relative references/*.md links to point at the GitHub source.
- website/docs/reference/skills-catalog.md — regenerated; each row links to
  the new dedicated page.
- website/docs/reference/optional-skills-catalog.md — same.
- website/sidebars.ts — Skills section now has Bundled / Optional subtrees
  with one nested category per skill folder.
- .github/workflows/{docs-site-checks,deploy-site}.yml — run the generator
  before docusaurus build so CI stays in sync with the source SKILL.md files.

Build verified locally with `npx docusaurus build`. Only remaining warnings
are pre-existing broken link/anchor issues in unrelated pages.
2026-04-23 22:22:11 -07:00

3 KiB

title sidebar_label description
Songsee — Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc Songsee Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc

{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}

Songsee

Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation.

Skill metadata

Source Bundled (installed by default)
Path skills/media/songsee
Version 1.0.0
Author community
License MIT
Tags Audio, Visualization, Spectrogram, Music, Analysis

Reference: full SKILL.md

:::info The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. :::

songsee

Generate spectrograms and multi-panel audio feature visualizations from audio files.

Prerequisites

Requires Go:

go install github.com/steipete/songsee/cmd/songsee@latest

Optional: ffmpeg for formats beyond WAV/MP3.

Quick Start

# Basic spectrogram
songsee track.mp3

# Save to specific file
songsee track.mp3 -o spectrogram.png

# Multi-panel visualization grid
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux

# Time slice (start at 12.5s, 8s duration)
songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg

# From stdin
cat track.mp3 | songsee - --format png -o out.png

Visualization Types

Use --viz with comma-separated values:

Type Description
spectrogram Standard frequency spectrogram
mel Mel-scaled spectrogram
chroma Pitch class distribution
hpss Harmonic/percussive separation
selfsim Self-similarity matrix
loudness Loudness over time
tempogram Tempo estimation
mfcc Mel-frequency cepstral coefficients
flux Spectral flux (onset detection)

Multiple --viz types render as a grid in a single image.

Common Flags

Flag Description
--viz Visualization types (comma-separated)
--style Color palette: classic, magma, inferno, viridis, gray
--width / --height Output image dimensions
--window / --hop FFT window and hop size
--min-freq / --max-freq Frequency range filter
--start / --duration Time slice of the audio
--format Output format: jpg or png
-o Output file path

Notes

  • WAV and MP3 are decoded natively; other formats require ffmpeg
  • Output images can be inspected with vision_analyze for automated audio analysis
  • Useful for comparing audio outputs, debugging synthesis, or documenting audio processing pipelines