refactor: reorganize skills into sub-categories

The skills directory was getting disorganized — mlops alone had 40
skills in a flat list, and 12 categories were singletons with just
one skill each.

Code change:
- prompt_builder.py: Support sub-categories in skill scanner.
  skills/mlops/training/axolotl/SKILL.md now shows as category
  'mlops/training' instead of just 'mlops'. Backwards-compatible
  with existing flat structure.

Split mlops (40 skills) into 7 sub-categories:
- mlops/training (12): accelerate, axolotl, flash-attention,
  grpo-rl-training, peft, pytorch-fsdp, pytorch-lightning,
  simpo, slime, torchtitan, trl-fine-tuning, unsloth
- mlops/inference (8): gguf, guidance, instructor, llama-cpp,
  obliteratus, outlines, tensorrt-llm, vllm
- mlops/models (6): audiocraft, clip, llava, segment-anything,
  stable-diffusion, whisper
- mlops/vector-databases (4): chroma, faiss, pinecone, qdrant
- mlops/evaluation (5): huggingface-tokenizers,
  lm-evaluation-harness, nemo-curator, saelens, weights-and-biases
- mlops/cloud (2): lambda-labs, modal
- mlops/research (1): dspy

Merged singleton categories:
- gifs → media (gif-search joins youtube-content)
- music-creation → media (heartmula, songsee)
- diagramming → creative (excalidraw joins ascii-art)
- ocr-and-documents → productivity
- domain → research (domain-intel)
- feeds → research (blogwatcher)
- market-data → research (polymarket)

Fixed misplaced skills:
- mlops/code-review → software-development (not ML-specific)
- mlops/ml-paper-writing → research (academic writing)

Added DESCRIPTION.md files for all new/updated categories.
This commit is contained in:
teknium1 2026-03-09 03:35:53 -07:00
parent d6c710706f
commit 732c66b0f3
217 changed files with 39 additions and 4 deletions

View file

@ -1,26 +0,0 @@
For citing papers in the ACL Anthology, we provide a single consolidated
BibTeX file containing all of its papers. The bibkeys in these papers are
designed to be semantic in nature: {names}-{year}-{words}, where
- `names` is the concatenated last names of the authors when there is just
one or two authors, or `lastname-etal` for 3+
- `year` is the four-digit year
- `words` is the first significant word in the title, or more, if necessary,
to preserve uniqueness
For example, https://aclanthology.org/N04-1035 can be cited as \cite{galley-etal-2004-whats}.
The consolidated file can be downloaded from here:
- https://aclanthology.org/anthology.bib
Unfortunately, as of 2024 or so, this file is now larger than 50 MB, which is Overleaf's
bib file size limit. Consequently, the Anthology shards the file automatically into
49 MB shards.
There are currently (2025) two files:
- https://aclanthology.org/anthology-1.bib
- https://aclanthology.org/anthology-2.bib
You can download these directly from Overleaf from New File -> From External URL,
and then adding them to the \bibliography line in acl_latex.tex:
\bibliography{custom,anthology-1,anthology-2}