mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
The skills directory was getting disorganized — mlops alone had 40 skills in a flat list, and 12 categories were singletons with just one skill each. Code change: - prompt_builder.py: Support sub-categories in skill scanner. skills/mlops/training/axolotl/SKILL.md now shows as category 'mlops/training' instead of just 'mlops'. Backwards-compatible with existing flat structure. Split mlops (40 skills) into 7 sub-categories: - mlops/training (12): accelerate, axolotl, flash-attention, grpo-rl-training, peft, pytorch-fsdp, pytorch-lightning, simpo, slime, torchtitan, trl-fine-tuning, unsloth - mlops/inference (8): gguf, guidance, instructor, llama-cpp, obliteratus, outlines, tensorrt-llm, vllm - mlops/models (6): audiocraft, clip, llava, segment-anything, stable-diffusion, whisper - mlops/vector-databases (4): chroma, faiss, pinecone, qdrant - mlops/evaluation (5): huggingface-tokenizers, lm-evaluation-harness, nemo-curator, saelens, weights-and-biases - mlops/cloud (2): lambda-labs, modal - mlops/research (1): dspy Merged singleton categories: - gifs → media (gif-search joins youtube-content) - music-creation → media (heartmula, songsee) - diagramming → creative (excalidraw joins ascii-art) - ocr-and-documents → productivity - domain → research (domain-intel) - feeds → research (blogwatcher) - market-data → research (polymarket) Fixed misplaced skills: - mlops/code-review → software-development (not ML-specific) - mlops/ml-paper-writing → research (academic writing) Added DESCRIPTION.md files for all new/updated categories.
2.1 KiB
2.1 KiB
SAELens Reference Documentation
This directory contains comprehensive reference materials for SAELens.
Contents
- api.md - Complete API reference for SAE, TrainingSAE, and configuration classes
- tutorials.md - Step-by-step tutorials for training and analyzing SAEs
- papers.md - Key research papers on sparse autoencoders
Quick Links
- GitHub Repository: https://github.com/jbloomAus/SAELens
- Neuronpedia: https://neuronpedia.org (browse pre-trained SAE features)
- HuggingFace SAEs: Search for tag
saelens
Installation
pip install sae-lens
Requirements: Python 3.10+, transformer-lens>=2.0.0
Basic Usage
from transformer_lens import HookedTransformer
from sae_lens import SAE
# Load model and SAE
model = HookedTransformer.from_pretrained("gpt2-small", device="cuda")
sae, cfg_dict, sparsity = SAE.from_pretrained(
release="gpt2-small-res-jb",
sae_id="blocks.8.hook_resid_pre",
device="cuda"
)
# Encode activations to sparse features
tokens = model.to_tokens("Hello world")
_, cache = model.run_with_cache(tokens)
activations = cache["resid_pre", 8]
features = sae.encode(activations) # Sparse feature activations
reconstructed = sae.decode(features) # Reconstructed activations
Key Concepts
Sparse Autoencoders
SAEs decompose dense neural activations into sparse, interpretable features:
- Encoder: Maps d_model → d_sae (typically 4-16x expansion)
- ReLU/TopK: Enforces sparsity
- Decoder: Reconstructs original activations
Training Loss
Loss = MSE(original, reconstructed) + L1_coefficient × L1(features)
Key Metrics
- L0: Average number of active features (target: 50-200)
- CE Loss Score: Cross-entropy recovered vs original model (target: 80-95%)
- Dead Features: Features that never activate (target: <5%)
Available Pre-trained SAEs
| Release | Model | Description |
|---|---|---|
gpt2-small-res-jb |
GPT-2 Small | Residual stream SAEs |
gemma-2b-res |
Gemma 2B | Residual stream SAEs |
| Various | Search HuggingFace | Community-trained SAEs |