mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-26 01:01:40 +00:00
feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap (#3934)
* feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap Map active skills to Telegram's slash command menu so users can discover and invoke skills directly. Three changes: 1. Telegram menu now includes active skill commands alongside built-in commands, capped at 100 entries (Telegram Bot API limit). Overflow commands remain callable but hidden from the picker. Logged at startup when cap is hit. 2. New /commands [page] gateway command for paginated browsing of all commands + skills. /help now shows first 10 skill commands and points to /commands for the full list. 3. When a user types a slash command that matches a disabled or uninstalled skill, they get actionable guidance: - Disabled: 'Enable it with: hermes skills config' - Optional (not installed): 'Install with: hermes skills install official/<path>' Built on ideas from PR #3921 by @kshitijk4poor. * chore: move 21 niche skills to optional-skills Move specialized/niche skills from built-in (skills/) to optional (optional-skills/) to reduce the default skill count. Users can install them with: hermes skills install official/<category>/<name> Moved skills (21): - mlops: accelerate, chroma, faiss, flash-attention, hermes-atropos-environments, huggingface-tokenizers, instructor, lambda-labs, llava, nemo-curator, pinecone, pytorch-lightning, qdrant, saelens, simpo, slime, tensorrt-llm, torchtitan - research: domain-intel, duckduckgo-search - devops: inference-sh cli Built-in skills: 96 → 75 Optional skills: 22 → 43 * fix: only include repo built-in skills in Telegram menu, not user-installed User-installed skills (from hub or manually added) stay accessible via /skills and by typing the command directly, but don't get registered in the Telegram slash command picker. Only skills whose SKILL.md is under the repo's skills/ directory are included in the menu. This keeps the Telegram menu focused on the curated built-in set while user-installed skills remain discoverable through /skills and /commands.
This commit is contained in:
parent
97d6813f51
commit
5ceed021dc
73 changed files with 163 additions and 4 deletions
70
optional-skills/mlops/saelens/references/README.md
Normal file
70
optional-skills/mlops/saelens/references/README.md
Normal file
|
|
@ -0,0 +1,70 @@
|
|||
# SAELens Reference Documentation
|
||||
|
||||
This directory contains comprehensive reference materials for SAELens.
|
||||
|
||||
## Contents
|
||||
|
||||
- [api.md](api.md) - Complete API reference for SAE, TrainingSAE, and configuration classes
|
||||
- [tutorials.md](tutorials.md) - Step-by-step tutorials for training and analyzing SAEs
|
||||
- [papers.md](papers.md) - Key research papers on sparse autoencoders
|
||||
|
||||
## Quick Links
|
||||
|
||||
- **GitHub Repository**: https://github.com/jbloomAus/SAELens
|
||||
- **Neuronpedia**: https://neuronpedia.org (browse pre-trained SAE features)
|
||||
- **HuggingFace SAEs**: Search for tag `saelens`
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install sae-lens
|
||||
```
|
||||
|
||||
Requirements: Python 3.10+, transformer-lens>=2.0.0
|
||||
|
||||
## Basic Usage
|
||||
|
||||
```python
|
||||
from transformer_lens import HookedTransformer
|
||||
from sae_lens import SAE
|
||||
|
||||
# Load model and SAE
|
||||
model = HookedTransformer.from_pretrained("gpt2-small", device="cuda")
|
||||
sae, cfg_dict, sparsity = SAE.from_pretrained(
|
||||
release="gpt2-small-res-jb",
|
||||
sae_id="blocks.8.hook_resid_pre",
|
||||
device="cuda"
|
||||
)
|
||||
|
||||
# Encode activations to sparse features
|
||||
tokens = model.to_tokens("Hello world")
|
||||
_, cache = model.run_with_cache(tokens)
|
||||
activations = cache["resid_pre", 8]
|
||||
|
||||
features = sae.encode(activations) # Sparse feature activations
|
||||
reconstructed = sae.decode(features) # Reconstructed activations
|
||||
```
|
||||
|
||||
## Key Concepts
|
||||
|
||||
### Sparse Autoencoders
|
||||
SAEs decompose dense neural activations into sparse, interpretable features:
|
||||
- **Encoder**: Maps d_model → d_sae (typically 4-16x expansion)
|
||||
- **ReLU/TopK**: Enforces sparsity
|
||||
- **Decoder**: Reconstructs original activations
|
||||
|
||||
### Training Loss
|
||||
`Loss = MSE(original, reconstructed) + L1_coefficient × L1(features)`
|
||||
|
||||
### Key Metrics
|
||||
- **L0**: Average number of active features (target: 50-200)
|
||||
- **CE Loss Score**: Cross-entropy recovered vs original model (target: 80-95%)
|
||||
- **Dead Features**: Features that never activate (target: <5%)
|
||||
|
||||
## Available Pre-trained SAEs
|
||||
|
||||
| Release | Model | Description |
|
||||
|---------|-------|-------------|
|
||||
| `gpt2-small-res-jb` | GPT-2 Small | Residual stream SAEs |
|
||||
| `gemma-2b-res` | Gemma 2B | Residual stream SAEs |
|
||||
| Various | Search HuggingFace | Community-trained SAEs |
|
||||
Loading…
Add table
Add a link
Reference in a new issue