mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-27 01:11:40 +00:00
The skills directory was getting disorganized — mlops alone had 40 skills in a flat list, and 12 categories were singletons with just one skill each. Code change: - prompt_builder.py: Support sub-categories in skill scanner. skills/mlops/training/axolotl/SKILL.md now shows as category 'mlops/training' instead of just 'mlops'. Backwards-compatible with existing flat structure. Split mlops (40 skills) into 7 sub-categories: - mlops/training (12): accelerate, axolotl, flash-attention, grpo-rl-training, peft, pytorch-fsdp, pytorch-lightning, simpo, slime, torchtitan, trl-fine-tuning, unsloth - mlops/inference (8): gguf, guidance, instructor, llama-cpp, obliteratus, outlines, tensorrt-llm, vllm - mlops/models (6): audiocraft, clip, llava, segment-anything, stable-diffusion, whisper - mlops/vector-databases (4): chroma, faiss, pinecone, qdrant - mlops/evaluation (5): huggingface-tokenizers, lm-evaluation-harness, nemo-curator, saelens, weights-and-biases - mlops/cloud (2): lambda-labs, modal - mlops/research (1): dspy Merged singleton categories: - gifs → media (gif-search joins youtube-content) - music-creation → media (heartmula, songsee) - diagramming → creative (excalidraw joins ascii-art) - ocr-and-documents → productivity - domain → research (domain-intel) - feeds → research (blogwatcher) - market-data → research (polymarket) Fixed misplaced skills: - mlops/code-review → software-development (not ML-specific) - mlops/ml-paper-writing → research (academic writing) Added DESCRIPTION.md files for all new/updated categories.
12 KiB
12 KiB
Unsloth Documentation
Unsloth Documentation
- Unsloth Docs: Train your own model with Unsloth, an open-source framework for LLM fine-tuning and reinforcement learning.
- Beginner? Start here!
- Unsloth Requirements: Here are Unsloth's requirements including system and GPU VRAM requirements.
- FAQ + Is Fine-tuning Right For Me?: If you're stuck on if fine-tuning is right for you, see here! Learn about fine-tuning misconceptions, how it compared to RAG and more:
- Unsloth Notebooks: Explore our catalog of Unsloth notebooks:
- All Our Models
- Install & Update: Learn to install Unsloth locally or online.
- Updating: To update or use an old version of Unsloth, follow the steps below:
- Pip Install: To install Unsloth locally via Pip, follow the steps below:
- Docker: Install Unsloth using our official Docker container
- Windows Installation: See how to install Unsloth on Windows with or without WSL.
- AMD: Fine-tune with Unsloth on AMD GPUs.
- Conda Install: To install Unsloth locally on Conda, follow the steps below:
- Google Colab: To install and run Unsloth on Google Colab, follow the steps below:
- Fine-tuning LLMs Guide: Learn all the basics and best practices of fine-tuning. Beginner-friendly.
- What Model Should I Use?
- Datasets Guide: Learn how to create & prepare a dataset for fine-tuning.
- LoRA Hyperparameters Guide: Optimal lora rank. alpha, number of epochs, batch size & gradient accumulation, QLoRA vs LoRA, target modules and more!
- Tutorial: How to Finetune Llama-3 and Use In Ollama: Beginner's Guide for creating a customized personal assistant (like ChatGPT) to run locally on Ollama
- Reinforcement Learning (RL) Guide: Learn all about Reinforcement Learning (RL) and how to train your own DeepSeek-R1 reasoning model with Unsloth using GRPO. A complete guide from beginner to advanced.
- Tutorial: Train your own Reasoning model with GRPO: Beginner's Guide to transforming a model like Llama 3.1 (8B) into a reasoning model by using Unsloth and GRPO.
- Advanced RL Documentation: Advanced documentation settings when using Unsloth with GRPO.
- Memory Efficient RL
- RL Reward Hacking: Learn what is Reward Hacking in Reinforcement Learning and how to counter it.
- GSPO Reinforcement Learning: Train with GSPO (Group Sequence Policy Optimization) RL in Unsloth.
- Reinforcement Learning - DPO, ORPO & KTO: To use the reward modelling functions for DPO, GRPO, ORPO or KTO with Unsloth, follow the steps below:
- DeepSeek-OCR: How to Run & Fine-tune: Guide on how to run and fine-tune DeepSeek-OCR locally.
- How to Fine-tune LLMs with Unsloth & Docker: Learn how to fine-tune LLMs or do Reinforcement Learning (RL) with Unsloth's Docker image.
- Vision Reinforcement Learning (VLM RL): Train Vision/multimodal models via GRPO and RL with Unsloth!
- gpt-oss Reinforcement Learning
- Tutorial: How to Train gpt-oss with RL: Learn to train OpenAI gpt-oss with GRPO to autonomously beat 2048 locally or on Colab.
- Unsloth Dynamic GGUFs on Aider Polyglot: Performance of Unsloth Dynamic GGUFs on Aider Polyglot Benchmarks
- Qwen3-VL: How to Run & Fine-tune: Learn to fine-tune and run Qwen3-VL locally with Unsloth.
- gpt-oss: How to Run & Fine-tune: Run & fine-tune OpenAI's new open-source models!
- Tutorial: How to Fine-tune gpt-oss: Learn step-by-step how to train OpenAI gpt-oss locally with Unsloth.
- Long Context gpt-oss Training
- GLM-4.6: How to Run Locally: A guide on how to run Z.ai's new GLM-4.6 model on your own local device!
- IBM Granite 4.0: How to run IBM Granite-4.0 with Unsloth GGUFs on llama.cpp, Ollama and how to fine-tune!
- DeepSeek-V3.1: How to Run Locally: A guide on how to run DeepSeek-V3.1 and Terminus on your own local device!
- Qwen3-Coder: How to Run Locally: Run Qwen3-Coder-30B-A3B-Instruct and 480B-A35B locally with Unsloth Dynamic quants.
- Gemma 3: How to Run & Fine-tune: How to run Gemma 3 effectively with our GGUFs on llama.cpp, Ollama, Open WebUI and how to fine-tune with Unsloth!
- Gemma 3n: How to Run & Fine-tune: Run Google's new Gemma 3n locally with Dynamic GGUFs on llama.cpp, Ollama, Open WebUI and fine-tune with Unsloth!
- Qwen3: How to Run & Fine-tune: Learn to run & fine-tune Qwen3 locally with Unsloth + our Dynamic 2.0 quants
- Qwen3-2507: Run Qwen3-30B-A3B-2507 and 235B-A22B Thinking and Instruct versions locally on your device!
- Tutorials: How To Fine-tune & Run LLMs: Learn how to run and fine-tune models for optimal performance 100% locally with Unsloth.
- DeepSeek-R1-0528: How to Run Locally: A guide on how to run DeepSeek-R1-0528 including Qwen3 on your own local device!
- Magistral: How to Run & Fine-tune: Meet Magistral - Mistral's new reasoning models.
- Llama 4: How to Run & Fine-tune: How to run Llama 4 locally using our dynamic GGUFs which recovers accuracy compared to standard quantization.
- Kimi K2: How to Run Locally: Guide on running Kimi K2 and Kimi-K2-Instruct-0905 on your own local device!
- Grok 2: Run xAI's Grok 2 model locally!
- Devstral: How to Run & Fine-tune: Run and fine-tune Mistral Devstral 1.1, including Small-2507 and 2505.
- DeepSeek-V3-0324: How to Run Locally: How to run DeepSeek-V3-0324 locally using our dynamic quants which recovers accuracy
- DeepSeek-R1: How to Run Locally: A guide on how you can run our 1.58-bit Dynamic Quants for DeepSeek-R1 using llama.cpp.
- DeepSeek-R1 Dynamic 1.58-bit: See performance comparison tables for Unsloth's Dynamic GGUF Quants vs Standard IMatrix Quants.
- QwQ-32B: How to Run effectively: How to run QwQ-32B effectively with our bug fixes and without endless generations + GGUFs.
- Phi-4 Reasoning: How to Run & Fine-tune: Learn to run & fine-tune Phi-4 reasoning models locally with Unsloth + our Dynamic 2.0 quants
- Running & Saving Models: Learn how to save your finetuned model so you can run it in your favorite inference engine.
- Saving to GGUF: Saving models to 16bit for GGUF so you can use it for Ollama, Jan AI, Open WebUI and more!
- Saving to Ollama
- Saving to vLLM for deployment: Saving models to 16bit for vLLM deployment and serving
- Saving to SGLang for deployment: Saving models to 16bit for SGLang for deployment and serving
- Unsloth Inference: Learn how to run your finetuned model with Unsloth's faster inference.
- Troubleshooting Inference: If you're experiencing issues when running or saving your model.
- vLLM Engine Arguments
- LoRA Hot Swapping Guide
- Text-to-Speech (TTS) Fine-tuning: Learn how to to fine-tune TTS & STT voice models with Unsloth.
- Unsloth Dynamic 2.0 GGUFs: A big new upgrade to our Dynamic Quants!
- Vision Fine-tuning: Learn how to fine-tune vision/multimodal LLMs with Unsloth
- Fine-tuning LLMs with NVIDIA DGX Spark and Unsloth: Tutorial on how to fine-tune and do reinforcement learning (RL) with OpenAI gpt-oss on NVIDIA DGX Spark.
- Fine-tuning LLMs with Blackwell, RTX 50 series & Unsloth: Learn how to fine-tune LLMs on NVIDIA's Blackwell RTX 50 series and B200 GPUs with our step-by-step guide.
- Multi-GPU Training with Unsloth: Learn how to fine-tune LLMs on multiple GPUs and parallelism with Unsloth.
- Finetuning from Last Checkpoint: Checkpointing allows you to save your finetuning progress so you can pause it and then continue.
- Troubleshooting & FAQs: Tips to solve issues, and frequently asked questions.
- Chat Templates: Learn the fundamentals and customization options of chat templates, including Conversational, ChatML, ShareGPT, Alpaca formats, and more!
- Quantization-Aware Training (QAT): Quantize models to 4-bit with Unsloth and PyTorch to recover accuracy.
- Unsloth Environment Flags: Advanced flags which might be useful if you see breaking finetunes, or you want to turn stuff off.
- Continued Pretraining: AKA as Continued Finetuning. Unsloth allows you to continually pretrain so a model can learn a new language.
- Unsloth Benchmarks: Unsloth recorded benchmarks on NVIDIA GPUs.