Add RL training configuration and tools

- Updated `.env.example` to include Tinker and WandB API keys for reinforcement learning training.
- Enhanced `model_tools.py` to clarify configuration options and streamline the RL training process.
- Expanded `README.md` with detailed instructions for setting up RL training using Tinker and WandB.
- Modified `hermes_cli` files to integrate RL training tools and ensure proper configuration checks.
- Improved `rl_training_tool.py` to reflect changes in training parameters and configuration management.
This commit is contained in:
teknium1 2026-02-04 09:36:51 -08:00
parent f018999da9
commit f6574978de
7 changed files with 169 additions and 65 deletions

View file

@ -165,3 +165,21 @@ IMAGE_TOOLS_DEBUG=false
# CONTEXT_COMPRESSION_ENABLED=true # Enable auto-compression (default: true)
# CONTEXT_COMPRESSION_THRESHOLD=0.85 # Compress at 85% of context limit
# CONTEXT_COMPRESSION_MODEL=google/gemini-2.0-flash-001 # Fast model for summaries
# =============================================================================
# RL TRAINING (Tinker + Atropos)
# =============================================================================
# Run reinforcement learning training on language models using the Tinker API.
# Requires the rl-server to be running (from tinker-atropos package).
# Tinker API Key - RL training service
# Get at: https://tinker-console.thinkingmachines.ai/keys
TINKER_API_KEY=
# Weights & Biases API Key - Experiment tracking and metrics
# Get at: https://wandb.ai/authorize
WANDB_API_KEY=
# RL API Server URL (default: http://localhost:8080)
# Change if running the rl-server on a different host/port
# RL_API_URL=http://localhost:8080