hermes-agent/docs/MODAL_BACKEND.md
Jai Suphavadeeprasit 0bc914b00c readme edit
2026-02-06 04:24:39 -05:00

224 lines
5.2 KiB
Markdown

# Modal Backend
Hermes Agent uses [Modal](https://modal.com) for scalable, isolated cloud execution environments. There are two Modal integrations:
1. **Terminal Tool** (`tools/terminal_tool.py`) - For CLI/agent command execution
2. **Atropos Backend** (`atropos/backends/modal_backend.py`) - For batch RL training workloads
---
## Terminal Tool (CLI/Agent)
The terminal tool provides a simple interface for executing commands in Modal sandboxes.
### Configuration
Set environment variables:
```bash
export TERMINAL_ENV=modal
export TERMINAL_MODAL_IMAGE=python:3.11
export TERMINAL_MODAL_APP_NAME=hermes-sandbox
```
Or use a YAML config file (`modal_profiles.yaml`):
```yaml
profiles:
default:
image: python:3.11
cpu: 1.0
memory: 2048
min_pool: 1
max_pool: 5
idle_timeout: 120
gpu:
image: pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime
gpu: T4
memory: 16384
min_pool: 0
max_pool: 2
```
### Features
| Feature | Description |
|---------|-------------|
| **Sandbox Pool** | Pre-warmed sandboxes for low latency |
| **Auto-scaling** | Grows/shrinks pool based on demand |
| **Idle Timeout** | Sandboxes auto-terminate when unused |
| **Profile Selection** | Different configs for different workloads |
| **Credential Injection** | `modal.Secret` integration |
### Usage
```python
from tools.terminal_tool import terminal_tool
# Simple command
output = terminal_tool("echo hello", task_id="my-task")
# With profile selection
output = terminal_tool("python train.py", task_id="training", profile="gpu")
# Cleanup when done
from tools.terminal_tool import cleanup_vm
cleanup_vm("my-task")
```
### Architecture
```
_ModalPoolManager (singleton)
├── "default" pool → [sandbox-0, sandbox-1, ...]
└── "gpu" pool → [sandbox-0, ...]
Each pool:
- Maintains min_pool warm sandboxes
- Scales up to max_pool on demand
- Background thread scales down idle sandboxes
```
---
## Atropos Backend (RL Training)
The Atropos backend is designed for high-throughput batch execution during reinforcement learning training.
### Key Concept: Slot-based Multiplexing
Instead of one sandbox per trajectory, multiple trajectories share sandboxes via **slots**:
```
Sandbox (1 container)
├── Slot 0 → Trajectory A (workspace: /data/slot_0)
├── Slot 1 → Trajectory B (workspace: /data/slot_1)
└── Slot 2 → Trajectory C (workspace: /data/slot_2)
```
**Benefits**:
- Fewer containers = lower cost
- Shared warm-up time
- Better GPU utilization
### Configuration
```python
from atropos.backends.modal_backend import ModalSandboxConfig, ModalToolBackend
config = ModalSandboxConfig(
name="default",
image="python:3.11",
cpu=1.0,
memory=2048,
slots_per_sandbox=10, # 10 trajectories per container
min_sandboxes=1,
max_sandboxes=5,
)
backend = ModalToolBackend(config.with_app_name("my-training"))
```
### Multi-Profile Support
Different trajectory types can request different resources:
```python
backend = ModalToolBackend.with_profiles(
app_name="rl-training",
profiles={
"default": ModalSandboxConfig(
name="default",
cpu=1.0,
memory=2048,
),
"pytorch-gpu": ModalSandboxConfig(
name="pytorch-gpu",
image="pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime",
gpu="T4",
memory=16384,
),
}
)
# CPU task
slot1 = await backend.acquire("traj-1", profile="default")
# GPU task
slot2 = await backend.acquire("traj-2", profile="pytorch-gpu")
```
### Batched Execution
The key optimization - execute many commands in parallel:
```python
# Acquire slots for multiple trajectories
slots = [await backend.acquire(f"traj-{i}") for i in range(50)]
# Execute batch across all slots in parallel
results = await backend.execute_batch([
(slot, "bash", {"command": "python step.py"})
for slot in slots
])
# Release slots
for slot in slots:
await backend.release(slot)
```
### Architecture
```
ModalToolBackend
└── _ModalMultiProfileManager
├── "default" → _ModalSandboxPool
│ ├── Sandbox 0 (slots 0-9)
│ └── Sandbox 1 (slots 0-9)
└── "pytorch-gpu" → _ModalSandboxPool
└── Sandbox 0 (slots 0-9)
```
---
## Credentials
Inject secrets securely using Modal's secret management:
```bash
# Create secret in Modal dashboard or CLI
modal secret create my-api-key API_KEY=sk-xxx
```
```python
# Reference in config
config = ModalSandboxConfig(
secrets=["my-api-key"], # Modal secret names
env_vars={"DEBUG": "1"}, # Additional env vars
)
```
## Troubleshooting
### "Modal package not installed"
```bash
pip install modal
modal token new # Authenticate
```
### "Sandbox creation failed"
- Check Modal dashboard for quota limits
- Verify image exists and is accessible
- Check secret names are correct
### Shutdown errors
These are harmless warnings during Python interpreter shutdown:
```
[Modal] Error terminating ...: cannot schedule new futures after interpreter shutdown
```
The sandboxes will auto-terminate via Modal's idle_timeout anyway.