singularity working

2026-05-13 03:52:00 +00:00 · 2026-02-06 01:03:59 +00:00 · 2026-02-06 01:03:59 +00:00 · fd1c3da305
commit fd1c3da305
parent 4d619bcd21
23 changed files with 1444 additions and 38 deletions
--- a/memory-bank/activeContext.md
+++ b/memory-bank/activeContext.md
@ -0,0 +1,62 @@
+# Active Context
+
+## Current Focus
+Singularity/Apptainer integration for HPC environments has been **COMPLETED AND TESTED**.
+
+## Recently Completed (Feb 6, 2026)
+
+### Singularity/Apptainer Sandbox Integration - FULLY WORKING
+Successfully adapted the Atropos implementation from Docker to Singularity/Apptainer for HPC clusters where Docker cannot run without sudo permissions.
+
+**Files Modified:**
+1. `atropos/nomad/client.py` - Added `driver` and `singularity_image` parameters to `create_sandbox_job()`; Fixed port detection to check both `DynamicPorts` and `ReservedPorts` in `get_job_allocations()`
+2. `atropos/slots/pool.py` - Added `driver` and `singularity_image` to `SlotPoolConfig`
+3. `atropos/backends/nomad_backend.py` - Added driver options to `NomadBackendConfig`
+4. `atropos/envs/agent_env.py` - Added CLI arguments `--env.driver` and `--env.singularity_image` to `AgentEnvConfig`
+
+**Files Created:**
+1. `nomad-singularity.hcl` - Nomad config with raw_exec driver enabled
+2. `atropos/atropos-sandbox.sif` - Singularity image (80MB) built from Docker image
+3. `test_singularity_job.py` - Test script for Singularity integration
+
+**Key Implementation Details:**
+- Uses Nomad's `raw_exec` driver to run `apptainer` commands
+- Shell wrapper (`/bin/sh -c`) ensures Nomad environment variables expand correctly
+- Binds Nomad allocation directory to `/data` for workspace persistence
+- Uses **static ports** (`ReservedPorts`) instead of dynamic ports since raw_exec runs directly on host
+- `get_job_allocations()` now checks both `DynamicPorts` (Docker) and `ReservedPorts` (Singularity)
+
+**Test Results (All Passing):**
+- Health check: ✅ Server responding with 5 slots
+- Bash execution: ✅ Commands execute inside Singularity container
+- Write file: ✅ File written to slot workspace
+- Read file: ✅ File read back successfully
+
+## Usage
+
+### For Docker (default):
+```python
+config = SlotPoolConfig(
+    driver="docker",
+    image="atropos-sandbox:local",
+)
+```
+
+### For Singularity/Apptainer:
+```python
+config = SlotPoolConfig(
+    driver="singularity",
+    singularity_image="/path/to/atropos-sandbox.sif",
+)
+```
+
+### Nomad Configuration:
+```bash
+# Start Nomad with Singularity support
+nomad agent -dev -config=nomad-singularity.hcl
+```
+
+## Next Steps
+- Deploy to HPC cluster for production testing
+- Consider adding bubblewrap (bwrap) support inside Singularity for additional sandboxing
+- Document HPC-specific deployment procedures in skills/mlops/
--- a/memory-bank/productContext.md
+++ b/memory-bank/productContext.md
@ -0,0 +1,55 @@
+# Product Context: Hermes-Agent
+
+## Why This Project Exists
+
+Hermes-Agent addresses several key challenges in the AI agent space:
+
+1. **Unified Tool Interface** - Provides a clean, consistent interface for LLMs to use various tools (web, terminal, browser, vision, etc.) without requiring custom integration for each model provider.
+
+2. **Training Data Generation** - Enables efficient generation of high-quality tool-calling trajectories for fine-tuning LLMs, with features like batch processing, checkpointing, and trajectory compression.
+
+3. **Flexible Deployment** - Supports multiple execution environments (local, Docker, Singularity, Modal, SSH) to accommodate different security and isolation requirements.
+
+4. **Developer Experience** - Offers a beautiful, interactive CLI with kawaii-style feedback that makes working with AI agents enjoyable.
+
+## Problems It Solves
+
+### For AI Researchers
+- **Data Generation at Scale**: Parallel batch processing with content-based checkpointing for fault tolerance
+- **Clean Trajectories**: Trajectory compression to fit token budgets while preserving important information
+- **Toolset Distributions**: Probability-based tool selection for varied training data
+
+### For Developers
+- **Tool Orchestration**: Logical grouping of tools into toolsets (research, development, debugging, etc.)
+- **Session Persistence**: Conversation history and session logging for debugging
+- **Multi-Model Support**: Works with any OpenAI-compatible API (OpenRouter, local models, etc.)
+
+### For MLOps
+- **Skills System**: On-demand knowledge documents for specific tools/frameworks (Axolotl, vLLM, TRL, etc.)
+- **Sandboxed Execution**: Terminal commands can run in isolated environments (Docker, Singularity, Modal)
+- **Configurable Backends**: Easy switching between local and cloud execution
+
+## How It Should Work
+
+### User Flow (CLI)
+1. User launches `./hermes` 
+2. Beautiful welcome banner displays with caduceus logo, model info, and available tools
+3. User types a natural language request
+4. Agent processes request, potentially calling tools with animated feedback
+5. Agent responds with results, conversation continues
+6. Session is automatically logged for debugging
+
+### User Flow (Batch Processing)
+1. User prepares JSONL file with prompts
+2. Runs `batch_runner.py` with distribution and worker count
+3. System processes prompts in parallel, saves checkpoints
+4. Completed trajectories saved to `data/<run_name>/trajectories.jsonl`
+5. Optional: compress trajectories with `trajectory_compressor.py`
+
+## User Experience Goals
+
+- **Delightful Interaction**: Kawaii ASCII faces, animated spinners, cute messages
+- **Informative Feedback**: Clear progress indication during tool execution
+- **Configurable Personalities**: From "helpful" to "pirate" to "Shakespeare"
+- **Easy Configuration**: YAML config file + environment variables + CLI flags
+- **Graceful Degradation**: Missing tools/APIs don't break the system, just disable features
--- a/memory-bank/progress.md
+++ b/memory-bank/progress.md
@ -0,0 +1,67 @@
+# Progress
+
+## Completed Features
+
+### ✅ Singularity/Apptainer Sandbox Integration (Feb 6, 2026 - FULLY TESTED)
+Adapted the Atropos sandbox environment from Docker to Singularity/Apptainer for HPC clusters.
+
+**What Works:**
+- `create_sandbox_job()` supports both `driver="docker"` and `driver="singularity"`
+- SlotPoolConfig and NomadBackendConfig propagate driver settings
+- Singularity container runs sandbox_server.py via Nomad's raw_exec driver
+- All sandbox operations work: bash execution, file read/write
+- Nomad environment variables properly expanded via shell wrapper
+- **CLI arguments** `--env.driver` and `--env.singularity_image` for AgentEnvConfig
+- **Static port binding** for Singularity (ReservedPorts vs DynamicPorts)
+- **Port detection** works for both Docker and Singularity allocations
+
+**CLI Usage:**
+```bash
+python -m atropos.envs.swe_smith_oracle_env process \
+    --env.driver singularity \
+    --env.singularity_image /path/to/atropos-sandbox.sif
+```
+
+**Created Files:**
+- `nomad-singularity.hcl` - Nomad config with raw_exec enabled
+- `atropos/atropos-sandbox.sif` - 80MB Singularity image
+- `test_singularity_job.py` - Integration test script
+
+**Modified Files:**
+- `atropos/nomad/client.py` - driver support + ReservedPorts detection
+- `atropos/slots/pool.py` - driver config fields
+- `atropos/backends/nomad_backend.py` - driver config fields
+- `atropos/envs/agent_env.py` - CLI arguments for driver selection
+
+### ✅ Memory Bank Initialized (Feb 5, 2026)
+Set up project documentation structure for context persistence.
+
+## In Progress
+None currently.
+
+## Known Issues
+- `bwrap_available: false` in Singularity containers - bubblewrap sandboxing not available inside the container (kernel namespaces already in use)
+- Health check timing - may need longer wait for container startup on slower systems
+
+## What's Left to Build
+
+### HPC Deployment
+- [ ] Test on actual HPC cluster with Slurm/PBS integration
+- [ ] Document cluster-specific deployment procedures
+- [ ] Add support for shared filesystem workspace binding
+
+### Enhanced Sandboxing
+- [ ] Investigate alternative sandboxing inside Singularity (seccomp, etc.)
+- [ ] Add network isolation options for Singularity
+
+### Documentation
+- [ ] Add Singularity deployment to README
+- [ ] Create HPC deployment skill in skills/mlops/
+
+## Evolution of Decisions
+
+### Container Runtime Selection
+- **Initial**: Docker-only via Nomad docker driver
+- **Problem**: HPC clusters don't allow Docker without sudo
+- **Solution**: Added Singularity/Apptainer support via raw_exec driver
+- **Result**: Both runtimes now supported with same API
--- a/memory-bank/projectbrief.md
+++ b/memory-bank/projectbrief.md
@ -0,0 +1,44 @@
+# Project Brief: Hermes-Agent
+
+## Overview
+Hermes-Agent is an AI agent harness for LLMs with advanced tool-calling capabilities, featuring a flexible toolsets system for organizing and managing tools. Named after Hermes, the Greek messenger god, it serves as a bridge between human intent and AI-powered task execution.
+
+## Core Requirements
+
+### Primary Goals
+1. **Interactive CLI Experience** - Beautiful terminal interface with animated feedback, personalities, and session management
+2. **Flexible Tool System** - Modular tools organized into logical toolsets for different use cases
+3. **Batch Processing** - Process multiple prompts in parallel with checkpointing and statistics
+4. **Multi-Backend Support** - Support for local, Docker, Singularity, Modal, and SSH terminal backends
+5. **Training Data Generation** - Save conversation trajectories in formats suitable for LLM fine-tuning
+
+### Target Users
+- AI researchers generating training data
+- Developers needing an AI assistant with tool access
+- MLOps practitioners automating workflows
+- Anyone needing a powerful CLI-based AI agent
+
+## Scope
+
+### In Scope
+- Interactive CLI with rich formatting and kawaii-style feedback
+- Web tools (search, extract, crawl via Firecrawl)
+- Terminal tools (command execution across multiple backends)
+- Browser automation (via agent-browser + Browserbase)
+- Vision tools (image analysis)
+- Image generation (FLUX via FAL.ai)
+- Mixture-of-Agents reasoning
+- Skills system for on-demand knowledge
+- Batch processing with parallel workers
+- Trajectory compression for training
+
+### Out of Scope (Current)
+- Proactive suggestions (agent only runs on request)
+- Clipboard integration (no local system access)
+- Real-time streaming of thinking/reasoning (deferred)
+
+## Success Metrics
+- Clean, maintainable tool architecture
+- Reliable tool execution with proper error handling
+- Efficient context management for long conversations
+- High-quality trajectory data for training
--- a/memory-bank/systemPatterns.md
+++ b/memory-bank/systemPatterns.md
@ -0,0 +1,149 @@
+# System Patterns: Hermes-Agent
+
+## Architecture Overview
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                           CLI (cli.py)                          │
+│  - Rich welcome banner with caduceus                            │
+│  - prompt_toolkit for input with history                        │
+│  - Kawaii-style feedback and personalities                      │
+└────────────────────────────┬────────────────────────────────────┘
+                             │
+                             ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                     AIAgent (run_agent.py)                      │
+│  - Conversation loop with tool calling                          │
+│  - KawaiiSpinner for animated feedback                          │
+│  - Retry logic with exponential backoff                         │
+│  - Session logging to logs/ directory                           │
+└────────────────────────────┬────────────────────────────────────┘
+                             │
+                             ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                   Tool Routing (model_tools.py)                 │
+│  - get_tool_definitions() - returns tools for API calls         │
+│  - handle_function_call() - dispatches to tool handlers         │
+│  - Toolset filtering (enabled/disabled)                         │
+└────────────────────────────┬────────────────────────────────────┘
+                             │
+           ┌─────────────────┼─────────────────┐
+           ▼                 ▼                 ▼
+    ┌───────────┐     ┌───────────┐     ┌───────────┐
+    │ Web Tools │     │ Terminal  │     │ Browser   │
+    │ (Firecrawl)│    │ (mini-swe)│     │(agent-brw)│
+    └───────────┘     └───────────┘     └───────────┘
+           │                 │                 │
+           └─────────────────┼─────────────────┘
+                             ▼
+                    ┌───────────────┐
+                    │  Toolsets     │
+                    │  (toolsets.py)│
+                    │  Composition  │
+                    └───────────────┘
+```
+
+## Key Design Patterns
+
+### 1. Toolset Composition Pattern
+Toolsets can include other toolsets, allowing flexible composition:
+
+```python
+TOOLSETS = {
+    "web": {"tools": ["web_search", "web_extract"], "includes": []},
+    "debugging": {"tools": ["terminal"], "includes": ["web"]},
+    "full_stack": {"tools": [], "includes": ["web", "terminal", "vision", "browser"]}
+}
+```
+
+Resolution is recursive with cycle detection.
+
+### 2. Graceful Degradation Pattern
+Each tool module has a `check_*_requirements()` function:
+- Tools are only loaded if requirements are met
+- Missing API keys disable tools, not crash the system
+- Import errors are caught and tools marked unavailable
+
+```python
+try:
+    from tools.web_tools import web_search_tool, check_firecrawl_api_key
+except ModuleNotFoundError:
+    web_search_tool = None
+    def check_firecrawl_api_key(): return False
+```
+
+### 3. Session Isolation Pattern (task_id)
+Stateful tools (terminal, browser) use `task_id` to isolate concurrent sessions:
+- Each batch worker gets unique task_id
+- VMs and browser sessions are tracked per task_id
+- Cleanup functions release resources: `cleanup_vm(task_id)`, `cleanup_browser(task_id)`
+
+### 4. Trajectory Format Pattern
+Conversations are saved in ShareGPT format for training:
+
+```json
+{"from": "system", "value": "System prompt with <tools>...</tools>"}
+{"from": "human", "value": "User message"}
+{"from": "gpt", "value": "<think>reasoning</think>\n<tool_call>{...}</tool_call>"}
+{"from": "tool", "value": "<tool_response>{...}</tool_response>"}
+{"from": "gpt", "value": "Final response"}
+```
+
+### 5. Ephemeral System Prompt Pattern
+Guide model behavior during data collection without saving to trajectories:
+- `ephemeral_system_prompt` influences execution
+- Only standard tool-calling system prompt saved to trajectories
+- Keeps training data clean
+
+### 6. Retry with Validation Pattern
+The agent validates responses before accepting:
+- Check tool names against `valid_tool_names` set
+- Validate JSON arguments can be parsed
+- Check for content after `<think>` blocks
+- Roll back to last valid state on persistent failures
+
+## Component Relationships
+
+### AIAgent Class
+- Central orchestrator for conversations
+- Manages conversation history
+- Calls OpenAI-compatible API
+- Routes tool calls to handlers
+- Provides animated feedback (KawaiiSpinner)
+
+### Tool Modules (tools/*.py)
+- Self-contained tool implementations
+- Export: handler function + check function + schema
+- Return JSON strings (never raw dicts)
+- Accept optional `task_id` for stateful tools
+
+### Toolsets System (toolsets.py)
+- Defines logical groupings of tools
+- Supports composition via `includes`
+- `resolve_toolset()` recursively resolves all tools
+- `validate_toolset()` checks if name is valid
+
+### Model Tools (model_tools.py)
+- Aggregates all tool definitions
+- Routes function calls to correct handlers
+- Filters tools based on enabled/disabled toolsets
+- Bridge between agent and tool implementations
+
+## Critical Implementation Paths
+
+### Tool Execution Flow
+1. AIAgent receives tool_calls from API response
+2. Validates tool names against `valid_tool_names`
+3. Validates JSON arguments can be parsed
+4. Calls `handle_function_call()` with tool name, args, task_id
+5. `handle_function_call()` routes to appropriate handler
+6. Tool executes, returns JSON string
+7. Result added to conversation as tool message
+8. Loop continues until natural language response
+
+### Configuration Loading Flow
+1. `cli.py` calls `load_cli_config()`
+2. Loads `cli-config.yaml`, merges with defaults
+3. Sets environment variables for terminal config
+4. `AIAgent` reads env vars when initializing terminal tool
+5. Terminal tool creates appropriate backend based on `TERMINAL_ENV`
--- a/memory-bank/techContext.md
+++ b/memory-bank/techContext.md
@ -0,0 +1,113 @@
+# Technical Context: Hermes-Agent
+
+## Technologies Used
+
+### Core Stack
+- **Python 3.11+** - Primary language
+- **OpenAI SDK** - For LLM API interactions (OpenAI-compatible)
+- **OpenRouter** - Default LLM provider (supports multiple models)
+- **Rich** - Terminal formatting and panels
+- **prompt_toolkit** - Interactive input with history
+- **Fire** - CLI argument parsing
+- **PyYAML** - Configuration files
+- **python-dotenv** - Environment variable management
+
+### Tool Dependencies
+- **Firecrawl** - Web search and extraction (`FIRECRAWL_API_KEY`)
+- **mini-swe-agent** - Terminal tool backend (local/docker/singularity/modal/ssh)
+- **agent-browser** - Browser automation (npm package)
+- **Browserbase** - Cloud browser execution (`BROWSERBASE_API_KEY`)
+- **FAL.ai** - Image generation with FLUX (`FAL_KEY`)
+- **Nous API** - Vision and MoA tools (`NOUS_API_KEY`)
+
+### Optional Dependencies
+- **Modal** - Cloud compute for sandboxed environments
+- **Singularity/Apptainer** - Rootless containers (HPC environments)
+- **Docker** - Container isolation
+
+## Development Setup
+
+### Quick Start
+```bash
+# Clone with submodules
+git clone --recurse-submodules https://github.com/NousResearch/Hermes-Agent.git
+cd Hermes-Agent
+
+# Create virtual environment
+python3 -m venv venv
+source venv/bin/activate
+
+# Install dependencies
+pip install -r requirements.txt
+pip install -e ./mini-swe-agent
+
+# Install browser tools (optional)
+npm install
+
+# Configure environment
+cp .env.example .env
+# Edit .env with your API keys
+```
+
+### Key Configuration Files
+- `.env` - API keys and secrets
+- `cli-config.yaml` - CLI configuration (model, terminal, toolsets, personalities)
+- `configs/` - Batch run scripts and configuration
+
+### Environment Variables
+
+**Required for Full Functionality:**
+- `OPENROUTER_API_KEY` - Primary LLM access
+- `FIRECRAWL_API_KEY` - Web tools
+- `NOUS_API_KEY` - Vision and reasoning tools
+- `FAL_KEY` - Image generation
+
+**Terminal Backend:**
+- `TERMINAL_ENV` - Backend type: `local`, `docker`, `singularity`, `modal`, `ssh`
+- `TERMINAL_CWD` - Working directory
+- `TERMINAL_DOCKER_IMAGE` / `TERMINAL_SINGULARITY_IMAGE` - Container images
+- `TERMINAL_SSH_HOST/USER/KEY` - SSH backend config
+- `SUDO_PASSWORD` - Optional sudo support
+
+**Browser:**
+- `BROWSERBASE_API_KEY` - Browser automation
+- `BROWSERBASE_PROJECT_ID` - Browserbase project
+
+## Technical Constraints
+
+1. **Context Window Limits** - Long tool outputs can exhaust context; trajectory compression helps
+2. **API Rate Limits** - OpenRouter and tool APIs have rate limits; exponential backoff implemented
+3. **Tool Availability** - Tools gracefully degrade if dependencies/keys missing
+4. **Async Compatibility** - Some tools are async, handled via `asyncio.run()` in sync context
+
+## Dependency Graph
+
+```
+tools/*.py → tools/__init__.py → model_tools.py → toolsets.py → toolset_distributions.py
+                                       ↑
+run_agent.py ──────────────────────────┘
+cli.py → run_agent.py (uses AIAgent with quiet_mode=True)
+batch_runner.py → run_agent.py + toolset_distributions.py
+```
+
+## Tool Usage Patterns
+
+### Adding a New Tool
+1. Create `tools/your_tool.py` with handler + requirements check
+2. Export in `tools/__init__.py`
+3. Register in `model_tools.py` (definitions + handler routing)
+4. Add to toolset in `toolsets.py`
+5. Optionally add to `toolset_distributions.py` for batch processing
+
+### Tool Handler Pattern
+```python
+def your_tool(param: str, task_id: str = None) -> str:
+    """Execute tool and return JSON string result."""
+    try:
+        result = {"success": True, "data": "..."}
+        return json.dumps(result, ensure_ascii=False)
+    except Exception as e:
+        return json.dumps({"error": str(e)}, ensure_ascii=False)
+```
+
+All tool handlers MUST return a JSON string, never raw dicts.