mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-03 02:11:48 +00:00
3.9 KiB
3.9 KiB
Progress
Completed Features
✅ Modal Backend Integration (Feb 8, 2026 - MERGED & TESTED)
Merged the modal-integration branch and fixed integration issues.
What Works:
ModalToolBackendimplements fullToolBackendinterface (start, stop, acquire, release, execute_batch)- Modal Sandboxes used for long-lived containers (not Functions)
sandbox.exec()for direct command execution (no HTTP server needed)- Slot-based multiplexing matching Nomad pattern
- Multi-profile support (
ModalSandboxConfig,_ModalMultiProfileManager) - YAML profile loading (
modal_profiles.yaml) AgentEnvConfigfields for all Modal settings (--env.modal_*)create_tool_backend()supportstool_pool_mode="modal"- Terminal tool (
tools/terminal_tool.py) native Modal integration with pool management - Named sandbox recovery via
Sandbox.from_name() - Auto-scaling sandbox pool per profile
- Artifact helpers (read, list, archive)
CLI Usage:
# Atropos backend
python -m atropos.envs.swe_smith_oracle_env process \
--env.tool_pool_mode modal \
--env.modal_image python:3.11
# Terminal tool
TERMINAL_ENV=modal ./hermes
Files Modified/Created:
atropos/backends/modal_backend.py- Full implementation (~1200 lines)atropos/backends/__init__.py-create_tool_backend()updatedatropos/envs/agent_env.py- 15 Modal config fields addedtools/terminal_tool.py- Native Modal sandbox pooldocs/MODAL_BACKEND.md- Documentationmodal_profiles.yaml.example- Example profilestests/test_modal_integration.py- Integration teststests/test_modal_stress.py- Stress teststests/test_modal_terminal.py- Terminal tool tests
✅ Singularity/Apptainer Sandbox Integration (Feb 6, 2026 - FULLY TESTED)
Adapted the Atropos sandbox environment from Docker to Singularity/Apptainer for HPC clusters.
What Works:
create_sandbox_job()supports bothdriver="docker"anddriver="singularity"- SlotPoolConfig and NomadBackendConfig propagate driver settings
- Singularity container runs sandbox_server.py via Nomad's raw_exec driver
- All sandbox operations work: bash execution, file read/write
- CLI arguments
--env.driverand--env.singularity_imagefor AgentEnvConfig - Static port binding for Singularity (ReservedPorts vs DynamicPorts)
✅ Memory Bank Initialized (Feb 5, 2026)
Set up project documentation structure for context persistence.
In Progress
None currently.
Known Issues
- Modal backend not yet live-tested with actual Modal cloud credentials
bwrap_available: falsein Singularity containers- Health check timing - may need longer wait for container startup on slower systems
What's Left to Build
Modal Backend
- Live test with Modal credentials on actual cloud
- Test multi-profile GPU workflows
- Test sandbox recovery after restart
- Integrate with SWE-smith-oracle env for GRPO training loop
- Performance benchmarking vs Nomad backend
HPC Deployment
- Test on actual HPC cluster with Slurm/PBS integration
- Document cluster-specific deployment procedures
Documentation
- Add Singularity deployment to README
- Create HPC deployment skill in skills/mlops/
Evolution of Decisions
Container Runtime Selection
- Initial: Docker-only via Nomad docker driver
- Problem: HPC clusters don't allow Docker without sudo
- Solution: Added Singularity/Apptainer support via raw_exec driver
- Result: Both runtimes now supported with same API
Modal Backend Architecture
- Initial: Stub placeholder raising RuntimeError
- Investigation: Modal Sandboxes vs Functions - chose Sandboxes for long-lived containers
- Design: Direct
sandbox.exec()instead of HTTP/sandbox_server.py (simpler, no networking needed) - Implementation: Merged from
modal-integrationbranch, fixed agent_env.py config fields - Result: Three backends now supported: Nomad/Docker, Nomad/Singularity, Modal