diff --git a/memory-bank/activeContext.md b/memory-bank/activeContext.md
index 7e0fe0ed06..f4ffb138c9 100644
--- a/memory-bank/activeContext.md
+++ b/memory-bank/activeContext.md
@@ -56,27 +56,30 @@ python environments/swe_smith_oracle_env.py process \
    - Backend integration (ModalSandboxConfig.from_agent_env_config, create_tool_backend)
    - `_use_sandbox_backend()` logic (True when modal+backend set, False otherwise)
 
+6. ✅ **End-to-end test with Qwen 3 8B + Modal sandbox** (THIS SESSION)
+   - RunPod endpoint: `0tx0ruuuo4f10c` (Qwen/Qwen3-8B via SGLang)
+   - 5 terminal tool calls executed IN sandbox: `ls`, `git status`, `git log`, `cat parse.py`, `cat tests/`
+   - In-sandbox verification: install deps + pytest → score=0.0 (model inspected but didn't fix)
+   - Full token tracking with logprobs via Phase 2 ManagedServer
+   - Key finding: Llama-3-8B template silently drops `tools=` param, Qwen 3 has full Hermes format support
+
 ### What Still Needs to Be Done
 
-#### End-to-end test with Modal
-The code is implemented and passes all import/integration checks. Needs a live Modal test:
-```bash
-python environments/swe_smith_oracle_env.py process \
-    --env.use_wandb false \
-    --env.total_steps 2 \
-    --env.group_size 1 \
-    --env.max_items 2 \
-    --env.tool_pool_mode modal \
-    --env.modal_image python:3.11 \
-    --env.modal_slots_per_sandbox 10 \
-    --env.modal_min_sandboxes 1
-```
+#### 1. Replace hermes-agent tools backend with sandbox backend globally
+Per Teknium's feedback: `tools/terminal_tool.py`, `tools/file_tools.py` etc. should be able to use
+the Modal/Nomad sandbox backend not just in atropos envs but also in `batch_runner.py` for scaled
+data generation. This unifies the tool execution path across CLI, batch, and RL environments.
 
-#### Remaining consolidation items (from progress.md)
-- Remove redundant `atropos/agent/` and `atropos/envs/agent_env.py`
-- Clean up redundant `atropos/tools/`
-- Test end-to-end with Tinker trainer (blocked on billing)
-- Test with actual tool calls (model producing tool_calls, not just text)
+#### 2. Clean up redundant code
+- Remove `atropos/agent/` (replaced by `environments/agent_loop.py`)
+- Remove `atropos/envs/agent_env.py` (replaced by `environments/hermes_base_env.py`)
+- Remove `atropos/tools/` (use `model_tools.py` + `tools/` directly)
+
+#### 3. Test with Tinker trainer (blocked on billing)
+Full RL training loop: Tinker API → atropos rollout API → environment → trainer
+
+#### 4. Add more environments
+Teknium mentioned needing "endless-terminals" and "terminalbench 2" envs
 
 ### Architecture Summary