hermes-agent/environments/benchmarks/terminalbench_2
dmahan93 ed27b826c5 feat: add eval_concurrency limit + Docker local config for TBLite
- Add eval_concurrency config field with asyncio.Semaphore
- Add local.yaml config using Docker backend (sandboxed, no cloud costs)
- Register docker_image alongside modal_image for backend flexibility
- Default: 8 parallel tasks for local runs
2026-03-11 06:52:26 -07:00
..
__init__.py Add new environments and enhance tool context functionality 2026-02-10 19:39:05 +00:00
default.yaml fix: limit concurrent Modal sandbox creations to avoid deadlocks 2026-03-07 14:02:34 -08:00
run_eval.sh feat: add OpenThoughts-TBLite evaluation script 2026-03-04 12:55:56 +00:00
terminalbench2_env.py feat: add eval_concurrency limit + Docker local config for TBLite 2026-03-11 06:52:26 -07:00