hermes-agent/environments/benchmarks/terminalbench_2
dmahan93 13f5459670 fix: use ManagedServer for vLLM in TBLite eval + local_vllm config
TBLite eval was bypassing ManagedServer and calling ServerManager
directly, which uses /v1/chat/completions — not available on the
atropos vllm_api_server (/generate only).

Now uses _use_managed_server() to detect vLLM/SGLang backends and
route through ManagedServer (Phase 2) with proper tool_parser and
/generate endpoint. Falls back to Phase 1 for OpenAI endpoints.

Also adds local_vllm.yaml config for running against a local vLLM
server with Docker sandboxes.
2026-03-11 06:52:55 -07:00
..
__init__.py Add new environments and enhance tool context functionality 2026-02-10 19:39:05 +00:00
default.yaml fix: limit concurrent Modal sandbox creations to avoid deadlocks 2026-03-07 14:02:34 -08:00
run_eval.sh feat: add OpenThoughts-TBLite evaluation script 2026-03-04 12:55:56 +00:00
terminalbench2_env.py fix: use ManagedServer for vLLM in TBLite eval + local_vllm config 2026-03-11 06:52:55 -07:00