mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
feat: add OpenThoughts-TBLite evaluation environment and configuration files
Introduced a new evaluation environment for OpenThoughts-TBLite, including the main evaluation script, configuration YAML, and README documentation. This environment provides a faster alternative to Terminal-Bench 2.0, featuring 100 difficulty-calibrated tasks for terminal agents. The setup allows for easy evaluation and configuration, enhancing the benchmarking capabilities for terminal agents.
This commit is contained in:
parent
3db3d60368
commit
0ea6c34325
4 changed files with 231 additions and 0 deletions
0
environments/benchmarks/tblite/__init__.py
Normal file
0
environments/benchmarks/tblite/__init__.py
Normal file
Loading…
Add table
Add a link
Reference in a new issue