mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
Introduced a new evaluation script for the OpenThoughts-TBLite environment, enabling users to run evaluations with customizable options. The script includes logging capabilities and real-time output, enhancing the evaluation process for terminal agents. This addition complements the existing benchmarking tools and improves usability for users. |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| default.yaml | ||
| run_eval.sh | ||
| terminalbench2_env.py | ||