hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-23 10:42:00 +00:00

History

teknium1 ee7fde6531 feat: add OpenThoughts-TBLite evaluation script Introduced a new evaluation script for the OpenThoughts-TBLite environment, enabling users to run evaluations with customizable options. The script includes logging capabilities and real-time output, enhancing the evaluation process for terminal agents. This addition complements the existing benchmarking tools and improves usability for users.		2026-03-04 12:55:56 +00:00
..
__init__.py	Add new environments and enhance tool context functionality	2026-02-10 19:39:05 +00:00
default.yaml	Enhance TerminalBench 2 configuration and evaluation handling	2026-02-10 22:53:24 +00:00
run_eval.sh	feat: add OpenThoughts-TBLite evaluation script	2026-03-04 12:55:56 +00:00
terminalbench2_env.py	Enhance TerminalBench2 environment with task filtering due to incompat with modal and logging improvements	2026-02-12 05:36:45 +00:00