From 9dd56f0dfb040342b9fb1ec78ff255c5854b4063 Mon Sep 17 00:00:00 2001 From: Teknium <127238744+teknium1@users.noreply.github.com> Date: Fri, 26 Jun 2026 11:05:07 -0700 Subject: [PATCH] docs(moa): add HermesBench results to Mixture of Agents page (#53206) --- .../docs/user-guide/features/mixture-of-agents.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/website/docs/user-guide/features/mixture-of-agents.md b/website/docs/user-guide/features/mixture-of-agents.md index 9dbe3e65583..c901a2fe9e7 100644 --- a/website/docs/user-guide/features/mixture-of-agents.md +++ b/website/docs/user-guide/features/mixture-of-agents.md @@ -106,6 +106,18 @@ hermes moa configure review # create or update a named preset hermes moa delete review ``` +## Benchmarks + +On HermesBench, a two-model MoA preset — `claude-opus-4.8` aggregating over a `gpt-5.5` reference — outscores either model run on its own: + +| Model | HermesBench score | +|---|---| +| **Opus aggregator (opus-4.8 + gpt-5.5 reference) — MoA** | **0.8202** | +| `anthropic/claude-opus-4.8` | 0.7607 | +| `openai/gpt-5.5` | 0.7412 | + +The MoA configuration beats its strongest component (opus-4.8) by ~6 points, confirming that aggregating a second perspective lifts quality on hard tasks rather than just averaging the two. + ## Notes - MoA is no longer listed under `hermes tools`; there is no `moa` toolset to enable.