feat(moa): expose MoA presets as selectable virtual models (#46081)

* feat(moa): expose MoA presets as selectable virtual models

Reconstructed onto current main (PR #46081's base had diverged with no common
ancestor, marking the PR dirty so CI never dispatched). MoA is now a virtual
provider: each named preset is a selectable model under provider 'moa', and the
preset's aggregator is the acting model that answers and calls tools.

Reference models fan out in parallel via a bounded ThreadPoolExecutor (the same
batch pattern delegate_task uses) — all references dispatched at once, collected
when every one finishes, then handed to the aggregator. Output order is
preserved, failures and the MoA-recursion guard stay isolated per reference.

- Removed the old mixture_of_agents model tool and moa toolset.
- Added moa as a virtual provider in the provider/model inventory.
- /moa is shortcut behavior over model selection (default preset / named preset
  / one-shot prompt).
- Dashboard + Desktop manage named presets; presets appear in model pickers.
- Parallel reference fan-out in agent/moa_loop.py with regression test.

* fix(moa): thread moa_config through _run_agent to _run_agent_inner

The reconstructed gateway MoA wiring declared moa_config on _run_agent (the
profile-scoping wrapper) and used it inside _run_agent_inner, but the wrapper
never forwarded it — _run_agent_inner had no such parameter, so the runtime hit
NameError: name 'moa_config' is not defined on the compression-failure session
sync path. Add moa_config to _run_agent_inner's signature and forward it from
both wrapper call sites (multiplex and non-multiplex). Caught by
tests/gateway/test_compression_failure_session_sync.py on CI shard test(4).

* fix(moa): classify moa as a virtual provider in the catalog

The moa virtual provider has no PROVIDER_REGISTRY/ProviderProfile entry, so
provider_catalog() fell through to the default auth_type="api_key" with no
env vars — tripping two catalog invariants:
  - test_provider_catalog: api_key providers must expose a credential env var
  - test_provider_parity: every hermes-model provider must be desktop-configurable

moa already declares auth_type="virtual" in HERMES_OVERLAYS; consult that
overlay as an auth_type fallback so the catalog reports moa as virtual (no real
credential, no network endpoint). Exempt virtual providers from the desktop
parity union check the same way 'custom' is exempt — derived from the catalog,
not a hardcoded slug, so future virtual providers are covered too.
This commit is contained in:
Teknium 2026-06-25 13:52:06 -07:00 committed by GitHub
parent f284d85efa
commit c6575df927
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
58 changed files with 2264 additions and 765 deletions

View file

@ -74,7 +74,7 @@ _POLISHED_TOOLS = {
"kanban_create", "kanban_show", "kanban_comment", "kanban_complete",
"kanban_block", "kanban_link", "kanban_heartbeat",
"yb_query_group_info", "yb_query_group_members", "yb_search_sticker",
"yb_send_dm", "yb_send_sticker", "mixture_of_agents",
"yb_send_dm", "yb_send_sticker",
}

View file

@ -719,6 +719,15 @@ def init_agent(
print("🔑 Using credentials: Microsoft Entra ID")
elif isinstance(effective_key, str) and len(effective_key) > 12:
print(f"🔑 Using token: {effective_key[:8]}...{effective_key[-4:]}")
elif agent.provider == "moa":
from agent.moa_loop import MoAClient
agent.api_mode = "chat_completions"
agent.client = MoAClient(agent.model or "default")
agent._client_kwargs = {}
agent.api_key = api_key or "moa-virtual-provider"
agent.base_url = base_url or "moa://local"
if not agent.quiet_mode:
print(f"🤖 AI Agent initialized with MoA preset: {agent.model}")
elif agent.api_mode == "bedrock_converse":
# AWS Bedrock — uses boto3 directly, no OpenAI client needed.
# Region is extracted from the base_url or defaults to us-east-1.

View file

@ -502,6 +502,7 @@ def run_conversation(
stream_callback: Optional[callable] = None,
persist_user_message: Optional[str] = None,
persist_user_timestamp: Optional[float] = None,
moa_config: Optional[dict[str, Any]] = None,
) -> Dict[str, Any]:
"""
Run a complete conversation with tool calling until completion.
@ -524,6 +525,19 @@ def run_conversation(
Returns:
Dict: Complete conversation result with final response and message history
"""
if moa_config is None:
try:
from hermes_cli.moa_config import decode_moa_turn
_decoded_message, _decoded_moa_config = decode_moa_turn(user_message)
if _decoded_moa_config is not None:
user_message = _decoded_message
moa_config = _decoded_moa_config
if persist_user_message is None:
persist_user_message = _decoded_message
except Exception:
pass
# ── Per-turn setup (the prologue) ──
# All once-per-turn setup — stdio guarding, retry-counter resets, user
# message sanitization, todo/nudge hydration, system-prompt restore-or-
@ -802,6 +816,29 @@ def run_conversation(
if effective_system:
api_messages = [{"role": "system", "content": effective_system}] + api_messages
if moa_config:
try:
from agent.moa_loop import aggregate_moa_context
_moa_context = aggregate_moa_context(
user_prompt=original_user_message if isinstance(original_user_message, str) else str(original_user_message),
api_messages=api_messages,
reference_models=moa_config.get("reference_models") or [],
aggregator=moa_config.get("aggregator") or {},
temperature=float(moa_config.get("reference_temperature", 0.6) or 0.6),
aggregator_temperature=float(moa_config.get("aggregator_temperature", 0.4) or 0.4),
max_tokens=int(moa_config.get("max_tokens", 4096) or 4096),
)
if _moa_context:
for _msg in reversed(api_messages):
if _msg.get("role") == "user":
_base = _msg.get("content", "")
if isinstance(_base, str):
_msg["content"] = _base + "\n\n" + _moa_context
break
except Exception as _moa_exc:
logger.warning("MoA context aggregation failed: %s", _moa_exc)
# Inject ephemeral prefill messages right after the system prompt
# but before conversation history. Same API-call-time-only pattern.
if agent.prefill_messages:
@ -1123,7 +1160,7 @@ def run_conversation(
# stream. Mirror the ACP exclusion used for Responses
# API upgrade (lines ~1083-1085).
elif (
agent.provider == "copilot-acp"
agent.provider in {"copilot-acp", "moa"}
or str(agent.base_url or "").lower().startswith("acp://copilot")
or str(agent.base_url or "").lower().startswith("acp+tcp://")
):

View file

@ -368,7 +368,7 @@ def build_tool_preview(tool_name: str, args: dict, max_len: int | None = None) -
"search_files": "pattern", "browser_navigate": "url",
"browser_click": "ref", "browser_type": "text",
"image_generate": "prompt", "text_to_speech": "text",
"vision_analyze": "question", "mixture_of_agents": "user_prompt",
"vision_analyze": "question",
"skill_view": "name", "skills_list": "category",
"cronjob": "action",
"execute_code": "code", "delegate_task": "goal",
@ -1216,8 +1216,6 @@ def get_cute_tool_message(
return _wrap(f"┊ 🔊 speak {_trunc(args.get('text', ''), 30)} {dur}")
if tool_name == "vision_analyze":
return _wrap(f"┊ 👁️ vision {_trunc(args.get('question', ''), 30)} {dur}")
if tool_name == "mixture_of_agents":
return _wrap(f"┊ 🧠 reason {_trunc(args.get('user_prompt', ''), 30)} {dur}")
if tool_name == "send_message":
return _wrap(f"┊ 📨 send {args.get('target', '?')}: \"{_trunc(args.get('message', ''), 25)}\" {dur}")
if tool_name == "cronjob":

306
agent/moa_loop.py Normal file
View file

@ -0,0 +1,306 @@
"""Mixture-of-Agents runtime helpers for /moa turns.
The slash command is deliberately not a model tool. It marks one user turn as
MoA-enabled; the normal Hermes agent loop still owns tool calling and turn
termination, while this module gathers reference-model context before each model
iteration.
"""
from __future__ import annotations
import logging
from concurrent.futures import ThreadPoolExecutor
from typing import Any
from agent.auxiliary_client import call_llm
from agent.transports import get_transport
logger = logging.getLogger(__name__)
# Upper bound on concurrent reference-model calls. References are independent
# advisory calls (no tools, no inter-dependence), so we fan them out the same
# way delegate_task runs a batch: all in flight at once, results collected when
# every reference finishes. Presets rarely list more than a handful of
# references; this cap just protects against a pathologically large preset
# opening dozens of sockets at once.
_MAX_REFERENCE_WORKERS = 8
def _slot_label(slot: dict[str, str]) -> str:
return f"{slot.get('provider', '').strip()}:{slot.get('model', '').strip()}"
def _run_reference(
slot: dict[str, str],
ref_messages: list[dict[str, Any]],
*,
temperature: float,
max_tokens: int,
) -> tuple[str, str]:
"""Call one reference model and return ``(label, text)``.
Never raises: a failed reference becomes a labelled note so the aggregator
can still act with partial context. Designed to run inside a thread pool
``call_llm`` is synchronous/blocking, so threads (not asyncio) are the right
concurrency primitive, mirroring ``delegate_task``'s batch fan-out.
"""
label = _slot_label(slot)
try:
response = call_llm(
task="moa_reference",
provider=slot["provider"],
model=slot["model"],
messages=ref_messages,
temperature=temperature,
max_tokens=max_tokens,
)
return label, _extract_text(response) or "(empty response)"
except Exception as exc:
logger.warning("MoA reference model %s failed: %s", label, exc)
return label, f"[failed: {exc}]"
def _run_references_parallel(
reference_models: list[dict[str, str]],
ref_messages: list[dict[str, Any]],
*,
temperature: float,
max_tokens: int,
) -> list[tuple[str, str]]:
"""Fan out all reference models in parallel, returning outputs in order.
Like ``delegate_task``'s batch mode, every reference is dispatched at once
and we block until all of them finish before handing the joined results to
the aggregator. Output order matches ``reference_models`` so the
``Reference {idx}`` labelling stays stable. MoA presets that reference
another MoA preset are skipped here (recursion guard) with a labelled note.
"""
if not reference_models:
return []
results: list[tuple[str, str] | None] = [None] * len(reference_models)
futures = {}
workers = min(_MAX_REFERENCE_WORKERS, len(reference_models))
with ThreadPoolExecutor(max_workers=workers) as executor:
for idx, slot in enumerate(reference_models):
if slot.get("provider") == "moa":
results[idx] = (
_slot_label(slot),
"[skipped: MoA presets cannot recursively reference MoA]",
)
continue
futures[
executor.submit(
_run_reference,
slot,
ref_messages,
temperature=temperature,
max_tokens=max_tokens,
)
] = idx
# Collect every reference before returning — the aggregator needs the
# complete set, so there is no early-exit / first-completed path here.
for future, idx in futures.items():
results[idx] = future.result()
return [r for r in results if r is not None]
def _reference_messages(messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
"""Build an advisory-safe view of the conversation for reference models.
Reference calls are advisory: they never call tools and never emit the
``tool_calls`` the main model did. Replaying the full transcript verbatim
(a) re-bills the ~8K-token Hermes system prompt per reference per
iteration and (b) risks 400s from strict providers (Mistral, Fireworks)
that reject orphan ``tool`` messages or ``tool_calls`` the reference never
produced. We keep only the user/assistant *text* turns, dropping the
system prompt, any ``tool``-role messages, and any ``tool_calls`` payloads.
"""
trimmed: list[dict[str, Any]] = []
for msg in messages:
role = msg.get("role")
if role not in ("user", "assistant"):
# Drop system prompt and tool-result messages.
continue
content = msg.get("content")
if not isinstance(content, str):
# Skip non-text (multimodal/tool-call-only) assistant turns.
if not content:
continue
text = content if isinstance(content, str) else ""
if role == "assistant" and not text.strip():
# Assistant turn that was purely tool calls — nothing advisory.
continue
trimmed.append({"role": role, "content": text})
if not trimmed:
# Degenerate case (e.g. first turn was stripped): fall back to a
# minimal user turn so the reference still has something to answer.
for msg in reversed(messages):
if msg.get("role") == "user" and isinstance(msg.get("content"), str):
return [{"role": "user", "content": msg["content"]}]
return trimmed
def _extract_text(response: Any) -> str:
try:
transport = get_transport("chat_completions")
if transport is None:
raise RuntimeError("chat_completions transport unavailable")
normalized = transport.normalize_response(response)
text = (normalized.content or "").strip()
if text:
return text
except Exception:
pass
try:
content = response.choices[0].message.content
return (content or "").strip()
except Exception:
return ""
def aggregate_moa_context(
*,
user_prompt: str,
api_messages: list[dict[str, Any]],
reference_models: list[dict[str, str]],
aggregator: dict[str, str],
temperature: float = 0.6,
aggregator_temperature: float = 0.4,
max_tokens: int = 4096,
) -> str:
"""Run configured reference models and synthesize their advice.
Failures are returned as model-specific notes instead of aborting the normal
agent loop; the main model can still act with partial context.
"""
reference_outputs: list[tuple[str, str]] = []
ref_messages = _reference_messages(api_messages)
reference_outputs = _run_references_parallel(
reference_models,
ref_messages,
temperature=temperature,
max_tokens=max_tokens,
)
joined = "\n\n".join(
f"Reference {idx}{label}:\n{text}"
for idx, (label, text) in enumerate(reference_outputs, start=1)
)
synth_prompt = (
"You are the aggregator in a Mixture of Agents process. Synthesize the "
"reference responses into concise, actionable guidance for the main "
"Hermes agent. Focus on next steps, tool-use strategy, risks, and any "
"disagreements. Do not answer the user directly unless that is all that "
"is needed; produce context the main agent should use in its normal loop.\n\n"
f"Original user prompt:\n{user_prompt}\n\n"
f"Reference responses:\n{joined}"
)
agg_label = _slot_label(aggregator)
try:
response = call_llm(
task="moa_aggregator",
provider=aggregator["provider"],
model=aggregator["model"],
messages=[{"role": "user", "content": synth_prompt}],
temperature=aggregator_temperature,
max_tokens=max_tokens,
)
synthesis = _extract_text(response)
except Exception as exc:
logger.warning("MoA aggregator model %s failed: %s", agg_label, exc)
synthesis = ""
if not synthesis:
synthesis = joined
return (
"[Mixture of Agents context — use this as private guidance for the "
"normal Hermes agent loop. You may call tools, continue reasoning, or "
"finish normally.]\n"
f"Aggregator: {agg_label}\n"
f"References: {', '.join(_slot_label(slot) for slot in reference_models)}\n\n"
f"{synthesis.strip()}"
)
class MoAChatCompletions:
"""OpenAI-chat-compatible facade where the aggregator is the acting model."""
def __init__(self, preset_name: str):
self.preset_name = preset_name or "default"
def create(self, **api_kwargs: Any) -> Any:
from hermes_cli.config import load_config
from hermes_cli.moa_config import resolve_moa_preset
preset = resolve_moa_preset(load_config().get("moa") or {}, self.preset_name)
messages = list(api_kwargs.get("messages") or [])
reference_models = preset.get("reference_models") or []
aggregator = preset.get("aggregator") or {}
max_tokens = int(preset.get("max_tokens", api_kwargs.get("max_tokens") or 4096) or 4096)
temperature = float(preset.get("reference_temperature", 0.6) or 0.6)
aggregator_temperature = float(preset.get("aggregator_temperature", api_kwargs.get("temperature") or 0.4) or 0.4)
# When the preset is disabled, skip the reference fan-out and let the
# configured aggregator act alone — it is the preset's acting model, so
# a disabled MoA preset is simply "use the aggregator directly."
if not preset.get("enabled", True):
reference_models = []
reference_outputs: list[tuple[str, str]] = []
ref_messages = _reference_messages(messages)
reference_outputs = _run_references_parallel(
reference_models,
ref_messages,
temperature=temperature,
max_tokens=max_tokens,
)
agg_messages = [dict(m) for m in messages]
if reference_outputs:
joined = "\n\n".join(
f"Reference {idx}{label}:\n{text}"
for idx, (label, text) in enumerate(reference_outputs, start=1)
)
guidance = (
"[Mixture of Agents reference context]\n"
f"Preset: {self.preset_name}\n"
f"Aggregator/acting model: {_slot_label(aggregator)}\n"
f"References: {', '.join(label for label, _ in reference_outputs)}\n\n"
"Use the reference responses below as private context. You are the aggregator and acting model: "
"answer the user directly or call tools as needed.\n\n"
f"{joined}"
)
for msg in reversed(agg_messages):
if msg.get("role") == "user" and isinstance(msg.get("content"), str):
msg["content"] = msg["content"] + "\n\n" + guidance
break
else:
agg_messages.append({"role": "user", "content": guidance})
if aggregator.get("provider") == "moa":
raise RuntimeError("MoA aggregator cannot be another MoA preset")
agg_kwargs = dict(api_kwargs)
agg_kwargs["messages"] = agg_messages
agg_kwargs["model"] = aggregator.get("model")
agg_kwargs["temperature"] = aggregator_temperature
return call_llm(
task="moa_aggregator",
provider=aggregator.get("provider"),
model=aggregator.get("model"),
messages=agg_messages,
temperature=aggregator_temperature,
max_tokens=agg_kwargs.get("max_tokens"),
tools=agg_kwargs.get("tools"),
extra_body=agg_kwargs.get("extra_body"),
)
class MoAClient:
def __init__(self, preset_name: str):
self.chat = type("_MoAChat", (), {})()
self.chat.completions = MoAChatCompletions(preset_name)

View file

@ -8,13 +8,15 @@ import {
getAuxiliaryModels,
getGlobalModelInfo,
getGlobalModelOptions,
getHermesConfigRecord,
getMoaModels,
getRecommendedDefaultModel,
saveMoaModels,
getHermesConfigRecord,
saveHermesConfig,
setEnvVar,
setModelAssignment
} from '@/hermes'
import type { AuxiliaryModelsResponse, ModelOptionProvider, StaleAuxAssignment } from '@/hermes'
import type { AuxiliaryModelsResponse, MoaConfigResponse, MoaModelSlot, ModelOptionProvider, StaleAuxAssignment } from '@/hermes'
import { useI18n } from '@/i18n'
import { AlertTriangle, Cpu, Loader2 } from '@/lib/icons'
import { cn } from '@/lib/utils'
@ -115,6 +117,9 @@ export function ModelSettings({ onMainModelChanged }: ModelSettingsProps) {
const [selectedProvider, setSelectedProvider] = useState('')
const [selectedModel, setSelectedModel] = useState('')
const [auxiliary, setAuxiliary] = useState<AuxiliaryModelsResponse | null>(null)
const [moa, setMoa] = useState<MoaConfigResponse | null>(null)
const [selectedMoaPreset, setSelectedMoaPreset] = useState('')
const [newMoaPresetName, setNewMoaPresetName] = useState('')
// Full profile config, kept so the reasoning/speed defaults round-trip
// (read agent.* → write back the whole record) like the generic config page.
const [config, setConfig] = useState<HermesConfigRecord | null>(null)
@ -134,10 +139,11 @@ export function ModelSettings({ onMainModelChanged }: ModelSettingsProps) {
setError('')
try {
const [modelInfo, modelOptions, auxiliaryModels, cfg] = await Promise.all([
const [modelInfo, modelOptions, auxiliaryModels, moaModels, cfg] = await Promise.all([
getGlobalModelInfo(),
getGlobalModelOptions(),
getAuxiliaryModels(),
getMoaModels().catch(() => null),
getHermesConfigRecord()
])
@ -146,6 +152,11 @@ export function ModelSettings({ onMainModelChanged }: ModelSettingsProps) {
setSelectedProvider(prev => prev || modelInfo.provider)
setSelectedModel(prev => prev || modelInfo.model)
setAuxiliary(auxiliaryModels)
setMoa(moaModels)
if (moaModels) {
setSelectedMoaPreset(prev => prev && moaModels.presets[prev] ? prev : moaModels.default_preset)
}
setConfig(cfg)
} catch (err) {
setError(err instanceof Error ? err.message : String(err))
@ -183,6 +194,62 @@ export function ModelSettings({ onMainModelChanged }: ModelSettingsProps) {
[auxDraft.provider, providers]
)
const modelsForProvider = useCallback(
(provider: string) => providers.find(row => row.slug === provider)?.models ?? [],
[providers]
)
const currentMoaPreset = useMemo(() => {
if (!moa) {
return null
}
return moa.presets[selectedMoaPreset] || moa.presets[moa.default_preset] || Object.values(moa.presets)[0] || null
}, [moa, selectedMoaPreset])
const updateMoaPreset = useCallback(
(updater: (preset: NonNullable<typeof currentMoaPreset>) => NonNullable<typeof currentMoaPreset>) => {
setMoa(prev => {
if (!prev || !selectedMoaPreset || !prev.presets[selectedMoaPreset]) {
return prev
}
return {
...prev,
presets: {
...prev.presets,
[selectedMoaPreset]: updater(prev.presets[selectedMoaPreset])
}
}
})
},
[selectedMoaPreset]
)
const updateMoaSlot = useCallback((slot: MoaModelSlot, patch: Partial<MoaModelSlot>): MoaModelSlot => {
const next = { ...slot, ...patch }
if (patch.provider) {
next.model = ''
}
return next
}, [])
const saveMoa = useCallback(async (next: MoaConfigResponse) => {
setApplying(true)
setError('')
try {
const saved = await saveMoaModels(next)
setMoa(saved)
} catch (err) {
setError(err instanceof Error ? err.message : String(err))
} finally {
setApplying(false)
}
}, [])
const auxiliaryTaskLabel = useCallback((key: string) => m.tasks[key]?.label ?? key, [m.tasks])
// Persistent mismatch: any aux slot pinned to a provider different from the
@ -658,6 +725,115 @@ export function ModelSettings({ onMainModelChanged }: ModelSettingsProps) {
})}
</div>
</section>
{moa && currentMoaPreset && (
<section>
<div className="mb-2.5 flex items-center justify-between">
<SectionHeading icon={Cpu} title="Mixture of Agents" />
<Button disabled={applying} onClick={() => void saveMoa(moa)} size="sm" variant="textStrong">
{applying ? m.applying : t.common.save}
</Button>
</div>
<p className="mb-2 text-xs text-muted-foreground">
Configure named presets that appear as models under the Mixture of Agents provider. The aggregator is the acting model.
</p>
<div className="mb-2 flex flex-wrap items-center gap-2">
<Select onValueChange={setSelectedMoaPreset} value={selectedMoaPreset || moa.default_preset}>
<SelectTrigger className={cn('min-w-40', CONTROL_TEXT)}><SelectValue placeholder="Preset" /></SelectTrigger>
<SelectContent>{Object.keys(moa.presets).map(name => <SelectItem key={name} value={name}>{name}</SelectItem>)}</SelectContent>
</Select>
<Button disabled={applying} onClick={() => setMoa(prev => prev && ({ ...prev, default_preset: selectedMoaPreset || prev.default_preset }))} size="sm" variant="text">
Set default
</Button>
<Button
disabled={Object.keys(moa.presets).length <= 1 || applying}
onClick={() => {
setMoa(prev => {
if (!prev || Object.keys(prev.presets).length <= 1) {
return prev
}
const next = { ...prev.presets }
delete next[selectedMoaPreset]
const fallback = Object.keys(next)[0]
return {
...prev,
presets: next,
default_preset: prev.default_preset === selectedMoaPreset ? fallback : prev.default_preset,
active_preset: prev.active_preset === selectedMoaPreset ? '' : prev.active_preset
}
})
setSelectedMoaPreset(Object.keys(moa.presets).find(name => name !== selectedMoaPreset) || '')
}}
size="sm"
variant="ghost"
>
Delete
</Button>
<Input className={cn('w-40', CONTROL_TEXT)} onChange={event => setNewMoaPresetName(event.target.value)} placeholder="new preset" value={newMoaPresetName} />
<Button
disabled={!newMoaPresetName.trim() || !!moa.presets[newMoaPresetName.trim()] || applying}
onClick={() => {
const name = newMoaPresetName.trim()
setMoa(prev => prev && ({
...prev,
presets: { ...prev.presets, [name]: { ...currentMoaPreset, reference_models: [...currentMoaPreset.reference_models] } }
}))
setSelectedMoaPreset(name)
setNewMoaPresetName('')
}}
size="sm"
variant="textStrong"
>
Add preset
</Button>
</div>
<div className="mb-2 text-xs text-muted-foreground">Default: <span className="font-mono">{moa.default_preset}</span></div>
<div className="grid gap-1">
{currentMoaPreset.reference_models.map((slot, index) => (
<ListRow
below={
<div className="mt-2 flex flex-wrap items-center gap-2 pt-1">
<Select onValueChange={value => updateMoaPreset(prev => ({ ...prev, reference_models: prev.reference_models.map((s, i) => i === index ? updateMoaSlot(s, { provider: value }) : s) }))} value={slot.provider}>
<SelectTrigger className={cn('min-w-32', CONTROL_TEXT)}><SelectValue placeholder={m.provider} /></SelectTrigger>
<SelectContent>{providerOptions.map(provider => <SelectItem key={provider.slug || 'none'} value={provider.slug || 'none'}>{provider.name}</SelectItem>)}</SelectContent>
</Select>
<Select onValueChange={value => updateMoaPreset(prev => ({ ...prev, reference_models: prev.reference_models.map((s, i) => i === index ? updateMoaSlot(s, { model: value }) : s) }))} value={slot.model}>
<SelectTrigger className={cn('min-w-48', CONTROL_TEXT)}><SelectValue placeholder={m.model} /></SelectTrigger>
<SelectContent>{modelsForProvider(slot.provider).map(model => <SelectItem key={model} value={model}>{model}</SelectItem>)}</SelectContent>
</Select>
<Button disabled={currentMoaPreset.reference_models.length <= 1 || applying} onClick={() => updateMoaPreset(prev => ({ ...prev, reference_models: prev.reference_models.filter((_, i) => i !== index) }))} size="sm" variant="ghost">
Remove
</Button>
</div>
}
description={<span className="font-mono text-[0.68rem]">{slot.provider} · {slot.model}</span>}
key={`${selectedMoaPreset}-${slot.provider}-${slot.model}-${index}`}
title={`Reference ${index + 1}`}
/>
))}
<Button disabled={applying} onClick={() => updateMoaPreset(prev => ({ ...prev, reference_models: [...prev.reference_models, prev.aggregator] }))} size="sm" variant="textStrong">
Add reference model
</Button>
<ListRow
below={
<div className="mt-2 flex flex-wrap items-center gap-2 pt-1">
<Select onValueChange={value => updateMoaPreset(prev => ({ ...prev, aggregator: updateMoaSlot(prev.aggregator, { provider: value }) }))} value={currentMoaPreset.aggregator.provider}>
<SelectTrigger className={cn('min-w-32', CONTROL_TEXT)}><SelectValue placeholder={m.provider} /></SelectTrigger>
<SelectContent>{providerOptions.map(provider => <SelectItem key={provider.slug || 'none'} value={provider.slug || 'none'}>{provider.name}</SelectItem>)}</SelectContent>
</Select>
<Select onValueChange={value => updateMoaPreset(prev => ({ ...prev, aggregator: updateMoaSlot(prev.aggregator, { model: value }) }))} value={currentMoaPreset.aggregator.model}>
<SelectTrigger className={cn('min-w-48', CONTROL_TEXT)}><SelectValue placeholder={m.model} /></SelectTrigger>
<SelectContent>{modelsForProvider(currentMoaPreset.aggregator.provider).map(model => <SelectItem key={model} value={model}>{model}</SelectItem>)}</SelectContent>
</Select>
</div>
}
description={<span className="font-mono text-[0.68rem]">{currentMoaPreset.aggregator.provider} · {currentMoaPreset.aggregator.model}</span>}
title="Aggregator"
/>
</div>
</section>
)}
</div>
)
}

View file

@ -16,7 +16,7 @@ import {
} from '@/components/ui/dropdown-menu'
import { Skeleton } from '@/components/ui/skeleton'
import type { HermesGateway } from '@/hermes'
import { getGlobalModelOptions } from '@/hermes'
import { getGlobalModelOptions, getMoaModels } from '@/hermes'
import { useI18n } from '@/i18n'
import { currentPickerSelection, displayModelName, modelDisplayParts, reasoningEffortLabel } from '@/lib/model-status-label'
import { cn } from '@/lib/utils'
@ -37,7 +37,7 @@ import {
$currentProvider,
$currentReasoningEffort
} from '@/store/session'
import type { ModelOptionProvider, ModelOptionsResponse } from '@/types/hermes'
import type { MoaConfigResponse, ModelOptionProvider, ModelOptionsResponse } from '@/types/hermes'
import { ModelEditSubmenu, resolveFastControl } from './model-edit-submenu'
@ -64,6 +64,7 @@ export function ModelMenuPanel({ gateway, onSelectModel, requestGateway }: Model
const [search, setSearch] = useState('')
const [refreshing, setRefreshing] = useState(false)
const queryClient = useQueryClient()
const [activeMoaPreset, setActiveMoaPreset] = useState('')
// Reactive session state is read from the stores here (not drilled in), so
// toggling effort/fast/model re-renders this panel in place without forcing
// the parent to rebuild the menu content (which would close the dropdown).
@ -86,6 +87,11 @@ export function ModelMenuPanel({ gateway, onSelectModel, requestGateway }: Model
}
})
const moaOptions = useQuery({
queryKey: ['moa-presets'],
queryFn: (): Promise<MoaConfigResponse> => getMoaModels()
})
const { model: optionsModel, provider: optionsProvider } = currentPickerSelection(
!!activeSessionId,
{ model: currentModel, provider: currentProvider },
@ -169,6 +175,15 @@ export function ModelMenuPanel({ gateway, onSelectModel, requestGateway }: Model
)
}
const toggleMoaPreset = async (preset: string) => {
if (!activeSessionId) {
return
}
await requestGateway('command.dispatch', { name: 'moa', arg: preset, session_id: activeSessionId })
setActiveMoaPreset(current => (current === preset ? '' : preset))
}
const groups = useMemo(
() => groupModels(providers ?? [], search, { model: optionsModel, provider: optionsProvider }, effectiveVisibleModels),
[providers, search, optionsModel, optionsProvider, effectiveVisibleModels]
@ -302,6 +317,27 @@ export function ModelMenuPanel({ gateway, onSelectModel, requestGateway }: Model
<DropdownMenuSeparator className="mx-0" />
{moaOptions.data && Object.keys(moaOptions.data.presets ?? {}).length > 0 ? (
<>
<DropdownMenuLabel className={dropdownMenuSectionLabel}>MoA presets</DropdownMenuLabel>
{Object.keys(moaOptions.data.presets).map(preset => (
<DropdownMenuItem
className={dropdownMenuRow}
disabled={!activeSessionId}
key={`moa:${preset}`}
onSelect={event => {
event.preventDefault()
void toggleMoaPreset(preset)
}}
>
<span className="min-w-0 flex-1 truncate">MoA: {preset}</span>
{activeMoaPreset === preset ? <Codicon className="ml-auto text-foreground" name="check" size="0.75rem" /> : null}
</DropdownMenuItem>
))}
<DropdownMenuSeparator className="mx-0" />
</>
) : null}
<DropdownMenuItem
className={cn(dropdownMenuRow, 'text-(--ui-text-tertiary)')}
disabled={refreshing}

View file

@ -23,6 +23,7 @@ import type {
MessagingPlatformsResponse,
MessagingPlatformTestResponse,
MessagingPlatformUpdate,
MoaConfigResponse,
ModelAssignmentRequest,
ModelAssignmentResponse,
ModelInfoResponse,
@ -85,6 +86,8 @@ export type {
MessagingPlatformsResponse,
MessagingPlatformTestResponse,
MessagingPlatformUpdate,
MoaConfigResponse,
MoaModelSlot,
ModelAssignmentRequest,
ModelAssignmentResponse,
ModelInfoResponse,
@ -746,6 +749,22 @@ export function getAuxiliaryModels(): Promise<AuxiliaryModelsResponse> {
})
}
export function getMoaModels(): Promise<MoaConfigResponse> {
return window.hermesDesktop.api<MoaConfigResponse>({
...profileScoped(),
path: '/api/model/moa'
})
}
export function saveMoaModels(body: MoaConfigResponse): Promise<MoaConfigResponse & { ok: boolean }> {
return window.hermesDesktop.api<MoaConfigResponse & { ok: boolean }>({
...profileScoped(),
path: '/api/model/moa',
method: 'PUT',
body
})
}
export function setModelAssignment(body: ModelAssignmentRequest): Promise<ModelAssignmentResponse> {
return window.hermesDesktop.api<ModelAssignmentResponse>({
...profileScoped(),

View file

@ -725,6 +725,30 @@ export interface AuxiliaryModelsResponse {
tasks: AuxiliaryTaskAssignment[]
}
export interface MoaModelSlot {
provider: string
model: string
}
export interface MoaConfigResponse {
default_preset: string
active_preset: string
presets: Record<string, {
aggregator: MoaModelSlot
aggregator_temperature: number
enabled: boolean
max_tokens: number
reference_models: MoaModelSlot[]
reference_temperature: number
}>
aggregator: MoaModelSlot
aggregator_temperature: number
enabled: boolean
max_tokens: number
reference_models: MoaModelSlot[]
reference_temperature: number
}
export interface ModelAssignmentRequest {
/** Optional API key for a custom/local endpoint. Persisted to model.api_key
* (where the runtime reads it) for self-hosted endpoints that require auth.

View file

@ -783,7 +783,6 @@ platform_toolsets:
# image_gen - image_generate (requires FAL_KEY)
# skills - skills_list, skill_view
# skills_hub - skill_hub (search/install/manage from online registries — user-driven only)
# moa - mixture_of_agents (requires OPENROUTER_API_KEY)
# todo - todo (in-memory task planning, no deps)
# tts - text_to_speech (Edge TTS free, or ELEVENLABS/OPENAI/MINIMAX/MISTRAL key)
# cronjob - cronjob (create/list/update/pause/resume/run/remove scheduled tasks)
@ -798,7 +797,7 @@ platform_toolsets:
#
# COMPOSITE:
# debugging - terminal + web + file
# safe - web + vision + moa (no terminal access)
# safe - web + vision (no terminal access)
# all - Everything available
#
# web - Web search and content extraction (web_search, web_extract)
@ -809,7 +808,6 @@ platform_toolsets:
# vision - Image analysis (vision_analyze)
# image_gen - Image generation with FLUX (image_generate)
# skills - Load skill documents (skills_list, skill_view)
# moa - Mixture of Agents reasoning (mixture_of_agents)
# todo - Task planning and tracking for multi-step work
# memory - Persistent memory across sessions (personal notes + user profile)
# session_search - Search and recall past conversations (FTS5 + Gemini Flash summarization)
@ -818,7 +816,7 @@ platform_toolsets:
#
# Composite toolsets:
# debugging - terminal + web + file (for troubleshooting)
# safe - web + vision + moa (no terminal access)
# safe - web + vision (no terminal access)
# NOTE: The top-level "toolsets" key is deprecated and ignored.
# Tool configuration is managed per-platform via platform_toolsets above.

58
cli.py
View file

@ -8422,6 +8422,51 @@ class HermesCLI(CLIAgentSetupMixin, CLICommandsMixin):
_cprint(f" No agent running; queued as next turn: {payload[:80]}{'...' if len(payload) > 80 else ''}")
elif canonical == "goal":
self._handle_goal_command(cmd_original)
elif canonical == "moa":
from hermes_cli.moa_config import (
exact_moa_preset_name,
moa_usage,
normalize_moa_config,
resolve_moa_preset,
)
parts = cmd_original.split(None, 1)
payload = parts[1].strip() if len(parts) > 1 else ""
moa_cfg = self.config.get("moa") if isinstance(self.config, dict) else {}
normalized = normalize_moa_config(moa_cfg)
matched_preset = exact_moa_preset_name(normalized, payload) if payload else normalized["default_preset"]
if matched_preset:
self.requested_provider = "moa"
self.provider = "moa"
self.model = matched_preset
self.api_key = "moa-virtual-provider"
self.base_url = "moa://local"
self.api_mode = "chat_completions"
self.agent = None
_cprint(f" Model switched to MoA preset: {matched_preset}.")
else:
if not payload:
_cprint(f" {moa_usage()}")
return True
preset = normalized["default_preset"]
self._pending_moa_restore_model = {
"requested_provider": getattr(self, "requested_provider", None),
"provider": getattr(self, "provider", None),
"model": getattr(self, "model", None),
"api_key": getattr(self, "api_key", None),
"base_url": getattr(self, "base_url", None),
"api_mode": getattr(self, "api_mode", None),
}
self.requested_provider = "moa"
self.provider = "moa"
self.model = preset
self.api_key = "moa-virtual-provider"
self.base_url = "moa://local"
self.api_mode = "chat_completions"
self.agent = None
self._pending_moa_disable_after_turn = True
self._pending_agent_seed = payload
_cprint(f" MoA one-shot queued with preset {preset}; previous model will be restored after this turn.")
elif canonical == "subgoal":
self._handle_subgoal_command(cmd_original)
elif canonical == "skin":
@ -11672,6 +11717,10 @@ class HermesCLI(CLIAgentSetupMixin, CLICommandsMixin):
if _srn:
agent_message = _prepend_note_to_message(agent_message, _srn)
self._pending_skills_reload_note = None
_moa_cfg = getattr(self, "_pending_moa_config", None)
self._pending_moa_config = None
if _moa_cfg is None:
_moa_cfg = None
try:
result = self.agent.run_conversation(
user_message=agent_message,
@ -11679,7 +11728,16 @@ class HermesCLI(CLIAgentSetupMixin, CLICommandsMixin):
stream_callback=stream_callback,
task_id=self.session_id,
persist_user_message=message if _voice_prefix else None,
moa_config=_moa_cfg,
)
if getattr(self, "_pending_moa_disable_after_turn", False):
_restore = getattr(self, "_pending_moa_restore_model", None) or {}
for _key, _value in _restore.items():
if _value is not None:
setattr(self, _key, _value)
self.agent = None
self._pending_moa_restore_model = None
self._pending_moa_disable_after_turn = False
except Exception as exc:
logging.error("run_conversation raised: %s", exc, exc_info=True)
_summary = getattr(self.agent, '_summarize_api_error', lambda e: str(e)[:300])(exc)

View file

@ -8028,6 +8028,9 @@ class GatewayRunner(GatewayAuthorizationMixin, GatewayKanbanWatchersMixin, Gatew
return await self._handle_goal_command(event)
return "Agent is running — use /goal status / pause / clear / wait mid-run, or /stop before setting a new goal."
if _cmd_def_inner and _cmd_def_inner.name == "moa":
return "Agent is running — wait or /stop first, then run /moa."
# /subgoal is safe mid-run — it only modifies the goal's
# subgoals list, which the judge reads at the next turn
# boundary. No race with the running turn.
@ -8532,6 +8535,50 @@ class GatewayRunner(GatewayAuthorizationMixin, GatewayKanbanWatchersMixin, Gatew
if canonical == "goal":
return await self._handle_goal_command(event)
if canonical == "moa":
from hermes_cli.moa_config import (
exact_moa_preset_name,
moa_usage,
normalize_moa_config,
resolve_moa_preset,
)
from hermes_cli.config import load_config
moa_payload = event.get_command_args().strip()
try:
cfg = load_config()
moa_cfg = normalize_moa_config(cfg.get("moa") if isinstance(cfg, dict) else {})
except Exception:
moa_cfg = normalize_moa_config({})
matched_preset = exact_moa_preset_name(moa_cfg, moa_payload) if moa_payload else moa_cfg["default_preset"]
if matched_preset:
self._session_model_overrides[_quick_key] = {
"provider": "moa",
"model": matched_preset,
"base_url": "moa://local",
"api_key": "moa-virtual-provider",
"api_mode": "chat_completions",
}
self._evict_cached_agent(_quick_key)
return f"Model switched to MoA preset: {matched_preset}."
if not moa_payload:
return moa_usage()
preset = moa_cfg["default_preset"]
try:
event.text = moa_payload
event._moa_restore_override = self._session_model_overrides.get(_quick_key)
self._session_model_overrides[_quick_key] = {
"provider": "moa",
"model": preset,
"base_url": "moa://local",
"api_key": "moa-virtual-provider",
"api_mode": "chat_completions",
}
self._evict_cached_agent(_quick_key)
event._moa_disable_after_turn = True
except Exception:
return "Failed to prepare MoA turn."
if canonical == "subgoal":
return await self._handle_subgoal_command(event)
@ -8741,6 +8788,16 @@ class GatewayRunner(GatewayAuthorizationMixin, GatewayKanbanWatchersMixin, Gatew
try:
_agent_result = await self._handle_message_with_agent(event, source, _quick_key, _run_generation)
if getattr(event, "_moa_disable_after_turn", False):
try:
_restore = getattr(event, "_moa_restore_override", None)
if _restore is None:
self._session_model_overrides.pop(_quick_key, None)
else:
self._session_model_overrides[_quick_key] = _restore
self._evict_cached_agent(_quick_key)
except Exception:
pass
# Goal continuation: after the agent returns a final response
# for this turn, check any standing /goal — the judge will
# either mark it done, pause it (budget), or enqueue a
@ -9866,6 +9923,7 @@ class GatewayRunner(GatewayAuthorizationMixin, GatewayKanbanWatchersMixin, Gatew
run_generation=run_generation,
event_message_id=self._reply_anchor_for_event(event),
channel_prompt=event.channel_prompt,
moa_config=getattr(event, "_moa_config", None),
persist_user_message=persist_user_message,
persist_user_timestamp=persist_user_timestamp,
)
@ -14681,6 +14739,7 @@ class GatewayRunner(GatewayAuthorizationMixin, GatewayKanbanWatchersMixin, Gatew
_interrupt_depth: int = 0,
event_message_id: Optional[str] = None,
channel_prompt: Optional[str] = None,
moa_config: Optional[dict] = None,
persist_user_message: Optional[str] = None,
persist_user_timestamp: Optional[float] = None,
) -> Dict[str, Any]:
@ -14698,7 +14757,8 @@ class GatewayRunner(GatewayAuthorizationMixin, GatewayKanbanWatchersMixin, Gatew
message, context_prompt, history, source, session_id,
session_key=session_key, run_generation=run_generation,
_interrupt_depth=_interrupt_depth, event_message_id=event_message_id,
channel_prompt=channel_prompt, persist_user_message=persist_user_message,
channel_prompt=channel_prompt, moa_config=moa_config,
persist_user_message=persist_user_message,
persist_user_timestamp=persist_user_timestamp,
)
@ -14708,7 +14768,8 @@ class GatewayRunner(GatewayAuthorizationMixin, GatewayKanbanWatchersMixin, Gatew
message, context_prompt, history, source, session_id,
session_key=session_key, run_generation=run_generation,
_interrupt_depth=_interrupt_depth, event_message_id=event_message_id,
channel_prompt=channel_prompt, persist_user_message=persist_user_message,
channel_prompt=channel_prompt, moa_config=moa_config,
persist_user_message=persist_user_message,
persist_user_timestamp=persist_user_timestamp,
)
@ -14739,6 +14800,7 @@ class GatewayRunner(GatewayAuthorizationMixin, GatewayKanbanWatchersMixin, Gatew
_interrupt_depth: int = 0,
event_message_id: Optional[str] = None,
channel_prompt: Optional[str] = None,
moa_config: Optional[dict] = None,
persist_user_message: Optional[str] = None,
persist_user_timestamp: Optional[float] = None,
) -> Dict[str, Any]:
@ -16322,6 +16384,8 @@ class GatewayRunner(GatewayAuthorizationMixin, GatewayKanbanWatchersMixin, Gatew
_conversation_kwargs["persist_user_message"] = _persist_user_message_override
elif observed_group_context:
_conversation_kwargs["persist_user_message"] = message
if moa_config is not None:
_conversation_kwargs["moa_config"] = moa_config
if _persist_user_timestamp_override is not None:
_conversation_kwargs["persist_user_timestamp"] = _persist_user_timestamp_override
result = agent.run_conversation(_api_run_message, **_conversation_kwargs)

View file

@ -109,6 +109,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
args_hint="<prompt>"),
CommandDef("goal", "Set a standing goal Hermes works on across turns until achieved", "Session",
args_hint="[text | draft <text> | show | pause | resume | clear | status | wait <pid> | unwait]"),
CommandDef("moa", "Run one prompt through configured Mixture of Agents models", "Session",
args_hint="<prompt>"),
CommandDef("subgoal", "Add or manage extra criteria on the active goal", "Session",
args_hint="[text | remove N | clear]"),
CommandDef("status", "Show session, model, token, and context info", "Session"),
@ -1153,8 +1155,10 @@ _SLACK_PRIORITY_ALIASES = ("btw", "bg")
# "Slack-via-/hermes" decision, not a silent clamp.
# - credits: the billing/top-up surface; reached via /hermes credits on Slack.
# - billing: the terminal-billing surface (buy/auto-reload/limit); /hermes billing.
# - moa: high-cost slash mode, available through /hermes moa to avoid
# displacing existing native Slack slash commands at the 50-command cap.
# - debug: the log/report upload surface; reached via /hermes debug on Slack.
_SLACK_VIA_HERMES_ONLY = frozenset({"credits", "billing", "debug"})
_SLACK_VIA_HERMES_ONLY = frozenset({"credits", "billing", "moa", "debug"})
def _sanitize_slack_name(raw: str) -> str:

View file

@ -1576,6 +1576,22 @@ DEFAULT_CONFIG = {
"timeout": 120,
"extra_body": {},
},
"moa_reference": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 600,
"extra_body": {},
},
"moa_aggregator": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 600,
"extra_body": {},
},
},
"display": {
@ -2054,6 +2070,27 @@ DEFAULT_CONFIG = {
"max_turns": 20,
},
# Mixture of Agents — named presets used by /moa. A preset is an execution
# mode around the main model, not a provider/model itself: references +
# aggregator synthesize private guidance before each main-model iteration.
"moa": {
"default_preset": "default",
"active_preset": "",
"presets": {
"default": {
"reference_models": [
{"provider": "openai-codex", "model": "gpt-5.5"},
{"provider": "openrouter", "model": "deepseek/deepseek-v4-pro"},
],
"aggregator": {"provider": "openrouter", "model": "anthropic/claude-opus-4.8"},
"reference_temperature": 0.6,
"aggregator_temperature": 0.4,
"max_tokens": 4096,
"enabled": True,
}
},
},
# Skills — external skill directories for sharing skills across tools/agents.
# Each path is expanded (~, ${VAR}) and resolved. Read-only — skill creation
# always goes to ~/.hermes/skills/.
@ -2953,7 +2990,7 @@ OPTIONAL_ENV_VARS = {
"prompt": "OpenRouter API key",
"url": "https://openrouter.ai/keys",
"password": True,
"tools": ["vision_analyze", "mixture_of_agents"],
"tools": ["vision_analyze"],
"category": "provider",
"advanced": True,
},
@ -4503,7 +4540,7 @@ _KNOWN_ROOT_KEYS = {
"_config_version", "model", "providers", "fallback_model",
"fallback_providers", "credential_pool_strategies", "toolsets",
"agent", "terminal", "display", "compression", "delegation",
"auxiliary", "custom_providers", "context", "memory", "gateway",
"auxiliary", "moa", "custom_providers", "context", "memory", "gateway",
"sessions", "streaming", "updates", "mcp_servers",
}

View file

@ -163,6 +163,10 @@ def build_models_payload(
refresh=refresh,
)
moa_row = _moa_provider_row(ctx)
if moa_row is not None:
rows = [moa_row] + [r for r in rows if str(r.get("slug", "")).lower() != "moa"]
# --- Deduplicate: remove models from aggregators that overlap with
# user-defined providers. When a local proxy (e.g. litellm-proxy)
# serves a model whose name also appears in an aggregator's curated
@ -209,7 +213,7 @@ def build_models_payload(
row["total_models"] = len(filtered)
if include_unconfigured:
rows = list(rows) + _append_unconfigured_rows(rows, ctx)
rows = list(rows) + [r for r in _append_unconfigured_rows(rows, ctx) if str(r.get("slug", "")).lower() != "moa"]
if picker_hints:
_apply_picker_hints(rows)
if canonical_order:
@ -436,3 +440,28 @@ def _apply_pricing(
# is never blocked from picking a model.
row["free_tier"] = False
row["unavailable_models"] = []
def _moa_provider_row(ctx: ConfigContext) -> dict | None:
try:
from hermes_cli.config import load_config
from hermes_cli.moa_config import normalize_moa_config
cfg = normalize_moa_config(load_config().get("moa") or {})
models = list(cfg.get("presets", {}).keys())
if not models:
return None
return {
"slug": "moa",
"name": "Mixture of Agents",
"is_current": (ctx.current_provider or "").lower() == "moa",
"is_user_defined": False,
"models": models,
"total_models": len(models),
"source": "virtual",
"authenticated": True,
"auth_type": "virtual",
"warning": "Aggregator acts as the selected model; references provide analysis before each call.",
}
except Exception:
return None

View file

@ -11579,7 +11579,7 @@ _BUILTIN_SUBCOMMANDS = frozenset(
"computer-use",
"config", "cron", "curator", "dashboard", "debug", "doctor",
"dump", "fallback", "gateway", "hooks", "import", "insights",
"gui", "desktop", "kanban", "login", "logout", "logs", "lsp", "mcp", "memory", "migrate",
"gui", "desktop", "kanban", "login", "logout", "logs", "lsp", "mcp", "memory", "migrate", "moa",
"model", "pairing", "pets", "plugins", "portal", "postinstall", "profile", "proxy",
"prompt-size",
"send", "sessions", "setup",
@ -12104,6 +12104,21 @@ def main():
# =========================================================================
build_model_parser(subparsers, cmd_model=cmd_model)
from hermes_cli.moa_cmd import cmd_moa
moa_parser = subparsers.add_parser(
"moa",
help="Configure Mixture of Agents provider/model slots",
description="Configure the provider/model set used by /moa <prompt>.",
)
moa_subparsers = moa_parser.add_subparsers(dest="moa_command")
moa_subparsers.add_parser("list", aliases=["ls"], help="Show current MoA model slots")
moa_configure = moa_subparsers.add_parser("configure", aliases=["config"], help="Interactively pick MoA models")
moa_configure.add_argument("name", nargs="?", help="Preset name to create or update")
moa_delete = moa_subparsers.add_parser("delete", aliases=["rm"], help="Delete a MoA preset")
moa_delete.add_argument("name", help="Preset name to delete")
moa_parser.set_defaults(func=cmd_moa)
# =========================================================================
# fallback command — manage the fallback provider chain
# =========================================================================

135
hermes_cli/moa_cmd.py Normal file
View file

@ -0,0 +1,135 @@
"""CLI helpers for configuring Mixture of Agents."""
from __future__ import annotations
from typing import Any
from hermes_cli.config import load_config, save_config
from hermes_cli.inventory import build_models_payload, load_picker_context
from hermes_cli.moa_config import DEFAULT_MOA_PRESET_NAME, normalize_moa_config
def _prompt_choice(title: str, rows: list[str], default: int = 0) -> int:
try:
from hermes_cli.curses_ui import curses_radiolist
return curses_radiolist(title, rows, selected=default, cancel_returns=default)
except Exception:
for idx, row in enumerate(rows, start=1):
print(f"{idx}. {row}")
raw = input(f"{title} [{default + 1}]: ").strip()
if not raw:
return default
try:
return max(0, min(len(rows) - 1, int(raw) - 1))
except ValueError:
return default
def _model_options() -> list[dict[str, Any]]:
payload = build_models_payload(
load_picker_context(),
include_unconfigured=True,
picker_hints=True,
canonical_order=True,
pricing=True,
capabilities=True,
max_models=200,
)
providers = payload.get("providers") or []
return [p for p in providers if p.get("slug") and p.get("models")]
def _pick_slot(current: dict[str, str] | None = None) -> dict[str, str]:
providers = _model_options()
if not providers:
raise RuntimeError("No configured model providers found. Run `hermes model` first.")
current_provider = (current or {}).get("provider", "")
provider_default = next(
(idx for idx, p in enumerate(providers) if p.get("slug") == current_provider),
0,
)
provider_rows = [f"{p.get('name') or p.get('slug')} ({p.get('slug')})" for p in providers]
provider = providers[_prompt_choice("Select provider", provider_rows, provider_default)]
models = list(provider.get("models") or [])
if not models:
raise RuntimeError(f"Provider {provider.get('slug')} has no selectable models")
current_model = (current or {}).get("model", "")
model_default = models.index(current_model) if current_model in models else 0
model = models[_prompt_choice(f"Select model for {provider.get('slug')}", models, model_default)]
return {"provider": str(provider.get("slug") or ""), "model": str(model)}
def _print_config(config: dict[str, Any]) -> None:
cfg = normalize_moa_config(config.get("moa") if isinstance(config, dict) else {})
print("Mixture of Agents presets")
print(f"Default: {cfg['default_preset']}")
active = cfg.get("active_preset") or "(off)"
print(f"Active in config: {active}")
for name, preset in cfg["presets"].items():
marker = "*" if name == cfg["default_preset"] else " "
print(f"\n{marker} {name}")
print(" Reference models:")
for idx, slot in enumerate(preset["reference_models"], start=1):
print(f" {idx}. {slot['provider']}:{slot['model']}")
agg = preset["aggregator"]
print(f" Aggregator: {agg['provider']}:{agg['model']}")
def cmd_moa(args) -> None:
"""Manage Mixture of Agents model presets."""
cfg = load_config()
sub = getattr(args, "moa_command", None) or "list"
if sub in {"list", "ls"}:
_print_config(cfg)
return
if sub in {"config", "configure"}:
moa = normalize_moa_config(cfg.get("moa") if isinstance(cfg, dict) else {})
preset_name = (getattr(args, "name", None) or moa.get("default_preset") or DEFAULT_MOA_PRESET_NAME).strip()
current = moa["presets"].get(preset_name, moa["presets"][moa["default_preset"]])
print(f"Configure MoA preset: {preset_name}")
print("Pick at least one reference model; choose Done when finished.")
refs: list[dict[str, str]] = []
existing = list(current.get("reference_models") or [])
idx = 0
while True:
base = existing[idx] if idx < len(existing) else None
refs.append(_pick_slot(base))
idx += 1
choice = _prompt_choice("Add another reference model?", ["Add another", "Done"], 1)
if choice == 1:
break
print("Configure aggregator model.")
current = dict(current)
current["reference_models"] = refs
current["aggregator"] = _pick_slot(current.get("aggregator"))
moa["presets"][preset_name] = current
moa.setdefault("default_preset", preset_name)
cfg["moa"] = normalize_moa_config(moa)
save_config(cfg)
print(f"Saved MoA preset: {preset_name}")
_print_config(cfg)
return
if sub == "delete":
moa = normalize_moa_config(cfg.get("moa") if isinstance(cfg, dict) else {})
preset_name = (getattr(args, "name", None) or "").strip()
if not preset_name:
raise SystemExit("Usage: hermes moa delete <name>")
if preset_name not in moa["presets"]:
raise SystemExit(f"Unknown MoA preset: {preset_name}")
if len(moa["presets"]) <= 1:
raise SystemExit("Cannot delete the only MoA preset")
del moa["presets"][preset_name]
if moa["default_preset"] == preset_name:
moa["default_preset"] = next(iter(moa["presets"]))
if moa.get("active_preset") == preset_name:
moa["active_preset"] = ""
cfg["moa"] = normalize_moa_config(moa)
save_config(cfg)
print(f"Deleted MoA preset: {preset_name}")
return
raise SystemExit(f"Unknown moa subcommand: {sub}")

174
hermes_cli/moa_config.py Normal file
View file

@ -0,0 +1,174 @@
"""Mixture-of-Agents configuration and slash-command helpers."""
from __future__ import annotations
import base64
import json
from copy import deepcopy
from typing import Any
MOA_MARKER_PREFIX = "__HERMES_MOA_TURN_V1__"
DEFAULT_MOA_PRESET_NAME = "default"
DEFAULT_MOA_REFERENCE_MODELS: list[dict[str, str]] = [
{"provider": "openai-codex", "model": "gpt-5.5"},
{"provider": "openrouter", "model": "deepseek/deepseek-v4-pro"},
]
DEFAULT_MOA_AGGREGATOR: dict[str, str] = {
"provider": "openrouter",
"model": "anthropic/claude-opus-4.8",
}
def _clean_slot(slot: Any) -> dict[str, str] | None:
if not isinstance(slot, dict):
return None
provider = str(slot.get("provider") or "").strip()
model = str(slot.get("model") or "").strip()
if not provider or not model:
return None
return {"provider": provider, "model": model}
def _default_preset() -> dict[str, Any]:
return {
"reference_models": deepcopy(DEFAULT_MOA_REFERENCE_MODELS),
"aggregator": deepcopy(DEFAULT_MOA_AGGREGATOR),
"reference_temperature": 0.6,
"aggregator_temperature": 0.4,
"max_tokens": 4096,
"enabled": True,
}
def _normalize_preset(raw: Any) -> dict[str, Any]:
if not isinstance(raw, dict):
raw = {}
refs = [_clean_slot(item) for item in raw.get("reference_models") or []]
refs = [item for item in refs if item is not None]
if not refs:
refs = deepcopy(DEFAULT_MOA_REFERENCE_MODELS)
aggregator = _clean_slot(raw.get("aggregator")) or deepcopy(DEFAULT_MOA_AGGREGATOR)
return {
"enabled": bool(raw.get("enabled", True)),
"reference_models": refs,
"aggregator": aggregator,
"reference_temperature": float(raw.get("reference_temperature", 0.6) or 0.6),
"aggregator_temperature": float(raw.get("aggregator_temperature", 0.4) or 0.4),
"max_tokens": int(raw.get("max_tokens", 4096) or 4096),
}
def normalize_moa_config(raw: Any) -> dict[str, Any]:
"""Return validated MoA config with named presets.
Backward compatible with the first PR shape where ``moa`` itself contained
``reference_models`` and ``aggregator`` directly.
"""
if not isinstance(raw, dict):
raw = {}
presets_raw = raw.get("presets")
presets: dict[str, dict[str, Any]] = {}
if isinstance(presets_raw, dict):
for name, preset in presets_raw.items():
clean_name = str(name or "").strip()
if clean_name:
presets[clean_name] = _normalize_preset(preset)
# Legacy flat config becomes the default preset.
if not presets:
presets[DEFAULT_MOA_PRESET_NAME] = _normalize_preset(raw)
default_name = str(raw.get("default_preset") or "").strip()
if not default_name or default_name not in presets:
default_name = next(iter(presets), DEFAULT_MOA_PRESET_NAME)
if default_name not in presets:
presets[default_name] = _default_preset()
active_name = str(raw.get("active_preset") or "").strip()
if active_name not in presets:
active_name = ""
active = presets[default_name]
return {
"default_preset": default_name,
"active_preset": active_name,
"presets": presets,
# Compatibility/flattened view for existing dashboard/desktop callers.
"reference_models": deepcopy(active["reference_models"]),
"aggregator": deepcopy(active["aggregator"]),
"reference_temperature": active["reference_temperature"],
"aggregator_temperature": active["aggregator_temperature"],
"max_tokens": active["max_tokens"],
"enabled": active["enabled"],
}
def list_moa_presets(config: Any) -> list[str]:
cfg = normalize_moa_config(config)
return list(cfg["presets"].keys())
def resolve_moa_preset(config: Any, name: str | None = None) -> dict[str, Any]:
cfg = normalize_moa_config(config)
preset_name = str(name or cfg.get("default_preset") or DEFAULT_MOA_PRESET_NAME).strip()
preset = cfg["presets"].get(preset_name)
if preset is None:
raise KeyError(preset_name)
return deepcopy(preset)
def exact_moa_preset_name(config: Any, text: str) -> str | None:
wanted = str(text or "").strip()
if not wanted:
return None
cfg = normalize_moa_config(config)
return wanted if wanted in cfg["presets"] else None
def set_active_moa_preset(config: Any, name: str | None) -> dict[str, Any]:
cfg = normalize_moa_config(config)
clean = str(name or "").strip()
if clean and clean not in cfg["presets"]:
raise KeyError(clean)
cfg["active_preset"] = clean
return cfg
def encode_moa_turn(prompt: str, config: Any = None, preset: str | None = None) -> str:
"""Encode a /moa one-shot turn for frontends that can only send text."""
payload = {
"prompt": str(prompt or ""),
"config": resolve_moa_preset(config or {}, preset),
}
encoded = base64.urlsafe_b64encode(
json.dumps(payload, separators=(",", ":"), ensure_ascii=False).encode("utf-8")
).decode("ascii")
return f"{MOA_MARKER_PREFIX}{encoded}"
def decode_moa_turn(message: Any) -> tuple[str, dict[str, Any] | None]:
"""Decode a hidden /moa one-shot marker."""
if not isinstance(message, str) or not message.startswith(MOA_MARKER_PREFIX):
return message, None
encoded = message[len(MOA_MARKER_PREFIX):].strip()
try:
payload = json.loads(base64.urlsafe_b64decode(encoded.encode("ascii")).decode("utf-8"))
except Exception:
return message, None
prompt = str(payload.get("prompt") or "")
return prompt, _normalize_preset(payload.get("config") or {})
def build_moa_turn_prompt(user_prompt: str, config: Any = None, preset: str | None = None) -> str:
"""Build the hidden one-shot payload used by TUI/gateway routing."""
return encode_moa_turn(user_prompt, config, preset=preset)
def moa_usage() -> str:
return "Usage: /moa [preset-name | prompt] (bare /moa toggles the default preset)"

View file

@ -807,6 +807,7 @@ def switch_model(
resolved_alias = ""
new_model = raw_input.strip()
target_provider = current_provider
resolved_moa_preset = False
# =================================================================
# PATH A: Explicit --provider given
@ -843,6 +844,14 @@ def switch_model(
)
target_provider = pdef.id
if target_provider == "moa" and not new_model:
try:
from hermes_cli.config import load_config
from hermes_cli.moa_config import normalize_moa_config
new_model = normalize_moa_config(load_config().get("moa") or {})["default_preset"]
except Exception:
new_model = "default"
# Guard against silent aggregator hops. A vendor name like bare
# "openai" is an alias that resolves to an aggregator ("openrouter").
@ -925,10 +934,28 @@ def switch_model(
# PATH B: No explicit provider — resolve from model input
# =================================================================
else:
# --- Step a: Try alias resolution on current provider ---
alias_result = resolve_alias(raw_input, current_provider)
try:
from hermes_cli.config import load_config
from hermes_cli.moa_config import exact_moa_preset_name, normalize_moa_config
if alias_result is not None:
_moa_cfg = normalize_moa_config(load_config().get("moa") or {})
_moa_match = exact_moa_preset_name(_moa_cfg, raw_input)
if _moa_match:
target_provider = "moa"
new_model = _moa_match
resolved_alias = ""
resolved_moa_preset = True
alias_result = None
else:
alias_result = resolve_alias(raw_input, current_provider)
except Exception:
alias_result = resolve_alias(raw_input, current_provider)
# --- Step a: Try alias resolution on current provider ---
if resolved_moa_preset:
pass
elif alias_result is not None:
target_provider, new_model, resolved_alias = alias_result
logger.debug(
"Alias '%s' resolved to %s on %s",
@ -961,7 +988,7 @@ def switch_model(
f"Try specifying the full model name."
),
)
else:
elif not resolved_moa_preset:
# --- Step c: On aggregator, convert vendor:model to vendor/model ---
# Only convert when there's no slash — a slash means the name
# is already in vendor/model format and the colon is a variant

View file

@ -173,6 +173,7 @@ def _xai_curated_models() -> list[str]:
_PROVIDER_MODELS: dict[str, list[str]] = {
"moa": ["default"],
"nous": [
# Anthropic
"anthropic/claude-opus-4.8",
@ -1003,6 +1004,7 @@ class ProviderEntry(NamedTuple):
CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("nous", "Nous Portal", "Nous Portal (Everything your agent needs, 300+ models with bundled tool use)"),
ProviderEntry("openrouter", "OpenRouter", "OpenRouter (Pay-per-use API aggregator)"),
ProviderEntry("moa", "Mixture of Agents", "Mixture of Agents (named presets; aggregator acts after reference models)"),
ProviderEntry("novita", "NovitaAI", "NovitaAI (Cloud: Model API, Agent Sandbox, GPU Cloud)"),
ProviderEntry("lmstudio", "LM Studio", "LM Studio (Local desktop app with built-in model server)"),
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models via API key or Claude Code)"),
@ -3663,6 +3665,24 @@ def validate_requested_model(
"message": "Model name cannot be empty.",
}
if normalized == "moa":
try:
from hermes_cli.config import load_config
from hermes_cli.moa_config import normalize_moa_config
cfg = normalize_moa_config(load_config().get("moa") or {})
if requested in cfg["presets"]:
return {"accepted": True, "persist": True, "recognized": True, "message": None}
return {
"accepted": False, "persist": False, "recognized": False,
"message": f"MoA preset `{requested}` was not found. Run `hermes moa list`.",
}
except Exception as exc:
return {
"accepted": False, "persist": False, "recognized": False,
"message": f"Could not read MoA presets: {exc}",
}
if any(ch.isspace() for ch in requested):
return {
"accepted": False,

View file

@ -111,16 +111,27 @@ def provider_catalog() -> list[ProviderDescriptor]:
except Exception:
OPTIONAL_ENV_VARS = {}
# Hermes overlays carry auth_type for providers that have no registry/profile
# entry of their own — notably the ``moa`` virtual provider (auth_type
# "virtual"), which has no real credential and no network endpoint.
try:
from hermes_cli.providers import HERMES_OVERLAYS
except Exception:
HERMES_OVERLAYS = {}
out: list[ProviderDescriptor] = []
for order, entry in enumerate(CANONICAL_PROVIDERS):
slug = entry.slug
cfg = PROVIDER_REGISTRY.get(slug)
prof = profiles.get(slug)
overlay = HERMES_OVERLAYS.get(slug)
# auth_type: registry is authoritative; fall back to profile, then api_key.
# auth_type: registry is authoritative; fall back to profile, then the
# Hermes overlay (e.g. moa → "virtual"), then api_key.
auth_type = (
(getattr(cfg, "auth_type", "") if cfg else "")
or (getattr(prof, "auth_type", "") if prof else "")
or (getattr(overlay, "auth_type", "") if overlay else "")
or "api_key"
)

View file

@ -44,6 +44,11 @@ class HermesOverlay:
HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
"moa": HermesOverlay(
transport="openai_chat",
auth_type="virtual",
base_url_override="moa://local",
),
"openrouter": HermesOverlay(
transport="openai_chat",
is_aggregator=True,
@ -355,6 +360,7 @@ ALIASES: Dict[str, str] = {
# not in the catalog.
_LABEL_OVERRIDES: Dict[str, str] = {
"moa": "Mixture of Agents",
"nous": "Nous Portal",
"openai-codex": "OpenAI Codex",
"copilot-acp": "GitHub Copilot ACP",

View file

@ -1400,6 +1400,16 @@ def resolve_runtime_provider(
"""
requested_provider = resolve_requested_provider(requested)
if requested_provider == "moa":
return {
"provider": "moa",
"api_mode": "chat_completions",
"base_url": "http://127.0.0.1/v1",
"api_key": "moa-virtual-provider",
"source": "moa-virtual-provider",
"requested_provider": requested_provider,
}
# Azure Anthropic short-circuit: when explicitly targeting an Azure endpoint
# with provider="anthropic", bypass _resolve_named_custom_runtime (which would
# return provider="custom" with chat_completions api_mode and no valid key).

View file

@ -408,11 +408,6 @@ def _print_setup_summary(config: dict, hermes_home):
else:
tool_status.append(("Vision (image analysis)", False, "run 'hermes setup' to configure"))
# Mixture of Agents — requires OpenRouter specifically (calls multiple models)
if get_env_value("OPENROUTER_API_KEY"):
tool_status.append(("Mixture of Agents", True, None))
else:
tool_status.append(("Mixture of Agents", False, "OPENROUTER_API_KEY"))
# Web tools (Exa, Parallel, Firecrawl, or Tavily)
if subscription_features.web.managed_by_nous:

View file

@ -144,7 +144,7 @@ TIPS = [
"The todo tool helps the agent track complex multi-step tasks during a session.",
"session_search performs full-text search across ALL past conversations.",
"The agent automatically saves preferences, corrections, and environment facts to memory.",
"mixture_of_agents routes hard problems through 4 frontier LLMs collaboratively.",
"/moa routes one hard prompt through your configured Mixture of Agents model set.",
"Terminal commands support background mode with notify_on_complete for long-running tasks.",
"Terminal background processes support watch_patterns to alert on specific output lines.",
"The terminal tool supports 6 backends: local, Docker, SSH, Modal, Daytona, and Singularity.",

View file

@ -63,7 +63,6 @@ CONFIGURABLE_TOOLSETS = [
("image_gen", "🎨 Image Generation", "image_generate"),
("video_gen", "🎬 Video Generation", "video_generate (text-to-video + image-to-video)"),
("x_search", "🐦 X (Twitter) Search", "x_search (requires xAI OAuth or XAI_API_KEY)"),
("moa", "🧠 Mixture of Agents", "mixture_of_agents"),
("tts", "🔊 Text-to-Speech", "text_to_speech"),
("skills", "📚 Skills", "list, view, manage"),
("todo", "📋 Task Planning", "todo"),
@ -111,7 +110,7 @@ def gui_toolset_label(label: str) -> str:
# `hermes tools` → X (Twitter) Search setup walks users through credential
# setup. The tool's check_fn means the schema still won't appear to the
# model if the credential later goes missing or expires.
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "spotify", "discord", "discord_admin", "video", "video_gen", "x_search"}
_DEFAULT_OFF_TOOLSETS = {"homeassistant", "spotify", "discord", "discord_admin", "video", "video_gen", "x_search"}
def _xai_credentials_present() -> bool:
@ -567,10 +566,9 @@ TOOL_CATEGORIES = {
}
# Simple env-var requirements for toolsets NOT in TOOL_CATEGORIES.
# Used as a fallback for tools like vision/moa that just need an API key.
# Used as a fallback for toolsets like vision that just need an API key.
TOOLSET_ENV_REQUIREMENTS = {
"vision": [("OPENROUTER_API_KEY", "https://openrouter.ai/keys")],
"moa": [("OPENROUTER_API_KEY", "https://openrouter.ai/keys")],
}

View file

@ -831,6 +831,35 @@ class ModelAssignment(BaseModel):
profile: Optional[str] = None
class MoaModelSlot(BaseModel):
provider: str = ""
model: str = ""
class MoaPresetPayload(BaseModel):
reference_models: list[MoaModelSlot] = []
aggregator: MoaModelSlot = MoaModelSlot()
reference_temperature: float = 0.6
aggregator_temperature: float = 0.4
max_tokens: int = 4096
enabled: bool = True
class MoaConfigPayload(BaseModel):
default_preset: str = "default"
active_preset: str = ""
presets: dict[str, MoaPresetPayload] = {}
# Backward-compatible flat payload fields used by older dashboard/desktop
# clients during this PR's transition window.
reference_models: list[MoaModelSlot] = []
aggregator: MoaModelSlot = MoaModelSlot()
reference_temperature: float = 0.6
aggregator_temperature: float = 0.4
max_tokens: int = 4096
enabled: bool = True
profile: Optional[str] = None
def _normalize_main_model_assignment(provider: str, model: str) -> tuple[str, str]:
"""Normalize a main-slot (provider, model) pair before persisting.
@ -3786,6 +3815,66 @@ def get_auxiliary_models(profile: Optional[str] = None):
raise HTTPException(status_code=500, detail="Failed to read auxiliary config")
@app.get("/api/model/moa")
def get_moa_models(profile: Optional[str] = None):
"""Return the configured Mixture-of-Agents provider/model slots."""
try:
from hermes_cli.moa_config import normalize_moa_config
with _profile_scope(profile):
cfg = load_config()
return normalize_moa_config(cfg.get("moa") if isinstance(cfg, dict) else {})
except HTTPException:
raise
except Exception:
_log.exception("GET /api/model/moa failed")
raise HTTPException(status_code=500, detail="Failed to read MoA config")
@app.put("/api/model/moa")
def set_moa_models(body: MoaConfigPayload, profile: Optional[str] = None):
"""Persist the Mixture-of-Agents provider/model slots."""
try:
from hermes_cli.moa_config import normalize_moa_config
with _profile_scope(body.profile or profile):
cfg = load_config()
if body.presets:
raw = {
"default_preset": body.default_preset,
"active_preset": body.active_preset,
"presets": {
name: {
"reference_models": [slot.dict() for slot in preset.reference_models],
"aggregator": preset.aggregator.dict(),
"reference_temperature": preset.reference_temperature,
"aggregator_temperature": preset.aggregator_temperature,
"max_tokens": preset.max_tokens,
"enabled": preset.enabled,
}
for name, preset in body.presets.items()
},
}
else:
raw = {
"reference_models": [slot.dict() for slot in body.reference_models],
"aggregator": body.aggregator.dict(),
"reference_temperature": body.reference_temperature,
"aggregator_temperature": body.aggregator_temperature,
"max_tokens": body.max_tokens,
"enabled": body.enabled,
}
normalized = normalize_moa_config(raw)
cfg["moa"] = normalized
save_config(cfg)
return {"ok": True, **normalized}
except HTTPException:
raise
except Exception:
_log.exception("PUT /api/model/moa failed")
raise HTTPException(status_code=500, detail="Failed to save MoA config")
@app.post("/api/model/set")
async def set_model_assignment(body: ModelAssignment, profile: Optional[str] = None):
"""Assign a model to the main slot or an auxiliary task slot.

View file

@ -225,7 +225,6 @@ _LEGACY_TOOLSET_MAP = {
"web_tools": ["web_search", "web_extract"],
"terminal_tools": ["terminal"],
"vision_tools": ["vision_analyze"],
"moa_tools": ["mixture_of_agents"],
"image_tools": ["image_generate"],
"skills_tools": ["skills_list", "skill_view", "skill_manage"],
"browser_tools": [

View file

@ -3709,6 +3709,8 @@ class AIAgent:
from unittest.mock import Mock
primary_client = self._ensure_primary_openai_client(reason=reason)
if self.provider == "moa":
return primary_client
if isinstance(primary_client, Mock):
return primary_client
with self._openai_client_lock():
@ -5313,6 +5315,7 @@ class AIAgent:
stream_callback: Optional[callable] = None,
persist_user_message: Optional[str] = None,
persist_user_timestamp: Optional[float] = None,
moa_config: Optional[dict[str, Any]] = None,
) -> Dict[str, Any]:
"""Forwarder — see ``agent.conversation_loop.run_conversation``."""
from agent.conversation_loop import run_conversation
@ -5324,7 +5327,8 @@ class AIAgent:
task_id,
stream_callback,
persist_user_message,
persist_user_timestamp,
persist_user_timestamp=persist_user_timestamp,
moa_config=moa_config,
)
def chat(self, message: str, stream_callback: Optional[callable] = None) -> str:

View file

@ -448,7 +448,6 @@ Enable/disable via `hermes tools` (interactive) or `hermes tools enable/disable
| `feishu_drive` | Feishu (Lark) drive tools |
| `yuanbao` | Yuanbao integration tools |
| `rl` | Reinforcement learning tools (off by default) |
| `moa` | Mixture of Agents (off by default) |
Full enumeration lives in `toolsets.py` as the `TOOLSETS` dict; `_HERMES_CORE_TOOLS` is the default bundle most platforms inherit from.

View file

@ -0,0 +1,69 @@
import queue
from unittest.mock import patch
from cli import HermesCLI
from hermes_cli.moa_config import decode_moa_turn
def _make_cli():
cli = HermesCLI.__new__(HermesCLI)
cli.config = {
"moa": {
"default_preset": "default",
"presets": {
"default": {
"reference_models": [{"provider": "openai-codex", "model": "gpt-5.5"}],
"aggregator": {"provider": "openrouter", "model": "anthropic/claude-opus-4.8"},
},
"review": {
"reference_models": [{"provider": "openrouter", "model": "deepseek/deepseek-v4-pro"}],
"aggregator": {"provider": "openrouter", "model": "anthropic/claude-opus-4.8"},
},
},
}
}
cli._pending_input = queue.Queue()
cli._pending_agent_seed = None
cli._pending_moa_config = None
cli._agent_running = False
cli.agent = None
return cli
def test_moa_bare_switches_to_default_preset_model():
cli = _make_cli()
with patch("cli._cprint"):
assert cli.process_command("/moa") is True
assert cli.provider == "moa"
assert cli.requested_provider == "moa"
assert cli.model == "default"
assert cli.agent is None
def test_moa_exact_preset_switches_to_named_preset_model():
cli = _make_cli()
with patch("cli._cprint"):
cli.process_command("/moa review")
assert cli.provider == "moa"
assert cli.model == "review"
assert cli.agent is None
def test_moa_non_preset_is_one_shot_prompt():
cli = _make_cli()
with patch("cli._cprint"):
cli.process_command("/moa inspect the flaky test")
assert cli._pending_agent_seed == "inspect the flaky test"
assert cli._pending_moa_disable_after_turn is True
assert cli.provider == "moa"
assert cli.model == "default"
assert cli._pending_moa_restore_model["provider"] != "moa"
def test_decode_legacy_encoded_moa_turn_still_works():
from hermes_cli.moa_config import build_moa_turn_prompt
encoded = build_moa_turn_prompt("hello", _make_cli().config["moa"], preset="review")
prompt, cfg = decode_moa_turn(encoded)
assert prompt == "hello"
assert cfg["reference_models"] == [{"provider": "openrouter", "model": "deepseek/deepseek-v4-pro"}]

View file

@ -165,7 +165,9 @@ def test_build_models_payload_returns_expected_shape():
assert set(payload.keys()) == {"providers", "model", "provider"}
assert payload["model"] == "m1"
assert payload["provider"] == "openrouter"
assert payload["providers"] == rows
assert payload["providers"][0]["slug"] == "moa"
assert payload["providers"][0]["models"] == ["default"]
assert payload["providers"][1:] == rows
def test_build_models_payload_does_not_call_provider_model_ids():
@ -586,7 +588,7 @@ def test_aggregator_dedup_no_user_providers_unchanged():
with _list_auth_returning(rows):
payload = build_models_payload(ctx)
or_row = payload["providers"][0]
or_row = next(r for r in payload["providers"] if r["slug"] == "openrouter")
assert len(or_row["models"]) == 2

View file

@ -0,0 +1,97 @@
from hermes_cli.moa_config import (
DEFAULT_MOA_AGGREGATOR,
DEFAULT_MOA_PRESET_NAME,
DEFAULT_MOA_REFERENCE_MODELS,
build_moa_turn_prompt,
decode_moa_turn,
exact_moa_preset_name,
normalize_moa_config,
resolve_moa_preset,
set_active_moa_preset,
)
def test_normalize_moa_config_uses_default_named_preset():
cfg = normalize_moa_config({})
assert cfg["default_preset"] == DEFAULT_MOA_PRESET_NAME
assert list(cfg["presets"]) == [DEFAULT_MOA_PRESET_NAME]
assert cfg["reference_models"] == DEFAULT_MOA_REFERENCE_MODELS
assert cfg["aggregator"] == DEFAULT_MOA_AGGREGATOR
def test_normalize_moa_config_preserves_named_presets():
cfg = normalize_moa_config(
{
"default_preset": "coding",
"presets": {
"coding": {
"reference_models": [{"provider": "openai-codex", "model": "gpt-5.5"}],
"aggregator": {"provider": "openrouter", "model": "anthropic/claude-opus-4.8"},
},
"review": {
"reference_models": [{"provider": "openrouter", "model": "deepseek/deepseek-v4-pro"}],
"aggregator": {"provider": "openrouter", "model": "anthropic/claude-opus-4.8"},
},
},
}
)
assert cfg["default_preset"] == "coding"
assert set(cfg["presets"]) == {"coding", "review"}
assert cfg["reference_models"] == [{"provider": "openai-codex", "model": "gpt-5.5"}]
def test_legacy_flat_config_becomes_default_preset():
cfg = normalize_moa_config(
{
"reference_models": [{"provider": "openai-codex", "model": "gpt-5.5"}],
"aggregator": {"provider": "openrouter", "model": "anthropic/claude-opus-4.8"},
}
)
assert cfg["presets"][DEFAULT_MOA_PRESET_NAME]["reference_models"] == [
{"provider": "openai-codex", "model": "gpt-5.5"}
]
def test_exact_preset_matching_is_not_fuzzy():
config = {"presets": {"coding": {}, "review": {}}}
assert exact_moa_preset_name(config, "coding") == "coding"
assert exact_moa_preset_name(config, "cod") is None
assert exact_moa_preset_name(config, "coding please fix this") is None
def test_active_preset_toggle_validation():
config = {"default_preset": "coding", "presets": {"coding": {}, "review": {}}}
active = set_active_moa_preset(config, "review")
assert active["active_preset"] == "review"
inactive = set_active_moa_preset(active, "")
assert inactive["active_preset"] == ""
def test_resolve_moa_preset_returns_requested_model_set():
cfg = normalize_moa_config(
{
"presets": {
"coding": {"reference_models": [{"provider": "openai-codex", "model": "gpt-5.5"}]},
"review": {"reference_models": [{"provider": "openrouter", "model": "deepseek/deepseek-v4-pro"}]},
}
}
)
assert resolve_moa_preset(cfg, "review")["reference_models"] == [
{"provider": "openrouter", "model": "deepseek/deepseek-v4-pro"}
]
def test_build_moa_turn_prompt_encodes_one_shot_default_preset():
prompt = build_moa_turn_prompt("write a file then inspect it")
decoded_prompt, cfg = decode_moa_turn(prompt)
assert decoded_prompt == "write a file then inspect it"
assert cfg is not None
assert cfg["reference_models"] == DEFAULT_MOA_REFERENCE_MODELS

View file

@ -24,7 +24,14 @@ HEADERS = {"X-Hermes-Session-Token": _SESSION_TOKEN}
# the model picker's local-endpoint flow, not a fixed credential card. It is in
# the CLI picker's universe but intentionally has no dedicated Providers-tab
# card. Exempt it from the union check.
_EXEMPT = {"custom"}
#
# Virtual providers (auth_type "virtual", e.g. `moa`) are likewise in the CLI
# picker universe but have no real credential and no Providers-tab card — they
# are configured through their own feature UI (MoA presets). Exempt them too,
# derived from the catalog so any future virtual provider is covered without a
# hardcoded slug.
_VIRTUAL = {d.slug for d in provider_catalog() if d.auth_type == "virtual"}
_EXEMPT = {"custom"} | _VIRTUAL
# Providers that legitimately offer BOTH auth methods and so intentionally
# appear on both desktop tabs (an API-key card AND an account sign-in card).

View file

@ -393,6 +393,36 @@ class TestWebServerEndpoints:
assert fields["api_key"]["value"] == ""
assert "secret-value" not in json.dumps(data)
def test_get_moa_models_returns_provider_model_slots(self):
resp = self.client.get("/api/model/moa")
assert resp.status_code == 200
data = resp.json()
assert data["reference_models"]
assert all(set(slot) == {"provider", "model"} for slot in data["reference_models"])
assert set(data["aggregator"]) == {"provider", "model"}
def test_put_moa_models_persists_provider_model_slots(self):
from hermes_cli.config import load_config
payload = {
"reference_models": [
{"provider": "openai-codex", "model": "gpt-5.5"},
{"provider": "openrouter", "model": "deepseek/deepseek-v4-pro"},
],
"aggregator": {"provider": "openrouter", "model": "anthropic/claude-opus-4.8"},
"reference_temperature": 0.6,
"aggregator_temperature": 0.4,
"max_tokens": 4096,
"enabled": True,
}
resp = self.client.put("/api/model/moa", json=payload)
assert resp.status_code == 200
assert resp.json()["ok"] is True
cfg = load_config()
assert cfg["moa"]["reference_models"] == payload["reference_models"]
assert cfg["moa"]["aggregator"] == payload["aggregator"]
# ── GET /api/media (remote image display) ───────────────────────────
def test_get_media_serves_image_in_root(self):

View file

@ -0,0 +1,224 @@
from types import SimpleNamespace
from unittest.mock import MagicMock
from run_agent import AIAgent
def _response(content="done", *, tool_calls=None):
message = SimpleNamespace(content=content, tool_calls=tool_calls or [])
choice = SimpleNamespace(message=message, finish_reason="stop")
return SimpleNamespace(choices=[choice], usage=None, model="fake-model")
def test_moa_virtual_provider_aggregator_is_actor(monkeypatch, tmp_path):
home = tmp_path / ".hermes"
home.mkdir()
(home / "config.yaml").write_text(
"""
moa:
default_preset: review
presets:
review:
reference_models:
- provider: openai-codex
model: gpt-5.5
aggregator:
provider: openrouter
model: anthropic/claude-opus-4.8
""".strip(),
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(home))
calls = []
def fake_call_llm(**kwargs):
calls.append(kwargs)
if kwargs["task"] == "moa_reference":
return _response("reference advice")
return _response("aggregator acted")
monkeypatch.setattr("agent.moa_loop.call_llm", fake_call_llm)
agent = AIAgent(
api_key="moa-virtual-provider",
base_url="moa://local",
model="review",
provider="moa",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
enabled_toolsets=["file"],
max_iterations=1,
)
result = agent.run_conversation("solve this")
assert result["final_response"] == "aggregator acted"
assert [(c["task"], c["provider"], c["model"]) for c in calls] == [
("moa_reference", "openai-codex", "gpt-5.5"),
("moa_aggregator", "openrouter", "anthropic/claude-opus-4.8"),
]
assert calls[1]["tools"] is not None
def test_reference_messages_strips_system_and_tool_history():
from agent.moa_loop import _reference_messages
messages = [
{"role": "system", "content": "huge hermes system prompt"},
{"role": "user", "content": "do the thing"},
{
"role": "assistant",
"content": "",
"tool_calls": [{"id": "c1", "function": {"name": "f", "arguments": "{}"}}],
},
{"role": "tool", "tool_call_id": "c1", "content": "tool result"},
{"role": "assistant", "content": "here is my answer"},
]
trimmed = _reference_messages(messages)
# System prompt, tool-call-only assistant turn, and tool result are gone.
assert all(m["role"] in ("user", "assistant") for m in trimmed)
assert all("tool_calls" not in m for m in trimmed)
assert trimmed == [
{"role": "user", "content": "do the thing"},
{"role": "assistant", "content": "here is my answer"},
]
def test_moa_facade_references_get_trimmed_messages(monkeypatch, tmp_path):
home = tmp_path / ".hermes"
home.mkdir()
(home / "config.yaml").write_text(
"""
moa:
default_preset: review
presets:
review:
reference_models:
- provider: openai-codex
model: gpt-5.5
aggregator:
provider: openrouter
model: anthropic/claude-opus-4.8
""".strip(),
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(home))
calls = []
def fake_call_llm(**kwargs):
calls.append(kwargs)
return _response("ok")
monkeypatch.setattr("agent.moa_loop.call_llm", fake_call_llm)
from agent.moa_loop import MoAChatCompletions
facade = MoAChatCompletions("review")
facade.create(
messages=[
{"role": "system", "content": "system prompt"},
{"role": "user", "content": "question"},
{"role": "tool", "tool_call_id": "x", "content": "leftover"},
],
tools=[{"type": "function"}],
)
ref_call = next(c for c in calls if c["task"] == "moa_reference")
# Reference never sees system prompt or tool-role messages.
assert all(m["role"] == "user" for m in ref_call["messages"])
assert ref_call.get("tools") in (None, [])
# Aggregator still receives the original messages + tool schema.
agg_call = next(c for c in calls if c["task"] == "moa_aggregator")
assert agg_call["tools"] is not None
def test_moa_disabled_preset_skips_references(monkeypatch, tmp_path):
home = tmp_path / ".hermes"
home.mkdir()
(home / "config.yaml").write_text(
"""
moa:
default_preset: review
presets:
review:
enabled: false
reference_models:
- provider: openai-codex
model: gpt-5.5
aggregator:
provider: openrouter
model: anthropic/claude-opus-4.8
""".strip(),
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(home))
calls = []
def fake_call_llm(**kwargs):
calls.append(kwargs)
return _response("aggregator only")
monkeypatch.setattr("agent.moa_loop.call_llm", fake_call_llm)
from agent.moa_loop import MoAChatCompletions
facade = MoAChatCompletions("review")
facade.create(messages=[{"role": "user", "content": "question"}], tools=[{"type": "function"}])
tasks = [c["task"] for c in calls]
# No reference fan-out — only the aggregator runs.
assert tasks == ["moa_aggregator"]
# Aggregator gets the unmodified user message (no MoA guidance appended).
agg_call = calls[0]
assert agg_call["messages"][-1]["content"] == "question"
def test_references_run_in_parallel(monkeypatch):
"""References fan out concurrently (delegate-batch semantics), not serially.
Each reference sleeps; wall-time must approximate the slowest single call,
not the sum. Order is preserved and a failing reference is isolated.
"""
import time
from agent import moa_loop
# Force _extract_text down its fallback path (no transport normalize).
monkeypatch.setattr(moa_loop, "get_transport", lambda *_a, **_k: None)
barrier_hits = []
def slow_call_llm(**kwargs):
barrier_hits.append(time.monotonic())
model = kwargs["model"]
if model == "boom":
raise RuntimeError("kaboom")
time.sleep(0.5)
return _response(f"resp-{kwargs['provider']}")
monkeypatch.setattr(moa_loop, "call_llm", slow_call_llm)
refs = [
{"provider": "p1", "model": "ok"},
{"provider": "moa", "model": "preset"}, # recursion guard, not dispatched
{"provider": "p2", "model": "boom"}, # failure isolated
{"provider": "p3", "model": "ok"},
]
start = time.monotonic()
out = moa_loop._run_references_parallel(
refs, [{"role": "user", "content": "hi"}], temperature=0.6, max_tokens=64
)
elapsed = time.monotonic() - start
# Two 0.5s sleeps run concurrently → well under the 1.0s serial floor.
assert elapsed < 0.9, f"references did not run in parallel (took {elapsed:.2f}s)"
# Output order matches input order (stable Reference N labelling).
assert [label for label, _ in out] == ["p1:ok", "moa:preset", "p2:boom", "p3:ok"]
assert "recursively reference MoA" in out[1][1]
assert out[2][1].startswith("[failed:")
assert out[0][1] == "resp-p1"

View file

@ -375,7 +375,7 @@ class TestPreToolCallBlocking:
class TestLegacyToolsetMap:
def test_expected_legacy_names(self):
expected = [
"web_tools", "terminal_tools", "vision_tools", "moa_tools",
"web_tools", "terminal_tools", "vision_tools",
"image_tools", "skills_tools", "browser_tools", "cronjob_tools",
"file_tools", "tts_tools",
]

View file

@ -36,52 +36,6 @@ def _run(coro):
return asyncio.get_event_loop().run_until_complete(coro)
# ── mixture_of_agents_tool — reference model (line 146) ───────────────────
class TestMoAReferenceModelContentNone:
"""tools/mixture_of_agents_tool.py — _query_model()"""
def test_none_content_raises_before_fix(self):
"""Demonstrate that None content from a reasoning model crashes."""
response = _make_response(None)
# Simulate the exact line: response.choices[0].message.content.strip()
with pytest.raises(AttributeError):
response.choices[0].message.content.strip()
def test_none_content_safe_with_or_guard(self):
"""The ``or ""`` guard should convert None to empty string."""
response = _make_response(None)
content = (response.choices[0].message.content or "").strip()
assert content == ""
def test_normal_content_unaffected(self):
"""Regular string content should pass through unchanged."""
response = _make_response(" Hello world ")
content = (response.choices[0].message.content or "").strip()
assert content == "Hello world"
# ── mixture_of_agents_tool — aggregator (line 214) ────────────────────────
class TestMoAAggregatorContentNone:
"""tools/mixture_of_agents_tool.py — _run_aggregator()"""
def test_none_content_raises_before_fix(self):
response = _make_response(None)
with pytest.raises(AttributeError):
response.choices[0].message.content.strip()
def test_none_content_safe_with_or_guard(self):
response = _make_response(None)
content = (response.choices[0].message.content or "").strip()
assert content == ""
# ── web_tools — LLM content processor (line 419) ─────────────────────────
class TestWebToolsProcessorContentNone:
@ -170,14 +124,6 @@ class TestSourceLinesAreGuarded:
with open(os.path.join(base, rel_path)) as f:
return f.read()
def test_mixture_of_agents_reference_model_guarded(self):
src = self._read_file("tools/mixture_of_agents_tool.py")
# The unguarded pattern should NOT exist
assert ".message.content.strip()" not in src, (
"tools/mixture_of_agents_tool.py still has unguarded "
".content.strip() — apply `(... or \"\").strip()` guard"
)
def test_web_tools_guarded(self):
src = self._read_file("tools/web_tools.py")
assert ".message.content.strip()" not in src, (

View file

@ -1,85 +0,0 @@
import importlib
import json
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock
import pytest
moa = importlib.import_module("tools.mixture_of_agents_tool")
def test_moa_defaults_are_well_formed():
# Invariants, not a catalog snapshot: the exact model list churns with
# OpenRouter availability (see PR #6636 where gemini-3-pro-preview was
# removed upstream). What we care about is that the defaults are present
# and valid vendor/model slugs.
assert isinstance(moa.REFERENCE_MODELS, list)
assert len(moa.REFERENCE_MODELS) >= 1
for m in moa.REFERENCE_MODELS:
assert isinstance(m, str) and "/" in m and not m.startswith("/")
assert isinstance(moa.AGGREGATOR_MODEL, str)
assert "/" in moa.AGGREGATOR_MODEL
@pytest.mark.asyncio
async def test_reference_model_retry_warnings_avoid_exc_info_until_terminal_failure(monkeypatch):
fake_client = SimpleNamespace(
chat=SimpleNamespace(
completions=SimpleNamespace(
create=AsyncMock(side_effect=RuntimeError("rate limited"))
)
)
)
warn = MagicMock()
err = MagicMock()
monkeypatch.setattr(moa, "_get_openrouter_client", lambda: fake_client)
monkeypatch.setattr(moa.logger, "warning", warn)
monkeypatch.setattr(moa.logger, "error", err)
model, message, success = await moa._run_reference_model_safe(
"openai/gpt-5.4-pro", "hello", max_retries=2
)
assert model == "openai/gpt-5.4-pro"
assert success is False
assert "failed after 2 attempts" in message
assert warn.call_count == 2
assert all(call.kwargs.get("exc_info") is None for call in warn.call_args_list)
err.assert_called_once()
assert err.call_args.kwargs.get("exc_info") is True
@pytest.mark.asyncio
async def test_moa_top_level_error_logs_single_traceback_on_aggregator_failure(monkeypatch):
monkeypatch.setenv("OPENROUTER_API_KEY", "test-key")
monkeypatch.setattr(
moa,
"_run_reference_model_safe",
AsyncMock(return_value=("anthropic/claude-opus-4.6", "ok", True)),
)
monkeypatch.setattr(
moa,
"_run_aggregator_model",
AsyncMock(side_effect=RuntimeError("aggregator boom")),
)
monkeypatch.setattr(
moa,
"_debug",
SimpleNamespace(log_call=MagicMock(), save=MagicMock(), active=False),
)
err = MagicMock()
monkeypatch.setattr(moa.logger, "error", err)
result = json.loads(
await moa.mixture_of_agents_tool(
"solve this",
reference_models=["anthropic/claude-opus-4.6"],
)
)
assert result["success"] is False
assert "Error in MoA processing" in result["error"]
err.assert_called_once()
assert err.call_args.kwargs.get("exc_info") is True

View file

@ -202,3 +202,77 @@ def test_pending_input_commands_includes_goal(server):
"""Guard: _PENDING_INPUT_COMMANDS must list 'goal' — removing it would
silently re-break the TUI."""
assert "goal" in server._PENDING_INPUT_COMMANDS
# ── command.dispatch /moa ────────────────────────────────────────────
def _write_moa_config(home, text):
cfg_path = home / "config.yaml"
cfg_path.write_text(text)
def test_moa_bare_switches_to_default_preset_model(server, session, hermes_home):
_write_moa_config(hermes_home, """
moa:
default_preset: default
presets:
default:
reference_models:
- provider: openai-codex
model: gpt-5.5
aggregator:
provider: openrouter
model: anthropic/claude-opus-4.8
""")
sid, _, s = session
r = _call(server, "command.dispatch", name="moa", arg="", session_id=sid)
assert r["result"]["type"] == "exec"
assert "Model switched to MoA preset: default" in r["result"]["output"]
assert s["model_override"]["provider"] == "moa"
assert s["model_override"]["model"] == "default"
def test_moa_exact_preset_switches_to_named_preset_model(server, session, hermes_home):
_write_moa_config(hermes_home, """
moa:
default_preset: default
presets:
default: {}
review:
reference_models:
- provider: openrouter
model: deepseek/deepseek-v4-pro
aggregator:
provider: openrouter
model: anthropic/claude-opus-4.8
""")
sid, _, s = session
r = _call(server, "command.dispatch", name="moa", arg="review", session_id=sid)
assert r["result"]["type"] == "exec"
assert s["model_override"]["provider"] == "moa"
assert s["model_override"]["model"] == "review"
def test_moa_non_preset_returns_one_shot_send(server, session, hermes_home):
_write_moa_config(hermes_home, """
moa:
default_preset: default
presets:
default:
reference_models:
- provider: openai-codex
model: gpt-5.5
aggregator:
provider: openrouter
model: anthropic/claude-opus-4.8
""")
sid, _, _ = session
r = _call(server, "command.dispatch", name="moa", arg="inspect this project", session_id=sid)
result = r["result"]
assert result["type"] == "send"
assert result["message"] == "inspect this project"
assert "one-shot" in result["notice"]
def test_pending_input_commands_includes_moa(server):
assert "moa" in server._PENDING_INPUT_COMMANDS

View file

@ -2,7 +2,7 @@
Replaces the identical DEBUG_MODE / _log_debug_call / _save_debug_log /
get_debug_session_info boilerplate previously duplicated across web_tools,
vision_tools, mixture_of_agents_tool, and image_generation_tool.
vision_tools, and image_generation_tool.
Usage in a tool module:

View file

@ -120,7 +120,7 @@ def _get_subagent_approval_callback():
# toolset to request explicitly — the correct mechanism for nested
# delegation is role='orchestrator', which re-adds "delegation" in
# _build_child_agent regardless of this exclusion.
_EXCLUDED_TOOLSET_NAMES = frozenset({"debugging", "safe", "delegation", "moa", "rl"})
_EXCLUDED_TOOLSET_NAMES = frozenset({"debugging", "safe", "delegation", "rl"})
_SUBAGENT_TOOLSETS = sorted(
name
for name, defn in TOOLSETS.items()

View file

@ -1,542 +0,0 @@
#!/usr/bin/env python3
"""
Mixture-of-Agents Tool Module
This module implements the Mixture-of-Agents (MoA) methodology that leverages
the collective strengths of multiple LLMs through a layered architecture to
achieve state-of-the-art performance on complex reasoning tasks.
Based on the research paper: "Mixture-of-Agents Enhances Large Language Model Capabilities"
by Junlin Wang et al. (arXiv:2406.04692v1)
Key Features:
- Multi-layer LLM collaboration for enhanced reasoning
- Parallel processing of reference models for efficiency
- Intelligent aggregation and synthesis of diverse responses
- Specialized for extremely difficult problems requiring intense reasoning
- Optimized for coding, mathematics, and complex analytical tasks
Available Tool:
- mixture_of_agents_tool: Process complex queries using multiple frontier models
Architecture:
1. Reference models generate diverse initial responses in parallel
2. Aggregator model synthesizes responses into a high-quality output
3. Multiple layers can be used for iterative refinement (future enhancement)
Models Used (via OpenRouter):
- Reference Models: claude-opus-4.6, gemini-3-pro-preview, gpt-5.4-pro, deepseek-v3.2
- Aggregator Model: claude-opus-4.6 (highest capability for synthesis)
Configuration:
To customize the MoA setup, modify the configuration constants at the top of this file:
- REFERENCE_MODELS: List of models for generating diverse initial responses
- AGGREGATOR_MODEL: Model used to synthesize the final response
- REFERENCE_TEMPERATURE/AGGREGATOR_TEMPERATURE: Sampling temperatures
- MIN_SUCCESSFUL_REFERENCES: Minimum successful models needed to proceed
Usage:
from mixture_of_agents_tool import mixture_of_agents_tool
import asyncio
# Process a complex query
result = await mixture_of_agents_tool(
user_prompt="Solve this complex mathematical proof..."
)
"""
import json
import logging
import os
import asyncio
import datetime
from typing import Dict, Any, List, Optional
from tools.openrouter_client import get_async_client as _get_openrouter_client, check_api_key as check_openrouter_api_key
from agent.auxiliary_client import extract_content_or_reasoning
from tools.debug_helpers import DebugSession
import sys
logger = logging.getLogger(__name__)
# Configuration for MoA processing
# Reference models - these generate diverse initial responses in parallel.
# Keep this list aligned with current top-tier OpenRouter frontier options.
REFERENCE_MODELS = [
"anthropic/claude-opus-4.6",
"google/gemini-2.5-pro",
"openai/gpt-5.4-pro",
"deepseek/deepseek-v3.2",
]
# Aggregator model - synthesizes reference responses into final output.
# Prefer the strongest synthesis model in the current OpenRouter lineup.
AGGREGATOR_MODEL = "anthropic/claude-opus-4.6"
# Temperature settings optimized for MoA performance
REFERENCE_TEMPERATURE = 0.6 # Balanced creativity for diverse perspectives
AGGREGATOR_TEMPERATURE = 0.4 # Focused synthesis for consistency
# Failure handling configuration
MIN_SUCCESSFUL_REFERENCES = 1 # Minimum successful reference models needed to proceed
# System prompt for the aggregator model (from the research paper)
AGGREGATOR_SYSTEM_PROMPT = """You have been provided with a set of responses from various open-source models to the latest user query. Your task is to synthesize these responses into a single, high-quality response. It is crucial to critically evaluate the information provided in these responses, recognizing that some of it may be biased or incorrect. Your response should not simply replicate the given answers but should offer a refined, accurate, and comprehensive reply to the instruction. Ensure your response is well-structured, coherent, and adheres to the highest standards of accuracy and reliability.
Responses from models:"""
_debug = DebugSession("moa_tools", env_var="MOA_TOOLS_DEBUG")
def _construct_aggregator_prompt(system_prompt: str, responses: List[str]) -> str:
"""
Construct the final system prompt for the aggregator including all model responses.
Args:
system_prompt (str): Base system prompt for aggregation
responses (List[str]): List of responses from reference models
Returns:
str: Complete system prompt with enumerated responses
"""
response_text = "\n".join([f"{i+1}. {response}" for i, response in enumerate(responses)])
return f"{system_prompt}\n\n{response_text}"
async def _run_reference_model_safe(
model: str,
user_prompt: str,
temperature: float = REFERENCE_TEMPERATURE,
max_tokens: int = 32000,
max_retries: int = 6
) -> tuple[str, str, bool]:
"""
Run a single reference model with retry logic and graceful failure handling.
Args:
model (str): Model identifier to use
user_prompt (str): The user's query
temperature (float): Sampling temperature for response generation
max_tokens (int): Maximum tokens in response
max_retries (int): Maximum number of retry attempts
Returns:
tuple[str, str, bool]: (model_name, response_content_or_error, success_flag)
"""
for attempt in range(max_retries):
try:
logger.info("Querying %s (attempt %s/%s)", model, attempt + 1, max_retries)
# Build parameters for the API call
api_params = {
"model": model,
"messages": [{"role": "user", "content": user_prompt}],
"max_tokens": max_tokens,
"extra_body": {
"reasoning": {
"enabled": True,
"effort": "xhigh"
}
}
}
# GPT models (especially gpt-4o-mini) don't support custom temperature values
# Only include temperature for non-GPT models
if not model.lower().startswith('gpt-'):
api_params["temperature"] = temperature
response = await _get_openrouter_client().chat.completions.create(**api_params)
content = extract_content_or_reasoning(response)
if not content:
# Reasoning-only response — let the retry loop handle it
logger.warning("%s returned empty content (attempt %s/%s), retrying", model, attempt + 1, max_retries)
if attempt < max_retries - 1:
await asyncio.sleep(min(2 ** (attempt + 1), 60))
continue
logger.info("%s responded (%s characters)", model, len(content))
return model, content, True
except Exception as e:
error_str = str(e)
# Keep retry-path logging concise; full tracebacks are reserved for
# terminal failure paths so long-running MoA retries don't flood logs.
if "invalid" in error_str.lower():
logger.warning("%s invalid request error (attempt %s): %s", model, attempt + 1, error_str)
elif "rate" in error_str.lower() or "limit" in error_str.lower():
logger.warning("%s rate limit error (attempt %s): %s", model, attempt + 1, error_str)
else:
logger.warning("%s unknown error (attempt %s): %s", model, attempt + 1, error_str)
if attempt < max_retries - 1:
# Exponential backoff for rate limiting: 2s, 4s, 8s, 16s, 32s, 60s
sleep_time = min(2 ** (attempt + 1), 60)
logger.info("Retrying in %ss...", sleep_time)
await asyncio.sleep(sleep_time)
else:
error_msg = f"{model} failed after {max_retries} attempts: {error_str}"
logger.error("%s", error_msg, exc_info=True)
return model, error_msg, False
async def _run_aggregator_model(
system_prompt: str,
user_prompt: str,
temperature: float = AGGREGATOR_TEMPERATURE,
max_tokens: int = None
) -> str:
"""
Run the aggregator model to synthesize the final response.
Args:
system_prompt (str): System prompt with all reference responses
user_prompt (str): Original user query
temperature (float): Focused temperature for consistent aggregation
max_tokens (int): Maximum tokens in final response
Returns:
str: Synthesized final response
"""
logger.info("Running aggregator model: %s", AGGREGATOR_MODEL)
# Build parameters for the API call
api_params = {
"model": AGGREGATOR_MODEL,
"messages": [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
],
"max_tokens": max_tokens,
"extra_body": {
"reasoning": {
"enabled": True,
"effort": "xhigh"
}
}
}
# GPT models (especially gpt-4o-mini) don't support custom temperature values
# Only include temperature for non-GPT models
if not AGGREGATOR_MODEL.lower().startswith('gpt-'):
api_params["temperature"] = temperature
response = await _get_openrouter_client().chat.completions.create(**api_params)
content = extract_content_or_reasoning(response)
# Retry once on empty content (reasoning-only response)
if not content:
logger.warning("Aggregator returned empty content, retrying once")
response = await _get_openrouter_client().chat.completions.create(**api_params)
content = extract_content_or_reasoning(response)
logger.info("Aggregation complete (%s characters)", len(content))
return content
async def mixture_of_agents_tool(
user_prompt: str,
reference_models: Optional[List[str]] = None,
aggregator_model: Optional[str] = None
) -> str:
"""
Process a complex query using the Mixture-of-Agents methodology.
This tool leverages multiple frontier language models to collaboratively solve
extremely difficult problems requiring intense reasoning. It's particularly
effective for:
- Complex mathematical proofs and calculations
- Advanced coding problems and algorithm design
- Multi-step analytical reasoning tasks
- Problems requiring diverse domain expertise
- Tasks where single models show limitations
The MoA approach uses a fixed 2-layer architecture:
1. Layer 1: Multiple reference models generate diverse responses in parallel (temp=0.6)
2. Layer 2: Aggregator model synthesizes the best elements into final response (temp=0.4)
Args:
user_prompt (str): The complex query or problem to solve
reference_models (Optional[List[str]]): Custom reference models to use
aggregator_model (Optional[str]): Custom aggregator model to use
Returns:
str: JSON string containing the MoA results with the following structure:
{
"success": bool,
"response": str,
"models_used": {
"reference_models": List[str],
"aggregator_model": str
},
"processing_time": float
}
Raises:
Exception: If MoA processing fails or API key is not set
"""
start_time = datetime.datetime.now()
debug_call_data = {
"parameters": {
"user_prompt": user_prompt[:200] + "..." if len(user_prompt) > 200 else user_prompt,
"reference_models": reference_models or REFERENCE_MODELS,
"aggregator_model": aggregator_model or AGGREGATOR_MODEL,
"reference_temperature": REFERENCE_TEMPERATURE,
"aggregator_temperature": AGGREGATOR_TEMPERATURE,
"min_successful_references": MIN_SUCCESSFUL_REFERENCES
},
"error": None,
"success": False,
"reference_responses_count": 0,
"failed_models_count": 0,
"failed_models": [],
"final_response_length": 0,
"processing_time_seconds": 0,
"models_used": {}
}
try:
logger.info("Starting Mixture-of-Agents processing...")
logger.info("Query: %s", user_prompt[:100])
# Validate API key availability
if not os.getenv("OPENROUTER_API_KEY"):
raise ValueError("OPENROUTER_API_KEY environment variable not set")
# Use provided models or defaults
ref_models = reference_models or REFERENCE_MODELS
agg_model = aggregator_model or AGGREGATOR_MODEL
logger.info("Using %s reference models in 2-layer MoA architecture", len(ref_models))
# Layer 1: Generate diverse responses from reference models (with failure handling)
logger.info("Layer 1: Generating reference responses...")
model_results = await asyncio.gather(*[
_run_reference_model_safe(model, user_prompt, REFERENCE_TEMPERATURE)
for model in ref_models
])
# Separate successful and failed responses
successful_responses = []
failed_models = []
for model_name, content, success in model_results:
if success:
successful_responses.append(content)
else:
failed_models.append(model_name)
successful_count = len(successful_responses)
failed_count = len(failed_models)
logger.info("Reference model results: %s successful, %s failed", successful_count, failed_count)
if failed_models:
logger.warning("Failed models: %s", ', '.join(failed_models))
# Check if we have enough successful responses to proceed
if successful_count < MIN_SUCCESSFUL_REFERENCES:
raise ValueError(f"Insufficient successful reference models ({successful_count}/{len(ref_models)}). Need at least {MIN_SUCCESSFUL_REFERENCES} successful responses.")
debug_call_data["reference_responses_count"] = successful_count
debug_call_data["failed_models_count"] = failed_count
debug_call_data["failed_models"] = failed_models
# Layer 2: Aggregate responses using the aggregator model
logger.info("Layer 2: Synthesizing final response...")
aggregator_system_prompt = _construct_aggregator_prompt(
AGGREGATOR_SYSTEM_PROMPT,
successful_responses
)
final_response = await _run_aggregator_model(
aggregator_system_prompt,
user_prompt,
AGGREGATOR_TEMPERATURE
)
# Calculate processing time
end_time = datetime.datetime.now()
processing_time = (end_time - start_time).total_seconds()
logger.info("MoA processing completed in %.2f seconds", processing_time)
# Prepare successful response (only final aggregated result, minimal fields)
result = {
"success": True,
"response": final_response,
"models_used": {
"reference_models": ref_models,
"aggregator_model": agg_model
}
}
debug_call_data["success"] = True
debug_call_data["final_response_length"] = len(final_response)
debug_call_data["processing_time_seconds"] = processing_time
debug_call_data["models_used"] = result["models_used"]
# Log debug information
_debug.log_call("mixture_of_agents_tool", debug_call_data)
_debug.save()
return json.dumps(result, indent=2, ensure_ascii=False)
except Exception as e:
error_msg = f"Error in MoA processing: {str(e)}"
logger.error("%s", error_msg, exc_info=True)
# Calculate processing time even for errors
end_time = datetime.datetime.now()
processing_time = (end_time - start_time).total_seconds()
# Prepare error response (minimal fields)
result = {
"success": False,
"response": "MoA processing failed. Please try again or use a single model for this query.",
"models_used": {
"reference_models": reference_models or REFERENCE_MODELS,
"aggregator_model": aggregator_model or AGGREGATOR_MODEL
},
"error": error_msg
}
debug_call_data["error"] = error_msg
debug_call_data["processing_time_seconds"] = processing_time
_debug.log_call("mixture_of_agents_tool", debug_call_data)
_debug.save()
return json.dumps(result, indent=2, ensure_ascii=False)
def check_moa_requirements() -> bool:
"""
Check if all requirements for MoA tools are met.
Returns:
bool: True if requirements are met, False otherwise
"""
return check_openrouter_api_key()
def get_moa_configuration() -> Dict[str, Any]:
"""
Get the current MoA configuration settings.
Returns:
Dict[str, Any]: Dictionary containing all configuration parameters
"""
return {
"reference_models": REFERENCE_MODELS,
"aggregator_model": AGGREGATOR_MODEL,
"reference_temperature": REFERENCE_TEMPERATURE,
"aggregator_temperature": AGGREGATOR_TEMPERATURE,
"min_successful_references": MIN_SUCCESSFUL_REFERENCES,
"total_reference_models": len(REFERENCE_MODELS),
"failure_tolerance": f"{len(REFERENCE_MODELS) - MIN_SUCCESSFUL_REFERENCES}/{len(REFERENCE_MODELS)} models can fail"
}
if __name__ == "__main__":
"""
Simple test/demo when run directly
"""
print("🤖 Mixture-of-Agents Tool Module")
print("=" * 50)
# Check if API key is available
api_available = check_openrouter_api_key()
if not api_available:
print("❌ OPENROUTER_API_KEY environment variable not set")
print("Please set your API key: export OPENROUTER_API_KEY='your-key-here'")
print("Get API key at: https://openrouter.ai/")
sys.exit(1)
else:
print("✅ OpenRouter API key found")
print("🛠️ MoA tools ready for use!")
# Show current configuration
config = get_moa_configuration()
print("\n⚙️ Current Configuration:")
print(f" 🤖 Reference models ({len(config['reference_models'])}): {', '.join(config['reference_models'])}")
print(f" 🧠 Aggregator model: {config['aggregator_model']}")
print(f" 🌡️ Reference temperature: {config['reference_temperature']}")
print(f" 🌡️ Aggregator temperature: {config['aggregator_temperature']}")
print(f" 🛡️ Failure tolerance: {config['failure_tolerance']}")
print(f" 📊 Minimum successful models: {config['min_successful_references']}")
# Show debug mode status
if _debug.active:
print(f"\n🐛 Debug mode ENABLED - Session ID: {_debug.session_id}")
print(f" Debug logs will be saved to: ./logs/moa_tools_debug_{_debug.session_id}.json")
else:
print("\n🐛 Debug mode disabled (set MOA_TOOLS_DEBUG=true to enable)")
print("\nBasic usage:")
print(" from mixture_of_agents_tool import mixture_of_agents_tool")
print(" import asyncio")
print("")
print(" async def main():")
print(" result = await mixture_of_agents_tool(")
print(" user_prompt='Solve this complex mathematical proof...'")
print(" )")
print(" print(result)")
print(" asyncio.run(main())")
print("\nBest use cases:")
print(" - Complex mathematical proofs and calculations")
print(" - Advanced coding problems and algorithm design")
print(" - Multi-step analytical reasoning tasks")
print(" - Problems requiring diverse domain expertise")
print(" - Tasks where single models show limitations")
print("\nPerformance characteristics:")
print(" - Higher latency due to multiple model calls")
print(" - Significantly improved quality for complex tasks")
print(" - Parallel processing for efficiency")
print(f" - Optimized temperatures: {REFERENCE_TEMPERATURE} for reference models, {AGGREGATOR_TEMPERATURE} for aggregation")
print(" - Token-efficient: only returns final aggregated response")
print(" - Resilient: continues with partial model failures")
print(" - Configurable: easy to modify models and settings at top of file")
print(" - State-of-the-art results on challenging benchmarks")
print("\nDebug mode:")
print(" # Enable debug logging")
print(" export MOA_TOOLS_DEBUG=true")
print(" # Debug logs capture all MoA processing steps and metrics")
print(" # Logs saved to: ./logs/moa_tools_debug_UUID.json")
# ---------------------------------------------------------------------------
# Registry
# ---------------------------------------------------------------------------
from tools.registry import registry
MOA_SCHEMA = {
"name": "mixture_of_agents",
"description": "Route a hard problem through multiple frontier LLMs collaboratively. Makes 5 API calls (4 reference models + 1 aggregator) with maximum reasoning effort — use sparingly for genuinely difficult problems. Best for: complex math, advanced algorithms, multi-step analytical reasoning, problems benefiting from diverse perspectives.",
"parameters": {
"type": "object",
"properties": {
"user_prompt": {
"type": "string",
"description": "The complex query or problem to solve using multiple AI models. Should be a challenging problem that benefits from diverse perspectives and collaborative reasoning."
}
},
"required": ["user_prompt"]
}
}
registry.register(
name="mixture_of_agents",
toolset="moa",
schema=MOA_SCHEMA,
handler=lambda args, **kw: mixture_of_agents_tool(user_prompt=args.get("user_prompt", "")),
check_fn=check_moa_requirements,
requires_env=["OPENROUTER_API_KEY"],
is_async=True,
emoji="🧠",
)

View file

@ -36,7 +36,6 @@ DISTRIBUTIONS = {
"image_gen": 100,
"terminal": 100,
"file": 100,
"moa": 100,
"browser": 100
}
},
@ -48,8 +47,7 @@ DISTRIBUTIONS = {
"image_gen": 90, # 80% chance of image generation tools
"vision": 90, # 60% chance of vision tools
"web": 55, # 40% chance of web tools
"terminal": 45,
"moa": 10 # 20% chance of reasoning tools
"terminal": 45
}
},
@ -60,7 +58,6 @@ DISTRIBUTIONS = {
"web": 90, # 90% chance of web tools
"browser": 70, # 70% chance of browser tools for deep research
"vision": 50, # 50% chance of vision tools
"moa": 40, # 40% chance of reasoning tools
"terminal": 10 # 10% chance of terminal tools
}
},
@ -74,8 +71,7 @@ DISTRIBUTIONS = {
"file": 94, # 94% chance of file tools
"vision": 65, # 65% chance of vision tools
"browser": 50, # 50% chance of browser for accessing papers/databases
"image_gen": 15, # 15% chance of image generation tools
"moa": 10 # 10% chance of reasoning tools
"image_gen": 15 # 15% chance of image generation tools
}
},
@ -85,7 +81,6 @@ DISTRIBUTIONS = {
"toolsets": {
"terminal": 80, # 80% chance of terminal tools
"file": 80, # 80% chance of file tools (read, write, patch, search)
"moa": 60, # 60% chance of reasoning tools
"web": 30, # 30% chance of web tools
"vision": 10 # 10% chance of vision tools
}
@ -98,8 +93,7 @@ DISTRIBUTIONS = {
"web": 80,
"browser": 70, # Browser is safe (no local filesystem access)
"vision": 60,
"image_gen": 60,
"moa": 50
"image_gen": 60
}
},
@ -112,7 +106,6 @@ DISTRIBUTIONS = {
"image_gen": 50,
"terminal": 50,
"file": 50,
"moa": 50,
"browser": 50
}
},
@ -156,14 +149,15 @@ DISTRIBUTIONS = {
# Reasoning heavy
"reasoning": {
"description": "Heavy mixture of agents usage with minimal other tools",
"description": "Heavy research/reasoning distribution with minimal other tools",
"toolsets": {
"moa": 90,
"web": 30,
"web": 90,
"file": 60,
"terminal": 20
}
},
# Browser-based web interaction
"browser_use": {
"description": "Full browser-based web interaction with search, vision, and page control",

View file

@ -156,12 +156,6 @@ TOOLSETS = {
"includes": []
},
"moa": {
"description": "Advanced reasoning and problem-solving tools",
"tools": ["mixture_of_agents"],
"includes": []
},
"skills": {
"description": "Access, create, edit, and manage skill documents with specialized instructions and knowledge",
"tools": ["skills_list", "skill_view", "skill_manage"],

View file

@ -8163,6 +8163,12 @@ def _run_prompt_submit(rid, sid: str, session: dict, text: Any) -> None:
except (TypeError, ValueError):
pass
result = agent.run_conversation(run_message, **run_kwargs)
if "moa_one_shot_restore" in session:
_restore = session.pop("moa_one_shot_restore", None)
if _restore is None:
session.pop("model_override", None)
else:
session["model_override"] = _restore
last_reasoning = None
status_note = None
@ -10223,6 +10229,7 @@ _PENDING_INPUT_COMMANDS: frozenset[str] = frozenset(
"steer",
"plan",
"goal",
"moa",
"undo",
"learn",
}
@ -10495,6 +10502,49 @@ def _(rid, params: dict) -> dict:
from agent.learn_prompt import build_learn_prompt
return _ok(rid, {"type": "send", "message": build_learn_prompt(arg)})
if name == "moa":
try:
from hermes_cli.moa_config import (
build_moa_turn_prompt, exact_moa_preset_name, moa_usage, normalize_moa_config
)
moa_cfg = normalize_moa_config(_load_cfg().get("moa") or {})
matched = exact_moa_preset_name(moa_cfg, arg) if arg else moa_cfg["default_preset"]
if matched:
if not session:
return _err(rid, 4001, "no active session")
session["model_override"] = {
"model": matched,
"provider": "moa",
"base_url": "moa://local",
"api_key": "moa-virtual-provider",
"api_mode": "chat_completions",
}
session["moa_active_preset"] = matched
return _ok(rid, {"type": "exec", "output": f"Model switched to MoA preset: {matched}."})
if not arg:
return _err(rid, 4004, moa_usage())
if not session:
return _err(rid, 4001, "no active session")
preset = moa_cfg["default_preset"]
session["moa_one_shot_restore"] = session.get("model_override")
session["model_override"] = {
"model": preset,
"provider": "moa",
"base_url": "moa://local",
"api_key": "moa-virtual-provider",
"api_mode": "chat_completions",
}
return _ok(
rid,
{
"type": "send",
"notice": f"MoA one-shot queued with preset {preset}; previous model will be restored after this turn.",
"message": arg,
},
)
except Exception as exc:
return _err(rid, 5030, f"moa unavailable: {exc}")
if name == "retry":
if not session:

View file

@ -76,6 +76,7 @@ const PROFILE_SCOPED_PREFIXES = [
"/api/model/info",
"/api/model/set",
"/api/model/auxiliary",
"/api/model/moa",
"/api/model/options",
];
@ -472,6 +473,13 @@ export const api = {
getModelInfo: () => fetchJSON<ModelInfoResponse>("/api/model/info"),
getModelOptions: () => fetchJSON<ModelOptionsResponse>("/api/model/options"),
getAuxiliaryModels: () => fetchJSON<AuxiliaryModelsResponse>("/api/model/auxiliary"),
getMoaModels: () => fetchJSON<MoaConfigResponse>("/api/model/moa"),
saveMoaModels: (body: MoaConfigResponse) =>
fetchJSON<MoaConfigResponse & { ok: boolean }>("/api/model/moa", {
method: "PUT",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(body),
}),
setModelAssignment: (body: ModelAssignmentRequest) =>
fetchJSON<ModelAssignmentResponse>("/api/model/set", {
method: "POST",
@ -2061,6 +2069,30 @@ export interface AuxiliaryModelsResponse {
main: { provider: string; model: string };
}
export interface MoaModelSlot {
provider: string;
model: string;
}
export interface MoaConfigResponse {
default_preset: string;
active_preset: string;
presets: Record<string, {
reference_models: MoaModelSlot[];
aggregator: MoaModelSlot;
reference_temperature: number;
aggregator_temperature: number;
max_tokens: number;
enabled: boolean;
}>;
reference_models: MoaModelSlot[];
aggregator: MoaModelSlot;
reference_temperature: number;
aggregator_temperature: number;
max_tokens: number;
enabled: boolean;
}
export interface ModelAssignmentRequest {
confirm_expensive_model?: boolean;
scope: "main" | "auxiliary";

View file

@ -16,6 +16,8 @@ import { api } from "@/lib/api";
import type {
AuxiliaryModelsResponse,
AuxiliaryTaskAssignment,
MoaConfigResponse,
MoaModelSlot,
ModelsAnalyticsModelEntry,
ModelsAnalyticsResponse,
} from "@/lib/api";
@ -534,6 +536,10 @@ type PickerTarget =
| { kind: "main" }
| { kind: "aux"; task: string };
type MoaPickerTarget =
| { kind: "reference"; index: number }
| { kind: "aggregator" };
function AuxiliaryTasksModal({
aux,
refreshKey,
@ -687,6 +693,174 @@ function AuxiliaryTasksModal({
);
}
function MoaModelsModal({
config,
refreshKey,
onClose,
onSaved,
}: {
config: MoaConfigResponse;
refreshKey: number;
onClose(): void;
onSaved(next: MoaConfigResponse): void;
}) {
const [draft, setDraft] = useState<MoaConfigResponse>(config);
const [selected, setSelected] = useState(config.default_preset || Object.keys(config.presets)[0] || "default");
const [newName, setNewName] = useState("");
const [picker, setPicker] = useState<MoaPickerTarget | null>(null);
const [busy, setBusy] = useState(false);
const [error, setError] = useState<string | null>(null);
const presetNames = Object.keys(draft.presets || {});
const preset = draft.presets[selected] || draft.presets[presetNames[0]];
const slotLabel = (slot: MoaModelSlot) => `${slot.provider || "(provider)"} · ${slot.model || "(model)"}`;
const updateSelectedPreset = (updater: (preset: MoaConfigResponse["presets"][string]) => MoaConfigResponse["presets"][string]) => {
setDraft((prev) => ({
...prev,
presets: {
...prev.presets,
[selected]: updater(prev.presets[selected]),
},
}));
};
const save = async () => {
setBusy(true);
setError(null);
try {
const saved = await api.saveMoaModels(draft);
onSaved(saved);
onClose();
} catch (e) {
setError(e instanceof Error ? e.message : String(e));
} finally {
setBusy(false);
}
};
const addPreset = () => {
const name = newName.trim();
if (!name || draft.presets[name]) return;
const seed = preset || {
reference_models: draft.reference_models,
aggregator: draft.aggregator,
reference_temperature: draft.reference_temperature,
aggregator_temperature: draft.aggregator_temperature,
max_tokens: draft.max_tokens,
enabled: draft.enabled,
};
setDraft((prev) => ({
...prev,
default_preset: prev.default_preset || name,
presets: { ...prev.presets, [name]: { ...seed, reference_models: [...seed.reference_models] } },
}));
setSelected(name);
setNewName("");
};
const deletePreset = () => {
if (presetNames.length <= 1) return;
const remaining = presetNames.filter((name) => name !== selected);
const nextSelected = remaining[0];
setDraft((prev) => {
const next = { ...prev.presets };
delete next[selected];
return {
...prev,
presets: next,
default_preset: prev.default_preset === selected ? nextSelected : prev.default_preset,
active_preset: prev.active_preset === selected ? "" : prev.active_preset,
};
});
setSelected(nextSelected);
};
if (!preset) return null;
return (
<div className="fixed inset-0 z-50 flex items-center justify-center bg-background/80 p-4 backdrop-blur-sm">
<Card className="max-h-[85vh] w-full max-w-2xl overflow-auto">
<CardHeader>
<CardTitle className="text-sm">Configure Mixture of Agents presets</CardTitle>
</CardHeader>
<CardContent className="space-y-4">
<p className="text-xs text-text-secondary">
Presets appear as models under the Mixture of Agents provider. References produce perspectives; the aggregator is the acting model that answers and calls tools.
</p>
<div className="flex flex-wrap items-center gap-2">
<select
className="border border-border bg-background px-2 py-1 text-xs"
value={selected}
onChange={(event) => setSelected(event.target.value)}
>
{presetNames.map((name) => <option key={name} value={name}>{name}</option>)}
</select>
<Button size="sm" outlined onClick={() => setDraft((prev) => ({ ...prev, default_preset: selected }))}>Set default</Button>
<Button size="sm" ghost disabled={presetNames.length <= 1} onClick={deletePreset}>Delete</Button>
<input
className="border border-border bg-background px-2 py-1 text-xs"
placeholder="new preset name"
value={newName}
onChange={(event) => setNewName(event.target.value)}
/>
<Button size="sm" outlined disabled={!newName.trim() || !!draft.presets[newName.trim()]} onClick={addPreset}>Add preset</Button>
</div>
<div className="text-xs text-text-secondary">
Default: <span className="font-mono">{draft.default_preset}</span>
</div>
<div className="space-y-2">
<div className="text-display text-xs font-medium tracking-wider">Reference models</div>
{preset.reference_models.map((slot, index) => (
<div key={`${selected}-${slot.provider}-${slot.model}-${index}`} className="flex items-center gap-2 border border-border/50 bg-muted/20 px-3 py-2">
<div className="min-w-0 flex-1 truncate font-mono text-xs text-text-secondary">{slotLabel(slot)}</div>
<Button size="sm" outlined onClick={() => setPicker({ kind: "reference", index })}>Change</Button>
<Button size="sm" ghost disabled={preset.reference_models.length <= 1} onClick={() => updateSelectedPreset((prev) => ({ ...prev, reference_models: prev.reference_models.filter((_, i) => i !== index) }))}>Remove</Button>
</div>
))}
<Button size="sm" outlined onClick={() => updateSelectedPreset((prev) => ({ ...prev, reference_models: [...prev.reference_models, prev.aggregator] }))}>Add reference model</Button>
</div>
<div className="space-y-2">
<div className="text-display text-xs font-medium tracking-wider">Aggregator</div>
<div className="flex items-center gap-2 border border-border/50 bg-muted/20 px-3 py-2">
<div className="min-w-0 flex-1 truncate font-mono text-xs text-text-secondary">{slotLabel(preset.aggregator)}</div>
<Button size="sm" outlined onClick={() => setPicker({ kind: "aggregator" })}>Change</Button>
</div>
</div>
{error && <div className="text-xs text-destructive">{error}</div>}
<div className="flex justify-end gap-2 pt-2">
<Button ghost onClick={onClose} disabled={busy}>Cancel</Button>
<Button onClick={save} disabled={busy}>{busy ? "Saving…" : "Save"}</Button>
</div>
</CardContent>
</Card>
{picker && (
<ModelPickerDialog
key={`moa-picker-${refreshKey}-${selected}-${picker.kind}-${picker.kind === "reference" ? picker.index : "agg"}`}
loader={api.getModelOptions}
alwaysGlobal
title="Select MoA Model"
onApply={async ({ provider, model }) => {
updateSelectedPreset((prev) => {
if (picker.kind === "aggregator") return { ...prev, aggregator: { provider, model } };
return {
...prev,
reference_models: prev.reference_models.map((slot, i) => i === picker.index ? { provider, model } : slot),
};
});
}}
onClose={() => setPicker(null)}
/>
)}
</div>
);
}
function ModelSettingsPanel({
aux,
refreshKey,
@ -697,6 +871,8 @@ function ModelSettingsPanel({
onSaved(): void;
}) {
const [auxModalOpen, setAuxModalOpen] = useState(false);
const [moaModalOpen, setMoaModalOpen] = useState(false);
const [moa, setMoa] = useState<MoaConfigResponse | null>(null);
const [picker, setPicker] = useState<PickerTarget | null>(null);
const [pendingReloadModel, setPendingReloadModel] = useState<string | null>(
null,
@ -705,6 +881,10 @@ function ModelSettingsPanel({
const mainProv = aux?.main.provider ?? "";
const mainModel = aux?.main.model ?? "";
useEffect(() => {
api.getMoaModels().then(setMoa).catch(() => setMoa(null));
}, [refreshKey]);
const applyAssignment = async ({
scope,
task,
@ -796,6 +976,31 @@ function ModelSettingsPanel({
</Button>
</div>
<div className="flex min-w-0 flex-col gap-2 bg-muted/20 border border-border/50 px-3 py-2 sm:flex-row sm:items-center sm:justify-between sm:gap-3">
<div className="min-w-0 flex-1">
<div className="flex items-center gap-2 mb-0.5">
<Brain className="h-3 w-3 text-text-tertiary" />
<span className="text-display text-xs font-medium tracking-wider">
Mixture of Agents
</span>
</div>
<div className="text-xs font-mono text-text-secondary truncate">
{moa
? `${moa.reference_models.length} reference${moa.reference_models.length === 1 ? "" : "s"} · ${moa.aggregator.provider}/${shortModelName(moa.aggregator.model)}`
: "not loaded"}
</div>
</div>
<Button
size="sm"
outlined
onClick={() => setMoaModalOpen(true)}
disabled={!moa}
className="shrink-0 self-start text-xs uppercase sm:self-center"
>
Configure
</Button>
</div>
{picker && (
<ModelPickerDialog
key={`picker-${refreshKey}`}
@ -832,6 +1037,17 @@ function ModelSettingsPanel({
model={pendingReloadModel}
onCancel={() => setPendingReloadModel(null)}
/>
{moaModalOpen && moa && (
<MoaModelsModal
config={moa}
refreshKey={refreshKey}
onSaved={(next) => {
setMoa(next);
onSaved();
}}
onClose={() => setMoaModalOpen(false)}
/>
)}
</CardContent>
</Card>
);

View file

@ -39,6 +39,7 @@ hermes [global-options] <command> [subcommand/options]
|---------|---------|
| `hermes chat` | Interactive or one-shot chat with the agent. |
| `hermes model` | Interactively choose the default provider and model. |
| `hermes moa` | Configure named Mixture of Agents presets used by `/moa`. |
| `hermes fallback` | Manage fallback providers tried when the primary model errors. |
| `hermes gateway` | Run or manage the messaging gateway service. |
| `hermes proxy` | Local OpenAI-compatible proxy that attaches OAuth provider credentials. See [Subscription Proxy](../user-guide/features/subscription-proxy.md). |
@ -1119,6 +1120,18 @@ On a fresh install the first scheduled pass is deferred by one full `interval_ho
See [Curator](../user-guide/features/curator.md) for behavior and config.
## `hermes moa`
Configure named Mixture of Agents presets used by the `/moa` slash command.
```bash
hermes moa list
hermes moa configure [name]
hermes moa delete <name>
```
`hermes moa configure` reuses Hermes' provider → model picker for each reference model and the aggregator. A preset is an execution-mode configuration, not a primary model or provider.
## `hermes fallback`
```bash

View file

@ -8,7 +8,7 @@ description: "Authoritative reference for Hermes built-in tools, grouped by tool
This page documents Hermes' built-in tools, grouped by toolset. Availability varies by platform, credentials, and enabled toolsets.
**Quick counts (current registry):** ~71 tools — 10 browser tools (core) + 2 CDP-gated browser tools, 4 file tools, 4 Home Assistant tools, 2 terminal tools, 2 web tools, 5 Feishu tools, 7 Spotify tools (registered by the bundled `spotify` plugin), 5 Yuanbao tools, 9 kanban tools (registered when the kanban dispatcher spawns the agent), 2 Discord tools, and a handful of standalone tools (`memory`, `clarify`, `delegate_task`, `execute_code`, `cronjob`, `session_search`, `skill_view`/`skill_manage`/`skills_list`, `text_to_speech`, `image_generate`, `video_generate`, `vision_analyze`, `video_analyze`, `mixture_of_agents`, `send_message`, `todo`, `computer_use`, `process`).
**Quick counts (current registry):** ~71 tools — 10 browser tools (core) + 2 CDP-gated browser tools, 4 file tools, 4 Home Assistant tools, 2 terminal tools, 2 web tools, 5 Feishu tools, 7 Spotify tools (registered by the bundled `spotify` plugin), 5 Yuanbao tools, 9 kanban tools (registered when the kanban dispatcher spawns the agent), 2 Discord tools, and a handful of standalone tools (`memory`, `clarify`, `delegate_task`, `execute_code`, `cronjob`, `session_search`, `skill_view`/`skill_manage`/`skills_list`, `text_to_speech`, `image_generate`, `video_generate`, `vision_analyze`, `video_analyze`, `send_message`, `todo`, `computer_use`, `process`).
:::tip MCP Tools
In addition to built-in tools, Hermes can load tools dynamically from MCP servers. MCP tools appear with the prefix `mcp_<server>_` (e.g., `mcp_github_create_issue` for the `github` MCP server). See [MCP Integration](/user-guide/features/mcp) for configuration.
@ -144,12 +144,6 @@ Registered when the agent is either (a) spawned by the kanban dispatcher (`HERME
|------|-------------|----------------------|
| `send_message` | Send a message to a connected messaging platform, or list available targets. IMPORTANT: When the user asks to send to a specific channel or person (not just a bare platform name), call send_message(action='list') FIRST to see available tar… | — |
## `moa` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `mixture_of_agents` | Route a hard problem through multiple frontier LLMs collaboratively. Makes 5 API calls (4 reference models + 1 aggregator) with maximum reasoning effort — use sparingly for genuinely difficult problems. Best for: complex math, advanced alg… | OPENROUTER_API_KEY |
## `session_search` toolset
| Tool | Description | Requires environment |

View file

@ -71,7 +71,6 @@ Or in-session:
| `kanban` | `kanban_block`, `kanban_comment`, `kanban_complete`, `kanban_create`, `kanban_heartbeat`, `kanban_link`, `kanban_list`, `kanban_show`, `kanban_unblock` | Multi-agent coordination tools. Registered for dispatcher-spawned task workers (`HERMES_KANBAN_TASK`) and for profiles that explicitly list the `kanban` toolset by name (the `all`/`*` wildcard does **not** enable it). Workers mark tasks done, block, heartbeat, comment, and create/link follow-up tasks; orchestrator profiles additionally get board-routing tools like list/unblock. |
| `memory` | `memory` | Persistent cross-session memory management. |
| `messaging` | `send_message` | Send messages to other platforms (Telegram, Discord, etc.) from within a session. |
| `moa` | `mixture_of_agents` | Multi-model consensus via Mixture of Agents. |
| `safe` | `image_generate`, `vision_analyze`, `web_extract`, `web_search` (via `includes`) | Read-only research + media generation. No file writes, no terminal, no code execution. |
| `search` | `web_search` | Web search only (without extract). |
| `session_search` | `session_search` | Search past conversation sessions. |

View file

@ -552,7 +552,7 @@ cronjob(action="create", name="weekly-news-summary",
prompt="Summarize this week's AI news: ...")
```
When `enabled_toolsets` is set on a job it wins; otherwise the `hermes tools` cron-platform config wins; otherwise Hermes falls back to the built-in defaults. This matters for cost control: carrying `moa`, `browser`, `delegation` into every tiny "fetch news" job bloats the tool-schema prompt on every LLM call.
When `enabled_toolsets` is set on a job it wins; otherwise the `hermes tools` cron-platform config wins; otherwise Hermes falls back to the built-in defaults. This matters for cost control: carrying `browser`, `delegation` into every tiny "fetch news" job bloats the tool-schema prompt on every LLM call.
### Skipping the agent entirely: `wakeAgent`

View file

@ -0,0 +1,115 @@
---
sidebar_position: 7
title: "Mixture of Agents"
description: "Create named MoA presets that appear as selectable models under the Mixture of Agents provider"
---
# Mixture of Agents
Mixture of Agents is a virtual model provider. Each named MoA preset appears as a selectable model under the `moa` provider.
When you select a MoA preset, the preset's aggregator is the acting model. It is the model that writes the assistant response and emits tool calls. Reference models run first and provide analysis for the aggregator to use.
Use MoA when a hard task benefits from multiple model perspectives but still needs Hermes' normal agent loop: tool calls, follow-up iterations, interrupts, transcript persistence, and the same session context as any other message.
## Select a MoA preset as your model
You can select a preset through the normal model picker surfaces:
```bash
/model default --provider moa
/model review --provider moa
```
The Dashboard, TUI, and Desktop model pickers also show a `Mixture of Agents` provider row. Its models are your configured preset names.
## Slash command shortcut
`/moa` is convenience sugar over model selection:
```bash
/moa
```
Switches the current session to the default MoA preset.
```bash
/moa review
```
If `review` exactly matches a preset name, switches the current session to provider `moa`, model `review`.
```bash
/moa design and implement a migration plan for this flaky test cluster
```
If the text does not exactly match a preset name, Hermes treats it as a one-shot prompt. It temporarily switches to the default MoA preset for that turn, sends the prompt, then restores the previous model afterward.
Preset matching is exact on purpose. Hermes does not fuzzy-match preset names, so normal prompts cannot accidentally become model switches.
## How it works in the agent loop
For each main model call when provider `moa` is selected, Hermes:
1. resolves the selected preset by name;
2. runs the configured reference models without tool schemas (they receive only the conversation's user/assistant text — not the Hermes system prompt or tool-call transcript — so reference calls stay cheap and avoid strict-provider rejections);
3. appends the reference outputs as private context for the aggregator;
4. calls the configured aggregator with the normal Hermes tool schema;
5. treats the aggregator response as the real model response;
6. if the aggregator calls tools, Hermes executes those tools normally;
7. on the next model iteration, the same MoA process runs again over the updated conversation, including tool results.
Because MoA is selected through the normal model system, it composes automatically with `/goal`, gateway sessions, TUI sessions, and Desktop chat.
## Configure presets
You can configure named MoA presets from:
- Dashboard → Models → Model Settings → Mixture of Agents
- Desktop app → Settings → Model → Mixture of Agents
- `hermes moa configure [name]`
- `config.yaml`
The config stores explicit provider/model pairs, so you can mix providers and use multiple models from the same provider:
```yaml
moa:
default_preset: default
presets:
default:
reference_models:
- provider: openai-codex
model: gpt-5.5
- provider: openrouter
model: deepseek/deepseek-v4-pro
aggregator:
provider: openrouter
model: anthropic/claude-opus-4.8
reference_temperature: 0.6
aggregator_temperature: 0.4
max_tokens: 4096
enabled: true
```
Default preset:
- reference: `openai-codex:gpt-5.5`
- reference: `openrouter:deepseek/deepseek-v4-pro`
- aggregator / acting model: `openrouter:anthropic/claude-opus-4.8`
## Terminal preset management
```bash
hermes moa list
hermes moa configure # update the default preset
hermes moa configure review # create or update a named preset
hermes moa delete review
```
## Notes
- MoA is no longer listed under `hermes tools`; there is no `moa` toolset to enable.
- Setting `enabled: false` on a preset disables the reference fan-out for that preset: the aggregator acts alone, exactly as if you selected it as a plain model. This is the per-preset off switch surfaced in the dashboard and desktop settings.
- A preset's aggregator cannot be another MoA preset. Recursive MoA trees are intentionally blocked.
- Credential failures on one reference model do not abort the turn. Hermes includes the failure in the reference context and continues with whatever models returned.
- MoA increases model-call count. A single model iteration can involve multiple reference calls plus the aggregator call.

View file

@ -49,7 +49,7 @@ hermes tools
hermes tools
```
Common toolsets include `web`, `search`, `terminal`, `file`, `browser`, `vision`, `image_gen`, `moa`, `skills`, `tts`, `todo`, `memory`, `session_search`, `cronjob`, `code_execution`, `delegation`, `clarify`, `homeassistant`, `messaging`, `spotify`, `discord`, `discord_admin`, `debugging`, and `safe`.
Common toolsets include `web`, `search`, `terminal`, `file`, `browser`, `vision`, `image_gen`, `skills`, `tts`, `todo`, `memory`, `session_search`, `cronjob`, `code_execution`, `delegation`, `clarify`, `homeassistant`, `messaging`, `spotify`, `discord`, `discord_admin`, `debugging`, and `safe`.
See [Toolsets Reference](/reference/toolsets-reference) for the full set, including platform presets such as `hermes-cli`, `hermes-telegram`, and dynamic MCP toolsets like `mcp-<server>`.

View file

@ -455,7 +455,6 @@ Enable/disable via `hermes tools` (interactive) or `hermes tools enable/disable
| `feishu_drive` | Feishu (Lark) drive tools |
| `yuanbao` | Yuanbao integration tools |
| `rl` | Reinforcement learning tools (off by default) |
| `moa` | Mixture of Agents (off by default) |
Full enumeration lives in `toolsets.py` as the `TOOLSETS` dict; `_HERMES_CORE_TOOLS` is the default bundle most platforms inherit from.

View file

@ -8,7 +8,7 @@ description: "Hermes 内置工具权威参考,按工具集分组"
本页记录 Hermes 的内置工具,按工具集分组。可用性因平台、凭据和已启用的工具集而异。
**当前注册表快速统计:** 约 71 个工具 —— 10 个浏览器工具(核心)+ 2 个 CDP 门控浏览器工具、4 个文件工具、4 个 Home Assistant 工具、2 个终端工具、2 个 Web 工具、5 个 Feishu 工具、7 个 Spotify 工具(由内置 `spotify` 插件注册、5 个 Yuanbao 工具、9 个 kanban 工具(在 kanban 调度器生成 agent 时注册、2 个 Discord 工具,以及若干独立工具(`memory``clarify``delegate_task``execute_code``cronjob``session_search``skill_view`/`skill_manage`/`skills_list``text_to_speech``image_generate``video_generate``vision_analyze``video_analyze``mixture_of_agents`、`send_message`、`todo``computer_use``process`)。
**当前注册表快速统计:** 约 71 个工具 —— 10 个浏览器工具(核心)+ 2 个 CDP 门控浏览器工具、4 个文件工具、4 个 Home Assistant 工具、2 个终端工具、2 个 Web 工具、5 个 Feishu 工具、7 个 Spotify 工具(由内置 `spotify` 插件注册、5 个 Yuanbao 工具、9 个 kanban 工具(在 kanban 调度器生成 agent 时注册、2 个 Discord 工具,以及若干独立工具(`memory``clarify``delegate_task``execute_code``cronjob``session_search``skill_view`/`skill_manage`/`skills_list``text_to_speech``image_generate``video_generate``vision_analyze``video_analyze``send_message`、`todo``computer_use``process`)。
:::tip MCP 工具
除内置工具外Hermes 还可从 MCP 服务器动态加载工具。MCP 工具以 `mcp_<server>_` 为前缀(例如,`github` MCP 服务器的 `mcp_github_create_issue`)。配置方法见 [MCP 集成](/user-guide/features/mcp)。
@ -143,12 +143,6 @@ description: "Hermes 内置工具权威参考,按工具集分组"
|------|------|----------|
| `send_message` | 向已连接的消息平台发送消息,或列出可用目标。重要:当用户要求发送到特定频道或人员(而非仅平台名称)时,请先调用 `send_message(action='list')` 查看可用目标… | — |
## `moa` 工具集
| 工具 | 描述 | 所需环境 |
|------|------|----------|
| `mixture_of_agents` | 将难题路由给多个前沿 LLM 协作处理。进行 5 次 API 调用4 个参考模型 + 1 个聚合器),以最大推理力度运行——请谨慎用于真正困难的问题。最适合:复杂数学、高级算法… | OPENROUTER_API_KEY |
## `session_search` 工具集
| 工具 | 描述 | 所需环境 |

View file

@ -70,7 +70,6 @@ hermes tools # curses UI to enable/disable per platfo
| `kanban` | `kanban_block`, `kanban_comment`, `kanban_complete`, `kanban_create`, `kanban_heartbeat`, `kanban_link`, `kanban_list`, `kanban_show`, `kanban_unblock` | 多 agent 协调工具。为调度器生成的任务工作者(`HERMES_KANBAN_TASK`)以及显式启用 `kanban` 工具集的 profile 注册。工作者可标记任务完成、阻塞、心跳、评论以及创建/关联后续任务;编排器 profile 还额外获得看板路由工具,如 list/unblock。 |
| `memory` | `memory` | 持久化跨会话记忆管理。 |
| `messaging` | `send_message` | 在会话中向其他平台Telegram、Discord 等)发送消息。 |
| `moa` | `mixture_of_agents` | 通过 Mixture of Agents 实现多模型共识。 |
| `safe` | `image_generate`, `vision_analyze`, `web_extract`, `web_search`(通过 `includes` | 只读研究 + 媒体生成。无文件写入、无终端、无代码执行。 |
| `search` | `web_search` | 仅网页搜索(不含提取)。 |
| `session_search` | `session_search` | 搜索历史会话记录。 |

View file

@ -69,6 +69,7 @@ const sidebars: SidebarsConfig = {
'user-guide/features/honcho',
'user-guide/features/context-files',
'user-guide/features/context-references',
'user-guide/features/mixture-of-agents',
'user-guide/features/personality',
'user-guide/features/skins',
'user-guide/features/plugins',