mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-23 10:42:00 +00:00
docs(config): document auxiliary task fallback_chain
This commit is contained in:
parent
5eb158e317
commit
cc30e0b659
3 changed files with 77 additions and 7 deletions
|
|
@ -1006,6 +1006,13 @@ auxiliary:
|
|||
# Context compression timeout (separate from compression.* config)
|
||||
compression:
|
||||
timeout: 120 # seconds — compression summarizes long conversations, needs more time
|
||||
# fallback_chain: # Optional — providers to try on rate-limit / connectivity failure
|
||||
# - provider: nous
|
||||
# model: deepseek/deepseek-chat
|
||||
# - provider: openrouter
|
||||
# model: google/gemini-2.5-flash
|
||||
# base_url: ""
|
||||
# api_key: ""
|
||||
|
||||
# Auto-generated session titles. Empty language follows the conversation;
|
||||
# set e.g. "English" or "Japanese" to pin titles to one language.
|
||||
|
|
@ -1054,6 +1061,34 @@ Each auxiliary task has a configurable `timeout` (in seconds). Defaults: vision
|
|||
Context compression has its own `compression:` block for thresholds and an `auxiliary.compression:` block for model/provider settings — see [Context Compression](#context-compression) above. The primary fallback chain uses a top-level `fallback_providers:` list — see [Fallback Providers](/integrations/providers#fallback-providers). All three follow the same provider/model/base_url pattern.
|
||||
:::
|
||||
|
||||
### Per-task fallback chain for auxiliary tasks
|
||||
|
||||
Each auxiliary task can optionally define a `fallback_chain` — a list of provider/model entries that Hermes tries when the primary auxiliary provider fails due to rate limits, connectivity issues, or payment restrictions:
|
||||
|
||||
```yaml
|
||||
auxiliary:
|
||||
compression:
|
||||
provider: openrouter
|
||||
model: openai/gpt-4o-mini
|
||||
fallback_chain:
|
||||
- provider: nous
|
||||
model: deepseek/deepseek-chat
|
||||
- provider: openrouter
|
||||
model: google/gemini-2.5-flash
|
||||
```
|
||||
|
||||
When the primary auxiliary provider (`openrouter` / `openai/gpt-4o-mini`) returns a rate-limit, connection timeout, or payment-required error, Hermes walks the `fallback_chain` in order. It skips entries whose provider matches the already-failed provider, and tries each remaining entry until one succeeds or the chain is exhausted. If all fallbacks fail, Hermes falls back to the main agent model as a final safety net.
|
||||
|
||||
Each entry supports the same three knobs as any auxiliary task config:
|
||||
|
||||
| Key | Description |
|
||||
|-----|-------------|
|
||||
| `provider` | Provider name (`nous`, `openrouter`, `anthropic`, `gemini`, `main`, etc.) |
|
||||
| `model` | Model name for that provider |
|
||||
| `base_url` | (Optional) Custom OpenAI-compatible endpoint |
|
||||
|
||||
`fallback_chain` is available on any auxiliary task — `compression`, `vision`, `web_extract`, `approval`, `skills_hub`, `mcp`, etc.
|
||||
|
||||
### OpenRouter routing & Pareto Code for auxiliary tasks
|
||||
|
||||
When an auxiliary task resolves to OpenRouter (either explicitly or via `provider: "main"` while your main agent is on OpenRouter), the main agent's `provider_routing` and `openrouter.min_coding_score` settings **do not propagate** — by design, each auxiliary task is independent. To set OpenRouter provider preferences or use the [Pareto Code router](/integrations/providers#openrouter-pareto-code-router) for a specific aux task, set them per-task via `extra_body`:
|
||||
|
|
|
|||
|
|
@ -820,6 +820,13 @@ auxiliary:
|
|||
# 上下文压缩超时(与 compression.* 配置分开)
|
||||
compression:
|
||||
timeout: 120 # 秒 —— 压缩摘要长对话,需要更多时间
|
||||
# fallback_chain: # 可选 —— 发生速率限制/连接故障时尝试的 provider
|
||||
# - provider: nous
|
||||
# model: deepseek/deepseek-chat
|
||||
# - provider: openrouter
|
||||
# model: google/gemini-2.5-flash
|
||||
# base_url: ""
|
||||
# api_key: ""
|
||||
|
||||
# 技能中心 —— 技能匹配和搜索
|
||||
skills_hub:
|
||||
|
|
@ -855,9 +862,37 @@ auxiliary:
|
|||
:::
|
||||
|
||||
:::info
|
||||
上下文压缩有自己的 `compression:` 块用于阈值,以及 `auxiliary.compression:` 块用于模型/provider 设置 —— 参阅上方的[上下文压缩](#context-compression)。回退模型使用 `fallback_model:` 块 —— 参阅[回退模型](/integrations/providers#fallback-model)。三者都遵循相同的 provider/model/base_url 模式。
|
||||
上下文压缩有自己的 `compression:` 块用于阈值,以及 `auxiliary.compression:` 块用于模型/provider 设置 —— 参阅上方的[上下文压缩](#context-compression)。主备用链使用顶层的 `fallback_providers:` 列表 —— 参阅[备用提供商](/integrations/providers#fallback-providers)。三者都遵循相同的 provider/model/base_url 模式。
|
||||
:::
|
||||
|
||||
### 辅助任务的每任务回退链
|
||||
|
||||
每个辅助任务都可以选择性地定义一个 `fallback_chain` —— 一个 provider/model 条目列表,当主要辅助 provider 因速率限制、网络连接问题或付费限制而失败时,Hermes 会尝试使用该列表:
|
||||
|
||||
```yaml
|
||||
auxiliary:
|
||||
compression:
|
||||
provider: openrouter
|
||||
model: openai/gpt-4o-mini
|
||||
fallback_chain:
|
||||
- provider: nous
|
||||
model: deepseek/deepseek-chat
|
||||
- provider: openrouter
|
||||
model: google/gemini-2.5-flash
|
||||
```
|
||||
|
||||
当主要辅助 provider(`openrouter` / `openai/gpt-4o-mini`)返回速率限制、连接超时或需要付费错误时,Hermes 将依次遍历 `fallback_chain`。它会跳过 provider 与已失败 provider 相同的条目,并尝试每个剩余条目,直到有一个成功或该链耗尽。如果所有回退都失败,Hermes 会回退到主 agent 模型作为最终的安全网。
|
||||
|
||||
每个条目支持与任何辅助任务配置相同的三个旋钮:
|
||||
|
||||
| 键 | 描述 |
|
||||
|-----|-------------|
|
||||
| `provider` | Provider 名称(`nous`、`openrouter`、`anthropic`、`gemini`、`main` 等) |
|
||||
| `model` | 该 provider 的模型名称 |
|
||||
| `base_url` | (可选)自定义 OpenAI 兼容端点 |
|
||||
|
||||
`fallback_chain` 适用于任何辅助任务 —— `compression`、`vision`、`web_extract`、`approval`、`skills_hub`、`mcp` 等。
|
||||
|
||||
### OpenRouter 路由和辅助任务的 Pareto Code
|
||||
|
||||
当辅助任务解析到 OpenRouter(显式或通过 `provider: "main"` 而您的主 agent 在 OpenRouter 上)时,主 agent 的 `provider_routing` 和 `openrouter.min_coding_score` 设置**不会传播** —— 按设计,每个辅助任务是独立的。要为特定辅助任务设置 OpenRouter provider 偏好或使用 [Pareto Code 路由器](/integrations/providers#openrouter-pareto-code-router),请通过 `extra_body` 按任务设置:
|
||||
|
|
|
|||
|
|
@ -166,12 +166,12 @@ fallback_model:
|
|||
|---------|-------------------|
|
||||
| CLI 会话 | ✔ |
|
||||
| 消息网关(Telegram、Discord 等) | ✔ |
|
||||
| 子 Agent 委派 | ✘(子 Agent 不继承备用配置) |
|
||||
| Cron 任务 | ✘(使用固定提供商运行) |
|
||||
| 子 Agent 委派 | ✔(子 Agent 继承父 Agent 的备用链) |
|
||||
| Cron 任务 | ✔(Cron Agent 继承配置的备用提供商) |
|
||||
| 辅助任务(视觉、压缩等) | ✘(使用各自的提供商链——见下文) |
|
||||
|
||||
:::tip
|
||||
`fallback_model` 没有对应的环境变量——它只能通过 `config.yaml` 配置。这是有意为之:备用配置是一个经过深思熟虑的选择,不应被过期的 shell 导出变量覆盖。
|
||||
没有针对主备用链的环境变量——只能通过 `config.yaml` 或 `hermes fallback` 进行配置。这是有意为之:备用配置是一个经过深思熟虑的选择,不应被过期的 shell 导出变量覆盖。
|
||||
:::
|
||||
|
||||
---
|
||||
|
|
@ -362,7 +362,7 @@ auxiliary:
|
|||
|
||||
## 委派提供商覆盖
|
||||
|
||||
由 `delegate_task` 生成的子 Agent **不会**使用主备用模型。但可以将它们路由到不同的提供商:模型对以优化成本:
|
||||
由 `delegate_task` 生成的子 Agent 会继承父 Agent 的主备用链。你仍然可以将子 Agent 路由到不同的主提供商:模型对以进行成本优化:
|
||||
|
||||
```yaml
|
||||
delegation:
|
||||
|
|
@ -378,7 +378,7 @@ delegation:
|
|||
|
||||
## Cron 任务提供商
|
||||
|
||||
Cron 任务使用执行时配置的提供商运行,不支持备用模型。若要为 Cron 任务使用不同的提供商,请在 Cron 任务本身上配置 `provider` 和 `model` 覆盖:
|
||||
Cron 任务在创建 Agent 时会继承你配置的 `fallback_providers` 链(或旧版 `fallback_model`)。要为 Cron 任务使用不同的主提供商,请在 Cron 任务本身配置 `provider` 和 `model` 覆盖:
|
||||
|
||||
```python
|
||||
cronjob(
|
||||
|
|
@ -398,7 +398,7 @@ cronjob(
|
|||
|
||||
| 功能 | 备用机制 | 配置位置 |
|
||||
|---------|-------------------|----------------|
|
||||
| 主 Agent 模型 | `fallback_model`(config.yaml 中)——出错时按轮次故障转移(每轮次恢复主模型) | `fallback_model:`(顶层) |
|
||||
| 主 Agent 模型 | `fallback_providers`(config.yaml 中)——出错时按轮次故障转移(每轮次恢复主模型) | `fallback_providers:`(顶层列表) |
|
||||
| 辅助任务(任意)— auto 用户 | 容量错误时完整自动检测链(主 Agent 模型优先,然后提供商链) | `auxiliary.<task>.provider: auto` |
|
||||
| 辅助任务(任意)— 显式提供商 | `fallback_chain`(若已设置)→ 主 Agent 模型 → 警告 + 抛出,仅在容量错误时触发 | `auxiliary.<task>.fallback_chain` |
|
||||
| 视觉 | 分层(见上文)+ 内部 OpenRouter 重试 | `auxiliary.vision` |
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue