After #19473 landed (enforce_max_runtime reads from task_runs.started_at
rather than tasks.started_at), a regression test added earlier still
only backdated the tasks column. Backdate both so the test is robust
regardless of which column the enforcer reads from.
Widens _verify_created_cards to also accept ids that are children of the
completing task in task_links. Previously we only accepted cards where
created_by matched the completing task's assignee, which was too strict
for legitimate orchestrator flows: a specifier creates a card (so
created_by=specifier, not worker), then a worker picks it up and passes
parents=[current_task] to kanban_create. The explicit link proves the
relationship and should be trusted.
Salvaged from #20022 @LeonSGP43 (full PR superseded by #20232 +
this patch; the linked-children relaxation was the portable
improvement).
Salvage follow-up for PR #20344:
- AUTHOR_MAP entry for rob-maron (required by CI)
- 17 parametrized tests covering _is_arcee_trinity_thinking,
_fixed_temperature_for_model Trinity override, and
_compression_threshold_for_model, including sibling-model negatives
(trinity-large-preview, trinity-mini) and the OpenRouter slug form.
- Add fr.yaml with French translations for approval prompts and gateway messages
- Register 'fr' in SUPPORTED_LANGUAGES
- Add French aliases: french, français, fr-fr, fr-be, fr-ca, fr-ch
- Update locale sync comment in en.yaml
Reduces SSE event rate ~500/turn → ~20/turn via 50ms text-delta batching in
_dispatch(), which eliminates markdown re-render storms on Open WebUI. Also:
- Trim tool_call.arguments in the response.completed event to 100KB
(prevents silent hangs on 848KB+ single-line SSE events).
- Catch-all exception handlers in _write_sse_responses() + _write_sse_chat_completion()
emit a proper error chunk instead of TransferEncodingError from incomplete
chunked encoding when the agent crashes mid-stream.
- MAX_REQUEST_BYTES 1MB → 10MB; pass client_max_size to aiohttp Application to
avoid silent 400s on truncated request bodies for long conversations.
Salvage of #17552 (api_server portion only). The contrib/openwebui-filter/
payload from that PR — Open WebUI Filter Function + benchmark writeup — is
a client-side user-installable add-on and doesn't need to live in the repo;
dropped here. Closes#17537.
Co-authored-by: bogerman1 <93757150+bogerman1@users.noreply.github.com>
Mirrors the pattern already shipping in hindsight-integrations/openclaw:
probe `<api_url>/version` once per process, gate on Hindsight ≥ 0.5.0.
When supported, retains use a stable session-scoped `document_id`
(`session_id`) plus `update_mode='append'` so cross-process retains for
the same session merge into one document instead of producing
N-different-process-stamped duplicates. When unsupported (or probe
fails), fall back to the existing per-process unique
`f"{session_id}-{start_ts}"` document_id with no `update_mode` — the
resume-overwrite fix (#6654) keeps working unchanged on legacy servers.
Closes the dedup half of #20115. The proposed `document_id_strategy`
config knob isn't needed: auto-detection via the same /version probe
the OpenClaw plugin already uses gives the same outcome with no extra
config burden, and the choice is purely a function of what the server
can do.
Plumbing
--------
- Module-level helpers (`_meets_minimum_version`, `_fetch_hindsight_api_version`,
`_check_api_supports_update_mode_append`) cache the result per api_url
so every provider in the process gets one /version round-trip.
- One-time WARN logged when the API is older than 0.5.0, telling the
user to upgrade for cross-session deduplication.
- New instance helper `_resolve_retain_target(fallback_doc_id)` returns
`(document_id, update_mode)` based on cached capability. Wired into
`sync_turn` and the `on_session_switch` flush path.
- For local_embedded mode, the probe URL is taken from the running
client (`client.url`) so we hit the actual daemon port rather than
the configured default.
- `update_mode` is set on the per-item dict; `aretain_batch` already
threads `item['update_mode']` into the API call.
Tests
-----
- `TestUpdateModeAppendCapability` (5 cases): legacy fallback, modern
stable+append, per-url cache, one-time warn, flush-on-switch resolves
against the OLD session.
- Existing `_make_hindsight_provider` factory in the manager-side test
file extended to seed `_mode`/`_api_url`/`_api_key`/`_client` and stub
`_resolve_retain_target` so the bypass-init pattern keeps working.
E2E verified against installed `~/.hermes/hermes-agent`:
- Legacy probe (unreachable host) → `legacy-session-<ts>` doc_id,
no `update_mode`.
- Modern probe (live local_embedded 0.5.6 daemon) → stable
`modern-session` doc_id + `update_mode='append'`.
- `test_hermes_embedded_smoke.py` passes (90s).
The dispatch_once function already accepts a max_spawn parameter but the
gateway was calling it without passing any value, effectively ignoring
the configuration. This change reads kanban.max_spawn from config.yaml
and passes it through, allowing users to limit concurrent kanban tasks.
This prevents resource exhaustion scenarios where kanban dispatcher
spawns too many parallel workers on constrained hardware.
Closes#12954
- Add README.zh-CN.md with complete Simplified Chinese translation
- Add language switcher badge in README.md linking to Chinese version
- Add language switcher badge in README.zh-CN.md linking to English version
Step-by-step guide covering Ollama installation, model selection,
Hermes configuration, speed optimization, and optional gateway bot
setup — all running on local hardware with zero API cost.
Includes hardware requirements, model comparison table with tool-call
support status, context window tuning, GPU offloading tips, fallback
provider setup, troubleshooting, and cost comparison.
The dispatcher's circuit breaker only protected against spawn-side
failures (profile missing, workspace mount error, exec failure).
Workers that successfully spawned but then timed out or crashed
re-queued to ``ready`` with no counter increment, so the next tick
re-spawned them — loops forever until someone noticed. Reported
externally on Twitter (Forbidden Seeds) and confirmed by walking the
kernel: ``enforce_max_runtime`` flipped the task back to ready, emitted
a ``timed_out`` event, and never touched ``spawn_failures``; same for
``detect_crashed_workers``.
Fix: unify the counter across all non-success outcomes.
Schema
------
* ``tasks.spawn_failures`` → ``tasks.consecutive_failures``
* ``tasks.last_spawn_error`` → ``tasks.last_failure_error``
* Migration renames the columns in-place on existing DBs (``ALTER
TABLE RENAME COLUMN`` — SQLite >= 3.25) so historical counter
values are preserved. Row mappers fall through to the legacy names
if both column renames and a migration somehow got out of sync.
Counter lifecycle
-----------------
New helper ``_record_task_failure(conn, task_id, error, *, outcome,
release_claim, end_run, event_payload_extra)`` is the single point
every non-success outcome funnels through:
* ``spawn_failed`` → ``_record_spawn_failure`` (kept as alias)
calls it with ``release_claim=True, end_run=True`` — transitions
running→ready, clears claim, closes run.
* ``timed_out`` → ``enforce_max_runtime`` already does the status
transition + run close + event emission, then calls
``_record_task_failure`` with ``release_claim=False, end_run=False``
just to bump the counter (and trip the breaker if needed).
* ``crashed`` → ``detect_crashed_workers`` same pattern, but the
counter increment runs after the main write_txn closes (SQLite
doesn't nest write transactions).
If the counter hits the breaker threshold (``DEFAULT_FAILURE_LIMIT=5``,
same as before), the task transitions to ``blocked`` with a ``gave_up``
event on top of whatever outcome-specific event was already emitted.
Reset semantics changed: the counter now clears only on successful
``complete_task`` (and operator ``reclaim_task`` — an explicit "I've
looked at this, try again with a fresh budget"). Previously
``_clear_spawn_failures`` ran on every successful spawn, which would
have wiped the counter before a timeout could accumulate past threshold
— exactly the loop this fix prevents.
Diagnostics
-----------
* ``_rule_repeated_spawn_failures`` → ``_rule_repeated_failures``. Now
fires regardless of which outcome is at fault. Classifies the most
recent failure (spawn_failed / timed_out / crashed) from the run
history so the title ("Agent timeout x3", "Agent crash x4", "Agent
spawn x5") and suggested action (``doctor`` for spawn, ``log`` for
timeout/crash) stay outcome-specific without N duplicate rules.
* ``_rule_repeated_crashes`` kept as a narrower early-warning at
threshold 2 (vs 3 for the unified rule), but now suppresses itself
when the unified rule would also fire — avoids double-flagging.
* Diagnostic ``data`` payload now carries
``{consecutive_failures, most_recent_outcome, last_error}`` instead
of spawn-specific keys.
CLI
---
* ``Task.consecutive_failures`` / ``Task.last_failure_error`` are the
public fields now. Existing callers that referenced the old names
get migrated (tests updated in this commit).
* Backward-compat: ``DEFAULT_SPAWN_FAILURE_LIMIT``,
``_clear_spawn_failures``, ``_record_spawn_failure`` stay as aliases.
Tests
-----
* 6 new kernel tests: timeout increments counter, 3 consecutive
timeouts trip the breaker (was the reported gap), crash increments
counter, reclaim clears counter, completion clears counter, spawn
success does NOT clear counter.
* Diagnostic tests: updated ``repeated_spawn_failures`` cases to use
the new kind name and add a timeout-loop test.
* Dashboard API test: spawn_failures column update → consecutive_failures.
389/389 kanban-suite tests pass.
Live verification
-----------------
Seeded 4 tasks in an isolated HERMES_HOME: 3 timeouts, 4 crashes,
2-spawn-failed + 2-timed-out, and a task that had prior failures but
completed successfully. Board correctly shows "!! 3 tasks need
attention" (the successful one has no badge because the counter
reset). Drawer for the timeout-loop task renders "Agent timeout x3"
with most_recent_outcome=timed_out and the "Check logs" suggested
action (not the spawn-flavoured "Verify profile"). The successful
task has zero diagnostics.
Closes the Forbidden-Seeds-reported gap.
Salvage of #11350. Kept:
- Code: add an explicit /voice join Choice in the slash UI (runner accepts both 'join' and 'channel' but only 'channel' was in autocomplete).
- Docs: Server Members Intent is conditional (only needed if DISCORD_ALLOWED_USERS contains usernames); SSRC → user_id mapping uses the voice websocket SPEAKING opcode, not the Members intent.
Dropped from the original PR:
- HERMES_DISCORD_VOICE_PACKET_DUMP — this env var doesn't exist on main (it was in a different PR that isn't merged).
- DISCORD_PROXY docs — already documented on current main.
- DISCORD_ALLOW_MENTION_* docs — already on main.
- "barge-in mode" rewrite — current main actually does pause the listener during TTS (VoiceReceiver.pause() at discord.py:192); there is no barge_in_guard/barge_in_rms on main.
Co-authored-by: Michel Belleau <michel.belleau@malaiwah.com>
Salvage of #11758. The PR's original diff was stale (the Docker Compose section on main has been heavily refactored — dashboard is now an embedded side-process, not a separate service), so the useful bit (API server env var requirements) is applied as a note on the basic `docker run` example.
Co-authored-by: xiangyong <xiangyong@zspace.cn>
Adds a comprehensive guide for connecting Dockerized Hermes to local
inference servers like vLLM and Ollama, covering:
- Docker Compose networking (recommended)
- Standalone Docker run with host.docker.internal / --network host
- Connectivity verification steps
- Ollama-specific example
Closes#12308
AGENTS.md is the AI-assistant entry doc, so its counts get used as ground
truth. Several values had drifted, and the same drift had spread to a few
user-facing surfaces. Fixing all of them in one commit so the count claims
agree and clearly distinguish gateway-core from plugin-shipped platforms.
AGENTS.md:
- run_agent.py "~12k LOC" → "~14k LOC as of 2026-05-03" (actual 14,097)
- cli.py "~11k LOC" → "~12k LOC as of 2026-05-03" (actual 12,043)
- tools/environments/ list now lists all 7 user-selectable terminal backends
in canonical order, matching tools/terminal_tool.py:2214-2215
- gateway/platforms/ list adds yuanbao and wecom_callback; the 19 names
match the user-facing list at website/docs/integrations/index.md
- plugins/ tree now mentions plugins/platforms/ (irc, teams)
- tests/ snapshot "~15k tests across ~700 files as of Apr 2026" →
"~19k tests across ~890 files as of 2026-05-03"
User-facing count claims:
- hermes_cli/tips.py:195 — "19 platforms" → "21 messaging platforms" with
IRC and Microsoft Teams added to the named list
- website/docs/index.md:49 — "6 terminal backends" → "7 terminal backends:
..., Vercel Sandbox" (also corrected by PR #19044; same edit content)
- website/docs/index.md:50 — "15+ platforms from one gateway" → "21+ messaging
platforms (19 in the gateway, plus IRC and Microsoft Teams via plugins)"
- website/docs/integrations/index.md:83-85 — "15+ messaging platforms" → "19+",
added yuanbao to the linked list. The surrounding text scopes it to "configured
through the same gateway subsystem", so plugin platforms (IRC, Teams) are
intentionally not in this list
- website/scripts/generate-llms-txt.py:205 — "15+ platforms" → "21+ messaging
platforms — 19 native to the gateway plus IRC and Microsoft Teams via plugins"
LOC and date stamps follow the existing AGENTS.md "as of <date>" convention
(line 56 already used this pattern). Source of truth for the gateway count is
gateway/config.py:130-148 (PlatformID enum); plugin platforms live in
plugins/platforms/.
Out of scope:
- RELEASE_v0.9.0.md historical "16 platforms" claim (immutable history)
- userStories.json verbatim user quotes
- Programmatic count generation from gateway/config.py + plugin manifests
is a worthwhile build-system change but separate from these content fixes
README:24 claimed "Six terminal backends" while tools/environments/ exposes
seven top-level backend choices through TERMINAL_ENV: local, docker, ssh,
singularity, modal, daytona, vercel_sandbox. Modal additionally has direct
and Nous-managed modes selected via terminal.modal_mode (the
ManagedModalEnvironment class is a Modal sub-mode, not a separate top-level
backend).
The same drift appeared in five other doc and code-comment sites with
inconsistent counts (six, seven, or implicit) and varying lists. Updated
all sites to a consistent seven-backend list in canonical order. The
configuration guide also clarifies how Modal's two modes are selected so
operators do not search for a non-existent backend: managed_modal value.
CONTRIBUTING.md:160 lists six backend filenames in a code tree but does
not carry the "Six terminal" prose; left out of scope per cohesion sweep
guidance to bundle only identical wording.
Files updated:
- README.md (line 24, marketing copy)
- website/docs/index.md (line 49, landing page)
- website/docs/user-guide/configuration.md (line 86, config guide)
- tools/environments/__init__.py (lines 3-6, package docstring)
- tools/file_operations.py (line 6, module docstring)
- environments/README.md (line 43, RL training docs — TERMINAL_ENV list)
* fix(tui): close slash parity gaps with CLI
Route unsupported /skills subcommands through slash.exec, support /new <name>
titles, and handle /redraw natively so TUI behavior matches classic CLI. Also
filter gateway-only commands out of the TUI catalog while keeping /status
discoverable.
* fix(tui): run remaining CLI parity paths natively
Forward chat launch flags into the TUI runtime and handle live-session status
and skill reloads in the gateway process so TUI state no longer depends on the
slash worker's stale CLI instance.
* fix(tui): block stale snapshot restores
Prevent snapshot restore from running through the isolated slash worker because
it mutates disk state without refreshing the live TUI agent.
* chore: uptick
* fix(tui): guard async session title updates
Handle failures from the fire-and-forget session.title RPC so title-setting errors do not surface as unhandled promise rejections while preserving session-scoped messaging.
Three worked recipes for OpenAI-compatible cloud providers, plus the
Copilot HTTP 401 auto-recovery info block and the GMI Cloud row in the
compatible providers table. All three additions were on the original
docs/custom-providers-cookbook branch but its merge base predated 1186
main commits, making the rebase impractical (84k+ line conflict).
Replays just the providers.md additions onto current main.