Surface ready tasks that nobody claims within a threshold (default
30 min) regardless of why. One identity-agnostic signal that catches:
- Operator typo'd the assignee
- Profile was deleted, leaving its tasks stranded
- External worker pool (Codex CLI lane, custom daemon) is down
- Dispatcher misconfigured (wrong board / wrong HERMES_HOME)
Today the dispatcher correctly skips these (no respawn loop, good)
but nothing surfaces the fact that operator-actionable work is
accumulating. The new `stranded_in_ready` rule does that without
requiring a manual lane registry — it reads the most recent ready-
transition event (`created` / `promoted` / `reclaimed` / `unblocked`)
and fires when (now - last_ready_ts) > threshold.
Severity escalates with age: warning at threshold, error at 2x,
critical at 6x. The cli_hint and reassign actions point operators
at the right next step.
Out of scope deliberately:
- Lane registry (#20157 closed) — this signal supersedes it.
- Pushing the diagnostic into messaging gateways — diagnostics
are pull-only via 'hermes kanban diagnostics' for now; gateway
push is a separate UX decision.
Tests: 10 new + 461 existing kanban tests pass. E2E verified end-
to-end via 'hermes kanban diagnostics --json' against a 2h-old
stranded task — surfaces as error severity with correct actions.
Added tests/gateway/test_stream_consumer_draft.py with 11 tests
covering:
- Transport selection: auto+dm-supported -> draft; auto+group -> edit;
explicit edit; explicit draft on unsupported adapter -> edit;
MagicMock adapter -> edit (back-compat for the existing test suite).
- Happy path: DM stream animates draft frames with a single shared
draft_id, then finalizes via a regular adapter.send.
- Group fallback: drafts entirely skipped in non-DM chats.
- Failure fallback: send_draft returning success=False disables drafts
for the rest of the response.
- Draft_id lifecycle: consecutive responses use distinct ids; tool
boundaries bump the id so post-tool text animates fresh below the
tool-progress bubble (the openclaw #32535 leak guard).
- _already_sent contract: drafts must NOT set the flag so the gateway's
fallback final-send still fires (drafts have no message_id).
Updated website/docs/user-guide/messaging/telegram.md with a
'Streaming transport' section explaining auto|draft|edit|off, the
DM-only constraint, and the per-response fallback behaviour.
Closes the architectural-pin part of #19931. Most of what that issue
asked for is already implemented (logs under kanban root, env-pinned
workspace, dispatcher routing of unknown assignees, lifecycle
ownership, structured handoff conventions). What was missing:
1. A written contract integrators can point at when adding a new
worker lane shape, and
2. The "code-changing workers should not auto-promote success to
done" convention.
This commit ships both as docs+convention layered on existing primitives.
No kernel changes — the kanban_complete / kanban_block / kanban_comment
surfaces already support the review-required pattern; we just hadn't
written it down or made it visible to workers.
Changes:
- `agent/prompt_builder.py::KANBAN_GUIDANCE`: append the review-required
exception to step 5 of the lifecycle. Workers get the cue
auto-injected into their system prompt — drop structured metadata
into a kanban_comment first, then end with
kanban_block(reason="review-required: <summary>") instead of
kanban_complete when the work needs review. Total prompt size went
from ~3000 to ~3275 chars; well under the 4096 budget enforced by
test_kanban_guidance_size.
- `skills/devops/kanban-worker/SKILL.md`: add a worked example to the
existing "Good summary + metadata shapes" section between the
Coding-task and Research-task examples. Same shape as the others
(kanban_comment with structured handoff JSON, then kanban_block with
the human-readable reason). Plus a one-line guide on when to use
kanban_complete vs the review-required pattern.
- `website/docs/user-guide/features/kanban-worker-lanes.md` (new): the
integrator-facing contract. Covers the hierarchy, the three things
every lane must provide (assignee, spawn mechanism, lifecycle
terminator), the env vars the dispatcher injects, the
review-required convention, the failure modes the kernel handles
for free, and an explicit "external CLI worker lane" deferred-
pending-concrete-asker section that links to #19931 and #19924.
- `website/sidebars.ts`: link the new page under user-guide/features.
The "specialist worker lanes for external CLI tools (Codex / Claude
Code / OpenCode)" runner is NOT shipped here. The dispatcher's
spawn_fn parameter already supports plugin-shaped extension; the
per-CLI integration work (auth, sandbox policy, exit-code mapping)
needs a concrete asker. The new docs page tells would-be integrators
the contract any such lane must satisfy.
Refs #19931
Adds a Cross-Platform Handoff section to user-guide/sessions.md covering
the CLI flow, per-platform thread behavior (Telegram topics / Discord
threads / Slack message-anchored / no-thread fallback), failure modes,
and the resume-back-to-CLI loop.
Adds the /handoff entry to reference/slash-commands.md and updates the
CLI-only commands note.
The skill enumerated 8 specialist profile names (researcher, analyst,
writer, reviewer, backend-eng, frontend-eng, ops, pm) as "the standard
roster" and told orchestrators to "assume these exist." Almost no real
Hermes setup matches that fleet — single-profile setups, Docker-worker
setups, and curated-team setups all violate it — so following the skill
literally produced cards assigned to non-existent profiles, which the
dispatcher silently failed to spawn (no autocorrect, no fallback, just
sits in `ready` forever).
Changes:
- Drop the standard-specialist-roster table.
- Add a "Profiles are user-configured — not a fixed roster" section at
the top with a Step 0 that prescribes `hermes profile list` (or asking
the user) before fanning out. Cache the result in working memory.
- Rewrite the worked task-graph example with placeholder names
(<profile-A>, <profile-B>, <profile-C>) so the structure is still
teachable but doesn't invite copy-paste of role names that may not
exist.
- Reframe the "If no specialist fits" anti-temptation rule: don't
invent profile names; ask the user.
- Add a "Inventing profile names that doesn't exist" entry to Pitfalls.
- Bump skill version 2.0.0 → 3.0.0 (semantic break: previous behavior
promised a roster the skill no longer enumerates).
- Update website/docs/user-guide/features/kanban.md to drop the
matching "(researcher, writer, analyst, backend-eng, reviewer, ops)"
line and explain the discovery prompt instead.
- Re-run website/scripts/generate-skill-docs.py to refresh the
auto-generated skill page + catalog.
Closes#21131 in spirit — addresses the same hardcoded-names footgun
@yehuosi flagged, with a different shape than their PR (delete the
roster rather than replace each name with placeholder, since the
roster table was the load-bearing footgun and the worked example is
salvageable with placeholder profile names).
Co-authored-by: yehuosi <yehuosi@users.noreply.github.com>
* feat(gateway): per-platform admin/user split for slash commands
Adds an opt-in two-list access control on top of the existing per-platform
`allow_from` allowlists, scoped to slash commands only:
- allow_admin_from — full slash command access
- user_allowed_commands — what non-admins may run
- group_allow_admin_from — same, group/channel scope
- group_user_allowed_commands
When `allow_admin_from` is unset for a scope, gating is disabled and every
allowed user keeps full access (backward compat). Plain chat is unaffected.
`/help` and `/whoami` are always reachable so users can see what they
can run.
Gate runs at the slash command dispatch site in gateway/run.py and uses
`is_gateway_known_command()`, so it covers built-in AND plugin-registered
commands through the live registry without per-feature wiring.
Adds `/whoami` showing platform, scope, tier, and runnable commands.
Salvage of PR #4443's permission tier work, scoped down. The full tier
system, tool filtering, audit log, usage tracking, rate limiting,
`/promote` flow, and persistent SQLite stores are not included here —
those can be re-expanded later if needed.
Co-authored-by: ReqX <mike@grossmann.at>
* fix(gateway): close running-agent fast-path bypass + add coverage and central docs
The slash command access gate was only applied at the cold dispatch site
(line ~5921). When an agent was already running, the running-agent
fast-path block (line ~5574) dispatched /restart, /stop, /new, /steer,
/model, /approve, /deny, /agents, /background, /kanban, /goal, /yolo,
/verbose, /footer, /help, /commands, /profile, /update directly
without going through the gate — letting non-admins bypass gating just
because an agent happens to be busy.
Refactored the gate into _check_slash_access() and called from BOTH
paths. /status remains intentionally pre-gate so users can always see
session state.
Also added 18 more dispatch tests covering:
- Running-agent fast-path: blocks non-admin, allows admin, /status
always works
- Alias canonicalization (gate uses canonical name, not user alias)
- Unknown / unregistered commands pass through (don't false-positive)
- DM admin scope-locked when group has its own admin list
- Multi-platform isolation (Discord gated, Telegram unrestricted)
Docs: added Slash Command Access Control section to the central
messaging index page + /whoami row in the chat commands table.
Co-authored-by: ReqX <mike@grossmann.at>
---------
Co-authored-by: ReqX <mike@grossmann.at>
Adds CDPSupervisor.evaluate_runtime() and wires it into _browser_eval as a
fast path when a supervisor is alive for the current task_id. Replaces the
~180ms agent-browser subprocess fork+exec+Node-startup hop with a ~1ms
Runtime.evaluate over the supervisor's already-connected WebSocket.
Falls through to the existing agent-browser CLI path when no supervisor is
running (e.g. backends without CDP, or before the first browser_navigate
attaches one), so behaviour is unchanged where it can't apply.
JS-side exceptions surface directly without falling through to the
subprocess (the subprocess would just re-raise the same error, slower);
supervisor-side failures (loop down, no session) fall through cleanly.
Benchmark — 30 iterations of `1 + 1` against headless Chrome:
supervisor WS mean= 0.96ms median= 0.91ms
agent-browser subprocess mean=179.35ms median=167.73ms
→ 187x speedup mean
Tests: 14 unit tests (mocked supervisor + response-shape coverage), 5
real-Chrome e2e tests in test_browser_supervisor.py (gated on Chrome
being installed). Browser test suite: 355 passed, 1 skipped.
* feat(plugins): host-owned LLM access via ctx.llm
Plugins can now ask the host to run a one-shot chat or structured
completion against the user's active model and auth, without ever
seeing an OAuth token or API key. Closes the gap where plugins that
needed bounded structured inference (receipts, CRM extraction,
support classification) had to either bring their own provider keys
or register a tool the agent had to call.
New surface on PluginContext:
- ctx.llm.complete(messages, ...)
- ctx.llm.complete_structured(instructions, input, json_schema, ...)
- async siblings ctx.llm.acomplete / acomplete_structured
Backed by the existing auxiliary_client.call_llm pipeline — every
provider, fallback chain, vision routing, and timeout policy Hermes
already supports applies automatically.
Trust gate (fail-closed by default):
- plugins.entries.<id>.llm.allow_model_override
- plugins.entries.<id>.llm.allowed_models (allowlist; '*' = any)
- plugins.entries.<id>.llm.allow_agent_id_override
- plugins.entries.<id>.llm.allow_profile_override
Embedded model@profile shorthand goes through the same gate as
explicit profile=, so it can't bypass the auth-profile policy.
Conflicting explicit and embedded profiles fail closed.
Also lands:
- plugins/plugin-llm-example/ — reference plugin that registers
/receipt-extract, demonstrating image+text structured input,
jsonschema validation, and the trust-gate config.
- website/docs/developer-guide/plugin-llm-access.md — full API docs.
- 45 unit tests covering trust gates, JSON parsing, schema
validation, image encoding, async surface, and config loading.
Validation:
- 2628 tests pass in tests/agent/
- E2E: bundled plugin loaded with isolated HERMES_HOME, slash
command produced parsed JSON via stubbed call_llm
- response_format extra_body wired correctly for both json_object
and json_schema modes
* docs(plugin-llm): rewrite quickstart and framing
The quickstart now uses a meeting-notes-to-tasks example instead of
a receipt extractor, and the page leads with hook-time / gateway
pre-filter / scheduled-job framing rather than the OpenClaw
KB/support/CRM/finance/migration enumeration that the original
upstream PR used. Receipt example moved to a separate worked
example link so the docs page itself doesn't echo any of the
upstream framing.
Also clarifies where ctx.llm fits in the broader plugin surface
(table comparing register_tool / register_platform / register_hook
/ etc.) and what makes this lane different from auxiliary_client
internals.
No code change.
* docs(plugin-llm): reframe as any LLM call, not just structured output
The original draft leaned heavily on complete_structured() and made
the chat lane (complete() / acomplete()) feel like a footnote.
Restructure so:
- The page title and description say 'any LLM call.'
- The lead shows BOTH a plain chat call (error rewriter) AND a
structured call (triage scorer) up top.
- Quick start has two complete plugin examples — /tldr (chat) and
/paste-to-tasks (structured).
- New 'When to use which' table for choosing complete() vs
complete_structured() vs the async siblings.
- Trust-gate sections explicitly note 'all four methods,' and the
request-shaping list calls out chat-only fields (messages) and
structured-only fields (instructions, input, json_schema)
alongside each other.
- The 'Where this fits' section now says 'for any reason,
structured or not.'
The receipt-extractor reference plugin still exists under
plugins/plugin-llm-example/ — but the docs page no longer treats
it as the canonical surface example. It's now described as 'a third
worked example, this time with image input.'
No code change.
* feat(plugin-llm): split provider/model into independent explicit kwargs
The first cut accepted a single 'provider/model' slug on every method
and split it internally. That looked clean but broke under live test:
the model-override path tried to use the slug's vendor prefix as a
literal Hermes provider id, which silently switched the user off
their aggregator (e.g. plugin asks for 'openai/gpt-4o-mini' on a user
who routes through OpenRouter — host attempted to call the 'openai'
provider directly, failed because OPENAI_API_KEY wasn't set).
New shape mirrors the host's main config:
ctx.llm.complete(
messages=[...],
provider='openrouter', # gated, optional
model='openai/gpt-4o-mini', # gated, optional
profile='work', # gated, optional
...
)
Each is independently gated by its own allow_*_override flag.
Granting model-override does NOT auto-grant provider-override.
Allowlists are now per-axis (allowed_providers, allowed_models)
matched literally against whatever string the plugin sends.
Dropped 'model@profile' embedded-suffix shorthand entirely. Hermes
doesn't use that pattern anywhere else; profile= is its own kwarg.
Live E2E (against real OpenRouter via Teknium's config) confirms:
- zero-config call works
- default-deny blocks each override with a helpful error
- model-only override stays on user's active provider (the bug)
- provider+model override switches cleanly
- allowlist refuses non-listed entries
- structured output round-trip parses + schema-validates
Tests: 49 cases (up from 45); all green. Docs updated to match the
new shape, including a 'most plugins never need this section' callout
on the trust-gate config block.
* fix+cleanup(plugin-llm): real attribution, hook-mode coverage, move example out of core
Three integration fixes for the ctx.llm surface:
1. Attribution bug — result.provider and result.model now reflect
what call_llm actually used, not placeholder fallbacks ('auto',
'default'). New _resolve_attribution() helper:
- explicit overrides win (what the call targeted)
- response.model wins for the recorded model (provider
canonicalisation: 'gpt-4o' → 'gpt-4o-2024-08-06' etc.)
- falls back to _read_main_provider() / _read_main_model()
when no override is set, so audit logs reflect the user's
active main provider/model
- 'auto' / 'default' only when EVERYTHING is empty
Live verified: zero-config call now records
provider='openrouter', model='anthropic/claude-4.7-opus-20260416'
instead of provider='auto', model='default'.
2. Hook-mode coverage — TestHookMode confirms ctx.llm.complete
works from inside a registered post_tool_call callback. The
docs page promised hook integration; now there's a test that
exercises the lazy-import path through the real invoke_hook
machinery. Two cases: traceback-rewrite hook with conditional
ctx.llm.complete, and minimal hook regression for the
sync-hook + sync-llm path.
3. Reference plugin moved out of core. plugins/plugin-llm-example/
is gone from hermes-agent — it now lives in the new
NousResearch/hermes-example-plugins companion repo. The docs
page links there. Hermes' bundled plugins should be plugins
users actually run; reference / docs-companion plugins live
externally.
Test count: 56 (up from 49). Wider sweep on tests/hermes_cli/
+ tests/gateway/ + tests/tools/ + tests/agent/ shows 16770
passing; the 12 failures are all pre-existing on origin/main
(verified by stashing this branch's changes and re-running) —
kanban-boards, delegate-task, gateway-restart, tts-routing —
none touch the plugin_llm surface.
* chore(plugins): move all example plugins to companion repo
Reference / docs-companion plugins now live exclusively in
NousResearch/hermes-example-plugins, not bundled with the core repo:
- example-dashboard
- strike-freedom-cockpit
A new fourth example, plugin-llm-async-example, was added to that
repo demonstrating ctx.llm's async surface (acomplete()) with
asyncio.gather() — registers /translate <lang>: <text> which fires
forward translation + sentiment classifier in parallel, then a
back-translation for QA. Live-tested at 2.5s for three real
provider round-trips (would be ~5-6s sequential).
Docs updated:
- developer-guide/plugin-llm-access.md links both sync and async
examples in the Reference section
- user-guide/features/extending-the-dashboard.md repoints both demo
sections to the companion repo with corrected install paths
- user-guide/features/built-in-plugins.md drops the two demo rows
- AGENTS.md notes that example plugins live in the companion repo
Net: hermes-agent's plugins/ directory now contains only plugins
users actually run (memory providers, dashboard tabs that ship real
features, the disk-cleanup hook, platform adapters). All four
demo / reference plugins live externally where they can be cloned
on demand instead of inflating the core install.
* feat(gateway): add LINE Messaging API platform plugin
Adds LINE as a bundled platform plugin under `plugins/platforms/line/`,
synthesized from the strongest pieces of seven open community PRs. The
adapter requires zero core edits — `Platform("line")` is auto-discovered
via the bundled-plugin scan in `gateway/config.py`, and all hooks
(setup, env-enablement, cron delivery, standalone send) are wired
through `register_platform()` kwargs the way IRC and Teams do it.
Highlights merged into one plugin:
- **Reply token preferred, Push fallback.** Try the free reply token
first (single-use, ~60s TTL); fall back to metered Push when the
token is absent, expired, or rejected. (PR #21023)
- **Slow-LLM Template Buttons postback.** When the LLM is still running
past `LINE_SLOW_RESPONSE_THRESHOLD` (default 45s), the adapter burns
the original reply token to send a "Get answer" button bubble. The
user taps it to fetch the cached answer via a fresh reply token —
also free. State machine: PENDING → READY → DELIVERED, ERROR for
cancelled runs (orphan resolves to `LINE_INTERRUPTED_TEXT` after
/stop). Set threshold to 0 to disable. (PR #18153)
- **Three-allowlist gating** — separate user / group / room allowlists
with `LINE_ALLOW_ALL_USERS=true` dev-only escape hatch. (PR #18153)
- **Markdown URL preservation.** Strip bold/italic/code-fence/heading
markers (LINE renders them literally) but keep `[label](url)` →
`label (url)` so URLs stay tappable. (PR #18153)
- **System-message bypass** for `⚡ Interrupting`, `⏳ Queued`, etc. —
busy-acks reach the user as visible bubbles instead of being
swallowed into the postback cache. (PR #18153)
- **Media via public HTTPS URLs.** LINE doesn't accept binary uploads;
images/audio/video must be HTTPS-reachable. The adapter serves
registered tempfiles under `/line/media/<token>/<filename>` from the
same aiohttp app. Allowed-roots traversal guard covers
`tempfile.gettempdir()`, `/tmp` (→ `/private/tmp` on macOS), and
`HERMES_HOME`. `LINE_PUBLIC_URL` overrides URL construction for
setups behind tunnels/proxies. (PR #8398)
- **5-message-per-call batching.** LINE rejects >5 messages per
Reply/Push; smart-chunker caps text at 4500 chars per bubble.
- **Inbound dedup** via `webhookEventId` LRU. (PR #21023)
- **Self-message filter** via `/v2/bot/info` userId lookup. (PR #21023)
- **Loading-animation indicator** wired to LINE's `chat/loading/start`
endpoint, DM-only (LINE rejects it for groups/rooms). (PR #21023)
- **Out-of-process cron delivery** via `_standalone_send`, so
`deliver: line` cron jobs work even when cron runs detached from
the gateway.
- **Webhook hardening** — 1 MiB body cap, constant-time HMAC-SHA256
signature verification, dedup, scoped lock so two profiles can't
bind the same channel.
Validation
----------
- `scripts/run_tests.sh tests/gateway/test_line_plugin.py` →
73 passed in 1.05s
- `scripts/run_tests.sh tests/gateway/test_line_plugin.py
tests/gateway/test_irc_adapter.py
tests/gateway/test_plugin_platform_interface.py
tests/gateway/test_platform_registry.py
tests/gateway/test_config.py` → 193 passed, 7 skipped
- E2E import + register + signature roundtrip + `Platform("line")`
bundled-plugin discovery verified against current `origin/main`.
Closes the seven open LINE PRs (#18153, #16832, #6676, #21023, #14942,
#14988, #8398) by superseding them with a single plugin-form
implementation that takes the best idea from each.
Co-authored-by: pwlee <32443648+leepoweii@users.noreply.github.com>
Co-authored-by: Jetha Chan <jetha@google.com>
Co-authored-by: Cattia <openclaw@liyangchen.me>
Co-authored-by: perng <charles@perng.com>
Co-authored-by: Soichiro Yoshimura <soichiro0111.dev@gmail.com>
Co-authored-by: David Zhou <77736378+David-0x221Eight@users.noreply.github.com>
Co-authored-by: Yu-ga <74749461+yuga-hashimoto@users.noreply.github.com>
* docs(platforms): document platform-specific slow-LLM UX pattern
Add a 'Platform-Specific Slow-LLM UX' section to the platform-adapter
developer guide covering the _keep_typing override pattern that LINE
uses for its Template Buttons postback flow.
Three subsections:
- Pattern: subclass _keep_typing to layer mid-flight UX (with code)
- Pattern: subclass send to route through a cache instead of sending
- When this pattern is appropriate (vs. always-Push fallback)
Plus a short pointer in gateway/platforms/ADDING_A_PLATFORM.md so
tree-readers find the prose walkthrough on the docsite.
Filed because the LINE plugin (PR #23197) was the first bundled
adapter to need this pattern — every prior plugin (irc, teams,
google_chat) handles slow responses with the default typing-loop and
a regular send_text. Documenting now while the rationale is fresh.
---------
Co-authored-by: pwlee <32443648+leepoweii@users.noreply.github.com>
Co-authored-by: Jetha Chan <jetha@google.com>
Co-authored-by: Cattia <openclaw@liyangchen.me>
Co-authored-by: perng <charles@perng.com>
Co-authored-by: Soichiro Yoshimura <soichiro0111.dev@gmail.com>
Co-authored-by: David Zhou <77736378+David-0x221Eight@users.noreply.github.com>
Co-authored-by: Yu-ga <74749461+yuga-hashimoto@users.noreply.github.com>
web_extract runs returned page content through the web_extract auxiliary
model when pages exceed 5 000 chars (single-pass up to 500k, chunked up
to 2M, refused above that). The user-guide page didn't mention this —
users were surprised that long-page extracts produced summaries instead
of raw markdown, and that those summaries cost main-model tokens by
default.
Adds:
- size-driven behavior table (under 5k / 5k–500k / 500k–2M / over 2M)
- which auxiliary task does the work (auxiliary.web_extract)
- how to route summaries to a cheap model regardless of main
- escape hatch: browser_navigate when you need raw content
- troubleshooting entry for summarization timeouts
These skills require heavy GPU/CUDA stacks or are niche enough that they shouldn't
be active by default. Moved to optional-skills/ where users opt-in via
`hermes skills install official/...`.
Moved:
- mlops/training/axolotl
- mlops/training/trl-fine-tuning
- mlops/training/unsloth
- mlops/inference/outlines
Counts: 91 -> 87 built-in, 72 -> 76 optional.
Auto-regenerated docs (per-skill pages + catalogs) reflect the move.
Cross-checked 75 docs pages under user-guide/messaging/, developer-guide/,
guides/, and integrations/ against the live registries and gateway code.
messaging/
- index.md: API Server toolset is hermes-api-server (was 'hermes (default)');
Google Chat slug is hermes-google_chat (underscore — plugin name uses _).
- google_chat.md: drop bogus 'pip install hermes-agent[google_chat]' (no such
extra); list the actual deps (google-cloud-pubsub, google-api-python-client,
google-auth, google-auth-oauthlib).
- qqbot.md: config namespace is platforms.qqbot (was platforms.qq, which is
silently ignored by the adapter); QQ_STT_BASE_URL is not read directly —
baseUrl lives under platforms.qqbot.extra.stt.
- teams-meetings.md: 'hermes teams-pipeline' is plugin-gated (teams_pipeline
plugin must be enabled), not a built-in subcommand.
- sms.md: example log line 0.0.0.0:8080 -> 127.0.0.1:8080 (default
SMS_WEBHOOK_HOST).
- open-webui.md: API_SERVER_* are env vars, not YAML keys — write them to
per-profile .env, not 'hermes config set' (same pattern fixed in
api-server.md last round). Also bumped example ports to 8650+ to dodge the
default webhook (8644)/wecom-callback (8645)/msgraph-webhook (8646)
collision.
developer-guide/
- architecture.md: tool/toolset counts (61/52 -> 70+/~28); LOC stamps for
run_agent.py, cli.py, hermes_cli/main.py, setup.py, mcp_tool.py,
gateway/run.py replaced with 'large file' to stop drifting.
- agent-loop.md: same LOC drift (~13,700 -> 'a large file (15k+ lines)').
- gateway-internals.md: '14+ external messaging platforms' -> '20+'; gateway
platform tree updated (qqbot is a sub-package, not qqbot.py; added
yuanbao.py, feishu_comment.py, msgraph_webhook.py); 'gateway/builtin_hooks/
(always active)' was wrong — it's an empty extension point and
_register_builtin_hooks() is a no-op stub.
- acp-internals.md: drop fictional 'message_callback' from the bridged-
callbacks list; clarify thinking_callback is currently set to None.
- provider-runtime.md: provider list was missing AWS Bedrock, Azure Foundry,
NVIDIA NIM, xAI, Arcee, GMI Cloud, StepFun, Qwen OAuth, Xiaomi, Ollama
Cloud, LM Studio, Tencent TokenHub. Fallback section described only the
legacy single-pair model — corrected to the canonical list-form
fallback_providers chain.
- environments.md: parsers list missing llama4_json and the deepseek_v31
alias; both register via @register_parser.
- browser-supervisor.md: drop reference to scripts/browser_supervisor_e2e.py
which doesn't exist in-repo.
- contributing.md: tinker-atropos is a git submodule — note that
'git submodule update --init' is required if cloning without
--recurse-submodules.
guides/
- operate-teams-meeting-pipeline.md: cron flags were all wrong — schedule is
positional (not --schedule), the script-only flag is --no-agent (not
--script-only), and there's no --command flag. Replaced with a real example
that creates the script under ~/.hermes/scripts/ and uses the actual flags.
Also replaced fictional 'hermes cron show <name>' with 'hermes cron status'.
- automation-templates.md: 'cron create --skills "a,b"' doesn't work —
the flag is --skill (singular, repeatable). Fixed all 5 occurrences via AST
rewrite.
- minimax-oauth.md: 'hermes auth add minimax-oauth --region cn' silently
fails because --region isn't registered on the auth-add argparse spec.
Pointed users at the minimax-cn provider (or MINIMAX_CN_API_KEY env) for
China-region access.
- cron-script-only.md: 'hermes send' is fictional — replaced the comparison-
table mention with a webhook-subscription pointer; also fixed the dead link
to /guides/pipe-script-output (page doesn't exist).
- cron-troubleshooting.md: 'hermes serve' isn't a real subcommand. Pointed
at 'hermes gateway' (foreground) / 'hermes gateway start' (service).
- local-ollama-setup.md: 'agent.api_timeout' is not a config key. The right
knob is the HERMES_API_TIMEOUT env var.
- python-library.md: run_conversation() return dict has only final_response
and messages — task_id is stored on the agent instance, not echoed back.
- use-mcp-with-hermes.md: '--args /c "npx -y …"' wraps the npx command in
one quoted string, so cmd.exe gets a single arg instead of the multi-token
command line it needs. Removed the surrounding quotes — argparse nargs='*'
collects each token correctly.
integrations/
- providers.md: Bedrock guardrail YAML keys were 'id'/'version' (don't exist);
actual keys are guardrail_identifier/guardrail_version (matches DEFAULT_CONFIG
and the run_agent.py reader). GMI default base URL (api.gmi.ai/v1 ->
api.gmi-serving.com/v1) and portal URL (inference.gmi.ai -> www.gmicloud.ai)
refreshed. Fallback section rewritten to lead with the canonical
fallback_providers list form (was leading with the legacy fallback_model
single dict); supported-providers list extended to include azure-foundry,
alibaba-coding-plan, lmstudio.
index.md
- '68 built-in tools' -> '70+'; '15+ platforms' was both inconsistent with
integrations/index.md ('19+') and undercounted — bumped to 20+ and added
Weixin/QQ Bot/Yuanbao/Google Chat to the list.
Validation: 'npm run build' clean (exit 0); broken-link count unchanged at
155 (same as round-1 post-skill-regen baseline). 24 files, +132/-89.
The plumbing for setting OpenRouter provider preferences and the Pareto Code
router on auxiliary tasks already exists — auxiliary.<task>.extra_body is
forwarded verbatim by call_llm() / async_call_llm(). It just wasn't documented,
so users who wanted (e.g.) Pareto Code routing for compression but the strongest
coder for the main agent had no way to discover the escape hatch.
- hermes_cli/config.py: expand the auxiliary section header with a YAML
example showing provider routing plus plugins under extra_body, and an
explicit note that main-agent provider_routing / openrouter.min_coding_score
do NOT propagate to aux calls (each task is independent by design)
- website/docs/user-guide/configuration.md: new 'OpenRouter routing and
Pareto Code for auxiliary tasks' subsection with worked example
- website/docs/integrations/providers.md: cross-link from the Pareto Code
Router section to the aux-side doc
E2E verified that auxiliary.<task>.extra_body reaches the OpenRouter API with
the configured provider routing and plugins blocks intact.
* docs: deep audit — fix stale config keys, missing commands, and registry drift
Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level
user-guide, user-guide/features) against the live registries:
hermes_cli/commands.py COMMAND_REGISTRY (slash commands)
hermes_cli/auth.py PROVIDER_REGISTRY (providers)
hermes_cli/config.py DEFAULT_CONFIG (config keys)
toolsets.py TOOLSETS (toolsets)
tools/registry.py get_all_tool_names() (tools)
python -m hermes_cli.main <subcmd> --help (CLI args)
reference/
- cli-commands.md: drop duplicate hermes fallback row + duplicate section,
add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand
lists to match --help output (status/logout/spotify, login, archive/prune/
list-archived).
- slash-commands.md: add missing /sessions and /reload-skills entries +
correct the cross-platform Notes line.
- tools-reference.md: drop bogus '68 tools' headline, drop fictional
'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated),
add missing 'kanban' and 'video' toolset sections, fix MCP example to use
the real mcp_<server>_<tool> prefix.
- toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser'
row, add missing 'kanban' and 'video' toolset rows, drop the stale
'38 tools' count for hermes-cli.
- profile-commands.md: add missing install/update/info subcommands, document
fish completion.
- environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the
one with the correct gmi-serving.com default).
- faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just
via OpenRouter), refresh the OpenAI model list.
getting-started/
- installation.md: PortableGit (not MinGit) is what the Windows installer
fetches; document the 32-bit MinGit fallback.
- installation.md / termux.md: installer prefers .[termux-all] then falls
back to .[termux].
- nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid
'nix flake update --flake' invocation.
- updating.md: 'hermes backup restore --state pre-update' doesn't exist —
point at the snapshot/quick-snapshot flow; correct config key
'updates.pre_update_backup' (was 'update.backup').
user-guide/
- configuration.md: api_max_retries default 3 (not 2); display.runtime_footer
is the real key (not display.runtime_metadata_footer); checkpoints defaults
enabled=false / max_snapshots=20 (not true / 50).
- configuring-models.md: 'hermes model list' / 'hermes model set ...' don't
exist — hermes model is interactive only.
- tui.md: busy_indicator -> tui_status_indicator with values
kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none).
- security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env,
not config.yaml.
- windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the
OpenAI-compatible API server runs inside hermes gateway.
user-guide/features/
- computer-use.md: approvals.mode (not security.approval_level); fix broken
./browser-use.md link to ./browser.md.
- fallback-providers.md: top-level fallback_providers (not
model.fallback_providers); the picker is subcommand-based, not modal.
- api-server.md: API_SERVER_* are env vars — write to per-profile .env,
not 'hermes config set' which targets YAML.
- web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl
modes are exposed through web_extract.
- kanban.md: failure_limit default is 2, not '~5'.
- plugins.md: drop hard-coded '33 providers' count.
- honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document
that 'hermes honcho' subcommand is gated on memory.provider=honcho;
reconcile subcommand list with actual --help output.
- memory-providers.md: legacy 'hermes honcho setup' redirect documented.
Verified via 'npm run build' — site builds cleanly; broken-link count went
from 149 to 146 (no regressions, fixed a few in passing).
* docs: round 2 audit fixes + regenerate skill catalogs
Follow-up to the previous commit on this branch:
Round 2 manual fixes:
- quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY;
voice-mode and ACP install commands rewritten — bare 'pip install ...'
doesn't work for curl-installed setups (no pip on PATH, not in repo
dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e
".[voice]"'. ACP already ships in [all] so the curl install includes it.
- cli.md / configuration.md: 'auxiliary.compression.model' shown as
'google/gemini-3-flash-preview' (the doc's own claimed default);
actual default is empty (= use main model). Reworded as 'leave empty
(default) or pin a cheap model'.
- built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row
that was missing from the table.
Regenerated skill catalogs:
- ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill
pages and both reference catalogs (skills-catalog.md,
optional-skills-catalog.md). This adds the entries that were genuinely
missing — productivity/teams-meeting-pipeline (bundled),
optional/finance/* (entire category — 7 skills:
3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model,
merger-model, pptx-author), creative/hyperframes,
creative/kanban-video-orchestrator, devops/watchers,
productivity/shop-app, research/searxng-search,
apple/macos-computer-use — and rewrites every other per-skill page from
the current SKILL.md. Most diffs are tiny (one line of refreshed
metadata).
Validation:
- 'npm run build' succeeded.
- Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation
shells that lag every newly-added skill page (pre-existing pattern).
No regressions on any en/ page.
Returning users who enabled '🖱️ Computer Use (macOS)' via 'hermes tools'
saw '✓ Saved configuration' but no install — cua-driver was never on
PATH and the toolset failed at first use. Two compounding causes:
1. _toolset_needs_configuration_prompt fell through to _toolset_has_keys,
which returned True for any provider with empty env_vars. cua-driver
has no env vars, so the gate skipped _configure_toolset entirely and
_run_post_setup('cua_driver') never ran.
2. No stable CLI entry-point existed for re-running the install when
the picker no-op'd it (e.g. when toggling the toolset off+on inside
one picker session, where 'added' is empty).
Changes:
- hermes_cli/tools_config.py: add _POST_SETUP_INSTALLED registry
mapping post_setup keys to installed-state predicates. The gate
now returns True when any visible provider has a registered
post_setup whose predicate fails. cua_driver is the only opt-in
for now; other post_setup hooks keep their existing behaviour.
- hermes_cli/main.py: add 'hermes computer-use install' and
'hermes computer-use status' as a stable docs target. install
reuses the same _run_post_setup('cua_driver') path that the
picker invokes; status reports whether cua-driver is on PATH.
- tools/computer_use/cua_backend.py: install hint now points users
at 'hermes computer-use install' first.
- website/docs/user-guide/features/computer-use.md: document the
new command as the primary install path.
- website/docs/reference/cli-commands.md: catalog 'hermes
computer-use' alongside 'hermes tools'.
- tests/hermes_cli/test_post_setup_gating.py: regression coverage
for the gate predicate (missing -> setup forced, installed ->
setup skipped, broken predicate -> non-blocking, unregistered
keys -> behaviour unchanged).
Fixes#22737. Reported by @f-trycua.
Windows Terminal captures Alt+Enter at the terminal layer (fullscreen
toggle), so documenting 'Alt+Enter or Ctrl+J' without qualification
leaves stock Windows Terminal users with no working newline key they
can discover from the docs alone.
- Main keybindings row: note Alt+Enter is intercepted on WT and direct
users to Ctrl+Enter / Ctrl+J instead.
- Shift+Enter compatibility table: split 'stock Windows Terminal' from
Windows Terminal Preview 1.25+ (which added Kitty protocol support
and works with the keybinding from this PR once enabled).
- Add AUTHOR_MAP entry for ra2157218@gmail.com -> Abd0r so the salvage
commit passes the email-mapping CI gate.
Closes#5346.
Most terminals send the same byte sequence for `Enter` and `Shift+Enter`
by default, so the application can't tell them apart — this is a terminal
protocol limitation, not something Hermes can paper over. But terminals
that implement the Kitty keyboard protocol (Kitty / foot / WezTerm /
Ghostty by default; iTerm2 / Alacritty / VS Code terminal / Warp once the
protocol is enabled) DO emit a distinct sequence for `Shift+Enter`:
- `\x1b[13;2u` — Kitty / CSI-u, modifier=2
- `\x1b[27;2;13~` — xterm modifyOtherKeys=2
Stock prompt_toolkit doesn't have the CSI-u sequence in its
`ANSI_SEQUENCES` table at all, and it maps the modifyOtherKeys variant to
plain `Keys.ControlM` (Enter) — i.e. it strips the Shift modifier, which
is the bug users actually hit on iTerm2 and friends.
This PR adds `hermes_cli/pt_input_extras.install_shift_enter_alias()`,
called once at CLI startup from `cli.py`, which inserts/overwrites those
sequences in `ANSI_SEQUENCES` so they decode to `(Keys.Escape, Keys.ControlM)`
— the same key tuple `Alt+Enter` produces. The existing Alt+Enter newline
handler (`@kb.add('escape', 'enter')` in `cli.py`) then fires unchanged,
so there is no new keybinding to register and no behavioral change for
terminals that don't emit the distinct sequences.
Files
=====
* `hermes_cli/pt_input_extras.py` — new module hosting the helper. Lives
outside `cli.py` so it's importable in tests without dragging in the
full CLI runtime (which depends on `fire`, `rich`, etc.).
* `cli.py` — calls `install_shift_enter_alias()` once at module import.
Wrapped in try/except so prompt_toolkit version drift can't break CLI
startup.
* `tests/cli/test_cli_shift_enter_newline.py` — 6 tests:
- registration of all three byte sequences
- overwrite of stock prompt_toolkit's broken modifyOtherKeys mapping
- idempotency
- parser equivalence: CSI-u Shift+Enter == Alt+Enter
- parser equivalence: modifyOtherKeys Shift+Enter == Alt+Enter
- plain Enter remains a single key (submit), distinct from the two-key
Alt+Enter / Shift+Enter tuple
* `website/docs/user-guide/cli.md` — keybinding table updated; new
"Shift+Enter compatibility" subsection with a per-terminal status table
noting macOS Terminal / stock Windows Terminal cannot distinguish the
keystroke at the protocol level.
* `website/docs/getting-started/quickstart.md`,
`website/docs/guides/tips.md` — short mention pointing readers at the
full compatibility note in `cli.md`.
Tested
======
pytest tests/cli/test_cli_shift_enter_newline.py # 6 passed
Live-tested by triggering `\x1b[13;2u` against the running Vt100Parser
(see test). Not exercised in a real terminal end-to-end because that
requires a Kitty-protocol-capable host; the test exercises the parser
path that drives the live terminal too.
Adds early-beta framing to every user-facing surface where native Windows
is introduced — landing page install block, Installation page, Windows
(Native) guide, contributor notes, and README. Sets expectations that the
path installs and runs but hasn't been road-tested as broadly as POSIX,
and points users who want maximum stability at WSL2 instead.
Follow-up to #21561 (native Windows support) and #22089 (Windows docs).
New page: website/docs/user-guide/windows-native.md — comprehensive
Windows-native deep dive covering:
- Quick install (irm | iex) and parameterized form
- What the installer does end-to-end (uv, Python 3.11, Node 22,
PortableGit, messaging SDK bootstrap)
- Feature matrix: native Windows vs WSL2 (dashboard /chat is WSL-only)
- How Hermes runs shell commands on Windows (Git Bash resolution,
HERMES_GIT_BASH_PATH override, MinGit layout pitfall)
- UTF-8 console shim (configure_windows_stdio, opt-out via
HERMES_DISABLE_WINDOWS_UTF8)
- Editor handling (notepad default, VSCode/Notepad++/nvim overrides,
why Ctrl-X Ctrl-E used to silently do nothing)
- Ctrl+Enter for newline in the CLI
- Gateway as a Scheduled Task (schtasks + Startup-folder fallback,
pythonw.exe detached spawn, why not a Windows Service)
- Data layout (%LOCALAPPDATA%\hermes vs %USERPROFILE%\.hermes split)
- PATH after install, environment variables, uninstall
- Process management internals (bpo-14484 os.kill(pid, 0) footgun,
_pid_exists primitive, check-windows-footguns.py CI gate)
- 10+ concrete pitfalls with fixes
Also:
- docs/index.md: add inline 'Install' section with both Linux/macOS
curl and Windows irm|iex one-liners right under the hero CTAs.
Updates the quick-links row to include 'native Windows'.
- sidebars.ts: add Windows (Native) entry above Windows (WSL2).
- windows-wsl-quickstart.md: point native-install cross-link at the
new dedicated page (was going to installation.md#windows-native).
- reference/environment-variables.md: document HERMES_GIT_BASH_PATH
and HERMES_DISABLE_WINDOWS_UTF8 (previously undocumented).
Native Windows (with Git for Windows installed) can now run the Hermes CLI
and gateway end-to-end without crashing. install.ps1 already existed and
the Git Bash terminal backend was already wired up — this PR fills the
remaining gaps discovered by auditing every Windows-unsafe primitive
(`signal.SIGKILL`, `os.kill(pid, 0)` probes, bare `fcntl`/`termios`
imports) and by comparing hermes against how Claude Code, OpenCode, Codex,
and Cline handle native Windows.
## What changed
### UTF-8 stdio (new module)
- `hermes_cli/stdio.py` — single `configure_windows_stdio()` entry point.
Flips the console code page to CP_UTF8 (65001), reconfigures
`sys.stdout`/`stderr`/`stdin` to UTF-8, sets `PYTHONIOENCODING` + `PYTHONUTF8`
for subprocesses. No-op on non-Windows. Opt out via `HERMES_DISABLE_WINDOWS_UTF8=1`.
- Called early in `cli.py::main`, `hermes_cli/main.py::main`, and
`gateway/run.py::main` so Unicode banners (box-drawing, geometric
symbols, non-Latin chat text) don't `UnicodeEncodeError` on cp1252
consoles.
### Crash sites fixed
- `hermes_cli/main.py:7970` (hermes update → stuck gateway sweep): raw
`os.kill(pid, _signal.SIGKILL)` → `gateway.status.terminate_pid(pid, force=True)`
which routes through `taskkill /T /F` on Windows.
- `hermes_cli/profiles.py::_stop_gateway_process`: same fix — also
converted SIGTERM path to `terminate_pid()` and widened OSError catch
on the intermediate `os.kill(pid, 0)` probe.
- `hermes_cli/kanban_db.py:2914, 3041`: raw `signal.SIGKILL` →
`getattr(signal, "SIGKILL", signal.SIGTERM)` fallback (matches the
pattern already used in `gateway/status.py`).
### OSError widening on `os.kill(pid, 0)` probes
Windows raises `OSError` (WinError 87) for a gone PID instead of
`ProcessLookupError`. Widened the catch at:
- `gateway/run.py:15101` (`--replace` wait-for-exit loop — without this,
the loop busy-spins the full 10s every Windows gateway start)
- `hermes_cli/gateway.py:228, 460, 940`
- `hermes_cli/profiles.py:777`
- `tools/process_registry.py::_is_host_pid_alive`
- `tools/browser_tool.py:1170, 1206`
### Dashboard PTY graceful degradation
`hermes_cli/pty_bridge.py` depends on `fcntl`/`termios`/`ptyprocess`,
none of which exist on native Windows. Previously a Windows dashboard
would crash on `import hermes_cli.web_server` because of a top-level
import. Now:
- `hermes_cli/web_server.py` wraps the pty_bridge import in
`try/except ImportError` and sets `_PTY_BRIDGE_AVAILABLE=False`.
- The `/api/pty` WebSocket handler returns a friendly "use WSL2 for
this tab" message instead of exploding.
- Every other dashboard feature (sessions, jobs, metrics, config
editor) runs natively on Windows.
### Dependency
- `pyproject.toml`: add `tzdata>=2023.3; sys_platform == 'win32'` so
Python's `zoneinfo` works on Windows (which has no IANA tzdata
shipped with the OS). Credits @sprmn24 (PR #13182).
### Docs
- README.md: removed "Native Windows is not supported"; added
PowerShell one-liner and Git-for-Windows prerequisite note.
- `website/docs/getting-started/installation.md`: new Windows section
with capability matrix (everything native except the dashboard
`/chat` PTY tab, which is WSL2-only).
- `website/docs/user-guide/windows-wsl-quickstart.md`: reframed as
"WSL2 as an alternative to native" rather than "the only way".
- `website/docs/developer-guide/contributing.md`: updated
cross-platform guidance with the `signal.SIGKILL` / `OSError`
rules we enforce now.
- `website/docs/user-guide/features/web-dashboard.md`: acknowledged
native Windows works for everything except the embedded PTY pane.
## Why this shape
Pulled from a survey of how other agent codebases handle native
Windows (Claude Code, OpenCode, Codex, Cline):
- All four treat Git Bash as the canonical shell on Windows, same as
hermes already does in `tools/environments/local.py::_find_bash()`.
- None of them force `SetConsoleOutputCP` — but they don't have to,
Node/Rust write UTF-16 to the Win32 console API. Python does not get
that for free, so we flip CP_UTF8 via ctypes.
- None of them ship PowerShell-as-primary-shell (Claude Code exposes
PS as a secondary tool; scope creep for this PR).
- All of them use `taskkill /T /F` for force-kill on Windows, which
is exactly what `gateway.status.terminate_pid(force=True)` does.
## Non-goals (deliberate scope limits)
- No PowerShell-as-a-second-shell tool — worth designing separately.
- No terminal routing rewrite (#12317, #15461, #19800 cluster) — that's
the hardest design call and needs a separate doc.
- No wholesale `open()` → `open(..., encoding="utf-8")` sweep (Tianworld
cluster) — will do as follow-up if users hit actual breakage; most
modern code already specifies it.
## Validation
- 28 new tests in `tests/tools/test_windows_native_support.py` — all
platform-mocked, pass on Linux CI. Cover:
- `configure_windows_stdio` idempotency, opt-out, env-preservation
- `terminate_pid` taskkill routing, failure → OSError, FileNotFoundError fallback
- `getattr(signal, "SIGKILL", …)` fallback shape
- `_is_host_pid_alive` OSError widening (Windows-gone-PID behavior)
- Source-level checks that all entry points call `configure_windows_stdio`
- pty_bridge import-guard present in `web_server.py`
- README no longer says "not supported"
- 12 pre-existing tests in `tests/tools/test_windows_compat.py` still pass.
- `tests/hermes_cli/` ran fully (3909 passed, 9 failures — all confirmed
pre-existing on main by stash-test).
- `tests/gateway/` ran fully (5021 passed, 1 pre-existing failure).
- `tests/tools/test_process_registry.py` + `test_browser_*` pass.
- Manual smoke: `import hermes_cli.stdio; import gateway.run;
import hermes_cli.web_server` — all clean, `_PTY_BRIDGE_AVAILABLE=True`
on Linux (as expected).
## Files
- New: `hermes_cli/stdio.py`, `tests/tools/test_windows_native_support.py`
- Modified: `cli.py`, `gateway/run.py`, `hermes_cli/main.py`,
`hermes_cli/profiles.py`, `hermes_cli/gateway.py`,
`hermes_cli/kanban_db.py`, `hermes_cli/pty_bridge.py`,
`hermes_cli/web_server.py`, `tools/browser_tool.py`,
`tools/process_registry.py`, `pyproject.toml`, `README.md`, and 4
docs pages.
Credits to everyone whose prior PR work informed these fixes — see
the co-author trailers. All of the PRs listed in
`~/.hermes/plans/windows-support-prs.md` fixing `os.kill` / `signal.SIGKILL`
/ UTF-8 stdio / tzdata / README patterns found the same issues; this PR
consolidates them.
Co-authored-by: Philip D'Souza <9472774+PhilipAD@users.noreply.github.com>
Co-authored-by: Arecanon <42595053+ArecaNon@users.noreply.github.com>
Co-authored-by: XiaoXiao0221 <263113677+XiaoXiao0221@users.noreply.github.com>
Co-authored-by: Lars Hagen <1360677+lars-hagen@users.noreply.github.com>
Co-authored-by: Luan Dias <65574834+luandiasrj@users.noreply.github.com>
Co-authored-by: Ruzzgar <ruzzgarcn@gmail.com>
Co-authored-by: sprmn24 <oncuevtv@gmail.com>
Co-authored-by: adybag14-cyber <252811164+adybag14-cyber@users.noreply.github.com>
Co-authored-by: Prasanna28Devadiga <54196612+Prasanna28Devadiga@users.noreply.github.com>
Fifth and final slice polish on top of @dlkakbs's docs + skill. Three
things ship here:
1. Subscription renewal cron recipe (the #1 operational footgun).
Microsoft Graph webhook subscriptions expire at 72 hours max and
don't auto-renew. The shipped operator runbook mentioned
`maintain-subscriptions --dry-run` as a "daily or periodic check"
but never told operators how to actually automate it. Without a
scheduled job, any production deployment silently stops ingesting
meetings three days after go-live.
Adds an "Automating subscription renewal (REQUIRED for production)"
section to website/docs/guides/operate-teams-meeting-pipeline.md
with three concrete options and copy-pasteable configs:
- Option 1: Hermes cron (`hermes cron add --schedule "0 */12 * * *"
--script-only --command "hermes teams-pipeline maintain-subscriptions"`)
- Option 2: systemd service + timer (12h cadence, Persistent=true
so missed runs catch up after reboots)
- Option 3: plain crontab with a wrapper that sources .env for
credentials
Go-Live Checklist gains a bolded mandatory item for the schedule
being in place, with a cross-link to the section.
website/docs/user-guide/messaging/teams-meetings.md adds a
`::⚠️::` admonition right after the manual `subscribe`
examples so anyone who creates a subscription manually is told
the same day that it will silently expire in 72 hours.
2. Sidebar wiring. Shela's new docs pages (teams-meetings.md and
operate-teams-meeting-pipeline.md) weren't in website/sidebars.ts,
so they were orphaned URLs — reachable only if someone knew the
path. Wired teams-meetings into Messaging Platforms next to the
existing teams entry, and operate-teams-meeting-pipeline into
Guides & Tutorials next to microsoft-graph-app-registration from
PR #21922. Adjacent placement keeps the related pages discoverable
from each other.
3. SKILL.md rewrite (v1.0.0 → v1.1.0).
The original skill had five Turkish-only trigger phrases, which
works in a Turkish-speaking session but doesn't match English
triggers. Rewrote the skill to:
- Describe triggers by intent instead of exact phrases, with
explicit "works in any language" framing and example phrases
in both English and Turkish.
- Add a Decision Tree section covering the three most common user
asks (missing summary, setup verification, re-run request) and
the specific CLI command sequence for each.
- Add a dedicated "Critical pitfall: Graph subscriptions expire
in 72 hours" section that tells the agent exactly what to do
when a user reports "worked yesterday, nothing today" — the
most common operational failure mode.
- Expand the command reference into three labeled groups (Status
and inspection / Re-running and debugging / Subscription
management) so the agent can reach for the right command
without scanning.
- Add cross-links to all four related docs pages (Azure app
registration, webhook listener setup, full pipeline setup,
operator runbook).
Validation:
- npm run build: all new pages route, anchor to
#automating-subscription-renewal-required-for-production resolves
from both the runbook TOC and the teams-meetings.md admonition.
- scripts/run_tests.sh on the relevant test suites (607 tests): all
pass.
Third docs slice shipped alongside the TeamsSummaryWriter code so
operators can configure outbound summary delivery the moment this
PR lands.
- website/docs/user-guide/messaging/teams.md: new 'Meeting Summary
Delivery (Teams Meeting Pipeline)' section under Features,
explaining that the existing teams adapter handles pipeline
outbound (not a separate adapter surface), with a config-snippet
example for graph and incoming_webhook modes, a mode-choice
trade-off table, and a note that settings are inert when the
teams_pipeline plugin is disabled.
- website/docs/reference/environment-variables.md: new Teams Meeting
Summary Delivery subsection documenting TEAMS_DELIVERY_MODE,
TEAMS_INCOMING_WEBHOOK_URL, TEAMS_GRAPH_ACCESS_TOKEN, TEAMS_TEAM_ID,
TEAMS_CHANNEL_ID, TEAMS_CHAT_ID with cross-link to the Teams setup
page section.
Verified via npm run build: pages route correctly, no new warnings
or errors.
PR #20831 shipped the feature with a terse reference page. This adds a
proper user guide — ~570 lines of what/why/when/how with use-case
walkthroughs, lifecycle coverage from author through installer through
update, and recipe snippets for common workflows.
New page: website/docs/user-guide/profile-distributions.md
Sections:
* What this means — the before/after, side-by-side
* Why git, not tarballs or a custom format
* When to use a distribution (personal, team, community, product) and
when NOT to (local backup, sharing credentials, sharing memories)
* The lifecycle — dedicated walkthroughs for authors (publish in 4 steps)
and installers (install, check, update, remove)
* Use cases: personal sync, team internal bot, community publish,
commercial product, ephemeral ops agent
* Recipes: pin a version, compare installed vs. latest, preserve local
customizations through updates, force clean reinstall, fork-and-customize,
test before pushing
* What is NEVER in a distribution (the user-owned exclude list verbatim)
* Security and trust model — what you are trusting, why cron is not
auto-scheduled, the browser-extension analogy
Cross-linking:
* Added to sidebar under Getting Started, right after user-guide/profiles.
* Existing Profiles page ends with a Sharing profiles as distributions
teaser that links here.
* The Distribution section of the reference page gets an admonition
pointing newcomers here first. The reference stays as a CLI-flag
lookup for people who already know what they want.
Validation:
* ascii-guard lint --exclude-code-blocks docs -> 0 errors.
* All internal links resolve to real pages.
Background macOS desktop control via cua-driver MCP — does NOT steal the
user's cursor or keyboard focus, works with any tool-capable model.
Replaces the Anthropic-native `computer_20251124` approach from the
abandoned #4562 with a generic OpenAI function-calling schema plus SOM
(set-of-mark) captures so Claude, GPT, Gemini, and open models can all
drive the desktop via numbered element indices.
- `tools/computer_use/` package — swappable ComputerUseBackend ABC +
CuaDriverBackend (stdio MCP client to trycua/cua's cua-driver binary).
- Universal `computer_use` tool with one schema for all providers.
Actions: capture (som/vision/ax), click, double_click, right_click,
middle_click, drag, scroll, type, key, wait, list_apps, focus_app.
- Multimodal tool-result envelope (`_multimodal=True`, OpenAI-style
`content: [text, image_url]` parts) that flows through
handle_function_call into the tool message. Anthropic adapter converts
into native `tool_result` image blocks; OpenAI-compatible providers
get the parts list directly.
- Image eviction in convert_messages_to_anthropic: only the 3 most
recent screenshots carry real image data; older ones become text
placeholders to cap per-turn token cost.
- Context compressor image pruning: old multimodal tool results have
their image parts stripped instead of being skipped.
- Image-aware token estimation: each image counts as a flat 1500 tokens
instead of its base64 char length (~1MB would have registered as
~250K tokens before).
- COMPUTER_USE_GUIDANCE system-prompt block — injected when the toolset
is active.
- Session DB persistence strips base64 from multimodal tool messages.
- Trajectory saver normalises multimodal messages to text-only.
- `hermes tools` post-setup installs cua-driver via the upstream script
and prints permission-grant instructions.
- CLI approval callback wired so destructive computer_use actions go
through the same prompt_toolkit approval dialog as terminal commands.
- Hard safety guards at the tool level: blocked type patterns
(curl|bash, sudo rm -rf, fork bomb), blocked key combos (empty trash,
force delete, lock screen, log out).
- Skill `apple/macos-computer-use/SKILL.md` — universal (model-agnostic)
workflow guide.
- Docs: `user-guide/features/computer-use.md` plus reference catalog
entries.
44 new tests in tests/tools/test_computer_use.py covering schema
shape (universal, not Anthropic-native), dispatch routing, safety
guards, multimodal envelope, Anthropic adapter conversion, screenshot
eviction, context compressor pruning, image-aware token estimation,
run_agent helpers, and universality guarantees.
469/469 pass across tests/tools/test_computer_use.py + the affected
agent/ test suites.
- `model_tools.py` provider-gating: the tool is available to every
provider. Providers without multi-part tool message support will see
text-only tool results (graceful degradation via `text_summary`).
- Anthropic server-side `clear_tool_uses_20250919` — deferred;
client-side eviction + compressor pruning cover the same cost ceiling
without a beta header.
- macOS only. cua-driver uses private SkyLight SPIs
(SLEventPostToPid, SLPSPostEventRecordTo,
_AXObserverAddNotificationAndCheckRemote) that can break on any macOS
update. Pin with HERMES_CUA_DRIVER_VERSION.
- Requires Accessibility + Screen Recording permissions — the post-setup
prints the Settings path.
Supersedes PR #4562 (pyautogui/Quartz foreground backend, Anthropic-
native schema). Credit @0xbyt4 for the original #3816 groundwork whose
context/eviction/token design is preserved here in generic form.
Second docs slice shipped alongside the webhook listener code so users
can actually wire up the endpoint the moment this PR lands.
- website/docs/user-guide/messaging/msgraph-webhook.md: new page
covering what the listener is (change-notification ingress, distinct
from the teams chat adapter), quick-start YAML + env-var config,
full config table, security hardening (clientState + timing-safe
compare, source-IP allowlisting against Microsoft's published egress
ranges, TLS termination at the reverse proxy, response hygiene),
status-code table, troubleshooting, and cross-links to the Azure
app registration guide.
- website/docs/reference/environment-variables.md: new Microsoft
Graph Webhook Listener subsection with MSGRAPH_WEBHOOK_ENABLED,
_PORT, _CLIENT_STATE, _ACCEPTED_RESOURCES, _ALLOWED_SOURCE_CIDRS.
- website/sidebars.ts: wire the new page into Messaging Platforms,
right after the teams chat adapter so the two related pages are
adjacent in the sidebar.
The pipeline runtime / operator CLI / outbound delivery pages still
land with their matching PRs. With this PR merged, an operator can get
the listener running end-to-end, register a Graph subscription
manually, and receive validation handshake plus notification POSTs
against the configured client_state.
Verified via npm run build: new page routes at
/docs/user-guide/messaging/msgraph-webhook, sidebar wires correctly,
no new warnings or errors.
Adds one reserved token to the cron `deliver` field:
- `all` — expand to every platform with a configured home channel
Resolves at fire time, not create time, so a job created before Telegram
was wired up picks it up once `TELEGRAM_HOME_CHANNEL` is set. Composes
with existing targets: `origin,all`, `all,telegram:-100:17`.
Inspired by Vellum Assistant's reminder routing-intent system.
## Changes
- cron/scheduler.py: _expand_routing_tokens + integrate into _resolve_delivery_targets
- tools/cronjob_tools.py: schema description updated
- tests/cron/test_scheduler.py: TestRoutingIntents (5 cases)
- website/docs/user-guide/features/cron.md: docs + table rows
## Validation
- tests/cron/test_scheduler.py -k 'Routing or Deliver' → 57 passed
The kanban specifier landed in #21435 with feature-page docs (the
kanban page itself + the CLI reference table), but three other docs
pages enumerate every auxiliary task slot and were missed:
user-guide/configuration.md Auxiliary Models section —
interactive picker example
+ full auxiliary config
reference YAML block.
user-guide/features/fallback-providers.md
Both 'Auxiliary Tasks' and
'Fallback Reference' tables.
user-guide/features/kanban-tutorial.md
Triage-column bullet now
mentions the ✨ Specify
button + CLI + slash command.
No other docs enumerate the aux task slots (verified with
grep -r 'title_generation\|auxiliary.session_search' website/docs/).
* feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks
The Triage column shipped with a placeholder 'a specifier will flesh
out the spec', but the specifier itself was never built. This wires
it up as a dedicated CLI verb.
`hermes kanban specify <id>` calls the auxiliary LLM (configured under
`auxiliary.triage_specifier`) to expand a rough one-liner into a
concrete spec — tightened title plus a body with Goal / Approach /
Acceptance criteria / Out-of-scope sections — then atomically flips
`status: triage -> todo` and recomputes ready so parent-free tasks
go straight to the dispatcher on the same tick.
Surface:
hermes kanban specify <task_id> # single task
hermes kanban specify --all [--tenant T] # sweep triage column
hermes kanban specify ... --author NAME # audit-comment author
hermes kanban specify ... --json # one JSON line per task
Design choices:
- Parent gating is preserved. specify_triage_task flips to 'todo',
then recompute_ready promotes to 'ready' only when parents are
done — same rule as a normal parent-gated todo.
- No daemon, no background watcher. Every invocation is explicit —
keeps cost predictable and doesn't fight the dispatcher loop.
- Response parse is lenient: strict JSON preferred, markdown-fence
tolerated, raw-body fallback on malformed JSON so the LLM can't
strand a task in triage.
- All failure modes (no aux client, API error, task moved out of
triage mid-call) return SpecifyOutcome(ok=False, reason=...) so
--all continues past individual failures.
Changes:
hermes_cli/kanban_db.py + specify_triage_task()
hermes_cli/kanban_specify.py NEW (~220 LOC — prompt, parse, call)
hermes_cli/kanban.py + specify subcommand + _cmd_specify
hermes_cli/config.py + auxiliary.triage_specifier task slot
website/docs/user-guide/features/kanban.md specify + config notes
website/docs/reference/cli-commands.md CLI reference entry
tests/hermes_cli/test_kanban_specify_db.py NEW (10 tests)
tests/hermes_cli/test_kanban_specify.py NEW (20 tests)
Validation: 30/30 targeted tests pass. E2E: triage task -> specify ->
ends in 'ready' with events [created, specified, promoted] and the
audit comment recorded under the configured author.
* feat(kanban): wire specifier into dashboard and gateway slash
Follow-ups to the initial PR #21435 — closes the two gaps I'd left as
post-merge: dashboard button and first-class gateway surface.
Dashboard (plugins/kanban/dashboard/)
- POST /tasks/:id/specify NEW endpoint. Thin wrapper around
kanban_specify.specify_task(). Returns the CLI outcome shape
({ok, task_id, reason, new_title}); ok=false with a human reason
is a 200, not a 4xx, so the UI can render it inline without
treating 'no aux client configured' as a crash.
- Runs sync in FastAPI's threadpool because the LLM call can take
tens of seconds on reasoning models.
- Pins HERMES_KANBAN_BOARD around the specify call so the module's
argless kb.connect() lands on the right board.
- dist/index.js: doSpecify callback threaded through the drawer →
TaskDetail → StatusActions prop chain. ✨ Specify button appears
ONLY when task.status === 'triage' (elsewhere the backend would
reject anyway — hide the button to keep the action row clean).
Busy state (Specifying…) + inline success/error banner under the
button using the response.reason text.
- dist/style.css: tiny hermes-kanban-msg-ok / -err classes using
existing --color vars so themes reskin cleanly.
Gateway slash (/kanban specify)
- Already works via the existing run_slash → build_parser →
kanban_command pipeline. No code change needed — slash commands
inherit the argparse tree automatically. Added coverage:
test_run_slash_specify_end_to_end (create --triage, specify, verify
promotion + retitle) and test_run_slash_specify_help_is_reachable.
Tests
- tests/plugins/test_kanban_dashboard_plugin.py: 3 new tests for the
REST endpoint — happy path, non-triage rejection as ok=false 200,
missing aux client as ok=false 200.
- tests/hermes_cli/test_kanban_cli.py: 2 new slash-surface tests.
Docs
- website/docs/user-guide/features/kanban.md: dashboard action row
description mentions ✨ Specify + all three surfaces. REST table
gains /tasks/:id/specify. Slash examples include /kanban specify.
Validation: 340/340 targeted tests pass. E2E via TestClient: create a
triage task over REST → POST /specify with mocked aux client → task
moves to 'ready' column on /board with new title and body applied.
Follow-up to the previous commit:
- Add _is_loopback_host() helper covering 127.0.0.1, localhost, ::1,
ip6-localhost, ip6-loopback (case-insensitive). Empty/None host is
treated as non-loopback since unset usually means public default bind.
- Fix mixed-indent comment in the safety rail (comment now aligned with
the if-block) and collapse the nested-if into one condition.
- Add TestInsecureNoAuthSafetyRail covering rejection on 0.0.0.0, a LAN
IP, and empty host; allowance on 127.0.0.1/localhost; plus unit-level
parametrized coverage of _is_loopback_host for spellings we can't bind
in the hermetic test env (::1, ip6-localhost, ip6-loopback).
- Pin test_connect_starts_server + test_webhook_deliver_only defaults
to 127.0.0.1 so they keep passing under the new rail.
- Document the behavior in website/docs/user-guide/messaging/webhooks.md.
Adds Google Chat as a new gateway platform, shipped under
plugins/platforms/google_chat/ following the canonical bundled-plugin
pattern (Teams, IRC). Rewired from the original PR #18425 to use the
new env_enablement_fn + cron_deliver_env_var plugin interfaces landed
in the preceding commit, so the adapter touches ZERO core files.
What it does:
- Inbound DM + group messages via Cloud Pub/Sub pull subscription (no
public URL needed), with attachments (PDFs, images, audio, video)
downloaded through an SSRF-guarded Google-host allowlist.
- Outbound text replies with the 'Hermes is thinking…' patch-in-place
pattern — no tombstones.
- Native file attachment delivery via per-user OAuth. Google Chat's
media.upload endpoint rejects service-account auth, so each user
runs /setup-files once in their own DM to grant
chat.messages.create for themselves; the adapter then uploads as
them. Tokens stored per email at
~/.hermes/google_chat_user_tokens/<email>.json.
- Thread isolation: side-threads get isolated sessions, top-level DM
messages share one continuous session. Persistent thread-count
store survives gateway restart.
- Supervisor reconnect with exponential backoff.
- Multi-user out of the box.
How it plugs in (no core edits):
- env_enablement_fn seeds PlatformConfig.extra with project_id,
subscription_name, service_account_json, and the home_channel dict
(which the core hook turns into a HomeChannel dataclass). Reads
GOOGLE_CHAT_PROJECT_ID (falls back to GOOGLE_CLOUD_PROJECT),
GOOGLE_CHAT_SUBSCRIPTION_NAME (falls back to GOOGLE_CHAT_SUBSCRIPTION),
GOOGLE_CHAT_SERVICE_ACCOUNT_JSON (falls back to
GOOGLE_APPLICATION_CREDENTIALS), GOOGLE_CHAT_HOME_CHANNEL.
- cron_deliver_env_var='GOOGLE_CHAT_HOME_CHANNEL' gets cron delivery
for free — cron/scheduler.py consults the platform registry for any
name not in its hardcoded built-in sets.
- plugin.yaml's rich requires_env / optional_env blocks auto-populate
OPTIONAL_ENV_VARS via the new hermes_cli/config.py injector, so
'hermes config' UI surfaces them with description / url / prompt /
password metadata.
- Module-level Platform('google_chat') call in adapter.py triggers the
Platform._missing_() registration so Platform.GOOGLE_CHAT attribute
access works without an enum entry.
Distribution: ships inside the existing hermes-agent package. Users
opt in via 'pip install hermes-agent[google_chat]' and follow the
8-step GCP walkthrough at
website/docs/user-guide/messaging/google_chat.md.
Test coverage: 153 tests in tests/gateway/test_google_chat.py, all
passing. Spans platform registration, env config loading, Pub/Sub
envelope routing, outbound send + chunking + typing patch-in-place,
attachment send paths, SSRF guard, thread/session model,
supervisor reconnect, authorization, per-user OAuth, and the new
plugin-registry cron delivery wiring.
Credit: adapter + OAuth + tests + docs authored by @donramon77
(PR #18425). Rewire onto the new plugin hooks + salvage commit by
Teknium.
Co-Authored-By: Ramón Fernández <112875006+donramon77@users.noreply.github.com>
Follow-up to the previous commit which flipped 'hermes curator run'
default from async to sync. Updates the curator.md feature page and
cli-commands.md reference to show --background as the opt-in async
flag and note that the default now blocks until the LLM pass finishes.
Follow-up to #20958. The worker skill section had the same stale
'hermes skills install devops/kanban-worker' command — kanban-worker
is also bundled, so that command fails with 'Could not fetch from any
source.'
Replace with bundled-skill verification + restore pattern, matching
the orchestrator section. Uses <your-worker-profile> placeholder since
assignees vary (researcher, writer, ops, linguist, reviewer, etc.)
rather than a single fixed 'worker' profile.
Previous version read like internal API docs \u2014 leading with env var tables,
config YAML, and 'precedence' rules before ever explaining the product.
Complete rewrite inverts the structure so readers see value first,
mechanics last.
Structure now:
- Lede: 'One subscription. Every tool built in.' + pitch paragraph
- CTA: subscribe/manage button styled as a real call-to-action
- What's included: emoji-led table with expanded descriptions per tool.
Image gen lists all 9 models by name (FLUX 2 Klein/Pro, Z-Image Turbo,
Nano Banana Pro, GPT Image 1.5/2, Ideogram V3, Recraft V4 Pro, Qwen)
- Why it's here: value bullets \u2014 one bill, one signup, one key, same
quality, bring-your-own anytime
- Get started: two-command flow (hermes model \u2192 hermes status)
- Eligibility: paid-tier note with upgrade link
- Mix and match: three realistic usage patterns
- Using individual image models: ID reference table for power users
- --- separator ---
- Configuration reference (demoted): use_gateway flag, disabling,
self-hosted gateway env vars moved below the fold where they belong
- FAQ: streamlined, removed redundant content
Fact-checked against code:
- 9 FAL models confirmed from tools/image_generation_tool.py FAL_MODELS
- Status section output verified against hermes_cli/status.py
- Portal subscription URL preserved
- Self-hosted env vars (TOOL_GATEWAY_DOMAIN etc.) kept accurate
Verified: docusaurus build SUCCESS, page renders, no new broken links.
Closes the remaining gaps from PR #11562 that weren't covered by the
core SearXNG integration landed in #20823.
- optional-skills/research/searxng-search/ — installable skill with
SKILL.md (curl-based usage, category support, Python example) and
searxng.sh helper script for health checks and instance queries
- website/docs/user-guide/configuration.md — SearXNG added to the
Web Search Backends section (5 backends, backend table, per-capability
split config example, correct search-only note)
- website/docs/reference/environment-variables.md — SEARXNG_URL row
- website/docs/reference/optional-skills-catalog.md — searxng-search entry
The core SearXNG code, OPTIONAL_ENV_VARS, hermes tools picker, and tests
were already on main via #20823. This commit is purely additive docs +
the optional skill scaffold.
Credits from #11562 salvage:
@w4rum — original _searxng_search structure
@nathansdev — tools_config.py integration
@moyomartin — category support and result formatting
@0xMihai — config/env var approach
@nicobailon — skill and documentation structure
@searxng-fan — error handling patterns
@local-first — self-hosted-first philosophy and docs
Same Hermes Teal palette as the default theme, but with baseSize 18px,
lineHeight 1.65, and spacious density so the whole dashboard scales up.
Gives users a one-click bigger-text preset and a copyable reference for
authoring custom YAML themes with their own typography settings.
Two pluggable surfaces were mentioned in the interfaces map without a
real authoring guide behind them:
1. **Image-gen backends** — only had 'See bundled examples' pointers.
Now a full developer-guide/image-gen-provider-plugin.md (270 lines)
mirroring the memory/context/model provider docs:
- How discovery works, directory structure, plugin.yaml
- ImageGenProvider ABC with every overridable method
(name, display_name, is_available, list_models, default_model,
get_setup_schema, generate)
- Full authoring walkthrough with a working MyBackendImageGenProvider
- Response-format reference (success_response / error_response)
- Handling b64 vs URL output (save_b64_image helper)
- User overrides at ~/.hermes/plugins/image_gen/<name>/
- Testing recipe + pip distribution
- Reference examples (openai, openai-codex, xai)
2. **Skill taps** — features/skills.md mentioned the CLI commands but
never explained the repo contract for publishing a tap. Added
'Publishing a custom skill tap' section under Skills Hub covering:
- Repo layout (skills/<name>/SKILL.md by default)
- Minimal working example
- Non-default path configuration (taps.json)
- Installing individual skills without subscribing
- Trust-level handling
- Full tap management CLI + in-session /skills tap commands
Wired into:
- website/sidebars.ts: image-gen-provider-plugin added to Extending group
- website/docs/user-guide/features/plugins.md: pluggable interfaces
table + 'What plugins can do' table now link to the real guides
instead of 'See bundled examples'
- website/docs/guides/build-a-hermes-plugin.md: top info map and
inline sub-sections updated, 'Full guide:' line added to
image-gen block, tap section mentions publishing
Verified: docusaurus build SUCCESS, new page renders at
/docs/developer-guide/image-gen-provider-plugin, anchor
#publishing-a-custom-skill-tap resolves from plugins.md +
build-a-hermes-plugin.md. Pre-existing zh-Hans broken links unchanged.
* feat(skills/linear): add Documents support + Python helper script
The bundled Linear skill (PR #1230) covered issues, projects, teams, and
workflow states via curl. It had no coverage for Linear's Documents API,
so fetching an RFC/doc from a linear.app URL required hand-writing
GraphQL against an underdocumented schema.
Adds:
- Documents section in SKILL.md explaining slugId extraction from URLs,
the contentState (markdown) vs contentState (ProseMirror) split, and
four canonical curl examples (fetch by slugId, fetch by UUID, list
recent, title-search).
- scripts/linear_api.py — stdlib-only Python CLI wrapping the most
common operations (whoami, list-teams, list/get/search/create/update
issues, add-comment, update-status, list/get/search documents, raw
GraphQL passthrough). Zero deps, reads LINEAR_API_KEY from env.
Auth header quirk (personal key takes bare $LINEAR_API_KEY, no Bearer
prefix) is already documented in the skill.
Found during RFC review: the existing skill's lack of document support
forced falling back to the browser (which hit Linear's login wall).
Also fixes a schema gotcha — the Document field is `contentState`, not
`contentData` (which returns 400).
Tested end-to-end against the production API:
python3 linear_api.py whoami
python3 linear_api.py get-document 38359beef67c
Both return expected payloads.
* fix(skills/linear): point LINEAR_API_KEY setup to the correct page
The org-level Settings > API page (/settings/api) only shows OAuth apps
and workspace-member keys. Personal API keys live under Account,
Security, access (/settings/account/security). Update both the setup
link in config.py (shown during hermes setup) and the setup step in
SKILL.md so users land on the page that can create a personal key.
* docs(providers): add model-provider-plugin authoring guide + fix stale refs
New docs:
- website/docs/developer-guide/model-provider-plugin.md — full authoring
guide (directory layout, minimal example, ProviderProfile fields,
overridable hooks, user overrides, api_mode selection, auth types,
testing, pip distribution)
- Wired into website/sidebars.ts under 'Extending'
- Cross-references added in:
- guides/build-a-hermes-plugin.md (tip block)
- developer-guide/adding-providers.md
- developer-guide/provider-runtime.md
User guide:
- user-guide/features/plugins.md: Plugin types table grows from 3 to 4
with 'Model providers' row
Stale comment cleanup (providers/*.py → plugins/model-providers/<name>/):
- hermes_cli/main.py:_is_profile_api_key_provider docstring
- hermes_cli/doctor.py:_build_apikey_providers_list docstring
- hermes_cli/auth.py: PROVIDER_REGISTRY + alias auto-extension comments
- hermes_cli/models.py: CANONICAL_PROVIDERS auto-extension comment
AGENTS.md:
- Project-structure tree: added plugins/model-providers/ row
- New section: 'Model-provider plugins' explaining discovery, override
semantics, PluginManager integration, kind auto-coerce heuristic
Verified: docusaurus build succeeds, new page renders, all 3 cross-links
resolve. 347/347 targeted tests pass (tests/providers/,
tests/hermes_cli/test_plugins.py, tests/hermes_cli/test_runtime_provider_resolution.py,
tests/run_agent/test_provider_parity.py).
* docs(plugins): add 'pluggable interfaces at a glance' maps to plugins.md + build-a-hermes-plugin
Devs landing on either the user-guide plugin page or the build-a-plugin
guide now get an upfront table of every distinct pluggable surface with
a link to the right authoring doc. Previously they'd have to read the
full general-plugin guide to discover that model providers / platforms
/ memory / context engines are separate systems.
user-guide/features/plugins.md:
- New 'Pluggable interfaces — where to go for each' section below the
existing 4-kinds table
- 10 rows covering every register_* surface (tool, hook, slash command,
CLI subcommand, skill, model provider, platform, memory, context
engine, image-gen)
- Explicit note: TTS/STT are NOT plugin-extensible yet — documented
with a pointer to the current config.yaml 'command providers' pattern
and a note that register_tts_provider()/register_stt_provider() may
come later
guides/build-a-hermes-plugin.md:
- New :::info 'Not sure which guide you need?' map at the top so devs
see all pluggable interfaces before investing in this 737-line
general-plugin walkthrough
- Existing bottom :::tip expanded to include platform adapters alongside
model/memory/context plugins
Verified:
- All 8 cross-doc links in the new plugins.md table resolve in a
docusaurus build (SUCCESS, no new broken links)
- TTS link corrected (features/voice → features/tts; latter exists)
- Pre-existing broken links/anchors (cron-script-only, llms.txt,
adding-platform-adapters#step-by-step-checklist) are unchanged
* docs(plugins): correct TTS/STT pluggability \u2014 they ARE plugins (command-providers)
Previous commit incorrectly said TTS/STT 'aren't plugin-extensible'. They
are, via the config-driven command-provider pattern \u2014 any CLI that reads
text and writes audio (or vice versa for STT) is automatically a plugin
with zero Python. The tts.md docs cover this extensively and I missed it.
plugins.md:
- TTS row: 'Config-driven (not a Python plugin)', points at
tts.md#custom-command-providers
- STT row: points at tts.md#voice-message-transcription-stt (STT docs
live in tts.md despite the filename)
- Expanded note: TTS/STT use config-driven shell-command templates as
their plugin surface (full tts.providers.<name> registry for TTS;
HERMES_LOCAL_STT_COMMAND escape hatch for STT)
- Any CLI that reads/writes files is automatically a plugin \u2014 no Python
register_* API needed
- Future register_tts_provider()/register_stt_provider() hooks mentioned
as nice-to-have for SDK/streaming cases, not as the primary story
build-a-hermes-plugin.md:
- Same map update: TTS/STT rows explicit, footer note corrected
Verified:
- tts.md anchors (custom-command-providers, voice-message-transcription-stt)
exist and resolve in docusaurus build (SUCCESS, no new broken links)
* docs(plugins): expand pluggable interfaces table with MCP / event hooks / shell hooks / skill taps
Broadened the scope beyond Python register_* hooks. Hermes has MULTIPLE
plugin-style extension surfaces; they're now all in one table instead of
being scattered across feature docs.
Added rows for:
- **MCP servers** — config.yaml mcp_servers.<name> auto-registers external
tools from any MCP server. Huge extensibility surface, previously not
linked from the plugin map.
- **Gateway event hooks** — drop HOOK.yaml + handler.py into
~/.hermes/hooks/<name>/ to fire on gateway:startup, session:*, agent:*,
command:* events. Separate from Python plugin hooks.
- **Shell hooks** — hooks: block in config.yaml runs shell commands on
events (notifications, auditing, etc.).
- **Skill sources (taps)** — hermes skills tap add <repo> to pull in new
skill registries beyond the built-in sources.
Both docs updated:
- user-guide/features/plugins.md: table column renamed to 'How' (mixes
Python API + config-driven + drop-in-dir surfaces accurately)
- guides/build-a-hermes-plugin.md: :::info map at top mirrors the new
surfaces with a forward-link to the consolidated table
Note block rewritten: instead of singling out TTS/STT as the 'different
style' exception, now honestly describes that Hermes deliberately
supports three plugin styles — Python APIs, config-driven commands, and
drop-in manifest directories — and devs should pick the one that fits
their integration.
Not included (considered and rejected):
- Transport layer (register_transport) — internal, not user-facing
- Tool-call parsers — internal, VLLM phase-2 thing
- Cloud browser providers — hardcoded registry, not drop-in yet
- Terminal backends — hardcoded if/elif, not drop-in yet
- Skill sources (the ABC) — hardcoded list, only taps are user-extensible
Verified:
- All 5 new anchors resolve (gateway-event-hooks, shell-hooks, skills-hub,
custom-command-providers, voice-message-transcription-stt)
- Docusaurus build SUCCESS, zero new broken links
- Same 3 pre-existing broken links on main (cron-script-only, llms.txt,
adding-platform-adapters#step-by-step-checklist)
* docs(plugins): cover every pluggable surface in both the overview and how-to
Both plugins.md and build-a-hermes-plugin.md now cover every extension
surface end-to-end \u2014 general plugin APIs, specialized plugin types,
config-driven surfaces \u2014 with concrete authoring patterns for each.
plugins.md:
- 'What plugins can do' table grows from 9 rows (general ctx.register_*
only) to 14 rows covering register_platform, register_image_gen_provider,
register_context_engine, MemoryProvider subclass, register_provider
(model). Each row links to its full authoring guide.
- New 'Plugin sub-categories' section under Plugin Discovery explains
how plugins/platforms/, plugins/image_gen/, plugins/memory/,
plugins/context_engine/, plugins/model-providers/ are routed to
different loaders \u2014 PluginManager vs the per-category own-loader
systems.
- Explicit mention of user-override semantics at
~/.hermes/plugins/model-providers/ and ~/.hermes/plugins/memory/.
build-a-hermes-plugin.md:
- New '## Specialized plugin types' section (5 sub-sections):
- Model provider plugins \u2014 ProviderProfile + plugin.yaml example,
auto-wiring summary, link to full guide
- Platform plugins \u2014 BasePlatformAdapter + register_platform() skeleton
- Memory provider plugins \u2014 MemoryProvider subclass example
- Context engine plugins \u2014 ContextEngine subclass example
- Image-generation backends \u2014 ImageGenProvider + kind: backend example
- New '## Non-Python extension surfaces' section (5 sub-sections):
- MCP servers \u2014 config.yaml mcp_servers.<name> example
- Gateway event hooks \u2014 HOOK.yaml + handler.py example
- Shell hooks \u2014 hooks: block in config.yaml example
- Skill sources (taps) \u2014 hermes skills tap add example
- TTS / STT command templates \u2014 tts.providers.<name> with type: command
- Distribute via pip / NixOS promoted from ### to ## (they were orphaned
after the reorganization)
Each specialized / non-Python section has a concrete, copy-pasteable
example plus a 'Full guide:' link to the authoritative doc. Devs arriving
at the build-a-hermes-plugin guide now see every extension surface at
their disposal, not just the general tool/hook/slash-command surface.
Verified:
- Docusaurus build SUCCESS, zero new broken links
- All new cross-links (developer-guide/model-provider-plugin,
adding-platform-adapters, memory-provider-plugin, context-engine-plugin,
user-guide/features/mcp, skills#skills-hub, hooks#gateway-event-hooks,
hooks#shell-hooks, tts#custom-command-providers,
tts#voice-message-transcription-stt) resolve
- Same 3 pre-existing broken links on main (cron-script-only, llms.txt,
adding-platform-adapters#step-by-step-checklist)
* docs(plugins): fix opt-in inconsistency — not every plugin is gated
The 'Every plugin is disabled by default' statement was wrong. Several
plugin categories intentionally bypass plugins.enabled:
- Bundled platform plugins (IRC, Teams) auto-load so shipped gateway
channels are available out of the box. Activation per channel is via
gateway.platforms.<name>.enabled.
- Bundled backends (plugins/image_gen/*) auto-load so the default
backend 'just works'. Selection via <category>.provider config.
- Memory providers are all discovered; one is active via memory.provider.
- Context engines are all discovered; one is active via context.engine.
- Model providers: all 33 discovered at first get_provider_profile();
user picks via --provider / config.
The plugins.enabled allow-list specifically gates:
- Standalone plugins (general tools/hooks/slash commands)
- User-installed backends
- User-installed platforms (third-party gateway adapters)
- Pip entry-point backends
Which matches the actual code in hermes_cli/plugins.py:737 where the
bundled+backend/platform check bypasses the allow-list.
Rewrote '## Plugins are opt-in' to:
- Retitle to 'Plugins are opt-in (with a few exceptions)'
- Narrow opening claim to 'General plugins and user-installed backends
are disabled by default'
- Added 'What the allow-list does NOT gate' subsection with a full
table of which bypass the gate and how they're activated instead
- Fixed migration section wording (bundled platform/backend plugins
never needed grandfathering)
Verified: docusaurus build SUCCESS, zero new broken links.
Replaces the 22-line stub with a ~320-line guide covering the parts of the
Windows/WSL2 split that specifically affect Hermes users:
- Why WSL2 (and not native Windows)
- Install: distro choice, WSL1→2, systemd via /etc/wsl.conf
- Filesystem boundary: /mnt/c vs \\wsl$, perf/perms/watchers/case,
wslpath/wslview, CRLF + git core.autocrlf, clone-where guidance
- Networking in both directions:
- WSL → Windows services: links to the canonical WSL2 Networking section
in integrations/providers.md (mirrored mode, NAT + host IP, bind addr,
firewall) instead of duplicating
- Windows/LAN → Hermes in WSL: mirrored vs NAT, netsh portproxy one-liner,
firewall rule, webhook tunneling pointer
- Long-running services: systemd gateway + Task Scheduler wsl.exe --exec
'sleep infinity' to keep the VM alive at login
- GPU passthrough: NVIDIA works, AMD/Intel out of matrix
- Common pitfalls: connection refused, /mnt/c slowness, CRLF ^M,
UNC warnings, post-sleep clock drift, mirrored-mode DNS with VPN,
PATH, Defender scanning, VHDX disk reclaim
All internal links use site-absolute /docs/... form (matches the rest of
user-guide/); all seven link targets verified to exist.