website/src/pages/skills/index.tsx imports ../../data/skills.json, but
that file is git-ignored and generated at build time by
website/scripts/extract-skills.py. CI workflows (deploy-site.yml,
docs-site-checks.yml) run the script explicitly before 'npm run build',
so production and PR checks always work — but 'npm run build' on a
contributor's machine fails with:
Module not found: Can't resolve '../../data/skills.json'
because the extraction step was never wired into the npm scripts.
Adds a prebuild/prestart hook that runs extract-skills.py automatically.
If python3 or pyyaml aren't installed locally, writes an empty
skills.json instead of hard-failing — the Skills Hub page renders with
an empty state, the rest of the site builds normally, and CI (which
always has the deps) still generates the full catalog for production.
Fills the three gaps left by the orchestrator/width-depth salvage:
- configuration.md §Delegation: max_concurrent_children, max_spawn_depth,
orchestrator_enabled are now in the canonical config.yaml reference
with a paragraph covering defaults, clamping, role-degradation, and
the 3x3x3=27-leaf cost scaling.
- environment-variables.md: adds DELEGATION_MAX_CONCURRENT_CHILDREN to
the Agent Behavior table.
- features/delegation.md: corrects stale 'default 5, cap 8' wording
(that was from the original PR; the salvage landed on default 3 with
no ceiling and a tool error on excess instead of truncation).
Page prompts are written in Step 5 from the text descriptions in
characters/characters.md — the PNG sheet generated in Step 7.1
cannot be used to write them. Reposition the PNG as a human-facing
review artifact (and reference for later regenerations / manual
edits), and drop the confusing "Character sheet | Strategy" tables
since the embedding rule is uniform.
- Remove PDF merge feature and scripts/ directory (no pdf-lib dep)
- Correct image_generate docs: prompt-only, returns URL; add
curl download step after every call
- Downgrade reference images to text-based trait extraction
(style/palette/scene); character sheet is agent-facing reference
- Unify source file naming on source-{slug}.md across SKILL.md
and workflow.md
Port the upstream baoyu-comic skill to Hermes' tool ecosystem, matching
the earlier baoyu-infographic adaptation:
- metadata namespace openclaw -> hermes (+ tags, homepage)
- drop EXTEND.md preferences system (references/config/ removed,
workflow Step 1.1 removed)
- user prompts via clarify (one question at a time) instead of
AskUserQuestion batches
- image generation via image_generate instead of baoyu-imagine, with
aspect-ratio mapping to landscape/portrait/square
- Windows/PowerShell/WSL shell snippets dropped
- file I/O referenced via Hermes write_file/read_file tools
- CLI-style --flags converted to natural-language options and
user-intent cues (skill matching has no slash command trigger)
Add PORT_NOTES.md documenting the adaptations and a sync procedure.
Art-style/tone/layout reference files are preserved verbatim from
upstream v1.56.1.
A single global MAX_TEXT_LENGTH = 4000 truncated every TTS provider at
4000 chars, causing long inputs to be silently chopped even though the
underlying APIs allow much more:
- OpenAI: 4096
- xAI: 15000
- MiniMax: 10000
- ElevenLabs: 5000 / 10000 / 30000 / 40000 (model-aware)
- Gemini: ~5000
- Edge: ~5000
The schema description also told the model 'Keep under 4000 characters',
which encouraged the agent to self-chunk long briefs into multiple TTS
calls (producing 3 separate audio files instead of one).
New behavior:
- PROVIDER_MAX_TEXT_LENGTH table + ELEVENLABS_MODEL_MAX_TEXT_LENGTH
encode the documented per-provider limits.
- _resolve_max_text_length(provider, cfg) resolves:
1. tts.<provider>.max_text_length user override
2. ElevenLabs model_id lookup
3. provider default
4. 4000 fallback
- text_to_speech_tool() and stream_tts_to_speaker() both call the
resolver; old MAX_TEXT_LENGTH alias kept for back-compat.
- Schema description no longer hardcodes 4000.
Tests: 27 new unit + E2E tests; all 53 existing TTS tests and 253
voice-command/voice-cli tests still pass.
After the prior inline-diff fix, the gateway still prepends a literal
" ┊ review diff" line to inline_diff (it's terminal chrome written by
`_emit_inline_diff`). Wrapping that in a ```diff fence left that header
inside the code block. The agent also often narrates its own edit in a
second fenced diff, so the assistant message ended up stacking two
diff blocks for the same change.
- Strip the leading "┊ review diff" header from queued inline diffs
before fencing.
- Skip appending the fenced diff entirely when the assistant already
wrote its own ```diff (or ```patch) fence.
Keeps the single-surface diff UX even when the agent is chatty.
When tool.complete already carries inline_diff, the assistant message owns the full diff block. Suppress the tool-row summary/detail in that case so the turn shows one detailed diff surface instead of a rich diff plus a duplicated tool-detail payload.
Avoid duplicate diff rendering in #13729 flow. We now skip queued inline diffs that are already present in final assistant text and dedupe repeated queued diffs by exact content.
Follow-up for #13729: segment-level system artifacts still looked detached in real flow.\n\nInstead of appending inline_diff as a standalone segment/system row, queue sanitized diffs during tool.complete and append them as a fenced diff block to the assistant completion text on message.complete. This keeps the diff in the same message flow as the assistant response.
Follow-up on multiline arrow behavior: Up/Down now fall back to queue/history whenever there is no logical line above/below the caret (not only at absolute start/end character positions). This makes Up from the end of the top line cycle history, matching expected readline-ish behavior.
Follow-up on #13724: showing literally every source was too noisy.\n\n now fetches a wider window (, larger limit) and then filters to a curated allowlist of human-facing sources (tui/cli plus chat adapters like telegram/discord/slack/whatsapp/etc). This keeps row #7 fixed (telegram sessions visible in /resume) without surfacing internal source kinds such as tool/acp.
Follow-up on #13729 from blitz screenshot feedback.\n\n- When tool.complete carried inline_diff but no buffered assistant text existed, pending tool rows were still in streamPendingTools, so diff rendered above the tool row section. appendSegmentMessage now emits pending tool rows as a trail segment before appending the diff artifact.\n- Strip ANSI color escapes from inline_diff payloads so we don't render loud red/green terminal palettes in the transcript.
Follow-up on #13726 from blitz feedback: Up/Down history cycling should only trigger when the caret is at the start/end boundary (or the input is empty).\n\nPreviously useInputHandlers intercepted arrows whenever inputBuf was empty, which still stole Up/Down from normal multiline editing. textInput now publishes caret position through inputSelectionStore even with no active selection, and useInputHandlers gates history/queue cycling on those boundaries.
* feat(models): hide OpenRouter models that don't advertise tool support
Port from Kilo-Org/kilocode#9068.
hermes-agent is tool-calling-first — every provider path assumes the
model can invoke tools. Models whose OpenRouter supported_parameters
doesn't include 'tools' (e.g. image-only or completion-only models)
cannot be driven by the agent loop and fail at the first tool call.
Filter them out of fetch_openrouter_models() so they never appear in
the model picker (`hermes model`, setup wizard, /model slash command).
Permissive when the field is missing — OpenRouter-compatible gateways
(Nous Portal, private mirrors, older snapshots) don't always populate
supported_parameters. Treat missing as 'unknown → allow' rather than
silently emptying the picker on those gateways. Only hide models
whose supported_parameters is an explicit list that omits tools.
Tests cover: tools present → kept, tools absent → dropped, field
missing → kept, malformed non-list → kept, non-dict item → kept,
empty list → dropped.
* feat(delegate): cross-agent file state coordination for concurrent subagents
Prevents mangled edits when concurrent subagents touch the same file
(same process, same filesystem — the mangle scenario from #11215).
Three layers, all opt-out via HERMES_DISABLE_FILE_STATE_GUARD=1:
1. FileStateRegistry (tools/file_state.py) — process-wide singleton
tracking per-agent read stamps and the last writer globally.
check_stale() names the sibling subagent in the warning when a
non-owning agent wrote after this agent's last read.
2. Per-path threading.Lock wrapped around the read-modify-write
region in write_file_tool and patch_tool. Concurrent siblings on
the same path serialize; different paths stay fully parallel.
V4A multi-file patches lock in sorted path order (deadlock-free).
3. Delegate-completion reminder in tools/delegate_tool.py: after a
subagent returns, writes_since(parent, child_start, parent_reads)
appends '[NOTE: subagent modified files the parent previously
read — re-read before editing: ...]' to entry.summary when the
child touched anything the parent had already seen.
Complements (does not replace) the existing path-overlap check in
run_agent._should_parallelize_tool_batch — batch check prevents
same-file parallel dispatch within one agent's turn (cheap prevention,
zero API cost), registry catches cross-subagent and cross-turn
staleness at write time (detection).
Behavior is warning-only, not hard-failing — matches existing project
style. Errors surface naturally: sibling writes often invalidate the
old_string in patch operations, which already errors cleanly.
Tests: tests/tools/test_file_state_registry.py — 16 tests covering
registry state transitions, per-path locking, per-path-not-global
locking, writes_since filtering, kill switch, and end-to-end
integration through the real read_file/write_file/patch handlers.
Reported during TUI v2 blitz retest: code-review diffs from tool.complete
appeared at the top of the current interaction thread, out of sequence
with the agent's messages and tool rows below them.
Root cause — `sys(inline_diff)` appends to `historyItems`, which sits
above the `StreamingAssistant` pane that renders the active turn.
Until the turn closed, the diff visually floated above everything
else happening in the same turn.
Route the diff through `turnController.appendSegmentMessage` instead
so it flushes any pending streaming text first, then lands in the
segment stream beside assistant output and tool calls. On
`message.complete` the segment list is committed to history in emit
order (diff → final text), matching what the gateway sent.
Adds a regression test that exercises tool.complete → message.complete
with an inline_diff payload and asserts both the streaming and final
placement.
Reported during TUI v2 blitz retest: `/history` in the TUI only shows
prompts from non-TUI Hermes runs and can't scroll the window. Root
cause is the slash-worker subprocess: it's a detached HermesCLI that
never sees the TUI's turns, so its `conversation_history` starts empty
and `show_history` surfaces whatever was persisted from earlier CLI
sessions — not what the user just did inside the TUI.
Intercept `/history` as a local slash command so it dumps
`ctx.local.getHistoryItems()` — the TUI's own transcript — routed
through the pager (which scrolls after #13591). Accepts an optional
preview-length argument (default 400 chars per message).
Adds createSlashHandler coverage.
Reported during TUI v2 blitz retest: typing a multi-line message with
shift-Enter and then pressing Up to edit an earlier line swapped the
whole buffer for the previous history entry instead of moving the
cursor up a line. Down then restored the draft → the buffer appeared
to "flip" between the draft and a prior prompt.
`useInputHandlers` cycles history on Up/Down, but textInput only
checked `inputBuf.length` — that only counts lines committed with a
trailing backslash, not shift-Enter newlines inside `input` itself.
Fix: detect logical lines inside the input string and move the cursor
one line up/down preserving column offset (clamp to line end when the
destination is shorter, standard editor behavior). Only fall through
to history cycling when the cursor is already on the first line (Up)
or last line (Down).
Adds unit coverage for the new `lineNav` helper.
Reported during TUI v2 blitz retest: /resume modal only surfaced tui/cli
rows, even though `hermes --tui --resume <id>` with a pasted telegram
session id works fine. The handler double-fetched with explicit
`source="tui"` and `source="cli"` filters and dropped everything else on
the floor.
Drop the filter — list_sessions_rich(source=None) already excludes
child sessions (subagents, compression continuations) via its default,
and users want to resume messenger sessions from inside the TUI.
Adds gateway regression coverage.
- Drop the outer no-op capture group from INLINE_RE and restructure the
source as an ordered list of patterns-with-index-comments so each
alternative is individually greppable. Shift group indices in MdInline
down by one accordingly.
- Inline single-use helpers (parseFence, isFenceClose, isMarkdownFence,
trimBareUrl) and intermediate variables (path, lang, raw, prefix, body,
depth, task body, setext match, etc.).
- Hoist block-level regexes used inside MdImpl (FENCE_CLOSE_RE, SETEXT_RE,
BULLET_RE, TASK_RE, NUMBERED_RE, QUOTE_RE) to top-level consts so
they're compiled once instead of per-line.
- Collapse the duplicate compact-vs-normal blank-line branches into one
if/!compact gap call.
- Move Fence and MdProps types to the bottom per house style.
- Shorten splitTableRow → splitRow and use optional chaining in a few
match sites.
No behavior change; 162/162 tests pass. Net -22 LoC.
The inline markdown regex had `~([^~\s][^~]*?)~` for Pandoc-style subscript
(H~2~O, CO~2~). On models that decorate prose with kaomoji like `thing ~!`
and `cool ~?` — Kimi especially — the opener `~!` paired with the next
stray `~` on the line and dim-formatted everything between them with a
leading `_` character, mangling markdown output.
Tighten the pattern to short alphanumeric-only content (`~[A-Za-z0-9]{1,8}~`)
since real subscript never contains punctuation, spaces, or long runs.
Same tightening applied to stripInlineMarkup so width measurement stays
consistent. Classic CLI was unaffected because it renders these literally.
Three additive conventions inspired by github.com/atomicmemory/llm-wiki-compiler:
- Paragraph-level provenance: `^[raw/articles/source.md]` markers on pages synthesizing 3+ sources, so readers can trace individual claims without re-reading full source files.
- Raw source content hashing: `sha256:` in raw/ frontmatter enables re-ingest drift detection — skip unchanged sources, flag changed ones.
- Optional `confidence` and `contested` frontmatter fields let lint surface weak or disputed claims without re-reading every page's prose.
Lint gains two new checks (quality signals, source drift) and one expanded check (contradictions now surfaces frontmatter-flagged pages).
Also adds a Related Tools section pointing users who want batch/scheduled compilation at llm-wiki-compiler (Obsidian-compatible, works on the same vault).
All additions are opt-in — existing wikis need no migration. Skill version 2.0.0 -> 2.1.0.
Revert two overreaches from #13699 that forced paid Nous vision to
xiaomi/mimo-v2-omni instead of the tier-appropriate gemini-3-flash-preview:
1. Remove "nous": "xiaomi/mimo-v2-omni" from _PROVIDER_VISION_MODELS —
#13696 already routes nous main-provider vision through the strict
backend, and this entry caused any direct resolve_provider_client(
"nous", ...) aggregator-lookup path to pick the wrong model for paid.
2. Drop the 'elif vision' paid override in _try_nous() that forced
mimo-v2-omni on every Nous vision call regardless of tier. Paid
accounts now keep gemini-3-flash-preview for vision as well as text.
Free-tier behavior unchanged: still uses mimo-v2-omni for vision,
mimo-v2-pro for text (check_nous_free_tier() branch).
E2E verified:
paid vision → google/gemini-3-flash-preview
free vision → xiaomi/mimo-v2-omni
paid text → google/gemini-3-flash-preview
free text → xiaomi/mimo-v2-pro
Two unit tests that pin down the threading.local semantics the CLI freeze
fix (#13617 / #13618) relies on:
- main-thread registration must be invisible to child threads (documents
the underlying bug — if this ever starts passing visible, ACP's
GHSA-qg5c-hvr5-hjgr race has returned)
- child-thread registration must be visible from that same thread AND
cleared by the finally block (documents the fix pattern used by
cli.py's run_agent closure and acp_adapter/server.py)
Pairs with the fix in the preceding commit by @Societus.
Two bugs that allow dangerous commands to execute without informed user consent.
TUI (Ink): useInputHandlers consumes the isBlocked return path, but Ink's
EventEmitter delivers keystrokes to ALL registered useInput listeners. The
ApprovalPrompt component receives arrow keys, number keys, and Enter even
though the overlay appears frozen. The user sees no visual feedback, but
keystrokes are processed — allowing blind approval, session-wide auto-approve
(choice "session"), or permanent allowlist writes (choice "always") without
the user knowing.
Discovered while replicating #13618 (TUI approval overlay freezes terminal).
Fix: in useInputHandlers, when overlay.approval/clarify/confirm is active,
only intercept Ctrl+C. All other keys pass through. This makes the overlay
visually responsive so the user can see what they are selecting.
CLI (prompt_toolkit): _callback_tls in terminal_tool.py is threading.local().
set_approval_callback() is called in the main thread during run(), but the
agent executes in a background thread. _get_approval_callback() returns None
in the agent thread, falling back to stdin input() which prompt_toolkit
blocks. The user sees the approval text but cannot respond — the terminal is
unusable until the 60s timeout expires with a default "deny".
Fix: set callbacks inside run_agent() (the thread target), matching the
pattern already used by acp_adapter/server.py. Clear on thread exit to avoid
stale references.
Closes#13618
Two changes:
1. _PROVIDER_VISION_MODELS: add 'nous' -> 'xiaomi/mimo-v2-omni' entry
so the vision auto-detect chain picks the correct multimodal model.
2. resolve_provider_client: detect when the requested model is a vision
model (from _PROVIDER_VISION_MODELS or known vision model names) and
pass vision=True to _try_nous(). Previously, _try_nous() was always
called without vision=True in resolve_provider_client(), causing it to
return the default text model (gemini-3-flash-preview or mimo-v2-pro)
instead of the vision-capable mimo-v2-omni.
The _try_nous() function already handled free-tier vision correctly, but
the resolve_provider_client() path (used by the auto-detect vision chain)
never signaled that a vision task was in progress.
Verified: xiaomi/mimo-v2-omni returns HTTP 200 with image inputs on Nous
inference API. google/gemini-3-flash-preview returns 404 with images.
The 'subagents know nothing' warning and the 'no conversation history'
constraint both said the user provides the goal/context fields. In
practice the LLM parent agent calls delegate_task; the user configures
the feature but doesn't write delegation calls. Rewording to point at
the parent agent matches how the tool actually works.
Adds role='leaf'|'orchestrator' to delegate_task. With max_spawn_depth>=2,
an orchestrator child retains the 'delegation' toolset and can spawn its
own workers; leaf children cannot delegate further (identical to today).
Default posture is flat — max_spawn_depth=1 means a depth-0 parent's
children land at the depth-1 floor and orchestrator role silently
degrades to leaf. Users opt into nested delegation by raising
max_spawn_depth to 2 or 3 in config.yaml.
Also threads acp_command/acp_args through the main agent loop's delegate
dispatch (previously silently dropped in the schema) via a new
_dispatch_delegate_task helper, and adds a DelegateEvent enum with
legacy-string back-compat for gateway/ACP/CLI progress consumers.
Config (hermes_cli/config.py defaults):
delegation.max_concurrent_children: 3 # floor-only, no upper cap
delegation.max_spawn_depth: 1 # 1=flat (default), 2-3 unlock nested
delegation.orchestrator_enabled: true # global kill switch
Salvaged from @pefontana's PR #11215. Overrides vs. the original PR:
concurrency stays at 3 (PR bumped to 5 + cap 8 — we keep the floor only,
no hard ceiling); max_spawn_depth defaults to 1 (PR defaulted to 2 which
silently enabled one level of orchestration for every user).
Co-authored-by: pefontana <fontana.pedro93@gmail.com>
Models frequently emit bare codepoints like U+26A0 (⚠), U+2139 (ℹ),
U+2764 (❤), U+2714 (✔), U+2600 (☀), U+263A (☺) which, per Unicode, have
Emoji_Presentation=No and render as monochrome text-style glyphs in
terminals unless followed by VS16 (U+FE0F). Agent output leaked through
the TUI like `⚠ careful` instead of `⚠️ careful`.
Added `ensureEmojiPresentation` (lib/emoji.ts): scans for the curated
set of text-default codepoints and appends VS16 when the next char is
not already VS16, ZWJ, or a keycap-enclosing mark. Idempotent and
fast-pathed by a Unicode-range regex so ASCII-heavy text is untouched.
Applied once at the top of `Md`'s line parse. Hermes-ink's stringWidth
already accounts for VS16, so cursor/layout stays correct.
PDFs emitted by tools (report generators, document exporters, etc.) now
deliver as native attachments when wrapped in MEDIA: — same as images,
audio, and video.
Bare .pdf paths are intentionally NOT added to extract_local_files(), so
the agent can still reference PDFs in text without auto-sending them.
The prior form of this test asserted on CLI_CONFIG["delegation"] after
importing cli, which only passed by accident of pytest-xdist worker
scheduling. cli._hermes_home is frozen at module import time (cli.py:76),
before the tests/conftest.py autouse HERMES_HOME-isolation fixture can
fire, so CLI_CONFIG ends up populated by deep-merging the contributor's
actual ~/.hermes/config.yaml over the defaults (cli.py:359-366). Any
contributor (like me) who still has the legacy key set in their own
config causes a false failure the moment another test file in the same
xdist worker imports cli at module level.
Asserting on the source of load_cli_config() instead sidesteps all of
that: the test now checks the defaults literal directly and is
independent of user config, HERMES_HOME, import order, and worker
scheduling.
Demonstrated failure mode before this fix:
pytest tests/hermes_cli/test_config_drift.py \
tests/hermes_cli/test_skills_hub.py -o addopts=""
-> FAILED (CLI_CONFIG["delegation"] contained "default_toolsets"
from the user's ~/.hermes/config.yaml)
Part of Initiative 2 / M0.5.