Skills discovery surfaced ~136 of 88k skills in the CLI and gave community
skills no clickable source on the docs page. Three coupled fixes:
CLI browse:
- hermes skills browse capped at 50 because the per-source limit dict had no
'hermes-index' key — when the centralized index is available the router
skips external APIs and serves only the index, so the default-50 fallthrough
silently truncated the whole hub. Add hermes-index: 5000. Browse now loads
5367 (269 pages) instead of 136.
- Add an Identifier column + install/inspect hint to the browse table so users
can act on what they see without a second 'search'.
- Route the TUI browse_skills() helper through parallel_search_sources so it
inherits the same index-aware source-skip (was double-counting); expose
identifier in its output.
Docs Skills Hub page:
- Synthesize a sourceUrl for every community skill (github tree URL, clawhub /
skills.sh / lobehub / browse.sh detail pages), preferring the adapter's
explicit extra.detail_url/source_url/repo_url. Expanded cards now show
'View source' for community skills (was nothing) and keep 'View full
documentation' for built-in/optional. 99% coverage.
- Add a Copy button on the install command.
- Add a loading state instead of flashing '0 skills / No skills found' while
the 45MB catalog fetches.
Category cleanup:
- _guess_category fell back to tags[0] verbatim, producing ~430 junk one-off
categories (version strings, brand names: '0.10.7 Dev', 'Doramagic Crystal').
Now only curated buckets are accepted; unknowns fold into 'Other'. Widen the
tag->category map so common community tags route to real buckets. 430 -> 173
categories, top 20 all meaningful.
Tests: tests/website/test_extract_skills.py covers _source_url synthesis +
precedence and _guess_category curation (13 tests). All 27 skills-hub CLI
tests still pass. Docusaurus build verified; expanded cards confirmed in
browser for both community (View source) and built-in (View full docs).
Salvages 8 distinct fixes from a batch of PRs by @kyssta-exe, reapplied
onto current main (original branches were stale) with a few refinements.
- cron(jobs.py): load_jobs() validates top-level JSON shape — a bare
list auto-repairs into the {"jobs": [...]} dict; scalars/null raise a
clear RuntimeError instead of an uncaught AttributeError that took
down the whole cron subsystem (#37065, closes#36867).
- web(web_server.py): close the per-action log file handle after Popen
so the parent stops leaking one fd per spawned action (#36843).
- web(web_server.py): DELETE /api/env returns 400 for invalid key names
instead of a misleading 500, mirroring PUT /api/env (#36840).
- gateway(gateway.py): read /proc/<pid>/cmdline inside a with-block so
the fd is released immediately instead of relying on GC (#36804).
- web-tools(web_tools.py): include "xai" in check_web_api_key() so a
configured X.AI web backend reports as available (#36802).
- compression(conversation_compression.py): mark the feasibility check
done only after it completes, and default the gate to "not checked"
if the attribute is missing (#36803).
- completion(completion.py): replace `ls` with directory globbing in the
generated bash/zsh/fish profile listers — handles names with spaces
and skips non-directory entries (#36806).
- terminal-tool(terminal_tool.py): drop a duplicate `import threading`
(#36808).
- claw(claw.py): the migrate recommendation now points at the real
`hermes gateway stop` command instead of the non-existent
`hermes stop` (#36795, #36796, closes#36771).
- tests: guard against a leaked HERMES_CRON_SESSION breaking gateway
approval tests — add it to the hermetic conftest unset list (root
cause, protects every test) and pop it in the affected test's
setup_method (#36796).
Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>
Reworks the content-type preflight so a misconfigured HTTP MCP url (a web-app
root serving HTML) fails in <1s instead of hanging the full 60s connect_timeout
— and does so non-retryably, which neither original PR achieved.
- Allow-list detection (application/json, text/event-stream) instead of a
text/html-only denylist — catches text/plain, application/xml, etc.
- New NonMcpEndpointError(ConnectionError); run() catches it in the same
top-level fast-fail block as InvalidMcpUrlError, so it returns before the
reconnect-backoff loop (truly non-retryable) and the probe runs once, not
on every reconnect.
- Probe runs on its own httpx client OUTSIDE the SDK anyio task group, so the
error propagates as itself rather than wrapped in an ExceptionGroup (the
trap that made the in-SDK event-hook approach a no-op).
- Forwards ssl_verify + client_cert + headers; HEAD->GET fallback on 405/501;
best-effort pass-through on missing content type, non-2xx, and network
errors; skips SSE transport. CancelledError is never swallowed.
- Replaces the malformed test file (which never imported the real method and
failed CI) with 21 tests driving the actual _preflight_content_type against
a real local HTTP server, plus full run() integration verifying <1s
non-retryable failure.
Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
Co-authored-by: uzunkuyruk <egitimviscara@gmail.com>
A misconfigured MCP server URL that returns text/html (e.g. pointing at
a web app root instead of an MCP endpoint) causes the MCP SDK to block
for the full connect_timeout (default 60 s) before surfacing
CancelledError.
Add a lightweight HEAD pre-flight check that detects text/html responses
in ≤5 s and raises ConnectionError with an actionable message. Non-HTML
responses, missing headers, and network errors pass through silently so
the normal MCP handshake proceeds unaffected.
Fixes#36052
* feat(tui): single /model command + unified Sessions overlay
Collapse the redundant `/provider` alias so `/model` is the only name
everywhere (it already drove the same 2-step ModelPicker in the TUI).
Merge the separate `/resume` (cold history browser) and `/sessions` (live
switcher) surfaces into one Sessions overlay reached by `/resume`,
`/sessions`, `/session`, and `/switch`. It pins a "+ new" row at the top
(always visible), lists live sessions with status, and lists resumable
history below — dispatching session.activate for live rows vs resume for
cold ones, with close/delete in place. Fixes `/session` opening an empty
live-only switcher and the hidden new-session affordance.
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix(tui): address Copilot review on the Sessions overlay
- Track the armed history-delete by session id instead of row index so the
1.5s live-status poll re-indexing rows can't redirect the second `d` to a
different session.
- Re-add the busy-session guard to immediate `/resume <id>` and `/sessions new`
actions (browsing the bare overlay stays allowed) so resuming/switching can't
corrupt an in-flight turn's streaming/busy state.
* fix(tui): guard cold-resume (not live-switch/new) from the Sessions overlay
Copilot flagged that overlay actions bypassed the busy guard. Only cold
resume actually closes the current session, so only it is guarded — both
from the slash path and now from the overlay (appActions.resumeById).
Switching between live sessions and starting a `+ new` live session keep
the current session running in the background, so they stay unguarded:
that concurrency is the orchestrator's whole purpose. Also dropped the
over-broad guard on `/sessions new` for the same reason.
* fix(tui): address Copilot review (history dedup + desktop /provider)
- The 1.5s poll now re-derives the resumable list from the RAW session.list
results (rawHistoryRef) against the current live set, so a session hidden
while live reappears in history once it closes — instead of being lost
until a full reload. Delete also prunes the raw ref.
- Drop the dead `/provider` entry from the desktop PICKER_OWNED_COMMANDS now
that the alias is gone, so the desktop client no longer advertises it.
* fix(tui): surface session.list errors + keep selection stable across polls
- A garbled session.list response now surfaces an error and preserves the
last good raw history, instead of silently blanking the resumable section.
- The 1.5s poll re-anchors the selection to the same row by session id
(live or history) when the live list grows/shrinks, so the highlight no
longer drifts to a different row mid-interaction.
* fix(tui): degrade session.list independently + cover overlay helpers
- Fetch active_list and session.list via Promise.allSettled so a failing
session.list no longer rejects the whole load: live sessions still render
and only the resumable history degrades (with an error).
- Add unit tests for the new helpers (sessionRowKindAt row ordering,
resumableHistory dedupe, sessionsCountLabel, relativeSessionAge).
* test(tui-gateway): assert /provider alias is gone, /model remains
The CI test_complete_slash_includes_provider_alias asserted the removed
`/provider` alias still autocompleted. Flip it to lock in the removal:
`/pro` no longer offers `provider`, and `/mod` still completes `model`.
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Inside an s6 container, `gateway run` redirects to the supervised
gateway and then keeps the CMD process alive as a no-op heartbeat so
/init doesn't start stage-3 shutdown. That heartbeat is
`os.execvp("sleep", ["sleep", "infinity"])`, which does a PATH lookup
for the `sleep` binary. When PATH was empty/truncated/clobbered at that
point — e.g. after user customizations rewrote PATH, or on a minimal
image without `sleep` on PATH — the exec raised FileNotFoundError,
killing the CMD process and causing /init to tear down every service:
the container failed to start (issue #36208, a regression in the s6
image from 2026.5.28).
Wrap the exec in try/except OSError: on success it still replaces the
process with the cheap `sleep` heartbeat (no resident Python
interpreter, and the existing process-tree/recursion contract is
preserved); on failure it falls back to `_block_until_terminated()` —
a SIGTERM handler (clean 128+signum exit on `docker stop`) plus a
signal.pause() loop, which needs no external binary and so can't fail
on PATH state. A threading.Event().wait() fallback covers platforms
without signal.pause().
Keeping execvp as the primary path (rather than replacing it outright)
preserves the `sleep infinity` heartbeat that the docker integration
tests assert (test_gateway_run_supervised.py) and avoids leaving a
full Python interpreter resident for the container's lifetime.
Verified end-to-end on a built image: with execvp forced to fail,
_block_until_terminated() blocks cleanly instead of raising
FileNotFoundError; normal boot still runs the cheap `sleep infinity`
heartbeat; the 6 test_gateway_run_supervised.py integration tests pass.
Salvages the two community fixes for this issue — the fallback design
from #36221 (@Pluviobyte) and the signal.pause() heartbeat from #36267
(@karmeleon) — and adds regression tests for both the normal and
sleep-missing paths.
Co-authored-by: Pluviobyte <Pluviobyte@users.noreply.github.com>
Co-authored-by: karmeleon <karmeleon@users.noreply.github.com>
Closes#36208.
* feat(desktop): session hygiene, archive, media streaming + connecting overlay
Address a batch of desktop feedback:
- Stop leaking empty "Untitled" sessions: the TUI gateway pre-created a DB
row on every session.create (i.e. every launch/draft). Persist the row
lazily on first prompt instead, and hide message-less rows in the sidebar.
- Archive/hide sessions: new `archived` column + set_session_archived, web
API (`?archived=` + PATCH archived), Ctrl/⌘-click and a context-menu item
in the sidebar, and an "Archived Chats" settings panel to restore/delete.
- Videos load via a streaming `hermes-media://` protocol instead of capped,
in-memory data URLs (16 MB limit) — bypasses the cap and supports seeking.
- Background-process completions route to the session that launched them:
the completion event now carries session_key and each poller only consumes
its own.
- Sidebar: "Group by workspace" toggle is always visible; each workspace
group gets a "+" to start a session in that directory; "New agent"/"Agents"
relabeled to "New session"/"Sessions".
- New gateway connecting overlay (ascii decode → fade out) replacing the bare
skeleton/"starting gateway" state.
* fix(desktop): bail connecting overlay on boot error
The shownRef latch kept the connecting overlay mounted behind
BootFailureOverlay after a hard boot failure. Return null on boot.error
so the failure recovery surface fully owns the screen.
* fix(desktop): address Copilot review
- /api/sessions: validate `archived` (400 on unknown) and return `archived`
as a JSON boolean instead of SQLite's 0/1.
- PATCH /api/sessions/{id}: 400 (not a misleading 404) when the body has no
updatable fields; stop conflating a no-op with "not found".
- hermes-media protocol: drop `bypassCSP` — streaming only needs
secure/standard/stream/supportFetchAPI.
- Sidebar workspace header: split the toggle and the "+" into sibling buttons
so we no longer nest interactive elements inside a <button>.
* fix(desktop): address Copilot re-review
- hermes-media protocol: restrict streaming to an audio/video extension
allowlist (415 otherwise) so it can't be used to read arbitrary local files.
- Connecting overlay: use z-[1200] instead of the non-standard z-1200 utility.
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
#32049 reports that under terminal.backend: docker, write_file / patch
calls to authoritative profile state (SOUL.md, memories, etc.) land on
the sandbox-local mirror at
``<HERMES_HOME>/profiles/<name>/sandboxes/<backend>/<task>/home/.hermes/...``
— a path the host Hermes process never reads. The tool reports success,
the user sees no behavior change, and on disk two divergent copies of
SOUL.md (or any other profile file) accumulate.
The existing classify_cross_profile_target guard does not catch this:
its parts[2] check sees "sandboxes" and returns None, and the path is
in-profile from the inner-mirror perspective so even a fixed version
would not fire.
Add a parallel sandbox-mirror classifier in agent/file_safety:
* classify_sandbox_mirror_target() detects the
``…/sandboxes/<backend>/<task>/home/.hermes/…`` shape via path parts.
Detection is path-shape only — backend-agnostic, does not require
the file to exist, and works regardless of which HERMES_HOME resolves.
* get_sandbox_mirror_warning() returns a model-facing warning that
names the mirror root and the inner authoritative path the agent
likely meant.
Wire both detectors through tools/file_tools._check_cross_profile_path
so the existing write_file and v4a patch call sites pick up the new
guard with no API change. The bypass kwarg (``cross_profile=True``)
remains shared between the two guards — same "I know what I'm doing"
escape valve after explicit user direction.
This is the defense-in-depth piece of the proposal in #32049 ("any
…/sandboxes/<backend>/…/home/…hermes/… path as sandbox-mirror"). It
catches the host-side speculation case where the agent writes a literal
sandbox-mirror path. The inner-container case (where the bind mount
strips the ``sandboxes/`` prefix from the agent's path view) is out of
scope for this surgical change — that requires either a dispatch-layer
host-side check before the container handoff, or the host-side
``profile_state`` / ``soul`` tool the issue also proposes.
Soft guard, NOT a security boundary — matches the existing
classify_cross_profile_target contract.
Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>
Co-authored-by: Ben Barclay <ben@nousresearch.com>
The extract pipeline (extract_media/extract_images/extract_local_files +
directive strips) can reduce a non-empty tool-using response to empty
text_content with no deliverable attachment. The 'if text_content' send
guard then silently skips delivery: a 'response ready' log with no
'Sending response', no error, and the answer never reaches the user.
- A2: snapshot the pre-extract response; when extraction yields empty text
and no image/local/media attachment, deliver the recovered original from
the post-extract_media body (so a spaced MEDIA path can't leak). Applies
on ALL platforms (supersedes the Discord-only #33842 and the unsafe
raw-fallback #29499).
- A3: loud delivery invariant - a non-empty response that produces nothing
deliverable logs response_delivery_dropped at ERROR; every recovery logs
response_delivery_recovered. No silent drop survives.
- Factor a _strip_media_directives helper for the [[...]] strips; MEDIA
stripping stays owned by extract_media, whose grammar handles spaced and
quoted paths.
- Salvaged + de-scoped the #33842 test harness to all platforms; added
unrecoverable-drop and no-leak regression tests.
A streamed preamble ("Let me search...") finalized at a tool boundary
routed through _try_fresh_final, which unconditionally set
_final_response_sent=True even though it is a NON-final segment. The
gateway then reads that flag as "final delivered" and suppresses the
genuine final answer produced on the next API call, so the user silently
gets nothing. Only reproduces with fresh_final_after_seconds > 0.
- _try_fresh_final / _send_or_edit take is_turn_final; the segment-break
call site passes is_turn_final=got_done so only the turn-final answer
marks final-delivered.
- _reset_segment_state clears the final-delivery flags at every tool
boundary as defense-in-depth against any future premature setter.
- Failing-first regression + happy-path no-duplicate test.
Wires the salvaged search helpers into the shared curses menu driver and
turns on type-to-filter for the CLI model pickers (the 100+ model lists
that previously required scrolling).
- Search lives in the shared `_run_curses_menu` driver behind a
`searchable` flag + `search_labels`, so both `curses_radiolist` and
`curses_single_select` get it without per-menu duplication. `/` opens
the filter, BACKSPACE edits, Ctrl+U clears, ESC clears the filter then
cancels. Returned values are always original item indices.
- `_filter_indices` RANKS matches (best-first) via a Python port of the
TS scorer in ui-tui/src/lib/fuzzy.ts and web/src/lib/fuzzy.ts. The port
is byte-identical in score: same per-char bonuses, prefix (+8) and
exact (+20) bonuses, camelCase/word-boundary detection (matching on the
lowercased target, boundary on the original case), and the -len*0.01
length tiebreak — so the CLI, TUI, and WebUI rank results identically.
A cross-language parity test pins the exact scores.
- `_prompt_model_selection` (the canonical picker across the model flows)
and the custom-provider model list pass `searchable=True`.
- Split `_decode_menu_key` out of `read_menu_key` so the search loop can
peek the raw key (catch `/`) before nav decoding.
- ESC during active search now clears the query (restores the full list)
so a no-match filter can't strand the user; printable-key capture is
restricted to ASCII to avoid Latin-1 mojibake.
- Update two setup-menu tests whose mock signatures predate the new
`searchable` kwarg; add ranked-scorer + parity + state-machine tests.
Pure, refactor-independent helpers for type-to-filter search in the
curses single-/radio-select menus: subsequence matching, filtered-index
mapping, cursor reconciliation, scroll clamping, and an active-search
key handler, plus unit tests.
Salvaged from #22758 (the curses event loop was since refactored into a
shared driver on main, so the integration is rebuilt in a follow-up
commit; these pure helpers and their tests carry over unchanged).
Port PR #29365's tool-surface contract test: terminal/file/execute_code
already honor TERMINAL_CWD (out of scope for the resolver cluster). Pinning
the behavior makes the supersession of #29365 airtight and guards against a
future refactor silently regressing the workspace contract.
Cover the two new hardening behaviors that were unpinned: whitespace-only
TERMINAL_CWD falling through to getcwd/None, and OSError from the getcwd
fallback arm propagating to the build_environment_hints try/except guard.
Do not treat lack of application-level SimpleX events as a stale WebSocket. The websockets client already uses protocol ping/pong for connection liveness, so quiet but healthy connections should not be closed by the health monitor.
M3 is 1M context, but pre-catalog builds resolved it via the generic
'minimax' catch-all (204,800) and persisted that to the context-length
cache. Step 1 of get_model_context_length returned the cached value
directly before reaching the 'minimax-m3' (1M) catalog entry, so users
who first probed M3 on an older build were stuck at 204K forever (e.g.
/new in the Telegram gateway showing 'Context: 204K tokens (detected)').
Mirror the existing Kimi/Codex stale-cache guards: when a cached entry
for a minimax-m3 slug is <= 204,800, drop it and re-resolve. M2.x slugs
(correctly 204,800) are untouched since they don't match the M3 name.
os.fchmod is Unix-only; the Windows os module has no fchmod (only
chmod). Passing mode= (e.g. 0o600 when saving the Hindsight config
during `hermes memory setup`) crashed on Windows with:
AttributeError: module 'os' has no attribute 'fchmod'
Guard the fchmod fast-path with hasattr(os, "fchmod"). Skipping it on
Windows is safe: mkstemp already creates the temp file as 0o600, and
the existing post-replace os.chmod(real_path, mode) — already wrapped
in try/except — applies the final mode durably (as far as Windows
honors it).
Adds regression tests: one simulating a Windows os module without
fchmod (must not raise), and one asserting the durable 0o600 mode on
POSIX.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Subway2023's #14639 blocks write_file/patch to ~/.hermes/config.yaml, but
the terminal side was only partially paired: echo>/tee/cp/mv to config.yaml
already tripped the project-config pattern, while `sed -i` and direct edits
slipped through with auto-approve. An unpaired write_file deny is theater per
SECURITY.md — the agent could flip approvals.mode=off via `sed -i` and the
mtime-keyed config cache reloads it mid-session.
config.yaml IS the security policy (approvals.mode/yolo/permanent allowlist
live there), so it warrants real pairing, not a half-door. Add a
_HERMES_CONFIG_PATH fragment mirroring _HERMES_ENV_PATH, fold it into
_SENSITIVE_WRITE_TARGET (covers tee/>/>>/cp/mv), and add sed -i coverage for
both config.yaml and .env. Pins 9 regression tests including no-regression
guards (reads pass, /tmp writes pass).
Co-authored-by: sbw2025 <subw3@mail2.sysu.edu.cn>
A stream that drops mid-response after tokens are delivered (peer-closed
connection, stale-stream reconnect) is converted into a synthetic
finish_reason="length" stub. The conversation loop treated that network
stall as a max-output-tokens truncation: when the dropped content was a
tool call it retried exactly once, then hard-failed with "Response
truncated due to output length limit" — even on large-output models that
never hit any cap (e.g. Opus).
- Tool-call truncation now retries up to 3 times (was 1) with a
progressive max_tokens boost, and is stub-aware: a PARTIAL_STREAM_STUB_ID
stall prints "Stream interrupted mid tool-call — retrying (n/3)" instead
of the false "model hit max output tokens", and the give-up message
distinguishes a network drop from a real truncation.
- Length-continuation retries preserve the original request's output cap
as a floor, so a high provider/model default isn't silently downshifted
to 8K/12K on retry.
- Added _requested_output_cap_from_api_kwargs() helper.
Tests: stub-stall mid-tool-call recovery within 3 retries; continuation
preserves a large provider-default output cap.
Fixes#26425. Salvages the substance of #26427 (cap floor) and #9525
(retry bump), adapted to the post-refactor conversation_loop.py which
handles all three api_modes uniformly.
Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
* feat(dashboard): backend API for MCP, pairing, webhooks, credential pool, memory, gateway lifecycle
Adds REST endpoints so a remote admin can manage these without CLI access:
- MCP servers: list/add/remove/test (config.yaml parity with hermes mcp)
- Pairing: list/approve/revoke/clear-pending messaging codes
- Webhooks: list/subscribe/remove (hot-reloaded JSON store)
- Credential pool: list/add/remove rotation keys (via CredentialPool API)
- Memory provider: status/select/disable/reset
- Gateway lifecycle: start/stop (restart+update already existed)
Secrets redacted on read; usable values only reach the agent at session start.
All endpoints sit behind the existing dashboard auth gate.
* feat(dashboard): backend API for ops + skills hub
- Ops actions (spawned, log-tailed via /api/actions): doctor, security audit,
backup, import, checkpoints prune
- Ops reads (structured JSON): hooks list + allowlist status, checkpoints list
with per-session size
- Skills hub actions (spawned): install / uninstall / update
- Registers new action log files for all spawn-based endpoints
All gated by the existing dashboard auth middleware.
* feat(dashboard): admin pages for MCP, pairing, webhooks, and system ops
Adds four new dashboard pages + nav entries so a remote admin can manage
Hermes without CLI access:
- MCP: list/add/remove/test MCP servers
- Webhooks: list/create/delete subscriptions (one-time secret reveal)
- Pairing: approve/revoke/clear messaging pairing codes
- System: gateway start/stop/restart, memory provider + reset, credential
pool add/remove, ops (doctor/audit/backup/import/skills update) with a
live action-log viewer, checkpoints prune, shell-hooks status
api.ts: client methods + types for all new endpoints.
App.tsx: routes + sidebar nav (plain labels, no i18n key required).
Verified: tsc -b clean, production build succeeds, new pages lint clean,
zero new eslint errors in App.tsx.
* test(dashboard): cover admin API endpoints
20 tests across MCP, credential pool, memory, pairing, webhooks, ops, plus
an auth-gate parametrize that asserts every admin endpoint requires the
session token. Asserts request contract + CLI-config parity, not catalog
values (per the no-change-detector-tests rule).
* docs(dashboard): document MCP, Webhooks, Pairing, and System admin pages
Adds Pages sections for the four new admin tabs and an Admin-endpoints table
to the REST API reference. Updates the page description to reflect the
dashboard's expanded role as a full administration panel.
* feat(install): --no-skills flag for blank-slate default profile
Add an install-time --no-skills flag so the default ~/.hermes profile can
be created with zero bundled skills, matching what
`hermes profile create --no-skills` already does for named profiles.
The flag writes $HERMES_HOME/.no-bundled-skills and skips the install-time
seed. sync_skills() now honors that marker with an early return
(skipped_opt_out=True), so neither the installer, a later `hermes update`,
nor a direct sync re-injects bundled skills into a profile that opted out.
Previously the marker was only checked by seed_profile_skills() (named
profiles); the default profile had no opt-out and `hermes update` would
re-seed it every time.
Tests: TestNoBundledSkillsOptOut covers marker-present (no-op) and
marker-absent (normal seed) paths.
* feat(skills): hermes skills opt-out / opt-in for existing profiles
Adds an interactive counterpart to the install-time --no-skills flag so
an already-installed profile (default or named) can toggle the
.no-bundled-skills marker without reinstalling.
- `hermes skills opt-out` writes the marker (stop future seeding). Safe
by default: nothing on disk is touched.
- `hermes skills opt-out --remove` ALSO deletes already-present bundled
skills, but ONLY ones that are manifest-tracked AND byte-identical to
their origin hash. User-edited bundled skills, hub-installed skills, and
hand-written skills are never removed. Previews + confirms before
deleting (--yes to skip).
- `hermes skills opt-in [--sync]` removes the marker and optionally
re-seeds immediately.
Core logic lives in tools/skills_sync.py (set_bundled_skills_opt_out,
is_bundled_skills_opt_out, remove_pristine_bundled_skills) reusing the
existing manifest origin-hash machinery for the safety check.
Tests: TestOptOutToggleAndRemove covers marker toggle idempotency and
proves user-modified + non-bundled skills survive --remove.
* docs: blank-slate skills — install --no-skills + opt-out/opt-in
- features/skills.md: new 'Starting with a blank slate' section covering
the install flag, profile-create flag, and runtime opt-out/opt-in, with
a safe-by-default note.
- reference/cli-commands.md: document the new skills opt-out / opt-in
subcommands + examples.
- reference/profile-commands.md: fix the marker filename (was .no-skills,
actually .no-bundled-skills) and cross-link the runtime commands.
Validated with a full docusaurus build (exit 0); the three edited pages
compile clean with no new warnings.
Two related changes to the skill curator:
1. Built-in pruning. New curator.prune_builtins config (default on) lets the
curator archive bundled built-in skills after the inactivity period, not
just agent-created ones. A .curator_suppressed list tells the update-time
re-seeder (tools/skills_sync) to leave pruned built-ins archived, so the
prune is durable across `hermes update`. Built-ins are seeded with a
baseline record on first sight, so the inactivity clock starts at upgrade
time -- no mass-prune on the first run. Hub-installed skills are never
pruned regardless of the flag. Restoring a built-in clears its suppression.
2. Usage tracking for all skills. Telemetry (view/use/patch) was wrongly gated
behind curation-eligibility, so built-ins were tracked only when prunable
and hub skills never. Telemetry is observability and is now decoupled from
curation: every skill accrues usage counts regardless of provenance, while
lifecycle mutators (set_state/set_pinned/mark_agent_created) stay
curation-gated. New usage_report() + provenance() expose all skills with an
agent/bundled/hub tag.
Gateway /undo was wired into every platform but still ran the old
single-turn hard-truncate. Now it matches the CLI/TUI: /undo [N] backs
up N user turns (default 1, clamps to oldest), soft-deletes the
truncated rows on disk (active=0, kept for audit, hidden from re-prompts
and search) via SessionDB.rewind_to_message, evicts the cached agent so
the next turn rebuilds from the active-only transcript (the gateway's
equivalent of the CLI's in-place history surgery + memory invalidation),
and echoes the backed-up message text so the user can copy/edit and
resend — platforms have no editable composer to prefill.
- gateway/session.py: SessionStore.rewind_session(session_id, n) wraps
the soft-delete primitive; load_transcript already returns active-only
- gateway/run.py: _handle_undo_command parses [N], calls rewind_session,
evicts the agent, echoes target text; confirm-prompt detail is count-aware
- locales: undo.removed gains {turns}; new undo.invalid_count, all 16 langs
- tests: tests/gateway/test_undo_rewind_session.py (6 cases)
The skill security scanner blocked legitimate community skills on three
intrinsic false-positive patterns:
- read_secrets_file matched `cat > file.env <<` heredocs (writing the
user's own keys into their own local .env), not just `cat file.env`
reads. Exclude output redirections.
- allowed-tools frontmatter is REQUIRED by the agent-skill spec; every
compliant skill declares it. Drop from HIGH privilege_escalation to a
LOW informational finding so it no longer drives the verdict.
- python_os_environ flagged `os.environ.get("CONFIG_VAR")` config reads
as HIGH exfiltration. Exempt non-secret `.get()` reads; add a dedicated
CRITICAL python_environ_get_secret pattern so secret-named reads
(OPENAI_API_KEY etc.) are still caught.
Also: scan_skill() now honors a skill-provided .skillignore / .clawhubignore
(gitignore-style) so dev/docs artifacts shipped in a skill root are excluded
from both structural checks and pattern scanning. SKILL.md is never ignorable.
80 tests pass (64 existing + 16 new).
Two pre-existing failures on main, unrelated to each other:
- test_model_catalog: website/static/api/model-catalog.json was stale vs
_PROVIDER_MODELS — minimax/minimax-m2.7 was renamed to minimax/minimax-m3
without regenerating the committed manifest. Ran scripts/build_model_catalog.py.
- test_gui_command: the macOS relaunchable-signing fixup
(_desktop_macos_relaunchable_fixup) makes two subprocess.run calls (xattr +
codesign) on darwin before launch. The two darwin GUI tests set
sys.platform='darwin' and mock subprocess.run with a 2-element side_effect
(pack + launch), so the fixup's calls drained the iterator -> StopIteration.
Mock out the fixup in those two tests so the subprocess accounting stays
focused on pack/launch.
Extends the existing /undo command from a single in-memory exchange
removal into a full rewind: back up N user turns (default 1), soft-delete
the truncated rows in SessionDB (active=0, kept for audit, hidden from
re-prompts and search), notify memory providers, and prefill the composer
with the backed-up message text for editing — CLI and TUI.
Reuses the SessionDB rewind primitives, the on_session_switch(rewound=True)
memory hook, and the TUI command.dispatch prefill payload from SaguaroDev's
#21910 work, wired to /undo [N] instead of a separate /rewind picker.
- cli.py: undo_last(n, prefill) — in-memory truncate + SQLite soft-delete
+ agent surgery (system-prompt invalidate, flush-index reset) + memory
notify + editable buffer prefill; /undo dispatch parses optional count;
checkpoint-rollback caller passes prefill=False
- tui_gateway/server.py: command.dispatch undo branch (was rewind) parses
count, picks Nth-from-last user turn, clamps to oldest
- commands.py: /undo gains [N] args_hint
- tests: rename + expand TUI suite (multi-turn, clamp, invalid-count)
- release.py: AUTHOR_MAP entry for SaguaroDev
Co-authored-by: SaguaroDev <74339271+SaguaroDev@users.noreply.github.com>
Adds the TUI half of the /rewind feature so the Ink terminal UI gets
the same affordance as the prompt_toolkit CLI.
Python side (tui_gateway/server.py):
- /rewind added to _PENDING_INPUT_COMMANDS so slash.exec rejects it
and the TUI falls through to command.dispatch (the only path with
access to live session state + memory hooks).
- New command.dispatch branch for name == "rewind":
v1 auto-picks the most recent user turn (Claude-Code-style single-
step undo), calls SessionDB.rewind_to_message, refreshes the
in-memory history, fires _memory_manager.on_session_switch with
rewound=True, and returns the new "prefill" payload.
- A dedicated picker overlay (multi-step rewind) is tracked as a
follow-up to #21910.
TS side (ui-tui/src/):
- New "prefill" variant on CommandDispatchResponse + asCommandDispatch
validator. Mirrors "send" but does NOT auto-submit; the client drops
the message into the composer for editing.
- createSlashHandler renders the optional notice via sys() and calls
ctx.composer.setInput(d.message), letting the user edit-and-resubmit
the rewound turn — the core UX promised by the issue.
Tests:
- 7 new tui_gateway tests covering prefill payload shape, in-memory
history truncation, DB soft-delete, memory-provider notification
(rewound=True), busy-session refusal, missing-session error, and
registry placement in _PENDING_INPUT_COMMANDS.
- Extended asCommandDispatch vitest covering the new prefill variant
(with + without notice, and rejection of malformed payloads).
Out of scope for v1 (tracked as #21910 follow-up):
- Dedicated picker overlay in Ink (the multi-step rewind UI). v1 auto-
picks the most recent user turn, matching the most common case.
- Gateway platforms (Telegram, Discord, etc.) — issue scopes v1 to
CLI + TUI only.
Self-review of the code-block masking fix: the cleanup path ran
media_pattern.sub('') over the _mask_protected_spans() copy of the text and
assigned that back to 'cleaned', so whenever a real MEDIA: tag was delivered
(if media: branch), every fenced code block / inline code / blockquote in the
reply was blanked to whitespace in the user-visible text.
Now mask only a length-equal copy of 'cleaned' to locate the real tag spans,
then delete those spans from the unmasked 'cleaned' — masking is a locator,
not a text rewrite. Protected spans survive verbatim. Strengthens the existing
mixed-code test (it only asserted 'Done.' survived, not the code block) and
adds an inline-code-survives regression test. Both fail on the old sub-based
code and pass now.
extract_media() scanned the full response text without distinguishing
live delivery tags from example paths in fenced code blocks, inline code
spans, and blockquotes. This caused false positives where the agent's
explanation of MEDIA: syntax (or tool output containing example paths)
was stripped from user-visible text and the path was added to the media
delivery list.
Added _mask_protected_spans() helper that replaces protected regions
with equal-length whitespace before regex matching, preserving match
offsets. The helper skips backtick-quoted paths in MEDIA: tags to
maintain existing path extraction behavior.
Fixes#35695
Self-review of #34375 fix: the cleanup path ran media_pattern.sub('') over
the JSON-masked copy of the text, which baked the masking spaces into the
user-visible 'cleaned' string — a serialized tool result like
{"old":"MEDIA:/x.png"} came back as {"old":" "}.
Now mask only a length-equal copy of 'cleaned' to locate the real tag spans,
then delete those spans from the unmasked 'cleaned'. Real tags are stripped;
JSON-embedded MEDIA: text reads back verbatim. Masking 'cleaned' (not the
original 'content') keeps offsets valid after the [[audio_as_voice]] /
[[as_document]] directives are removed. Adds two cleaned-text regression tests.
Serialized tool results frequently embed a prior reply's text, e.g.
{"result": "MEDIA:/path/stale.png"}. The bare-path branch of
MEDIA_TAG_CLEANUP_RE matched these and re-delivered stale files (#34375).
Adds BasePlatformAdapter._mask_json_string_media, which blanks (offset-
preserving) only MEDIA:<bare-path> tokens that sit inside a JSON value-
context string (opened by : , { or [). Legitimate tags at line start,
after prose, indented, MEDIA:"quoted" form, and two-line TTS output are
all left untouched.
Reworked from the approach in #34388 (a line-start regex anchor), which
no longer applied to current main and regressed same-line/indented tags.
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
When an unauthenticated SPA fetch hit a gated /api/* endpoint (e.g.
GET /api/analytics/models?days=30 fired from ModelsPage on mount or
after a session expiry), the gated middleware stamped the request's
own path into next= on the 401 envelope's login_url. The SPA's global
401 handler in web/src/lib/api.ts full-page-navigated to that URL,
the PKCE cookie carried the encoded /api/* value through the OAuth
round trip to Portal, and /auth/callback's _validate_post_login_target
accepted it as same-origin and redirected the user to the raw JSON
endpoint instead of the dashboard.
Symptom Ben reported: after the OAuth screen he kept landing on
$DOMAIN/api/analytics/models?days=30 (raw JSON) rather than /models.
The bug was deterministic per page — whichever /api/* call ModelsPage,
AnalyticsPage, or SessionsPage fired first owned the redirect race.
Fix: both validators now reject /api/* targets in addition to the
existing /login, /auth/, /api/auth/ exclusions:
- _safe_next_target in middleware.py drops the value before it ever
enters login_url, so the SPA's 401 handler navigates to a bare
/login (which the SPA itself can return-from via its own
sessionStorage["hermes.lastLocation"] fallback that was already
saving the actual browser location).
- _validate_post_login_target in routes.py drops it as second-line
defence at the callback boundary, so a legacy cookie, a regressed
middleware, or an attacker-crafted /auth/login?next=/api/... value
can't smuggle the redirect through. Either layer alone is enough;
pairing them means a regression in one is caught by the other.
The match is anchored: ``decoded == "/api"`` or
``decoded.startswith("/api/")``. SPA route lookalikes like /apidocs
or /api-keys remain valid landing targets — tests pin that.
Test additions in test_dashboard_auth_401_reauth.py:
- TestApi401Envelope: rewrote test_login_url_carries_next_for_deep_
api_path (which asserted the pre-fix behaviour) as
test_login_url_drops_next_for_deep_api_path, plus added the
specific analytics-models repro case from Ben's report.
- TestNextSameOriginValidation: rejects-api-paths + does-not-reject-
api-prefix-lookalikes (covers /apidocs, /api-keys).
- TestAuthCallbackNext: end-to-end test_callback_with_api_next_
lands_at_root drives /auth/login?next=/api/... through to the
callback and asserts the user lands at "/", not the API URL.
- TestValidatePostLoginTarget: new class covering the callback-side
validator directly, including the URL-encoded ``%2Fapi%2F...``
form the PKCE cookie actually carries.
Mutation-tested: reverting both validators causes exactly the 5 new
or rewritten /api/*-related assertions to fail (each fix layer is
independently tested), while the 31 other assertions in the file
remain green. Full tests/hermes_cli/ suite (288 files, 5,938 tests)
passes with the fix applied.
The desktop rename dialog sent PATCH /api/sessions/{id}, but the backend
only defined GET and DELETE for that path — FastAPI returned 405 Method
Not Allowed, surfaced to the user as "Rename failed". Add the PATCH route
backed by SessionDB.set_session_title (handles sanitization, uniqueness,
and clearing the title when empty).
Also fix a misleading notification: any 405 was summarized as an unrelated
"does not support that audio endpoint" message. Make it a generic 405 hint.