* fix: update to version 3 endpoints and adding update and delete tool
* chore: removing the test md file
* fix: prevent circuit breaker on client errors in Mem0 provider
* chore: add telemetry for platform version
* feat: add OSS mode support to Mem0 memory provider
* chore: bump mem0ai dependency to >=2.0.1 in memory plugin
* refactor: enhance dependency checks and embedder config in mem0 backend
* refactor: adjust fact storage message for OSS mode
* refactor: expand user paths, add collection recreation on dimension change for Qdrant
* fix(mem0): make MEM0_USER_ID override gateway-native ids and tag writes with channel
When MEM0_USER_ID was configured (env or mem0.json), the gateway-native id
from kwargs (Telegram numeric id, Discord snowflake, ...) still won, so the
same human ended up under different user_ids per channel and memories never
merged across CLI / Telegram / Slack / Discord. Mirrors openclaw's cfg.userId
pattern: configured override wins, gateway-native id is the fallback.
The legacy "hermes-user" placeholder default written by the setup wizard is
treated as unset to avoid silently bucketing every gateway user together.
Also tag every write with metadata.channel (cli/telegram/discord/...) so the
dashboard can offer per-channel filtered views without coupling identity to
the channel; document the read/write filter asymmetry as intentional
(reads scope to user_id only for cross-agent recall).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: improve Mem0 memory provider backend, pagination, config, and error handling
* refactor: update mem0 telemetry code, docs, and bump version
* fix(mem0): make get_config_schema() return unified schema with mode-aware required flag
Schema always includes api_key field so picker shows "API key / local" for
both modes. In OSS mode api_key.required=False so status won't mislead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: improve mem0 telemetry, add env var key and OSS mode detection
* chore: bump mem0ai lower bound to 2.0.4 (latest SDK release)
* refactor: set telemetry sample rate to 1.0 and update docs for opt‑out
* fix(mem0): resolve 15 correctness, thread-safety, and resource bugs
Thread safety:
- Protect circuit breaker counters with _breaker_lock (race between
prefetch/sync daemon threads and main thread)
- Wrap sync_turn thread creation in _sync_lock; skip if previous sync
is still alive after 5 s join to prevent duplicate memory ingestion
- Guard _schedule_flush timer creation under _queue_lock (TOCTOU race)
- Capture local `backend` reference in prefetch/sync closures so
shutdown() nulling self._backend cannot crash in-flight threads
Correctness:
- Fix bool("false")==True for rerank param; parse string values explicitly
- Guard page/top_k with max(1,...) and move int() inside try blocks
- Fix fact_count=0 always in OSS mode (Memory.add returns list, not dict)
- Fix prefetch() not clearing result when thread still alive after timeout
- Fix atexit.register accumulating on repeated initialize() calls
Backend / setup:
- Handle Qdrant named-vector collections in _recreate_collection_if_dims_changed
(vectors is a dict; .size access raised AttributeError, swallowed silently)
- Wrap QdrantClient and psycopg2 conn/cursor in try/finally to prevent leaks
- Resolve ollama_bin at top of _ensure_ollama; use it for ollama pull
- Fix embedder key lookup when LLM provider has no env_var (e.g. ollama)
Also: remove _telemetry_enabled cache (env var check is cheap), bump
required mem0ai to >=2.0.7, minor README wording fix.
* fix(mem0): fix brittle qdrant path test + add telemetry sample-rate docs
- Replace generator-throw lambda with a proper def in
test_qdrant_path_not_writable; use tmp_path instead of a hardcoded
/nonexistent path so the test is root-safe
- Add MEM0_TELEMETRY_SAMPLE_RATE to memory-providers.md (was only
in the plugin README, not the user-guide docs)
* revert: remove MEM0_TELEMETRY_SAMPLE_RATE from user-guide docs
* refactor: remove telemetry from mem0 plugin and update documentation
* fix(mem0): set stdin=DEVNULL on setup subprocess calls
The TUI stdin guard (scripts/check_subprocess_stdin.py) requires every
subprocess call in plugin code to set stdin= so it can't inherit the
gateway's JSON-RPC stdin fd. Muzzle the docker/ollama calls in the OSS
setup wizard with stdin=subprocess.DEVNULL (none need interactive input).
Also covers the docker-inspect call the linter's regex misses.
---------
Co-authored-by: chaithanyak42 <chaithanya.kumar42a@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Authorization to message the agent is the gate, not the file extension.
Previously the inbound-attachment allowlist (SUPPORTED_DOCUMENT_TYPES) was
opt-OUT on Discord (allow_any_attachment defaulted false) and had no bypass
at all on Telegram/Slack — so an .html (or any non-allowlisted type) was
dropped or hard-rejected before the agent saw it.
Now every authorized upload is cached and surfaced to the agent regardless
of type:
- base.cache_media_bytes(): unknown types cache as octet-stream (or the
caller-supplied MIME) instead of returning None — fixes the chokepoint
that Teams/Telegram-media route through.
- discord/telegram/slack adapters: removed the allowlist reject/skip; any
non-media attachment is typed DOCUMENT and cached. Known types keep their
precise MIME.
- Text inlining now gates on a shared _TEXT_INJECT_EXTENSIONS set (text +
code + config + markup) instead of a blind UTF-8 decode, so binary formats
(PDF/zip/docx) with ASCII headers are never inlined.
- gateway/run.py emits the path-pointing context note for every DOCUMENT,
including non text/application MIME types.
- discord.allow_any_attachment is now a documented no-op kept for config
back-compat.
Validation: 357 gateway tests pass; E2E confirms .html/.bin/custom types
cache, known types stay precise, PDFs are not inlined.
When `hermes dashboard --host 0.0.0.0` is run interactively with the auth
gate engaged but no DashboardAuthProvider configured, prompt to set up the
bundled username/password provider on the spot (or point at `hermes dashboard
register` for OAuth) instead of only emitting the fail-closed error.
- main.py: `_maybe_setup_dashboard_auth_interactively()` runs before
start_server. No-ops on loopback binds, when a provider is already
registered, or when stdin/stdout isn't a TTY (Docker/s6, CI, piped runs) so
the fail-closed SystemExit stays the backstop for unattended deploys. On the
password path it writes dashboard.basic_auth.{username,password_hash,secret}
to config.yaml (scrypt hash, never plaintext), then force-rediscovers
plugins so the basic provider registers before the gate check.
- web_server.py: fix the fail-closed hint — it told operators to set
`dashboard_auth.basic.username` but the provider reads `dashboard.basic_auth`.
- docs: note the interactive setup under Fail-closed semantics.
No new env vars; reuses the existing dashboard.basic_auth config surface.
Sibling-site follow-up to the AGENTS.md token-lock fix (#50481). Platform
adapters migrated from gateway/platforms/<name>.py to
plugins/platforms/<name>/adapter.py; a handful (signal, weixin, bluebubbles,
qqbot, yuanbao, msgraph_webhook, webhook, api_server) still live in
gateway/platforms/.
- adding-platform-adapters.md: new-adapter creation path + reference-impl table
- gateway-internals.md: rewrite the adapter tree to reflect the actual split
- zh-Hans mirrors of both kept in parity
- scripts/release.py: add TutkuEroglu to AUTHOR_MAP (CI gate)
* feat(providers): remove google-gemini-cli + google-antigravity OAuth providers
Google now actively bans accounts for third-party tools that piggyback on
Gemini CLI / Antigravity / Code Assist OAuth, and because abuse prevention
sits at a backend layer the ban can extend to the entire Google account
(Gmail/Drive), with a second violation being permanent.
Ref: https://github.com/google-gemini/gemini-cli/discussions/20632
Removes both OAuth inference providers entirely (modules, provider profiles,
auth/runtime/config/models wiring, the /gquota Code Assist quota command,
the antigravity-cli optional skill, desktop + docs surface in en + zh-Hans).
The API-key 'gemini' provider (GOOGLE_API_KEY/GEMINI_API_KEY against
generativelanguage.googleapis.com) is unaffected and stays fully supported.
* fix(skills): keep the antigravity-cli skill — only the OAuth provider is removed
The antigravity-cli optional skill orchestrates the external `agy` binary as
a coding-agent tool via the terminal tool — it does NOT wrap Hermes inference
through the banned google-antigravity OAuth provider, so it carries none of
the account-ban risk that motivated removing that provider. Restore the skill,
its docs page, the sidebar entry, and the optional-skills catalog row. The
google-antigravity / google-gemini-cli inference providers stay fully removed.
The s6 dashboard entrypoint and docker integration tests relied on
HERMES_DASHBOARD_INSECURE=1 to bring up a 0.0.0.0 dashboard with no auth
provider. With --insecure now a no-op (auth gate mandatory on non-loopback
binds), that path fails closed.
- s6 dashboard/run: drop --insecure derivation; warn that the env is a no-op
and point operators at HERMES_DASHBOARD_BASIC_AUTH_* / OAuth.
- docker tests: supervision tests now register the bundled basic password
provider (HERMES_DASHBOARD_BASIC_AUTH_USERNAME/_PASSWORD) so the gate has a
provider and the dashboard binds. Rewrote the insecure-opt-out test to
assert fail-closed (dashboard does NOT serve) instead of gate-bypass.
- docs (en + zh-Hans): HERMES_DASHBOARD_INSECURE documented as deprecated
no-op; basic-auth is the zero-infra way to authenticate a containerized
public dashboard.
The kanban-worker and kanban-orchestrator bundled skills existed only to
be force-loaded into dispatcher-spawned workers, gated by
environments:[kanban] so they wouldn't leak into normal CLI listings.
That gating was fragile (the leak that #50443 patched) and the
--skills auto-load was already best-effort — most workers ran without it
because the bundled skill isn't present in profile-scoped skills dirs.
Remove the skills entirely and promote their load-bearing content
(workspace kinds, deliverable artifacts, created-card integrity, profile
discovery) into KANBAN_GUIDANCE, which is already injected into every
kanban worker's system prompt. Net result: every worker reliably gets
the guidance, nothing can leak into a CLI/blank-slate session, and the
gating machinery is gone.
- agent/prompt_builder.py: promote the 4 load-bearing rules into KANBAN_GUIDANCE
- hermes_cli/kanban_db.py: drop --skills kanban-worker auto-injection + _kanban_worker_skill_available probe
- hermes_cli/kanban_swarm.py: drop skills=[kanban-orchestrator] on the root card
- hermes_cli/kanban.py: drop kanban-init skill seeding; fix help text
- delete skills/devops/kanban-{worker,orchestrator}
- docs: delete the two skill pages (EN+zh), fix sidebars/catalog/kanban.md/kanban-worker-lanes.md and the video-orchestrator + codex-lane references
- tests: update spawn-argv expectations; re-bound the guidance-size guard
Supersedes the skill-leak half of #50443 (credit @helix4u for flagging the area).
Answers a recurring plugin-author question: how to read the active
profile and drive Hermes from inside a hook callback when ctx._cli_ref
is None (gateway, hermes chat -q, and kanban-spawned worker sessions).
- Adds a 'Act from inside a hook' section to the plugin guide covering
ctx.profile_name and ctx.dispatch_tool as the session-agnostic APIs,
with a kanban_task_blocked example, and notes there is no in-process
slash-command bridge for headless workers (shell out via the terminal
tool instead).
- Adds the three kanban lifecycle hooks to the hook reference table with
their process semantics.
- Pins the contract with a regression test: ctx.dispatch_tool invokes a
tool handler with _cli_ref=None (worker/hook context).
Requested by @Smithangshu on Discord.
hermes -w created the worktree branch from the standalone clone's HEAD, which
lags origin when the clone isn't freshly updated (it's only refreshed by
hermes update, not per session). Every worktree branch then rooted on a stale
base, so the PR diff GitHub computes against current main ballooned with
unrelated changes and the agent had to discover the staleness at push time and
rebase.
_resolve_worktree_base() now fetches and branches from the freshest available
ref: the current branch's upstream if it tracks one (so a deliberate
feature-branch worktree tracks its own remote), else the remote's default
branch (origin/HEAD), else local HEAD as a fail-soft fallback (offline / no
remote / detached). A bogus 'origin/(unknown)' default is guarded, and worktree
creation retries from HEAD if branching off the remote ref fails — so this is
never worse than the old behavior.
Gated by worktree_sync (default true); set worktree_sync: false to keep the
old branch-from-local-HEAD behavior. The resolved base is printed in the
session banner.
This is the follow-up to the #50319 session, where the standalone clone was
213 commits behind origin and the worktree inherited that stale base.
Rich messages are not ready for primetime: current Telegram clients can
render Bot API 10.1 rich messages as blank/unsupported bubbles and make
them hard to copy as plain text, which is worse than the legacy
MarkdownV2 path for command snippets and mobile handoffs. Default the
rich_messages toggle to False so replies stay on the copyable legacy
path; users opt in per bot via platforms.telegram.extra.rich_messages:
true. Updates adapter, gateway config default, example config, English +
zh-Hans docs, and the default/opt-in tests.
The gateway pre-compression hygiene valve force-compressed any session
crossing 400 messages regardless of token usage. On large-context (1M+)
models doing many short, message-dense turns, a healthy session at ~16%
token usage could hit 400 messages and get force-compressed — and the
compression summary's stale Active Task could then bleed into the next
turn.
The valve's actual purpose is to break a death spiral: when API calls
keep disconnecting on an oversized session, no token-usage data arrives,
the token threshold never fires, and the transcript grows unbounded.
It's a count-based floor for that pathological case only. 400 was tuned
for ~200K-context models and is far too low for modern large-context
sessions. Raise the default to 5000 — still well clear of any death
spiral, but no longer firing on legitimate long conversations.
The value remains fully configurable via compression.hygiene_hard_message_limit.
Adds hermes_cli/context_switch_guard.py mirroring the model_cost_guard
pattern. When a user switches models mid-session (Herm TUI picker, CLI,
or /model on Telegram/Discord), the warning surfaces on the existing
ModelSwitchResult.warning_message path used by the expensive-model
guard if the new model's compression threshold is below the current
session size.
Partial fix for #23767 — addresses only the 'user-facing guardrail
when switching from a high-context provider to a substantially
lower-context provider' slice. The other proposed fixes from that
issue (hard preflight token guard, metadata cache invalidation on
switch, compression safety invariant, oversized tool-output handling)
are out of scope for this PR.
- Remove the 'you only log in once per machine' claim from spotify.md
and document the ~6-month refresh token expiry with re-auth instructions
- Add test_client_wraps_invalid_grant_as_spotify_auth_required_error to
confirm SpotifyClient wraps AuthError(code=spotify_refresh_invalid_grant)
into SpotifyAuthRequiredError with a user-facing message
Refs: #28155
When model.context_length is set in config.yaml, it blocks auto-detection
from the server's /v1/models endpoint. The skill incorrectly implied a
hard fallback to 131072. Add the resolution chain and the fix command
(hermes config set model.context_length "") to both the config table
and a new troubleshooting section.
Closes#48835
The bundled himalaya skill and its website docs documented command
syntax that does not match Himalaya CLI v1.2.0.
Verified against pimalaya/himalaya v1.2.0 source:
- message move: MessageMoveCommand declares target_folder BEFORE
envelopes (src/email/message/command/move.rs) -> usage is
'<TARGET> <ID>...', so 'move 42 "Archive"' is wrong; correct is
'move "Archive" 42'.
- message copy: same ordering in copy.rs.
- attachment download: AttachmentDownloadCommand exposes the flag as
'-d, --downloads-dir <PATH>' (src/email/message/attachment/command/
download.rs), not '--dir'.
Fixed in all three surfaces that carried the wrong examples:
- skills/email/himalaya/SKILL.md
- website/docs/.../email-himalaya.md
- website/i18n/zh-Hans/.../email-himalaya.md
* feat(setup): Blank Slate setup mode — minimal agent, opt in to everything
Adds a third first-time setup option alongside Quick Setup and Full Setup.
Blank Slate forces ON only what an agent needs to run — provider & model,
the File Operations toolset, and the Terminal toolset — and turns
everything else OFF, then walks the user through opting each capability
back in.
What it does:
- platform_toolsets.cli = [file, terminal] (explicit, authoritative list)
- agent.disabled_toolsets = every other known toolset (web, browser,
code_execution, vision, memory, delegation, cronjob, skills, image_gen,
kanban, …). Applied last in the resolver, so it overrides the
non-configurable platform-toolset recovery that would otherwise re-add
toolsets like kanban — guaranteeing a true blank slate.
- Optional config features off: compression, memory + user-profile capture,
checkpoints, smart model routing, auto session reset.
- Bundled skills default to NONE (reuses the .no-bundled-skills marker);
offers to seed the full catalog.
- Walks through tools / plugins / MCP / messaging, all opt-in.
Proven end-to-end: with the Blank Slate config, model_tools.get_tool_definitions
emits exactly 6 schemas — patch, process, read_file, search_files, terminal,
write_file. Nothing else reaches the model.
Re-enable later via hermes tools / hermes skills opt-in --sync /
hermes setup agent.
Tests: tests/hermes_cli/test_setup_blank_slate.py (8 tests) pin the writers,
the resolver invariant ({file, terminal}), and the 6-schema end-to-end set.
Docs: getting-started/quickstart.md documents all three setup modes.
* feat(setup): Blank Slate fork — finish minimal, or walk through configs
After applying the minimal baseline (provider/model + file + terminal,
everything else off), Blank Slate now presents a choice instead of always
running the full walkthrough:
1. Start with everything disabled — finish now with the minimal agent.
2. Walk through all configurations — opt in to tools, skills, plugins, MCP,
and messaging.
Provider/model and terminal are still configured first either way (the agent
can't run without them). The finish-now path records the bundled-skill opt-out
so future `hermes update` runs don't re-inject skills. The walkthrough body
moved to a separate _blank_slate_walkthrough() helper.
Tests: TestBlankSlateFork covers both branches (finish-now applies baseline +
skill opt-out and skips the walkthrough; walkthrough path invokes it). Docs
updated to describe the fork.
Sets the Telegram bot's short description (the line under its name) to
"Online" on gateway connect and "Offline" on clean disconnect, gated
behind extra.status_indicator (off by default).
Telegram bots have no presence/online dot — that's a user-account
feature the Bot API doesn't expose for bots. The short description is
the closest available surface, so this gives users a way to tell whether
the gateway is up from the bot's profile.
- New extra.status_indicator flag (+ status_online/status_offline text
overrides), read in __init__ via config.extra — no config-schema change.
- _set_status_indicator() helper: best-effort, swallows API errors so it
never blocks connect/disconnect; truncates to Telegram's 120-char cap.
- Wired Online after _mark_connected(), Offline at top of disconnect()
while the bot HTTP client is still alive.
- 9 unit tests + Telegram docs section.
Requested by @ilTrumpista, cc @Teknium.
Adds a Raft platform adapter as a bundled plugin (plugins/platforms/raft/)
connecting Hermes to Raft as an external agent via a wake-channel bridge.
The adapter starts a loopback HTTP endpoint, spawns 'raft agent bridge' as a
child process, and injects content-free wake hints into the gateway session
pipeline. The agent reads/sends messages through the Raft CLI; the adapter
never touches message bodies or delivery cursors. Activity observer hooks
report tool/LLM/session lifecycle events via a bounded at-most-once queue.
Auto-enables when RAFT_PROFILE is set.
Cherry-picked from PR #47629. Authored by skyzh (@xxchan).
Extend the 'Running Many Gateways at Once' user-guide page with a
'one gateway for all profiles (multiplexing)' section, kept to a single page:
- How to opt in (gateway.multiplex_profiles on the default profile) and when to
prefer it vs one-process-per-profile.
- Every contract change a user sees when the flag is on:
1. secondary-profile 'gateway start' is a hard error (--force escape hatch),
2. HTTP-inbound reached via /p/<profile>/ prefix; secondary profiles must NOT
enable a port-binding platform (webhook/api_server/msgraph_webhook/feishu/
wecom_callback/bluebubbles/sms) — config error at startup,
3. per-credential platforms still need their own token per profile,
4. session keys namespaced agent:<profile>: (default stays agent:main:),
5. single PID/lock + aggregated hermes status, per-profile runtime_status.json.
- What does NOT change: per-profile .env credential isolation (stricter, incl.
MCP/Kanban subprocess env), Kanban, profile-scoped skills/memory/SOUL, routing.
All inert when the flag is off.
Add a ChatGPT-style conversation list beside the embedded TUI on the
dashboard Chat tab so users can swap sessions without leaving the page.
- New ChatSessionList component: lists recent sessions for the active
profile (title/preview, last-active, message count, source), a New chat
button, and a refresh control. Best-effort like ChatSidebar.
- Selecting a row drives /chat?resume=<id>, which ChatPage already treats
as part of the PTY identity, so the terminal respawns resuming that
conversation. Active row is highlighted; New chat clears resume.
- Wired into ChatPage as a dedicated right-side column (desktop) and into
the existing slide-over panel above model/tools (narrow screens).
- i18n: new sessions.newChat key across all locales.
- Read-only switcher by design — delete/rename/export stay on Sessions.
Docs: web-dashboard.md Chat section documents the switcher.
gui.log was registered in hermes_cli/logs.py::LOG_FILES (and surfaced by
`hermes logs gui`) but was never wired into `hermes debug share`. The share
report captured agent/errors/gateway/desktop tails plus full agent/gateway/
desktop logs — but nothing from gui.log, the surface the dashboard, TUI-over-
PTY bridge, and websocket layer (hermes_cli.web_server / pty_bridge /
tui_gateway) actually write to. A user reporting a dashboard or TUI bug shared
zero breadcrumbs from the broken surface.
Wire gui.log through all three share surfaces, matching the existing pattern:
- _capture_default_log_snapshots(): capture the gui snapshot (redacted like the rest)
- collect_debug_report(): add the gui.log summary tail block
- build_debug_share(): pull gui full_text, prepend dump header + redaction banner, add to the upload loop
- run_debug_share() --local branch: same, plus the local print block
- _PRIVACY_NOTICE: name gui.log in both bullets
Redaction is inherited for free — the gui snapshot goes through the same
_capture_log_snapshot(..., redact=redact) path, so secrets are scrubbed in
both the tail and full text (verified E2E: seeded key masked by default,
passes through under --no-redact, raw token never leaks).
Tests: seed gui.log in the fixture, add test_report_includes_gui_log, and bump
the upload-count tripwire 4->5 (test_share_uploads_five_pastes).
* feat(skills): add html-artifact skill, fold in sketch + architecture-diagram + concept-diagrams
Adds a unified `html-artifact` creative skill that produces self-contained,
single-file HTML artifacts — concept explainers, implementation plans,
status/incident reports, code-review walkthroughs, technical + educational
SVG diagrams, multi-variant design comparisons, and throwaway editors that
export their state back to the clipboard. Grounded in Anthropic's
html-effectiveness gallery (MIT); the house style (token block, serif/sans/
mono split, hand-rolled diffs, inline-SVG diagrams, graceful degradation) is
distilled from reading all 20 reference files.
Supersedes and removes three overlapping skills, folding their unique value in:
- sketch -> the fidelity dial (throwaway vs presentation) + the
multi-variant comparison layouts + the browser-vision
verify loop (references/fidelity-and-verify.md)
- architecture-diagram-> the dark "infra" token variant + double-rect masking +
semantic component palette (references/dark-tech.md,
templates/diagram.html infra mode)
- concept-diagrams -> the 9-ramp educational color system + the concept
archetype library (references/concept-archetypes.md,
the light design system in templates/diagram.html)
Structure:
- SKILL.md (description exactly 60 chars), 6 references, 3 templates
- templates verified by headless-Chrome render + vision inspection
- editor export logic (file://-safe clipboard, Promise-normalized) verified in node
Cross-references updated in claude-design (new disambiguation table row drawing
the design-taste vs information-artifact boundary), design-md, pretext, spike,
and kanban-video-orchestrator. Website skill docs + catalogs regenerated;
stale EN/zh-Hans per-skill pages pruned and i18n cross-refs fixed.
Not folded (intentionally orthogonal): excalidraw (.excalidraw JSON), p5js
(generative canvas), claude-design / popular-web-designs / design-md (visual
design taste / brand vocab / token spec).
* feat(skills): ship html-effectiveness gallery as fetched reference examples
Add scripts/fetch-examples.sh (idempotent clone/pull of Anthropic's MIT
html-effectiveness gallery) + references/examples.md mapping each of the 20
example files to a mode so the agent reads the right worked example. The clone
lands in references/examples/ and is gitignored (it's a 384KB upstream repo,
not vendored). SKILL.md workflow + reference list now point at it; falls back to
the distilled pattern references when offline.
* feat(skills): make reading a gallery example a required authoring step
Reading the matching html-effectiveness example is now workflow step 2 (was an
optional aside in step 3): fetch the gallery, read_file the file for your mode,
mirror its structure. Models skip optional steps; the examples are the ground
truth, so consulting one is mandatory. Added an 'Example' column to the
mode->build quick-reference table and a 'don't skip the example' pitfall.
Also dogfooded the skill: read 03-code-review-pr.html and 13-flowchart-diagram.html
raw and reconciled the distilled references against source — aligned diff-row tint
opacity to the source's 0.15 (was 0.18) and added the .ctx/.hunk rows in
house-style.md + base.html so they match 03-code-review-pr.html verbatim.
* docs(skills): explain the consolidation + bundled-vs-optional rationale
The supersession note only stated *what* was folded, not *why* the prune is
sound. Expand SKILL.md's intro into a 'Why this skill exists' section: the three
former skills emitted the same artifact and overlapped, so consolidating removes
which-one-do-I-load ambiguity; and the optional->bundled promotion of
concept-diagrams is footprint-safe because this skill has zero deps (only cost is
the 60-char description; everything else is progressive-disclosure). States the
bundling dividing line explicitly: zero install cost + broadly useful gets
bundled, real install cost (hyperframes: Node+FFmpeg+Chromium) stays optional.
Regenerated website per-skill page to match.
* feat(image-gen): add image-to-image / editing to image_generate
Brings image generation to parity with video generation: the unified
image_generate tool now edits/transforms a source image (image-to-image)
when given image_url / reference_image_urls, routing to each backend's
edit endpoint, exactly as video_generate routes to image-to-video.
- ImageGenProvider ABC: generate() gains keyword-only image_url +
reference_image_urls; new capabilities() declares modalities +
max_reference_images (defaults to text-only, backward compatible).
success_response gains a modality field; adds normalize_reference_images.
- image_generate tool: schema exposes image_url + reference_image_urls;
dynamic schema reflects the active model's actual edit capability so the
agent knows when image_url is honored. Handler + plugin dispatch forward
the new inputs; legacy/text-only providers get a clear modality_unsupported
error instead of silently dropping the source image.
- In-tree FAL: 7 models gain edit endpoints (flux-2-klein, flux-2-pro,
nano-banana-pro, gpt-image-1.5, gpt-image-2, ideogram/v3, qwen-image)
with per-model edit_supports whitelists + reference caps; routes to the
/edit endpoint and skips the upscaler for edits.
- Plugins: openai (images.edit, 16 refs), xai (/v1/images/edits via
grok-imagine-image-quality, JSON body per xAI docs), krea
(image_style_references, 10 refs). openai-codex stays text-only and
rejects edits with an actionable error.
- Tests: 15 new (payload, routing, dispatch forwarding, dynamic schema,
capabilities); updated 2 change-detector/lambda tests for the new schema.
- Docs: image-generation feature page, image-gen provider plugin guide,
tools reference.
* fix(image-gen): preserve legacy passthrough in fal/krea plugin tests
Two existing plugin tests asserted pre-image-to-image behavior:
- fal: forward image_url/reference_image_urls only when supplied, so a
text-to-image delegation stays byte-identical (no None kwargs).
- krea: keep dict-shaped image_style_references refs verbatim (the unified
string refs go through normalize_reference_images; legacy non-string ref
objects pass through unchanged) — fixes KeyError when callers pass the
richer Krea ref-object shape.
* fix(image-gen): clearer not-capable message for text-to-image-only models
When a text-to-image-only model (incl. gpt-image-2 on the Codex OAuth path,
which can't do editing through the Responses image_generation tool) gets a
source image, say 'this model is not capable of image-to-image / editing —
provide a text-only prompt' rather than sending the user shopping for other
backends. Applies to the openai-codex guard, the in-tree FAL no-edit-endpoint
error, and the dynamic tool-schema text-only line.
Adds a 'Customizing platform hints' section to the Prompt Assembly
developer guide covering the append/replace/shorthand shapes, the
defensive fallback, and the cache-stable lifecycle (stable tier,
resolved at build time).
When a worker calls kanban_create from inside a session that has a
persistent delivery channel, the originating session is now subscribed
to the new task's completion/block events automatically. The agent
that dispatched the task gets notified instead of having to poll.
- Gateway sessions (telegram/discord/slack): HERMES_SESSION_PLATFORM +
HERMES_SESSION_CHAT_ID ContextVars, set by the messaging gateway.
- TUI / desktop sessions: HERMES_SESSION_KEY in the subprocess env.
The TUI notification poller keys on platform='tui' + chat_id=<key>.
- CLI / cron / test: no persistent channel, no subscription.
Gated by kanban.auto_subscribe_on_create in config.yaml (default True).
Disable to mirror pre-feature behaviour — users who want explicit
kanban_notify-subscribe calls per task can set it to false. This
config gate addresses the design concern that got PR #19718 reverted
upstream (unconditional implicit auto-subscribe on tool-driven
kanban_create was too aggressive for orchestrator users).
HERMES_SESSION_ID is intentionally not a fallback channel — it is
set by ACP/agent subprocess telemetry for every invocation, not just
TUI, so treating it as a notification target would auto-subscribe
every CLI session and re-introduce the over-eager behaviour.
The kanban_create response now includes a 'subscribed' bool so
orchestrators can react if subscription failed (e.g. by falling
back to explicit kanban_notify-subscribe or to polling).
Includes 6 tests covering the gateway / TUI / CLI / partial-context /
gated / add_notify_sub-failure paths. All 90 tests in
test_kanban_tools.py pass; 509 broader kanban tests pass.
The dashboard MCP catalog only showed name/description/transport and a
non-clickable source. Users couldn't see what an entry connects to or runs
before installing — the exact detail the docs trust model tells them to vet.
- /api/mcp/catalog now returns transport target (url, or command+args),
auth_type, git install source/ref + bootstrap commands, default-enabled
tool hint, and post-install guidance per entry.
- McpPage renders the endpoint URL (http) or command+args (stdio), the git
install source/ref, a collapsible bootstrap-commands list, setup notes,
and the source as a clickable link when it's a URL.
- Docs: drop the 'uv pip install -e .[mcp]' quick-start step (Hermes does
not support pip installs; MCP ships with the standard install) and note
the dashboard now surfaces this detail.
- Strengthen the catalog endpoint test to assert the new inspection fields.