Two follow-ups to the cherry-picked PR #9873 (`e3bcc819`):
1. `_is_allowed_user` now uses `getattr(self, '_allowed_*_ids', set())`
so test fixtures that build the adapter via `object.__new__`
(skipping __init__) don't crash with AttributeError.
See AGENTS.md pitfall #17 — same pattern as gateway.run.
2. New 3-case regression coverage in test_discord_bot_auth_bypass.py:
- role-only config bypasses the gateway 'no allowlists' branch
- roles + users combined still authorizes user-allowlist matches
- the role bypass does NOT leak to other platforms (Telegram, etc.)
3. Autouse fixture in test_discord_bot_auth_bypass.py clears all Discord
auth env vars before each test so DISCORD_ALLOWED_ROLES leakage from
a previous test in the session can't flip later 'should-reject' tests
into false-pass.
Required because the bare cherry-pick of #9873 only added the adapter-
level role check — it didn't cover the gateway-level _is_user_authorized,
which still rejected role-only setups via the 'no allowlists configured'
branch.
Adds a new DISCORD_ALLOWED_ROLES environment variable that allows filtering
bot interactions by Discord role ID. Uses OR semantics with the existing
DISCORD_ALLOWED_USERS - if a user matches either allowlist, they're permitted.
Changes:
- Parse DISCORD_ALLOWED_ROLES comma-separated role IDs on connect
- Enable members intent when roles are configured (needed for role lookup)
- Update _is_allowed_user() to accept optional author param for direct role check
- Fallback to scanning mutual guilds when author object lacks roles (DMs, voice)
- Fully backwards compatible: no behavior change when env var is unset
Six test cases covering:
- DISCORD_ALLOW_BOTS=mentions + bot not in DISCORD_ALLOWED_USERS → authorized
- DISCORD_ALLOW_BOTS=all + bot not in DISCORD_ALLOWED_USERS → authorized
- DISCORD_ALLOW_BOTS=none → bots still rejected (preserves security)
- DISCORD_ALLOW_BOTS unset → same as 'none'
- Humans still checked against allowlist even with allow_bots=all
- Bot bypass is Discord-specific — doesn't leak to other platforms
Guards against a regression where the is_bot bypass in _is_user_authorized
gets moved, removed, or accidentally extended to other platforms.
Fixes#4466.
Root cause: two sequential authorization gates both independently rejected
bot messages, making DISCORD_ALLOW_BOTS completely ineffective.
Gate 1 — `discord.py` `on_message`:
_is_allowed_user ran BEFORE the bot filter, so bot senders were dropped
before the DISCORD_ALLOW_BOTS policy was ever evaluated.
Gate 2 — `gateway/run.py` _is_user_authorized:
The gateway-level allowlist check rejected bot IDs with 'Unauthorized
user: <bot_id>' even if they passed Gate 1.
Fix:
gateway/platforms/discord.py — reorder on_message so DISCORD_ALLOW_BOTS
runs BEFORE _is_allowed_user. Bots permitted by the filter skip the
user allowlist; non-bots are still checked.
gateway/session.py — add is_bot: bool = False to SessionSource so the
gateway layer can distinguish bot senders.
gateway/platforms/base.py — expose is_bot parameter in build_source.
gateway/platforms/discord.py _handle_message — set is_bot=True when
building the SessionSource for bot authors.
gateway/run.py _is_user_authorized — when source.is_bot is True AND
DISCORD_ALLOW_BOTS is 'mentions' or 'all', return True early. Platform
filter already validated the message at on_message; don't re-reject.
Behavior matrix:
| Config | Before | After |
| DISCORD_ALLOW_BOTS=none (default) | Blocked | Blocked |
| DISCORD_ALLOW_BOTS=all | Blocked | Allowed |
| DISCORD_ALLOW_BOTS=mentions + @mention | Blocked | Allowed |
| DISCORD_ALLOW_BOTS=mentions, no mention | Blocked | Blocked |
| Human in DISCORD_ALLOWED_USERS | Allowed | Allowed |
| Human NOT in DISCORD_ALLOWED_USERS | Blocked | Blocked |
Co-authored-by: Hermes Maintainer <hermes@nousresearch.com>
Closes#11321, closes#10259.
## Problem
The nested /skill command group (category subcommand groups + skill
subcommands) serialized to ~14KB with the default 75-skill catalog,
exceeding Discord's ~8000-byte per-command registration payload. The
entire tree.sync() rejected with error 50035 — ALL slash commands
including the 27 base commands failed to register.
## Fix
Replace the nested Group layout with a single flat Command:
/skill name:<autocomplete> args:<optional string>
Autocomplete options are fetched dynamically by Discord when the user
types — they do NOT count against the per-command registration budget.
So this single command registers at ~200 bytes regardless of how many
skills exist. Scales to thousands of skills with no size calculations,
no splitting, no hidden skills.
UX improvements:
- Discord live-filters by user's typed prefix against BOTH name and
description, so '/skill pdf' finds 'ocr-and-documents' via its
description. More discoverable than clicking through category menus.
- Unknown skill name → ephemeral error pointing user at autocomplete.
- Stable alphabetical ordering across restarts.
## Why not the other proposed approaches
Three prior PRs tried to fit within the 8KB limit by modifying the
nested layout:
- #10214 (njiangk): truncated all descriptions to 'Run <name>' and
category descriptions to 'Skills'. Works but destroys slash picker UX.
- #11385 (LeonSGP43): 40-char description clamp + iterative
trim-largest-category fallback. Works but HIDES skills the user can
no longer invoke via slash — functional regression.
- #10261 (zeapsu): adaptive split into /skill-<cat> top-level groups.
Preserves all skills but pollutes the slash namespace with 20
top-level commands.
All three work around the symptom. The flat autocomplete design
dissolves the problem — there is no payload-size pressure to manage.
## Tests
tests/gateway/test_discord_slash_commands.py — 5 new test cases replace
the 3 old nested-structure tests:
- flat-not-nested structure assertion
- empty skills → no command registered
- callback dispatches the right cmd_key by name
- unknown name → ephemeral error, no dispatch
- large-catalog regression guard (500 skills) — command payload stays
under 500 bytes regardless
E2E validated against real discord.py 2.7.1:
- Command registers as discord.app_commands.Command (not Group).
- Autocomplete filters by name AND description (verified across several
queries including description-only matches like 'pdf' → OCR skill).
- 500-skill catalog returns max 25 results per autocomplete query
(Discord's hard cap), filtered correctly.
- Choice labels formatted as 'name — description' clamped to 100 chars.
Adds 15 regression tests for hermes_cli/dingtalk_auth.py covering:
* _api_post — network error mapping, errcode-nonzero mapping, success path
* begin_registration — 2-step chain, missing-nonce/device_code/uri
error cases
* wait_for_registration_success — success path, missing-creds guard,
on_waiting callback invocation
* render_qr_to_terminal — returns False when qrcode missing, prints
when available
* Configuration — BASE_URL default + override, SOURCE default
Also adds a one-line disclosure in dingtalk_qr_auth() telling users
the scan page will be OpenClaw-branded. Interim measure: DingTalk's
registration portal is hardcoded to route all sources to /openapp/
registration/openClaw, so users see OpenClaw branding regardless of
what 'source' value we send. We keep 'openClaw' as the source token
until DingTalk-Real-AI registers a Hermes-specific template.
Also adds meng93 to scripts/release.py AUTHOR_MAP.
- feat: support one-click QR scan to create DingTalk bot and establish connection
- fix(gateway): wrap blocking DingTalkStreamClient.start() with asyncio.to_thread()
- fix(gateway): extract message fields from CallbackMessage payload instead of ChatbotMessage
- fix(gateway): add oapi.dingtalk.com to allowed webhook URL domains
- stop rewriting markdown tables, headings, and links before delivery
- keep markdown table blocks and headings together during chunking
- update Weixin tests and docs for native markdown rendering
Closes#10308
The Weixin adapter's send() method previously split and delivered the
raw response text without first extracting MEDIA: tags or bare local
file paths. This meant images, documents, and voice files referenced
by the agent were silently dropped in normal (non-streaming,
non-background) conversations.
Changes:
- In WeixinAdapter.send(), call extract_media() and
extract_local_files() before formatting/splitting text.
- Deliver extracted files via send_image_file(), send_document(),
send_voice(), or send_video() prior to sending text chunks.
- Also fix two minor typing issues in gateway/run.py where
extract_media() tuples were not unpacked correctly in background
and /btw task handlers.
Fixes missing media delivery on Weixin personal accounts.
Three open issues — #8242, #6587, #11345 — all trace to the same root
cause: the image / audio / document download paths in
`DiscordAdapter._handle_message` used plain, unauthenticated HTTP to
fetch `att.url`. That broke in three independent ways:
#8242 cdn.discordapp.com attachment URLs increasingly require the
bot session to download; unauthenticated httpx sees 403
Forbidden, image/voice analysis fail silently.
#6587 Some user environments (VPNs, corporate DNS, tunnels) resolve
cdn.discordapp.com to private-looking IPs. Our is_safe_url()
guard correctly blocks them as SSRF risks, but the user
environment is legitimate — image analysis and voice STT die.
#11345 The document download path skipped is_safe_url() entirely —
raw aiohttp.ClientSession.get(att.url) with no SSRF check,
inconsistent with the image/audio branches.
Unified fix: use `discord.Attachment.read()` as the primary download
path on all three branches. `att.read()` routes through discord.py's
own authenticated HTTPClient, so:
- Discord CDN auth is handled (#8242 resolved).
- Our is_safe_url() gate isn't consulted for the attachment path at
all — the bot session handles networking internally (#6587 resolved).
- All three branches now share the same code path, eliminating the
document-path SSRF gap (#11345 resolved).
Falls back to the existing cache_*_from_url helpers (image/audio) or an
SSRF-gated aiohttp fetch (documents) when `att.read()` is unavailable
or fails — preserves defense-in-depth for any future payload-schema
drift that could slip a non-CDN URL into att.url.
New helpers on DiscordAdapter:
- _read_attachment_bytes(att) — safe att.read() wrapper
- _cache_discord_image(att, ext) — primary + URL fallback
- _cache_discord_audio(att, ext) — primary + URL fallback
- _cache_discord_document(att, ext) — primary + SSRF-gated aiohttp fallback
Tests:
- tests/gateway/test_discord_attachment_download.py — 12 new cases
covering all three helpers: primary path, fallback on missing
.read(), fallback on validator rejection, SSRF guard on document
fallback, aiohttp fallback happy-path, and an E2E case via
_handle_message confirming cache_image_from_url is never invoked
when att.read() succeeds.
- All 11 existing document-handling tests continue to pass via the
aiohttp fallback path (their SimpleNamespace attachments have no
.read(), which triggers the fallback — now SSRF-gated).
Closes#8242, closes#6587, closes#11345.
The send_message tool's direct-REST QQBot path used "QQBotAccessToken {token}"
which QQ's API rejects with 401. The correct format is "QQBot {token}" — the
gateway adapter at gateway/platforms/qqbot.py uses this format in all 5 header
sites (lines 341, 551, 579, 1068, 1467); this was the one outlier.
Credit to @Quon for surfacing this in #10257 (that PR had unrelated issues in
its media-upload logic and was closed; this salvages the genuine 1-line fix).
When a WebSocket-based platform adapter (e.g. QQ Bot) temporarily
loses its connection, send() now polls is_connected for up to 15s
instead of immediately returning a non-retryable failure. If the
auto-reconnect completes within the window, the message is delivered
normally. On timeout, the SendResult is marked retryable=True so the
base class retry mechanism can attempt re-delivery.
Same treatment applied to _send_media().
Adds 4 async tests covering:
- Successful send after simulated reconnection
- Retryable failure on timeout
- Immediate success when already connected
- _send_media reconnection wait
Fixes#11163
Adds 16 regression tests for the gating logic introduced in the
salvaged commit:
* TestAllowedUsersGate — empty/wildcard/case-insensitive matching,
staff_id vs sender_id, env var CSV population
* TestMentionPatterns — compilation, case-insensitivity, invalid
regex is skipped-not-raised, JSON env var, newline fallback
* TestShouldProcessMessage — DM always accepted, group gating via
require_mention / is_in_at_list / wake-word pattern / free_response_chats
Also adds yule975 to scripts/release.py AUTHOR_MAP (release CI blocks
unmapped emails).
DingTalk was the only messaging platform without group-mention gating or a
per-user allowlist. Slack, Telegram, Discord, WhatsApp, Matrix, and Mattermost
all support these via config.yaml + matching env vars; this change closes the
gap for DingTalk using the same surface:
Config:
platforms.dingtalk.require_mention: bool (env: DINGTALK_REQUIRE_MENTION)
platforms.dingtalk.mention_patterns: list (env: DINGTALK_MENTION_PATTERNS)
platforms.dingtalk.free_response_chats: list (env: DINGTALK_FREE_RESPONSE_CHATS)
platforms.dingtalk.allowed_users: list (env: DINGTALK_ALLOWED_USERS)
Semantics mirror Telegram's implementation:
- DMs are always accepted (subject to allowed_users).
- Group messages are accepted only when the chat is allowlisted, mention is
not required, the bot was @mentioned (dingtalk_stream sets is_in_at_list),
or the text matches a configured regex wake-word.
- allowed_users matches sender_id / sender_staff_id case-insensitively;
a single "*" disables the check.
Rationale: without this, any DingTalk user in a group chat can trigger the
bot, which makes DingTalk less safe to deploy than the other platforms. A
user's config.yaml already accepts require_mention for dingtalk but the value
was silently ignored.
The Copilot API returns HTTP 400 "model_not_supported" when it receives a
model ID it doesn't recognize (vendor-prefixed like
`anthropic/claude-sonnet-4.6` or dash-notation like `claude-sonnet-4-6`).
Two bugs combined to leave both formats unhandled:
1. `_COPILOT_MODEL_ALIASES` in hermes_cli/models.py only covered bare
dot-notation and vendor-prefixed dot-notation. Hermes' default Claude
IDs elsewhere use hyphens (anthropic native format), and users with an
aggregator-style config who switch `model.provider` to `copilot`
inherit `anthropic/claude-X-4.6` — neither case was in the table.
2. The Copilot branch of `normalize_model_for_provider()` only stripped
the vendor prefix when it matched the target provider (`copilot/`) or
was the special-cased `openai/` for openai-codex. Every other vendor
prefix survived to the Copilot request unchanged.
Fix:
- Add dash-notation aliases (`claude-{opus,sonnet,haiku}-4-{5,6}` and the
`anthropic/`-prefixed variants) to the alias table.
- Rewire the Copilot / Copilot-ACP branch of
`normalize_model_for_provider()` to delegate to the existing
`normalize_copilot_model_id()`. That function already does alias
lookups, catalog-aware resolution, and vendor-prefix fallback — it was
being bypassed for the generic normalisation entry point.
Because `switch_model()` already calls `normalize_model_for_provider()`
for every `/model` switch (line 685 in model_switch.py), this single fix
covers the CLI startup path (cli.py), the `/model` slash command path,
and the gateway load-from-config path.
Closes#6879
Credits dsr-restyn (#6743) who independently diagnosed the dash-notation
case; their aliases are folded into this consolidated fix alongside the
vendor-prefix stripping repair.
Follow-up to the reply-reference fix: `_make_discord_adapter` used to return
the raw fetched `Message` as the expected reference, but the adapter now
wraps it via `ref_msg.to_reference(fail_if_not_exists=False)` so Discord
treats a deleted target as 'send without reply chip'. Update the fixture
to return the MessageReference sentinel so the 4 chunk-reference-identity
tests assert against the right object.
No production behavior change; only aligns the stale test fixture.
Follow-up to the reply-reference fix: ensure errors unrelated to the reply
reference (e.g. 50013 Missing Permissions) do NOT trigger the no-reference
retry path and still surface as a failed SendResult. Keeps the wider retry
condition from silently swallowing unrelated API errors.
Proposed in the original issue writeup (#11342) as test case
`test_non_reference_errors_still_propagate`.
* feat(skills): add 'hermes skills reset' to un-stick bundled skills
When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.
Adds an escape hatch for this case.
hermes skills reset <name>
Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
re-baselines against the user's current copy. Future 'hermes update'
runs accept upstream changes again. Non-destructive.
hermes skills reset <name> --restore
Also deletes the user's copy and re-copies the bundled version.
Use when you want the pristine upstream skill back.
Also available as /skills reset in chat.
- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
repro, --restore, unknown-skill error, upstream-removed-skill, and
no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
section explaining the origin-hash mechanic + reset usage
* fix(auth): codex auth remove no longer silently undone by auto-import
'hermes auth remove openai-codex' appeared to succeed but the credential
reappeared on the next command. Two compounding bugs:
1. _seed_from_singletons() for openai-codex unconditionally re-imports
tokens from ~/.codex/auth.json whenever the Hermes auth store is
empty (by design — the Codex CLI and Hermes share that file). There
was no suppression check, unlike the claude_code seed path.
2. auth_remove_command's cleanup branch only matched
removed.source == 'device_code' exactly. Entries added via
'hermes auth add openai-codex' have source 'manual:device_code', so
for those the Hermes auth store's providers['openai-codex'] state was
never cleared on remove — the next load_pool() re-seeded straight
from there.
Net effect: there was no way to make a codex removal stick short of
manually editing both ~/.hermes/auth.json and ~/.codex/auth.json before
opening Hermes again.
Fix:
- Add unsuppress_credential_source() helper (mirrors
suppress_credential_source()).
- Gate the openai-codex branch in _seed_from_singletons() with
is_source_suppressed(), matching the claude_code pattern.
- Broaden auth_remove_command's codex match to handle both
'device_code' and 'manual:device_code' (via endswith check), always
call suppress_credential_source(), and print guidance about the
unchanged ~/.codex/auth.json file.
- Clear the suppression marker in auth_add_command's openai-codex
branch so re-linking via 'hermes auth add openai-codex' works.
~/.codex/auth.json is left untouched — that's the Codex CLI's own
credential store, not ours to delete.
Tests cover: unsuppress helper behavior, remove of both source
variants, add clears suppression, seed respects suppression. E2E
verified: remove → load → add → load flow now behaves correctly.
Add TestWeixinSendImageFileParameterName test class with two tests:
- test_send_image_file_uses_image_path_parameter: verifies the correct
parameter name (image_path) is used when gateway calls send_image_file
- test_send_image_file_works_without_optional_params: ensures minimal
params work correctly
This prevents the interface from drifting again as noted by Copilot.
The send_image_file method in WeixinAdapter used 'path' as parameter
name, but BasePlatformAdapter and gateway callers use 'image_path'.
This mismatch caused image sending to fail when called through the
gateway's extract_media path.
Changed parameter name from 'path' to 'image_path' to match the
interface defined in base.py and the calls in gateway/run.py.
discord.py does not apply a default AllowedMentions to the client, so any
reply whose content contains @everyone/@here or a role mention would ping
the whole server — including verbatim echoes of user input or LLM output
that happens to contain those tokens.
Set a safe default on commands.Bot: everyone=False, roles=False,
users=True, replied_user=True. Operators can opt back in via four
DISCORD_ALLOW_MENTION_* env vars or discord.allow_mentions.* in
config.yaml. No behavior change for normal user/reply pings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First pass of test-suite reduction to address flaky CI and bloat.
Removed tests that fall into these change-detector patterns:
1. Source-grep tests (tests/gateway/test_feishu.py, test_email.py): tests
that call inspect.getsource() on production modules and grep for string
literals. Break on any refactor/rename even when behavior is correct.
2. Platform enum tautologies (every gateway/test_X.py): assertions like
`Platform.X.value == 'x'` duplicated across ~9 adapter test files.
3. Toolset/PLATFORM_HINTS/setup-wizard registry-presence checks: tests that
only verify a key exists in a dict. Data-layout tests, not behavior.
4. Argparse wiring tests (test_argparse_flag_propagation, test_subparser_routing
_fallback): tests that do parser.parse_args([...]) then assert args.field.
Tests Python's argparse, not our code.
5. Pure dispatch tests (test_plugins_cmd.TestPluginsCommandDispatch): patch
cmd_X, call plugins_command with matching action, assert mock called.
Tests the if/elif chain, not behavior.
6. Kwarg-to-mock verification (test_auxiliary_client ~45 tests,
test_web_tools_config, test_gemini_cloudcode, test_retaindb_plugin): tests
that mock the external API client, call our function, and assert exact
kwargs. Break on refactor even when behavior is preserved.
7. Schedule-internal "function-was-called" tests (acp/test_server scheduling
tests): tests that patch own helper method, then assert it was called.
Kept behavioral tests throughout: error paths (pytest.raises), security
tests (path traversal, SSRF, redaction), message alternation invariants,
provider API format conversion, streaming logic, memory contract, real
config load/merge tests.
Net reduction: 169 tests removed. 38 empty classes cleaned up.
Collected before: 12,522 tests
Collected after: 12,353 tests
The cache-read, cache-write, and total estimated-cost values shown in
/insights (and the per-model Cost column) were unreliable. Hide them from
both terminal and gateway renderings.
The underlying data pipeline is untouched — sessions still store
cache_read_tokens, cache_write_tokens, and estimated_cost_usd; the web
server, /usage command, and status bar are unaffected. Only the
InsightsEngine display layer is trimmed.
Changes:
- format_terminal: drop 'Cache read / Cache write' line, drop 'Est. cost'
from the Total tokens row, drop per-model 'Cost' column, drop the
'* Cost N/A for custom/self-hosted' footnote.
- format_gateway: drop cache breakdown from Tokens line, drop 'Est. cost'
line, drop per-model cost suffix.
- Tests updated to assert these strings are now absent.
* feat(skills): add 'hermes skills reset' to un-stick bundled skills
When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.
Adds an escape hatch for this case.
hermes skills reset <name>
Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
re-baselines against the user's current copy. Future 'hermes update'
runs accept upstream changes again. Non-destructive.
hermes skills reset <name> --restore
Also deletes the user's copy and re-copies the bundled version.
Use when you want the pristine upstream skill back.
Also available as /skills reset in chat.
- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
repro, --restore, unknown-skill error, upstream-removed-skill, and
no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
section explaining the origin-hash mechanic + reset usage
* fix(nous): respect 'Skip (keep current)' after OAuth login
When a user already set up on another provider (e.g. OpenRouter) runs
`hermes model` and picks Nous Portal, OAuth succeeds and then a model
picker is shown. If the user picks 'Skip (keep current)', the previous
provider + model should be preserved.
Previously, \_update_config_for_provider was called unconditionally after
login, which flipped config.yaml model.provider to 'nous' while keeping
the old model.default (e.g. anthropic/claude-opus-4.6 from OpenRouter),
leaving the user with a mismatched provider/model pair on the next
request.
Fix: snapshot the prior active_provider before login, and if no model is
selected (Skip, or no models available, or fetch failure), restore the
prior active_provider and leave config.yaml untouched. The Nous OAuth
tokens stay saved so future `hermes model` -> Nous works without
re-authenticating.
Test plan:
- New tests cover Skip path (preserves provider+model, saves creds),
pick-a-model path (switches to nous), and fresh-install Skip path
(active_provider cleared, not stuck as 'nous').
The cherry-picked SDK compat fix (previous commit) wired process() to
parse CallbackMessage.data into a ChatbotMessage, but _extract_text()
was still written against the pre-0.20 payload shape:
* message.text changed from dict {content: ...} → TextContent object.
The old code's str(text) fallback produced 'TextContent(content=...)'
as the agent's input, so every received message came in mangled.
* rich_text moved from message.rich_text (list) to
message.rich_text_content.rich_text_list.
This preserves legacy fallbacks (dict-shaped text, bare rich_text list)
while handling the current SDK layout via hasattr(text, 'content').
Adds regression tests covering:
* webhook domain allowlist (api.*, oapi.*, and hostile lookalikes)
* _IncomingHandler.process is a coroutine function
* _extract_text against TextContent object, dict, rich_text_content,
legacy rich_text, and empty-message cases
Also adds kevinskysunny to scripts/release.py AUTHOR_MAP (release CI
blocks unmapped emails).
When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.
Adds an escape hatch for this case.
hermes skills reset <name>
Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
re-baselines against the user's current copy. Future 'hermes update'
runs accept upstream changes again. Non-destructive.
hermes skills reset <name> --restore
Also deletes the user's copy and re-copies the bundled version.
Use when you want the pristine upstream skill back.
Also available as /skills reset in chat.
- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
repro, --restore, unknown-skill error, upstream-removed-skill, and
no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
section explaining the origin-hash mechanic + reset usage
Three tests in tests/test_plugin_skills.py and tests/hermes_cli/test_plugins.py
used caplog.at_level(logging.WARNING) without specifying a logger. When another
test earlier in the same xdist worker touched propagation on tools.skills_tool
or hermes_cli.plugins, caplog would miss the warning and the assertion would
fail intermittently in CI.
These three tests accounted for 15 of the last ~30 Tests workflow failures
(5 each), including the recent main failure on commit 436a7359 (PR #11398).
Fix: pass logger="tools.skills_tool" / logger="hermes_cli.plugins" to
caplog.at_level() so the handler attaches directly to the logger under test
and capture is independent of global propagation state.
Affected tests:
- tests/test_plugin_skills.py::TestSkillViewPluginGuards::test_injection_logged_but_served
- tests/hermes_cli/test_plugins.py::TestPluginCommands::test_register_command_empty_name_rejected
- tests/hermes_cli/test_plugins.py::TestPluginCommands::test_register_command_builtin_conflict_rejected
No production code change. Verified passing under xdist (-n 4) alongside
test_hermes_logging.py (the test most likely to poison the logger state).
Extends test_build_event_handler_registers_reaction_and_card_processors
to assert that register_p2_im_chat_access_event_bot_p2p_chat_entered_v1
and register_p2_im_message_recalled_v1 are called when building the
event handler, matching the production registrations.
Also adds Fatty911 to scripts/release.py AUTHOR_MAP for credit on the
salvaged event-handler fix.
* feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro
Upstream asked for these two upgrades ASAP — the old entries show
stale models when newer, higher-quality versions are available on FAL.
Recraft V3 → Recraft V4 Pro
ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image
Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier)
Schema: V4 dropped the required `style` enum entirely; defaults
handle taste now. Added `colors` and `background_color`
to supports for brand-palette control. `seed` is not
supported by V4 per the API docs.
Nano Banana → Nano Banana Pro
ID: fal-ai/nano-banana → fal-ai/nano-banana-pro
Price: $0.08/image → $0.15/image (1K); $0.30 at 4K
Schema: Aspect ratio family unchanged. Added `resolution`
(1K/2K/4K, default 1K for billing predictability),
`enable_web_search` (real-time info grounding, +$0.015),
and `limit_generations` (force exactly 1 image).
Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality
and reasoning depth improved; slower (~6s → ~8s).
Migration: users who had the old IDs in `image_gen.model` will
fall through the existing 'unknown model → default' warning path
in `_resolve_fal_model()` and get the Klein 9B default on the next
run. Re-run `hermes tools` → Image Generation to pick the new
version. No silent cost-upgrade aliasing — the 2-6x price jump
on these tiers warrants explicit user re-selection.
Portal note: both new model IDs need to be allowlisted on the
Nous fal-queue-gateway alongside the previous 7 additions, or
users on Nous Subscription will see the 'managed gateway rejected
model' error we added previously (which is clear and
self-remediating, just noisy).
* docs: wrap '<1s' in backticks to unblock MDX compilation
Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and
'<1s' fails because '1' isn't a valid tag-name start character. This
was broken on main since PR #11265 (never noticed because
docs-site-checks was failing on OTHER issues at the time and we
admin-merged through it).
Wrapping in backticks also gives the cell monospace styling which
reads more cleanly alongside the inline-code model ID in the same row.
The other '<1s' occurrence (line 52) is inside a fenced code block
and is already safe — code fences bypass MDX parsing.
* feat(mcp-oauth): scaffold MCPOAuthManager
Central manager for per-server MCP OAuth state. Provides
get_or_build_provider (cached), remove (evicts cache + deletes
disk), invalidate_if_disk_changed (mtime watch, core fix for
external-refresh workflow), and handle_401 (dedup'd recovery).
No behavior change yet — existing call sites still use
build_oauth_auth directly. Task 1 of 8 in the MCP OAuth
consolidation (fixes Cthulhu's BetterStack reliability issues).
* feat(mcp-oauth): add HermesMCPOAuthProvider with pre-flow disk watch
Subclasses the MCP SDK's OAuthClientProvider to inject a disk
mtime check before every async_auth_flow, via the central
manager. When a subclass instance is used, external token
refreshes (cron, another CLI instance) are picked up before
the next API call.
Still dead code: the manager's _build_provider still delegates
to build_oauth_auth and returns the plain OAuthClientProvider.
Task 4 wires this subclass in. Task 2 of 8.
* refactor(mcp-oauth): extract build_oauth_auth helpers
Decomposes build_oauth_auth into _configure_callback_port,
_build_client_metadata, _maybe_preregister_client, and
_parse_base_url. Public API preserved. These helpers let
MCPOAuthManager._build_provider reuse the same logic in Task 4
instead of duplicating the construction dance.
Also updates the SDK version hint in the warning from 1.10.0 to
1.26.0 (which is what we actually require for the OAuth types
used here). Task 3 of 8.
* feat(mcp-oauth): manager now builds HermesMCPOAuthProvider directly
_build_provider constructs the disk-watching subclass using the
helpers from Task 3, instead of delegating to the plain
build_oauth_auth factory. Any consumer using the manager now gets
pre-flow disk-freshness checks automatically.
build_oauth_auth is preserved as the public API for backwards
compatibility. The code path is now:
MCPOAuthManager.get_or_build_provider ->
_build_provider ->
_configure_callback_port
_build_client_metadata
_maybe_preregister_client
_parse_base_url
HermesMCPOAuthProvider(...)
Task 4 of 8.
* feat(mcp): wire OAuth manager + add _reconnect_event
MCPServerTask gains _reconnect_event alongside _shutdown_event.
When set, _run_http / _run_stdio exit their async-with blocks
cleanly (no exception), and the outer run() loop re-enters the
transport to rebuild the MCP session with fresh credentials.
This is the recovery path for OAuth failures that the SDK's
in-place httpx.Auth cannot handle (e.g. cron externally consumed
the refresh_token, or server-side session invalidation).
_run_http now asks MCPOAuthManager for the OAuth provider
instead of calling build_oauth_auth directly. Config-time,
runtime, and reconnect paths all share one provider instance
with pre-flow disk-watch active.
shutdown() defensively sets both events so there is no race
between reconnect and shutdown signalling.
Task 5 of 8.
* feat(mcp): detect auth failures in tool handlers, trigger reconnect
All 5 MCP tool handlers (tool call, list_resources, read_resource,
list_prompts, get_prompt) now detect auth failures and route
through MCPOAuthManager.handle_401:
1. If the manager says recovery is viable (disk has fresh tokens,
or SDK can refresh in-place), signal MCPServerTask._reconnect_event
to tear down and rebuild the MCP session with fresh credentials,
then retry the tool call once.
2. If no recovery path exists, return a structured needs_reauth
JSON error so the model stops hallucinating manual refresh
attempts (the 'let me curl the token endpoint' loop Cthulhu
pasted from Discord).
_is_auth_error catches OAuthFlowError, OAuthTokenError,
OAuthNonInteractiveError, and httpx.HTTPStatusError(401). Non-auth
exceptions still surface via the generic error path unchanged.
Task 6 of 8.
* feat(mcp-cli): route add/remove through manager, add 'hermes mcp login'
cmd_mcp_add and cmd_mcp_remove now go through MCPOAuthManager
instead of calling build_oauth_auth / remove_oauth_tokens
directly. This means CLI config-time state and runtime MCP
session state are backed by the same provider cache — removing
a server evicts the live provider, adding a server populates
the same cache the MCP session will read from.
New 'hermes mcp login <name>' command:
- Wipes both the on-disk tokens file and the in-memory
MCPOAuthManager cache
- Triggers a fresh OAuth browser flow via the existing probe
path
- Intended target for the needs_reauth error Task 6 returns
to the model
Task 7 of 8.
* test(mcp-oauth): end-to-end integration tests
Five new tests exercising the full consolidation with real file
I/O and real imports (no transport mocks):
1. external_refresh_picked_up_without_restart — Cthulhu's cron
workflow. External process writes fresh tokens to disk;
on the next auth flow the manager's mtime-watch flips
_initialized and the SDK re-reads from storage.
2. handle_401_deduplicates_concurrent_callers — 10 concurrent
handlers for the same failed token fire exactly ONE recovery
attempt (thundering-herd protection).
3. handle_401_returns_false_when_no_provider — defensive path
for unknown servers.
4. invalidate_if_disk_changed_handles_missing_file — pre-auth
state returns False cleanly.
5. provider_is_reused_across_reconnects — cache stickiness so
reconnects preserve the disk-watch baseline mtime.
Task 8 of 8 — consolidation complete.
Mirrors OpenRouter which already lists anthropic/claude-opus-4.7 as
recommended. Surfaces the model in the `hermes model` picker and the
gateway /model flow for Nous Portal users.
Context length (1M) is already covered by the existing claude-opus-4.7
entry in agent/model_metadata.py DEFAULT_CONTEXT_LENGTHS.
* docs: fix ascii-guard border alignment errors
Three docs pages had ASCII diagram boxes with off-by-one column
alignment issues that failed docs-site-checks CI:
- architecture.md: outer box is 71 cols but inner-box content lines
and border corners were offset by 1 col, making content-line right
border at col 70/72 while top/bottom border was at col 71. Inner
boxes also had border corners at cols 19/36/53 but content pipes
at cols 20/37/54. Rewrote the diagram with consistent 71-col width
throughout, aligned inner boxes at cols 4-19, 22-37, 40-55 with
2-space gaps and 15-space trailing padding.
- gateway-internals.md: same class of issue — outer box at 51 cols,
inner content lines varied 52-54 cols. Rewrote with consistent
51-col width, inner boxes at cols 4-15, 18-29, 32-43. Also
restructured the bottom-half message flow so it's bare text
(not half-open box cells) matching the intent of the original.
- agent-loop.md line 112-114: box 2 (API thread) content lines had
one extra space pushing the right border to col 46 while the top
and bottom borders of that box sat at col 45. Trimmed one trailing
space from each of the three content lines.
All 123 docs files now pass `npm run lint:diagrams`:
✓ Errors: 0 (warnings: 6, non-fatal)
Pre-existing failures on main — unrelated to any open PR.
* test(setup): accept description kwarg in prompt_choice mock lambdas
setup.py's `_curses_prompt_choice` gained an optional `description`
parameter (used for rendering context hints alongside the prompt).
`prompt_choice` forwards it via keyword arg. The two existing tests
mocked `_curses_prompt_choice` with lambdas that didn't accept the
new kwarg, so the forwarded call raised TypeError.
Fix: add `description=None` to both mock lambda signatures so they
absorb the new kwarg without changing behavior.
* test(matrix): update stale audio-caching assertion
test_regular_audio_has_http_url asserted that non-voice audio
messages keep their HTTP URL and are NOT downloaded/cached. That
was true when the caching code only triggered on
`is_voice_message`. Since bec02f37 (encrypted-media caching
refactor), matrix.py caches all media locally — photos, audio,
video, documents — so downstream tools can read them as real
files via media_urls. This applies to regular audio too.
Renamed the test to `test_regular_audio_is_cached_locally`,
flipped the assertions accordingly, and documented the
intentional behavior change in the docstring. Other tests in
the file (voice-specific caching, message-type detection,
reply-to threading) continue to pass.
* test(413): allow multi-pass preflight compression
run_agent.py's preflight compression runs up to 3 passes in a loop
for very large sessions (each pass summarizes the middle N turns,
then re-checks tokens). The loop breaks when a pass returns a
message list no shorter than its input (can't compress further).
test_preflight_compresses_oversized_history used a static mock
return value that returned the same 2 messages regardless of input,
so the loop ran pass 1 (41 -> 2) and pass 2 (2 -> 2 -> break),
making call_count == 2. The assert_called_once() assertion was
strictly wrong under the multi-pass design.
The invariant the test actually cares about is: preflight ran, and
its first invocation received the full oversized history. Replaced
the count assertion with those two invariants.
* docs: drop '...' from gateway diagram, merge side-by-side boxes
ascii-guard 2.3.0 flagged two remaining issues after the initial fix
pass:
1. gateway-internals.md L33: the '...' suffix after inner box 3's
right border got parsed as 'extra characters after inner-box right
border'. Dropped the '...' — the surrounding prose already conveys
'and more platforms' without needing the visual hint.
2. agent-loop.md: ascii-guard can't cleanly parse two side-by-side
boxes of different heights (main thread 7 rows, API thread 5 rows).
Even equalizing heights didn't help — the linter treats the left
box's right border as the end of the diagram. Merged into a single
54-char-wide outer box with both threads labeled as regions inside,
keeping the ▶ arrow to preserve the main→API flow direction.
Previous pass assumed both skills would always be loaded together, so
each description pointed at the other ('use concept-diagrams instead').
That breaks when only one skill is active — the agent reads 'use the
other skill' and there is no other skill.
Now each skill's description and scope section is fully self-contained:
- States what it's best suited for
- Lists subjects where a more specialized skill (if available) would be
a better fit, naming them only as 'consider X if available'
- Explicitly offers itself as a general SVG diagram fallback when no
more specialized skill exists
An agent loading either skill alone gets unambiguous guidance; an
agent with both loaded still gets useful routing via the 'consider X
if available' hints and the related_skills metadata.