Commit graph

641 commits

Author SHA1 Message Date
Teknium
7d888ab49c test(discord): regression guard for DISCORD_ALLOW_BOTS auth bypass
Six test cases covering:
- DISCORD_ALLOW_BOTS=mentions + bot not in DISCORD_ALLOWED_USERS → authorized
- DISCORD_ALLOW_BOTS=all + bot not in DISCORD_ALLOWED_USERS → authorized
- DISCORD_ALLOW_BOTS=none → bots still rejected (preserves security)
- DISCORD_ALLOW_BOTS unset → same as 'none'
- Humans still checked against allowlist even with allow_bots=all
- Bot bypass is Discord-specific — doesn't leak to other platforms

Guards against a regression where the is_bot bypass in _is_user_authorized
gets moved, removed, or accidentally extended to other platforms.
2026-04-17 05:42:04 -07:00
Teknium
d7fb435e0e
fix(discord): flat /skill command with autocomplete — fits 8KB limit trivially (#11580)
Closes #11321, closes #10259.

## Problem

The nested /skill command group (category subcommand groups + skill
subcommands) serialized to ~14KB with the default 75-skill catalog,
exceeding Discord's ~8000-byte per-command registration payload. The
entire tree.sync() rejected with error 50035 — ALL slash commands
including the 27 base commands failed to register.

## Fix

Replace the nested Group layout with a single flat Command:

    /skill name:<autocomplete> args:<optional string>

Autocomplete options are fetched dynamically by Discord when the user
types — they do NOT count against the per-command registration budget.
So this single command registers at ~200 bytes regardless of how many
skills exist. Scales to thousands of skills with no size calculations,
no splitting, no hidden skills.

UX improvements:
- Discord live-filters by user's typed prefix against BOTH name and
  description, so '/skill pdf' finds 'ocr-and-documents' via its
  description. More discoverable than clicking through category menus.
- Unknown skill name → ephemeral error pointing user at autocomplete.
- Stable alphabetical ordering across restarts.

## Why not the other proposed approaches

Three prior PRs tried to fit within the 8KB limit by modifying the
nested layout:

- #10214 (njiangk): truncated all descriptions to 'Run <name>' and
  category descriptions to 'Skills'. Works but destroys slash picker UX.
- #11385 (LeonSGP43): 40-char description clamp + iterative
  trim-largest-category fallback. Works but HIDES skills the user can
  no longer invoke via slash — functional regression.
- #10261 (zeapsu): adaptive split into /skill-<cat> top-level groups.
  Preserves all skills but pollutes the slash namespace with 20
  top-level commands.

All three work around the symptom. The flat autocomplete design
dissolves the problem — there is no payload-size pressure to manage.

## Tests

tests/gateway/test_discord_slash_commands.py — 5 new test cases replace
the 3 old nested-structure tests:

- flat-not-nested structure assertion
- empty skills → no command registered
- callback dispatches the right cmd_key by name
- unknown name → ephemeral error, no dispatch
- large-catalog regression guard (500 skills) — command payload stays
  under 500 bytes regardless

E2E validated against real discord.py 2.7.1:
- Command registers as discord.app_commands.Command (not Group).
- Autocomplete filters by name AND description (verified across several
  queries including description-only matches like 'pdf' → OCR skill).
- 500-skill catalog returns max 25 results per autocomplete query
  (Discord's hard cap), filtered correctly.
- Choice labels formatted as 'name — description' clamped to 100 chars.
2026-04-17 05:19:14 -07:00
Berny Linville
6ee65b4d61 fix(weixin): preserve native markdown rendering
- stop rewriting markdown tables, headings, and links before delivery
- keep markdown table blocks and headings together during chunking
- update Weixin tests and docs for native markdown rendering

Closes #10308
2026-04-17 05:01:29 -07:00
Patrick Wang
4ed6e4c1a5 refactor(weixin): drop pilk dependency from voice fallback 2026-04-17 05:01:29 -07:00
Patrick Wang
649f38390c fix: force Weixin voice fallback to file attachments 2026-04-17 05:01:29 -07:00
Patrick Wang
678b69ec1b fix(weixin): use Tencent SILK encoding for voice replies 2026-04-17 05:01:29 -07:00
Teknium
53da34a4fc
fix(discord): route attachment downloads through authenticated bot session (#11568)
Three open issues — #8242, #6587, #11345 — all trace to the same root
cause: the image / audio / document download paths in
`DiscordAdapter._handle_message` used plain, unauthenticated HTTP to
fetch `att.url`. That broke in three independent ways:

  #8242  cdn.discordapp.com attachment URLs increasingly require the
         bot session to download; unauthenticated httpx sees 403
         Forbidden, image/voice analysis fail silently.

  #6587  Some user environments (VPNs, corporate DNS, tunnels) resolve
         cdn.discordapp.com to private-looking IPs. Our is_safe_url()
         guard correctly blocks them as SSRF risks, but the user
         environment is legitimate — image analysis and voice STT die.

  #11345 The document download path skipped is_safe_url() entirely —
         raw aiohttp.ClientSession.get(att.url) with no SSRF check,
         inconsistent with the image/audio branches.

Unified fix: use `discord.Attachment.read()` as the primary download
path on all three branches. `att.read()` routes through discord.py's
own authenticated HTTPClient, so:

  - Discord CDN auth is handled (#8242 resolved).
  - Our is_safe_url() gate isn't consulted for the attachment path at
    all — the bot session handles networking internally (#6587 resolved).
  - All three branches now share the same code path, eliminating the
    document-path SSRF gap (#11345 resolved).

Falls back to the existing cache_*_from_url helpers (image/audio) or an
SSRF-gated aiohttp fetch (documents) when `att.read()` is unavailable
or fails — preserves defense-in-depth for any future payload-schema
drift that could slip a non-CDN URL into att.url.

New helpers on DiscordAdapter:
  - _read_attachment_bytes(att)  — safe att.read() wrapper
  - _cache_discord_image(att, ext)     — primary + URL fallback
  - _cache_discord_audio(att, ext)     — primary + URL fallback
  - _cache_discord_document(att, ext)  — primary + SSRF-gated aiohttp fallback

Tests:
  - tests/gateway/test_discord_attachment_download.py — 12 new cases
    covering all three helpers: primary path, fallback on missing
    .read(), fallback on validator rejection, SSRF guard on document
    fallback, aiohttp fallback happy-path, and an E2E case via
    _handle_message confirming cache_image_from_url is never invoked
    when att.read() succeeds.
  - All 11 existing document-handling tests continue to pass via the
    aiohttp fallback path (their SimpleNamespace attachments have no
    .read(), which triggers the fallback — now SSRF-gated).

Closes #8242, closes #6587, closes #11345.
2026-04-17 04:59:03 -07:00
LehaoLin
504e7eb9e5 fix(gateway): wait for reconnection before dropping WebSocket sends
When a WebSocket-based platform adapter (e.g. QQ Bot) temporarily
loses its connection, send() now polls is_connected for up to 15s
instead of immediately returning a non-retryable failure. If the
auto-reconnect completes within the window, the message is delivered
normally. On timeout, the SendResult is marked retryable=True so the
base class retry mechanism can attempt re-delivery.

Same treatment applied to _send_media().

Adds 4 async tests covering:
- Successful send after simulated reconnection
- Retryable failure on timeout
- Immediate success when already connected
- _send_media reconnection wait

Fixes #11163
2026-04-17 04:22:40 -07:00
dieutx
995177d542 fix(gateway): honor QQ_GROUP_ALLOWED_USERS in runner auth 2026-04-17 04:22:40 -07:00
Pedro Gonzalez
590c9964e1 Fix QQ voice attachment SSRF validation 2026-04-17 04:22:40 -07:00
Teknium
aca81ac7bb test(dingtalk): cover require_mention + allowed_users gating
Adds 16 regression tests for the gating logic introduced in the
salvaged commit:

  * TestAllowedUsersGate — empty/wildcard/case-insensitive matching,
    staff_id vs sender_id, env var CSV population
  * TestMentionPatterns — compilation, case-insensitivity, invalid
    regex is skipped-not-raised, JSON env var, newline fallback
  * TestShouldProcessMessage — DM always accepted, group gating via
    require_mention / is_in_at_list / wake-word pattern / free_response_chats

Also adds yule975 to scripts/release.py AUTHOR_MAP (release CI blocks
unmapped emails).
2026-04-17 04:21:49 -07:00
Teknium
eabe14af1c test(discord): update reply_mode fixture for new to_reference() wrapping
Follow-up to the reply-reference fix: `_make_discord_adapter` used to return
the raw fetched `Message` as the expected reference, but the adapter now
wraps it via `ref_msg.to_reference(fail_if_not_exists=False)` so Discord
treats a deleted target as 'send without reply chip'. Update the fixture
to return the MessageReference sentinel so the 4 chunk-reference-identity
tests assert against the right object.

No production behavior change; only aligns the stale test fixture.
2026-04-17 04:17:56 -07:00
Teknium
ef37aa7cce test(discord): add regression guard for non-reference send errors
Follow-up to the reply-reference fix: ensure errors unrelated to the reply
reference (e.g. 50013 Missing Permissions) do NOT trigger the no-reference
retry path and still surface as a failed SendResult. Keeps the wider retry
condition from silently swallowing unrelated API errors.

Proposed in the original issue writeup (#11342) as test case
`test_non_reference_errors_still_propagate`.
2026-04-17 04:17:56 -07:00
LeonSGP43
a448e7a04d fix(discord): drop invalid reply references 2026-04-17 04:17:56 -07:00
Asunfly
7c932c5aa4 fix(dingtalk): close websocket on disconnect 2026-04-17 04:11:30 -07:00
赵晨飞
82969615bb test(weixin): add regression test for send_image_file parameter name
Add TestWeixinSendImageFileParameterName test class with two tests:
- test_send_image_file_uses_image_path_parameter: verifies the correct
  parameter name (image_path) is used when gateway calls send_image_file
- test_send_image_file_works_without_optional_params: ensures minimal
  params work correctly

This prevents the interface from drifting again as noted by Copilot.
2026-04-17 04:09:21 -07:00
Michel Belleau
efa6c9f715 fix(discord): default allowed_mentions to block @everyone and role pings
discord.py does not apply a default AllowedMentions to the client, so any
reply whose content contains @everyone/@here or a role mention would ping
the whole server — including verbatim echoes of user input or LLM output
that happens to contain those tokens.

Set a safe default on commands.Bot: everyone=False, roles=False,
users=True, replied_user=True. Operators can opt back in via four
DISCORD_ALLOW_MENTION_* env vars or discord.allow_mentions.* in
config.yaml. No behavior change for normal user/reply pings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 04:08:42 -07:00
Teknium
2367c6ffd5
test: remove 169 change-detector tests across 21 files (#11472)
First pass of test-suite reduction to address flaky CI and bloat.

Removed tests that fall into these change-detector patterns:

1. Source-grep tests (tests/gateway/test_feishu.py, test_email.py): tests
   that call inspect.getsource() on production modules and grep for string
   literals. Break on any refactor/rename even when behavior is correct.

2. Platform enum tautologies (every gateway/test_X.py): assertions like
   `Platform.X.value == 'x'` duplicated across ~9 adapter test files.

3. Toolset/PLATFORM_HINTS/setup-wizard registry-presence checks: tests that
   only verify a key exists in a dict. Data-layout tests, not behavior.

4. Argparse wiring tests (test_argparse_flag_propagation, test_subparser_routing
   _fallback): tests that do parser.parse_args([...]) then assert args.field.
   Tests Python's argparse, not our code.

5. Pure dispatch tests (test_plugins_cmd.TestPluginsCommandDispatch): patch
   cmd_X, call plugins_command with matching action, assert mock called.
   Tests the if/elif chain, not behavior.

6. Kwarg-to-mock verification (test_auxiliary_client ~45 tests,
   test_web_tools_config, test_gemini_cloudcode, test_retaindb_plugin): tests
   that mock the external API client, call our function, and assert exact
   kwargs. Break on refactor even when behavior is preserved.

7. Schedule-internal "function-was-called" tests (acp/test_server scheduling
   tests): tests that patch own helper method, then assert it was called.

Kept behavioral tests throughout: error paths (pytest.raises), security
tests (path traversal, SSRF, redaction), message alternation invariants,
provider API format conversion, streaming logic, memory contract, real
config load/merge tests.

Net reduction: 169 tests removed. 38 empty classes cleaned up.

Collected before: 12,522 tests
Collected after:  12,353 tests
2026-04-17 01:05:09 -07:00
Teknium
3438d274f6 fix(dingtalk): repair _extract_text for dingtalk-stream >= 0.20 SDK shape
The cherry-picked SDK compat fix (previous commit) wired process() to
parse CallbackMessage.data into a ChatbotMessage, but _extract_text()
was still written against the pre-0.20 payload shape:

  * message.text changed from dict {content: ...} → TextContent object.
    The old code's str(text) fallback produced 'TextContent(content=...)'
    as the agent's input, so every received message came in mangled.
  * rich_text moved from message.rich_text (list) to
    message.rich_text_content.rich_text_list.

This preserves legacy fallbacks (dict-shaped text, bare rich_text list)
while handling the current SDK layout via hasattr(text, 'content').

Adds regression tests covering:
  * webhook domain allowlist (api.*, oapi.*, and hostile lookalikes)
  * _IncomingHandler.process is a coroutine function
  * _extract_text against TextContent object, dict, rich_text_content,
    legacy rich_text, and empty-message cases

Also adds kevinskysunny to scripts/release.py AUTHOR_MAP (release CI
blocks unmapped emails).
2026-04-17 00:52:35 -07:00
Teknium
816e3e3774 test(feishu): cover new SDK event handler registrations
Extends test_build_event_handler_registers_reaction_and_card_processors
to assert that register_p2_im_chat_access_event_bot_p2p_chat_entered_v1
and register_p2_im_message_recalled_v1 are called when building the
event handler, matching the production registrations.

Also adds Fatty911 to scripts/release.py AUTHOR_MAP for credit on the
salvaged event-handler fix.
2026-04-16 22:08:11 -07:00
Teknium
24fa055763
fix(ci): resolve 4 pre-existing main failures (docs lint + 3 stale tests) (#11373)
* docs: fix ascii-guard border alignment errors

Three docs pages had ASCII diagram boxes with off-by-one column
alignment issues that failed docs-site-checks CI:

- architecture.md: outer box is 71 cols but inner-box content lines
  and border corners were offset by 1 col, making content-line right
  border at col 70/72 while top/bottom border was at col 71. Inner
  boxes also had border corners at cols 19/36/53 but content pipes
  at cols 20/37/54. Rewrote the diagram with consistent 71-col width
  throughout, aligned inner boxes at cols 4-19, 22-37, 40-55 with
  2-space gaps and 15-space trailing padding.

- gateway-internals.md: same class of issue — outer box at 51 cols,
  inner content lines varied 52-54 cols. Rewrote with consistent
  51-col width, inner boxes at cols 4-15, 18-29, 32-43. Also
  restructured the bottom-half message flow so it's bare text
  (not half-open box cells) matching the intent of the original.

- agent-loop.md line 112-114: box 2 (API thread) content lines had
  one extra space pushing the right border to col 46 while the top
  and bottom borders of that box sat at col 45. Trimmed one trailing
  space from each of the three content lines.

All 123 docs files now pass `npm run lint:diagrams`:
  ✓ Errors: 0  (warnings: 6, non-fatal)

Pre-existing failures on main — unrelated to any open PR.

* test(setup): accept description kwarg in prompt_choice mock lambdas

setup.py's `_curses_prompt_choice` gained an optional `description`
parameter (used for rendering context hints alongside the prompt).
`prompt_choice` forwards it via keyword arg. The two existing tests
mocked `_curses_prompt_choice` with lambdas that didn't accept the
new kwarg, so the forwarded call raised TypeError.

Fix: add `description=None` to both mock lambda signatures so they
absorb the new kwarg without changing behavior.

* test(matrix): update stale audio-caching assertion

test_regular_audio_has_http_url asserted that non-voice audio
messages keep their HTTP URL and are NOT downloaded/cached. That
was true when the caching code only triggered on
`is_voice_message`. Since bec02f37 (encrypted-media caching
refactor), matrix.py caches all media locally — photos, audio,
video, documents — so downstream tools can read them as real
files via media_urls. This applies to regular audio too.

Renamed the test to `test_regular_audio_is_cached_locally`,
flipped the assertions accordingly, and documented the
intentional behavior change in the docstring. Other tests in
the file (voice-specific caching, message-type detection,
reply-to threading) continue to pass.

* test(413): allow multi-pass preflight compression

run_agent.py's preflight compression runs up to 3 passes in a loop
for very large sessions (each pass summarizes the middle N turns,
then re-checks tokens). The loop breaks when a pass returns a
message list no shorter than its input (can't compress further).

test_preflight_compresses_oversized_history used a static mock
return value that returned the same 2 messages regardless of input,
so the loop ran pass 1 (41 -> 2) and pass 2 (2 -> 2 -> break),
making call_count == 2. The assert_called_once() assertion was
strictly wrong under the multi-pass design.

The invariant the test actually cares about is: preflight ran, and
its first invocation received the full oversized history. Replaced
the count assertion with those two invariants.

* docs: drop '...' from gateway diagram, merge side-by-side boxes

ascii-guard 2.3.0 flagged two remaining issues after the initial fix
pass:

1. gateway-internals.md L33: the '...' suffix after inner box 3's
   right border got parsed as 'extra characters after inner-box right
   border'. Dropped the '...' — the surrounding prose already conveys
   'and more platforms' without needing the visual hint.

2. agent-loop.md: ascii-guard can't cleanly parse two side-by-side
   boxes of different heights (main thread 7 rows, API thread 5 rows).
   Even equalizing heights didn't help — the linter treats the left
   box's right border as the end of the diagram. Merged into a single
   54-char-wide outer box with both threads labeled as regions inside,
   keeping the ▶ arrow to preserve the main→API flow direction.
2026-04-16 20:43:41 -07:00
Teknium
7af9bf3a54
fix(feishu): queue inbound events when adapter loop not ready (#5499) (#11372)
Inbound Feishu messages arriving during brief windows when the adapter
loop is unavailable (startup/restart transitions, network-flap reconnect)
were silently dropped with a WARNING log. This matches the symptom in
issue #5499 — and users have reported seeing only a subset of their
messages reach the agent.

Fix: queue pending events in a thread-safe list and spawn a single
drainer thread that replays them once the loop becomes ready. Covers
these scenarios:

  * Queue events instead of dropping when loop is None/closed
  * Single drainer handles the full queue (not thread-per-event)
  * Thread-safe with threading.Lock on the queue and schedule flag
  * Handles mid-drain bursts (new events arrive while drainer is working)
  * Handles RuntimeError if loop closes between check and submit
  * Depth cap (1000) prevents unbounded growth during extended outages
  * Drops queue cleanly on disconnect rather than holding forever
  * Safety timeout (120s) prevents infinite retention on broken adapters

Based on the approach proposed in #4789 by milkoor, rewritten for
thread-safety and correctness.

Test plan:
  * 5 new unit tests (TestPendingInboundQueue) — all passing
  * E2E test with real asyncio loop + fake WS thread: 10-event burst
    before loop ready → all 10 delivered in order
  * E2E concurrent burst test: 20 events queued, 20 more arrive during
    drainer dispatch → all 40 delivered, no loss, no duplicates
  * All 111 existing feishu tests pass

Related: #5499, #4789

Co-authored-by: milkoor <milkoor@users.noreply.github.com>
2026-04-16 20:36:59 -07:00
Brooklyn Nicholson
7f1204840d test(tui): fix stale mocks + xdist flakes in TUI test suite
All 61 TUI-related tests green across 3 consecutive xdist runs.

tests/tui_gateway/test_protocol.py:
- rename `get_messages` → `get_messages_as_conversation` on mock DB (method
  was renamed in the real backend, test was still stubbing the old name)
- update tool-message shape expectation: `{role, name, context}` matches
  current `_history_to_messages` output, not the legacy `{role, text}`

tests/hermes_cli/test_tui_resume_flow.py:
- `cmd_chat` grew a first-run provider-gate that bailed to "Run: hermes
  setup" before `_launch_tui` was ever reached; 3 tests stubbed
  `_resolve_last_session` + `_launch_tui` but not the gate
- factored a `main_mod` fixture that stubs `_has_any_provider_configured`,
  reused by all three tests

tests/test_tui_gateway_server.py:
- `test_config_set_personality_resets_history_and_returns_info` was flaky
  under xdist because the real `_write_config_key` touches
  `~/.hermes/config.yaml`, racing with any other worker that writes
  config. Stub it in the test.
2026-04-16 19:07:49 -05:00
helix4u
5d7d574779 fix(gateway): let /queue bypass active-session guard 2026-04-16 16:36:40 -07:00
Brooklyn Nicholson
3746c60439 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 18:25:49 -05:00
Siddharth Balyan
d38b73fa57
fix(matrix): E2EE and migration bugfixes (#10860)
* - make buffered streaming
- fix path naming to expand `~` for agent.
- fix stripping of matrix ID to not remove other mentions / localports.

* fix(matrix): register MembershipEventDispatcher for invite auto-join

The mautrix migration (#7518) broke auto-join because InternalEventType.INVITE
events are only dispatched when MembershipEventDispatcher is registered on the
client. Without it, _on_invite is dead code and the bot silently ignores all
room invites.

Closes #10094
Closes #10725
Refs: PR #10135 (digging-airfare-4u), PR #10732 (fxfitz)

* fix(matrix): preserve _joined_rooms reference for CryptoStateStore

connect() reassigned self._joined_rooms = set(...) after initial sync,
orphaning the reference captured by _CryptoStateStore at init time.
find_shared_rooms() returned [] forever, breaking Megolm session rotation
on membership changes.

Mutate in place with clear() + update() so the CryptoStateStore reference
stays valid.

Refs #8174, PR #8215

* fix(matrix): remove dual ROOM_ENCRYPTED handler to fix dedup race

mautrix auto-registers DecryptionDispatcher when client.crypto is set.
The adapter also registered _on_encrypted_event for the same event type.
_on_encrypted_event had zero awaits and won the race to mark event IDs
in the dedup set, causing _on_room_message to drop successfully decrypted
events from DecryptionDispatcher. The retry loop masked this by re-decrypting
every message ~4 seconds later.

Remove _on_encrypted_event entirely. DecryptionDispatcher handles decryption;
genuinely undecryptable events are logged by mautrix and retried on next
key exchange.

Refs #8174, PR #8215

* fix(matrix): re-verify device keys after share_keys() upload

Matrix homeservers treat ed25519 identity keys as immutable per device.
share_keys() can return 200 but silently ignore new keys if the device
already exists with different identity keys. The bot would proceed with
shared=True while peers encrypt to the old (unreachable) keys.

Now re-queries the server after share_keys() and fails closed if keys
don't match, with an actionable error message.

Refs #8174, PR #8215

* fix(matrix): encrypt outbound attachments in E2EE rooms

_upload_and_send() uploaded raw bytes and used the 'url' key for all
rooms. In E2EE rooms, media must be encrypted client-side with
encrypt_attachment(), the ciphertext uploaded, and the 'file' key
(with key/iv/hashes) used instead of 'url'.

Now detects encrypted rooms via state_store.is_encrypted() and
branches to the encrypted upload path.

Refs: PR #9822 (charles-brooks)

* fix(matrix): add stop_typing to clear typing indicator after response

The adapter set a 30-second typing timeout but never cleared it.
The base class stop_typing() is a no-op, so the typing indicator
lingered for up to 30 seconds after each response.

Closes #6016
Refs: PR #6020 (r266-tech)

* fix(matrix): cache all media types locally, not just photos/voice

should_cache_locally only covered PHOTO, VOICE, and encrypted media.
Unencrypted audio/video/documents in plaintext rooms were passed as MXC
URLs that require authentication the agent doesn't have, resulting
in 401 errors.

Refs #3487, #3806

* fix(matrix): detect stale OTK conflict on startup and fail closed

When crypto state is wiped but the same device ID is reused, the
homeserver may still hold one-time keys signed with the previous
identity key. Identity key re-upload succeeds but OTK uploads fail
with "already exists" and a signature mismatch. Peers cannot
establish new Olm sessions, so all new messages are undecryptable.

Now proactively flushes OTKs via share_keys() during connect() and
catches the "already exists" error with an actionable log message
telling the operator to purge the device from the homeserver or
generate a fresh device ID.

Also documents the crypto store recovery procedure in the Matrix
setup guide.

Refs #8174

* docs(matrix): improve crypto recovery docs per review

- Put easy path (fresh access token) first, manual purge second
- URL-encode user ID in Synapse admin API example
- Note that device deletion may invalidate the access token
- Add "stop Synapse first" caveat for direct SQLite approach
- Mention the fail-closed startup detection behavior
- Add back-reference from upgrade section to OTK warning

* refactor(matrix): cleanup from code review

- Extract _extract_server_ed25519() and _reverify_keys_after_upload()
  to deduplicate the re-verification block (was copy-pasted in two
  places, three copies of ed25519 key extraction total)
- Remove dead code: _pending_megolm, _retry_pending_decryptions,
  _MAX_PENDING_EVENTS, _PENDING_EVENT_TTL — all orphaned after
  removing _on_encrypted_event
- Remove tautological TestMediaCacheGate (tested its own predicate,
  not production code)
- Remove dead TestMatrixMegolmEventHandling and
  TestMatrixRetryPendingDecryptions (tested removed methods)
- Merge duplicate TestMatrixStopTyping into TestMatrixTypingIndicator
- Trim comment to just the "why"
2026-04-17 04:03:02 +05:30
Brooklyn Nicholson
842a122964 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 15:37:28 -05:00
asheriif
6c34bf3d00 fix(gateway): fix matrix read receipts 2026-04-16 13:18:12 -07:00
Brooklyn Nicholson
9c71f3a6ea Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 10:47:41 -05:00
jackjin1997
f5ac025714 fix(gateway): guard pending_event.channel_prompt against None in recursive _run_agent
Initialize next_channel_prompt before the pending_event check and use
getattr with None default, matching the existing pattern for
next_source/next_message/next_message_id. Prevents AttributeError
when pending_event is None (interrupt path).

Cherry-picked from #10953 by @jackjin1997.
2026-04-16 07:45:27 -07:00
Teknium
77bdad5b02
fix(tests): resolve 12 CI failures + 10 errors across 6 root causes (#11040)
Group A (3 tests): 'No LLM provider configured' RuntimeError
- test_user_message_surrogates_sanitized, test_counters_initialized_in_init,
  test_openai_prompt_tokens_unchanged
- Root cause: AIAgent.__init__ now requires base_url alongside api_key to
  skip resolve_provider_client() (which returns None when API keys are
  blanked in CI). Added base_url='http://localhost:1234/v1' to test
  agent construction.

Group B (5 tests): Discord slash command auto-registration
- test_auto_registers_missing_gateway_commands, test_auto_registered_command_*,
  test_register_skill_group_*
- Root cause: xdist workers that loaded a discord mock WITHOUT
  app_commands.Command/Group caused _register_slash_commands() to fail
  silently. Added comprehensive shared discord mock in
  tests/gateway/conftest.py (same pattern as existing telegram mock).

Group C (5 errors): Discord reply mode 'NoneType has no DMChannel'
- All TestReplyToText tests
- Root cause: FakeDMChannel was not a subclass of real discord.DMChannel,
  so isinstance() checks in _handle_message failed when running in full
  suite (real discord installed). Made FakeDMChannel inherit from
  discord.DMChannel when available. Removed fragile monkeypatch approach.

Group D (2 tests): detect_provider_for_model wrong provider
- test_openrouter_slug_match (got 'ai-gateway'), test_bare_name_gets_
  openrouter_slug (got 'copilot')
- Root cause: ai-gateway, copilot, and kilocode are multi-vendor
  aggregators that list other providers' models (OpenRouter-style slugs).
  They were being matched in Step 1 before OpenRouter. Added all three
  to _AGGREGATORS set so they're skipped like nous/openrouter.

Group E (1 test): model_flow_custom StopIteration
- test_model_flow_custom_saves_verified_v1_base_url
- Root cause: 'Display name' prompt was added after the test was written.
  The input iterator had 5 answers but the flow now asks 6 questions.
  Added 6th empty string answer.

Group F (1 test): Telegram proxy env assertion
- test_uses_proxy_env_for_primary_and_fallback_transports
- Root cause: _resolve_proxy_url() now checks TELEGRAM_PROXY first
  (via resolve_proxy_url('TELEGRAM_PROXY')). Test didn't clear this
  env var, allowing potential leakage from other tests in xdist workers.
  Added TELEGRAM_PROXY to the cleanup list.
2026-04-16 06:49:36 -07:00
Teknium
3c42064efc
fix: enforce config.yaml as sole CWD source + deprecate .env CWD vars + add hermes memory reset (#11029)
config.yaml terminal.cwd is now the single source of truth for working
directory. MESSAGING_CWD and TERMINAL_CWD in .env are deprecated with a
migration warning.

Changes:

1. config.py: Remove MESSAGING_CWD from OPTIONAL_ENV_VARS (setup wizard
   no longer prompts for it). Add warn_deprecated_cwd_env_vars() that
   prints a migration hint when deprecated env vars are detected.

2. gateway/run.py: Replace all MESSAGING_CWD reads with TERMINAL_CWD
   (which is bridged from config.yaml terminal.cwd). MESSAGING_CWD is
   still accepted as a backward-compat fallback with deprecation warning.
   Config bridge skips cwd placeholder values so they don't clobber
   the resolved TERMINAL_CWD.

3. cli.py: Guard against lazy-import clobbering — when cli.py is
   imported lazily during gateway runtime (via delegate_tool), don't
   let load_cli_config() overwrite an already-resolved TERMINAL_CWD
   with os.getcwd() of the service's working directory. (#10817)

4. hermes_cli/main.py: Add 'hermes memory reset' command with
   --target all/memory/user and --yes flags. Profile-scoped via
   HERMES_HOME.

Migration path for users with .env settings:
  Remove MESSAGING_CWD / TERMINAL_CWD from .env
  Add to config.yaml:
    terminal:
      cwd: /your/project/path

Addresses: #10225, #4672, #10817, #7663
2026-04-16 06:48:33 -07:00
kshitijk4poor
a6142a8e08 fix: follow-up for salvaged PR #10854
- Extract duplicated activity-callback polling into shared
  touch_activity_if_due() helper in tools/environments/base.py
- Use helper from both base.py _wait_for_process and
  code_execution_tool.py local polling loop (DRY)
- Add test assertion that timeout output field contains the
  timeout message and emoji (#10807)
- Add stream_consumer test for tool-boundary fallback scenario
  where continuation is empty but final_text differs from
  visible prefix (#10807)
2026-04-16 06:42:45 -07:00
LeonSGP43
465193b7eb fix(gateway): close temporary agents after one-off tasks
Add shared _cleanup_agent_resources() for temporary gateway AIAgent
instances. Apply cleanup to memory flush, background tasks, /btw,
manual /compress, and session-hygiene auto-compression. Prevents
unclosed aiohttp client session leaks.

Cherry-picked from #10899 by @LeonSGP43. Consolidates #10945 by @Lubrsy706.
Fixes #10865.

Co-authored-by: Lubrsy706 <Lubrsy706@users.noreply.github.com>
2026-04-16 06:31:23 -07:00
Brooklyn Nicholson
f81dba0da2 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 08:23:20 -05:00
Teknium
fa830a49e0 test: add cancellation handler delivery confirmation tests
5 tests covering the stream_consumer.py cancellation handler fix:
- partial-only (no accumulated) stays False
- best-effort send succeeds → True
- best-effort send fails → stays False (gateway fallback delivers)
- preserves existing True through cancellation
- regression: old code would have promoted partial to final
2026-04-16 05:53:18 -07:00
Dave Tist
35bbc6851b fix(gateway): honor previewed replies in queued follow-ups 2026-04-16 05:53:18 -07:00
Dave Tist
d67e602cc8 fix: only suppress gateway replies after confirmed final stream delivery
(cherry picked from commit 675249085b383fff305cc84b8aeacd6dd20c7b14)
2026-04-16 05:53:18 -07:00
helix4u
bd7e272c1f fix(slack): per-thread sessions for DMs by default
Each top-level Slack DM now gets its own Hermes session, matching the
per-thread behavior channels already have. Previously all top-level DM
messages shared one continuous session because thread_ts was None,
causing context to accumulate across unrelated conversations.

The behavior is controlled by platforms.slack.extra.dm_top_level_threads_as_sessions
in config.yaml (default: true). Set to false to restore legacy behavior.

Based on PR #10789 by helix4u. Changes from original:
- Default flipped to true (was opt-in, now opt-out)
- Removed env var fallback (config.yaml only per project policy)
- Tests updated to cover both default and opt-out paths
2026-04-16 04:22:33 -07:00
Peter Berthelsen
9a9b8cd1e4 fix: keep rapid telegram follow-ups from getting cut off 2026-04-16 02:44:00 -07:00
Teknium
e4cd62d07d
fix(tests): resolve remaining CI failures — commit_memory_session, already_sent, timezone leak, session env (#10785)
Fixes 12 CI test failures:

1. test_cli_new_session (4): _FakeAgent missing commit_memory_session
   attribute added in the memory provider refactoring. Added MagicMock.

2. test_run_progress_topics (1): already_sent detection only checked
   stream consumer flags, missing the response_previewed path from
   interim_assistant_callback. Restructured guard to check both paths.

3. test_timezone (1): HERMES_TIMEZONE leaked into child processes via
   _SAFE_ENV_PREFIXES matching HERMES_*. The code correctly converts
   it to TZ but didn't remove the original. Added child_env.pop().

4. test_session_env (1): contextvars baseline captured from a different
   context couldn't be restored after clear. Changed assertion to verify
   the test's value was removed rather than comparing to a fragile baseline.

5. test_discord_slash_commands (5): already fixed on current main.
2026-04-16 02:26:14 -07:00
helix4u
8021a735c2 fix(gateway): preserve notify context in executor threads
Gateway executor work now inherits the active session contextvars via
copy_context() so background process watchers retain the correct
platform/chat/user/session metadata for routing completion events back
to the originating chat.

Cherry-picked from #10647 by @helix4u with:
- Use asyncio.get_running_loop() instead of deprecated get_event_loop()
- Strip trailing whitespace
- Add *args forwarding test
- Add exception propagation test
2026-04-16 02:05:59 -07:00
Markus Corazzione
0cf7d570e2 fix(telegram): restore typing indicator and thread routing for forum General topic
In Telegram forum-enabled groups, the General topic does not include
message_thread_id in incoming messages (it is None). This caused:
1. Messages in General losing thread context — replies went to wrong place
2. Typing indicator failing because thread_id=1 was rejected by Telegram

Fix: synthesize thread_id="1" for forum groups when message_thread_id
is None, then handle it correctly per operation:
- send: omit message_thread_id (Telegram rejects thread_id=1 for sends)
- typing: pass thread_id=1, retry without it on "thread not found"

Also centralizes thread_id extraction into _metadata_thread_id() across
all send methods (send, send_voice, send_image, send_document, send_video,
send_animation, send_photo), replacing ~10 duplicate patterns.

Salvaged from PR #7892 by @corazzione.
Closes #7877, closes #7519.
2026-04-15 22:35:19 -07:00
leeyang1990
c5acc6edb6 feat(telegram): add dedicated TELEGRAM_PROXY env var and config.yaml proxy_url support
Pass platform_env_var="TELEGRAM_PROXY" to resolve_proxy_url() in both
telegram.py (main connect) and telegram_network.py (fallback transport),
so a Telegram-specific proxy takes priority over the generic HTTPS_PROXY.

Also bridge telegram.proxy_url from config.yaml to the TELEGRAM_PROXY
env var (env var takes precedence if both are set), add OPTIONAL_ENV_VARS
entry, docs, and tests.

Composite salvage of four community PRs:
- Core approach (both call sites): #9414 by @leeyang1990
- config.yaml bridging + docs: #6530 by @WhiteWorld
- Naming convention: #9074 by @brantzh6
- Earlier proxy work: #7786 by @ten-ltw

Closes #9414, closes #9074, closes #7786, closes #6530

Co-authored-by: WhiteWorld <WhiteWorld@users.noreply.github.com>
Co-authored-by: brantzh6 <brantzh6@users.noreply.github.com>
Co-authored-by: ten-ltw <ten-ltw@users.noreply.github.com>
2026-04-15 22:13:11 -07:00
kshitijk4poor
ff5bf0d6c8 fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation
Salvaged from PR #10643 by kshitijk4poor, updated for current main.

Root causes fixed:
1. Telegram xdist mock pollution — new tests/gateway/conftest.py with shared
   mock that runs at collection time (prevents ChatType=None caching)
2. VIRTUAL_ENV env var leak — monkeypatch.delenv in _detect_venv_dir tests
3. Copilot base_url missing — add fallback in _resolve_runtime_from_pool_entry
4. Stale vision model assertion — zai now uses glm-5v-turbo
5. Reasoning item id intentionally stripped — assert 'id' not in (store=False)
6. Context length warning unreachable — pass base_url to AIAgent in test
7. Kimi provider label updated — 'Kimi / Kimi Coding Plan' matches models.py
8. Google Workspace calendar tests — rewritten for current production code,
   properly mock subprocess on api_module, removed stale +agenda assertions
9. Credential pool auto-seeding — mock _select_pool_entry / _resolve_auto /
   _import_codex_cli_tokens to prevent real credentials from leaking into tests
2026-04-15 22:05:21 -07:00
Kovyrin Family Claw
00ff9a26cd Fix Telegram link preview suppression for bot sends 2026-04-15 17:54:43 -07:00
Kovyrin Family Claw
aea3499e56 feat(telegram): add config option to disable link previews 2026-04-15 17:54:43 -07:00
Roque
92a23479c0 fix(model-switch): normalize Unicode dashes from Telegram/iOS input
Telegram on iOS auto-converts double hyphens (--) to em dashes (—)
or en dashes (–) via autocorrect. This breaks /model flag parsing
since parse_model_flags() only recognizes literal '--provider' and
'--global'.

When the flag isn't parsed, the entire string (e.g. 'glm-5.1 —provider zai')
gets treated as the model name and fails with 'Model names cannot
contain spaces.'

Fix: normalize Unicode dashes (U+2012-U+2015) to '--' when they
appear before flag keywords (provider, global), before flag extraction.

The existing test suite in test_model_switch_provider_routing.py
already covers all four dash variants — this commit adds the code
that makes them pass.
2026-04-15 17:54:16 -07:00
Xowiek
21cd3a3fc0 fix(profile): use existing get_active_profile_name() for /profile command
Replace inline Path.home() / '.hermes' / 'profiles' detection in both CLI
and gateway /profile handlers with the existing get_active_profile_name()
from hermes_cli.profiles — which already handles custom-root deployments,
standard profiles, and Docker layouts.

Fixes /profile incorrectly reporting 'default' when HERMES_HOME points to
a custom-root profile path like /opt/data/profiles/coder.

Based on PR #10484 by Xowiek.
2026-04-15 17:52:03 -07:00
Xowiek
77435c4f13 fix(gateway): use profile-aware Hermes paths in runtime hints 2026-04-15 17:52:03 -07:00