Commit graph

1760 commits

Author SHA1 Message Date
Hinotoi-agent
3bace071bf fix(state): restrict sensitive store file permissions
response_store.db (api server) holds conversation history including tool
payloads, prompts, and results. webhook_subscriptions.json holds per-route
HMAC secrets. Under a permissive umask (e.g. 0o022, default on most
distros) both files were created mode 0o644 — readable by other local
users on shared boxes.

- gateway/platforms/api_server.py: ResponseStore tightens itself + WAL/SHM
  sidecars to 0o600 after __init__, then trusts the inode. (Original
  contributor patch chmod'd after every _commit() — wasteful on a hot
  api_server path; chmod-on-create is sufficient since SQLite preserves
  mode bits across writes.)

- hermes_cli/webhook.py: _save_subscriptions writes via tempfile.mkstemp
  (which itself creates the file with 0o600), chmods the temp before the
  atomic rename, and re-asserts 0o600 on the destination so an existing
  permissive file from before this fix gets narrowed.

Tests cover (a) creation under permissive umask leaves 0o600 and (b) an
existing 0o644 webhook_subscriptions.json gets narrowed on next save.
Tests guarded with skipif os.name=='nt' since POSIX mode bits don't apply
on Windows.

Salvaged from PR #30917 by @Hinotoi-agent. Reworked the api_server.py
side from chmod-on-every-commit to chmod-on-create.

Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
2026-05-24 04:55:18 -07:00
m0n3r0
f378f00bfb fix(feishu): validate verification token before reflecting url_verification challenge
When FEISHU_VERIFICATION_TOKEN is configured, an unauthenticated remote
could previously prove endpoint control by sending a url_verification
payload with any attacker-controlled challenge string — the handler
reflected the challenge BEFORE running the token check.

Move the verification_token check ahead of the url_verification echo so
the challenge response is gated on a valid token. Add a regression test
covering the wrong-token case. Also fix the stale
test_connect_webhook_mode_starts_local_server fixture to set
FEISHU_VERIFICATION_TOKEN (post #30746 webhook mode requires a secret).

Salvaged from PR #29663 by @m0n3r0 — kept the url_verification reorder
and its regression test; dropped the host-conditional weakening of the
#30746 secret guard (we want webhook secrets required regardless of
bind host, not only on 0.0.0.0/::).

Docs updated to call out the gating.

Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
2026-05-24 04:51:19 -07:00
teknium1
15aa6884a2 fix(webhook): use 403 not 500 for missing-secret rejection
Operator misconfiguration is a client/setup error, not an internal server
exception. 403 "forbidden" more accurately reflects "this route refuses
to authenticate" than 500 "internal server error" — the latter triggers
incident alerting on operator monitoring and conflates real bugs with
config drift.

Follow-up tweak to PR #29629 by @m0n3r0.
2026-05-24 04:47:45 -07:00
m0n3r0
dbf73e90fa fix: fail closed for webhook routes without secrets
Reject unsigned webhook requests when a route has no effective HMAC secret, even if the request handler is reached without the normal connect-time validation. Add regression coverage for the direct-handler path.
2026-05-24 04:47:45 -07:00
BaxBit
bbf02c3224
fix(gateway): validate Svix webhook signatures (#30200) 2026-05-24 04:45:13 -07:00
teknium1
b9f533af0a test(gateway): regression for plugin-transformed response after streaming
Adds a test that fails without the gateway fix, exercising the
response_transformed=True branch in _finalize_response: a streamed
response whose final text was modified by a transform_llm_output
plugin hook must be edit_message'd in place (not duplicate-sent),
with already_sent=True so the normal final-send is skipped.

Also drops two minor leftovers from the salvaged PR #29119:

  * accumulated_text property on GatewayStreamConsumer (unused)
  * duplicate _response_transformed=False inside the hook try block
2026-05-24 04:31:13 -07:00
kenyonxu
5cb21e3fb5 fix(gateway): edit streamed message instead of sending duplicate when response_transformed
When a transform_llm_output hook appends content after streaming, the previous
fix skipped the final-send suppression which caused the full response to be
sent as a NEW message (duplicate). Instead, edit the existing streamed message
in-place to append the transformed content, then set already_sent=True.

Added stream_consumer.message_id and .accumulated_text public properties.
2026-05-24 04:31:13 -07:00
kenyonxu
a4ceead796 fix(gateway): propagate response_transformed flag through run_sync return dict
run_sync() cherry-picks fields from the run_conversation result dict into
a new response dict for the gateway. response_transformed was missing from
the cherry-pick list, so the gateway always saw it as False and suppressed
the final send even though a transform_llm_output hook had modified the content.
2026-05-24 04:31:13 -07:00
kenyonxu
8edeebe6d7 fix: propagate response_transformed flag — plugin hook output survives streaming suppression
When a transform_llm_output hook modifies final_response after streaming,
the gateway was silently discarding the transformed content because
streamed=True / content_delivered=True triggered the final-send
suppression. Three changes:

1. conversation_loop: set `_response_transformed=True` when a
   transform_llm_output hook returns a non-empty string, and expose it
   as `response_transformed` in the result dict.

2. gateway/run: skip the final-send suppression when
   `response_transformed` is True — the transformed response must
   reach the client even if streaming already sent the original text.

3. acp_adapter/server: remove `not streamed_message` guard so
   final_response is always delivered (ACP path fixed separately).
2026-05-24 04:31:13 -07:00
Teknium
197f63f454
fix(feishu): require webhook auth secret and honor config extras (#30746) 2026-05-24 04:27:28 -07:00
Teknium
bdb97b8573
fix(feishu): enforce auth and chat binding for approval buttons (#30744) 2026-05-24 04:27:17 -07:00
Teknium
485292ac7d
fix(feishu): authorize interactive exec approval callbacks (#30739) 2026-05-24 04:26:57 -07:00
Teknium
4ca77f1059
Harden msgraph webhook auth requirements (#30169) 2026-05-24 04:25:20 -07:00
Teknium
3e78e353d7
fix(qqbot): authorize approval button interactions by session owner (#30737) 2026-05-24 04:25:12 -07:00
Teknium
c3caca6584
fix(gateway): remove discord role allowlist auth bypass (#30742) 2026-05-24 04:24:49 -07:00
AhmetArif0
5848174374 fix(wecom): guard flush task against cancel-delivery race to prevent message loss
When asyncio.sleep() fires just before Task.cancel() is called, CPython
sets _must_cancel=True but cannot cancel the already-completed sleep
future, so CancelledError is delivered at the next await (handle_message)
rather than at the sleep.  By that point the superseded task has already
popped the merged event from _pending_text_batches, so the superseding
task sees an empty batch and silently drops the message.

Fix: add a synchronous task-registry check between the sleep and the pop.
No await between the check and the pop means no other coroutine can
interleave, so the guard is race-free.
2026-05-24 01:33:40 -07:00
Teknium
1bed4e8eed fix(gateway): drop text snippet from debounce debug log (CodeQL)
CodeQL py/clear-text-logging-sensitive-data flagged the candidate-accept
debug log including event.text[:60]. Log text_len instead — sufficient for
debugging burst behavior without surfacing message contents.

Co-authored-by: Paulo Nascimento <pnascimento9596@gmail.com>
2026-05-24 01:31:45 -07:00
Paulo Nascimento
7abd62719b gateway: debounce queued text follow-ups 2026-05-24 01:31:45 -07:00
AhmetArif0
21db250034 fix(wecom-callback): retry send with fresh token on errcode 40001/42001
When WeCom returns errcode=40001 (invalid credential) or 42001 (token
expired), send() was returning a failure without evicting the bad token
from _access_tokens. All subsequent sends then kept using the same
invalid cached token until its TTL naturally expired (~7200s).

Fix: on the first token-rejection errcode, evict the cache entry and
retry once with a freshly fetched token. Non-token errcodes fail
immediately as before. If the refreshed token also fails, the error
is returned without looping further.

Adds four regression tests covering: successful retry on 40001,
successful retry on 42001, no retry on unrelated errcode, and clean
failure when the refresh does not help.
2026-05-24 01:30:47 -07:00
AhmetArif0
39b8d1d313 fix(dingtalk): finalize open streaming cards before disconnect
AI Card "tool progress" cards created with finalize=False were left in
streaming state on DingTalk's UI after a gateway restart because
disconnect() called _streaming_cards.clear() without first closing
them via _close_streaming_siblings.

Move the finalization loop before self._http_client.aclose() so the
HTTP client is still available when the finalize requests are sent.
Adds a regression test that asserts the HTTP client is alive during
finalization.
2026-05-23 20:48:56 -07:00
Teknium
e42fcc5625
fix(provider): make config.yaml model.provider the single source of truth (#31222)
Policy: if it ain't a secret it goes in config.yaml. HERMES_INFERENCE_PROVIDER
was leaking behavioral config into the .env surface, including from the gateway,
which bypassed config.yaml entirely.

Behavior:
- gateway/run.py: drop HERMES_INFERENCE_PROVIDER read in _resolve_runtime_agent_kwargs.
  Gateway now flows through resolve_runtime_provider() with no `requested` override,
  which reads model.provider from config.yaml first.

Docs/UX (strip env var from user-facing surface):
- --provider help text no longer mentions the env var
- cli-config.yaml.example same
- reference/environment-variables.md: remove HERMES_INFERENCE_PROVIDER row and
  the cross-reference from HERMES_INFERENCE_MODEL
- reference/cli-commands.md: blank the env-var column for --provider
- guides/xai-grok-oauth.md, guides/minimax-oauth.md: replace
  HERMES_INFERENCE_PROVIDER=x hermes invocations with config.yaml / --provider
- developer-guide/adding-providers.md, model-provider-plugin.md: reframe

Internal mechanism (kept as-is):
- hermes_cli/main.py writes HERMES_INFERENCE_PROVIDER into the TUI subprocess env
- tui_gateway/server.py reads it on TUI startup
- resolve_requested_provider() / oneshot.py / cli.py still fall through to the
  env var as a last-resort behind config.yaml, which is what makes the TUI
  parent->child handoff work
This stays. We just stop documenting it as a user knob.

Tests: tests/gateway/test_auth_fallback.py — simplify mock to fail on first
call, succeed on second; drop monkeypatch.setenv lines that no longer matter.

Supersedes #31064 (closed with credit to @novax635 who surfaced the underlying
issue but proposed aligning gateway *to* the env var rather than removing it).
2026-05-23 18:18:41 -07:00
Edison
e752c9454e feat(plugins): add register_auxiliary_task() to PluginContext API
Auxiliary LLM tasks (vision, compression, web_extract, etc.) currently
require modifications to core files for any plugin that needs its own
task slot — specifically the _AUX_TASKS list in hermes_cli/main.py and
the hardcoded env-var bridging dict in gateway/run.py. This violates
the 'plugins must not modify core files' rule and forces every memory
or context plugin that wants its own auxiliary task to either fork
core or open a coupled core+plugin PR.

This change adds a generic plugin surface for auxiliary task
registration:

    ctx.register_auxiliary_task(
        key='memory_retain_filter',
        display_name='Memory retain filter',
        description='hindsight pre-retain dedup/extract',
        defaults={'timeout': 30, 'extra_body': {'reasoning_effort': 'low'}},
    )

After registration, the task automatically:

  - Appears in 'hermes model → Configure auxiliary models' picker via
    a new _all_aux_tasks() merge of built-in + plugin tasks
  - Has its provider/model/base_url/api_key bridged from config.yaml
    to AUXILIARY_<KEY_UPPER>_* env vars at gateway startup
    (gateway/run.py now uses a dynamic bridged-keys set instead of
    a hardcoded per-task dict)
  - Gets plugin-declared defaults (timeout, extra_body, etc.) layered
    underneath user config so unconfigured plugin tasks still work
    (agent/auxiliary_client._get_auxiliary_task_config)
  - Resets to auto via 'Reset all to auto' alongside built-ins

Validation:

  - Rejects shadowing of built-in keys (vision, compression, etc.)
  - Rejects invalid key shapes (must match [A-Za-z0-9_]+)
  - Rejects cross-plugin collisions (clear error)
  - Allows same-plugin re-registration (idempotent updates)

Plugin discovery failures (rare) fall back gracefully — the aux
config UI still shows built-in tasks if get_plugin_auxiliary_tasks()
raises, and gateway env-var bridging keeps working for built-ins.

Built-in tasks remain hardcoded in _AUX_TASKS for stability — they're
the baseline UX, and DEFAULT_CONFIG already ships their defaults.
Plugin tasks layer on top.

Tests: 15 new tests in test_plugin_auxiliary_tasks.py covering API
validation, manager state lifecycle, helper sort order, _all_aux_tasks
merge semantics, _reset_aux_to_auto inclusion of plugin tasks, and
default-layering in auxiliary_client.

Updates the gateway-bridge code-parity test (test_auxiliary_config_bridge)
to assert the new dynamic shape rather than the hardcoded literal env
var names which no longer appear post-refactor.

Motivation: this unblocks PR #20262 (hindsight smart retain pipeline)
and similar plugins that need a dedicated aux task slot. The change
is non-breaking — built-in env vars (AUXILIARY_VISION_PROVIDER, etc.)
keep working since they're produced by the same f-string template
that built the hardcoded names.
2026-05-23 17:49:47 -07:00
novax635
86871ee25a fix(cli): synchronize HERMES_SESSION_ID across environment and contextvar during session switches 2026-05-23 17:46:55 -07:00
Glucksberg
9451087aab fix(telegram): preserve observed group slash commands 2026-05-23 16:26:28 -07:00
Teknium
6a8e131a0a refactor(ntfy): convert built-in adapter to platform plugin
ntfy now ships as a self-contained plugin under plugins/platforms/ntfy/
instead of editing 8 core files (gateway/config.py Platform enum,
gateway/run.py factory + auth maps, cron/scheduler.py, toolsets.py,
hermes_cli/status.py, agent/prompt_builder.py, gateway/channel_directory.py,
tools/send_message_tool.py).

All routing goes through gateway/platform_registry via register_platform():
- adapter_factory, check_fn, validate_config, is_connected
- env_enablement_fn seeds PlatformConfig.extra from NTFY_* env vars so
  gateway status reflects env-only setups without instantiating httpx
- standalone_sender_fn handles deliver=ntfy cron jobs when cron runs
  out-of-process from the gateway
- allowed_users_env / allow_all_env hook into _is_user_authorized
- cron_deliver_env_var=NTFY_HOME_CHANNEL for cron home routing
- platform_hint surfaces in the system prompt
- pii_safe=True (topic names are the only identifier; no PII to redact)

Tests moved to tests/gateway/test_ntfy_plugin.py using _plugin_adapter_loader
so the module lives under plugin_adapter_ntfy in sys.modules and cannot
collide with sibling plugin-adapter tests on the same xdist worker. The
core-file grep tests (Platform.NTFY in source, hermes-ntfy in toolsets,
etc.) are replaced with plugin-shape tests covering register() metadata,
env_enablement_fn output, and standalone_sender_fn behavior.

68 tests pass under scripts/run_tests.sh.
2026-05-23 16:13:01 -07:00
sprmn24
b10f17bf1e feat(ntfy): add ntfy platform adapter with atomic reconnect, identity fix, and 81 tests 2026-05-23 16:13:01 -07:00
QuenVix
7245bc77eb fix(fallback): merge fallback_providers with legacy fallback_model configurations 2026-05-23 05:24:57 -07:00
Teknium
9acf949e34
feat(telegram): edit status messages in place instead of appending (#30864)
Closes #30045. Based on @qike-ms's PR #30141.

Telegram status callbacks (lifecycle, compression, context-pressure)
used to append a fresh bubble on every emit. Now adapter tracks
{(chat_id, status_key) -> message_id}; first call sends, subsequent
calls edit. Failed edits drop the cache entry and fall through to a
fresh send.

- gateway/platforms/telegram.py: send_or_update_status() (+34 LOC)
- gateway/run.py: route _status_callback_sync through it when the
  adapter supports it; plain adapter.send() otherwise (+15 LOC)
- 5 tests covering first send / edit-in-place / edit-failure fallback
  / distinct key & chat isolation
2026-05-23 02:42:10 -07:00
Zyrixtrex
61ac118724 fix(webhook): enforce INSECURE_NO_AUTH safety rail on dynamic route reloads 2026-05-23 02:39:12 -07:00
walli
60b0a0e006 fix(qqbot): fix SILK magic byte detection slice length
_guess_ext_from_data: data[:5] == b"#!SILK" -> data[:6] (6-byte string)
_looks_like_silk: data[:4] == b"#!SILK" -> data[:6]

The previous slices were too short to ever match the 6-byte "#!SILK"
literal, relying entirely on the "#!SILK_V3" (9-byte) and 0x02! (2-byte)
fallback paths for SILK format detection.
2026-05-23 02:27:17 -07:00
walli
0e7448d63a fix(qqbot): use original attachment filename for cached files
Add original_name parameter to _download_and_cache, preferring the
attachment metadata filename over the CDN URL path basename. Previously
files were cached with meaningless QQ CDN hash names (e.g.
qqdownload_...oadftnv5), causing ugly filenames when sent back to users.

Aligns with qqbot-agent-sdk's AttachmentDownloader.download_document.
2026-05-23 02:27:17 -07:00
walli
a54f5afc70 fix(qqbot): handle op 7/9 and expand fatal close code set
1. Handle op 7 (Server Reconnect): close WS to trigger reconnect loop
   while preserving session for Resume
2. Handle op 9 (Invalid Session): check d value to determine if session
   is resumable; clear session only when not resumable
3. Remove 4009 from session-clearing set (connection timeout is resumable)
4. Expand fatal close codes: 4001/4002/4010-4014 now stop reconnect
   immediately instead of retrying uselessly
5. Add unit tests
2026-05-23 02:27:17 -07:00
walli
bbd77d165c fix(qqbot): add INTERACTION intent and expose video/file cached paths
1. Add INTERACTION intent bit (1<<26) to _send_identify, fixing approval
   button clicks not being received (INTERACTION_CREATE events were never
   dispatched by the gateway)
2. Include local cached path in video/file attachment descriptions so the
   LLM can reference files for re-sending to users
3. Add unit tests (TestIdentifyIntents, TestProcessAttachmentsPathExposure)
2026-05-23 02:27:17 -07:00
teknium1
66d81f9e14 fix(gateway): don't swallow expansion errors in runtime config helper
A bare except in _load_gateway_runtime_config would silently return the
unexpanded dict on any _expand_env_vars failure — masking the very bug
this helper exists to fix. Drop it; let the caller see real errors.
2026-05-23 02:27:08 -07:00
QuenVix
2362cc4688 fix(gateway): enforce env variable template expansion on runtime config loaders 2026-05-23 02:27:08 -07:00
QuenVix
d21ac579e9 fix(gateway): honor key_env in auth-failure fallback resolution 2026-05-23 02:25:53 -07:00
QuenVix
52a368fa72 fix(gateway): preserve WhatsApp pairing approvals across JID/LID alias flips 2026-05-23 01:46:34 -07:00
Eugeniusz Gilewski
41d2c758c3 Fix unsafe gateway media path delivery 2026-05-23 01:40:35 -07:00
Markus
4a91e36495 fix(gateway): separate observed Telegram group context 2026-05-23 01:33:42 -07:00
aaronagent
b82608a6f5 fix(skills,pairing): path traversal guard in uninstall, lock list_pending, hash file paths
- skills_hub: validate that uninstall_skill's install_path resolves
  inside SKILLS_DIR before calling shutil.rmtree, preventing recursive
  deletion of arbitrary directories via poisoned lock.json entries
- skills_hub: include file paths (not just contents) in
  bundle_content_hash so swapping filenames between files changes the
  hash, strengthening update-detection integrity
- pairing: wrap list_pending() in self._lock so _cleanup_expired() file
  writes don't race with concurrent generate_code()/approve_code() calls

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-05-22 19:59:24 -07:00
kshitijk4poor
cc8e5ec2af refactor(gateway): migrate Discord adapter to bundled plugin (full Teams parity)
First migration of an existing built-in platform adapter to the plugin
system established by IRC / Teams / LINE / Google Chat. Closes #24325;
advances the umbrella refactor in #3823.

Matches Teams' shape exactly — adapter under ``plugins/platforms/discord/``
with the standard ``__init__.py`` / ``adapter.py`` / ``plugin.yaml``
shell, ``register(ctx)`` entry point, **no back-compat shim** at the old
import path, and full parity for the four hooks Teams uses plus the
``apply_yaml_config_fn`` hook that landed in #25443 (the Discord plugin
is the first consumer of that hook):

* ``standalone_sender_fn`` — out-of-process cron delivery via REST API
* ``setup_fn`` — interactive ``hermes setup gateway`` wizard
* ``apply_yaml_config_fn`` — translate ``config.yaml`` ``discord:`` keys
  into ``DISCORD_*`` env vars (replaces the hardcoded block in
  ``gateway/config.py``)
* ``is_connected`` — declares connection state from ``DISCORD_BOT_TOKEN``
* ``check_fn`` — lazy-installs ``discord.py`` on demand
* plus ``allowed_users_env``, ``allow_all_env``, ``cron_deliver_env_var``,
  ``max_message_length``, ``emoji``, ``required_env``, ``install_hint``

* ``gateway/platforms/discord.py`` (5,101 LOC) →
  ``plugins/platforms/discord/adapter.py`` (git rename, R090).
* New ``plugins/platforms/discord/{__init__.py, plugin.yaml}`` with
  ``requires_env`` / ``optional_env`` declarations.
* Append ``register(ctx)`` block + new hook implementations
  (``_standalone_send``, ``interactive_setup``, ``_apply_yaml_config``,
  ``_clean_discord_user_ids``, ``_is_connected``, ``_build_adapter``,
  plus helpers ``_DISCORD_CHANNEL_TYPE_PROBE_CACHE`` etc.) to the
  adapter.

* Replace the ``Platform.DISCORD elif`` branch in
  ``GatewayRunner._create_adapter()`` (−9 LOC) with a generic post-creation
  hook (+6 LOC) in the registry path: any plugin adapter that declares a
  ``gateway_runner`` attribute now gets it auto-injected. Webhook's
  built-in branch is unchanged (it doesn't go through the registry path).

* Move ``_send_discord`` (190 LOC) and helpers
  (``_DISCORD_CHANNEL_TYPE_PROBE_CACHE``, ``_remember_channel_is_forum``,
  ``_probe_is_forum_cached``, ``_derive_forum_thread_name``) from
  ``tools/send_message_tool.py`` into the plugin as ``_standalone_send``.
* Wire via ``standalone_sender_fn=_standalone_send`` (Teams pattern; same
  gap fixed in #21804 for other plugin platforms).
* Replace the Discord ``elif`` in ``tools/send_message_tool.py``
  ``_send_to_platform`` with a 10-line registry-hook dispatch.
* Drop the ``DiscordAdapter`` import and the
  ``Platform.DISCORD: DiscordAdapter.MAX_MESSAGE_LENGTH`` ``_MAX_LENGTHS``
  entry — the registry's ``max_message_length=2000`` covers it.

* Move ``_setup_discord`` and ``_clean_discord_user_ids`` (68 LOC) from
  ``hermes_cli/setup.py`` into the plugin as ``interactive_setup``.
* Wire via ``setup_fn=interactive_setup``.  CLI helpers (``prompt``,
  ``print_info``, etc.) are lazy-imported so the plugin's module-load
  surface stays minimal.
* Remove ``"discord": _s._setup_discord`` from
  ``hermes_cli/gateway.py::_builtin_setup_fn``.
* Remove the entire 32-line ``_PLATFORMS["discord"]`` static dict entry —
  Discord's setup metadata is now discovered dynamically via
  ``_all_platforms()`` from the registry entry.

* Move the 59-line ``discord_cfg`` YAML→env bridge from
  ``gateway/config.py::load_gateway_config()`` into the plugin as
  ``_apply_yaml_config``.  Covers ``require_mention``,
  ``thread_require_mention``, ``free_response_channels``, ``auto_thread``,
  ``reactions``, ``ignored_channels``, ``allowed_channels``,
  ``no_thread_channels``, ``allow_mentions.{everyone,roles,users,
  replied_user}``, and ``reply_to_mode`` (including the YAML 1.1
  ``off``-as-False coercion and the ``extra.reply_to_mode`` fallback).
* Wire via ``apply_yaml_config_fn=_apply_yaml_config``.
* The hook runs BEFORE ``_apply_env_overrides`` and after the generic
  shared-key loop, exactly as documented in
  ``website/docs/developer-guide/adding-platform-adapters.md``.
* Behavior is preserved exactly — every assignment still uses
  ``not os.getenv(...)`` guards so env vars take precedence over YAML.

All 78 references to the old import path are rewritten — no back-compat
shim:

* 51 ``from gateway.platforms.discord import X`` →
  ``from plugins.platforms.discord.adapter import X``
* 5 ``import gateway.platforms.discord as discord_platform`` →
  ``import plugins.platforms.discord.adapter as discord_platform``
* 1 ``from gateway.platforms import discord as discord_mod`` →
  ``from plugins.platforms.discord import adapter as discord_mod``
* 21 ``mock.patch("gateway.platforms.discord.X")`` strings →
  ``mock.patch("plugins.platforms.discord.adapter.X")``
* 1 docstring reference in ``hermes_cli/commands.py``
* 1 import in ``tools/send_message_tool.py`` (now removed entirely)

The import-safety test in ``tests/gateway/test_discord_imports.py`` is
updated to purge the new canonical module name from ``sys.modules``.

**38 files changed, +621 / −473** — net positive due to the YAML hook
implementation (89 new LOC in the plugin trading for 59 deleted in core),
but every line moved has a clear plugin home now.  The git rename is
detected at R090 because the adapter gained ~340 LOC of moved-in hook
implementations (``_standalone_send`` + ``interactive_setup`` +
``_apply_yaml_config`` + helpers).

* All 568 Discord-specific tests pass across 25 ``test_discord_*.py``
  files plus voice/send/text-batching/reload-skills/stream-consumer/
  integration tests.
* All 147 tests in the YAML-touching subset
  (``test_discord_reply_mode``, ``test_discord_free_response``,
  ``test_discord_allowed_channels``, ``test_discord_allowed_mentions``,
  ``test_discord_channel_controls``, ``test_discord_reactions``,
  ``test_discord_thread_persistence``, ``test_runtime_footer``) pass —
  this is the strongest signal that the YAML→env hook behaves
  identically to the legacy block.
* Broader gateway/cron/integration sweep (1297 tests) introduces zero
  new failures vs ``main``.  Pre-existing failures in
  ``tests/gateway/test_tts_media_routing.py`` and
  ``tests/e2e/test_platform_commands.py`` reproduce identically on the
  unchanged ``main`` revision.
* Plugin discovery sanity check confirms Discord registers alongside the
  other four platform plugins:

    Registered platforms: ['discord', 'google_chat', 'irc', 'line', 'teams']

These Discord-shaped tendrils in core were **deliberately not moved** —
they are generic platform-registry concerns affecting every platform,
not Discord-specific:

* ``gateway/config.py:1205`` ``DISCORD_BOT_TOKEN → config.token`` env
  enablement — same shape Telegram has.  The existing
  ``env_enablement_fn`` registry hook only seeds ``extra``, not
  ``.token``, so it can't replace this without an adapter refactor to
  read from ``extra["bot_token"]``.
* ``gateway/run.py`` voice-mode hooks
  (``self.adapters.get(Platform.DISCORD)`` for
  ``start_voice_mode``/``stop_voice_mode``), role-based auth,
  ``DISCORD_ALLOW_BOTS`` branch in ``_is_user_authorized``,
  ``_UPDATE_ALLOWED_PLATFORMS`` frozenset, and the per-platform
  allowlist maps — generic platform-registry concerns.
* ``Platform.DISCORD`` enum literal — stable identifier used as dict
  keys throughout the codebase; removing it is a separate refactor with
  no real benefit.
* ``tools/discord_tool.py`` and ``tools/environments/local.py`` —
  first-class agent tools and env-passthrough config, neither is the
  gateway adapter.

Each of these is worth its own scoping issue when the time comes.
2026-05-22 14:21:41 -07:00
Teknium
82c2035823 fix(pairing): handle legacy plaintext pending entries during upgrade
When an existing install upgrades to the hashed-pending schema, its
on-disk pending.json still has the old {code: entry} format with no
hash/salt fields. The original PR #8056 assumed every entry had both
fields and would have KeyErrored in approve_code, list_pending, and
_cleanup_expired.

Guard each consumer:
  - approve_code: skip entries that are not a dict, lack salt/hash,
    or have a non-hex salt. Legacy entries simply fail to match.
  - list_pending: tolerate missing 'hash' (show "legacy" placeholder)
    and non-numeric created_at (skip the row).
  - _cleanup_expired: treat malformed/legacy entries as expired so
    they get pruned on the next call rather than wedging the file.

Regression tests cover all three consumers plus a mixed-malformed
case.
2026-05-22 04:11:49 -07:00
Tom Qiao
2e509422ef fix(security): hash gateway pairing codes instead of storing plaintext
Pairing codes were stored as plaintext keys in JSON files. Now uses
sha256 + random salt hashing with constant-time comparison.

Fixes #8036

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-22 04:11:49 -07:00
memosr
9c90b3a597 fix(security): validate secret in _reload_dynamic_routes to prevent HMAC bypass 2026-05-22 03:45:21 -07:00
Teknium
3fde8c153d
fix(skills): prune dependency/venv dirs from all skill scanners (#30042)
* fix(skills): skip dependency dirs in skill scan

* fix(skills): widen sibling rglob scanners to use shared exclusion set

Follow-up to PR #29968. The contributor's PR widened EXCLUDED_SKILL_DIRS
in the canonical walker (iter_skill_index_files), which fixes the
user-visible discovery path. This commit sweeps the ~12 other
rglob('SKILL.md') sites that did their own ad-hoc filtering — most only
checked .git/.hub, some had no filter at all — so dependency dirs
(.venv, node_modules, site-packages, etc.) cannot leak ghost skills
through the secondary paths.

Adds agent.skill_utils.is_excluded_skill_path(path) helper. Migrates
all 13 sites to use it. Removes 3 hardcoded duplicate filter sets.

Sites touched:
  agent/curator_backup.py        - skill backup file count
  gateway/run.py                 - disabled-skill response (2 sites)
  hermes_cli/dump.py             - skill count in env dump
  hermes_cli/profile_describer.py- profile description (2 sites)
  hermes_cli/profile_distribution.py - profile install count
  hermes_cli/profiles.py         - profile skill count
  hermes_cli/skills_hub.py       - category detection
  tools/skill_manager_tool.py    - skill name lookup (already used set, now uses helper)
  tools/skill_usage.py           - usage tracking + skill dir lookup (2 sites)
  tools/skills_hub.py            - optional skills find + scan (2 sites)
  tools/skills_sync.py           - bundled skills sync

E2E verified with the exact reported shape
(bring/scripts/.venv/.../typer/.agents/skills/typer/SKILL.md): no
sibling site picks up the ghost skill, all five legit-skill counts
still return 1.

* chore(infographic): retro-pop-grid bento for PR #30042 skill-scanner sweep

---------

Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com>
2026-05-21 14:18:02 -07:00
EloquentBrush0x
6c26727bb3 fix(gateway): extend observe+attribution to location and media handlers
_handle_location_message and _handle_media_message were skipped when the
observe-unmentioned-group-messages feature landed (a9db0e2c7). Both handlers
now:

1. Check _should_observe_unmentioned_group_message on the skipped path and
   call _observe_unmentioned_group_message so group chatter is stored as
   shared session context even when the bot is not addressed.

2. Call _apply_telegram_group_observe_attribution on the triggered path so
   the dispatched event uses the shared (user_id=None) group session instead
   of the per-user session, letting the model see previously observed context.
   For stickers the attribution is applied after _handle_sticker completes
   (which overwrites event.text with the vision description); for all other
   media types it is applied once after caption cleaning.

Four new tests cover the observe and attribution paths for both handlers.
2026-05-20 23:52:18 -07:00
Omar B
be0728cacc fix: handle Discord typing indicator 429 gracefully
The typing indicator loop (send_typing) ran every 8s and died on any
exception, including Discord 429 rate limits.  Once a 429 killed the
loop, the indicator never restarted — and the raw exception bounce
could cascade into broader gateway instability.

Changes:
- Bump sleep interval from 8s to 12s (typing light lasts ~10s)
- On 429: extract retry_after, log a warning, sleep the backoff,
  and continue the loop
- On non-rate-limit errors: log debug and return (unchanged
  behaviour)
2026-05-20 23:27:38 -07:00
Markus
a9db0e2c74 Observe unmentioned Telegram group messages 2026-05-20 22:55:31 -07:00
Teknium
31a0100104 feat(state.db): persist platform_message_id; restore yuanbao exact-id recall
PR #29211 dropped JSONL gateway transcripts and noted that the platform's
own `message_id` field (used by Yuanbao's recall guard to redact a
message by exact platform id) was no longer preserved — falling back to
content-match.  That fallback works for the common case but redacts the
wrong row when two messages share text (or fails to match when content
is post-processed).

Restore exact-id matching by giving state.db a column for it:

- New `platform_message_id TEXT` column on the messages table
  (SCHEMA_VERSION bump 11 → 12; column added via declarative reconciler
  on existing DBs, no version-gated migration block needed)
- Partial index `idx_messages_platform_msg_id` on
  (session_id, platform_message_id) to keep recall's point-lookup cheap
  even on large sessions
- `append_message()` and `replace_messages()` accept the new value:
  the gateway-facing `append_to_transcript` in `gateway/session.py`
  forwards either `message["platform_message_id"]` or the legacy
  `message["message_id"]` key (yuanbao's existing convention)
- `get_messages_as_conversation()` surfaces the column back on the
  message dict as `message_id` so platform code reads the same shape
  it used to read from JSONL
- Yuanbao `_patch_transcript`: restore branch A1 (exact id match)
  ahead of A2 (content match) ahead of B (system-note).  Both branches
  log which one fired so operators can tell from gateway.log whether
  recall hit the canonical path or had to fall back.

Tests:
- New low-level round-trip tests in `test_hermes_state.py` for both
  `append_message` and `replace_messages` paths
- The PR's `test_yuanbao_recall_db_only.py` was rewritten to assert
  the new contract: branch A1 (id match) works against DB-only
  transcripts, and branch A2 (content match) still recovers rows that
  were observed without a platform id (e.g. agent-processed @bot
  messages where run.py doesn't carry msg_id through)
2026-05-20 13:00:57 -07:00
yoniebans
0cc1a1d2d9 refactor(yuanbao): drop dead branch A1 message_id loop + pin missing fixture
PR #29211 review findings:

1. test_retry_replacement: pin DEFAULT_DB_PATH so SessionDB() doesn't write
   to the real ~/.hermes/state.db. Same fix as the other DB-only fixtures.

2. yuanbao recall branch A1 (message_id exact match) was structurally dead
   once load_transcript() became DB-only — state.db never preserves the
   platform message_id. Removed the dead loop, consolidated to a single
   content-match branch (renamed 'A: content match'). Branch B (system
   note) unchanged. Updated the test name + docstring to reflect this.

Note: self._lock is no longer taken in append_to_transcript (was guarding
the JSONL file append). SQLite append_message handles its own concurrency
via WAL mode, so this is safe; flagging for awareness.
2026-05-20 13:00:57 -07:00