The Matrix adapter's _upload_file fell back to sending
"(file not found: {file_path})" directly into the room — the same
host-path leak class fixed for the base adapter and Slack in the
previous commit. Replace it with a friendly notice, log the path at
WARN for operators, and preserve any caller-supplied caption.
The base BasePlatformAdapter implementations of send_voice, send_video,
send_document, and send_image_file forwarded their *_path argument
verbatim into the chat text (e.g. "🎬 Video: /home/.../hermes/cache/...").
Telegram, Discord, and Slack adapters all fall back to those base methods
when their native send raises — so a rejected video on Telegram surfaced
the host filesystem layout to the user instead of a useful message.
Replace the path-echo with a friendly notice, log the path for operator
diagnostics, and keep the user-supplied caption intact. The Slack adapter
had three identical sites that fell through to the same path-echo on its
own native upload failures; fix those too. send_document still surfaces
the caller-provided file_name (or the basename derived from it) since
that is the user-facing filename, not a host path.
Add regression tests asserting the *_path argument never appears in the
fallback content while caption text and explicit file_name still do.
Follow-up to the salvaged #8008 fix:
- Sibling-site fix: _evaluate_slash_authorization gated DISCORD_ALLOWED_CHANNELS /
DISCORD_IGNORED_CHANNELS on numeric IDs only, so name/#name config that now works
for on_message still silently failed for slash-command interactions. Refactor the
channel-key helper to _discord_channel_keys_from_channel(channel, parent) and reuse
it at the interaction gate. Fail-closed on missing channel id is preserved.
- The contributor's hardcoded 8s flush deadline could be hard-cancelled mid-flush:
_teardown_adapter already wraps cancel_background_tasks() in the per-adapter
disconnect budget (HERMES_GATEWAY_ADAPTER_DISCONNECT_TIMEOUT, default 5s). The flush
deadline now derives from that budget with headroom so it always completes inside it.
- AUTHOR_MAP: map cypher@augmentl.com -> Nickperillo for CI.
- Tests: slash-auth name/#name allow + name ignore matching.
Two related fixes to the Discord gateway adapter:
1. Channel name matching (free-response, allowed, ignored, no-thread channels)
Previously these config values only matched against numeric channel IDs.
If a user configured free_response_channels: cypher (by name), the adapter
would silently ignore it because it only intersected against channel_ids.
Now the adapter builds a channel_keys set that includes the channel ID,
channel name, and #channel-name form, and checks all three for each gate.
2. Flush pending text-batch tasks before shutdown
The Discord adapter uses _pending_text_batch_tasks (its own dict) for
merging rapid successive message chunks. These tasks were NOT added to
self._background_tasks (the base class list), so the base
cancel_background_tasks() never awaited them on restart/shutdown.
This caused a race: in-flight response deliveries were cancelled before
Discord had a chance to send them, resulting in silent dropped messages
visible to users as tool-log-only replies with no text body.
Fix: override cancel_background_tasks() in DiscordAdapter to await all
pending text-batch tasks (8s deadline) before delegating to the base class.
Add durable public-URL output and URL-based chaining to xAI Grok Imagine:
- Store generated media on files-cdn with permanent public HTTPS URLs
(public_url: true, no expiry by default).
- Chain by URL: generate -> edit -> extend each take a prior result's
public HTTPS URL (or a data URI / local file for inputs).
- Add provider-specific xai_video_edit and xai_video_extend tools.
- Image generation: public-URL/storage output, multi-reference edits,
and ~/ local-path support for image edits.
Credentials use xAI Grok device-code OAuth (separate PR).
A Slack user/legacy token (xoxp-...) makes auth.test resolve to the
installing human's member ID with no bot_id, so the adapter binds its
identity (_bot_user_id / _team_bot_user_ids) to that human. Every
"is this the bot?" check then misfires: that person's <@...> mentions
wake the bot and are stripped as the bot's own mention, so the agent is
genuinely told it was @mentioned and replies to messages merely
addressed to that human (symptom: bot responds to "@trevor ..." and
insists it was explicitly mentioned).
There is no runtime API error to catch — a user token still
sends/receives — so the only detectable moment is connect time. Add a
warning-only nudge (_warn_if_not_bot_token) alongside the existing
group-DM scope nudge: when auth.test resolves a user_id but no bot_id,
log that the token is a user token and to use the xoxb-... Bot User
OAuth Token. Warning-only: does not block a working-but-misconfigured
install. Fires once per workspace per process.
The self-hosted OIDC dashboard provider was public-client + PKCE only, with
two `# TODO(confidential-client)` seams. Authentik and Keycloak commonly
default a new OIDC client to *confidential*, whose token endpoint rejects an
unauthenticated exchange (`invalid_client`) — so a self-hoster who accepts
their IDP's default could not complete dashboard login without manually
flipping the client to public.
Add optional confidential-client support:
- New optional `client_secret` (env `HERMES_DASHBOARD_OIDC_CLIENT_SECRET`,
or `dashboard.oauth.self_hosted.client_secret`; env-wins-config, empty
treated as unset). It is a credential, so docs steer operators to the
`.env` file; config.yaml is supported only for precedence symmetry.
- `_token_endpoint_auth()` selects `client_secret_basic` (HTTP Basic header)
vs `client_secret_post` (form body) from the IDP's advertised
`token_endpoint_auth_methods_supported`, defaulting to basic (the OIDC
default) when absent. Applied to complete_login, refresh_session, and
revoke_session (RFC 7009 §2.1).
- PKCE is sent in BOTH modes — the secret is client authentication layered
on top, never a replacement (OAuth 2.1 / RFC 9700 keep PKCE mandatory).
- Basic header url-encodes client_id/secret before base64 per RFC 6749
§2.3.1, so reserved chars (`:`, `@`, space) round-trip correctly.
Non-breaking: with no secret configured the provider is a pure public PKCE
client, byte-identical to prior behaviour (no Authorization header, no
client_secret in the body). The secret is never logged — register() reports
only a `confidential=<bool>` flag.
Tests: 16 new cases covering basic/post selection, default-when-absent,
public-unchanged contract, PKCE-preserved, reserved-char url-encoding,
blank-secret-is-public, refresh + revoke auth, no-secret-in-logs, and
env/config register wiring. Full dashboard-auth suite (nous provider,
middleware, gate, cookies, WS, 401-reauth, status endpoint) — 396 tests —
green, proving no existing auth path regressed.
The self-hosted OIDC dashboard login rejected any http:// redirect_uri
whose host was not localhost/127.0.0.1, surfacing "redirect_uri may only use http:// for localhost/127.0.0.1" before reaching the IDP. This broke self-hosted dashboards reached over plain HTTP (including LAN IPs, internal hostnames, and reverse proxies that terminate TLS upstream).
#38827 already dropped this check from the nous provider, but the generic self-hosted provider copied the old localhost-only
branch and reintroduced the bug for HERMES_DASHBOARD_OIDC_ISSUER setups.
The IDP's own allowlist is authoritative on which redirect_uris are
permitted; this client-side _validate_redirect_uri is only a fast-fail for
obvious operator error and should not second-guess valid http:// deployments.
Fix: drop the localhost-only branch on the http scheme. Validation now enforces only that the scheme is http(s) and the path ends with
/auth/callback. Updated the docstring to explain the relaxed contract,
and added test_allows_http_with_arbitrary_host covering an internal
hostname and a LAN IP alongside the existing localhost case.
Salvages the two still-valid hardenings from #5381 onto the relocated
plugin adapters (the discord/feishu/whatsapp adapters moved to
plugins/platforms/ since the PR was opened, and 4 of its 6 hunks are
already on main or superseded).
- feishu: rate limiter now denies untracked keys when the tracking table
is at capacity after pruning stale entries (was: allow through without
tracking). At-capacity-with-all-fresh-entries only happens under abuse,
so allowing untracked requests let an attacker who flooded the table
bypass the limiter entirely. Already-tracked keys and post-prune room
are unaffected.
- whatsapp: absolute file paths handed back by the Baileys bridge are now
validated to resolve inside a known media cache dir before being
attached. A compromised/buggy bridge could otherwise return an
arbitrary path (e.g. /etc/passwd) that would be sent verbatim to the
model. Guard resolves symlinks and accepts both the canonical
cache/<kind> and legacy <kind>_cache layouts.
Follow-up to the group-DM manifest fix. The manifest change only helps
NEW installs; existing apps keep their old (mpim-less) scopes until the
admin reinstalls. Since a missing message.mpim event delivers nothing
(no runtime API error to catch), detect stale installs at connect time
from the auth.test x-oauth-scopes header and log an actionable reinstall
nudge when im:history is granted but mpim:history is not. Also promote
message.mpim from Recommended to Required in the docs event tables so the
default setup path can't drop it.
The supermemory and mem0 memory providers shipped third-party SDKs
(supermemory / mem0ai) that are not core dependencies, but — unlike the
honcho and hindsight providers — they imported those SDKs directly with
no tools.lazy_deps.ensure() preflight and had no LAZY_DEPS allowlist
entry. On the published Docker image the agent venv is sealed
(HERMES_DISABLE_LAZY_INSTALLS=1) and lazy installs are redirected to a
writable durable target (HERMES_LAZY_INSTALL_TARGET). honcho/hindsight
route through ensure() and install fine there; supermemory/mem0 never
called it, so their SDK was never installed on a hosted instance and the
provider silently reported itself unavailable even with the API key set.
Fixes:
- Add memory.supermemory + memory.mem0 to the LAZY_DEPS allowlist
(tools/lazy_deps.py), pinned to current PyPI releases.
- Call ensure('memory.<x>', prompt=False) at each SDK-import chokepoint
(_SupermemoryClient.__init__; Mem0MemoryProvider._create_backend),
mirroring honcho's wrapped try/except shape.
- Drop the SDK-import gate from supermemory's is_available() — it was a
chicken-and-egg trap (provider never loaded on a sealed venv, so
ensure() never ran). Now key-presence only, like honcho/mem0.
- Add matching pyproject extras [supermemory]/[mem0]; update the
lazy-covered-extras contract test (excluded from [all] by policy).
Tests prove each path fails without the fix and the real sealed-venv
durable-target gate accepts both features.
The WeCom callback endpoint (internet-facing, 0.0.0.0) parsed untrusted
request bodies before signature verification. defusedxml already guards
the entity-expansion class on main, but there was no cap on raw body
size, so an unauthenticated POST could still force unbounded read work
pre-auth.
Set client_max_size=64KB on the aiohttp app (413 at the framework layer)
plus an explicit length guard in _handle_callback as defense in depth.
WeCom callbacks are small encrypted XML envelopes — media is delivered
out-of-band via MediaId, never inline — so 64KB is ample for legitimate
traffic. Adds tests for oversized (413) and normal-sized (not 413) bodies.
Salvaged from #10192 by @memosr (body-size limit half; defusedxml half
already superseded on main).
Two platform-security hardenings:
- Matrix: _on_invite now checks the inviter against the existing
allow-list (_allowed_user_ids / GATEWAY_ALLOW_ALL_USERS) before
auto-joining. Without this any federated Matrix user could invite
the bot into arbitrary rooms, exposing its presence and metadata.
The message and reaction paths already enforce this allow-list; the
invite path bypassed it.
- Mattermost: _api_get / _api_post / _api_put reject any path
containing '..'. WebSocket-event values (channel_id, post_id,
file_id) are interpolated directly into API paths, so a malicious or
compromised server could craft traversal payloads to make the bot
issue authenticated requests to arbitrary endpoints with its bearer
token.
The configurable-E2EE-passphrase change from the original PR is dropped:
the matrix adapter was rewritten onto mautrix and the passphrase-protected
key-export file no longer exists.
Removed/unauthorized Telegram users could inject prompt content before the
per-user auth gate fired. The adapter ran `_should_process_message`,
`_build_message_event`, and text/photo batching — and dispatched to the
runner — before `_is_user_authorized()` (gateway/authz_mixin.py) rejected
the sender. Unmentioned group chatter from a removed user was also
persisted into the session transcript via `_observe_unmentioned_group_message`,
leaking into the agent's observed context independent of dispatch.
Add `_is_user_authorized_from_message()` as an intake prefilter that runs
in `_handle_text_message`, `_handle_command`, `_handle_location_message`,
and `_handle_media_message` BEFORE batching, event construction, and the
unmentioned-group observe branch. It reuses the runner's
`_is_user_authorized()` with a correctly-shaped SessionSource (group vs
forum vs dm, real chat_id for TELEGRAM_GROUP_ALLOWED_* allowlists),
falls back to env allowlists, and only rejects when an allowlist actually
exists — unknown DMs with no allowlist still reach the pairing flow.
Channel posts authorize via `sender_chat` identity when `from_user` is
absent.
Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
Co-authored-by: Carlos Manuel Cejas <carlosmcejas@gmail.com>
main (cb982ad99) wired windows_hide_flags() into the auxiliary git/gh/wmic/
bash/powershell/taskkill legs but left two it didn't reach, plus the Electron
backend-launch leg it explicitly deferred. Cover them the same way:
- apps/desktop/electron/main.cjs: getNoConsoleVenvPython resolves the BASE
pythonw.exe instead of the venv Scripts\pythonw.exe shim, which re-execs a
console python.exe and flashes a conhost the desktop backend can't suppress.
Both backend creators put the venv site-packages on PYTHONPATH so imports
still resolve under the base interpreter. (main's commit said this Electron
leg "needs a Windows-tested change of its own".)
- tools/tts_tool.py, tools/transcription_tools.py, plugins/platforms/discord:
ffmpeg conversions (voice notes / TTS / STT) via windows_hide_flags().
- plugins/platforms/whatsapp: netstat + taskkill bridge-port cleanup via
windows_hide_flags().
All no-ops on POSIX. Tests assert the base-pythonw preference and the ffmpeg
legs pass CREATE_NO_WINDOW.
The merged CLOSE-WAIT heartbeat (#52744) only probes get_me(), which uses the
general request path and stays healthy while PTB's getUpdates consumer is
silently wedged (updater.running=True but the long-poll task is stuck, observed
on WSL2). DMs then queue in the Bot API and never reach handlers (#42909).
Augment the existing _polling_heartbeat_loop to also probe
get_webhook_info().pending_update_count. After two consecutive probes that see a
non-draining queue while the updater claims to be running, escalate into the
existing _handle_polling_network_error recovery ladder — no new restart
machinery. No-ops in webhook mode, when the updater is not running, or when a
reconnect is already in flight.
Credit to @gazzumatteo, whose PR #42959 identified the pending_update_count
signal as the missing liveness probe. This reuses the existing heartbeat +
recovery path rather than adding a parallel watchdog.
Fixes#42909.
HolographicMemoryProvider.shutdown() dropped its MemoryStore reference
without calling the existing MemoryStore.close(). Since the connection is
opened check_same_thread=False (one per session), its fd was released by
refcount/GC at a non-deterministic time on a non-deterministic thread,
churning a DB fd through the kernel free pool on every session teardown.
Call close() so the fd is released deterministically.
Reported by @alfranli123 (#44037), who pinpointed the exact code location.
Note: the report's TLS-fd-recycle corruption attribution could not be
reproduced from the code — dropping a sqlite connection flushes valid
SQLite pages via the VFS, never TLS framing, and the provider is at most a
releaser of DB fds, not a TLS-flushing socket owner. This change is correct
resource hygiene that removes per-session fd churn regardless.
Rebase onto plugins/platforms/matrix/adapter.py (code moved from
gateway/platforms/matrix.py). Same logic: _on_invite checks is_direct
on invite events and calls _record_dm_room to persist in m.direct
account data.
Fixes#44679
* fix(telegram): clear send_path_degraded on successful reconnect
_send_path_degraded was cleared only in _verify_polling_after_reconnect,
60s after reconnect and only if scheduled. A clean start_polling() reconnect
left the flag stuck True, short-circuiting send() and blocking all outbound
messages until the deferred probe ran (or forever if it never did).
Clear the flag the moment start_polling() succeeds — that is the recovery
signal. The deferred probe remains a defensive re-check that re-enters the
reconnect ladder (re-setting the flag) if it detects a silent wedge.
Fixes#35205.
* docs: add infographic for #35205 telegram send-path fix
Telegram polling entered a self-inflicted ~31s loop of 409 Conflict ->
retry -> resume -> Conflict. The error_callback PTB invokes synchronously
inside its internal network_retry_loop only scheduled our async recovery
task (loop.create_task) and returned, so PTB kept polling getUpdates on its
own while our handler concurrently ran stop -> sleep -> start_polling. The
two polling sessions overlapped and Telegram returned a fresh 409.
Fix: in the conflict branch of the error_callback, synchronously set PTB's
private polling stop_event before scheduling recovery. PTB's loop exits on
its next tick (it races that event in do_action), so our handler owns
polling alone. The handler's await updater.stop() drains the task and PTB
clears the event, so the subsequent start_polling() builds a fresh event
and is not poisoned.
Keeps the existing reconnect ladder intact (option B) — fixes only the
race. Defensive: probes mangled + unmangled stop_event spellings and no-ops
(prior behaviour) if neither exists; never flips _running, which would make
the handler skip stop() and leave the loop wedged.
The Discord adapter could enter a silent zombie state after a network
outage / proxy stall: the process is alive, _client looks open, but the
underlying socket is dead. discord.py's WebSocket reconnect never sees a
RST through a wedged proxy/NAT, so client.start() spins forever without
exiting — which means the bot-task done callback (which only fires on
task completion) never trips either. The bot stays "offline" in Discord
until a manual `hermes gateway restart`. Reported offline for 13-17h.
Adds an out-of-band REST liveness probe in DiscordAdapter. Every
`discord.liveness_interval_seconds` (default 60s) the adapter issues a
cheap fetch_user(bot_id) — the same REST path as message delivery, so it
fails when the proxy/NAT is wedged. After
`discord.liveness_failure_threshold` consecutive failures (default 3) the
probe closes the wedged client and surfaces a retryable fatal error,
which trips the gateway's existing _platform_reconnect_watcher and
rebuilds the adapter. Operators disable it by setting either knob to 0.
Config lives in config.yaml (discord.liveness_*) per the .env-is-secrets
policy; _apply_yaml_config bridges it to internal env vars the adapter
reads, matching the existing HERMES_DISCORD_TEXT_BATCH_* pattern.
Co-authored-by: Hermes Agent <agent@nousresearch.com>
When a Telegram attachment download/cache fails (typically a transient
httpx.ConnectError to Telegram's CDN), the except handler logged a warning
and fell through to handle_message() with empty media and no text — the user
thought the file was delivered, the agent saw a content-less turn with no
signal an attachment was attempted, and the only record was a buried log line.
Adds _surface_media_cache_failure(): replies to the user in Telegram so they
know to retry, and appends an agent-visible notice to event.text via the
existing _append_observed_note channel so the agent knows an attachment was
attempted and failed. No new event fields (structured-event refactor is out
of scope per #23045). Wired into all five cache-failure sites — photo, voice,
audio, video, document — since they shared the identical silent fall-through.
Bug 1 from #23045 (unsupported types routed as fake user messages) no longer
exists on main: the document handler now accepts any file type, so there is no
rejection branch to fix.
Closes#23045
DISCORD_ALLOWED_USERS="*" now means "allow everyone", matching the
SIGNAL_ALLOWED_USERS / DISCORD_ALLOWED_CHANNELS wildcard convention and
the value `claw migrate` emits. Previously _is_allowed_user did exact
ID matching only, so "*" matched no user and blocked every non-self
sender — a P1 with no workaround.
Three sites, all required for the fix to hold at runtime:
- _is_allowed_user: short-circuit when "*" is in the allowlist.
- connect(): exclude "*" from the intents.members trigger so the
wildcard does not request the privileged Server Members intent
(which can block the bot from coming online).
- _resolve_allowed_usernames: preserve "*" verbatim; otherwise it lands
in the username-resolution bucket, matches no member, and is silently
dropped from the set and env var on the first on_ready — quietly
undoing the fix.
Slash auth delegates to _is_allowed_user (auto-covered); component auth
already honors "*" on main.
Follow-up to #53791 addressing review feedback: the footgun checker treated
capture_output=/stdout=/stderr=/check_output as proof a subprocess can't pop a
Windows console. That invariant is false — stream redirection controls where a
child's output goes, not whether a console is allocated. From a console-less
parent (Desktop/Electron, pythonw.exe, detached gateway/cron) a console-subsystem
child still flashes a window even when fully captured.
- check-windows-footguns.py: capture/redirect/check_output is no longer a blanket
safe-pass. Added _WINDOWS_FLASHING_PROGRAMS (git/gh/npm/node/python/uv/ffmpeg/
docker/powershell/…); calls to those are flagged even when captured. Non-flashing
programs keep the capture exemption (no 271-site noise). _subprocess_compat.run/
popen calls are inherently safe (wrapper injects CREATE_NO_WINDOW).
- Routed the 35 genuine flashing git/gh/npm/uv/ffmpeg/docker spawns through the
_subprocess_compat.run/popen chokepoint (Brooklyn's wrapper from #53810) — the
durable fix, not per-site annotations. cmd.exe /c start stays # ok (intentional).
- Updated tests + CONTRIBUTING.md rule #17 to the corrected invariant.
* fix(windows): stop terminal-window popups from background spawns
Native-Windows desktop/gateway users saw cmd/conhost windows flash on
gateway restart, image paste, the dashboard Projects tree, voice notes,
and ~5 min after closing the app (detached cron). Two root causes:
- Console-subsystem exes (taskkill, schtasks, wmic, netstat, tasklist,
agent-browser, git, ffmpeg, powershell, git-bash) spawned via raw
subprocess allocate a fresh console when the launching process has
none (pythonw desktop backend / detached gateway) - even with output
captured.
- uv venv pythonw shims re-exec console python.exe, so Python children
get a console regardless of how they're launched.
Fixes:
- Single hidden-spawn primitive (_subprocess_compat.run/.popen) that ORs
CREATE_NO_WINDOW on Windows, no-op on POSIX. Route every Hermes-owned
console-exe spawn through it.
- FreeConsole() catch-all in hermes_bootstrap: any Python child that
exclusively owns an auto-allocated console detaches it at startup
(GetConsoleProcessList()==1 gate leaves shared interactive consoles
untouched).
- Replace PowerShell/wmic gateway PID scans with in-process psutil.
- Skip schtasks queries on non-interactive desktop restarts.
- Prefer native agent-browser .exe over .cmd shims.
- Guard test bans raw subprocess spawns of the Windows-only console
tools repo-wide so the popup class can't regress.
* fix(windows): scope FreeConsole to background entry points; fix merge fallout
Console detach review (per #53810 feedback): GetConsoleProcessList()==1 can't
tell a uv pythonw->python phantom console apart from a user opening the
interactive CLI/TUI in its own fresh console (double-click, shortcut, ConPTY) —
both report a single attached process with a tty. Running FreeConsole() in the
import-time bootstrap therefore risked detaching a legitimately-interactive
terminal.
- Extract FreeConsole into explicit hermes_bootstrap.detach_orphan_console();
remove it from apply_windows_utf8_bootstrap() (import side effect).
- Call it only from known background mains: gateway run, dashboard backend
(start_server, what the desktop spawns), cron standalone, tui_gateway entry,
slash worker. Interactive CLI/TUI never calls it.
- Behavior-contract tests: frees only when solo owner, leaves shared console,
no-op without console / on POSIX, and asserts it's not an import side effect.
Merge fallout from origin/main (#53791):
- local.py: 3-way merge left a dangling **_popen_kwargs (NameError crashing
every terminal init). _subprocess_compat.popen already hides the window, so
drop it.
- discord adapter: merge stacked an undefined windows_hide_flags() onto the
primitive call; drop the redundant arg.
- test_gateway: scan now goes psutil-first (zero spawn); rewrite the
case-variant test to drive that production path.
* test(claw): mock _subprocess_compat.run seam for Windows process scan
claw.py's Windows tasklist/powershell scan routes through the hidden-spawn
primitive; the tests still patched claw_mod.subprocess, so on win32 the mock
was never hit and real spawns returned nothing. Patch the actual seam.
* fix(windows): stop subprocess console-window popups + add CI guard
The single biggest source of Windows 'terminal popup' bug reports was bare
subprocess.run/Popen calls spawning a console window. The compat helpers
(windows_hide_flags / windows_detach_popen_kwargs) already existed but the
footgun checker had no rule to stop new bare calls from reintroducing the flash.
- scripts/check-windows-footguns.py: new AST-based rule flagging subprocess
calls that can create a new console — output-redirection-aware (capture/
redirect/check_output exempt) and POSIX-only-program-aware (launchctl/
systemctl/brew/etc. exempt). Comprehensive on real popups, no annotation
burden on calls that can't flash.
- Swept all genuine window-spawning sites through windows_hide_flags()/
windows_detach_popen_kwargs(); marked intentionally-visible launches
(editor/terminal/foreground re-exec) with '# windows-footgun: ok'.
- tests/scripts/test_windows_footgun_subprocess_rule.py: behavior-contract
tests + full-repo cleanliness invariant.
- CONTRIBUTING.md: documents the rule + the helper pattern.
* test: accept creationflags kwarg in psutil_android fake_subprocess_run
The Windows no-window sweep added creationflags=windows_hide_flags() to
install_psutil_android.py's subprocess.run call; the test's fake stub had a
fixed (cmd) signature and raised TypeError on the new kwarg.
send_message with MEDIA:/path to a WhatsApp target previously dropped the
attachment: the WhatsApp branch never passed media_files, the plugin's
_standalone_send accepted the param but only POSTed text, and WhatsApp was
absent from the media-supported platform list.
- send_message_tool: add a Platform.WHATSAPP media block (mirrors Feishu) that
routes media_files through the whatsapp plugin's standalone_sender_fn, and
add whatsapp to the supported-media list strings.
- whatsapp adapter: _standalone_send now sends text first (skipped when the
chunk is media-only), then uploads each file via the bridge /send-media
endpoint with a mediaType derived from extension/is_voice/force_document, so
images/videos/voice arrive as native bubbles instead of documents.
- _bridge_media_type classifier maps ext -> image|video|audio|document.
Closes#19105 (remaining send_message gap). Other items in the report
(inbound video paths, image_generate auto-deliver, history dedup, native
gateway bubbles) already landed on main.
Add an explicit _closing guard to both owned executors so the
recreate-on-shutdown path only recovers from an *external* teardown of
the loop default — never resurrects a pool the gateway/adapter itself
stopped. _shutdown_*executor() sets the flag; _get_*executor() raises if
closing; feishu connect() re-arms on reconnect. Updates the gateway
recreate test to assert the refusal contract and adds feishu coverage.
Feishu SDK calls ran on asyncio's shared default executor, so a torn-down
default executor wedged every send with 'Executor shutdown has been called'
and left the gateway a zombie (#10849). The adapter now owns a
ThreadPoolExecutor recreated on demand if shut down, mirroring the
gateway-owned executor change. Routes all 17 self._client SDK calls through
_run_blocking; shuts the pool down on disconnect.
TELEGRAM_HOME_CHANNEL set to an @username (not a numeric chat ID) crashed
all webhook/cron->Telegram home-channel delivery with 'ValueError: invalid
literal for int()'. The Telegram Bot API accepts both a numeric chat_id and
an @username string; Hermes was force-coercing every chat_id with int().
Add normalize_telegram_chat_id() (returns int for numeric values, passes
@username strings through) and apply it at the Bot API send/edit sites in
the Telegram adapter and the send_message tool. Username targets are now
recognized as explicit targets in _parse_target_ref.
Reapplies the approach from #13274 (season179), whose branch predated the
gateway/platforms/telegram.py -> plugins/platforms/telegram/adapter.py
relocation. Dupes: #13535 (Tranquil-Flow), #37572 (chewkaah).
Co-authored-by: season179 <season.saw@gmail.com>
Add post_setup() and get_status_config() to the Supermemory memory
provider so `hermes memory setup` and `hermes memory status` print a
one-line connection summary (container, profile fact count,
auto_recall/auto_capture). Point API-key onboarding at the Hermes
connect URL (app.supermemory.ai/integrations?connect=hermes).
Salvage of #52988. Two fixes folded in:
- Test isolation: the new probe/status tests mocked _SupermemoryClient
but not the __import__("supermemory") guard inside
_probe_supermemory_connection, so they passed only where the optional
supermemory package was installed and failed on a clean checkout / CI
(the PR shipped with red CI). Added _stub_supermemory_importable()
mirroring the existing test_is_available_false_when_import_missing
pattern; the suite now passes with supermemory absent.
- post_setup: `if api_key and api_key not in os.environ` checked whether
the key's *value* named an env var (always false in practice). Fixed to
compare the value: `os.environ.get("SUPERMEMORY_API_KEY") != api_key`.
Verified: 38/38 in test_supermemory_provider.py and the full
tests/plugins/memory/ suite green with supermemory not installed.
Closes#52988
Populate `reply_to_message_id`, `reply_to_text`, and
`reply_to_is_own_message` on reaction events so the gateway injects
`[Replying to your previous message: "..."]` when the agent receives
a tapback.
The sidecar now extracts a capped text preview from the hydrated
reaction target (plain text and mixed group messages; null for
attachment/voice-only targets), emitting it as `targetText` in the
NDJSON reaction payload. The Python adapter reads this field and sets
the reply correlation fields on the `MessageEvent`.
v8 made `richlink` outbound-only; inbound rich links now arrive as
plain `text`. Remove the `getBalloonBundleId`/`toRichlinkMessage`
branches from the iMessage mapper patch and update the fixture,
lockfile, and README accordingly.
Update the Photon platform plugin's Node.js sidecar from spectrum-ts
3.1.0 to 7.0.0, which splits the SDK into scoped `@spectrum-ts/*`
packages with `spectrum-ts` as the umbrella re-export.
- Bump exact pin in package.json/package-lock.json to 7.0.0
- Update mixed-attachments patch script to target the new
`@spectrum-ts/imessage/dist/index.js` path and tab-indented output
- Rewrite test fixture to match v7.x mapper shape (tab-indented,
`const ... = async` declarations, single-line builder calls) and
point at `@spectrum-ts/imessage/dist/index.js`
- Update README upgrade guide to document the v5 package split and
the postinstall patch validation step
- Update comments in cli.py and index.mjs to reference v5/v7 changes
The self-hosted OIDC provider fetched the discovery document with a bare
httpx.get(). httpx defaults to follow_redirects=False (unlike curl -L or
the requests library), so when an IDP answers GET
/.well-known/openid-configuration with a 3xx — Authentik canonicalises the
.well-known path, and any IDP behind a reverse proxy doing an http→https
upgrade redirects too — the bare redirect (empty body) tripped the
status != 200 guard and raised 'OIDC discovery returned 302', which
routes.py maps to the provider_unreachable audit event and a 503. The
browser surfaced 'Auth provider self-hosted unreachable'.
The user's smoking gun (curl -o writing zero bytes from inside the
container) is exactly a redirect with no body — the same wall the code hit.
Add follow_redirects=True to the discovery GET only. It's safe: the
issuer-pin check and _require_https_or_loopback still validate the resolved
document and every endpoint, so a redirect can't smuggle in a bad issuer or
a cleartext endpoint. The token/revocation POSTs deliberately keep the
no-follow default (they carry an auth code / refresh token and the endpoint
is already the canonical absolute URL).
Existing discovery tests mocked httpx.get with a canned 200 and never
exercised a real 3xx. Add a regression test that runs a real loopback
server returning a 302 on the .well-known path — fails without the fix
(ProviderError: discovery returned 302), passes with it.
* Return None instead of erroring on drain login failure
* Fix login on drain
* Remove login for drained endpoints flow and clean the code
* chore: drop unrelated credits changes from this PR
* Remove extra comments that were not really necessary
The PR's original refactor commit only replaced the primitives (regex,
is_table_row, split_markdown_table_row) with shared imports but left the
verbatim-copied renderer (_render_table_block_for_telegram) and driver
(_wrap_markdown_tables) in place. Both are logic-identical to the shared
convert_table_to_bullets in gateway/platforms/helpers.py.
Replace both with a direct import alias. _TABLE_SEPARATOR_RE is still
imported separately because it's used by the rich-message routing logic
(lines 1024, 1044) to detect whether content contains tables.
Found by 3-agent parallel code-reuse review.
Replace local _TABLE_SEPARATOR_RE, _is_table_row, and
_split_markdown_table_row with imports from the shared module.
Telegram-specific rendering stays local.
Co-authored-by: Yashiel Sookdeo <yashiel@skyner.co.za>
Discord does not render GFM pipe tables — raw pipe characters display
as garbage text. format_message now rewrites tables into bold-heading +
bullet groups using the shared helpers.
Fixes#21168
Co-authored-by: Yashiel Sookdeo <yashiel@skyner.co.za>
preexec_fn=os.setsid runs Python code in the forked child before exec,
which is unsafe in multi-threaded processes (CPython docs). When the
Desktop gateway loads native libraries (onnxruntime, BLAS, provider SDKs)
with active thread pools, the fork can SIGSEGV before the child execs.
Replace all preexec_fn usage with start_new_session=True, which provides
the same setsid/process-group semantics without running Python in the
fork. This is already the pattern used throughout hermes_cli/gateway.py
and hermes_cli/_subprocess_compat.py.
Fixes#46789