hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-25 11:02:03 +00:00

Author	SHA1	Message	Date
teknium	e9cd8c5bf3	fix(delivery): drop env-var knob, flag all chunking adapters Follow-up to ScotterMonk's cron-truncation fix: - Remove HERMES_DELIVERY_MAX_PLATFORM_OUTPUT env var. Behavioral config belongs in config.yaml, not a new HERMES_* env var (.env is secrets only). The actual bug is fixed entirely by the adapter-aware skip; the configurable cap was unneeded scope. MAX_PLATFORM_OUTPUT is a constant again, collapsing the max_output=0 disable branch and the audit-vs-truncation threshold divergence. - Flag the remaining verified-chunking adapters (slack, matrix, feishu, mattermost, teams, whatsapp, whatsapp_cloud, weixin, bluebubbles, yuanbao) with splits_long_messages=True so the fix covers the whole bug class, not just Discord/Telegram. Each verified to chunk in its own send() via truncate_message(). - SMS deliberately left False: it chunks for normal replies but a multi-segment cron blast is cost-bearing; the 4000-cap + file save is the safer default there. - Update tests: drop the two env-override tests, add a test asserting a save failure during truncation (non-chunking) propagates.	2026-06-22 05:41:22 -07:00
ScotterMonk	86e4521cb1	fix(delivery): make cron output truncation configurable + adapter-aware Gateway-level truncation (MAX_PLATFORM_OUTPUT=4000) was pre-empting adapter-side message splitting. Discord and Telegram both chunk long content natively in their send() via truncate_message(), but the delivery router truncated to 3800 chars + footer before the adapter ever saw the full payload — so long cron output was cut short instead of being delivered as multiple messages (issue #50126). Changes: - HERMES_DELIVERY_MAX_PLATFORM_OUTPUT env var makes the cap configurable (default 4000, backward compatible). Set to 0 to disable truncation. - TRUNCATED_VISIBLE (3800) removed — visible portion now derived dynamically from max_output minus the actual footer length. - New BasePlatformAdapter.splits_long_messages capability flag (default False). Adapters that chunk in send() set True; delivery skips truncation for them but still saves full output to disk as audit. - Flagged Discord and Telegram (both verified to chunk in send()). Fixes #50126	2026-06-22 05:41:22 -07:00
kshitij	1f28b1a9b9	fix(gateway): redact credentials from approval prompts before sending to clients (#48456 ) (#50767 ) Tirith redacts its own findings, but the approval-request callbacks built the operator prompt from the RAW command string, so a credential-shaped value Tirith flagged was sent verbatim to clients, undoing the redaction one layer up. Two egress transports carried the leak; both are fixed via a shared module-level seam _redact_approval_command() (redact_sensitive_text force=True): 1. chat platforms — _approval_notify_sync (gateway/run.py): redact before both the button path (send_exec_approval) and the plain-text /approve fallback. 2. SSE/API stream — _approval_notify (gateway/platforms/api_server.py): redact event['command'] before it is enqueued to API/desktop clients. (whole-bug-class: sibling call path on a separate transport.) force=True so the prompt — a hard secret-egress boundary — honors redaction even when security.redact_secrets is off. Clean commands pass through unchanged. Tests bind the seam (synthetic credential-format fixtures, force-when-disabled) AND assert BOTH callbacks ASSIGN the redacted result before the send/enqueue sink, via an AST contract that rejects a discarded-result call. All mutation-checked.	2026-06-22 11:39:45 +00:00
teknium1	4314d451ca	fix(gateway): accept any inbound file type across all messaging platforms Authorization to message the agent is the gate, not the file extension. Previously the inbound-attachment allowlist (SUPPORTED_DOCUMENT_TYPES) was opt-OUT on Discord (allow_any_attachment defaulted false) and had no bypass at all on Telegram/Slack — so an .html (or any non-allowlisted type) was dropped or hard-rejected before the agent saw it. Now every authorized upload is cached and surfaced to the agent regardless of type: - base.cache_media_bytes(): unknown types cache as octet-stream (or the caller-supplied MIME) instead of returning None — fixes the chokepoint that Teams/Telegram-media route through. - discord/telegram/slack adapters: removed the allowlist reject/skip; any non-media attachment is typed DOCUMENT and cached. Known types keep their precise MIME. - Text inlining now gates on a shared _TEXT_INJECT_EXTENSIONS set (text + code + config + markup) instead of a blind UTF-8 decode, so binary formats (PDF/zip/docx) with ASCII headers are never inlined. - gateway/run.py emits the path-pointing context note for every DOCUMENT, including non text/application MIME types. - discord.allow_any_attachment is now a documented no-op kept for config back-compat. Validation: 357 gateway tests pass; E2E confirms .html/.bin/custom types cache, known types stay precise, PDFs are not inlined.	2026-06-21 22:43:45 -07:00
teknium1	7726ce3040	fix(security): close hermes-0day MCP-persistence attack surface Remove the dashboard --insecure auth-bypass, add an MCP persistence guard + IOC blocklist, and raise the API-server key entropy floor. Driven by the June 2026 hermes-0day campaign (r/hermesagent, live 854.media instance): scanners find exposed Hermes dashboards/API servers, drive the root agent to plant a 'command: bash' MCP entry that appends an attacker SSH key to authorized_keys, which cron + startup then re-execute every tick. - dashboard: --insecure no longer disables the auth gate. should_require_auth returns True for every non-loopback bind; a public bind ALWAYS requires an auth provider (bundled password provider or OAuth). --insecure kept as a warned no-op for backward compat. Fail-closed error now points at the password provider, not at --insecure. - mcp_security: validate_mcp_server_entry now also rejects shell payloads that write to OS persistence surfaces (authorized_keys/.ssh/pam.d/sudoers/cron/ rc files) and hard-rejects a hermes-0day IOC blocklist (attacker SSH key + source IPs) anywhere in command/args/env. Runs at save AND spawn time. - api_server: raise network-bind API_SERVER_KEY entropy floor 8->16 chars; warn when a network-accessible API server runs an unsandboxed local backend.	2026-06-21 19:05:27 -07:00
Teknium	c0409a87ff	feat(gateway): typed send-error classification (SendResult.error_kind) (#50342 ) Add a platform-neutral send-failure vocabulary so consumers can branch on a typed category instead of substring-matching the raw provider message. - base.py: SEND_ERROR_KINDS + classify_send_error() (too_long / bad_format / forbidden / not_found / rate_limited / transient / unknown), and an optional SendResult.error_kind field (defaults None — fully backward compatible). - telegram.py: populate error_kind on send() failures; message_too_long keeps its existing error token plus error_kind='too_long'. Purely additive: no behavioral change to the existing degrade-and-deliver paths (MarkdownV2->plain-text fallback, overflow split, retry classification all untouched). 22 new tests + 210 adapter regression tests green.	2026-06-21 12:34:22 -07:00
Teknium	7a131f7f40	fix(api-server): stop silently promising async delivery on stateless HTTP path (#50319 ) * fix(api-server): stop silently promising async delivery on stateless HTTP path terminal(notify_on_complete=True / watch_patterns) and delegate_task(background=True) silently no-op'd on the API server / WebUI path (#10760): the watcher / detached child registered, but every API-server route (OpenAI-spec /v1/chat/completions and /v1/responses, plus the proprietary /v1/runs SSE stream) tears down its channel when the turn ends, and APIServerAdapter.send() is a no-op stub. A completion that fires after the response closed had nowhere to go — from the agent side, indistinguishable from a hang. There is no spec-compliant surface to wake the agent later on a stateless HTTP client, so make the no-op honest instead of silent: - Add a per-adapter capability flag supports_async_delivery (default True; APIServerAdapter = False), propagated into a HERMES_SESSION_ASYNC_DELIVERY contextvar via async_delivery_supported(). Toggle on the adapter, not a hardcoded platform string — a future stateless adapter is correct-by-default. - terminal: when delivery is unsupported, skip watcher registration, force notify_on_complete off, and return a notify_unsupported note telling the agent to process(action='poll'). - delegate_task: when delivery is unsupported, fall back to SYNCHRONOUS execution (work runs and returns in the same response) with a note, instead of handing out a handle that never resolves. CLI (in-process completion_queue) and the real gateway platforms are unchanged. Fixes #10760 * refactor(api-server): route session binding through a single no-delivery chokepoint Add APIServerAdapter._bind_api_server_session() and route both agent-entry paths (_run_agent for /v1/chat/completions + /v1/responses, and the /v1/runs _run_sync path) through it. The helper hardwires platform="api_server" and async_delivery=False with no async_delivery parameter to pass, so a future route added to the API server physically cannot reintroduce the silent no-op (#10760) by forgetting to mark the channel as non-delivering. The binding stays request-scoped (cleared per turn), so a session resumed later on a delivering interface (CLI / gateway platform) re-binds fresh and is NOT blocked — the no-delivery decision tracks the interface handling the current turn, never the session.	2026-06-21 12:15:14 -07:00
sgaofen	93ea9b04af	fix(gateway): cap inbound media download size to prevent memory exhaustion Inbound image/audio/video payloads were buffered fully into process memory before being written to the cache, with no size limit. A large upload (Discord Nitro allows 500 MB) or a remote media URL in an inbound message pointing at a huge file could spike RAM and OOM-kill the gateway. Enforce a configurable cap in the shared cache helpers (gateway/platforms/ base.py) so the protection holds across every platform adapter, not one: - cache_image/audio/video_from_bytes reject oversized payloads before writing (video was the gap in the original report — now covered). - cache_image/audio_from_url stream the body, rejecting on an oversized Content-Length header and re-checking the running total per chunk so an absent/lying header can't smuggle an unbounded body past the cap. - Discord's _read_attachment_bytes checks att.size up front, so an oversized attachment is rejected before any bytes are pulled into memory. Configurable via gateway.max_inbound_media_bytes in config.yaml (default 128 MiB; 0 disables). No new env var — non-secret config lives in config.yaml. Salvaged and extended from @sgaofen's PR #13341 (the original report and the shared-helper approach). Reapplied onto current main (Discord adapter has since moved to plugins/platforms/discord/), the configurable knob moved from an env var to config.yaml, and the video cache helper added. Co-authored-by: Hermes Agent <noreply@nousresearch.com>	2026-06-21 11:56:46 -07:00
Teknium	e499d69e3e	feat(api-server): configurable concurrent-run cap to prevent DoS (#50007 ) The OpenAI-compatible API server only enforced a hardcoded cap of 10 concurrent runs on /v1/runs, leaving /v1/chat/completions and /v1/responses unbounded — a request flood could exhaust CPU, memory, and upstream LLM quota (#7483). - Add gateway.api_server.max_concurrent_runs (config.yaml, default 10, 0 disables). No env var. - Shared concurrency gate across all three agent-serving endpoints, counting both the chat/responses in-flight counter and the /v1/runs stream set. Returns OpenAI-style 429 + Retry-After when at the cap. - Remove the dead hardcoded _MAX_CONCURRENT_RUNS class attribute. Closes #7483.	2026-06-21 07:26:03 -07:00
kshitijk4poor	b577f25100	refactor(gateway): dedupe drain-timeout resolution + share active_agents parse Follow-up cleanups on top of the busy/idle readout (PR #50103): - web_server.py /api/status reused the single drain-timeout resolver hermes_cli.gateway._get_restart_drain_timeout() (HERMES_RESTART_DRAIN_TIMEOUT env -> agent.restart_drain_timeout config -> default) instead of inlining a third hand-rolled copy of that precedence chain. Also fixes a subtle divergence: the inline copy used os.environ.get() so a set-but-empty env var was treated as a value rather than falling through to config; the shared resolver .strip()s and falls through correctly. - Added gateway.status.parse_active_agents() and routed BOTH HTTP surfaces (/api/status and /health/detailed) through it, so the exposed active_agents field is consistently clamped non-negative. Previously /api/status clamped while /health/detailed exposed the raw file value, diverging on a corrupt count. - Added TestParseActiveAgents covering the shared coercion contract.	2026-06-21 17:22:52 +05:30
Ben	0ee75469d7	feat(dashboard): surface gateway busy/drainable on /api/status Give an external consumer (NAS) a trustworthy, always-reachable busy/idle readout it can poll before a disruptive lifecycle action (restart, migrate, stop, auto-update). The dashboard /api/status is the only HTTP surface guaranteed up on a hosted agent regardless of which gateway platforms are enabled, and it already reads gateway_state.json. Add to /api/status (additive, non-breaking): - active_agents — in-flight gateway-turn count (now refreshed per-turn by the companion gateway-side commit) - gateway_busy — running AND active_agents > 0 - gateway_drainable — running and live (a valid begin-drain target) - restart_drain_timeout — resolved seconds, so the consumer can size its poll deadline without out-of-band knowledge (env HERMES_RESTART_DRAIN_TIMEOUT → config agent.restart_drain_timeout → default) The busy/drainable contract is defined once in gateway.status (derive_gateway_busy / derive_gateway_drainable) and consumed by both /api/status and /health/detailed so the two surfaces can never disagree. Liveness keys off gateway_running (a live PID/health probe), NEVER gateway_updated_at — a healthy idle gateway never advances that timestamp. All derived fields degrade to safe falsy values when the gateway is down or the status file is absent/corrupt (never a spurious "busy" that would wedge the consumer). active_sessions (the 5-min DB recency heuristic the SPA reads) is left exactly as-is — new signal, new fields. Tests (behaviour contracts, not snapshots): the pure derivation contract across every running/state/count/liveness combination; /api/status integration for busy, idle-drainable, draining, down, stale-busy-file, corrupt-count, and timeout surfacing; and /health/detailed parity.	2026-06-21 17:22:52 +05:30
Zheng Tao	491579fa05	fix(whatsapp): resolve bridge dir with HERMES_HOME mirror in Docker In Docker the install tree (/opt/hermes) is read-only, so npm install for the WhatsApp bridge fails with EACCES. Add resolve_whatsapp_bridge_dir() in whatsapp_common.py: when the install dir is read-only, mirror the bridge source into a writable HERMES_HOME location and use that. Both the adapter and the 'hermes whatsapp' CLI resolve through the shared helper so the install and runtime paths agree. Fixes #49561	2026-06-20 17:05:27 -07:00
Teknium	5600105478	refactor(gateway): migrate slack/dingtalk/whatsapp/matrix/feishu/telegram/wecom/email/sms adapters to bundled plugins Salvage of PR #41284 onto current main. Relocates the last 9 inline messaging adapters (+ satellites: telegram_network, feishu_comment/_rules/meeting_invite, wecom_crypto, wecom_callback) from gateway/platforms/ into self-contained bundled plugins under plugins/platforms/<x>/, discovered via the platform registry. Strips the per-platform core touchpoints from gateway/run.py, gateway/config.py, hermes_cli/gateway.py, hermes_cli/setup.py, and tools/send_message_tool.py. Carries forward the migration fixes (explicit enabled:false honored, get_connected_platforms forces discovery, plugin is_connected via gateway.get_env_value, logs --component gateway matches plugins.platforms.*, matrix hidden on Windows). Additionally ports config keys main added since the PR base: the matrix plugin's _apply_yaml_config now also covers allowed_users, ignore_user_patterns, process_notices, and session_scope (the inline gateway/config.py matrix block gained these in the 1340 commits the PR sat open; they would otherwise have been silently dropped on deletion).	2026-06-20 10:26:45 -07:00
kshitijk4poor	26d9a3c710	fix(signal): FIFO-evict the quote-detection timestamp cache `_sent_message_timestamps` (the reply-to-own-message quote cache) used a `set` evicted with `set.pop()`, which removes an ARBITRARY element — so once more than the cap (500) outbound timestamps are tracked, a still-recent timestamp could be dropped while older ones survive, missing a genuine reply-to-own-message. Convert it to an OrderedDict with FIFO (oldest-first) eviction, mirroring the recently-hardened echo ring (#31250). This closes the same bug class on the sibling cache. Adds a regression test asserting oldest-first eviction + MRU promotion.	2026-06-20 21:00:46 +05:30
w31rdm4ch1nZ	332f88f6a6	fix(signal): harden recently-sent echo ring with LRU + TTL	2026-06-20 20:50:52 +05:30
kshitijk4poor	32a97a20af	fix(signal): strip self-mention in all groups, not just require_mention Review follow-up on the salvaged self-mention strip (#31217): the original only stripped the bot's rendered @<number>/@<uuid> self-mention inside the `require_mention=true` branch, so groups with require_mention=false still leaked it into the agent text. Hoist the strip to run for every group message (fixing the whole bug class), and collapse the doubled space a mid-sentence removal leaves while preserving intentional newlines.	2026-06-20 16:27:28 +05:30
Kailigithub	40b6ac9ac7	fix(signal): send explicit stop-typing RPC when cancelling indicator	2026-06-20 16:23:41 +05:30
Rick Ratmansky	96b10327b6	fix(signal): strip bot self-mention from group messages before agent dispatch	2026-06-20 16:23:41 +05:30
lkz-de	96db7c6883	fix(signal): preserve quoted reply context Carry Signal quote metadata through gateway events so replies to assistant messages include the quoted context without personalizing comments.	2026-06-20 15:16:53 +05:30
kshitij	ff50a88617	Merge pull request #49558 from NousResearch/salvage/env-var-guards-48735	2026-06-20 15:11:54 +05:30
kshitijk4poor	a7dd98c860	fix(env): guard remaining malformed int/float env var casts with utils helpers Widen the env_float() guard from #48735 across the whole bug class: a non-numeric value (e.g. a stale .env "HERMES_API_TIMEOUT=abc" or a typo'd port) raised an unhandled ValueError and crashed adapter/agent init. Converts 22 genuinely-unguarded first-party int/float(os.getenv()) sites to the canonical utils.env_int / utils.env_float helpers (the established house pattern), instead of duplicating per-module helpers or inline try/except: - gateway/config.py: WECOM_CALLBACK_PORT, BLUEBUBBLES_WEBHOOK_PORT - gateway/platforms/email.py: EMAIL_IMAP/SMTP_PORT, EMAIL_POLL_INTERVAL - gateway/platforms/feishu.py: dedup cache + text/media batch settings - gateway/platforms/wecom.py, discord/adapter.py: text batch delays - gateway/platforms/telegram.py: media batch delay, TELEGRAM_WEBHOOK_PORT - gateway/platforms/whatsapp.py: WHATSAPP_NPM_INSTALL_TIMEOUT - hermes_cli/auth.py: CODEX/XAI refresh timeouts - agent/chat_completion_helpers.py: API/stream read/stale timeouts - run_agent.py, agent/auxiliary_client.py: API + nous timeouts Sites already guarded by try/except or local helpers are left untouched. The HERMES_MAX_ITERATIONS sites are already guarded on main via _current_max_iterations(), so they are not included.	2026-06-20 14:54:36 +05:30
kshitijk4poor	abafba0762	refactor(signal): correct STT-fallback comment, type the markdown wrapper, make AAC test portable Review follow-up on the salvaged AAC + markdown changes: - Fix an inaccurate comment claiming the STT layer has a sniff-and-remux fallback (verified: no such fallback exists; the ffmpeg-absent path caches raw ADTS and STT may reject it). - Type the _markdown_to_signal wrapper as tuple[str, list[str]] to match the shared helper instead of a bare tuple. - Replace the hardcoded /home/pi/... test fixture with a runtime-generated ADTS AAC sample so the remux round-trip actually runs in CI (skips only when ffmpeg is absent) instead of always-skipping.	2026-06-20 14:24:29 +05:30
jasnoorgill	da34fca2bb	fix(signal): detect ADTS AAC voice notes and remux to MP4 Android Signal delivers voice notes as raw ADTS AAC frames, which share the `0xFF 0xFx` sync word with MPEG-1/2 Layer 3 (MP3). The `_guess_extension` byte-signature test in gateway/platforms/signal.py was matching both, so ADTS AAC was being misclassified as MP3 — saved to disk with the wrong extension and rejected by every major STT API (Groq, OpenAI) because their server-side format sniffers inspect the actual codec, not the file extension. Two changes: 1. Tighten the MP3 vs ADTS disambiguator. ADTS packs `ID`, `layer`, and `protection_absent` into bits 3-0 of byte 1, where `ID=0` and `layer=00` for AAC. Real MP3 has `ID=1` and `layer` in {01, 10, 11}. The mask `0xF6` against target `0xF0` cleanly separates them. 2. Remux raw ADTS AAC to MP4 container at the cache step via `ffmpeg -c:a copy`. Single demux/remux, no re-encode, no quality loss, sub-100ms on a Pi 5. The cached file is a normal `.m4a` that all major STT providers accept. ffmpeg is a transitive dependency of many other Hermes features (TTS, video skills) so this isn't a new install requirement; the remux degrades gracefully to a no-op if ffmpeg is missing. The new helper `_remux_aac_to_m4a` is unit-tested with a real Android voice note from the audio cache that originally triggered the bug, plus synthetic ADTS frames for the byte-level disambiguator and garbage-input graceful failure. Closes the gap that broke transcription for any Android Signal user sending voice messages to Hermes.	2026-06-20 13:48:05 +05:30
lkz-de	905820b59f	fix(signal): share markdown formatting across send paths Route Signal send paths through shared markdown formatting helpers and render markdown bullets consistently as Unicode bullets. Add coverage for Signal formatting and send_message integration.	2026-06-20 13:47:14 +05:30
joaomarcos	3a6c171e9e	fix(gateway): log signal transport response and bubble cron live adapter errors	2026-06-19 16:59:38 -07:00
joaomarcos	5649b8649a	Fix silent delivery failures in Signal live adapter (#49260 )	2026-06-19 16:59:38 -07:00
kshitijk4poor	d4e7dd609d	refactor(windows): tidy managed-node resolver helpers Behavior-preserving cleanups on the managed-node resolver: - Hoist _candidate_node_command_names() out of the inner dir loop in find_hermes_node_executable (computed once, not per directory). - Drop redundant os.environ.copy() at the two with_hermes_node_path( os.environ.copy()) sites \u2014 the helper already copies os.environ when called with no argument (verified env-equivalent). - Add reciprocal keep-in-sync comments between iter_hermes_node_dirs() (hermes_constants.py) and hermesManagedNodePathEntries() (electron main.cjs), which mirror the same platform-ordering rule across the Python/Node boundary.	2026-06-20 02:12:16 +05:30
helix4u	7a7b56d498	fix(windows): prefer managed node for whatsapp and desktop	2026-06-20 02:00:37 +05:30
Teknium	26e76a75e5	feat(telegram): opt-in Online/Offline bot status indicator (#49134 ) Sets the Telegram bot's short description (the line under its name) to "Online" on gateway connect and "Offline" on clean disconnect, gated behind extra.status_indicator (off by default). Telegram bots have no presence/online dot — that's a user-account feature the Bot API doesn't expose for bots. The short description is the closest available surface, so this gives users a way to tell whether the gateway is up from the bot's profile. - New extra.status_indicator flag (+ status_online/status_offline text overrides), read in __init__ via config.extra — no config-schema change. - _set_status_indicator() helper: best-effort, swallows API errors so it never blocks connect/disconnect; truncates to Telegram's 120-char cap. - Wired Online after _mark_connected(), Offline at top of disconnect() while the bot HTTP client is still alive. - 9 unit tests + Telegram docs section. Requested by @ilTrumpista, cc @Teknium.	2026-06-19 11:38:39 -07:00
teknium1	a58287afcb	Merge remote-tracking branch 'origin/main' into pr48275-rebase # Conflicts: # cron/scheduler.py	2026-06-19 07:40:29 -07:00
Ben Barclay	f35abb122a	feat(gateway): multiplex phase 1 — HTTP-inbound /p/<profile>/ routing (webhook) Serve webhook inbound for multiple profiles off the one shared listener via a URL prefix, with no second port bound. - SessionSource gains a 'profile' field (round-trips through to_dict/from_dict; omitted when unset so existing serialization is unchanged). It carries which profile an inbound message was routed to. - WebhookAdapter registers /p/{profile}/webhooks/{route_name} alongside the existing /webhooks/{route_name}. _resolve_request_profile validates the prefix against profiles_to_serve(): None when absent or multiplexing is off (ignored, handled as default — no spurious 404), the profile name when valid, _PROFILE_REJECTED (→ 404) when the profile isn't served. The resolved profile is stamped onto the SessionSource. - session-key namespacing and the per-turn home/credential scope now prefer source.profile: SessionStore._resolve_profile_for_key(source), _session_key_for_source fallback, and _resolve_profile_home_for_source all honor it (→ the agent turn resolves that profile's config/skills/credentials via the Phase 2 _profile_runtime_scope). Constraint: routing inbound needs no per-profile platform credential, but the agent still needs the routed profile's provider key — delivered by Phase 2's secret scope. api_server (OpenAI-compatible surface) profile routing is a focused follow-on; its source-construction path differs from webhook's. Tests: SessionSource.profile round-trip + namespace drive; _resolve_request_ profile accept/reject/ignore matrix.	2026-06-19 07:34:15 -07:00
infinitycrew39	460b1e50e5	fix(gateway): refresh max_turns before resolving runtime budget	2026-06-19 06:31:13 -07:00
Ben	b75757d4aa	feat(cron): wire on_jobs_changed, cron.chronos config, docs + agent↔NAS contract Phase 4F (F.1 + F.2 + F.3, agent side). F.4 is the operator-run live smoke (needs a NAS deployment); recorded in the PR, not code. F.1 — on_jobs_changed wiring: - cron/scheduler.py: _notify_provider_jobs_changed() — resolve the active provider, call on_jobs_changed(), swallow errors. Lives in scheduler.py (not jobs.py) so the store stays free of provider imports (no import cycle). - Wired at the consumer surfaces AFTER a successful mutation: the cronjob model tool (tools/cronjob_tools.py, create/update/remove/pause/resume) — which the `hermes cron` CLI also routes through — and the REST handlers (gateway/platforms/api_server.py, same five). Built-in's no-op default = zero behavior change on the default path. Sleeping-agent direct jobs.json writes (no tool/CLI/REST) are covered by reconcile-on-wake in start(). F.2 — config: cron.chronos.{portal_url,callback_url,expected_audience, nas_jwks_url}. All non-secret; the agent holds no scheduler creds and the outbound provision call reuses the existing Nous token (no token key). Additive deep-merge key, no version literal. F.3 — docs: - docs/chronos-managed-cron-contract.md: authoritative agent↔NAS wire contract (the three agent-cron endpoints + inbound /api/cron/fire + the 3-hop trust model + at-most-once/re-arm semantics). This is what the NAS-side agent builds against. - cron-internals.md: "Managed cron (Chronos) for scale-to-zero" section. - cli-commands.md: cron.provider accepts chronos + the cron.chronos.* keys. - User docs name no scheduler vendor (QStash is a NAS-internal detail). INVARIANT re-verified: zero qstash/upstash hits across plugins/cron, gateway, hermes_cli, tools, website/docs (the one remaining repo hit is an unrelated Context7 MCP comment in tools/mcp_tool.py). Tests: test_jobs_changed_notify (5) — notify calls provider hook, swallows errors, built-in harmless, tool create/remove notify. Full cron + chronos + webhook + config + api_server_jobs suites green (504 in the cron+chronos+webhook run).	2026-06-18 15:11:32 +10:00
Ben	3fc7b624d8	feat(cron,gateway): NAS-JWT fire verifier + /api/cron/fire webhook (Chronos) Phase 4E (E.1 + E.2). The inbound side of Chronos: NAS POSTs the agent when a one-shot fires; the agent verifies a NAS-minted JWT and runs the job. E.1 — plugins/cron/chronos/verify.py: - verify_nas_fire_token(token, expected_audience, jwks_or_key, issuer): verifies signature against the NAS JWKS (RS/ES family; symmetric rejected), aud == this agent, exp/nbf, iss, and purpose == "cron_fire" (so a general agent JWT can't be replayed against the fire endpoint). Returns claims or None; never raises. Crypto delegated to PyJWT[crypto] (already a declared dep) — no hand-rolled JWT, no new dependency. No key configured → refuse (never unsigned-decode a security boundary). - get_fire_verifier(): pluggable indirection so the DQ-4 escape hatch (direct per-job cron-key) can swap in with no handler change. E.2 — gateway/platforms/api_server.py: - POST /api/cron/fire (registered only when _CRON_AVAILABLE). Authenticated by the NAS-JWT via get_fire_verifier() — NOT API_SERVER_KEY (NAS holds no API key; this is the only inbound that triggers remote job execution, so it gets its own purpose-scoped check). Verifier args come from cron.chronos.* config. 401 on bad/missing/forged token. 400 on missing job_id. On success: 202 + fire_due runs in the background (so a long agent turn never trips NAS's HTTP timeout); the store CAS claim inside fire_due de-dupes a scheduler retry. Tests: - test_chronos_verify (11): REAL RS256 signing — valid→claims, wrong-aud, missing/wrong purpose, expired, wrong-iss, tampered-signature (attacker key), no-key-refuse, empty-token, JWKS-URL key resolution, get_fire_verifier. - test_cron_fire_webhook (5): valid→202+fire, invalid→401+no-fire, missing token→401, missing job_id→400, and fire path does NOT require API_SERVER_KEY. api_server regression suites (214) green. E.3 (NAS endpoints) is a separate cross-repo PR; the wire contract lands next (docs/chronos-managed-cron-contract.md).	2026-06-18 14:46:33 +10:00
Sierra (Hermes Agent)	01ae9b853e	fix(telegram): resolve replies to rich (sendRichMessage) messages Telegram does not echo a sendRichMessage's content back in reply_to_message (.text/.caption empty, .api_kwargs None), so replies to rich sends (briefings, the gateway's own rich finals) arrived with no quotable text and the [Replying to: ...] injection was skipped. Remember message_id -> text at send time in a best-effort JSON index (gateway/rich_sent_store.py), and recover it on inbound when text and caption are both empty. Best-effort and no-throw throughout: any failure degrades to prior behavior and never breaks a send or message. Salvaged from #47375 by @x1erra. Dropped the cross-platform run.py reply-prefix rewrite (out of scope; bloated every reply on every platform) and scrubbed a docstring reference to an out-of-repo script. Kept the inbound reply_to logging enrichment used to verify the fix.	2026-06-16 13:04:20 -07:00
Wolfram Ravenwolf	16fc717091	fix(mattermost): harden delivery hygiene PROBLEM: Mattermost threads can become invalid or enormous, exposing two failure modes: internal scratch/reasoning/commentary displays could leak into persistent Mattermost threads via global display toggles, while rejected threaded user-visible replies could disappear unless every failed send fell back flat. A broad flat fallback would pollute channels with tool/status/progress noise. SOLUTION: Require explicit Mattermost platform opt-in for scratch displays, keep using the existing notify=True metadata marker for user-visible final text/media/file replies, and allow the Mattermost plugin adapter to flat-fallback only notify-worthy sends whose threaded POST failure looks like a broken root/thread. Keep tool/status/progress and other non-notify sends thread-strict. Add regression tests for display opt-in, notify-only broken-thread fallback, generic API failure suppression, and stream notify metadata. Verification: tests/gateway/test_mattermost.py tests/gateway/test_stream_consumer.py tests/gateway/test_stream_consumer_thread_routing.py tests/gateway/test_stream_consumer_fresh_final.py tests/gateway/test_stream_consumer_draft.py; tests/gateway/test_session_api.py tests/gateway/test_status_command.py tests/gateway/test_resume_command.py tests/hermes_cli/test_commands.py; py_compile touched gateway files; git diff --check. Session: Mattermost thread 6qg8e9dd1pd9pkhi74xyaa1mry, 2026-06-01.	2026-06-16 06:34:54 -07:00
Rory Evans	e65d74bc6f	fix(gateway): accept `metadata` kwarg in WhatsApp/email send_image `BasePlatformAdapter.send_multiple_images` passes `metadata=metadata` to `send_image` / `send_image_file` / `send_animation` on every send. The WhatsApp and email `send_image` overrides stopped their signature at `reply_to`, so any image delivered as a URL (the common case — image-gen backends return URLs) raised: TypeError: send_image() got an unexpected keyword argument "metadata" and the image silently failed to send. Their sibling overrides (`send_image_file` / `send_video` / `send_voice` / `send_document`) already absorb it via **kwargs, which is why only plain image-URL sends broke. - whatsapp/email `send_image`: accept `metadata` (matches the base signature); WhatsApp forwards it to the super() text fallback. - Add `tests/gateway/test_media_metadata_contract.py`: asserts WhatsApp + email accept it, plus a best-effort sweep over every adapter so the next slip fails at test time instead of in production. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 06:23:53 -07:00
Teknium	a6364bfa08	fix(telegram): edit streamed previews in place as rich (Bot API 10.1) (#46890 ) Streamed Telegram replies that finalize through editMessageText were converted to MarkdownV2, which has no table syntax and rewrites pipe tables into bullet lists — users saw a table while streaming that collapsed to a list at the last moment. Finalize now edits the existing preview IN PLACE via Bot API 10.1's editMessageText rich_message parameter when the content has constructs the legacy path degrades (tables, task lists, <details>, block math). No fresh send + delete, so no duplicate-preview flicker — the reason #46206 reverted the fresh-final re-send path. prefers_fresh_final_streaming stays False; the in-place edit replaces it. - _needs_rich_rendering(): rich reserved for table/task-list/details/math (adapted from #45995, @YonganZhang); plain replies stay on MarkdownV2. - _try_edit_rich(): editMessageText + rich_message via do_api_request, mirroring _try_send_rich's fallback/latch/transient contract. - edit_message finalize tries rich in place before the 4,096 overflow pre-flight (rich cap is 32,768), falling back to legacy on rejection. - rich_messages default flipped back to True (DEFAULT_CONFIG + adapter). - docs (en + zh-Hans) + cli-config example updated to default-on. Closes the root cause behind #45911 / #46009.	2026-06-16 05:26:04 -07:00
Teknium	a1f51feb72	fix(telegram): avoid rich final duplicate previews (#46206 )	2026-06-14 11:13:38 -07:00
Teknium	efbe1635dd	fix(gateway): include replied-to media attachments (#46107 )	2026-06-14 04:51:50 -07:00
Teknium	9459057d7f	fix(telegram): guard rich details math crash (#46102 )	2026-06-14 04:22:22 -07:00
Teknium	cf7d5932f8	fix(email): make IPv4 SMTP fallback use supported sockets	2026-06-14 04:16:26 -07:00
liuhao1024	04d4471d79	fix(email): use SMTP_SSL for port 465 and fall back to IPv4 on timeout Port 465 expects implicit TLS (SMTP_SSL) from the first byte. The email adapter always used SMTP() + starttls(), which is correct for port 587 but hangs/fails on port 465 providers (e.g., Swiss ISPs). Additionally, when the SMTP host has AAAA DNS records but IPv6 is unreachable, socket.create_connection() tries IPv6 first and hangs until timeout. Add an IPv4 fallback via AF_INET socket. Extract _connect_smtp() helper to consolidate the 4 duplicate SMTP connection sites into a single method with correct protocol selection and IPv6 fallback logic.	2026-06-14 04:16:26 -07:00
Teknium	5105c3651a	perf(api-server): normalize chat content linearly (#46079 )	2026-06-14 03:25:49 -07:00
Teknium	afc8615509	perf(webhook): prune request caches incrementally (#46065 )	2026-06-14 02:40:54 -07:00
Justin Sunseri	12682d96b9	feat(telegram): restore rich messages opt-out Salvages PR #45840's client-compatibility opt-out while keeping rich messages enabled by default via telegram.extra.rich_messages: true.	2026-06-13 21:45:49 -07:00
ITheEqualizer	57c2a55be4	fix(telegram): harden rich message fallback handling Carry forward focused follow-ups from PR #45741: treat PTB's raw Bot API 10.1 response shapes safely, recognize real missing-endpoint errors, preserve link preview settings on rich sends, and lock the rich limit to Telegram's character-based cap.	2026-06-13 14:34:53 -07:00
ITheEqualizer	7c0605bf22	fix(telegram): preserve rich formatting on stream final	2026-06-13 13:44:45 -07:00
Que0x	fc46354580	fix(security): fail closed when an own-policy gateway adapter has no allowlist Own-policy adapters (WhatsApp, WeCom, Weixin, QQBot, Yuanbao) default dm_policy/group_policy to "open", which forwards every sender. The gateway's adapter-trust shortcut in _is_user_authorized blanket-trusted those platforms when no env allowlist was set, so an operator who enabled one with only credentials authorized the entire external network -- the fail-open SECURITY.md section 2.6 forbids ("an allowlist is required for every enabled network-exposed adapter"). Trust the adapter only when its effective policy for the chat type is an actual "allowlist" restriction (the case #34515 was protecting). "open"/"pairing"/anything else falls through to default-deny, where {PLATFORM}_ALLOW_ALL_USERS / GATEWAY_ALLOW_ALL_USERS and the pairing flow remain the explicit opt-ins.	2026-06-13 07:18:54 -07:00
Clayton Chew	f82cb48120	fix(platform): add .xls, .doc, .ppt to SUPPORTED_DOCUMENT_TYPES Old Office formats (.xls, .doc, .ppt) were missing from the SUPPORTED_DOCUMENT_TYPES dict in gateway/platforms/base.py while their newer counterparts (.xlsx, .docx, .pptx) were included. Sending an .xls file via Telegram triggers 'Unsupported document type' and the file is silently dropped instead of being cached and forwarded to the agent. Add the three legacy MIME types so these files are handled the same way as their modern equivalents.	2026-06-13 07:18:37 -07:00

1 2 3 4 5 ...

1107 commits