Task 2.0b: the concrete shared-bearer-secret auth provider, the FIRST consumer
of the generic token-auth capability (Task 2.0a). Implements decisions.md Q-A.
plugins/dashboard_auth/drain/ (bundled, discovered like dashboard_auth/basic):
- DrainSecretProvider: non-interactive provider, supports_token=True. Verifies
an inbound Authorization bearer token against a per-agent shared secret with
hmac.compare_digest (constant-time, no timing oracle) and, on a match,
vouches for the caller as the "drain-control" principal scoped to "drain".
The five interactive ABC methods raise NotImplementedError; verify_session
returns None (stacks harmlessly in the cookie-verify loop).
- assess_secret_strength(): fail-closed entropy gate. Rejects secrets shorter
than 43 url-safe-b64 chars (~256 bits), with < 16 distinct characters, or
below 128 bits Shannon entropy — so a weak/structured/repeated secret can
never be silently accepted. Enforced both at register() (friendly skip
reason) and in __init__ (raises — defence in depth).
- register(ctx): no-op + skip reason when HERMES_DASHBOARD_DRAIN_SECRET is
unset; rejects a weak secret fail-closed (drain endpoint stays gated). On a
strong secret, registers the provider AND opts /api/gateway/drain into the
generic token-auth seam via register_token_route().
Config: the secret is a CREDENTIAL → carried via HERMES_DASHBOARD_DRAIN_SECRET
(per-agent, provisioned by NAS at deploy). Behavioural knobs only
(dashboard.drain_auth.{scope,min_secret_chars}) live in config.yaml — added to
DEFAULT_CONFIG with the .env-is-for-secrets rationale documented inline.
Tests: tests/plugins/dashboard_auth/test_drain_provider.py — entropy gate
(strong pass; empty/short/repeated/few-distinct/custom-min reject), verify_token
(match → scoped principal, wrong/empty → None, custom scope), protocol
compliance, interactive-methods-raise, and register() (skip-no-secret,
fail-closed-weak-secret, strong-env-secret registers + route opt-in, config
scope + min_secret_chars). 21 new tests; drain + token-auth suites 44 passed.
Verified the plugin is discovered as dashboard_auth/drain alongside basic/nous.
Intentionally deferred:
- The begin/cancel-drain endpoint handler itself — Task 2.1.
- The dashboard→gateway control channel — Task 2.2.
Build status: dashboard-auth + drain-plugin suites green.
After a prolonged outage the in-process network-error ladder escalates to
fatal and GatewayRunner._platform_reconnect_watcher rebuilds a fresh adapter
that reconnects through the bootstrap path. That path called
start_polling(drop_pending_updates=True), discarding every update Telegram
queued during the outage — all messages sent while the bot was down were
silently lost. The in-process ladder and 409-conflict handler already passed
drop_pending_updates=False; only bootstrap did not distinguish a cold first
boot from a reconnect.
Thread an is_reconnect signal from the watcher through
_connect_adapter_with_timeout into adapter.connect(). The base
BasePlatformAdapter.connect() gains a keyword-only is_reconnect=False so every
adapter inherits a tolerant signature (no per-platform breakage when the
runner forwards the kwarg). Telegram translates is_reconnect into
drop_pending_updates=not is_reconnect on both the polling and webhook bootstrap
calls. Cold boot still drops the stale queue; a watcher reconnect preserves it.
Fixes#46621.
Co-authored-by: annguyenNous <annguyen@nousresearch.com>
Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>
Co-authored-by: Kewe63 <Kewe63@users.noreply.github.com>
The email adapter authorized senders entirely off the From: header, which is
attacker-controlled and unauthenticated by IMAP. An attacker could forge
From: an-allowlisted-address and pass both the adapter's EMAIL_ALLOWED_USERS
pre-filter and the gateway's allowlist authz (both key on the same spoofable
sender_addr), getting unauthorized commands executed by the agent.
Verify the From: domain against the trusted Authentication-Results header the
receiving mail server stamps (SPF/DKIM/DMARC) before trusting it for
authorization. Enforced only when an allowlist is in effect and allow-all is
off — fail-closed. Operators whose server does not stamp the header can opt
out via platforms.email.require_authenticated_sender: false (or
EMAIL_TRUST_FROM_HEADER=true).
CI shard test_telegram_conflict.py timed out (140s) because the new
_polling_heartbeat_loop, started by connect(), busy-spun under those
tests: they monkeypatch asyncio.sleep to instant and pass a bot double
with no get_me(), so the probe raised AttributeError (swallowed) and the
loop re-entered immediately with no real pacing, starving the event loop.
Guard the loop to return when bot.get_me is not callable — a real PTB Bot
always exposes it, so this only triggers on a torn-down app or a test
double, where there is nothing to probe. Also cancel the heartbeat task in
the conflict tests that call connect() without disconnect(), matching the
production disconnect() teardown.
Verified: test_telegram_conflict.py now runs in ~4.5s; the 22
heartbeat/reconnect tests still pass; E2E confirms a hanging get_me still
fires the reconnect ladder while a missing get_me exits without spinning.
When a Telegram long-poll TCP socket enters CLOSE-WAIT (remote sent FIN
but httpx hasn't noticed), epoll still reports it readable so no
exception is raised. PTB's error_callback never fires, the reconnect
ladder never engages, and the gateway silently stops receiving messages
while the process stays alive — until a manual systemctl restart.
The existing recovery only covers two cases: error_callback-driven
reconnects (which require an exception PTB never gets) and a one-shot
_verify_polling_after_reconnect probe (which runs only right after an
explicit reconnect). A socket that wedges during steady-state operation
is never detected.
Add _polling_heartbeat_loop: a background asyncio.Task started in
connect() (polling mode only) that probes get_me() every 90s on the
general request pool (not the getUpdates pool, so healthy long-polls are
never interrupted). On asyncio.TimeoutError/OSError it hands off to the
existing _handle_polling_network_error ladder; other errors are
swallowed. disconnect() cancels and awaits the task. Worst-case
detection window ~105s.
Complementary to #51541 (general-pool keepalive limits / fd leak) — that
recycles idle pooled connections; this detects a wedged active read.
Fixes#48495
Co-authored-by: agt-user <267614622+agt-user@users.noreply.github.com>
Pipe-only markdown tables now use sendRichMessage even when rich_messages
is off, and resumed DM-topic sends route via direct_messages_topic_id
without requiring a reply anchor. Rich finalize edits forward topic kwargs.
atomic_yaml_write used default yaml.dump which emits indentless
sequences (list items at column 0), while atomic_roundtrip_yaml_update
(ruamel.yaml) emits 2-space-indented sequences. Cross-path writes to
the same config.yaml toggled indentation on every save, eventually
producing a mixed-indent file that js-yaml rejects with 'bad indentation
of a mapping entry', silently dropping custom_providers and breaking
model switching.
Add IndentDumper SafeDumper subclass that forces indentless=False,
route atomic_yaml_write through it. Route tui_gateway._save_cfg and
the Telegram adapter's config writer through atomic_yaml_write so all
paths emit the same 2-indent layout.
Salvaged from #32034 by @xxxigm. Adapted to current main which already
has allow_unicode=True (from #51356) but was missing IndentDumper.
Closes#31999
The quality-first default (OpenAI image via OpenRouter) is slow, and a full
hatch fans out ~8 rows with up to 3 retries each (300s/call) across 2 parallel
waves, so the absolute backend worst case is ~30 min. The old ceilings fired
mid-run:
- per-image HTTP call: 180s -> 300s (a single cold row can exceed 3 min)
- drafts RPC: 240s -> 420s (single wave, no retries — 7 min is ample)
- hatch RPC: 420s -> 1hr (sits above the ~30 min backend worst case)
The hatch ceiling is intentionally well above the realistic max so the frontend
never throws "request timed out" before the backend has exhausted its own
retries. The background-resumable notification path remains the real UX safety
net — the user can close the modal and get pinged on completion.
OpenRouter/Nous image gen now runs a quality-first model chain by default:
attempt the highest-fidelity OpenAI image model first, then fall back to
Gemini 3 Pro Image when it's access-gated/unavailable/times out. An explicit
OPENROUTER_IMAGE_MODEL / config model override pins one model with no fallback.
Atlas validation rejects malformed model output instead of shipping it: adds a
per-state collapse guard (a single sliver/fragment row no longer passes because
other rows are healthy), on top of the existing postage-stamp + multi-pose
checks.
Desktop: pet-gen native notifications are now "global" (not tied to a chat
session), so a background generation started from the command center fires an
OS notification when the user is away even with no active session. Adds a
neutral "This can take up to 5 minutes." banner on step 1, and lets the
provider picker auto-size.
Tests updated/added for the OpenRouter fallback chain, the collapse guard, and
the global notification path.
When Telegram's sendRichMessage returns a FloodWait/RetryAfter error,
_try_send_rich() now extracts the server-provided retry_after value and
propagates it through SendResult.retry_after. The base _send_with_retry()
layer honors this value instead of using its default short exponential
backoff (~2s, ~4s), preventing the retry budget from being exhausted
against a server that demands a 25-37s wait.
Salvaged from #46774 by @liuhao1024. Telegram adapter path moved from
gateway/platforms/telegram.py to plugins/platforms/telegram/adapter.py
since the original PR.
Closes#46762
A single ddgs (DuckDuckGo) search could hang indefinitely and block the
shared agent loop — and therefore every platform (CLI, Telegram, Matrix...).
The DDGS constructor's timeout only bounds individual HTTP requests; ddgs's
multi-engine retry loop has no overall cap, so a slow/rate-limited response
could spin for 20+ minutes with no output and no error.
Run the synchronous ddgs call in a single-worker ThreadPoolExecutor and cap
it with future.result(timeout=_SEARCH_TIMEOUT_SECS=30). On timeout, return a
clear failure ("DuckDuckGo search timed out ... try a different provider")
instead of blocking; the pool is shut down with cancel_futures so a hung
worker is never awaited.
Salvaged from #37422 by @uzunkuyruk (authorship preserved). Re-applied on
current main (the PR's provider.py base had diverged). Added a load-bearing
timeout regression test (the original PR only updated the fake's constructor
and had no timeout-behavior test) — mutation-verified to fail without the cap.
Closes#36776.
Reference-grounded image provider over the OpenRouter-compatible
chat-completions image protocol (Gemini Flash Image et al.). Nous Portal
proxies OpenRouter, so one provider serves both — giving pet generation a
reference-capable backend beyond OpenAI gpt-image.
Salvage of PR #48927 by @ehz0ah, which consolidates OpenViking recall
work from #41706 (@huangxun375-stack), #33260, #49975, and #32444.
Replaces stale background post-turn prefetch warming with synchronous
current-query recall. The old queue_prefetch warmed the PREVIOUS user
message while turn-start recall consumed the CURRENT one, so injected
context was always about the wrong topic.
Changes:
- prefetch() now does session-aware /api/v1/search/search with the
current query, falls back to /api/v1/search/find on failure
- Contract-safe payloads: limit, score_threshold, context_type,
session_id — no top_k, no search-body mode, no target_uri
- L2 content reads for items with level=2 or empty abstracts, capped
at full_read_limit (default 2)
- Local ranking (score + query-token overlap + leaf boost), dedup,
score threshold, and injected-char budget
- queue_prefetch() is now a no-op (background warming removed)
- Additive batched viking_read: uris param accepts up to 3 URIs
- Per-request timeout support on _VikingClient.get/post/delete
- Removes stale _prefetch_result/_prefetch_thread/_prefetch_generation
state and _invalidate_prefetch_state()
- Strengthened system_prompt_block guidance
Salvage follow-up fixes:
- Expose all 8 recall config knobs in get_config_schema() (PR #48927
had removed them; #41706 correctly exposed them). Env vars remain
as internal mechanism but are now visible in setup wizard.
- Lower default timeout 8s→4s, request_timeout 6s→3s, full_read_limit
3→2 to reduce per-turn blocking latency.
Co-authored-by: Hao Zhe <haozhe4547@gmail.com>
Co-authored-by: Eurekaxun <eurekaxun@163.com>
* docs: stop recommending pip install hermes-agent; point to install script
The install script is the only supported install path (it provisions a
managed, isolated uv environment). Replace bare `pip install hermes-agent`
primary-install recommendations with the curl install script, and rewrite
optional-extra snippets (`pip install "hermes-agent[X]"`) to the managed-env
form `cd ~/.hermes/hermes-agent && uv pip install -e ".[X]"` that matches the
installer and the English quickstart.
Covers English docs + zh-Hans mirrors, the achievements plugin README, and
realigns the zh-Hans quickstart to the English Desktop-installer-first layout
(dropping its stale "Method A — pip (simplest)" section).
* docs: drop pip as a supported install/update method
Removes the 'pip installs' supported-method sections from updating.md and
cli-commands.md (EN + zh-Hans): the curl install script is the only supported
way to install/update the Hermes CLI. The _cmd_update_pip pip/pipx branches
remain in code as an undocumented safety net for users who already have such an
install, but the docs no longer advertise pip as a path.
Also normalizes a bare `pip install -e '.[acp]'` to the managed-env form.
Leaves python-library.md untouched: importing AIAgent as a library dependency
into your own project is a distinct use case where pip is correct.
Most Matrix clients auto-set a room name when creating a DM (e.g.
"Alice & Bot" from participant display names), so the old
`is_direct and not has_explicit_name` heuristic classified virtually
all client-created DM rooms as "room", forcing require_mention gating
in legitimate one-on-one DMs.
member_count is now the primary DM signal: <=2 members means the room
is necessarily a 1:1 conversation, regardless of m.direct or an explicit
name. A room that grew to 3+ members but is still in stale m.direct is
still classified as a room (conflict flag set). Falls back to the
m.direct + name heuristic when the count is unavailable.
Also hardens _get_room_member_count with a joined_members API fallback
when the cache-backed state_store is empty.
Salvaged from #48554 by @justemu onto the current plugin adapter path
(gateway/platforms/matrix.py -> plugins/platforms/matrix/adapter.py).
Fixes#48551
Component button interactions (approve/deny, slash confirm, model
picker, clarify) were not checking the pairing store for authorization.
Users approved via `hermes pairing approve` could send messages and use
slash commands (which go through the gateway authz_mixin), but button
clicks were rejected because `_component_check_auth` only checked
env-var allowlists (DISCORD_ALLOWED_USERS, GATEWAY_ALLOW_ALL_USERS,
etc.) and not the pairing store.
This was a regression from commit f6f363662 which intentionally made
component auth fail-closed when no allowlist is set (security fix for
GHSA-mc26-p6fw-7pp6), but did not account for pairing-based auth.
Fix: add a `PairingStore.is_approved("discord", uid)` check to
`_component_check_auth`, mirroring `authz_mixin._check_authorization`.
The pairing store check runs after all allowlist checks, preserving the
fail-closed behavior for non-paired, non-allowed users.
Fixes#50627
The 30-slot default could not fit Hermes's ~50 built-in commands, so
every skill command (and 20 built-ins) were silently dropped from the
Telegram \`/\` menu by default — they only worked when typed manually.
Raising the default to 60 keeps all built-ins plus common skill commands
visible out of the box while staying under Telegram's ~4KB payload limit.
Users can still tune it via platforms.telegram.extra.command_menu.
Adds a configurable Telegram BotCommand menu cap and priority list via
platforms.telegram.extra.command_menu (max_commands clamped 1..100;
priority_mode prepend|append|replace). Default cap stays 30; hidden
commands remain invokable when typed and /commands lists the full set.
Salvaged from PR #42021. Cherry-picked onto current main; the original
edited gateway/platforms/telegram.py, now relocated to
plugins/platforms/telegram/adapter.py.
atomic_yaml_write (and two sibling config writers) called yaml.dump
without allow_unicode=True. The default personalities shipped in cli.py
contain emoji/kaomoji, so PyYAML escaped astral-plane chars as 8-digit
\\UXXXXXXXX sequences inside multi-line double-quoted strings wrapped
with \\ line-continuations. Stricter/non-PyYAML parsers, editors, and
hand-edits break that structure into unclosed quotes, failing the whole
config parse -> silent fallback to defaults -> custom_providers lost.
Add allow_unicode=True to the canonical writer plus tui_gateway/server.py
and the telegram adapter's atomic config write so config is written as
readable UTF-8 with no escape/fold artifacts.
Fixes#51356
spectrum-ts routes stream telemetry through @photon-ai/otel's createLogger,
which sends severity>=ERROR to console.error and WARN/INFO to console.log.
The two lines the health monitor keys off land on different channels:
log.error("stream persistently failing") -> console.error (caught), but
log.warn("stream interrupted; reconnecting") -> console.log (was missed).
The original interception patched console.error only, so the recovering->
degraded escalation counter never saw the interrupt bursts that are the
primary silent-inbound symptom. Verified live against spectrum-ts 3.1.0 +
@photon-ai/otel: 3 real log.warn('stream interrupted') calls now escalate
to degraded -> process.exit(75) -> adapter reconnect.
Adds a shared classifyStreamLog() fed by both console.error and console.log,
plus a regression test asserting both channels are intercepted.
When _auto_create_thread() creates a thread from a user message via
message.create_thread(), Discord fires a second MESSAGE_CREATE event
for the 'thread starter message'. That starter message carries
message.id == thread.id and may arrive with type=default instead of
type=21 (thread_starter_message), so the existing type filter in
on_message does not catch it — triggering a second call into
_handle_message and thus a second agent run and response.
Fix: after _auto_create_thread succeeds and returns a thread, pre-seed
the dedup cache with str(thread.id) via self._dedup.is_duplicate().
The dedup cache is the same TTL-based MessageDeduplicator that already
guards against Discord RESUME event replays. Calling is_duplicate()
marks the ID as seen; when the duplicate thread-starter MESSAGE_CREATE
arrives, on_message's guard returns True and the event is dropped.
This is a minimal, targeted fix:
- No new state: reuses the existing _dedup instance
- No timing/race: the pre-seed happens synchronously inside the async
_handle_message, before the thread-starter event can be dispatched
- Scoped: only fires when auto-threading is enabled AND thread creation
succeeds (thread object is not None)
Also adds tests in tests/gateway/test_discord_double_dispatch.py
covering the pre-seed behaviour, failure modes (thread creation fails,
auto-thread disabled), and dedup cache integrity.
Closes#51057
PTB's HTTPXRequest builds its httpx.AsyncClient with
`limits = httpx.Limits(max_connections=connection_pool_size)` and no
keepalive tuning, so httpx's default keepalive_expiry=5.0 applies. Behind
an HTTP proxy (Cloudflare Warp etc.) a peer-initiated FIN can sit in
CLOSE_WAIT longer than that, leaking fds in the general request pool
(_request[1], which routes bot.send_message/set_my_commands) — the pool
_drain_polling_connections never resets. Telegram was the lone holdout
adapter not using the shared #18451 CLOSE_WAIT helper.
Wire gateway.platforms._http_client_limits.platform_httpx_limits() into
the httpx client across ALL THREE request-construction branches —
fallback-transport, proxy, and plain — via httpx_kwargs["limits"], which
PTB spreads last into its client kwargs so our tuned limits win. PTB's
connection_pool_size (max_connections) is preserved; only keepalive
behaviour is tightened (max_keepalive_connections + keepalive_expiry<5.0).
The fix is macOS-import-safe: no Linux-only socket TCP_KEEPIDLE/INTVL/CNT
constants at module scope (unlike the broken candidate which crashed on
import on the reporter's OS), and it patches the actual proxy path the
repro hits rather than TelegramFallbackTransport, which the proxy repro
never instantiates.
Adds a mutation-survivable behavior-contract test asserting every
HTTPXRequest built by connect() receives httpx_kwargs["limits"] with
keepalive_expiry < httpx's 5.0 default, across both the proxy and plain
branches. Reverting the limits wiring fails the test.
Co-authored-by: indigokarasu <mx.indigo.karasu@gmail.com>
Map Hermes xhigh→max to unlock DeepSeek V4's 'Max thinking' tier
through Ollama Cloud's OpenAI-compatible /v1/chat/completions endpoint.
low/medium/high pass through unchanged; disabled/none suppress
reasoning entirely.
Empirically confirmed: reasoning_effort:max produces ~2.5× more
thinking tokens than high on deepseek-v4-pro:cloud (1576 vs 642).
Follow-up to the salvaged voice-clip fix: the rerouted video/mp4 branch
used {".m4a": "audio/mp4"}.get(ext, "audio/mp4"), whose sole key's value
equals the default, so it always returned "audio/mp4" regardless of the
cached extension (dead lookup + a throwaway dict per inbound voice clip).
Replace it with a module-level _SLACK_EXT_TO_AUDIO_MIME map so the reported
media_type matches the bytes we cached (e.g. a clip cached as .wav now
reports audio/wav instead of audio/mp4). STT routing already keys on the
audio/ prefix + cached filename extension, so behavior is unchanged; this
just removes the dead construct and keeps the reported mimetype coherent.
Slack in-app voice clips ("record a clip") arrive as MP4/AAC containers
(mimetype audio/mp4, filename audio_message*.mp4), and Slack sometimes
labels them video/mp4. The inbound audio handler derived the cache
extension from the mimetype and fell back to ".ogg" for anything not in
{.ogg,.mp3,.wav,.webm,.m4a} — so audio/mp4 voice messages were cached as
.ogg. OpenAI STT (whisper-1, gpt-4o-transcribe) sniffs the container from
the FILENAME extension, so it received MP4 bytes named .ogg and rejected
them. WhatsApp .ogg and uploaded .m4a worked only because their extension
happened to match the bytes.
Fix:
- _resolve_slack_audio_ext(): pick the cache extension from the real
filename first, then a mimetype map (audio/mp4 -> .m4a), defaulting to
.m4a — never the bogus .ogg fallback. Mirrors the video branch and the
audio map already in gateway/platforms/bluebubbles.py.
- _is_slack_voice_clip(): detect audio-only clips mislabeled video/mp4
via the slack_audio subtype / audio_message* filename, and route them
through the audio path (cached as audio, reported as audio/*) so they
reach STT instead of video understanding. Genuine videos (and
slack_video screen recordings) are left on the video path.
Verified end-to-end against a real audio-only MP4: old path cached it as
.ogg (ffprobe shows MP4 bytes -> container mismatch -> OpenAI rejects);
new path caches it as .mp4 (extension matches bytes -> accepted).
Adds inbound-audio tests (previously none): helper unit tests plus
_handle_slack_message E2E coverage for audio/mp4, video/mp4-mislabeled
voice clips, and a real video staying on the video path. Confirmed the
two voice-message tests fail without the fix (mutation check).
* feat(memory): OAuth token storage and refresh for the Honcho provider
* feat(memory): refresh the Honcho OAuth token in the client and session
* feat(memory): zero-CLI loopback OAuth authorization flow
* feat(memory): generic memory-provider OAuth connect endpoints
* feat(desktop): memory-provider OAuth connect link
* feat(memory): CLI OAuth sign-in with source-tagged authorize links
* fix(memory): IP-literal loopback redirect and consent config_path on the authorize link
* fix(memory): profile-scope the memory-provider OAuth endpoints
* refactor(desktop): generic memory-provider OAuth client functions
* docs(memory): trim OAuth module docstrings to the invariants
* docs(memory): document OAuth connect as an optional auth method
* fix(memory): send home-relative display path to consent, not the absolute path
* perf(memory): cache OAuth token expiry in memory to skip the hot-path disk read
* fix(memory): log OAuth refresh failures at warning, not debug
* feat(memory): fall back to an OS-assigned loopback port when 8765 is taken
* test(memory): cover the desktop Connect launcher, status, and provider dispatch
* fix(desktop): keep the memory-provider dropdown one size regardless of connect state
* fix(desktop): move the memory connect link to the description line, leaving the dropdown untouched
* refactor(memory): move OAuth connect routes out of web_server into a memory-layer router
* refactor(desktop): import MemoryConnect directly, drop the single-export barrel
* fix(memory): launch CLI OAuth sign-in right after the auth choice, not after the wizard
* fix(desktop): auto-clear the OAuth error state instead of leaving it sticky
* test(honcho): isolate auth-method prompt from deployment-shape wizard tests
main's wizard suite scripts the cloud prompts without the OAuth auth-method step; auto-answer it in the shared helper so the answer lists stay shape-only.
* docs(honcho): document query-adaptive reasoning level (reasoningHeuristic)
README never mentioned reasoningHeuristic and listed reasoningLevelCap as an orphaned cap with the wrong default (— vs "high"). Add the query-adaptive scaling note + the reasoningHeuristic/reasoningLevelCap rows (grouped under Dialectic & Reasoning), matching the wording already on the hosted honcho.md page, and add a pointer from the memory-providers overview.
* fix(honcho): default the CLI peer prompt to the OAuth consent name
The CLI runs the grant with apply_config=False, so the peerName the user just entered at consent was dropped and the wizard's 'Your name' prompt fell back to $USER. Surface it as a transient OAuthCredential.consent_peer_name (set even when config isn't merged) and seed the prompt default from it.
* feat(honcho): split OAuth client_id by surface (cli=hermes-agent, desktop=hermes-desktop)
resolve_endpoints now picks the client_id from the initiating surface and
threads it through authorize -> token exchange -> persisted grant -> refresh,
so the CLI and desktop register as distinct OAuth clients. Surface-specific
env overrides (HONCHO_OAUTH_CLIENT_ID_CLI/_DESKTOP) win over the generic
HONCHO_OAUTH_CLIENT_ID, which still overrides every surface.
* feat(honcho): show OAuth vs API key in status; detect existing OAuth in setup
status now prints 'Auth: OAuth (clientId, token valid Xm/expired)' instead of
masking the OAuth access token as a generic API key; setup notes an existing
OAuth grant when re-run.
* docs(honcho): drop 'shared pool' wording from unified observation mode help
* fix(honcho): cross-process lock around OAuth refresh to prevent grant revocation
The in-process threading lock can't stop a sibling process (another profile or
the desktop app sharing honcho.json) from replaying the single-use refresh
token and tripping reuse-detection, which revokes the whole grant. Guard the
read-refresh-persist section with an OS file lock on <config>.lock so only one
process rotates at a time; the others re-read the freshly-persisted token.
Best-effort: platforms without flock degrade to in-process serialization.
* refactor(honcho): one OAuth client (hermes-agent) for all surfaces
Collapse the per-surface client_id split. CLI and desktop now use a single
client_id (hermes-agent); consent branding/UI still adapt via the source query
param. One grant identity means no clientId-vs-refresh-token desync that could
get the grant revoked. HONCHO_OAUTH_CLIENT_ID still overrides for self-hosting.
* fix(honcho): per-session resolves to session_id, never remapped by title
Reorder resolve_session_name so stable identifiers win over labels: gateway
per-chat key first, then the per-session session_id, then the cwd map / title.
A (possibly auto-generated) title can no longer remap a live per-session
conversation onto a second Honcho session mid-stream — fixes the desktop, which
is per-conversation via session_id. Consequence: a gateway's per-chat key now
also wins over a title (titles never remap a stable id).
Discord enforces a hard 100-command limit per app and rejects an upsert that would push the live total over 100 (error 30032), which silently breaks ALL slash commands. The sync deleted obsolete commands AFTER creating new ones, so an app already at the cap momentarily exceeded it and the whole sync failed.
Reorder: delete no-longer-desired commands up front, then create/update. Removes the now-redundant trailing delete loop. Adapts @infinitycrew39 PR #50890 to current main (the original adapter diff no longer applied after the platform refactor); test commit cherry-picked with authorship preserved.
Both fallback sites that currently log "Thread X not found,
retrying without message_thread_id" now also drop the
``telegram_dm_topic_bindings`` row keyed on
``(chat_id, thread_id)``:
* The streaming send loop (``send`` body) — fires on the
second failure, after the same-thread one-shot retry confirms
the thread really is gone (the first attempt is left alone
because Bot API has been observed to return a transient
"Thread not found" that recovers on immediate retry).
* The control-message helper ``_send_message_with_thread_fallback``
(approval prompts, model picker, update prompts) — single-shot
retry, prune unconditionally on the BadRequest match.
Without this prune, a user who deletes a Telegram DM topic in
the client keeps getting their next inbound message recovered
back to the dead thread by
``_recover_telegram_topic_thread_id`` in ``gateway/run.py``,
which walks the per-user binding list newest-first and treats
the deleted thread as authoritative. The reproduction in the
bug report is exactly this: tool progress, approvals, activity
messages and replies all land in the wrong place until the user
manually runs DELETE on state.db.
Cleanup is best-effort — we log at INFO when it succeeds, swallow
any exception from the SessionDB call, and the user-facing send
proceeds either way.
Refs #31501
The Slack docs document `slack.mention_patterns` as custom wake words that
trigger the bot alongside `@mention`, and the config layer bridges the key into
the Slack adapter's `config.extra` — but the adapter never read it. With
`require_mention` on, a channel message containing a configured wake word (and
no literal `<@BOTUID>`) was silently ignored. Every other adapter that
documents `mention_patterns` (Telegram, DingTalk, Mattermost, WhatsApp,
BlueBubbles, Photon) implements it; Slack was the odd one out.
Add `_slack_mention_patterns()` (compiled, cached; reads `slack.mention_patterns`
as a list/string or `SLACK_MENTION_PATTERNS` as a JSON/CSV/newline list, invalid
regexes warned and skipped) and `_slack_message_matches_mention_patterns()`,
mirroring the existing adapters. Channel mention detection now also triggers on
a wake-word match, so the documented field works as described.
Adds tests for pattern compilation (list/string/env/invalid-regex) and for the
channel-trigger gating with a wake word under require_mention.
Mirror built-in memory writes to external providers only after the native memory tool succeeds and is not staged for approval. Keep OpenViking's built-in memory mirroring add-only, since Hermes native memory entries do not yet have stable OpenViking file URIs for replace/remove.
Add a narrow viking_forget tool for exact user memory file deletion and document the current OpenViking write/delete behavior.
Follow-up to ScotterMonk's cron-truncation fix:
- Remove HERMES_DELIVERY_MAX_PLATFORM_OUTPUT env var. Behavioral config
belongs in config.yaml, not a new HERMES_* env var (.env is secrets
only). The actual bug is fixed entirely by the adapter-aware skip; the
configurable cap was unneeded scope. MAX_PLATFORM_OUTPUT is a constant
again, collapsing the max_output=0 disable branch and the
audit-vs-truncation threshold divergence.
- Flag the remaining verified-chunking adapters (slack, matrix, feishu,
mattermost, teams, whatsapp, whatsapp_cloud, weixin, bluebubbles,
yuanbao) with splits_long_messages=True so the fix covers the whole
bug class, not just Discord/Telegram. Each verified to chunk in its
own send() via truncate_message().
- SMS deliberately left False: it chunks for normal replies but a
multi-segment cron blast is cost-bearing; the 4000-cap + file save is
the safer default there.
- Update tests: drop the two env-override tests, add a test asserting a
save failure during truncation (non-chunking) propagates.
Gateway-level truncation (MAX_PLATFORM_OUTPUT=4000) was pre-empting
adapter-side message splitting. Discord and Telegram both chunk long
content natively in their send() via truncate_message(), but the
delivery router truncated to 3800 chars + footer before the adapter
ever saw the full payload — so long cron output was cut short instead
of being delivered as multiple messages (issue #50126).
Changes:
- HERMES_DELIVERY_MAX_PLATFORM_OUTPUT env var makes the cap configurable
(default 4000, backward compatible). Set to 0 to disable truncation.
- TRUNCATED_VISIBLE (3800) removed — visible portion now derived
dynamically from max_output minus the actual footer length.
- New BasePlatformAdapter.splits_long_messages capability flag (default
False). Adapters that chunk in send() set True; delivery skips
truncation for them but still saves full output to disk as audit.
- Flagged Discord and Telegram (both verified to chunk in send()).
Fixes#50126
* fix: update to version 3 endpoints and adding update and delete tool
* chore: removing the test md file
* fix: prevent circuit breaker on client errors in Mem0 provider
* chore: add telemetry for platform version
* feat: add OSS mode support to Mem0 memory provider
* chore: bump mem0ai dependency to >=2.0.1 in memory plugin
* refactor: enhance dependency checks and embedder config in mem0 backend
* refactor: adjust fact storage message for OSS mode
* refactor: expand user paths, add collection recreation on dimension change for Qdrant
* fix(mem0): make MEM0_USER_ID override gateway-native ids and tag writes with channel
When MEM0_USER_ID was configured (env or mem0.json), the gateway-native id
from kwargs (Telegram numeric id, Discord snowflake, ...) still won, so the
same human ended up under different user_ids per channel and memories never
merged across CLI / Telegram / Slack / Discord. Mirrors openclaw's cfg.userId
pattern: configured override wins, gateway-native id is the fallback.
The legacy "hermes-user" placeholder default written by the setup wizard is
treated as unset to avoid silently bucketing every gateway user together.
Also tag every write with metadata.channel (cli/telegram/discord/...) so the
dashboard can offer per-channel filtered views without coupling identity to
the channel; document the read/write filter asymmetry as intentional
(reads scope to user_id only for cross-agent recall).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: improve Mem0 memory provider backend, pagination, config, and error handling
* refactor: update mem0 telemetry code, docs, and bump version
* fix(mem0): make get_config_schema() return unified schema with mode-aware required flag
Schema always includes api_key field so picker shows "API key / local" for
both modes. In OSS mode api_key.required=False so status won't mislead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: improve mem0 telemetry, add env var key and OSS mode detection
* chore: bump mem0ai lower bound to 2.0.4 (latest SDK release)
* refactor: set telemetry sample rate to 1.0 and update docs for opt‑out
* fix(mem0): resolve 15 correctness, thread-safety, and resource bugs
Thread safety:
- Protect circuit breaker counters with _breaker_lock (race between
prefetch/sync daemon threads and main thread)
- Wrap sync_turn thread creation in _sync_lock; skip if previous sync
is still alive after 5 s join to prevent duplicate memory ingestion
- Guard _schedule_flush timer creation under _queue_lock (TOCTOU race)
- Capture local `backend` reference in prefetch/sync closures so
shutdown() nulling self._backend cannot crash in-flight threads
Correctness:
- Fix bool("false")==True for rerank param; parse string values explicitly
- Guard page/top_k with max(1,...) and move int() inside try blocks
- Fix fact_count=0 always in OSS mode (Memory.add returns list, not dict)
- Fix prefetch() not clearing result when thread still alive after timeout
- Fix atexit.register accumulating on repeated initialize() calls
Backend / setup:
- Handle Qdrant named-vector collections in _recreate_collection_if_dims_changed
(vectors is a dict; .size access raised AttributeError, swallowed silently)
- Wrap QdrantClient and psycopg2 conn/cursor in try/finally to prevent leaks
- Resolve ollama_bin at top of _ensure_ollama; use it for ollama pull
- Fix embedder key lookup when LLM provider has no env_var (e.g. ollama)
Also: remove _telemetry_enabled cache (env var check is cheap), bump
required mem0ai to >=2.0.7, minor README wording fix.
* fix(mem0): fix brittle qdrant path test + add telemetry sample-rate docs
- Replace generator-throw lambda with a proper def in
test_qdrant_path_not_writable; use tmp_path instead of a hardcoded
/nonexistent path so the test is root-safe
- Add MEM0_TELEMETRY_SAMPLE_RATE to memory-providers.md (was only
in the plugin README, not the user-guide docs)
* revert: remove MEM0_TELEMETRY_SAMPLE_RATE from user-guide docs
* refactor: remove telemetry from mem0 plugin and update documentation
* fix(mem0): set stdin=DEVNULL on setup subprocess calls
The TUI stdin guard (scripts/check_subprocess_stdin.py) requires every
subprocess call in plugin code to set stdin= so it can't inherit the
gateway's JSON-RPC stdin fd. Muzzle the docker/ollama calls in the OSS
setup wizard with stdin=subprocess.DEVNULL (none need interactive input).
Also covers the docker-inspect call the linter's regex misses.
---------
Co-authored-by: chaithanyak42 <chaithanya.kumar42a@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Defense-in-depth for the dashboard plugin auto-import path. The web server
auto-imports and mounts the Python backend (dashboard/manifest.json -> api file)
of plugins found in ~/.hermes/plugins/ (user) and ./.hermes/plugins/ (project),
not just bundled plugins. So any plugin that reaches one of those dirs gets
arbitrary Python executed on the next dashboard start.
NOTE ON THREAT MODEL: #43719's originally-documented delivery chain (a public
--insecure dashboard + open API used to git clone a malicious repo into
~/.hermes/plugins/) is ALREADY mitigated on main — since the June 2026
hermes-0day hardening, a non-loopback bind ALWAYS requires an auth provider and
--insecure no longer bypasses the auth gate. This change is therefore NOT
closing that (now-authenticated) network path; it removes the residual
'arbitrary code executes merely because a plugin is on disk' hazard, which still
applies when a plugin arrives by other means: a socially-engineered git clone,
a supply-chain drop, an authenticated-but-malicious actor, or a future
regression in the auth gate. Untrusted on-disk code should not auto-execute.
Restrict dashboard backend Python auto-import to BUNDLED plugins only. User and
project plugins may still extend the dashboard UI via static JS/CSS, but their
api Python file is never auto-imported. Two layers: _discover_dashboard_plugins
scrubs api/_api_file for user/project sources (and bundled wins name conflicts
so a non-bundled plugin cannot shadow a trusted backend route);
_mount_plugin_api_routes re-refuses user/project at mount time. Tightens the
prior GHSA-5qr3-c538-wm9j / #29156 hardening (bundled+user) to bundled-only.
Salvaged from #44472 (@egilewski) onto current main.
Follow-up to the accept-any-file-type change. The observe-unmentioned and
replied-media paths relied on cache_media_bytes() returning None for
unsupported document types to emit an 'unsupported, not cached' note. Now
that any file type is always cached, those docs are cached and surfaced with
a path-pointing note — consistent with the main document path. The
remaining cached-is-None branch is image-validation-failure only; its note
is reworded accordingly. Updates the group-gating test to the new contract.
Authorization to message the agent is the gate, not the file extension.
Previously the inbound-attachment allowlist (SUPPORTED_DOCUMENT_TYPES) was
opt-OUT on Discord (allow_any_attachment defaulted false) and had no bypass
at all on Telegram/Slack — so an .html (or any non-allowlisted type) was
dropped or hard-rejected before the agent saw it.
Now every authorized upload is cached and surfaced to the agent regardless
of type:
- base.cache_media_bytes(): unknown types cache as octet-stream (or the
caller-supplied MIME) instead of returning None — fixes the chokepoint
that Teams/Telegram-media route through.
- discord/telegram/slack adapters: removed the allowlist reject/skip; any
non-media attachment is typed DOCUMENT and cached. Known types keep their
precise MIME.
- Text inlining now gates on a shared _TEXT_INJECT_EXTENSIONS set (text +
code + config + markup) instead of a blind UTF-8 decode, so binary formats
(PDF/zip/docx) with ASCII headers are never inlined.
- gateway/run.py emits the path-pointing context note for every DOCUMENT,
including non text/application MIME types.
- discord.allow_any_attachment is now a documented no-op kept for config
back-compat.
Validation: 357 gateway tests pass; E2E confirms .html/.bin/custom types
cache, known types stay precise, PDFs are not inlined.