hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-27 11:22:03 +00:00

Author	SHA1	Message	Date
Teknium	7a65800fed	fix(cache): content-address prompt_cache_key so recurring cron jobs reuse the warm prefix (#52295 ) Recurring cron jobs were prompt-cache-cold on every fire. session_id is built as cron_<job_id>_<timestamp>, and the Codex/Responses transport used session_id directly as prompt_cache_key — so the timestamp changed the cache key on every run and the static prefix (agent identity + tool schemas) was re-paid each tick. Derive prompt_cache_key from a SHA-256 of the static prefix (instructions + sorted tool schemas) instead. Repeated fires of the same job share one content-addressed key (pck_<hash>) and reuse the warm prefix within the provider's cache TTL. The key changes exactly when the prefix changes — edit the job's prompt or toolset and it re-keys; leave it alone and it stays stable. session_id is left untouched for transcript isolation, log correlation, and the Codex/xAI session-scope routing headers (session_id, x-client-request-id, x-grok-conv-id) — those are the per-fire identity, not the cache key. Only the prompt_cache_key body field (standard OpenAI/Codex path and the xAI extra_body field) is content-addressed. Closes #51395. Co-authored-by: spiky02plateau <spiky02plateau@users.noreply.github.com> Co-authored-by: JoaoMarcos44 <JoaoMarcos44@users.noreply.github.com>	2026-06-24 21:46:30 -07:00
Ben Barclay	72ae163250	fix(relay): authorize relay-delivered events by delivery, not source.platform (#52306 ) * fix(relay): authorize relay-delivered events by delivery, not source.platform The #52190 upstream-authz fix keyed _is_user_authorized off source.platform via _adapter_authorization_is_upstream(source.platform). But a relay message inbound carries the UNDERLYING platform (source.platform == discord/telegram/...), NOT Platform.RELAY, because ws_transport._event_from_wire maps the connector's wire payload (platform="discord") straight onto SessionSource for session-keying and egress. The relay adapter is registered only under Platform.RELAY, so adapters.get(Platform.DISCORD) misses, the trusted-upstream branch is skipped, and the user hits the env-allowlist default-deny: WARNING gateway.run: Unauthorized user: <id> (<name>) on discord (Live staging bug: alpha tester linked successfully, then every follow-up DM was silently dropped.) Fix: the authentic trust signal is that the event was delivered over the per-instance-authenticated relay WS, not which platform it underlies. Add a wire-INVISIBLE SessionSource.delivered_via_upstream_relay flag, stamped by the relay transport in _event_from_wire, and authorize on it. The flag is excluded from to_dict/from_dict so a peer can neither forge it across the wire nor have it restored from persistence. The existing adapter-flag check is retained for events whose source.platform IS Platform.RELAY (interaction-passthrough). A direct Discord event on a multiplexing gateway (direct + relay adapters) is unmarked and still default-denies. * fix(relay): use identity check on delivery marker to avoid MagicMock fail-open A MagicMock() source (used by test_signal.py and other gateway tests) auto- vivifies source.delivered_via_upstream_relay as a truthy Mock, which a bare truthiness check would treat as authorized — flipping test_signal_in_allowlist_maps from False to True. The marker is a real bool on SessionSource, so check 'is True' explicitly: refuses to authorize any non-bool stand-in, defensive against accidental fail-open.	2026-06-25 14:21:09 +10:00
brooklyn!	0c442fa1d3	Merge pull request #52303 from NousResearch/bb/pets-gen-qa feat(pets): quality-first OpenRouter chain, stronger atlas gates, global pet-gen notifications	2026-06-24 23:16:40 -05:00
Brooklyn Nicholson	e92b5c6af8	feat(pets): quality-first OpenRouter model chain + stronger atlas gates + global pet-gen notifications OpenRouter/Nous image gen now runs a quality-first model chain by default: attempt the highest-fidelity OpenAI image model first, then fall back to Gemini 3 Pro Image when it's access-gated/unavailable/times out. An explicit OPENROUTER_IMAGE_MODEL / config model override pins one model with no fallback. Atlas validation rejects malformed model output instead of shipping it: adds a per-state collapse guard (a single sliver/fragment row no longer passes because other rows are healthy), on top of the existing postage-stamp + multi-pose checks. Desktop: pet-gen native notifications are now "global" (not tied to a chat session), so a background generation started from the command center fires an OS notification when the user is away even with no active session. Adds a neutral "This can take up to 5 minutes." banner on step 1, and lets the provider picker auto-size. Tests updated/added for the OpenRouter fallback chain, the collapse guard, and the global notification path.	2026-06-24 23:11:21 -05:00
brooklyn!	380d660cab	Merge pull request #52297 from NousResearch/bb/ad-hoc-verify Support ad-hoc verification scripts	2026-06-24 23:10:15 -05:00
brooklyn!	d473e5d07a	Merge pull request #52296 from NousResearch/bb/verify-stop-loop Add verification stop loop	2026-06-24 23:10:03 -05:00
brooklyn!	1512bad0bc	Merge pull request #52286 from NousResearch/bb/verify-status feat(gateway): expose coding verification status	2026-06-24 23:09:45 -05:00
brooklyn!	da0320bf40	Merge pull request #52285 from NousResearch/bb/verify-ledger feat(agent): record coding verification evidence	2026-06-24 23:07:10 -05:00
Brooklyn Nicholson	a5a2edd451	feat(agent): recognize focused ad-hoc verification scripts Allow focused temporary scripts to satisfy verification when no canonical suite is detected, while keeping suite evidence distinct from ad-hoc proof.	2026-06-24 23:03:45 -05:00
Brooklyn Nicholson	2f1a47b90e	feat(agent): require verification before finishing edits Make verification closure the default coding behavior after landed file edits while keeping bounded retries and config/env switches for users who need to disable it.	2026-06-24 23:02:48 -05:00
Brooklyn Nicholson	7ef0f360d0	feat(gateway): expose coding verification status Add a read-only gateway RPC for querying the passive verification ledger without running checks from the UI surface.	2026-06-24 22:36:03 -05:00
Brooklyn Nicholson	f0beb6f617	test(agent): cover verification evidence ledger Exercise command classification, session scoping, stale edits, bounded retention, and natural expiry for recorded verification evidence.	2026-06-24 22:35:27 -05:00
Victor Kyriazakos	b177d4ee48	fix(cron): mirror continuable cron as a labelled user turn (alternation-safe) Addresses review on #51077 (kxee). The continuable-cron mirror reused gateway.mirror.mirror_to_session, which writes role=assistant — re- introducing the exact alternation violation #2313 (`37a997945`) deliberately removed: a cron brief landing as assistant after the agent's last turn yields assistant->assistant, which breaks strict- alternation providers (OpenAI/OpenRouter) per issue #2221. The mirror/ mirror_source metadata is also dropped at the SQLite boundary, so the [Delivered from cron] label is lost on replay. This is an intentional, opt-in (default OFF) reversal of #2313's 'cron output does not belong in interactive history' for the reply-to- cron use case — gated behind cron.mirror_delivery / attach_to_session. Fixes: - mirror_to_session gains a role param (default 'assistant' — interactive send_message mirror unchanged, it IS the agent speaking). Cron paths pass role='user' with a '[Cron delivery: <task>]' prefix so the brief collapses via repair_message_sequence's consecutive-user merge on every provider, and stays distinguishable on replay despite the metadata drop. - thread_seeded: defer seeding + the flag until delivery into the new thread actually succeeds. Previously set pre-delivery, so an open- succeeds / deliver-fails case both stranded a seeded-but-unseen brief AND suppressed the DM-fallback mirror. - seed mirror now passes user_id='system:cron' to resolve the exact thread-keyed session row it just created. - dedupe the duplicate BasePlatformAdapter import in _deliver_result. - trim oversized docstrings to non-obvious WHY (AGENTS.md). - docs: document cron.mirror_delivery / attach_to_session in website/docs/user-guide/features/cron.md. - test: assert the cron mirror writes role='user' with the label prefix. 204 cron+mirror tests pass.	2026-06-24 20:27:05 -07:00
Victor Kyriazakos	b693bee100	feat(cron): thread-preferred continuable delivery (open a thread, mirror DM fallback) Continuable cron jobs (attach_to_session / cron.mirror_delivery, default OFF) now prefer a dedicated thread on thread-capable platforms, falling back to origin-DM mirroring where threads don't exist. - Thread-capable (Telegram topics, Discord/Slack threads): open a fresh thread for the job via the shipped adapter.create_handoff_thread, route the brief into it, and seed the thread-keyed session so the user's in-thread reply continues with full context. This is the 'continuable cron opens its own thread' interface. - DM-only (WhatsApp/Signal/SMS): create_handoff_thread returns None -> fall back to mirroring into the origin DM session (existing behaviour). Reuses existing infrastructure end-to-end — no new adapter surface, no provider-chain signature change: - adapter.create_handoff_thread (already implemented per-platform, returns None on unsupported platforms = the fallback signal) - the live SessionStore via adapter._session_store (already set on every adapter), reached without threading a new param through the frozen CronScheduler.start() contract - gateway.mirror.mirror_to_session for the seed/append - existing per-target delivery routing carries the new thread_id for free Mirrors GatewayRunner._process_handoff's open-thread-or-fallback + seed pattern, standalone for the cron delivery path. thread_seeded guards against a double-mirror after seeding. Scoped to the origin target only; fan-out/broadcast targets are never threaded or mirrored. Config docs updated (cron.mirror_delivery) + cronjob tool attach_to_session description reframed around continuable/thread-preferred. Tests: +5 (thread id returned on thread platform; None on DM platform; None without capability/loop; seed creates thread session + mirrors; seed no-op on empty). 22/22 in TestCronDeliveryMirror; 532 cron tests pass (4 failures pre-existing: croniter-not-installed + TZ).	2026-06-24 20:27:05 -07:00
Victor Kyriazakos	98f3c19282	feat(cron): pass origin user_id to delivery mirror (send_message parity) Multi-participant parity with interactive send_message, which passes HERMES_SESSION_USER_ID to gateway.mirror.mirror_to_session so the mirror lands in the exact participant's session. - cronjob_tools._origin_from_env now captures user_id from the session context at job-create time (alongside platform/chat_id/thread_id). - _maybe_mirror_cron_delivery forwards user_id to mirror_to_session. - _deliver_result threads origin.user_id through for the origin target. Effect: in a per-user-isolated group chat (group_sessions_per_user=True, the default), the mirror resolves to the member who scheduled the job instead of conservatively no-op'ing on ambiguous candidates. DMs and shared group/thread sessions are unaffected (single candidate). Default still OFF. Tests: helper forwards user_id; E2E _deliver_result forwards origin user_id. 17/17 in TestCronDeliveryMirror; 527 cron tests pass (4 failures pre-existing: croniter-not-installed + TZ, identical on baseline).	2026-06-24 20:27:05 -07:00
Victor Kyriazakos	c06ceb3232	refactor(cron): scope delivery mirror to the origin conversation The cron->session mirror now fires ONLY for the delivery target that equals the job's origin (platform+chat_id[+thread_id]). A job created from a live gateway chat stamps that chat as origin, and that session is guaranteed to exist (it is the conversation the user scheduled the job in). Fan-out / broadcast / home-channel-fallback targets are never mirrored: they are not a continuation of a conversation and may have no session at all. This makes the prior 'cold-start session seeding' concern a non-case by construction: when the mirror semantically applies the session exists; when none exists the target was never the origin, so we no-op. Adds _target_matches_origin() + origin-scoping tests (exact match, other-chat/other-platform/no-origin rejection, thread scoping, fan-out mirrors only the origin target).	2026-06-24 20:27:05 -07:00
Victor Kyriazakos	1b181724fa	feat(cron): optional mirror of cron delivery into target chat session Adds an opt-in path so a cron job's delivered output is also appended to the TARGET chat's gateway session transcript (as an assistant turn), so a user reply to a recurring delivery (daily brief, reminder) is answered with the delivery in context instead of 'what is that?' amnesia. - Reuses the shipped gateway.mirror.mirror_to_session — the same primitive interactive send_message mirroring already uses. No messaging-toolset change (cron still can't call send_message; this rides delivery). - Gated: per-job attach_to_session overrides global cron.mirror_delivery (config.yaml). Default OFF — historical isolation preserved byte-for-byte. - Mirrors the CLEAN agent output, not the cron header/footer wrapper. - Alternation/cache-safe: append lands at a turn boundary, never mid-loop, never mutates the cached system prompt. Cold-start (no target session) is a silent no-op; mirror errors never fail a successful delivery. - Surfaced on the cronjob tool (attach_to_session) + config schema. Driven by enterprise cron-as-control-plane use case. 10 new tests; full cron + cronjob-tool suites pass (600).	2026-06-24 20:27:05 -07:00
Ben	0c3f197cff	fix(relay): re-attach DM author user_id on outbound for connector egress A DM reply carries no guild_id, so the connector's egress guard cannot resolve the owning tenant from metadata.guild_id and declines the send with "discord egress declined: target not routed to an onboarded tenant" — the bug behind "the bot never replies in DMs". Guild replies are unaffected (they carry guild_id), which is why the guild path worked end-to-end while DMs looked broken. The connector now resolves a DM reply's tenant from the recipient's author binding (gateway-gateway #67, resolveByUser keyed on metadata.user_id) — the outbound counterpart to inbound Phase 7a author-first resolution. But it needs the recipient user_id ON the outbound action, and the adapter only re-attached guild_id (_capture_scope/_with_scope), no-op for DMs (the docstring even said so). This extends the adapter's inbound-scope capture: for a DM (no guild_id) remember chat_id -> the authentic author user_id we observed, and re-attach it as metadata.user_id on outbound. Guild capture is unchanged and wins when present; user_id is the DM-only fallback. The id is the one the connector observed inbound (never gateway-asserted), so the trust invariant holds. +4 unit tests (DM reply re-attaches user_id + no guild_id; unknown chat invents nothing; explicit user_id preserved; guild reply never carries user_id). Proved load-bearing (reverting the re-attach fails the DM test). 144 relay tests pass, ruff clean. Pairs with gateway-gateway #67 (the connector-side resolver). Together they close the DM-reply egress gap end-to-end.	2026-06-25 12:43:54 +10:00
Ben Barclay	c15945655f	fix(terminal): sanitize host/relative cwd OVERRIDE before it reaches docker run -w (#50636 ) terminal_tool() resolves a per-task cwd override that WINS over config["cwd"]: cwd = overrides.get("cwd") or config["cwd"] config["cwd"] is sanitized for container backends in _get_env_config() (host prefixes /Users//home//C:\\/C:/ and relative paths are replaced with the backend default /root). But the override was applied RAW — it was never run through that guard. The gateway/TUI registers the host launch dir as a cwd override for workspace tracking (tui_gateway/server.py _register_session_cwd -> _terminal_task_cwd -> _session_cwd -> os.getcwd()), so on a container backend a host path leaked straight to `docker run -w <host-path>`: - Windows desktop: -w C:\Users\<user> -> container fails to start (exit 125) - POSIX: -w /home/<user> -> same The ACP adapter translates its override cwd (acp_adapter/session.py _translate_acp_cwd), but the gateway path did neither translation nor sanitization, so the override bypassed the one guard that would have caught it. Fix: extract the host/relative-path predicate into a shared _is_unusable_container_cwd() helper (so the existing _get_env_config() sanitizer and the new guard can't drift), and re-apply it to the resolved cwd at the override-resolution site. Valid in-container override paths (RL/benchmark sandboxes that set cwd to /workspace, /root, ...) are absolute non-host paths and pass through untouched. Tests: unit-pin the predicate (Windows backslash/forwardslash, POSIX home, macOS /Users, relative, valid container paths) AND an E2E call-site pin that drives terminal_tool() with a host-path override registered and asserts the cwd reaching _create_environment is sanitized. Mutation-verified: reverting the call-site guard makes the two host-path E2E tests fail (showing the raw host path leaking) while the valid-/workspace-override test stays green.	2026-06-25 02:33:40 +00:00
Teknium	411faf08bd	fix(soul): installers seed the real default persona, upgrade legacy empty templates (#52246 ) The desktop bootstrap (and curl/PowerShell/docker installs) seeded ~/.hermes/SOUL.md with a comment-only scaffold that contained no persona text. That shadowed the runtime default (_ensure_default_soul_md -> DEFAULT_SOUL_MD), since seeding is guarded by 'if SOUL.md doesn't exist'. Result: every fresh installer install got the empty template instead of the documented Hermes persona; desktop just made it visible in onboarding. - install.sh / install.ps1 / docker/SOUL.md now write DEFAULT_SOUL_MD. - _ensure_default_soul_md() upgrades a SOUL.md still matching the known legacy scaffold in place; customized files (any deviation, incl. a persona appended below the comment) are never touched. - Detection normalizes CRLF/BOM so Windows-installer drift still matches.	2026-06-24 18:56:26 -07:00
Teknium	a4fa1481e2	fix(tui): route /learn through command.dispatch so the prompt fires (#52232 ) The Desktop GUI (tui_gateway) slash worker subprocess has no reader for the CLI's _pending_input queue. /learn's CLI handler prints the ack and puts the built prompt onto that queue, so in the TUI the prompt was silently dropped — ack shown, no LLM turn, no skill created (#51829). command.dispatch already handles 'learn' correctly (returns {type: send, message: build_learn_prompt(arg)}), but 'learn' was missing from _PENDING_INPUT_COMMANDS, so slash.exec fell through to the worker instead of routing to command.dispatch. Add it to the frozenset, matching the existing goal/queue/steer/plan pattern.	2026-06-24 18:48:50 -07:00
Ben	d1cac0e5ef	feat(gateway): scale-to-zero idle detection + dormant-quiesce (Phase 0) The gateway-side BEHAVIOUR layer that consumes the relay scale-to-zero primitives (gateway-gateway Phase 5): the gateway decides it is idle and drives the relay transport dormant so the platform (Fly autostop:"suspend") can suspend the now-traffic-idle machine, which wakes on the connector's wakeUrl poke (decisions.md Q3=C', D1-D13). - gateway/scale_to_zero.py: pure helpers — scale_to_zero_enabled (the NAS Labs HERMES_SCALE_TO_ZERO stamp, D11/Q8=A), parse_idle_timeout_seconds (config.yaml gateway.scale_to_zero.idle_timeout_minutes, D2), messaging_is_relay_only_or_absent (F6/D1), should_arm (D1/D11/§3.4(1)), is_idle (D2/D3/F7). - gateway/run.py: _last_inbound_at clock stamped on user inbound in _handle_message (F13); the arm-gate + idle predicate + the _scale_to_zero_watcher dormant sequence (mark draining -> adapter go_dormant() -> cooldown), started only when armed. Deliberately NOT the stop path and NOT mark_resume_pending (F12/D13). - tools/process_registry.py: has_any_active() for the bg-work guard (D3/F7). - hermes_cli/config.py: gateway.scale_to_zero.idle_timeout_minutes default 5. Tests: 38 pure-logic + 6 watcher (incl. bg-work regression guard proven RED). Full relay + scale-to-zero suites: 184 passed. The 20 unrelated failures in the broader run are PRE-EXISTING on origin/main (custom-provider/tools tests), confirmed via a pristine baseline worktree.	2026-06-24 18:47:18 -07:00
Ben	96af4bec30	feat(relay): add go_dormant() transport mode for scale-to-zero (0.E0) Net-new WebSocketRelayTransport.go_dormant() + RelayAdapter.go_dormant() — the third transport mode the scale-to-zero behaviour layer needs, distinct from both disconnect() and an unexpected close (decisions.md D12/F14): - disconnect() sets _closing=True and CANCELS the reconnect supervisor (terminal "shutting down for good") -> a suspended machine never re-dials on wake, stranding its buffered backlog. - an unexpected close re-dials IMMEDIATELY -> the socket never stays down, so the platform proxy never suspends the machine. go_dormant(): going_idle->ack (reuse go_idle), then close the socket WITHOUT setting _closing, so the reader's fall-through still arms the reconnect supervisor (wake path stays live) but on the longer _dormant_redial_s cadence so it doesn't fight the platform suspend window. A successful re-dial clears _dormant. Honors the §3.4 wake->reconnect->drain contract. Tests: 6 new in test_relay_going_idle.py incl. the F14 regression guard (routing dormancy through disconnect() fails exactly the 4 wake-path tests). Full relay suite 140 passed.	2026-06-24 18:47:18 -07:00
helix4u	17beb55e3c	fix(telegram): gate rich draft previews separately	2026-06-24 18:11:14 -07:00
brooklyn!	7157b213f5	Merge pull request #47959 from NousResearch/bb/pets-gen Pet generation: frame-perfect hatch flow, backend picker, CPU-safe chroma, and CI-hardening	2026-06-24 19:41:34 -05:00
Brooklyn Nicholson	a05a9b0e07	test(delegate): harden heartbeat in-tool stale timing assertion Stabilize the long-running-tool heartbeat test by patching stale thresholds inside the test and asserting the heartbeat exceeds the idle ceiling, which preserves intent while removing scheduler-sensitive assumptions that flake in CI.	2026-06-24 19:33:40 -05:00
brooklyn!	b649cdee4a	Merge pull request #52203 from NousResearch/bb/update-drain-announce fix(update): announce gateway drain waits so desktop updates don't look hung	2026-06-24 19:28:44 -05:00
Ben	538c419d2e	fix(gateway): scope dashboard liveness fallback to the profile PR #52151 hardened the runtime-status liveness check to trust a readable live process command line over stale gateway_state.json argv, so a recycled PID now owned by an s6 supervisor no longer counts as a running gateway. That fix is correct but incomplete for the reported symptom: the web dashboard showed a named profile's gateway green while `hermes -p <name> gateway status` showed it stopped. Two further issues: 1. Cross-profile PID reuse. In per-profile Docker supervision, one profile's stale `gateway_state.json` can record a PID the OS later recycled onto a DIFFERENT profile's live gateway. That PID's command line still `looks_like_gateway`, so the dead profile was reported running. The recorded argv has its `-p <name>` selector stripped in-process by `_apply_profile_override`, so it cannot disambiguate; the live `/proc` cmdline still carries it. `get_runtime_status_running_pid` now accepts an `expected_home` and validates the live command line belongs to THAT profile (mirroring `hermes_cli.gateway._matches_current_profile`, the logic the CLI scan path already uses — which is why the CLI was correct). `_check_gateway_running` passes the enumerated profile dir. 2. The existing regression test `test_gateway_running_check_falls_back_to_ runtime_state` used the live pytest PID with a gateway-shaped record; once the live cmdline became authoritative it no longer looked like a gateway. Updated to mock the live cmdline to the real separate-process scenario it describes. The active-profile path (`get_running_pid`) is intentionally left unscoped: it is lock-verified and any live gateway cmdline is acceptable there. Multiplex mode is unaffected — `running` state is only ever written to a gateway's own home, never a secondary served profile's. Adds coverage for: cross-profile PID reuse (named + default), matching profile cmdline (`-p`, `--profile`, explicit HERMES_HOME=), the bare default gateway, and the unreadable-cmdline cross-platform fallback. Each new cross-profile assertion fails without the profile scope and passes with it. Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com>	2026-06-25 10:25:54 +10:00
helix4u	f1617a7ebb	fix(gateway): validate runtime status pid command line	2026-06-25 10:25:54 +10:00
AIalliAI	463bf2be25	fix(update): announce gateway drain waits so desktop updates don't look hung On macOS, the desktop updater's stage 1 (hermes update --gateway) ends by restarting running gateways. launchd_restart() SIGTERMs the gateway and silently waits up to agent.restart_drain_timeout (default 180s) for the drain; the manual profile-gateway loop waits its drain budget per gateway the same way. Neither path prints anything before the wait, so the desktop updater's live output goes dead for minutes right after '✓ Update complete!' — users read it as a hung update and force-kill their gateway processes to make it move (#44515). The systemd branch already announces its drain ('draining (up to Ns)...'); launchd and the manual loop did not. Print the stop/drain (with PID and budget) before the wait in both paths, mirroring the systemd branch, and assert the message in the existing launchd drain test. Fixes #44515	2026-06-24 19:12:44 -05:00
Brooklyn Nicholson	1fe013ee16	feat(pets): polish generate flow and reduce hatch CPU pressure Ship the final pet-generation UX polish (provider picker behavior, step-2 cancel flow, banner integration, and visual consistency) and make saturated-chroma background removal C-op driven so hatch processing no longer hammers the machine during long runs.	2026-06-24 19:08:06 -05:00
Ben	d335164833	fix(relay): authorize relay inbound via connector-enforced upstream authz A hosted instance fronted by the Team Gateway connector dropped EVERY relay message as "Unauthorized user" and the agent never replied — despite the message routing correctly through the connector to the instance. Root cause: gateway authorization (_is_user_authorized) had no notion of upstream-enforced authz. Platform.RELAY matches no {PLATFORM}_ALLOWED_USERS allowlist and isn't in the HA/WEBHOOK always-authorized set, so a relay user with no env allowlist configured hit the default-deny ("No user allowlists configured. All unauthorized users will be denied."). The message was received, then silently denied before reaching the agent. This is incorrect for relay: the connector authenticates the gateway's WS with a per-instance secret and performs owner-only author-binding resolution BEFORE delivering. A message only reaches this gateway because the connector resolved it to THIS instance's bound user (user_instance_binding), keyed on the author id the connector OBSERVED off the event — never a gateway claim. The authorization decision is already made by a trusted, authenticated upstream; there is no local RELAY_ALLOWED_USERS allowlist to consult, and default-denying for its absence is the bug. Fix: add a generic BasePlatformAdapter.authorization_is_upstream capability (default False) that the relay adapter overrides to True, plus a dedicated trusted branch in _is_user_authorized that honors it. This is delegation to a trusted upstream, NOT a fail-open: it fires only for an adapter that explicitly declares the flag; every direct network-exposed adapter leaves it False and the env-allowlist default-deny (SECURITY.md §2.6) is unchanged. Distinct from enforces_own_access_policy, which mirrors a LOCAL config-driven allowlist — this delegates to an authenticated upstream's decision. Tests: behavior contract that the base defaults False, the relay adapter declares True, a relay user (group + DM) is authorized with no env allowlist, and crucially a non-upstream adapter with no allowlist still default-denies (guards against the fix becoming a blanket fail-open). 6 new tests; relay + authz + config-policy suites green (134 + 90). Found via live staging debug of the Discord self-serve onboarding flow.	2026-06-25 10:06:21 +10:00
Ben	41b9b7e719	test(lazy-deps): make durable-target tests network-free CI test shard has no PyPI egress: the real 'pip install packaging==20.9' in test_core_package_is_not_shadowed failed (the pypi.org reachability probe passed but the actual install didn't), failing slice 2/6. - Prove the anti-shadow invariant deterministically: synthesize a fake 'packaging' in the durable target with a sentinel and assert the import still resolves to the core copy (TestCoreNeverShadowed). No network. - Cover the install wire offline: stub subprocess and assert --target + --constraint are built in durable mode and absent in venv-scoped mode (TestInstallArgConstruction). - Gate the genuine PyPI install behind HERMES_RUN_NETWORK_TESTS=1 (opt-in, skipped in CI) instead of a flaky reachability probe that doesn't predict install success.	2026-06-25 09:20:13 +10:00
Ben	cbd6ba1bdd	fix(docker): redirect lazy installs to a durable target so opt-in backends work in the immutable image (#51136 ) The published Docker image seals the agent venv (root-owned, read-only /opt/hermes) and sets HERMES_DISABLE_LAZY_INSTALLS=1 so a runtime install can't mutate and brick the core. But opt-in backends (Firecrawl web search, Exa, Feishu, ...) deliberately keep their SDKs in tools/lazy_deps.py and out of [all] (pyproject policy 2026-05-12: one quarantined release must not break every install). The two policies collided: the SDK isn't baked in AND can't lazy-install, so the default Firecrawl web_search/web_extract fail out of the box in Docker (#51136), as do Exa (#49445) and Feishu (#50205). Fix the whole class instead of baking in one backend: when HERMES_LAZY_INSTALL_TARGET is set, lazy installs are redirected to a writable dir on the durable /opt/data volume via `pip/uv install --target`, and that dir is APPENDED to the end of sys.path. Because the core venv always wins name collisions, a package installed this way can only ADD new modules — it can never shadow, downgrade, or break a module the core ships. The worst a bad/incompatible backend package can do is fail to import and report itself unavailable; the agent core stays healthy. That structural guarantee is what made it safe to seal the venv, and it is preserved here even with installs re-enabled. - tools/lazy_deps.py: durable-target mode — `--target` install + core-pinned `--constraint` file (shared deps resolve to core's versions, conflicts fail loudly at install time), append-only sys.path activation, ABI/Python-version stamp that wipes the store if an image rebuild bumps the interpreter, and a reworked gate so HERMES_DISABLE_LAZY_INSTALLS=1 redirects (rather than hard- blocks) when a target is set. security.allow_lazy_installs=false still disables installs in every mode. - hermes_bootstrap.py: activate the durable target on sys.path at first import (before any backend imports its SDK) so packages installed on a previous run are importable on this run. - Dockerfile: set HERMES_LAZY_INSTALL_TARGET=/opt/data/lazy-packages. - docker/stage2-hook.sh: seed + chown the dir on the data volume. - tests: real-install E2E proving installs land in the target, import cleanly, don't leak into the sealed venv, and that a core package is never shadowed; ABI-stamp wipe/preserve; gate matrix; Dockerfile/stage2 contract test. Fixes #51136	2026-06-25 09:20:13 +10:00
liuhao1024	404b06ac4f	fix(gateway): honor server retry_after in _send_with_retry for Telegram flood control (#46762 ) When Telegram's sendRichMessage returns a FloodWait/RetryAfter error, _try_send_rich() now extracts the server-provided retry_after value and propagates it through SendResult.retry_after. The base _send_with_retry() layer honors this value instead of using its default short exponential backoff (~2s, ~4s), preventing the retry budget from being exhausted against a server that demands a 25-37s wait. Salvaged from #46774 by @liuhao1024. Telegram adapter path moved from gateway/platforms/telegram.py to plugins/platforms/telegram/adapter.py since the original PR. Closes #46762	2026-06-25 02:43:47 +05:30
kshitij	cedbb4cfa2	Merge pull request #52140 from NousResearch/salvage/47707-tool-schema-validation fix(agent): validate context/memory tool schemas before wrapping (#47707)	2026-06-25 02:36:19 +05:30
kshitij	085096fd59	Merge pull request #52135 from NousResearch/salvage/51826-tirith-mkdtemp-oerror fix(tools): catch mkdtemp OSError in tirith install (#51826)	2026-06-25 02:35:27 +05:30
Bartok9	710cd48fb1	fix(agent): validate context/memory tool schemas before wrapping Closes #47707 Context engines and memory providers expose tool schemas via get_tool_schemas(). agent_init.py wrapped each as {"type":"function","function":_schema} without validating that _schema carries a top-level name. A provider returning an entry already in OpenAI tool form ({"type":"function","function":{...}}) was then double-wrapped into a tool whose function has no name. Strict providers (e.g. DeepSeek) reject the entire request with HTTP 400 'tools[N].function: missing field name', so one malformed schema silently disables the whole toolset and breaks every turn. The schema was also never added to valid_tool_names, so even lenient providers could not call it. Add a shared normalize_tool_schema() helper that unwraps an already-wrapped entry and returns None for anything lacking a resolvable string name. Wire it into the agent_init context-engine loop and all three memory_manager surfaces (inject_memory_provider_tools, add_provider routing index, get_all_tool_schemas), so a single bad plugin schema is skipped with a warning instead of poisoning the request. Verification: 209 targeted agent/memory tests pass (incl. 9 new). New tests assert the unwrap + skip-nameless behavior and fail without the fix.	2026-06-25 02:17:29 +05:30
liuhao1024	dbf0797335	fix(tools): catch mkdtemp OSError in tirith install to prevent unbounded retry and temp-dir leak (#51826 ) When tempfile.mkdtemp() raises OSError (e.g. disk full), the exception propagated past the try/finally block, so _mark_install_failed() was never called. The 24h backoff marker never engaged, causing unbounded retry on every command -- each attempt leaked a tirith-install-* temp directory, eventually filling /tmp completely. Fix: wrap mkdtemp in its own try/except OSError, returning (None, "no_space") so the caller's normal failure path (including _mark_install_failed) executes. Salvaged from #51831 by @liuhao1024. Closes #51826	2026-06-25 02:13:56 +05:30
liuhao1024	8d1f6debfd	fix(agent): deepcopy plugin context engine to prevent parent corruption on delegate_task (#42449 ) When delegate_task spawns a child agent with a different model/provider, the child's init_agent loaded the plugin context-engine GLOBAL singleton by reference (`_selected_engine = _candidate`) and then called update_model() on it with the child's (smaller) context_length. Because parent and child shared the same object, this mutated the PARENT's compressor: e.g. DeepSeek 1M ctx silently dropped to 204800 and the compression threshold from 200K to 40K after any delegate_task with a different model. Deepcopy the singleton before assigning/mutating it (agent_init.py) so the child gets its own instance and the parent's compressor is untouched. Salvaged from #42452 by @liuhao1024 (authorship preserved). Added a source-pin regression test that fails if the production line reverts to the bare alias, plus an end-to-end test driving get_plugin_context_engine() and a StubEngine.update_model() — the original PR's tests exercised copy.deepcopy in isolation but did not guard the actual agent_init code path. Closes #42449. Supersedes #42469, #42474 (same one-line fix, no test).	2026-06-25 02:13:26 +05:30
kshitij	77d2b50751	Merge pull request #52118 from NousResearch/salvage/36776-ddgs-timeout fix(ddgs): bound DuckDuckGo search with a wall-clock timeout (#36776)	2026-06-25 01:56:26 +05:30
kshitij	4d589b1e13	Merge pull request #52121 from NousResearch/salvage/43466-strip-cronjob-toolset fix(delegate): strip cronjob toolset from delegated children (#43466)	2026-06-25 01:54:37 +05:30
uzunkuyruk	489b85ee1e	fix(ddgs): bound DuckDuckGo search with a wall-clock timeout (#36776 ) A single ddgs (DuckDuckGo) search could hang indefinitely and block the shared agent loop — and therefore every platform (CLI, Telegram, Matrix...). The DDGS constructor's timeout only bounds individual HTTP requests; ddgs's multi-engine retry loop has no overall cap, so a slow/rate-limited response could spin for 20+ minutes with no output and no error. Run the synchronous ddgs call in a single-worker ThreadPoolExecutor and cap it with future.result(timeout=_SEARCH_TIMEOUT_SECS=30). On timeout, return a clear failure ("DuckDuckGo search timed out ... try a different provider") instead of blocking; the pool is shut down with cancel_futures so a hung worker is never awaited. Salvaged from #37422 by @uzunkuyruk (authorship preserved). Re-applied on current main (the PR's provider.py base had diverged). Added a load-bearing timeout regression test (the original PR only updated the fake's constructor and had no timeout-behavior test) — mutation-verified to fail without the cap. Closes #36776.	2026-06-25 01:45:06 +05:30
Riyasudeen Farook	1e4df599ec	fix(delegate): strip cronjob toolset from delegated children (#43466 ) _strip_blocked_tools used a hardcoded set missing 'cronjob'. Children on gateway platforms could inherit the cronjob toolset, scheduling persistent jobs that outlive the delegation despite DELEGATE_BLOCKED_TOOLS. Fix: derive the strip set from DELEGATE_BLOCKED_TOOLS at runtime so the two lists can never drift. Add 'cronjob' to DELEGATE_BLOCKED_TOOLS for documentation consistency. Two regression tests lock the invariant. Salvaged from #43687 by @riyas22. Adapted test to current main (no 'messaging' toolset exists -- send_message is intentionally not registered as an agent tool). Closes #43466	2026-06-25 01:37:25 +05:30
kshitij	7a79a4447c	Merge pull request #52116 from NousResearch/fix/46994-session-load-bool-iterable fix(gateway): skip non-dict entries in session loading (#46994)	2026-06-25 01:33:36 +05:30
kshitij	8f0a12ce09	Merge pull request #52114 from NousResearch/salvage/27405-preflight-fewbig fix(agent): trigger preflight compression on few-but-huge sessions (#27405)	2026-06-25 01:27:07 +05:30
kshitijk4poor	9c994377ed	fix(gateway): skip non-dict entries in session loading (#46994 ) Corrupted sessions.json entries (e.g. a bare bool where a dict is expected) caused TypeError on 'origin' in data' which escaped the (ValueError, KeyError) inner except and aborted loading ALL remaining sessions, not just the corrupted one. Two-layer fix: - Loop level: isinstance(entry_data, dict) guard before from_dict - from_dict: isinstance(data['origin'], dict) instead of bare truthiness - Added TypeError to the inner except as defense-in-depth Closes #46994	2026-06-25 01:26:13 +05:30
texhy	aacc6bb0a8	fix(agent): trigger preflight compression on few-but-huge sessions (#27405 ) The preflight-compression gate only ran the (expensive) token estimate when the message COUNT exceeded protect_first_n + protect_last_n + 1. A session with a handful of very large messages never tripped the count condition, so compression was never attempted and the turn eventually hit a hard context-overflow error. Add _should_run_preflight_estimate() with OR semantics: run the estimate when either the message count exceeds the protected ranges (the historical gate) OR a cheap char-based estimate already crosses the configured threshold. The downstream estimate_request_tokens_rough() stays authoritative — this is only a hint that decides whether to pay for the full estimate. Salvaged from #27435 by @texhy (authorship preserved). Re-applied on current main: the preflight gate moved from conversation_loop.py to turn_context.py since the PR was opened, so the helper + gate are placed there; the test imports the real MINIMUM_CONTEXT_LENGTH instead of a hardcoded literal. Closes #27405.	2026-06-25 01:20:23 +05:30
kshitijk4poor	e0272cfef2	Revert "fix(compression): make minimum context floor configurable (#31600 )" This reverts commit `cae1ee44a7`.	2026-06-25 01:04:44 +05:30
kshitij	59acaa972f	Merge pull request #52053 from NousResearch/salvage/31600-minimum-context-length-configurable fix(compression): make minimum context floor configurable (#31600)	2026-06-25 01:02:52 +05:30

1 2 3 4 5 ...

6177 commits