hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-27 11:22:03 +00:00

Author	SHA1	Message	Date
xxxigm	4aeaba6922	test(desktop): cover undefined/null attachment holes in ref helpers Regression for the refText crash: attachmentDisplayText and optimisticAttachmentRef must return null (not throw) when handed an undefined/null attachment hole, so the submit path can't reproduce "Cannot read properties of undefined (reading 'refText')".	2026-06-24 18:22:01 -07:00
xxxigm	7e2db0a140	fix(desktop): stop refText crash on undefined composer attachment holes A session switch or draft restore can leave undefined/null holes in the composer attachments array. AttachmentList was guarded against this in #49624, but the sibling submit path was not: submitPromptText maps the same array through attachmentDisplayText/optimisticAttachmentRef and buildContextText (a.kind / a.label / a.refText), so a hole threw "Cannot read properties of undefined (reading 'refText')" — an uncaught renderer error that blanks the chat pane and shows "Desktop app link offline". Close the whole bug class: - attachmentDisplayText / optimisticAttachmentRef no-op on a falsy attachment (shared chokepoint, also protects thread.tsx drop handler). - submitPromptText filters falsy entries from the source array, and buildContextText filters its (possibly post-sync) input before reading fields.	2026-06-24 18:22:01 -07:00
helix4u	17beb55e3c	fix(telegram): gate rich draft previews separately	2026-06-24 18:11:14 -07:00
Gille	284be6cc24	Merge pull request #52210 from helix4u/fix/desktop-update-progress-visibility fix(desktop): surface update progress lines	2026-06-24 19:45:05 -05:00
brooklyn!	7157b213f5	Merge pull request #47959 from NousResearch/bb/pets-gen Pet generation: frame-perfect hatch flow, backend picker, CPU-safe chroma, and CI-hardening	2026-06-24 19:41:34 -05:00
brooklyn!	153ad79524	Merge pull request #52201 from NousResearch/bb/desktop-shallow-update-count fix(desktop): don't report a bogus update count for a shallow checkout	2026-06-24 19:34:02 -05:00
Brooklyn Nicholson	a05a9b0e07	test(delegate): harden heartbeat in-tool stale timing assertion Stabilize the long-running-tool heartbeat test by patching stale thresholds inside the test and asserting the heartbeat exceeds the idle ceiling, which preserves intent while removing scheduler-sensitive assumptions that flake in CI.	2026-06-24 19:33:40 -05:00
Brooklyn Nicholson	2ea94c6c45	fix(pets): make inline generate cancel discard draft flow Wire the sparkle generate button's cancel action to the same discard/reset path as step-2 cancel so abort semantics are consistent and always return to step 1 while retaining the prompt input.	2026-06-24 19:33:33 -05:00
brooklyn!	d635a6d507	Merge pull request #52208 from NousResearch/bb/desktop-update-steps fix(desktop): stop the update overlay looking frozen while it works	2026-06-24 19:29:02 -05:00
brooklyn!	42e14d1089	Merge pull request #52205 from NousResearch/bb/desktop-restart-profile fix(desktop): route gateway restart / status / update to the active profile	2026-06-24 19:28:53 -05:00
brooklyn!	b649cdee4a	Merge pull request #52203 from NousResearch/bb/update-drain-announce fix(update): announce gateway drain waits so desktop updates don't look hung	2026-06-24 19:28:44 -05:00
Ben	538c419d2e	fix(gateway): scope dashboard liveness fallback to the profile PR #52151 hardened the runtime-status liveness check to trust a readable live process command line over stale gateway_state.json argv, so a recycled PID now owned by an s6 supervisor no longer counts as a running gateway. That fix is correct but incomplete for the reported symptom: the web dashboard showed a named profile's gateway green while `hermes -p <name> gateway status` showed it stopped. Two further issues: 1. Cross-profile PID reuse. In per-profile Docker supervision, one profile's stale `gateway_state.json` can record a PID the OS later recycled onto a DIFFERENT profile's live gateway. That PID's command line still `looks_like_gateway`, so the dead profile was reported running. The recorded argv has its `-p <name>` selector stripped in-process by `_apply_profile_override`, so it cannot disambiguate; the live `/proc` cmdline still carries it. `get_runtime_status_running_pid` now accepts an `expected_home` and validates the live command line belongs to THAT profile (mirroring `hermes_cli.gateway._matches_current_profile`, the logic the CLI scan path already uses — which is why the CLI was correct). `_check_gateway_running` passes the enumerated profile dir. 2. The existing regression test `test_gateway_running_check_falls_back_to_ runtime_state` used the live pytest PID with a gateway-shaped record; once the live cmdline became authoritative it no longer looked like a gateway. Updated to mock the live cmdline to the real separate-process scenario it describes. The active-profile path (`get_running_pid`) is intentionally left unscoped: it is lock-verified and any live gateway cmdline is acceptable there. Multiplex mode is unaffected — `running` state is only ever written to a gateway's own home, never a secondary served profile's. Adds coverage for: cross-profile PID reuse (named + default), matching profile cmdline (`-p`, `--profile`, explicit HERMES_HOME=), the bare default gateway, and the unreadable-cmdline cross-platform fallback. Each new cross-profile assertion fails without the profile scope and passes with it. Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com>	2026-06-25 10:25:54 +10:00
helix4u	f1617a7ebb	fix(gateway): validate runtime status pid command line	2026-06-25 10:25:54 +10:00
Brooklyn Nicholson	592c462e3c	refine(pets): preserve user-requested tone in generation prompts Remove cute/chibi-biased wording from base draft variations and explicitly preserve the requested mood across base and row prompts so scary, eerie, or other non-cute concepts are honored while keeping sprite constraints.	2026-06-24 19:22:00 -05:00
Brooklyn Nicholson	9a4600c5fb	fix(desktop): stop the update overlay looking frozen while it works Two ways the update overlay read as stuck even though the update was streaming progress underneath: - In-app (macOS/Linux) UpdatesOverlay: runStreamedUpdate forwards every stdout line as a progress event with percent: null, and ingestProgress wrote that straight through — clobbering the milestone percents (10/60) so the bar fell back to indeterminate on every log line. Keep the last percent when a line carries null. - Staged install/update overlay: the bar is completedCount / totalCount, which counts only finished stages, so a long first stage pinned it at "0 of 2" / 0% until the stage ended. Count the running stage as half a unit so the bar advances during the stage (the per-stage spinner already shows which step is live). Both are display-only; no stage/event semantics change. (The Windows hermes-setup Tauri progress UI in apps/bootstrap-installer has the same counter-only-on-completion logic — parity follow-up.)	2026-06-24 19:20:38 -05:00
Brooklyn Nicholson	65b13e9dbc	fix(desktop): route gateway restart / status / update to the active profile restartGateway, getActionStatus, getStatus, updateHermes and checkHermesUpdate all hit window.hermesDesktop.api WITHOUT spreading profileScoped() — unlike their siblings (getModelInfo, setModelAssignment, grantComputerUsePermissions). _apiProfile tracks the active gateway profile, and the Electron proxy uses request.profile to pick which pooled / remote backend serves the call. So for a multi-profile or global-remote user, the System-panel "Restart gateway" (and its status poll, plus Update / status reads) targeted the primary/default backend instead of the one they're on: the restart hit the wrong gateway and the poll never saw the action → it looked like restart silently failed. Single-profile users are unaffected (profileScoped() returns {} when no profile is active). Add ...profileScoped() to the five backend-action helpers so they follow the active profile like the rest of the API surface.	2026-06-24 19:16:26 -05:00
AIalliAI	463bf2be25	fix(update): announce gateway drain waits so desktop updates don't look hung On macOS, the desktop updater's stage 1 (hermes update --gateway) ends by restarting running gateways. launchd_restart() SIGTERMs the gateway and silently waits up to agent.restart_drain_timeout (default 180s) for the drain; the manual profile-gateway loop waits its drain budget per gateway the same way. Neither path prints anything before the wait, so the desktop updater's live output goes dead for minutes right after '✓ Update complete!' — users read it as a hung update and force-kill their gateway processes to make it move (#44515). The systemd branch already announces its drain ('draining (up to Ns)...'); launchd and the manual loop did not. Print the stop/drain (with PID and budget) before the wait in both paths, mirroring the systemd branch, and assert the message in the existing launchd drain test. Fixes #44515	2026-06-24 19:12:44 -05:00
briandevans	cb6edbf448	fix(desktop): skip the rev-list count when it is discarded anyway checkUpdates() ran `git rev-list HEAD..origin/<branch> --count` unconditionally in the parallel probe batch, even on the shallow + no-merge-base path where resolveBehindCount() ignores the result and falls back to a SHA compare. In the #51922 failure mode that count walks the entire remote ancestry (thousands of commits), so the work was pure latency on every update check for the exact case the fix targets. Split the probes into two phases: resolve --is-shallow-repository and merge-base first, then run rev-list --count only when shouldCountCommits says the number is meaningful (full clone, or shallow-with-merge-base). The shallow/no-merge-base SHA fallback is preserved unchanged.	2026-06-24 19:12:09 -05:00
briandevans	a6485bddb8	fix(desktop): don't report a bogus update count for a shallow checkout The desktop installer clones with `--depth 1`, so a public install's local history often shares no merge-base with the freshly fetched origin tip. In that state `git rev-list HEAD..origin/<branch> --count` enumerates the entire remote ancestry and returns a meaningless huge number, surfacing as e.g. "v0.17.0 (+12104)" in the update indicator (#51922). The official-SSH branch of checkUpdates() already sidesteps this by reporting a binary up-to-date check (`behind: currentSha === targetSha ? 0 : 1`), and hermes_cli/banner.py guards the identical class for the CLI banner. The passive desktop count path was the one place the shallow guard was missing. Detect shallow + no-merge-base up front and fall back to the same SHA-based binary check; full clones (developers / Docker dev images) keep the exact count path unchanged. The resolution logic lives in a pure update-count.cjs helper so it is unit-testable without booting Electron.	2026-06-24 19:12:09 -05:00
Brooklyn Nicholson	1fe013ee16	feat(pets): polish generate flow and reduce hatch CPU pressure Ship the final pet-generation UX polish (provider picker behavior, step-2 cancel flow, banner integration, and visual consistency) and make saturated-chroma background removal C-op driven so hatch processing no longer hammers the machine during long runs.	2026-06-24 19:08:06 -05:00
Ben	d335164833	fix(relay): authorize relay inbound via connector-enforced upstream authz A hosted instance fronted by the Team Gateway connector dropped EVERY relay message as "Unauthorized user" and the agent never replied — despite the message routing correctly through the connector to the instance. Root cause: gateway authorization (_is_user_authorized) had no notion of upstream-enforced authz. Platform.RELAY matches no {PLATFORM}_ALLOWED_USERS allowlist and isn't in the HA/WEBHOOK always-authorized set, so a relay user with no env allowlist configured hit the default-deny ("No user allowlists configured. All unauthorized users will be denied."). The message was received, then silently denied before reaching the agent. This is incorrect for relay: the connector authenticates the gateway's WS with a per-instance secret and performs owner-only author-binding resolution BEFORE delivering. A message only reaches this gateway because the connector resolved it to THIS instance's bound user (user_instance_binding), keyed on the author id the connector OBSERVED off the event — never a gateway claim. The authorization decision is already made by a trusted, authenticated upstream; there is no local RELAY_ALLOWED_USERS allowlist to consult, and default-denying for its absence is the bug. Fix: add a generic BasePlatformAdapter.authorization_is_upstream capability (default False) that the relay adapter overrides to True, plus a dedicated trusted branch in _is_user_authorized that honors it. This is delegation to a trusted upstream, NOT a fail-open: it fires only for an adapter that explicitly declares the flag; every direct network-exposed adapter leaves it False and the env-allowlist default-deny (SECURITY.md §2.6) is unchanged. Distinct from enforces_own_access_policy, which mirrors a LOCAL config-driven allowlist — this delegates to an authenticated upstream's decision. Tests: behavior contract that the base defaults False, the relay adapter declares True, a relay user (group + DM) is authorized with no env allowlist, and crucially a non-upstream adapter with no allowlist still default-denies (guards against the fix becoming a blanket fail-open). 6 new tests; relay + authz + config-policy suites green (134 + 90). Found via live staging debug of the Discord self-serve onboarding flow.	2026-06-25 10:06:21 +10:00
brooklyn!	a378b1e980	Merge pull request #52192 from NousResearch/bb/session-loop-guard fix(desktop): let the session watchdog heal a stuck "looping" turn	2026-06-24 19:03:43 -05:00
brooklyn!	4127332f15	Merge pull request #52189 from NousResearch/bb/desktop-offline fix(desktop): give the gateway reconnect loop an escape hatch	2026-06-24 19:03:34 -05:00
brooklyn!	70650e82a3	Merge pull request #52187 from NousResearch/bb/desktop-voice fix(desktop): wire Ctrl+B voice, declutter voice settings, stop endless TTS hang	2026-06-24 19:03:25 -05:00
brooklyn!	9a94865552	Merge pull request #52183 from NousResearch/bb/desktop-agents-status fix(desktop): make Agents indicator match the Spawn-tree panel	2026-06-24 19:03:11 -05:00
Brooklyn Nicholson	93192059c9	fix(desktop): let the session watchdog heal a stuck "looping" turn The 8-minute stream-silence watchdog only removed a stuck session from $workingSessionIds (the sidebar dot). The composer's busy state lives in the session-state cache and was never cleared, so a hung or looping turn that never delivered its terminal event — including an old session re-opened while the backend still reports it "running" — stayed wedged on "Thinking" / Stop indefinitely. Have the watchdog notify subscribers when it force-clears a session, and subscribe from the session-state cache to also drop that session's busy/awaiting/needsInput flags. updateSessionState re-syncs $busy when the healed session is the one on screen, so the composer recovers instead of spinning forever. Frontend-only safety net; doesn't touch the turn lifecycle. The backend root (a stale in-memory session["running"] surviving a dead turn thread and re-arming busy on every resume) is a separate follow-up.	2026-06-24 18:36:17 -05:00
Brooklyn Nicholson	2a75c4a8cb	fix(desktop): give the gateway reconnect loop an escape hatch When a remote gateway dropped after a healthy boot (internet loss, sleep/wake, VPS restart), use-gateway-boot retried with backoff forever and never surfaced an error. The renderer sat behind the fullscreen CONNECTING overlay with gatewayState non-open and boot.error null — no way to reach Settings, sign in again, or switch to a local gateway. To the user the app was simply broken on connection loss. Raise a recoverable boot error once the reconnect loop crosses RECONNECT_ESCALATE_AFTER (6 attempts, ≈45s), so the BootFailureOverlay (Retry / Sign in / Use local gateway) replaces the dead-end CONNECTING screen. The loop keeps retrying underneath; the next successful reconnect (or a manual/wake-driven one) clears the error and dismisses the overlay. This implements the contract already specified — but never wired up — in use-gateway-boot.test.tsx (desktop vitest isn't in CI, so the failing "FIX:" specs went unnoticed). All 4 hook tests + the 3 connecting-overlay tests pass.	2026-06-24 18:32:29 -05:00
Brooklyn Nicholson	8d1706ae5c	fix(desktop): wire Ctrl+B voice, declutter voice settings, stop endless TTS hang Three voice-mode papercuts in the desktop app: 1. Ctrl+B did nothing. The docs + `voice.record_key` advertise Ctrl+B to talk, but the desktop never bound it (only ⌘B = sidebar existed). Add a rebindable `composer.voice` action that toggles the voice conversation, defaulting to ⌃B on macOS (distinct from ⌘B; off-macOS `ctrl` folds to the sidebar chord, so it ships unbound there to avoid stealing it). The global keybind reaches the composer through a new focus-bus event. 2. The Voice settings page rendered every provider's options at once (~30 fields). Filter to the selected TTS/STT provider's sub-fields; STT provider fields hide when STT is off. Picking "edge" now shows just the Edge voice, making it obvious voice chat also needs STT enabled. 3. Voice mode could hang "speaking" forever. Free Edge TTS sometimes returns audio that never fires `playing`/`ended`/`error`, so the playback promise never settled. Add a stall watchdog (rearmed on each progress tick, so long speech is never cut off) that rejects a stuck stream, letting the loop recover with a clear error.	2026-06-24 18:26:14 -05:00
Ben	41b9b7e719	test(lazy-deps): make durable-target tests network-free CI test shard has no PyPI egress: the real 'pip install packaging==20.9' in test_core_package_is_not_shadowed failed (the pypi.org reachability probe passed but the actual install didn't), failing slice 2/6. - Prove the anti-shadow invariant deterministically: synthesize a fake 'packaging' in the durable target with a sentinel and assert the import still resolves to the core copy (TestCoreNeverShadowed). No network. - Cover the install wire offline: stub subprocess and assert --target + --constraint are built in durable mode and absent in venv-scoped mode (TestInstallArgConstruction). - Gate the genuine PyPI install behind HERMES_RUN_NETWORK_TESTS=1 (opt-in, skipped in CI) instead of a flaky reachability probe that doesn't predict install success.	2026-06-25 09:20:13 +10:00
Ben	cbd6ba1bdd	fix(docker): redirect lazy installs to a durable target so opt-in backends work in the immutable image (#51136 ) The published Docker image seals the agent venv (root-owned, read-only /opt/hermes) and sets HERMES_DISABLE_LAZY_INSTALLS=1 so a runtime install can't mutate and brick the core. But opt-in backends (Firecrawl web search, Exa, Feishu, ...) deliberately keep their SDKs in tools/lazy_deps.py and out of [all] (pyproject policy 2026-05-12: one quarantined release must not break every install). The two policies collided: the SDK isn't baked in AND can't lazy-install, so the default Firecrawl web_search/web_extract fail out of the box in Docker (#51136), as do Exa (#49445) and Feishu (#50205). Fix the whole class instead of baking in one backend: when HERMES_LAZY_INSTALL_TARGET is set, lazy installs are redirected to a writable dir on the durable /opt/data volume via `pip/uv install --target`, and that dir is APPENDED to the end of sys.path. Because the core venv always wins name collisions, a package installed this way can only ADD new modules — it can never shadow, downgrade, or break a module the core ships. The worst a bad/incompatible backend package can do is fail to import and report itself unavailable; the agent core stays healthy. That structural guarantee is what made it safe to seal the venv, and it is preserved here even with installs re-enabled. - tools/lazy_deps.py: durable-target mode — `--target` install + core-pinned `--constraint` file (shared deps resolve to core's versions, conflicts fail loudly at install time), append-only sys.path activation, ABI/Python-version stamp that wipes the store if an image rebuild bumps the interpreter, and a reworked gate so HERMES_DISABLE_LAZY_INSTALLS=1 redirects (rather than hard- blocks) when a target is set. security.allow_lazy_installs=false still disables installs in every mode. - hermes_bootstrap.py: activate the durable target on sys.path at first import (before any backend imports its SDK) so packages installed on a previous run are importable on this run. - Dockerfile: set HERMES_LAZY_INSTALL_TARGET=/opt/data/lazy-packages. - docker/stage2-hook.sh: seed + chown the dir on the data volume. - tests: real-install E2E proving installs land in the target, import cleanly, don't leak into the sealed venv, and that a core package is never shadowed; ABI-stamp wipe/preserve; gate matrix; Dockerfile/stage2 contract test. Fixes #51136	2026-06-25 09:20:13 +10:00
Brooklyn Nicholson	a268dfff0a	fix(desktop): make Agents indicator match the Spawn-tree panel The status-bar "Agents" item conflated three unrelated signals — running subagents (aggregated across all sessions), in-flight session turns, and failed background system actions (gateway restarts, toolset installs, computer-use grants via $desktopActionTasks/preview restart) — yet clicking it opens AgentsView, which renders only subagents. A failed gateway restart therefore showed "Agents (1 Failed)" over an empty "No live subagents" tree. AgentsView also filtered to the active session, so a subagent running in a background session showed "Agents N running" with nothing in the tree (the desync reported in #49808). Unify the scope both surfaces speak: - AgentsView aggregates subagents across every session (salvages #49819). - The indicator's running/failed counts come from subagents only (aggregated), never background system actions — those keep their own surfaces in settings / command center. So "Agents (N …)" now always points at a populated Spawn tree. Supersedes #49819. Fixes #49808.	2026-06-24 18:16:14 -05:00
liuhao1024	404b06ac4f	fix(gateway): honor server retry_after in _send_with_retry for Telegram flood control (#46762 ) When Telegram's sendRichMessage returns a FloodWait/RetryAfter error, _try_send_rich() now extracts the server-provided retry_after value and propagates it through SendResult.retry_after. The base _send_with_retry() layer honors this value instead of using its default short exponential backoff (~2s, ~4s), preventing the retry budget from being exhausted against a server that demands a 25-37s wait. Salvaged from #46774 by @liuhao1024. Telegram adapter path moved from gateway/platforms/telegram.py to plugins/platforms/telegram/adapter.py since the original PR. Closes #46762	2026-06-25 02:43:47 +05:30
kshitij	cedbb4cfa2	Merge pull request #52140 from NousResearch/salvage/47707-tool-schema-validation fix(agent): validate context/memory tool schemas before wrapping (#47707)	2026-06-25 02:36:19 +05:30
kshitij	085096fd59	Merge pull request #52135 from NousResearch/salvage/51826-tirith-mkdtemp-oerror fix(tools): catch mkdtemp OSError in tirith install (#51826)	2026-06-25 02:35:27 +05:30
kshitij	7d2c1f3f84	Merge pull request #52134 from NousResearch/salvage/42449-deepcopy-ctx-engine fix(agent): deepcopy plugin context engine to prevent parent corruption on delegate_task (#42449)	2026-06-25 02:28:37 +05:30
Bartok9	710cd48fb1	fix(agent): validate context/memory tool schemas before wrapping Closes #47707 Context engines and memory providers expose tool schemas via get_tool_schemas(). agent_init.py wrapped each as {"type":"function","function":_schema} without validating that _schema carries a top-level name. A provider returning an entry already in OpenAI tool form ({"type":"function","function":{...}}) was then double-wrapped into a tool whose function has no name. Strict providers (e.g. DeepSeek) reject the entire request with HTTP 400 'tools[N].function: missing field name', so one malformed schema silently disables the whole toolset and breaks every turn. The schema was also never added to valid_tool_names, so even lenient providers could not call it. Add a shared normalize_tool_schema() helper that unwraps an already-wrapped entry and returns None for anything lacking a resolvable string name. Wire it into the agent_init context-engine loop and all three memory_manager surfaces (inject_memory_provider_tools, add_provider routing index, get_all_tool_schemas), so a single bad plugin schema is skipped with a warning instead of poisoning the request. Verification: 209 targeted agent/memory tests pass (incl. 9 new). New tests assert the unwrap + skip-nameless behavior and fail without the fix.	2026-06-25 02:17:29 +05:30
liuhao1024	dbf0797335	fix(tools): catch mkdtemp OSError in tirith install to prevent unbounded retry and temp-dir leak (#51826 ) When tempfile.mkdtemp() raises OSError (e.g. disk full), the exception propagated past the try/finally block, so _mark_install_failed() was never called. The 24h backoff marker never engaged, causing unbounded retry on every command -- each attempt leaked a tirith-install-* temp directory, eventually filling /tmp completely. Fix: wrap mkdtemp in its own try/except OSError, returning (None, "no_space") so the caller's normal failure path (including _mark_install_failed) executes. Salvaged from #51831 by @liuhao1024. Closes #51826	2026-06-25 02:13:56 +05:30
liuhao1024	8d1f6debfd	fix(agent): deepcopy plugin context engine to prevent parent corruption on delegate_task (#42449 ) When delegate_task spawns a child agent with a different model/provider, the child's init_agent loaded the plugin context-engine GLOBAL singleton by reference (`_selected_engine = _candidate`) and then called update_model() on it with the child's (smaller) context_length. Because parent and child shared the same object, this mutated the PARENT's compressor: e.g. DeepSeek 1M ctx silently dropped to 204800 and the compression threshold from 200K to 40K after any delegate_task with a different model. Deepcopy the singleton before assigning/mutating it (agent_init.py) so the child gets its own instance and the parent's compressor is untouched. Salvaged from #42452 by @liuhao1024 (authorship preserved). Added a source-pin regression test that fails if the production line reverts to the bare alias, plus an end-to-end test driving get_plugin_context_engine() and a StubEngine.update_model() — the original PR's tests exercised copy.deepcopy in isolation but did not guard the actual agent_init code path. Closes #42449. Supersedes #42469, #42474 (same one-line fix, no test).	2026-06-25 02:13:26 +05:30
kshitij	77d2b50751	Merge pull request #52118 from NousResearch/salvage/36776-ddgs-timeout fix(ddgs): bound DuckDuckGo search with a wall-clock timeout (#36776)	2026-06-25 01:56:26 +05:30
kshitij	4d589b1e13	Merge pull request #52121 from NousResearch/salvage/43466-strip-cronjob-toolset fix(delegate): strip cronjob toolset from delegated children (#43466)	2026-06-25 01:54:37 +05:30
uzunkuyruk	489b85ee1e	fix(ddgs): bound DuckDuckGo search with a wall-clock timeout (#36776 ) A single ddgs (DuckDuckGo) search could hang indefinitely and block the shared agent loop — and therefore every platform (CLI, Telegram, Matrix...). The DDGS constructor's timeout only bounds individual HTTP requests; ddgs's multi-engine retry loop has no overall cap, so a slow/rate-limited response could spin for 20+ minutes with no output and no error. Run the synchronous ddgs call in a single-worker ThreadPoolExecutor and cap it with future.result(timeout=_SEARCH_TIMEOUT_SECS=30). On timeout, return a clear failure ("DuckDuckGo search timed out ... try a different provider") instead of blocking; the pool is shut down with cancel_futures so a hung worker is never awaited. Salvaged from #37422 by @uzunkuyruk (authorship preserved). Re-applied on current main (the PR's provider.py base had diverged). Added a load-bearing timeout regression test (the original PR only updated the fake's constructor and had no timeout-behavior test) — mutation-verified to fail without the cap. Closes #36776.	2026-06-25 01:45:06 +05:30
kshitijk4poor	e25b56fc64	chore: AUTHOR_MAP entry for riyas22 (PR #43687 salvage)	2026-06-25 01:39:11 +05:30
Riyasudeen Farook	1e4df599ec	fix(delegate): strip cronjob toolset from delegated children (#43466 ) _strip_blocked_tools used a hardcoded set missing 'cronjob'. Children on gateway platforms could inherit the cronjob toolset, scheduling persistent jobs that outlive the delegation despite DELEGATE_BLOCKED_TOOLS. Fix: derive the strip set from DELEGATE_BLOCKED_TOOLS at runtime so the two lists can never drift. Add 'cronjob' to DELEGATE_BLOCKED_TOOLS for documentation consistency. Two regression tests lock the invariant. Salvaged from #43687 by @riyas22. Adapted test to current main (no 'messaging' toolset exists -- send_message is intentionally not registered as an agent tool). Closes #43466	2026-06-25 01:37:25 +05:30
kshitij	7a79a4447c	Merge pull request #52116 from NousResearch/fix/46994-session-load-bool-iterable fix(gateway): skip non-dict entries in session loading (#46994)	2026-06-25 01:33:36 +05:30
kshitij	8f0a12ce09	Merge pull request #52114 from NousResearch/salvage/27405-preflight-fewbig fix(agent): trigger preflight compression on few-but-huge sessions (#27405)	2026-06-25 01:27:07 +05:30
kshitijk4poor	9c994377ed	fix(gateway): skip non-dict entries in session loading (#46994 ) Corrupted sessions.json entries (e.g. a bare bool where a dict is expected) caused TypeError on 'origin' in data' which escaped the (ValueError, KeyError) inner except and aborted loading ALL remaining sessions, not just the corrupted one. Two-layer fix: - Loop level: isinstance(entry_data, dict) guard before from_dict - from_dict: isinstance(data['origin'], dict) instead of bare truthiness - Added TypeError to the inner except as defense-in-depth Closes #46994	2026-06-25 01:26:13 +05:30
texhy	aacc6bb0a8	fix(agent): trigger preflight compression on few-but-huge sessions (#27405 ) The preflight-compression gate only ran the (expensive) token estimate when the message COUNT exceeded protect_first_n + protect_last_n + 1. A session with a handful of very large messages never tripped the count condition, so compression was never attempted and the turn eventually hit a hard context-overflow error. Add _should_run_preflight_estimate() with OR semantics: run the estimate when either the message count exceeds the protected ranges (the historical gate) OR a cheap char-based estimate already crosses the configured threshold. The downstream estimate_request_tokens_rough() stays authoritative — this is only a hint that decides whether to pay for the full estimate. Salvaged from #27435 by @texhy (authorship preserved). Re-applied on current main: the preflight gate moved from conversation_loop.py to turn_context.py since the PR was opened, so the helper + gate are placed there; the test imports the real MINIMUM_CONTEXT_LENGTH instead of a hardcoded literal. Closes #27405.	2026-06-25 01:20:23 +05:30
kshitij	ed1fdb5b61	Merge pull request #52112 from NousResearch/revert/52053-minimum-context-floor revert(plugins): revert minimum context floor configurable (#52053)	2026-06-25 01:11:53 +05:30
kshitijk4poor	e0272cfef2	Revert "fix(compression): make minimum context floor configurable (#31600 )" This reverts commit `cae1ee44a7`.	2026-06-25 01:04:44 +05:30
kshitij	59acaa972f	Merge pull request #52053 from NousResearch/salvage/31600-minimum-context-length-configurable fix(compression): make minimum context floor configurable (#31600)	2026-06-25 01:02:52 +05:30

1 2 3 4 5 ...

12819 commits