hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-01 12:02:05 +00:00

Author	SHA1	Message	Date
fayenix	d6c53dcdcb	fix(gateway): stop per-turn agent-cache eviction from model + message_id signature churn Two independent bugs evicted the cached gateway AIAgent on every turn, preventing the prompt cache from ever warming: 1. Model normalization mismatch: the post-run fallback-eviction check compared _agent.model (stripped in AIAgent.__init__) against the raw _resolve_gateway_model() config string. For vendor-prefixed config on native providers (e.g. 'deepseek/deepseek-v4-pro' vs 'deepseek-v4-pro') this was always unequal, so the agent was evicted after every successful run. Normalize _cfg_model the same way (skip aggregators). 2. Discord triggering message_id leaked into the cached system prompt via build_session_context_prompt()'s Discord IDs block. message_id changes every turn, so the agent-cache signature (computed from the ephemeral prompt) changed every Discord turn -> rebuild every message. The id is now injected per-turn into the user message (where per-turn content belongs and does not touch the cache signature); the cached IDs block carries a static pointer to it, preserving reply/react/pin via the discord tools. Adapted from #28846. Bug #1 fix is the contributor's; bug #2 reworked to be non-destructive (keeps the triggering-id capability instead of deleting it). Redundant auto-reset eviction (already on main via #9893/#48031) and the wrong-premise reset_context_note plumbing from the original PR were dropped. Co-authored-by: Hermes Agent <hermes@nousresearch.com>	2026-06-30 04:22:41 -07:00
Zane Ding	ac380050ea	fix(credential-pool): distinguish OpenRouter upstream 429s from account 429s OpenRouter returns 429 in two shapes: an account-level throttle on the user's key, and an upstream-provider throttle (DeepSeek/Anthropic/etc. rate-limiting OpenRouter's aggregate traffic). The classifier treated both identically and rotated/exhausted OPENROUTER_API_KEY on every 429 — burning the key for ~24min and silently disabling auxiliary features (compression, summarization, vision) on an upstream throttle where the key was healthy. Add a FailoverReason.upstream_rate_limit classified from OpenRouter's unambiguous wrapper message "Provider returned error" (the same signal the metadata-raw parser already trusts). Recovery skips credential rotation and defers to the fallback chain to switch models instead. Co-authored-by: Hermes Agent <127238744+teknium1@users.noreply.github.com>	2026-06-30 03:57:14 -07:00
Teknium	abca77615a	chore(release): map Jeffgithub0029 author email for #28558 salvage	2026-06-30 03:51:08 -07:00
teknium1	c510f48680	chore(release): add jasonQin6 to AUTHOR_MAP for PR #15093 salvage	2026-06-30 03:42:25 -07:00
teknium1	2ae9e222f0	chore: AUTHOR_MAP entry for PR #27123 salvage (jimmyjohansson84)	2026-06-30 03:42:20 -07:00
teknium1	ea95fdd6d7	chore(release): add nikshepsvn to AUTHOR_MAP for PR #27426 salvage	2026-06-30 03:41:46 -07:00
Kong	6d6702ef50	fix(whatsapp-bridge): clarify FIFO outbound-id tracker semantics Rename LRU/refresh wording to match Set insertion-order eviction and reject non-positive maxSize at construction time.	2026-06-30 03:41:43 -07:00
Keira Voss	db52ad0f07	fix(whatsapp): gate owner-typed forwards on customer chatId allowlist The opt-in WHATSAPP_FORWARD_OWNER_MESSAGES path in bot mode marks fromMe inbound messages as fromOwner: true and forwards them to the Python adapter so plugins can detect "owner just typed in this chat" and trigger handover / sliding TTL flows. The previous implementation bypassed the allowlist for that path: the existing allowlist gate at the bottom of the dispatch loop is guarded by !msg.key.fromMe, so any chat the operator happened to reply to was forwarded — even ones not on WHATSAPP_ALLOWED_USERS. Concretely, on a deployment with a single allowlisted customer, an owner reply in any other chat would still wake Hermes and let the gateway-policy plugin's owner-implicit branch create a stray handover row keyed by the non-allowlisted chatId. Fix: extract the bot-mode fromMe gate into a small pure helper (`owner_message_gate.js`) that returns one of {drop_echo, drop_disabled, drop_allowlist, forward_owner, pass} so the new allowlist branch can be unit-tested without spinning up Baileys. The check runs against the customer chatId (not senderId, which is the owner's own number/LID and won't be on the allowlist by construction). matchesAllowedUser already short-circuits true on an empty allowlist or "*", so deployments without an allowlist see no behavior change. Self-chat mode is untouched — its existing isSelfChat pin is the correct guard there. Tests: scripts/whatsapp-bridge/owner_message_gate.test.mjs covers echo drop, disabled drop, the new allowlist drop, the forward path, the open-allowlist short-circuit, and the precedence of echo/disabled checks over the allowlist check (so logs stay honest).	2026-06-30 03:41:43 -07:00
keiravoss94	84f350efe0	feat(whatsapp): opt-in forwarding of owner-typed messages in bot mode In `WHATSAPP_MODE=bot` the bridge currently drops every fromMe inbound message — they are all assumed to be echoes of our own /send calls. That makes it impossible for plugins / agents to detect when a human owner has typed directly into a customer chat from the same WhatsApp Business account (e.g. via a linked phone or WhatsApp Web). This adds an opt-in `WHATSAPP_FORWARD_OWNER_MESSAGES` env var. When true, the bridge classifies fromMe inbound by looking up `key.id` in a bounded LRU of recently-sent message IDs (the existing 50-entry echo suppressor, bumped to 512 and extracted to a testable `outbound_ids.js` helper). Hits in the LRU are still dropped (echoes); misses are forwarded to the Python adapter with `fromOwner: true`. The Python adapter lifts that flag onto `MessageEvent.metadata["whatsapp_from_owner"]`. `metadata` is a new free-form dict on the event so future per-platform signals don't each need their own field. Default behaviour is unchanged: with the env flag unset, bot mode still drops every fromMe message exactly as before. Use cases for downstream consumers: - Implicit handover activation when the owner replies manually - Sliding TTL on owner activity (keep an active session alive while the owner is engaged) - Audit trails of owner interventions - Analytics on human-vs-bot reply ratios Heuristic limitation (documented in code): the LRU is in-memory. After a bridge restart, in-flight delivery receipts of pre-restart sends will briefly look like owner-typed for a few seconds until the set is repopulated. Persisting isn't worth the disk churn — downstream consumers should treat the flag as best-effort. Tests: - tests/gateway/test_whatsapp_from_owner.py (new): adapter sets the metadata flag iff the bridge payload has `fromOwner: true`; absent otherwise. - scripts/whatsapp-bridge/outbound_ids.test.mjs (new): LRU bounds, eviction order, falsy-id handling. Backwards compatibility: with the env flag unset, every code path is identical to before. No existing deployment is affected.	2026-06-30 03:41:43 -07:00
teknium1	3ecc58a8da	chore: map trevorgordon981 in AUTHOR_MAP for #50590 co-authorship	2026-06-30 03:27:41 -07:00
teknium1	bf2dc18f84	test+chore: real-path regression test for #15157 model_extra guard + AUTHOR_MAP Adds tests/agent/test_model_extra_type_guard.py exercising the real ChatCompletionsTransport.normalize_response path with string/list/None/dict model_extra; adds the AUTHOR_MAP entry for the contributor.	2026-06-30 03:27:12 -07:00
Tao Yan	b8ebe32866	fix(agent): flatten multi-part user_message in codex intermediate-ack detector Vision requests routed through the OpenAI-compat API server forward the raw multi-part content list ([{type:"text"}, {type:"image_url"}, ...]) straight through as user_message. The codex intermediate-ack detector flattened it with (user_message or "").strip(), so a truthy list survived and .strip() raised AttributeError — killing any Codex-routed vision turn that took the require_workspace path. Route through the existing _summarize_user_message_for_log helper (which already backs the logging/banner previews on main), and widen the param type hint from str to Any to match how the function is actually called. The two logging-preview sites the original PR also touched were fixed independently on main by the conversation-loop refactor. Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-30 03:20:11 -07:00
Markus Phan	cd9f5cc671	fix(delegate): route subagent progress lines through _safe_print for ACP stdio delegate_task's per-task completion display emitted lines like "✓ [1/3] Research done (17.92s)" via a bare print(). Under ACP (and any headless JSON-RPC stdio host where AIAgent routes human output to stderr via a custom _print_fn), these landed on stdout and corrupted the protocol frame stream, surfacing as "Failed to parse JSON message: ✓ [3/3] …" in the ACP adapter. Add _emit_parent_console() which prefers parent_agent._safe_print (the same hook AIAgent uses for every other user-facing print) and falls back to print() only when no router is wired up or it raises. CLI behavior is unchanged. The PR's other fix (preset toolset expansion) is already covered on main by _expand_parent_toolsets(), so only the stdio-safe printing change is salvaged here.	2026-06-30 03:16:22 -07:00
teknium1	db880186f2	chore(release): add AUTHOR_MAP entries for #51841 and #54287 salvage	2026-06-30 03:11:13 -07:00
teknium1	35a0803a3b	fix(delegation): budget subagent summaries against parent context headroom Batch delegation returned each subagent's full final_response verbatim into the parent's context. A fan-out of N children could dump 60k+ tokens at once, blowing the parent's context window and — on rate-limited providers — triggering a compression/429 death spiral (429 misread as context-too-large -> window step-down -> retry loop -> conversation dies). Cap each summary against the parent's remaining context headroom split across the batch (not a magic char count). When trimming, mirror the web_extract convention: spill the full text to cache/delegation (mounted into remote backends via credential_files._CACHE_DIRS) and return a head+tail window (75/25, line-snapped) plus a footer with the exact read_file offset to page the omitted middle. Both the subagent's opening AND its closing (outcomes / files-changed / issues, which live at the end) survive in-context, and nothing is lost — the parent can read_file the full version on any backend. delegation.max_summary_chars (default 24000) is a static ceiling layered on top as belt-and-suspenders for models that ignore 'be concise'; 0 disables it. Child prompt tightened to lead with outcomes / bullets. Co-authored-by: rc-int <rcint@klaith.com>	2026-06-30 03:07:40 -07:00
MarioYounger	3b2bb30c5d	fix(security): harden heredoc approval, NFKC homograph fold, env-var filter Three independent security-scanner hardenings, re-homed onto the current shared threat-pattern architecture (tools/threat_patterns.py): - approval.py: add bash/sh/zsh/ksh heredoc to DANGEROUS_PATTERNS. The existing heredoc pattern only covered python/perl/ruby/node, so `bash <<'EOF' ... EOF` ran arbitrary shell — including exfil pipelines whose inner commands don't individually match a pattern — with no prompt. - threat_patterns.py: apply unicodedata.normalize("NFKC", ...) before pattern matching so full-width / compatibility homographs (e.g. `ｃａｔ ~/.hermes/.env`) are folded to ASCII and no longer bypass the keyword scanners. Invisible-char detection still runs on the raw content first (NFKC can strip those codepoints). - code_execution_tool.py: add CREDS/BEARER/APIKEY to _SECRET_SUBSTRINGS so vars like HERMES_LLM_CREDS, API_BEARER, MY_APIKEY are scrubbed from the sandbox env. PASS was intentionally dropped from the original proposal — it false-positives on BYPASS_CACHE / COMPASS_DIR / PASSENGER_HOST while PASSWORD/PASSWD already cover the credential cases. The original PR also proposed a 'synonym' injection pattern block (overlook/forget/set aside/bypass/discard + developer-mode); dropped here because it false-positives on ordinary AGENTS.md/SOUL.md prose ("don't forget to follow the rules", "run in developer mode"), exactly the bossy-English class threat_patterns.py is documented to avoid. Salvaged from #9028. Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-30 02:59:46 -07:00
teknium1	b6045170bb	fix(discord): extend channel-name matching to slash-command auth; clamp flush deadline to disconnect budget Follow-up to the salvaged #8008 fix: - Sibling-site fix: _evaluate_slash_authorization gated DISCORD_ALLOWED_CHANNELS / DISCORD_IGNORED_CHANNELS on numeric IDs only, so name/#name config that now works for on_message still silently failed for slash-command interactions. Refactor the channel-key helper to _discord_channel_keys_from_channel(channel, parent) and reuse it at the interaction gate. Fail-closed on missing channel id is preserved. - The contributor's hardcoded 8s flush deadline could be hard-cancelled mid-flush: _teardown_adapter already wraps cancel_background_tasks() in the per-adapter disconnect budget (HERMES_GATEWAY_ADAPTER_DISCONNECT_TIMEOUT, default 5s). The flush deadline now derives from that budget with headroom so it always completes inside it. - AUTHOR_MAP: map cypher@augmentl.com -> Nickperillo for CI. - Tests: slash-auth name/#name allow + name ignore matching.	2026-06-30 02:48:42 -07:00
ethernet	808ba82125	feat(ci): add CI timing report	2026-06-29 19:07:00 -07:00
teknium1	10c9eafde2	chore(attribution): map mango001@126.com -> max-chen for salvaged #51194	2026-06-29 02:35:57 -07:00
teknium1	2f5950a83a	chore(release): add telos-oc to AUTHOR_MAP for PR #14353 salvage	2026-06-29 02:25:48 -07:00
teknium1	0b733a8418	test(gateway): pin auto-reset cached-agent eviction (#10710 ) Relocate marco0158's eviction into the dedicated auto-reset cleanup block (single source of truth for dropping session-scoped transient state) and add an AST invariant pinning _evict_cached_agent into that block. Add AUTHOR_MAP entry for marco0158.	2026-06-28 22:35:17 -07:00
Teknium	9cf9d3a28f	chore(release): add AUTHOR_MAP entry for PR #53295 salvage	2026-06-28 20:46:44 -07:00
LIC99	dda3268d09	fix(approvals): warn and default to manual on unknown approvals.mode _normalize_approval_mode() previously accepted any string, so an unknown value like 'auto' fell through every downstream mode check (off/smart) and silently behaved like manual with no signal. Validate against the known modes (manual/smart/off), emit a warning for anything else, and default to manual to match the config default and the rest of the function. Bug 1 from the original PR (/approve & /deny bypassing the running-agent guard) already landed on main independently, so only the mode-validation fix is salvaged here. Fixes #4261 Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-28 19:04:18 -07:00
Teknium	11183e8332	fix(profiles): validate custom alias names to prevent path traversal `hermes profile alias <profile> --name <custom>` accepted arbitrary strings and used them verbatim as a filename under ~/.local/bin. Because normalize_profile_name only lowercases/strips (no regex gate), a value like `../../.bashrc` escaped the wrapper directory and clobbered arbitrary user-writable files. remove_wrapper_script had the same sink. Add validate_alias_name (reusing the profile-id regex, which forbids `/`, `.`, and `..`) and wire it into check_alias_collision, create_wrapper_script, remove_wrapper_script, and the CLI alias action so the rejection surfaces a clear "Invalid alias name" error instead of silently writing or unlinking outside the wrapper dir. Co-authored-by: Gutslabs <gutslabsxyz@gmail.com> Co-authored-by: Xowiek <xowiekk@gmail.com>	2026-06-28 18:53:33 -07:00
Teknium	9860d93f2a	fix(terminal): require approval for host-bound Docker commands (#54483 ) * fix(terminal): require approval for host-bound Docker commands The Docker terminal backend blanket-skips dangerous-command approval on the assumption that the container is isolated from the host. That holds only when nothing is bind-mounted in. Once a host path is exposed (via TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE or a host-path entry in TERMINAL_DOCKER_VOLUMES), a command like `rm -rf /workspace` reaches real host files but is still auto-approved. Detect host bind mounts and route those sessions through the normal approval flow. Isolated Docker keeps the fast path. The same gating is applied to the execute_code guard, which had the identical blanket skip. Co-authored-by: Hermes Agent <agent@nousresearch.com> * chore: add AUTHOR_MAP entry for PR #6436 salvage (Kolektori) * test: accept has_host_access kwarg in _check_all_guards mocks The host-bound Docker approval fix adds a has_host_access kwarg to the _check_all_guards wrapper. Six pre-existing tests monkeypatch it with a fixed (command, env_type) / (cmd, env) lambda signature, which now raises TypeError when terminal_tool passes the new kwarg. Widen those mock signatures to accept **kwargs. --------- Co-authored-by: Kolektori <256073454+Kolektori@users.noreply.github.com> Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-29 11:35:41 +10:00
HexLab98	95994bbc56	fix(windows): repair missing hermes.exe after pip install (#52931 ) On Windows, uv pip install -e . can register hermes.exe in package metadata while the launcher never lands on disk. Detect missing [project.scripts] shims and reinstall entry points under the existing quarantine path in hermes update and install.ps1.	2026-06-28 17:01:31 -05:00
teknium1	c648ecdca5	fix(telegram): reject unauthorized users before event construction (#40863 ) Removed/unauthorized Telegram users could inject prompt content before the per-user auth gate fired. The adapter ran `_should_process_message`, `_build_message_event`, and text/photo batching — and dispatched to the runner — before `_is_user_authorized()` (gateway/authz_mixin.py) rejected the sender. Unmentioned group chatter from a removed user was also persisted into the session transcript via `_observe_unmentioned_group_message`, leaking into the agent's observed context independent of dispatch. Add `_is_user_authorized_from_message()` as an intake prefilter that runs in `_handle_text_message`, `_handle_command`, `_handle_location_message`, and `_handle_media_message` BEFORE batching, event construction, and the unmentioned-group observe branch. It reuses the runner's `_is_user_authorized()` with a correctly-shaped SessionSource (group vs forum vs dm, real chat_id for TELEGRAM_GROUP_ALLOWED_* allowlists), falls back to env allowlists, and only rejects when an allowlist actually exists — unknown DMs with no allowlist still reach the pairing flow. Channel posts authorize via `sender_chat` identity when `from_user` is absent. Co-authored-by: liuhao1024 <sunsky.lau@gmail.com> Co-authored-by: Carlos Manuel Cejas <carlosmcejas@gmail.com>	2026-06-28 14:25:15 -07:00
teknium1	f25f235722	chore: map salvaged PR #49845 author email for AUTHOR_MAP	2026-06-28 04:47:39 -07:00
Brad Hallett	376d021fee	fix(desktop): force app exit after update/uninstall handoff on macOS Some checks are pending CI / Detect affected areas (push) Waiting to run Details CI / Python tests (push) Blocked by required conditions Details CI / Python lints (push) Blocked by required conditions Details CI / TypeScript (push) Blocked by required conditions Details CI / Docs Site (push) Blocked by required conditions Details CI / Deny unrelated histories (push) Blocked by required conditions Details CI / Check contributors (push) Blocked by required conditions Details CI / Check uv.lock (push) Blocked by required conditions Details CI / Lint Docker scripts (push) Blocked by required conditions Details CI / Build&Test Docker image (push) Blocked by required conditions Details CI / Supply-chain scan (push) Blocked by required conditions Details CI / OSV scan (push) Waiting to run Details CI / All required checks pass (push) Blocked by required conditions Details Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details On macOS app.quit() closes windows but window-all-closed deliberately keeps the process alive (Dock convention). Every detached hand-off (update swap, relaunch, Windows bootstrap recovery, uninstall cleanup) waits for the desktop PID to exit before replacing/removing the bundle — so the process never dying means the script spins its full PID-wait and the user sees a blank app, or an uninstall that appears to do nothing. Add a module-level isQuittingForHandoff flag, set before every hand-off app.quit(); window-all-closed then quits on all platforms when it's set. Covers all five hand-off sites including the Linux relaunch path.	2026-06-28 04:30:14 -07:00
kshitijk4poor	546193aa6d	fix(install): time-box desktop + node-deps installs so a stalled download self-heals (#39219 ) The desktop install step ran npm ci / npm run pack with no wall-clock cap, and the sibling browser-tools / TUI / agent-browser dependency installs had the same gap. The Electron binary (~150MB) is fetched from GitHub during the pack; on a throttled or region-blocked link that download can stall rather than fail — npm never errors and never exits, so the installer sits on "Build desktop app" (step 9/11) indefinitely with only harmless 'npm warn deprecated' lines visible. The existing self-heal escalation (cache purge -> dist restore -> npmmirror fallback) only fires when pack returns non-zero, so a stall bypassed it. - run_with_timeout (generalized from run_browser_install_with_timeout): GNU timeout --foreground -k 10 (Ctrl+C-aware, #35166) / gtimeout for external commands, else a pure-shell process-group watchdog so stock macOS (neither binary present) is protected. Shell functions (_desktop_pack) always take the pure-shell path — the timeout binary can't exec a function. Integer-normalized budget + a boundary recheck so a command finishing in the final poll second isn't mislabeled 124. The internal wait is guarded so set -e can't abort mid-function before the real exit code is computed. - Wrap the desktop npm ci/install (sharing ONE budget via a computed deadline so a stall can't cost 2x DESKTOP_BUILD_TIMEOUT) + all three _desktop_pack attempts (DESKTOP_BUILD_TIMEOUT, default 900s), and the browser-tools / TUI / agent- browser registry installs (NODE_DEPS_TIMEOUT, default 600s). A stall now converts to a bounded non-zero exit that feeds the existing mirror self-heal instead of hanging the whole install.	2026-06-28 02:47:47 -07:00
Teknium	b508d4296e	test(ci): raise per-file timeout 140s → 300s to stop false timeouts (#54143 ) * test(ci): raise per-file timeout 140s to 300s to stop false timeouts The per-file parallel runner caps each test-file subprocess at a flat wall-clock budget. Combined with per-test subprocess isolation (a fresh Python process per test), a large-collection file pays N x (interpreter startup + import) of overhead before any test logic runs. That overhead dilates under load on shared CI runners, so a file that finishes in ~100s on a quiet box can blow the old 140s cap purely from scheduling jitter, surfacing as a false 'no tests ran' timeout (rc=124) with zero actual test failures. Raise the default to 300s (5 min). The Docker build matrix jobs already take 7-10 min, so this headroom costs nothing on total CI wall time while still bounding a genuinely hung file. * docs: add infographic for CI per-file timeout bump	2026-06-28 02:41:07 -07:00
teknium1	fe89ce0694	chore(release): map Cossackx in AUTHOR_MAP for #52528 salvage	2026-06-28 02:40:37 -07:00
teknium1	7c9cdad9fd	test(cli): cover Windows self-lock recovery guard + cmd-quote its hint Add two tests for the self-lock guard in _recover_from_interrupted_install: one asserting it clears the marker and skips install when hermes.exe is a process ancestor (breaking the #52378/#45542 loop), one asserting it falls through to a normal recovery install when the shim is NOT an ancestor. The guard's manual-recovery hint runs only inside the Windows branch, so quote it for cmd.exe (cd /d, double-quoted paths) — the cross-platform fallback hint at the end of the function is left POSIX-correct. Map Icather in scripts/release.py AUTHOR_MAP for the salvage.	2026-06-28 02:40:37 -07:00
teknium1	dddaea0c98	chore(release): map yungchentang author for #53622 salvage	2026-06-28 02:34:17 -07:00
teknium1	86ec979f66	chore(release): map PRATHAMESH75 author for #37550 salvage	2026-06-28 02:05:50 -07:00
Coy Geek	d7a1052424	fix(env-passthrough): fail closed when provider blocklist import fails When tools.environments.local can't be imported (partial install, import-time error), _is_hermes_provider_credential() returned False — fail-open. A skill could then register a Hermes provider credential (ANTHROPIC_API_KEY, etc.) as env passthrough; _scrub_child_env lets passthrough vars bypass the secret-substring net (rule 1), so the operator's real key would land in the execute_code child. Reopens the GHSA-rhgp-j443-p4rf bypass. Fail closed instead: on import failure, treat the name as a protected provider credential and refuse passthrough. Regression test exercises the full register -> scrub path under a simulated import failure. Co-authored-by: Hermes Agent <noreply@nousresearch.com>	2026-06-28 02:05:43 -07:00
teknium1	58c36b1798	fix(api-server): widen error redaction to cron-endpoint + SSE sites Follow-up to the salvaged #37733 fix. The contributor centralized redaction at _openai_error and the chat/responses failure paths, which covers the OpenAI-compatible envelopes transitively. Two sibling classes crossed the same authenticated HTTP boundary unredacted: - 8x cron-management endpoints returning {"error": str(e)} on 500 - the session-chat SSE error event ({"message": str(exc)}) Route both through the same _redact_api_error_text(force=True) helper. Add AUTHOR_MAP entry for coygeek and a TestRedactApiErrorText guard covering mask/force/limit/passthrough behavior.	2026-06-28 02:05:38 -07:00
teknium1	c0b4a3438a	fix(install): scope Playwright override to too-new apt releases + keep step interruptible Follow-up on #54032 for #35166: - Gate the PLAYWRIGHT_HOST_PLATFORM_OVERRIDE retry on the host being an apt release newer than Playwright recognizes (Ubuntu >24.04 / Debian >13) via playwright_host_unrecognized(), instead of retrying on ANY install failure. A network/disk/permission failure on a supported host now surfaces unchanged rather than getting a mismatched-glibc build forced onto it. - detect_os() now captures DISTRO_VERSION from os-release. - Fold in the interruptibility fix (was PR #35304, self-closed): wrap the download in 'timeout --foreground -k 10' (probed, with plain-timeout fallback) so a terminal Ctrl+C reaches the child and a wedged download is force-killed after the deadline. - Add behavioral tests that source the helpers and assert the retry fires only on Ubuntu 26.04 / Debian 14, not on supported hosts, non-apt distros, native-success, operator-pinned override, or unsupported arch.	2026-06-28 02:05:18 -07:00
kshitijk4poor	a28fe788a6	fix(install): retry Playwright install with platform override on unrecognized host (#35166 ) On apt releases newer than the bundled Playwright recognizes (Ubuntu 26.04, Debian 14, and future distros), 'npx playwright install --with-deps chromium' hangs uninterruptibly at 'Installing Playwright Chromium with system dependencies' because Playwright's resolver maps the host to a platform with no download build (#35166). Wrap every installer Playwright call in run_playwright_install(), which tries the native install first and, only if it fails or times out, retries once with PLAYWRIGHT_HOST_PLATFORM_OVERRIDE pinned to the newest known build (ubuntu24.04-<arch>). This is the escape hatch Playwright's maintainers bless for unrecognized platforms (microsoft/playwright#33434). Try-native-first (not a hardcoded distro/version table) is deliberate: - Self-correcting — when Playwright already supports the host (e.g. Ubuntu 26.04 on Playwright >=1.61) the first attempt succeeds and the override is never applied, so we never force a mismatched-glibc build onto a release Playwright handles correctly (microsoft/playwright#35114). - Zero-maintenance — new distro releases work the moment Playwright adds them. - Covers Debian 14+ and future releases, not just Ubuntu 26.04. An operator-set PLAYWRIGHT_HOST_PLATFORM_OVERRIDE is always respected (applied to the first attempt; retry skipped). Non-x64/arm64 arches have no fallback build and skip the retry. Refs #35166	2026-06-28 02:05:18 -07:00
teknium1	578e3989d4	fix(agent): route content-filter stream stalls to fallback chain (#32421 ) When a provider's output-layer safety filter (MiniMax "output new_sensitive (1027)", Azure content_filter, etc.) kills a streaming response after deltas were already sent, interruptible_streaming_api_call swallows the raw error into a finish_reason=length partial-stream stub. The conversation loop then burned 3 continuation retries against the SAME primary — re-hitting the content-deterministic filter every time — and gave up with "Response remained truncated after 3 continuation attempts", never consulting fallback_providers. Builds on @595650661's classifier change (cherry-picked) so error_classifier recognizes the filter; then: - chat_completion_helpers: run the swallowed error through error_classifier at the stub-creation point and stamp _content_filter_terminated on the stub (single source of truth — no parallel pattern list). - conversation_loop: read the tag and activate the fallback chain BEFORE burning any continuation retries; roll partial content back to the last clean turn and re-issue against the new provider (restart_with_rebuilt_messages). Plain network stalls are unaffected (only content_policy_blocked is tagged). Credits #32479 (@sweetcornna) and #33845 (@Tranquil-Flow) which fixed the same issue via the stub-tag and loop-escalation approaches respectively. Live E2E confirmed: before, _try_activate_fallback called 0x; after, fallback fires on the first stub and the fallback provider completes the turn.	2026-06-28 01:15:21 -07:00
teknium1	cb9f855c2b	test(whatsapp-bridge): drop structural send-queue integration test The .integration.test.mjs greps bridge.js source text for the queue wiring — a change-detector that breaks on any benign refactor of the same code. The behavioral unit test (bridge.sendqueue.test.mjs) already covers FIFO ordering, error isolation, timeout propagation, and single-consumer concurrency, which is the contract that matters.	2026-06-28 01:10:14 -07:00
Tranquil-Flow	c393a8e55f	fix(whatsapp-bridge): serialize sendMessage to prevent cross-chat contamination (#33360 ) Concurrent sock.sendMessage() calls on a single Baileys socket can cause the WhatsApp protocol-level routing to misdeliver messages — responses intended for one chat appear in another. Add a promise-based send queue that serialises all sendMessage() calls across concurrent HTTP /send, /edit, and /send-media handlers so only one send is in-flight at a time. Includes unit tests for queue ordering, error isolation, timeout propagation, and single-consumer concurrency semantics, plus an integration check that the queue is wired into sendWithTimeout.	2026-06-28 01:10:14 -07:00
teknium1	2e1b48ed31	chore: map kurlyk local email → skabartem for PR #32867 salvage	2026-06-28 01:08:04 -07:00
Teknium	2523917680	fix(tests): bare pytest flags pass through run_tests.sh without a '--' separator (#54008 ) The parallel runner only forwarded pytest args after a literal '--', so a bare 'scripts/run_tests.sh tests/foo.py -q' (or -v/-x/-k/--tb=long) errored out with 'unrecognized arguments'. This contradicted the docstring's promise that common pytest flags pass through, and forced a retry on every run that used pytest muscle-memory. Now any token starting with '-' that isn't one of the runner's own options (-j/--jobs, --paths, --slice, --file-timeout, --generate-slices, --files, --include-integration) is routed to each per-file pytest invocation automatically. Value-taking flags given space-separated (-k expr, -m mark, -p plugin, -o name=val, etc.) keep their value instead of having it stolen by positional-path discovery. The explicit '--' separator still works and stacks with bare flags. - scripts/run_tests_parallel.py: argv splitter routes bare unknown flags to pytest; value-flag lookahead; updated docstring. - scripts/run_tests.sh: usage comment reflects bare-flag passthrough. - tests/test_run_tests_parallel.py: 4 behavior-contract tests (bare -q runs, -k keeps its value/filters, '--' still works, positional path stays a root).	2026-06-27 22:43:26 -07:00
Rafael Millan	54ea059919	fix: fall back to no-sandbox for desktop launch on restricted Linux hosts	2026-06-27 22:16:20 -07:00
teknium1	97640fd9ad	fix(desktop): reserve WCO width on plain Linux + author map The plain-Linux overlay re-enable (#53185) left nativeOverlayWidth() at 0 for plain Linux, so the native min/max/close buttons painted on top of the app's right-edge titlebar tools. Reserve the fallback width everywhere the WCO overlay is painted (Windows, WSLg, plain Linux); macOS still reserves 0 since it uses traffic lights.	2026-06-27 22:05:33 -07:00
teknium1	c72d68715f	chore(release): map salvaged contributor emails for #49129 and #51488	2026-06-27 21:23:25 -07:00
teknium1	2e7e600eaa	chore(release): map HexLab98 author for PR #53863 salvage	2026-06-27 21:22:49 -07:00
Jack Maloney	f0de4c6a47	fix(pool): re-select from credential pool on primary runtime restore _restore_primary_runtime restored the construction-time api_key snapshot and never consulted the credential pool. After the pool rotated away from a revoked/exhausted entry mid-session, every new turn restored the dead key, re-failed instantly, burned the remaining entries, and fell through to cross-provider fallback. After restoring the snapshot, re-select the pool's current best entry and swap the live credential in via _swap_credential (which already rebuilds the OpenAI/Anthropic client, reapplies base-url headers, and carries the #33163 base_url / OAuth-detection fixes). Falls back to the snapshot key when the pool is absent, empty, or the entry has no usable key. Salvaged from #25206 onto current main: the original targeted the pre-refactor monolithic method in run_agent.py; the logic now lives in agent/agent_runtime_helpers.py and is collapsed onto _swap_credential instead of re-inlining the client rebuild. Fixes #25205	2026-06-27 20:04:45 -07:00
teknium1	926a1b915d	fix(tools): suppress transient check_fn flakes so subagents keep file/terminal tools A flaky external probe in a tool's check_fn (e.g. check_terminal_requirements running `docker version` with a 5s timeout, momentarily timing out under load) would return False for a single get_tool_definitions() call. Because file tools delegate their check_fn to the terminal check, that one flake silently stripped read_file/write_file/patch/search_files AND terminal from whatever agent was being constructed at that instant — most visibly a delegate_task subagent, which then reported "Tool read_file does not exist". This explains both the intermittent (~80% success) user-session failures and the deterministic cron failures in #21658 / #5304. The existing _check_fn TTL cache made this worse: it cached the transient False for the full 30s window, poisoning every subagent spawned in that span. Fix: remember the last time each check_fn returned True; when a fresh probe fails within a short grace window of that success, treat it as a flake — serve the last-good True and do NOT cache the failure (so the next call re-probes). A failure with no recent success, or past the grace window, is honored normally so a backend that genuinely went down stops advertising its tools. Probe failures now log at WARNING regardless of quiet mode, making the previously-silent tool loss diagnosable in subagent (quiet) sessions. Co-authored-by: Stuart Horner <5261694+djstunami@users.noreply.github.com>	2026-06-27 19:29:00 -07:00

1 2 3 4 5 ...

1344 commits