Discord threads where the bot has already participated bypass mention gating by default, but the backfill check was still tied to the mention-needed condition. That meant follow-up thread messages could trigger a response without providing recent thread history to the session.
Run history backfill for thread messages whenever backfill is enabled, while keeping DMs skipped and channel mention backfill behavior unchanged. Add a regression test for a known thread follow-up without an explicit mention.
Fixes#33666
Co-authored-by: Cursor <cursoragent@cursor.com>
The reaper hoist in the prior commit adds an extra
`asyncio.to_thread(_kb.reap_worker_zombies)` call at the top of every
dispatcher tick (before the per-board work). The existing
`test_gateway_dispatcher_disables_corrupt_board_without_traceback`
mocks `to_thread` with a 4-call cap that previously matched 2 full
dispatch ticks. With the reaper hoist each tick is now 3
`to_thread` calls instead of 2, so the cap is raised to 6 to preserve
the same number of dispatch ticks. The `connect == 5` assertion is
unchanged.
Also add the contributor's `steveonjava@gmail.com` to AUTHOR_MAP
alongside `steve@steveonjava.com` so contributor-audit passes for
both identities used across the salvaged commits.
Salvage follow-up for PR #32857.
Reaper now runs at the top of every dispatcher tick regardless of per-board connect() failures. Previously the reaper sat inside dispatch_once after the kanban_db.connect() call — any EIO during connect would skip reaping for that tick, accumulating zombie workers and stale claim_lock rows.
Also: reap_worker_zombies now returns the list of reaped pids (the dispatcher logs them) and a test indentation fix.
Squashes three sibling commits from PR #32301 into one logical change for batch review.
Pre-requisite for PR #32020 salvage (auth: global auth.json fallback
in _load_provider_state). Contributor_audit strict mode fails if any
commit author email on main is unmapped.
Co-authored-by: kshitijk4poor <kshitijk4poor@gmail.com>
Three additions on top of @Nami4D's salvage:
1. Gate the preflight slash-enum strip on the model name pattern
(grok-* / x-ai/grok-*). The original PR stripped slash-containing
enum values from every codex_responses request, but native Codex
(OpenAI) and GitHub Models DO accept slash enums — stripping them
there would silently degrade tool-schema constraints. xAI is the
only Responses-API surface that rejects the shape.
2. Resolve the merge conflict in agent/transports/codex.py by
preserving both the timeout-forwarding block that landed on main
between the PR's branch point and now AND the new service_tier
strip. Behavioural intent of both is preserved.
3. Six new tests in tests/agent/transports/test_codex_transport.py
covering:
- TestCodexTransportXaiServiceTierStrip (3 tests): xAI strips
service_tier from request_overrides; non-xAI codex_responses
and GitHub Models both KEEP service_tier (regression guards
so the strip stays xAI-only).
- TestPreflightSlashEnumStrip (3 tests): Grok and aggregator-
prefixed Grok model names both trigger the safety-net strip;
non-Grok models preserve slash enums as a regression guard
against the strip becoming too broad.
51/51 in tests/agent/transports/test_codex_transport.py.
Co-authored-by: Nami4D <hello@nami4d.tech>
Alibaba's latest flagship Qwen model is released but not yet present in the
DashScope (alibaba) or Alibaba Coding Plan curated catalogs. Add it so it
shows up in the /model picker and setup wizard for those providers.
OpenCode Go routing for qwen3.7-max already landed via #32780 (commit 2fc77c53f).
OpenRouter + Nous catalog entries already landed via #32809 (commit ccd3d04fc).
This salvage picks up the remaining alibaba / alibaba-coding-plan entries from
#32806 — the AI Gateway entry is dropped because Vercel AI Gateway was removed
in #33067.
#33016 added GET /v1/skills + /v1/toolsets on the API server; the
capability flag introduced in this branch was placeholder-False. Flip
to True so capability probers see the truth.
* fix(codex-responses): gracefully recover from invalid_encrypted_content (salvage #10144)
When an OpenAI-compatible Responses API surface accepts an initial
request but later rejects the replayed `codex_reasoning_items`
encrypted blob with HTTP 400 `invalid_encrypted_content`, the
session previously got stuck retrying the same poisoned payload.
Recovery: classify the error as a dedicated FailoverReason, and on the
first hit disable encrypted reasoning replay for the rest of the
session, strip cached items from message history, and retry once.
Changes:
* error_classifier: add FailoverReason.invalid_encrypted_content
branch in _classify_400 (before context_overflow so the messages
that mention 'encrypted content … could not be verified' don't trip
context heuristics), in _classify_by_error_code, and extend
_extract_error_code to peek inside wrapped JSON in error.message and
ignore the bare '400' as a code.
* agent_init: initialize `_codex_reasoning_replay_enabled = True` on
every agent.
* run_agent: add AIAgent._disable_codex_reasoning_replay() helper
that flips the flag and pops cached items.
* codex_responses_adapter: thread a `replay_encrypted_reasoning`
kwarg through _chat_messages_to_responses_input so that when the
flag is False we don't replay codex_reasoning_items.
* transports/codex.py: read `replay_encrypted_reasoning` from params,
thread it into the adapter, and gate the
`include=['reasoning.encrypted_content']` request hint on it.
* chat_completion_helpers: pass the agent's replay flag through to
the transport.
* conversation_loop: in the retry loop, add an
invalid_encrypted_content recovery branch that fires once per
session, only when api_mode == codex_responses, only when replay is
still enabled, and only when at least one assistant message in
history actually carries cached reasoning items (otherwise the 400
has nothing to do with our cache and the normal retry path handles
it).
Tests:
* test_error_classifier: new wrapped-JSON _extract_error_code case;
new TestClassifyApiError cases proving the 400 is retryable with
no fallback, that the broad message match doesn't catch a generic
'parsed' message, and that the error code match is
case-insensitive.
* test_run_agent_codex_responses: end-to-end test of the recovery
branch firing once and disabling replay, plus a sibling test that
proves the branch does *not* fire (and the flag stays True) when
history has no cached reasoning items.
Salvages PR #10144 onto the post-refactor module layout
(error_classifier / codex_responses_adapter / transports/codex /
conversation_loop / agent_init) since the original diff was written
against the pre-refactor monolithic run_agent.py.
* chore(release): map victorGPT in AUTHOR_MAP for #10144 salvage
---------
Co-authored-by: victorGPT <wuxuebin1993@gmail.com>
When the gateway processes /reload-mcp, it reconnects MCP servers and
updates the global _servers registry, but cached AIAgent instances in
_agent_cache keep the tools list they were built with. The user had to
also run /new (discarding conversation history) before the agent could
see the new tools — even though /reload-mcp had succeeded.
This patch refreshes each cached agent's .tools and .valid_tool_names
in _execute_mcp_reload after discovery returns, so existing sessions
pick up new MCP tools on their next turn. The slash-confirm gate in
_handle_reload_mcp_command already obtains user consent for the
implied prompt-cache invalidation before this code runs.
Mirrors the equivalent behaviour the CLI already does in cli.py
_reload_mcp. Per-agent enabled_toolsets and disabled_toolsets are
preserved so an agent that was scoped to a subset of toolsets does
not silently gain disabled tools after the reload.
Original diagnosis + initial implementation in #23812 from @fujinice.
The auto-reload watcher half of that PR is intentionally dropped —
users want /reload-mcp to remain explicit.
Co-authored-by: fujinice <45688690+fujinice@users.noreply.github.com>
Pre-salvage prep for the must-have security cluster (#32103, #32155).
#32103 author commit uses dearmayo@localhost; PR opener is ffr31mr —
same pattern as the existing holynn-q localhost mapping.
Adds an optional autonomous-ai-agents skill that delegates coding tasks
to the OpenHands CLI (https://github.com/All-Hands-AI/OpenHands). Sits
alongside claude-code / codex / opencode and is the model-agnostic
option in that family — any LiteLLM-supported provider works.
This is a ground-truth rewrite of #19325 by @xzessmedia (Tim Koepsel).
The original PR's SKILL.md was drafted by the OpenHands agent itself and
hallucinated several flags that don't exist in the real CLI (\`--model\`,
\`--max-iterations\`, \`--workspace\`, \`--sandbox docker\`), pointed at
the wrong PyPI package (\`openhands-ai\`, which is the legacy V0 SDK),
and claimed native Windows support that the upstream docs explicitly
disclaim. Rather than cherry-pick and rewrite half the lines under
contributor authorship, the SKILL.md was rebuilt against a verified
install (\`uv tool install openhands --python 3.12\`) and a real
end-to-end \`--headless --json\` run against openrouter/openai/gpt-4o-mini.
Authorship credited via the \`author:\` frontmatter field and an
AUTHOR_MAP entry in scripts/release.py.
Changes:
- optional-skills/autonomous-ai-agents/openhands/SKILL.md (new)
- website/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-openhands.md (auto-gen)
- website/docs/reference/optional-skills-catalog.md (one new row)
- website/sidebars.ts (one new entry under Optional → Autonomous AI Agents)
- scripts/release.py (AUTHOR_MAP entry for xzessmedia)
Pitfalls documented in the SKILL came from running the tool, not from
the upstream README: LiteLLM bedrock/sagemaker stderr noise on every
invocation, banner spam (\`OPENHANDS_SUPPRESS_BANNER=1\` required),
\`--override-with-envs\` mandatory or the CLI ignores LLM_* env vars
entirely, the dashed-vs-undashed Conversation ID footgun for \`--resume\`,
LiteLLM model-slug double-prefix when going through OpenRouter.
The ChatGPT Codex backend (chatgpt.com/backend-api/codex) has historically
silently dropped certain model requests: the connection is accepted but no
stream events are emitted and no error is raised. PR #31967 lowered the
implicit stale-call default from 300s to 90s so fallbacks kick in faster,
but users still see an opaque "No response from provider for 90s
(non-streaming, ...)" message that gives no path forward.
This patch adds a narrow heuristic — gpt-5.5 family on the Codex backend
via codex_responses api_mode — that substitutes the generic timeout
message with actionable text naming the gpt-5.4-codex workaround and
pointing at #21444 for symptom history.
Changes:
- run_agent.py — new ``AIAgent._codex_silent_hang_hint(model=...)`` method.
Returns ``None`` for any request that does not match all three guards
(codex_responses api_mode, openai-codex provider or chatgpt.com Codex
base URL, gpt-5.5-family model name with word-boundary regex anchoring
to avoid false-positives on e.g. ``gpt-5.50``).
- agent/chat_completion_helpers.py — the non-stream stale-call site
consults the hint via ``getattr(...)`` so the call site stays robust
if the helper is ever removed or stubbed in tests. Hint is appended to
both the ``_emit_status`` warning and the ``TimeoutError`` message so
the user sees it in their terminal AND it lands in any retry-loop
diagnostics.
- tests/run_agent/test_codex_silent_hang_hint.py — 10 regression tests
covering positive cases (bare gpt-5.5, vendor-prefixed openai/gpt-5.5,
gpt-5.5-codex SKU, model=None fallback to self.model) and negative
cases (gpt-5.4-codex workaround, gpt-5.50 false-positive guard,
non-codex api_mode, non-codex provider, empty/None model, unrelated
models on Codex).
Does NOT fix the backend-side issue (that's an upstream OpenAI/ChatGPT
problem we cannot patch from here). Only converts an opaque timeout into
text that names the workaround so users do not have to dig through logs
or wait for a forum post to learn what to do.
Closes#22046
Codex / Responses-API requests had three latent timeout bugs that combined
into the long silent hangs reported on #21444:
1. The non-stream stale-call detector estimated context tokens from
``api_kwargs["messages"]`` only. Codex / Responses-API payloads carry
their conversational load in ``input`` (with ``instructions`` and
``tools``), so every Codex turn logged ``context=~0 tokens`` and the
detector never applied its >50k / >100k tier bumps.
2. ``providers.<id>.request_timeout_seconds`` was silently dropped on the
main Codex path. The chat_completions path and the auxiliary Codex
adapter both forwarded it; the main path skipped it through three
places (``build_api_kwargs``, ``ResponsesApiTransport.build_kwargs``,
``_preflight_codex_api_kwargs``).
3. The streaming stale detector had the same payload-shape bug for
``codex_responses`` requests, which route through the non-streaming
detector (it's the path that emits the user-facing
"No response from provider for 300s (non-streaming, ...)" warning that
reporters keep pasting).
This commit:
- Adds ``estimate_request_context_tokens`` in ``chat_completion_helpers``,
used by both the non-stream and stream detectors. Handles ``messages``
(Chat Completions), ``input + instructions + tools`` (Responses API),
bare lists, and an unknown-dict fallback.
- Forwards ``timeout`` through ``ResponsesApiTransport.build_kwargs``
and ``_preflight_codex_api_kwargs`` (with guards against
zero/negative/inf/bool values), and wires
``_resolved_api_call_timeout()`` into the Codex branch of
``build_api_kwargs``.
- Lowers the implicit non-stream stale defaults so fallback providers
kick in faster when upstream stalls:
* base 300s -> 90s
* >50k 450s -> 150s
* >100k 600s -> 240s
These only apply when the user has *not* set
``providers.<id>.stale_timeout_seconds`` or
``HERMES_API_CALL_STALE_TIMEOUT``. Explicit config still wins.
- Adds regression tests for the estimator shapes, the new defaults, the
context-tier scaling, transport timeout pass-through, and preflight
timeout pass-through / rejection of invalid values.
Closes#21444
Supersedes #21652#24126#31855
Co-authored-by: Hoang V. Pham <26063003+hehehe0803@users.noreply.github.com>
Follow-up to @someaka's fix.
Polish:
- Drop the redundant `_preflight_tokens >= threshold_tokens` clause.
`should_compress(tokens)` already short-circuits when tokens < threshold,
so the explicit comparison was dead code on the True branch.
Tests:
- Preflight: pin that should_compress() is called (anti-thrash has a vote).
Mocks should_compress to return False even with tokens past the raw
threshold and asserts no compression runs — exact bug shape from #29335.
- Gateway: AST scan of gateway/run.py asserts every
`session_entry.session_id = ...` assignment is followed by a
`session_store._save()` call within the same block. Three sites mutate
the session_id after compression; all three must persist or the next
turn loads the pre-compression transcript and re-loops. Empirically
verified the test catches the bug (drops the new _save() line → red).
AUTHOR_MAP:
- Map ed@bebop.crew -> someaka so the salvaged commit resolves to
@someaka in release notes.
Follow-up on top of @jacevys' PR #21437 cherry-pick:
- _provider_model_ids() now also matches normalized == 'openai-api' for
the live /v1/models fetch path, so users see the full catalog instead
of just the curated list.
- Add gpt-5.5-pro and gpt-5.3-codex to the curated list for parity with
the existing 'openai' table (used as fallback when /v1/models fails).
- Add scripts/release.py AUTHOR_MAP entry for jacevys so CI doesn't
block the salvage PR.
Follow-up to @Strontvod's fix.
Tests:
- Five new tests in test_update_concurrent_quarantine.py cover the parent-
chain exclusion: the .exe launcher is excluded, an unrelated sibling
hermes.exe is still reported, multi-level ancestry is fully excluded,
PID cycles in the parent chain don't hang, and a partially-stubbed
psutil (no Process attribute) degrades gracefully instead of crashing.
- New _fake_psutil_with_parent_chain helper builds a fuller stand-in
(Process / NoSuchProcess / AccessDenied + process_iter) than the
process_iter-only SimpleNamespace the older tests use.
Hardening:
- Broaden the except in the parent-walk to bare Exception. The original
fix listed (NoSuchProcess, AccessDenied, ValueError), but those names
are evaluated lazily during exception matching — if psutil is a partial
stub without the attribute, the exception handler itself raises
AttributeError that escapes. The function is documented as 'never raises'
(the surrounding update flow depends on it), so the broader catch keeps
the contract regardless of how the dependency is shaped.
AUTHOR_MAP:
- Map schepers.zander1@gmail.com -> Strontvod so the salvaged commit
resolves to @Strontvod in the release notes.
All 18 detect_concurrent + quarantine tests pass.