feat(delegation): background fan-out — parallel subagents, one consolidated return (#49734)

* feat(delegation): single-task delegate_task always runs in the background

The model no longer decides whether a subagent runs in the background — a
single-task delegate_task from the top-level agent is now always dispatched
async, so the parent turn returns immediately and the subagent's result
re-enters the conversation when it finishes.

- run_agent._dispatch_delegate_task (the live model path) forces
  background=True for top-level single-task calls; the schema-level
  `background` param is ignored.
- A batch (tasks with >1 item) stays synchronous (fan-out can't go async).
- A delegation from an orchestrator subagent (depth > 0) stays synchronous —
  it needs its workers' results within its own turn.
- The function-level default is unchanged, so direct Python callers/tests keep
  the historical synchronous behavior.
- On async-pool capacity rejection, single-task now falls through to a
  synchronous run instead of erroring (the child stays attached for interrupt
  propagation; detach happens only on a successful dispatch).
- Schema `background` param marked deprecated/ignored; tool description
  updated to state the always-background single-task rule.

* feat(delegation): all delegate_task fan-out runs in the background

Extend the always-background behavior to the full fan-out. A batch is now
dispatched as N independent async subagents (one handle each), instead of
running synchronously. Single task and batch both return immediately; each
subagent's result re-enters the conversation as its own message when it
finishes.

- delegate_task: when background is set, loop over ALL built children and
  dispatch each via dispatch_async_delegation; return a combined handle block
  (count + per-task delegation_ids). Children the async pool rejects (at
  capacity) run synchronously inline and are reported alongside the dispatched
  handles, so nothing is silently dropped.
- run_agent._dispatch_delegate_task + registry handler: force background for
  any top-level model delegation (single OR batch); orchestrator subagents
  (depth > 0) still run synchronously since they need workers' results within
  their own turn.
- Removed the v1 'batch async not supported' rejection.
- Tool description updated: BOTH MODES RUN IN THE BACKGROUND.
- Tests updated to assert batch fan-out dispatches each task async (verified
  E2E: 3-task batch -> 3 independent completion-queue events).

* fix(delegation): background fan-out joins and returns one consolidated block

Correct the fan-out semantics: a backgrounded batch is dispatched as ONE
async unit (one handle, one async-pool slot), not N independent dispatches.
The unit runs all children in parallel, waits on every one, and emits a
SINGLE completion event carrying the consolidated per-task results. The chat
is never blocked; when all subagents finish, their full summaries re-enter
the conversation together as one message.

- async_delegation.dispatch_async_delegation_batch + _finalize_batch: a batch
  occupies one slot; its runner returns the combined {results:[...]} dict and
  one event with the full results list is pushed to the completion queue.
- delegate_tool: extract the sync execution+aggregation into
  _execute_and_aggregate(); background dispatches it via the batch unit and
  returns one handle; on pool-capacity rejection it runs the batch inline.
- process_registry._format_async_delegation: render a consolidated multi-task
  block (TASK i/N + per-task summary) when the event carries is_batch/results.
- Tests updated; E2E verified: 3-task batch -> immediate return -> one combined
  completion block with all three summaries.
This commit is contained in:
Teknium 2026-06-20 11:27:12 -07:00 committed by GitHub
parent 680732c104
commit ea8a8b4af8
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 719 additions and 331 deletions

View file

@ -2103,18 +2103,12 @@ def delegate_task(
# Normalise the top-level role once; per-task overrides re-normalise.
top_role = _normalize_role(role)
# Async (background) delegation is single-task only in v1. A batch carries
# fan-out semantics (N handles, partial completion) that double the state
# model — reject early with a clear message rather than silently running
# the batch synchronously.
# Background (async) delegation now applies to BOTH single tasks and
# batches. A batch simply becomes N independent async dispatches: each
# child runs on the daemon executor and re-enters the conversation via
# the completion queue on its own, carrying its own handle. There's no
# combined "wait for all" — fan-out is exactly N background subagents.
background = is_truthy_value(background, default=False) if background is not None else False
if background and tasks and isinstance(tasks, list) and len(tasks) > 1:
return tool_error(
"background=true is single-task only. Dispatch one background "
"subagent per delegate_task call (each returns its own handle and "
"re-enters the conversation independently), or run the batch "
"synchronously with background=false."
)
# Depth limit — configurable via delegation.max_spawn_depth,
# default 2 for parity with the original MAX_DEPTH constant.
@ -2250,150 +2244,101 @@ def delegate_task(
# Authoritative restore: reset global to parent's tool names after all children built
_model_tools._last_resolved_tool_names = _parent_tool_names
if n_tasks == 1:
# Single task -- run directly (no thread pool overhead)
_i, _t, child = children[0]
def _execute_and_aggregate() -> dict:
"""Run all built children (1 or N), join on them, aggregate results,
fire subagent_stop hooks + cost rollup, and return the combined result
dict. Used by BOTH the synchronous path and the background runner. In
the background case this whole function runs on the daemon executor, so
the parent turn isn't blocked — but the batch still JOINS on itself
here (all children must finish) before producing ONE consolidated
results block. That is the contract: fan-out runs in the background,
waits on each other, and returns together.
"""
if n_tasks == 1:
# Single task -- run directly (no thread pool overhead)
_i, _t, child = children[0]
result = _run_single_child(_i, _t["goal"], child, parent_agent)
results.append(result)
else:
# Batch -- run in parallel with per-task progress lines
completed_count = 0
spinner_ref = getattr(parent_agent, "_delegate_spinner", None)
# ----- Async / background dispatch -----
# When background=true, hand the already-built child to the async
# delegation registry and return a handle immediately. The child runs
# on a daemon executor; its result re-enters the conversation as a
# fresh turn via process_registry.completion_queue (see
# tools/async_delegation.py). Batch async is intentionally NOT
# supported in v1 — the rejection is handled before we get here.
if background:
from tools.async_delegation import dispatch_async_delegation
from tools.approval import get_current_session_key
with ThreadPoolExecutor(max_workers=max_children) as executor:
futures = {}
for i, t, child in children:
future = executor.submit(
_run_single_child,
task_index=i,
goal=t["goal"],
child=child,
parent_agent=parent_agent,
)
futures[future] = i
# Capture the gateway routing key on THIS (parent) thread — the
# daemon worker won't carry the session contextvar.
_session_key = get_current_session_key(default="")
# Poll futures with interrupt checking. as_completed() blocks
# until ALL futures finish — if a child agent gets stuck,
# the parent blocks forever even after interrupt propagation.
# Instead, use wait() with a short timeout so we can bail
# when the parent is interrupted.
# Map task_index -> child agent, so fabricated entries for
# still-pending futures can carry the correct _delegate_role.
_child_by_index = {i: child for (i, _, child) in children}
# Detach the child from the parent's interrupt-propagation list.
# _build_child_agent registered it there (correct for sync
# children, which block the parent's turn), but a BACKGROUND
# child must survive parent-turn interrupts (Ctrl+C, mid-turn
# steering), cache evicts (release_clients), and session close
# (/new) — otherwise the detached subagent dies with whatever
# the parent was doing when it was dispatched. Its lifecycle is
# owned by the async-delegation registry (interrupt_fn below),
# and _run_single_child's finally block closes its resources
# when it finishes.
if hasattr(parent_agent, "_active_children"):
try:
_ac_lock = getattr(parent_agent, "_active_children_lock", None)
if _ac_lock:
with _ac_lock:
parent_agent._active_children.remove(child)
else:
parent_agent._active_children.remove(child)
except ValueError:
pass
def _async_runner(_child=child, _goal=_t["goal"]):
return _run_single_child(0, _goal, _child, parent_agent)
def _async_interrupt(_child=child):
try:
if hasattr(_child, "interrupt"):
_child.interrupt("Async delegation cancelled")
elif hasattr(_child, "_interrupt_requested"):
_child._interrupt_requested = True
except Exception:
pass
dispatch = dispatch_async_delegation(
goal=_t["goal"],
context=_t.get("context"),
toolsets=_t.get("toolsets") or toolsets,
role=_normalize_role(_t.get("role") or top_role),
model=creds["model"],
session_key=_session_key,
runner=_async_runner,
interrupt_fn=_async_interrupt,
max_async_children=_get_max_async_children(),
)
if dispatch.get("status") == "dispatched":
return json.dumps(
{
"status": "dispatched",
"delegation_id": dispatch["delegation_id"],
"goal": _t["goal"],
"mode": "background",
"note": (
"Subagent is running in the background. You and the "
"user can keep working; the full task source and "
"result will re-enter the conversation as a new "
"message when it finishes. Do not wait or poll — "
"just continue."
),
},
ensure_ascii=False,
)
# Rejected (at capacity or schedule failure) — surface as a tool
# error so the model can fall back to synchronous delegation.
return tool_error(
dispatch.get("error", "Async delegation could not be scheduled.")
)
result = _run_single_child(0, _t["goal"], child, parent_agent)
results.append(result)
else:
# Batch -- run in parallel with per-task progress lines
completed_count = 0
spinner_ref = getattr(parent_agent, "_delegate_spinner", None)
with ThreadPoolExecutor(max_workers=max_children) as executor:
futures = {}
for i, t, child in children:
future = executor.submit(
_run_single_child,
task_index=i,
goal=t["goal"],
child=child,
parent_agent=parent_agent,
)
futures[future] = i
# Poll futures with interrupt checking. as_completed() blocks
# until ALL futures finish — if a child agent gets stuck,
# the parent blocks forever even after interrupt propagation.
# Instead, use wait() with a short timeout so we can bail
# when the parent is interrupted.
# Map task_index -> child agent, so fabricated entries for
# still-pending futures can carry the correct _delegate_role.
_child_by_index = {i: child for (i, _, child) in children}
pending = set(futures.keys())
while pending:
if getattr(parent_agent, "_interrupt_requested", False) is True:
# Parent interrupted — collect whatever finished and
# abandon the rest. Children already received the
# interrupt signal; we just can't wait forever.
for f in pending:
idx = futures[f]
if f.done():
try:
entry = f.result()
except Exception as exc:
pending = set(futures.keys())
while pending:
if getattr(parent_agent, "_interrupt_requested", False) is True:
# Parent interrupted — collect whatever finished and
# abandon the rest. Children already received the
# interrupt signal; we just can't wait forever.
for f in pending:
idx = futures[f]
if f.done():
try:
entry = f.result()
except Exception as exc:
entry = {
"task_index": idx,
"status": "error",
"summary": None,
"error": str(exc),
"api_calls": 0,
"duration_seconds": 0,
"_child_role": getattr(
_child_by_index.get(idx), "_delegate_role", None
),
}
else:
entry = {
"task_index": idx,
"status": "error",
"status": "interrupted",
"summary": None,
"error": str(exc),
"error": "Parent agent interrupted — child did not finish in time",
"api_calls": 0,
"duration_seconds": 0,
"_child_role": getattr(
_child_by_index.get(idx), "_delegate_role", None
),
}
else:
results.append(entry)
completed_count += 1
break
from concurrent.futures import wait as _cf_wait, FIRST_COMPLETED
done, pending = _cf_wait(
pending, timeout=0.5, return_when=FIRST_COMPLETED
)
for future in done:
try:
entry = future.result()
except Exception as exc:
idx = futures[future]
entry = {
"task_index": idx,
"status": "interrupted",
"status": "error",
"summary": None,
"error": "Parent agent interrupted — child did not finish in time",
"error": str(exc),
"api_calls": 0,
"duration_seconds": 0,
"_child_role": getattr(
@ -2402,165 +2347,229 @@ def delegate_task(
}
results.append(entry)
completed_count += 1
break
from concurrent.futures import wait as _cf_wait, FIRST_COMPLETED
done, pending = _cf_wait(
pending, timeout=0.5, return_when=FIRST_COMPLETED
)
for future in done:
try:
entry = future.result()
except Exception as exc:
idx = futures[future]
entry = {
"task_index": idx,
"status": "error",
"summary": None,
"error": str(exc),
"api_calls": 0,
"duration_seconds": 0,
"_child_role": getattr(
_child_by_index.get(idx), "_delegate_role", None
),
}
results.append(entry)
completed_count += 1
# Print per-task completion line above the spinner
idx = entry["task_index"]
label = (
task_labels[idx] if idx < len(task_labels) else f"Task {idx}"
)
dur = entry.get("duration_seconds", 0)
status = entry.get("status", "?")
icon = "" if status == "completed" else ""
remaining = n_tasks - completed_count
completion_line = f"{icon} [{idx+1}/{n_tasks}] {label} ({dur}s)"
if spinner_ref:
try:
spinner_ref.print_above(completion_line)
except Exception:
# Print per-task completion line above the spinner
idx = entry["task_index"]
label = (
task_labels[idx] if idx < len(task_labels) else f"Task {idx}"
)
dur = entry.get("duration_seconds", 0)
status = entry.get("status", "?")
icon = "" if status == "completed" else ""
remaining = n_tasks - completed_count
completion_line = f"{icon} [{idx+1}/{n_tasks}] {label} ({dur}s)"
if spinner_ref:
try:
spinner_ref.print_above(completion_line)
except Exception:
print(f" {completion_line}")
else:
print(f" {completion_line}")
else:
print(f" {completion_line}")
# Update spinner text to show remaining count
if spinner_ref and remaining > 0:
try:
spinner_ref.update_text(
f"🔀 {remaining} task{'s' if remaining != 1 else ''} remaining"
)
except Exception as e:
logger.debug("Spinner update_text failed: %s", e)
# Update spinner text to show remaining count
if spinner_ref and remaining > 0:
try:
spinner_ref.update_text(
f"🔀 {remaining} task{'s' if remaining != 1 else ''} remaining"
)
except Exception as e:
logger.debug("Spinner update_text failed: %s", e)
# Sort by task_index so results match input order
results.sort(key=lambda r: r["task_index"])
# Sort by task_index so results match input order
results.sort(key=lambda r: r["task_index"])
# Notify parent's memory provider of delegation outcomes
if (
parent_agent
and hasattr(parent_agent, "_memory_manager")
and parent_agent._memory_manager
):
for entry in results:
try:
_task_goal = (
task_list[entry["task_index"]]["goal"]
if entry["task_index"] < len(task_list)
else ""
)
parent_agent._memory_manager.on_delegation(
task=_task_goal,
result=entry.get("summary", "") or "",
child_session_id=(
getattr(children[entry["task_index"]][2], "session_id", "")
if entry["task_index"] < len(children)
# Notify parent's memory provider of delegation outcomes
if (
parent_agent
and hasattr(parent_agent, "_memory_manager")
and parent_agent._memory_manager
):
for entry in results:
try:
_task_goal = (
task_list[entry["task_index"]]["goal"]
if entry["task_index"] < len(task_list)
else ""
),
)
parent_agent._memory_manager.on_delegation(
task=_task_goal,
result=entry.get("summary", "") or "",
child_session_id=(
getattr(children[entry["task_index"]][2], "session_id", "")
if entry["task_index"] < len(children)
else ""
),
)
except Exception:
pass
# Fire subagent_stop hooks once per child, serialised on the parent thread.
# This keeps Python-plugin and shell-hook callbacks off of the worker threads
# that ran the children, so hook authors don't need to reason about
# concurrent invocation. Role was captured into the entry dict in
# _run_single_child (or the fabricated-entry branches above) before the
# child was closed.
_parent_session_id = getattr(parent_agent, "session_id", None)
try:
from hermes_cli.plugins import invoke_hook as _invoke_hook
except Exception:
_invoke_hook = None
# Aggregate child spend here so the parent's footer/UI reflect the true
# cost of a subagent-heavy turn. Port of Kilo-Org/kilocode#9448. Each
# child's cost was captured in _run_single_child before its AIAgent was
# closed; we fold them into the parent in one pass alongside the
# subagent_stop hook loop so we don't walk `results` twice.
_children_cost_total = 0.0
for entry in results:
child_role = entry.pop("_child_role", None)
child_cost = entry.pop("_child_cost_usd", 0.0)
try:
if child_cost:
_children_cost_total += float(child_cost)
except (TypeError, ValueError):
pass
if _invoke_hook is None:
continue
try:
_child_index = entry.get("task_index", -1)
_child_agent = (
children[_child_index][2]
if isinstance(_child_index, int) and 0 <= _child_index < len(children)
else None
)
_invoke_hook(
"subagent_stop",
parent_session_id=_parent_session_id,
parent_turn_id=getattr(parent_agent, "_current_turn_id", "") or "",
child_session_id=getattr(_child_agent, "session_id", None),
child_role=child_role,
child_summary=entry.get("summary"),
child_status=entry.get("status"),
duration_ms=int((entry.get("duration_seconds") or 0) * 1000),
)
except Exception:
pass
logger.debug("subagent_stop hook invocation failed", exc_info=True)
# Fire subagent_stop hooks once per child, serialised on the parent thread.
# This keeps Python-plugin and shell-hook callbacks off of the worker threads
# that ran the children, so hook authors don't need to reason about
# concurrent invocation. Role was captured into the entry dict in
# _run_single_child (or the fabricated-entry branches above) before the
# child was closed.
_parent_session_id = getattr(parent_agent, "session_id", None)
try:
from hermes_cli.plugins import invoke_hook as _invoke_hook
except Exception:
_invoke_hook = None
# Aggregate child spend here so the parent's footer/UI reflect the true
# cost of a subagent-heavy turn. Port of Kilo-Org/kilocode#9448. Each
# child's cost was captured in _run_single_child before its AIAgent was
# closed; we fold them into the parent in one pass alongside the
# subagent_stop hook loop so we don't walk `results` twice.
_children_cost_total = 0.0
for entry in results:
child_role = entry.pop("_child_role", None)
child_cost = entry.pop("_child_cost_usd", 0.0)
try:
if child_cost:
_children_cost_total += float(child_cost)
except (TypeError, ValueError):
pass
if _invoke_hook is None:
continue
try:
_child_index = entry.get("task_index", -1)
_child_agent = (
children[_child_index][2]
if isinstance(_child_index, int) and 0 <= _child_index < len(children)
else None
)
_invoke_hook(
"subagent_stop",
parent_session_id=_parent_session_id,
parent_turn_id=getattr(parent_agent, "_current_turn_id", "") or "",
child_session_id=getattr(_child_agent, "session_id", None),
child_role=child_role,
child_summary=entry.get("summary"),
child_status=entry.get("status"),
duration_ms=int((entry.get("duration_seconds") or 0) * 1000),
)
except Exception:
logger.debug("subagent_stop hook invocation failed", exc_info=True)
# Fold the aggregated child cost into the parent's session total. This is
# additive — each delegate_task call contributes its own children — so
# nested orchestrator→worker trees roll up naturally: each layer's own
# delegate_task() folds its direct children in, and when the orchestrator
# itself finishes, its parent folds the orchestrator's now-inflated total
# on top. Degrades silently if the parent lacks the counter (older test
# fixtures, etc.).
if _children_cost_total > 0.0:
try:
current = float(getattr(parent_agent, "session_estimated_cost_usd", 0.0) or 0.0)
parent_agent.session_estimated_cost_usd = current + _children_cost_total
# Upgrade the cost_source so the UI doesn't label a partially-real
# total as "none" when the parent itself hadn't billed any calls
# yet (rare but possible when the parent's only action this turn
# was delegate_task).
if getattr(parent_agent, "session_cost_source", "none") in {None, "", "none"}:
parent_agent.session_cost_source = "subagent"
if getattr(parent_agent, "session_cost_status", "unknown") in {None, "", "unknown"}:
parent_agent.session_cost_status = "estimated"
except Exception:
logger.debug("Subagent cost rollup failed", exc_info=True)
# Fold the aggregated child cost into the parent's session total. This is
# additive — each delegate_task call contributes its own children — so
# nested orchestrator→worker trees roll up naturally: each layer's own
# delegate_task() folds its direct children in, and when the orchestrator
# itself finishes, its parent folds the orchestrator's now-inflated total
# on top. Degrades silently if the parent lacks the counter (older test
# fixtures, etc.).
if _children_cost_total > 0.0:
try:
current = float(getattr(parent_agent, "session_estimated_cost_usd", 0.0) or 0.0)
parent_agent.session_estimated_cost_usd = current + _children_cost_total
# Upgrade the cost_source so the UI doesn't label a partially-real
# total as "none" when the parent itself hadn't billed any calls
# yet (rare but possible when the parent's only action this turn
# was delegate_task).
if getattr(parent_agent, "session_cost_source", "none") in {None, "", "none"}:
parent_agent.session_cost_source = "subagent"
if getattr(parent_agent, "session_cost_status", "unknown") in {None, "", "unknown"}:
parent_agent.session_cost_status = "estimated"
except Exception:
logger.debug("Subagent cost rollup failed", exc_info=True)
total_duration = round(time.monotonic() - overall_start, 2)
total_duration = round(time.monotonic() - overall_start, 2)
return json.dumps(
{
return {
"results": results,
"total_duration_seconds": total_duration,
},
ensure_ascii=False,
)
}
# ----- Background dispatch: run the WHOLE batch as one async unit -----
# When background is true, the entire fan-out runs on the daemon executor
# via a single async delegation. _execute_and_aggregate() joins on every
# child and produces ONE consolidated results block, which re-enters the
# conversation as a single message when ALL children finish. The chat is
# not blocked in the meantime. This is the contract: dispatch N subagents,
# keep chatting, get the combined summaries back together at the end.
if background:
from tools.async_delegation import dispatch_async_delegation_batch
from tools.approval import get_current_session_key
_session_key = get_current_session_key(default="")
_child_agents = [c for (_, _, c) in children]
# Detach every child from the parent's interrupt-propagation list — the
# batch's lifecycle is owned by the async registry now, not the parent
# turn. _build_child_agent attached them (correct for sync runs).
if hasattr(parent_agent, "_active_children"):
_ac_lock = getattr(parent_agent, "_active_children_lock", None)
for _c in _child_agents:
try:
if _ac_lock:
with _ac_lock:
parent_agent._active_children.remove(_c)
else:
parent_agent._active_children.remove(_c)
except ValueError:
pass
def _batch_runner():
return _execute_and_aggregate()
def _batch_interrupt():
for _c in _child_agents:
try:
if hasattr(_c, "interrupt"):
_c.interrupt("Async delegation cancelled")
elif hasattr(_c, "_interrupt_requested"):
_c._interrupt_requested = True
except Exception:
pass
_goals = [t["goal"] for t in task_list]
dispatch = dispatch_async_delegation_batch(
goals=_goals,
context=context,
toolsets=toolsets,
role=top_role,
model=creds["model"],
session_key=_session_key,
runner=_batch_runner,
interrupt_fn=_batch_interrupt,
max_async_children=_get_max_async_children(),
)
if dispatch.get("status") == "dispatched":
n = len(_goals)
note = (
"Subagent is running in the background. You and the user can "
"keep working; its full result re-enters the conversation as a "
"new message when it finishes. Do not wait or poll — just "
"continue."
if n == 1 else
f"{n} subagents are running in parallel in the background. You "
f"and the user can keep working; they wait on each other and "
f"their consolidated results re-enter the conversation as a "
f"single message once ALL of them finish. Do not wait or poll "
f"— just continue."
)
payload = {
"status": "dispatched",
"mode": "background",
"count": n,
"delegation_id": dispatch["delegation_id"],
"goals": _goals,
"note": note,
}
return json.dumps(payload, ensure_ascii=False)
# Pool at capacity / schedule failure — children are still attached
# (we detach above only on the parent list, but the async unit was
# never accepted, so re-attaching isn't needed: we just run inline).
logger.info(
"delegate_task: async pool at capacity (%s); running the whole "
"batch synchronously instead.",
dispatch.get("error", "rejected"),
)
return json.dumps(_execute_and_aggregate(), ensure_ascii=False)
# ----- Synchronous path -----
return json.dumps(_execute_and_aggregate(), ensure_ascii=False)
def _resolve_child_credential_pool(
@ -2842,11 +2851,16 @@ def _build_top_level_description() -> str:
"Only the final summary is returned -- intermediate tool results "
"never enter your context window.\n\n"
"TWO MODES (one of 'goal' or 'tasks' is required):\n"
"1. Single task: provide 'goal' (+ optional context, toolsets)\n"
"1. Single task: provide 'goal' (+ optional context, toolsets).\n"
f"2. Batch (parallel): provide 'tasks' array with up to {max_children} "
f"items concurrently for this user (configured via "
f"delegation.max_concurrent_children in config.yaml). "
f"All run in parallel and results are returned together. {nesting_clause}\n\n"
f"delegation.max_concurrent_children in config.yaml). {nesting_clause}\n\n"
"BOTH MODES RUN IN THE BACKGROUND. delegate_task returns immediately — "
"you and the user keep working, and each subagent's full result "
"re-enters the conversation as its own new message when it finishes. A "
"batch is just N independent background subagents (N handles, each "
"completes on its own). Do NOT wait or poll; just continue with other "
"work after dispatching.\n\n"
"WHEN TO USE delegate_task:\n"
"- Reasoning-heavy subtasks (debugging, code review, research synthesis)\n"
"- Tasks that would flood your context with intermediate data\n"
@ -2857,11 +2871,10 @@ def _build_top_level_description() -> str:
"- Tasks needing user interaction -> subagents cannot use clarify\n"
"- Durable long-running work that must outlive the current turn -> "
"use cronjob (action='create') or terminal(background=True, "
"notify_on_complete=True) instead. delegate_task runs SYNCHRONOUSLY "
"inside the parent turn: if the parent is interrupted (user sends a "
"new message, /stop, /new) the child is cancelled with status="
"'interrupted' and its work is discarded. Children cannot continue "
"in the background.\n\n"
"notify_on_complete=True) instead. Background delegations are NOT "
"durable: if the parent session is closed (/new) or the process exits "
"before a subagent finishes, that subagent's work is discarded, and "
"/stop cancels every running background subagent.\n\n"
"IMPORTANT:\n"
"- Subagents have NO memory of your conversation. Pass all relevant "
"info (file paths, error messages, constraints) via the 'context' field.\n"
@ -3059,19 +3072,13 @@ DELEGATE_TASK_SCHEMA = {
"background": {
"type": "boolean",
"description": (
"Run the subagent asynchronously in the BACKGROUND "
"instead of blocking this turn. When true, delegate_task "
"returns immediately with a delegation_id; you and the "
"user keep working while the subagent runs, and its full "
"result re-enters the conversation as a new message when "
"it finishes (similar to terminal background=true + "
"notify_on_complete). The re-injected message includes the "
"original goal/context so you can act on it even after "
"moving on. Single-task only — cannot be combined with the "
"'tasks' batch array. Use for long-running independent work "
"the user shouldn't have to wait on (research, builds, "
"multi-step investigations). Do NOT poll or wait after "
"dispatching — just continue; the result will come to you."
"DEPRECATED / IGNORED. Single-task delegations always run "
"in the background automatically — you do not need to (and "
"cannot) opt in or out. The result re-enters the "
"conversation as a new message when the subagent finishes; "
"just continue working in the meantime. Setting this has no "
"effect; the parameter remains only for backward "
"compatibility."
),
},
"acp_command": {
@ -3105,6 +3112,23 @@ DELEGATE_TASK_SCHEMA = {
# --- Registry ---
from tools.registry import registry, tool_error
def _model_background_value(args: dict, parent_agent=None) -> bool:
"""Background flag for the MODEL-facing dispatch path (registry fallback).
Delegations from the top-level agent always run in the background the
model does not choose. This applies to both a single task and a fan-out
batch (each task becomes its own independent background subagent). The one
exception is a delegation from an orchestrator subagent (depth > 0), which
needs its workers' results within its own turn. The live path is
``run_agent._dispatch_delegate_task``; this lambda mirrors it for the rare
case the intercept is bypassed. Direct Python callers of ``delegate_task``
keep the historical synchronous default.
"""
is_subagent = getattr(parent_agent, "_delegate_depth", 0) > 0
return not is_subagent
registry.register(
name="delegate_task",
toolset="delegation",
@ -3118,7 +3142,7 @@ registry.register(
acp_command=args.get("acp_command"),
acp_args=args.get("acp_args"),
role=args.get("role"),
background=args.get("background"),
background=_model_background_value(args, kw.get("parent_agent")),
parent_agent=kw.get("parent_agent"),
),
check_fn=check_delegate_requirements,