feat(goals): /goal wait <pid> — park the loop on a background process (#50503)

* feat(goals): add /goal wait <pid> barrier to park the loop on a background process The /goal loop re-pokes the agent every turn via the post-turn judge. When a goal is gated on a long-running background process (CI poller, build, test matrix, deploy) that produces nothing to judge yet, this spins the agent into 'is it done?' busy-work and burns the turn budget. /goal wait <pid> [reason] parks the loop: while the PID is alive, the judge is skipped, no turn is consumed, no continuation fires, and /goal status shows a parked indicator. The barrier auto-clears the moment the process exits (the agent's notify_on_complete watcher is the natural wake signal), then the next turn resumes normal judging. /goal unwait clears it manually; pause/resume/clear drop it; a dead/stale PID can never wedge the loop. Wired across CLI, gateway, and the mid-run command guard for parity. Barrier persists in SessionDB.state_meta (survives /resume); GoalState gains backward-compatible waiting_on_pid/waiting_reason/waiting_since fields. 12 new tests; docs updated. * fix(goals): use gateway.status._pid_exists for liveness, not os.kill(pid,0) The Windows-footguns CI guard flagged os.kill(pid, 0) in _pid_alive — on Windows that's not a no-op, it routes to CTRL_C_EVENT and hard-kills the target's console process group (bpo-14484). Delegate to the canonical footgun-safe gateway.status._pid_exists (psutil + ctypes/POSIX fallback) instead, with a direct-psutil last resort. * feat(goals): judge-driven auto-wait — the loop parks itself, no manual /goal wait Makes the wait barrier automatic. Every turn the judge is shown the agent's live background processes (pid, command, uptime, output tail from the process_registry) alongside the goal + response, and can return a new 'wait' verdict instead of continue: {"verdict":"wait","wait_on_pid":N} → park until that process exits {"verdict":"wait","wait_for_seconds":N} → park until the deadline passes evaluate_after_turn acts on the directive (sets the barrier, parks the loop) so the agent isn't re-poked into busy-work while CI/builds/deploys run. Adds a time-based waiting_until barrier alongside the pid barrier; both auto-clear and can never wedge the loop. Drivers (CLI, gateway, tui_gateway) feed the live registry in via gather_background_processes(). Manual /goal wait stays as an override. Judge verdict contract widened to (verdict, reason, parse_failed, wait_directive); legacy {"done":bool} shape still accepted. * test(goals): update kanban _fake_judge to the 4-tuple judge contract CI test(3) caught it: test_kanban_goal_mode's _fake_judge still returned the 3-tuple (verdict, reason, parse_failed), but the kanban loop now unpacks the 4-tuple (+ wait_directive). Update the fake to return None for the directive and accept the background_processes kwarg. * feat(goals): trigger-based wait — park on a process's own signal, not just exit Addresses two gaps in the judge-driven wait: (1) the judge could only express 'wait until PID exits' or 'wait N seconds', so a long-lived watcher/server that fires a trigger MID-RUN (and may never exit) couldn't be waited on; (2) the process's own watch_patterns/notify_on_complete trigger was invisible to the judge. Adds a session-based barrier (waiting_on_session) that releases on the process's OWN trigger via process_registry.is_session_waiting(): the session exits, OR (if started with watch_patterns) its pattern matches — even while the process keeps running. list_sessions() now surfaces session_id + watch_patterns/watch_hit/ notify_on_complete so the judge sees the trigger and is told to prefer wait_on_session for trigger processes. Judge verdict gains a {wait_on_session} directive (preferred over pid). Backward-compatible GoalState field; pid + time barriers unchanged. Tests: TestSessionTriggerBarrier (release on mid-run pattern match while alive, release on exit, unknown-session, full park→trigger→resume, parse, validation, backcompat load). 105 goal-surface + 85 process_registry tests green.
2026-06-23 10:42:00 +00:00 · 2026-06-22 06:27:29 -07:00 · 2026-06-22 06:27:29 -07:00 · ff85af3fc7
commit ff85af3fc7
parent d4fa2db1c5
13 changed files with 1139 additions and 104 deletions
--- a/hermes_cli/cli_commands_mixin.py
+++ b/hermes_cli/cli_commands_mixin.py
@ -1821,6 +1821,38 @@ class CLICommandsMixin:
                _cprint(f"  {_DIM}No active goal.{_RST}")
            return

+        # /goal wait <pid> [reason] — park the loop on a background process so
+        # it stops re-poking the agent every turn while it waits on CI / a
+        # build / a long job. The barrier auto-clears when the PID exits.
+        if lower == "wait" or lower.startswith("wait "):
+            wait_arg = arg[len("wait"):].strip()
+            if not wait_arg:
+                _cprint("  Usage: /goal wait <pid> [reason]")
+                return
+            wtokens = wait_arg.split(None, 1)
+            try:
+                pid = int(wtokens[0])
+            except ValueError:
+                _cprint("  /goal wait: <pid> must be an integer process id.")
+                return
+            reason = wtokens[1].strip() if len(wtokens) > 1 else ""
+            try:
+                mgr.wait_on(pid, reason=reason)
+            except (RuntimeError, ValueError) as exc:
+                _cprint(f"  /goal wait: {exc}")
+                return
+            rtxt = f" ({reason})" if reason else ""
+            _cprint(f"  ⏳ Goal parked on pid {pid}{rtxt}. Loop pauses until it exits.")
+            return
+
+        # /goal unwait — drop the wait barrier and resume normal looping.
+        if lower == "unwait":
+            if mgr.stop_waiting():
+                _cprint("  ▶ Wait barrier cleared — goal loop resumes.")
+            else:
+                _cprint(f"  {_DIM}No wait barrier set.{_RST}")
+            return
+
        # Otherwise treat the arg as the goal text.
        try:
            state = mgr.set(arg)
--- a/hermes_cli/commands.py
+++ b/hermes_cli/commands.py
@ -108,7 +108,7 @@ COMMAND_REGISTRY: list[CommandDef] = [
    CommandDef("steer", "Inject a message after the next tool call without interrupting", "Session",
               args_hint="<prompt>"),
    CommandDef("goal", "Set a standing goal Hermes works on across turns until achieved", "Session",
-               args_hint="[text | pause | resume | clear | status]"),
+               args_hint="[text | pause | resume | clear | status | wait <pid> | unwait]"),
    CommandDef("subgoal", "Add or manage extra criteria on the active goal", "Session",
               args_hint="[text | remove N | clear]"),
    CommandDef("status", "Show session, model, token, and context info", "Session"),
--- a/hermes_cli/goals.py
+++ b/hermes_cli/goals.py
@ -94,25 +94,59 @@ CONTINUATION_PROMPT_WITH_SUBGOALS_TEMPLATE = (

 JUDGE_SYSTEM_PROMPT = (
    "You are a strict judge evaluating whether an autonomous agent has "
-    "achieved a user's stated goal. You receive the goal text and the "
-    "agent's most recent response. Your only job is to decide whether "
-    "the goal is fully satisfied based on that response.\n\n"
-    "A goal is DONE only when:\n"
+    "achieved a user's stated goal. You receive the goal text, the agent's "
+    "most recent response, and — when present — a list of background "
+    "processes the agent has running. Decide one of three verdicts.\n\n"
+    "DONE — the goal is fully satisfied:\n"
    "- The response explicitly confirms the goal was completed, OR\n"
    "- The response clearly shows the final deliverable was produced, OR\n"
    "- The response explains the goal is unachievable / blocked / needs "
    "user input (treat this as DONE with reason describing the block).\n\n"
-    "Otherwise the goal is NOT done — CONTINUE.\n\n"
-    "Reply ONLY with a single JSON object on one line:\n"
-    '{\"done\": <true|false>, \"reason\": \"<one-sentence rationale>\"}'
+    "WAIT — the goal is NOT done, but the next step is to wait for async "
+    "work to finish rather than act again. Choose this ONLY when the agent's "
+    "progress is genuinely gated on something running on its own:\n"
+    "- A background process listed below is still running AND the response "
+    "shows the agent is waiting on its result (e.g. a CI poller, build, "
+    "test run, deploy). If the process has a session id, return it in "
+    "``wait_on_session`` — that releases when the process exits OR its "
+    "watch_patterns trigger fires (use this for a long-lived watcher that "
+    "signals mid-run and may never exit). Otherwise return its pid in "
+    "``wait_on_pid`` (releases on exit only).\n"
+    "- The agent says it is rate-limited / backing off / must wait a fixed "
+    "period — return seconds in ``wait_for_seconds``.\n"
+    "Picking WAIT parks the loop without burning a turn; it resumes "
+    "automatically when the pid exits or the time elapses. Do NOT pick WAIT "
+    "just because work remains — only when re-poking now would be pure "
+    "busy-work because the agent can't progress until the async thing "
+    "finishes.\n\n"
+    "CONTINUE — not done, and there is a concrete next step the agent can "
+    "take right now. This is the default when in doubt.\n\n"
+    "Reply ONLY with a single JSON object on one line. Shapes:\n"
+    '{"verdict": "done", "reason": "<one sentence>"}\n'
+    '{"verdict": "continue", "reason": "<one sentence>"}\n'
+    '{"verdict": "wait", "wait_on_session": "<id>", "reason": "<one sentence>"}\n'
+    '{"verdict": "wait", "wait_on_pid": <int>, "reason": "<one sentence>"}\n'
+    '{"verdict": "wait", "wait_for_seconds": <int>, "reason": "<one sentence>"}\n'
+    "The legacy shape {\"done\": <true|false>, \"reason\": \"...\"} is still "
+    "accepted (true=done, false=continue)."
+)
+
+
+# Rendered into the judge prompt when the agent has background processes
+# running. Gives the judge the context it needs to decide WAIT vs CONTINUE
+# (and which pid to wait on) without it having to probe anything itself.
+JUDGE_BACKGROUND_BLOCK_TEMPLATE = (
+    "Background processes the agent currently has running (it may be waiting "
+    "on one of these):\n{background_lines}\n\n"
 )


 JUDGE_USER_PROMPT_TEMPLATE = (
    "Goal:\n{goal}\n\n"
    "Agent's most recent response:\n{response}\n\n"
+    "{background_block}"
    "Current time: {current_time}\n\n"
-    "Is the goal satisfied?"
+    "Is the goal satisfied — done, continue, or wait?"
 )

 # Used when the user has added /subgoal criteria. The judge must
@ -122,6 +156,7 @@ JUDGE_USER_PROMPT_WITH_SUBGOALS_TEMPLATE = (
    "Additional criteria the user added mid-loop (all must also be "
    "satisfied for the goal to be DONE):\n{subgoals_block}\n\n"
    "Agent's most recent response:\n{response}\n\n"
+    "{background_block}"
    "Current time: {current_time}\n\n"
    "Decision: For each numbered criterion above, find concrete "
    "evidence in the agent's response that the criterion is "
@ -129,7 +164,8 @@ JUDGE_USER_PROMPT_WITH_SUBGOALS_TEMPLATE = (
    "met' or 'implying it was done' — require specific evidence (a "
    "file contents excerpt, an output line, a command result). If "
    "ANY criterion lacks specific evidence in the response, the goal "
-    "is NOT done — return CONTINUE.\n\n"
+    "is NOT done — return CONTINUE (or WAIT if blocked on a listed "
+    "background process).\n\n"
    "Is the goal AND every additional criterion satisfied?"
 )

@ -159,6 +195,30 @@ class GoalState:
    # them into the verdict. Backwards-compatible: defaults to empty so
    # old state_meta rows load unchanged.
    subgoals: List[str] = field(default_factory=list)
+    # Wait barrier: when the agent is blocked on long-running async work
+    # (CI poller, build, test run, deploy, rate-limit cooldown) the goal loop
+    # PARKS instead of being re-poked every turn into busy-work. Two barrier
+    # kinds, set automatically by the judge (which now sees the live
+    # background-process list and can return a ``wait`` verdict) or manually
+    # via ``/goal wait``:
+    #   • ``waiting_on_pid`` — park until that process exits.
+    #   • ``waiting_on_session`` — park until that process_registry session's
+    #     OWN trigger fires: it exits, OR (if it has watch_patterns) its
+    #     pattern matches. Covers long-lived watchers/servers that signal
+    #     mid-run via a trigger and may never exit. Preferred over raw pid
+    #     when the agent set up a watch_patterns/notify_on_complete process.
+    #   • ``waiting_until``  — park until this wall-clock epoch (time backoff).
+    # While ANY is active, ``evaluate_after_turn`` short-circuits to
+    # should_continue=False without burning a turn or calling the judge. The
+    # barrier auto-clears when the pid exits / the trigger fires / the deadline
+    # passes, then the next turn resumes normal judging. Cleared by that,
+    # ``/goal unwait``, pause, resume, or clear. Backwards-compatible: old
+    # state_meta rows load with no barrier.
+    waiting_on_pid: Optional[int] = None
+    waiting_on_session: Optional[str] = None
+    waiting_until: float = 0.0
+    waiting_reason: Optional[str] = None
+    waiting_since: float = 0.0

    def to_json(self) -> str:
        return json.dumps(asdict(self), ensure_ascii=False)
@ -182,6 +242,11 @@ class GoalState:
            paused_reason=data.get("paused_reason"),
            consecutive_parse_failures=int(data.get("consecutive_parse_failures", 0) or 0),
            subgoals=subgoals,
+            waiting_on_pid=(int(data["waiting_on_pid"]) if data.get("waiting_on_pid") else None),
+            waiting_on_session=(str(data["waiting_on_session"]) if data.get("waiting_on_session") else None),
+            waiting_until=float(data.get("waiting_until", 0.0) or 0.0),
+            waiting_reason=data.get("waiting_reason"),
+            waiting_since=float(data.get("waiting_since", 0.0) or 0.0),
        )

    # --- subgoals helpers -------------------------------------------------
@ -330,6 +395,52 @@ def _truncate(text: str, limit: int) -> str:
    return text[:limit] + "… [truncated]"


+def _pid_alive(pid: int) -> bool:
+    """Return True if a process with ``pid`` is currently alive.
+
+    Delegates to ``gateway.status._pid_exists`` — the canonical,
+    cross-platform, footgun-safe liveness check (psutil with a ctypes /
+    POSIX fallback). Critically this avoids ``os.kill(pid, 0)``, which on
+    Windows is NOT a no-op: it routes to ``CTRL_C_EVENT`` and hard-kills the
+    target's console process group (bpo-14484). Any error resolves to False
+    (treat unknown as dead) so a stale barrier never wedges the loop — the
+    worst case is the goal resumes one turn early, which is safe.
+    """
+    if not pid or pid <= 0:
+        return False
+    try:
+        from gateway.status import _pid_exists
+
+        return bool(_pid_exists(int(pid)))
+    except Exception:
+        pass
+    # Last-resort fallback if gateway.status is unavailable: psutil directly.
+    try:
+        import psutil  # type: ignore
+
+        return bool(psutil.pid_exists(int(pid)))
+    except Exception:
+        return False
+
+
+def _session_waiting(session_id: str) -> bool:
+    """Whether a goal parked on a process_registry session should stay parked.
+
+    Delegates to ``process_registry.is_session_waiting`` — True while the
+    session is running and (if it has watch_patterns) its trigger hasn't fired.
+    Fail-safe: any import/registry error yields False (don't wait) so a stale
+    barrier can never wedge the loop.
+    """
+    if not session_id:
+        return False
+    try:
+        from tools.process_registry import process_registry
+
+        return bool(process_registry.is_session_waiting(session_id))
+    except Exception:
+        return False
+
+
 _JSON_OBJECT_RE = re.compile(r"\{.*?\}", re.DOTALL)


@ -357,17 +468,25 @@ def _goal_judge_max_tokens() -> int:
    return DEFAULT_JUDGE_MAX_TOKENS


-def _parse_judge_response(raw: str) -> Tuple[bool, str, bool]:
-    """Parse the judge's reply. Fail-open to ``(False, "<reason>", parse_failed)``.
+def _parse_judge_response(raw: str) -> Tuple[str, str, bool, Optional[Dict[str, Any]]]:
+    """Parse the judge's reply. Fail-open on unusable output.

-    Returns ``(done, reason, parse_failed)``. ``parse_failed`` is True when the
-    judge returned output that couldn't be interpreted as the expected JSON
-    verdict (empty body, prose, malformed JSON). Callers use that flag to
-    auto-pause after N consecutive parse failures so a weak judge model
-    doesn't silently burn the turn budget.
+    Returns ``(verdict, reason, parse_failed, wait_directive)`` where:
+      - ``verdict`` is ``"done"``, ``"continue"``, or ``"wait"``.
+      - ``parse_failed`` is True when the judge returned output that couldn't
+        be interpreted as the expected JSON verdict (empty body, prose,
+        malformed JSON). Callers use it to auto-pause after N consecutive
+        parse failures so a weak judge model doesn't silently burn the budget.
+      - ``wait_directive`` is set only for ``verdict == "wait"``: a dict with
+        ``{"pid": int}`` or ``{"seconds": int}`` (whichever the judge supplied).
+        ``None`` otherwise. If a wait verdict carries neither a usable pid nor
+        seconds, it is downgraded to ``continue`` (can't park on nothing).
+
+    Accepts both the new ``{"verdict": ...}`` shape and the legacy
+    ``{"done": <bool>}`` shape.
    """
    if not raw:
-        return False, "judge returned empty response", True
+        return "continue", "judge returned empty response", True, None

    text = raw.strip()

@ -393,17 +512,103 @@ def _parse_judge_response(raw: str) -> Tuple[bool, str, bool]:
                data = None

    if not isinstance(data, dict):
-        return False, f"judge reply was not JSON: {_truncate(raw, 200)!r}", True
+        return "continue", f"judge reply was not JSON: {_truncate(raw, 200)!r}", True, None

-    done_val = data.get("done")
-    if isinstance(done_val, str):
-        done = done_val.strip().lower() in {"true", "yes", "1", "done"}
+    reason = str(data.get("reason") or "").strip() or "no reason provided"
+
+    # Determine verdict — prefer the explicit "verdict" field, fall back to
+    # the legacy "done" boolean.
+    verdict_raw = data.get("verdict")
+    if isinstance(verdict_raw, str):
+        verdict = verdict_raw.strip().lower()
    else:
-        done = bool(done_val)
-    reason = str(data.get("reason") or "").strip()
-    if not reason:
-        reason = "no reason provided"
-    return done, reason, False
+        done_val = data.get("done")
+        if isinstance(done_val, str):
+            done = done_val.strip().lower() in {"true", "yes", "1", "done"}
+        else:
+            done = bool(done_val)
+        verdict = "done" if done else "continue"
+
+    if verdict not in {"done", "continue", "wait"}:
+        verdict = "continue"
+
+    if verdict != "wait":
+        return verdict, reason, False, None
+
+    # Wait verdict: extract a concrete directive (pid or seconds). Accept a
+    # few key spellings the model might emit.
+    def _first_int(*keys: str) -> Optional[int]:
+        for k in keys:
+            v = data.get(k)
+            if v is None:
+                continue
+            try:
+                iv = int(v)
+                if iv > 0:
+                    return iv
+            except (TypeError, ValueError):
+                continue
+        return None
+
+    # Prefer a session-id directive (releases on the process's own trigger —
+    # exit OR watch-pattern match), then pid (exit only), then seconds.
+    sess = data.get("wait_on_session") or data.get("session_id") or data.get("wait_session")
+    if isinstance(sess, str) and sess.strip():
+        return "wait", reason, False, {"session_id": sess.strip()}
+    pid = _first_int("wait_on_pid", "pid", "wait_pid")
+    if pid is not None:
+        return "wait", reason, False, {"pid": pid}
+    seconds = _first_int("wait_for_seconds", "seconds", "wait_seconds")
+    if seconds is not None:
+        return "wait", reason, False, {"seconds": seconds}
+    # Wait with no usable target — can't park on nothing; treat as continue.
+    return "continue", f"{reason} (wait verdict had no target — continuing)", False, None
+
+
+def _render_background_block(background_processes: Optional[List[Dict[str, Any]]]) -> str:
+    """Render the live background-process list for the judge prompt.
+
+    Each entry is a ``process_registry.list_sessions()`` dict. Only RUNNING
+    processes are worth showing (an exited one is nothing to wait on). Returns
+    an empty string when there's nothing running, so the judge prompt is
+    byte-identical to the no-background case (no behavior change for the
+    common path).
+    """
+    if not background_processes:
+        return ""
+    lines: List[str] = []
+    for p in background_processes:
+        if not isinstance(p, dict):
+            continue
+        if p.get("status") == "exited":
+            continue
+        pid = p.get("pid")
+        if not pid:
+            continue
+        cmd = _truncate(str(p.get("command") or "").replace("\n", " ").strip(), 120)
+        uptime = p.get("uptime_seconds")
+        tail = _truncate(str(p.get("output_preview") or "").replace("\n", " ").strip(), 120)
+        sid = p.get("session_id")
+        line = f"- pid {pid}"
+        if sid:
+            line += f" / session {sid}"
+        line += f": {cmd}"
+        if uptime is not None:
+            line += f" (running {uptime}s)"
+        # Surface the process's own trigger so the judge can wait on a
+        # mid-run signal (watch-pattern) or completion, not just exit.
+        wps = p.get("watch_patterns")
+        if wps:
+            hit = " [already matched]" if p.get("watch_hit") else ""
+            line += f" | watch_patterns={wps}{hit}"
+        elif p.get("notify_on_complete"):
+            line += " | notify_on_complete"
+        if tail:
+            line += f" | recent output: {tail}"
+        lines.append(line)
+    if not lines:
+        return ""
+    return JUDGE_BACKGROUND_BLOCK_TEMPLATE.format(background_lines="\n".join(lines))


 def judge_goal(
@ -412,11 +617,14 @@ def judge_goal(
    *,
    timeout: float = DEFAULT_JUDGE_TIMEOUT,
    subgoals: Optional[List[str]] = None,
-) -> Tuple[str, str, bool]:
+    background_processes: Optional[List[Dict[str, Any]]] = None,
+) -> Tuple[str, str, bool, Optional[Dict[str, Any]]]:
    """Ask the auxiliary model whether the goal is satisfied.

-    Returns ``(verdict, reason, parse_failed)`` where verdict is ``"done"``,
-    ``"continue"``, or ``"skipped"`` (when the judge couldn't be reached).
+    Returns ``(verdict, reason, parse_failed, wait_directive)`` where verdict
+    is ``"done"``, ``"continue"``, ``"wait"``, or ``"skipped"`` (when the
+    judge couldn't be reached). ``wait_directive`` is set only for ``"wait"``
+    (``{"pid": int}`` or ``{"seconds": int}``); ``None`` otherwise.

    ``parse_failed`` is True only when the judge call succeeded but its output
    was unusable (empty or non-JSON). API/transport errors return False — they
@ -425,37 +633,39 @@ def judge_goal(
    ``DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES``).

    ``subgoals`` is an optional list of user-added criteria (from
-    ``/subgoal``) that the judge must also factor into its DONE/CONTINUE
-    decision. When non-empty the prompt switches to the with-subgoals
-    template; otherwise behavior is identical to the original judge.
+    ``/subgoal``) factored into the verdict. ``background_processes`` is the
+    live ``process_registry.list_sessions()`` snapshot; when the agent is
+    waiting on one (a CI poller, build, etc.) the judge can return a ``wait``
+    verdict naming its pid, parking the loop instead of re-poking.

-    This is deliberately fail-open: any error returns ``("continue", "...", False)``
+    This is deliberately fail-open: any error returns ``("continue", ..., False, None)``
    so a broken judge doesn't wedge progress — the turn budget and the
    consecutive-parse-failures auto-pause are the backstops.
    """
    if not goal.strip():
-        return "skipped", "empty goal", False
+        return "skipped", "empty goal", False, None
    if not last_response.strip():
        # No substantive reply this turn — almost certainly not done yet.
-        return "continue", "empty response (nothing to evaluate)", False
+        return "continue", "empty response (nothing to evaluate)", False, None

    try:
        from agent.auxiliary_client import get_auxiliary_extra_body, get_text_auxiliary_client
    except Exception as exc:
        logger.debug("goal judge: auxiliary client import failed: %s", exc)
-        return "continue", "auxiliary client unavailable", False
+        return "continue", "auxiliary client unavailable", False, None

    try:
        client, model = get_text_auxiliary_client("goal_judge")
    except Exception as exc:
        logger.debug("goal judge: get_text_auxiliary_client failed: %s", exc)
-        return "continue", "auxiliary client unavailable", False
+        return "continue", "auxiliary client unavailable", False, None

    if client is None or not model:
-        return "continue", "no auxiliary client configured", False
+        return "continue", "no auxiliary client configured", False, None

    # Build the prompt — pick the with-subgoals variant when applicable.
    clean_subgoals = [s.strip() for s in (subgoals or []) if s and s.strip()]
+    background_block = _render_background_block(background_processes)
    current_time = datetime.now(tz=timezone.utc).astimezone().strftime("%Y-%m-%d %H:%M:%S %Z")
    if clean_subgoals:
        subgoals_block = "\n".join(
@ -465,12 +675,14 @@ def judge_goal(
            goal=_truncate(goal, 2000),
            subgoals_block=_truncate(subgoals_block, 2000),
            response=_truncate(last_response, _JUDGE_RESPONSE_SNIPPET_CHARS),
+            background_block=background_block,
            current_time=current_time,
        )
    else:
        prompt = JUDGE_USER_PROMPT_TEMPLATE.format(
            goal=_truncate(goal, 2000),
            response=_truncate(last_response, _JUDGE_RESPONSE_SNIPPET_CHARS),
+            background_block=background_block,
            current_time=current_time,
        )

@ -488,17 +700,40 @@ def judge_goal(
        )
    except Exception as exc:
        logger.info("goal judge: API call failed (%s) — falling through to continue", exc)
-        return "continue", f"judge error: {type(exc).__name__}", False
+        return "continue", f"judge error: {type(exc).__name__}", False, None

    try:
        raw = resp.choices[0].message.content or ""
    except Exception:
        raw = ""

-    done, reason, parse_failed = _parse_judge_response(raw)
-    verdict = "done" if done else "continue"
-    logger.info("goal judge: verdict=%s reason=%s", verdict, _truncate(reason, 120))
-    return verdict, reason, parse_failed
+    verdict, reason, parse_failed, wait_directive = _parse_judge_response(raw)
+    logger.info(
+        "goal judge: verdict=%s reason=%s%s",
+        verdict, _truncate(reason, 120),
+        f" wait={wait_directive}" if wait_directive else "",
+    )
+    return verdict, reason, parse_failed, wait_directive
+
+
+def gather_background_processes(task_id: Optional[str] = None) -> List[Dict[str, Any]]:
+    """Return the live background-process snapshot for the goal judge.
+
+    Thin, fail-safe wrapper over ``process_registry.list_sessions(task_id)``.
+    Returns only RUNNING processes (an exited one is nothing to wait on) and
+    never raises — any import/registry failure yields ``[]`` so the goal loop
+    degrades to its pre-wait-barrier behavior (judge just won't see processes).
+    The drivers (CLI + gateway) call this and pass the result into
+    ``GoalManager.evaluate_after_turn(background_processes=...)``.
+    """
+    try:
+        from tools.process_registry import process_registry
+
+        sessions = process_registry.list_sessions(task_id=task_id) or []
+    except Exception as exc:
+        logger.debug("gather_background_processes failed: %s", exc)
+        return []
+    return [s for s in sessions if isinstance(s, dict) and s.get("status") != "exited"]


 # ──────────────────────────────────────────────────────────────────────
@ -547,6 +782,16 @@ class GoalManager:
        turns = f"{s.turns_used}/{s.max_turns} turns"
        sub = f", {len(s.subgoals)} subgoal{'s' if len(s.subgoals) != 1 else ''}" if s.subgoals else ""
        if s.status == "active":
+            if s.waiting_on_session and _session_waiting(s.waiting_on_session):
+                wr = s.waiting_reason or f"session {s.waiting_on_session}"
+                return f"⏳ Goal (parked on {wr}, {turns}{sub}): {s.goal}"
+            if s.waiting_on_pid and _pid_alive(s.waiting_on_pid):
+                wr = s.waiting_reason or f"pid {s.waiting_on_pid}"
+                return f"⏳ Goal (parked on {wr}, {turns}{sub}): {s.goal}"
+            if s.waiting_until and time.time() < s.waiting_until:
+                remaining = int(s.waiting_until - time.time())
+                wr = s.waiting_reason or f"{remaining}s"
+                return f"⏳ Goal (parked {remaining}s — {wr}, {turns}{sub}): {s.goal}"
            return f"⊙ Goal (active, {turns}{sub}): {s.goal}"
        if s.status == "paused":
            extra = f" — {s.paused_reason}" if s.paused_reason else ""
@ -578,6 +823,12 @@ class GoalManager:
            return None
        self._state.status = "paused"
        self._state.paused_reason = reason
+        # A wait barrier is meaningless once paused — drop it.
+        self._state.waiting_on_pid = None
+        self._state.waiting_on_session = None
+        self._state.waiting_until = 0.0
+        self._state.waiting_reason = None
+        self._state.waiting_since = 0.0
        save_goal(self.session_id, self._state)
        return self._state

@ -586,6 +837,12 @@ class GoalManager:
            return None
        self._state.status = "active"
        self._state.paused_reason = None
+        # Resuming starts fresh — clear any stale barrier.
+        self._state.waiting_on_pid = None
+        self._state.waiting_on_session = None
+        self._state.waiting_until = 0.0
+        self._state.waiting_reason = None
+        self._state.waiting_since = 0.0
        if reset_budget:
            self._state.turns_used = 0
        save_goal(self.session_id, self._state)
@ -653,6 +910,123 @@ class GoalManager:
            return "(no subgoals — use /subgoal <text> to add criteria)"
        return self._state.render_subgoals_block()

+    # --- /goal wait barrier -------------------------------------------
+
+    def wait_on(self, pid: int, reason: str = "") -> GoalState:
+        """Park the goal loop on a background process PID.
+
+        While the PID is alive, ``evaluate_after_turn`` returns
+        ``should_continue=False`` without burning a turn or calling the
+        judge — the loop quiesces instead of re-poking the agent into busy
+        work. The barrier auto-clears when the process exits. Requires an
+        active goal. For a process with a watch_patterns/notify_on_complete
+        trigger, prefer ``wait_on_session`` so a mid-run trigger (not just
+        exit) releases the barrier.
+        """
+        if self._state is None or self._state.status != "active":
+            raise RuntimeError("no active goal to park")
+        pid = int(pid)
+        if pid <= 0:
+            raise ValueError("pid must be a positive integer")
+        self._state.waiting_on_pid = pid
+        self._state.waiting_on_session = None
+        self._state.waiting_until = 0.0
+        self._state.waiting_reason = (reason or "").strip() or None
+        self._state.waiting_since = time.time()
+        save_goal(self.session_id, self._state)
+        return self._state
+
+    def wait_on_session(self, session_id: str, reason: str = "") -> GoalState:
+        """Park the goal loop on a process_registry session's OWN trigger.
+
+        Unlike ``wait_on`` (which releases only on PID exit), this releases
+        when the session's trigger fires: it exits, OR — if it was started
+        with ``watch_patterns`` — its pattern matches. This is the right
+        barrier for a long-lived watcher/server/poller that signals mid-run
+        and may never exit. Requires an active goal.
+        """
+        if self._state is None or self._state.status != "active":
+            raise RuntimeError("no active goal to park")
+        session_id = str(session_id or "").strip()
+        if not session_id:
+            raise ValueError("session_id must be a non-empty string")
+        self._state.waiting_on_session = session_id
+        self._state.waiting_on_pid = None
+        self._state.waiting_until = 0.0
+        self._state.waiting_reason = (reason or "").strip() or None
+        self._state.waiting_since = time.time()
+        save_goal(self.session_id, self._state)
+        return self._state
+
+    def wait_for_seconds(self, seconds: int, reason: str = "") -> GoalState:
+        """Park the goal loop until ``seconds`` from now have elapsed.
+
+        Time-based counterpart to ``wait_on`` — for backoff / cooldown waits
+        where there's no process to track (e.g. the agent is rate-limited).
+        The barrier auto-clears once the deadline passes. Requires an active
+        goal.
+        """
+        if self._state is None or self._state.status != "active":
+            raise RuntimeError("no active goal to park")
+        seconds = int(seconds)
+        if seconds <= 0:
+            raise ValueError("seconds must be a positive integer")
+        self._state.waiting_on_pid = None
+        self._state.waiting_on_session = None
+        self._state.waiting_until = time.time() + seconds
+        self._state.waiting_reason = (reason or "").strip() or None
+        self._state.waiting_since = time.time()
+        save_goal(self.session_id, self._state)
+        return self._state
+
+    def stop_waiting(self) -> bool:
+        """Clear any active wait barrier (pid / session / time). Returns True
+        if one was cleared."""
+        if self._state is None:
+            return False
+        if (
+            self._state.waiting_on_pid is None
+            and self._state.waiting_on_session is None
+            and not self._state.waiting_until
+        ):
+            return False
+        self._state.waiting_on_pid = None
+        self._state.waiting_on_session = None
+        self._state.waiting_until = 0.0
+        self._state.waiting_reason = None
+        self._state.waiting_since = 0.0
+        save_goal(self.session_id, self._state)
+        return True
+
+    def is_waiting(self) -> bool:
+        """True iff a barrier is set AND not yet satisfied.
+
+        Session barrier: active until the process exits or its watch-pattern
+        trigger fires. Pid barrier: active while the process is alive. Time
+        barrier: active until the deadline passes. Side effect: a satisfied
+        barrier is cleared here (lazy auto-clear) so the next evaluation
+        resumes normal judging.
+        """
+        s = self._state
+        if s is None:
+            return False
+        if s.waiting_on_session is not None:
+            if _session_waiting(s.waiting_on_session):
+                return True
+            self.stop_waiting()  # session exited or trigger fired
+            return False
+        if s.waiting_on_pid is not None:
+            if _pid_alive(s.waiting_on_pid):
+                return True
+            self.stop_waiting()  # process gone
+            return False
+        if s.waiting_until:
+            if time.time() < s.waiting_until:
+                return True
+            self.stop_waiting()  # deadline passed
+            return False
+        return False
+
    # --- the main entry point called after every turn -----------------

    def evaluate_after_turn(
@ -660,6 +1034,7 @@ class GoalManager:
        last_response: str,
        *,
        user_initiated: bool = True,
+        background_processes: Optional[List[Dict[str, Any]]] = None,
    ) -> Dict[str, Any]:
        """Run the judge and update state. Return a decision dict.

@ -667,11 +1042,16 @@ class GoalManager:
        continuation prompt we fed ourselves (False). Both increment
        ``turns_used`` because both consume model budget.

+        ``background_processes`` is the live ``process_registry.list_sessions()``
+        snapshot for this session. It's handed to the judge so it can decide
+        to WAIT on an in-flight process (CI poller, build, ...) instead of
+        re-poking the agent — the automatic counterpart to ``/goal wait``.
+
        Decision keys:
          - ``status``: current goal status after update
          - ``should_continue``: bool — caller should fire another turn
          - ``continuation_prompt``: str or None
-          - ``verdict``: "done" | "continue" | "skipped" | "inactive"
+          - ``verdict``: "done" | "continue" | "wait" | "skipped" | "inactive"
          - ``reason``: str
          - ``message``: user-visible one-liner to print/send
        """
@ -686,12 +1066,36 @@ class GoalManager:
                "message": "",
            }

+        # Wait barrier: if the loop is parked (on a live process OR a time
+        # deadline that hasn't passed), quiesce — do NOT burn a turn or call
+        # the judge. Resumes automatically once the barrier clears.
+        if self.is_waiting():
+            if state.waiting_on_session is not None:
+                tgt = f"session {state.waiting_on_session}"
+            elif state.waiting_on_pid is not None:
+                tgt = f"pid {state.waiting_on_pid}"
+            else:
+                remaining = max(0, int(state.waiting_until - time.time()))
+                tgt = f"{remaining}s remaining"
+            reason = state.waiting_reason or tgt
+            return {
+                "status": "active",
+                "should_continue": False,
+                "continuation_prompt": None,
+                "verdict": "waiting",
+                "reason": reason,
+                "message": f"⏳ Goal parked — waiting on {tgt}: {reason}",
+            }
+
        # Count the turn that just finished.
        state.turns_used += 1
        state.last_turn_at = time.time()

-        verdict, reason, parse_failed = judge_goal(
-            state.goal, last_response, subgoals=state.subgoals or None
+        verdict, reason, parse_failed, wait_directive = judge_goal(
+            state.goal,
+            last_response,
+            subgoals=state.subgoals or None,
+            background_processes=background_processes,
        )
        state.last_verdict = verdict
        state.last_reason = reason
@ -704,6 +1108,31 @@ class GoalManager:
        else:
            state.consecutive_parse_failures = 0

+        # WAIT verdict: the judge decided the agent is blocked on async work
+        # and re-poking now would be busy-work. Set the barrier and park —
+        # the turn we just counted stands (the judge call happened), but no
+        # continuation fires. The loop resumes automatically when the pid
+        # exits or the deadline passes (next evaluate_after_turn falls through
+        # the is_waiting() short-circuit once the barrier clears).
+        if verdict == "wait" and wait_directive:
+            if wait_directive.get("session_id"):
+                self.wait_on_session(str(wait_directive["session_id"]), reason=reason)
+                tgt = f"session {wait_directive['session_id']}"
+            elif wait_directive.get("pid"):
+                self.wait_on(int(wait_directive["pid"]), reason=reason)
+                tgt = f"pid {wait_directive['pid']}"
+            else:
+                self.wait_for_seconds(int(wait_directive["seconds"]), reason=reason)
+                tgt = f"{wait_directive['seconds']}s"
+            return {
+                "status": "active",
+                "should_continue": False,
+                "continuation_prompt": None,
+                "verdict": "wait",
+                "reason": reason,
+                "message": f"⏳ Goal parked (judge) — waiting on {tgt}: {reason}",
+            }
+
        if verdict == "done":
            state.status = "done"
            save_goal(self.session_id, state)
@ -889,7 +1318,12 @@ def run_kanban_goal_loop(
            return {"outcome": "stopped", "turns_used": turns_used, "reason": f"status={status}"}

        # Still open — judge whether the latest response satisfies the card.
-        verdict, reason, _parse_failed = judge_goal(goal_text, last_response)
+        # The kanban worker loop has no wait-barrier concept (workers finish
+        # via kanban_complete / kanban_block, not by parking), so a WAIT
+        # verdict is treated as CONTINUE here.
+        verdict, reason, _parse_failed, _wait = judge_goal(goal_text, last_response)
+        if verdict == "wait":
+            verdict = "continue"
        _log(f"kanban goal loop: turn {turns_used}/{max_turns} verdict={verdict} reason={_truncate(reason, 120)}")

        if verdict == "done":