From 5a1c5994125eef8135965756d551ddaa5c7ee1a0 Mon Sep 17 00:00:00 2001
From: Teknium <127238744+teknium1@users.noreply.github.com>
Date: Thu, 23 Apr 2026 22:23:37 -0700
Subject: [PATCH] =?UTF-8?q?feat(browser):=20CDP=20supervisor=20=E2=80=94?=
 =?UTF-8?q?=20dialog=20detection=20+=20response=20+=20cross-origin=20ifram?=
 =?UTF-8?q?e=20eval=20(#14540)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* docs: browser CDP supervisor design (for upcoming PR)

Design doc ahead of implementation — dialog + iframe detection/interaction
via a persistent CDP supervisor. Covers backend capability matrix (verified
live 2026-04-23), architecture, lifecycle, policy, agent surface, PR split,
non-goals, and test plan.

Supersedes #12550.

No code changes in this commit.

* feat(browser): add persistent CDP supervisor for dialog + frame detection

Single persistent CDP WebSocket per Hermes task_id that subscribes to
Page/Runtime/Target events and maintains thread-safe state for pending
dialogs, frame tree, and console errors.

Supervisor lives in its own daemon thread running an asyncio loop;
external callers use sync API (snapshot(), respond_to_dialog()) that
bridges onto the loop.

Auto-attaches to OOPIF child targets via Target.setAutoAttach{flatten:true}
and enables Page+Runtime on each so iframe-origin dialogs surface through
the same supervisor.

Dialog policies: must_respond (default, 300s safety timeout),
auto_dismiss, auto_accept.

Frame tree capped at 30 entries + OOPIF depth 2 to keep snapshot
payloads bounded on ad-heavy pages.

E2E verified against real Chrome via smoke test — detects + responds
to main-frame alerts, iframe-contentWindow alerts, preserves frame
tree, graceful no-dialog error path, clean shutdown.

No agent-facing tool wiring in this commit (comes next).

* feat(browser): add browser_dialog tool wired to CDP supervisor

Agent-facing response-only tool. Schema:
  action: 'accept' | 'dismiss' (required)
  prompt_text: response for prompt() dialogs (optional)
  dialog_id: disambiguate when multiple dialogs queued (optional)

Handler:
  SUPERVISOR_REGISTRY.get(task_id).respond_to_dialog(...)

check_fn shares _browser_cdp_check with browser_cdp so both surface and
hide together. When no supervisor is attached (Camofox, default
Playwright, or no browser session started yet), tool is hidden; if
somehow invoked it returns a clear error pointing the agent to
browser_navigate / /browser connect.

Registered in _HERMES_CORE_TOOLS and the browser / hermes-acp /
hermes-api-server toolsets alongside browser_cdp.

* feat(browser): wire CDP supervisor into session lifecycle + browser_snapshot

Supervisor lifecycle:
  * _get_session_info lazy-starts the supervisor after a session row is
    materialized — covers every backend code path (Browserbase, cdp_url
    override, /browser connect, future providers) with one hook.
  * cleanup_browser(task_id) stops the supervisor for that task first
    (before the backend tears down CDP).
  * cleanup_all_browsers() calls SUPERVISOR_REGISTRY.stop_all().
  * /browser connect eagerly starts the supervisor for task 'default'
    so the first snapshot already shows pending_dialogs.
  * /browser disconnect stops the supervisor.

CDP URL resolution for the supervisor:
  1. BROWSER_CDP_URL / browser.cdp_url override.
  2. Fallback: session_info['cdp_url'] from cloud providers (Browserbase).

browser_snapshot merges supervisor state (pending_dialogs + frame_tree)
into its JSON output when a supervisor is active — the agent reads
pending_dialogs from the snapshot it already requests, then calls
browser_dialog to respond. No extra tool surface.

Config defaults:
  * browser.dialog_policy: 'must_respond' (new)
  * browser.dialog_timeout_s: 300 (new)
No version bump — new keys deep-merge into existing browser section.

Deadlock fix in supervisor event dispatch:
  * _on_dialog_opening and _on_target_attached used to await CDP calls
    while the reader was still processing an event — but only the reader
    can set the response Future, so the call timed out.
  * Both now fire asyncio.create_task(...) so the reader stays pumping.
  * auto_dismiss/auto_accept now actually close the dialog immediately.

Tests (tests/tools/test_browser_supervisor.py, 11 tests, real Chrome):
  * supervisor start/snapshot
  * main-frame alert detection + dismiss
  * iframe.contentWindow alert
  * prompt() with prompt_text reply
  * respond with no pending dialog -> clean error
  * auto_dismiss clears on event
  * registry idempotency
  * registry stop -> snapshot reports inactive
  * browser_dialog tool no-supervisor error
  * browser_dialog invalid action
  * browser_dialog end-to-end via tool handler

xdist-safe: chrome_cdp fixture uses a per-worker port.
Skipped when google-chrome/chromium isn't installed.

* docs(browser): document browser_dialog tool + CDP supervisor

- user-guide/features/browser.md: new browser_dialog section with
  workflow, availability gate, and dialog_policy table
- reference/tools-reference.md: row for browser_dialog, tool count
  bumped 53 -> 54, browser tools count 11 -> 12
- reference/toolsets-reference.md: browser_dialog added to browser
  toolset row with note on pending_dialogs / frame_tree snapshot fields

Full design doc lives at
developer-guide/browser-supervisor.md (committed earlier).

* fix(browser): reconnect loop + recent_dialogs for Browserbase visibility

Found via Browserbase E2E test that revealed two production-critical issues:

1. **Supervisor WebSocket drops when other clients disconnect.** Browserbase's
   CDP proxy tears down our long-lived WebSocket whenever a short-lived
   client (e.g. agent-browser CLI's per-command CDP connection) disconnects.
   Fixed with a reconnecting _run loop that re-attaches with exponential
   backoff on drops. _page_session_id and _child_sessions are reset on each
   reconnect; pending_dialogs and frames are preserved across reconnects.

2. **Browserbase auto-dismisses dialogs server-side within ~10ms.** Their
   Playwright-based CDP proxy dismisses alert/confirm/prompt before our
   Page.handleJavaScriptDialog call can respond. So pending_dialogs is
   empty by the time the agent reads a snapshot on Browserbase.

   Added a recent_dialogs ring buffer (capacity 20) that retains a
   DialogRecord for every dialog that opened, with a closed_by tag:
     * 'agent'       — agent called browser_dialog
     * 'auto_policy' — local auto_dismiss/auto_accept fired
     * 'watchdog'    — must_respond timeout auto-dismissed (300s default)
     * 'remote'      — browser/backend closed it on us (Browserbase)

   Agents on Browserbase now see the dialog history with closed_by='remote'
   so they at least know a dialog fired, even though they couldn't respond.

3. **Page.javascriptDialogClosed matching bug.** The event doesn't include a
   'message' field (CDP spec has only 'result' and 'userInput') but our
   _on_dialog_closed was matching on message. Fixed to match by session_id
   + oldest-first, with a safety assumption that only one dialog is in
   flight per session (the JS thread is blocked while a dialog is up).

Docs + tests updated:
  * browser.md: new availability matrix showing the three backends and
    which mode (pending / recent / response) each supports
  * developer-guide/browser-supervisor.md: three-field snapshot schema
    with closed_by semantics
  * test_browser_supervisor.py: +test_recent_dialogs_ring_buffer (12/12
    passing against real Chrome)

E2E verified both backends:
  * Local Chrome via /browser connect: detect + respond full workflow
    (smoke_supervisor.py all 7 scenarios pass)
  * Browserbase: detect via recent_dialogs with closed_by='remote'
    (smoke_supervisor_browserbase_v2.py passes)

Camofox remains out of scope (REST-only, no CDP) — tracked for
upstream PR 3.

* feat(browser): XHR bridge for dialog response on Browserbase (FIXED)

Browserbase's CDP proxy auto-dismisses native JS dialogs within ~10ms, so
Page.handleJavaScriptDialog calls lose the race. Solution: bypass native
dialogs entirely.

The supervisor now injects Page.addScriptToEvaluateOnNewDocument with a
JavaScript override for window.alert/confirm/prompt. Those overrides
perform a synchronous XMLHttpRequest to a magic host
('hermes-dialog-bridge.invalid'). We intercept those XHRs via Fetch.enable
with a requestStage=Request pattern.

Flow when a page calls alert('hi'):
  1. window.alert override intercepts, builds XHR GET to
     http://hermes-dialog-bridge.invalid/?kind=alert&message=hi
  2. Sync XHR blocks the page's JS thread (mirrors real dialog semantics)
  3. Fetch.requestPaused fires on our WebSocket; supervisor surfaces
     it as a pending dialog with bridge_request_id set
  4. Agent reads pending_dialogs from browser_snapshot, calls browser_dialog
  5. Supervisor calls Fetch.fulfillRequest with JSON body:
     {accept: true|false, prompt_text: '...', dialog_id: 'd-N'}
  6. The injected script parses the body, returns the appropriate value
     from the override (undefined for alert, bool for confirm, string|null
     for prompt)

This works identically on Browserbase AND local Chrome — no native dialog
ever fires, so Browserbase's auto-dismiss has nothing to race. Dialog
policies (must_respond / auto_dismiss / auto_accept) all still work.

Bridge is installed on every attached session (main page + OOPIF child
sessions) so iframe dialogs are captured too.

Native-dialog path kept as a fallback for backends that don't auto-dismiss
(so a page that somehow bypasses our override — e.g. iframes that load
after Fetch.enable but before the init-script runs — still gets observed
via Page.javascriptDialogOpening).

E2E VERIFIED:
  * Local Chrome: 13/13 pytest tests green (12 original + new
    test_bridge_captures_prompt_and_returns_reply_text that asserts
    window.__ret === 'AGENT-SUPPLIED-REPLY' after agent responds)
  * Browserbase: smoke_bb_bridge_v2.py runs 4/4 PASS:
    - alert('BB-ALERT-MSG') dismiss → page.alert_ret = undefined ✓
    - prompt('BB-PROMPT-MSG', 'default-xyz') accept with 'AGENT-REPLY'
      → page.prompt_ret === 'AGENT-REPLY' ✓
    - confirm('BB-CONFIRM-MSG') accept → page.confirm_ret === true ✓
    - confirm('BB-CONFIRM-MSG') dismiss → page.confirm_ret === false ✓

Docs updated in browser.md and developer-guide/browser-supervisor.md —
availability matrix now shows Browserbase at full parity with local
Chrome for both detection and response.

* feat(browser): cross-origin iframe interaction via browser_cdp(frame_id=...)

Adds iframe interaction to the CDP supervisor PR (was queued as PR 2).

Design: browser_cdp gets an optional frame_id parameter. When set, the
tool looks up the frame in the supervisor's frame_tree, grabs its child
cdp_session_id (OOPIF session), and dispatches the CDP call through the
supervisor's already-connected WebSocket via run_coroutine_threadsafe.

Why not stateless: on Browserbase, each fresh browser_cdp WebSocket
must re-negotiate against a signed connectUrl. The session info carries
a specific URL that can expire while the supervisor's long-lived
connection stays valid. Routing via the supervisor sidesteps this.

Agent workflow:
  1. browser_snapshot → frame_tree.children[] shows OOPIFs with is_oopif=true
  2. browser_cdp(method='Runtime.evaluate', frame_id=<OOPIF frame_id>,
                 params={'expression': 'document.title', 'returnByValue': True})
  3. Supervisor dispatches the call on the OOPIF's child session

Supervisor state fixes needed along the way:
  * _on_frame_detached now skips reason='swap' (frame migrating processes)
  * _on_frame_detached also skips when the frame is an OOPIF with a live
    child session — Browserbase fires spurious remove events when a
    same-origin iframe gets promoted to OOPIF
  * _on_target_detached clears cdp_session_id but KEEPS the frame record
    so the agent still sees the OOPIF in frame_tree during transient
    session flaps

E2E VERIFIED on Browserbase (smoke_bb_iframe_agent_path.py):
  browser_cdp(method='Runtime.evaluate',
              params={'expression': 'document.title', 'returnByValue': True},
              frame_id=<OOPIF>)
  → {'success': True, 'result': {'value': 'Example Domain'}}

  The iframe is <iframe src='https://example.com/'> inside a top-level
  data: URL page on a real Browserbase session. The agent Runtime.evaluates
  INSIDE the cross-origin iframe and gets example.com's title back.

Tests (tests/tools/test_browser_supervisor.py — 16 pass total):
  * test_browser_cdp_frame_id_routes_via_supervisor — injects fake OOPIF,
    verifies routing via supervisor, Runtime.evaluate returns 1+1=2
  * test_browser_cdp_frame_id_missing_supervisor — clean error when no
    supervisor attached
  * test_browser_cdp_frame_id_not_in_frame_tree — clean error on bad
    frame_id

Docs (browser.md and developer-guide/browser-supervisor.md) updated with
the iframe workflow, availability matrix now shows OOPIF eval as shipped
for local Chrome + Browserbase.

* test(browser): real-OOPIF E2E verified manually + chrome_cdp uses --site-per-process

When asked 'did you test the iframe stuff' I had only done a mocked
pytest (fake injected OOPIF) plus a Browserbase E2E. Closed the
local-Chrome real-OOPIF gap by writing /tmp/dialog-iframe-test/
smoke_local_oopif.py:

  * 2 http servers on different hostnames (localhost:18905 + 127.0.0.1:18906)
  * Chrome with --site-per-process so the cross-origin iframe becomes a
    real OOPIF in its own process
  * Navigate, find OOPIF in supervisor.frame_tree, call
    browser_cdp(method='Runtime.evaluate', frame_id=<OOPIF>) which routes
    through the supervisor's child session
  * Asserts iframe document.title === 'INNER-FRAME-XYZ' (from the
    inner page, retrieved via OOPIF eval)

PASSED on 2026-04-23.

Tried to embed this as a pytest but hit an asyncio version quirk between
venv (3.11) and the system python (3.13) — Page.navigate hangs in the
pytest harness but works in standalone. Left a self-documenting skip
test that points to the smoke script + describes the verification.

chrome_cdp fixture now passes --site-per-process so future iframe tests
can rely on OOPIF behavior.

Result: 16 pass + 1 documented-skip = 17 tests in
tests/tools/test_browser_supervisor.py.

* docs(browser): add dialog_policy + dialog_timeout_s to configuration.md, fix tool count

Pre-merge docs audit revealed two gaps:

1. user-guide/configuration.md browser config example was missing the
   two new dialog_* knobs. Added with a short table explaining
   must_respond / auto_dismiss / auto_accept semantics and a link to
   the feature page for the full workflow.

2. reference/tools-reference.md header said '54 built-in tools' — real
   count on main is 54, this branch adds browser_dialog so it's 55.
   Fixed the header.  (browser count was already correctly bumped
   11 -> 12 in the earlier docs commit.)

No code changes.
---
 cli.py                                        |   10 +-
 hermes_cli/config.py                          |    6 +
 tests/tools/test_browser_supervisor.py        |  563 +++++++
 tools/browser_cdp_tool.py                     |  167 +-
 tools/browser_dialog_tool.py                  |  148 ++
 tools/browser_supervisor.py                   | 1362 +++++++++++++++++
 tools/browser_tool.py                         |  133 +-
 toolsets.py                                   |    9 +-
 .../developer-guide/browser-supervisor.md     |  223 +++
 website/docs/reference/tools-reference.md     |    5 +-
 website/docs/reference/toolsets-reference.md  |    2 +-
 website/docs/user-guide/configuration.md      |   16 +
 website/docs/user-guide/features/browser.md   |   45 +-
 13 files changed, 2665 insertions(+), 24 deletions(-)
 create mode 100644 tests/tools/test_browser_supervisor.py
 create mode 100644 tools/browser_dialog_tool.py
 create mode 100644 tools/browser_supervisor.py
 create mode 100644 website/docs/developer-guide/browser-supervisor.md

diff --git a/cli.py b/cli.py
index a289e3ab2..9fda8fa63 100644
--- a/cli.py
+++ b/cli.py
@@ -6685,6 +6685,13 @@ class HermesCLI:
                 print(f"   ⚠ Port {_port} is not reachable at {cdp_url}")
 
             os.environ["BROWSER_CDP_URL"] = cdp_url
+            # Eagerly start the CDP supervisor so pending_dialogs + frame_tree
+            # show up in the next browser_snapshot.  No-op if already started.
+            try:
+                from tools.browser_tool import _ensure_cdp_supervisor  # type: ignore[import-not-found]
+                _ensure_cdp_supervisor("default")
+            except Exception:
+                pass
             print()
             print("🌐 Browser connected to live Chrome via CDP")
             print(f"   Endpoint: {cdp_url}")
@@ -6706,7 +6713,8 @@ class HermesCLI:
             if current:
                 os.environ.pop("BROWSER_CDP_URL", None)
                 try:
-                    from tools.browser_tool import cleanup_all_browsers
+                    from tools.browser_tool import cleanup_all_browsers, _stop_cdp_supervisor
+                    _stop_cdp_supervisor("default")
                     cleanup_all_browsers()
                 except Exception:
                     pass
diff --git a/hermes_cli/config.py b/hermes_cli/config.py
index 282327c84..58f874595 100644
--- a/hermes_cli/config.py
+++ b/hermes_cli/config.py
@@ -466,6 +466,12 @@ DEFAULT_CONFIG = {
         "record_sessions": False,  # Auto-record browser sessions as WebM videos
         "allow_private_urls": False,  # Allow navigating to private/internal IPs (localhost, 192.168.x.x, etc.)
         "cdp_url": "",  # Optional persistent CDP endpoint for attaching to an existing Chromium/Chrome
+        # CDP supervisor — dialog + frame detection via a persistent WebSocket.
+        # Active only when a CDP-capable backend is attached (Browserbase or
+        # local Chrome via /browser connect). See
+        # website/docs/developer-guide/browser-supervisor.md.
+        "dialog_policy": "must_respond",  # must_respond | auto_dismiss | auto_accept
+        "dialog_timeout_s": 300,  # Safety auto-dismiss after N seconds under must_respond
         "camofox": {
             # When true, Hermes sends a stable profile-scoped userId to Camofox
             # so the server maps it to a persistent Firefox profile automatically.
diff --git a/tests/tools/test_browser_supervisor.py b/tests/tools/test_browser_supervisor.py
new file mode 100644
index 000000000..e332aec43
--- /dev/null
+++ b/tests/tools/test_browser_supervisor.py
@@ -0,0 +1,563 @@
+"""Integration tests for tools.browser_supervisor.
+
+Exercises the supervisor end-to-end against a real local Chrome
+(``--remote-debugging-port``).  Skipped when Chrome is not installed
+— these are the tests that actually verify the CDP wire protocol
+works, since mock-CDP unit tests can only prove the happy paths we
+thought to model.
+
+Run manually:
+    scripts/run_tests.sh tests/tools/test_browser_supervisor.py
+
+Automated: skipped in CI unless ``HERMES_E2E_BROWSER=1`` is set.
+"""
+
+from __future__ import annotations
+
+import asyncio
+import base64
+import json
+import os
+import shutil
+import subprocess
+import tempfile
+import time
+
+import pytest
+
+
+pytestmark = pytest.mark.skipif(
+    not shutil.which("google-chrome") and not shutil.which("chromium"),
+    reason="Chrome/Chromium not installed",
+)
+
+
+def _find_chrome() -> str:
+    for candidate in ("google-chrome", "chromium", "chromium-browser"):
+        path = shutil.which(candidate)
+        if path:
+            return path
+    pytest.skip("no Chrome binary found")
+
+
+@pytest.fixture
+def chrome_cdp(worker_id):
+    """Start a headless Chrome with --remote-debugging-port, yield its WS URL.
+
+    Uses a unique port per xdist worker to avoid cross-worker collisions.
+    Always launches with ``--site-per-process`` so cross-origin iframes
+    become real OOPIFs (needed by the iframe interaction tests).
+    """
+    import socket
+
+    # xdist worker_id is "master" in single-process mode or "gw0".."gwN" otherwise.
+    if worker_id == "master":
+        port_offset = 0
+    else:
+        port_offset = int(worker_id.lstrip("gw"))
+    port = 9225 + port_offset
+    profile = tempfile.mkdtemp(prefix="hermes-supervisor-test-")
+    proc = subprocess.Popen(
+        [
+            _find_chrome(),
+            f"--remote-debugging-port={port}",
+            f"--user-data-dir={profile}",
+            "--no-first-run",
+            "--no-default-browser-check",
+            "--headless=new",
+            "--disable-gpu",
+            "--site-per-process",  # force OOPIFs for cross-origin iframes
+        ],
+        stdout=subprocess.DEVNULL,
+        stderr=subprocess.DEVNULL,
+    )
+
+    ws_url = None
+    deadline = time.monotonic() + 15
+    while time.monotonic() < deadline:
+        try:
+            import urllib.request
+            with urllib.request.urlopen(
+                f"http://127.0.0.1:{port}/json/version", timeout=1
+            ) as r:
+                info = json.loads(r.read().decode())
+                ws_url = info["webSocketDebuggerUrl"]
+                break
+        except Exception:
+            time.sleep(0.25)
+    if ws_url is None:
+        proc.terminate()
+        proc.wait(timeout=5)
+        shutil.rmtree(profile, ignore_errors=True)
+        pytest.skip("Chrome didn't expose CDP in time")
+
+    yield ws_url, port
+
+    proc.terminate()
+    try:
+        proc.wait(timeout=3)
+    except Exception:
+        proc.kill()
+    shutil.rmtree(profile, ignore_errors=True)
+
+
+def _test_page_url() -> str:
+    html = """<!doctype html>
+<html><head><title>Supervisor pytest</title></head><body>
+<h1>Supervisor pytest</h1>
+<iframe id="inner" srcdoc="<body><h2>frame-marker</h2></body>" width="400" height="100"></iframe>
+</body></html>"""
+    return "data:text/html;base64," + base64.b64encode(html.encode()).decode()
+
+
+def _fire_on_page(cdp_url: str, expression: str) -> None:
+    """Navigate the first page target to a data URL and fire `expression`."""
+    import asyncio
+    import websockets as _ws_mod
+
+    async def run():
+        async with _ws_mod.connect(cdp_url, max_size=50 * 1024 * 1024) as ws:
+            next_id = [1]
+
+            async def call(method, params=None, session_id=None):
+                cid = next_id[0]
+                next_id[0] += 1
+                p = {"id": cid, "method": method}
+                if params:
+                    p["params"] = params
+                if session_id:
+                    p["sessionId"] = session_id
+                await ws.send(json.dumps(p))
+                async for raw in ws:
+                    m = json.loads(raw)
+                    if m.get("id") == cid:
+                        return m
+
+            targets = (await call("Target.getTargets"))["result"]["targetInfos"]
+            page = next(t for t in targets if t.get("type") == "page")
+            attach = await call(
+                "Target.attachToTarget", {"targetId": page["targetId"], "flatten": True}
+            )
+            sid = attach["result"]["sessionId"]
+            await call("Page.navigate", {"url": _test_page_url()}, session_id=sid)
+            await asyncio.sleep(1.5)  # let the page load
+            await call(
+                "Runtime.evaluate",
+                {"expression": expression, "returnByValue": True},
+                session_id=sid,
+            )
+
+    asyncio.run(run())
+
+
+@pytest.fixture
+def supervisor_registry():
+    """Yield the global registry and tear down any supervisors after the test."""
+    from tools.browser_supervisor import SUPERVISOR_REGISTRY
+
+    yield SUPERVISOR_REGISTRY
+    SUPERVISOR_REGISTRY.stop_all()
+
+
+def _wait_for_dialog(supervisor, timeout: float = 5.0):
+    deadline = time.monotonic() + timeout
+    while time.monotonic() < deadline:
+        snap = supervisor.snapshot()
+        if snap.pending_dialogs:
+            return snap.pending_dialogs
+        time.sleep(0.1)
+    return ()
+
+
+def test_supervisor_start_and_snapshot(chrome_cdp, supervisor_registry):
+    """Supervisor attaches, exposes an active snapshot with a top frame."""
+    cdp_url, _port = chrome_cdp
+    supervisor = supervisor_registry.get_or_start(task_id="pytest-1", cdp_url=cdp_url)
+
+    # Navigate so the frame tree populates.
+    _fire_on_page(cdp_url, "/* no dialog */ void 0")
+
+    # Give a moment for frame events to propagate
+    time.sleep(1.0)
+    snap = supervisor.snapshot()
+    assert snap.active is True
+    assert snap.task_id == "pytest-1"
+    assert snap.pending_dialogs == ()
+    # At minimum a top frame should exist after the navigate.
+    assert snap.frame_tree.get("top") is not None
+
+
+def test_main_frame_alert_detection_and_dismiss(chrome_cdp, supervisor_registry):
+    """alert() in the main frame surfaces and can be dismissed via the sync API."""
+    cdp_url, _port = chrome_cdp
+    supervisor = supervisor_registry.get_or_start(task_id="pytest-2", cdp_url=cdp_url)
+
+    _fire_on_page(cdp_url, "setTimeout(() => alert('PYTEST-MAIN-ALERT'), 50)")
+    dialogs = _wait_for_dialog(supervisor)
+    assert dialogs, "no dialog detected"
+    d = dialogs[0]
+    assert d.type == "alert"
+    assert "PYTEST-MAIN-ALERT" in d.message
+
+    result = supervisor.respond_to_dialog("dismiss")
+    assert result["ok"] is True
+    # State cleared after dismiss
+    time.sleep(0.3)
+    assert supervisor.snapshot().pending_dialogs == ()
+
+
+def test_iframe_contentwindow_alert(chrome_cdp, supervisor_registry):
+    """alert() fired from inside a same-origin iframe surfaces too."""
+    cdp_url, _port = chrome_cdp
+    supervisor = supervisor_registry.get_or_start(task_id="pytest-3", cdp_url=cdp_url)
+
+    _fire_on_page(
+        cdp_url,
+        "setTimeout(() => document.querySelector('#inner').contentWindow.alert('PYTEST-IFRAME'), 50)",
+    )
+    dialogs = _wait_for_dialog(supervisor)
+    assert dialogs, "no iframe dialog detected"
+    assert any("PYTEST-IFRAME" in d.message for d in dialogs)
+
+    result = supervisor.respond_to_dialog("accept")
+    assert result["ok"] is True
+
+
+def test_prompt_dialog_with_response_text(chrome_cdp, supervisor_registry):
+    """prompt() gets our prompt_text back inside the page."""
+    cdp_url, _port = chrome_cdp
+    supervisor = supervisor_registry.get_or_start(task_id="pytest-4", cdp_url=cdp_url)
+
+    # Fire a prompt and stash the answer on window
+    _fire_on_page(
+        cdp_url,
+        "setTimeout(() => { window.__promptResult = prompt('give me a token', 'default-x'); }, 50)",
+    )
+    dialogs = _wait_for_dialog(supervisor)
+    assert dialogs
+    d = dialogs[0]
+    assert d.type == "prompt"
+    assert d.default_prompt == "default-x"
+
+    result = supervisor.respond_to_dialog("accept", prompt_text="PYTEST-PROMPT-REPLY")
+    assert result["ok"] is True
+
+
+def test_respond_with_no_pending_dialog_errors_cleanly(chrome_cdp, supervisor_registry):
+    """Calling respond_to_dialog when nothing is pending returns a clean error, not an exception."""
+    cdp_url, _port = chrome_cdp
+    supervisor = supervisor_registry.get_or_start(task_id="pytest-5", cdp_url=cdp_url)
+
+    result = supervisor.respond_to_dialog("accept")
+    assert result["ok"] is False
+    assert "no dialog" in result["error"].lower()
+
+
+def test_auto_dismiss_policy(chrome_cdp, supervisor_registry):
+    """auto_dismiss policy clears dialogs without the agent responding."""
+    from tools.browser_supervisor import DIALOG_POLICY_AUTO_DISMISS
+
+    cdp_url, _port = chrome_cdp
+    supervisor = supervisor_registry.get_or_start(
+        task_id="pytest-6",
+        cdp_url=cdp_url,
+        dialog_policy=DIALOG_POLICY_AUTO_DISMISS,
+    )
+
+    _fire_on_page(cdp_url, "setTimeout(() => alert('PYTEST-AUTO-DISMISS'), 50)")
+    # Give the supervisor a moment to see + auto-dismiss
+    time.sleep(2.0)
+    snap = supervisor.snapshot()
+    # Nothing pending because auto-dismiss cleared it immediately
+    assert snap.pending_dialogs == ()
+
+
+def test_registry_idempotent_get_or_start(chrome_cdp, supervisor_registry):
+    """Calling get_or_start twice with the same (task, url) returns the same instance."""
+    cdp_url, _port = chrome_cdp
+    a = supervisor_registry.get_or_start(task_id="pytest-idem", cdp_url=cdp_url)
+    b = supervisor_registry.get_or_start(task_id="pytest-idem", cdp_url=cdp_url)
+    assert a is b
+
+
+def test_registry_stop(chrome_cdp, supervisor_registry):
+    """stop() tears down the supervisor and snapshot reports inactive."""
+    cdp_url, _port = chrome_cdp
+    supervisor = supervisor_registry.get_or_start(task_id="pytest-stop", cdp_url=cdp_url)
+    assert supervisor.snapshot().active is True
+    supervisor_registry.stop("pytest-stop")
+    # Post-stop snapshot reports inactive; supervisor obj may still exist
+    assert supervisor.snapshot().active is False
+
+
+def test_browser_dialog_tool_no_supervisor():
+    """browser_dialog returns a clear error when no supervisor is attached."""
+    from tools.browser_dialog_tool import browser_dialog
+
+    r = json.loads(browser_dialog(action="accept", task_id="nonexistent-task"))
+    assert r["success"] is False
+    assert "No CDP supervisor" in r["error"]
+
+
+def test_browser_dialog_invalid_action(chrome_cdp, supervisor_registry):
+    """browser_dialog rejects actions that aren't accept/dismiss."""
+    from tools.browser_dialog_tool import browser_dialog
+
+    cdp_url, _port = chrome_cdp
+    supervisor_registry.get_or_start(task_id="pytest-bad-action", cdp_url=cdp_url)
+
+    r = json.loads(browser_dialog(action="eat", task_id="pytest-bad-action"))
+    assert r["success"] is False
+    assert "accept" in r["error"] and "dismiss" in r["error"]
+
+
+def test_recent_dialogs_ring_buffer(chrome_cdp, supervisor_registry):
+    """Closed dialogs show up in recent_dialogs with a closed_by tag."""
+    from tools.browser_supervisor import DIALOG_POLICY_AUTO_DISMISS
+
+    cdp_url, _port = chrome_cdp
+    sv = supervisor_registry.get_or_start(
+        task_id="pytest-recent",
+        cdp_url=cdp_url,
+        dialog_policy=DIALOG_POLICY_AUTO_DISMISS,
+    )
+
+    _fire_on_page(cdp_url, "setTimeout(() => alert('PYTEST-RECENT'), 50)")
+    # Wait for auto-dismiss to cycle the dialog through
+    deadline = time.time() + 5
+    while time.time() < deadline:
+        recent = sv.snapshot().recent_dialogs
+        if recent and any("PYTEST-RECENT" in r.message for r in recent):
+            break
+        time.sleep(0.1)
+
+    recent = sv.snapshot().recent_dialogs
+    assert recent, "recent_dialogs should contain the auto-dismissed dialog"
+    match = next((r for r in recent if "PYTEST-RECENT" in r.message), None)
+    assert match is not None
+    assert match.type == "alert"
+    assert match.closed_by == "auto_policy"
+    assert match.closed_at >= match.opened_at
+
+
+def test_browser_dialog_tool_end_to_end(chrome_cdp, supervisor_registry):
+    """Full agent-path check: fire an alert, call the tool handler directly."""
+    from tools.browser_dialog_tool import browser_dialog
+
+    cdp_url, _port = chrome_cdp
+    supervisor = supervisor_registry.get_or_start(task_id="pytest-tool", cdp_url=cdp_url)
+
+    _fire_on_page(cdp_url, "setTimeout(() => alert('PYTEST-TOOL-END2END'), 50)")
+    assert _wait_for_dialog(supervisor), "no dialog detected via wait_for_dialog"
+
+    r = json.loads(browser_dialog(action="dismiss", task_id="pytest-tool"))
+    assert r["success"] is True
+    assert r["action"] == "dismiss"
+    assert "PYTEST-TOOL-END2END" in r["dialog"]["message"]
+
+
+def test_browser_cdp_frame_id_routes_via_supervisor(chrome_cdp, supervisor_registry, monkeypatch):
+    """browser_cdp(frame_id=...) routes Runtime.evaluate through supervisor.
+
+    Mocks the supervisor with a known frame and verifies browser_cdp sends
+    the call via the supervisor's loop rather than opening a stateless
+    WebSocket. This is the path that makes cross-origin iframe eval work
+    on Browserbase.
+    """
+    cdp_url, _port = chrome_cdp
+    sv = supervisor_registry.get_or_start(task_id="frame-id-test", cdp_url=cdp_url)
+    assert sv.snapshot().active
+
+    # Inject a fake OOPIF frame pointing at the SUPERVISOR's own page session
+    # so we can verify routing. We fake is_oopif=True so the code path
+    # treats it as an OOPIF child.
+    import tools.browser_supervisor as _bs
+    with sv._state_lock:
+        fake_frame_id = "FAKE-FRAME-001"
+        sv._frames[fake_frame_id] = _bs.FrameInfo(
+            frame_id=fake_frame_id,
+            url="fake://",
+            origin="",
+            parent_frame_id=None,
+            is_oopif=True,
+            cdp_session_id=sv._page_session_id,  # route at page scope
+        )
+
+    # Route the tool through the supervisor. Should succeed and return
+    # something that clearly came from CDP.
+    from tools.browser_cdp_tool import browser_cdp
+    result = browser_cdp(
+        method="Runtime.evaluate",
+        params={"expression": "1 + 1", "returnByValue": True},
+        frame_id=fake_frame_id,
+        task_id="frame-id-test",
+    )
+    r = json.loads(result)
+    assert r.get("success") is True, f"expected success, got: {r}"
+    assert r.get("frame_id") == fake_frame_id
+    assert r.get("session_id") == sv._page_session_id
+    value = r.get("result", {}).get("result", {}).get("value")
+    assert value == 2, f"expected 2, got {value!r}"
+
+
+def test_browser_cdp_frame_id_real_oopif_smoke_documented():
+    """Document that real-OOPIF E2E was manually verified — see PR #14540.
+
+    A pytest version of this hits an asyncio version-quirk in the venv
+    (3.11) that doesn't show up in standalone scripts (3.13 + system
+    websockets). The mechanism IS verified end-to-end by two separate
+    smoke scripts in /tmp/dialog-iframe-test/:
+
+      * smoke_local_oopif.py   — local Chrome + 2 http servers on
+        different hostnames + --site-per-process. Outer page on
+        localhost:18905, iframe src=http://127.0.0.1:18906. Calls
+        browser_cdp(method='Runtime.evaluate', frame_id=<OOPIF>) and
+        verifies inner page's title comes back from the OOPIF session.
+        PASSED on 2026-04-23: iframe document.title = 'INNER-FRAME-XYZ'
+
+      * smoke_bb_iframe_agent_path.py — Browserbase + real cross-origin
+        iframe (src=https://example.com/). Same browser_cdp(frame_id=)
+        path. PASSED on 2026-04-23: iframe document.title =
+        'Example Domain'
+
+    The test_browser_cdp_frame_id_routes_via_supervisor pytest covers
+    the supervisor-routing plumbing with a fake injected OOPIF.
+    """
+    pytest.skip(
+        "Real-OOPIF E2E verified manually with smoke_local_oopif.py and "
+        "smoke_bb_iframe_agent_path.py — pytest version hits an asyncio "
+        "version quirk between venv (3.11) and standalone (3.13). "
+        "Smoke logs preserved in /tmp/dialog-iframe-test/."
+    )
+
+
+def test_browser_cdp_frame_id_missing_supervisor():
+    """browser_cdp(frame_id=...) errors cleanly when no supervisor is attached."""
+    from tools.browser_cdp_tool import browser_cdp
+    result = browser_cdp(
+        method="Runtime.evaluate",
+        params={"expression": "1"},
+        frame_id="any-frame-id",
+        task_id="no-such-task",
+    )
+    r = json.loads(result)
+    assert r.get("success") is not True
+    assert "supervisor" in (r.get("error") or "").lower()
+
+
+def test_browser_cdp_frame_id_not_in_frame_tree(chrome_cdp, supervisor_registry):
+    """browser_cdp(frame_id=...) errors when the frame_id isn't known."""
+    cdp_url, _port = chrome_cdp
+    sv = supervisor_registry.get_or_start(task_id="bad-frame-test", cdp_url=cdp_url)
+    assert sv.snapshot().active
+
+    from tools.browser_cdp_tool import browser_cdp
+    result = browser_cdp(
+        method="Runtime.evaluate",
+        params={"expression": "1"},
+        frame_id="nonexistent-frame",
+        task_id="bad-frame-test",
+    )
+    r = json.loads(result)
+    assert r.get("success") is not True
+    assert "not found" in (r.get("error") or "").lower()
+
+
+def test_bridge_captures_prompt_and_returns_reply_text(chrome_cdp, supervisor_registry):
+    """End-to-end: agent's prompt_text round-trips INTO the page's JS.
+
+    Proves the bridge isn't just catching dialogs — it's properly round-
+    tripping our reply back into the page via Fetch.fulfillRequest, so
+    ``prompt()`` actually returns the agent-supplied string to the page.
+    """
+    import base64 as _b64
+
+    cdp_url, _port = chrome_cdp
+    sv = supervisor_registry.get_or_start(task_id="pytest-bridge-prompt", cdp_url=cdp_url)
+
+    # Page fires prompt and stashes the return value on window.
+    html = """<!doctype html><html><body><script>
+      window.__ret = null;
+      setTimeout(() => { window.__ret = prompt('PROMPT-MSG', 'default'); }, 50);
+    </script></body></html>"""
+    url = "data:text/html;base64," + _b64.b64encode(html.encode()).decode()
+
+    import asyncio as _asyncio
+    import websockets as _ws_mod
+
+    async def nav_and_read():
+        async with _ws_mod.connect(cdp_url, max_size=50 * 1024 * 1024) as ws:
+            nid = [1]
+            pending: dict = {}
+
+            async def reader_fn():
+                try:
+                    async for raw in ws:
+                        m = json.loads(raw)
+                        if "id" in m:
+                            fut = pending.pop(m["id"], None)
+                            if fut and not fut.done():
+                                fut.set_result(m)
+                except Exception:
+                    pass
+
+            rd = _asyncio.create_task(reader_fn())
+
+            async def call(method, params=None, sid=None):
+                c = nid[0]; nid[0] += 1
+                p = {"id": c, "method": method}
+                if params: p["params"] = params
+                if sid: p["sessionId"] = sid
+                fut = _asyncio.get_event_loop().create_future()
+                pending[c] = fut
+                await ws.send(json.dumps(p))
+                return await _asyncio.wait_for(fut, timeout=20)
+
+            try:
+                t = (await call("Target.getTargets"))["result"]["targetInfos"]
+                pg = next(x for x in t if x.get("type") == "page")
+                a = await call("Target.attachToTarget", {"targetId": pg["targetId"], "flatten": True})
+                sid = a["result"]["sessionId"]
+
+                # Fire navigate but don't await — prompt() blocks the page
+                nav_id = nid[0]; nid[0] += 1
+                nav_fut = _asyncio.get_event_loop().create_future()
+                pending[nav_id] = nav_fut
+                await ws.send(json.dumps({"id": nav_id, "method": "Page.navigate", "params": {"url": url}, "sessionId": sid}))
+
+                # Wait for supervisor to see the prompt
+                deadline = time.monotonic() + 10
+                dialog = None
+                while time.monotonic() < deadline:
+                    snap = sv.snapshot()
+                    if snap.pending_dialogs:
+                        dialog = snap.pending_dialogs[0]
+                        break
+                    await _asyncio.sleep(0.05)
+                assert dialog is not None, "no dialog captured"
+                assert dialog.bridge_request_id is not None, "expected bridge path"
+                assert dialog.type == "prompt"
+
+                # Agent responds
+                resp = sv.respond_to_dialog("accept", prompt_text="AGENT-SUPPLIED-REPLY")
+                assert resp["ok"] is True
+
+                # Wait for nav to complete + read back
+                try:
+                    await _asyncio.wait_for(nav_fut, timeout=10)
+                except Exception:
+                    pass
+                await _asyncio.sleep(0.5)
+                r = await call(
+                    "Runtime.evaluate",
+                    {"expression": "window.__ret", "returnByValue": True},
+                    sid=sid,
+                )
+                return r.get("result", {}).get("result", {}).get("value")
+            finally:
+                rd.cancel()
+                try: await rd
+                except BaseException: pass
+
+    value = asyncio.run(nav_and_read())
+    assert value == "AGENT-SUPPLIED-REPLY", f"expected AGENT-SUPPLIED-REPLY, got {value!r}"
diff --git a/tools/browser_cdp_tool.py b/tools/browser_cdp_tool.py
index 9b13b2bb6..73cce11cd 100644
--- a/tools/browser_cdp_tool.py
+++ b/tools/browser_cdp_tool.py
@@ -188,10 +188,116 @@ async def _cdp_call(
 # ---------------------------------------------------------------------------
 
 
+def _browser_cdp_via_supervisor(
+    task_id: str,
+    frame_id: str,
+    method: str,
+    params: Optional[Dict[str, Any]],
+    timeout: float,
+) -> str:
+    """Route a CDP call through the live supervisor session for an OOPIF frame.
+
+    Looks up the frame in the supervisor's snapshot, extracts its child
+    ``cdp_session_id``, and dispatches ``method`` with that sessionId via
+    the supervisor's already-connected WebSocket (using
+    ``asyncio.run_coroutine_threadsafe`` onto the supervisor loop).
+    """
+    try:
+        from tools.browser_supervisor import SUPERVISOR_REGISTRY  # type: ignore[import-not-found]
+    except Exception as exc:  # pragma: no cover — defensive
+        return tool_error(
+            f"CDP supervisor is not available: {exc}. frame_id routing requires "
+            f"a running supervisor attached via /browser connect or an active "
+            f"Browserbase session."
+        )
+
+    supervisor = SUPERVISOR_REGISTRY.get(task_id)
+    if supervisor is None:
+        return tool_error(
+            f"No CDP supervisor is attached for task={task_id!r}. Call "
+            f"browser_navigate or /browser connect first so the supervisor "
+            f"can attach. Once attached, browser_snapshot will populate "
+            f"frame_tree with frame_ids you can pass here."
+        )
+
+    snap = supervisor.snapshot()
+    # Search both the top frame and the children for the requested id.
+    top = snap.frame_tree.get("top")
+    frame_info: Optional[Dict[str, Any]] = None
+    if top and top.get("frame_id") == frame_id:
+        frame_info = top
+    else:
+        for child in snap.frame_tree.get("children", []) or []:
+            if child.get("frame_id") == frame_id:
+                frame_info = child
+                break
+    if frame_info is None:
+        # Check the raw frames dict too (frame_tree is capped at 30 entries)
+        with supervisor._state_lock:  # type: ignore[attr-defined]
+            raw = supervisor._frames.get(frame_id)  # type: ignore[attr-defined]
+        if raw is not None:
+            frame_info = raw.to_dict()
+
+    if frame_info is None:
+        return tool_error(
+            f"frame_id {frame_id!r} not found in supervisor state. "
+            f"Call browser_snapshot to see current frame_tree."
+        )
+
+    child_sid = frame_info.get("session_id")
+    if not child_sid:
+        # Not an OOPIF — fall back to top-level session (evaluating at page
+        # scope).  Same-origin iframes don't get their own sessionId; the
+        # agent can still use contentWindow/contentDocument from the parent.
+        return tool_error(
+            f"frame_id {frame_id!r} is not an out-of-process iframe (no "
+            f"dedicated CDP session). For same-origin iframes, use "
+            f"`browser_cdp(method='Runtime.evaluate', params={{'expression': "
+            f"\"document.querySelector('iframe').contentDocument.title\"}})` "
+            f"at the top-level page instead."
+        )
+
+    # Dispatch onto the supervisor's loop.
+    import asyncio as _asyncio
+    loop = supervisor._loop  # type: ignore[attr-defined]
+    if loop is None or not loop.is_running():
+        return tool_error(
+            "CDP supervisor loop is not running. Try reconnecting with "
+            "/browser connect."
+        )
+
+    async def _do_cdp():
+        return await supervisor._cdp(  # type: ignore[attr-defined]
+            method,
+            params or {},
+            session_id=child_sid,
+            timeout=timeout,
+        )
+
+    try:
+        fut = _asyncio.run_coroutine_threadsafe(_do_cdp(), loop)
+        result_msg = fut.result(timeout=timeout + 2)
+    except Exception as exc:
+        return tool_error(
+            f"CDP call via supervisor failed: {type(exc).__name__}: {exc}",
+            cdp_docs=CDP_DOCS_URL,
+        )
+
+    payload: Dict[str, Any] = {
+        "success": True,
+        "method": method,
+        "frame_id": frame_id,
+        "session_id": child_sid,
+        "result": result_msg.get("result", {}),
+    }
+    return json.dumps(payload, ensure_ascii=False)
+
+
 def browser_cdp(
     method: str,
     params: Optional[Dict[str, Any]] = None,
     target_id: Optional[str] = None,
+    frame_id: Optional[str] = None,
     timeout: float = 30.0,
     task_id: Optional[str] = None,
 ) -> str:
@@ -202,16 +308,34 @@ def browser_cdp(
         params: Method-specific parameters; defaults to ``{}``.
         target_id: Optional target/tab ID for page-level methods.  When set,
             we first attach to the target (``flatten=True``) and send
-            ``method`` with the resulting ``sessionId``.
+            ``method`` with the resulting ``sessionId``.  Uses a fresh
+            stateless CDP connection.
+        frame_id: Optional cross-origin (OOPIF) iframe ``frame_id`` from
+            ``browser_snapshot.frame_tree.children[]``.  When set (and the
+            frame is an OOPIF with a live session tracked by the CDP
+            supervisor), routes the call through the supervisor's existing
+            WebSocket — which is how you Runtime.evaluate *inside* an
+            iframe on backends where per-call fresh CDP connections would
+            hit signed-URL expiry (Browserbase) or expensive reattach.
         timeout: Seconds to wait for the call to complete.
-        task_id: Unused (tool is stateless) — accepted for uniformity with
-            other browser tools.
+        task_id: Task identifier for supervisor lookup.  When ``frame_id``
+            is set, this identifies which task's supervisor to use; the
+            handler will default to ``"default"`` otherwise.
 
     Returns:
         JSON string ``{"success": True, "method": ..., "result": {...}}`` on
         success, or ``{"error": "..."}`` on failure.
     """
-    del task_id  # unused — stateless
+    # --- Route iframe-scoped calls through the supervisor ---------------
+    if frame_id:
+        return _browser_cdp_via_supervisor(
+            task_id=task_id or "default",
+            frame_id=frame_id,
+            method=method,
+            params=params,
+            timeout=timeout,
+        )
+    del task_id  # stateless path below
 
     if not method or not isinstance(method, str):
         return tool_error(
@@ -324,12 +448,18 @@ BROWSER_CDP_SCHEMA: Dict[str, Any] = {
         "'mobile': false}, target_id=<tabId>\n\n"
         "**Usage rules:**\n"
         "- Browser-level methods (Target.*, Browser.*, Storage.*): omit "
-        "target_id.\n"
+        "target_id and frame_id.\n"
         "- Page-level methods (Page.*, Runtime.*, DOM.*, Emulation.*, "
         "Network.* scoped to a tab): pass target_id from Target.getTargets.\n"
-        "- Each call is independent — sessions and event subscriptions do "
-        "not persist between calls. For stateful workflows, prefer the "
-        "dedicated browser tools."
+        "- **Cross-origin iframe scope** (Runtime.evaluate inside an OOPIF, "
+        "Page.* targeting a frame target, etc.): pass frame_id from the "
+        "browser_snapshot frame_tree output. This routes through the CDP "
+        "supervisor's live connection — the only reliable way on "
+        "Browserbase where stateless CDP calls hit signed-URL expiry.\n"
+        "- Each stateless call (without frame_id) is independent — sessions "
+        "and event subscriptions do not persist between calls. For stateful "
+        "workflows, prefer the dedicated browser tools or use frame_id "
+        "routing."
     ),
     "parameters": {
         "type": "object",
@@ -353,8 +483,24 @@ BROWSER_CDP_SCHEMA: Dict[str, Any] = {
                 "type": "string",
                 "description": (
                     "Optional. Target/tab ID from Target.getTargets result "
-                    "(each entry's 'targetId'). Required for page-level "
-                    "methods; must be omitted for browser-level methods."
+                    "(each entry's 'targetId'). Use for page-level methods "
+                    "at the top-level tab scope. Mutually exclusive with "
+                    "frame_id."
+                ),
+            },
+            "frame_id": {
+                "type": "string",
+                "description": (
+                    "Optional. Out-of-process iframe (OOPIF) frame_id from "
+                    "browser_snapshot.frame_tree.children[] where "
+                    "is_oopif=true. When set, routes the call through the "
+                    "CDP supervisor's live session for that iframe. "
+                    "Essential for Runtime.evaluate inside cross-origin "
+                    "iframes, especially on Browserbase where fresh "
+                    "per-call CDP connections can't keep up with signed "
+                    "URL rotation. For same-origin iframes, use parent "
+                    "contentWindow/contentDocument from Runtime.evaluate "
+                    "at the top-level page instead."
                 ),
             },
             "timeout": {
@@ -408,6 +554,7 @@ registry.register(
         method=args.get("method", ""),
         params=args.get("params"),
         target_id=args.get("target_id"),
+        frame_id=args.get("frame_id"),
         timeout=args.get("timeout", 30.0),
         task_id=kw.get("task_id"),
     ),
diff --git a/tools/browser_dialog_tool.py b/tools/browser_dialog_tool.py
new file mode 100644
index 000000000..51ab0c424
--- /dev/null
+++ b/tools/browser_dialog_tool.py
@@ -0,0 +1,148 @@
+"""Agent-facing tool: respond to a native JS dialog captured by the CDP supervisor.
+
+This tool is response-only — the agent first reads ``pending_dialogs`` from
+``browser_snapshot`` output, then calls ``browser_dialog(action=...)`` to
+accept or dismiss.
+
+Gated on the same ``_browser_cdp_check`` as ``browser_cdp`` so it only
+appears when a CDP endpoint is reachable (Browserbase with a
+``connectUrl``, local Chrome via ``/browser connect``, or
+``browser.cdp_url`` set in config).
+
+See ``website/docs/developer-guide/browser-supervisor.md`` for the full
+design.
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+from typing import Any, Dict, Optional
+
+from tools.browser_supervisor import SUPERVISOR_REGISTRY
+from tools.registry import registry
+
+logger = logging.getLogger(__name__)
+
+
+BROWSER_DIALOG_SCHEMA: Dict[str, Any] = {
+    "name": "browser_dialog",
+    "description": (
+        "Respond to a native JavaScript dialog (alert / confirm / prompt / "
+        "beforeunload) that is currently blocking the page.\n\n"
+        "**Workflow:** call ``browser_snapshot`` first — if a dialog is open, "
+        "it appears in the ``pending_dialogs`` field with ``id``, ``type``, "
+        "and ``message``. Then call this tool with ``action='accept'`` or "
+        "``action='dismiss'``.\n\n"
+        "**Prompt dialogs:** pass ``prompt_text`` to supply the response "
+        "string. Ignored for alert/confirm/beforeunload.\n\n"
+        "**Multiple dialogs:** if more than one dialog is queued (rare — "
+        "happens when a second dialog fires while the first is still open), "
+        "pass ``dialog_id`` from the snapshot to disambiguate.\n\n"
+        "**Availability:** only present when a CDP-capable backend is "
+        "attached — Browserbase sessions, local Chrome via "
+        "``/browser connect``, or ``browser.cdp_url`` in config.yaml. "
+        "Not available on Camofox (REST-only) or the default Playwright "
+        "local browser (CDP port is hidden)."
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "action": {
+                "type": "string",
+                "enum": ["accept", "dismiss"],
+                "description": (
+                    "'accept' clicks OK / returns the prompt text. "
+                    "'dismiss' clicks Cancel / returns null from prompt(). "
+                    "For ``beforeunload`` dialogs: 'accept' allows the "
+                    "navigation, 'dismiss' keeps the page."
+                ),
+            },
+            "prompt_text": {
+                "type": "string",
+                "description": (
+                    "Response string for a ``prompt()`` dialog. Ignored for "
+                    "other dialog types. Defaults to empty string."
+                ),
+            },
+            "dialog_id": {
+                "type": "string",
+                "description": (
+                    "Specific dialog to respond to, from "
+                    "``browser_snapshot.pending_dialogs[].id``. Required "
+                    "only when multiple dialogs are queued."
+                ),
+            },
+        },
+        "required": ["action"],
+    },
+}
+
+
+def browser_dialog(
+    action: str,
+    prompt_text: Optional[str] = None,
+    dialog_id: Optional[str] = None,
+    task_id: Optional[str] = None,
+) -> str:
+    """Respond to a pending dialog on the active task's CDP supervisor."""
+    effective_task_id = task_id or "default"
+    supervisor = SUPERVISOR_REGISTRY.get(effective_task_id)
+    if supervisor is None:
+        return json.dumps(
+            {
+                "success": False,
+                "error": (
+                    "No CDP supervisor is attached to this task. Either the "
+                    "browser backend doesn't expose CDP (Camofox, default "
+                    "Playwright) or no browser session has been started yet. "
+                    "Call browser_navigate or /browser connect first."
+                ),
+            }
+        )
+
+    result = supervisor.respond_to_dialog(
+        action=action,
+        prompt_text=prompt_text,
+        dialog_id=dialog_id,
+    )
+    if result.get("ok"):
+        return json.dumps(
+            {
+                "success": True,
+                "action": action,
+                "dialog": result.get("dialog", {}),
+            }
+        )
+    return json.dumps({"success": False, "error": result.get("error", "unknown error")})
+
+
+def _browser_dialog_check() -> bool:
+    """Gate: same as ``browser_cdp`` — only offered when CDP is reachable.
+
+    Kept identical so the two tools appear and disappear together. The
+    supervisor itself is started lazily by ``browser_navigate`` /
+    ``/browser connect`` / Browserbase session creation, so a reachable
+    CDP URL is enough to commit to showing the tool.
+    """
+    try:
+        from tools.browser_cdp_tool import _browser_cdp_check  # type: ignore[import-not-found]
+    except Exception as exc:  # pragma: no cover — defensive
+        logger.debug("browser_dialog check: browser_cdp_tool import failed: %s", exc)
+        return False
+    return _browser_cdp_check()
+
+
+registry.register(
+    name="browser_dialog",
+    toolset="browser-cdp",
+    schema=BROWSER_DIALOG_SCHEMA,
+    handler=lambda args, **kw: browser_dialog(
+        action=args.get("action", ""),
+        prompt_text=args.get("prompt_text"),
+        dialog_id=args.get("dialog_id"),
+        task_id=kw.get("task_id"),
+    ),
+    check_fn=_browser_dialog_check,
+    emoji="💬",
+)
diff --git a/tools/browser_supervisor.py b/tools/browser_supervisor.py
new file mode 100644
index 000000000..e230d92ed
--- /dev/null
+++ b/tools/browser_supervisor.py
@@ -0,0 +1,1362 @@
+"""Persistent CDP supervisor for browser dialog + frame detection.
+
+One ``CDPSupervisor`` runs per Hermes ``task_id`` that has a reachable CDP
+endpoint. It holds a single persistent WebSocket to the backend, subscribes
+to ``Page`` / ``Runtime`` / ``Target`` events on every attached session
+(top-level page and every OOPIF / worker target that auto-attaches), and
+surfaces observable state — pending dialogs and frame tree — through a
+thread-safe snapshot object that tool handlers consume synchronously.
+
+The supervisor is NOT in the agent's tool schema. Its output reaches the
+agent via two channels:
+
+1. ``browser_snapshot`` merges supervisor state into its return payload
+   (see ``tools/browser_tool.py``).
+2. ``browser_dialog`` tool responds to a pending dialog by calling
+   ``respond_to_dialog()`` on the active supervisor.
+
+Design spec: ``website/docs/developer-guide/browser-supervisor.md``.
+"""
+
+from __future__ import annotations
+
+import asyncio
+import json
+import logging
+import threading
+import time
+from dataclasses import dataclass, field
+from typing import Any, Dict, List, Optional, Tuple
+
+import websockets
+from websockets.asyncio.client import ClientConnection
+
+logger = logging.getLogger(__name__)
+
+
+# ── Config defaults ───────────────────────────────────────────────────────────
+
+DIALOG_POLICY_MUST_RESPOND = "must_respond"
+DIALOG_POLICY_AUTO_DISMISS = "auto_dismiss"
+DIALOG_POLICY_AUTO_ACCEPT = "auto_accept"
+
+_VALID_POLICIES = frozenset(
+    {DIALOG_POLICY_MUST_RESPOND, DIALOG_POLICY_AUTO_DISMISS, DIALOG_POLICY_AUTO_ACCEPT}
+)
+
+DEFAULT_DIALOG_POLICY = DIALOG_POLICY_MUST_RESPOND
+DEFAULT_DIALOG_TIMEOUT_S = 300.0
+
+# Snapshot caps for frame_tree — keep payloads bounded on ad-heavy pages.
+FRAME_TREE_MAX_ENTRIES = 30
+FRAME_TREE_MAX_OOPIF_DEPTH = 2
+
+# Ring buffer of recent console-level events (used later by PR 2 diagnostics).
+CONSOLE_HISTORY_MAX = 50
+
+# Keep the last N closed dialogs in ``recent_dialogs`` so agents on backends
+# that auto-dismiss server-side (e.g. Browserbase) can still observe that a
+# dialog fired, even if they couldn't respond to it in time.
+RECENT_DIALOGS_MAX = 20
+
+# Magic host the injected dialog bridge XHRs to.  Intercepted via the CDP
+# Fetch domain before any network resolution happens, so the hostname never
+# has to exist.  Keep this ASCII + URL-safe; we also gate Fetch patterns on it.
+DIALOG_BRIDGE_HOST = "hermes-dialog-bridge.invalid"
+DIALOG_BRIDGE_URL_PATTERN = f"http://{DIALOG_BRIDGE_HOST}/*"
+
+# Script injected into every frame via Page.addScriptToEvaluateOnNewDocument.
+# Overrides alert/confirm/prompt to round-trip through a sync XHR that we
+# intercept via Fetch.requestPaused. Works on Browserbase (whose CDP proxy
+# auto-dismisses REAL native dialogs) because the native dialogs never fire
+# in the first place — the overrides take precedence.
+_DIALOG_BRIDGE_SCRIPT = r"""
+(() => {
+  if (window.__hermesDialogBridgeInstalled) return;
+  window.__hermesDialogBridgeInstalled = true;
+  const ENDPOINT = "http://hermes-dialog-bridge.invalid/";
+  function ask(kind, message, defaultPrompt) {
+    try {
+      const xhr = new XMLHttpRequest();
+      // Use GET with query params so we don't need to worry about request
+      // body encoding in the Fetch interceptor.
+      const params = new URLSearchParams({
+        kind: String(kind || ""),
+        message: String(message == null ? "" : message),
+        default_prompt: String(defaultPrompt == null ? "" : defaultPrompt),
+      });
+      xhr.open("GET", ENDPOINT + "?" + params.toString(), false);  // sync
+      xhr.send(null);
+      if (xhr.status !== 200) return null;
+      const body = xhr.responseText || "";
+      let parsed;
+      try { parsed = JSON.parse(body); } catch (e) { return null; }
+      if (kind === "alert") return undefined;
+      if (kind === "confirm") return Boolean(parsed && parsed.accept);
+      if (kind === "prompt") {
+        if (!parsed || !parsed.accept) return null;
+        return parsed.prompt_text == null ? "" : String(parsed.prompt_text);
+      }
+      return null;
+    } catch (e) {
+      // If the bridge is unreachable, fall back to the native call so the
+      // page still sees *some* behavior (the backend will auto-dismiss).
+      return null;
+    }
+  }
+  const realAlert   = window.alert;
+  const realConfirm = window.confirm;
+  const realPrompt  = window.prompt;
+  window.alert   = function(message) { ask("alert",   message, ""); };
+  window.confirm = function(message) {
+    const r = ask("confirm", message, "");
+    return r === null ? false : Boolean(r);
+  };
+  window.prompt  = function(message, def) {
+    const r = ask("prompt", message, def == null ? "" : def);
+    return r === null ? null : String(r);
+  };
+  // onbeforeunload — we can't really synchronously prompt the user from this
+  // event without racing navigation.  Leave native behavior for now; the
+  // supervisor's native-dialog fallback path still surfaces them in
+  // recent_dialogs.
+})();
+"""
+
+
+# ── Data model ────────────────────────────────────────────────────────────────
+
+
+@dataclass
+class PendingDialog:
+    """A JS dialog currently open on some frame's session."""
+
+    id: str
+    type: str  # "alert" | "confirm" | "prompt" | "beforeunload"
+    message: str
+    default_prompt: str
+    opened_at: float
+    cdp_session_id: str  # which attached CDP session the dialog fired in
+    frame_id: Optional[str] = None
+    # When set, the dialog was captured via the bridge XHR path (Fetch domain).
+    # Response must be delivered via Fetch.fulfillRequest, NOT
+    # Page.handleJavaScriptDialog — the native dialog never fired.
+    bridge_request_id: Optional[str] = None
+
+    def to_dict(self) -> Dict[str, Any]:
+        return {
+            "id": self.id,
+            "type": self.type,
+            "message": self.message,
+            "default_prompt": self.default_prompt,
+            "opened_at": self.opened_at,
+            "frame_id": self.frame_id,
+        }
+
+
+@dataclass
+class DialogRecord:
+    """A historical record of a dialog that was opened and then handled.
+
+    Retained in ``recent_dialogs`` for a short window so agents on backends
+    that auto-dismiss dialogs server-side (Browserbase) can still observe
+    that a dialog fired, even though they couldn't respond to it.
+    """
+
+    id: str
+    type: str
+    message: str
+    opened_at: float
+    closed_at: float
+    closed_by: str  # "agent" | "auto_policy" | "remote" | "watchdog"
+    frame_id: Optional[str] = None
+
+    def to_dict(self) -> Dict[str, Any]:
+        return {
+            "id": self.id,
+            "type": self.type,
+            "message": self.message,
+            "opened_at": self.opened_at,
+            "closed_at": self.closed_at,
+            "closed_by": self.closed_by,
+            "frame_id": self.frame_id,
+        }
+
+
+@dataclass
+class FrameInfo:
+    """One frame in the page's frame tree.
+
+    ``is_oopif`` means the frame has its own CDP target (separate process,
+    reachable via ``cdp_session_id``). Same-origin / srcdoc iframes share
+    the parent process and have ``is_oopif=False`` + ``cdp_session_id=None``.
+    """
+
+    frame_id: str
+    url: str
+    origin: str
+    parent_frame_id: Optional[str]
+    is_oopif: bool
+    cdp_session_id: Optional[str] = None
+    name: str = ""
+
+    def to_dict(self) -> Dict[str, Any]:
+        d = {
+            "frame_id": self.frame_id,
+            "url": self.url,
+            "origin": self.origin,
+            "is_oopif": self.is_oopif,
+        }
+        if self.cdp_session_id:
+            d["session_id"] = self.cdp_session_id
+        if self.parent_frame_id:
+            d["parent_frame_id"] = self.parent_frame_id
+        if self.name:
+            d["name"] = self.name
+        return d
+
+
+@dataclass
+class ConsoleEvent:
+    """Ring buffer entry for console + exception traffic."""
+
+    ts: float
+    level: str  # "log" | "error" | "warning" | "exception"
+    text: str
+    url: Optional[str] = None
+
+
+@dataclass(frozen=True)
+class SupervisorSnapshot:
+    """Read-only snapshot of supervisor state.
+
+    Frozen dataclass so tool handlers can freely dereference without
+    worrying about mutation under their feet.
+    """
+
+    pending_dialogs: Tuple[PendingDialog, ...]
+    recent_dialogs: Tuple[DialogRecord, ...]
+    frame_tree: Dict[str, Any]
+    console_errors: Tuple[ConsoleEvent, ...]
+    active: bool  # False if supervisor is detached/stopped
+    cdp_url: str
+    task_id: str
+
+    def to_dict(self) -> Dict[str, Any]:
+        """Serialize for inclusion in ``browser_snapshot`` output."""
+        out: Dict[str, Any] = {
+            "pending_dialogs": [d.to_dict() for d in self.pending_dialogs],
+            "frame_tree": self.frame_tree,
+        }
+        if self.recent_dialogs:
+            out["recent_dialogs"] = [d.to_dict() for d in self.recent_dialogs]
+        return out
+
+
+# ── Supervisor core ───────────────────────────────────────────────────────────
+
+
+class CDPSupervisor:
+    """One supervisor per (task_id, cdp_url) pair.
+
+    Lifecycle:
+      * ``start()`` — kicked off by ``SupervisorRegistry.get_or_start``; spawns
+        a daemon thread running its own asyncio loop, connects the WebSocket,
+        attaches to the first page target, enables domains, starts
+        auto-attaching to child targets.
+      * ``snapshot()`` — sync, thread-safe, called from tool handlers.
+      * ``respond_to_dialog(action, ...)`` — sync bridge; schedules a coroutine
+        on the supervisor's loop and waits (with timeout) for the CDP ack.
+      * ``stop()`` — cancels task, closes WebSocket, joins thread.
+
+    All CDP I/O lives on the supervisor's own loop. External callers never
+    touch the loop directly; they go through the sync API above.
+    """
+
+    def __init__(
+        self,
+        task_id: str,
+        cdp_url: str,
+        *,
+        dialog_policy: str = DEFAULT_DIALOG_POLICY,
+        dialog_timeout_s: float = DEFAULT_DIALOG_TIMEOUT_S,
+    ) -> None:
+        if dialog_policy not in _VALID_POLICIES:
+            raise ValueError(
+                f"Invalid dialog_policy {dialog_policy!r}; "
+                f"must be one of {sorted(_VALID_POLICIES)}"
+            )
+        self.task_id = task_id
+        self.cdp_url = cdp_url
+        self.dialog_policy = dialog_policy
+        self.dialog_timeout_s = float(dialog_timeout_s)
+
+        # State protected by ``_state_lock`` for cross-thread reads.
+        self._state_lock = threading.Lock()
+        self._pending_dialogs: Dict[str, PendingDialog] = {}
+        self._recent_dialogs: List[DialogRecord] = []
+        self._frames: Dict[str, FrameInfo] = {}
+        self._console_events: List[ConsoleEvent] = []
+        self._active = False
+
+        # Supervisor loop machinery — populated in start().
+        self._loop: Optional[asyncio.AbstractEventLoop] = None
+        self._thread: Optional[threading.Thread] = None
+        self._ready_event = threading.Event()
+        self._start_error: Optional[BaseException] = None
+        self._stop_requested = False
+
+        # CDP call tracking (runs on supervisor loop only).
+        self._next_call_id = 1
+        self._pending_calls: Dict[int, asyncio.Future] = {}
+        self._ws: Optional[ClientConnection] = None
+        self._page_session_id: Optional[str] = None
+        self._child_sessions: Dict[str, Dict[str, Any]] = {}  # session_id -> info
+
+        # Dialog auto-dismiss watchdog handles (per dialog id).
+        self._dialog_watchdogs: Dict[str, asyncio.TimerHandle] = {}
+        # Monotonic id generator for dialogs (human-readable in snapshots).
+        self._dialog_seq = 0
+
+    # ── Public sync API ──────────────────────────────────────────────────────
+
+    def start(self, timeout: float = 15.0) -> None:
+        """Launch the background loop and wait until attachment is complete.
+
+        Raises whatever exception attach failed with (connect error, bad
+        WebSocket URL, CDP domain enable failure, etc.). On success, the
+        supervisor is fully wired up — pending-dialog events will be captured
+        as of the moment ``start()`` returns.
+        """
+        if self._thread and self._thread.is_alive():
+            return
+        self._ready_event.clear()
+        self._start_error = None
+        self._stop_requested = False
+        self._thread = threading.Thread(
+            target=self._thread_main,
+            name=f"cdp-supervisor-{self.task_id}",
+            daemon=True,
+        )
+        self._thread.start()
+        if not self._ready_event.wait(timeout=timeout):
+            self.stop()
+            raise TimeoutError(
+                f"CDP supervisor did not attach within {timeout}s "
+                f"(cdp_url={self.cdp_url[:80]}...)"
+            )
+        if self._start_error is not None:
+            err = self._start_error
+            self.stop()
+            raise err
+
+    def stop(self, timeout: float = 5.0) -> None:
+        """Cancel the supervisor task and join the thread."""
+        self._stop_requested = True
+        loop = self._loop
+        if loop is not None and loop.is_running():
+            # Close the WebSocket from inside the loop — this makes ``async for
+            # raw in self._ws`` return cleanly, ``_run`` hits its ``finally``,
+            # pending tasks get cancelled in order, THEN the thread exits.
+            async def _close_ws():
+                ws = self._ws
+                self._ws = None
+                if ws is not None:
+                    try:
+                        await ws.close()
+                    except Exception:
+                        pass
+
+            try:
+                fut = asyncio.run_coroutine_threadsafe(_close_ws(), loop)
+                try:
+                    fut.result(timeout=2.0)
+                except Exception:
+                    pass
+            except RuntimeError:
+                pass  # loop already shutting down
+        if self._thread is not None:
+            self._thread.join(timeout=timeout)
+        with self._state_lock:
+            self._active = False
+
+    def snapshot(self) -> SupervisorSnapshot:
+        """Return an immutable snapshot of current state."""
+        with self._state_lock:
+            dialogs = tuple(self._pending_dialogs.values())
+            recent = tuple(self._recent_dialogs[-RECENT_DIALOGS_MAX:])
+            frames_tree = self._build_frame_tree_locked()
+            console = tuple(self._console_events[-CONSOLE_HISTORY_MAX:])
+            active = self._active
+        return SupervisorSnapshot(
+            pending_dialogs=dialogs,
+            recent_dialogs=recent,
+            frame_tree=frames_tree,
+            console_errors=console,
+            active=active,
+            cdp_url=self.cdp_url,
+            task_id=self.task_id,
+        )
+
+    def respond_to_dialog(
+        self,
+        action: str,
+        *,
+        prompt_text: Optional[str] = None,
+        dialog_id: Optional[str] = None,
+        timeout: float = 10.0,
+    ) -> Dict[str, Any]:
+        """Accept/dismiss a pending dialog. Sync bridge onto the supervisor loop.
+
+        Returns ``{"ok": True, "dialog": {...}}`` on success,
+        ``{"ok": False, "error": "..."}`` on a recoverable error (no dialog,
+        ambiguous dialog_id, supervisor inactive).
+        """
+        if action not in ("accept", "dismiss"):
+            return {"ok": False, "error": f"action must be 'accept' or 'dismiss', got {action!r}"}
+
+        with self._state_lock:
+            if not self._active:
+                return {"ok": False, "error": "supervisor is not active"}
+            pending = list(self._pending_dialogs.values())
+            if not pending:
+                return {"ok": False, "error": "no dialog is currently open"}
+            if dialog_id:
+                dialog = self._pending_dialogs.get(dialog_id)
+                if dialog is None:
+                    return {
+                        "ok": False,
+                        "error": f"dialog_id {dialog_id!r} not found "
+                        f"(known: {sorted(self._pending_dialogs)})",
+                    }
+            elif len(pending) > 1:
+                return {
+                    "ok": False,
+                    "error": (
+                        f"{len(pending)} pending dialogs; specify dialog_id. "
+                        f"Candidates: {[d.id for d in pending]}"
+                    ),
+                }
+            else:
+                dialog = pending[0]
+            snapshot_copy = dialog
+
+        loop = self._loop
+        if loop is None:
+            return {"ok": False, "error": "supervisor loop is not running"}
+
+        async def _do_respond():
+            return await self._handle_dialog_cdp(
+                snapshot_copy, accept=(action == "accept"), prompt_text=prompt_text or ""
+            )
+
+        try:
+            fut = asyncio.run_coroutine_threadsafe(_do_respond(), loop)
+            fut.result(timeout=timeout)
+        except Exception as e:
+            return {"ok": False, "error": f"{type(e).__name__}: {e}"}
+        return {"ok": True, "dialog": snapshot_copy.to_dict()}
+
+    # ── Supervisor loop internals ────────────────────────────────────────────
+
+    def _thread_main(self) -> None:
+        """Entry point for the supervisor's dedicated thread."""
+        loop = asyncio.new_event_loop()
+        self._loop = loop
+        try:
+            asyncio.set_event_loop(loop)
+            loop.run_until_complete(self._run())
+        except BaseException as e:  # noqa: BLE001 — propagate via _start_error
+            if not self._ready_event.is_set():
+                self._start_error = e
+                self._ready_event.set()
+            else:
+                logger.warning("CDP supervisor %s crashed: %s", self.task_id, e)
+        finally:
+            # Flush any remaining tasks before closing the loop so we don't
+            # emit "Task was destroyed but it is pending" warnings.
+            try:
+                pending = [t for t in asyncio.all_tasks(loop) if not t.done()]
+                for t in pending:
+                    t.cancel()
+                if pending:
+                    loop.run_until_complete(asyncio.gather(*pending, return_exceptions=True))
+            except Exception:
+                pass
+            try:
+                loop.close()
+            except Exception:
+                pass
+            with self._state_lock:
+                self._active = False
+
+    async def _run(self) -> None:
+        """Top-level supervisor coroutine.
+
+        Holds a reconnecting loop so we survive the remote closing the
+        WebSocket — Browserbase in particular tears down the CDP socket
+        every time a short-lived client (e.g. agent-browser's per-command
+        CDP client) disconnects.  We drop our state snapshot keys that
+        depend on specific CDP session ids, re-attach, and keep going.
+        """
+        attempt = 0
+        last_success_at = 0.0
+        backoff = 0.5
+        while not self._stop_requested:
+            try:
+                self._ws = await asyncio.wait_for(
+                    websockets.connect(self.cdp_url, max_size=50 * 1024 * 1024),
+                    timeout=10.0,
+                )
+            except Exception as e:
+                attempt += 1
+                if not self._ready_event.is_set():
+                    # Never connected once — fatal for start().
+                    self._start_error = e
+                    self._ready_event.set()
+                    return
+                logger.warning(
+                    "CDP supervisor %s: connect failed (attempt %s): %s",
+                    self.task_id, attempt, e,
+                )
+                await asyncio.sleep(min(backoff, 10.0))
+                backoff = min(backoff * 2, 10.0)
+                continue
+
+            reader_task = asyncio.create_task(self._read_loop(), name="cdp-reader")
+            try:
+                # Reset per-connection session state so stale ids don't hang
+                # around after a reconnect.
+                self._page_session_id = None
+                self._child_sessions.clear()
+                # We deliberately keep `_pending_dialogs` and `_frames` —
+                # they're reconciled as the supervisor resubscribes and
+                # receives fresh events.  Worst case: an agent sees a stale
+                # dialog entry that the new session's handleJavaScriptDialog
+                # call rejects with "no dialog is showing" (logged, not
+                # surfaced).
+                await self._attach_initial_page()
+                with self._state_lock:
+                    self._active = True
+                last_success_at = time.time()
+                backoff = 0.5  # reset after a successful attach
+                if not self._ready_event.is_set():
+                    self._ready_event.set()
+                # Run until the reader returns.
+                await reader_task
+            except BaseException as e:
+                if not self._ready_event.is_set():
+                    # Never got to ready — propagate to start().
+                    self._start_error = e
+                    self._ready_event.set()
+                    raise
+                logger.warning(
+                    "CDP supervisor %s: session dropped after %.1fs: %s",
+                    self.task_id,
+                    time.time() - last_success_at,
+                    e,
+                )
+            finally:
+                with self._state_lock:
+                    self._active = False
+                if not reader_task.done():
+                    reader_task.cancel()
+                    try:
+                        await reader_task
+                    except (asyncio.CancelledError, Exception):
+                        pass
+                for handle in list(self._dialog_watchdogs.values()):
+                    handle.cancel()
+                self._dialog_watchdogs.clear()
+                ws = self._ws
+                self._ws = None
+                if ws is not None:
+                    try:
+                        await ws.close()
+                    except Exception:
+                        pass
+
+            if self._stop_requested:
+                return
+
+            # Reconnect: brief backoff, then reattach.
+            logger.debug(
+                "CDP supervisor %s: reconnecting in %.1fs...", self.task_id, backoff,
+            )
+            await asyncio.sleep(backoff)
+            backoff = min(backoff * 2, 10.0)
+
+    async def _attach_initial_page(self) -> None:
+        """Find a page target, attach flattened session, enable domains, install dialog bridge."""
+        resp = await self._cdp("Target.getTargets")
+        targets = resp.get("result", {}).get("targetInfos", [])
+        page_target = next((t for t in targets if t.get("type") == "page"), None)
+        if page_target is None:
+            created = await self._cdp("Target.createTarget", {"url": "about:blank"})
+            target_id = created["result"]["targetId"]
+        else:
+            target_id = page_target["targetId"]
+
+        attach = await self._cdp(
+            "Target.attachToTarget",
+            {"targetId": target_id, "flatten": True},
+        )
+        self._page_session_id = attach["result"]["sessionId"]
+        await self._cdp("Page.enable", session_id=self._page_session_id)
+        await self._cdp("Runtime.enable", session_id=self._page_session_id)
+        await self._cdp(
+            "Target.setAutoAttach",
+            {"autoAttach": True, "waitForDebuggerOnStart": False, "flatten": True},
+            session_id=self._page_session_id,
+        )
+        # Install the dialog bridge — overrides native alert/confirm/prompt with
+        # a synchronous XHR we intercept via Fetch domain. This is how we make
+        # dialog response work on Browserbase (whose CDP proxy auto-dismisses
+        # real native dialogs before we can call handleJavaScriptDialog).
+        await self._install_dialog_bridge(self._page_session_id)
+
+    async def _install_dialog_bridge(self, session_id: str) -> None:
+        """Install the dialog-bridge init script + Fetch interceptor on a session.
+
+        Two CDP calls:
+          1. ``Page.addScriptToEvaluateOnNewDocument`` — the JS override runs
+             in every frame before any page script. Replaces alert/confirm/
+             prompt with a sync XHR to our bridge URL.
+          2. ``Fetch.enable`` scoped to the bridge URL — we catch those XHRs,
+             surface them as pending dialogs, then fulfill once the agent
+             responds.
+
+        Idempotent at the CDP level: Chromium de-duplicates identical
+        add-script calls by source, and Fetch.enable replaces prior patterns.
+        """
+        try:
+            await self._cdp(
+                "Page.addScriptToEvaluateOnNewDocument",
+                {"source": _DIALOG_BRIDGE_SCRIPT, "runImmediately": True},
+                session_id=session_id,
+                timeout=5.0,
+            )
+        except Exception as e:
+            logger.debug(
+                "dialog bridge: addScriptToEvaluateOnNewDocument failed on sid=%s: %s",
+                (session_id or "")[:16], e,
+            )
+        try:
+            await self._cdp(
+                "Fetch.enable",
+                {
+                    "patterns": [
+                        {
+                            "urlPattern": DIALOG_BRIDGE_URL_PATTERN,
+                            "requestStage": "Request",
+                        }
+                    ],
+                    "handleAuthRequests": False,
+                },
+                session_id=session_id,
+                timeout=5.0,
+            )
+        except Exception as e:
+            logger.debug(
+                "dialog bridge: Fetch.enable failed on sid=%s: %s",
+                (session_id or "")[:16], e,
+            )
+        # Also try to inject into the already-loaded document so existing
+        # pages pick up the override on reconnect. Best-effort.
+        try:
+            await self._cdp(
+                "Runtime.evaluate",
+                {"expression": _DIALOG_BRIDGE_SCRIPT, "returnByValue": True},
+                session_id=session_id,
+                timeout=3.0,
+            )
+        except Exception:
+            pass
+
+    async def _cdp(
+        self,
+        method: str,
+        params: Optional[Dict[str, Any]] = None,
+        *,
+        session_id: Optional[str] = None,
+        timeout: float = 10.0,
+    ) -> Dict[str, Any]:
+        """Send a CDP command and await its response."""
+        if self._ws is None:
+            raise RuntimeError("supervisor WebSocket is not connected")
+        call_id = self._next_call_id
+        self._next_call_id += 1
+        payload: Dict[str, Any] = {"id": call_id, "method": method}
+        if params:
+            payload["params"] = params
+        if session_id:
+            payload["sessionId"] = session_id
+        fut: asyncio.Future = asyncio.get_running_loop().create_future()
+        self._pending_calls[call_id] = fut
+        await self._ws.send(json.dumps(payload))
+        try:
+            return await asyncio.wait_for(fut, timeout=timeout)
+        finally:
+            self._pending_calls.pop(call_id, None)
+
+    async def _read_loop(self) -> None:
+        """Continuously dispatch incoming CDP frames."""
+        assert self._ws is not None
+        try:
+            async for raw in self._ws:
+                if self._stop_requested:
+                    break
+                try:
+                    msg = json.loads(raw)
+                except Exception:
+                    logger.debug("CDP supervisor: non-JSON frame dropped")
+                    continue
+                if "id" in msg:
+                    fut = self._pending_calls.pop(msg["id"], None)
+                    if fut is not None and not fut.done():
+                        if "error" in msg:
+                            fut.set_exception(
+                                RuntimeError(f"CDP error on id={msg['id']}: {msg['error']}")
+                            )
+                        else:
+                            fut.set_result(msg)
+                elif "method" in msg:
+                    await self._on_event(msg["method"], msg.get("params", {}), msg.get("sessionId"))
+        except Exception as e:
+            logger.debug("CDP read loop exited: %s", e)
+
+    # ── Event dispatch ──────────────────────────────────────────────────────
+
+    async def _on_event(
+        self, method: str, params: Dict[str, Any], session_id: Optional[str]
+    ) -> None:
+        if method == "Page.javascriptDialogOpening":
+            await self._on_dialog_opening(params, session_id)
+        elif method == "Page.javascriptDialogClosed":
+            await self._on_dialog_closed(params, session_id)
+        elif method == "Fetch.requestPaused":
+            await self._on_fetch_paused(params, session_id)
+        elif method == "Page.frameAttached":
+            self._on_frame_attached(params, session_id)
+        elif method == "Page.frameNavigated":
+            self._on_frame_navigated(params, session_id)
+        elif method == "Page.frameDetached":
+            self._on_frame_detached(params, session_id)
+        elif method == "Target.attachedToTarget":
+            await self._on_target_attached(params)
+        elif method == "Target.detachedFromTarget":
+            self._on_target_detached(params)
+        elif method == "Runtime.consoleAPICalled":
+            self._on_console(params, level_from="api")
+        elif method == "Runtime.exceptionThrown":
+            self._on_console(params, level_from="exception")
+
+    async def _on_dialog_opening(
+        self, params: Dict[str, Any], session_id: Optional[str]
+    ) -> None:
+        self._dialog_seq += 1
+        dialog = PendingDialog(
+            id=f"d-{self._dialog_seq}",
+            type=str(params.get("type") or ""),
+            message=str(params.get("message") or ""),
+            default_prompt=str(params.get("defaultPrompt") or ""),
+            opened_at=time.time(),
+            cdp_session_id=session_id or self._page_session_id or "",
+            frame_id=params.get("frameId"),
+        )
+
+        if self.dialog_policy == DIALOG_POLICY_AUTO_DISMISS:
+            # Archive immediately with the policy tag so the ``closed`` event
+            # arriving right after our handleJavaScriptDialog call doesn't
+            # re-archive it as "remote".
+            with self._state_lock:
+                self._archive_dialog_locked(dialog, "auto_policy")
+            asyncio.create_task(
+                self._auto_handle_dialog(dialog, accept=False, prompt_text="")
+            )
+        elif self.dialog_policy == DIALOG_POLICY_AUTO_ACCEPT:
+            with self._state_lock:
+                self._archive_dialog_locked(dialog, "auto_policy")
+            asyncio.create_task(
+                self._auto_handle_dialog(
+                    dialog, accept=True, prompt_text=dialog.default_prompt
+                )
+            )
+        else:
+            # must_respond → add to pending and arm watchdog.
+            with self._state_lock:
+                self._pending_dialogs[dialog.id] = dialog
+            loop = asyncio.get_running_loop()
+            handle = loop.call_later(
+                self.dialog_timeout_s,
+                lambda: asyncio.create_task(self._dialog_timeout_expired(dialog.id)),
+            )
+            self._dialog_watchdogs[dialog.id] = handle
+
+    async def _auto_handle_dialog(
+        self, dialog: PendingDialog, *, accept: bool, prompt_text: str
+    ) -> None:
+        """Send handleJavaScriptDialog for auto_dismiss/auto_accept.
+
+        Dialog has already been archived by the caller (``_on_dialog_opening``);
+        this just fires the CDP call so the page unblocks.
+        """
+        params: Dict[str, Any] = {"accept": accept}
+        if dialog.type == "prompt":
+            params["promptText"] = prompt_text
+        try:
+            await self._cdp(
+                "Page.handleJavaScriptDialog",
+                params,
+                session_id=dialog.cdp_session_id or None,
+                timeout=5.0,
+            )
+        except Exception as e:
+            logger.debug("auto-handle CDP call failed for %s: %s", dialog.id, e)
+
+    async def _dialog_timeout_expired(self, dialog_id: str) -> None:
+        with self._state_lock:
+            dialog = self._pending_dialogs.get(dialog_id)
+        if dialog is None:
+            return
+        logger.warning(
+            "CDP supervisor %s: dialog %s (%s) auto-dismissed after %ss timeout",
+            self.task_id,
+            dialog_id,
+            dialog.type,
+            self.dialog_timeout_s,
+        )
+        try:
+            # Archive with watchdog tag BEFORE fulfilling / dismissing.
+            with self._state_lock:
+                if dialog_id in self._pending_dialogs:
+                    self._pending_dialogs.pop(dialog_id, None)
+                    self._archive_dialog_locked(dialog, "watchdog")
+            # Unblock the page — via bridge Fetch fulfill for bridge dialogs,
+            # else native Page.handleJavaScriptDialog for real dialogs.
+            if dialog.bridge_request_id:
+                await self._fulfill_bridge_request(dialog, accept=False, prompt_text="")
+            else:
+                await self._cdp(
+                    "Page.handleJavaScriptDialog",
+                    {"accept": False},
+                    session_id=dialog.cdp_session_id or None,
+                    timeout=5.0,
+                )
+        except Exception as e:
+            logger.debug("auto-dismiss failed for %s: %s", dialog_id, e)
+
+    def _archive_dialog_locked(self, dialog: PendingDialog, closed_by: str) -> None:
+        """Move a pending dialog to the recent_dialogs ring buffer. Must hold state_lock."""
+        record = DialogRecord(
+            id=dialog.id,
+            type=dialog.type,
+            message=dialog.message,
+            opened_at=dialog.opened_at,
+            closed_at=time.time(),
+            closed_by=closed_by,
+            frame_id=dialog.frame_id,
+        )
+        self._recent_dialogs.append(record)
+        if len(self._recent_dialogs) > RECENT_DIALOGS_MAX * 2:
+            self._recent_dialogs = self._recent_dialogs[-RECENT_DIALOGS_MAX:]
+
+    async def _handle_dialog_cdp(
+        self, dialog: PendingDialog, *, accept: bool, prompt_text: str
+    ) -> None:
+        """Send the Page.handleJavaScriptDialog CDP command (agent path only).
+
+        Routes to the bridge-fulfill path when the dialog was captured via
+        the injected XHR override (see ``_on_fetch_paused``).
+        """
+        if dialog.bridge_request_id:
+            try:
+                await self._fulfill_bridge_request(
+                    dialog, accept=accept, prompt_text=prompt_text
+                )
+            finally:
+                with self._state_lock:
+                    if dialog.id in self._pending_dialogs:
+                        self._pending_dialogs.pop(dialog.id, None)
+                        self._archive_dialog_locked(dialog, "agent")
+                handle = self._dialog_watchdogs.pop(dialog.id, None)
+                if handle is not None:
+                    handle.cancel()
+            return
+
+        params: Dict[str, Any] = {"accept": accept}
+        if dialog.type == "prompt":
+            params["promptText"] = prompt_text
+        try:
+            await self._cdp(
+                "Page.handleJavaScriptDialog",
+                params,
+                session_id=dialog.cdp_session_id or None,
+                timeout=5.0,
+            )
+        finally:
+            # Clear regardless — the CDP error path usually means the dialog
+            # already closed (browser auto-dismissed after navigation, etc.).
+            with self._state_lock:
+                if dialog.id in self._pending_dialogs:
+                    self._pending_dialogs.pop(dialog.id, None)
+                    self._archive_dialog_locked(dialog, "agent")
+            handle = self._dialog_watchdogs.pop(dialog.id, None)
+            if handle is not None:
+                handle.cancel()
+
+    async def _on_dialog_closed(
+        self, params: Dict[str, Any], session_id: Optional[str]
+    ) -> None:
+        # ``Page.javascriptDialogClosed`` spec has only ``result`` (bool) and
+        # ``userInput`` (string), not the original ``message``.  Match by
+        # session id and clear the oldest dialog on that session — if Chrome
+        # closed one on us (e.g. our disconnect auto-dismissed it, or the
+        # browser navigated, or Browserbase's CDP proxy auto-dismissed), there
+        # shouldn't be more than one in flight per session anyway because the
+        # JS thread is blocked while a dialog is up.
+        with self._state_lock:
+            candidate_ids = [
+                d.id
+                for d in self._pending_dialogs.values()
+                if d.cdp_session_id == session_id
+                # Bridge-captured dialogs aren't cleared by native close events;
+                # they're resolved via Fetch.fulfillRequest instead. Only the
+                # real-native-dialog path uses Page.javascriptDialogClosed.
+                and d.bridge_request_id is None
+            ]
+            if candidate_ids:
+                did = candidate_ids[0]
+                dialog = self._pending_dialogs.pop(did, None)
+                if dialog is not None:
+                    self._archive_dialog_locked(dialog, "remote")
+                handle = self._dialog_watchdogs.pop(did, None)
+                if handle is not None:
+                    handle.cancel()
+
+    async def _on_fetch_paused(
+        self, params: Dict[str, Any], session_id: Optional[str]
+    ) -> None:
+        """Bridge XHR captured mid-flight — materialize as a pending dialog.
+
+        The injected script (``_DIALOG_BRIDGE_SCRIPT``) fires a synchronous
+        XHR to ``DIALOG_BRIDGE_HOST`` whenever page code calls alert/confirm/
+        prompt. We catch it via Fetch.enable pattern; the page's JS thread
+        is blocked on the XHR's response until we call Fetch.fulfillRequest
+        (which happens from ``respond_to_dialog``) or until the watchdog
+        fires (at which point we fulfill with a cancel response).
+        """
+        url = str(params.get("request", {}).get("url") or "")
+        request_id = params.get("requestId")
+        if not request_id:
+            return
+        # Only care about our bridge URLs. Fetch can still deliver other
+        # intercepted requests if patterns were ever broadened.
+        if DIALOG_BRIDGE_HOST not in url:
+            # Not ours — forward unchanged so the page sees its own request.
+            try:
+                await self._cdp(
+                    "Fetch.continueRequest", {"requestId": request_id},
+                    session_id=session_id, timeout=3.0,
+                )
+            except Exception:
+                pass
+            return
+
+        # Parse query string for dialog metadata. Use urllib to be robust.
+        from urllib.parse import urlparse, parse_qs
+        q = parse_qs(urlparse(url).query)
+
+        def _q(name: str) -> str:
+            v = q.get(name, [""])
+            return v[0] if v else ""
+
+        kind = _q("kind") or "alert"
+        message = _q("message")
+        default_prompt = _q("default_prompt")
+
+        self._dialog_seq += 1
+        dialog = PendingDialog(
+            id=f"d-{self._dialog_seq}",
+            type=kind,
+            message=message,
+            default_prompt=default_prompt,
+            opened_at=time.time(),
+            cdp_session_id=session_id or self._page_session_id or "",
+            frame_id=params.get("frameId"),
+            bridge_request_id=str(request_id),
+        )
+
+        # Apply policy exactly as for native dialogs.
+        if self.dialog_policy == DIALOG_POLICY_AUTO_DISMISS:
+            with self._state_lock:
+                self._archive_dialog_locked(dialog, "auto_policy")
+            asyncio.create_task(
+                self._fulfill_bridge_request(dialog, accept=False, prompt_text="")
+            )
+        elif self.dialog_policy == DIALOG_POLICY_AUTO_ACCEPT:
+            with self._state_lock:
+                self._archive_dialog_locked(dialog, "auto_policy")
+            asyncio.create_task(
+                self._fulfill_bridge_request(
+                    dialog, accept=True, prompt_text=default_prompt
+                )
+            )
+        else:
+            # must_respond — add to pending + arm watchdog.
+            with self._state_lock:
+                self._pending_dialogs[dialog.id] = dialog
+            loop = asyncio.get_running_loop()
+            handle = loop.call_later(
+                self.dialog_timeout_s,
+                lambda: asyncio.create_task(self._dialog_timeout_expired(dialog.id)),
+            )
+            self._dialog_watchdogs[dialog.id] = handle
+
+    async def _fulfill_bridge_request(
+        self, dialog: PendingDialog, *, accept: bool, prompt_text: str
+    ) -> None:
+        """Resolve a bridge XHR via Fetch.fulfillRequest so the page unblocks."""
+        if not dialog.bridge_request_id:
+            return
+        payload = {
+            "accept": bool(accept),
+            "prompt_text": prompt_text if dialog.type == "prompt" else "",
+            "dialog_id": dialog.id,
+        }
+        body = json.dumps(payload).encode()
+        try:
+            import base64 as _b64
+            await self._cdp(
+                "Fetch.fulfillRequest",
+                {
+                    "requestId": dialog.bridge_request_id,
+                    "responseCode": 200,
+                    "responseHeaders": [
+                        {"name": "Content-Type", "value": "application/json"},
+                        {"name": "Access-Control-Allow-Origin", "value": "*"},
+                    ],
+                    "body": _b64.b64encode(body).decode(),
+                },
+                session_id=dialog.cdp_session_id or None,
+                timeout=5.0,
+            )
+        except Exception as e:
+            logger.debug("bridge fulfill failed for %s: %s", dialog.id, e)
+
+    # ── Frame / target tracking ─────────────────────────────────────────────
+
+    def _on_frame_attached(
+        self, params: Dict[str, Any], session_id: Optional[str]
+    ) -> None:
+        frame_id = params.get("frameId")
+        if not frame_id:
+            return
+        with self._state_lock:
+            self._frames[frame_id] = FrameInfo(
+                frame_id=frame_id,
+                url="",
+                origin="",
+                parent_frame_id=params.get("parentFrameId"),
+                is_oopif=False,
+                cdp_session_id=session_id,
+            )
+
+    def _on_frame_navigated(
+        self, params: Dict[str, Any], session_id: Optional[str]
+    ) -> None:
+        frame = params.get("frame") or {}
+        frame_id = frame.get("id")
+        if not frame_id:
+            return
+        with self._state_lock:
+            existing = self._frames.get(frame_id)
+            info = FrameInfo(
+                frame_id=frame_id,
+                url=str(frame.get("url") or ""),
+                origin=str(frame.get("securityOrigin") or frame.get("origin") or ""),
+                parent_frame_id=frame.get("parentId") or (existing.parent_frame_id if existing else None),
+                is_oopif=bool(existing.is_oopif if existing else False),
+                cdp_session_id=existing.cdp_session_id if existing else session_id,
+                name=str(frame.get("name") or (existing.name if existing else "")),
+            )
+            self._frames[frame_id] = info
+
+    def _on_frame_detached(
+        self, params: Dict[str, Any], session_id: Optional[str]
+    ) -> None:
+        """Remove a frame from our state only when it's truly gone.
+
+        CDP emits ``Page.frameDetached`` with a ``reason`` of either
+        ``"remove"`` (the frame is actually gone from the DOM) or ``"swap"``
+        (the frame is migrating to a new process — typical when a
+        same-process iframe becomes an OOPIF, or when history navigates).
+        Dropping on ``swap`` would hide OOPIFs from the agent the moment
+        Chromium promotes them to their own process, so treat swap as a
+        no-op.
+
+        Even with ``reason=remove``, the parent page's perspective is
+        "the child frame left MY process tree" — which is what happens
+        when a same-origin iframe gets promoted to an OOPIF. If we
+        already have a live child CDP session attached for that frame_id,
+        the frame is still very much alive; only drop it when we have
+        no session record.
+        """
+        frame_id = params.get("frameId")
+        if not frame_id:
+            return
+        reason = str(params.get("reason") or "remove").lower()
+        if reason == "swap":
+            return
+        with self._state_lock:
+            existing = self._frames.get(frame_id)
+            # Keep OOPIF records even when the parent says the frame was
+            # "removed" — the iframe is still visible, just in a different
+            # process. If the frame truly goes away later, Target.detached
+            # + the next Page.frameDetached without a live session will
+            # clear it.
+            if existing and existing.is_oopif and existing.cdp_session_id:
+                return
+            self._frames.pop(frame_id, None)
+
+    async def _on_target_attached(self, params: Dict[str, Any]) -> None:
+        info = params.get("targetInfo") or {}
+        sid = params.get("sessionId")
+        target_type = info.get("type")
+        if not sid or target_type not in ("iframe", "worker"):
+            return
+        self._child_sessions[sid] = {"info": info, "type": target_type}
+
+        # Record the frame with its OOPIF session id for interaction routing.
+        if target_type == "iframe":
+            target_id = info.get("targetId")
+            with self._state_lock:
+                existing = self._frames.get(target_id)
+                self._frames[target_id] = FrameInfo(
+                    frame_id=target_id,
+                    url=str(info.get("url") or ""),
+                    origin="",  # filled by frameNavigated on the child session
+                    parent_frame_id=(existing.parent_frame_id if existing else None),
+                    is_oopif=True,
+                    cdp_session_id=sid,
+                    name=str(info.get("title") or (existing.name if existing else "")),
+                )
+
+        # Enable domains on the child off-loop so the reader keeps pumping.
+        # Awaiting the CDP replies here would deadlock because only the
+        # reader can resolve those replies' Futures.
+        asyncio.create_task(self._enable_child_domains(sid))
+
+    async def _enable_child_domains(self, sid: str) -> None:
+        """Enable Page+Runtime (+nested setAutoAttach) on a child CDP session.
+
+        Also installs the dialog bridge so iframe-scoped alert/confirm/prompt
+        calls round-trip through Fetch too.
+        """
+        try:
+            await self._cdp("Page.enable", session_id=sid, timeout=3.0)
+            await self._cdp("Runtime.enable", session_id=sid, timeout=3.0)
+            await self._cdp(
+                "Target.setAutoAttach",
+                {"autoAttach": True, "waitForDebuggerOnStart": False, "flatten": True},
+                session_id=sid,
+                timeout=3.0,
+            )
+        except Exception as e:
+            logger.debug("child session %s setup failed: %s", sid[:16], e)
+        # Install the dialog bridge on the child so iframe dialogs are captured.
+        await self._install_dialog_bridge(sid)
+
+    def _on_target_detached(self, params: Dict[str, Any]) -> None:
+        """Handle a child CDP session detaching.
+
+        We deliberately DO NOT drop frames from ``_frames`` here — Browserbase
+        fires transient detach events during page transitions even while the
+        iframe is still visible to the user, and dropping the record hides
+        OOPIFs from the agent between the detach and the next
+        ``Target.attachedToTarget``. Instead, we just clear the session
+        binding so stale ``cdp_session_id`` values aren't used for routing.
+        If the iframe truly goes away, ``Page.frameDetached`` will clean up.
+        """
+        sid = params.get("sessionId")
+        if not sid:
+            return
+        self._child_sessions.pop(sid, None)
+        with self._state_lock:
+            for fid, frame in list(self._frames.items()):
+                if frame.cdp_session_id == sid:
+                    # Replace with a copy that has cdp_session_id cleared so
+                    # routing falls back to top-level page session if retried.
+                    self._frames[fid] = FrameInfo(
+                        frame_id=frame.frame_id,
+                        url=frame.url,
+                        origin=frame.origin,
+                        parent_frame_id=frame.parent_frame_id,
+                        is_oopif=frame.is_oopif,
+                        cdp_session_id=None,
+                        name=frame.name,
+                    )
+
+    # ── Console / exception ring buffer ─────────────────────────────────────
+
+    def _on_console(self, params: Dict[str, Any], *, level_from: str) -> None:
+        if level_from == "exception":
+            details = params.get("exceptionDetails") or {}
+            text = str(details.get("text") or "")
+            url = details.get("url")
+            event = ConsoleEvent(ts=time.time(), level="exception", text=text, url=url)
+        else:
+            raw_level = str(params.get("type") or "log")
+            level = "error" if raw_level in ("error", "assert") else (
+                "warning" if raw_level == "warning" else "log"
+            )
+            args = params.get("args") or []
+            parts: List[str] = []
+            for a in args[:4]:
+                if isinstance(a, dict):
+                    parts.append(str(a.get("value") or a.get("description") or ""))
+            event = ConsoleEvent(ts=time.time(), level=level, text=" ".join(parts))
+        with self._state_lock:
+            self._console_events.append(event)
+            if len(self._console_events) > CONSOLE_HISTORY_MAX * 2:
+                # Keep last CONSOLE_HISTORY_MAX; allow 2x slack to reduce churn.
+                self._console_events = self._console_events[-CONSOLE_HISTORY_MAX:]
+
+    # ── Frame tree building (bounded) ───────────────────────────────────────
+
+    def _build_frame_tree_locked(self) -> Dict[str, Any]:
+        """Build the capped frame_tree payload. Must be called under state lock."""
+        frames = self._frames
+        if not frames:
+            return {"top": None, "children": [], "truncated": False}
+
+        # Identify a top frame — one with no parent, preferring oopif=False.
+        tops = [f for f in frames.values() if not f.parent_frame_id]
+        top = next((f for f in tops if not f.is_oopif), tops[0] if tops else None)
+
+        # BFS from top, capped by FRAME_TREE_MAX_ENTRIES and
+        # FRAME_TREE_MAX_OOPIF_DEPTH for OOPIF branches.
+        children: List[Dict[str, Any]] = []
+        truncated = False
+        if top is None:
+            return {"top": None, "children": [], "truncated": False}
+
+        queue: List[Tuple[FrameInfo, int]] = [
+            (f, 1) for f in frames.values() if f.parent_frame_id == top.frame_id
+        ]
+        visited: set[str] = {top.frame_id}
+        while queue and len(children) < FRAME_TREE_MAX_ENTRIES:
+            frame, depth = queue.pop(0)
+            if frame.frame_id in visited:
+                continue
+            visited.add(frame.frame_id)
+            if frame.is_oopif and depth > FRAME_TREE_MAX_OOPIF_DEPTH:
+                truncated = True
+                continue
+            children.append(frame.to_dict())
+            for f in frames.values():
+                if f.parent_frame_id == frame.frame_id and f.frame_id not in visited:
+                    queue.append((f, depth + 1))
+        if queue:
+            truncated = True
+
+        return {
+            "top": top.to_dict(),
+            "children": children,
+            "truncated": truncated,
+        }
+
+
+# ── Registry ─────────────────────────────────────────────────────────────────
+
+
+class _SupervisorRegistry:
+    """Process-global (task_id → supervisor) map with idempotent start/stop.
+
+    One instance, exposed as ``SUPERVISOR_REGISTRY``. Safe to call from any
+    thread — mutations go through ``_lock``.
+    """
+
+    def __init__(self) -> None:
+        self._lock = threading.Lock()
+        self._by_task: Dict[str, CDPSupervisor] = {}
+
+    def get(self, task_id: str) -> Optional[CDPSupervisor]:
+        """Return the supervisor for ``task_id`` if running, else ``None``."""
+        with self._lock:
+            return self._by_task.get(task_id)
+
+    def get_or_start(
+        self,
+        task_id: str,
+        cdp_url: str,
+        *,
+        dialog_policy: str = DEFAULT_DIALOG_POLICY,
+        dialog_timeout_s: float = DEFAULT_DIALOG_TIMEOUT_S,
+        start_timeout: float = 15.0,
+    ) -> CDPSupervisor:
+        """Idempotently ensure a supervisor is running for ``(task_id, cdp_url)``.
+
+        If a supervisor exists for this task but was bound to a different
+        ``cdp_url``, the old one is stopped and a fresh one is started.
+        """
+        with self._lock:
+            existing = self._by_task.get(task_id)
+            if existing is not None:
+                if existing.cdp_url == cdp_url:
+                    return existing
+                # URL changed — tear down old, fall through to re-create.
+                self._by_task.pop(task_id, None)
+        if existing is not None:
+            existing.stop()
+
+        supervisor = CDPSupervisor(
+            task_id=task_id,
+            cdp_url=cdp_url,
+            dialog_policy=dialog_policy,
+            dialog_timeout_s=dialog_timeout_s,
+        )
+        supervisor.start(timeout=start_timeout)
+        with self._lock:
+            # Guard against a concurrent get_or_start from another thread.
+            already = self._by_task.get(task_id)
+            if already is not None and already.cdp_url == cdp_url:
+                supervisor.stop()
+                return already
+            self._by_task[task_id] = supervisor
+        return supervisor
+
+    def stop(self, task_id: str) -> None:
+        """Stop and discard the supervisor for ``task_id`` if it exists."""
+        with self._lock:
+            supervisor = self._by_task.pop(task_id, None)
+        if supervisor is not None:
+            supervisor.stop()
+
+    def stop_all(self) -> None:
+        """Stop every running supervisor. For shutdown / test teardown."""
+        with self._lock:
+            items = list(self._by_task.items())
+            self._by_task.clear()
+        for _, supervisor in items:
+            supervisor.stop()
+
+
+SUPERVISOR_REGISTRY = _SupervisorRegistry()
+
+
+__all__ = [
+    "CDPSupervisor",
+    "ConsoleEvent",
+    "DEFAULT_DIALOG_POLICY",
+    "DEFAULT_DIALOG_TIMEOUT_S",
+    "DIALOG_POLICY_AUTO_ACCEPT",
+    "DIALOG_POLICY_AUTO_DISMISS",
+    "DIALOG_POLICY_MUST_RESPOND",
+    "DialogRecord",
+    "FrameInfo",
+    "PendingDialog",
+    "SUPERVISOR_REGISTRY",
+    "SupervisorSnapshot",
+    "_SupervisorRegistry",
+]
diff --git a/tools/browser_tool.py b/tools/browser_tool.py
index e46636ad9..469e9be28 100644
--- a/tools/browser_tool.py
+++ b/tools/browser_tool.py
@@ -63,7 +63,7 @@ import tempfile
 import threading
 import time
 import requests
-from typing import Dict, Any, Optional, List
+from typing import Dict, Any, Optional, List, Tuple
 from pathlib import Path
 from agent.auxiliary_client import call_llm
 from hermes_constants import get_hermes_home
@@ -287,6 +287,100 @@ def _get_cdp_override() -> str:
     return ""
 
 
+def _get_dialog_policy_config() -> Tuple[str, float]:
+    """Read ``browser.dialog_policy`` + ``browser.dialog_timeout_s`` from config.
+
+    Returns a ``(policy, timeout_s)`` tuple, falling back to the supervisor's
+    defaults when keys are absent or invalid.
+    """
+    # Defer imports so browser_tool can be imported in minimal environments.
+    from tools.browser_supervisor import (
+        DEFAULT_DIALOG_POLICY,
+        DEFAULT_DIALOG_TIMEOUT_S,
+        _VALID_POLICIES,
+    )
+
+    try:
+        from hermes_cli.config import read_raw_config
+
+        cfg = read_raw_config()
+        browser_cfg = cfg.get("browser", {}) if isinstance(cfg, dict) else {}
+        if not isinstance(browser_cfg, dict):
+            return DEFAULT_DIALOG_POLICY, DEFAULT_DIALOG_TIMEOUT_S
+        policy = str(browser_cfg.get("dialog_policy") or DEFAULT_DIALOG_POLICY)
+        if policy not in _VALID_POLICIES:
+            logger.debug("Invalid browser.dialog_policy=%r; using default", policy)
+            policy = DEFAULT_DIALOG_POLICY
+        timeout_raw = browser_cfg.get("dialog_timeout_s")
+        try:
+            timeout_s = float(timeout_raw) if timeout_raw is not None else DEFAULT_DIALOG_TIMEOUT_S
+            if timeout_s <= 0:
+                timeout_s = DEFAULT_DIALOG_TIMEOUT_S
+        except (TypeError, ValueError):
+            timeout_s = DEFAULT_DIALOG_TIMEOUT_S
+        return policy, timeout_s
+    except Exception:
+        return DEFAULT_DIALOG_POLICY, DEFAULT_DIALOG_TIMEOUT_S
+
+
+def _ensure_cdp_supervisor(task_id: str) -> None:
+    """Start a CDP supervisor for ``task_id`` if an endpoint is reachable.
+
+    Idempotent — delegates to ``SupervisorRegistry.get_or_start`` which skips
+    when a supervisor for this ``(task_id, cdp_url)`` already exists and
+    tears down + restarts on URL change. Safe to call on every
+    ``browser_navigate`` / ``/browser connect`` without worrying about
+    double-attach.
+
+    Resolves the CDP URL in this order:
+      1. ``BROWSER_CDP_URL`` / ``browser.cdp_url`` — covers ``/browser connect``
+         and config-set overrides.
+      2. ``_active_sessions[task_id]["cdp_url"]`` — covers Browserbase + any
+         other cloud provider whose ``create_session`` returns a raw CDP URL.
+
+    Swallows all errors — failing to attach the supervisor must not break
+    the browser session itself.  The agent simply won't see
+    ``pending_dialogs`` / ``frame_tree`` fields in snapshots.
+    """
+    cdp_url = _get_cdp_override()
+    if not cdp_url:
+        # Fallback: active session may carry a per-session CDP URL from a
+        # cloud provider (Browserbase sets this).
+        with _cleanup_lock:
+            session_info = _active_sessions.get(task_id, {})
+        maybe = str(session_info.get("cdp_url") or "")
+        if maybe:
+            cdp_url = _resolve_cdp_override(maybe)
+    if not cdp_url:
+        return
+    try:
+        from tools.browser_supervisor import SUPERVISOR_REGISTRY  # type: ignore[import-not-found]
+
+        policy, timeout_s = _get_dialog_policy_config()
+        SUPERVISOR_REGISTRY.get_or_start(
+            task_id=task_id,
+            cdp_url=cdp_url,
+            dialog_policy=policy,
+            dialog_timeout_s=timeout_s,
+        )
+    except Exception as exc:
+        logger.debug(
+            "CDP supervisor attach for task=%s failed (non-fatal): %s",
+            task_id,
+            exc,
+        )
+
+
+def _stop_cdp_supervisor(task_id: str) -> None:
+    """Stop the CDP supervisor for ``task_id`` if one exists. No-op otherwise."""
+    try:
+        from tools.browser_supervisor import SUPERVISOR_REGISTRY  # type: ignore[import-not-found]
+
+        SUPERVISOR_REGISTRY.stop(task_id)
+    except Exception as exc:
+        logger.debug("CDP supervisor stop for task=%s failed (non-fatal): %s", task_id, exc)
+
+
 # ============================================================================
 # Cloud Provider Registry
 # ============================================================================
@@ -995,7 +1089,12 @@ def _get_session_info(task_id: Optional[str] = None) -> Dict[str, str]:
         if task_id in _active_sessions:
             return _active_sessions[task_id]
         _active_sessions[task_id] = session_info
-    
+
+    # Lazy-start the CDP supervisor now that the session exists (if the
+    # backend surfaces a CDP URL via override or session_info["cdp_url"]).
+    # Idempotent; swallows errors. See _ensure_cdp_supervisor for details.
+    _ensure_cdp_supervisor(task_id)
+
     return session_info
 
 
@@ -1455,7 +1554,7 @@ def browser_navigate(url: str, task_id: Optional[str] = None) -> str:
     if is_first_nav:
         session_info["_first_nav"] = False
         _maybe_start_recording(effective_task_id)
-    
+
     result = _run_browser_command(effective_task_id, "open", [url], timeout=max(_get_command_timeout(), 60))
     
     if result.get("success"):
@@ -1578,7 +1677,20 @@ def browser_snapshot(
             "snapshot": snapshot_text,
             "element_count": len(refs) if refs else 0
         }
-        
+
+        # Merge supervisor state (pending dialogs + frame tree) when a CDP
+        # supervisor is attached to this task. No-op otherwise. See
+        # website/docs/developer-guide/browser-supervisor.md.
+        try:
+            from tools.browser_supervisor import SUPERVISOR_REGISTRY  # type: ignore[import-not-found]
+            _supervisor = SUPERVISOR_REGISTRY.get(effective_task_id)
+            if _supervisor is not None:
+                _sv_snap = _supervisor.snapshot()
+                if _sv_snap.active:
+                    response.update(_sv_snap.to_dict())
+        except Exception as _sv_exc:
+            logger.debug("supervisor snapshot merge failed: %s", _sv_exc)
+
         return json.dumps(response, ensure_ascii=False)
     else:
         return json.dumps({
@@ -2248,7 +2360,11 @@ def cleanup_browser(task_id: Optional[str] = None) -> None:
     """
     if task_id is None:
         task_id = "default"
-    
+
+    # Stop the CDP supervisor for this task FIRST so we close our WebSocket
+    # before the backend tears down the underlying CDP endpoint.
+    _stop_cdp_supervisor(task_id)
+
     # Also clean up Camofox session if running in Camofox mode.
     # Skip full close when managed persistence is enabled — the browser
     # profile (and its session cookies) must survive across agent tasks.
@@ -2329,6 +2445,13 @@ def cleanup_all_browsers() -> None:
     for task_id in task_ids:
         cleanup_browser(task_id)
 
+    # Tear down CDP supervisors for all tasks so background threads exit.
+    try:
+        from tools.browser_supervisor import SUPERVISOR_REGISTRY  # type: ignore[import-not-found]
+        SUPERVISOR_REGISTRY.stop_all()
+    except Exception:
+        pass
+
     # Reset cached lookups so they are re-evaluated on next use.
     global _cached_agent_browser, _agent_browser_resolved
     global _cached_command_timeout, _command_timeout_resolved
diff --git a/toolsets.py b/toolsets.py
index 975d8883c..b3cdb2e7a 100644
--- a/toolsets.py
+++ b/toolsets.py
@@ -43,7 +43,7 @@ _HERMES_CORE_TOOLS = [
     "browser_navigate", "browser_snapshot", "browser_click",
     "browser_type", "browser_scroll", "browser_back",
     "browser_press", "browser_get_images",
-    "browser_vision", "browser_console", "browser_cdp",
+    "browser_vision", "browser_console", "browser_cdp", "browser_dialog",
     # Text-to-speech
     "text_to_speech",
     # Planning & memory
@@ -115,7 +115,8 @@ TOOLSETS = {
             "browser_navigate", "browser_snapshot", "browser_click",
             "browser_type", "browser_scroll", "browser_back",
             "browser_press", "browser_get_images",
-            "browser_vision", "browser_console", "browser_cdp", "web_search"
+            "browser_vision", "browser_console", "browser_cdp",
+            "browser_dialog", "web_search"
         ],
         "includes": []
     },
@@ -249,7 +250,7 @@ TOOLSETS = {
             "browser_navigate", "browser_snapshot", "browser_click",
             "browser_type", "browser_scroll", "browser_back",
             "browser_press", "browser_get_images",
-            "browser_vision", "browser_console", "browser_cdp",
+            "browser_vision", "browser_console", "browser_cdp", "browser_dialog",
             "todo", "memory",
             "session_search",
             "execute_code", "delegate_task",
@@ -274,7 +275,7 @@ TOOLSETS = {
             "browser_navigate", "browser_snapshot", "browser_click",
             "browser_type", "browser_scroll", "browser_back",
             "browser_press", "browser_get_images",
-            "browser_vision", "browser_console", "browser_cdp",
+            "browser_vision", "browser_console", "browser_cdp", "browser_dialog",
             # Planning & memory
             "todo", "memory",
             # Session history search
diff --git a/website/docs/developer-guide/browser-supervisor.md b/website/docs/developer-guide/browser-supervisor.md
new file mode 100644
index 000000000..d0aa34dbb
--- /dev/null
+++ b/website/docs/developer-guide/browser-supervisor.md
@@ -0,0 +1,223 @@
+# Browser CDP Supervisor — Design
+
+**Status:** Shipped (PR 14540)
+**Last updated:** 2026-04-23
+**Author:** @teknium1
+
+## Problem
+
+Native JS dialogs (`alert`/`confirm`/`prompt`/`beforeunload`) and iframes are
+the two biggest gaps in our browser tooling:
+
+1. **Dialogs block the JS thread.** Any operation on the page stalls until the
+   dialog is handled. Before this work, the agent had no way to know a dialog
+   was open — subsequent tool calls would hang or throw opaque errors.
+2. **Iframes are invisible.** The agent could see iframe nodes in the DOM
+   snapshot but could not click, type, or eval inside them — especially
+   cross-origin (OOPIF) iframes that live in separate Chromium processes.
+
+[PR #12550](https://github.com/NousResearch/hermes-agent/pull/12550) proposed a
+stateless `browser_dialog` wrapper. That doesn't solve detection — it's a
+cleaner CDP call for when the agent already knows (via symptoms) that a dialog
+is open. Closed as superseded.
+
+## Backend capability matrix (verified live 2026-04-23)
+
+Using throwaway probe scripts against a data-URL page that fires alerts in the
+main frame and in a same-origin srcdoc iframe, plus a cross-origin
+`https://example.com` iframe:
+
+| Backend | Dialog detect | Dialog respond | Frame tree | OOPIF `Runtime.evaluate` via `browser_cdp(frame_id=...)` |
+|---|---|---|---|---|
+| Local Chrome (`--remote-debugging-port`) / `/browser connect` | ✓ | ✓ full workflow | ✓ | ✓ |
+| Browserbase | ✓ (via bridge) | ✓ full workflow (via bridge) | ✓ | ✓ (`document.title = "Example Domain"` verified on real cross-origin iframe) |
+| Camofox | ✗ no CDP (REST-only) | ✗ | partial via DOM snapshot | ✗ |
+
+**How Browserbase respond works.** Browserbase's CDP proxy uses Playwright
+internally and auto-dismisses native dialogs within ~10ms, so
+`Page.handleJavaScriptDialog` can't keep up. To work around this, the
+supervisor injects a bridge script via
+`Page.addScriptToEvaluateOnNewDocument` that overrides
+`window.alert`/`confirm`/`prompt` with a synchronous XHR to a magic host
+(`hermes-dialog-bridge.invalid`). `Fetch.enable` intercepts those XHRs
+before they touch the network — the dialog becomes a `Fetch.requestPaused`
+event the supervisor captures, and `respond_to_dialog` fulfills via
+`Fetch.fulfillRequest` with a JSON body the injected script decodes.
+
+Net result: from the page's perspective, `prompt()` still returns the
+agent-supplied string. From the agent's perspective, it's the same
+`browser_dialog(action=...)` API either way. Tested end-to-end against
+real Browserbase sessions — 4/4 (alert/prompt/confirm-accept/confirm-dismiss)
+pass including value round-tripping back into page JS.
+
+Camofox stays unsupported for this PR; follow-up upstream issue planned at
+`jo-inc/camofox-browser` requesting a dialog polling endpoint.
+
+## Architecture
+
+### CDPSupervisor
+
+One `asyncio.Task` running in a background daemon thread per Hermes `task_id`.
+Holds a persistent WebSocket to the backend's CDP endpoint. Maintains:
+
+- **Dialog queue** — `List[PendingDialog]` with `{id, type, message, default_prompt, session_id, opened_at}`
+- **Frame tree** — `Dict[frame_id, FrameInfo]` with parent relationships, URL, origin, whether cross-origin child session
+- **Session map** — `Dict[session_id, SessionInfo]` so interaction tools can route to the right attached session for OOPIF operations
+- **Recent console errors** — ring buffer of the last 50 (for PR 2 diagnostics)
+
+Subscribes on attach:
+- `Page.enable` — `javascriptDialogOpening`, `frameAttached`, `frameNavigated`, `frameDetached`
+- `Runtime.enable` — `executionContextCreated`, `consoleAPICalled`, `exceptionThrown`
+- `Target.setAutoAttach {autoAttach: true, flatten: true}` — surfaces child OOPIF targets; supervisor enables `Page`+`Runtime` on each
+
+Thread-safe state access via a snapshot lock; tool handlers (sync) read the
+frozen snapshot without awaiting.
+
+### Lifecycle
+
+- **Start:** `SupervisorRegistry.get_or_start(task_id, cdp_url)` — called by
+  `browser_navigate`, Browserbase session create, `/browser connect`. Idempotent.
+- **Stop:** session teardown or `/browser disconnect`. Cancels the asyncio
+  task, closes the WebSocket, discards state.
+- **Rebind:** if the CDP URL changes (user reconnects to a new Chrome), stop
+  the old supervisor and start fresh — never reuse state across endpoints.
+
+### Dialog policy
+
+Configurable via `config.yaml` under `browser.dialog_policy`:
+
+- **`must_respond`** (default) — capture, surface in `browser_snapshot`, wait
+  for explicit `browser_dialog(action=...)` call. After a 300s safety timeout
+  with no response, auto-dismiss and log. Prevents a buggy agent from stalling
+  forever.
+- `auto_dismiss` — record and dismiss immediately; agent sees it after the
+  fact via `browser_state` inside `browser_snapshot`.
+- `auto_accept` — record and accept (useful for `beforeunload` where the user
+  wants to navigate away cleanly).
+
+Policy is per-task; no per-dialog overrides in v1.
+
+## Agent surface (PR 1)
+
+### One new tool
+
+```
+browser_dialog(action, prompt_text=None, dialog_id=None)
+```
+
+- `action="accept"` / `"dismiss"` → responds to the specified or sole pending dialog (required)
+- `prompt_text=...` → text to supply to a `prompt()` dialog
+- `dialog_id=...` → disambiguate when multiple dialogs queued (rare)
+
+Tool is response-only. Agent reads pending dialogs from `browser_snapshot`
+output before calling.
+
+### `browser_snapshot` extension
+
+Adds three optional fields to the existing snapshot output when a supervisor
+is attached:
+
+```json
+{
+  "pending_dialogs": [
+    {"id": "d-1", "type": "alert", "message": "Hello", "opened_at": 1650000000.0}
+  ],
+  "recent_dialogs": [
+    {"id": "d-1", "type": "alert", "message": "...", "opened_at": 1650000000.0,
+     "closed_at": 1650000000.1, "closed_by": "remote"}
+  ],
+  "frame_tree": {
+    "top": {"frame_id": "FRAME_A", "url": "https://example.com/", "origin": "https://example.com"},
+    "children": [
+      {"frame_id": "FRAME_B", "url": "about:srcdoc", "is_oopif": false},
+      {"frame_id": "FRAME_C", "url": "https://ads.example.net/", "is_oopif": true, "session_id": "SID_C"}
+    ],
+    "truncated": false
+  }
+}
+```
+
+- **`pending_dialogs`**: dialogs currently blocking the page's JS thread.
+  The agent must call `browser_dialog(action=...)` to respond. Empty on
+  Browserbase because their CDP proxy auto-dismisses within ~10ms.
+
+- **`recent_dialogs`**: ring buffer of up to 20 recently-closed dialogs with
+  a `closed_by` tag — `"agent"` (we responded), `"auto_policy"` (local
+  auto_dismiss/auto_accept), `"watchdog"` (must_respond timeout hit), or
+  `"remote"` (browser/backend closed it on us, e.g. Browserbase). This is
+  how agents on Browserbase still get visibility into what happened.
+
+- **`frame_tree`**: frame structure including cross-origin (OOPIF) children.
+  Capped at 30 entries + OOPIF depth 2 to bound snapshot size on ad-heavy
+  pages. `truncated: true` surfaces when limits were hit; agents needing
+  the full tree can use `browser_cdp` with `Page.getFrameTree`.
+
+No new tool schema surface for any of these — the agent reads the snapshot
+it already requests.
+
+### Availability gating
+
+Both surfaces gate on `_browser_cdp_check` (supervisor can only run when a CDP
+endpoint is reachable). On Camofox / no-backend sessions, the dialog tool is
+hidden and snapshot omits the new fields — no schema bloat.
+
+## Cross-origin iframe interaction
+
+Extending the dialog-detect work, `browser_cdp(frame_id=...)` routes CDP
+calls (notably `Runtime.evaluate`) through the supervisor's already-connected
+WebSocket using the OOPIF's child `sessionId`. Agents pick frame_ids out of
+`browser_snapshot.frame_tree.children[]` where `is_oopif=true` and pass them
+to `browser_cdp`. For same-origin iframes (no dedicated CDP session), the
+agent uses `contentWindow`/`contentDocument` from a top-level
+`Runtime.evaluate` instead — supervisor surfaces an error pointing at that
+fallback when `frame_id` belongs to a non-OOPIF.
+
+On Browserbase, this is the ONLY reliable path for iframe interaction —
+stateless CDP connections (opened per `browser_cdp` call) hit signed-URL
+expiry, while the supervisor's long-lived connection keeps a valid session.
+
+## Camofox (follow-up)
+
+Issue planned against `jo-inc/camofox-browser` adding:
+- Playwright `page.on('dialog', handler)` per session
+- `GET /tabs/:tabId/dialogs` polling endpoint
+- `POST /tabs/:tabId/dialogs/:id` to accept/dismiss
+- Frame-tree introspection endpoint
+
+## Files touched (PR 1)
+
+### New
+
+- `tools/browser_supervisor.py` — `CDPSupervisor`, `SupervisorRegistry`, `PendingDialog`, `FrameInfo`
+- `tools/browser_dialog_tool.py` — `browser_dialog` tool handler
+- `tests/tools/test_browser_supervisor.py` — mock CDP WebSocket server + lifecycle/state tests
+- `website/docs/developer-guide/browser-supervisor.md` — this file
+
+### Modified
+
+- `toolsets.py` — register `browser_dialog` in `browser`, `hermes-acp`, `hermes-api-server`, core toolsets (gated on CDP reachability)
+- `tools/browser_tool.py`
+  - `browser_navigate` start-hook: if CDP URL resolvable, `SupervisorRegistry.get_or_start(task_id, cdp_url)`
+  - `browser_snapshot` (at ~line 1536): merge supervisor state into return payload
+  - `/browser connect` handler: restart supervisor with new endpoint
+  - Session teardown hooks in `_cleanup_browser_session`
+- `hermes_cli/config.py` — add `browser.dialog_policy` and `browser.dialog_timeout_s` to `DEFAULT_CONFIG`
+- Docs: `website/docs/user-guide/features/browser.md`, `website/docs/reference/tools-reference.md`, `website/docs/reference/toolsets-reference.md`
+
+## Non-goals
+
+- Detection/interaction for Camofox (upstream gap; tracked separately)
+- Streaming dialog/frame events live to the user (would require gateway hooks)
+- Persisting dialog history across sessions (in-memory only)
+- Per-iframe dialog policies (agent can express this via `dialog_id`)
+- Replacing `browser_cdp` — it stays as the escape hatch for the long tail (cookies, viewport, network throttling)
+
+## Testing
+
+Unit tests use an asyncio mock CDP server that speaks enough of the protocol
+to exercise all state transitions: attach, enable, navigate, dialog fire,
+dialog dismiss, frame attach/detach, child target attach, session teardown.
+Real-backend E2E (Browserbase + local Chrome) is manual; probe scripts from
+the 2026-04-23 investigation kept in-repo under
+`scripts/browser_supervisor_e2e.py` so anyone can re-verify on new backend
+versions.
diff --git a/website/docs/reference/tools-reference.md b/website/docs/reference/tools-reference.md
index c255c8f6a..b3380d14b 100644
--- a/website/docs/reference/tools-reference.md
+++ b/website/docs/reference/tools-reference.md
@@ -6,9 +6,9 @@ description: "Authoritative reference for Hermes built-in tools, grouped by tool
 
 # Built-in Tools Reference
 
-This page documents all 53 built-in tools in the Hermes tool registry, grouped by toolset. Availability varies by platform, credentials, and enabled toolsets.
+This page documents all 55 built-in tools in the Hermes tool registry, grouped by toolset. Availability varies by platform, credentials, and enabled toolsets.
 
-**Quick counts:** 11 browser tools, 4 file tools, 10 RL tools, 4 Home Assistant tools, 2 terminal tools, 2 web tools, 5 Feishu tools, and 15 standalone tools across other toolsets.
+**Quick counts:** 12 browser tools, 4 file tools, 10 RL tools, 4 Home Assistant tools, 2 terminal tools, 2 web tools, 5 Feishu tools, and 15 standalone tools across other toolsets.
 
 :::tip MCP Tools
 In addition to built-in tools, Hermes can load tools dynamically from MCP servers. MCP tools appear with a server-name prefix (e.g., `github_create_issue` for the `github` MCP server). See [MCP Integration](/docs/user-guide/features/mcp) for configuration.
@@ -20,6 +20,7 @@ In addition to built-in tools, Hermes can load tools dynamically from MCP server
 |------|-------------|----------------------|
 | `browser_back` | Navigate back to the previous page in browser history. Requires browser_navigate to be called first. | — |
 | `browser_cdp` | Send a raw Chrome DevTools Protocol (CDP) command. Escape hatch for browser operations not covered by browser_navigate, browser_click, browser_console, etc. Only available when a CDP endpoint is reachable at session start — via `/browser connect` or `browser.cdp_url` config. See https://chromedevtools.github.io/devtools-protocol/ | — |
+| `browser_dialog` | Respond to a native JavaScript dialog (alert / confirm / prompt / beforeunload). Call `browser_snapshot` first — pending dialogs appear in its `pending_dialogs` field. Then call `browser_dialog(action='accept'|'dismiss')`. Same availability as `browser_cdp` (Browserbase or `/browser connect`). | — |
 | `browser_click` | Click on an element identified by its ref ID from the snapshot (e.g., '@e5'). The ref IDs are shown in square brackets in the snapshot output. Requires browser_navigate and browser_snapshot to be called first. | — |
 | `browser_console` | Get browser console output and JavaScript errors from the current page. Returns console.log/warn/error/info messages and uncaught JS exceptions. Use this to detect silent JavaScript errors, failed API calls, and application warnings. Requi… | — |
 | `browser_get_images` | Get a list of all images on the current page with their URLs and alt text. Useful for finding images to analyze with the vision tool. Requires browser_navigate to be called first. | — |
diff --git a/website/docs/reference/toolsets-reference.md b/website/docs/reference/toolsets-reference.md
index bb911004e..a8c0a8225 100644
--- a/website/docs/reference/toolsets-reference.md
+++ b/website/docs/reference/toolsets-reference.md
@@ -52,7 +52,7 @@ Or in-session:
 
 | Toolset | Tools | Purpose |
 |---------|-------|---------|
-| `browser` | `browser_back`, `browser_cdp`, `browser_click`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` | Full browser automation. Includes `web_search` as a fallback for quick lookups. `browser_cdp` is a raw CDP passthrough gated on a reachable CDP endpoint — it only appears when `/browser connect` is active or `browser.cdp_url` is set. |
+| `browser` | `browser_back`, `browser_cdp`, `browser_click`, `browser_console`, `browser_dialog`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` | Full browser automation. Includes `web_search` as a fallback for quick lookups. `browser_cdp` and `browser_dialog` are gated on a reachable CDP endpoint — they only appear when `/browser connect` is active, `browser.cdp_url` is set, or a Browserbase session is active. `browser_dialog` works together with the `pending_dialogs` and `frame_tree` fields that `browser_snapshot` adds when a CDP supervisor is attached. |
 | `clarify` | `clarify` | Ask the user a question when the agent needs clarification. |
 | `code_execution` | `execute_code` | Run Python scripts that call Hermes tools programmatically. |
 | `cronjob` | `cronjob` | Schedule and manage recurring tasks. |
diff --git a/website/docs/user-guide/configuration.md b/website/docs/user-guide/configuration.md
index 420ca1468..80f5c6f88 100644
--- a/website/docs/user-guide/configuration.md
+++ b/website/docs/user-guide/configuration.md
@@ -1240,10 +1240,26 @@ browser:
   inactivity_timeout: 120        # Seconds before auto-closing idle sessions
   command_timeout: 30             # Timeout in seconds for browser commands (screenshot, navigate, etc.)
   record_sessions: false         # Auto-record browser sessions as WebM videos to ~/.hermes/browser_recordings/
+  # Optional CDP override — when set, Hermes attaches directly to your own
+  # Chrome (via /browser connect) rather than starting a headless browser.
+  cdp_url: ""
+  # Dialog supervisor — controls how native JS dialogs (alert / confirm / prompt)
+  # are handled when a CDP backend is attached (Browserbase, local Chrome via
+  # /browser connect). Ignored on Camofox and default local agent-browser mode.
+  dialog_policy: must_respond    # must_respond | auto_dismiss | auto_accept
+  dialog_timeout_s: 300          # Safety auto-dismiss under must_respond (seconds)
   camofox:
     managed_persistence: false   # When true, Camofox sessions persist cookies/logins across restarts
 ```
 
+**Dialog policies:**
+
+- `must_respond` (default) — capture the dialog, surface it in `browser_snapshot.pending_dialogs`, and wait for the agent to call `browser_dialog(action=...)`. After `dialog_timeout_s` seconds with no response, the dialog is auto-dismissed to prevent the page's JS thread from stalling forever.
+- `auto_dismiss` — capture, dismiss immediately. The agent still sees the dialog record in `browser_snapshot.recent_dialogs` with `closed_by="auto_policy"` after the fact.
+- `auto_accept` — capture, accept immediately. Useful for pages with aggressive `beforeunload` prompts.
+
+See the [browser feature page](./features/browser.md#browser_dialog) for the full dialog workflow.
+
 The browser toolset supports multiple providers. See the [Browser feature page](/docs/user-guide/features/browser) for details on Browserbase, Browser Use, and local Chrome CDP setup.
 
 ## Timezone
diff --git a/website/docs/user-guide/features/browser.md b/website/docs/user-guide/features/browser.md
index d6624bf7d..ca51b633e 100644
--- a/website/docs/user-guide/features/browser.md
+++ b/website/docs/user-guide/features/browser.md
@@ -355,7 +355,50 @@ browser_cdp(method="Runtime.evaluate",
 browser_cdp(method="Network.getAllCookies")
 ```
 
-Browser-level methods (`Target.*`, `Browser.*`, `Storage.*`) omit `target_id`. Page-level methods (`Page.*`, `Runtime.*`, `DOM.*`, `Emulation.*`) require a `target_id` from `Target.getTargets`. Each call is independent — sessions do not persist between calls.
+Browser-level methods (`Target.*`, `Browser.*`, `Storage.*`) omit `target_id`. Page-level methods (`Page.*`, `Runtime.*`, `DOM.*`, `Emulation.*`) require a `target_id` from `Target.getTargets`. Each stateless call is independent — sessions do not persist between calls.
+
+**Cross-origin iframes:** pass `frame_id` (from `browser_snapshot.frame_tree.children[]` where `is_oopif=true`) to route the CDP call through the supervisor's live session for that iframe. This is how `Runtime.evaluate` inside a cross-origin iframe works on Browserbase, where stateless CDP connections would hit signed-URL expiry. Example:
+
+```
+browser_cdp(
+  method="Runtime.evaluate",
+  params={"expression": "document.title", "returnByValue": True},
+  frame_id="<frame_id from browser_snapshot>",
+)
+```
+
+Same-origin iframes don't need `frame_id` — use `document.querySelector('iframe').contentDocument` from a top-level `Runtime.evaluate` instead.
+
+### `browser_dialog`
+
+Responds to a native JS dialog (`alert` / `confirm` / `prompt` / `beforeunload`). Before this tool existed, dialogs would silently block the page's JavaScript thread and subsequent `browser_*` calls would hang or throw; now the agent sees pending dialogs in `browser_snapshot` output and responds explicitly.
+
+**Workflow:**
+1. Call `browser_snapshot`. If a dialog is blocking the page, it shows up as `pending_dialogs: [{"id": "d-1", "type": "alert", "message": "..."}]`.
+2. Call `browser_dialog(action="accept")` or `browser_dialog(action="dismiss")`. For `prompt()` dialogs, pass `prompt_text="..."` to supply the response.
+3. Re-snapshot — `pending_dialogs` is empty; the page's JS thread has resumed.
+
+**Detection happens automatically** via a persistent CDP supervisor — one WebSocket per task that subscribes to Page/Runtime/Target events. The supervisor also populates a `frame_tree` field in the snapshot so the agent can see the iframe structure of the current page, including cross-origin (OOPIF) iframes.
+
+**Availability matrix:**
+
+| Backend | Detection via `pending_dialogs` | Response (`browser_dialog` tool) |
+|---|---|---|
+| Local Chrome via `/browser connect` or `browser.cdp_url` | ✓ | ✓ full workflow |
+| Browserbase | ✓ | ✓ full workflow (via injected XHR bridge) |
+| Camofox / default local agent-browser | ✗ | ✗ (no CDP endpoint) |
+
+**How it works on Browserbase.** Browserbase's CDP proxy auto-dismisses real native dialogs server-side within ~10ms, so we can't use `Page.handleJavaScriptDialog`. The supervisor injects a small script via `Page.addScriptToEvaluateOnNewDocument` that overrides `window.alert`/`confirm`/`prompt` with a synchronous XHR. We intercept those XHRs via `Fetch.enable` — the page's JS thread stays blocked on the XHR until we call `Fetch.fulfillRequest` with the agent's response. `prompt()` return values round-trip back into page JS unchanged.
+
+**Dialog policy** is configured in `config.yaml` under `browser.dialog_policy`:
+
+| Policy | Behavior |
+|--------|----------|
+| `must_respond` (default) | Capture, surface in snapshot, wait for explicit `browser_dialog()` call. Safety auto-dismiss after `browser.dialog_timeout_s` (default 300s) so a buggy agent can't stall forever. |
+| `auto_dismiss` | Capture, dismiss immediately. Agent still sees the dialog in `browser_state` history but doesn't have to act. |
+| `auto_accept` | Capture, accept immediately. Useful when navigating pages with aggressive `beforeunload` prompts. |
+
+**Frame tree** inside `browser_snapshot.frame_tree` is capped to 30 frames and OOPIF depth 2 to keep payloads bounded on ad-heavy pages. A `truncated: true` flag surfaces when limits were hit; agents needing the full tree can use `browser_cdp` with `Page.getFrameTree`.
 
 ## Practical Examples