feat(browser): add browser_cdp raw DevTools Protocol passthrough (#12369)

Agents can now send arbitrary CDP commands to the browser. The tool is
gated on a reachable CDP endpoint at session start — it only appears in
the toolset when BROWSER_CDP_URL is set (from '/browser connect') or
'browser.cdp_url' is configured in config.yaml. Backends that don't
currently expose CDP to the Python side (Camofox, default local
agent-browser, cloud providers whose per-session cdp_url is not yet
surfaced) do not see the tool at all.

Tool schema description links to the CDP method reference at
https://chromedevtools.github.io/devtools-protocol/ so the agent can
web_extract specific method docs on demand.

Stateless per call. Browser-level methods (Target.*, Browser.*,
Storage.*) omit target_id. Page-level methods attach to the target
with flatten=true and dispatch the method on the returned sessionId.
Clean errors when the endpoint becomes unreachable mid-session or
the URL isn't a WebSocket.

Tests: 19 unit (mock CDP server + gate checks) + E2E against real
headless Chrome (Target.getTargets, Browser.getVersion,
Runtime.evaluate with target_id, Page.navigate + re-eval, bogus
method, bogus target_id, missing endpoint) + E2E of the check_fn
gate (tool hidden without CDP URL, visible with it, hidden again
after unset).
This commit is contained in:
Teknium 2026-04-19 00:03:10 -07:00 committed by GitHub
parent d66414a844
commit ce410521b3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 862 additions and 7 deletions

View file

@ -43,7 +43,7 @@ _HERMES_CORE_TOOLS = [
"browser_navigate", "browser_snapshot", "browser_click",
"browser_type", "browser_scroll", "browser_back",
"browser_press", "browser_get_images",
"browser_vision", "browser_console",
"browser_vision", "browser_console", "browser_cdp",
# Text-to-speech
"text_to_speech",
# Planning & memory
@ -115,7 +115,7 @@ TOOLSETS = {
"browser_navigate", "browser_snapshot", "browser_click",
"browser_type", "browser_scroll", "browser_back",
"browser_press", "browser_get_images",
"browser_vision", "browser_console", "web_search"
"browser_vision", "browser_console", "browser_cdp", "web_search"
],
"includes": []
},
@ -249,7 +249,7 @@ TOOLSETS = {
"browser_navigate", "browser_snapshot", "browser_click",
"browser_type", "browser_scroll", "browser_back",
"browser_press", "browser_get_images",
"browser_vision", "browser_console",
"browser_vision", "browser_console", "browser_cdp",
"todo", "memory",
"session_search",
"execute_code", "delegate_task",
@ -274,7 +274,7 @@ TOOLSETS = {
"browser_navigate", "browser_snapshot", "browser_click",
"browser_type", "browser_scroll", "browser_back",
"browser_press", "browser_get_images",
"browser_vision", "browser_console",
"browser_vision", "browser_console", "browser_cdp",
# Planning & memory
"todo", "memory",
# Session history search