mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
perf(browser): route browser_console eval through supervisor's persistent CDP WS (180x faster) (#23226)
Adds CDPSupervisor.evaluate_runtime() and wires it into _browser_eval as a fast path when a supervisor is alive for the current task_id. Replaces the ~180ms agent-browser subprocess fork+exec+Node-startup hop with a ~1ms Runtime.evaluate over the supervisor's already-connected WebSocket. Falls through to the existing agent-browser CLI path when no supervisor is running (e.g. backends without CDP, or before the first browser_navigate attaches one), so behaviour is unchanged where it can't apply. JS-side exceptions surface directly without falling through to the subprocess (the subprocess would just re-raise the same error, slower); supervisor-side failures (loop down, no session) fall through cleanly. Benchmark — 30 iterations of `1 + 1` against headless Chrome: supervisor WS mean= 0.96ms median= 0.91ms agent-browser subprocess mean=179.35ms median=167.73ms → 187x speedup mean Tests: 14 unit tests (mocked supervisor + response-shape coverage), 5 real-Chrome e2e tests in test_browser_supervisor.py (gated on Chrome being installed). Browser test suite: 355 passed, 1 skipped.
This commit is contained in:
parent
08c5b35a73
commit
d4b26df897
6 changed files with 717 additions and 0 deletions
|
|
@ -423,6 +423,15 @@ Check the browser console for any JavaScript errors
|
|||
|
||||
Use `clear=True` to clear the console after reading, so subsequent calls only show new messages.
|
||||
|
||||
`browser_console` also evaluates JavaScript when called with an `expression` argument — same shape as DevTools console, the result comes back parsed (JSON-serialized objects become dicts; primitive values stay primitive).
|
||||
|
||||
```
|
||||
browser_console(expression="document.querySelector('h1').textContent")
|
||||
browser_console(expression="JSON.stringify(performance.timing)")
|
||||
```
|
||||
|
||||
When a CDP supervisor is active for the current session (typical for any session that's run `browser_navigate` against a CDP-capable backend), evaluation runs over the supervisor's persistent WebSocket — no subprocess startup cost. Falls through to the standard agent-browser CLI path otherwise. Behaviour is identical either way; only latency changes.
|
||||
|
||||
### `browser_cdp`
|
||||
|
||||
Raw Chrome DevTools Protocol passthrough — the escape hatch for browser operations not covered by the other tools. Use for native dialog handling, iframe-scoped evaluation, cookie/network control, or any CDP verb the agent needs.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue