From 340d2b6de08a94fa7335b6be8dd973be000e259a Mon Sep 17 00:00:00 2001 From: Teknium <127238744+teknium1@users.noreply.github.com> Date: Tue, 19 May 2026 22:28:26 -0700 Subject: [PATCH 01/29] docs(xai-oauth): note X Premium+ also unlocks Grok OAuth (#29055) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The xAI Grok OAuth page only mentioned SuperGrok subscribers. An X Premium+ subscription on the X account you sign in with also unlocks Grok access via accounts.x.ai (xAI links the X subscription status to the xAI session automatically — see https://docs.x.ai/grok/faq). Updates the OAuth page title, prereqs, and overview table, plus the provider/configuration/x-search docs that reference the OAuth flow. --- website/docs/guides/xai-grok-oauth.md | 24 +++++++++++--------- website/docs/integrations/providers.md | 2 +- website/docs/user-guide/configuration.md | 4 ++-- website/docs/user-guide/features/x-search.md | 2 +- 4 files changed, 17 insertions(+), 15 deletions(-) diff --git a/website/docs/guides/xai-grok-oauth.md b/website/docs/guides/xai-grok-oauth.md index 6719a14a90b..df313e9afa7 100644 --- a/website/docs/guides/xai-grok-oauth.md +++ b/website/docs/guides/xai-grok-oauth.md @@ -1,12 +1,14 @@ --- sidebar_position: 16 -title: "xAI Grok OAuth (SuperGrok Subscription)" -description: "Sign in with your SuperGrok subscription to use Grok models in Hermes Agent — no API key required" +title: "xAI Grok OAuth (SuperGrok / X Premium+)" +description: "Sign in with your SuperGrok or X Premium+ subscription to use Grok models in Hermes Agent — no API key required" --- -# xAI Grok OAuth (SuperGrok Subscription) +# xAI Grok OAuth (SuperGrok / X Premium+) -Hermes Agent supports xAI Grok through a browser-based OAuth login flow against [accounts.x.ai](https://accounts.x.ai), using your existing **SuperGrok subscription**. No `XAI_API_KEY` is required — log in once and Hermes automatically refreshes your session in the background. +Hermes Agent supports xAI Grok through a browser-based OAuth login flow against [accounts.x.ai](https://accounts.x.ai), using either a **SuperGrok subscription** ([grok.com](https://x.ai/grok)) or an **X Premium+ subscription** (linked X account). No `XAI_API_KEY` is required — log in once and Hermes automatically refreshes your session in the background. + +When you sign in with an X account that has Premium+, xAI automatically links the subscription status to your xAI session, so the OAuth flow works the same as it does for direct SuperGrok subscribers. The transport reuses the `codex_responses` adapter (xAI exposes a Responses-style endpoint), so reasoning, tool-calling, streaming, and prompt caching work without any adapter changes. @@ -17,20 +19,20 @@ The same OAuth bearer token is also reused by every direct-to-xAI surface in Her | Item | Value | |------|-------| | Provider ID | `xai-oauth` | -| Display name | xAI Grok OAuth (SuperGrok Subscription) | +| Display name | xAI Grok OAuth (SuperGrok / X Premium+) | | Auth type | Browser OAuth 2.0 PKCE (loopback callback) | | Transport | xAI Responses API (`codex_responses`) | | Default model | `grok-4.3` | | Endpoint | `https://api.x.ai/v1` | | Auth server | `https://accounts.x.ai` | | Requires env var | No (`XAI_API_KEY` is **not** used for this provider) | -| Subscription | [SuperGrok](https://x.ai/grok) — see note below | +| Subscription | [SuperGrok](https://x.ai/grok) or [X Premium+](https://x.com/i/premium_sign_up) — see note below | ## Prerequisites - Python 3.9+ - Hermes Agent installed -- An active SuperGrok subscription on your xAI account +- An active **SuperGrok** subscription on your xAI account, **or** an **X Premium+** subscription on the X account you sign in with (xAI links the subscription automatically) - A browser available on the local machine (or use `--no-browser` for remote sessions) :::warning xAI may restrict OAuth API access by tier @@ -42,7 +44,7 @@ xAI's backend enforces its own allowlist on the OAuth API surface and has been s ```bash # Launch the provider and model picker hermes model -# → Select "xAI Grok OAuth (SuperGrok Subscription)" from the provider list +# → Select "xAI Grok OAuth (SuperGrok / X Premium+)" from the provider list # → Hermes opens your browser to accounts.x.ai # → Approve access in the browser # → Pick a model (grok-4.3 is at the top) @@ -111,7 +113,7 @@ The `◆ Auth Providers` section will show the current state of every provider, ```bash hermes model -# → Select "xAI Grok OAuth (SuperGrok Subscription)" +# → Select "xAI Grok OAuth (SuperGrok / X Premium+)" # → Pick from the model list (grok-4.3 is pinned to the top) ``` @@ -155,7 +157,7 @@ hermes tools # → Text-to-Speech → "xAI TTS" # → Image Generation → "xAI Grok Imagine (image)" # → Video Generation → "xAI Grok Imagine" -# → X (Twitter) Search → "xAI Grok OAuth (SuperGrok Subscription)" +# → X (Twitter) Search → "xAI Grok OAuth (SuperGrok / X Premium+)" ``` If OAuth tokens are already stored, the picker confirms it and skips the credential prompt. If neither OAuth nor `XAI_API_KEY` is set, the picker offers a 3-choice menu: OAuth login, paste API key, or skip. @@ -165,7 +167,7 @@ The `video_gen` toolset is disabled by default. Enable it in `hermes tools` → ::: :::note X search auto-enables when xAI credentials are present -The `x_search` toolset auto-enables whenever xAI credentials (a SuperGrok OAuth token or `XAI_API_KEY`) are configured. Disable explicitly via `hermes tools` → `🐦 X (Twitter) Search` (press space) if you don't want this. The tool routes through xAI's built-in `x_search` Responses API — it works with **either** your SuperGrok OAuth login or a paid `XAI_API_KEY`, and prefers OAuth when both are configured (uses your subscription quota instead of API spend). The tool schema is hidden from the model when no xAI credentials are configured, regardless of whether the toolset is enabled. +The `x_search` toolset auto-enables whenever xAI credentials (a SuperGrok / X Premium+ OAuth token or `XAI_API_KEY`) are configured. Disable explicitly via `hermes tools` → `🐦 X (Twitter) Search` (press space) if you don't want this. The tool routes through xAI's built-in `x_search` Responses API — it works with **either** your SuperGrok / X Premium+ OAuth login or a paid `XAI_API_KEY`, and prefers OAuth when both are configured (uses your subscription quota instead of API spend). The tool schema is hidden from the model when no xAI credentials are configured, regardless of whether the toolset is enabled. ::: ### Models diff --git a/website/docs/integrations/providers.md b/website/docs/integrations/providers.md index 345d50868ae..6969bcc7e60 100644 --- a/website/docs/integrations/providers.md +++ b/website/docs/integrations/providers.md @@ -347,7 +347,7 @@ When using the Z.AI / GLM provider, Hermes automatically probes multiple endpoin xAI is wired through the Responses API (`codex_responses` transport) for automatic reasoning support on Grok 4 models — no `reasoning_effort` parameter needed, the server reasons by default. Set `XAI_API_KEY` in `~/.hermes/.env` and pick xAI in `hermes model`, or drop `grok` as a shortcut into `/model grok-4-1-fast-reasoning`. -SuperGrok subscribers can sign in with browser OAuth instead of using an API key — pick **xAI Grok OAuth (SuperGrok Subscription)** in `hermes model`, or run `hermes auth add xai-oauth`. The same OAuth bearer token is automatically reused by direct-to-xAI tools (TTS, image gen, video gen, transcription). See the [xAI Grok OAuth guide](../guides/xai-grok-oauth.md) for the full flow — and if Hermes runs on a remote host, also see [OAuth over SSH / Remote Hosts](../guides/oauth-over-ssh.md) for the required `ssh -L` tunnel. +SuperGrok and X Premium+ subscribers can sign in with browser OAuth instead of using an API key — pick **xAI Grok OAuth (SuperGrok Subscription)** in `hermes model`, or run `hermes auth add xai-oauth`. The same OAuth bearer token is automatically reused by direct-to-xAI tools (TTS, image gen, video gen, transcription). See the [xAI Grok OAuth guide](../guides/xai-grok-oauth.md) for the full flow — and if Hermes runs on a remote host, also see [OAuth over SSH / Remote Hosts](../guides/oauth-over-ssh.md) for the required `ssh -L` tunnel. When using xAI as a provider (any base URL containing `x.ai`), Hermes automatically enables prompt caching by sending the `x-grok-conv-id` header with every API request. This routes requests to the same server within a conversation session, allowing xAI's infrastructure to reuse cached system prompts and conversation history. diff --git a/website/docs/user-guide/configuration.md b/website/docs/user-guide/configuration.md index 40f924aa209..61fde178737 100644 --- a/website/docs/user-guide/configuration.md +++ b/website/docs/user-guide/configuration.md @@ -836,7 +836,7 @@ Available providers for auxiliary tasks: `auto`, `main`, plus any provider in th ::: :::tip xAI Grok OAuth -`xai-oauth` logs in via browser OAuth for SuperGrok subscribers (no API key needed). Run `hermes model` and select **xAI Grok OAuth (SuperGrok Subscription)** to authenticate. The same OAuth token is reused for every direct-to-xAI surface (chat, auxiliary tasks, TTS, image gen, video gen, transcription). See the [xAI Grok OAuth guide](../guides/xai-grok-oauth.md), and if Hermes is on a remote host see [OAuth over SSH / Remote Hosts](../guides/oauth-over-ssh.md). +`xai-oauth` logs in via browser OAuth for SuperGrok and X Premium+ subscribers (no API key needed). Run `hermes model` and select **xAI Grok OAuth (SuperGrok Subscription)** to authenticate. The same OAuth token is reused for every direct-to-xAI surface (chat, auxiliary tasks, TTS, image gen, video gen, transcription). See the [xAI Grok OAuth guide](../guides/xai-grok-oauth.md), and if Hermes is on a remote host see [OAuth over SSH / Remote Hosts](../guides/oauth-over-ssh.md). ::: :::warning `"main"` is for auxiliary tasks only @@ -962,7 +962,7 @@ These options apply to **auxiliary task configs** (`auxiliary:`, `compression:`, | `"nous"` | Force Nous Portal | `hermes auth` | | `"codex"` | Force Codex OAuth (ChatGPT account). Supports vision (gpt-5.3-codex). | `hermes model` → Codex | | `"minimax-oauth"` | Force MiniMax OAuth (browser login, no API key). Uses MiniMax-M2.7-highspeed for auxiliary tasks. | `hermes model` → MiniMax (OAuth) | -| `"xai-oauth"` | Force xAI Grok OAuth (browser login for SuperGrok subscribers, no API key). Same OAuth token covers chat, TTS, image, video, and transcription. | `hermes model` → xAI Grok OAuth (SuperGrok Subscription) | +| `"xai-oauth"` | Force xAI Grok OAuth (browser login for SuperGrok or X Premium+ subscribers, no API key). Same OAuth token covers chat, TTS, image, video, and transcription. | `hermes model` → xAI Grok OAuth (SuperGrok Subscription) | | `"main"` | Use your active custom/main endpoint. This can come from `OPENAI_BASE_URL` + `OPENAI_API_KEY` or from a custom endpoint saved via `hermes model` / `config.yaml`. Works with OpenAI, local models, or any OpenAI-compatible API. **Auxiliary tasks only — not valid for `model.provider`.** | Custom endpoint credentials + base URL | Direct API-key providers from the main provider catalog also work here when you want side tasks to bypass your default router. `gmi` is valid once `GMI_API_KEY` is configured: diff --git a/website/docs/user-guide/features/x-search.md b/website/docs/user-guide/features/x-search.md index f241c808f17..49479fbf6f2 100644 --- a/website/docs/user-guide/features/x-search.md +++ b/website/docs/user-guide/features/x-search.md @@ -17,7 +17,7 @@ The `x_search` tool lets the agent search X (Twitter) posts, profiles, and threa | Credential | Source | Setup | |------------|--------|-------| -| **SuperGrok OAuth** (preferred) | Browser login at `accounts.x.ai`, refreshed automatically | `hermes auth add xai-oauth` — see [xAI Grok OAuth (SuperGrok Subscription)](../../guides/xai-grok-oauth.md) | +| **SuperGrok / X Premium+ OAuth** (preferred) | Browser login at `accounts.x.ai`, refreshed automatically | `hermes auth add xai-oauth` — see [xAI Grok OAuth (SuperGrok / X Premium+)](../../guides/xai-grok-oauth.md) | | **`XAI_API_KEY`** | Paid xAI API key | Set in `~/.hermes/.env` | Both hit the same endpoint with the same payload — the only difference is the bearer token. **When both are configured, SuperGrok OAuth wins** so x_search runs against your subscription quota instead of paid API spend. From 697d38a3f4bb1bfbea3d75e31ffb3fa1dd221798 Mon Sep 17 00:00:00 2001 From: H-Ali13381 Date: Thu, 7 May 2026 21:40:05 -0400 Subject: [PATCH 02/29] feat: auto-launch Chromium-family browser for CDP Add browser CDP launch candidates for Chrome, Chromium, Brave, and Edge while preserving Chrome-first selection. Retry candidate launch failures instead of giving up after the first executable. Update /browser CLI and TUI messaging, docs, and tool descriptions from Chrome-only wording to Chromium-family browser support. Add regression coverage for Brave/Edge paths, Chrome-first precedence, fallback launches, and CDP endpoint probing. --- cli.py | 88 +++++------ hermes_cli/browser_connect.py | 149 ++++++++++++++---- hermes_cli/commands.py | 2 +- hermes_cli/tips.py | 4 +- tests/cli/test_cli_browser_connect.py | 142 ++++++++++++++++- tests/test_tui_gateway_server.py | 12 +- tools/browser_camofox.py | 2 +- tools/browser_cdp_tool.py | 21 +-- tools/browser_dialog_tool.py | 4 +- tui_gateway/server.py | 16 +- .../createGatewayEventHandler.test.ts | 4 +- .../src/__tests__/createSlashHandler.test.ts | 10 +- ui-tui/src/app/slash/commands/ops.ts | 4 +- .../developer-guide/browser-supervisor.md | 6 +- website/docs/integrations/index.md | 2 +- website/docs/reference/slash-commands.md | 2 +- website/docs/user-guide/configuration.md | 8 +- website/docs/user-guide/features/browser.md | 44 ++++-- website/docs/user-guide/features/overview.md | 2 +- 19 files changed, 373 insertions(+), 149 deletions(-) diff --git a/cli.py b/cli.py index a700f0ddded..033a60077f0 100644 --- a/cli.py +++ b/cli.py @@ -105,6 +105,7 @@ _COMMAND_SPINNER_FRAMES = ("⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧ from hermes_constants import get_hermes_home, display_hermes_home from hermes_cli.browser_connect import ( DEFAULT_BROWSER_CDP_URL, + is_browser_debug_ready, manual_chrome_debug_command, try_launch_chrome_debug, ) @@ -8454,10 +8455,10 @@ class HermesCLI: @staticmethod def _try_launch_chrome_debug(port: int, system: str) -> bool: - """Try to launch Chrome/Chromium with remote debugging enabled. + """Try to launch a Chromium-family browser with remote debugging enabled. Uses a dedicated user-data-dir so the debug instance doesn't conflict - with an already-running Chrome using the default profile. + with an already-running browser using the default profile. Returns True if a launch command was executed (doesn't guarantee success). """ @@ -8502,7 +8503,7 @@ class HermesCLI: ) def _handle_browser_command(self, cmd: str): - """Handle /browser connect|disconnect|status — manage live Chrome CDP connection.""" + """Handle /browser connect|disconnect|status — manage live Chromium-family CDP connection.""" import platform as _plat parts = cmd.strip().split(None, 1) @@ -8556,56 +8557,42 @@ class HermesCLI: print() - # Check if Chrome is already listening on the debug port - import socket - _already_open = False - try: - s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) - s.settimeout(1) - s.connect((_host, _port)) - s.close() - _already_open = True - except (OSError, socket.timeout): - pass + # Check if a Chromium-family browser is already serving CDP on the debug port + _already_open = is_browser_debug_ready(cdp_url, timeout=1.0) if _already_open: - print(f" ✓ Chrome is already listening on port {_port}") + print(f" ✓ Chromium-family browser is already listening on port {_port}") elif cdp_url == _DEFAULT_CDP: - # Try to auto-launch Chrome with remote debugging - print(" Chrome isn't running with remote debugging — attempting to launch...") + # Try to auto-launch a Chromium-family browser with remote debugging + print(" Chromium-family browser isn't running with remote debugging — attempting to launch...") _launched = self._try_launch_chrome_debug(_port, _plat.system()) if _launched: - # Wait for the port to come up + # Wait for the DevTools discovery endpoint to come up for _wait in range(10): - try: - s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) - s.settimeout(1) - s.connect((_host, _port)) - s.close() + if is_browser_debug_ready(cdp_url, timeout=1.0): _already_open = True break - except (OSError, socket.timeout): - time.sleep(0.5) + time.sleep(0.5) if _already_open: - print(f" ✓ Chrome launched and listening on port {_port}") + print(f" ✓ Chromium-family browser launched and listening on port {_port}") else: - print(f" ⚠ Chrome launched but port {_port} isn't responding yet") + print(f" ⚠ Browser launched but port {_port} isn't responding yet") print(" Try again in a few seconds — the debug instance may still be starting") else: - print(" ⚠ Could not auto-launch Chrome") + print(" ⚠ Could not auto-launch a Chromium-family browser") sys_name = _plat.system() chrome_cmd = manual_chrome_debug_command(_port, sys_name) if chrome_cmd: - print(f" Launch Chrome manually:") + print(f" Launch a Chromium-family browser manually:") print(f" {chrome_cmd}") else: - print(" No Chrome/Chromium executable found in this environment") + print(" No supported Chromium-family browser executable found in this environment") else: print(f" ⚠ Port {_port} is not reachable at {cdp_url}") if not _already_open: print() - print("Browser not connected — start Chrome with remote debugging and retry /browser connect") + print("Browser not connected — start a Chromium-family browser with remote debugging and retry /browser connect") print() return @@ -8618,20 +8605,23 @@ class HermesCLI: except Exception: pass print() - print("🌐 Browser connected to live Chrome via CDP") + print("🌐 Browser connected to live Chromium-family browser via CDP") print(f" Endpoint: {cdp_url}") print() - # Inject context message so the model knows + # Inject context message so the model knows this slash command + # intentionally makes the dev/debug CDP browser available for use. if hasattr(self, '_pending_input'): self._pending_input.put( - "[System note: The user has connected your browser tools to their live Chrome browser " - "via Chrome DevTools Protocol. Your browser_navigate, browser_snapshot, browser_click, " - "and other browser tools now control their real browser — including any pages they have " - "open, logged-in sessions, and cookies. They likely opened specific sites or logged into " - "services before connecting. Please await their instruction before attempting to operate " - "the browser. When you do act, be mindful that your actions affect their real browser — " - "don't close tabs or navigate away from pages without asking.]" + "[System note: The user invoked /browser connect and connected your browser tools to " + "a Chromium-family dev/debug browser via Chrome DevTools Protocol. " + "Your browser_navigate, browser_snapshot, browser_click, and other browser tools now " + "control that CDP browser. The command itself is a signal that using browser tools for " + "their current browser-related request is expected; do not wait for separate permission " + "just because CDP is connected. This is typically a Hermes-managed isolated debug " + "profile, not the user's main everyday browser. It is still user-visible and may contain " + "pages, logged-in sessions, or cookies in that debug profile, so avoid destructive actions, " + "closing tabs, or navigating away unless the user's task calls for it.]" ) elif sub == "disconnect": @@ -8644,24 +8634,24 @@ class HermesCLI: except Exception: pass print() - print("🌐 Browser disconnected from live Chrome") + print("🌐 Browser disconnected from live Chromium-family browser") print(" Browser tools reverted to default mode (local headless or cloud provider)") print() if hasattr(self, '_pending_input'): self._pending_input.put( - "[System note: The user has disconnected the browser tools from their live Chrome. " + "[System note: The user has disconnected the browser tools from their live Chromium-family browser. " "Browser tools are back to default mode (headless local browser or cloud provider).]" ) else: print() - print("Browser is not connected to live Chrome (already using default mode)") + print("Browser is not connected to a live Chromium-family browser (already using default mode)") print() elif sub == "status": print() if current: - print("🌐 Browser: connected to live Chrome via CDP") + print("🌐 Browser: connected to live Chromium-family browser via CDP") print(f" Endpoint: {current}") _port = 9222 @@ -8677,7 +8667,7 @@ class HermesCLI: s.close() print(" Status: ✓ reachable") except (OSError, Exception): - print(" Status: ⚠ not reachable (Chrome may not be running)") + print(" Status: ⚠ not reachable (browser may not be running)") else: try: from tools.browser_tool import _get_cloud_provider @@ -8697,13 +8687,13 @@ class HermesCLI: if engine == "lightpanda": print("🌐 Browser: local Lightpanda (agent-browser --engine lightpanda)") print(" ⚡ Lightpanda: faster navigation, no screenshot support") - print(" Automatic Chrome fallback for screenshots and failed commands") + print(" Automatic Chromium fallback for screenshots and failed commands") elif engine == "chrome": - print("🌐 Browser: local headless Chrome (agent-browser --engine chrome)") + print("🌐 Browser: local headless Chromium (agent-browser --engine chrome)") else: print("🌐 Browser: local headless Chromium (agent-browser)") print() - print(" /browser connect — connect to your live Chrome") + print(" /browser connect — connect to your live Chromium-family browser") print(" /browser disconnect — revert to default") print() @@ -8711,7 +8701,7 @@ class HermesCLI: print() print("Usage: /browser connect|disconnect|status") print() - print(" connect Connect browser tools to your live Chrome session") + print(" connect Connect browser tools to your live Chromium-family browser session") print(" disconnect Revert to default browser backend") print(" status Show current browser mode") print() diff --git a/hermes_cli/browser_connect.py b/hermes_cli/browser_connect.py index 89c9d2c6521..7ed4f2e4da4 100644 --- a/hermes_cli/browser_connect.py +++ b/hermes_cli/browser_connect.py @@ -1,4 +1,4 @@ -"""Shared helpers for attaching Hermes to a local Chrome CDP port.""" +"""Shared helpers for attaching Hermes to a local Chromium-family CDP port.""" from __future__ import annotations @@ -21,23 +21,53 @@ _DARWIN_APPS = ( "/Applications/Microsoft Edge.app/Contents/MacOS/Microsoft Edge", ) -_WINDOWS_INSTALL_PARTS = ( - ("Google", "Chrome", "Application", "chrome.exe"), - ("Chromium", "Application", "chrome.exe"), - ("Chromium", "Application", "chromium.exe"), - ("BraveSoftware", "Brave-Browser", "Application", "brave.exe"), - ("Microsoft", "Edge", "Application", "msedge.exe"), +_WINDOWS_BROWSER_GROUPS = ( + (("chrome.exe", "chrome"), (("Google", "Chrome", "Application", "chrome.exe"),)), + ( + ("chromium.exe", "chromium"), + (("Chromium", "Application", "chrome.exe"), ("Chromium", "Application", "chromium.exe")), + ), + (("brave.exe", "brave"), (("BraveSoftware", "Brave-Browser", "Application", "brave.exe"),)), + (("msedge.exe", "msedge"), (("Microsoft", "Edge", "Application", "msedge.exe"),)), ) -_LINUX_BIN_NAMES = ( - "google-chrome", "google-chrome-stable", "chromium-browser", - "chromium", "brave-browser", "microsoft-edge", +_WINDOWS_BIN_NAMES = tuple(name for names, _ in _WINDOWS_BROWSER_GROUPS for name in names) +_WINDOWS_INSTALL_PARTS = tuple(parts for _, group in _WINDOWS_BROWSER_GROUPS for parts in group) + +_LINUX_BROWSER_GROUPS = ( + ( + ("google-chrome", "google-chrome-stable"), + ("/opt/google/chrome/chrome", "/usr/bin/google-chrome", "/usr/bin/google-chrome-stable"), + ), + ( + ("chromium-browser", "chromium"), + ("/usr/bin/chromium-browser", "/usr/bin/chromium"), + ), + ( + ("brave-browser", "brave-browser-stable", "brave"), + ( + "/usr/bin/brave-browser", + "/usr/bin/brave-browser-stable", + "/usr/bin/brave", + "/snap/bin/brave", + "/opt/brave.com/brave/brave-browser", + "/opt/brave.com/brave/brave", + "/opt/brave-bin/brave", + ), + ), + ( + ("microsoft-edge", "microsoft-edge-stable", "msedge"), + ( + "/usr/bin/microsoft-edge", + "/usr/bin/microsoft-edge-stable", + "/opt/microsoft/msedge/microsoft-edge", + "/opt/microsoft/msedge/msedge", + ), + ), ) -_WINDOWS_BIN_NAMES = ( - "chrome.exe", "msedge.exe", "brave.exe", "chromium.exe", - "chrome", "msedge", "brave", "chromium", -) +_LINUX_BIN_NAMES = tuple(name for names, _ in _LINUX_BROWSER_GROUPS for name in names) +_LINUX_INSTALL_PATHS = tuple(path for _, paths in _LINUX_BROWSER_GROUPS for path in paths) def get_chrome_debug_candidates(system: str) -> list[str]: @@ -53,10 +83,14 @@ def get_chrome_debug_candidates(system: str) -> list[str]: candidates.append(path) seen.add(normalized) - def add_install_paths(bases: tuple[str | None, ...]) -> None: - for base in filter(None, bases): - for parts in _WINDOWS_INSTALL_PARTS: - add(os.path.join(base, *parts)) + def add_windows_install_paths( + bases: tuple[str | None, ...], + install_groups: tuple[tuple[tuple[str, ...], tuple[tuple[str, ...], ...]], ...], + ) -> None: + for _, group in install_groups: + for base in filter(None, bases): + for parts in group: + add(os.path.join(base, *parts)) if system == "Darwin": for app in _DARWIN_APPS: @@ -64,18 +98,25 @@ def get_chrome_debug_candidates(system: str) -> list[str]: return candidates if system == "Windows": - for name in _WINDOWS_BIN_NAMES: - add(shutil.which(name)) - add_install_paths(( + install_bases = ( os.environ.get("ProgramFiles"), os.environ.get("ProgramFiles(x86)"), os.environ.get("LOCALAPPDATA"), - )) + ) + for names, install_parts in _WINDOWS_BROWSER_GROUPS: + for name in names: + add(shutil.which(name)) + for base in filter(None, install_bases): + for parts in install_parts: + add(os.path.join(base, *parts)) return candidates - for name in _LINUX_BIN_NAMES: - add(shutil.which(name)) - add_install_paths(("/mnt/c/Program Files", "/mnt/c/Program Files (x86)")) + for names, paths in _LINUX_BROWSER_GROUPS: + for name in names: + add(shutil.which(name)) + for path in paths: + add(path) + add_windows_install_paths(("/mnt/c/Program Files", "/mnt/c/Program Files (x86)"), _WINDOWS_BROWSER_GROUPS) return candidates @@ -92,6 +133,42 @@ def _chrome_debug_args(port: int) -> list[str]: ] +def is_browser_debug_ready(url: str, timeout: float = 1.0) -> bool: + """Return True when ``url`` exposes a reachable Chrome DevTools endpoint.""" + import socket + import urllib.request + from urllib.parse import urlparse + + parsed = urlparse(url if "://" in url else f"http://{url}") + try: + port = parsed.port or (443 if parsed.scheme in {"https", "wss"} else 80) + except ValueError: + return False + + if parsed.scheme in {"ws", "wss"} and parsed.path.startswith("/devtools/browser/"): + if not parsed.hostname: + return False + try: + with socket.create_connection((parsed.hostname, port), timeout=timeout): + return True + except OSError: + return False + + scheme = {"ws": "http", "wss": "https"}.get(parsed.scheme, parsed.scheme) + if scheme not in {"http", "https"} or not parsed.netloc: + return False + + root = f"{scheme}://{parsed.netloc}".rstrip("/") + for probe in (f"{root}/json/version", f"{root}/json"): + try: + with urllib.request.urlopen(probe, timeout=timeout) as resp: + if 200 <= getattr(resp, "status", 200) < 300: + return True + except Exception: + continue + return False + + def manual_chrome_debug_command(port: int = DEFAULT_BROWSER_CDP_PORT, system: str | None = None) -> str | None: system = system or platform.system() candidates = get_chrome_debug_candidates(system) @@ -126,13 +203,15 @@ def try_launch_chrome_debug(port: int = DEFAULT_BROWSER_CDP_PORT, system: str | return False os.makedirs(chrome_debug_data_dir(), exist_ok=True) - try: - subprocess.Popen( - [candidates[0], *_chrome_debug_args(port)], - stdout=subprocess.DEVNULL, - stderr=subprocess.DEVNULL, - **_detach_kwargs(system), - ) - return True - except Exception: - return False + for candidate in candidates: + try: + subprocess.Popen( + [candidate, *_chrome_debug_args(port)], + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + **_detach_kwargs(system), + ) + return True + except Exception: + continue + return False diff --git a/hermes_cli/commands.py b/hermes_cli/commands.py index 9fc0472f512..03e3df81b9b 100644 --- a/hermes_cli/commands.py +++ b/hermes_cli/commands.py @@ -187,7 +187,7 @@ COMMAND_REGISTRY: list[CommandDef] = [ aliases=("reload_mcp",)), CommandDef("reload-skills", "Re-scan ~/.hermes/skills/ for newly installed or removed skills", "Tools & Skills", aliases=("reload_skills",)), - CommandDef("browser", "Connect browser tools to your live Chrome via CDP", "Tools & Skills", + CommandDef("browser", "Connect browser tools to your live Chromium-family browser via CDP", "Tools & Skills", cli_only=True, args_hint="[connect|disconnect|status]", subcommands=("connect", "disconnect", "status")), CommandDef("plugins", "List installed plugins and their status", diff --git a/hermes_cli/tips.py b/hermes_cli/tips.py index 060c441a150..2871cc4af8f 100644 --- a/hermes_cli/tips.py +++ b/hermes_cli/tips.py @@ -31,7 +31,7 @@ TIPS = [ "/skin changes the CLI theme — try ares, mono, slate, poseidon, or charizard.", "/statusbar toggles a persistent bar showing model, tokens, context fill %, cost, and duration.", "/tools disable browser temporarily removes browser tools for the current session.", - "/browser connect attaches browser tools to your running Chrome instance via CDP.", + "/browser connect attaches browser tools to your running Chromium-family browser via CDP.", "/plugins lists installed plugins and their status.", "/cron manages scheduled tasks — set up recurring prompts with delivery to any platform.", "/reload-mcp hot-reloads MCP server configuration without restarting.", @@ -300,7 +300,7 @@ TIPS = [ "Container mode: place .container-mode in HERMES_HOME and the host CLI auto-execs into the container.", "Ctrl+C has 5 priority tiers: cancel recording → cancel prompts → cancel picker → interrupt agent → exit.", "Every interrupt during an agent run is logged to ~/.hermes/interrupt_debug.log with timestamps.", - "BROWSER_CDP_URL connects browser tools to any running Chrome — accepts WebSocket, HTTP, or host:port.", + "BROWSER_CDP_URL connects browser tools to any running Chromium-family browser — accepts WebSocket, HTTP, or host:port.", "BROWSERBASE_ADVANCED_STEALTH=true enables advanced anti-detection with custom Chromium (Scale Plan).", "The CLI auto-switches to compact mode in terminals narrower than 80 columns.", "Quick commands support two types: exec (run shell command directly) and alias (redirect to another command).", diff --git a/tests/cli/test_cli_browser_connect.py b/tests/cli/test_cli_browser_connect.py index cf9471d5843..09d009ca751 100644 --- a/tests/cli/test_cli_browser_connect.py +++ b/tests/cli/test_cli_browser_connect.py @@ -1,11 +1,18 @@ """Tests for CLI browser CDP auto-launch helpers.""" +from contextlib import redirect_stdout +from io import StringIO import os +from queue import Queue import subprocess from unittest.mock import patch from cli import HermesCLI -from hermes_cli.browser_connect import manual_chrome_debug_command +from hermes_cli.browser_connect import ( + get_chrome_debug_candidates, + is_browser_debug_ready, + manual_chrome_debug_command, +) def _assert_chrome_debug_cmd(cmd, expected_chrome, expected_port): @@ -19,7 +26,35 @@ def _assert_chrome_debug_cmd(cmd, expected_chrome, expected_port): assert "chrome-debug" in user_data_args[0] +class _FakeResponse: + status = 200 + + def __enter__(self): + return self + + def __exit__(self, exc_type, exc, tb): + return False + + class TestChromeDebugLaunch: + def test_browser_debug_ready_requires_http_cdp_endpoint(self): + requested = [] + + def fake_urlopen(url, timeout): + requested.append(url) + if url.endswith("/json/version"): + return _FakeResponse() + raise OSError("unexpected probe") + + with patch("urllib.request.urlopen", side_effect=fake_urlopen): + assert is_browser_debug_ready("http://127.0.0.1:9222", timeout=0.1) is True + + assert requested == ["http://127.0.0.1:9222/json/version"] + + def test_browser_debug_ready_rejects_non_cdp_listener(self): + with patch("urllib.request.urlopen", side_effect=OSError("not cdp")): + assert is_browser_debug_ready("http://127.0.0.1:9222", timeout=0.1) is False + def test_windows_launch_uses_browser_found_on_path(self): captured = {} @@ -72,6 +107,86 @@ class TestChromeDebugLaunch: assert command is not None assert command.startswith("/usr/bin/chromium --remote-debugging-port=9222") + def test_linux_candidates_prefer_chrome_before_brave_when_both_exist(self): + chrome = "/usr/bin/google-chrome" + brave = "/usr/bin/brave-browser" + + def fake_which(name): + return {"google-chrome": chrome, "brave-browser": brave}.get(name) + + with patch("hermes_cli.browser_connect.shutil.which", side_effect=fake_which), \ + patch("hermes_cli.browser_connect.os.path.isfile", side_effect=lambda path: path in {chrome, brave}): + candidates = get_chrome_debug_candidates("Linux") + command = manual_chrome_debug_command(9222, "Linux") + + assert candidates[:2] == [chrome, brave] + assert command is not None + assert command.startswith(f"{chrome} --remote-debugging-port=9222") + + def test_linux_candidates_prefer_chrome_install_path_before_brave_on_path(self): + chrome = "/opt/google/chrome/chrome" + brave = "/usr/bin/brave-browser" + + with patch("hermes_cli.browser_connect.shutil.which", side_effect=lambda name: brave if name == "brave-browser" else None), \ + patch("hermes_cli.browser_connect.os.path.isfile", side_effect=lambda path: path in {chrome, brave}): + candidates = get_chrome_debug_candidates("Linux") + + assert candidates[:2] == [chrome, brave] + + def test_windows_candidates_prefer_chrome_install_path_before_brave_on_path(self, monkeypatch): + program_files = r"C:\Program Files" + chrome = os.path.join(program_files, "Google", "Chrome", "Application", "chrome.exe") + brave = r"C:\Brave\brave.exe" + + monkeypatch.setenv("ProgramFiles", program_files) + monkeypatch.delenv("ProgramFiles(x86)", raising=False) + monkeypatch.delenv("LOCALAPPDATA", raising=False) + + with patch("hermes_cli.browser_connect.shutil.which", side_effect=lambda name: brave if name == "brave.exe" else None), \ + patch("hermes_cli.browser_connect.os.path.isfile", side_effect=lambda path: path in {chrome, brave}): + candidates = get_chrome_debug_candidates("Windows") + + assert candidates[:2] == [chrome, brave] + + def test_linux_candidates_include_arch_brave_install_path(self): + brave = "/opt/brave-bin/brave" + + with patch("hermes_cli.browser_connect.shutil.which", return_value=None), \ + patch("hermes_cli.browser_connect.os.path.isfile", side_effect=lambda path: path == brave): + candidates = get_chrome_debug_candidates("Linux") + command = manual_chrome_debug_command(9222, "Linux") + + assert candidates == [brave] + assert command is not None + assert command.startswith(f"{brave} --remote-debugging-port=9222") + + def test_linux_candidates_include_official_brave_and_edge_stable_paths(self): + brave = "/usr/bin/brave-browser-stable" + edge = "/usr/bin/microsoft-edge-stable" + + with patch("hermes_cli.browser_connect.shutil.which", return_value=None), \ + patch("hermes_cli.browser_connect.os.path.isfile", side_effect=lambda path: path in {brave, edge}): + candidates = get_chrome_debug_candidates("Linux") + + assert candidates == [brave, edge] + + def test_launch_tries_next_browser_when_first_candidate_fails(self): + brave = "/usr/bin/brave-browser" + chrome = "/usr/bin/google-chrome" + attempts = [] + + def fake_popen(cmd, **kwargs): + attempts.append(cmd[0]) + if cmd[0] == brave: + raise OSError("broken brave install") + return object() + + with patch("hermes_cli.browser_connect.get_chrome_debug_candidates", return_value=[brave, chrome]), \ + patch("subprocess.Popen", side_effect=fake_popen): + assert HermesCLI._try_launch_chrome_debug(9222, "Linux") is True + + assert attempts == [brave, chrome] + def test_manual_command_uses_wsl_windows_chrome_when_available(self): chrome = "/mnt/c/Program Files/Google/Chrome/Application/chrome.exe" @@ -99,3 +214,28 @@ class TestChromeDebugLaunch: with patch("hermes_cli.browser_connect.shutil.which", return_value=None), \ patch("hermes_cli.browser_connect.os.path.isfile", return_value=False): assert manual_chrome_debug_command(9222, "Linux") is None + + def test_connect_context_note_allows_expected_browser_use(self, monkeypatch): + """`/browser connect` is an instruction to use the CDP browser. + + The queued context note must not tell the model to wait for a second + permission step or imply that the attached browser is the user's main + everyday Chrome profile. + """ + cli = HermesCLI.__new__(HermesCLI) + cli._pending_input = Queue() + monkeypatch.delenv("BROWSER_CDP_URL", raising=False) + + with patch("cli.is_browser_debug_ready", return_value=True), \ + patch("tools.browser_tool.cleanup_all_browsers"), \ + patch("tools.browser_tool._ensure_cdp_supervisor"), \ + redirect_stdout(StringIO()): + cli._handle_browser_command("/browser connect") + + note = cli._pending_input.get_nowait() + assert "Chromium-family" in note + assert "dev/debug" in note + assert "using browser tools for their current browser-related request is expected" in note + assert "live Chrome browser" not in note + assert "real browser" not in note + assert "Please await their instruction" not in note diff --git a/tests/test_tui_gateway_server.py b/tests/test_tui_gateway_server.py index 43d0643c779..fe8e189091c 100644 --- a/tests/test_tui_gateway_server.py +++ b/tests/test_tui_gateway_server.py @@ -3914,7 +3914,7 @@ def test_browser_manage_connect_sets_env_and_cleans_twice(monkeypatch): assert resp["result"]["connected"] is True assert resp["result"]["url"] == "http://127.0.0.1:9222" - assert resp["result"]["messages"] == ["Chrome is already listening on port 9222"] + assert resp["result"]["messages"] == ["Chromium-family browser is already listening on port 9222"] assert os.environ.get("BROWSER_CDP_URL") == "http://127.0.0.1:9222" # First cleanup runs against the OLD env (none here), second against the NEW. assert cleanup_calls == ["", "http://127.0.0.1:9222"] @@ -3934,7 +3934,7 @@ def test_browser_manage_connect_defaults_to_loopback(monkeypatch): assert resp["result"]["connected"] is True assert resp["result"]["url"] == "http://127.0.0.1:9222" - assert resp["result"]["messages"] == ["Chrome is already listening on port 9222"] + assert resp["result"]["messages"] == ["Chromium-family browser is already listening on port 9222"] assert urls[0] == "http://127.0.0.1:9222/json/version" @@ -3977,10 +3977,10 @@ def test_browser_manage_connect_default_local_reports_launch_hint(monkeypatch): assert resp["result"]["url"] == "http://127.0.0.1:9222" assert ( resp["result"]["messages"][0] - == "Chrome isn't running with remote debugging — attempting to launch..." + == "Chromium-family browser isn't running with remote debugging — attempting to launch..." ) assert any( - "No Chrome/Chromium executable was found" in line + "No supported Chromium-family browser executable was found" in line for line in resp["result"]["messages"] ) assert any( @@ -4107,8 +4107,8 @@ def test_browser_manage_connect_default_local_retries_after_launch(monkeypatch): assert resp["result"]["connected"] is True assert resp["result"]["url"] == "http://127.0.0.1:9222" assert resp["result"]["messages"] == [ - "Chrome isn't running with remote debugging — attempting to launch...", - "Chrome launched and listening on port 9222", + "Chromium-family browser isn't running with remote debugging — attempting to launch...", + "Chromium-family browser launched and listening on port 9222", ] assert os.environ["BROWSER_CDP_URL"] == "http://127.0.0.1:9222" diff --git a/tools/browser_camofox.py b/tools/browser_camofox.py index 071f1a2164b..45bf885def6 100644 --- a/tools/browser_camofox.py +++ b/tools/browser_camofox.py @@ -56,7 +56,7 @@ def get_camofox_url() -> str: def is_camofox_mode() -> bool: """True when Camofox backend is configured and no CDP override is active. - When the user has explicitly connected to a live Chrome instance via + When the user has explicitly connected to a live Chromium-family browser via ``/browser connect`` (which sets ``BROWSER_CDP_URL``), the CDP connection takes priority over Camofox so the browser tools operate on the real browser instead of being silently routed to the Camofox backend. diff --git a/tools/browser_cdp_tool.py b/tools/browser_cdp_tool.py index f10a1541923..e2aae88308f 100644 --- a/tools/browser_cdp_tool.py +++ b/tools/browser_cdp_tool.py @@ -358,8 +358,9 @@ def browser_cdp( if not endpoint: return tool_error( "No CDP endpoint is available. Run '/browser connect' to attach " - "to a running Chrome, or set 'browser.cdp_url' in config.yaml. " - "The Camofox backend is REST-only and does not expose CDP.", + "to a running Chrome, Brave, Chromium, or Edge browser, or set " + "'browser.cdp_url' in config.yaml. The Camofox backend is REST-only " + "and does not expose CDP.", cdp_docs=CDP_DOCS_URL, ) @@ -367,8 +368,8 @@ def browser_cdp( return tool_error( f"CDP endpoint is not a WebSocket URL: {endpoint!r}. " "Expected ws://... or wss://... — the /browser connect " - "resolver should have rewritten this. Check that Chrome is " - "actually listening on the debug port." + "resolver should have rewritten this. Check that a Chromium-family " + "browser is actually listening on the debug port." ) call_params: Dict[str, Any] = params or {} @@ -431,12 +432,12 @@ BROWSER_CDP_SCHEMA: Dict[str, Any] = { "browser operations not covered by browser_navigate, browser_click, " "browser_console, etc.\n\n" "**Requires a reachable CDP endpoint.** Available when the user has " - "run '/browser connect' to attach to a running Chrome, or when " - "'browser.cdp_url' is set in config.yaml. Not currently wired up for " - "cloud backends (Browserbase, Browser Use, Firecrawl) — those expose " - "CDP per session but live-session routing is a follow-up. Camofox is " - "REST-only and will never support CDP. If the tool is in your toolset " - "at all, a CDP endpoint is already reachable.\n\n" + "run '/browser connect' to attach to a running Chrome, Brave, Chromium, " + "or Edge browser, or when 'browser.cdp_url' is set in config.yaml. " + "Not currently wired up for cloud backends (Browserbase, Browser Use, " + "Firecrawl) — those expose CDP per session but live-session routing is " + "a follow-up. Camofox is REST-only and will never support CDP. If the " + "tool is in your toolset at all, a CDP endpoint is already reachable.\n\n" f"**CDP method reference:** {CDP_DOCS_URL} — use web_extract on a " "method's URL (e.g. '/tot/Page/#method-handleJavaScriptDialog') " "to look up parameters and return shape.\n\n" diff --git a/tools/browser_dialog_tool.py b/tools/browser_dialog_tool.py index 51ab0c4241e..e37337b9bba 100644 --- a/tools/browser_dialog_tool.py +++ b/tools/browser_dialog_tool.py @@ -6,7 +6,7 @@ accept or dismiss. Gated on the same ``_browser_cdp_check`` as ``browser_cdp`` so it only appears when a CDP endpoint is reachable (Browserbase with a -``connectUrl``, local Chrome via ``/browser connect``, or +``connectUrl``, local Chromium-family browser via ``/browser connect``, or ``browser.cdp_url`` set in config). See ``website/docs/developer-guide/browser-supervisor.md`` for the full @@ -40,7 +40,7 @@ BROWSER_DIALOG_SCHEMA: Dict[str, Any] = { "happens when a second dialog fires while the first is still open), " "pass ``dialog_id`` from the snapshot to disambiguate.\n\n" "**Availability:** only present when a CDP-capable backend is " - "attached — Browserbase sessions, local Chrome via " + "attached — Browserbase sessions, local Chromium-family browser via " "``/browser connect``, or ``browser.cdp_url`` in config.yaml. " "Not available on Camofox (REST-only) or the default Playwright " "local browser (CDP port is hidden)." diff --git a/tui_gateway/server.py b/tui_gateway/server.py index 7b9f286380c..71a5d6f9417 100644 --- a/tui_gateway/server.py +++ b/tui_gateway/server.py @@ -6087,17 +6087,17 @@ def _failure_messages(url: str, port: int, system: str) -> list[str]: command = manual_chrome_debug_command(port, system) hint = ( - ["Start Chrome with remote debugging, then retry /browser connect:", command] + ["Start a Chromium-family browser with remote debugging, then retry /browser connect:", command] if command else [ - "No Chrome/Chromium executable was found in this environment.", - f"Install one or start Chrome with --remote-debugging-port={port}, then retry /browser connect.", + "No supported Chromium-family browser executable was found in this environment.", + f"Install one or start a Chromium-family browser with --remote-debugging-port={port}, then retry /browser connect.", ] ) return [ - f"Chrome is not reachable at {url}.", + f"Browser CDP is not reachable at {url}.", *hint, - "Browser not connected — start Chrome with remote debugging and retry /browser connect", + "Browser not connected — start a Chromium-family browser with remote debugging and retry /browser connect", ] @@ -6183,7 +6183,7 @@ def _browser_connect(rid, params: dict) -> dict: from hermes_cli.browser_connect import try_launch_chrome_debug announce( - "Chrome isn't running with remote debugging — attempting to launch..." + "Chromium-family browser isn't running with remote debugging — attempting to launch..." ) if try_launch_chrome_debug(port, system): @@ -6194,7 +6194,7 @@ def _browser_connect(rid, params: dict) -> dict: break if ok: - announce(f"Chrome launched and listening on port {port}") + announce(f"Chromium-family browser launched and listening on port {port}") else: for line in _failure_messages(url, port, system)[1:]: announce(line, level="error") @@ -6204,7 +6204,7 @@ def _browser_connect(rid, params: dict) -> dict: elif not ok: return _err(rid, 5031, f"could not reach browser CDP at {url}") elif _is_default_local_cdp(parsed): - announce(f"Chrome is already listening on port {port}") + announce(f"Chromium-family browser is already listening on port {port}") normalized = _normalize_cdp_url(parsed) diff --git a/ui-tui/src/__tests__/createGatewayEventHandler.test.ts b/ui-tui/src/__tests__/createGatewayEventHandler.test.ts index 0c7ec3b06ad..417b8c41b93 100644 --- a/ui-tui/src/__tests__/createGatewayEventHandler.test.ts +++ b/ui-tui/src/__tests__/createGatewayEventHandler.test.ts @@ -379,11 +379,11 @@ describe('createGatewayEventHandler', () => { const handler = createGatewayEventHandler(ctx) handler({ - payload: { message: 'Chrome launched and listening on port 9222' }, + payload: { message: 'Chromium-family browser launched and listening on port 9222' }, type: 'browser.progress' } as any) - expect(ctx.system.sys).toHaveBeenCalledWith('Chrome launched and listening on port 9222') + expect(ctx.system.sys).toHaveBeenCalledWith('Chromium-family browser launched and listening on port 9222') }) it('annotates gateway.start_timeout with stderr tail lines so users can diagnose without /logs', () => { diff --git a/ui-tui/src/__tests__/createSlashHandler.test.ts b/ui-tui/src/__tests__/createSlashHandler.test.ts index 8c1cee8bfd9..952f34fc38b 100644 --- a/ui-tui/src/__tests__/createSlashHandler.test.ts +++ b/ui-tui/src/__tests__/createSlashHandler.test.ts @@ -387,8 +387,8 @@ describe('createSlashHandler', () => { Promise.resolve({ connected: false, messages: [ - "Chrome isn't running with remote debugging — attempting to launch...", - 'Browser not connected — start Chrome with remote debugging and retry /browser connect' + "Chromium-family browser isn't running with remote debugging — attempting to launch...", + 'Browser not connected — start a Chromium-family browser with remote debugging and retry /browser connect' ], url: 'http://127.0.0.1:9222' }) @@ -397,14 +397,14 @@ describe('createSlashHandler', () => { const ctx = buildCtx({ gateway: { ...buildGateway(), rpc } }) expect(createSlashHandler(ctx)('/browser connect')).toBe(true) - expect(ctx.transcript.sys).toHaveBeenCalledWith('checking Chrome remote debugging at http://127.0.0.1:9222...') + expect(ctx.transcript.sys).toHaveBeenCalledWith('checking Chromium-family browser remote debugging at http://127.0.0.1:9222...') await vi.waitFor(() => { expect(ctx.transcript.sys).toHaveBeenCalledWith( - "Chrome isn't running with remote debugging — attempting to launch..." + "Chromium-family browser isn't running with remote debugging — attempting to launch..." ) expect(ctx.transcript.sys).toHaveBeenCalledWith( - 'Browser not connected — start Chrome with remote debugging and retry /browser connect' + 'Browser not connected — start a Chromium-family browser with remote debugging and retry /browser connect' ) expect(ctx.transcript.sys).not.toHaveBeenCalledWith('browser connect failed') }) diff --git a/ui-tui/src/app/slash/commands/ops.ts b/ui-tui/src/app/slash/commands/ops.ts index d8f6522dc00..791a96c1d3b 100644 --- a/ui-tui/src/app/slash/commands/ops.ts +++ b/ui-tui/src/app/slash/commands/ops.ts @@ -155,7 +155,7 @@ export const opsCommands: SlashCommand[] = [ const url = action === 'connect' ? rest.join(' ').trim() || 'http://127.0.0.1:9222' : undefined if (url) { - ctx.transcript.sys(`checking Chrome remote debugging at ${url}...`) + ctx.transcript.sys(`checking Chromium-family browser remote debugging at ${url}...`) } ctx.gateway @@ -181,7 +181,7 @@ export const opsCommands: SlashCommand[] = [ } if (r.connected) { - ctx.transcript.sys('Browser connected to live Chrome via CDP') + ctx.transcript.sys('Browser connected to live Chromium-family browser via CDP') ctx.transcript.sys(`Endpoint: ${r.url || '(url unavailable)'}`) ctx.transcript.sys('next browser tool call will use this CDP endpoint') } diff --git a/website/docs/developer-guide/browser-supervisor.md b/website/docs/developer-guide/browser-supervisor.md index ba26d579bbb..8b56cf6bda8 100644 --- a/website/docs/developer-guide/browser-supervisor.md +++ b/website/docs/developer-guide/browser-supervisor.md @@ -217,6 +217,6 @@ Issue planned against `jo-inc/camofox-browser` adding: Unit tests use an asyncio mock CDP server that speaks enough of the protocol to exercise all state transitions: attach, enable, navigate, dialog fire, dialog dismiss, frame attach/detach, child target attach, session teardown. -Real-backend E2E (Browserbase + local Chrome) is manual — exercise via -`/browser connect` to a live Chrome and run the dialog/frame test cases -described above. +Real-backend E2E (Browserbase + local Chromium-family browser) is manual — exercise via +`/browser connect` to a live Chromium-family browser and run the dialog/frame +test cases described above. diff --git a/website/docs/integrations/index.md b/website/docs/integrations/index.md index d80a61abd8c..0b7ec938c17 100644 --- a/website/docs/integrations/index.md +++ b/website/docs/integrations/index.md @@ -46,7 +46,7 @@ Hermes includes full browser automation with multiple backend options for naviga - **Browserbase** — Managed cloud browsers with anti-bot tooling, CAPTCHA solving, and residential proxies - **Browser Use** — Alternative cloud browser provider -- **Local Chrome via CDP** — Connect to your running Chrome instance using `/browser connect` +- **Local Chromium-family CDP** — Connect to your running Chrome, Brave, Chromium, or Edge browser using `/browser connect` - **Local Chromium** — Headless local browser via the `agent-browser` CLI See [Browser Automation](/docs/user-guide/features/browser) for setup and usage. diff --git a/website/docs/reference/slash-commands.md b/website/docs/reference/slash-commands.md index d545c8242bd..3239aa43117 100644 --- a/website/docs/reference/slash-commands.md +++ b/website/docs/reference/slash-commands.md @@ -85,7 +85,7 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in |---------|-------------| | `/tools [list\|disable\|enable] [name...]` | Manage tools: list available tools, or disable/enable specific tools for the current session. Disabling a tool removes it from the agent's toolset and triggers a session reset. | | `/toolsets` | List available toolsets | -| `/browser [connect\|disconnect\|status]` | Manage local Chrome CDP connection. `connect` attaches browser tools to a running Chrome instance (default: `ws://localhost:9222`). `disconnect` detaches. `status` shows current connection. Auto-launches Chrome if no debugger is detected. | +| `/browser [connect\|disconnect\|status]` | Manage a local Chromium-family CDP connection. `connect` attaches browser tools to a running Chrome, Brave, Chromium, or Edge instance (default: `http://127.0.0.1:9222`). `disconnect` detaches. `status` shows current connection. Auto-launches a supported Chromium-family browser if no debugger is detected. | | `/skills` | Search, install, inspect, or manage skills from online registries | | `/cron` | Manage scheduled tasks (list, add/create, edit, pause, resume, run, remove) | | `/curator` | Background skill maintenance — `status`, `run`, `pin`, `archive`. See [Curator](/docs/user-guide/features/curator). | diff --git a/website/docs/user-guide/configuration.md b/website/docs/user-guide/configuration.md index 61fde178737..ad63ed84c09 100644 --- a/website/docs/user-guide/configuration.md +++ b/website/docs/user-guide/configuration.md @@ -1505,11 +1505,11 @@ browser: command_timeout: 30 # Timeout in seconds for browser commands (screenshot, navigate, etc.) record_sessions: false # Auto-record browser sessions as WebM videos to ~/.hermes/browser_recordings/ # Optional CDP override — when set, Hermes attaches directly to your own - # Chrome (via /browser connect) rather than starting a headless browser. + # Chromium-family browser (via /browser connect) rather than starting a headless browser. cdp_url: "" # Dialog supervisor — controls how native JS dialogs (alert / confirm / prompt) - # are handled when a CDP backend is attached (Browserbase, local Chrome via - # /browser connect). Ignored on Camofox and default local agent-browser mode. + # are handled when a CDP backend is attached (Browserbase, local Chromium-family + # browser via /browser connect). Ignored on Camofox and default local agent-browser mode. dialog_policy: must_respond # must_respond | auto_dismiss | auto_accept dialog_timeout_s: 300 # Safety auto-dismiss under must_respond (seconds) camofox: @@ -1527,7 +1527,7 @@ browser: See the [browser feature page](./features/browser.md#browser_dialog) for the full dialog workflow. -The browser toolset supports multiple providers. See the [Browser feature page](/docs/user-guide/features/browser) for details on Browserbase, Browser Use, and local Chrome CDP setup. +The browser toolset supports multiple providers. See the [Browser feature page](/docs/user-guide/features/browser) for details on Browserbase, Browser Use, and local Chromium-family CDP setup. ## Timezone diff --git a/website/docs/user-guide/features/browser.md b/website/docs/user-guide/features/browser.md index 1da4a8f2a36..296572f22f9 100644 --- a/website/docs/user-guide/features/browser.md +++ b/website/docs/user-guide/features/browser.md @@ -1,6 +1,6 @@ --- title: Browser Automation -description: Control browsers with multiple providers, local Chrome via CDP, or cloud browsers for web interaction, form filling, scraping, and more. +description: Control browsers with multiple providers, local Chromium-family browsers via CDP, or cloud browsers for web interaction, form filling, scraping, and more. sidebar_label: Browser sidebar_position: 5 --- @@ -13,7 +13,7 @@ Hermes Agent includes a full browser automation toolset with multiple backend op - **Browser Use cloud mode** via [Browser Use](https://browser-use.com) as an alternative cloud browser provider - **Firecrawl cloud mode** via [Firecrawl](https://firecrawl.dev) for cloud browsers with built-in scraping - **Camofox local mode** via [Camofox](https://github.com/jo-inc/camofox-browser) for local anti-detection browsing (Firefox-based fingerprint spoofing) -- **Local Chrome via CDP** — connect browser tools to your own Chrome instance using `/browser connect` +- **Local Chromium-family CDP** — connect browser tools to your own Chrome, Brave, Chromium, or Edge instance using `/browser connect` - **Local browser mode** via the `agent-browser` CLI and a local Chromium installation In all modes, the agent can navigate websites, interact with page elements, fill forms, and extract information. @@ -25,7 +25,7 @@ Pages are represented as **accessibility trees** (text-based snapshots), making Key capabilities: - **Multi-provider cloud execution** — Browserbase, Browser Use, or Firecrawl — no local browser needed -- **Local Chrome integration** — attach to your running Chrome via CDP for hands-on browsing +- **Local Chromium-family integration** — attach to your running Chrome, Brave, Chromium, or Edge browser via CDP for hands-on browsing - **Built-in stealth** — random fingerprints, CAPTCHA solving, residential proxies (Browserbase) - **Session isolation** — each task gets its own browser session - **Automatic cleanup** — inactive sessions are closed after a timeout @@ -285,9 +285,9 @@ Adoption only fires until `tab_id` is populated for the session. If the external When Camofox runs in headed mode (with a visible browser window), it exposes a VNC port in its health check response. Hermes automatically discovers this and includes the VNC URL in navigation responses, so the agent can share a link for you to watch the browser live. -### Local Chrome via CDP (`/browser connect`) +### Local Chromium-family browser via CDP (`/browser connect`) -Instead of a cloud provider, you can attach Hermes browser tools to your own running Chrome instance via the Chrome DevTools Protocol (CDP). This is useful when you want to see what the agent is doing in real-time, interact with pages that require your own cookies/sessions, or avoid cloud browser costs. +Instead of a cloud provider, you can attach Hermes browser tools to your own running Chrome, Brave, Chromium, or Edge instance via the Chrome DevTools Protocol (CDP). This is useful when you want to see what the agent is doing in real-time, interact with pages that require your own cookies/sessions, or avoid cloud browser costs. :::note `/browser connect` is an **interactive-CLI slash command** — it is not dispatched by the gateway. If you try to run it inside a WebUI, Telegram, Discord, or other gateway chat, the message will be sent to the agent as plain text and the command will not execute. Start Hermes from the terminal (`hermes` or `hermes chat`) and issue `/browser connect` there. @@ -296,26 +296,40 @@ Instead of a cloud provider, you can attach Hermes browser tools to your own run In the CLI, use: ``` -/browser connect # Connect to Chrome at ws://localhost:9222 +/browser connect # Auto-launch/connect to a local Chromium-family browser at http://127.0.0.1:9222 /browser connect ws://host:port # Connect to a specific CDP endpoint -/browser status # Check current connection -/browser disconnect # Detach and return to cloud/local mode +/browser status # Check current connection +/browser disconnect # Detach and return to cloud/local mode ``` -If Chrome isn't already running with remote debugging, Hermes will attempt to auto-launch it with `--remote-debugging-port=9222`. +If a browser isn't already running with remote debugging, Hermes will attempt to auto-launch a supported Chromium-family browser with `--remote-debugging-port=9222`. Detection includes Brave, Google Chrome, Chromium, and Microsoft Edge, with common Linux install paths such as `/opt/brave-bin/brave` and `/snap/bin/brave`. :::tip -To start Chrome manually with CDP enabled, use a dedicated user-data-dir so the debug port actually comes up even if Chrome is already running with your normal profile: +To start a Chromium-family browser manually with CDP enabled, use a dedicated user-data-dir so the debug port actually comes up even if the browser is already running with your normal profile: ```bash -# Linux +# Linux — Brave +brave-browser \ + --remote-debugging-port=9222 \ + --user-data-dir=$HOME/.hermes/chrome-debug \ + --no-first-run \ + --no-default-browser-check & + +# Linux — Google Chrome google-chrome \ --remote-debugging-port=9222 \ --user-data-dir=$HOME/.hermes/chrome-debug \ --no-first-run \ --no-default-browser-check & -# macOS +# macOS — Brave +"/Applications/Brave Browser.app/Contents/MacOS/Brave Browser" \ + --remote-debugging-port=9222 \ + --user-data-dir="$HOME/.hermes/chrome-debug" \ + --no-first-run \ + --no-default-browser-check & + +# macOS — Google Chrome "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \ --remote-debugging-port=9222 \ --user-data-dir="$HOME/.hermes/chrome-debug" \ @@ -325,10 +339,10 @@ google-chrome \ Then launch the Hermes CLI and run `/browser connect`. -**Why `--user-data-dir`?** Without it, launching Chrome while a regular Chrome instance is already running typically opens a new window on the existing process — and that existing process was not started with `--remote-debugging-port`, so port 9222 never opens. A dedicated user-data-dir forces a fresh Chrome process where the debug port actually listens. `--no-first-run --no-default-browser-check` skips the first-launch wizard for the fresh profile. +**Why `--user-data-dir`?** Without it, launching a Chromium-family browser while a regular instance is already running typically opens a new window on the existing process — and that existing process was not started with `--remote-debugging-port`, so port 9222 never opens. A dedicated user-data-dir forces a fresh browser process where the debug port actually listens. `--no-first-run --no-default-browser-check` skips the first-launch wizard for the fresh profile. ::: -When connected via CDP, all browser tools (`browser_navigate`, `browser_click`, etc.) operate on your live Chrome instance instead of spinning up a cloud session. +When connected via CDP, all browser tools (`browser_navigate`, `browser_click`, etc.) operate on your live browser instance instead of spinning up a cloud session. ### WSL2 + Windows Chrome: prefer MCP over `/browser connect` @@ -489,7 +503,7 @@ When a CDP supervisor is active for the current session (typical for any session Raw Chrome DevTools Protocol passthrough — the escape hatch for browser operations not covered by the other tools. Use for native dialog handling, iframe-scoped evaluation, cookie/network control, or any CDP verb the agent needs. -**Only available when a CDP endpoint is reachable at session start** — meaning `/browser connect` has attached to a running Chrome, or `browser.cdp_url` is set in `config.yaml`. The default local agent-browser mode, Camofox, and cloud providers (Browserbase, Browser Use, Firecrawl) do not currently expose CDP to this tool — cloud providers have per-session CDP URLs but live-session routing is a follow-up. +**Only available when a CDP endpoint is reachable at session start** — meaning `/browser connect` has attached to a running Chrome, Brave, Chromium, or Edge browser, or `browser.cdp_url` is set in `config.yaml`. The default local agent-browser mode, Camofox, and cloud providers (Browserbase, Browser Use, Firecrawl) do not currently expose CDP to this tool — cloud providers have per-session CDP URLs but live-session routing is a follow-up. **CDP method reference:** https://chromedevtools.github.io/devtools-protocol/ — the agent can `web_extract` a specific method's page to look up parameters and return shape. diff --git a/website/docs/user-guide/features/overview.md b/website/docs/user-guide/features/overview.md index c00d57148ce..5ad06641540 100644 --- a/website/docs/user-guide/features/overview.md +++ b/website/docs/user-guide/features/overview.md @@ -28,7 +28,7 @@ Hermes Agent includes a rich set of capabilities that extend far beyond basic ch ## Media & Web - **[Voice Mode](voice-mode.md)** — Full voice interaction across CLI and messaging platforms. Talk to the agent using your microphone, hear spoken replies, and have live voice conversations in Discord voice channels. -- **[Browser Automation](browser.md)** — Full browser automation with multiple backends: Browserbase cloud, Browser Use cloud, local Chrome via CDP, or local Chromium. Navigate websites, fill forms, and extract information. +- **[Browser Automation](browser.md)** — Full browser automation with multiple backends: Browserbase cloud, Browser Use cloud, local Chrome/Brave/Chromium/Edge via CDP, or local Chromium. Navigate websites, fill forms, and extract information. - **[Vision & Image Paste](vision.md)** — Multimodal vision support. Paste images from your clipboard into the CLI and ask the agent to analyze, describe, or work with them using any vision-capable model. - **[Image Generation](image-generation.md)** — Generate images from text prompts using FAL.ai. Nine models supported (FLUX 2 Klein/Pro, GPT-Image 1.5/2, Nano Banana Pro, Ideogram V3, Recraft V4 Pro, Qwen, Z-Image Turbo); pick one via `hermes tools`. - **[Voice & TTS](tts.md)** — Text-to-speech output and voice message transcription across all messaging platforms, with ten native provider options: Edge TTS (free), ElevenLabs, OpenAI TTS, MiniMax, Mistral Voxtral, Google Gemini, xAI, NeuTTS, KittenTTS, and Piper — plus custom command providers for any local TTS CLI. From 6a6766fb896043ab3757a30713f1248b6054e0a6 Mon Sep 17 00:00:00 2001 From: H-Ali13381 Date: Mon, 18 May 2026 23:33:42 -0400 Subject: [PATCH 03/29] test(cli): cover Brave binary CDP launch detection --- tests/cli/test_cli_browser_connect.py | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/tests/cli/test_cli_browser_connect.py b/tests/cli/test_cli_browser_connect.py index 09d009ca751..b4523b3778d 100644 --- a/tests/cli/test_cli_browser_connect.py +++ b/tests/cli/test_cli_browser_connect.py @@ -160,6 +160,18 @@ class TestChromeDebugLaunch: assert command is not None assert command.startswith(f"{brave} --remote-debugging-port=9222") + def test_linux_candidates_include_brave_binary_name(self): + brave = "/usr/bin/brave" + + with patch("hermes_cli.browser_connect.shutil.which", side_effect=lambda name: brave if name == "brave" else None), \ + patch("hermes_cli.browser_connect.os.path.isfile", side_effect=lambda path: path == brave): + candidates = get_chrome_debug_candidates("Linux") + command = manual_chrome_debug_command(9222, "Linux") + + assert candidates == [brave] + assert command is not None + assert command.startswith(f"{brave} --remote-debugging-port=9222") + def test_linux_candidates_include_official_brave_and_edge_stable_paths(self): brave = "/usr/bin/brave-browser-stable" edge = "/usr/bin/microsoft-edge-stable" From 5e743559e0157df42e0f640cd06d736e898370d0 Mon Sep 17 00:00:00 2001 From: brooklyn! Date: Wed, 20 May 2026 01:46:40 -0500 Subject: [PATCH 04/29] fix(lint): skip per-file shell linter when LSP will handle the file (#29054) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix(lint): skip per-file shell linter when LSP will handle the file `_check_lint` ran `npx tsc --noEmit FILE.ts` after every `.ts`/`.tsx` edit. `tsc` ignores `tsconfig.json` when given an explicit file argument (documented quirk) and defaults to no-lib / ES5, so every ES2015+ stdlib reference reports as missing: - `Cannot find global value 'Promise'` - `Cannot find name 'Map' / 'Set' / 'ReadonlySet' / 'Iterable'` - `Property 'isFinite' does not exist on type 'NumberConstructor'` - `Module 'phaser' can only be default-imported using esModuleInterop` - `import.meta is only allowed when --module is es2020+` On real TypeScript projects this floods the `lint` field on WriteResult / PatchResult with up to 25K tokens of false positives per edit. The delta filter in `_check_lint_delta` is supposed to mask them, but a tiny edit shifts line numbers and every phantom resurfaces as "introduced by this edit". The result is a 1MB+ phantom-error dump on every patch that eats the agent's context budget. Same shape for `.go` (`go vet` outside a module) and `.rs` (`rustfmt --check` outside a Cargo project). PR #24168 added an LSP tier on top of this — real `tsserver` / `gopls` / `rust-analyzer` diagnostics surface in the separate `lsp_diagnostics` field. But the broken shell linter kept running underneath, so the phantom-error dump kept happening even when LSP was giving us a clean authoritative signal. This change short-circuits the shell linter for the structurally-broken extensions (`.ts`, `.tsx`, `.go`, `.rs`) when an LSP server is active and claims the file via `LSPService.enabled_for(path)`. The LSP tier runs as before and carries the real diagnostics in `lsp_diagnostics`. Other shell linters (`py_compile`, `node --check`) keep running unconditionally — they're fast, file-local, and correct. Default behavior (LSP disabled, LSP misconfigured, remote backend, file outside a workspace) is unchanged — the existing fallback paths trigger when `_lsp_will_handle` returns False, so users who haven't opted into LSP get the same shell-linter behavior they had before. Drive-by: `.tsx` was missing from the `LINTERS` table entirely, so TS React files got no post-edit syntax check at all. Added it for symmetry; in practice it now hits the LSP-skip path. Tests: - `tests/agent/lsp/test_shell_linter_lsp_skip.py` — 14 tests covering: * skip happens for each redundant extension when LSP claims the file (asserted by patching `_exec` to raise on any shell-linter call) * shell linter still runs when LSP is inactive (regression guard) * `.py` / `.js` continue to run unconditionally even with LSP active * `_lsp_will_handle` is exception-safe: returns False on None service, remote backend, or `enabled_for` raising * `.tsx` is in both `LINTERS` and `_SHELL_LINTER_LSP_REDUNDANT` - All pre-existing tests in `tests/agent/lsp/` and `tests/tools/test_file_operations*.py` still pass (233/233). * fix(lint): address Copilot review on #29054 Two fixes from copilot-pull-request-reviewer on PR #29054: 1. `.tsx` regression with LSP disabled (https://github.com/NousResearch/hermes-agent/pull/29054#discussion_r3271017282) The first revision added `.tsx` to the `LINTERS` table so that TypeScript React files would hit the LSP skip path. Side effect: when LSP is *disabled* (the default), `.tsx` edits would suddenly run `npx tsc --noEmit FILE.tsx` and inherit the same phantom-error dump this PR is supposed to fix. Pre-PR behavior was implicit `skipped` (no `LINTERS` entry); restore that. - Remove `.tsx` from `LINTERS`. - Remove `.tsx` from `_SHELL_LINTER_LSP_REDUNDANT` (the skip path is unreachable without a `LINTERS` entry — falls through to `ext not in LINTERS` first). - When LSP IS enabled, `.tsx` is still covered by the LSP tier via `_maybe_lsp_diagnostics` (typescript-language-server's `extensions` tuple includes `.tsx`), so the diagnostics still surface — just on the `lsp_diagnostics` channel, not `lint`. - Update test_shell_linter_lsp_skip.py to reflect this contract (drop `.tsx` from the parametrize lists; add `test_tsx_stays_out_of_linters_table_for_default_compatibility` and `test_tsx_default_check_lint_returns_skipped`). 2. V4A patches dropped `WriteResult.lsp_diagnostics` (https://github.com/NousResearch/hermes-agent/pull/29054#discussion_r3271017295) `tools/patch_parser.py::apply_v4a_operations` calls `file_ops.write_file()` per operation, then calls `_check_lint()` directly afterwards — but never propagates `WriteResult.lsp_diagnostics` to the `PatchResult`. The shell-linter skip introduced in this PR makes the gap visible: a `.ts` / `.go` / `.rs` V4A patch with LSP active would return `lint = {f: {skipped: True}}` and zero diagnostics from any channel. - `_apply_add` and `_apply_update` now return `Tuple[bool, str, Optional[str]]` where the third element is `WriteResult.lsp_diagnostics` (or `None` on failure / no diags). - `_apply_delete` and `_apply_move` stay 2-tuples — they don't produce diagnostics, no write goes through `write_file`. - `apply_v4a_operations` accumulates per-file diagnostics blocks and surfaces a combined block on `PatchResult.lsp_diagnostics`. Each block already carries its `` header from `LSPService.report_for_file`, so concatenation preserves per-file attribution. Tests added (`test_patch_parser.py::TestV4ALspDiagnosticsPropagation`): - ADD op: `WriteResult.lsp_diagnostics` flows to `PatchResult` - UPDATE op: same - No diagnostics → `PatchResult.lsp_diagnostics is None` (not "") - Multi-file patch: combined block contains every per-file block Verification: - Targeted test scope: 257/257 pass (tests/agent/lsp/, tests/tools/test_file_operations*.py, tests/tools/test_patch_parser.py) - Wider sweep: 5400 pass; 11 failures all pre-existing on origin/main (file_staleness / file_read_guards / file_state_registry — unrelated macOS /var/folders tmp-path sensitivity issues, confirmed by re-running on a clean origin/main checkout) * docs(test): align shell-linter LSP skip docstring with .tsx behavior Copilot review feedback (review #4324947616, comment #3271049036): the test module docstring still listed .tsx alongside .ts/.go/.rs in the skip contract, but .tsx is now intentionally NOT in LINTERS or _SHELL_LINTER_LSP_REDUNDANT. Updated the bullet list to drop .tsx from the skip contract and added a paragraph documenting why .tsx is left out (preserves pre-PR implicit-skip behavior for LSP-disabled users; LSP coverage still happens via _maybe_lsp_diagnostics). * test(lsp): drop unused tmp_path from _make_fops helper Copilot review #3271069484: the helper accepted tmp_path but never used it. Callers still need tmp_path themselves for the file they're asserting against, so we just drop the helper's parameter. --- tests/agent/lsp/test_shell_linter_lsp_skip.py | 210 ++++++++++++++++++ tests/tools/test_patch_parser.py | 138 ++++++++++++ tools/file_operations.py | 85 +++++++ tools/patch_parser.py | 52 ++++- 4 files changed, 474 insertions(+), 11 deletions(-) create mode 100644 tests/agent/lsp/test_shell_linter_lsp_skip.py diff --git a/tests/agent/lsp/test_shell_linter_lsp_skip.py b/tests/agent/lsp/test_shell_linter_lsp_skip.py new file mode 100644 index 00000000000..a101fa9e1be --- /dev/null +++ b/tests/agent/lsp/test_shell_linter_lsp_skip.py @@ -0,0 +1,210 @@ +"""Skip the per-file shell linter when LSP will handle the same file. + +The per-file ``npx tsc --noEmit FILE.ts`` shell linter cannot see +``tsconfig.json`` (a documented ``tsc`` quirk: explicit file args bypass +the project config), so it defaults to no-lib / ES5 and floods the +agent's lint field with phantom "Cannot find 'Promise' / 'Map' / 'Set' / +'ReadonlySet' / 'Iterable' / 'imul' / …" errors on every edit — up to +25K tokens per patch. The LSP tier (``tsserver`` via +typescript-language-server) reads tsconfig correctly and surfaces real +diagnostics in the ``lsp_diagnostics`` field of the WriteResult / +PatchResult. + +These tests pin the contract: + + - When LSP is active AND ``enabled_for(path)`` for a ``.ts`` / ``.go`` + / ``.rs`` file, ``_check_lint`` returns ``skipped`` without invoking + the shell linter at all. + - When LSP is inactive or disabled-for-path, the shell linter runs + exactly as before (regression guard for the default config). + - The skip only applies to extensions in + ``_SHELL_LINTER_LSP_REDUNDANT`` — Python ``py_compile`` and + ``node --check`` keep running unconditionally because they're fast, + file-local, and correct. + - ``.tsx`` is intentionally NOT in either ``LINTERS`` or + ``_SHELL_LINTER_LSP_REDUNDANT``: it had no ``LINTERS`` entry + pre-PR (so it was already implicitly ``skipped`` via the + ``ext not in LINTERS`` branch) and adding one would have inherited + ``.ts``'s broken ``tsc --noEmit FILE`` invocation for LSP-disabled + users. When LSP IS enabled, ``.tsx`` is still covered by + typescript-language-server via ``_maybe_lsp_diagnostics`` — the + diagnostics show up on ``lsp_diagnostics``, not ``lint``. +""" +from __future__ import annotations + +from unittest.mock import MagicMock, patch + +import pytest + + +def _make_fops(): + from tools.environments.local import LocalEnvironment + from tools.file_operations import ShellFileOperations + return ShellFileOperations(LocalEnvironment()) + + +@pytest.mark.parametrize("ext", [".ts", ".go", ".rs"]) +def test_shell_linter_skipped_when_lsp_will_handle(ext, tmp_path): + """When LSP is active and enabled_for(path), shell linter is skipped. + + The shell linter's _exec must NOT be called — that's the whole + point. We assert by patching ``_exec`` to raise, so any accidental + invocation surfaces as a test failure. + """ + fops = _make_fops() + src = tmp_path / f"bad{ext}" + src.write_text("intentionally invalid content\n") + + def _exec_must_not_run(*args, **kwargs): # pragma: no cover + raise AssertionError( + "shell linter was invoked despite LSP claiming the file" + ) + + with patch.object(fops, "_lsp_will_handle", return_value=True), \ + patch.object(fops, "_exec", side_effect=_exec_must_not_run), \ + patch.object(fops, "_has_command", return_value=True): + result = fops._check_lint(str(src)) + + assert result.skipped is True + assert "LSP" in (result.message or "") + + +@pytest.mark.parametrize("ext", [".ts", ".go", ".rs"]) +def test_shell_linter_runs_when_lsp_inactive(ext, tmp_path): + """When LSP is inactive (default config, no service, remote backend, ...), + the shell linter runs as before — no behavior change.""" + fops = _make_fops() + src = tmp_path / f"clean{ext}" + src.write_text("// content\n") + + fake_result = MagicMock() + fake_result.exit_code = 0 + fake_result.stdout = "" + + with patch.object(fops, "_lsp_will_handle", return_value=False), \ + patch.object(fops, "_exec", return_value=fake_result) as exec_mock, \ + patch.object(fops, "_has_command", return_value=True): + result = fops._check_lint(str(src)) + + # _exec must have been called — proving the shell linter ran. + assert exec_mock.called, "shell linter did NOT run when LSP was inactive" + assert result.success is True + + +@pytest.mark.parametrize("ext", [".py", ".js"]) +def test_lsp_does_not_skip_non_redundant_extensions(ext, tmp_path): + """``py_compile`` and ``node --check`` keep running even when an LSP + server (pyright/pylsp/typescript-language-server-for-JS) is active — + they're fast, file-local, and correct, so there's no upside to + suppressing them. + """ + fops = _make_fops() + src = tmp_path / f"clean{ext}" + src.write_text("# valid\n" if ext == ".py" else "// valid\n") + + fake_result = MagicMock() + fake_result.exit_code = 0 + fake_result.stdout = "" + + # Even with LSP claiming the file, the shell linter must still run + # for these extensions. + with patch.object(fops, "_lsp_will_handle", return_value=True), \ + patch.object(fops, "_exec", return_value=fake_result) as exec_mock, \ + patch.object(fops, "_has_command", return_value=True): + fops._check_lint(str(src)) + + assert exec_mock.called, ( + f"shell linter for {ext} did not run despite being in the " + "'always-run' set (py_compile / node --check)" + ) + + +def test_lsp_will_handle_returns_false_when_service_is_none(tmp_path): + """``_lsp_will_handle`` must return False when the LSP service hasn't + been initialized — otherwise we'd accidentally skip the shell linter + on systems where LSP isn't configured at all.""" + fops = _make_fops() + src = tmp_path / "foo.ts" + src.write_text("const x = 1\n") + + with patch.object(fops, "_lsp_local_only", return_value=True), \ + patch("agent.lsp.get_service", return_value=None): + assert fops._lsp_will_handle(str(src)) is False + + +def test_lsp_will_handle_returns_false_on_remote_backend(tmp_path): + """LSP servers run on the host process — remote backends (Docker, + SSH, Modal, …) keep files inside the sandbox where the host LSP + can't reach them. ``_lsp_will_handle`` must short-circuit before + calling into the service in that case.""" + fops = _make_fops() + src = tmp_path / "foo.ts" + src.write_text("const x = 1\n") + + with patch.object(fops, "_lsp_local_only", return_value=False), \ + patch("agent.lsp.get_service") as get_service_mock: + result = fops._lsp_will_handle(str(src)) + + assert result is False + # Importantly: we never even consulted the service. + assert not get_service_mock.called + + +def test_lsp_will_handle_swallows_enabled_for_exception(tmp_path): + """A flaky LSP service must never break the shell-linter fallback — + if ``enabled_for`` raises, we treat the file as "not handled" so the + shell linter still runs.""" + fops = _make_fops() + src = tmp_path / "foo.ts" + src.write_text("const x = 1\n") + + fake_svc = MagicMock() + fake_svc.enabled_for.side_effect = RuntimeError("server crashed") + + with patch.object(fops, "_lsp_local_only", return_value=True), \ + patch("agent.lsp.get_service", return_value=fake_svc): + assert fops._lsp_will_handle(str(src)) is False + + +def test_tsx_stays_out_of_linters_table_for_default_compatibility(): + """Regression: keep ``.tsx`` out of ``LINTERS`` so users with LSP + DISABLED don't suddenly get the broken ``npx tsc --noEmit FILE.tsx`` + invocation that ``.ts`` historically used to get. + + Pre-PR behavior: ``.tsx`` had no entry in ``LINTERS``, so it fell + through to ``ext not in LINTERS`` → ``LintResult(skipped=True, + message="No linter for .tsx files")``. This PR preserves that for + the default config. + + When LSP IS enabled, ``.tsx`` is still covered by the LSP tier via + ``_maybe_lsp_diagnostics`` (typescript-language-server claims + ``.tsx`` in its extensions list) — the diagnostics show up in the + ``lsp_diagnostics`` field, not the ``lint`` field. + """ + from tools.file_operations import LINTERS, _SHELL_LINTER_LSP_REDUNDANT + + assert ".tsx" not in LINTERS + assert ".tsx" not in _SHELL_LINTER_LSP_REDUNDANT + + +def test_tsx_default_check_lint_returns_skipped(tmp_path): + """End-to-end: ``.tsx`` files get ``LintResult(skipped=True)`` from + ``_check_lint`` regardless of LSP status — this is the no-regression + contract that addresses Copilot review #3271017282.""" + fops = _make_fops() + src = tmp_path / "foo.tsx" + src.write_text("export const X = () =>
\n") + + # Even with LSP claiming the file, no shell linter runs for .tsx + # because there's no LINTERS entry — the ``ext not in LINTERS`` + # branch fires before the LSP short-circuit is consulted. + with patch.object(fops, "_lsp_will_handle", return_value=True), \ + patch.object(fops, "_exec") as exec_mock: + result = fops._check_lint(str(src)) + + assert result.skipped is True + assert not exec_mock.called, "no shell linter should run for .tsx" + + +if __name__ == "__main__": # pragma: no cover + pytest.main([__file__, "-v"]) diff --git a/tests/tools/test_patch_parser.py b/tests/tools/test_patch_parser.py index 8c4a0c80a37..79077a84a16 100644 --- a/tests/tools/test_patch_parser.py +++ b/tests/tools/test_patch_parser.py @@ -509,3 +509,141 @@ class TestParseErrorSignalling: ops, err = parse_v4a_patch(patch) assert err is None assert len(ops) == 1 + + +class TestV4ALspDiagnosticsPropagation: + """V4A patches must surface ``WriteResult.lsp_diagnostics`` from the + underlying ``write_file`` calls on ``PatchResult.lsp_diagnostics``. + + Without explicit propagation the LSP tier's output gets silently + dropped on the V4A code path — see Copilot review #3271017295 on + PR #29054. The shell-linter LSP skip introduced by that PR makes + this gap visible: a ``.ts`` / ``.go`` / ``.rs`` V4A patch with LSP + active would otherwise return ``lint = {f: {skipped: True, ...}}`` + and zero diagnostics from any channel. + """ + + def _build_ops_writing(self, path: str, content: str): + """Build a single ADD operation that writes ``content`` to ``path``.""" + # Use the V4A parser so we don't have to construct PatchOperation + # / Hunk / Line objects by hand. + lines = "\n".join(f"+{line}" for line in content.splitlines()) + patch_text = ( + "*** Begin Patch\n" + f"*** Add File: {path}\n" + f"{lines}\n" + "*** End Patch" + ) + ops, err = parse_v4a_patch(patch_text) + assert err is None, err + return ops + + def test_lsp_diagnostics_propagated_from_write_file_on_add(self): + """ADD op: ``WriteResult.lsp_diagnostics`` flows through to + ``PatchResult.lsp_diagnostics``.""" + ops = self._build_ops_writing("foo.ts", "const x: number = 1\n") + + diag_block = ( + "\n" + "ERROR [1:7] some diagnostic\n" + "" + ) + + class FakeFileOps: + def write_file(self, path, content): + return SimpleNamespace(error=None, lsp_diagnostics=diag_block) + + def _check_lint(self, path): + return SimpleNamespace(to_dict=lambda: {"skipped": True}) + + result = apply_v4a_operations(ops, FakeFileOps()) + + assert result.success is True + assert result.lsp_diagnostics == diag_block + + def test_lsp_diagnostics_propagated_from_write_file_on_update(self): + """UPDATE op: ``WriteResult.lsp_diagnostics`` flows through to + ``PatchResult.lsp_diagnostics``.""" + patch_text = ( + "*** Begin Patch\n" + "*** Update File: bar.ts\n" + "-old\n" + "+new\n" + "*** End Patch" + ) + ops, err = parse_v4a_patch(patch_text) + assert err is None + + diag_block = ( + "\n" + "ERROR [3:1] something\n" + "" + ) + + class FakeFileOps: + def read_file_raw(self, path): + return SimpleNamespace(content="ctx\nold\nctx\n", error=None) + + def write_file(self, path, content): + return SimpleNamespace(error=None, lsp_diagnostics=diag_block) + + def _check_lint(self, path): + return SimpleNamespace(to_dict=lambda: {"skipped": True}) + + result = apply_v4a_operations(ops, FakeFileOps()) + + assert result.success is True + assert result.lsp_diagnostics == diag_block + + def test_lsp_diagnostics_none_when_no_blocks_emitted(self): + """When no underlying ``write_file`` produced diagnostics, the + aggregated field stays ``None`` (so it doesn't get serialized + as an empty string in ``PatchResult.to_dict``).""" + ops = self._build_ops_writing("foo.py", "x = 1\n") + + class FakeFileOps: + def write_file(self, path, content): + # lsp_diagnostics omitted entirely (older WriteResult shape). + return SimpleNamespace(error=None) + + def _check_lint(self, path): + return SimpleNamespace(to_dict=lambda: {"success": True}) + + result = apply_v4a_operations(ops, FakeFileOps()) + + assert result.success is True + assert result.lsp_diagnostics is None + + def test_lsp_diagnostics_combined_across_multiple_files(self): + """When several files in one V4A patch produce diagnostics, + each block appears in the combined output so per-file attribution + is preserved.""" + patch_text = ( + "*** Begin Patch\n" + "*** Add File: a.ts\n" + "+const a = 1\n" + "*** Add File: b.ts\n" + "+const b = 2\n" + "*** End Patch" + ) + ops, err = parse_v4a_patch(patch_text) + assert err is None + + per_file = { + "a.ts": "\nERR a\n", + "b.ts": "\nERR b\n", + } + + class FakeFileOps: + def write_file(self, path, content): + return SimpleNamespace(error=None, lsp_diagnostics=per_file[path]) + + def _check_lint(self, path): + return SimpleNamespace(to_dict=lambda: {"skipped": True}) + + result = apply_v4a_operations(ops, FakeFileOps()) + + assert result.success is True + assert result.lsp_diagnostics is not None + assert per_file["a.ts"] in result.lsp_diagnostics + assert per_file["b.ts"] in result.lsp_diagnostics diff --git a/tools/file_operations.py b/tools/file_operations.py index 13d9314b912..c25dc332cb0 100644 --- a/tools/file_operations.py +++ b/tools/file_operations.py @@ -326,6 +326,44 @@ LINTERS = { '.rs': 'rustfmt --check {file} 2>&1', } +# Extensions where the per-file shell linter is structurally weaker than +# a real LSP server AND produces phantom errors on real-world projects: +# +# - ``.ts``: ``tsc --noEmit FILE.ts`` ignores ``tsconfig.json`` and +# defaults to no-lib / ES5, so every ES2015+ stdlib reference +# (``Promise``, ``Map``, ``Set``, ``ReadonlySet``, ``Iterable``, +# ``Math.imul``, ``Number.isFinite``, etc.) reports as missing. This +# floods the agent's lint field with 20K+ tokens of false positives on +# every edit. No supported tsc flag fixes the single-file invocation; +# the canonical replacement is ``tsserver`` via LSP, which respects +# tsconfig and gives true diagnostics. +# +# ``.tsx`` is intentionally NOT in ``LINTERS`` (and therefore not +# here): it has no shell linter entry, so it falls through to the +# ``ext not in LINTERS`` skip case unchanged. Pre-PR behavior: +# ``.tsx`` was implicitly ``skipped``. Keeping it that way means +# ``.tsx`` edits with LSP disabled get no per-file syntax check +# (same as before this PR) instead of the broken ``tsc`` invocation +# that ``.ts`` used to get. When LSP is enabled, ``.tsx`` is covered +# by the LSP tier via ``_maybe_lsp_diagnostics`` exactly as ``.ts``. +# +# - ``.go``: ``go vet FILE.go`` fails outside a module / GOPATH with +# "cannot find package" — already partially handled by +# ``_LINTER_UNUSABLE_PATTERNS`` but only when the package error is the +# ONLY output; mixed real+phantom output still leaks through. +# ``gopls`` is the canonical replacement. +# +# - ``.rs``: ``rustfmt --check FILE.rs`` is style, not type-checking, and +# rejects non-Cargo project files. ``rust-analyzer`` is the canonical +# replacement. +# +# When the LSP service is configured AND ``enabled_for(path)`` for this +# extension's file, ``_check_lint`` skips the shell linter for these +# extensions — the ``lsp_diagnostics`` channel carries the real signal. +# Everything else in ``LINTERS`` (Python ``py_compile``, ``node --check``) +# is fast, file-local, and correct, so it runs unconditionally. +_SHELL_LINTER_LSP_REDUNDANT = frozenset({'.ts', '.go', '.rs'}) + # Patterns that indicate the linter base command exists on PATH but # couldn't actually run — e.g. ``npx tsc`` when tsc isn't installed in @@ -1169,6 +1207,19 @@ class ShellFileOperations(FileOperations): if ext not in LINTERS: return LintResult(skipped=True, message=f"No linter for {ext} files") + # If a real LSP server is active and claims this file, skip the + # shell linter for extensions whose per-file shell invocation is + # structurally weaker / floods phantom errors. See + # ``_SHELL_LINTER_LSP_REDUNDANT`` above for the rationale per ext. + # The LSP tier runs separately via ``_maybe_lsp_diagnostics`` and + # carries the real diagnostics in ``lsp_diagnostics`` on the + # WriteResult / PatchResult. + if ext in _SHELL_LINTER_LSP_REDUNDANT and self._lsp_will_handle(path): + return LintResult( + skipped=True, + message=f"LSP server handles {ext} — shell linter skipped", + ) + linter_cmd = LINTERS[ext] # Extract the base command (first word) base_cmd = linter_cmd.split()[0] @@ -1332,6 +1383,40 @@ class ShellFileOperations(FileOperations): return True return False + def _lsp_will_handle(self, path: str) -> bool: + """Return True iff the LSP service is active AND will lint this file. + + Stronger than :meth:`_lsp_handles_extension` — that one only checks + the static server registry. This one additionally requires the + LSP service to be configured/enabled and the file to pass + :meth:`agent.lsp.manager.LSPService.enabled_for` (which gates on + workspace detection, disabled-server set, and the broken-pair + short-circuit). + + Used by :meth:`_check_lint` to decide whether to skip the per-file + shell linter for extensions in ``_SHELL_LINTER_LSP_REDUNDANT``. + + Best-effort: any failure path returns False so the shell linter + runs as before — never suppress lint based on an LSP probe that + couldn't actually answer the question. + """ + if not self._lsp_local_only(): + return False + try: + from agent.lsp import get_service + except Exception: # noqa: BLE001 + return False + try: + svc = get_service() + except Exception: # noqa: BLE001 + return False + if svc is None: + return False + try: + return bool(svc.enabled_for(path)) + except Exception: # noqa: BLE001 + return False + def _snapshot_lsp_baseline(self, path: str) -> None: """Capture pre-edit LSP diagnostics so the post-write delta is correct. diff --git a/tools/patch_parser.py b/tools/patch_parser.py index dacc6e855c3..e16cb446ee0 100644 --- a/tools/patch_parser.py +++ b/tools/patch_parser.py @@ -363,6 +363,12 @@ def apply_v4a_operations(operations: List[PatchOperation], files_created = [] files_deleted = [] all_diffs = [] + # Per-file LSP diagnostics blocks captured from underlying write_file + # calls. V4A bypasses the WriteResult / PatchResult plumbing that + # write_file and patch_replace use, so without explicit propagation + # the LSP tier's output gets silently dropped — see + # ``PatchResult.lsp_diagnostics`` aggregation below. + lsp_blocks: List[str] = [] errors = [] for op in operations: @@ -372,6 +378,8 @@ def apply_v4a_operations(operations: List[PatchOperation], if result[0]: files_created.append(op.file_path) all_diffs.append(result[1]) + if result[2]: + lsp_blocks.append(result[2]) else: errors.append(f"Failed to add {op.file_path}: {result[1]}") @@ -396,6 +404,8 @@ def apply_v4a_operations(operations: List[PatchOperation], if result[0]: files_modified.append(op.file_path) all_diffs.append(result[1]) + if result[2]: + lsp_blocks.append(result[2]) else: errors.append(f"Failed to update {op.file_path}: {result[1]}") @@ -411,6 +421,13 @@ def apply_v4a_operations(operations: List[PatchOperation], combined_diff = '\n'.join(all_diffs) + # Combine per-file LSP diagnostics blocks. Each block already has + # the ```` header from + # ``LSPService.report_for_file`` so concatenation is safe — the + # agent (and any downstream parsers) can still attribute each + # diagnostic to its file. + combined_lsp = "\n\n".join(lsp_blocks) if lsp_blocks else None + if errors: return PatchResult( success=False, @@ -419,6 +436,7 @@ def apply_v4a_operations(operations: List[PatchOperation], files_created=files_created, files_deleted=files_deleted, lint=lint_results if lint_results else None, + lsp_diagnostics=combined_lsp, error="Apply phase failed (state may be inconsistent — run `git diff` to assess):\n" + "\n".join(f" • {e}" for e in errors), ) @@ -430,11 +448,19 @@ def apply_v4a_operations(operations: List[PatchOperation], files_created=files_created, files_deleted=files_deleted, lint=lint_results if lint_results else None, + lsp_diagnostics=combined_lsp, ) -def _apply_add(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]: - """Apply an add file operation.""" +def _apply_add(op: PatchOperation, file_ops: Any) -> Tuple[bool, str, Optional[str]]: + """Apply an add file operation. + + Returns ``(success, diff_or_error, lsp_diagnostics)``. The third + element carries the formatted ```` block from + :class:`WriteResult.lsp_diagnostics` so V4A patches can surface + semantic diagnostics from the LSP layer — without this, the LSP + tier would silently swallow them on the V4A code path. + """ # Extract content from hunks (all + lines) content_lines = [] for hunk in op.hunks: @@ -446,12 +472,12 @@ def _apply_add(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]: result = file_ops.write_file(op.file_path, content) if result.error: - return False, result.error + return False, result.error, None diff = f"--- /dev/null\n+++ b/{op.file_path}\n" diff += '\n'.join(f"+{line}" for line in content_lines) - return True, diff + return True, diff, getattr(result, "lsp_diagnostics", None) def _apply_delete(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]: @@ -485,8 +511,12 @@ def _apply_move(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]: return True, diff -def _apply_update(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]: - """Apply an update file operation.""" +def _apply_update(op: PatchOperation, file_ops: Any) -> Tuple[bool, str, Optional[str]]: + """Apply an update file operation. + + Returns ``(success, diff_or_error, lsp_diagnostics)`` — see + :func:`_apply_add` for the rationale on the third element. + """ # Deferred import: breaks the patch_parser ↔ fuzzy_match circular dependency from tools.fuzzy_match import fuzzy_find_and_replace @@ -494,7 +524,7 @@ def _apply_update(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]: read_result = file_ops.read_file_raw(op.file_path) if read_result.error: - return False, f"Cannot read file: {read_result.error}" + return False, f"Cannot read file: {read_result.error}", None current_content = read_result.content @@ -549,7 +579,7 @@ def _apply_update(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]: err_msg += format_no_match_hint(error, 0, search_pattern, new_content) except Exception: pass - return False, err_msg + return False, err_msg, None else: # Addition-only hunk (no context or removed lines). # Insert at the location indicated by the context hint, or at end of file. @@ -563,7 +593,7 @@ def _apply_update(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]: return False, ( f"Addition-only hunk: context hint '{hunk.context_hint}' is ambiguous " f"({occurrences} occurrences) — provide a more unique hint" - ) + ), None else: hint_pos = new_content.find(hunk.context_hint) # Insert after the line containing the context hint @@ -578,7 +608,7 @@ def _apply_update(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]: # Write new content write_result = file_ops.write_file(op.file_path, new_content) if write_result.error: - return False, write_result.error + return False, write_result.error, None # Generate diff diff_lines = difflib.unified_diff( @@ -589,4 +619,4 @@ def _apply_update(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]: ) diff = ''.join(diff_lines) - return True, diff + return True, diff, getattr(write_result, "lsp_diagnostics", None) From 258965663c46d406a324356274f529586f7ce22c Mon Sep 17 00:00:00 2001 From: Savanne Kham Date: Tue, 19 May 2026 23:58:55 +0200 Subject: [PATCH 05/29] fix(chat_completions): strip tool_name from messages for strict providers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The 'tool_name' key on role=tool messages is an internal Hermes field (stored in the messages.tool_name SQLite column for FTS indexing) that is not part of the OpenAI Chat Completions schema. Strict OpenAI-compatible providers — notably Moonshot AI (Kimi) — reject it with HTTP 400: Error from provider: Extra inputs are not permitted, field: 'messages[N].tool_name', value: 'execute_code' Add 'tool_name' to the sanitize block in ChatCompletionsTransport.convert_messages alongside the existing Codex Responses API fields (codex_reasoning_items, codex_message_items) so it is popped before the request is sent. Reproducer: hermes chat --model kimi-k2.6 > list the top 5 Hacker News stories -> assistant emits tool_call(execute_code) -> tool result message gets tool_name='execute_code' -> next turn's payload includes messages[N].tool_name -> 400 Permissive backends (MiniMax, OpenRouter on most routes) ignore the extra field and were masking the bug. --- agent/transports/chat_completions.py | 7 ++++++- .../agent/transports/test_chat_completions.py | 20 +++++++++++++++++++ 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/agent/transports/chat_completions.py b/agent/transports/chat_completions.py index 7edb69e42c7..ed5d8b0ba43 100644 --- a/agent/transports/chat_completions.py +++ b/agent/transports/chat_completions.py @@ -122,7 +122,11 @@ class ChatCompletionsTransport(ProviderTransport): for msg in messages: if not isinstance(msg, dict): continue - if "codex_reasoning_items" in msg or "codex_message_items" in msg: + if ( + "codex_reasoning_items" in msg + or "codex_message_items" in msg + or "tool_name" in msg + ): needs_sanitize = True break tool_calls = msg.get("tool_calls") @@ -145,6 +149,7 @@ class ChatCompletionsTransport(ProviderTransport): continue msg.pop("codex_reasoning_items", None) msg.pop("codex_message_items", None) + msg.pop("tool_name", None) tool_calls = msg.get("tool_calls") if isinstance(tool_calls, list): for tc in tool_calls: diff --git a/tests/agent/transports/test_chat_completions.py b/tests/agent/transports/test_chat_completions.py index 7ed0d4da634..2e7b9da2f8d 100644 --- a/tests/agent/transports/test_chat_completions.py +++ b/tests/agent/transports/test_chat_completions.py @@ -46,6 +46,26 @@ class TestChatCompletionsBasic: assert "codex_reasoning_items" in msgs[0] assert "codex_message_items" in msgs[0] + def test_convert_messages_strips_tool_name(self, transport): + """Internal `tool_name` (used for FTS indexing in the SQLite store) is + not part of the OpenAI Chat Completions schema. Strict providers like + Moonshot/Kimi reject it with HTTP 400 'Extra inputs are not permitted'. + """ + msgs = [ + {"role": "user", "content": "hi"}, + {"role": "assistant", "content": None, + "tool_calls": [{"id": "call_1", "type": "function", + "function": {"name": "execute_code", "arguments": "{}"}}]}, + {"role": "tool", "tool_call_id": "call_1", "tool_name": "execute_code", + "content": "result"}, + ] + result = transport.convert_messages(msgs) + assert "tool_name" not in result[2] + assert result[2]["content"] == "result" + assert result[2]["tool_call_id"] == "call_1" + # Original list untouched (deepcopy-on-demand) + assert msgs[2]["tool_name"] == "execute_code" + class TestChatCompletionsBuildKwargs: From 42c42884113d8fe2567f66266fa486be9fcd040e Mon Sep 17 00:00:00 2001 From: teknium1 <127238744+teknium1@users.noreply.github.com> Date: Tue, 19 May 2026 23:09:12 -0700 Subject: [PATCH 06/29] fix(chat_completions): broaden tool_name strip docstring + AUTHOR_MAP Salvage follow-up to PR #28958 (savanne-kham): - convert_messages() docstring now explicitly documents the tool_name strip alongside Codex fields, names which providers reject it (Fireworks, Moonshot/Kimi), and why permissive providers (OpenRouter, MiniMax) masked the bug. - AUTHOR_MAP entry for savanne.kham@protonmail.com -> savanne-kham. --- agent/transports/chat_completions.py | 18 ++++++++++++++---- scripts/release.py | 1 + 2 files changed, 15 insertions(+), 4 deletions(-) diff --git a/agent/transports/chat_completions.py b/agent/transports/chat_completions.py index ed5d8b0ba43..fa36301bd81 100644 --- a/agent/transports/chat_completions.py +++ b/agent/transports/chat_completions.py @@ -112,11 +112,21 @@ class ChatCompletionsTransport(ProviderTransport): def convert_messages( self, messages: list[dict[str, Any]], **kwargs ) -> list[dict[str, Any]]: - """Messages are already in OpenAI format — sanitize Codex leaks only. + """Messages are already in OpenAI format — strip internal fields + that strict chat-completions providers reject with HTTP 400/422. - Strips Codex Responses API fields (``codex_reasoning_items`` / - ``codex_message_items`` on the message, ``call_id``/``response_item_id`` - on tool_calls) that strict chat-completions providers reject with 400/422. + Strips: + + - Codex Responses API fields: ``codex_reasoning_items`` / + ``codex_message_items`` on the message, ``call_id`` / + ``response_item_id`` on ``tool_calls`` entries. + - ``tool_name`` on tool-result messages — written by + ``make_tool_result_message()`` for the SQLite FTS index, but not + part of the Chat Completions schema. Strict providers (Fireworks, + Moonshot/Kimi) reject any payload containing it with + ``Extra inputs are not permitted, field: 'messages[N].tool_name'``. + Permissive providers (OpenRouter, MiniMax) silently ignore the + field, which masked the bug for months. """ needs_sanitize = False for msg in messages: diff --git a/scripts/release.py b/scripts/release.py index 718b1079a0c..6658513f9ca 100755 --- a/scripts/release.py +++ b/scripts/release.py @@ -372,6 +372,7 @@ AUTHOR_MAP = { "bloodcarter@gmail.com": "bloodcarter", "scott@scotttrinh.com": "scotttrinh", "quocanh261997@gmail.com": "quocanh261997", + "savanne.kham@protonmail.com": "savanne-kham", # PR #28958 salvage (strip tool_name for strict providers) # contributors (from noreply pattern) "david.vv@icloud.com": "davidvv", "wangqiang@wangqiangdeMac-mini.local": "xiaoqiang243", From edb2d910577bfdc64792e5d3bb28e842e7f9042e Mon Sep 17 00:00:00 2001 From: Austin Pickett Date: Wed, 20 May 2026 08:00:17 -0400 Subject: [PATCH 07/29] feat(web): migrate dashboard checkboxes to @nous-research/ui + DS polish (#28814) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(web): migrate dashboard checkboxes to @nous-research/ui + DS polish Replaces the hand-rolled shadcn-style `Checkbox` in `web/src/components/ui/` with the Nous DS `Checkbox` (Radix-backed) from `@nous-research/ui`, bumps the DS to 0.14.2, and picks up two regressions surfaced by the bump. Checkbox migration - bump `@nous-research/ui` 0.14.0 → ^0.14.2 and remove `web/src/components/ui/checkbox.tsx` - migrate `ProfilesPage` and `ModelPickerDialog` to the DS Checkbox API (`onCheckedChange`, paired `
)}
diff --git a/web/src/components/ui/checkbox.tsx b/web/src/components/ui/checkbox.tsx deleted file mode 100644 index fa9f0098a00..00000000000 --- a/web/src/components/ui/checkbox.tsx +++ /dev/null @@ -1,61 +0,0 @@ -import { cn } from "@/lib/utils"; -import { Check } from "lucide-react"; - -interface CheckboxProps - extends Omit, "type"> { - label?: React.ReactNode; -} - -export function Checkbox({ - className, - label, - id, - checked, - defaultChecked, - ...props -}: CheckboxProps) { - // Support both controlled (checked prop) and uncontrolled (defaultChecked) usage. - // For visual rendering, prefer `checked` if provided; otherwise fall back to defaultChecked. - const isChecked = checked ?? defaultChecked ?? false; - - return ( - - ); -} diff --git a/web/src/index.css b/web/src/index.css index e9818174e02..854c528cddf 100644 --- a/web/src/index.css +++ b/web/src/index.css @@ -1,4 +1,11 @@ @import 'tailwindcss'; +/* `fonts.css` must come BEFORE `globals.css`: as of @nous-research/ui 0.14.x, + `globals.css` only declares the `--font-*` CSS variables (Collapse, Rules + Compressed/Expanded, Mondwest). The `@font-face` registrations live in + `fonts.css`, so without this import the DS variables resolve to font + families the browser never loads and components fall back to a system + stack (Tabs, Segmented, Typography, Buttons, etc. all look unstyled). */ +@import '@nous-research/ui/styles/fonts.css'; @import '@nous-research/ui/styles/globals.css'; /* Scan the published design-system bundle so its utility classes survive diff --git a/web/src/pages/AnalyticsPage.tsx b/web/src/pages/AnalyticsPage.tsx index c97af1deed2..492b79ce924 100644 --- a/web/src/pages/AnalyticsPage.tsx +++ b/web/src/pages/AnalyticsPage.tsx @@ -439,7 +439,7 @@ export default function AnalyticsPage() { ); setEnd( showTokens === false ? null : ( -
+
{PERIODS.map((p) => ( , @@ -266,7 +280,9 @@ export default function ProfilesPage() {
e.target === e.currentTarget && setCreateModalOpen(false)} + onClick={(e) => + e.target === e.currentTarget && setCreateModalOpen(false) + } role="dialog" aria-modal="true" aria-labelledby="create-profile-title" @@ -313,12 +329,22 @@ export default function ProfilesPage() {

- setCloneFromDefault(e.target.checked)} - label={t.profiles.cloneFromDefault} - /> +
+ + setCloneFromDefault(checked === true) + } + /> + + +