docs: 30-day overhaul — correctness audit, PR coverage, Nous Portal weave, sidebar reorg (#33782)

* docs(audit): correctness pass across getting-started, reference, features, messaging, developer-guide, guides, integrations, user-guide * docs: add PR coverage for last 30d + Nous Portal weave + nav reorg + build fixes - Add docs for top user-visible PRs that shipped without docs (api-server session control, kanban features, telegram pin/edit, provider client tag, xAI retired-model migration, cron name lookup, --branch update flag, etc.) - Apply Nous Portal weave across 23 pages (tasteful one-liners on getting-started/learning-path, configuration, overview, vision, x-search, credential-pools, provider-routing, cron, codex-runtime, profiles, docker, messaging/index, multiple guides, plus FAQ + index promotion) - Reorganize sidebar: split Messaging into Popular/M365/Chinese/Other, Reference into Command/Configuration/Tools-Skills sub-categories, add orphan developer-guide pages (web-search-provider-plugin, browser-supervisor), move features from Integrations back to Features, fold lone spotify into Media & Web. - Regenerate skill stubs + catalogs (kanban-codex-lane, hermes-s6-container- supervision, web-pentest) - Fix broken anchor links (security/cron, configuration/fallback, telegram large-files, adding-platform-adapters step-by-step)
2026-06-05 07:41:39 +00:00 · 2026-05-28 02:41:36 -07:00 · 2026-05-28 02:41:36 -07:00 · 8b6beaab5f
commit 8b6beaab5f
parent c7f7783e5c
142 changed files with 1840 additions and 483 deletions
--- a/website/docs/developer-guide/adding-platform-adapters.md
+++ b/website/docs/developer-guide/adding-platform-adapters.md
@ -9,7 +9,7 @@ This guide covers adding a new messaging platform to the Hermes gateway. A platf
 :::tip
 There are two ways to add a platform:
 - **Plugin** (recommended for community/third-party): Drop a plugin directory into `~/.hermes/plugins/` — zero core code changes needed. See [Plugin Path](#plugin-path-recommended) below.
- **Built-in**: Modify 20+ files across code, config, and docs. Use the [Built-in Checklist](#step-by-step-checklist) below.
+- **Built-in**: Modify 20+ files across code, config, and docs. Use the [Built-in Checklist](#step-by-step-checklist-built-in-path) below.
 :::

 ## Architecture Overview
--- a/website/docs/developer-guide/adding-providers.md
+++ b/website/docs/developer-guide/adding-providers.md
@ -321,12 +321,12 @@ At minimum, touch the tests that guard provider wiring.

 Common places:

- `tests/test_runtime_provider_resolution.py`
- `tests/test_cli_provider_resolution.py`
- `tests/test_cli_model_command.py`
- `tests/test_setup_model_selection.py`
- `tests/test_provider_parity.py`
- `tests/test_run_agent.py`
+- `tests/hermes_cli/test_runtime_provider_resolution.py`
+- `tests/cli/test_cli_provider_resolution.py`
+- `tests/hermes_cli/test_model_switch_custom_providers.py` (and adjacent `tests/hermes_cli/test_model_switch_*.py`)
+- `tests/hermes_cli/test_setup_model_provider.py`
+- `tests/run_agent/test_provider_parity.py`
+- `tests/run_agent/test_run_agent.py`
 - `tests/test_<provider>_adapter.py` for a native provider

 For docs-only examples, the exact file set may differ. The point is to cover:
@ -342,7 +342,7 @@ Run tests with xdist disabled:

 ```bash
 source venv/bin/activate
-python -m pytest tests/test_runtime_provider_resolution.py tests/test_cli_provider_resolution.py tests/test_cli_model_command.py tests/test_setup_model_selection.py -n0 -q
+python -m pytest tests/hermes_cli/test_runtime_provider_resolution.py tests/cli/test_cli_provider_resolution.py tests/hermes_cli/test_setup_model_provider.py tests/run_agent/test_provider_parity.py -n0 -q
 ```

 For deeper changes, run the full suite before pushing:
--- a/website/docs/developer-guide/agent-loop.md
+++ b/website/docs/developer-guide/agent-loop.md
@ -6,7 +6,7 @@ description: "Detailed walkthrough of AIAgent execution, API modes, tools, callb

 # Agent Loop Internals

-The core orchestration engine is `run_agent.py`'s `AIAgent` class — a large file (15k+ lines) that handles everything from prompt assembly to tool dispatch to provider failover.
+The core orchestration engine is `run_agent.py`'s `AIAgent` class — a large file (~4,400 lines) that handles everything from prompt assembly to tool dispatch to provider failover.

 ## Core Responsibilities

--- a/website/docs/developer-guide/architecture.md
+++ b/website/docs/developer-guide/architecture.md
@ -40,7 +40,7 @@ This page is the top-level map of Hermes Agent internals. Use it to orient yours
           ▼                                    ▼
 ┌───────────────────┐              ┌──────────────────────┐
 │ Session Storage   │              │ Tool Backends         │
-│ (SQLite + FTS5)   │              │ Terminal (7 backends) │
+│ (SQLite + FTS5)   │              │ Terminal (6 backends) │
 │ hermes_state.py   │              │ Browser (5 backends)  │
 │ gateway/session.py│              │ Web (4 backends)      │
 └───────────────────┘              │ MCP (dynamic)         │
@ -130,7 +130,7 @@ hermes-agent/
 ├── skills/                   # Bundled skills (always available)
 ├── optional-skills/          # Official optional skills (install explicitly)
 ├── website/                  # Docusaurus documentation site
-└── tests/                    # Pytest suite (~3,000+ tests)
+└── tests/                    # Pytest suite (~25,000 tests across ~1,250 files)
 ```

 ## Data Flow
--- a/website/docs/developer-guide/browser-supervisor.md
+++ b/website/docs/developer-guide/browser-supervisor.md
@ -1,57 +1,49 @@
-# Browser CDP Supervisor — Design
+---
+sidebar_position: 18
+title: "Browser CDP Supervisor"
+description: "How Hermes detects and responds to native JS dialogs and interacts with cross-origin iframes via a persistent CDP connection."
+---

-**Status:** Shipped (PR 14540)
-**Last updated:** 2026-04-23
-**Author:** @teknium1
+# Browser CDP Supervisor

-## Problem
+The CDP supervisor closes two long-standing gaps in Hermes' browser tooling:

-Native JS dialogs (`alert`/`confirm`/`prompt`/`beforeunload`) and iframes are
-the two biggest gaps in our browser tooling:
+1. **Native JS dialogs** (`alert`/`confirm`/`prompt`/`beforeunload`) block the
+   page's JS thread. Without supervision, the agent has no way to know a
+   dialog is open — subsequent tool calls hang or throw opaque errors.
+2. **Cross-origin iframes (OOPIFs)** are invisible to top-level
+   `Runtime.evaluate`. The agent can see iframe nodes in the DOM snapshot but
+   can't click, type, or eval inside them without a CDP session attached to
+   the child target.

-1. **Dialogs block the JS thread.** Any operation on the page stalls until the
-   dialog is handled. Before this work, the agent had no way to know a dialog
-   was open — subsequent tool calls would hang or throw opaque errors.
-2. **Iframes are invisible.** The agent could see iframe nodes in the DOM
-   snapshot but could not click, type, or eval inside them — especially
-   cross-origin (OOPIF) iframes that live in separate Chromium processes.
+The supervisor solves both by holding a persistent WebSocket to the backend's
+CDP endpoint per browser task, surfacing pending dialogs and frame structure
+into `browser_snapshot`, and exposing a `browser_dialog` tool for explicit
+responses.

-[PR #12550](https://github.com/NousResearch/hermes-agent/pull/12550) proposed a
-stateless `browser_dialog` wrapper. That doesn't solve detection — it's a
-cleaner CDP call for when the agent already knows (via symptoms) that a dialog
-is open. Closed as superseded.
-
-## Backend capability matrix (verified live 2026-04-23)
-
-Using throwaway probe scripts against a data-URL page that fires alerts in the
-main frame and in a same-origin srcdoc iframe, plus a cross-origin
-`https://example.com` iframe:
+## Backend support

 | Backend | Dialog detect | Dialog respond | Frame tree | OOPIF `Runtime.evaluate` via `browser_cdp(frame_id=...)` |
 |---|---|---|---|---|
 | Local Chrome (`--remote-debugging-port`) / `/browser connect` | ✓ | ✓ full workflow | ✓ | ✓ |
-| Browserbase | ✓ (via bridge) | ✓ full workflow (via bridge) | ✓ | ✓ (`document.title = "Example Domain"` verified on real cross-origin iframe) |
+| Browserbase | ✓ (via bridge) | ✓ full workflow (via bridge) | ✓ | ✓ |
 | Camofox | ✗ no CDP (REST-only) | ✗ | partial via DOM snapshot | ✗ |

-**How Browserbase respond works.** Browserbase's CDP proxy uses Playwright
-internally and auto-dismisses native dialogs within ~10ms, so
-`Page.handleJavaScriptDialog` can't keep up. To work around this, the
-supervisor injects a bridge script via
+**Browserbase quirk.** Browserbase's CDP proxy uses Playwright internally and
+auto-dismisses native dialogs within ~10ms, so `Page.handleJavaScriptDialog`
+can't keep up. The supervisor injects a bridge script via
 `Page.addScriptToEvaluateOnNewDocument` that overrides
 `window.alert`/`confirm`/`prompt` with a synchronous XHR to a magic host
-(`hermes-dialog-bridge.invalid`). `Fetch.enable` intercepts those XHRs
-before they touch the network — the dialog becomes a `Fetch.requestPaused`
-event the supervisor captures, and `respond_to_dialog` fulfills via
+(`hermes-dialog-bridge.invalid`). `Fetch.enable` intercepts those XHRs before
+they touch the network — the dialog becomes a `Fetch.requestPaused` event the
+supervisor captures, and `respond_to_dialog` fulfills via
 `Fetch.fulfillRequest` with a JSON body the injected script decodes.

-Net result: from the page's perspective, `prompt()` still returns the
-agent-supplied string. From the agent's perspective, it's the same
-`browser_dialog(action=...)` API either way. Tested end-to-end against
-real Browserbase sessions — 4/4 (alert/prompt/confirm-accept/confirm-dismiss)
-pass including value round-tripping back into page JS.
+From the page's perspective, `prompt()` still returns the agent-supplied
+string. From the agent's perspective, it's the same `browser_dialog(action=...)`
+API either way.

-Camofox stays unsupported for this PR; follow-up upstream issue planned at
-`jo-inc/camofox-browser` requesting a dialog polling endpoint.
+Camofox is unsupported — no CDP surface, REST-only.

 ## Architecture

@ -63,9 +55,10 @@ Holds a persistent WebSocket to the backend's CDP endpoint. Maintains:
 - **Dialog queue** — `List[PendingDialog]` with `{id, type, message, default_prompt, session_id, opened_at}`
 - **Frame tree** — `Dict[frame_id, FrameInfo]` with parent relationships, URL, origin, whether cross-origin child session
 - **Session map** — `Dict[session_id, SessionInfo]` so interaction tools can route to the right attached session for OOPIF operations
- **Recent console errors** — ring buffer of the last 50 (for PR 2 diagnostics)
+- **Recent console errors** — ring buffer of the last 50 for diagnostics

 Subscribes on attach:
+
 - `Page.enable` — `javascriptDialogOpening`, `frameAttached`, `frameNavigated`, `frameDetached`
 - `Runtime.enable` — `executionContextCreated`, `consoleAPICalled`, `exceptionThrown`
 - `Target.setAutoAttach {autoAttach: true, flatten: true}` — surfaces child OOPIF targets; supervisor enables `Page`+`Runtime` on each
@ -76,11 +69,13 @@ frozen snapshot without awaiting.
 ### Lifecycle

 - **Start:** `SupervisorRegistry.get_or_start(task_id, cdp_url)` — called by
-  `browser_navigate`, Browserbase session create, `/browser connect`. Idempotent.
+  `browser_navigate`, Browserbase session create, `/browser connect`.
+  Idempotent.
 - **Stop:** session teardown or `/browser disconnect`. Cancels the asyncio
  task, closes the WebSocket, discards state.
- **Rebind:** if the CDP URL changes (user reconnects to a new Chrome), stop
-  the old supervisor and start fresh — never reuse state across endpoints.
+- **Rebind:** if the CDP URL changes (user reconnects to a new Chrome), the
+  old supervisor is stopped and a fresh one started — state is never reused
+  across endpoints.

 ### Dialog policy

@ -92,14 +87,14 @@ Configurable via `config.yaml` under `browser.dialog_policy`:
  forever.
 - `auto_dismiss` — record and dismiss immediately; agent sees it after the
  fact via `browser_state` inside `browser_snapshot`.
- `auto_accept` — record and accept (useful for `beforeunload` where the user
-  wants to navigate away cleanly).
+- `auto_accept` — record and accept (useful for `beforeunload` where the
+  workflow wants to navigate away cleanly).

-Policy is per-task; no per-dialog overrides in v1.
+Policy is per-task; no per-dialog overrides.

-## Agent surface (PR 1)
+## Agent surface

-### One new tool
+### `browser_dialog` tool

 ```
 browser_dialog(action, prompt_text=None, dialog_id=None)
@ -107,9 +102,9 @@ browser_dialog(action, prompt_text=None, dialog_id=None)

 - `action="accept"` / `"dismiss"` → responds to the specified or sole pending dialog (required)
 - `prompt_text=...` → text to supply to a `prompt()` dialog
- `dialog_id=...` → disambiguate when multiple dialogs queued (rare)
+- `dialog_id=...` → disambiguate when multiple dialogs are queued (rare)

-Tool is response-only. Agent reads pending dialogs from `browser_snapshot`
+Tool is response-only. The agent reads pending dialogs from `browser_snapshot`
 output before calling.

 ### `browser_snapshot` extension
@ -137,72 +132,52 @@ is attached:
 }
 ```

- **`pending_dialogs`**: dialogs currently blocking the page's JS thread.
+- **`pending_dialogs`** — dialogs currently blocking the page's JS thread.
  The agent must call `browser_dialog(action=...)` to respond. Empty on
  Browserbase because their CDP proxy auto-dismisses within ~10ms.

- **`recent_dialogs`**: ring buffer of up to 20 recently-closed dialogs with
-  a `closed_by` tag — `"agent"` (we responded), `"auto_policy"` (local
+- **`recent_dialogs`** — ring buffer of up to 20 recently-closed dialogs with
+  a `closed_by` tag: `"agent"` (we responded), `"auto_policy"` (local
  auto_dismiss/auto_accept), `"watchdog"` (must_respond timeout hit), or
  `"remote"` (browser/backend closed it on us, e.g. Browserbase). This is
  how agents on Browserbase still get visibility into what happened.

- **`frame_tree`**: frame structure including cross-origin (OOPIF) children.
+- **`frame_tree`** — frame structure including cross-origin (OOPIF) children.
  Capped at 30 entries + OOPIF depth 2 to bound snapshot size on ad-heavy
  pages. `truncated: true` surfaces when limits were hit; agents needing
  the full tree can use `browser_cdp` with `Page.getFrameTree`.

-No new tool schema surface for any of these — the agent reads the snapshot
-it already requests.
+No new tool schema surface for any of these — the agent reads the snapshot it
+already requests.

 ### Availability gating

 Both surfaces gate on `_browser_cdp_check` (supervisor can only run when a CDP
 endpoint is reachable). On Camofox / no-backend sessions, the dialog tool is
-hidden and snapshot omits the new fields — no schema bloat.
+hidden and the snapshot omits the new fields — no schema bloat.

 ## Cross-origin iframe interaction

-Extending the dialog-detect work, `browser_cdp(frame_id=...)` routes CDP
-calls (notably `Runtime.evaluate`) through the supervisor's already-connected
-WebSocket using the OOPIF's child `sessionId`. Agents pick frame_ids out of
+`browser_cdp(frame_id=...)` routes CDP calls (notably `Runtime.evaluate`)
+through the supervisor's already-connected WebSocket using the OOPIF's child
+`sessionId`. Agents pick frame_ids out of
 `browser_snapshot.frame_tree.children[]` where `is_oopif=true` and pass them
 to `browser_cdp`. For same-origin iframes (no dedicated CDP session), the
 agent uses `contentWindow`/`contentDocument` from a top-level
-`Runtime.evaluate` instead — supervisor surfaces an error pointing at that
+`Runtime.evaluate` instead — the supervisor surfaces an error pointing at that
 fallback when `frame_id` belongs to a non-OOPIF.

-On Browserbase, this is the ONLY reliable path for iframe interaction —
+On Browserbase, this is the only reliable path for iframe interaction —
 stateless CDP connections (opened per `browser_cdp` call) hit signed-URL
 expiry, while the supervisor's long-lived connection keeps a valid session.

-## Camofox (follow-up)
-
-Issue planned against `jo-inc/camofox-browser` adding:
- Playwright `page.on('dialog', handler)` per session
- `GET /tabs/:tabId/dialogs` polling endpoint
- `POST /tabs/:tabId/dialogs/:id` to accept/dismiss
- Frame-tree introspection endpoint
-
-## Files touched (PR 1)
-
-### New
+## File layout

 - `tools/browser_supervisor.py` — `CDPSupervisor`, `SupervisorRegistry`, `PendingDialog`, `FrameInfo`
 - `tools/browser_dialog_tool.py` — `browser_dialog` tool handler
- `tests/tools/test_browser_supervisor.py` — mock CDP WebSocket server + lifecycle/state tests
- `website/docs/developer-guide/browser-supervisor.md` — this file
-
-### Modified
-
- `toolsets.py` — register `browser_dialog` in `browser`, `hermes-acp`, `hermes-api-server`, core toolsets (gated on CDP reachability)
- `tools/browser_tool.py`
-  - `browser_navigate` start-hook: if CDP URL resolvable, `SupervisorRegistry.get_or_start(task_id, cdp_url)`
-  - `browser_snapshot` (at ~line 1536): merge supervisor state into return payload
-  - `/browser connect` handler: restart supervisor with new endpoint
-  - Session teardown hooks in `_cleanup_browser_session`
- `hermes_cli/config.py` — add `browser.dialog_policy` and `browser.dialog_timeout_s` to `DEFAULT_CONFIG`
- Docs: `website/docs/user-guide/features/browser.md`, `website/docs/reference/tools-reference.md`, `website/docs/reference/toolsets-reference.md`
+- `tools/browser_tool.py` — `browser_navigate` start-hook, `browser_snapshot` merge, `/browser connect` reattach, `_cleanup_browser_session` teardown
+- `toolsets.py` — registers `browser_dialog` in `browser`, `hermes-acp`, `hermes-api-server`, and core toolsets (gated on CDP reachability)
+- `hermes_cli/config.py` — `browser.dialog_policy` and `browser.dialog_timeout_s` defaults

 ## Non-goals

@ -214,9 +189,10 @@ Issue planned against `jo-inc/camofox-browser` adding:

 ## Testing

-Unit tests use an asyncio mock CDP server that speaks enough of the protocol
-to exercise all state transitions: attach, enable, navigate, dialog fire,
-dialog dismiss, frame attach/detach, child target attach, session teardown.
-Real-backend E2E (Browserbase + local Chromium-family browser) is manual — exercise via
-`/browser connect` to a live Chromium-family browser and run the dialog/frame
-test cases described above.
+Unit tests (`tests/tools/test_browser_supervisor.py`) use an asyncio mock CDP
+server that speaks enough of the protocol to exercise all state transitions:
+attach, enable, navigate, dialog fire, dialog dismiss, frame attach/detach,
+child target attach, session teardown. Real-backend E2E (Browserbase + local
+Chromium-family browser) is manual — exercise via `/browser connect` to a
+live Chromium-family browser and run the dialog/frame test cases described
+above.
--- a/website/docs/developer-guide/memory-provider-plugin.md
+++ b/website/docs/developer-guide/memory-provider-plugin.md
@ -61,7 +61,7 @@ class MyMemoryProvider(MemoryProvider):
 | `is_available()` | Agent init, before activation | **Yes** — no network calls |
 | `initialize(session_id, **kwargs)` | Agent startup | **Yes** |
 | `get_tool_schemas()` | After init, for tool injection | **Yes** |
-| `handle_tool_call(name, args)` | When agent uses your tools | **Yes** (if you have tools) |
+| `handle_tool_call(tool_name, args, **kwargs)` | When agent uses your tools | **Yes** (if you have tools) |

 ### Config

@ -75,9 +75,9 @@ class MyMemoryProvider(MemoryProvider):
 | Method | When Called | Use Case |
 |--------|-----------|----------|
 | `system_prompt_block()` | System prompt assembly | Static provider info |
-| `prefetch(query)` | Before each API call | Return recalled context |
+| `prefetch(query, *, session_id="")` | Before each API call | Return recalled context |
 | `queue_prefetch(query)` | After each turn | Pre-warm for next turn |
-| `sync_turn(user, assistant)` | After each completed turn | Persist conversation |
+| `sync_turn(user, assistant, *, session_id="")` | After each completed turn | Persist conversation |
 | `on_session_end(messages)` | Conversation ends | Final extraction/flush |
 | `on_pre_compress(messages)` | Before context compression | Save insights before discard |
 | `on_memory_write(action, target, content)` | Built-in memory writes | Mirror to your backend |
@ -182,7 +182,7 @@ data_dir = Path("~/.hermes/my-provider").expanduser()

 ## Testing

-See `tests/agent/test_memory_plugin_e2e.py` for the complete E2E testing pattern using a real SQLite provider.
+See `tests/agent/test_memory_provider.py` and adjacent memory tests (`tests/agent/test_memory_session_switch.py`, `tests/agent/test_memory_user_id.py`, `tests/run_agent/test_memory_provider_init.py`) for end-to-end patterns.

 ```python
 from agent.memory_manager import MemoryManager
--- a/website/docs/developer-guide/provider-runtime.md
+++ b/website/docs/developer-guide/provider-runtime.md
@ -193,7 +193,11 @@ Cron jobs **do** support fallback: `run_job()` reads `fallback_providers` (or le

 ### Test coverage

-See `tests/test_fallback_model.py` for comprehensive tests covering all supported providers, one-shot semantics, and edge cases.
+Fallback behavior is exercised across several suites:
+
+- `tests/run_agent/test_fallback_credential_isolation.py` — credential isolation between primary and fallback
+- `tests/hermes_cli/test_fallback_cmd.py` — the `/fallback` CLI command
+- `tests/gateway/test_fallback_eviction.py` — gateway eviction of failed providers

 ## Related docs

--- a/website/docs/developer-guide/web-search-provider-plugin.md
+++ b/website/docs/developer-guide/web-search-provider-plugin.md
@ -6,7 +6,7 @@ description: "How to build a web-search/extract/crawl backend plugin for Hermes

 # Building a Web Search Provider Plugin

-Web-search provider plugins register a backend that services `web_search`, `web_extract`, and (optionally) deep-crawl tool calls. Built-in providers — Firecrawl, SearXNG, Tavily, Exa, Parallel, Brave Search (free tier), and DDGS — all ship as plugins under `plugins/web/<name>/`. You can add a new one, or override a bundled one, by dropping a directory next to them.
+Web-search provider plugins register a backend that services `web_search`, `web_extract`, and (optionally) deep-crawl tool calls. Built-in providers — Firecrawl, SearXNG, Tavily, Exa, Parallel, Brave Search (free tier), xAI, and DDGS — all ship as plugins under `plugins/web/<name>/`. You can add a new one, or override a bundled one, by dropping a directory next to them.

 :::tip
 Web search is one of several **backend plugins** Hermes supports. The others (with their own ABCs) are [Image Generation Provider Plugins](/developer-guide/image-gen-provider-plugin), [Video Generation Provider Plugins](/developer-guide/video-gen-provider-plugin), [Memory Provider Plugins](/developer-guide/memory-provider-plugin), [Context Engine Plugins](/developer-guide/context-engine-plugin), and [Model Provider Plugins](/developer-guide/model-provider-plugin). General tool/hook/CLI plugins live in [Build a Hermes Plugin](/guides/build-a-hermes-plugin).