mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

teknium1 a8bf414f4a feat: browser console/errors tool, annotated screenshots, auto-recording, and dogfood QA skill

New browser capabilities and a built-in skill for agent-driven web QA.

## New tool: browser_console

Returns console messages (log/warn/error/info) AND uncaught JavaScript
exceptions in a single call. Uses agent-browser's 'console' and 'errors'
commands through the existing session plumbing. Supports --clear to reset
buffers. Verified working in both local and Browserbase cloud modes.

## Enhanced tool: browser_vision(annotate=True)

New boolean parameter on browser_vision. When true, agent-browser overlays
numbered [N] labels on interactive elements — each [N] maps to ref @eN.
Annotation data (element name, role, bounding box) returned alongside the
vision analysis. Useful for QA reports and spatial reasoning.

## Config: browser.record_sessions

Auto-record browser sessions as WebM video files when enabled:
- Starts recording on first browser_navigate
- Stops and saves on browser_close
- Saves to ~/.hermes/browser_recordings/
- Works in both local and cloud modes (verified)
- Disabled by default

## Built-in skill: dogfood

Systematic exploratory QA testing for web applications. Teaches the agent
a 5-phase workflow:
1. Plan — accept URL, create output dirs, set scope
2. Explore — systematic crawl with annotated screenshots
3. Collect Evidence — screenshots, console errors, JS exceptions
4. Categorize — severity (Critical/High/Medium/Low) and category
   (Functional/Visual/Accessibility/Console/UX/Content)
5. Report — structured markdown with per-issue evidence

Includes:
- skills/dogfood/SKILL.md — full workflow instructions
- skills/dogfood/references/issue-taxonomy.md — severity/category defs
- skills/dogfood/templates/dogfood-report-template.md — report template

## Tests

21 new tests covering:
- browser_console message/error parsing, clear flag, empty/failed states
- browser_console schema registration
- browser_vision annotate schema and flag passing
- record_sessions config defaults and recording lifecycle
- Dogfood skill file existence and content validation

Addresses #315.

2026-03-08 21:28:12 -07:00

6.2 KiB

Raw Blame History

name

description

version

metadata

dogfood

Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports

1.0.0

hermes

Dogfood: Systematic Web Application QA Testing

Overview

This skill guides you through systematic exploratory QA testing of web applications using the browser toolset. You will navigate the application, interact with elements, capture evidence of issues, and produce a structured bug report.

Prerequisites

Browser toolset must be available (browser_navigate, browser_snapshot, browser_click, browser_type, browser_vision, browser_console, browser_scroll, browser_back, browser_press, browser_close)
A target URL and testing scope from the user

Inputs

The user provides:

Target URL — the entry point for testing
Scope — what areas/features to focus on (or "full site" for comprehensive testing)
Output directory (optional) — where to save screenshots and the report (default: ./dogfood-output)

Workflow

Follow this 5-phase systematic workflow:

Phase 1: Plan

Create the output directory structure:

{output_dir}/
├── screenshots/       # Evidence screenshots
└── report.md          # Final report (generated in Phase 5)

Identify the testing scope based on user input.
Build a rough sitemap by planning which pages and features to test:
- Landing/home page
- Navigation links (header, footer, sidebar)
- Key user flows (sign up, login, search, checkout, etc.)
- Forms and interactive elements
- Edge cases (empty states, error pages, 404s)

Phase 2: Explore

For each page or feature in your plan:

Navigate to the page:

browser_navigate(url="https://example.com/page")

Take a snapshot to understand the DOM structure:
```
browser_snapshot()
```
Check the console for JavaScript errors:
```
browser_console(clear=true)
```
Do this after every navigation and after every significant interaction. Silent JS errors are high-value findings.
Take an annotated screenshot to visually assess the page and identify interactive elements:
```
browser_vision(question="Describe the page layout, identify any visual issues, broken elements, or accessibility concerns", annotate=true)
```
The annotate=true flag overlays numbered [N] labels on interactive elements. Each [N] maps to ref @eN for subsequent browser commands.
Test interactive elements systematically:
- Click buttons and links: browser_click(ref="@eN")
- Fill forms: browser_type(ref="@eN", text="test input")
- Test keyboard navigation: browser_press(key="Tab"), browser_press(key="Enter")
- Scroll through content: browser_scroll(direction="down")
- Test form validation with invalid inputs
- Test empty submissions
After each interaction, check for:
- Console errors: browser_console()
- Visual changes: browser_vision(question="What changed after the interaction?")
- Expected vs actual behavior

Phase 3: Collect Evidence

For every issue found:

Take a screenshot showing the issue:
```
browser_vision(question="Capture and describe the issue visible on this page", annotate=false)
```
Save the screenshot_path from the response — you will reference it in the report.
Record the details:
- URL where the issue occurs
- Steps to reproduce
- Expected behavior
- Actual behavior
- Console errors (if any)
- Screenshot path
Classify the issue using the issue taxonomy (see references/issue-taxonomy.md):
- Severity: Critical / High / Medium / Low
- Category: Functional / Visual / Accessibility / Console / UX / Content

Phase 4: Categorize

Review all collected issues.
De-duplicate — merge issues that are the same bug manifesting in different places.
Assign final severity and category to each issue.
Sort by severity (Critical first, then High, Medium, Low).
Count issues by severity and category for the executive summary.

Phase 5: Report

Generate the final report using the template at templates/dogfood-report-template.md.

The report must include:

Executive summary with total issue count, breakdown by severity, and testing scope
Per-issue sections with:
- Issue number and title
- Severity and category badges
- URL where observed
- Description of the issue
- Steps to reproduce
- Expected vs actual behavior
- Screenshot references (use MEDIA:<screenshot_path> for inline images)
- Console errors if relevant
Summary table of all issues
Testing notes — what was tested, what was not, any blockers

Save the report to {output_dir}/report.md.

Tools Reference

Tool	Purpose
`browser_navigate`	Go to a URL
`browser_snapshot`	Get DOM text snapshot (accessibility tree)
`browser_click`	Click an element by ref (`@eN`) or text
`browser_type`	Type into an input field
`browser_scroll`	Scroll up/down on the page
`browser_back`	Go back in browser history
`browser_press`	Press a keyboard key
`browser_vision`	Screenshot + AI analysis; use `annotate=true` for element labels
`browser_console`	Get JS console output and errors
`browser_close`	Close the browser session

Tips

Always check browser_console() after navigating and after significant interactions. Silent JS errors are among the most valuable findings.
Use annotate=true with browser_vision when you need to reason about interactive element positions or when the snapshot refs are unclear.
Test with both valid and invalid inputs — form validation bugs are common.
Scroll through long pages — content below the fold may have rendering issues.
Test navigation flows — click through multi-step processes end-to-end.
Check responsive behavior by noting any layout issues visible in screenshots.
Don't forget edge cases: empty states, very long text, special characters, rapid clicking.
When reporting screenshots to the user, include MEDIA:<screenshot_path> so they can see the evidence inline.

6.2 KiB Raw Blame History