* refactor: remove browser_close tool — auto-cleanup handles it
The browser_close tool was called in only 9% of browser sessions (13/144
navigations across 66 sessions), always redundantly — cleanup_browser()
already runs via _cleanup_task_resources() at conversation end, and the
background inactivity reaper catches anything else.
Removing it saves one tool schema slot in every browser-enabled API call.
Also fixes a latent bug: cleanup_browser() now handles Camofox sessions
too (previously only Browserbase). Camofox sessions were never auto-cleaned
per-task because they live in a separate dict from _active_sessions.
Files changed (13):
- tools/browser_tool.py: remove function, schema, registry entry; add
camofox cleanup to cleanup_browser()
- toolsets.py, model_tools.py, prompt_builder.py, display.py,
acp_adapter/tools.py: remove browser_close from all tool lists
- tests/: remove browser_close test, update toolset assertion
- docs/skills: remove all browser_close references
* fix: repeat browser_scroll 5x per call for meaningful page movement
Most backends scroll ~100px per call — barely visible on a typical
viewport. Repeating 5x gives ~500px (~half a viewport), making each
scroll tool call actually useful.
Backend-agnostic approach: works across all 7+ browser backends without
needing to configure each one's scroll amount individually. Breaks
early on error for the agent-browser path.
* feat: auto-return compact snapshot from browser_navigate
Every browser session starts with navigate → snapshot. Now navigate
returns the compact accessibility tree snapshot inline, saving one
tool call per browser task.
The snapshot captures the full page DOM (not viewport-limited), so
scroll position doesn't affect it. browser_snapshot remains available
for refreshing after interactions or getting full=true content.
Both Browserbase and Camofox paths auto-snapshot. If the snapshot
fails for any reason, navigation still succeeds — the snapshot is
a bonus, not a requirement.
Schema descriptions updated to guide models: navigate mentions it
returns a snapshot, snapshot mentions it's for refresh/full content.
* refactor: slim cronjob tool schema — consolidate model/provider, drop unused params
Session data (151 calls across 67 sessions) showed several schema
properties were never used by models. Consolidated and cleaned up:
Removed from schema (still work via backend/CLI):
- skill (singular): use skills array instead
- reason: pause-only, unnecessary
- include_disabled: now defaults to true
- base_url: extreme edge case, zero usage
- provider (standalone): merged into model object
Consolidated:
- model + provider → single 'model' object with {model, provider} fields.
If provider is omitted, the current main provider is pinned at creation
time so the job stays stable even if the user changes their default.
Kept:
- script: useful data collection feature
- skills array: standard interface for skill loading
Schema shrinks from 14 to 10 properties. All backend functionality
preserved — the Python function signature and handler lambda still
accept every parameter.
* fix: remove mixture_of_agents from core toolsets — opt-in only via hermes tools
MoA was in _HERMES_CORE_TOOLS and composite toolsets (hermes-cli,
hermes-messaging, safe), which meant it appeared in every session
for anyone with OPENROUTER_API_KEY set. The _DEFAULT_OFF_TOOLSETS
gate only works after running 'hermes tools' explicitly.
Now MoA only appears when a user explicitly enables it via
'hermes tools'. The moa toolset definition and check_fn remain
unchanged — it just needs to be opted into.
6.1 KiB
| name | description | version | metadata | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| dogfood | Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports | 1.0.0 |
|
Dogfood: Systematic Web Application QA Testing
Overview
This skill guides you through systematic exploratory QA testing of web applications using the browser toolset. You will navigate the application, interact with elements, capture evidence of issues, and produce a structured bug report.
Prerequisites
- Browser toolset must be available (
browser_navigate,browser_snapshot,browser_click,browser_type,browser_vision,browser_console,browser_scroll,browser_back,browser_press) - A target URL and testing scope from the user
Inputs
The user provides:
- Target URL — the entry point for testing
- Scope — what areas/features to focus on (or "full site" for comprehensive testing)
- Output directory (optional) — where to save screenshots and the report (default:
./dogfood-output)
Workflow
Follow this 5-phase systematic workflow:
Phase 1: Plan
- Create the output directory structure:
{output_dir}/ ├── screenshots/ # Evidence screenshots └── report.md # Final report (generated in Phase 5) - Identify the testing scope based on user input.
- Build a rough sitemap by planning which pages and features to test:
- Landing/home page
- Navigation links (header, footer, sidebar)
- Key user flows (sign up, login, search, checkout, etc.)
- Forms and interactive elements
- Edge cases (empty states, error pages, 404s)
Phase 2: Explore
For each page or feature in your plan:
-
Navigate to the page:
browser_navigate(url="https://example.com/page") -
Take a snapshot to understand the DOM structure:
browser_snapshot() -
Check the console for JavaScript errors:
browser_console(clear=true)Do this after every navigation and after every significant interaction. Silent JS errors are high-value findings.
-
Take an annotated screenshot to visually assess the page and identify interactive elements:
browser_vision(question="Describe the page layout, identify any visual issues, broken elements, or accessibility concerns", annotate=true)The
annotate=trueflag overlays numbered[N]labels on interactive elements. Each[N]maps to ref@eNfor subsequent browser commands. -
Test interactive elements systematically:
- Click buttons and links:
browser_click(ref="@eN") - Fill forms:
browser_type(ref="@eN", text="test input") - Test keyboard navigation:
browser_press(key="Tab"),browser_press(key="Enter") - Scroll through content:
browser_scroll(direction="down") - Test form validation with invalid inputs
- Test empty submissions
- Click buttons and links:
-
After each interaction, check for:
- Console errors:
browser_console() - Visual changes:
browser_vision(question="What changed after the interaction?") - Expected vs actual behavior
- Console errors:
Phase 3: Collect Evidence
For every issue found:
-
Take a screenshot showing the issue:
browser_vision(question="Capture and describe the issue visible on this page", annotate=false)Save the
screenshot_pathfrom the response — you will reference it in the report. -
Record the details:
- URL where the issue occurs
- Steps to reproduce
- Expected behavior
- Actual behavior
- Console errors (if any)
- Screenshot path
-
Classify the issue using the issue taxonomy (see
references/issue-taxonomy.md):- Severity: Critical / High / Medium / Low
- Category: Functional / Visual / Accessibility / Console / UX / Content
Phase 4: Categorize
- Review all collected issues.
- De-duplicate — merge issues that are the same bug manifesting in different places.
- Assign final severity and category to each issue.
- Sort by severity (Critical first, then High, Medium, Low).
- Count issues by severity and category for the executive summary.
Phase 5: Report
Generate the final report using the template at templates/dogfood-report-template.md.
The report must include:
- Executive summary with total issue count, breakdown by severity, and testing scope
- Per-issue sections with:
- Issue number and title
- Severity and category badges
- URL where observed
- Description of the issue
- Steps to reproduce
- Expected vs actual behavior
- Screenshot references (use
MEDIA:<screenshot_path>for inline images) - Console errors if relevant
- Summary table of all issues
- Testing notes — what was tested, what was not, any blockers
Save the report to {output_dir}/report.md.
Tools Reference
| Tool | Purpose |
|---|---|
browser_navigate |
Go to a URL |
browser_snapshot |
Get DOM text snapshot (accessibility tree) |
browser_click |
Click an element by ref (@eN) or text |
browser_type |
Type into an input field |
browser_scroll |
Scroll up/down on the page |
browser_back |
Go back in browser history |
browser_press |
Press a keyboard key |
browser_vision |
Screenshot + AI analysis; use annotate=true for element labels |
browser_console |
Get JS console output and errors |
Tips
- Always check
browser_console()after navigating and after significant interactions. Silent JS errors are among the most valuable findings. - Use
annotate=truewithbrowser_visionwhen you need to reason about interactive element positions or when the snapshot refs are unclear. - Test with both valid and invalid inputs — form validation bugs are common.
- Scroll through long pages — content below the fold may have rendering issues.
- Test navigation flows — click through multi-step processes end-to-end.
- Check responsive behavior by noting any layout issues visible in screenshots.
- Don't forget edge cases: empty states, very long text, special characters, rapid clicking.
- When reporting screenshots to the user, include
MEDIA:<screenshot_path>so they can see the evidence inline.