mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-14 04:02:26 +00:00

Teknium b9bac87d5a feat(skills): declare platforms frontmatter for all 79 undeclared built-in skills

Completes the Windows-gating coverage for the built-in skills/ tree. Every
bundled SKILL.md now carries an explicit platforms: declaration so the
loader (agent.skill_utils.skill_matches_platform) can skip-load skills
that don't fit the current OS.

74 skills declared cross-platform (platforms: [linux, macos, windows]):
  Creative (16): ascii-art, ascii-video, architecture-diagram, baoyu-comic,
    baoyu-infographic, claude-design, creative-ideation, design-md,
    excalidraw, humanizer, manim-video, p5js, pixel-art,
    popular-web-designs, pretext, sketch, songwriting-and-ai-music,
    touchdesigner-mcp
  Autonomous agents: claude-code, codex, hermes-agent, opencode
  Data/devops: jupyter-live-kernel, kanban-orchestrator, kanban-worker,
    webhook-subscriptions, dogfood, codebase-inspection
  GitHub: github-auth, github-code-review, github-issues,
    github-pr-workflow, github-repo-management
  Media: gif-search, heartmula, songsee, spotify, youtube-content
  MCP / email / gaming / notes / smart-home: native-mcp, himalaya,
    pokemon-player, obsidian, openhue
  mlops (non-broken): weights-and-biases, huggingface-hub, llama-cpp,
    outlines, segment-anything-model, dspy, trl-fine-tuning
  Productivity: airtable, google-workspace, linear, maps, nano-pdf,
    notion, ocr-and-documents, powerpoint
  Red-teaming / research: godmode, arxiv, blogwatcher, llm-wiki,
    polymarket
  Software-dev: debugging-hermes-tui-commands, hermes-agent-skill-authoring,
    node-inspect-debugger, plan, requesting-code-review, spike,
    subagent-driven-development, systematic-debugging,
    test-driven-development, writing-plans
  Misc: yuanbao

5 skills gated from Windows (platforms: [linux, macos]):
  mlops/inference/vllm (serving-llms-vllm)
    vLLM is officially Linux-only; Windows requires WSL.
  mlops/training/axolotl
    Axolotl's flash-attn + deepspeed + bitsandbytes stack is Linux-first.
  mlops/training/unsloth
    Requires Triton + xformers + flash-attn — Linux only in practice.
  mlops/models/audiocraft (audiocraft-audio-generation)
    torchaudio ffmpeg backend + encodec dependencies are Linux-first.
  mlops/inference/obliteratus
    Research abliteration workflow; relies on Linux-focused pytorch
    kernels and MLX — no first-class Windows path.

Same strict-over-lenient policy as the optional-skills sweep: when the
underlying tool's Windows support is rough, missing, or WSL-only, gate the
skill. Easier to un-gate after verified Windows support lands than to leak
partial support that manifests as mid-task failures.

Combined with prior commits in this branch, every bundled SKILL.md
(skills/ + optional-skills/) now has a platforms: declaration.

2026-05-08 09:23:27 -07:00

6.1 KiB

Raw Blame History

name

description

version

platforms

metadata

dogfood

Exploratory QA of web apps: find bugs, evidence, reports.

1.0.0

linux

macos

windows

hermes

Dogfood: Systematic Web Application QA Testing

Overview

This skill guides you through systematic exploratory QA testing of web applications using the browser toolset. You will navigate the application, interact with elements, capture evidence of issues, and produce a structured bug report.

Prerequisites

Browser toolset must be available (browser_navigate, browser_snapshot, browser_click, browser_type, browser_vision, browser_console, browser_scroll, browser_back, browser_press)
A target URL and testing scope from the user

Inputs

The user provides:

Target URL — the entry point for testing
Scope — what areas/features to focus on (or "full site" for comprehensive testing)
Output directory (optional) — where to save screenshots and the report (default: ./dogfood-output)

Workflow

Follow this 5-phase systematic workflow:

Phase 1: Plan

Create the output directory structure:

{output_dir}/
├── screenshots/       # Evidence screenshots
└── report.md          # Final report (generated in Phase 5)

Identify the testing scope based on user input.
Build a rough sitemap by planning which pages and features to test:
- Landing/home page
- Navigation links (header, footer, sidebar)
- Key user flows (sign up, login, search, checkout, etc.)
- Forms and interactive elements
- Edge cases (empty states, error pages, 404s)

Phase 2: Explore

For each page or feature in your plan:

Navigate to the page:

browser_navigate(url="https://example.com/page")

Take a snapshot to understand the DOM structure:
```
browser_snapshot()
```
Check the console for JavaScript errors:
```
browser_console(clear=true)
```
Do this after every navigation and after every significant interaction. Silent JS errors are high-value findings.
Take an annotated screenshot to visually assess the page and identify interactive elements:
```
browser_vision(question="Describe the page layout, identify any visual issues, broken elements, or accessibility concerns", annotate=true)
```
The annotate=true flag overlays numbered [N] labels on interactive elements. Each [N] maps to ref @eN for subsequent browser commands.
Test interactive elements systematically:
- Click buttons and links: browser_click(ref="@eN")
- Fill forms: browser_type(ref="@eN", text="test input")
- Test keyboard navigation: browser_press(key="Tab"), browser_press(key="Enter")
- Scroll through content: browser_scroll(direction="down")
- Test form validation with invalid inputs
- Test empty submissions
After each interaction, check for:
- Console errors: browser_console()
- Visual changes: browser_vision(question="What changed after the interaction?")
- Expected vs actual behavior

Phase 3: Collect Evidence

For every issue found:

Take a screenshot showing the issue:
```
browser_vision(question="Capture and describe the issue visible on this page", annotate=false)
```
Save the screenshot_path from the response — you will reference it in the report.
Record the details:
- URL where the issue occurs
- Steps to reproduce
- Expected behavior
- Actual behavior
- Console errors (if any)
- Screenshot path
Classify the issue using the issue taxonomy (see references/issue-taxonomy.md):
- Severity: Critical / High / Medium / Low
- Category: Functional / Visual / Accessibility / Console / UX / Content

Phase 4: Categorize

Review all collected issues.
De-duplicate — merge issues that are the same bug manifesting in different places.
Assign final severity and category to each issue.
Sort by severity (Critical first, then High, Medium, Low).
Count issues by severity and category for the executive summary.

Phase 5: Report

Generate the final report using the template at templates/dogfood-report-template.md.

The report must include:

Executive summary with total issue count, breakdown by severity, and testing scope
Per-issue sections with:
- Issue number and title
- Severity and category badges
- URL where observed
- Description of the issue
- Steps to reproduce
- Expected vs actual behavior
- Screenshot references (use MEDIA:<screenshot_path> for inline images)
- Console errors if relevant
Summary table of all issues
Testing notes — what was tested, what was not, any blockers

Save the report to {output_dir}/report.md.

Tools Reference

Tool	Purpose
`browser_navigate`	Go to a URL
`browser_snapshot`	Get DOM text snapshot (accessibility tree)
`browser_click`	Click an element by ref (`@eN`) or text
`browser_type`	Type into an input field
`browser_scroll`	Scroll up/down on the page
`browser_back`	Go back in browser history
`browser_press`	Press a keyboard key
`browser_vision`	Screenshot + AI analysis; use `annotate=true` for element labels
`browser_console`	Get JS console output and errors

Tips

Always check browser_console() after navigating and after significant interactions. Silent JS errors are among the most valuable findings.
Use annotate=true with browser_vision when you need to reason about interactive element positions or when the snapshot refs are unclear.
Test with both valid and invalid inputs — form validation bugs are common.
Scroll through long pages — content below the fold may have rendering issues.
Test navigation flows — click through multi-step processes end-to-end.
Check responsive behavior by noting any layout issues visible in screenshots.
Don't forget edge cases: empty states, very long text, special characters, rapid clicking.
When reporting screenshots to the user, include MEDIA:<screenshot_path> so they can see the evidence inline.

6.1 KiB Raw Blame History