hermes-agent/website/docs/user-guide/skills/bundled/dogfood/dogfood-dogfood.md
Teknium 289cc47631
docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738)
Broad drift audit against origin/main (b52b63396).

Reference pages (most user-visible drift):
- slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer
  that were missing; drop non-existent /terminal-setup; fix /q footnote
  (resolves to /queue, not /quit); extend CLI-only list with all 24
  CLI-only commands in the registry
- cli-commands: add dedicated sections for hermes curator / fallback /
  hooks (new subcommands not previously documented); remove stale
  hermes honcho standalone section (the plugin registers dynamically
  via hermes memory); list curator/fallback/hooks in top-level table;
  fix completion to include fish
- toolsets-reference: document the real 52-toolset count; split browser
  vs browser-cdp; add discord / discord_admin / spotify / yuanbao;
  correct hermes-cli tool count from 36 to 38; fix misleading claim
  that hermes-homeassistant adds tools (it's identical to hermes-cli)
- tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao,
  2 Discord toolsets; move browser_cdp/browser_dialog to their own
  browser-cdp toolset section
- environment-variables: add 40+ user-facing HERMES_* vars that were
  undocumented (--yolo, --accept-hooks, --ignore-*, inference model
  override, agent/stream/checkpoint timeouts, OAuth trace, per-platform
  batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs,
  gateway restart/connect timeouts); dedupe the Cron Scheduler section;
  replace stale QQ_SANDBOX with QQ_PORTAL_HOST

User-guide (top level):
- cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20)
- configuration.md: display.platforms is the canonical per-platform
  override key; tool_progress_overrides is deprecated and auto-migrated
- profiles.md: model.default is the config key, not model.model
- sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8
- checkpoints-and-rollback.md: destructive-command list now matches
  _DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd)
- docker.md: the container runs as non-root hermes (UID 10000) via
  gosu; fix install command (uv pip); add missing --insecure on the
  dashboard compose example (required for non-loopback bind)
- security.md: systemctl danger pattern also matches 'restart'
- index.md: built-in tool count 47 -> 68
- integrations/index.md: 6 STT providers, 8 memory providers
- integrations/providers.md: drop fictional dashscope/qwen aliases

Features:
- overview.md: 9 image models (not 8), 9 TTS providers (not 5),
  8 memory providers (Supermemory was missing)
- tool-gateway.md: 9 image models
- tools.md: extend common-toolsets list with search / messaging /
  spotify / discord / debugging / safe
- fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY
  (lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan,
  tencent-tokenhub, azure-foundry)
- plugins.md: Available Hooks table now includes on_session_finalize,
  on_session_reset, subagent_stop
- built-in-plugins.md: add the 7 bundled plugins the page didn't
  mention (spotify, google_meet, three image_gen providers, two
  dashboard examples)
- web-dashboard.md: add --insecure and --tui flags
- cron.md: hermes cron create takes positional schedule/prompt, not
  flags

Messaging:
- telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when
  TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it
  per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch.
- discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default
  is 2.0, not 0.1
- dingtalk.md: document DINGTALK_REQUIRE_MENTION /
  FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL /
  ALLOW_ALL_USERS that the adapter supports
- bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env
  var; the setting lives in platforms.bluebubbles.extra only
- qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and
  QQ_GROUP_ALLOWED_USERS
- wecom-callback.md: replace 'hermes gateway start' (service-only)
  with 'hermes gateway' for first-time setup

Developer-guide:
- architecture.md: refresh tool/toolset counts (61/52), terminal
  backend count (7), line counts for run_agent.py (~13.7k), cli.py
  (~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py
  (~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform
  adapter count 18 -> 20
- agent-loop.md: run_agent.py line count 10.7k -> 13.7k
- tools-runtime.md: add vercel_sandbox backend
- adding-tools.md: remove stale 'Discovery import added to
  model_tools.py' checklist item (registry auto-discovery)
- adding-platform-adapters.md: mark send_typing / get_chat_info as
  concrete base methods; only connect/disconnect/send are abstract
- acp-internals.md: ACP sessions now persist to SessionDB
  (~/.hermes/state.db); acp.run_agent call uses
  use_unstable_protocol=True
- cron-internals.md: gateway runs scheduler in a dedicated background
  thread via _start_cron_ticker, not on a maintenance cycle; locking
  is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows)
- gateway-internals.md: gateway/run.py ~12k lines
- provider-runtime.md: cron DOES support fallback (run_job reads
  fallback_providers from config)
- session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations
  10 and 11 (trigram FTS, inline-mode FTS5 re-index); add
  api_call_count column to Sessions DDL; document messages_fts_trigram
  and state_meta in the architecture tree
- context-compression-and-caching.md: remove the obsolete 'context
  pressure warnings' section (warnings were removed for causing
  models to give up early)
- context-engine-plugin.md: compress() signature now includes
  focus_topic param
- extending-the-cli.md: _build_tui_layout_children signature now
  includes model_picker_widget; add to default layout

Also fixed three pre-existing broken links/anchors the build warned
about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and
tips#background-tasks, nix-setup.md -> #container-aware-cli).

Regenerated per-skill pages via website/scripts/generate-skill-docs.py
so catalog tables and sidebar are consistent with current SKILL.md
frontmatter.

docusaurus build: clean, no broken links or anchors.
2026-04-29 20:55:59 -07:00

6.7 KiB

title sidebar_label description
Dogfood — Exploratory QA of web apps: find bugs, evidence, reports Dogfood Exploratory QA of web apps: find bugs, evidence, reports

{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}

Dogfood

Exploratory QA of web apps: find bugs, evidence, reports.

Skill metadata

Source Bundled (installed by default)
Path skills/dogfood
Version 1.0.0
Tags qa, testing, browser, web, dogfood

Reference: full SKILL.md

:::info The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. :::

Dogfood: Systematic Web Application QA Testing

Overview

This skill guides you through systematic exploratory QA testing of web applications using the browser toolset. You will navigate the application, interact with elements, capture evidence of issues, and produce a structured bug report.

Prerequisites

  • Browser toolset must be available (browser_navigate, browser_snapshot, browser_click, browser_type, browser_vision, browser_console, browser_scroll, browser_back, browser_press)
  • A target URL and testing scope from the user

Inputs

The user provides:

  1. Target URL — the entry point for testing
  2. Scope — what areas/features to focus on (or "full site" for comprehensive testing)
  3. Output directory (optional) — where to save screenshots and the report (default: ./dogfood-output)

Workflow

Follow this 5-phase systematic workflow:

Phase 1: Plan

  1. Create the output directory structure:
{output_dir}/
├── screenshots/       # Evidence screenshots
└── report.md          # Final report (generated in Phase 5)
  1. Identify the testing scope based on user input.
  2. Build a rough sitemap by planning which pages and features to test:
    • Landing/home page
    • Navigation links (header, footer, sidebar)
    • Key user flows (sign up, login, search, checkout, etc.)
    • Forms and interactive elements
    • Edge cases (empty states, error pages, 404s)

Phase 2: Explore

For each page or feature in your plan:

  1. Navigate to the page:

    browser_navigate(url="https://example.com/page")
    
  2. Take a snapshot to understand the DOM structure:

    browser_snapshot()
    
  3. Check the console for JavaScript errors:

    browser_console(clear=true)
    

    Do this after every navigation and after every significant interaction. Silent JS errors are high-value findings.

  4. Take an annotated screenshot to visually assess the page and identify interactive elements:

    browser_vision(question="Describe the page layout, identify any visual issues, broken elements, or accessibility concerns", annotate=true)
    

    The annotate=true flag overlays numbered [N] labels on interactive elements. Each [N] maps to ref @eN for subsequent browser commands.

  5. Test interactive elements systematically:

    • Click buttons and links: browser_click(ref="@eN")
    • Fill forms: browser_type(ref="@eN", text="test input")
    • Test keyboard navigation: browser_press(key="Tab"), browser_press(key="Enter")
    • Scroll through content: browser_scroll(direction="down")
    • Test form validation with invalid inputs
    • Test empty submissions
  6. After each interaction, check for:

    • Console errors: browser_console()
    • Visual changes: browser_vision(question="What changed after the interaction?")
    • Expected vs actual behavior

Phase 3: Collect Evidence

For every issue found:

  1. Take a screenshot showing the issue:

    browser_vision(question="Capture and describe the issue visible on this page", annotate=false)
    

    Save the screenshot_path from the response — you will reference it in the report.

  2. Record the details:

    • URL where the issue occurs
    • Steps to reproduce
    • Expected behavior
    • Actual behavior
    • Console errors (if any)
    • Screenshot path
  3. Classify the issue using the issue taxonomy (see references/issue-taxonomy.md):

    • Severity: Critical / High / Medium / Low
    • Category: Functional / Visual / Accessibility / Console / UX / Content

Phase 4: Categorize

  1. Review all collected issues.
  2. De-duplicate — merge issues that are the same bug manifesting in different places.
  3. Assign final severity and category to each issue.
  4. Sort by severity (Critical first, then High, Medium, Low).
  5. Count issues by severity and category for the executive summary.

Phase 5: Report

Generate the final report using the template at templates/dogfood-report-template.md.

The report must include:

  1. Executive summary with total issue count, breakdown by severity, and testing scope
  2. Per-issue sections with:
    • Issue number and title
    • Severity and category badges
    • URL where observed
    • Description of the issue
    • Steps to reproduce
    • Expected vs actual behavior
    • Screenshot references (use MEDIA:<screenshot_path> for inline images)
    • Console errors if relevant
  3. Summary table of all issues
  4. Testing notes — what was tested, what was not, any blockers

Save the report to {output_dir}/report.md.

Tools Reference

Tool Purpose
browser_navigate Go to a URL
browser_snapshot Get DOM text snapshot (accessibility tree)
browser_click Click an element by ref (@eN) or text
browser_type Type into an input field
browser_scroll Scroll up/down on the page
browser_back Go back in browser history
browser_press Press a keyboard key
browser_vision Screenshot + AI analysis; use annotate=true for element labels
browser_console Get JS console output and errors

Tips

  • Always check browser_console() after navigating and after significant interactions. Silent JS errors are among the most valuable findings.
  • Use annotate=true with browser_vision when you need to reason about interactive element positions or when the snapshot refs are unclear.
  • Test with both valid and invalid inputs — form validation bugs are common.
  • Scroll through long pages — content below the fold may have rendering issues.
  • Test navigation flows — click through multi-step processes end-to-end.
  • Check responsive behavior by noting any layout issues visible in screenshots.
  • Don't forget edge cases: empty states, very long text, special characters, rapid clicking.
  • When reporting screenshots to the user, include MEDIA:<screenshot_path> so they can see the evidence inline.