hermes-agent/tests
teknium1 a8bf414f4a feat: browser console/errors tool, annotated screenshots, auto-recording, and dogfood QA skill
New browser capabilities and a built-in skill for agent-driven web QA.

## New tool: browser_console

Returns console messages (log/warn/error/info) AND uncaught JavaScript
exceptions in a single call. Uses agent-browser's 'console' and 'errors'
commands through the existing session plumbing. Supports --clear to reset
buffers. Verified working in both local and Browserbase cloud modes.

## Enhanced tool: browser_vision(annotate=True)

New boolean parameter on browser_vision. When true, agent-browser overlays
numbered [N] labels on interactive elements — each [N] maps to ref @eN.
Annotation data (element name, role, bounding box) returned alongside the
vision analysis. Useful for QA reports and spatial reasoning.

## Config: browser.record_sessions

Auto-record browser sessions as WebM video files when enabled:
- Starts recording on first browser_navigate
- Stops and saves on browser_close
- Saves to ~/.hermes/browser_recordings/
- Works in both local and cloud modes (verified)
- Disabled by default

## Built-in skill: dogfood

Systematic exploratory QA testing for web applications. Teaches the agent
a 5-phase workflow:
1. Plan — accept URL, create output dirs, set scope
2. Explore — systematic crawl with annotated screenshots
3. Collect Evidence — screenshots, console errors, JS exceptions
4. Categorize — severity (Critical/High/Medium/Low) and category
   (Functional/Visual/Accessibility/Console/UX/Content)
5. Report — structured markdown with per-issue evidence

Includes:
- skills/dogfood/SKILL.md — full workflow instructions
- skills/dogfood/references/issue-taxonomy.md — severity/category defs
- skills/dogfood/templates/dogfood-report-template.md — report template

## Tests

21 new tests covering:
- browser_console message/error parsing, clear flag, empty/failed states
- browser_console schema registration
- browser_vision annotate schema and flag passing
- record_sessions config defaults and recording lifecycle
- Dogfood skill file existence and content validation

Addresses #315.
2026-03-08 21:28:12 -07:00
..
agent refactor: remove redundant 'openai' auxiliary provider, clean up docs 2026-03-08 18:50:26 -07:00
cron Merge PR #309: fix(timezone): timezone-aware now() for prompt, cron, and execute_code 2026-03-07 00:04:41 -08:00
fakes test: add HA integration tests with fake in-process server 2026-02-28 14:28:04 +03:00
gateway feat: add Signal messenger gateway platform (#405) 2026-03-08 20:20:35 -07:00
hermes_cli Merge pull request #732 from NousResearch/hermes/hermes-2cb83eed 2026-03-08 20:10:32 -07:00
honcho_integration Merge PR #193: add unit tests for 5 security/logic-critical modules (batch 4) 2026-03-04 19:35:01 -08:00
integration chore: remove all NOUS_API_KEY references 2026-03-08 17:45:38 -07:00
tools feat: browser console/errors tool, annotated screenshots, auto-recording, and dogfood QA skill 2026-03-08 21:28:12 -07:00
__init__.py A bit of restructuring for simplicity and organization 2025-10-01 23:29:25 +00:00
conftest.py fix(tests): isolate HERMES_HOME in tests and adjust log directory for debug session 2026-03-02 04:34:21 -08:00
test_413_compression.py fix: preflight context compression + error handler ordering for model switches 2026-03-04 14:42:41 -08:00
test_api_key_providers.py fix: add Kimi Code API support (api.kimi.com/coding/v1) 2026-03-07 21:00:12 -05:00
test_atomic_json_write.py refactor: extract atomic_json_write helper, add 24 checkpoint tests 2026-03-06 05:50:12 -08:00
test_auth_codex_provider.py refactor(auth): transition Codex OAuth tokens to Hermes auth store 2026-03-01 19:59:24 -08:00
test_auth_nous_provider.py Fix nous refresh token rotation failure in case where api key mint/retrieval fails 2026-03-02 17:18:15 +11:00
test_auxiliary_config_bridge.py fix: harden auxiliary model config — gateway bridge, vision safety, tests 2026-03-08 18:06:47 -07:00
test_batch_runner_checkpoint.py refactor: extract atomic_json_write helper, add 24 checkpoint tests 2026-03-06 05:50:12 -08:00
test_cli_init.py chore: remove dead module stubs from test_cli_init.py 2026-03-08 05:35:02 -07:00
test_cli_model_command.py fix: improve /model user feedback + update docs 2026-03-08 06:13:12 -07:00
test_cli_provider_resolution.py fix: trust user-selected models with OpenAI Codex provider 2026-03-08 18:29:09 -07:00
test_codex_execution_paths.py fix: update mock agent signature to accept task_id after PR #419 2026-03-05 01:41:50 -08:00
test_codex_models.py Merge pull request #735 from NousResearch/hermes/hermes-f8d56335 2026-03-08 18:30:27 -07:00
test_external_credential_detection.py refactor(auth): transition Codex OAuth tokens to Hermes auth store 2026-03-01 19:59:24 -08:00
test_flush_memories_codex.py refactor(cli): Finalize OpenAI Codex Integration with OAuth 2026-02-28 21:47:51 -08:00
test_hermes_state.py fix: add title validation — sanitize, length limit, control char stripping 2026-03-08 15:54:51 -07:00
test_honcho_client_config.py fix(honcho): auto-enable when API key is present 2026-03-01 03:12:37 -05:00
test_insights.py fix: deep review — prefix matching, tool_calls extraction, query perf, serialization 2026-03-06 14:50:57 -08:00
test_model_tools.py test: strengthen assertions across 3 more test files (batch 2) 2026-03-05 18:46:30 -08:00
test_provider_parity.py feat: default reasoning effort from xhigh to medium 2026-03-07 10:14:19 -08:00
test_resume_display.py feat: display previous messages when resuming a session in CLI 2026-03-08 17:45:45 -07:00
test_run_agent.py fix: allow Anthropic API URLs as custom OpenAI-compatible endpoints 2026-03-07 23:36:35 -08:00
test_run_agent_codex_responses.py refactor(cli): Finalize OpenAI Codex Integration with OAuth 2026-02-28 21:47:51 -08:00
test_runtime_provider_resolution.py fix: custom endpoint no longer leaks OPENROUTER_API_KEY (#560) 2026-03-06 17:16:14 -08:00
test_timezone.py fix(timezone): add timezone-aware clock across agent, cron, and execute_code 2026-03-03 18:23:40 +05:30
test_toolset_distributions.py test: add unit tests for 8 modules (batch 2) 2026-02-26 13:54:20 +03:00
test_toolsets.py test: add unit tests for 8 untested modules 2026-02-26 13:27:58 +03:00
test_trajectory_compressor.py test: add 25 unit tests for trajectory_compressor 2026-02-28 21:28:28 +03:00
test_worktree.py fix: wire worktree flag into hermes CLI entry point + docs + tests 2026-03-07 21:05:40 -08:00