hermes-agent/website/docs/reference/toolsets-reference.md
Teknium 4b8272f549
feat(browser): add browser_dialog for native JS dialog handling
Ergonomic wrapper over CDP's Page.handleJavaScriptDialog that accepts
or dismisses alert/confirm/prompt/beforeunload dialogs blocking a page.
Unsticks pages whose JS thread is frozen by an unhandled dialog —
symptom is that browser_snapshot, browser_console, browser_click etc.
start hanging or erroring.

- action='accept'|'dismiss' required; prompt_text optional for prompt()
- target_id auto-resolves when exactly one page tab is open; with
  multiple page tabs, errors with the tab list so the agent picks one
- Shares browser_cdp's check_fn gate — only appears when CDP is
  reachable (/browser connect or browser.cdp_url in config). Hidden
  otherwise so backends that can't use it don't see it.
- Safe as a probe: CDP returns a clean 'No dialog is showing' error
  when nothing's pending, which we pass through verbatim

Dialog detection (knowing a dialog is open without being told) is NOT
included — it requires persistent CDP subscriptions per session, a
larger architectural change. Documented as a follow-up; agents infer
from symptoms and use this tool to recover.

Tests: 11 new unit tests against mock CDP server covering the wrapper
(action validation, auto-resolve with 0/1/multiple page targets,
explicit target_id accept/dismiss flow, prompt_text passthrough, shared
gate with browser_cdp, registry dispatch). E2E probe case against real
headless Chrome passes. Positive-case real-Chrome E2E is blocked by
Chromium's headless auto-dismiss behavior when no persistent listener
is attached — unit tests exercise the exact CDP protocol we send, so
the handling path is protocol-verified; headful real-browser usage
(the actual /browser connect case) keeps dialogs alive via the Chrome
UI.
2026-04-19 05:20:51 -07:00

8.1 KiB

sidebar_position title description
4 Toolsets Reference Reference for Hermes core, composite, platform, and dynamic toolsets

Toolsets Reference

Toolsets are named bundles of tools that control what the agent can do. They're the primary mechanism for configuring tool availability per platform, per session, or per task.

How Toolsets Work

Every tool belongs to exactly one toolset. When you enable a toolset, all tools in that bundle become available to the agent. Toolsets come in three kinds:

  • Core — A single logical group of related tools (e.g., file bundles read_file, write_file, patch, search_files)
  • Composite — Combines multiple core toolsets for a common scenario (e.g., debugging bundles file, terminal, and web tools)
  • Platform — A complete tool configuration for a specific deployment context (e.g., hermes-cli is the default for interactive CLI sessions)

Configuring Toolsets

Per-session (CLI)

hermes chat --toolsets web,file,terminal
hermes chat --toolsets debugging        # composite — expands to file + terminal + web
hermes chat --toolsets all              # everything

Per-platform (config.yaml)

toolsets:
  - hermes-cli          # default for CLI
  # - hermes-telegram   # override for Telegram gateway

Interactive management

hermes tools                            # curses UI to enable/disable per platform

Or in-session:

/tools list
/tools disable browser
/tools enable rl

Core Toolsets

Toolset Tools Purpose
browser browser_back, browser_cdp, browser_click, browser_console, browser_dialog, browser_get_images, browser_navigate, browser_press, browser_scroll, browser_snapshot, browser_type, browser_vision, web_search Full browser automation. Includes web_search as a fallback for quick lookups. browser_cdp and browser_dialog share a gate on a reachable CDP endpoint — both only appear when /browser connect is active or browser.cdp_url is set.
clarify clarify Ask the user a question when the agent needs clarification.
code_execution execute_code Run Python scripts that call Hermes tools programmatically.
cronjob cronjob Schedule and manage recurring tasks.
delegation delegate_task Spawn isolated subagent instances for parallel work.
feishu_doc feishu_doc_read Read Feishu/Lark document content. Used by the Feishu document-comment intelligent-reply handler.
feishu_drive feishu_drive_add_comment, feishu_drive_list_comments, feishu_drive_list_comment_replies, feishu_drive_reply_comment Feishu/Lark drive comment operations. Scoped to the comment agent; not exposed on hermes-cli or other messaging toolsets.
file patch, read_file, search_files, write_file File reading, writing, searching, and editing.
homeassistant ha_call_service, ha_get_state, ha_list_entities, ha_list_services Smart home control via Home Assistant. Only available when HASS_TOKEN is set.
image_gen image_generate Text-to-image generation via FAL.ai.
memory memory Persistent cross-session memory management.
messaging send_message Send messages to other platforms (Telegram, Discord, etc.) from within a session.
moa mixture_of_agents Multi-model consensus via Mixture of Agents.
rl rl_check_status, rl_edit_config, rl_get_current_config, rl_get_results, rl_list_environments, rl_list_runs, rl_select_environment, rl_start_training, rl_stop_training, rl_test_inference RL training environment management (Atropos).
search web_search Web search only (without extract).
session_search session_search Search past conversation sessions.
skills skill_manage, skill_view, skills_list Skill CRUD and browsing.
terminal process, terminal Shell command execution and background process management.
todo todo Task list management within a session.
tts text_to_speech Text-to-speech audio generation.
vision vision_analyze Image analysis via vision-capable models.
web web_extract, web_search Web search and page content extraction.

Composite Toolsets

These expand to multiple core toolsets, providing a convenient shorthand for common scenarios:

Toolset Expands to Use case
debugging web + file + process, terminal (via includes) — effectively patch, process, read_file, search_files, terminal, web_extract, web_search, write_file Debug sessions — file access, terminal, and web research without browser or delegation overhead.
safe image_generate, vision_analyze, web_extract, web_search Read-only research and media generation. No file writes, no terminal access, no code execution. Good for untrusted or constrained environments.

Platform Toolsets

Platform toolsets define the complete tool configuration for a deployment target. Most messaging platforms use the same set as hermes-cli:

Toolset Differences from hermes-cli
hermes-cli Full toolset — all 36 core tools including clarify. The default for interactive CLI sessions.
hermes-acp Drops clarify, cronjob, image_generate, send_message, text_to_speech, homeassistant tools. Focused on coding tasks in IDE context.
hermes-api-server Drops clarify, send_message, and text_to_speech. Adds everything else — suitable for programmatic access where user interaction isn't possible.
hermes-telegram Same as hermes-cli.
hermes-discord Same as hermes-cli.
hermes-slack Same as hermes-cli.
hermes-whatsapp Same as hermes-cli.
hermes-signal Same as hermes-cli.
hermes-matrix Same as hermes-cli.
hermes-mattermost Same as hermes-cli.
hermes-email Same as hermes-cli.
hermes-sms Same as hermes-cli.
hermes-bluebubbles Same as hermes-cli.
hermes-dingtalk Same as hermes-cli.
hermes-feishu Same as hermes-cli. Note: the feishu_doc / feishu_drive toolsets are used only by the document-comment handler, not by the regular Feishu chat adapter.
hermes-qqbot Same as hermes-cli.
hermes-wecom Same as hermes-cli.
hermes-wecom-callback Same as hermes-cli.
hermes-weixin Same as hermes-cli.
hermes-homeassistant Same as hermes-cli plus the homeassistant toolset always on.
hermes-webhook Same as hermes-cli.
hermes-gateway Internal gateway orchestrator toolset — union of the broadest possible tool set when the gateway needs to accept any message source.

Dynamic Toolsets

MCP server toolsets

Each configured MCP server generates a mcp-<server> toolset at runtime. For example, if you configure a github MCP server, a mcp-github toolset is created containing all tools that server exposes.

# config.yaml
mcp_servers:
  github:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-github"]

This creates a mcp-github toolset you can reference in --toolsets or platform configs.

Plugin toolsets

Plugins can register their own toolsets via ctx.register_tool() during plugin initialization. These appear alongside built-in toolsets and can be enabled/disabled the same way.

Custom toolsets

Define custom toolsets in config.yaml to create project-specific bundles:

toolsets:
  - hermes-cli
custom_toolsets:
  data-science:
    - file
    - terminal
    - code_execution
    - web
    - vision

Wildcards

  • all or * — expands to every registered toolset (built-in + dynamic + plugin)

Relationship to hermes tools

The hermes tools command provides a curses-based UI for toggling individual tools on or off per platform. This operates at the tool level (finer than toolsets) and persists to config.yaml. Disabled tools are filtered out even if their toolset is enabled.

See also: Tools Reference for the complete list of individual tools and their parameters.