refactor: remove browser_close tool — auto-cleanup handles it (#5792)

* refactor: remove browser_close tool — auto-cleanup handles it

The browser_close tool was called in only 9% of browser sessions (13/144
navigations across 66 sessions), always redundantly — cleanup_browser()
already runs via _cleanup_task_resources() at conversation end, and the
background inactivity reaper catches anything else.

Removing it saves one tool schema slot in every browser-enabled API call.

Also fixes a latent bug: cleanup_browser() now handles Camofox sessions
too (previously only Browserbase). Camofox sessions were never auto-cleaned
per-task because they live in a separate dict from _active_sessions.

Files changed (13):
- tools/browser_tool.py: remove function, schema, registry entry; add
  camofox cleanup to cleanup_browser()
- toolsets.py, model_tools.py, prompt_builder.py, display.py,
  acp_adapter/tools.py: remove browser_close from all tool lists
- tests/: remove browser_close test, update toolset assertion
- docs/skills: remove all browser_close references

* fix: repeat browser_scroll 5x per call for meaningful page movement

Most backends scroll ~100px per call — barely visible on a typical
viewport. Repeating 5x gives ~500px (~half a viewport), making each
scroll tool call actually useful.

Backend-agnostic approach: works across all 7+ browser backends without
needing to configure each one's scroll amount individually. Breaks
early on error for the agent-browser path.

* feat: auto-return compact snapshot from browser_navigate

Every browser session starts with navigate → snapshot. Now navigate
returns the compact accessibility tree snapshot inline, saving one
tool call per browser task.

The snapshot captures the full page DOM (not viewport-limited), so
scroll position doesn't affect it. browser_snapshot remains available
for refreshing after interactions or getting full=true content.

Both Browserbase and Camofox paths auto-snapshot. If the snapshot
fails for any reason, navigation still succeeds — the snapshot is
a bonus, not a requirement.

Schema descriptions updated to guide models: navigate mentions it
returns a snapshot, snapshot mentions it's for refresh/full content.

* refactor: slim cronjob tool schema — consolidate model/provider, drop unused params

Session data (151 calls across 67 sessions) showed several schema
properties were never used by models. Consolidated and cleaned up:

Removed from schema (still work via backend/CLI):
- skill (singular): use skills array instead
- reason: pause-only, unnecessary
- include_disabled: now defaults to true
- base_url: extreme edge case, zero usage
- provider (standalone): merged into model object

Consolidated:
- model + provider → single 'model' object with {model, provider} fields.
  If provider is omitted, the current main provider is pinned at creation
  time so the job stays stable even if the user changes their default.

Kept:
- script: useful data collection feature
- skills array: standard interface for skill loading

Schema shrinks from 14 to 10 properties. All backend functionality
preserved — the Python function signature and handler lambda still
accept every parameter.

* fix: remove mixture_of_agents from core toolsets — opt-in only via hermes tools

MoA was in _HERMES_CORE_TOOLS and composite toolsets (hermes-cli,
hermes-messaging, safe), which meant it appeared in every session
for anyone with OPENROUTER_API_KEY set. The _DEFAULT_OFF_TOOLSETS
gate only works after running 'hermes tools' explicitly.

Now MoA only appears when a user explicitly enables it via
'hermes tools'. The moa toolset definition and check_fn remain
unchanged — it just needs to be opted into.
This commit is contained in:
Teknium 2026-04-07 03:28:44 -07:00 committed by GitHub
parent cafdfd3654
commit 8b861b77c1
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
15 changed files with 136 additions and 142 deletions

View file

@ -39,7 +39,6 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
"browser_scroll": "execute", "browser_scroll": "execute",
"browser_press": "execute", "browser_press": "execute",
"browser_back": "execute", "browser_back": "execute",
"browser_close": "execute",
"browser_get_images": "read", "browser_get_images": "read",
# Agent internals # Agent internals
"delegate_task": "execute", "delegate_task": "execute",

View file

@ -890,8 +890,6 @@ def get_cute_tool_message(
return _wrap(f"┊ ◀️ back {dur}") return _wrap(f"┊ ◀️ back {dur}")
if tool_name == "browser_press": if tool_name == "browser_press":
return _wrap(f"┊ ⌨️ press {args.get('key', '?')} {dur}") return _wrap(f"┊ ⌨️ press {args.get('key', '?')} {dur}")
if tool_name == "browser_close":
return _wrap(f"┊ 🚪 close browser {dur}")
if tool_name == "browser_get_images": if tool_name == "browser_get_images":
return _wrap(f"┊ 🖼️ images extracting {dur}") return _wrap(f"┊ 🖼️ images extracting {dur}")
if tool_name == "browser_vision": if tool_name == "browser_vision":

View file

@ -744,7 +744,6 @@ def build_nous_subscription_prompt(valid_tool_names: "set[str] | None" = None) -
"browser_type", "browser_type",
"browser_scroll", "browser_scroll",
"browser_console", "browser_console",
"browser_close",
"browser_press", "browser_press",
"browser_get_images", "browser_get_images",
"browser_vision", "browser_vision",

View file

@ -539,7 +539,7 @@ platform_toolsets:
# terminal - terminal, process # terminal - terminal, process
# file - read_file, write_file, patch, search # file - read_file, write_file, patch, search
# browser - browser_navigate, browser_snapshot, browser_click, browser_type, # browser - browser_navigate, browser_snapshot, browser_click, browser_type,
# browser_scroll, browser_back, browser_press, browser_close, # browser_scroll, browser_back, browser_press,
# browser_get_images, browser_vision (requires BROWSERBASE_API_KEY) # browser_get_images, browser_vision (requires BROWSERBASE_API_KEY)
# vision - vision_analyze (requires OPENROUTER_API_KEY) # vision - vision_analyze (requires OPENROUTER_API_KEY)
# image_gen - image_generate (requires FAL_KEY) # image_gen - image_generate (requires FAL_KEY)

View file

@ -211,7 +211,7 @@ _LEGACY_TOOLSET_MAP = {
"browser_tools": [ "browser_tools": [
"browser_navigate", "browser_snapshot", "browser_click", "browser_navigate", "browser_snapshot", "browser_click",
"browser_type", "browser_scroll", "browser_back", "browser_type", "browser_scroll", "browser_back",
"browser_press", "browser_close", "browser_get_images", "browser_press", "browser_get_images",
"browser_vision", "browser_console" "browser_vision", "browser_console"
], ],
"cronjob_tools": ["cronjob"], "cronjob_tools": ["cronjob"],

View file

@ -16,7 +16,7 @@ This skill guides you through systematic exploratory QA testing of web applicati
## Prerequisites ## Prerequisites
- Browser toolset must be available (`browser_navigate`, `browser_snapshot`, `browser_click`, `browser_type`, `browser_vision`, `browser_console`, `browser_scroll`, `browser_back`, `browser_press`, `browser_close`) - Browser toolset must be available (`browser_navigate`, `browser_snapshot`, `browser_click`, `browser_type`, `browser_vision`, `browser_console`, `browser_scroll`, `browser_back`, `browser_press`)
- A target URL and testing scope from the user - A target URL and testing scope from the user
## Inputs ## Inputs
@ -148,7 +148,6 @@ Save the report to `{output_dir}/report.md`.
| `browser_press` | Press a keyboard key | | `browser_press` | Press a keyboard key |
| `browser_vision` | Screenshot + AI analysis; use `annotate=true` for element labels | | `browser_vision` | Screenshot + AI analysis; use `annotate=true` for element labels |
| `browser_console` | Get JS console output and errors | | `browser_console` | Get JS console output and errors |
| `browser_close` | Close the browser session |
## Tips ## Tips

View file

@ -39,7 +39,7 @@ class TestHermesApiServerToolset:
tools = resolve_toolset("hermes-api-server") tools = resolve_toolset("hermes-api-server")
for tool in ["browser_navigate", "browser_snapshot", "browser_click", for tool in ["browser_navigate", "browser_snapshot", "browser_click",
"browser_type", "browser_scroll", "browser_back", "browser_type", "browser_scroll", "browser_back",
"browser_press", "browser_close"]: "browser_press"]:
assert tool in tools, f"Missing browser tool: {tool}" assert tool in tools, f"Missing browser tool: {tool}"
def test_toolset_includes_homeassistant_tools(self): def test_toolset_includes_homeassistant_tools(self):

View file

@ -65,18 +65,6 @@ class TestBrowserCleanup:
mock_stop.assert_called_once_with("task-1") mock_stop.assert_called_once_with("task-1")
mock_run.assert_called_once_with("task-1", "close", [], timeout=10) mock_run.assert_called_once_with("task-1", "close", [], timeout=10)
def test_browser_close_delegates_to_cleanup_browser(self):
import json
browser_tool = self.browser_tool
browser_tool._active_sessions["task-2"] = {"session_name": "sess-2"}
with patch("tools.browser_tool.cleanup_browser") as mock_cleanup:
result = json.loads(browser_tool.browser_close("task-2"))
assert result == {"success": True, "closed": True}
mock_cleanup.assert_called_once_with("task-2")
def test_emergency_cleanup_clears_all_tracking_state(self): def test_emergency_cleanup_clears_all_tracking_state(self):
browser_tool = self.browser_tool browser_tool = self.browser_tool
browser_tool._cleanup_done = False browser_tool._cleanup_done = False

View file

@ -240,6 +240,25 @@ def camofox_navigate(url: str, task_id: Optional[str] = None) -> str:
"Browser is visible via VNC. " "Browser is visible via VNC. "
"Share this link with the user so they can watch the browser live." "Share this link with the user so they can watch the browser live."
) )
# Auto-take a compact snapshot so the model can act immediately
try:
snap_data = _get(
f"/tabs/{session['tab_id']}/snapshot",
params={"userId": session["user_id"]},
)
snapshot_text = snap_data.get("snapshot", "")
from tools.browser_tool import (
SNAPSHOT_SUMMARIZE_THRESHOLD,
_truncate_snapshot,
)
if len(snapshot_text) > SNAPSHOT_SUMMARIZE_THRESHOLD:
snapshot_text = _truncate_snapshot(snapshot_text)
result["snapshot"] = snapshot_text
result["element_count"] = snap_data.get("refsCount", 0)
except Exception:
pass # Navigation succeeded; snapshot is a bonus
return json.dumps(result) return json.dumps(result)
except requests.HTTPError as e: except requests.HTTPError as e:
return json.dumps({"success": False, "error": f"Navigation failed: {e}"}) return json.dumps({"success": False, "error": f"Navigation failed: {e}"})

View file

@ -518,7 +518,7 @@ atexit.register(_stop_browser_cleanup_thread)
BROWSER_TOOL_SCHEMAS = [ BROWSER_TOOL_SCHEMAS = [
{ {
"name": "browser_navigate", "name": "browser_navigate",
"description": "Navigate to a URL in the browser. Initializes the session and loads the page. Must be called before other browser tools. For simple information retrieval, prefer web_search or web_extract (faster, cheaper). Use browser tools when you need to interact with a page (click, fill forms, dynamic content).", "description": "Navigate to a URL in the browser. Initializes the session and loads the page. Must be called before other browser tools. For simple information retrieval, prefer web_search or web_extract (faster, cheaper). Use browser tools when you need to interact with a page (click, fill forms, dynamic content). Returns a compact page snapshot with interactive elements and ref IDs — no need to call browser_snapshot separately after navigating.",
"parameters": { "parameters": {
"type": "object", "type": "object",
"properties": { "properties": {
@ -532,7 +532,7 @@ BROWSER_TOOL_SCHEMAS = [
}, },
{ {
"name": "browser_snapshot", "name": "browser_snapshot",
"description": "Get a text-based snapshot of the current page's accessibility tree. Returns interactive elements with ref IDs (like @e1, @e2) for browser_click and browser_type. full=false (default): compact view with interactive elements. full=true: complete page content. Snapshots over 8000 chars are truncated or LLM-summarized. Requires browser_navigate first.", "description": "Get a text-based snapshot of the current page's accessibility tree. Returns interactive elements with ref IDs (like @e1, @e2) for browser_click and browser_type. full=false (default): compact view with interactive elements. full=true: complete page content. Snapshots over 8000 chars are truncated or LLM-summarized. Requires browser_navigate first. Note: browser_navigate already returns a compact snapshot — use this to refresh after interactions that change the page, or with full=true for complete content.",
"parameters": { "parameters": {
"type": "object", "type": "object",
"properties": { "properties": {
@ -615,15 +615,7 @@ BROWSER_TOOL_SCHEMAS = [
"required": ["key"] "required": ["key"]
} }
}, },
{
"name": "browser_close",
"description": "Close the browser session and release resources. Call this when done with browser tasks to free up Browserbase session quota.",
"parameters": {
"type": "object",
"properties": {},
"required": []
}
},
{ {
"name": "browser_get_images", "name": "browser_get_images",
"description": "Get a list of all images on the current page with their URLs and alt text. Useful for finding images to analyze with the vision tool. Requires browser_navigate to be called first.", "description": "Get a list of all images on the current page with their URLs and alt text. Useful for finding images to analyze with the vision tool. Requires browser_navigate to be called first.",
@ -1230,6 +1222,21 @@ def browser_navigate(url: str, task_id: Optional[str] = None) -> str:
) )
response["stealth_features"] = active_features response["stealth_features"] = active_features
# Auto-take a compact snapshot so the model can act immediately
# without a separate browser_snapshot call.
try:
snap_result = _run_browser_command(effective_task_id, "snapshot", ["-c"])
if snap_result.get("success"):
snap_data = snap_result.get("data", {})
snapshot_text = snap_data.get("snapshot", "")
refs = snap_data.get("refs", {})
if len(snapshot_text) > SNAPSHOT_SUMMARIZE_THRESHOLD:
snapshot_text = _truncate_snapshot(snapshot_text)
response["snapshot"] = snapshot_text
response["element_count"] = len(refs) if refs else 0
except Exception as e:
logger.debug("Auto-snapshot after navigate failed: %s", e)
return json.dumps(response, ensure_ascii=False) return json.dumps(response, ensure_ascii=False)
else: else:
return json.dumps({ return json.dumps({
@ -1376,12 +1383,6 @@ def browser_scroll(direction: str, task_id: Optional[str] = None) -> str:
Returns: Returns:
JSON string with scroll result JSON string with scroll result
""" """
if _is_camofox_mode():
from tools.browser_camofox import camofox_scroll
return camofox_scroll(direction, task_id)
effective_task_id = task_id or "default"
# Validate direction # Validate direction
if direction not in ["up", "down"]: if direction not in ["up", "down"]:
return json.dumps({ return json.dumps({
@ -1389,19 +1390,34 @@ def browser_scroll(direction: str, task_id: Optional[str] = None) -> str:
"error": f"Invalid direction '{direction}'. Use 'up' or 'down'." "error": f"Invalid direction '{direction}'. Use 'up' or 'down'."
}, ensure_ascii=False) }, ensure_ascii=False)
result = _run_browser_command(effective_task_id, "scroll", [direction]) # Repeat the scroll 5 times to get meaningful page movement.
# Most backends scroll ~100px per call, which is barely visible.
# 5x gives roughly half a viewport of travel, backend-agnostic.
_SCROLL_REPEATS = 5
if result.get("success"): if _is_camofox_mode():
return json.dumps({ from tools.browser_camofox import camofox_scroll
"success": True, result = None
"scrolled": direction for _ in range(_SCROLL_REPEATS):
}, ensure_ascii=False) result = camofox_scroll(direction, task_id)
else: return result
effective_task_id = task_id or "default"
result = None
for _ in range(_SCROLL_REPEATS):
result = _run_browser_command(effective_task_id, "scroll", [direction])
if not result.get("success"):
return json.dumps({ return json.dumps({
"success": False, "success": False,
"error": result.get("error", f"Failed to scroll {direction}") "error": result.get("error", f"Failed to scroll {direction}")
}, ensure_ascii=False) }, ensure_ascii=False)
return json.dumps({
"success": True,
"scrolled": direction
}, ensure_ascii=False)
def browser_back(task_id: Optional[str] = None) -> str: def browser_back(task_id: Optional[str] = None) -> str:
""" """
@ -1463,33 +1479,7 @@ def browser_press(key: str, task_id: Optional[str] = None) -> str:
}, ensure_ascii=False) }, ensure_ascii=False)
def browser_close(task_id: Optional[str] = None) -> str:
"""
Close the browser session.
Args:
task_id: Task identifier for session isolation
Returns:
JSON string with close result
"""
if _is_camofox_mode():
from tools.browser_camofox import camofox_close
return camofox_close(task_id)
effective_task_id = task_id or "default"
with _cleanup_lock:
had_session = effective_task_id in _active_sessions
cleanup_browser(effective_task_id)
response = {
"success": True,
"closed": True,
}
if not had_session:
response["warning"] = "Session may not have been active"
return json.dumps(response, ensure_ascii=False)
def browser_console(clear: bool = False, expression: Optional[str] = None, task_id: Optional[str] = None) -> str: def browser_console(clear: bool = False, expression: Optional[str] = None, task_id: Optional[str] = None) -> str:
@ -1942,7 +1932,7 @@ def cleanup_browser(task_id: Optional[str] = None) -> None:
Clean up browser session for a task. Clean up browser session for a task.
Called automatically when a task completes or when inactivity timeout is reached. Called automatically when a task completes or when inactivity timeout is reached.
Closes both the agent-browser session and the Browserbase session. Closes both the agent-browser/Browserbase session and Camofox sessions.
Args: Args:
task_id: Task identifier to clean up task_id: Task identifier to clean up
@ -1950,6 +1940,14 @@ def cleanup_browser(task_id: Optional[str] = None) -> None:
if task_id is None: if task_id is None:
task_id = "default" task_id = "default"
# Also clean up Camofox session if running in Camofox mode
if _is_camofox_mode():
try:
from tools.browser_camofox import camofox_close
camofox_close(task_id)
except Exception as e:
logger.debug("Camofox cleanup for task %s: %s", task_id, e)
logger.debug("cleanup_browser called for task_id: %s", task_id) logger.debug("cleanup_browser called for task_id: %s", task_id)
logger.debug("Active sessions: %s", list(_active_sessions.keys())) logger.debug("Active sessions: %s", list(_active_sessions.keys()))
@ -2168,14 +2166,7 @@ registry.register(
check_fn=check_browser_requirements, check_fn=check_browser_requirements,
emoji="⌨️", emoji="⌨️",
) )
registry.register(
name="browser_close",
toolset="browser",
schema=_BROWSER_SCHEMA_MAP["browser_close"],
handler=lambda args, **kw: browser_close(task_id=kw.get("task_id")),
check_fn=check_browser_requirements,
emoji="🚪",
)
registry.register( registry.register(
name="browser_get_images", name="browser_get_images",
toolset="browser", toolset="browser",

View file

@ -103,6 +103,32 @@ def _canonical_skills(skill: Optional[str] = None, skills: Optional[Any] = None)
def _resolve_model_override(model_obj: Optional[Dict[str, Any]]) -> tuple:
"""Resolve a model override object into (provider, model) for job storage.
If provider is omitted, pins the current main provider from config so the
job doesn't drift when the user later changes their default via hermes model.
Returns (provider_str_or_none, model_str_or_none).
"""
if not model_obj or not isinstance(model_obj, dict):
return (None, None)
model_name = (model_obj.get("model") or "").strip() or None
provider_name = (model_obj.get("provider") or "").strip() or None
if model_name and not provider_name:
# Pin to the current main provider so the job is stable
try:
from hermes_cli.config import load_config
cfg = load_config()
model_cfg = cfg.get("model", {})
if isinstance(model_cfg, dict):
provider_name = model_cfg.get("provider") or None
except Exception:
pass # Best-effort; provider stays None
return (provider_name, model_name)
def _normalize_optional_job_value(value: Optional[Any], *, strip_trailing_slash: bool = False) -> Optional[str]: def _normalize_optional_job_value(value: Optional[Any], *, strip_trailing_slash: bool = False) -> Optional[str]:
if value is None: if value is None:
return None return None
@ -392,14 +418,9 @@ Use action='list' to inspect jobs.
Use action='update', 'pause', 'resume', 'remove', or 'run' to manage an existing job. Use action='update', 'pause', 'resume', 'remove', or 'run' to manage an existing job.
Jobs run in a fresh session with no current-chat context, so prompts must be self-contained. Jobs run in a fresh session with no current-chat context, so prompts must be self-contained.
If skill or skills are provided on create, the future cron run loads those skills in order, then follows the prompt as the task instruction. If skills are provided on create, the future cron run loads those skills in order, then follows the prompt as the task instruction.
On update, passing skills=[] clears attached skills. On update, passing skills=[] clears attached skills.
If script is provided on create, the referenced Python script runs before each agent turn.
Its stdout is injected into the prompt as context. Use this for data collection and change
detection the script handles gathering data, the agent analyzes and reports.
On update, pass script="" to clear an attached script.
NOTE: The agent's final response is auto-delivered to the target. Put the primary NOTE: The agent's final response is auto-delivered to the target. Put the primary
user-facing content in the final response. Cron jobs run autonomously with no user user-facing content in the final response. Cron jobs run autonomously with no user
present they cannot ask questions or request clarification. present they cannot ask questions or request clarification.
@ -418,7 +439,7 @@ Important safety rule: cron-run sessions should not recursively schedule more cr
}, },
"prompt": { "prompt": {
"type": "string", "type": "string",
"description": "For create: the full self-contained prompt. If skill or skills are also provided, this becomes the task instruction paired with those skills." "description": "For create: the full self-contained prompt. If skills are also provided, this becomes the task instruction paired with those skills."
}, },
"schedule": { "schedule": {
"type": "string", "type": "string",
@ -436,39 +457,30 @@ Important safety rule: cron-run sessions should not recursively schedule more cr
"type": "string", "type": "string",
"description": "Delivery target: origin, local, telegram, discord, slack, whatsapp, signal, matrix, mattermost, homeassistant, dingtalk, feishu, wecom, email, sms, or platform:chat_id or platform:chat_id:thread_id for Telegram topics. Examples: 'origin', 'local', 'telegram', 'telegram:-1001234567890:17585', 'discord:#engineering'" "description": "Delivery target: origin, local, telegram, discord, slack, whatsapp, signal, matrix, mattermost, homeassistant, dingtalk, feishu, wecom, email, sms, or platform:chat_id or platform:chat_id:thread_id for Telegram topics. Examples: 'origin', 'local', 'telegram', 'telegram:-1001234567890:17585', 'discord:#engineering'"
}, },
"model": {
"type": "string",
"description": "Optional per-job model override used when the cron job runs"
},
"provider": {
"type": "string",
"description": "Optional per-job provider override used when resolving runtime credentials"
},
"base_url": {
"type": "string",
"description": "Optional per-job base URL override paired with provider/model routing"
},
"include_disabled": {
"type": "boolean",
"description": "For list: include paused/completed jobs"
},
"skill": {
"type": "string",
"description": "Optional single skill name to load before executing the cron prompt"
},
"skills": { "skills": {
"type": "array", "type": "array",
"items": {"type": "string"}, "items": {"type": "string"},
"description": "Optional ordered list of skills to load before executing the cron prompt. On update, pass an empty array to clear attached skills." "description": "Optional ordered list of skill names to load before executing the cron prompt. On update, pass an empty array to clear attached skills."
}, },
"reason": { "model": {
"type": "object",
"description": "Optional per-job model override. If provider is omitted, the current main provider is pinned at creation time so the job stays stable.",
"properties": {
"provider": {
"type": "string", "type": "string",
"description": "Optional pause reason" "description": "Provider name (e.g. 'openrouter', 'anthropic'). Omit to use and pin the current provider."
},
"model": {
"type": "string",
"description": "Model name (e.g. 'anthropic/claude-sonnet-4', 'claude-sonnet-4')"
}
},
"required": ["model"]
}, },
"script": { "script": {
"type": "string", "type": "string",
"description": "Optional path to a Python script that runs before each cron job execution. Its stdout is injected into the prompt as context. Use for data collection and change detection. Relative paths resolve under ~/.hermes/scripts/. On update, pass empty string to clear." "description": "Optional path to a Python script that runs before each cron job execution. Its stdout is injected into the prompt as context. Use for data collection and change detection. Relative paths resolve under ~/.hermes/scripts/. On update, pass empty string to clear."
} },
}, },
"required": ["action"] "required": ["action"]
} }
@ -502,7 +514,7 @@ registry.register(
name="cronjob", name="cronjob",
toolset="cronjob", toolset="cronjob",
schema=CRONJOB_SCHEMA, schema=CRONJOB_SCHEMA,
handler=lambda args, **kw: cronjob( handler=lambda args, **kw: (lambda _mo=_resolve_model_override(args.get("model")): cronjob(
action=args.get("action", ""), action=args.get("action", ""),
job_id=args.get("job_id"), job_id=args.get("job_id"),
prompt=args.get("prompt"), prompt=args.get("prompt"),
@ -510,16 +522,16 @@ registry.register(
name=args.get("name"), name=args.get("name"),
repeat=args.get("repeat"), repeat=args.get("repeat"),
deliver=args.get("deliver"), deliver=args.get("deliver"),
include_disabled=args.get("include_disabled", False), include_disabled=args.get("include_disabled", True),
skill=args.get("skill"), skill=args.get("skill"),
skills=args.get("skills"), skills=args.get("skills"),
model=args.get("model"), model=_mo[1],
provider=args.get("provider"), provider=_mo[0] or args.get("provider"),
base_url=args.get("base_url"), base_url=args.get("base_url"),
reason=args.get("reason"), reason=args.get("reason"),
script=args.get("script"), script=args.get("script"),
task_id=kw.get("task_id"), task_id=kw.get("task_id"),
), ))(),
check_fn=check_cronjob_requirements, check_fn=check_cronjob_requirements,
emoji="", emoji="",
) )

View file

@ -37,14 +37,12 @@ _HERMES_CORE_TOOLS = [
"read_file", "write_file", "patch", "search_files", "read_file", "write_file", "patch", "search_files",
# Vision + image generation # Vision + image generation
"vision_analyze", "image_generate", "vision_analyze", "image_generate",
# MoA
"mixture_of_agents",
# Skills # Skills
"skills_list", "skill_view", "skill_manage", "skills_list", "skill_view", "skill_manage",
# Browser automation # Browser automation
"browser_navigate", "browser_snapshot", "browser_click", "browser_navigate", "browser_snapshot", "browser_click",
"browser_type", "browser_scroll", "browser_back", "browser_type", "browser_scroll", "browser_back",
"browser_press", "browser_close", "browser_get_images", "browser_press", "browser_get_images",
"browser_vision", "browser_console", "browser_vision", "browser_console",
# Text-to-speech # Text-to-speech
"text_to_speech", "text_to_speech",
@ -116,7 +114,7 @@ TOOLSETS = {
"tools": [ "tools": [
"browser_navigate", "browser_snapshot", "browser_click", "browser_navigate", "browser_snapshot", "browser_click",
"browser_type", "browser_scroll", "browser_back", "browser_type", "browser_scroll", "browser_back",
"browser_press", "browser_close", "browser_get_images", "browser_press", "browser_get_images",
"browser_vision", "browser_console", "web_search" "browser_vision", "browser_console", "web_search"
], ],
"includes": [] "includes": []
@ -214,7 +212,7 @@ TOOLSETS = {
"safe": { "safe": {
"description": "Safe toolkit without terminal access", "description": "Safe toolkit without terminal access",
"tools": ["mixture_of_agents"], "tools": [],
"includes": ["web", "vision", "image_gen"] "includes": ["web", "vision", "image_gen"]
}, },
@ -235,7 +233,7 @@ TOOLSETS = {
"skills_list", "skill_view", "skill_manage", "skills_list", "skill_view", "skill_manage",
"browser_navigate", "browser_snapshot", "browser_click", "browser_navigate", "browser_snapshot", "browser_click",
"browser_type", "browser_scroll", "browser_back", "browser_type", "browser_scroll", "browser_back",
"browser_press", "browser_close", "browser_get_images", "browser_press", "browser_get_images",
"browser_vision", "browser_console", "browser_vision", "browser_console",
"todo", "memory", "todo", "memory",
"session_search", "session_search",
@ -255,14 +253,12 @@ TOOLSETS = {
"read_file", "write_file", "patch", "search_files", "read_file", "write_file", "patch", "search_files",
# Vision + image generation # Vision + image generation
"vision_analyze", "image_generate", "vision_analyze", "image_generate",
# MoA
"mixture_of_agents",
# Skills # Skills
"skills_list", "skill_view", "skill_manage", "skills_list", "skill_view", "skill_manage",
# Browser automation # Browser automation
"browser_navigate", "browser_snapshot", "browser_click", "browser_navigate", "browser_snapshot", "browser_click",
"browser_type", "browser_scroll", "browser_back", "browser_type", "browser_scroll", "browser_back",
"browser_press", "browser_close", "browser_get_images", "browser_press", "browser_get_images",
"browser_vision", "browser_console", "browser_vision", "browser_console",
# Planning & memory # Planning & memory
"todo", "memory", "todo", "memory",

View file

@ -20,7 +20,6 @@ In addition to built-in tools, Hermes can load tools dynamically from MCP server
|------|-------------|----------------------| |------|-------------|----------------------|
| `browser_back` | Navigate back to the previous page in browser history. Requires browser_navigate to be called first. | — | | `browser_back` | Navigate back to the previous page in browser history. Requires browser_navigate to be called first. | — |
| `browser_click` | Click on an element identified by its ref ID from the snapshot (e.g., '@e5'). The ref IDs are shown in square brackets in the snapshot output. Requires browser_navigate and browser_snapshot to be called first. | — | | `browser_click` | Click on an element identified by its ref ID from the snapshot (e.g., '@e5'). The ref IDs are shown in square brackets in the snapshot output. Requires browser_navigate and browser_snapshot to be called first. | — |
| `browser_close` | Close the browser session and release resources. Call this when done with browser tasks to free up Browserbase session quota. | — |
| `browser_console` | Get browser console output and JavaScript errors from the current page. Returns console.log/warn/error/info messages and uncaught JS exceptions. Use this to detect silent JavaScript errors, failed API calls, and application warnings. Requi… | — | | `browser_console` | Get browser console output and JavaScript errors from the current page. Returns console.log/warn/error/info messages and uncaught JS exceptions. Use this to detect silent JavaScript errors, failed API calls, and application warnings. Requi… | — |
| `browser_get_images` | Get a list of all images on the current page with their URLs and alt text. Useful for finding images to analyze with the vision tool. Requires browser_navigate to be called first. | — | | `browser_get_images` | Get a list of all images on the current page with their URLs and alt text. Useful for finding images to analyze with the vision tool. Requires browser_navigate to be called first. | — |
| `browser_navigate` | Navigate to a URL in the browser. Initializes the session and loads the page. Must be called before other browser tools. For simple information retrieval, prefer web_search or web_extract (faster, cheaper). Use browser tools when you need… | — | | `browser_navigate` | Navigate to a URL in the browser. Initializes the session and loads the page. Must be called before other browser tools. For simple information retrieval, prefer web_search or web_extract (faster, cheaper). Use browser tools when you need… | — |

View file

@ -52,7 +52,7 @@ Or in-session:
| Toolset | Tools | Purpose | | Toolset | Tools | Purpose |
|---------|-------|---------| |---------|-------|---------|
| `browser` | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` | Full browser automation. Includes `web_search` as a fallback for quick lookups. | | `browser` | `browser_back`, `browser_click`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` | Full browser automation. Includes `web_search` as a fallback for quick lookups. |
| `clarify` | `clarify` | Ask the user a question when the agent needs clarification. | | `clarify` | `clarify` | Ask the user a question when the agent needs clarification. |
| `code_execution` | `execute_code` | Run Python scripts that call Hermes tools programmatically. | | `code_execution` | `execute_code` | Run Python scripts that call Hermes tools programmatically. |
| `cronjob` | `cronjob` | Schedule and manage recurring tasks. | | `cronjob` | `cronjob` | Schedule and manage recurring tasks. |

View file

@ -277,10 +277,6 @@ Check the browser console for any JavaScript errors
Use `clear=True` to clear the console after reading, so subsequent calls only show new messages. Use `clear=True` to clear the console after reading, so subsequent calls only show new messages.
### `browser_close`
Close the browser session and release resources. Call this when done to free up Browserbase session quota.
## Practical Examples ## Practical Examples
### Filling Out a Web Form ### Filling Out a Web Form
@ -295,7 +291,6 @@ Agent workflow:
4. browser_type(ref="@e5", text="SecurePass123") 4. browser_type(ref="@e5", text="SecurePass123")
5. browser_click(ref="@e8") → clicks "Create Account" 5. browser_click(ref="@e8") → clicks "Create Account"
6. browser_snapshot() → confirms success 6. browser_snapshot() → confirms success
7. browser_close()
``` ```
### Researching Dynamic Content ### Researching Dynamic Content
@ -307,7 +302,6 @@ Agent workflow:
1. browser_navigate("https://github.com/trending") 1. browser_navigate("https://github.com/trending")
2. browser_snapshot(full=true) → reads trending repo list 2. browser_snapshot(full=true) → reads trending repo list
3. Returns formatted results 3. Returns formatted results
4. browser_close()
``` ```
## Session Recording ## Session Recording
@ -349,5 +343,5 @@ If paid features aren't available on your plan, Hermes automatically falls back
- **Text-based interaction** — relies on accessibility tree, not pixel coordinates - **Text-based interaction** — relies on accessibility tree, not pixel coordinates
- **Snapshot size** — large pages may be truncated or LLM-summarized at 8000 characters - **Snapshot size** — large pages may be truncated or LLM-summarized at 8000 characters
- **Session timeout** — cloud sessions expire based on your provider's plan settings - **Session timeout** — cloud sessions expire based on your provider's plan settings
- **Cost** — cloud sessions consume provider credits; use `browser_close` when done. Use `/browser connect` for free local browsing. - **Cost** — cloud sessions consume provider credits; sessions are automatically cleaned up when the conversation ends or after inactivity. Use `/browser connect` for free local browsing.
- **No file downloads** — cannot download files from the browser - **No file downloads** — cannot download files from the browser