mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
refactor: remove browser_close tool — auto-cleanup handles it (#5792)
* refactor: remove browser_close tool — auto-cleanup handles it
The browser_close tool was called in only 9% of browser sessions (13/144
navigations across 66 sessions), always redundantly — cleanup_browser()
already runs via _cleanup_task_resources() at conversation end, and the
background inactivity reaper catches anything else.
Removing it saves one tool schema slot in every browser-enabled API call.
Also fixes a latent bug: cleanup_browser() now handles Camofox sessions
too (previously only Browserbase). Camofox sessions were never auto-cleaned
per-task because they live in a separate dict from _active_sessions.
Files changed (13):
- tools/browser_tool.py: remove function, schema, registry entry; add
camofox cleanup to cleanup_browser()
- toolsets.py, model_tools.py, prompt_builder.py, display.py,
acp_adapter/tools.py: remove browser_close from all tool lists
- tests/: remove browser_close test, update toolset assertion
- docs/skills: remove all browser_close references
* fix: repeat browser_scroll 5x per call for meaningful page movement
Most backends scroll ~100px per call — barely visible on a typical
viewport. Repeating 5x gives ~500px (~half a viewport), making each
scroll tool call actually useful.
Backend-agnostic approach: works across all 7+ browser backends without
needing to configure each one's scroll amount individually. Breaks
early on error for the agent-browser path.
* feat: auto-return compact snapshot from browser_navigate
Every browser session starts with navigate → snapshot. Now navigate
returns the compact accessibility tree snapshot inline, saving one
tool call per browser task.
The snapshot captures the full page DOM (not viewport-limited), so
scroll position doesn't affect it. browser_snapshot remains available
for refreshing after interactions or getting full=true content.
Both Browserbase and Camofox paths auto-snapshot. If the snapshot
fails for any reason, navigation still succeeds — the snapshot is
a bonus, not a requirement.
Schema descriptions updated to guide models: navigate mentions it
returns a snapshot, snapshot mentions it's for refresh/full content.
* refactor: slim cronjob tool schema — consolidate model/provider, drop unused params
Session data (151 calls across 67 sessions) showed several schema
properties were never used by models. Consolidated and cleaned up:
Removed from schema (still work via backend/CLI):
- skill (singular): use skills array instead
- reason: pause-only, unnecessary
- include_disabled: now defaults to true
- base_url: extreme edge case, zero usage
- provider (standalone): merged into model object
Consolidated:
- model + provider → single 'model' object with {model, provider} fields.
If provider is omitted, the current main provider is pinned at creation
time so the job stays stable even if the user changes their default.
Kept:
- script: useful data collection feature
- skills array: standard interface for skill loading
Schema shrinks from 14 to 10 properties. All backend functionality
preserved — the Python function signature and handler lambda still
accept every parameter.
* fix: remove mixture_of_agents from core toolsets — opt-in only via hermes tools
MoA was in _HERMES_CORE_TOOLS and composite toolsets (hermes-cli,
hermes-messaging, safe), which meant it appeared in every session
for anyone with OPENROUTER_API_KEY set. The _DEFAULT_OFF_TOOLSETS
gate only works after running 'hermes tools' explicitly.
Now MoA only appears when a user explicitly enables it via
'hermes tools'. The moa toolset definition and check_fn remain
unchanged — it just needs to be opted into.
This commit is contained in:
parent
cafdfd3654
commit
8b861b77c1
15 changed files with 136 additions and 142 deletions
|
|
@ -39,7 +39,6 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
|
||||||
"browser_scroll": "execute",
|
"browser_scroll": "execute",
|
||||||
"browser_press": "execute",
|
"browser_press": "execute",
|
||||||
"browser_back": "execute",
|
"browser_back": "execute",
|
||||||
"browser_close": "execute",
|
|
||||||
"browser_get_images": "read",
|
"browser_get_images": "read",
|
||||||
# Agent internals
|
# Agent internals
|
||||||
"delegate_task": "execute",
|
"delegate_task": "execute",
|
||||||
|
|
|
||||||
|
|
@ -890,8 +890,6 @@ def get_cute_tool_message(
|
||||||
return _wrap(f"┊ ◀️ back {dur}")
|
return _wrap(f"┊ ◀️ back {dur}")
|
||||||
if tool_name == "browser_press":
|
if tool_name == "browser_press":
|
||||||
return _wrap(f"┊ ⌨️ press {args.get('key', '?')} {dur}")
|
return _wrap(f"┊ ⌨️ press {args.get('key', '?')} {dur}")
|
||||||
if tool_name == "browser_close":
|
|
||||||
return _wrap(f"┊ 🚪 close browser {dur}")
|
|
||||||
if tool_name == "browser_get_images":
|
if tool_name == "browser_get_images":
|
||||||
return _wrap(f"┊ 🖼️ images extracting {dur}")
|
return _wrap(f"┊ 🖼️ images extracting {dur}")
|
||||||
if tool_name == "browser_vision":
|
if tool_name == "browser_vision":
|
||||||
|
|
|
||||||
|
|
@ -744,7 +744,6 @@ def build_nous_subscription_prompt(valid_tool_names: "set[str] | None" = None) -
|
||||||
"browser_type",
|
"browser_type",
|
||||||
"browser_scroll",
|
"browser_scroll",
|
||||||
"browser_console",
|
"browser_console",
|
||||||
"browser_close",
|
|
||||||
"browser_press",
|
"browser_press",
|
||||||
"browser_get_images",
|
"browser_get_images",
|
||||||
"browser_vision",
|
"browser_vision",
|
||||||
|
|
|
||||||
|
|
@ -539,7 +539,7 @@ platform_toolsets:
|
||||||
# terminal - terminal, process
|
# terminal - terminal, process
|
||||||
# file - read_file, write_file, patch, search
|
# file - read_file, write_file, patch, search
|
||||||
# browser - browser_navigate, browser_snapshot, browser_click, browser_type,
|
# browser - browser_navigate, browser_snapshot, browser_click, browser_type,
|
||||||
# browser_scroll, browser_back, browser_press, browser_close,
|
# browser_scroll, browser_back, browser_press,
|
||||||
# browser_get_images, browser_vision (requires BROWSERBASE_API_KEY)
|
# browser_get_images, browser_vision (requires BROWSERBASE_API_KEY)
|
||||||
# vision - vision_analyze (requires OPENROUTER_API_KEY)
|
# vision - vision_analyze (requires OPENROUTER_API_KEY)
|
||||||
# image_gen - image_generate (requires FAL_KEY)
|
# image_gen - image_generate (requires FAL_KEY)
|
||||||
|
|
|
||||||
|
|
@ -211,7 +211,7 @@ _LEGACY_TOOLSET_MAP = {
|
||||||
"browser_tools": [
|
"browser_tools": [
|
||||||
"browser_navigate", "browser_snapshot", "browser_click",
|
"browser_navigate", "browser_snapshot", "browser_click",
|
||||||
"browser_type", "browser_scroll", "browser_back",
|
"browser_type", "browser_scroll", "browser_back",
|
||||||
"browser_press", "browser_close", "browser_get_images",
|
"browser_press", "browser_get_images",
|
||||||
"browser_vision", "browser_console"
|
"browser_vision", "browser_console"
|
||||||
],
|
],
|
||||||
"cronjob_tools": ["cronjob"],
|
"cronjob_tools": ["cronjob"],
|
||||||
|
|
|
||||||
|
|
@ -16,7 +16,7 @@ This skill guides you through systematic exploratory QA testing of web applicati
|
||||||
|
|
||||||
## Prerequisites
|
## Prerequisites
|
||||||
|
|
||||||
- Browser toolset must be available (`browser_navigate`, `browser_snapshot`, `browser_click`, `browser_type`, `browser_vision`, `browser_console`, `browser_scroll`, `browser_back`, `browser_press`, `browser_close`)
|
- Browser toolset must be available (`browser_navigate`, `browser_snapshot`, `browser_click`, `browser_type`, `browser_vision`, `browser_console`, `browser_scroll`, `browser_back`, `browser_press`)
|
||||||
- A target URL and testing scope from the user
|
- A target URL and testing scope from the user
|
||||||
|
|
||||||
## Inputs
|
## Inputs
|
||||||
|
|
@ -148,7 +148,6 @@ Save the report to `{output_dir}/report.md`.
|
||||||
| `browser_press` | Press a keyboard key |
|
| `browser_press` | Press a keyboard key |
|
||||||
| `browser_vision` | Screenshot + AI analysis; use `annotate=true` for element labels |
|
| `browser_vision` | Screenshot + AI analysis; use `annotate=true` for element labels |
|
||||||
| `browser_console` | Get JS console output and errors |
|
| `browser_console` | Get JS console output and errors |
|
||||||
| `browser_close` | Close the browser session |
|
|
||||||
|
|
||||||
## Tips
|
## Tips
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -39,7 +39,7 @@ class TestHermesApiServerToolset:
|
||||||
tools = resolve_toolset("hermes-api-server")
|
tools = resolve_toolset("hermes-api-server")
|
||||||
for tool in ["browser_navigate", "browser_snapshot", "browser_click",
|
for tool in ["browser_navigate", "browser_snapshot", "browser_click",
|
||||||
"browser_type", "browser_scroll", "browser_back",
|
"browser_type", "browser_scroll", "browser_back",
|
||||||
"browser_press", "browser_close"]:
|
"browser_press"]:
|
||||||
assert tool in tools, f"Missing browser tool: {tool}"
|
assert tool in tools, f"Missing browser tool: {tool}"
|
||||||
|
|
||||||
def test_toolset_includes_homeassistant_tools(self):
|
def test_toolset_includes_homeassistant_tools(self):
|
||||||
|
|
|
||||||
|
|
@ -65,18 +65,6 @@ class TestBrowserCleanup:
|
||||||
mock_stop.assert_called_once_with("task-1")
|
mock_stop.assert_called_once_with("task-1")
|
||||||
mock_run.assert_called_once_with("task-1", "close", [], timeout=10)
|
mock_run.assert_called_once_with("task-1", "close", [], timeout=10)
|
||||||
|
|
||||||
def test_browser_close_delegates_to_cleanup_browser(self):
|
|
||||||
import json
|
|
||||||
|
|
||||||
browser_tool = self.browser_tool
|
|
||||||
browser_tool._active_sessions["task-2"] = {"session_name": "sess-2"}
|
|
||||||
|
|
||||||
with patch("tools.browser_tool.cleanup_browser") as mock_cleanup:
|
|
||||||
result = json.loads(browser_tool.browser_close("task-2"))
|
|
||||||
|
|
||||||
assert result == {"success": True, "closed": True}
|
|
||||||
mock_cleanup.assert_called_once_with("task-2")
|
|
||||||
|
|
||||||
def test_emergency_cleanup_clears_all_tracking_state(self):
|
def test_emergency_cleanup_clears_all_tracking_state(self):
|
||||||
browser_tool = self.browser_tool
|
browser_tool = self.browser_tool
|
||||||
browser_tool._cleanup_done = False
|
browser_tool._cleanup_done = False
|
||||||
|
|
|
||||||
|
|
@ -240,6 +240,25 @@ def camofox_navigate(url: str, task_id: Optional[str] = None) -> str:
|
||||||
"Browser is visible via VNC. "
|
"Browser is visible via VNC. "
|
||||||
"Share this link with the user so they can watch the browser live."
|
"Share this link with the user so they can watch the browser live."
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Auto-take a compact snapshot so the model can act immediately
|
||||||
|
try:
|
||||||
|
snap_data = _get(
|
||||||
|
f"/tabs/{session['tab_id']}/snapshot",
|
||||||
|
params={"userId": session["user_id"]},
|
||||||
|
)
|
||||||
|
snapshot_text = snap_data.get("snapshot", "")
|
||||||
|
from tools.browser_tool import (
|
||||||
|
SNAPSHOT_SUMMARIZE_THRESHOLD,
|
||||||
|
_truncate_snapshot,
|
||||||
|
)
|
||||||
|
if len(snapshot_text) > SNAPSHOT_SUMMARIZE_THRESHOLD:
|
||||||
|
snapshot_text = _truncate_snapshot(snapshot_text)
|
||||||
|
result["snapshot"] = snapshot_text
|
||||||
|
result["element_count"] = snap_data.get("refsCount", 0)
|
||||||
|
except Exception:
|
||||||
|
pass # Navigation succeeded; snapshot is a bonus
|
||||||
|
|
||||||
return json.dumps(result)
|
return json.dumps(result)
|
||||||
except requests.HTTPError as e:
|
except requests.HTTPError as e:
|
||||||
return json.dumps({"success": False, "error": f"Navigation failed: {e}"})
|
return json.dumps({"success": False, "error": f"Navigation failed: {e}"})
|
||||||
|
|
|
||||||
|
|
@ -518,7 +518,7 @@ atexit.register(_stop_browser_cleanup_thread)
|
||||||
BROWSER_TOOL_SCHEMAS = [
|
BROWSER_TOOL_SCHEMAS = [
|
||||||
{
|
{
|
||||||
"name": "browser_navigate",
|
"name": "browser_navigate",
|
||||||
"description": "Navigate to a URL in the browser. Initializes the session and loads the page. Must be called before other browser tools. For simple information retrieval, prefer web_search or web_extract (faster, cheaper). Use browser tools when you need to interact with a page (click, fill forms, dynamic content).",
|
"description": "Navigate to a URL in the browser. Initializes the session and loads the page. Must be called before other browser tools. For simple information retrieval, prefer web_search or web_extract (faster, cheaper). Use browser tools when you need to interact with a page (click, fill forms, dynamic content). Returns a compact page snapshot with interactive elements and ref IDs — no need to call browser_snapshot separately after navigating.",
|
||||||
"parameters": {
|
"parameters": {
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"properties": {
|
"properties": {
|
||||||
|
|
@ -532,7 +532,7 @@ BROWSER_TOOL_SCHEMAS = [
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "browser_snapshot",
|
"name": "browser_snapshot",
|
||||||
"description": "Get a text-based snapshot of the current page's accessibility tree. Returns interactive elements with ref IDs (like @e1, @e2) for browser_click and browser_type. full=false (default): compact view with interactive elements. full=true: complete page content. Snapshots over 8000 chars are truncated or LLM-summarized. Requires browser_navigate first.",
|
"description": "Get a text-based snapshot of the current page's accessibility tree. Returns interactive elements with ref IDs (like @e1, @e2) for browser_click and browser_type. full=false (default): compact view with interactive elements. full=true: complete page content. Snapshots over 8000 chars are truncated or LLM-summarized. Requires browser_navigate first. Note: browser_navigate already returns a compact snapshot — use this to refresh after interactions that change the page, or with full=true for complete content.",
|
||||||
"parameters": {
|
"parameters": {
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"properties": {
|
"properties": {
|
||||||
|
|
@ -615,15 +615,7 @@ BROWSER_TOOL_SCHEMAS = [
|
||||||
"required": ["key"]
|
"required": ["key"]
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
{
|
|
||||||
"name": "browser_close",
|
|
||||||
"description": "Close the browser session and release resources. Call this when done with browser tasks to free up Browserbase session quota.",
|
|
||||||
"parameters": {
|
|
||||||
"type": "object",
|
|
||||||
"properties": {},
|
|
||||||
"required": []
|
|
||||||
}
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
"name": "browser_get_images",
|
"name": "browser_get_images",
|
||||||
"description": "Get a list of all images on the current page with their URLs and alt text. Useful for finding images to analyze with the vision tool. Requires browser_navigate to be called first.",
|
"description": "Get a list of all images on the current page with their URLs and alt text. Useful for finding images to analyze with the vision tool. Requires browser_navigate to be called first.",
|
||||||
|
|
@ -1230,6 +1222,21 @@ def browser_navigate(url: str, task_id: Optional[str] = None) -> str:
|
||||||
)
|
)
|
||||||
response["stealth_features"] = active_features
|
response["stealth_features"] = active_features
|
||||||
|
|
||||||
|
# Auto-take a compact snapshot so the model can act immediately
|
||||||
|
# without a separate browser_snapshot call.
|
||||||
|
try:
|
||||||
|
snap_result = _run_browser_command(effective_task_id, "snapshot", ["-c"])
|
||||||
|
if snap_result.get("success"):
|
||||||
|
snap_data = snap_result.get("data", {})
|
||||||
|
snapshot_text = snap_data.get("snapshot", "")
|
||||||
|
refs = snap_data.get("refs", {})
|
||||||
|
if len(snapshot_text) > SNAPSHOT_SUMMARIZE_THRESHOLD:
|
||||||
|
snapshot_text = _truncate_snapshot(snapshot_text)
|
||||||
|
response["snapshot"] = snapshot_text
|
||||||
|
response["element_count"] = len(refs) if refs else 0
|
||||||
|
except Exception as e:
|
||||||
|
logger.debug("Auto-snapshot after navigate failed: %s", e)
|
||||||
|
|
||||||
return json.dumps(response, ensure_ascii=False)
|
return json.dumps(response, ensure_ascii=False)
|
||||||
else:
|
else:
|
||||||
return json.dumps({
|
return json.dumps({
|
||||||
|
|
@ -1376,12 +1383,6 @@ def browser_scroll(direction: str, task_id: Optional[str] = None) -> str:
|
||||||
Returns:
|
Returns:
|
||||||
JSON string with scroll result
|
JSON string with scroll result
|
||||||
"""
|
"""
|
||||||
if _is_camofox_mode():
|
|
||||||
from tools.browser_camofox import camofox_scroll
|
|
||||||
return camofox_scroll(direction, task_id)
|
|
||||||
|
|
||||||
effective_task_id = task_id or "default"
|
|
||||||
|
|
||||||
# Validate direction
|
# Validate direction
|
||||||
if direction not in ["up", "down"]:
|
if direction not in ["up", "down"]:
|
||||||
return json.dumps({
|
return json.dumps({
|
||||||
|
|
@ -1389,19 +1390,34 @@ def browser_scroll(direction: str, task_id: Optional[str] = None) -> str:
|
||||||
"error": f"Invalid direction '{direction}'. Use 'up' or 'down'."
|
"error": f"Invalid direction '{direction}'. Use 'up' or 'down'."
|
||||||
}, ensure_ascii=False)
|
}, ensure_ascii=False)
|
||||||
|
|
||||||
result = _run_browser_command(effective_task_id, "scroll", [direction])
|
# Repeat the scroll 5 times to get meaningful page movement.
|
||||||
|
# Most backends scroll ~100px per call, which is barely visible.
|
||||||
|
# 5x gives roughly half a viewport of travel, backend-agnostic.
|
||||||
|
_SCROLL_REPEATS = 5
|
||||||
|
|
||||||
if result.get("success"):
|
if _is_camofox_mode():
|
||||||
return json.dumps({
|
from tools.browser_camofox import camofox_scroll
|
||||||
"success": True,
|
result = None
|
||||||
"scrolled": direction
|
for _ in range(_SCROLL_REPEATS):
|
||||||
}, ensure_ascii=False)
|
result = camofox_scroll(direction, task_id)
|
||||||
else:
|
return result
|
||||||
|
|
||||||
|
effective_task_id = task_id or "default"
|
||||||
|
|
||||||
|
result = None
|
||||||
|
for _ in range(_SCROLL_REPEATS):
|
||||||
|
result = _run_browser_command(effective_task_id, "scroll", [direction])
|
||||||
|
if not result.get("success"):
|
||||||
return json.dumps({
|
return json.dumps({
|
||||||
"success": False,
|
"success": False,
|
||||||
"error": result.get("error", f"Failed to scroll {direction}")
|
"error": result.get("error", f"Failed to scroll {direction}")
|
||||||
}, ensure_ascii=False)
|
}, ensure_ascii=False)
|
||||||
|
|
||||||
|
return json.dumps({
|
||||||
|
"success": True,
|
||||||
|
"scrolled": direction
|
||||||
|
}, ensure_ascii=False)
|
||||||
|
|
||||||
|
|
||||||
def browser_back(task_id: Optional[str] = None) -> str:
|
def browser_back(task_id: Optional[str] = None) -> str:
|
||||||
"""
|
"""
|
||||||
|
|
@ -1463,33 +1479,7 @@ def browser_press(key: str, task_id: Optional[str] = None) -> str:
|
||||||
}, ensure_ascii=False)
|
}, ensure_ascii=False)
|
||||||
|
|
||||||
|
|
||||||
def browser_close(task_id: Optional[str] = None) -> str:
|
|
||||||
"""
|
|
||||||
Close the browser session.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
task_id: Task identifier for session isolation
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
JSON string with close result
|
|
||||||
"""
|
|
||||||
if _is_camofox_mode():
|
|
||||||
from tools.browser_camofox import camofox_close
|
|
||||||
return camofox_close(task_id)
|
|
||||||
|
|
||||||
effective_task_id = task_id or "default"
|
|
||||||
with _cleanup_lock:
|
|
||||||
had_session = effective_task_id in _active_sessions
|
|
||||||
|
|
||||||
cleanup_browser(effective_task_id)
|
|
||||||
|
|
||||||
response = {
|
|
||||||
"success": True,
|
|
||||||
"closed": True,
|
|
||||||
}
|
|
||||||
if not had_session:
|
|
||||||
response["warning"] = "Session may not have been active"
|
|
||||||
return json.dumps(response, ensure_ascii=False)
|
|
||||||
|
|
||||||
|
|
||||||
def browser_console(clear: bool = False, expression: Optional[str] = None, task_id: Optional[str] = None) -> str:
|
def browser_console(clear: bool = False, expression: Optional[str] = None, task_id: Optional[str] = None) -> str:
|
||||||
|
|
@ -1942,7 +1932,7 @@ def cleanup_browser(task_id: Optional[str] = None) -> None:
|
||||||
Clean up browser session for a task.
|
Clean up browser session for a task.
|
||||||
|
|
||||||
Called automatically when a task completes or when inactivity timeout is reached.
|
Called automatically when a task completes or when inactivity timeout is reached.
|
||||||
Closes both the agent-browser session and the Browserbase session.
|
Closes both the agent-browser/Browserbase session and Camofox sessions.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
task_id: Task identifier to clean up
|
task_id: Task identifier to clean up
|
||||||
|
|
@ -1950,6 +1940,14 @@ def cleanup_browser(task_id: Optional[str] = None) -> None:
|
||||||
if task_id is None:
|
if task_id is None:
|
||||||
task_id = "default"
|
task_id = "default"
|
||||||
|
|
||||||
|
# Also clean up Camofox session if running in Camofox mode
|
||||||
|
if _is_camofox_mode():
|
||||||
|
try:
|
||||||
|
from tools.browser_camofox import camofox_close
|
||||||
|
camofox_close(task_id)
|
||||||
|
except Exception as e:
|
||||||
|
logger.debug("Camofox cleanup for task %s: %s", task_id, e)
|
||||||
|
|
||||||
logger.debug("cleanup_browser called for task_id: %s", task_id)
|
logger.debug("cleanup_browser called for task_id: %s", task_id)
|
||||||
logger.debug("Active sessions: %s", list(_active_sessions.keys()))
|
logger.debug("Active sessions: %s", list(_active_sessions.keys()))
|
||||||
|
|
||||||
|
|
@ -2168,14 +2166,7 @@ registry.register(
|
||||||
check_fn=check_browser_requirements,
|
check_fn=check_browser_requirements,
|
||||||
emoji="⌨️",
|
emoji="⌨️",
|
||||||
)
|
)
|
||||||
registry.register(
|
|
||||||
name="browser_close",
|
|
||||||
toolset="browser",
|
|
||||||
schema=_BROWSER_SCHEMA_MAP["browser_close"],
|
|
||||||
handler=lambda args, **kw: browser_close(task_id=kw.get("task_id")),
|
|
||||||
check_fn=check_browser_requirements,
|
|
||||||
emoji="🚪",
|
|
||||||
)
|
|
||||||
registry.register(
|
registry.register(
|
||||||
name="browser_get_images",
|
name="browser_get_images",
|
||||||
toolset="browser",
|
toolset="browser",
|
||||||
|
|
|
||||||
|
|
@ -103,6 +103,32 @@ def _canonical_skills(skill: Optional[str] = None, skills: Optional[Any] = None)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_model_override(model_obj: Optional[Dict[str, Any]]) -> tuple:
|
||||||
|
"""Resolve a model override object into (provider, model) for job storage.
|
||||||
|
|
||||||
|
If provider is omitted, pins the current main provider from config so the
|
||||||
|
job doesn't drift when the user later changes their default via hermes model.
|
||||||
|
|
||||||
|
Returns (provider_str_or_none, model_str_or_none).
|
||||||
|
"""
|
||||||
|
if not model_obj or not isinstance(model_obj, dict):
|
||||||
|
return (None, None)
|
||||||
|
model_name = (model_obj.get("model") or "").strip() or None
|
||||||
|
provider_name = (model_obj.get("provider") or "").strip() or None
|
||||||
|
if model_name and not provider_name:
|
||||||
|
# Pin to the current main provider so the job is stable
|
||||||
|
try:
|
||||||
|
from hermes_cli.config import load_config
|
||||||
|
cfg = load_config()
|
||||||
|
model_cfg = cfg.get("model", {})
|
||||||
|
if isinstance(model_cfg, dict):
|
||||||
|
provider_name = model_cfg.get("provider") or None
|
||||||
|
except Exception:
|
||||||
|
pass # Best-effort; provider stays None
|
||||||
|
return (provider_name, model_name)
|
||||||
|
|
||||||
|
|
||||||
def _normalize_optional_job_value(value: Optional[Any], *, strip_trailing_slash: bool = False) -> Optional[str]:
|
def _normalize_optional_job_value(value: Optional[Any], *, strip_trailing_slash: bool = False) -> Optional[str]:
|
||||||
if value is None:
|
if value is None:
|
||||||
return None
|
return None
|
||||||
|
|
@ -392,14 +418,9 @@ Use action='list' to inspect jobs.
|
||||||
Use action='update', 'pause', 'resume', 'remove', or 'run' to manage an existing job.
|
Use action='update', 'pause', 'resume', 'remove', or 'run' to manage an existing job.
|
||||||
|
|
||||||
Jobs run in a fresh session with no current-chat context, so prompts must be self-contained.
|
Jobs run in a fresh session with no current-chat context, so prompts must be self-contained.
|
||||||
If skill or skills are provided on create, the future cron run loads those skills in order, then follows the prompt as the task instruction.
|
If skills are provided on create, the future cron run loads those skills in order, then follows the prompt as the task instruction.
|
||||||
On update, passing skills=[] clears attached skills.
|
On update, passing skills=[] clears attached skills.
|
||||||
|
|
||||||
If script is provided on create, the referenced Python script runs before each agent turn.
|
|
||||||
Its stdout is injected into the prompt as context. Use this for data collection and change
|
|
||||||
detection — the script handles gathering data, the agent analyzes and reports.
|
|
||||||
On update, pass script="" to clear an attached script.
|
|
||||||
|
|
||||||
NOTE: The agent's final response is auto-delivered to the target. Put the primary
|
NOTE: The agent's final response is auto-delivered to the target. Put the primary
|
||||||
user-facing content in the final response. Cron jobs run autonomously with no user
|
user-facing content in the final response. Cron jobs run autonomously with no user
|
||||||
present — they cannot ask questions or request clarification.
|
present — they cannot ask questions or request clarification.
|
||||||
|
|
@ -418,7 +439,7 @@ Important safety rule: cron-run sessions should not recursively schedule more cr
|
||||||
},
|
},
|
||||||
"prompt": {
|
"prompt": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "For create: the full self-contained prompt. If skill or skills are also provided, this becomes the task instruction paired with those skills."
|
"description": "For create: the full self-contained prompt. If skills are also provided, this becomes the task instruction paired with those skills."
|
||||||
},
|
},
|
||||||
"schedule": {
|
"schedule": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
|
|
@ -436,39 +457,30 @@ Important safety rule: cron-run sessions should not recursively schedule more cr
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "Delivery target: origin, local, telegram, discord, slack, whatsapp, signal, matrix, mattermost, homeassistant, dingtalk, feishu, wecom, email, sms, or platform:chat_id or platform:chat_id:thread_id for Telegram topics. Examples: 'origin', 'local', 'telegram', 'telegram:-1001234567890:17585', 'discord:#engineering'"
|
"description": "Delivery target: origin, local, telegram, discord, slack, whatsapp, signal, matrix, mattermost, homeassistant, dingtalk, feishu, wecom, email, sms, or platform:chat_id or platform:chat_id:thread_id for Telegram topics. Examples: 'origin', 'local', 'telegram', 'telegram:-1001234567890:17585', 'discord:#engineering'"
|
||||||
},
|
},
|
||||||
"model": {
|
|
||||||
"type": "string",
|
|
||||||
"description": "Optional per-job model override used when the cron job runs"
|
|
||||||
},
|
|
||||||
"provider": {
|
|
||||||
"type": "string",
|
|
||||||
"description": "Optional per-job provider override used when resolving runtime credentials"
|
|
||||||
},
|
|
||||||
"base_url": {
|
|
||||||
"type": "string",
|
|
||||||
"description": "Optional per-job base URL override paired with provider/model routing"
|
|
||||||
},
|
|
||||||
"include_disabled": {
|
|
||||||
"type": "boolean",
|
|
||||||
"description": "For list: include paused/completed jobs"
|
|
||||||
},
|
|
||||||
"skill": {
|
|
||||||
"type": "string",
|
|
||||||
"description": "Optional single skill name to load before executing the cron prompt"
|
|
||||||
},
|
|
||||||
"skills": {
|
"skills": {
|
||||||
"type": "array",
|
"type": "array",
|
||||||
"items": {"type": "string"},
|
"items": {"type": "string"},
|
||||||
"description": "Optional ordered list of skills to load before executing the cron prompt. On update, pass an empty array to clear attached skills."
|
"description": "Optional ordered list of skill names to load before executing the cron prompt. On update, pass an empty array to clear attached skills."
|
||||||
},
|
},
|
||||||
"reason": {
|
"model": {
|
||||||
|
"type": "object",
|
||||||
|
"description": "Optional per-job model override. If provider is omitted, the current main provider is pinned at creation time so the job stays stable.",
|
||||||
|
"properties": {
|
||||||
|
"provider": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "Optional pause reason"
|
"description": "Provider name (e.g. 'openrouter', 'anthropic'). Omit to use and pin the current provider."
|
||||||
|
},
|
||||||
|
"model": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "Model name (e.g. 'anthropic/claude-sonnet-4', 'claude-sonnet-4')"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"required": ["model"]
|
||||||
},
|
},
|
||||||
"script": {
|
"script": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "Optional path to a Python script that runs before each cron job execution. Its stdout is injected into the prompt as context. Use for data collection and change detection. Relative paths resolve under ~/.hermes/scripts/. On update, pass empty string to clear."
|
"description": "Optional path to a Python script that runs before each cron job execution. Its stdout is injected into the prompt as context. Use for data collection and change detection. Relative paths resolve under ~/.hermes/scripts/. On update, pass empty string to clear."
|
||||||
}
|
},
|
||||||
},
|
},
|
||||||
"required": ["action"]
|
"required": ["action"]
|
||||||
}
|
}
|
||||||
|
|
@ -502,7 +514,7 @@ registry.register(
|
||||||
name="cronjob",
|
name="cronjob",
|
||||||
toolset="cronjob",
|
toolset="cronjob",
|
||||||
schema=CRONJOB_SCHEMA,
|
schema=CRONJOB_SCHEMA,
|
||||||
handler=lambda args, **kw: cronjob(
|
handler=lambda args, **kw: (lambda _mo=_resolve_model_override(args.get("model")): cronjob(
|
||||||
action=args.get("action", ""),
|
action=args.get("action", ""),
|
||||||
job_id=args.get("job_id"),
|
job_id=args.get("job_id"),
|
||||||
prompt=args.get("prompt"),
|
prompt=args.get("prompt"),
|
||||||
|
|
@ -510,16 +522,16 @@ registry.register(
|
||||||
name=args.get("name"),
|
name=args.get("name"),
|
||||||
repeat=args.get("repeat"),
|
repeat=args.get("repeat"),
|
||||||
deliver=args.get("deliver"),
|
deliver=args.get("deliver"),
|
||||||
include_disabled=args.get("include_disabled", False),
|
include_disabled=args.get("include_disabled", True),
|
||||||
skill=args.get("skill"),
|
skill=args.get("skill"),
|
||||||
skills=args.get("skills"),
|
skills=args.get("skills"),
|
||||||
model=args.get("model"),
|
model=_mo[1],
|
||||||
provider=args.get("provider"),
|
provider=_mo[0] or args.get("provider"),
|
||||||
base_url=args.get("base_url"),
|
base_url=args.get("base_url"),
|
||||||
reason=args.get("reason"),
|
reason=args.get("reason"),
|
||||||
script=args.get("script"),
|
script=args.get("script"),
|
||||||
task_id=kw.get("task_id"),
|
task_id=kw.get("task_id"),
|
||||||
),
|
))(),
|
||||||
check_fn=check_cronjob_requirements,
|
check_fn=check_cronjob_requirements,
|
||||||
emoji="⏰",
|
emoji="⏰",
|
||||||
)
|
)
|
||||||
|
|
|
||||||
14
toolsets.py
14
toolsets.py
|
|
@ -37,14 +37,12 @@ _HERMES_CORE_TOOLS = [
|
||||||
"read_file", "write_file", "patch", "search_files",
|
"read_file", "write_file", "patch", "search_files",
|
||||||
# Vision + image generation
|
# Vision + image generation
|
||||||
"vision_analyze", "image_generate",
|
"vision_analyze", "image_generate",
|
||||||
# MoA
|
|
||||||
"mixture_of_agents",
|
|
||||||
# Skills
|
# Skills
|
||||||
"skills_list", "skill_view", "skill_manage",
|
"skills_list", "skill_view", "skill_manage",
|
||||||
# Browser automation
|
# Browser automation
|
||||||
"browser_navigate", "browser_snapshot", "browser_click",
|
"browser_navigate", "browser_snapshot", "browser_click",
|
||||||
"browser_type", "browser_scroll", "browser_back",
|
"browser_type", "browser_scroll", "browser_back",
|
||||||
"browser_press", "browser_close", "browser_get_images",
|
"browser_press", "browser_get_images",
|
||||||
"browser_vision", "browser_console",
|
"browser_vision", "browser_console",
|
||||||
# Text-to-speech
|
# Text-to-speech
|
||||||
"text_to_speech",
|
"text_to_speech",
|
||||||
|
|
@ -116,7 +114,7 @@ TOOLSETS = {
|
||||||
"tools": [
|
"tools": [
|
||||||
"browser_navigate", "browser_snapshot", "browser_click",
|
"browser_navigate", "browser_snapshot", "browser_click",
|
||||||
"browser_type", "browser_scroll", "browser_back",
|
"browser_type", "browser_scroll", "browser_back",
|
||||||
"browser_press", "browser_close", "browser_get_images",
|
"browser_press", "browser_get_images",
|
||||||
"browser_vision", "browser_console", "web_search"
|
"browser_vision", "browser_console", "web_search"
|
||||||
],
|
],
|
||||||
"includes": []
|
"includes": []
|
||||||
|
|
@ -214,7 +212,7 @@ TOOLSETS = {
|
||||||
|
|
||||||
"safe": {
|
"safe": {
|
||||||
"description": "Safe toolkit without terminal access",
|
"description": "Safe toolkit without terminal access",
|
||||||
"tools": ["mixture_of_agents"],
|
"tools": [],
|
||||||
"includes": ["web", "vision", "image_gen"]
|
"includes": ["web", "vision", "image_gen"]
|
||||||
},
|
},
|
||||||
|
|
||||||
|
|
@ -235,7 +233,7 @@ TOOLSETS = {
|
||||||
"skills_list", "skill_view", "skill_manage",
|
"skills_list", "skill_view", "skill_manage",
|
||||||
"browser_navigate", "browser_snapshot", "browser_click",
|
"browser_navigate", "browser_snapshot", "browser_click",
|
||||||
"browser_type", "browser_scroll", "browser_back",
|
"browser_type", "browser_scroll", "browser_back",
|
||||||
"browser_press", "browser_close", "browser_get_images",
|
"browser_press", "browser_get_images",
|
||||||
"browser_vision", "browser_console",
|
"browser_vision", "browser_console",
|
||||||
"todo", "memory",
|
"todo", "memory",
|
||||||
"session_search",
|
"session_search",
|
||||||
|
|
@ -255,14 +253,12 @@ TOOLSETS = {
|
||||||
"read_file", "write_file", "patch", "search_files",
|
"read_file", "write_file", "patch", "search_files",
|
||||||
# Vision + image generation
|
# Vision + image generation
|
||||||
"vision_analyze", "image_generate",
|
"vision_analyze", "image_generate",
|
||||||
# MoA
|
|
||||||
"mixture_of_agents",
|
|
||||||
# Skills
|
# Skills
|
||||||
"skills_list", "skill_view", "skill_manage",
|
"skills_list", "skill_view", "skill_manage",
|
||||||
# Browser automation
|
# Browser automation
|
||||||
"browser_navigate", "browser_snapshot", "browser_click",
|
"browser_navigate", "browser_snapshot", "browser_click",
|
||||||
"browser_type", "browser_scroll", "browser_back",
|
"browser_type", "browser_scroll", "browser_back",
|
||||||
"browser_press", "browser_close", "browser_get_images",
|
"browser_press", "browser_get_images",
|
||||||
"browser_vision", "browser_console",
|
"browser_vision", "browser_console",
|
||||||
# Planning & memory
|
# Planning & memory
|
||||||
"todo", "memory",
|
"todo", "memory",
|
||||||
|
|
|
||||||
|
|
@ -20,7 +20,6 @@ In addition to built-in tools, Hermes can load tools dynamically from MCP server
|
||||||
|------|-------------|----------------------|
|
|------|-------------|----------------------|
|
||||||
| `browser_back` | Navigate back to the previous page in browser history. Requires browser_navigate to be called first. | — |
|
| `browser_back` | Navigate back to the previous page in browser history. Requires browser_navigate to be called first. | — |
|
||||||
| `browser_click` | Click on an element identified by its ref ID from the snapshot (e.g., '@e5'). The ref IDs are shown in square brackets in the snapshot output. Requires browser_navigate and browser_snapshot to be called first. | — |
|
| `browser_click` | Click on an element identified by its ref ID from the snapshot (e.g., '@e5'). The ref IDs are shown in square brackets in the snapshot output. Requires browser_navigate and browser_snapshot to be called first. | — |
|
||||||
| `browser_close` | Close the browser session and release resources. Call this when done with browser tasks to free up Browserbase session quota. | — |
|
|
||||||
| `browser_console` | Get browser console output and JavaScript errors from the current page. Returns console.log/warn/error/info messages and uncaught JS exceptions. Use this to detect silent JavaScript errors, failed API calls, and application warnings. Requi… | — |
|
| `browser_console` | Get browser console output and JavaScript errors from the current page. Returns console.log/warn/error/info messages and uncaught JS exceptions. Use this to detect silent JavaScript errors, failed API calls, and application warnings. Requi… | — |
|
||||||
| `browser_get_images` | Get a list of all images on the current page with their URLs and alt text. Useful for finding images to analyze with the vision tool. Requires browser_navigate to be called first. | — |
|
| `browser_get_images` | Get a list of all images on the current page with their URLs and alt text. Useful for finding images to analyze with the vision tool. Requires browser_navigate to be called first. | — |
|
||||||
| `browser_navigate` | Navigate to a URL in the browser. Initializes the session and loads the page. Must be called before other browser tools. For simple information retrieval, prefer web_search or web_extract (faster, cheaper). Use browser tools when you need… | — |
|
| `browser_navigate` | Navigate to a URL in the browser. Initializes the session and loads the page. Must be called before other browser tools. For simple information retrieval, prefer web_search or web_extract (faster, cheaper). Use browser tools when you need… | — |
|
||||||
|
|
|
||||||
|
|
@ -52,7 +52,7 @@ Or in-session:
|
||||||
|
|
||||||
| Toolset | Tools | Purpose |
|
| Toolset | Tools | Purpose |
|
||||||
|---------|-------|---------|
|
|---------|-------|---------|
|
||||||
| `browser` | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` | Full browser automation. Includes `web_search` as a fallback for quick lookups. |
|
| `browser` | `browser_back`, `browser_click`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` | Full browser automation. Includes `web_search` as a fallback for quick lookups. |
|
||||||
| `clarify` | `clarify` | Ask the user a question when the agent needs clarification. |
|
| `clarify` | `clarify` | Ask the user a question when the agent needs clarification. |
|
||||||
| `code_execution` | `execute_code` | Run Python scripts that call Hermes tools programmatically. |
|
| `code_execution` | `execute_code` | Run Python scripts that call Hermes tools programmatically. |
|
||||||
| `cronjob` | `cronjob` | Schedule and manage recurring tasks. |
|
| `cronjob` | `cronjob` | Schedule and manage recurring tasks. |
|
||||||
|
|
|
||||||
|
|
@ -277,10 +277,6 @@ Check the browser console for any JavaScript errors
|
||||||
|
|
||||||
Use `clear=True` to clear the console after reading, so subsequent calls only show new messages.
|
Use `clear=True` to clear the console after reading, so subsequent calls only show new messages.
|
||||||
|
|
||||||
### `browser_close`
|
|
||||||
|
|
||||||
Close the browser session and release resources. Call this when done to free up Browserbase session quota.
|
|
||||||
|
|
||||||
## Practical Examples
|
## Practical Examples
|
||||||
|
|
||||||
### Filling Out a Web Form
|
### Filling Out a Web Form
|
||||||
|
|
@ -295,7 +291,6 @@ Agent workflow:
|
||||||
4. browser_type(ref="@e5", text="SecurePass123")
|
4. browser_type(ref="@e5", text="SecurePass123")
|
||||||
5. browser_click(ref="@e8") → clicks "Create Account"
|
5. browser_click(ref="@e8") → clicks "Create Account"
|
||||||
6. browser_snapshot() → confirms success
|
6. browser_snapshot() → confirms success
|
||||||
7. browser_close()
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Researching Dynamic Content
|
### Researching Dynamic Content
|
||||||
|
|
@ -307,7 +302,6 @@ Agent workflow:
|
||||||
1. browser_navigate("https://github.com/trending")
|
1. browser_navigate("https://github.com/trending")
|
||||||
2. browser_snapshot(full=true) → reads trending repo list
|
2. browser_snapshot(full=true) → reads trending repo list
|
||||||
3. Returns formatted results
|
3. Returns formatted results
|
||||||
4. browser_close()
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Session Recording
|
## Session Recording
|
||||||
|
|
@ -349,5 +343,5 @@ If paid features aren't available on your plan, Hermes automatically falls back
|
||||||
- **Text-based interaction** — relies on accessibility tree, not pixel coordinates
|
- **Text-based interaction** — relies on accessibility tree, not pixel coordinates
|
||||||
- **Snapshot size** — large pages may be truncated or LLM-summarized at 8000 characters
|
- **Snapshot size** — large pages may be truncated or LLM-summarized at 8000 characters
|
||||||
- **Session timeout** — cloud sessions expire based on your provider's plan settings
|
- **Session timeout** — cloud sessions expire based on your provider's plan settings
|
||||||
- **Cost** — cloud sessions consume provider credits; use `browser_close` when done. Use `/browser connect` for free local browsing.
|
- **Cost** — cloud sessions consume provider credits; sessions are automatically cleaned up when the conversation ends or after inactivity. Use `/browser connect` for free local browsing.
|
||||||
- **No file downloads** — cannot download files from the browser
|
- **No file downloads** — cannot download files from the browser
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue