hermes-agent/tests
Teknium 8b861b77c1
refactor: remove browser_close tool — auto-cleanup handles it (#5792)
* refactor: remove browser_close tool — auto-cleanup handles it

The browser_close tool was called in only 9% of browser sessions (13/144
navigations across 66 sessions), always redundantly — cleanup_browser()
already runs via _cleanup_task_resources() at conversation end, and the
background inactivity reaper catches anything else.

Removing it saves one tool schema slot in every browser-enabled API call.

Also fixes a latent bug: cleanup_browser() now handles Camofox sessions
too (previously only Browserbase). Camofox sessions were never auto-cleaned
per-task because they live in a separate dict from _active_sessions.

Files changed (13):
- tools/browser_tool.py: remove function, schema, registry entry; add
  camofox cleanup to cleanup_browser()
- toolsets.py, model_tools.py, prompt_builder.py, display.py,
  acp_adapter/tools.py: remove browser_close from all tool lists
- tests/: remove browser_close test, update toolset assertion
- docs/skills: remove all browser_close references

* fix: repeat browser_scroll 5x per call for meaningful page movement

Most backends scroll ~100px per call — barely visible on a typical
viewport. Repeating 5x gives ~500px (~half a viewport), making each
scroll tool call actually useful.

Backend-agnostic approach: works across all 7+ browser backends without
needing to configure each one's scroll amount individually. Breaks
early on error for the agent-browser path.

* feat: auto-return compact snapshot from browser_navigate

Every browser session starts with navigate → snapshot. Now navigate
returns the compact accessibility tree snapshot inline, saving one
tool call per browser task.

The snapshot captures the full page DOM (not viewport-limited), so
scroll position doesn't affect it. browser_snapshot remains available
for refreshing after interactions or getting full=true content.

Both Browserbase and Camofox paths auto-snapshot. If the snapshot
fails for any reason, navigation still succeeds — the snapshot is
a bonus, not a requirement.

Schema descriptions updated to guide models: navigate mentions it
returns a snapshot, snapshot mentions it's for refresh/full content.

* refactor: slim cronjob tool schema — consolidate model/provider, drop unused params

Session data (151 calls across 67 sessions) showed several schema
properties were never used by models. Consolidated and cleaned up:

Removed from schema (still work via backend/CLI):
- skill (singular): use skills array instead
- reason: pause-only, unnecessary
- include_disabled: now defaults to true
- base_url: extreme edge case, zero usage
- provider (standalone): merged into model object

Consolidated:
- model + provider → single 'model' object with {model, provider} fields.
  If provider is omitted, the current main provider is pinned at creation
  time so the job stays stable even if the user changes their default.

Kept:
- script: useful data collection feature
- skills array: standard interface for skill loading

Schema shrinks from 14 to 10 properties. All backend functionality
preserved — the Python function signature and handler lambda still
accept every parameter.

* fix: remove mixture_of_agents from core toolsets — opt-in only via hermes tools

MoA was in _HERMES_CORE_TOOLS and composite toolsets (hermes-cli,
hermes-messaging, safe), which meant it appeared in every session
for anyone with OPENROUTER_API_KEY set. The _DEFAULT_OFF_TOOLSETS
gate only works after running 'hermes tools' explicitly.

Now MoA only appears when a user explicitly enables it via
'hermes tools'. The moa toolset definition and check_fn remain
unchanged — it just needs to be opted into.
2026-04-07 03:28:44 -07:00
..
acp feat(api): structured run events via /v1/runs SSE endpoint 2026-04-05 12:05:13 -07:00
agent Fix compaction summary retries for temperature-restricted models 2026-04-06 16:49:57 -07:00
cron fix(cron): suppress delivery when [SILENT] appears anywhere in response 2026-04-06 16:49:40 -07:00
e2e test(e2e): remove section separator comments 2026-04-01 15:23:52 -07:00
fakes fix: streaming tool call parsing, error handling, and fake HA state mutation 2026-03-14 14:27:20 +03:00
gateway refactor: remove browser_close tool — auto-cleanup handles it (#5792) 2026-04-07 03:28:44 -07:00
hermes_cli fix: handle launchctl kickstart exit code 113 in launchd_start() 2026-04-06 13:20:01 -07:00
honcho_plugin fix(honcho): migration guard for observation mode default change 2026-04-05 12:34:11 -07:00
integration refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804) 2026-03-24 07:30:25 -07:00
plugins fix(memory): clean up supermemory provider threads 2026-04-06 22:15:58 -07:00
skills fix: protect profile-scoped google workspace oauth tokens 2026-04-03 17:49:18 -07:00
tools refactor: remove browser_close tool — auto-cleanup handles it (#5792) 2026-04-07 03:28:44 -07:00
__init__.py A bit of restructuring for simplicity and organization 2025-10-01 23:29:25 +00:00
conftest.py fix(approval): show full command in dangerous command approval (#1553) 2026-03-17 02:02:33 -07:00
run_interrupt_test.py fix: thread safety for concurrent subagent delegation (#1672) 2026-03-17 02:53:33 -07:00
test_413_compression.py fix: remove stale test skips, fix regex backtracking, file search bug, and test flakiness 2026-04-04 10:18:57 -07:00
test_860_dedup.py fix: eliminate 3x SQLite message duplication in gateway sessions (#860) 2026-03-10 15:22:44 -07:00
test_1630_context_overflow_loop.py fix: prevent infinite 400 loop on context overflow + block prompt injection via cache files (#1630, #1558) 2026-03-17 01:50:59 -07:00
test_agent_guardrails.py feat: pre-call sanitization and post-call tool guardrails (#1732) 2026-03-17 04:24:27 -07:00
test_agent_loop.py fix: salvage gateway dedup and executor cleanup from PR #993 2026-03-14 11:03:20 -07:00
test_agent_loop_tool_calling.py fix: remove stale test skips, fix regex backtracking, file search bug, and test flakiness 2026-04-04 10:18:57 -07:00
test_agent_loop_vllm.py test: restore vllm integration coverage and add dict-args regression 2026-03-15 08:02:29 -07:00
test_anthropic_adapter.py fix: preserve Anthropic thinking block signatures across tool-use turns 2026-04-02 10:30:32 -07:00
test_anthropic_error_handling.py fix(ci): pin acp <0.9 and update retry-exhaust test (#3320) 2026-03-26 19:21:34 -07:00
test_anthropic_oauth_flow.py fix: preflight Anthropic auth and prefer Claude store 2026-03-14 19:38:55 -07:00
test_anthropic_provider_persistence.py fix: preflight Anthropic auth and prefer Claude store 2026-03-14 19:38:55 -07:00
test_api_key_providers.py fix(credential_pool): auto-detect Z.AI endpoint via probe and cache 2026-04-07 00:00:08 -07:00
test_async_httpx_del_neuter.py fix: eliminate 'Event loop is closed' / 'Press ENTER to continue' during idle (#3398) 2026-03-27 09:45:25 -07:00
test_atomic_json_write.py test: cover atomic temp cleanup on interrupts 2026-03-14 22:31:51 -07:00
test_atomic_yaml_write.py test: cover atomic temp cleanup on interrupts 2026-03-14 22:31:51 -07:00
test_auth_codex_provider.py refactor(auth): transition Codex OAuth tokens to Hermes auth store 2026-03-01 19:59:24 -08:00
test_auth_commands.py fix: hermes auth remove now clears env-seeded credentials permanently (#5285) 2026-04-05 12:00:53 -07:00
test_auth_nous_provider.py Fix nous refresh token rotation failure in case where api key mint/retrieval fails 2026-03-02 17:18:15 +11:00
test_auxiliary_config_bridge.py feat(compression): add summary_base_url + move compression config to YAML-only 2026-03-17 04:46:15 -07:00
test_batch_runner_checkpoint.py fix: sanitize chat payloads and provider precedence 2026-03-13 23:59:12 -07:00
test_branch_command.py fix: clear ghost status-bar lines on terminal resize (#4960) 2026-04-03 22:43:45 -07:00
test_cli_approval_ui.py fix(cli): repair dangerous command approval UI 2026-03-14 11:57:44 -07:00
test_cli_background_tui_refresh.py fix(cli): refresh TUI before background task output to prevent status bar overlap (#3048) 2026-03-25 15:00:33 -07:00
test_cli_browser_connect.py fix: cross-platform browser test path separators 2026-04-06 16:54:16 -07:00
test_cli_context_warning.py fix: add missing provider attrs to cli_obj test fixture 2026-04-01 01:12:23 -07:00
test_cli_extension_hooks.py refactor(cli): add protected TUI extension hooks for wrapper CLIs 2026-03-21 09:42:07 -07:00
test_cli_file_drop.py refactor: extract _detect_file_drop() + add 28 tests 2026-04-02 00:40:27 -07:00
test_cli_init.py fix(cli): surface recent sessions inside /history and /resume 2026-04-03 00:50:49 -07:00
test_cli_interrupt_subagent.py fix: thread safety for concurrent subagent delegation (#1672) 2026-03-17 02:53:33 -07:00
test_cli_loading_indicator.py fix(cli): add loading indicators for slow slash commands 2026-03-10 17:31:00 -07:00
test_cli_mcp_config_watch.py fix: auto-reload MCP tools when mcp_servers config changes without restart (#1474) 2026-03-15 19:03:34 -07:00
test_cli_new_session.py fix: complete session reset — missing compressor counters + test 2026-03-20 04:35:17 -07:00
test_cli_plan_command.py fix: save /plan output in workspace (#1381) 2026-03-14 21:28:51 -07:00
test_cli_prefix_matching.py feat: add /tools disable/enable/list slash commands with session reset (#1652) 2026-03-17 02:05:26 -07:00
test_cli_preloaded_skills.py fix: move activated skills line below welcome text 2026-03-23 06:20:19 -07:00
test_cli_provider_resolution.py feat: show model pricing for OpenRouter and Nous Portal providers 2026-04-05 22:02:21 -07:00
test_cli_retry.py test: lock retry replacement semantics 2026-03-14 21:19:22 -07:00
test_cli_save_config_value.py fix(cli): use atomic write in save_config_value to prevent config loss on interrupt 2026-03-31 12:21:55 -07:00
test_cli_secret_capture.py feat: secure skill env setup on load (core #688) 2026-03-13 03:14:04 -07:00
test_cli_skin_integration.py fix(test): add missing voice state attrs to CLI stub in skin tests 2026-03-14 15:00:45 +03:00
test_cli_status_bar.py fix(cli): handle CJK wide chars in TUI input height 2026-04-06 16:54:16 -07:00
test_cli_tools_command.py fix: resolve 7 failing CI tests (#3936) 2026-03-30 08:10:14 -07:00
test_codex_execution_paths.py fix(tests): provide model name in Codex 401 refresh tests for CI (#4166) 2026-03-30 21:17:09 -07:00
test_codex_models.py fix: repair OpenCode model routing and selection (#4508) 2026-04-02 09:36:24 -07:00
test_compression_boundary.py fix(agent): prevent silent tool result loss during context compression (#1993) 2026-03-18 15:22:51 -07:00
test_compression_persistence.py fix: persist compressed context to gateway session after mid-run compression 2026-03-30 18:49:14 -07:00
test_compressor_fallback_update.py feat(providers): add ordered fallback provider chain (salvage #1761) (#3813) 2026-03-29 16:04:53 -07:00
test_config_env_expansion.py feat(config): support ${ENV_VAR} substitution in config.yaml (#2684) 2026-03-23 16:02:06 -07:00
test_context_pressure.py fix: cap context pressure percentage at 100% in display (#3480) 2026-03-27 21:42:09 -07:00
test_context_references.py fix(context): restrict @ references to safe workspace paths (#2601) 2026-03-23 06:40:05 -07:00
test_context_token_tracking.py fix(tests): resolve all consistently failing tests 2026-03-22 05:58:26 -07:00
test_credential_pool.py fix(delegate): share credential pools with subagents + per-task leasing 2026-04-06 23:01:11 -07:00
test_credential_pool_routing.py Honor provider reset windows in pooled credential failover 2026-04-05 00:20:53 -07:00
test_crossloop_client_cache.py fix(agent): prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode (#2701) 2026-03-25 17:31:56 -07:00
test_dict_tool_call_args.py test: restore vllm integration coverage and add dict-args regression 2026-03-15 08:02:29 -07:00
test_display.py feat: add inline diff previews for write actions 2026-04-01 02:13:57 -07:00
test_evidence_store.py feat: add OSS Security Forensics skill (Skills Hub) (#1482) 2026-03-15 21:59:53 -07:00
test_exit_cleanup_interrupt.py feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623) 2026-04-02 15:33:51 -07:00
test_external_credential_detection.py refactor(auth): transition Codex OAuth tokens to Hermes auth store 2026-03-01 19:59:24 -08:00
test_fallback_model.py feat: upgrade MiniMax default to M2.7 + add new OpenRouter models 2026-03-18 02:42:58 -07:00
test_file_permissions.py security: enforce 0600/0700 file permissions on sensitive files (inspired by openclaw) 2026-03-09 02:19:32 -07:00
test_flush_memories_codex.py fix: update all test mocks for call_llm migration 2026-03-11 21:06:54 -07:00
test_gemini_provider.py fix: update Gemini model catalog + wire models.dev as live model source 2026-04-06 10:28:03 -07:00
test_hermes_logging.py feat: centralized logging, instrumentation, hermes logs CLI, gateway noise fix (#5430) 2026-04-06 00:08:20 -07:00
test_hermes_state.py fix: merge dotted+hyphenated FTS5 quoting into single pass 2026-04-02 00:49:11 -07:00
test_honcho_client_config.py feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623) 2026-04-02 15:33:51 -07:00
test_insights.py feat: add route-aware pricing estimates (#1695) 2026-03-17 03:44:44 -07:00
test_interactive_interrupt.py fix: thread safety for concurrent subagent delegation (#1672) 2026-03-17 02:53:33 -07:00
test_interrupt_propagation.py fix: thread safety for concurrent subagent delegation (#1672) 2026-03-17 02:53:33 -07:00
test_large_tool_result.py feat: save oversized tool results to file instead of destructive truncation (#5210) 2026-04-05 10:29:57 -07:00
test_long_context_tier_429.py fix: handle Anthropic Sonnet long-context tier 429 by reducing to 200k (#4747) 2026-04-03 02:05:02 -07:00
test_managed_server_tool_support.py test: fix stale CI assumptions in parser and quick-command coverage (#1236) 2026-03-13 21:56:12 -07:00
test_mcp_serve.py feat: add MCP server mode — hermes mcp serve (#3795) 2026-03-29 15:47:19 -07:00
test_minisweagent_path.py chore: remove all remaining mini-swe-agent references 2026-03-24 08:19:23 -07:00
test_model_metadata_local_ctx.py fix: prefer loaded instance context size over max for LM Studio 2026-03-19 21:24:53 +01:00
test_model_normalize.py Fix #5211: Preserve dots in OpenCode Go model names 2026-04-06 11:25:06 -07:00
test_model_provider_persistence.py fix: repair OpenCode model routing and selection (#4508) 2026-04-02 09:36:24 -07:00
test_model_tools.py Add request-scoped plugin lifecycle hooks 2026-04-05 23:31:29 -07:00
test_model_tools_async_bridge.py fix: use per-thread persistent event loops in worker threads 2026-03-20 15:41:06 -04:00
test_ollama_cloud_auth.py fix: Ollama Cloud auth, /model switch persistence, and alias tab completion 2026-04-05 11:06:06 -07:00
test_openai_client_lifecycle.py fix: audit fixes — 5 bugs found and resolved 2026-03-16 06:35:46 -07:00
test_packaging_metadata.py chore: prepare Hermes for Homebrew packaging (#4099) 2026-03-30 17:34:43 -07:00
test_percentage_clamp.py fix: cap percentage displays at 100% in stats, gateway, and memory tool (#3599) 2026-03-28 14:55:18 -07:00
test_personality_none.py feat(cli,gateway): add /personality none and custom personality support 2026-03-09 17:31:54 +03:00
test_plugin_cli_registration.py fix(plugins): only register CLI commands for the active memory provider 2026-04-05 12:34:11 -07:00
test_plugins.py feat(plugins): pre_api_request/post_api_request with narrow payloads 2026-04-05 23:31:29 -07:00
test_plugins_cmd.py feat(plugins): prompt for required env vars during hermes plugins install 2026-04-06 16:37:53 -07:00
test_primary_runtime_restore.py feat: per-turn primary runtime restoration and transport recovery (#4624) 2026-04-02 10:52:01 -07:00
test_project_metadata.py fix: exclude matrix from [all] extras — python-olm is upstream-broken (#4615) 2026-04-02 09:21:37 -07:00
test_provider_fallback.py feat(providers): add ordered fallback provider chain (salvage #1761) (#3813) 2026-03-29 16:04:53 -07:00
test_provider_parity.py test: add test for _should_sanitize_tool_calls() 2026-04-05 00:13:25 -07:00
test_quick_commands.py fix: thread safety for concurrent subagent delegation (#1672) 2026-03-17 02:53:33 -07:00
test_real_interrupt_subagent.py fix: thread safety for concurrent subagent delegation (#1672) 2026-03-17 02:53:33 -07:00
test_reasoning_command.py fix: prevent reasoning box from rendering 3x during tool-calling loops (#3405) 2026-03-27 09:57:50 -07:00
test_redirect_stdout_issue.py fix: use session_key instead of chat_id for adapter interrupt lookups 2026-03-12 08:35:45 -07:00
test_resume_display.py feat: display previous messages when resuming a session in CLI 2026-03-08 17:45:45 -07:00
test_run_agent.py feat(plugins): pre_api_request/post_api_request with narrow payloads 2026-04-05 23:31:29 -07:00
test_run_agent_codex_responses.py fix(codex): handle reasoning-only responses and replay path (#2070) 2026-03-19 10:34:44 -07:00
test_runtime_provider_resolution.py fix: stale OAuth credentials block OpenRouter users on auto-detect (#5746) 2026-04-06 23:01:43 -07:00
test_session_meta_filtering.py fix: filter transcript-only roles from chat-completions payload (#4715) 2026-04-03 14:57:33 -07:00
test_session_reset_fix.py fix(session): clear compressor summary and turn counter on /clear and /new (#3102) 2026-03-25 18:22:21 -07:00
test_setup_model_selection.py fix: repair OpenCode model routing and selection (#4508) 2026-04-02 09:36:24 -07:00
test_sql_injection.py fix(security): eliminate SQL string formatting in execute() calls 2026-03-19 15:16:35 +01:00
test_streaming.py test: add codex transport drop regression 2026-03-31 12:05:06 -07:00
test_strict_api_validation.py test: add strict API validation tests for Fireworks compatibility 2026-04-05 00:13:25 -07:00
test_surrogate_sanitization.py fix: sanitize surrogate characters from clipboard paste to prevent UnicodeEncodeError (#3624) 2026-03-28 16:53:14 -07:00
test_timezone.py fix: skip stale cron jobs on gateway restart instead of firing immediately 2026-03-16 23:48:14 -07:00
test_token_persistence_non_cli.py fix(insights): persist token usage for non-CLI sessions 2026-04-02 10:47:13 -07:00
test_tool_arg_coercion.py feat: coerce tool call arguments to match JSON Schema types (#5265) 2026-04-05 10:57:34 -07:00
test_tool_call_parsers.py fix(mistral-parser): handle nested JSON in fallback extraction 2026-03-21 09:41:17 -07:00
test_toolset_distributions.py test: add unit tests for 8 modules (batch 2) 2026-02-26 13:54:20 +03:00
test_toolsets.py fix: add missing Platform.SIGNAL to toolset mappings, update test + config docs 2026-03-09 23:27:19 -07:00
test_trajectory_compressor.py fix: URL-based auth for third-party Anthropic endpoints + CI test fixes (#4148) 2026-03-30 20:36:56 -07:00
test_trajectory_compressor_async.py fix: create AsyncOpenAI lazily in trajectory_compressor to avoid closed event loop (#4013) 2026-03-30 13:16:16 -07:00
test_utils_truthy_values.py Gate tool-gateway behind an env var, so it's not in users' faces until we're ready. Even if users enable it, it'll be blocked server-side for now, until we unlock for non-admin users on tool-gateway. 2026-03-30 13:28:10 +09:00
test_worktree.py fix: harden salvaged worktree include checks 2026-03-14 21:51:27 -07:00
test_worktree_security.py fix: harden salvaged worktree include checks 2026-03-14 21:51:27 -07:00