hermes-agent/tests/tools
Teknium 8d59881a62
feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647)
* feat(auth): add same-provider credential pools and rotation UX

Add same-provider credential pooling so Hermes can rotate across
multiple credentials for a single provider, recover from exhausted
credentials without jumping providers immediately, and configure
that behavior directly in hermes setup.

- agent/credential_pool.py: persisted per-provider credential pools
- hermes auth add/list/remove/reset CLI commands
- 429/402/401 recovery with pool rotation in run_agent.py
- Setup wizard integration for pool strategy configuration
- Auto-seeding from env vars and existing OAuth state

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Salvaged from PR #2647

* fix(tests): prevent pool auto-seeding from host env in credential pool tests

Tests for non-pool Anthropic paths and auth remove were failing when
host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials
were present. The pool auto-seeding picked these up, causing unexpected
pool entries in tests.

- Mock _select_pool_entry in auxiliary_client OAuth flag tests
- Clear Anthropic env vars and mock _seed_from_singletons in auth remove test

* feat(auth): add thread safety, least_used strategy, and request counting

- Add threading.Lock to CredentialPool for gateway thread safety
  (concurrent requests from multiple gateway sessions could race on
  pool state mutations without this)
- Add 'least_used' rotation strategy that selects the credential
  with the lowest request_count, distributing load more evenly
- Add request_count field to PooledCredential for usage tracking
- Add mark_used() method to increment per-credential request counts
- Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current()
  with lock acquisition
- Add tests: least_used selection, mark_used counting, concurrent
  thread safety (4 threads × 20 selects with no corruption)

* feat(auth): add interactive mode for bare 'hermes auth' command

When 'hermes auth' is called without a subcommand, it now launches an
interactive wizard that:

1. Shows full credential pool status across all providers
2. Offers a menu: add, remove, reset cooldowns, set strategy
3. For OAuth-capable providers (anthropic, nous, openai-codex), the
   add flow explicitly asks 'API key or OAuth login?' — making it
   clear that both auth types are supported for the same provider
4. Strategy picker shows all 4 options (fill_first, round_robin,
   least_used, random) with the current selection marked
5. Remove flow shows entries with indices for easy selection

The subcommand paths (hermes auth add/list/remove/reset) still work
exactly as before for scripted/non-interactive use.

* fix(tests): update runtime_provider tests for config.yaml source of truth (#4165)

Tests were using OPENAI_BASE_URL env var which is no longer consulted
after #4165. Updated to use model config (provider, base_url, api_key)
which is the new single source of truth for custom endpoint URLs.

* feat(auth): support custom endpoint credential pools keyed by provider name

Custom OpenAI-compatible endpoints all share provider='custom', making
the provider-keyed pool useless. Now pools for custom endpoints are
keyed by 'custom:<normalized_name>' where the name comes from the
custom_providers config list (auto-generated from URL hostname).

- Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)'
- load_pool('custom:name') seeds from custom_providers api_key AND
  model.api_key when base_url matches
- hermes auth add/list now shows custom endpoints alongside registry
  providers
- _resolve_openrouter_runtime and _resolve_named_custom_runtime check
  pool before falling back to single config key
- 6 new tests covering custom pool keying, seeding, and listing

* docs: add Excalidraw diagram of full credential pool flow

Comprehensive architecture diagram showing:
- Credential sources (env vars, auth.json OAuth, config.yaml, CLI)
- Pool storage and auto-seeding
- Runtime resolution paths (registry, custom, OpenRouter)
- Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh)
- CLI management commands and strategy configuration

Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g

* fix(tests): update setup wizard pool tests for unified select_provider_and_model flow

The setup wizard now delegates to select_provider_and_model() instead
of using its own prompt_choice-based provider picker. Tests needed:
- Mock select_provider_and_model as no-op (provider pre-written to config)
- Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it)
- Pre-write model.provider to config so the pool step is reached

* docs: add comprehensive credential pool documentation

- New page: website/docs/user-guide/features/credential-pools.md
  Full guide covering quick start, CLI commands, rotation strategies,
  error recovery, custom endpoint pools, auto-discovery, thread safety,
  architecture, and storage format.
- Updated fallback-providers.md to reference credential pools as the
  first layer of resilience (same-provider rotation before cross-provider)
- Added hermes auth to CLI commands reference with usage examples
- Added credential_pool_strategies to configuration guide

* chore: remove excalidraw diagram from repo (external link only)

* refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns

- _load_config_safe(): replace 4 identical try/except/import blocks
- _iter_custom_providers(): shared generator for custom provider iteration
- PooledCredential.extra dict: collapse 11 round-trip-only fields
  (token_type, scope, client_id, portal_base_url, obtained_at,
  expires_in, agent_key_id, agent_key_expires_in, agent_key_reused,
  agent_key_obtained_at, tls) into a single extra dict with
  __getattr__ for backward-compatible access
- _available_entries(): shared exhaustion-check between select and peek
- Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical)
- SimpleNamespace replaces class _Args boilerplate in auth_commands
- _try_resolve_from_custom_pool(): shared pool-check in runtime_provider

Net -17 lines. All 383 targeted tests pass.

---------

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
..
__init__.py test: reorganize test structure and add missing unit tests 2026-02-26 03:20:08 +03:00
test_ansi_strip.py fix: strip ANSI at the source — clean terminal output before it reaches the model 2026-03-23 07:43:12 -07:00
test_approval.py fix(security): catch sensitive path writes in approval checks (#3859) 2026-03-29 20:57:57 -07:00
test_browser_camofox.py fix: URL-based auth for third-party Anthropic endpoints + CI test fixes (#4148) 2026-03-30 20:36:56 -07:00
test_browser_cdp_override.py fix: normalize live Chrome CDP endpoints for browser tools 2026-03-19 10:17:03 -07:00
test_browser_cleanup.py Fix browser cleanup consistency and screenshot recovery 2026-03-14 11:28:26 -07:00
test_browser_console.py fix: add browser_console to browser toolset and core tools list (#1084) 2026-03-17 02:02:57 -07:00
test_browser_content_none_guard.py fix(browser): guard LLM response content against None in snapshot and vision (#3642) 2026-03-28 17:25:04 -07:00
test_browser_homebrew_paths.py fix: add macOS Homebrew paths to browser and terminal PATH resolution 2026-03-23 22:45:55 -07:00
test_browser_ssrf_local.py fix(tools): make browser SSRF check configurable via browser.allow_private_urls (#4198) 2026-03-31 02:11:55 -07:00
test_checkpoint_manager.py fix: reduce file tool log noise 2026-03-13 22:14:00 -07:00
test_clarify_tool.py test(tools): add unit tests for clarify_tool.py 2026-02-27 03:29:26 -05:00
test_clipboard.py fix(cli): respect HERMES_HOME in all remaining hardcoded ~/.hermes paths 2026-03-13 21:32:53 -07:00
test_code_execution.py refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804) 2026-03-24 07:30:25 -07:00
test_command_guards.py fix: make tirith block verdicts approvable instead of hard-blocking (#3428) 2026-03-27 13:22:01 -07:00
test_config_null_guard.py fix: guard config.get() against YAML null values to prevent AttributeError (#3377) 2026-03-27 04:03:00 -07:00
test_credential_files.py feat: mount skills directory into all remote backends with live sync (#3890) 2026-03-30 02:45:41 -07:00
test_cron_prompt_injection.py fix: cron prompt injection scanner bypass for multi-word variants 2026-02-26 13:55:54 +03:00
test_cronjob_tools.py fix(tools): remove unnecessary crontab requirement from cronjob tool (#1638) 2026-03-17 01:40:02 -07:00
test_daytona_environment.py feat: mount skills directory into all remote backends with live sync (#3890) 2026-03-30 02:45:41 -07:00
test_debug_helpers.py fix(tests): isolate HERMES_HOME in tests and adjust log directory for debug session 2026-03-02 04:34:21 -08:00
test_delegate.py feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647) 2026-03-31 03:10:01 -07:00
test_delegate_toolset_scope.py fix(security): restrict subagent toolsets to parent's enabled set (#3269) 2026-03-26 14:50:26 -07:00
test_docker_environment.py refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804) 2026-03-24 07:30:25 -07:00
test_docker_find.py fix: Docker backend fails when docker is not in PATH (macOS gateway) 2026-03-10 20:45:13 -07:00
test_env_passthrough.py feat: env var passthrough for skills and user config (#2807) 2026-03-24 08:19:34 -07:00
test_file_operations.py fix: search_files now reports error for non-existent paths instead of silent empty results 2026-03-08 16:47:20 -07:00
test_file_tools.py fix: strip ANSI at the source — clean terminal output before it reaches the model 2026-03-23 07:43:12 -07:00
test_file_tools_live.py fix: skip hanging tests + add global test timeout 2026-03-12 01:23:28 -07:00
test_file_write_safety.py fix(security): harden terminal safety and sandbox file writes (#1653) 2026-03-17 02:22:12 -07:00
test_force_dangerous_override.py fix(skills): honor policy table for dangerous verdicts 2026-03-14 11:27:02 -07:00
test_fuzzy_match.py test: reorganize test structure and add missing unit tests 2026-02-26 03:20:08 +03:00
test_hidden_dir_filter.py fix: use Path.parts for hidden directory filter in skill listing 2026-03-04 18:34:16 +03:00
test_homeassistant_tool.py fix: add service domain blocklist and entity_id validation to HA tools 2026-03-01 11:53:50 +03:00
test_honcho_tools.py fix(banner): show honcho tools as available when configured (#3810) 2026-03-29 15:55:05 -07:00
test_interrupt.py feat: concurrent tool execution with ThreadPoolExecutor 2026-03-13 02:51:51 -07:00
test_llm_content_none_guard.py fix: guard aux LLM calls against None content + reasoning fallback + retry (salvage #3389) (#3449) 2026-03-27 15:28:19 -07:00
test_local_env_blocklist.py fix: add macOS Homebrew paths to browser and terminal PATH resolution 2026-03-23 22:45:55 -07:00
test_local_persistent.py fix(terminal): avoid merging heredoc EOF with fence wrapper (#3598) 2026-03-28 14:43:41 -07:00
test_mcp_dynamic_discovery.py feat(mcp): dynamic tool discovery via notifications/tools/list_changed (#3812) 2026-03-29 15:52:54 -07:00
test_mcp_oauth.py fix(mcp-oauth): port mismatch, path traversal, and shared handler state (salvage #2521) (#2552) 2026-03-22 15:02:26 -07:00
test_mcp_probe.py feat: interactive MCP tool configuration in hermes tools (#1694) 2026-03-17 03:48:44 -07:00
test_mcp_tool.py fix: add MCP tool name collision protection (#3077) 2026-03-25 16:52:04 -07:00
test_mcp_tool_issue_948.py fix(mcp): resolve npx stdio connection failures (#1291) 2026-03-14 05:44:00 -07:00
test_memory_tool.py fix: tighten memory and session recall guidance 2026-03-14 11:36:47 -07:00
test_mixture_of_agents_tool.py refactor: tighten MoA traceback logging scope (#1307) 2026-03-14 07:53:56 -07:00
test_modal_sandbox_fixes.py refactor: replace swe-rex with native Modal SDK for Modal backend (#3538) 2026-03-28 11:21:44 -07:00
test_parse_env_var.py fix(docker): add explicit env allowlist for container credentials (#1436) 2026-03-17 02:34:35 -07:00
test_patch_parser.py fix: handle addition-only hunks in V4A patch parser (#3325) 2026-03-26 19:38:04 -07:00
test_process_registry.py fix(gateway): persist watcher metadata in checkpoint for crash recovery (#1706) 2026-03-17 03:52:15 -07:00
test_read_loop_detection.py fix: remove post-compression file-read history injection (#2226) 2026-03-20 14:54:25 -07:00
test_registry.py perf(ttft): salvage easy-win startup optimizations from #3346 (#3395) 2026-03-27 07:49:44 -07:00
test_rl_training_tool.py fix: call _stop_training_run on early-return failure paths 2026-03-10 17:09:51 -07:00
test_search_hidden_dirs.py fix: exclude hidden directories from find/grep search backends (#1558) 2026-03-17 02:02:57 -07:00
test_send_message_missing_platforms.py fix(tools): implement send_message routing for Matrix, Mattermost, HomeAssistant, DingTalk (#3796) 2026-03-29 15:17:46 -07:00
test_send_message_tool.py test: replace real-looking WhatsApp jid in regression test 2026-03-17 15:38:37 +00:00
test_session_search.py feat(sessions): add --source flag for third-party session isolation (#3255) 2026-03-26 14:35:31 -07:00
test_singularity_preflight.py fix(tests): use case-insensitive regex in singularity preflight tests 2026-03-16 19:01:39 +03:00
test_skill_env_passthrough.py fix(skills): stop marking persisted env vars missing on remote backends (#3650) 2026-03-28 17:52:32 -07:00
test_skill_manager_tool.py fix(skills): block category path traversal in skill manager (#3844) 2026-03-29 20:08:22 -07:00
test_skill_view_path_check.py refactor: use Path.is_relative_to() for skill_view boundary check 2026-03-04 05:30:43 -08:00
test_skill_view_traversal.py fix(security): block path traversal in skill_view file_path (fixes #220) 2026-03-02 02:00:09 -08:00
test_skills_guard.py fix(skills): preserve trust for skills-sh identifiers + reduce resolution churn (#3251) 2026-03-26 13:40:21 -07:00
test_skills_hub.py fix(skills): validate hub bundle paths before install (#3986) 2026-03-30 08:37:19 -07:00
test_skills_hub_clawhub.py fix: improve clawhub skill search matching 2026-03-14 23:15:04 -07:00
test_skills_sync.py feat: nix flake — uv2nix build, NixOS module, persistent container mode (#20) 2026-03-26 01:08:02 +05:30
test_skills_tool.py fix(skills): stop marking persisted env vars missing on remote backends (#3650) 2026-03-28 17:52:32 -07:00
test_ssh_environment.py merge: resolve conflicts with origin/main (SSH preflight check) 2026-03-15 21:13:40 -07:00
test_symlink_prefix_confusion.py fix: use is_relative_to() for symlink boundary check in skills_guard 2026-03-04 17:23:23 +03:00
test_terminal_disk_usage.py fix(terminal): log disk warning check failures at debug level (salvage #2372) (#2394) 2026-03-21 17:10:17 -07:00
test_terminal_requirements.py refactor: replace swe-rex with native Modal SDK for Modal backend (#3538) 2026-03-28 11:21:44 -07:00
test_terminal_timeout_output.py fix(terminal): preserve partial output when command times out (#3868) 2026-03-29 21:51:44 -07:00
test_terminal_tool_requirements.py refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804) 2026-03-24 07:30:25 -07:00
test_tirith_security.py fix: send_animation metadata, MarkdownV2 inline code splitting, tirith cosign-free install (#1626) 2026-03-16 23:39:41 -07:00
test_todo_tool.py fix: update test_non_empty_has_markers to match todo filtering behavior 2026-03-08 23:07:38 +03:00
test_transcription.py feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647) 2026-03-31 03:10:01 -07:00
test_transcription_tools.py fix(acp): complete session management surface for editor clients (salvage #3501) (#3675) 2026-03-28 23:45:53 -07:00
test_url_safety.py fix(security): add SSRF protection to vision_tools and web_tools (hardened) 2026-03-23 15:40:42 -07:00
test_vision_tools.py fix(vision): reject non-image files and enforce website policy (salvage #1940) (#3845) 2026-03-29 20:55:04 -07:00
test_voice_cli_integration.py fix: voice pipeline hardening — 7 bug fixes with tests 2026-03-14 14:27:21 +03:00
test_voice_mode.py fix: voice pipeline hardening — 7 bug fixes with tests 2026-03-14 14:27:21 +03:00
test_web_tools_config.py feat(web): add Tavily as web search/extract/crawl backend (#1731) 2026-03-17 04:28:03 -07:00
test_web_tools_tavily.py feat(web): add Tavily as web search/extract/crawl backend (#1731) 2026-03-17 04:28:03 -07:00
test_website_policy.py fix: resolve 7 failing CI tests (#3936) 2026-03-30 08:10:14 -07:00
test_windows_compat.py fix: guard POSIX-only process functions for Windows compatibility 2026-03-01 01:54:27 +03:00
test_write_deny.py fix: resolve symlink bypass in write deny list on macOS 2026-02-26 13:30:55 +03:00
test_yolo_mode.py fix(security): harden terminal safety and sandbox file writes (#1653) 2026-03-17 02:22:12 -07:00