hermes-agent/tools
teknium1 d3ffbc6409 feat(stt): add stt.providers.<name> command-provider registry
Mirror of the TTS command-provider registry (PR #17843) for STT. Lets any
shell-driven ASR engine — Doubao ASR, NVIDIA Parakeet, whisper.cpp builds,
SenseVoice, curl pipelines — become an STT backend with zero Python.
Complements the legacy HERMES_LOCAL_STT_COMMAND escape hatch (preserved
untouched via the built-in local_command path) and the
register_transcription_provider() Python plugin hook also shipped in this
PR.

Resolution order (mirrors TTS exactly):

  1. Built-in (local, local_command, groq, openai, mistral, xai)
     → native handler. Always wins.
  2. stt.providers.<name>: type: command  → command-provider runner.
  3. Plugin-registered TranscriptionProvider → plugin dispatch.
  4. No match → 'No STT provider available'.

Files
-----
- tools/transcription_tools.py: BUILTIN_STT_PROVIDERS frozenset retained;
  added _resolve_command_stt_provider_config, _transcribe_command_stt,
  and local helpers for template rendering, shell-quote context, and
  process-tree termination. Helpers are documented as mirrors of their
  tts_tool.py counterparts (kept local to avoid cross-tool private
  import). Wire-in is one insertion point in transcribe_audio() after
  the xai elif and before the plugin dispatcher. Plugin dispatcher
  additionally defensively short-circuits when a same-name command
  config exists (command-wins-over-plugin invariant).

- tests/tools/test_transcription_command_providers.py: 50 new tests
  covering resolution (builtin precedence, type/command gating,
  case-insensitive lookup, legacy stt.<name> back-compat), helpers
  (timeout fallback, format validation, iter, has-any), template
  rendering (shell-quote contexts, doubled-brace preservation),
  end-to-end via _transcribe_command_stt (output_path read, stdout
  fallback, timeout, nonzero exit envelope, model override,
  language precedence), and dispatcher integration via the real
  transcribe_audio() including command-wins-over-plugin and
  builtin-shadow-rejection.

- tests/plugins/transcription/check_parity_vs_main.py: extended from
  10 to 13 scenarios. New cases: command-provider-installed,
  command-vs-plugin-same-name (verifies command wins precedence),
  explicit-openai-with-command-shadow (verifies built-in wins).
  Adds command_provider dispatch_kind detection via transcript prefix
  (CMD: vs PLUGIN:) so command-provider scenarios can be distinguished
  from plugin scenarios even when sharing a provider name.

- website/docs/user-guide/features/tts.md: new 'STT custom command
  providers' section symmetric to the TTS section — example config,
  placeholder grammar table (input_path / output_path / output_dir /
  format / language / model), transcript-read-back semantics (file
  first, then stdout fallback), optional keys table, behavior notes,
  security note. Updated 'Python plugin providers (STT)' to include
  the new 'When to pick which (STT)' decision table and updated
  resolution-order section (now 4 layers instead of 3).

Verification
------------
189/189 STT targeted tests + 50/50 new command-provider tests pass.
Combined sweep: tests/tools/ 5576/5576, tests/agent/ + tests/hermes_cli/
8623/8623 — zero regressions across 14,199 tests.

Parity harness: 13 scenarios, 9 OK + 4 expected diffs
(no_provider_error → plugin, plugin_unavailable, command_provider × 2).

E2E live-verified in an isolated HERMES_HOME with a real .wav file:

  command:                    → dispatched to stt.providers.my-fake-cli
  plugin:                     → dispatched to registered TranscriptionProvider
  command-wins-over-plugin:   → command provider beats same-name plugin
  builtin-wins-over-command:  → built-in OpenAI handler fires;
                                stt.providers.openai: type: command
                                does NOT hijack it.
2026-05-25 01:41:19 -07:00
..
computer_use fix(computer-use): skip capture_after when action failed (ok=False) 2026-05-22 01:19:01 -07:00
environments feat(docker): remove gosu from bundled image; s6-setuidgid handles privilege drop 2026-05-24 18:05:33 -07:00
neutts_samples refactor(tts): replace NeuTTS optional skill with built-in provider + setup flow 2026-03-17 02:33:12 -07:00
__init__.py Merge branch 'main' into rewbs/tool-use-charge-to-subscription 2026-03-31 08:48:54 +09:00
ansi_strip.py fix: strip ANSI at the source — clean terminal output before it reaches the model 2026-03-23 07:43:12 -07:00
approval.py fix(approval): pin 'silence is not consent' contract on timeout/deny (#24912) (#30879) 2026-05-23 02:59:13 -07:00
binary_extensions.py fix(tools): address PR review — remove _extract_raw_output, BudgetConfig everywhere, read_file hardening 2026-04-08 02:24:32 -07:00
browser_camofox.py feat: auto-launch Chromium-family browser for CDP 2026-05-19 22:34:05 -07:00
browser_camofox_state.py feat(browser): add persistent Camofox sessions and VNC URL discovery (salvage #4400) (#4419) 2026-04-01 04:18:50 -07:00
browser_cdp_tool.py feat: auto-launch Chromium-family browser for CDP 2026-05-19 22:34:05 -07:00
browser_dialog_tool.py feat: auto-launch Chromium-family browser for CDP 2026-05-19 22:34:05 -07:00
browser_supervisor.py fix(async): close unscheduled coroutines in all threadsafe bridges (#26584) 2026-05-15 14:00:01 -07:00
browser_tool.py fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452) 2026-05-24 15:01:28 -07:00
budget_config.py chore: remove Atropos RL environments and tinker-atropos integration (#26106) 2026-05-15 10:36:38 +05:30
checkpoint_manager.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
clarify_gateway.py feat(gateway): wire clarify tool with inline keyboard buttons on Telegram (#24199) 2026-05-12 16:33:33 -07:00
clarify_tool.py refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites 2026-04-07 13:36:38 -07:00
code_execution_tool.py fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290) 2026-05-24 00:38:17 -07:00
computer_use_tool.py feat(computer-use): cua-driver backend, universal any-model schema 2026-05-08 11:07:38 -07:00
credential_files.py fix(gateway): translate inbound document host paths to container paths for Docker backend 2026-05-07 05:02:26 -07:00
cronjob_tools.py fix(cron): allow emoji ZWJ sequences in prompts 2026-05-19 00:10:43 -07:00
debug_helpers.py refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821) 2026-04-07 10:25:31 -07:00
delegate_tool.py fix(delegation): preserve configured_provider name when runtime returns 'custom' 2026-05-17 11:40:05 -07:00
discord_tool.py feat: add Discord message deletion action 2026-05-07 05:11:09 -07:00
env_passthrough.py refactor(config): add cfg_get() helper; migrate 20 nested-get call sites (#17304) 2026-04-28 23:17:39 -07:00
fal_common.py refactor(image_gen): port FAL backend to plugins/image_gen/fal 2026-05-22 04:10:45 -07:00
feishu_doc_tool.py perf(cli): cut ~19s from 'hermes' cold start (skills cache + lazy Feishu + no Nous HTTP) (#22138) 2026-05-08 16:39:32 -07:00
feishu_drive_tool.py perf(cli): cut ~19s from 'hermes' cold start (skills cache + lazy Feishu + no Nous HTTP) (#22138) 2026-05-08 16:39:32 -07:00
file_operations.py fix(lint): skip per-file shell linter when LSP will handle the file (#29054) 2026-05-20 01:46:40 -05:00
file_state.py feat(delegate): cross-agent file state coordination for concurrent subagents (#13718) 2026-04-21 16:41:26 -07:00
file_tools.py fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290) 2026-05-24 00:38:17 -07:00
fuzzy_match.py chore: ruff auto-fixes — collapsible-else-if, if-stmt-min-max, dict.fromkeys (#23926) 2026-05-11 11:03:29 -07:00
homeassistant_tool.py fix: clean up description escaping, add string-data tests 2026-04-13 04:45:07 -07:00
image_generation_tool.py refactor(image_gen): port FAL backend to plugins/image_gen/fal 2026-05-22 04:10:45 -07:00
interrupt.py fix(interrupt): propagate to concurrent-tool workers + opt-in debug trace (#11907) 2026-04-17 20:39:25 -07:00
kanban_tools.py feat(kanban): stamp originating ACP session_id on tasks 2026-05-18 21:15:21 -07:00
lazy_deps.py feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
managed_tool_gateway.py fix(tools): add debug logging for token refresh and tighten domain check 2026-04-02 12:40:03 +11:00
mcp_oauth.py fix(security): guard os.chmod(parent) against / and top-level dirs 2026-05-20 22:56:55 -07:00
mcp_oauth_manager.py fix(mcp-oauth): persist OAuth server metadata across process restarts (#21226) 2026-05-07 05:35:33 -07:00
mcp_tool.py fix(mcp): raise ImportError instead of NameError when stdio SDK missing (#31450) 2026-05-24 04:44:59 -07:00
memory_tool.py fix(memory): guard against external drift in MEMORY.md/USER.md (#26045) (#30877) 2026-05-23 02:51:29 -07:00
microsoft_graph_auth.py feat(msgraph): add auth and client foundation 2026-05-08 09:27:26 -07:00
microsoft_graph_client.py fix(msgraph): stream download_to_file body instead of buffering 2026-05-08 09:27:26 -07:00
mixture_of_agents_tool.py chore: ruff auto-fix C401, C416, C408, PLR1722 (#23940) 2026-05-11 11:20:58 -07:00
neutts_synth.py fix(tts): document NeuTTS provider and align install guidance (#1903) 2026-03-18 02:55:30 -07:00
openrouter_client.py refactor: route ad-hoc LLM consumers through centralized provider router 2026-03-11 20:02:36 -07:00
osv_check.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
patch_parser.py fix(lint): skip per-file shell linter when LSP will handle the file (#29054) 2026-05-20 01:46:40 -05:00
path_security.py refactor: extract shared helpers to deduplicate repeated code patterns (#7917) 2026-04-11 13:59:52 -07:00
process_registry.py fix(process_registry): use taskkill /T /F for tree-kill on Windows 2026-05-23 20:30:29 -07:00
registry.py security: sanitize tool error strings before injecting into model context (#26823) 2026-05-16 00:57:39 -07:00
schema_sanitizer.py fix(xai-responses): strip enum values containing '/' from tool schemas 2026-05-18 10:37:35 -07:00
send_message_tool.py refactor(gateway): migrate Mattermost adapter to bundled plugin 2026-05-24 18:05:33 -07:00
session_search_tool.py feat(session_search): single-shape tool with discovery, scroll, browse — no LLM (#27590) 2026-05-17 23:28:45 -07:00
skill_manager_tool.py fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290) 2026-05-24 00:38:17 -07:00
skill_provenance.py fix(curator): only mark agent-created for background-review sediment (#19621) 2026-05-04 02:42:16 -07:00
skill_usage.py fix(skills): prune dependency/venv dirs from all skill scanners (#30042) 2026-05-21 14:18:02 -07:00
skills_ast_audit.py refactor(skills): slim AST diagnostic to single entry point 2026-05-23 17:47:26 -07:00
skills_guard.py fix(skills_guard): explain why --force is rejected on dangerous verdicts 2026-05-23 02:37:30 -07:00
skills_hub.py fix(skills,pairing): path traversal guard in uninstall, lock list_pending, hash file paths 2026-05-22 19:59:24 -07:00
skills_sync.py fix(skills): prune dependency/venv dirs from all skill scanners (#30042) 2026-05-21 14:18:02 -07:00
skills_tool.py fix(skills): prune dependency/venv dirs from all skill scanners (#30042) 2026-05-21 14:18:02 -07:00
slash_confirm.py fix(async): close unscheduled coroutines in all threadsafe bridges (#26584) 2026-05-15 14:00:01 -07:00
terminal_tool.py fix(terminal): warn at call time when background=true runs silently (#31289) 2026-05-23 21:02:14 -07:00
tirith_security.py fix(tirith): suppress .app lookalike_tld false positives in warn verdicts 2026-05-18 10:20:07 -07:00
todo_tool.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
tool_backend_helpers.py fix(cli): coerce use_gateway config flags in tool routing 2026-04-26 19:02:55 -07:00
tool_output_limits.py feat(skills): add design-md skill for Google's DESIGN.md spec (#14876) 2026-04-23 21:51:19 -07:00
tool_result_storage.py fix(tool-result-storage): persist via stdin to bypass 128 KB exec-arg cap (#22913) 2026-05-09 18:44:58 -07:00
transcription_tools.py feat(stt): add stt.providers.<name> command-provider registry 2026-05-25 01:41:19 -07:00
tts_tool.py feat(tts): add register_tts_provider() plugin hook (closes #30398) 2026-05-24 18:04:54 -07:00
url_safety.py fix(url_safety): block IPv4-mapped IPv6 addresses to prevent SSRF bypass 2026-05-18 10:51:15 -07:00
video_generation_tool.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
vision_tools.py fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452) 2026-05-24 15:01:28 -07:00
voice_mode.py fix(voice): chunk oversized CLI recordings 2026-05-21 14:17:39 -07:00
web_tools.py feat(web): add xAI Web Search provider plugin 2026-05-19 19:27:34 -07:00
website_policy.py refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821) 2026-04-07 10:25:31 -07:00
x_search_tool.py fix(x_search): surface degraded results + validate dates 2026-05-21 02:38:45 +05:30
xai_http.py feat(web): add xAI Web Search provider plugin 2026-05-19 19:27:34 -07:00
yuanbao_tools.py Fix unsafe gateway media path delivery 2026-05-23 01:40:35 -07:00