hermes-agent/website/docs/user-guide/features
teknium1 d3ffbc6409 feat(stt): add stt.providers.<name> command-provider registry
Mirror of the TTS command-provider registry (PR #17843) for STT. Lets any
shell-driven ASR engine — Doubao ASR, NVIDIA Parakeet, whisper.cpp builds,
SenseVoice, curl pipelines — become an STT backend with zero Python.
Complements the legacy HERMES_LOCAL_STT_COMMAND escape hatch (preserved
untouched via the built-in local_command path) and the
register_transcription_provider() Python plugin hook also shipped in this
PR.

Resolution order (mirrors TTS exactly):

  1. Built-in (local, local_command, groq, openai, mistral, xai)
     → native handler. Always wins.
  2. stt.providers.<name>: type: command  → command-provider runner.
  3. Plugin-registered TranscriptionProvider → plugin dispatch.
  4. No match → 'No STT provider available'.

Files
-----
- tools/transcription_tools.py: BUILTIN_STT_PROVIDERS frozenset retained;
  added _resolve_command_stt_provider_config, _transcribe_command_stt,
  and local helpers for template rendering, shell-quote context, and
  process-tree termination. Helpers are documented as mirrors of their
  tts_tool.py counterparts (kept local to avoid cross-tool private
  import). Wire-in is one insertion point in transcribe_audio() after
  the xai elif and before the plugin dispatcher. Plugin dispatcher
  additionally defensively short-circuits when a same-name command
  config exists (command-wins-over-plugin invariant).

- tests/tools/test_transcription_command_providers.py: 50 new tests
  covering resolution (builtin precedence, type/command gating,
  case-insensitive lookup, legacy stt.<name> back-compat), helpers
  (timeout fallback, format validation, iter, has-any), template
  rendering (shell-quote contexts, doubled-brace preservation),
  end-to-end via _transcribe_command_stt (output_path read, stdout
  fallback, timeout, nonzero exit envelope, model override,
  language precedence), and dispatcher integration via the real
  transcribe_audio() including command-wins-over-plugin and
  builtin-shadow-rejection.

- tests/plugins/transcription/check_parity_vs_main.py: extended from
  10 to 13 scenarios. New cases: command-provider-installed,
  command-vs-plugin-same-name (verifies command wins precedence),
  explicit-openai-with-command-shadow (verifies built-in wins).
  Adds command_provider dispatch_kind detection via transcript prefix
  (CMD: vs PLUGIN:) so command-provider scenarios can be distinguished
  from plugin scenarios even when sharing a provider name.

- website/docs/user-guide/features/tts.md: new 'STT custom command
  providers' section symmetric to the TTS section — example config,
  placeholder grammar table (input_path / output_path / output_dir /
  format / language / model), transcript-read-back semantics (file
  first, then stdout fallback), optional keys table, behavior notes,
  security note. Updated 'Python plugin providers (STT)' to include
  the new 'When to pick which (STT)' decision table and updated
  resolution-order section (now 4 layers instead of 3).

Verification
------------
189/189 STT targeted tests + 50/50 new command-provider tests pass.
Combined sweep: tests/tools/ 5576/5576, tests/agent/ + tests/hermes_cli/
8623/8623 — zero regressions across 14,199 tests.

Parity harness: 13 scenarios, 9 OK + 4 expected diffs
(no_provider_error → plugin, plugin_unavailable, command_provider × 2).

E2E live-verified in an isolated HERMES_HOME with a real .wav file:

  command:                    → dispatched to stt.providers.my-fake-cli
  plugin:                     → dispatched to registered TranscriptionProvider
  command-wins-over-plugin:   → command provider beats same-name plugin
  builtin-wins-over-command:  → built-in OpenAI handler fires;
                                stt.providers.openai: type: command
                                does NOT hijack it.
2026-05-25 01:41:19 -07:00
..
_category_.json
acp.md docs: comprehensive 2-week sweep of feature/PR coverage gaps (#28497) 2026-05-18 23:55:25 -07:00
api-server.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
batch-processing.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
browser.md docs: surface 'hermes setup --portal' and 'hermes portal' across user-facing pages (#30869) 2026-05-23 02:42:31 -07:00
built-in-plugins.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
code-execution.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
codex-app-server-runtime.md docs(codex_app_server): document multi-root Kanban writable_roots (#27941) 2026-05-18 21:03:25 -07:00
computer-use.md feat(computer-use): refresh cua-driver on hermes update + add install --upgrade (#24063) 2026-05-11 17:10:58 -07:00
context-files.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
context-references.md docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393) 2026-04-05 19:45:50 -07:00
credential-pools.md fix: avoid persisting borrowed credential secrets (#31416) 2026-05-25 00:32:08 -07:00
cron.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
curator.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
delegation.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
deliverable-mode.md feat(gateway): deliverable mode — ship artifacts as native uploads from any agent surface (#27813) 2026-05-18 02:14:43 -07:00
extending-the-dashboard.md feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194) 2026-05-10 07:09:28 -07:00
fallback-providers.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
goals.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
honcho.md docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784) 2026-05-09 13:19:51 -07:00
hooks.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
image-generation.md docs: surface 'hermes setup --portal' and 'hermes portal' across user-facing pages (#30869) 2026-05-23 02:42:31 -07:00
kanban-tutorial.md docs: align kanban readiness docs and smoke tests 2026-05-18 21:07:03 -07:00
kanban-worker-lanes.md feat(kanban): stranded_in_ready diagnostic for unclaimed tasks (#23578) 2026-05-10 21:58:44 -07:00
kanban.md feat(kanban): warn users that scratch workspaces are deleted on completion (#30949) 2026-05-23 11:27:00 -07:00
lsp.md docs(lsp): replace "git worktree" with "git repository" in LSP docs 2026-05-13 23:05:20 -07:00
mcp.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
memory-providers.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
memory.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
overview.md feat: auto-launch Chromium-family browser for CDP 2026-05-19 22:34:05 -07:00
personality.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
plugins.md feat(stt): add register_transcription_provider() plugin hook 2026-05-25 01:41:19 -07:00
provider-routing.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
skills.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
skins.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
spotify.md docs(spotify): document Home Assistant speaker routing 2026-05-16 20:32:43 -07:00
subscription-proxy.md feat(proxy): local OpenAI-compatible proxy for OAuth providers (#25969) 2026-05-14 15:40:48 -07:00
tool-gateway.md docs: surface 'hermes setup --portal' and 'hermes portal' across user-facing pages (#30869) 2026-05-23 02:42:31 -07:00
tools.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
tts.md feat(stt): add stt.providers.<name> command-provider registry 2026-05-25 01:41:19 -07:00
vision.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
voice-mode.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
web-dashboard.md docs(dashboard): clarify chat tab tui flag 2026-05-16 20:32:43 -07:00
web-search.md fix(website): cross-locale doc links + drop empty ko locale (#31895) 2026-05-24 23:16:20 -07:00
x-search.md fix(model): include Premium+ in xAI OAuth label 2026-05-24 18:12:16 -07:00