mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-07 08:02:23 +00:00
feat(stt): add register_transcription_provider() plugin hook
Add an opt-in Python plugin surface for speech-to-text backends,
mirroring the TTS hook pattern. New backends (OpenRouter, SenseAudio,
Gemini-STT, custom proprietary engines) can be implemented as plugins
without modifying tools/transcription_tools.py.
Built-ins always win
--------------------
The 6 built-in STT providers (local/faster-whisper, local_command,
groq, openai, mistral, xai) keep their native handlers. Plugins
attempting to register under a built-in name are rejected at
registration time with a warning and re-checked defensively at
dispatch.
Resolution order
----------------
1. stt.provider matches a built-in → built-in dispatch (unchanged)
2. stt.provider matches a registered plugin →
a. if plugin.is_available() returns False → unavailability envelope
identifying the plugin (not the generic "No STT provider"
message — the user explicitly opted into this plugin)
b. otherwise plugin.transcribe() with model + language forwarded
from stt.<provider>.{model,language} config
3. No match → legacy "No STT provider available" error (unchanged)
Per-provider config namespace
-----------------------------
Plugins read their config from stt.<provider> in config.yaml, mirroring
how built-ins read stt.openai.model / stt.mistral.model. The dispatcher
forwards `model` and `language` from this section. Caller's explicit
`model=` argument overrides the config-set model.
Files
-----
- agent/transcription_provider.py: TranscriptionProvider ABC
- agent/transcription_registry.py: register/get/list providers,
built-in shadow guard, _reset_for_tests
- hermes_cli/plugins.py: register_transcription_provider() on
PluginContext
- tools/transcription_tools.py: BUILTIN_STT_PROVIDERS frozenset,
_dispatch_to_plugin_provider() with availability gate, wire-in
after xai branch and before "No STT provider" error
- tests/agent/test_transcription_registry.py: 27 tests
- tests/hermes_cli/test_plugins_transcription_registration.py: 3 tests
- tests/tools/test_transcription_plugin_dispatch.py: 28 tests
(covering built-in short-circuit, plugin dispatch, exception
envelope, non-dict guard, availability gate, language forwarding)
- tests/plugins/transcription/check_parity_vs_main.py: 10-scenario
subprocess-pinned parity harness vs origin/main
- website/docs/user-guide/features/{tts,plugins}.md: docs
Behavior parity
---------------
10 scenarios, 8 OK + 2 expected DIFFs:
no_provider_error → plugin (plugin-installed scenario)
no_provider_error → plugin_unavailable (plugin-installed-unavailable
scenario; PR returns cleaner envelope)
Zero behavior change for users not opting into a plugin.
Issue follow-up to #30398.
This commit is contained in:
parent
2e0ac31a72
commit
2cd952e110
11 changed files with 1831 additions and 1 deletions
|
|
@ -678,6 +678,50 @@ class PluginContext:
|
|||
self.manifest.name, provider.name,
|
||||
)
|
||||
|
||||
# -- transcription (STT) provider registration ---------------------------
|
||||
|
||||
def register_transcription_provider(self, provider) -> None:
|
||||
"""Register a speech-to-text backend.
|
||||
|
||||
``provider`` must be an instance of
|
||||
:class:`agent.transcription_provider.TranscriptionProvider`.
|
||||
The ``provider.name`` attribute is what ``stt.provider`` in
|
||||
``config.yaml`` matches against when routing
|
||||
:func:`tools.transcription_tools.transcribe_audio` calls —
|
||||
**but only when**:
|
||||
|
||||
1. ``provider.name`` is NOT a built-in STT provider name
|
||||
(``local``, ``local_command``, ``groq``, ``openai``,
|
||||
``mistral``, ``xai``). Built-ins always win — the registry
|
||||
rejects shadowing names with a warning.
|
||||
2. There is NO ``stt.providers.<name>: type: command`` entry
|
||||
with the same name. Command-providers win on name
|
||||
collision because config is more local than plugin install
|
||||
— same precedence rule as TTS.
|
||||
|
||||
Coexists with the in-tree dispatcher and the STT
|
||||
command-provider registry rather than replacing them. The 6
|
||||
built-in STT backends keep their native implementations in
|
||||
``tools/transcription_tools.py``; this hook is for *new* Python
|
||||
engines (OpenRouter, SenseAudio, Gemini-STT, custom proprietary
|
||||
backends).
|
||||
"""
|
||||
from agent.transcription_provider import TranscriptionProvider
|
||||
from agent.transcription_registry import register_provider as _register_stt_provider
|
||||
|
||||
if not isinstance(provider, TranscriptionProvider):
|
||||
logger.warning(
|
||||
"Plugin '%s' tried to register a transcription provider that "
|
||||
"does not inherit from TranscriptionProvider. Ignoring.",
|
||||
self.manifest.name,
|
||||
)
|
||||
return
|
||||
_register_stt_provider(provider)
|
||||
logger.info(
|
||||
"Plugin '%s' registered transcription provider: %s",
|
||||
self.manifest.name, provider.name,
|
||||
)
|
||||
|
||||
# -- platform adapter registration ---------------------------------------
|
||||
|
||||
def register_platform(
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue