hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

History

kshitijk4poor 00ec0b617c feat(tts): add register_tts_provider() plugin hook (closes #30398 ) Adds a `TTSProvider(ABC)` + `register_tts_provider()` extension point to the plugin context API, alongside the existing config-driven `tts.providers.<name>: type: command` registry from PR #17843. This is additive — the command-provider surface stays as the primary way to add a TTS backend. The hook covers cases the shell-template grammar can't reasonably express: - Native Python SDKs without a CLI (Cartesia, Fish Audio, etc.) - Streaming synthesis (chunked Opus → voice-bubble delivery) - Voice metadata API for the `hermes tools` picker - OAuth-refreshing auth flows None of the 10 inline built-in providers (`edge`, `openai`, `elevenlabs`, `minimax`, `gemini`, `mistral`, `xai`, `piper`, `kittentts`, `neutts`) are migrated to plugins. They stay inline. The hook is for new engines that aren't built-in. ## Resolution order The dispatcher's resolution order is the load-bearing invariant: 1. `tts.provider` is a built-in name → built-in dispatch. Always wins. 2. `tts.provider` matches `tts.providers.<name>` with `command:` set → command-provider dispatch (PR #17843). 3. `tts.provider` matches a plugin-registered `TTSProvider` → plugin dispatch (new). 4. No match → falls through to Edge TTS default (legacy behavior). Built-ins-always-win is enforced at THREE layers: - Registry: `register_provider()` rejects shadowing names with a warning. - Dispatcher: `_dispatch_to_plugin_provider()` short-circuits built-in names defensively before consulting the registry. - Picker: `_plugin_tts_providers()` filters built-in shadows out of the `hermes tools` row list defensively. Command-providers-win-over-plugins is enforced at TWO layers: - The caller in `text_to_speech_tool` checks `_resolve_command_provider_config` first. - `_dispatch_to_plugin_provider` re-checks for a same-name command config defensively so a refactor of the caller can't silently break the invariant. ## New files - `agent/tts_provider.py` — `TTSProvider(ABC)` with `synthesize()` (required), `list_voices()`, `list_models()`, `get_setup_schema()`, `stream()`, `voice_compatible` (all optional with sane defaults). Mirrors `agent/image_gen_provider.py` shape. - `agent/tts_registry.py` — `register_provider`/`get_provider`/`list_providers` with `_BUILTIN_NAMES` reject-shadowing invariant. Mirrors `agent/image_gen_registry.py` shape. - `plugins/tts/...` directory ready for community plugins (none shipped). ## Modified files - `hermes_cli/plugins.py` — `register_tts_provider()` method on `PluginContext`. Matches the gating shape of `register_image_gen_provider()` / `register_browser_provider()`. - `tools/tts_tool.py` — `_dispatch_to_plugin_provider()` + `_plugin_provider_is_voice_compatible()` + walrus-elif wiring into the main dispatcher. Built-in elif chain untouched. - `hermes_cli/tools_config.py` — `_plugin_tts_providers()` injects plugin rows into the Text-to-Speech picker category alongside the 10 hardcoded built-in rows. ## Tests - `tests/agent/test_tts_registry.py` — 47 tests covering registration, lookup, ABC contract, helpers, AND a `TestBuiltinSync` regression test that fails if `agent.tts_registry._BUILTIN_NAMES` drifts from `tools.tts_tool.BUILTIN_TTS_PROVIDERS` (kept duplicated due to circular import constraints). - `tests/tools/test_tts_plugin_dispatch.py` — 35 tests covering built-in-always-wins, command-wins-over-plugin, plugin dispatch, exception passthrough, voice_compatible helper. - `tests/hermes_cli/test_tts_picker.py` — 10 tests covering the picker surface, builtin shadowing defense, integration with `_visible_providers`. - `tests/hermes_cli/test_plugins_tts_registration.py` — 3 end-to-end tests via `PluginManager.discover_and_load()`. - `tests/plugins/tts/check_parity_vs_main.py` — 9-scenario subprocess parity harness vs `origin/main`. The only intentional diff is `fallback_edge → plugin` for the `plugin-installed` scenario. ## Verification - 95/95 new tests pass. - 170/170 pre-existing TTS tests (test_tts_command_providers, test_tts_max_text_length, test_tts_speed, etc.) pass unchanged. - Parity harness against `origin/main`: 8 OK + 1 expected DIFF. - E2E smoke: a registered plugin's `synthesize()` is called via `text_to_speech_tool` with the standard JSON envelope returned. - Ruff clean on all touched files. ## Docs - `website/docs/user-guide/features/tts.md` — new "Python plugin providers" section with a decision table (command-provider vs plugin), minimal plugin example, and the optional-hook reference. - `website/docs/user-guide/features/plugins.md` — TTS row updated to mention both surfaces (command-provider primary, plugin for SDK/streaming). Closes #30398		2026-05-24 18:04:54 -07:00
..
browser	fix(browser): self-review pass — dead-import, log levels, future-proofing	2026-05-17 04:04:15 -07:00
image_gen	refactor(image_gen): port FAL backend to plugins/image_gen/fal	2026-05-22 04:10:45 -07:00
memory	fix(memory): skip OpenViking upload symlinks	2026-05-14 07:48:03 -07:00
model_providers	fix(opencode-go): emit Kimi reasoning_effort, match KimiProfile shape	2026-05-23 02:20:28 -07:00
tts	feat(tts): add register_tts_provider() plugin hook (closes #30398 )	2026-05-24 18:04:54 -07:00
video_gen	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )	2026-05-17 02:29:41 -07:00
web	test: use subprocesses for each test file (#29016 )	2026-05-21 16:40:04 +05:30
__init__.py	fix: mem0 API v2 compat, prefetch context fencing, secret redaction (#5423 )	2026-04-05 22:43:33 -07:00
test_achievements_plugin.py	test: use subprocesses for each test file (#29016 )	2026-05-21 16:40:04 +05:30
test_disk_cleanup_plugin.py	feat(plugins): make all plugins opt-in by default	2026-04-20 04:46:45 -07:00
test_google_meet_audio.py	feat(plugins): google_meet \u2014 join, transcribe, speak, follow up (#16364 )	2026-04-27 06:22:25 -07:00
test_google_meet_node.py	feat(plugins): google_meet \u2014 join, transcribe, speak, follow up (#16364 )	2026-04-27 06:22:25 -07:00
test_google_meet_plugin.py	feat(plugins): google_meet \u2014 join, transcribe, speak, follow up (#16364 )	2026-04-27 06:22:25 -07:00
test_google_meet_realtime.py	feat(plugins): google_meet \u2014 join, transcribe, speak, follow up (#16364 )	2026-04-27 06:22:25 -07:00
test_kanban_dashboard_plugin.py	fix(kanban-dashboard): restore implementations dropped during salvages (#28481 )	2026-05-18 21:54:56 -07:00
test_kanban_worker_runs.py	feat(kanban): worker visibility endpoints (workers/active, runs/{id}, inspect)	2026-05-18 21:01:47 -07:00
test_langfuse_plugin.py	fix(langfuse): complete observability fix — trace I/O, tool outputs, placeholder credentials (closes #22342 , #22763 ) (#26320 )	2026-05-15 05:04:02 -07:00
test_retaindb_plugin.py	test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )	2026-04-17 14:21:22 -07:00
test_teams_pipeline_plugin.py	fix(teams-pipeline): fill in missing delivery URL in adapter-reuse test	2026-05-08 12:00:09 -07:00