hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-25 17:04:52 +00:00

History

Teknium fce6c3cdf6 feat(tts): add Google Gemini TTS provider (#11229 ) Adds Google Gemini TTS as the seventh voice provider, with 30 prebuilt voices (Zephyr, Puck, Kore, Enceladus, Gacrux, etc.) and natural-language prompt control. Integrates through the existing provider chain: - tools/tts_tool.py: new _generate_gemini_tts() calls the generativelanguage REST endpoint with responseModalities=[AUDIO], wraps the returned 24kHz mono 16-bit PCM (L16) in a WAV RIFF header, then ffmpeg-converts to MP3 or Opus depending on output extension. For .ogg output, libopus is forced explicitly so Telegram voice bubbles get Opus (ffmpeg defaults to Vorbis for .ogg). - hermes_cli/tools_config.py: exposes 'Google Gemini TTS' as a provider option in the curses-based 'hermes tools' UI. - hermes_cli/setup.py: adds gemini to the setup wizard picker, tool status display, and API key prompt branch (accepts existing GEMINI_API_KEY or GOOGLE_API_KEY, falls back to Edge if neither set). - tests/tools/test_tts_gemini.py: 15 unit tests covering WAV header wrap correctness, env var fallback (GEMINI/GOOGLE), voice/model overrides, snake_case vs camelCase inlineData handling, HTTP error surfacing, and empty-audio edge cases. - docs: TTS features page updated to list seven providers with the new gemini config block and ffmpeg notes. Live-tested against api key against gemini-2.5-flash-preview-tts: .wav, .mp3, and Telegram-compatible .ogg (Opus codec) all produce valid playable audio.		2026-04-16 14:23:16 -07:00
..
features	feat(tts): add Google Gemini TTS provider (#11229 )	2026-04-16 14:23:16 -07:00
messaging	feat(telegram): add dedicated TELEGRAM_PROXY env var and config.yaml proxy_url support	2026-04-15 22:13:11 -07:00
skills	feat(google-workspace): add --from flag for custom sender display name (#9931 )	2026-04-14 16:55:34 -07:00
_category_.json	feat: add documentation website (Docusaurus)	2026-03-05 05:24:55 -08:00
checkpoints-and-rollback.md	docs: restructure site navigation — promote features and platforms to top-level (#4116 )	2026-03-30 18:39:51 -07:00
cli.md	docs: comprehensive update for recent merged PRs (#9019 )	2026-04-13 10:50:59 -07:00
configuration.md	fix: stop hermes update from nagging about llm-wiki's wiki.path (#11222 )	2026-04-16 13:34:16 -07:00
docker.md	docs(docker): add dashboard section, expose API port, update Compose example	2026-04-14 15:41:30 -07:00
git-worktrees.md	docs: restructure site navigation — promote features and platforms to top-level (#4116 )	2026-03-30 18:39:51 -07:00
profiles.md	fix(honcho): plugin drift overhaul -- observation config, chunking, setup wizard, docs, dead code cleanup	2026-04-05 12:34:11 -07:00
security.md	docs: comprehensive docs audit — cover 13 features from last week's PRs (#5815 )	2026-04-07 10:21:03 -07:00
sessions.md	docs: add QQBot to all 14 docs pages (full platform parity)	2026-04-14 00:11:49 -07:00