fix(tts): document NeuTTS provider and align install guidance (#1903)

Co-authored-by: charles-édouard <59705750+ccbbccbb@users.noreply.github.com>
This commit is contained in:
Teknium 2026-03-18 02:55:30 -07:00 committed by GitHub
parent fb923d5efc
commit 11f029c311
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
7 changed files with 79 additions and 17 deletions

View file

@ -10,13 +10,14 @@ Hermes Agent supports both text-to-speech output and voice message transcription
## Text-to-Speech
Convert text to speech with three providers:
Convert text to speech with four providers:
| Provider | Quality | Cost | API Key |
|----------|---------|------|---------|
| **Edge TTS** (default) | Good | Free | None needed |
| **ElevenLabs** | Excellent | Paid | `ELEVENLABS_API_KEY` |
| **OpenAI TTS** | Good | Paid | `VOICE_TOOLS_OPENAI_KEY` |
| **NeuTTS** | Good | Free | None needed |
### Platform Delivery
@ -32,7 +33,7 @@ Convert text to speech with three providers:
```yaml
# In ~/.hermes/config.yaml
tts:
provider: "edge" # "edge" | "elevenlabs" | "openai"
provider: "edge" # "edge" | "elevenlabs" | "openai" | "neutts"
edge:
voice: "en-US-AriaNeural" # 322 voices, 74 languages
elevenlabs:
@ -41,6 +42,11 @@ tts:
openai:
model: "gpt-4o-mini-tts"
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
neutts:
ref_audio: ''
ref_text: ''
model: neuphonic/neutts-air-q4-gguf
device: cpu
```
### Telegram Voice Bubbles & ffmpeg
@ -49,6 +55,7 @@ Telegram voice bubbles require Opus/OGG audio format:
- **OpenAI and ElevenLabs** produce Opus natively — no extra setup
- **Edge TTS** (default) outputs MP3 and needs **ffmpeg** to convert:
- **NeuTTS** outputs WAV and also needs **ffmpeg** to convert for Telegram voice bubbles
```bash
# Ubuntu/Debian
@ -61,7 +68,7 @@ brew install ffmpeg
sudo dnf install ffmpeg
```
Without ffmpeg, Edge TTS audio is sent as a regular audio file (playable, but shows as a rectangular player instead of a voice bubble).
Without ffmpeg, Edge TTS and NeuTTS audio are sent as regular audio files (playable, but shown as a rectangular player instead of a voice bubble).
:::tip
If you want voice bubbles without installing ffmpeg, switch to the OpenAI or ElevenLabs provider.