mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-09 08:21:50 +00:00
* feat(discord): voice-channel mixer — ambient idle bed + verbal acks that overlap TTS
Discord voice mode can now feel conversational: the bot speaks a short
acknowledgement before it starts working, and a subtle ambient 'thinking' bed
plays underneath while tools run, ducking under speech and swelling back — the
Grok-voice-mode feel.
discord.py plays only one audio stream per voice connection, so this adds a
software mixer (VoiceMixer, a discord.AudioSource) installed once per guild on
join. It sums an ambient loop, verbal acks, and TTS replies into that single
20ms/48kHz/stereo stream (numpy int16 add + clip), so they overlap instead of
stop-and-swap. Speech ducks the ambient gain down and releases it smoothly.
- plugins/platforms/discord/voice_mixer.py: VoiceMixer + MixerChild (gain,
loop, fade, duck/release), decode_to_pcm (ffmpeg), synth_ambient_pcm (no
asset needed — synthesised pad).
- adapter: install mixer on join, tear down on leave, route
play_in_voice_channel through the mixer (legacy one-shot path kept as
fallback), play_ack_in_voice, voice_mixer_active. Defensive getattr for the
object.__new__ test helpers.
- gateway/run.py: tool_start_callback fires a one-time verbal ack on the first
tool call of a turn when in a voice channel (independent of the text
tool-progress gate). No system-prompt or message-flow changes.
- config: discord.voice_fx.* (OFF by default; ambient/duck/speech gains, ack
phrases). All in config.yaml, not .env.
- docs + tests (mixer unit + adapter integration).
Verified: 19 new tests pass, existing voice suite green (2 pre-existing
davey-module env failures unchanged), and a real-mixer E2E confirms ambient
streams, TTS overlaps it, acks layer in, and teardown is clean.
* fix(discord): make voice mixer numpy import lazy (numpy is voice-extra-only)
numpy ships in the optional 'voice' extra, not [all,dev], so a module-level
'import numpy' broke CI test collection (and would break the always-imported
Discord adapter on any install without the voice extra). Defer numpy to the
functions that actually mix audio via _require_numpy(); guard the test module
with pytest.importorskip('numpy').
|
||
|---|---|---|
| .. | ||
| docs | ||
| i18n/zh-Hans/docusaurus-plugin-content-docs/current | ||
| scripts | ||
| src | ||
| static | ||
| .gitignore | ||
| docusaurus.config.ts | ||
| package-lock.json | ||
| package.json | ||
| README.md | ||
| sidebars.ts | ||
| tsconfig.json | ||
Website
This website is built using Docusaurus, a modern static website generator.
Installation
yarn
Local Development
yarn start
This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
Build
yarn build
This command generates static content into the build directory and can be served using any static contents hosting service.
Deployment
Using SSH:
USE_SSH=true yarn deploy
Not using SSH:
GIT_USER=<Your GitHub username> yarn deploy
If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the gh-pages branch.
Diagram Linting
CI runs ascii-guard to lint docs for ASCII box diagrams. Use Mermaid (````mermaid`) or plain lists/tables instead of ASCII boxes to avoid CI failures.