hermes-agent/gateway
Teknium 8a9ded5b21
feat(discord): voice-channel mixer — ambient idle bed + verbal acks that overlap TTS (#39659)
* feat(discord): voice-channel mixer — ambient idle bed + verbal acks that overlap TTS

Discord voice mode can now feel conversational: the bot speaks a short
acknowledgement before it starts working, and a subtle ambient 'thinking' bed
plays underneath while tools run, ducking under speech and swelling back — the
Grok-voice-mode feel.

discord.py plays only one audio stream per voice connection, so this adds a
software mixer (VoiceMixer, a discord.AudioSource) installed once per guild on
join. It sums an ambient loop, verbal acks, and TTS replies into that single
20ms/48kHz/stereo stream (numpy int16 add + clip), so they overlap instead of
stop-and-swap. Speech ducks the ambient gain down and releases it smoothly.

- plugins/platforms/discord/voice_mixer.py: VoiceMixer + MixerChild (gain,
  loop, fade, duck/release), decode_to_pcm (ffmpeg), synth_ambient_pcm (no
  asset needed — synthesised pad).
- adapter: install mixer on join, tear down on leave, route
  play_in_voice_channel through the mixer (legacy one-shot path kept as
  fallback), play_ack_in_voice, voice_mixer_active. Defensive getattr for the
  object.__new__ test helpers.
- gateway/run.py: tool_start_callback fires a one-time verbal ack on the first
  tool call of a turn when in a voice channel (independent of the text
  tool-progress gate). No system-prompt or message-flow changes.
- config: discord.voice_fx.* (OFF by default; ambient/duck/speech gains, ack
  phrases). All in config.yaml, not .env.
- docs + tests (mixer unit + adapter integration).

Verified: 19 new tests pass, existing voice suite green (2 pre-existing
davey-module env failures unchanged), and a real-mixer E2E confirms ambient
streams, TTS overlaps it, acks layer in, and teardown is clean.

* fix(discord): make voice mixer numpy import lazy (numpy is voice-extra-only)

numpy ships in the optional 'voice' extra, not [all,dev], so a module-level
'import numpy' broke CI test collection (and would break the always-imported
Discord adapter on any install without the voice extra). Defer numpy to the
functions that actually mix audio via _require_numpy(); guard the test module
with pytest.importorskip('numpy').
2026-06-05 03:10:40 -07:00
..
assets fix: improve telegram topic mode setup 2026-05-04 12:07:17 -07:00
builtin_hooks remove: BOOT.md built-in hook (#17093) 2026-04-28 09:50:27 -07:00
platforms fix(deps): promote markdown to a core dependency so rich delivery works out of the box (#32486) (#38649) 2026-06-04 16:46:36 -07:00
__init__.py docs(gateway): mention Weixin in gateway help and docstrings 2026-05-12 17:08:51 -07:00
channel_directory.py refactor(ntfy): convert built-in adapter to platform plugin 2026-05-23 16:13:01 -07:00
config.py fix(gateway): bridge shared-key loop to nested platform config blocks 2026-06-04 05:31:47 -07:00
delivery.py fix(gateway): drop outbound silence-narration messages pre-send 2026-05-29 19:06:05 -07:00
display_config.py fix(gateway): keep Telegram heartbeat + interim commentary on; edit heartbeat in place (#33187) 2026-05-27 05:21:53 -07:00
hooks.py fix(plugins): register dynamically-loaded modules in sys.modules before exec 2026-04-29 23:34:35 -07:00
memory_monitor.py Port from cline/cline#10343: periodic gateway memory logging (#27102) 2026-05-16 12:55:23 -07:00
mirror.py refactor(gateway): drop _append_to_jsonl from mirror 2026-05-20 13:00:57 -07:00
pairing.py fix(gateway): preserve WhatsApp pairing approvals across JID/LID alias flips 2026-05-23 01:46:34 -07:00
platform_registry.py refactor(plugins): add apply_yaml_config_fn registry hook 2026-05-13 22:20:30 -07:00
restart.py fix(gateway): address restart review feedback 2026-04-10 21:18:34 -07:00
run.py feat(discord): voice-channel mixer — ambient idle bed + verbal acks that overlap TTS (#39659) 2026-06-05 03:10:40 -07:00
runtime_footer.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
session.py feat(gateway): bring /undo [N] to messaging platforms (parity with CLI/TUI) (#36699) 2026-06-01 02:04:14 -07:00
session_context.py fix(desktop): stabilize project folder sessions (#37586) 2026-06-02 20:23:09 +00:00
shutdown_forensics.py chore: ruff auto-fixes — collapsible-else-if, if-stmt-min-max, dict.fromkeys (#23926) 2026-05-11 11:03:29 -07:00
slash_access.py feat(gateway): per-platform admin/user split for slash commands (salvage of #4443) (#23373) 2026-05-10 12:33:54 -07:00
status.py fix(gateway): tolerate non-UTF-8 status/pid files in gateway status reads 2026-06-04 22:05:23 -07:00
sticker_cache.py fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes 2026-05-19 00:12:41 -07:00
stream_consumer.py fix(telegram): finalize sealed overflow chunk so split streamed replies render formatting 2026-06-04 17:11:12 -07:00
stream_dispatch.py feat(gateway): structured stream-event protocol + Telegram draft formatting parity (#37250) 2026-06-02 00:33:50 -07:00
stream_events.py feat(gateway): structured stream-event protocol + Telegram draft formatting parity (#37250) 2026-06-02 00:33:50 -07:00
whatsapp_identity.py fix(whatsapp_identity): pin identifier regex to ASCII, clarify it's defense-in-depth 2026-04-26 20:48:31 -07:00