mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-23 10:42:00 +00:00
Slack in-app voice clips ("record a clip") arrive as MP4/AAC containers
(mimetype audio/mp4, filename audio_message*.mp4), and Slack sometimes
labels them video/mp4. The inbound audio handler derived the cache
extension from the mimetype and fell back to ".ogg" for anything not in
{.ogg,.mp3,.wav,.webm,.m4a} — so audio/mp4 voice messages were cached as
.ogg. OpenAI STT (whisper-1, gpt-4o-transcribe) sniffs the container from
the FILENAME extension, so it received MP4 bytes named .ogg and rejected
them. WhatsApp .ogg and uploaded .m4a worked only because their extension
happened to match the bytes.
Fix:
- _resolve_slack_audio_ext(): pick the cache extension from the real
filename first, then a mimetype map (audio/mp4 -> .m4a), defaulting to
.m4a — never the bogus .ogg fallback. Mirrors the video branch and the
audio map already in gateway/platforms/bluebubbles.py.
- _is_slack_voice_clip(): detect audio-only clips mislabeled video/mp4
via the slack_audio subtype / audio_message* filename, and route them
through the audio path (cached as audio, reported as audio/*) so they
reach STT instead of video understanding. Genuine videos (and
slack_video screen recordings) are left on the video path.
Verified end-to-end against a real audio-only MP4: old path cached it as
.ogg (ffprobe shows MP4 bytes -> container mismatch -> OpenAI rejects);
new path caches it as .mp4 (extension matches bytes -> accepted).
Adds inbound-audio tests (previously none): helper unit tests plus
_handle_slack_message E2E coverage for audio/mp4, video/mp4-mislabeled
voice clips, and a real video staying on the video path. Confirmed the
two voice-message tests fail without the fix (mutation check).
|
||
|---|---|---|
| .. | ||
| browser | ||
| context_engine | ||
| cron | ||
| dashboard_auth | ||
| disk-cleanup | ||
| google_meet | ||
| hermes-achievements | ||
| image_gen | ||
| kanban | ||
| memory | ||
| model-providers | ||
| observability | ||
| platforms | ||
| security-guidance | ||
| spotify | ||
| teams_pipeline | ||
| video_gen | ||
| web | ||
| __init__.py | ||
| plugin_utils.py | ||