mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-30 06:41:51 +00:00
Two coordinated changes that unblock downstream audio pipelines (diarization, custom transcription, archival) on attachments larger than the public Bot API's 20MB getFile ceiling. - `stt.enabled: false` no longer drops voice/audio with a generic "transcription disabled" note. The gateway probes the cached file's duration (wave → mutagen → ffprobe ladder) and surfaces `[The user sent a voice message: <abs path> (duration: M:SS)]` to the agent so a skill or tool can pick up the raw file. The previous placeholder is replaced rather than appended when present. - `platforms.telegram.extra.base_url` set → adapter auto-lifts its document size cap from 20MB to 2GB (the local telegram-bot-api `--local` ceiling) and the "too large" reply reports the active limit dynamically. No new config knob; presence of `base_url` is the opt-in. - `platforms.telegram.extra.local_mode: true` wires `Application.builder().local_mode(True)` on the python-telegram-bot builder. PTB then reads files from disk instead of HTTP, which is required when telegram-bot-api runs in `--local` mode (the server returns absolute filesystem paths, not `/file/bot...` URLs). - gateway/run.py: rewrites the `stt.enabled: false` branch of `_enrich_message_with_transcription`. New `_format_duration` + `_probe_audio_duration` helpers. - gateway/platforms/telegram.py: `_max_doc_bytes` instance attribute derived from `extra.base_url`; `local_mode` builder wiring; dynamic "too large" message. - tests/gateway/test_stt_config.py: covers path-surfacing with and without an existing user message, and placeholder replacement. - tests/gateway/test_telegram_max_doc_bytes.py: 3 cases — default 20MB without base_url, 2GB when set, empty-string base_url keeps default. - website/docs/user-guide/messaging/telegram.md: new "Skipping STT" subsection under Voice Messages and a full "Large Files (>20MB) via Local Bot API Server" walkthrough (api_id/api_hash, docker-compose, one-time `logOut` migration, `platforms.telegram.extra` config, the `local_mode` disk-access requirement, the silent HTTP-fallback 404). - website/docs/user-guide/features/voice-mode.md: documents the `stt.enabled` knob in the config reference. - `pytest tests/gateway/test_telegram_max_doc_bytes.py tests/gateway/test_stt_config.py` → 9/9 passing. - Verified end-to-end on a live deployment: gateway log shows `Using custom Telegram base_url: http://...` and `Using Telegram local_mode (read files from disk)` on startup; voice messages above 20MB cache to disk and surface their path to the agent. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| _category_.json | ||
| acp.md | ||
| api-server.md | ||
| batch-processing.md | ||
| browser.md | ||
| built-in-plugins.md | ||
| code-execution.md | ||
| codex-app-server-runtime.md | ||
| computer-use.md | ||
| context-files.md | ||
| context-references.md | ||
| credential-pools.md | ||
| cron.md | ||
| curator.md | ||
| delegation.md | ||
| deliverable-mode.md | ||
| extending-the-dashboard.md | ||
| fallback-providers.md | ||
| goals.md | ||
| honcho.md | ||
| hooks.md | ||
| image-generation.md | ||
| kanban-tutorial.md | ||
| kanban-worker-lanes.md | ||
| kanban.md | ||
| lsp.md | ||
| mcp.md | ||
| memory-providers.md | ||
| memory.md | ||
| overview.md | ||
| personality.md | ||
| plugins.md | ||
| provider-routing.md | ||
| skills.md | ||
| skins.md | ||
| spotify.md | ||
| subscription-proxy.md | ||
| tool-gateway.md | ||
| tools.md | ||
| tts.md | ||
| vision.md | ||
| voice-mode.md | ||
| web-dashboard.md | ||
| web-search.md | ||
| x-search.md | ||