hermes-agent/website/docs/user-guide/features
Albert G ad2531be08 feat(telegram): skip-STT audio path + 2GB cap via local Bot API server
Two coordinated changes that unblock downstream audio pipelines
(diarization, custom transcription, archival) on attachments larger
than the public Bot API's 20MB getFile ceiling.

- `stt.enabled: false` no longer drops voice/audio with a generic
  "transcription disabled" note. The gateway probes the cached file's
  duration (wave → mutagen → ffprobe ladder) and surfaces
  `[The user sent a voice message: <abs path> (duration: M:SS)]` to
  the agent so a skill or tool can pick up the raw file. The previous
  placeholder is replaced rather than appended when present.

- `platforms.telegram.extra.base_url` set → adapter auto-lifts its
  document size cap from 20MB to 2GB (the local telegram-bot-api
  `--local` ceiling) and the "too large" reply reports the active
  limit dynamically. No new config knob; presence of `base_url` is the
  opt-in.

- `platforms.telegram.extra.local_mode: true` wires
  `Application.builder().local_mode(True)` on the python-telegram-bot
  builder. PTB then reads files from disk instead of HTTP, which is
  required when telegram-bot-api runs in `--local` mode (the server
  returns absolute filesystem paths, not `/file/bot...` URLs).

- gateway/run.py: rewrites the `stt.enabled: false` branch of
  `_enrich_message_with_transcription`. New `_format_duration` +
  `_probe_audio_duration` helpers.
- gateway/platforms/telegram.py: `_max_doc_bytes` instance attribute
  derived from `extra.base_url`; `local_mode` builder wiring;
  dynamic "too large" message.
- tests/gateway/test_stt_config.py: covers path-surfacing with and
  without an existing user message, and placeholder replacement.
- tests/gateway/test_telegram_max_doc_bytes.py: 3 cases — default 20MB
  without base_url, 2GB when set, empty-string base_url keeps default.
- website/docs/user-guide/messaging/telegram.md: new "Skipping STT"
  subsection under Voice Messages and a full "Large Files (>20MB) via
  Local Bot API Server" walkthrough (api_id/api_hash, docker-compose,
  one-time `logOut` migration, `platforms.telegram.extra` config, the
  `local_mode` disk-access requirement, the silent HTTP-fallback 404).
- website/docs/user-guide/features/voice-mode.md: documents the
  `stt.enabled` knob in the config reference.

- `pytest tests/gateway/test_telegram_max_doc_bytes.py
  tests/gateway/test_stt_config.py` → 9/9 passing.
- Verified end-to-end on a live deployment: gateway log shows
  `Using custom Telegram base_url: http://...` and
  `Using Telegram local_mode (read files from disk)` on startup;
  voice messages above 20MB cache to disk and surface their path to
  the agent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 22:59:40 -07:00
..
_category_.json feat: add documentation website (Docusaurus) 2026-03-05 05:24:55 -08:00
acp.md feat(acp): hermes acp --setup-browser bootstraps browser tools for registry installs 2026-05-15 01:38:24 -07:00
api-server.md docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784) 2026-05-09 13:19:51 -07:00
batch-processing.md fix: normalize remaining reasoning effort orderings and add missing 'minimal' 2026-04-09 14:20:16 -07:00
browser.md fix(browser): honor pre-set AGENT_BROWSER_ARGS and document the bypass 2026-05-14 19:02:17 -07:00
built-in-plugins.md fix(plugins): remove unreachable hermes tools → Langfuse path 2026-05-16 17:15:19 -07:00
code-execution.md docs(execute_code): document project/strict execution modes (#12073) 2026-04-18 01:53:09 -07:00
codex-app-server-runtime.md docs(codex_app_server): document multi-root Kanban writable_roots (#27941) 2026-05-18 21:03:25 -07:00
computer-use.md feat(computer-use): refresh cua-driver on hermes update + add install --upgrade (#24063) 2026-05-11 17:10:58 -07:00
context-files.md feat: progressive subdirectory hint discovery (#5291) 2026-04-05 12:33:47 -07:00
context-references.md docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393) 2026-04-05 19:45:50 -07:00
credential-pools.md docs(anthropic): correct OAuth scope to Max plan + extra usage credits only (#17404) 2026-04-29 04:11:14 -07:00
cron.md fix(cron): route Telegram cron deliveries to a dedicated topic via TELEGRAM_CRON_THREAD_ID 2026-05-18 22:36:11 -07:00
curator.md docs(curator): update CLI docs for synchronous-by-default manual run 2026-05-07 05:27:47 -07:00
delegation.md docs(delegation): show api_mode override in custom-endpoint example 2026-05-16 20:32:43 -07:00
deliverable-mode.md feat(gateway): deliverable mode — ship artifacts as native uploads from any agent surface (#27813) 2026-05-18 02:14:43 -07:00
extending-the-dashboard.md feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194) 2026-05-10 07:09:28 -07:00
fallback-providers.md feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
goals.md makes the Persistent Goals docs accessible in the docs nav (and llms.txt) (#18481) 2026-05-01 10:29:22 -07:00
honcho.md docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784) 2026-05-09 13:19:51 -07:00
hooks.md test+docs: cover transform_llm_output hook + release author map 2026-05-07 05:46:05 -07:00
image-generation.md feat(image-gen): add GPT Image 2 to FAL catalog (#13677) 2026-04-21 13:35:31 -07:00
kanban-tutorial.md docs: align kanban readiness docs and smoke tests 2026-05-18 21:07:03 -07:00
kanban-worker-lanes.md feat(kanban): stranded_in_ready diagnostic for unclaimed tasks (#23578) 2026-05-10 21:58:44 -07:00
kanban.md feat(kanban): configure worktree paths and branches 2026-05-18 21:33:08 -07:00
lsp.md docs(lsp): replace "git worktree" with "git repository" in LSP docs 2026-05-13 23:05:20 -07:00
mcp.md feat: add supports_parallel_tool_calls for MCP servers (#26825) 2026-05-16 01:04:28 -07:00
memory-providers.md docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784) 2026-05-09 13:19:51 -07:00
memory.md docs(session_search): update all docs for the single-shape rewrite (#27840) 2026-05-18 00:36:17 -07:00
overview.md feat(tts): add Piper as a native local TTS provider (closes #8508) (#17885) 2026-04-30 02:53:20 -07:00
personality.md docs: document SOUL.md as primary agent identity (#1927) 2026-03-18 04:18:08 -07:00
plugins.md fix(plugins): surface category-namespaced plugins in hermes plugins list 2026-05-16 17:15:19 -07:00
provider-routing.md docs: fallback providers + /background command documentation 2026-03-15 06:24:28 -07:00
skills.md feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373) 2026-05-18 21:38:05 -07:00
skins.md fix(tui): honor skin highlight colors (#20895) 2026-05-06 14:01:56 -07:00
spotify.md docs(spotify): document Home Assistant speaker routing 2026-05-16 20:32:43 -07:00
subscription-proxy.md feat(proxy): local OpenAI-compatible proxy for OAuth providers (#25969) 2026-05-14 15:40:48 -07:00
tool-gateway.md docs(tool-gateway): rewrite as pitch-first marketing page (#20827) 2026-05-06 13:20:09 -07:00
tools.md docs(tools): add video_generate / video_gen toolset to user-facing tool docs (#27050) 2026-05-16 11:53:13 -07:00
tts.md docs(voice): add Doubao speech integration examples (TTS + STT) 2026-05-05 13:54:33 -07:00
vision.md docs: two-week gap sweep — platforms, CLI, config, TUI, hooks, providers (#17727) 2026-04-29 20:32:37 -07:00
voice-mode.md feat(telegram): skip-STT audio path + 2GB cap via local Bot API server 2026-05-18 22:59:40 -07:00
web-dashboard.md docs(dashboard): clarify chat tab tui flag 2026-05-16 20:32:43 -07:00
web-search.md docs(web-search): explain auxiliary-model summarization for web_extract (#23211) 2026-05-10 06:40:23 -07:00
x-search.md feat(x_search): gated X (Twitter) search tool with OAuth-or-API-key auth (#26763) 2026-05-16 00:58:27 -07:00