fix(slack): harden attachment handling

Multiple overlapping Slack attachment improvements:

1. Upload retry with backoff on transient errors (429, 5xx, connection
   reset, rate_limited, service unavailable). New _is_retryable_upload_error
   helper covers three upload paths: _upload_file, send_video,
   send_document. Up to 3 attempts with 1.5s * attempt backoff.

2. Thread participation tracking: successful file uploads now add the
   thread_ts to _bot_message_ts, mirroring how text replies are tracked.
   This lets follow-up thread messages auto-trigger the bot (same
   engagement rules as replied threads).

3. Thread metadata preservation in the image redirect-guard fallback
   (send_image → send text fallback) and in two gateway.run.py send
   paths (image + document fallback calls).

4. HTML response rejection in _download_slack_file_bytes. Parallels
   the existing check in _download_slack_file. Guards against Slack
   returning a sign-in / redirect page as document bytes when scopes
   are missing, so the agent doesn't get HTML-as-a-PDF.

5. File lifecycle event acks (file_shared / file_created / file_change).
   These events arrive around snippet uploads. Acking them silences the
   slack_bolt 'Unhandled request' 404 warnings without changing behavior.

6. Post-loop message type classification so a mixed image+document upload
   classifies as PHOTO (or VOICE if no image), falling back to DOCUMENT.
   Previously, the per-file classification in the inbound loop could be
   overwritten unpredictably.

7. Expanded text-inject whitelist in inbound document handling to cover
   .csv, .json, .xml, .yaml, .yml, .toml, .ini, .cfg (up to 100KB) so
   snippets and config files are directly visible to the agent, not just
   cached as opaque uploads. Paired with new MIME entries in
   SUPPORTED_DOCUMENT_TYPES in base.py.

Squashed from two commits in #11819 so the single commit carries the
contributor's GitHub attribution (the original commits were authored
under a local dev hostname).
This commit is contained in:
ghostmfr 2026-04-26 18:16:15 -07:00 committed by Teknium
parent b16f9d438b
commit e818ec520a
5 changed files with 297 additions and 32 deletions

View file

@ -735,6 +735,7 @@ class TestSlackDownloadSlackFileBytes:
fake_response = MagicMock()
fake_response.content = b"raw bytes here"
fake_response.raise_for_status = MagicMock()
fake_response.headers = {"content-type": "application/pdf"}
mock_client = AsyncMock()
mock_client.get = AsyncMock(return_value=fake_response)
@ -750,6 +751,29 @@ class TestSlackDownloadSlackFileBytes:
result = asyncio.run(run())
assert result == b"raw bytes here"
def test_rejects_html_response(self):
"""Slack HTML sign-in pages should not be accepted as file bytes."""
adapter = _make_slack_adapter()
fake_response = MagicMock()
fake_response.content = b"<!DOCTYPE html><html><title>Slack</title></html>"
fake_response.raise_for_status = MagicMock()
fake_response.headers = {"content-type": "text/html; charset=utf-8"}
mock_client = AsyncMock()
mock_client.get = AsyncMock(return_value=fake_response)
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=False)
async def run():
with patch("httpx.AsyncClient", return_value=mock_client):
await adapter._download_slack_file_bytes(
"https://files.slack.com/file.bin"
)
with pytest.raises(ValueError, match="HTML instead of file bytes"):
asyncio.run(run())
def test_retries_on_429_then_succeeds(self):
"""429 on first attempt is retried; raw bytes returned on second."""
adapter = _make_slack_adapter()
@ -757,6 +781,7 @@ class TestSlackDownloadSlackFileBytes:
ok_response = MagicMock()
ok_response.content = b"final bytes"
ok_response.raise_for_status = MagicMock()
ok_response.headers = {"content-type": "application/pdf"}
mock_client = AsyncMock()
mock_client.get = AsyncMock(