fix(relay): re-attach DM author user_id on outbound for connector egress

A DM reply carries no guild_id, so the connector's egress guard cannot
resolve the owning tenant from metadata.guild_id and declines the send
with "discord egress declined: target not routed to an onboarded tenant"
— the bug behind "the bot never replies in DMs". Guild replies are
unaffected (they carry guild_id), which is why the guild path worked
end-to-end while DMs looked broken.

The connector now resolves a DM reply's tenant from the recipient's
author binding (gateway-gateway #67, resolveByUser keyed on
metadata.user_id) — the outbound counterpart to inbound Phase 7a
author-first resolution. But it needs the recipient user_id ON the
outbound action, and the adapter only re-attached guild_id
(_capture_scope/_with_scope), no-op for DMs (the docstring even said so).

This extends the adapter's inbound-scope capture: for a DM (no guild_id)
remember chat_id -> the authentic author user_id we observed, and
re-attach it as metadata.user_id on outbound. Guild capture is unchanged
and wins when present; user_id is the DM-only fallback. The id is the one
the connector observed inbound (never gateway-asserted), so the trust
invariant holds.

+4 unit tests (DM reply re-attaches user_id + no guild_id; unknown chat
invents nothing; explicit user_id preserved; guild reply never carries
user_id). Proved load-bearing (reverting the re-attach fails the DM
test). 144 relay tests pass, ruff clean.

Pairs with gateway-gateway #67 (the connector-side resolver). Together
they close the DM-reply egress gap end-to-end.
This commit is contained in:
Ben 2026-06-25 12:34:51 +10:00 committed by Ben Barclay
parent c15945655f
commit 0c3f197cff
2 changed files with 119 additions and 12 deletions

View file

@ -66,6 +66,11 @@ class RelayAdapter(BasePlatformAdapter):
# re-attach the scope here from what we saw inbound. Keyed by chat_id
# (channel) since that's what send() receives. See routedEgressGuard.ts.
self._scope_by_chat: Dict[str, str] = {}
# chat_id -> author user_id for DM channels (no guild_id). A DM reply has
# no guild discriminator, so the connector resolves its tenant from the
# recipient's author binding; we re-attach this user_id as
# metadata.user_id on the outbound action so it can. See _capture_scope.
self._dm_user_by_chat: Dict[str, str] = {}
self.supports_code_blocks = descriptor.markdown_dialect not in ("", "plain")
# Phase 7 Unit 7d-B: watches the transport for a terminal auth revocation
# (a 4401 close after a successful handshake = the operator opted this
@ -199,29 +204,65 @@ class RelayAdapter(BasePlatformAdapter):
await self.handle_message(event)
def _capture_scope(self, event) -> None:
"""Remember chat_id -> guild scope from an inbound event so our outbound
(the agent's reply) can re-assert it for the connector's egress tenant
resolution. Never raises scope tracking must not break inbound."""
"""Remember a chat_id's egress discriminator from an inbound event so our
outbound (the agent's reply) can re-assert it for the connector's egress
tenant resolution. Never raises scope tracking must not break inbound.
Two cases, matching the connector's two tenant-resolution paths:
- GUILD message: remember chat_id -> guild_id. The connector resolves
the tenant from metadata.guild_id (routing table).
- DM (no guild_id): remember chat_id -> the authentic author user_id.
A DM carries no guild discriminator, so the connector instead resolves
the tenant from the recipient's author binding (resolveByUser); it
needs the user_id on the OUTBOUND action to do that. Without this, a
DM reply has no resolvable discriminator and the connector's egress
guard declines it as "target not routed to an onboarded tenant".
See gateway-gateway routedEgressGuard.ts / discordTenantOf.
"""
try:
src = getattr(event, "source", None)
scope = getattr(src, "guild_id", None) if src else None
chat = getattr(src, "chat_id", None) if src else None
if scope and chat:
self._scope_by_chat[str(chat)] = str(scope)
if not src:
return
chat = getattr(src, "chat_id", None)
if not chat:
return
guild = getattr(src, "guild_id", None)
if guild:
self._scope_by_chat[str(chat)] = str(guild)
return
# DM: no guild_id. Remember the authentic author id for outbound
# author-binding resolution (the user we're replying to in this DM).
user_id = getattr(src, "user_id", None)
if user_id:
self._dm_user_by_chat[str(chat)] = str(user_id)
except Exception: # noqa: BLE001 - scope tracking must never break inbound
pass
def _with_scope(self, chat_id: str, metadata: Optional[Dict[str, Any]]) -> Dict[str, Any]:
"""Ensure the outbound metadata carries guild_id for the connector's
egress tenant resolution. The connector resolves the owning tenant from
metadata.guild_id (Discord); without it egress is declined as
'target not routed to an onboarded tenant'. No-op when we have no scope
for this chat (e.g. DMs) or it's already present."""
"""Ensure the outbound metadata carries the discriminator the connector's
egress guard needs to resolve the owning tenant. Two cases:
- GUILD reply: re-attach metadata.guild_id (routing-table resolution).
- DM reply: there is no guild_id, so re-attach metadata.user_id the
authentic author id we saw inbound which the connector resolves to
the tenant via the recipient's author binding (resolveByUser). Without
one of these, egress is declined as 'target not routed to an onboarded
tenant'. See gateway-gateway routedEgressGuard.ts / discordTenantOf.
No-op when the relevant value is already present or unknown for this chat.
"""
meta: Dict[str, Any] = dict(metadata or {})
if not meta.get("guild_id"):
scope = self._scope_by_chat.get(str(chat_id))
if scope:
meta["guild_id"] = scope
# DM author-binding discriminator. Only meaningful when there's no guild
# (a guild reply resolves by guild_id); harmless to carry otherwise, but
# we only set it when this chat is a known DM and the field is absent.
if not meta.get("guild_id") and not meta.get("user_id"):
dm_user = self._dm_user_by_chat.get(str(chat_id))
if dm_user:
meta["user_id"] = dm_user
return meta
async def on_interrupt(self, session_key: str, chat_id: str) -> None:

View file

@ -104,6 +104,21 @@ def _make_event(chat_id="chan-1", guild_id="guild-9"):
return MessageEvent(text="hi", source=src, message_type=MessageType.TEXT)
def _make_dm_event(chat_id="dm-1", user_id="user-42"):
"""An inbound DM: no guild_id, carries the authentic author user_id."""
from gateway.platforms.base import MessageEvent, MessageType
from gateway.session import SessionSource
src = SessionSource(
platform=Platform.RELAY,
chat_id=chat_id,
chat_type="dm",
guild_id=None,
user_id=user_id,
)
return MessageEvent(text="hi", source=src, message_type=MessageType.TEXT)
@pytest.mark.asyncio
async def test_send_reattaches_guild_id_from_inbound_scope():
"""The connector's egress guard resolves the owning tenant from
@ -142,6 +157,57 @@ async def test_send_preserves_explicit_guild_id():
assert t.sent["metadata"]["guild_id"] == "explicit-1"
@pytest.mark.asyncio
async def test_send_reattaches_dm_user_id_from_inbound_scope():
"""A DM reply has no guild_id, so the connector resolves the tenant from the
recipient's author binding — it needs metadata.user_id. The adapter must
re-attach the authentic author id learned from the inbound DM. Regression for
live 'discord egress declined: target not routed to an onboarded tenant' on
DM replies (the connector-side fix is gateway-gateway #67)."""
t = _CaptureTransport()
a = RelayAdapter(PlatformConfig(), make_desc(platform="discord"), transport=t)
a._capture_scope(_make_dm_event(chat_id="dm-1", user_id="user-42"))
await a.send("dm-1", "the reply")
assert t.sent["metadata"].get("user_id") == "user-42"
# A DM carries no guild_id — only the author discriminator.
assert "guild_id" not in t.sent["metadata"]
@pytest.mark.asyncio
async def test_send_dm_does_not_invent_user_id_for_unknown_chat():
"""A chat we never saw inbound gets neither discriminator — no-op."""
t = _CaptureTransport()
a = RelayAdapter(PlatformConfig(), make_desc(platform="discord"), transport=t)
await a.send("unknown-dm", "hi")
assert "user_id" not in t.sent["metadata"]
assert "guild_id" not in t.sent["metadata"]
@pytest.mark.asyncio
async def test_send_preserves_explicit_user_id():
"""An explicitly-provided metadata.user_id is never overwritten."""
t = _CaptureTransport()
a = RelayAdapter(PlatformConfig(), make_desc(platform="discord"), transport=t)
a._capture_scope(_make_dm_event(chat_id="dm-1", user_id="user-42"))
await a.send("dm-1", "hi", metadata={"user_id": "explicit-user"})
assert t.sent["metadata"]["user_id"] == "explicit-user"
@pytest.mark.asyncio
async def test_guild_reply_does_not_carry_user_id():
"""A guild reply resolves by guild_id and must NOT carry a DM user_id even if
the same chat_id was somehow seen guild capture wins and user_id stays out
(guild_id is the discriminator; user_id is the DM-only fallback)."""
t = _CaptureTransport()
a = RelayAdapter(PlatformConfig(), make_desc(platform="discord"), transport=t)
a._capture_scope(_make_event(chat_id="chan-1", guild_id="guild-9"))
await a.send("chan-1", "hi")
assert t.sent["metadata"].get("guild_id") == "guild-9"
assert "user_id" not in t.sent["metadata"]
# ── Phase 7 Unit 7d-B: terminal auth revocation → clean "relay disabled" ─────