fix(relay): authorize relay inbound via connector-enforced upstream authz

A hosted instance fronted by the Team Gateway connector dropped EVERY relay
message as "Unauthorized user" and the agent never replied — despite the
message routing correctly through the connector to the instance.

Root cause: gateway authorization (_is_user_authorized) had no notion of
upstream-enforced authz. Platform.RELAY matches no {PLATFORM}_ALLOWED_USERS
allowlist and isn't in the HA/WEBHOOK always-authorized set, so a relay user
with no env allowlist configured hit the default-deny ("No user allowlists
configured. All unauthorized users will be denied."). The message was received,
then silently denied before reaching the agent.

This is incorrect for relay: the connector authenticates the gateway's WS with
a per-instance secret and performs owner-only author-binding resolution BEFORE
delivering. A message only reaches this gateway because the connector resolved
it to THIS instance's bound user (user_instance_binding), keyed on the author id
the connector OBSERVED off the event — never a gateway claim. The authorization
decision is already made by a trusted, authenticated upstream; there is no local
RELAY_ALLOWED_USERS allowlist to consult, and default-denying for its absence is
the bug.

Fix: add a generic BasePlatformAdapter.authorization_is_upstream capability
(default False) that the relay adapter overrides to True, plus a dedicated
trusted branch in _is_user_authorized that honors it. This is delegation to a
trusted upstream, NOT a fail-open: it fires only for an adapter that explicitly
declares the flag; every direct network-exposed adapter leaves it False and the
env-allowlist default-deny (SECURITY.md §2.6) is unchanged. Distinct from
enforces_own_access_policy, which mirrors a LOCAL config-driven allowlist —
this delegates to an authenticated upstream's decision.

Tests: behavior contract that the base defaults False, the relay adapter
declares True, a relay user (group + DM) is authorized with no env allowlist,
and crucially a non-upstream adapter with no allowlist still default-denies
(guards against the fix becoming a blanket fail-open). 6 new tests; relay +
authz + config-policy suites green (134 + 90).

Found via live staging debug of the Discord self-serve onboarding flow.
This commit is contained in:
Ben 2026-06-25 09:30:55 +10:00 committed by Ben Barclay
parent a378b1e980
commit d335164833
4 changed files with 229 additions and 0 deletions

View file

@ -31,6 +31,28 @@ from gateway.whatsapp_identity import (
class GatewayAuthorizationMixin:
"""User/chat authorization methods for ``GatewayRunner``."""
def _adapter_authorization_is_upstream(self, platform: Optional[Platform]) -> bool:
"""Whether the adapter for *platform* delegates authz to a trusted upstream.
Mirrors ``BasePlatformAdapter.authorization_is_upstream``. The relay
adapter sets this True: the Team Gateway connector authenticates the
gateway's WS and resolves owner-only author bindings before delivering,
so an inbound relay event is already authorized as this instance's bound
user. Unlike ``_adapter_enforces_own_access_policy`` (a LOCAL config
policy the gateway mirrors only when it's an allowlist), this is an
UPSTREAM decision the gateway honors directly. Defaults to ``False`` when
the adapter is unknown or doesn't expose the flag.
"""
if not platform:
return False
adapters = getattr(self, "adapters", None)
if not adapters:
return False
adapter = adapters.get(platform)
if adapter is None:
return False
return bool(getattr(adapter, "authorization_is_upstream", False))
def _adapter_enforces_own_access_policy(self, platform: Optional[Platform]) -> bool:
"""Whether the adapter for *platform* gates access at intake itself.
@ -193,6 +215,21 @@ class GatewayAuthorizationMixin:
if source.platform in {Platform.HOMEASSISTANT, Platform.WEBHOOK}:
return True
# Relay (and any adapter whose authorization is enforced by a trusted
# authenticated upstream): the Team Gateway connector authenticates this
# gateway's WS with a per-instance secret and resolves owner-only author
# bindings BEFORE delivering, so an inbound relay event was already
# authorized as this instance's bound user (the author id is the one the
# connector observed, never gateway-asserted). There is no local
# RELAY_ALLOWED_USERS env allowlist to consult, and default-denying for
# its absence is the bug this branch fixes. This is delegation to a
# trusted upstream, NOT a fail-open: it fires only for an adapter that
# explicitly sets authorization_is_upstream=True; every direct
# network-exposed adapter leaves it False and the env-allowlist
# default-deny below still applies unchanged.
if self._adapter_authorization_is_upstream(source.platform):
return True
user_id = source.user_id
# Telegram (and similar) authorize entire group/forum/channel chats

View file

@ -2247,6 +2247,38 @@ class BasePlatformAdapter(ABC):
"""
return False
@property
def authorization_is_upstream(self) -> bool:
"""Whether inbound on this adapter was already authorized UPSTREAM.
Distinct from ``enforces_own_access_policy``: that flag describes an
adapter that enforces a LOCAL, config-driven access surface
(``dm_policy: allowlist`` / ``allow_from``) the gateway can mirror. This
flag describes an adapter whose authorization is performed by a TRUSTED
UPSTREAM over an authenticated transport there is no local policy to
consult, and the env allowlist (``{PLATFORM}_ALLOWED_USERS``) does not
apply because the sender identity isn't a platform account the operator
configures here.
The relay adapter is the sole user: it fronts the Team Gateway
connector over a per-instance-authenticated WebSocket, and the connector
performs owner-only author-binding resolution BEFORE delivering a
message only reaches this gateway because the connector resolved it to
THIS instance's bound user (``user_instance_binding``). The author id is
read off the event the connector observed, never gateway-asserted. So an
inbound relay event carries an authorization decision already made by a
trusted, authenticated upstream; default-denying it (no env allowlist
deny) is incorrect.
This is NOT a fail-open: it is authorization DELEGATED to a trusted
upstream that authenticated the transport (the relay WS secret) and
enforced owner-only binding, as opposed to authorization being ABSENT.
It only takes effect for an adapter that explicitly overrides this to
``True``; every network-exposed direct adapter leaves it ``False`` and
the env-allowlist default-deny continues to apply unchanged.
"""
return False
def supports_draft_streaming(
self,
chat_type: Optional[str] = None,

View file

@ -75,6 +75,20 @@ class RelayAdapter(BasePlatformAdapter):
self._revocation_monitor: Optional[asyncio.Task[None]] = None
# ── capability surface (from descriptor) ─────────────────────────────
@property
def authorization_is_upstream(self) -> bool:
"""Relay authorization is enforced by the connector, not locally.
The connector authenticates this gateway's WS (per-instance secret) and
performs owner-only author-binding resolution before delivering, so any
inbound relay event was already authorized as THIS instance's bound user
(``user_instance_binding``, keyed on the connector-observed author id).
The instance therefore must not default-deny relay users for lack of a
local ``RELAY_ALLOWED_USERS`` env allowlist. See
``BasePlatformAdapter.authorization_is_upstream``.
"""
return True
@property
def message_len_fn(self) -> Callable[[str], int]:
return _LEN_FNS.get(self.descriptor.len_unit, len)

View file

@ -0,0 +1,146 @@
"""Tests for relay upstream-enforced authorization at the gateway layer.
Background: the relay adapter fronts the Team Gateway connector over a
per-instance-authenticated WebSocket. The connector performs owner-only
author-binding resolution BEFORE delivering an inbound event a message only
reaches this gateway because the connector resolved it to THIS instance's bound
user (``user_instance_binding``, keyed on the connector-observed author id,
never a gateway claim). So a relay inbound is already authorized by a trusted,
authenticated upstream.
Before this fix, ``_is_user_authorized`` had no notion of upstream
authorization: ``Platform.RELAY`` matched no ``*_ALLOWED_USERS`` allowlist and
isn't in the HA/WEBHOOK always-authorized set, so every relay user hit the
default-deny ("No user allowlists configured. All unauthorized users will be
denied.") and the agent never saw the message. This was the live staging bug:
the message routed correctly through the connector to the instance, then the
instance's authz layer dropped it as ``Unauthorized user``.
The fix adds a generic ``BasePlatformAdapter.authorization_is_upstream``
capability (default ``False``) that the relay adapter overrides to ``True``,
plus a dedicated trusted branch in ``_is_user_authorized``. It is delegation to
a trusted upstream, NOT a fail-open: it fires only for an adapter that
explicitly declares the flag; every direct network-exposed adapter leaves it
``False`` and the env-allowlist default-deny is unchanged.
"""
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock
import pytest
from gateway.config import Platform
from gateway.session import SessionSource
def _clear_auth_env(monkeypatch) -> None:
for key in (
"DISCORD_ALLOWED_USERS",
"GATEWAY_ALLOWED_USERS",
"GATEWAY_ALLOW_ALL_USERS",
"DISCORD_ALLOW_ALL_USERS",
):
monkeypatch.delenv(key, raising=False)
def _make_runner(*, platform: Platform, authorization_is_upstream: bool):
"""Build a bare GatewayRunner with one adapter for *platform*.
``authorization_is_upstream`` controls whether that adapter declares the
upstream-authz capability.
"""
from gateway.run import GatewayRunner
runner = object.__new__(GatewayRunner)
adapter = SimpleNamespace(
send=AsyncMock(),
authorization_is_upstream=authorization_is_upstream,
enforces_own_access_policy=False,
)
runner.adapters = {platform: adapter}
runner.pairing_store = MagicMock()
runner.pairing_store.is_approved.return_value = False
runner.pairing_store._is_rate_limited.return_value = False
return runner, adapter
def _relay_source(**kw) -> SessionSource:
base = dict(
platform=Platform.RELAY,
user_id="428014785045725184",
chat_id="1400724139874058314",
user_name="definitely_not_cthulhu",
chat_type="group",
)
base.update(kw)
return SessionSource(**base)
# ---------------------------------------------------------------------------
# Capability contract
# ---------------------------------------------------------------------------
def test_base_adapter_defaults_to_not_upstream_authorized():
"""The base property is False — direct adapters keep env default-deny."""
from gateway.platforms.base import BasePlatformAdapter
assert BasePlatformAdapter.authorization_is_upstream.fget(object()) is False
def test_relay_adapter_declares_upstream_authz():
"""The relay adapter overrides the capability to True (static capability)."""
from gateway.relay.adapter import RelayAdapter
# Property reflects a static capability, independent of instance config.
assert RelayAdapter.authorization_is_upstream.fget(object()) is True
# ---------------------------------------------------------------------------
# Authorization behavior
# ---------------------------------------------------------------------------
def test_relay_user_authorized_with_no_env_allowlist(monkeypatch):
"""A relay user is authorized even with NO env allowlist configured.
This is the staging-bug regression guard: the connector already authorized
the author via owner-only binding, so the instance must not default-deny.
"""
_clear_auth_env(monkeypatch)
runner, _ = _make_runner(platform=Platform.RELAY, authorization_is_upstream=True)
assert runner._is_user_authorized(_relay_source()) is True
def test_relay_dm_authorized_with_no_env_allowlist(monkeypatch):
"""The /link DM path is also authorized (DMs are upstream-bound too)."""
_clear_auth_env(monkeypatch)
runner, _ = _make_runner(platform=Platform.RELAY, authorization_is_upstream=True)
assert runner._is_user_authorized(_relay_source(chat_type="dm")) is True
def test_non_upstream_adapter_still_default_denies(monkeypatch):
"""A direct adapter that does NOT declare the flag still default-denies.
Guards against the fix becoming a blanket fail-open: an adapter with
authorization_is_upstream=False and no env allowlist must remain denied.
"""
_clear_auth_env(monkeypatch)
runner, _ = _make_runner(platform=Platform.DISCORD, authorization_is_upstream=False)
src = SessionSource(
platform=Platform.DISCORD,
user_id="123",
chat_id="456",
user_name="someone",
chat_type="dm",
)
assert runner._is_user_authorized(src) is False
def test_upstream_authz_helper_false_for_unknown_platform(monkeypatch):
"""The helper returns False when there's no adapter for the platform."""
_clear_auth_env(monkeypatch)
runner, _ = _make_runner(platform=Platform.RELAY, authorization_is_upstream=True)
# A platform with no registered adapter must not be treated as upstream-authz.
assert runner._adapter_authorization_is_upstream(Platform.DISCORD) is False
assert runner._adapter_authorization_is_upstream(None) is False