hermes-agent/tests/gateway/relay
Ben Barclay a64fc490fe
fix(relay): make hosted gateways actually connect AND complete the inbound/outbound round-trip (#48828)
* fix(relay): enable RELAY platform + normalize dial URL so hosted gateways actually connect

Three bugs blocked a self-provisioned hosted gateway from ever establishing its
inbound relay WS (found while standing up the live staging end-to-end). Each
masked the next; all three are needed for inbound to work.

1. RELAY platform never enabled in config.platforms (gateway/config.py).
   register_relay_adapter() puts the adapter in the platform_registry, but
   start_gateway()'s connect loop iterates self.config.platforms — which never
   contained Platform.RELAY. So the adapter was "registered" but never connected
   (logs showed "relay adapter registered" then "No messaging platforms
   enabled"). Fix: _apply_env_overrides now enables Platform.RELAY (mirroring
   relay_url into extra for the connected-checker) when GATEWAY_RELAY_URL (env)
   or gateway.relay_url (yaml) is set. Absent -> no RELAY entry (direct/
   single-tenant gateways unaffected).

2. URL scheme not converted for the WS dial (gateway/relay/ws_transport.py).
   The relay URL is configured once as the http(s):// base (used as-is for the
   provision POST), but websockets.connect rejects http(s):// with "scheme isn't
   ws or wss". Fix: _ws_dial_url converts https->wss / http->ws.

3. /relay path not appended (same helper). The connector mounts its
   WebSocketServer at path "/relay" and returns HTTP 400 on an upgrade to any
   other path. GATEWAY_RELAY_URL is the base (no /relay), so the dial hit "/"
   -> 400. Fix: _ws_dial_url ensures the path ends in /relay. Idempotent — a URL
   already carrying ws(s):// and/or /relay is unchanged, so provision's
   _provision_url (which derives /relay/provision from either form) still works.

Why the cross-repo E2E missed #2/#3: the stub connector binds ws://host:port and
its websockets.serve accepts ANY path, so neither the scheme nor the /relay path
was exercised. Real connector needs both.

Verified live on staging hermes-agent-stg-automated-perception-5054: after the
fixes the gateway logs "Connecting to relay..." -> "✓ relay connected" ->
"Gateway running with 1 platform(s)" against
wss://gateway-gateway.staging-nousresearch.com/relay, stable.

Tests: added _ws_dial_url scheme+path+idempotency cases (test_ws_transport.py)
and RELAY-platform-enablement cases for env + yaml + absent (test_config.py).
Full gateway/relay + config suites green (191 passed).

Relay-adapter lane. EXPERIMENTAL.

* fix(relay): re-attach guild_id to outbound so connector egress resolves the tenant

The final bug in the hosted-relay round-trip. Inbound worked end to end (Discord
-> connector -> bus -> agent WS -> agent runs -> reply), but the reply's egress
was declined by the connector: "discord egress declined: target not routed to an
onboarded tenant".

Cause: the connector's routedEgressGuard resolves the owning tenant from the
OUTBOUND action's metadata.guild_id (Discord's routing discriminator). The
gateway's generic delivery path builds outbound metadata via
run.py _thread_metadata_for_source, which only carries thread_id (and returns
None entirely for a non-threaded message) — so guild_id never reached the
connector, tenant resolution failed, and the shared bot refused to post.

Fix (relay-adapter-local, no perturbation of the generic delivery path or other
platforms): RelayAdapter learns chat_id -> guild_id from each inbound event
(_capture_scope) and re-attaches it to the outbound action's metadata in send()
(_with_scope) when not already present. No-op for chats we never saw inbound
(e.g. DMs) and never overwrites an explicit guild_id.

Verified live on staging hermes-agent-stg-automated-perception-5054: an
@mention in #general now produces a visible bot reply — full multi-tenant relay
round-trip (real Discord -> shared connector bot -> tenant routing -> agent WS ->
reply egress -> Discord).

Tests: _capture_scope/_with_scope reattach, no-scope no-op, explicit-guild_id
preserved (test_relay_adapter.py). Full relay + config suites green (160 passed).

Relay-adapter lane. EXPERIMENTAL.
2026-06-19 16:30:24 +10:00
..
__init__.py feat(relay): experimental CapabilityDescriptor schema 2026-06-17 16:37:45 -07:00
stub_connector.py feat(relay): WS-only inbound on the gateway adapter (Phase 3) (#48294) 2026-06-19 09:33:15 +10:00
test_auth.py feat(relay): connector⇄gateway channel auth + signed-HTTP inbound receiver + enroll CLI (#48147) 2026-06-18 12:01:54 +10:00
test_contract_doc_conformance.py test(gateway): enforce relay contract-doc ⟷ Python conformance 2026-06-17 16:37:45 -07:00
test_descriptor.py feat(relay): experimental CapabilityDescriptor schema 2026-06-17 16:37:45 -07:00
test_descriptor_from_entry.py feat(relay): derive descriptor from PlatformEntry 2026-06-17 16:37:45 -07:00
test_no_stub_leak.py test(relay): assert connector stub never leaks into production paths 2026-06-17 16:37:45 -07:00
test_relay_adapter.py fix(relay): make hosted gateways actually connect AND complete the inbound/outbound round-trip (#48828) 2026-06-19 16:30:24 +10:00
test_relay_follow_up.py feat(gateway): token-less follow_up outbound op (A2 capability action) 2026-06-17 16:37:45 -07:00
test_relay_interrupt.py feat(relay): WS-only inbound on the gateway adapter (Phase 3) (#48294) 2026-06-19 09:33:15 +10:00
test_relay_registration.py test(gateway): live ws-transport round-trip + config-driven registration 2026-06-17 16:37:45 -07:00
test_relay_roundtrip.py feat(relay): transport protocol + test-only stub connector 2026-06-17 16:37:45 -07:00
test_relay_roundtrip_telegram.py test(gateway): Telegram relay round-trip (Phase 1 generalization proof) 2026-06-17 16:37:45 -07:00
test_relay_sheds_crypto.py feat(relay): WS-only inbound on the gateway adapter (Phase 3) (#48294) 2026-06-19 09:33:15 +10:00
test_self_provision.py fix(relay): trigger self-provision on relay-config + NAS token, not is_managed() (#48724) 2026-06-19 01:01:24 +00:00
test_ws_transport.py fix(relay): make hosted gateways actually connect AND complete the inbound/outbound round-trip (#48828) 2026-06-19 16:30:24 +10:00