# Relay ↔ Connector Contract (v1, EXPERIMENTAL) > **Status:** EXPERIMENTAL. This contract MAY CHANGE without a deprecation > cycle until at least two real Class-1 platforms (Discord + Telegram) have > validated it. Evolution during the experimental phase is **additive-only**, > gated by `contract_version`. A breaking change updates both repos in lockstep. This document is the formal interface between the **Hermes gateway** (Python, `gateway/relay/`) and the **connector** (Node/TypeScript, `NousResearch/gateway-gateway`). The connector implementer's first action is to read this file. The gateway runs a generic `RelayAdapter` that dials **out** to the connector, receives a `CapabilityDescriptor` at handshake, then exchanges normalized `MessageEvent`s (inbound) and actions (outbound) over a per-turn bidirectional WebSocket. The gateway never learns which concrete platform is fronting it; the connector owns all platform-specific socket/identity logic. --- ## 1. Handshake 1. Gateway opens the transport (`connect`). 2. Gateway calls `handshake()`; connector returns a `CapabilityDescriptor` (section 2) describing the platform this adapter instance fronts. 3. Gateway configures the adapter from the descriptor (char limit, length unit, draft/edit/thread/markdown capabilities) and registers an inbound handler. 4. Connector then streams inbound events and accepts outbound actions. `contract_version` (currently `1`) is carried in the descriptor. The gateway ignores unknown descriptor fields (forward-compat) and fills missing optional fields from defaults. --- ## 2. CapabilityDescriptor (handshake payload) JSON object. Source of truth: `gateway/relay/descriptor.py`. | Field | Type | Required | Meaning | | --- | --- | --- | --- | | `contract_version` | int | yes | Contract version (additive-only within a version). | | `platform` | string | yes | Platform name (e.g. `"discord"`, `"telegram"`). | | `label` | string | yes | Human-readable label. | | `max_message_length` | int | yes | Char limit; gateway exposes as `MAX_MESSAGE_LENGTH`. 0 → treat as 4096. | | `supports_draft_streaming` | bool | yes | Native draft-streaming preview support. | | `supports_edit` | bool | yes | Edit-based streaming possible; if false, consumer degrades to one-message-per-segment. | | `supports_threads` | bool | yes | `create_handoff_thread` capability. | | `markdown_dialect` | string | yes | `"plain"`, `"markdown_v2"`, `"discord"`, … (drives `supports_code_blocks`). | | `len_unit` | string | yes | `"chars"` (builtin len) or `"utf16"` (Telegram UTF-16 code units). | | `emoji` | string | no | Display emoji (default 🔌). | | `platform_hint` | string | no | System-prompt platform hint. | | `pii_safe` | bool | no | Redact PII in session descriptions. | Most fields are a projection of the gateway's existing `PlatformEntry`; the runtime-only fields (`len_unit`, `supports_*`, `markdown_dialect`) come from the live platform adapter's capability methods. --- ## 3. Inbound: `MessageEvent` envelope The connector normalizes each platform wire event into a `MessageEvent` (`gateway/platforms/base.py`) and delivers it to the gateway. **Inbound is delivered over the gateway's OUTBOUND `/relay` WebSocket** (see the transport note below) — the connector pushes an `inbound` frame down the socket the gateway already dialed. The gateway keys the session via `build_session_key()` from the embedded `SessionSource` — so populating the right discriminators is the single highest-correctness responsibility of the connector. ### Inbound transport (WS back-channel, not HTTP) The gateway dials **out** to the connector's `/relay` WebSocket for the handshake + outbound actions (§4) + its own `/stop` egress (§5). Inbound rides the **same socket** in the other direction: the connector pushes an `inbound` frame (and `interrupt_inbound` for §5) down the gateway's outbound WS. There is **no gateway-side inbound HTTP endpoint** — a gateway need not (and, when hosted, cannot) expose any inbound port; everything flows over the connection it initiated. **Multi-instance routing.** The connector instance that owns a platform's socket (and thus produces inbound events) is generally **not** the instance the gateway dialed its outbound WS into. The producing instance therefore publishes the event on the connector's internal **relay bus** (Redis pub/sub; `RelayBus` in `src/core/relayBus.ts`) keyed by tenant. Every connector instance subscribes and routes each message to its **local** sessions for that tenant (`RelayServer.routeBusMessage`); the single instance that actually holds the gateway's socket delivers it, and instances with no local session for the tenant no-op. Cross-instance delivery is thus an in-cluster Redis hop, not a public HTTP call. Frames (connector → gateway, over the WS): - `{"type":"inbound", "event": , "bufferId"?}` - `{"type":"interrupt_inbound", "session_key", "chat_id"}` (§5) - `{"type":"passthrough_forward", "forward": , "bufferId"?}` (§5.1) `PassthroughForward` is the wire form of a forwarded passthrough-plane request (Class-2/3 webhooks — Discord interactions, Twilio): `{platform, botId, method, path, headers: [[k,v],…], bodyB64}`. The body is base64-encoded so arbitrary bytes survive the newline-delimited-JSON transport; the gateway base64-decodes back to the exact bytes the connector forwarded (the connector already verified the provider signature and stripped any shared-identity credential at the edge — §6 — so the gateway re-processes a sanitized, token-free body and acts on it via the token-less `follow_up` path). See §3.1. **Trust.** The WS upgrade is authenticated with the gateway's per-gateway secret (§6.1), so the channel is trusted end to end — inbound frames are not separately HMAC-signed (the authenticated socket subsumes the per-delivery origin proof the old HTTP path needed). The relay-bus hop is inside the connector trust domain (same as the lease/buffer/capability stores). > Earlier drafts of this contract delivered inbound over a signed **HTTP POST** > to a `gatewayEndpoint` (`HttpGatewayDelivery` + a gateway-side > `inbound_receiver`), HMAC-signed with a per-tenant delivery key. That required > every gateway to expose a reachable inbound URL — impossible for hosted > gateways, which have no public IP. The WS back-channel above replaces it; the > per-tenant delivery key is retained at provision for forward-compat but is no > longer used for inbound. The **passthrough plane** (Class-2/3 webhooks like > Discord interactions / Twilio) historically still used `gatewayEndpoint` for > its post-ACK forward; Phase 5 §5.1 moves that forward onto the WS too (the > `passthrough_forward` frame above), so a hosted gateway needs zero public > inbound surface and `gatewayEndpoint` is retired once the cutover lands. ### 3.1 Passthrough-plane forward (§5.1) The passthrough plane answers the provider's latency-critical ACK at the connector EDGE (e.g. Discord's deferred interaction response within ~3s), then does a **fire-and-forget** forward of the real request to the gateway. That forward needs no response back (the provider was already satisfied), so it rides the same outbound WS as `inbound` via a `passthrough_forward` frame rather than an HTTP POST. The gateway processes the decoded request through its normal agent path (a Discord interaction is decoded to a `MessageEvent` and handled like a message; the reply egresses over the outbound / `follow_up` path). `bufferId` is present when the forward was buffered (Phase 5 §5.3 buffered-only flip) and the gateway acks it after durable handoff. ### SessionSource fields (the wire surface) Source of truth: `SessionSource.to_dict()` in `gateway/session.py`. These are every key the gateway accepts on the wire. `platform`, `chat_id`, `chat_type`, `user_id`, `user_name`, `thread_id`, `chat_name`, and `chat_topic` are always present (may be `null`); the rest are included only when set. | Field | Type | Always sent | Meaning | | --- | --- | --- | --- | | `platform` | string | yes | Platform name (matches the descriptor's `platform`). | | `chat_id` | string | yes | Primary conversation id (channel/chat). Session-key discriminator. | | `chat_type` | string | yes | `dm` / `group` / `channel` / `thread` / `forum`. | | `chat_name` | string\|null | yes | Human-readable chat name. | | `user_id` | string\|null | yes | Message author id. Session-key discriminator. | | `user_name` | string\|null | yes | Author display name. | | `thread_id` | string\|null | yes | Thread/forum-topic id when in a thread. Session-key discriminator. | | `chat_topic` | string\|null | yes | Channel topic/description (Discord, Slack). | | `user_id_alt` | string | no | Platform-specific stable alt id (Signal UUID, Feishu union_id). | | `chat_id_alt` | string | no | Alternate chat id (e.g. Signal group internal id). | | `guild_id` | string | no | Discord guild / Slack workspace / Matrix server scope. **REQUIRED for Discord server isolation.** Session-key discriminator. | | `parent_chat_id` | string | no | Parent channel when `chat_id` refers to a thread. | | `message_id` | string | no | Id of the triggering message (for pin/reply/react). | > `is_bot` (author-is-a-bot/webhook classification) exists on the gateway-side > dataclass but is **intentionally NOT on the wire** in v1 — it is not part of > `to_dict()`. Do not add it to the connector's `SessionSource` until it is > first added here and to `to_dict()` (additive bump). ### SessionSource discriminators per platform | Platform | chat_id | chat_type | user_id | thread_id | guild_id | | --- | --- | --- | --- | --- | --- | | **Discord** | channel id | `dm`/`group`/`thread` | author id | thread channel id (threads) | **guild id** (REQUIRED for server isolation) | | **Telegram** | chat id | `dm`/`group`/`forum` | from id | forum topic id (forums) | — | **Get Discord's `guild_id` wrong and two servers collide into one session.** This is the #1 High-severity risk. The gateway's `build_session_key()` is the conformance oracle: for a given `SessionSource`, the connector's normalization must produce the same key the Python adapter would. (The Phase-1 stub tests assert known-input → known-key.) ### Bot identity vs tenant (single-bot consolidation, Appendix A) The envelope carries the **originating bot identity** as a field **distinct from tenant**. Tenant is resolved from the event's own discriminator (Discord `guild_id`, Telegram `chat_id`, webhook path/subdomain) — **never** from which token/socket/process delivered it. This keeps one shared bot able to front many tenants (Phase 6) without overloading an existing field. ### Author-first resolution + the account-link (DM) path (Phase 7) Phase 7 adds **self-serve, per-user onboarding to a shared bot**, which changes *which* discriminator resolves the instance for a routed inbound message — and adds a management path for users to bind their own account. **Author-first resolution (the multi-tenant-guild rule, D-7.2).** A single Discord guild may hold **many** tenants — different members each linked to their own agent. So for delivery the connector resolves the destination instance from the **authenticated author binding** (`user_instance_binding`, keyed by `(tenant, platform, platform_user_id)` via `resolveByUser`), **NOT** by a guild→instance route. Concretely: - A routed message authored by a **linked** user reaches **only that user's** instance — even when a second linked user in the **same guild** is served by a different instance (each reaches only their own). - A message authored by an **unlinked** user resolves to **no** instance and is dropped (**fail-closed** — never broadcast to the guild's other tenants). - The author id used is the **authentic `user_id` off the observed event**, the same `SessionSource.user_id` documented above — never a value asserted by a gateway or carried in a management frame. This is the per-`user_id` owner-only routing the connector enforces in `WsGatewayDelivery` (the gateway-side multi-tenant-guild E2E driver `gateway_multitenant_guild_driver.py` is the cross-repo oracle). **The account-link (DM) path.** A user binds their account to an instance with a one-time code, redeemed by DMing the shared bot: 1. The owner triggers a link from the Portal (or a self-hosted CLI). The connector mints a short-lived **link code** for the **authenticated** instance (`POST /manage/link`; instanceId comes from the caller's principal — a NAS-signed `aud=agent:{instanceId}` token or the instance's own per-gateway secret — **never** the request body). 2. The user sends `/link ` as a **direct message** to the shared bot from the account they want to bind. 3. The connector's inbound observer **consumes** that DM (it is not routed to any agent) and writes the `user_instance_binding` using the **authentic `user_id`** off the observed DM event. From then on, author-first resolution routes that user's messages to the bound instance. **Opt-out is connector-authoritative.** Deprovisioning an instance (`POST /manage/deprovision`) drops its author bindings (so its users stop resolving to it) **and** revokes its per-gateway secret (so its socket can no longer authenticate — the next WS upgrade is closed **4401**). A gateway that sees a **4401 close after a previously-successful handshake** treats it as a terminal revocation: it stops reconnecting and reports the relay platform as **disabled** (not a retryable error). A 4401 *before* any successful handshake stays retryable (a cold-start / not-yet-provisioned race, not a revocation). ### 3.2 Going-idle / buffered-flip primitive (§5.3) A scale-to-zero PRIMITIVE (not the behaviour — nothing here decides to sleep or suspends a machine; a later workstream consumes these frames). It lets a gateway enter a drain/idle transition without losing inbound that arrives while it is gone, by making the connector buffer for that instance and replay on reconnect. Three frames (all keyed by the connection's **authenticated** per-instance id — read off the stored secret record at the WS upgrade, never asserted in a frame): - `{"type":"going_idle"}` (gateway → connector) — emitted as part of the gateway's EXISTING drain transition (the adapter sends it before tearing down the socket). Asks the connector to flip this instance to **buffered-only**. - `{"type":"going_idle_ack"}` (connector → gateway) — the connector has flipped: live delivery has stopped and subsequent inbound for this instance buffers durably. The gateway **stays serving until this ack** (so an event landing in the flip window is delivered live, not lost — the same SUBSCRIBE-before-serve ordering discipline as the bus). Only after the ack is it safe to close. - `{"type":"inbound_ack", "bufferId"}` (gateway → connector) — durable receipt of a buffered `inbound` delivery (which carries its `bufferId`) replayed on reconnect. The connector acks the buffer entry only after this, giving drain-without-dup on the **delivery leg**: an instance that dies mid-drain redelivers exactly the unacked tail; an acked entry never redelivers. **Buffer + drain.** While flipped, the connector appends inbound to a durable per-instance delivery-leg buffer (`delivery:`) instead of pushing it live. On the gateway's **reconnect** (a NET-NEW reconnect loop re-dials + re-handshakes after an unexpected close), the new handshake triggers the connector to drain that backlog over the new socket **in order, ack-gated**, then clear the flip so live delivery resumes. This reuses the same `drainWithoutDup` machinery as the Discord→connector ingest leg, applied to the connector→gateway delivery leg. Connector-authoritative throughout: a gateway can only flip/drain ITS OWN instance. > NOT in scope (deferred behaviour): the autonomous idle timer that DECIDES to > drain, the actual machine suspend, and the NAS suspended-health model. The > primitive is "when the gateway drains, relay flips to buffered + replays on > reconnect, with no loss/dup"; WHAT triggers the drain is out of scope. ### 3.3 Wake poke (§5.2) The other half of the sleep/wake loop: how a SUSPENDED gateway finds out it has buffered work waiting. A PRIMITIVE — nothing here suspends a machine; it wires the wake SIGNAL so a future scale-to-zero behaviour layer can rely on "buffered ⇒ wake poked." - **Registration.** The gateway registers a **wake URL** at enroll/provision — any reachable URL the connector can GET to wake it (a Fly autostart hostname, a dashboard host). Self-hosted: `hermes gateway enroll --wake-url ` (or `GATEWAY_RELAY_WAKE_URL` / `gateway.relay_wake_url`). Managed/NAS: stamped into the container env beside `GATEWAY_RELAY_URL`. Forwarded in the `/relay/provision` body as `wakeUrl` and stored per-instance on the connector's secret record (gateway-asserted but safely scoped — same posture as `instanceId`; the org/tenant stays token-verified, so a gateway can only register a wake target for ITS OWN instance). DISTINCT from the retired `gatewayEndpoint`: a **poke target**, not a delivery target. - **The poke.** When a buffered-only (going-idle) destination receives its FIRST buffered event, the connector issues a **payload-free, unsigned GET** to that instance's registered `wakeUrl`, **directly** (NOT NAS-mediated — relay stays NAS-independent). It carries no tenant data and no inbound: it only says "you have buffered work, reconnect." Tenant authority is re-established the normal way when the gateway re-dials (the authenticated WS upgrade), so a leaked/ guessed wake URL can at worst cause a spurious reconnect of ITS OWN instance. Rate-limited per instance (one poke per cooldown window, not per event), and best-effort — a failed poke is swallowed; the gateway still drains whenever it next reconnects on its own. No new frame: the wake is an out-of-band HTTP GET, not a relay-WS message (the socket is down — that's the whole point). > NOT in scope (deferred behaviour): the actual machine suspend (Fly > `autostop:"suspend"`) and the autonomous idle timer that decides to sleep. The > primitive is "buffered event for a sleeping instance ⇒ its wakeUrl gets poked"; > WHAT makes the instance sleep (and wake-to-serve) is the behaviour layer. ### 3.4 Obligations on a future scale-to-zero behaviour layer §3.2 and §3.3 ship the **primitives**; this section is the **contract a separate scale-to-zero behaviour workstream must honour to consume them safely.** It owns the *decision* to suspend, the actual machine suspend, and the platform/health model — none of which live here — but it MUST hold these guarantees, which the primitives assume: 1. **Register a `wakeUrl` before the instance can ever be suspended.** A suspended instance with no registered `wakeUrl` is a black hole — buffered inbound never triggers a poke, so it sleeps through its own traffic until something else reconnects it. The behaviour layer MUST ensure a reachable wake target is registered (self-hosted: `--wake-url`; managed: stamped) as a precondition of allowing suspend. A wake URL that is unreachable while the machine is suspended (e.g. points at the suspended machine itself with no platform autostart in front) is equivalent to none. 2. **Drain through `going_idle` → await `going_idle_ack` BEFORE tearing down the socket or suspending.** Never suspend with an un-acked flip in flight. The ack is the connector's confirmation that delivery for this instance is now buffered-only; a machine that suspends after sending `going_idle` but before the ack can drop the inbound that races the flip. The gateway already gates socket teardown on the ack (Q-5.3c); the suspend step MUST sit *after* a clean drain completes, not race it. 3. **Keep the NET-NEW reconnect loop live as a precondition of suspend.** The wake→drain contract is "poke ⇒ the gateway re-dials ⇒ the connector drains on the reconnect handshake." If the reconnect loop is disabled, a poke lands on a machine that never re-dials and the buffer strands. The behaviour layer must not suspend an instance whose relay transport won't reconnect on wake. 4. **Treat suspended ≠ down in the health model (Q-5.3b).** A suspended instance is healthy-asleep, not failed. The health/monitoring layer MUST distinguish the two (e.g. via the platform machine-state) so a suspended instance is not restarted, alerted on, or reaped as unhealthy — that would defeat the suspend and can race the wake/drain. 5. **The wake poke is best-effort and rate-limited — do not assume exactly-once or immediate wake.** At most one poke per cooldown window per instance, and a failed poke is swallowed. The behaviour layer must not rely on the poke as a guaranteed/prompt signal; correctness still rests on "the gateway drains whenever it next reconnects." A belt-and-suspenders wake (e.g. a scheduled job that also reconnects) is the behaviour layer's call, not the primitive's. 6. **Suspend only when genuinely idle — and idle is connector-observable, not gateway-guessed.** WHAT counts as idle (no in-flight turn + no inbound for N min) is the behaviour layer's policy, but it must compose with the existing drain machinery (`gateway_state` running→draining) rather than introduce a parallel relay-only idle path — the same integration constraint §3.2 places on `going_idle`. These are guarantees the behaviour layer OWES the primitives; the primitives owe the behaviour layer only what §3.2/§3.3 already specify (a flip-on-going_idle, a durable per-instance buffer + ack-gated reconnect drain, and a poke on the first buffered event for a flipped instance). --- ## 4. Outbound: action set The gateway calls the transport with action dicts. Source of truth: `gateway/relay/transport.py` + `gateway/relay/adapter.py`. | `op` | Fields | Result | | --- | --- | --- | | `send` | `chat_id`, `content`, `reply_to?`, `metadata?` | `{success: bool, message_id?, error?}` | | `edit` | `chat_id`, `message_id`, `content`, `metadata?` | `{success: bool, error?}` | | `typing` | `chat_id` | `{success: bool}` | | `follow_up` | `session_key`, `kind`, `content`, `metadata?` | `{success: bool, message_id?, error?}` | `get_chat_info(chat_id)` is a separate proxied call returning at least `{name, type}`. Media actions follow the same envelope shape (deferred to a later contract revision; additive). **`follow_up` (A2 capability action).** Some inbound payloads carry a credential that acts on the **shared** bot identity (e.g. a Discord interaction follow-up token). Per §6 the connector strips that at the edge and binds it in its capability vault keyed by the session; it **never reaches the gateway**. To use it, the gateway issues `follow_up` naming the **session it is already in** (`session_key`) plus the capability `kind` (e.g. `discord.interaction_token`) — **never a token**. The connector resolves the real value from its vault, enforces the tenant match (tenant B can never wield tenant A's capability), and egresses. `success: false` when the capability is absent/expired or the tenant doesn't match — the gateway has nothing to retry with, by design (a leaked gateway holds zero capability material). Source of truth: `gateway/relay/transport.py` (`send_follow_up`) + `gateway/relay/adapter.py`. --- ## 5. Interrupt (`/stop`) routing - **Gateway → connector:** `send_interrupt(session_key, reason?)` egresses a mid-turn `/stop` over the outbound WS. The connector MUST forward it to the gateway instance running that `session_key` (the routing invariant). - **Connector → gateway:** an inbound interrupt for a `session_key` is delivered as an `interrupt_inbound` frame down the gateway's outbound WS (§3 transport note) — routed cross-instance via the relay bus to whichever instance holds the socket — and bridged by the adapter's `on_interrupt(session_key, chat_id)` into the existing per-session interrupt mechanism, cancelling exactly that turn (siblings untouched). Both directions ride the gateway's outbound WS: the gateway→connector `/stop` egresses over it, and the connector→gateway interrupt rides the same `inbound` back-channel as a normalized event. --- ## 6. Trust boundary & signed-body handling (A2) **The connector is the sole crypto/identity boundary. The gateway re-validates nothing.** Webhook signatures (Discord ed25519, Twilio HMAC, WeCom BizMsgCrypt) are computed over exact raw bytes, and some payloads are *encrypted* with a shared secret. The connector fronts a **shared** bot for many tenants and holds every tenant's platform secrets, so it: - **verifies / decrypts at the edge** (the only place the secrets live), - **normalizes** the payload into a tenant-scoped `MessageEvent` (§3), - **strips any shared-identity capability** out of the payload and binds it in its capability vault, keyed by the session (see §4 `follow_up`), - **forwards only the sanitized `MessageEvent`** — never the raw signed body. The gateway therefore performs **no** platform signature/crypto verification on the relay path; it trusts the normalized event. This is an enforced invariant on the gateway side (`tests/gateway/relay/test_relay_sheds_crypto.py`: the relay package imports/calls no platform-crypto). **Why not "forward the signed body byte-for-byte so the gateway re-validates"?** That earlier model is incoherent under an untrusted, disposable tenant gateway: - Re-validating Twilio HMAC / WeCom crypto would require handing the gateway the **shared signing secret** — which is itself the leak, and on a shared bot it's a *cross-tenant* leak. - WeCom payloads are encrypted with the shared secret; the connector must decrypt at the edge just to route, so forwarding ciphertext would again require giving the gateway the secret. - A Discord interaction token lives **inside** the signed JSON body — you cannot both preserve the bytes and strip the credential; they are the same bytes. So byte-preservation is abandoned deliberately: the connector re-serializes the sanitized event and the gateway trusts it. This also unifies the passthrough and relay planes — both are "verify at the edge → emit a normalized event," differing only in transport. See `docs/capability-trust-boundary.md` (connector repo: `gateway-gateway`) for the full A2 rationale and the connector-side vault. ### 6.1 Channel authentication (the connector⇄gateway link itself) A2 makes the connector the sole holder of platform secrets while the gateway may be **customer-managed and internet-exposed**, so the connector⇄gateway channel is itself authenticated. The gateway holds an enrollment- or provision-issued **per-gateway secret** (`hermes gateway enroll` → connector `/relay/enroll`, or managed self-provision → `/relay/provision`) that authenticates its outbound WS upgrade. It is an HMAC-SHA256 scheme with a multi-secret rotation verify list (gateway side: `gateway/relay/auth.py`; connector side: `src/core/relayAuthToken.ts`). | Leg | Credential | Mechanism | |-----|-----------|-----------| | Gateway → connector WS upgrade | per-gateway secret | An `Authorization` bearer header on the `/relay` upgrade. The token is `base64url(payload:exp:sig)` where `payload = gatewayId` and `sig = HMAC(payload:exp, secret)`. Connector verifies and rejects the upgrade (**close 4401**) on mismatch/absence/revocation. The authenticated tenant comes from the connector's store, never the `hello` frame. | | Connector → gateway inbound (`inbound` / `interrupt_inbound` frames) | — (rides the authenticated WS) | Inbound is pushed down the gateway's already-authenticated outbound socket (§3), so no per-message signature is needed. A **per-tenant delivery key** is still issued at enroll/provision and retained for forward-compat, but is no longer used to sign inbound. | This is the **channel** authenticator — distinct from platform crypto, which the relay path still sheds entirely (§6). The gateway holds zero platform secrets; the per-gateway secret authenticates only the connector link. Full threat model + enrollment/rotation/kill-switch design: `docs/connector-gateway-auth-design.md` (connector repo). --- ## 7. Per-instance delivery & the management plane (Phase 6) Phases 1–5 treat the connector as a single-tenant front: inbound events for a tenant fan out to that tenant's gateway socket(s). **Phase 6 makes delivery per-INSTANCE** — a shared bot can front many users/agents in one tenant (one Discord guild, one Telegram bot) without cross-delivery — and adds a small **management plane** the agent (or a managed Portal) uses to declare who-sees-what and what's-relevant. All of this lives **connector-side**; the gateway's only new responsibility is to **declare its relevance policy** at boot (§7.3). ### 7.1 The delivery gate (connector-side, informational) For each inbound event the connector decides which instances receive it by composing three AND-ed filters. The gateway does not implement these — they run in the connector — but they define the delivery semantics the gateway relies on: | Layer | Question | Source of truth | | --- | --- | --- | | **owner / scope ∧ principal** | May this instance *see* this author here? | per-user `user_id → instance` bindings (the owner floor) + per-instance `(guild, channel)` scope grants + an `owner-only` / `allow-list` / `any` principal policy. | | **visibility floor** | Can the instance's bound owner actually `VIEW_CHANNEL` this in Discord? | live Discord ACL (effective permissions), fail-closed. Narrows an over-broad scope grant downward. | | **relevance** | *Given* it may see it, should the agent engage? | the relevance policy declared in §7.3 (address-gating / free-response / allow-bots). | The composition only ever **narrows** delivery (`deliver ⇔ authorized ∧ visible ∧ relevant`); the **owner floor bypasses the relevance layer** (an author's own message always reaches their own instance — you don't @mention your own agent). A message authored by an unbound user reaches no instance (fail-closed). The full design + invariants live in the connector repo (`NousResearch/gateway-gateway`); this section is the gateway-facing summary. ### 7.2 Management routes (connector-side, authenticated) The connector mounts authenticated management routes. They share the **same dual-auth** as the WS upgrade: either a managed NAS-signed `aud=agent:{instanceId}` RS256 JWT, **or** the gateway's own per-gateway secret bearer (§6.1 `make_upgrade_token`). In both cases the connector resolves the authoritative `{tenant, instanceId}` from its **stored** record — **never** from the request body (a body-asserted `instanceId` is ignored). | Route | Purpose | | --- | --- | | `POST /manage/link` | Issue a short-lived code to bind a platform account to the authenticated instance (the `/link ` flow; the connector reads the authentic `user_id` off the inbound event). | | `POST /manage/scope`, `/manage/scope/release` | Claim / release a `(guild, channel)` scope for the authenticated instance. A channel is owned by at most one instance (non-overlap is a PK constraint). | | `POST /manage/principal` | Set the instance's principal policy (`owner-only` \| `allow-list` \| `any`). | | `POST /manage/dm-default` | Set the user's DM-default instance (DM tie-break when a user linked more than one). | | `POST /relay/policy` | Declare the instance's **relevance policy** (§7.3). | These are connector-owned (the management plane is not part of the gateway's agent path); the gateway only calls `POST /relay/policy` (§7.3). The others are driven by the managed Portal / `hermes` CLI. ### 7.3 Relevance-policy declaration (the gateway's responsibility) The relevance layer (§7.1) is the per-tenant parity for the gateway's own behaviour knobs (`require_mention`, `free_response_channels`, `{PLATFORM}_ALLOW_BOTS`). So the **same** behaviour governs relay delivery, the gateway projects those knobs into a **platform-agnostic** policy and POSTs it to `POST /relay/policy` at boot (after its per-gateway secret is resolved). Body (`gateway/relay/__init__.py` `relay_relevance_policy()` → `send_relay_policy()`): | Field | Type | Projected from | Meaning | | --- | --- | --- | --- | | `platform` | string | the fronted platform (`relay_platform_identity`) | which platform this policy applies to. | | `requireAddress` | bool | `require_mention` | a non-owner message must @mention / reply-to the bot to be relevant. | | `freeResponseScopes` | string[] | `free_response_channels` | scope (channel) ids where `requireAddress` is waived. Same scope vocabulary as §7.1's scope grants. | | `allowOtherBots` | bool | `{PLATFORM}_ALLOW_BOTS ∈ {mentions, all}` | admit bot-authored messages (default off). | Auth is the per-gateway upgrade token (§6.1), so the connector attaches the policy to the authenticated instance. The gateway is the **source of truth** and re-declares **every boot** (a full replace, mirroring the `routeKeys` upsert at provision — self-healing). When the projected policy is all-default the gateway sends nothing (the connector's absent-row default already matches). The POST is **fail-soft**: a failure logs and boot proceeds — relevance is an optimization layered on the authorization gate (§7.1), never a boot dependency. There is **no new gateway inbound surface** and **no new credential** — it reuses the per-gateway secret and the same host as `/relay/provision`. > A relevance drop happens **before** the connector wakes a scaled-to-zero agent > (Phase 5), so excluded chatter never spins an agent up — relevance is the > primary scale-to-zero lever as well as a correctness filter. --- ## 8. Versioning policy - `contract_version` is an int; bump **only** for additive changes during the experimental phase (new optional fields, new `op`s). - A breaking change (renamed/removed field, changed semantics) requires a coordinated update of both repos and a version bump. - The connector's first PR references the commit SHA of this file it implements against.