hermes-agent/gateway/relay
Ben Barclay 0ddd21c74e
feat(relay): managed-boot self-provision client (Phase 3, gateway side) (#48242)
The gateway half of relay Phase 3. On a MANAGED boot with relay configured and
no secret pinned, the runtime self-provisions its relay credentials IN-PROCESS:
resolve the agent's own Nous access token (resolve_nous_access_token) -> POST
the connector's /relay/provision asserting its own endpoint + route keys ->
set GATEWAY_RELAY_ID/SECRET/DELIVERY_KEY into os.environ so the immediately-
following register_relay_adapter() reads them and dials out authenticated.

No human, no enrollment token, no disk write — the creds live only in process
memory (save_env_value refuses under managed anyway, and keeping the secret off
any volume is the stronger posture). Stateless: process-env creds don't survive
a restart, so a managed container re-provisions every boot; the connector's
rotation window covers a still-connected prior instance. An explicitly-pinned
GATEWAY_RELAY_SECRET is respected (skip). Self-hosted is unchanged: humans keep
using `hermes gateway enroll`.

Endpoint provenance is gateway-asserted (GATEWAY_RELAY_ENDPOINT +
GATEWAY_RELAY_ROUTE_KEYS, env or gateway.relay_* config) — uniform code path
whether the operator sets it (self-hosted) or NAS stamps it (hosted, the only
case NAS knows the public URL). Both absent -> outbound-only provisioning
(credentials, no inbound routes). The connector scopes the asserted endpoint to
the verified tenant, so it stays within the security model.

- gateway/relay/__init__.py: relay_endpoint(), relay_route_keys(),
  _provision_url(), _post_provision(), self_provision_if_managed() (never
  raises — a provision failure logs and boots without relay auth).
- gateway/run.py: call self_provision_if_managed() immediately before
  register_relay_adapter() in the startup path.

Tests: 12 unit (trigger logic, respect-pinned-secret, in-process env wiring,
endpoint+routes vs outbound-only, fail-soft on token/connector failure);
mutation-checked (drop is_managed guard / pinned-secret guard -> tests fail).
Cross-repo live E2E driver lands on the connector side (depends on this).

EXPERIMENTAL: relay auth scheme may change until >=2 Class-1 platforms validate.
2026-06-18 15:25:29 +10:00
..
__init__.py feat(relay): managed-boot self-provision client (Phase 3, gateway side) (#48242) 2026-06-18 15:25:29 +10:00
adapter.py feat(relay): connector⇄gateway channel auth + signed-HTTP inbound receiver + enroll CLI (#48147) 2026-06-18 12:01:54 +10:00
auth.py feat(relay): connector⇄gateway channel auth + signed-HTTP inbound receiver + enroll CLI (#48147) 2026-06-18 12:01:54 +10:00
descriptor.py feat(relay): derive descriptor from PlatformEntry 2026-06-17 16:37:45 -07:00
inbound_receiver.py feat(relay): connector⇄gateway channel auth + signed-HTTP inbound receiver + enroll CLI (#48147) 2026-06-18 12:01:54 +10:00
transport.py feat(gateway): token-less follow_up outbound op (A2 capability action) 2026-06-17 16:37:45 -07:00
ws_transport.py feat(relay): connector⇄gateway channel auth + signed-HTTP inbound receiver + enroll CLI (#48147) 2026-06-18 12:01:54 +10:00