docs(gateway): document multiplexing opt-in + contract changes

Extend the 'Running Many Gateways at Once' user-guide page with a 'one gateway for all profiles (multiplexing)' section, kept to a single page: - How to opt in (gateway.multiplex_profiles on the default profile) and when to prefer it vs one-process-per-profile. - Every contract change a user sees when the flag is on: 1. secondary-profile 'gateway start' is a hard error (--force escape hatch), 2. HTTP-inbound reached via /p/<profile>/ prefix; secondary profiles must NOT enable a port-binding platform (webhook/api_server/msgraph_webhook/feishu/ wecom_callback/bluebubbles/sms) — config error at startup, 3. per-credential platforms still need their own token per profile, 4. session keys namespaced agent:<profile>: (default stays agent:main:), 5. single PID/lock + aggregated hermes status, per-profile runtime_status.json. - What does NOT change: per-profile .env credential isolation (stricter, incl. MCP/Kanban subprocess env), Kanban, profile-scoped skills/memory/SOUL, routing. All inert when the flag is off.
2026-06-21 10:22:18 +00:00 · 2026-06-19 15:49:25 +10:00 · 2026-06-19 15:49:25 +10:00 · 2dd285f9b3
commit 2dd285f9b3
parent 1e70df5fdd
1 changed files with 133 additions and 0 deletions
--- a/website/docs/user-guide/multi-profile-gateways.md
+++ b/website/docs/user-guide/multi-profile-gateways.md
@ -56,6 +56,139 @@ research gateway start
 That's it — three independent agents, each on its own process, restarting
 automatically on crash and on user login.

+## Alternative: one gateway for all profiles (multiplexing)
+
+The model above runs **one process per profile**. That is the default and is
+the right choice for most setups. But on a host with many profiles — or a
+container deployment where one process per profile is operationally heavy — you
+can instead run a **single multiplexing gateway**: the default profile's gateway
+becomes the sole inbound process and serves messages for *every* profile on the
+box.
+
+This is **opt-in** and **off by default**. When it's off, nothing on this page
+changes — every behavior below is inert.
+
+### When to prefer multiplexing
+
+- A container/VPS deployment where N supervisor units, N ports, and N PID files
+  are a burden.
+- Many low-traffic profiles that don't each justify a full process.
+- You want a single thing to start, monitor, and restart.
+
+Stick with one-process-per-profile when you want hard process-level isolation
+between profiles (separate memory footprints, independent crash domains, the
+ability to restart one profile without touching the others).
+
+### How to opt in
+
+Set the flag on the **default profile** (it owns the multiplexer) and restart
+its gateway:
+
+```bash
+hermes config set gateway.multiplex_profiles true
+hermes gateway restart
+```
+
+Equivalently, in the default profile's `~/.hermes/config.yaml`:
+
+```yaml
+gateway:
+  multiplex_profiles: true
+```
+
+(The flag is also accepted as a top-level `multiplex_profiles: true` for
+convenience.) On the next start the default gateway enumerates every profile,
+brings up each profile's enabled platforms under that profile's own
+credentials, and routes each inbound message to the profile it belongs to. Each
+turn resolves the routed profile's config, skills, memory, SOUL, **and provider
+keys** — credentials are never shared across profiles.
+
+You do **not** run `hermes gateway start` for the secondary profiles — the
+default gateway serves them. See the contract changes below.
+
+### What changes when multiplexing is on
+
+Enabling the flag changes how a few things behave. All of these revert the
+moment the flag is off.
+
+#### 1. Secondary profiles must not start their own gateway
+
+With a multiplexer running, a named-profile `hermes gateway start` / `run` is a
+**hard error**, pointing you back at the multiplexer:
+
+```
+The default gateway is running as a profile multiplexer and already serves
+profile 'coder'. ...
+```
+
+The multiplexer is the single inbound process; a second profile gateway would
+double-bind that profile's platforms. Pass `--force` only if you deliberately
+want a separate process for that profile (not recommended while the multiplexer
+is running). The cross-profile lifecycle wrapper script earlier on this page is
+therefore **not** used in multiplex mode — you only manage the default gateway.
+
+#### 2. HTTP-inbound platforms are reached via a `/p/<profile>/` URL prefix
+
+Webhook (and other HTTP-inbound) traffic for a secondary profile arrives on the
+default listener under a profile prefix, **not** a second port:
+
+```
+# default profile
+POST http://host:8644/webhooks/<route>
+# the "coder" profile, same listener
+POST http://host:8644/p/coder/webhooks/<route>
+```
+
+An unknown or unconfigured profile in the prefix returns `404`. Because the one
+shared listener already serves every profile this way, a **secondary profile
+must not enable a port-binding platform itself** — doing so is a config error
+and the gateway refuses to start, naming the profile and platform:
+
+```
+Profile 'coder' enables the port-binding platform 'webhook', but
+gateway.multiplex_profiles is on. ... Remove platforms.webhook from profile
+'coder's config.yaml (configure it only on the default profile).
+```
+
+Port-binding platforms covered by this rule: `webhook`, `api_server`,
+`msgraph_webhook`, `feishu`, `wecom_callback`, `bluebubbles`, `sms`. Configure
+any of these **only on the default profile**; every profile is reachable through
+its `/p/<profile>/` prefix.
+
+#### 3. Per-credential platforms still need their own token per profile
+
+Polling/connection platforms (Telegram, Discord, Slack, Matrix, Signal, …) work
+fine multiplexed, but each profile that enables one must supply its **own** bot
+token — the same token cannot be polled by two profiles at once. If two profiles
+configure the same `(platform, token)`, startup fails fast naming both profiles
+(see [Token-conflict safety](#token-conflict-safety) — the rule is unchanged,
+it's just enforced inside the one process now).
+
+#### 4. Session keys are namespaced by profile
+
+Each profile's sessions live under an `agent:<profile>:…` namespace so two
+profiles on the same platform/chat never collide in the shared session store.
+The **default** profile keeps the historical `agent:main:…` namespace
+byte-for-byte, so existing default-profile sessions are unaffected — no
+migration, no orphaned history.
+
+#### 5. One PID/lock and one status surface
+
+There is a single process-level PID and lock (the multiplexer, under the default
+home). `hermes status` reports the multiplexer and the profiles it serves;
+`hermes status -p <name>` slices to one profile. Each profile still writes its
+own `runtime_status.json` under its own home, so existing per-profile readers
+keep working.
+
+#### What does **not** change
+
+Per-profile `.env` credential isolation is preserved and, if anything,
+stricter: a profile's keys are resolved from its own scope and are never unioned
+into a shared environment (this also means subprocesses like MCP servers and
+Kanban workers only ever see their own profile's secrets). Kanban,
+profile-scoped skills/memory/SOUL, and model routing all behave per-profile
+exactly as they do with separate gateways.
+
 ## Start, stop, or restart all gateways at once

 The CLI ships with single-profile lifecycle commands. To act across every