Follow-up to Ben's PR #37892. Adds a TestInternalCredential block to
test_dashboard_auth_ws_tickets.py exercising the mint-once stability,
multi-use, unminted-rejection, empty-value, wrong-value, reset-and-remint,
and ticket-store-independence branches directly (previously only covered
indirectly via _ws_auth_ok, which left the unminted and empty-value
branches unexercised).
Also corrects the consume_internal_credential docstring: the returned
identity dict is discarded by the current _ws_auth_ok caller (which only
needs the boolean outcome), so the prior 'carry it into its session log'
wording over-promised.
The embedded-TUI PTY child attaches to two server-internal WebSockets:
/api/ws (its primary JSON-RPC gateway backend) and /api/pub (the event
sidecar). Both URLs are built server-side in web_server.py and handed to
the child via its environment.
In OAuth-gated mode (auth_required=true, every hosted Fly agent), _ws_auth_ok
unconditionally rejects the legacy ?token=<_SESSION_TOKEN> path — a leaked
session token must not grant WS access once the gate is engaged. But
_build_gateway_ws_url() still only emitted ?token=, with no gated-mode
branch (its sibling _build_sidecar_url had been given a ticket branch; the
gateway-url builder was missed). So the TUI child's /api/ws upgrade was
rejected 4401 -> 'gateway websocket connection failed' -> 'gateway startup
timeout', leaving the embedded chat unusable on every gated deployment.
A single-use 30s browser ticket is the wrong shape for this link: the child
reads its attach URL once at startup and reuses it on every reconnect, and
on a slow cold boot it may not dial within the TTL. (_build_sidecar_url's
own docstring already flagged this fragility.)
Fix: add a process-lifetime, multi-use internal credential to
dashboard_auth.ws_tickets (internal_ws_credential / consume_internal_credential),
minted once per process and NEVER injected into the SPA — it only leaves the
process via a spawned child's env, so browser-side XSS can't read it, and a
leak grants no more than a ticket already does. _ws_auth_ok accepts it via
?internal= in gated mode only. Both _build_gateway_ws_url and
_build_sidecar_url now use it, so the child can reconnect both sockets.
Loopback / --insecure behavior is unchanged (still ?token=).
Needs review: touches _ws_auth_ok + dashboard_auth (core auth surface).
Phase 5 task 5.1. Browsers cannot set Authorization on a WebSocket
upgrade, so in gated mode the SPA needs an alternative way to bind the
upgrade to its authenticated session.
hermes_cli/dashboard_auth/ws_tickets.py — in-memory single-use ticket
store with 30s TTL. Thread-safe (threading.Lock), token_urlsafe(32)
values, ticket value truncated to 8 chars in error messages for log
hygiene. Module-level state with _reset_for_tests() helper.
hermes_cli/dashboard_auth/routes.py — adds POST /api/auth/ws-ticket.
Auth-required (the gate middleware already attaches Session to
request.state.session). Returns {ticket, ttl_seconds}; emits
WS_TICKET_MINTED audit event with user_id + provider + ip.
hermes_cli/dashboard_auth/audit.py — adds WS_TICKET_REJECTED enum
value for the consume-side rejection event (wired into the WS
endpoints in task 5.2).
11 new tests covering round-trip, single-use, TTL boundary, unknown
ticket rejection, secret-hygiene truncation in error messages, and
concurrent mint+consume from 20 threads.