hermes-agent/hermes_cli/dashboard_auth/middleware.py
Ben b26d81d536 feat(dashboard-auth): honour X-Forwarded-Prefix + __Host-/__Secure- cookies
Mission-control style deploys reverse-proxy the dashboard at a path
prefix (e.g. mission-control.tilos.com/hermes/* -> :9119) and inject
X-Forwarded-Prefix: /hermes on every request. The SPA mount already
honoured this for asset URLs and the bootstrap __HERMES_BASE_PATH__,
but the OAuth gate didn't:

  1. The gate's Location: header to /login and the 401 envelope's
     login_url were built bare ("/login?next=..."). Under a /hermes
     prefix the browser follows that to mission-control.tilos.com/login
     which the proxy doesn't route to the dashboard.
  2. _redirect_uri (the OAuth callback URL handed to the IDP) used
     request.url_for() which doesn't honour X-Forwarded-Prefix
     (Starlette/uvicorn only proxy_headers Host + Proto + For). The
     IDP redirects back to /auth/callback instead of /hermes/auth/
     callback → 404 in the user's browser.
  3. Cookies were set with Path=/ which leaks them to other apps on
     the same origin and won't be sent back on requests under the
     prefix in the first place.

Fix threads the normalised prefix through every boundary:

  * New hermes_cli/dashboard_auth/prefix.py — single source of truth
    for X-Forwarded-Prefix parsing. web_server._normalise_prefix
    becomes a re-export so the SPA mount, the gate, and the cookies
    helper all agree.
  * middleware._unauth_response builds login_url = f"{prefix}/login".
  * routes._redirect_uri splices the prefix into the path component
    of the IDP-bound URL (with full validation of the header).
  * cookies.{set,clear}_{session,pkce}_cookie now take prefix="".
    Path attribute switches to /hermes when set; cookie name switches
    name variant (see below). Every caller passes the request's
    normalised prefix.

Cookie hardening (Teknium's lesser-note #1 in the PR review): adopt
the __Host- / __Secure- cookie name prefixes per draft-west-cookie-
prefixes. The variant is selected from (use_https, prefix):

  * Loopback HTTP → bare "hermes_session_at" (both prefixes require
    Secure, incompatible with HTTP).
  * HTTPS, direct deploy (Path=/) → "__Host-hermes_session_at".
    Strongest spec: bound to exact origin, no Domain attribute, Secure
    required.
  * HTTPS, behind a proxy prefix (Path=/hermes) →
    "__Secure-hermes_session_at". __Host- forbids Path != "/"; the
    explicit Path=/hermes covers same-origin app isolation.

Setter and reader BOTH consult the prefix because the cookie *name*
changes — a reader that looked up the bare name when the setter wrote
__Secure- would never find the value. The reader falls back across
all three variants so a request whose shape changed mid-session (e.g.
post-deploy from no-prefix to /hermes) still picks up the existing
cookie until it expires.

Test coverage:

  - tests/hermes_cli/test_dashboard_auth_prefix.py — new file. 11 tests
    pinning:
      • Location: /hermes/login on the gate's HTML redirect
      • 401 envelope login_url carries the prefix
      • Malformed X-Forwarded-Prefix is ignored (header-injection
        defence; the script-tag value is normalised to empty string)
      • _redirect_uri splices /hermes into the path (the property
        that prevents the IDP-returns-to-404 failure)
      • PKCE cookie uses Path=/hermes + __Secure- when proxied
      • Session cookies use __Host- when direct, __Secure- when
        proxied, bare on loopback HTTP
      • End-to-end round trip with hand-managed PKCE cookie carriage
        (TestClient can't simulate a Path=/hermes cookie automatically)
  - tests/hermes_cli/test_dashboard_auth_cookies.py — rewritten to pin
    each (use_https, prefix) shape produces its expected cookie name,
    plus reader-side coverage that __Host- and __Secure- variants are
    both recognised.
  - Existing tests across middleware / 401-reauth / etc. updated to
    match the new cookie names (substring contains instead of
    startswith).

Mutation-tested: reverting _unauth_response to build the bare
"/login" URL trips exactly the two tests that pin the prefix
carriage, confirming the suite discriminates the regression.
2026-05-27 02:12:27 -07:00

207 lines
7.5 KiB
Python

"""Auth-gate middleware for the dashboard.
Engaged when ``app.state.auth_required is True``. The gate's job:
1. Allow a small set of routes through unauthenticated (login page,
``/auth/*`` OAuth round trip, ``/api/auth/providers``, static
assets).
2. For everything else, demand a valid session cookie and attach the
verified :class:`Session` to ``request.state.session``.
3. On HTML routes, redirect missing/invalid cookies to ``/login``.
On ``/api/*`` routes, return 401 JSON.
The middleware is a no-op when ``auth_required`` is False (loopback
mode); the legacy ``_SESSION_TOKEN`` ``auth_middleware`` handles those
binds.
"""
from __future__ import annotations
import logging
from typing import Awaitable, Callable
from fastapi import Request
from fastapi.responses import JSONResponse, RedirectResponse, Response
from hermes_cli.dashboard_auth import list_providers
from hermes_cli.dashboard_auth.audit import AuditEvent, audit_log
from hermes_cli.dashboard_auth.base import ProviderError
from hermes_cli.dashboard_auth.cookies import read_session_cookies
_log = logging.getLogger(__name__)
# Paths that bypass the auth gate. Order matters: prefix match.
_GATE_PUBLIC_PREFIXES: tuple[str, ...] = (
"/auth/login",
"/auth/callback",
"/auth/logout",
"/login",
"/api/auth/providers",
"/assets/",
"/favicon.ico",
"/ds-assets/",
"/fonts/",
"/fonts-terminal/",
)
def _path_is_public(path: str) -> bool:
return any(
path == prefix or path.startswith(prefix)
for prefix in _GATE_PUBLIC_PREFIXES
)
def _client_ip(request: Request) -> str:
fwd = request.headers.get("x-forwarded-for", "")
if fwd:
return fwd.split(",")[0].strip()
return request.client.host if request.client else ""
def _unauth_response(request: Request, *, reason: str) -> Response:
"""API routes → 401 JSON with ``login_url``; HTML routes → 302 → /login.
The JSON envelope carries a ``login_url`` field with a ``next=`` query
string so the SPA's global 401 handler can drop the user back where
they were after re-auth. The contract is intentionally simple so any
fetch-wrapper can implement the redirect without parsing details:
if response.status === 401 && body.error in ("unauthenticated",
"session_expired"):
window.location.assign(body.login_url);
HTML redirects also carry the ``next=`` query string so direct
navigation to ``/sessions`` (etc.) without a cookie comes back to
``/sessions`` after login.
Under a reverse proxy with ``X-Forwarded-Prefix: /hermes``, the
``login_url`` is prefixed (``/hermes/login?next=...``) so the
browser's window.location.assign / Location: follow lands on the
proxied login page rather than the bare ``/login`` (which the
proxy doesn't route to the dashboard).
"""
from hermes_cli.dashboard_auth.prefix import prefix_from_request
path = request.url.path
next_param = _safe_next_target(request)
prefix = prefix_from_request(request)
login_url = (
f"{prefix}/login?next={next_param}" if next_param
else f"{prefix}/login"
)
if path.startswith("/api/"):
# API routes never get redirects: the browser fetch() API would
# follow a 302 into the cross-origin OAuth dance opaquely. Return
# 401 with a structured envelope so the SPA can full-page-navigate
# to login_url.
error_code = (
"session_expired"
if reason == "invalid_or_expired_session"
else "unauthenticated"
)
return JSONResponse(
{
"error": error_code,
"detail": "Unauthorized",
"reason": reason,
"login_url": login_url,
},
status_code=401,
)
return RedirectResponse(url=login_url, status_code=302)
def _safe_next_target(request: Request) -> str:
"""Build the URL-encoded ``next`` query value, or empty string.
Only same-origin relative paths are accepted; absolute URLs or
``//evil.com`` open-redirect attempts are silently dropped. The empty
string return means the caller produces a bare ``/login`` URL — fine,
user lands at the dashboard root after re-auth.
"""
path = request.url.path
# Reject anything that doesn't start with "/" or starts with "//"
# (protocol-relative URL — would open-redirect to an attacker host).
if not path or not path.startswith("/") or path.startswith("//"):
return ""
# Don't redirect back to the auth routes themselves — that loops.
if any(
path == p or path.startswith(p)
for p in ("/login", "/auth/", "/api/auth/")
):
return ""
# Preserve query string if present (e.g. /sessions?page=2).
query = request.url.query
target = f"{path}?{query}" if query else path
# urlencode the whole thing as a single value.
from urllib.parse import quote
return quote(target, safe="")
async def gated_auth_middleware(
request: Request,
call_next: Callable[[Request], Awaitable[Response]],
) -> Response:
"""Engaged only when ``app.state.auth_required is True``.
No-op pass-through in loopback mode so the legacy auth_middleware can
handle those binds via ``_SESSION_TOKEN``.
"""
if not getattr(request.app.state, "auth_required", False):
return await call_next(request)
path = request.url.path
if _path_is_public(path):
return await call_next(request)
at, _rt = read_session_cookies(request)
if not at:
return _unauth_response(request, reason="no_cookie")
# Try every registered provider's verify_session in turn. Providers
# MUST return None for tokens they don't recognise (not raise). This
# lets multiple providers stack — the first one that recognises a
# token wins.
session = None
for provider in list_providers():
try:
session = provider.verify_session(access_token=at)
except ProviderError as e:
_log.warning(
"dashboard-auth: provider %r unreachable during verify: %s",
provider.name, e,
)
audit_log(
AuditEvent.SESSION_VERIFY_FAILURE,
provider=provider.name,
reason="provider_unreachable",
ip=_client_ip(request),
)
return JSONResponse(
{"detail": f"Auth provider {provider.name!r} unreachable"},
status_code=503,
)
if session is not None:
break
if session is None:
audit_log(
AuditEvent.SESSION_VERIFY_FAILURE,
reason="no_provider_recognises",
ip=_client_ip(request),
)
response = _unauth_response(request, reason="invalid_or_expired_session")
# Clear the dead cookie so the browser doesn't keep sending it.
# Contract v1: no refresh token to retry with, so the only correct
# next step is full re-auth via /login. Importing locally avoids a
# cycle with cookies → middleware at module load. Pass the active
# prefix so the deletion's Path matches the set-Path (otherwise
# the browser ignores it).
from hermes_cli.dashboard_auth.cookies import clear_session_cookies
from hermes_cli.dashboard_auth.prefix import prefix_from_request
clear_session_cookies(response, prefix=prefix_from_request(request))
return response
request.state.session = session
return await call_next(request)