feat(dashboard-auth): HERMES_DASHBOARD_PUBLIC_URL / dashboard.public_url override

Operators behind reverse proxies that don't reliably forward
X-Forwarded-Host / X-Forwarded-Proto / X-Forwarded-Prefix (manual
nginx setups, on-prem ingresses, custom-domain Fly deploys with
incomplete proxy chains) had no way to force the absolute base URL
the OAuth callback redirects from. The dashboard would reconstruct
the redirect_uri from request headers, the IDP would echo it back,
and the user would land on the wrong host or wrong path — 404.

Add `dashboard.public_url` to config.yaml with env override
HERMES_DASHBOARD_PUBLIC_URL. When set, it is the complete authority —
scheme + host + optional path prefix (e.g. https://example.com/hermes) —
and becomes the base for the OAuth `redirect_uri`. X-Forwarded-Prefix
is IGNORED on this code path because the operator has explicitly
declared the public URL; we no longer need to guess from proxy
headers, and stacking the prefix on top would double-prefix the
common case where the prefix is already baked into public_url.

When unset, the existing proxy_headers + X-Forwarded-Prefix
reconstruction runs untouched. Existing Fly.io deploys continue to
work without configuration — this is purely additive.

Precedence mirrors dashboard.oauth.client_id:

  env (non-empty) > config.yaml > reconstructed from request

Implementation:

  - hermes_cli/config.py: add dashboard.public_url to DEFAULT_CONFIG
    with a multi-paragraph doc comment explaining the use case,
    the X-Forwarded-Prefix interaction, and the validation rules.
  - hermes_cli/dashboard_auth/prefix.py: factored out the existing
    _REJECT_CHARS frozenset, added _normalise_public_url() validator
    (requires http/https scheme + non-empty host + no header-injection
    chars), _load_dashboard_section() loader (robust to load_config
    raising, non-dict shapes), and resolve_public_url() entry point
    with the env-overrides-config precedence. A malformed value
    silently falls through to ""; the caller treats "" as "reconstruct
    from request" so a typo never breaks the login flow.
  - hermes_cli/dashboard_auth/routes.py: rewrite _redirect_uri()
    docstring to spell out the three resolution tiers; add the
    public_url short-circuit before the existing X-Forwarded-Prefix
    splicing. Source-level comment notes that X-Forwarded-Prefix is
    intentionally ignored when public_url is set so a future reader
    doesn't try to "fix" the missing prefix layering.
  - cli-config.yaml.example: extend the existing dashboard section
    with a public_url block.
  - website/docs/user-guide/features/web-dashboard.md: new "Public
    URL override" section between the provider configuration and
    the OAuth flow walkthrough. Documents the env-vs-config table,
    the validation rules, and the `http://` `public_url` ↔ Secure
    cookie footgun.

Test coverage — new TestPublicUrlOverride class (8 tests):

  - env var overrides request reconstruction (the primary motivating
    case)
  - config.yaml used when env unset
  - env wins over config (precedence pin)
  - public_url with a path prefix already baked in (the Q1-a case the
    user explicitly chose)
  - public_url suppresses X-Forwarded-Prefix layering (defends
    against the double-prefix bug)
  - trailing slash stripped from public_url (no //auth/callback)
  - malformed public_url falls through to reconstruction (six
    hostile inputs: javascript:, ftp:, missing scheme, missing host,
    quote chars, CRLF injection)
  - empty env string doesn't shadow config.yaml entry (CI / Fly
    provisioned-but-empty secret case)

Mutation-tested: flipping the precedence in resolve_public_url() trips
exactly test_env_overrides_config_public_url; weakening the validator
(accept any scheme) trips exactly test_malformed_public_url_falls_through_to_reconstruction.
Both other tests in each pair stay green, confirming the suite
discriminates the specific regression each test pins.
This commit is contained in:
Ben 2026-05-26 13:56:22 +10:00 committed by Teknium
parent 0af37ff272
commit a890389b69
6 changed files with 400 additions and 19 deletions

View file

@ -1108,6 +1108,7 @@ display:
#
# dashboard.oauth.client_id <- HERMES_DASHBOARD_OAUTH_CLIENT_ID
# dashboard.oauth.portal_url <- HERMES_DASHBOARD_PORTAL_URL
# dashboard.public_url <- HERMES_DASHBOARD_PUBLIC_URL
#
# Env wins when set to a non-empty value. This is what Fly.io's platform-
# secret injection uses to push per-deploy client_ids without needing to
@ -1121,3 +1122,21 @@ display:
# oauth:
# client_id: "" # agent:{instance_id}; Portal provisions this at deploy
# portal_url: "" # blank → default https://portal.nousresearch.com
#
# # Force the absolute base URL the OAuth callback (and any other public
# # URL the dashboard hands to external systems) is built from. Set this
# # for deploys behind reverse proxies that don't reliably forward
# # X-Forwarded-Host / X-Forwarded-Proto / X-Forwarded-Prefix (manual
# # nginx setups, on-prem ingresses, custom-domain Fly deploys without
# # full proxy header chains).
# #
# # When set, the value is the complete authority: scheme + host +
# # optional path prefix (e.g. "https://example.com/hermes"). The OAuth
# # callback URL becomes "<public_url>/auth/callback" — X-Forwarded-Prefix
# # is IGNORED on this code path because the operator has explicitly
# # declared the public URL and we no longer need to guess.
# #
# # Leave empty to use the existing proxy-header reconstruction (the
# # default — works on Fly.io out of the box).
# #
# # public_url: "https://example.com/hermes"

View file

@ -1197,6 +1197,27 @@ DEFAULT_CONFIG = {
"client_id": "", # agent:{instance_id} — Portal provisions this
"portal_url": "", # blank → use plugin default (production Portal)
},
# Public URL override (env: ``HERMES_DASHBOARD_PUBLIC_URL``).
# When set, this is the complete authority — scheme + host +
# optional path prefix (e.g. ``https://example.com/hermes``) —
# the OAuth ``redirect_uri`` is built from. Set this for deploys
# behind reverse proxies that don't reliably forward
# ``X-Forwarded-Host`` / ``X-Forwarded-Proto`` / ``X-Forwarded-Prefix``
# (manual nginx setups, on-prem ingresses, custom-domain Fly
# deploys without proper proxy headers). When set,
# ``X-Forwarded-Prefix`` is IGNORED on the OAuth path because
# the operator has declared the public URL — we no longer need
# to guess from proxy headers, and stacking the prefix on top
# would double-prefix the common case where the prefix is
# already baked into ``public_url``. Leave empty to use the
# existing proxy-header reconstruction (the default).
#
# Validation: rejects values without ``http(s)://`` scheme or
# without a host, and any string containing quote / angle /
# whitespace / control characters. A malformed value silently
# falls through to request reconstruction rather than breaking
# the login flow.
"public_url": "",
},
# Privacy settings

View file

@ -2,18 +2,35 @@
Mission-control style deploys reverse-proxy the dashboard at a path
prefix (e.g. ``mission-control.tilos.com/hermes/*`` -> dashboard on
:9119). The proxy injects ``X-Forwarded-Prefix: /hermes`` so the
backend can reconstruct prefixed URLs (Location: headers, OAuth
redirect_uri, cookie Path attributes, SPA asset URLs).
:9119), injecting ``X-Forwarded-Prefix: /hermes`` so the backend can
reconstruct prefixed URLs (Location: headers, OAuth redirect_uri,
cookie Path attributes, SPA asset URLs).
The single source of truth for the parsed prefix lives here so the
gate middleware, the OAuth routes, the cookie helpers, and the SPA
mount all agree on validation rules.
This module is also the home of the ``HERMES_DASHBOARD_PUBLIC_URL`` /
``dashboard.public_url`` resolution when the operator declares a
complete public URL (scheme + host + optional path prefix), we use
that directly for the OAuth ``redirect_uri`` and skip the
X-Forwarded-Prefix reconstruction. Relief valve for deploys where the
proxy header chain isn't reliable.
The single source of truth for both helpers lives here so the gate
middleware, the OAuth routes, the cookie helpers, and the SPA mount
all agree on validation rules.
"""
from __future__ import annotations
import logging
import os
import urllib.parse
from typing import Optional
_log = logging.getLogger(__name__)
# Characters that, if present in a public_url or prefix value, indicate
# either a typo or a header-injection attempt. Reject the whole value
# rather than try to sanitise — the operator can fix their config.
_REJECT_CHARS = frozenset(('"', "'", "<", ">", " ", "\n", "\r", "\t"))
def normalise_prefix(raw: Optional[str]) -> str:
"""Normalise an X-Forwarded-Prefix header value.
@ -35,7 +52,7 @@ def normalise_prefix(raw: Optional[str]) -> str:
if (
"//" in p
or ".." in p
or any(c in p for c in ('"', "'", "<", ">", " ", "\n", "\r", "\t"))
or any(c in p for c in _REJECT_CHARS)
):
return ""
if len(p) > 64:
@ -48,3 +65,93 @@ def prefix_from_request(request) -> str:
Request and normalises it. Returns ``""`` when no prefix.
"""
return normalise_prefix(request.headers.get("x-forwarded-prefix"))
# ---------------------------------------------------------------------------
# HERMES_DASHBOARD_PUBLIC_URL / dashboard.public_url
# ---------------------------------------------------------------------------
def _normalise_public_url(raw: Optional[str]) -> str:
"""Normalise a ``dashboard.public_url`` value.
Returns the cleaned URL (scheme://netloc[/path], trailing slash
removed) on success, or ``""`` when the value is empty, malformed,
or contains characters that suggest header injection. The caller
must treat ``""`` as "fall back to request reconstruction" never
as "the user explicitly chose no public URL", because the two are
indistinguishable from an empty env var.
"""
if not raw:
return ""
url = raw.strip()
if not url:
return ""
# Reject control / quote / whitespace characters before trying to
# parse — urlparse is permissive enough to accept some hostile
# values (e.g. embedded newlines) and we want a hard "no" rather
# than a soft "maybe".
if any(c in url for c in _REJECT_CHARS):
return ""
try:
parsed = urllib.parse.urlparse(url)
except ValueError:
return ""
if parsed.scheme not in {"http", "https"}:
return ""
if not parsed.netloc:
return ""
# Strip a single trailing slash so callers can append paths without
# producing ``//`` double-slashes.
return url.rstrip("/")
def _load_dashboard_section() -> dict:
"""Return the ``dashboard`` block from ``config.yaml`` if it exists
and is a dict; otherwise an empty dict.
Robust to (a) load_config() raising (malformed YAML, IO error,
config.yaml absent), and (b) ``dashboard`` being absent or non-dict.
Both shapes fall through to ``{}`` so the caller can rely on
``.get(...)`` access.
"""
try:
from hermes_cli.config import load_config
except Exception:
return {}
try:
cfg = load_config()
except Exception as exc: # noqa: BLE001 — broad catch is intentional
_log.debug(
"dashboard-auth.prefix: load_config() raised %s; "
"falling back to env-only configuration",
exc,
)
return {}
section = cfg.get("dashboard") if isinstance(cfg, dict) else None
return section if isinstance(section, dict) else {}
def resolve_public_url() -> str:
"""Resolve the operator-declared dashboard public URL.
Precedence (mirrors ``dashboard.oauth.client_id``):
1. ``HERMES_DASHBOARD_PUBLIC_URL`` env var (when non-empty after
strip empty values are treated as unset so a provisioned-but-
not-populated Fly secret can't shadow a valid config.yaml entry).
2. ``dashboard.public_url`` in ``config.yaml``.
3. Empty string signals "no override, reconstruct from request"
to the caller.
Each candidate value is run through :func:`_normalise_public_url`.
A malformed env var falls through to the config.yaml entry; a
malformed config entry falls through to ``""``. This means a typo
in one surface doesn't prevent the other from working.
"""
env_raw = os.environ.get("HERMES_DASHBOARD_PUBLIC_URL", "")
env_clean = _normalise_public_url(env_raw)
if env_clean:
return env_clean
cfg_raw = _load_dashboard_section().get("public_url", "")
return _normalise_public_url(str(cfg_raw))

View file

@ -50,23 +50,47 @@ router = APIRouter()
def _redirect_uri(request: Request) -> str:
"""Reconstruct the absolute callback URL the IDP redirects back to.
Reads from the request URL under uvicorn's ``proxy_headers=True``
this picks up the public https URL from ``X-Forwarded-Host`` plus
``X-Forwarded-Proto``.
Three resolution tiers:
Under ``X-Forwarded-Prefix: /hermes`` (Mission Control deploys), we
additionally prepend the prefix to the path so the IDP redirects
the user back to ``https://mission-control.tilos.com/hermes/auth/callback``
rather than the bare ``/auth/callback`` (which the proxy doesn't
route to the dashboard). FastAPI's ``url_for`` doesn't natively
honour X-Forwarded-Prefix that header isn't part of the
Starlette/uvicorn proxy_headers set so we splice the prefix in
manually.
1. ``HERMES_DASHBOARD_PUBLIC_URL`` env var or
``dashboard.public_url`` in config.yaml when set, this is
the complete authority (scheme + host + optional path prefix)
and we append ``/auth/callback`` verbatim. ``X-Forwarded-Prefix``
is IGNORED on this code path because the operator has declared
the public URL we no longer need to guess from proxy headers,
and stacking the prefix on top would double-prefix the common
case where the prefix is already baked into ``public_url``.
Relief valve for deploys behind reverse proxies whose forwarded
headers aren't reliable.
2. ``X-Forwarded-Prefix: /hermes`` (Mission Control deploys) we
prepend the prefix to the path FastAPI's ``url_for`` produces
(it doesn't natively honour this header — it isn't part of the
Starlette/uvicorn proxy_headers set).
3. Bare ``request.url_for("auth_callback")`` under uvicorn's
``proxy_headers=True`` this picks up the public https URL from
``X-Forwarded-Host`` plus ``X-Forwarded-Proto``. Fly.io's
default path.
"""
from urllib.parse import urlparse, urlunparse
from hermes_cli.dashboard_auth.prefix import prefix_from_request
from hermes_cli.dashboard_auth.prefix import (
prefix_from_request,
resolve_public_url,
)
# Tier 1: operator-declared public URL.
public_url = resolve_public_url()
if public_url:
# ``public_url`` is the complete authority (possibly with a
# path prefix already baked in). Append the auth callback path
# verbatim. ``resolve_public_url`` already stripped any trailing
# slash so we don't produce ``//auth/callback`` double-slashes.
return f"{public_url}/auth/callback"
# Tier 2 + 3: reconstruct from the request URL, optionally with
# X-Forwarded-Prefix layered on top of the path.
base = str(request.url_for("auth_callback"))
prefix = prefix_from_request(request)
if not prefix:

View file

@ -203,6 +203,191 @@ class TestOAuthRedirectUriRespectsPrefix:
assert parsed.path == "/auth/callback"
# ---------------------------------------------------------------------------
# HERMES_DASHBOARD_PUBLIC_URL / dashboard.public_url override
# ---------------------------------------------------------------------------
class TestPublicUrlOverride:
"""``dashboard.public_url`` (env override:
``HERMES_DASHBOARD_PUBLIC_URL``) lets an operator force the absolute
base URL the OAuth ``redirect_uri`` is built from.
When set, it is the *complete authority* scheme + host + optional
path prefix. ``X-Forwarded-Prefix`` is ignored on that code path
because the operator has explicitly declared the public URL and we
no longer need to guess from proxy headers. This is the relief
valve for deploys behind reverse proxies that don't set
``X-Forwarded-Host`` / ``X-Forwarded-Proto`` / ``X-Forwarded-Prefix``
correctly (or at all) manual nginx setups, on-prem ingresses,
Fly.io deploys with custom domains where the proxy header chain is
incomplete.
When unset, the existing ``proxy_headers=True`` + X-Forwarded-Prefix
reconstruction path runs untouched. Existing Fly.io deploys
continue to work without configuration.
Precedence (mirrors ``client_id``):
env (non-empty) > config.yaml > reconstructed from request
"""
@pytest.fixture
def patch_config(self, monkeypatch):
"""Replace ``hermes_cli.config.load_config`` with a stub
returning the given ``public_url``. Pass ``None`` to set no
config-side value."""
def _set(public_url) -> None:
cfg = {}
if public_url is not None:
cfg = {"dashboard": {"public_url": public_url}}
monkeypatch.setattr(
"hermes_cli.config.load_config", lambda: cfg
)
return _set
def _redirect_uri(self, gated_app, *, headers=None) -> str:
"""Drive /auth/login and read the redirect_uri the IDP saw."""
r = gated_app.get(
"/auth/login?provider=stub",
headers=headers or {},
follow_redirects=False,
)
assert r.status_code == 302, r.text
# Stub IDP echoes redirect_uri back as the prefix of the
# Location header (`{redirect_uri}?code=stub_code&state=…`).
return r.headers["location"].split("?", 1)[0]
def test_public_url_env_overrides_request_reconstruction(
self, gated_app_direct, patch_config, monkeypatch
):
"""``HERMES_DASHBOARD_PUBLIC_URL`` wins over the URL the
request would otherwise reconstruct to. Critical for deploys
whose proxy headers don't match the public URL."""
patch_config(None)
monkeypatch.setenv(
"HERMES_DASHBOARD_PUBLIC_URL", "https://custom.example",
)
redirect_uri = self._redirect_uri(gated_app_direct)
assert redirect_uri == "https://custom.example/auth/callback", (
f"public_url env var didn't override reconstruction "
f"(got {redirect_uri!r})"
)
def test_public_url_config_yaml_used_when_env_unset(
self, gated_app_direct, patch_config, monkeypatch
):
monkeypatch.delenv("HERMES_DASHBOARD_PUBLIC_URL", raising=False)
patch_config("https://from-config.example")
redirect_uri = self._redirect_uri(gated_app_direct)
assert redirect_uri == "https://from-config.example/auth/callback"
def test_env_overrides_config_public_url(
self, gated_app_direct, patch_config, monkeypatch
):
"""Precedence pin — env wins over config.yaml. Fly.io / CI
secret injection depends on this ordering."""
monkeypatch.setenv(
"HERMES_DASHBOARD_PUBLIC_URL", "https://from-env.example",
)
patch_config("https://from-config.example")
redirect_uri = self._redirect_uri(gated_app_direct)
assert redirect_uri == "https://from-env.example/auth/callback", (
"env var must override config.yaml — Fly secret injection "
"depends on this precedence"
)
def test_public_url_with_path_prefix_baked_in(
self, gated_app_direct, patch_config, monkeypatch
):
"""When public_url already carries a path prefix
(``https://example.com/hermes``), the OAuth callback URL is
the path appended verbatim. The operator is declaring the
whole authority; we trust them."""
patch_config(None)
monkeypatch.setenv(
"HERMES_DASHBOARD_PUBLIC_URL", "https://example.com/hermes",
)
redirect_uri = self._redirect_uri(gated_app_direct)
assert redirect_uri == "https://example.com/hermes/auth/callback"
def test_public_url_ignores_x_forwarded_prefix(
self, gated_app_proxied, patch_config, monkeypatch
):
"""X-Forwarded-Prefix is the auto-reconstruction signal; when
public_url is set we no longer need to guess, and stacking the
prefix on top would double-prefix in the common case where
the operator already baked their prefix into public_url."""
patch_config(None)
monkeypatch.setenv(
"HERMES_DASHBOARD_PUBLIC_URL", "https://example.com/already-prefixed",
)
redirect_uri = self._redirect_uri(
gated_app_proxied,
headers={"x-forwarded-prefix": "/should-be-ignored"},
)
assert (
redirect_uri == "https://example.com/already-prefixed/auth/callback"
), (
f"public_url should suppress X-Forwarded-Prefix layering, "
f"got {redirect_uri!r}"
)
def test_public_url_strips_trailing_slash(
self, gated_app_direct, patch_config, monkeypatch
):
"""``https://example.com/`` and ``https://example.com`` must
produce identical results no ``//auth/callback`` double slash."""
patch_config(None)
monkeypatch.setenv(
"HERMES_DASHBOARD_PUBLIC_URL", "https://example.com/",
)
redirect_uri = self._redirect_uri(gated_app_direct)
assert redirect_uri == "https://example.com/auth/callback"
def test_malformed_public_url_falls_through_to_reconstruction(
self, gated_app_direct, patch_config, monkeypatch
):
"""Defence against header injection: a public_url that doesn't
parse as ``http(s)://host[/path]`` is dropped and we fall back
to request reconstruction. The login flow continues to work
rather than dispatching the user to a hostile URL."""
from urllib.parse import urlparse
patch_config(None)
for bad in [
"javascript:alert(1)",
"ftp://example.com",
"example.com", # missing scheme
"https://", # missing host
'https://example.com/"injected', # quote char
"https://example.com/\nhttps://evil", # CRLF injection
]:
monkeypatch.setenv("HERMES_DASHBOARD_PUBLIC_URL", bad)
redirect_uri = self._redirect_uri(gated_app_direct)
# Fell through to request reconstruction — netloc is the
# bound host, NOT the hostile value.
parsed = urlparse(redirect_uri)
assert parsed.netloc == "fly-app.fly.dev", (
f"malformed public_url={bad!r} leaked into redirect_uri: "
f"{redirect_uri!r}"
)
assert parsed.path == "/auth/callback"
def test_empty_public_url_env_treated_as_unset(
self, gated_app_direct, patch_config, monkeypatch
):
"""Same defensive behaviour as the other env vars in this
plugin an empty env var doesn't shadow a valid config.yaml
entry."""
monkeypatch.setenv("HERMES_DASHBOARD_PUBLIC_URL", "")
patch_config("https://from-config.example")
redirect_uri = self._redirect_uri(gated_app_direct)
assert redirect_uri == "https://from-config.example/auth/callback"
# ---------------------------------------------------------------------------
# Cookies: Path attribute + __Host- / __Secure- prefix rules
# ---------------------------------------------------------------------------

View file

@ -367,6 +367,31 @@ Or pass --insecure to skip the auth gate (NOT recommended on untrusted
networks).
```
### Public URL override
By default, the dashboard reconstructs the OAuth callback URL from the request — `X-Forwarded-Host` + `X-Forwarded-Proto` + `X-Forwarded-Prefix` (when uvicorn is configured with `proxy_headers=True`, which `start_server` enables under the gate). This works out of the box on Fly.io, which sets all three headers correctly.
For deploys behind reverse proxies that don't reliably forward those headers (manual nginx setups, on-prem ingresses, custom-domain Fly deploys with partial proxy chains), set `dashboard.public_url` (or `HERMES_DASHBOARD_PUBLIC_URL`) to the **complete public URL** the dashboard is reached at:
```yaml
dashboard:
public_url: "https://dashboard.example.com/hermes"
```
When set, the OAuth callback URL becomes `<public_url>/auth/callback` verbatim — `X-Forwarded-Prefix` is ignored on that code path because the operator has explicitly declared the public URL. This is intentional: stacking the prefix on top would double-prefix the common case where the prefix is already baked into `public_url`.
Same precedence as the other dashboard settings — env wins over `config.yaml`:
| Surface | Override path | When to use |
|---------|---------------|-------------|
| `dashboard.public_url` in `config.yaml` | `HERMES_DASHBOARD_PUBLIC_URL` | Local dev / on-prem (canonical) |
| `HERMES_DASHBOARD_PUBLIC_URL` env var | — | Fly.io platform secrets / CI |
| (unset) | — | Default — reconstruct from `X-Forwarded-*` headers |
Validation rejects values without `http://` / `https://` scheme, without a host, or containing quote / angle / whitespace / control characters. A malformed value silently falls through to header reconstruction so the login flow keeps working rather than dispatching the user to a hostile URL.
> **Note:** `public_url` overrides the OAuth callback URL only. The `Secure` cookie flag is still controlled by `request.url.scheme` (X-Forwarded-Proto under proxy_headers), so an `http://` `public_url` on a TLS-terminated public deploy will produce non-Secure cookies. This is an operator footgun — pair `public_url` with proper TLS termination upstream.
### OAuth flow
The provider implements the [Nous Portal OAuth contract v1](https://github.com/NousResearch/nous-account-service/blob/main/docs/agent-dashboard-oauth-contract.md) — authorization-code grant with PKCE (S256):