Commit graph

6 commits

Author SHA1 Message Date
Ben
b3dc539304 feat(dashboard-auth): Nous plugin always-on; default portal URL; specific error messages
The Nous OAuth provider plugin (plugins/dashboard_auth/nous) is bundled
and auto-loaded — same as before — but previously refused to register
unless BOTH HERMES_DASHBOARD_OAUTH_CLIENT_ID and HERMES_DASHBOARD_PORTAL_URL
were set, then the gate's fail-closed branch told the operator 'install
the default Nous provider'. That message is misleading: the provider IS
installed; it's just unconfigured. And the contract only really needs
the per-instance client_id — the portal URL is the same for everyone
in production.

Three changes:

1. plugins/dashboard_auth/nous/__init__.py:
   - HERMES_DASHBOARD_PORTAL_URL is now optional and defaults to
     'https://portal.nousresearch.com'. Override only for staging
     (portal.rewbs.uk) or a custom deployment. Empty string also
     falls back to the default so an empty Fly secret can't point
     the dashboard at nowhere.
   - Plugin exposes a module-level LAST_SKIP_REASON: str that the gate
     reads when no providers register. Cleared on each register() call.
     Skip reasons are human-readable and actionable
     ('HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. The Nous Portal
     provisions this env var…').

2. plugins/dashboard_auth/nous/plugin.yaml:
   - requires_env drops HERMES_DASHBOARD_PORTAL_URL; only the client_id
     is mandatory. Description updated to reflect this.

3. hermes_cli/web_server.py:
   - When the gate fail-closes for 'no providers', it now reads each
     bundled plugin's LAST_SKIP_REASON and embeds them in the SystemExit
     message. Operator sees the specific config fix needed:
       Bundled providers reported these issues:
         • nous: HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. …
     instead of the prior generic 'Install the default Nous provider'.

Tests:
  - TestPluginRegister rewritten to assert the new defaults +
    LAST_SKIP_REASON contents (6 tests, +1 new for empty-string env).
  - New gate test test_start_server_surfaces_nous_skip_reason_when_unconfigured.
  - test_get_method_is_not_allowed widened to handle the SPA-shell 200
    path explicitly — assertion now verifies no JSON ticket leaks
    rather than asserting a specific status code (covers all four of
    401/404/405/200).

Docs updated: web-dashboard.md's 'Default provider' section now shows
the env-var table with required/optional columns and embeds the
fail-closed error message verbatim so operators can match what they
see at the prompt.
2026-05-27 02:12:27 -07:00
Ben
5e9308b5b8 feat(dashboard-auth): Phase 6 — 401 re-auth envelope + next= propagation
Contract V1 of nous-account-service PR #180 ships no refresh tokens, so
the original Phase 6 silent-refresh design is replaced with a thinner
'401 → redirect to /login' UX. The dashboard's gated middleware now
emits a structured envelope on any auth failure; the SPA's fetch
wrapper sees it and full-page-navigates the user through re-auth.

hermes_cli/dashboard_auth/cookies.py:
  set_session_cookies(refresh_token='') SKIPS writing the
  hermes_session_rt cookie. Forward-compat: a non-empty refresh_token
  still emits the cookie unchanged, so a future Portal contract that
  starts issuing RTs flips the persistence on with no other change.
  clear_session_cookies still emits a Max-Age=0 deletion for the RT
  cookie so stale cookies from earlier deployments get flushed on
  logout / session expiry. Deprecation marker + rationale in
  module docstring per the user's docstring-only deprecation pattern.

hermes_cli/dashboard_auth/middleware.py:
  _unauth_response now builds a structured JSON envelope for API 401s:
    { error: 'session_expired' | 'unauthenticated',
      detail: 'Unauthorized',
      reason: <internal>,
      login_url: '/login?next=<safe-path>' }
  HTML redirects also carry next= so a user landing on /sessions
  without a cookie bounces back to /sessions after re-auth.
  _safe_next_target validates same-origin: drops protocol-relative
  paths (//evil.com), absolute URLs, and any /login or /auth/* loop.
  Dead cookies are cleared on the 401 path so the browser stops
  replaying invalid tokens.

hermes_cli/dashboard_auth/routes.py:
  /auth/callback accepts next= query param and validates via
  _validate_post_login_target (same rules as the gate's
  _safe_next_target — defence-in-depth because next= survived a full
  IDP round trip and attacker-controlled state can re-enter via the
  callback URL). Open-redirect attempts land at '/' instead.

web/src/lib/api.ts:
  fetchJSON parses the 401 envelope and full-page-navigates to
  body.login_url ONLY on the known session-expiry error codes.
  Domain-level 401s (e.g. permission errors) bubble up as regular
  errors. credentials: 'include' added so cookie auth works for all
  fetches routed through this wrapper. sessionStorage.lastLocation is
  preserved for future use by AuthWidget / hermes_status.

Test files marked with pytest.mark.xdist_group so the four files that
mutate web_server.app.state.auth_required serialize onto the same xdist
worker — eliminates 'works locally, fails in CI' app-state bleed.

20 new tests in test_dashboard_auth_401_reauth.py:
  - set_session_cookies(refresh_token='') skips RT cookie
  - clear_session_cookies still emits RT deletion
  - 401 envelope shape (unauthenticated vs session_expired)
  - dead cookie cleared on invalid-token 401
  - login_url carries next= for deep paths
  - login loop avoided when path is /login/auth/api-auth
  - protocol-relative URL rejected
  - _safe_next_target unit tests (accept same-origin, reject loops/abs)
  - /auth/callback respects safe next= but rejects open redirects

2 pre-existing tests updated to accept the new /login?next=%2F shape.

Full dashboard-auth suite: 168 passed, 1 skipped (Phase 0 pre-existing).
2026-05-27 02:12:27 -07:00
Ben
53736b3922 feat(dashboard-auth): fail-closed on no providers; proxy_headers when gated; suppress _SESSION_TOKEN injection
Phase 3, Task 3.5. Three changes to web_server.py:

  1. start_server replaces the legacy SystemExit-refusing-to-bind guard
     with: if app.state.auth_required and no providers registered, exit
     with a clear message; otherwise log the gate-on banner. --insecure
     keeps its existing behaviour.

  2. uvicorn proxy_headers flag is computed from app.state.auth_required.
     Loopback / --insecure keep it False (so _ws_client_is_allowed sees
     the real peer for the loopback gate); gated mode flips it True so
     X-Forwarded-Proto from Fly's TLS terminator is honoured for cookie
     Secure-flag decisions in detect_https().

  3. _serve_index no longer injects window.__HERMES_SESSION_TOKEN__ when
     the gate is on — the SPA reads identity from /api/auth/me using
     cookie auth instead. window.__HERMES_AUTH_REQUIRED__ flag lets the
     SPA pick between ticket-auth (gated) and token-auth (loopback) for
     /api/pty + /api/ws (Phase 5 will wire this in the React layer).

4 new behavioural tests; loopback regression harness still green.
2026-05-27 02:12:27 -07:00
Ben
949ad95e4b feat(dashboard): stash auth_required flag on app.state
Phase 0, Task 0.3. start_server now computes should_require_auth(host,
allow_public) and records it on app.state.auth_required BEFORE the
existing legacy SystemExit guard fires. This gives middleware, the SPA
token-injection path, and WS endpoints a consistent read source for
'is the gate active'. The flag is set but no one reads it yet — Phase 3
registers the gate middleware.

Note: 4 pre-existing test failures in tests/hermes_cli/test_web_server.py
(PtyWebSocket) + test_update_hangup_protection.py reproduce on pristine
HEAD and are unrelated to this change (starlette TestClient WS regression).
2026-05-27 02:12:27 -07:00
Ben
8773bbf186 feat(dashboard): add should_require_auth predicate for OAuth gate
Phase 0, Task 0.2. Single source of truth for 'is the auth gate active?'.
Reuses the existing _LOOPBACK_HOST_VALUES frozenset so this stays in sync
with the DNS-rebinding host-header check. RFC1918/CGNAT/link-local are
treated as public — exact threat model the gate exists for.
2026-05-27 02:12:27 -07:00
Ben
f2b479e7a2 test(dashboard): pin current loopback auth behavior as regression harness
Phase 0, Task 0.1 of the dashboard-oauth plan. Establishes a baseline for
the loopback dashboard's auth surface so future phases can prove they
didn't regress the existing _SESSION_TOKEN flow when adding the OAuth gate.
2026-05-27 02:12:27 -07:00