feat(dashboard-auth): Nous plugin always-on; default portal URL; specific error messages

The Nous OAuth provider plugin (plugins/dashboard_auth/nous) is bundled
and auto-loaded — same as before — but previously refused to register
unless BOTH HERMES_DASHBOARD_OAUTH_CLIENT_ID and HERMES_DASHBOARD_PORTAL_URL
were set, then the gate's fail-closed branch told the operator 'install
the default Nous provider'. That message is misleading: the provider IS
installed; it's just unconfigured. And the contract only really needs
the per-instance client_id — the portal URL is the same for everyone
in production.

Three changes:

1. plugins/dashboard_auth/nous/__init__.py:
   - HERMES_DASHBOARD_PORTAL_URL is now optional and defaults to
     'https://portal.nousresearch.com'. Override only for staging
     (portal.rewbs.uk) or a custom deployment. Empty string also
     falls back to the default so an empty Fly secret can't point
     the dashboard at nowhere.
   - Plugin exposes a module-level LAST_SKIP_REASON: str that the gate
     reads when no providers register. Cleared on each register() call.
     Skip reasons are human-readable and actionable
     ('HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. The Nous Portal
     provisions this env var…').

2. plugins/dashboard_auth/nous/plugin.yaml:
   - requires_env drops HERMES_DASHBOARD_PORTAL_URL; only the client_id
     is mandatory. Description updated to reflect this.

3. hermes_cli/web_server.py:
   - When the gate fail-closes for 'no providers', it now reads each
     bundled plugin's LAST_SKIP_REASON and embeds them in the SystemExit
     message. Operator sees the specific config fix needed:
       Bundled providers reported these issues:
         • nous: HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. …
     instead of the prior generic 'Install the default Nous provider'.

Tests:
  - TestPluginRegister rewritten to assert the new defaults +
    LAST_SKIP_REASON contents (6 tests, +1 new for empty-string env).
  - New gate test test_start_server_surfaces_nous_skip_reason_when_unconfigured.
  - test_get_method_is_not_allowed widened to handle the SPA-shell 200
    path explicitly — assertion now verifies no JSON ticket leaks
    rather than asserting a specific status code (covers all four of
    401/404/405/200).

Docs updated: web-dashboard.md's 'Default provider' section now shows
the env-var table with required/optional columns and embeds the
fail-closed error message verbatim so operators can match what they
see at the prompt.
This commit is contained in:
Ben 2026-05-21 18:27:56 +10:00 committed by Teknium
parent af3d4a687f
commit b3dc539304
7 changed files with 219 additions and 56 deletions

View file

@ -323,14 +323,30 @@ If the gate would engage but **no** `DashboardAuthProvider` is registered (no No
### Default provider: Nous Research
The bundled `plugins/dashboard_auth/nous` plugin is auto-loaded and registers a `DashboardAuthProvider` named `nous` when these environment variables are present:
The bundled `plugins/dashboard_auth/nous` plugin is **always installed** and auto-loaded. It auto-registers a `DashboardAuthProvider` named `nous` when the per-instance client ID is set:
| Env var | Format | Provisioned by |
|---------|--------|----------------|
| `HERMES_DASHBOARD_OAUTH_CLIENT_ID` | `agent:{instance_id}` | Nous Portal at Fly.io provisioning time |
| `HERMES_DASHBOARD_PORTAL_URL` | `https://portal.nousresearch.com` | Nous Portal at Fly.io provisioning time |
| Env var | Required? | Format | Provisioned by |
|---------|-----------|--------|----------------|
| `HERMES_DASHBOARD_OAUTH_CLIENT_ID` | **yes** | `agent:{instance_id}` | Nous Portal at Fly.io provisioning time |
| `HERMES_DASHBOARD_PORTAL_URL` | no | `https://portal.nousresearch.com` (default) | Portal — override only for staging or a custom deployment |
Both are injected automatically when you deploy the Hermes Agent VPS through the Nous Portal — you don't set them by hand. If either is absent, the Nous plugin loads silently and registers nothing (the gate's fail-closed branch then kicks in if a public bind is attempted).
`HERMES_DASHBOARD_OAUTH_CLIENT_ID` is the only required variable; it's injected automatically when you deploy through the Nous Portal. The portal URL defaults to production, so the typical operator never touches it — set it explicitly only if you're pointing at staging (`portal.rewbs.uk`) or a custom Portal deployment.
If `HERMES_DASHBOARD_OAUTH_CLIENT_ID` is absent or malformed, the plugin reports the specific reason and the dashboard's fail-closed bind error tells you exactly what to fix:
```
Refusing to bind dashboard to 0.0.0.0 — the OAuth auth gate engages on
non-loopback binds, but no auth providers are registered.
Bundled providers reported these issues:
• nous: HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. The Nous Portal
provisions this env var (shape 'agent:{instance_id}') when it
deploys a Hermes Agent instance — set it to your provisioned
client id, or pass --insecure to skip the OAuth gate entirely.
Or pass --insecure to skip the auth gate (NOT recommended on untrusted
networks).
```
### OAuth flow
@ -390,9 +406,9 @@ The login page lists all registered providers; multiple providers can be stacked
### Verifying the gate is on
```bash
# Run the dashboard with the gate engaged (Fly.io shape):
# Run the dashboard with the gate engaged (Fly.io shape).
# HERMES_DASHBOARD_PORTAL_URL is optional — defaults to production.
HERMES_DASHBOARD_OAUTH_CLIENT_ID=agent:test \
HERMES_DASHBOARD_PORTAL_URL=https://portal.nousresearch.com \
hermes dashboard --host 0.0.0.0
# Hit /api/status to see the gate state: