When jwt.decode raises InvalidTokenError, decode the token a second time
without signature verification (safe — we never trust the values, just
display them) and append the actual iss/aud claims plus our configured
expected values to the error message. Lets operators see config drift
between HERMES_DASHBOARD_PORTAL_URL / HERMES_DASHBOARD_OAUTH_CLIENT_ID
and what Portal is actually emitting without having to hand-decode the
JWT from the browser cookie.
The Nous OAuth provider plugin (plugins/dashboard_auth/nous) is bundled
and auto-loaded — same as before — but previously refused to register
unless BOTH HERMES_DASHBOARD_OAUTH_CLIENT_ID and HERMES_DASHBOARD_PORTAL_URL
were set, then the gate's fail-closed branch told the operator 'install
the default Nous provider'. That message is misleading: the provider IS
installed; it's just unconfigured. And the contract only really needs
the per-instance client_id — the portal URL is the same for everyone
in production.
Three changes:
1. plugins/dashboard_auth/nous/__init__.py:
- HERMES_DASHBOARD_PORTAL_URL is now optional and defaults to
'https://portal.nousresearch.com'. Override only for staging
(portal.rewbs.uk) or a custom deployment. Empty string also
falls back to the default so an empty Fly secret can't point
the dashboard at nowhere.
- Plugin exposes a module-level LAST_SKIP_REASON: str that the gate
reads when no providers register. Cleared on each register() call.
Skip reasons are human-readable and actionable
('HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. The Nous Portal
provisions this env var…').
2. plugins/dashboard_auth/nous/plugin.yaml:
- requires_env drops HERMES_DASHBOARD_PORTAL_URL; only the client_id
is mandatory. Description updated to reflect this.
3. hermes_cli/web_server.py:
- When the gate fail-closes for 'no providers', it now reads each
bundled plugin's LAST_SKIP_REASON and embeds them in the SystemExit
message. Operator sees the specific config fix needed:
Bundled providers reported these issues:
• nous: HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. …
instead of the prior generic 'Install the default Nous provider'.
Tests:
- TestPluginRegister rewritten to assert the new defaults +
LAST_SKIP_REASON contents (6 tests, +1 new for empty-string env).
- New gate test test_start_server_surfaces_nous_skip_reason_when_unconfigured.
- test_get_method_is_not_allowed widened to handle the SPA-shell 200
path explicitly — assertion now verifies no JSON ticket leaks
rather than asserting a specific status code (covers all four of
401/404/405/200).
Docs updated: web-dashboard.md's 'Default provider' section now shows
the env-var table with required/optional columns and embeds the
fail-closed error message verbatim so operators can match what they
see at the prompt.
Bundled, kind=backend, auto-loads. Activates ONLY when Portal-injected
env vars are present:
HERMES_DASHBOARD_OAUTH_CLIENT_ID — agent:{instance_id}
HERMES_DASHBOARD_PORTAL_URL — Portal base URL
Loopback / --insecure operators leave both unset and never see this
plugin register anything. The fail-closed branch in start_server handles
the 'public bind + zero providers' case independently.
Implementation follows nous-account-service PR #180's published OAuth
contract verbatim:
- client_id is per-instance (agent:{instance_id}); the suffix is
cross-checked against the token's agent_instance_id claim as
defense-in-depth (contract C9).
- scope is agent_dashboard:access only (contract C3).
- aud is the bare client_id, no hermes-cli: prefix (contract C2).
- RS256 JWT verification against /.well-known/jwks.json with
5-minute cache (contract C7).
- No refresh tokens in V1: refresh_session always raises
RefreshExpiredError; revoke_session is a no-op (contract C5).
- oauth_contract_version claim: missing → warn + proceed; present
and != 1 → refuse (contract C11, OQ-C2 tolerant treatment).
- redirect_uri validated client-side as defense before bouncing to
Portal; authoritative check is server-side per agent-redirect-uri.ts.
41 new tests covering construction, plugin-entry env gating, start_login
shape, complete_login httpx-mocked happy path + error mapping,
verify_session JWT verification (RSA keypair fixture, full claim-check
matrix), refresh_session always raising, revoke_session no-op.
PyJWT + cryptography are already in the venv (jose was previously
suggested; switched to pyjwt[crypto] since the latter is already
pulled in transitively).