hermes-agent/hermes_cli/dashboard_auth/audit.py
Ben cb9cb6ba1c feat(dashboard-auth): generic non-interactive API-token capability
Task 2.0a of the safe-shutdown drain-coordination plan. Widens the dashboard
auth framework GENERICALLY to support non-interactive (service-to-service)
bearer-token auth, mirroring the existing supports_password precedent. This is
a reusable capability — any future machine-credential provider plugs in without
core changes (decisions.md Q-C). The drain bearer-secret plugin (Task 2.0b) is
the first consumer, not the definition.

- base.py: add TokenPrincipal dataclass (the token analog of Session) +
  supports_token capability flag + verify_token() on the ABC (default raises
  NotImplementedError so a misconfigured provider fails loud). Contract mirrors
  verify_session stacking: return None for unrecognised tokens (never raise),
  raise ProviderError only on a genuine backing-store outage.
- registry.py: list_token_providers() — the supports_token subset, in
  registration order. Empty when none registered (token routes fail closed).
- token_auth.py (new): route-agnostic seam. Routes opt in via
  register_token_route(exact path); token_auth_middleware owns the auth
  decision for those routes only — authenticate via stacked providers, attach
  request.state.token_principal + token_authenticated, pass through. 401 on
  missing/unrecognised token, 503 when a provider was unreachable, untouched
  passthrough for non-token routes. Fails closed (never open).
- web_server.py: install the seam OUTERMOST (registered last → runs first).
  Both downstream gates (legacy auth_middleware + gated_auth_middleware) honour
  request.state.token_authenticated and skip enforcement, so a token-authed
  service request is never bounced to /login.
- audit.py: TOKEN_AUTH_SUCCESS / TOKEN_AUTH_FAILURE events.

Tests: tests/hermes_cli/test_dashboard_token_auth.py — ABC flag default,
verify_token NotImplementedError, registry filter, bearer extraction
(case-insensitive scheme, malformed/non-bearer → ""), provider stacking
(first-match-wins, unreachable-remembered, unreachable-then-valid, buggy
provider doesn't crash the gate), and the seam's passthrough/401/503/
fail-closed behaviour. 29 new tests; full dashboard-auth suite 169 passed.

Intentionally deferred:
- The concrete shared-bearer-secret provider plugin — Task 2.0b.
- The begin/cancel-drain endpoint that registers itself as a token route —
  Task 2.1.

Build status: dashboard-auth + plugin-hook suites green.
2026-06-26 00:47:19 -07:00

89 lines
2.9 KiB
Python

"""Audit log for dashboard-auth events.
Profile-aware location: ``$HERMES_HOME/logs/dashboard-auth.log``.
Format: one JSON object per line. Token-like fields are stripped before
serialisation to avoid leaking refresh tokens or JWTs to disk.
This module deliberately keeps a minimal dependency surface — no imports
from ``hermes_constants`` or other hermes_cli modules — so it can be
imported safely from middleware code that loads early in the startup
sequence.
"""
from __future__ import annotations
import datetime as _dt
import enum
import json
import logging
import os
import threading
from pathlib import Path
from typing import Any
_log = logging.getLogger(__name__)
_write_lock = threading.Lock()
# Field names that must never appear in the log raw. Any kwarg matching
# these is silently dropped.
_REDACTED_FIELDS: frozenset = frozenset({
"access_token", "refresh_token", "code", "code_verifier",
"state", "ticket", "cookie", "Authorization", "authorization",
})
class AuditEvent(enum.Enum):
"""Event types written to dashboard-auth.log.
Values are the literal ``event`` field on the JSON line.
"""
LOGIN_START = "login_start"
LOGIN_SUCCESS = "login_success"
LOGIN_FAILURE = "login_failure"
LOGOUT = "logout"
REFRESH_SUCCESS = "refresh_success"
REFRESH_FAILURE = "refresh_failure"
REVOKE = "revoke"
SESSION_VERIFY_FAILURE = "session_verify_failure"
WS_TICKET_MINTED = "ws_ticket_minted"
WS_TICKET_REJECTED = "ws_ticket_rejected"
TOKEN_AUTH_SUCCESS = "token_auth_success"
TOKEN_AUTH_FAILURE = "token_auth_failure"
def _resolve_log_path() -> Path:
"""``$HERMES_HOME/logs/dashboard-auth.log`` with the standard fallback.
Mirrors ``hermes_constants.get_hermes_home`` semantics: env var wins,
else ``~/.hermes``. A local copy avoids an import cycle with the
middleware which lives below ``hermes_cli``.
"""
home = os.environ.get("HERMES_HOME") or str(Path.home() / ".hermes")
return Path(home) / "logs" / "dashboard-auth.log"
def audit_log(event: AuditEvent, **fields: Any) -> None:
"""Append one event to the audit log.
Token-like fields are dropped. Missing log directory is created.
Write failures are logged at WARNING but never raise — auth must not
fail because the audit logger broke.
"""
safe_fields = {
k: v for k, v in fields.items()
if k not in _REDACTED_FIELDS
}
entry = {
"ts": _dt.datetime.now(_dt.timezone.utc).isoformat(),
"event": event.value,
**safe_fields,
}
line = json.dumps(entry, separators=(",", ":")) + "\n"
path = _resolve_log_path()
try:
path.parent.mkdir(parents=True, exist_ok=True)
with _write_lock:
with open(path, "a", encoding="utf-8") as f:
f.write(line)
except Exception as e:
_log.warning("dashboard-auth audit log write failed: %s", e)