mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-22 10:32:00 +00:00
Live-test finding: the Chronos fire webhook was only on the APIServerAdapter
(aiohttp), but hosted agents expose `hermes dashboard` (the FastAPI web_server
app on :9119) as their public URL — NOT the api_server adapter. So NAS's relay
callback to {callback_url}/api/cron/fire could never reach the verifier on a
hosted agent (the exact target environment). Two layers were wrong:
1. Wrong server: /api/cron/fire didn't exist on the dashboard app. Added
cron_fire_webhook there, alongside the existing /api/cron/* dashboard routes.
It resolves the job's profile (_find_cron_job_profile) and runs fire_due via
the resolved provider under the cron-profile retarget lock
(_fire_cron_job_for_profile, mirroring _call_cron_for_profile) so the CAS
claim + run_one_job operate on the right profile's jobs.json. Runs with no
live adapters (delivery falls back to the per-platform send path, like the
desktop cron path). 202 + background so a long turn never trips NAS's
timeout; the store CAS de-dupes a NAS retry. job-not-found -> 200 "gone".
2. Auth gate: the dashboard auth middleware 401s any non-cookie request before
the handler runs. Added /api/cron/fire to the shared PUBLIC_API_PATHS so the
NAS bearer-JWT callback reaches the verifier — the JWT (purpose=cron_fire),
not the cookie, is the real gate. One shared frozenset feeds both the
loopback and OAuth middlewares, so no drift.
Kept the APIServerAdapter route too (valid self-host api_server surface).
Contract doc updated to name the dashboard app as the hosted-agent callback
surface.
Tests: test_cron_fire_dashboard (6) — route registered on the dashboard app,
in PUBLIC_API_PATHS, 401 on bad token WITH the cookie gate engaged (proves it's
reachable past the gate + JWT is the gate), 400 missing job_id, 200 gone for
unknown job, 202 + fire_due invoked for the resolved profile on a valid token.
Full hermes_cli + cron + chronos + webhook suites green (7637).
Why the original tests missed it: the api_server webhook test built an
APIServerAdapter client directly and never asserted which server the hosted
public URL exposes — green-but-wrong-integration. The new test pins the route
to the dashboard app.
55 lines
2.6 KiB
Python
55 lines
2.6 KiB
Python
"""Shared allowlist of ``/api/*`` paths that bypass dashboard auth.
|
|
|
|
Two middlewares enforce dashboard auth and previously kept independent
|
|
copies of this list:
|
|
|
|
* ``hermes_cli.web_server.auth_middleware`` — loopback / ``--insecure``
|
|
mode, gates on the ephemeral ``_SESSION_TOKEN``.
|
|
* ``hermes_cli.dashboard_auth.middleware.gated_auth_middleware`` —
|
|
non-loopback mode, gates on the OAuth session cookie.
|
|
|
|
When the lists drifted, ``/api/status`` ended up public under the legacy
|
|
gate but 401'd under the OAuth gate. That broke the portal's wildcard
|
|
liveness probe (``nous-account-service`` ``fly-provider.ts``
|
|
``getInstanceRuntimeStatus``), which fetches ``/api/status`` without a
|
|
cookie as its sole signal of "agent dashboard is alive": every healthy
|
|
wildcard-subdomain agent surfaced as STARTING/down in the portal UI even
|
|
though the dashboard was serving correctly.
|
|
|
|
Centralising the allowlist here so both middlewares import the same
|
|
frozenset prevents the next drift. Keep this list minimal — only truly
|
|
non-sensitive, read-only endpoints belong here. As a sanity check, every
|
|
entry should be safe to expose to:
|
|
|
|
* external uptime probes (Pingdom, Better Stack, NAS),
|
|
* the dashboard SPA before the user has logged in,
|
|
* anyone who happens to ``curl`` the hostname.
|
|
|
|
If a new endpoint doesn't pass all three tests, it should be gated and
|
|
the SPA should bootstrap it after login instead.
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
PUBLIC_API_PATHS: frozenset[str] = frozenset({
|
|
# Liveness probe target. Returns version, gateway state, active
|
|
# session count, and the dashboard auth-gate shape. No bodies, no
|
|
# session content, no secrets. Documented as the portal's wildcard
|
|
# liveness probe in
|
|
# ``docs/agent-dashboard-public-url-contract.md`` (NAS side).
|
|
"/api/status",
|
|
# Read-only config-defaults / schema feeds for the SPA's Config page.
|
|
"/api/config/defaults",
|
|
"/api/config/schema",
|
|
# Read-only model metadata (context windows, etc.) — same shape as
|
|
# provider catalogs already exposed on the public internet.
|
|
"/api/model/info",
|
|
# Read-only theme + plugin manifests for the dashboard skin engine.
|
|
"/api/dashboard/themes",
|
|
"/api/dashboard/plugins",
|
|
# Chronos managed-cron fire webhook (NAS -> agent). NOT cookie-gated: it
|
|
# carries its own short-lived NAS-minted JWT (purpose=cron_fire), which the
|
|
# handler verifies as the real auth. Must bypass the dashboard auth gate so
|
|
# the NAS relay's bearer-only callback reaches the verifier instead of a
|
|
# 401 no_cookie. The JWT — not this allowlist — is the security boundary.
|
|
"/api/cron/fire",
|
|
})
|