mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-21 10:22:18 +00:00
* feat(billing): nous_billing http client + BillingState core (phase 2b)
Phase 2b terminal-billing client foundation:
- hermes_cli/nous_billing.py: typed client for the 4 /api/billing/* endpoints
(state/charge/poll/auto-top-up). Raises typed errors (BillingScopeRequired,
BillingRateLimited, BillingAuthError) mapped from the live-verified contract;
fail-open is the caller's job. Idempotency-Key enforced client-side.
- agent/billing_view.py: surface-agnostic BillingState core + Decimal money
parsing (server emits decimal strings, not 2dp), fail-open builder,
idempotency-key gen, custom-amount validation.
- 51 unit tests (decimal parse/format, payload tiering, error->exception
matrix, fail-open, amount validation).
Plan: docs/plans/2026-06-13-001-phase-2b-terminal-billing-tui-plan.md
* feat(billing): billing:manage scope + lazy step-up re-auth (phase 2b)
- NOUS_BILLING_MANAGE_SCOPE constant.
- nous_token_has_billing_scope(): split-based scope check (no false-positive
substring match).
- step_up_nous_billing_scope(): re-runs the device flow requesting
billing:manage, reusing the held credential's portal/inference URLs + client_id
(so a preview stays a preview), persists like _login_nous but WITHOUT the model
picker. Returns True iff the minted token carries the scope (False when NAS
silently downscopes a non-admin / unticked grant).
Lazy step-up (plan D-A): normal login path unchanged; 403 insufficient_scope
from a billing call triggers this. 7 unit tests.
* feat(billing): billing JSON-RPC methods for the TUI (phase 2b)
billing.state / charge / charge_status / auto_reload / step_up in
tui_gateway/server.py. Return STRUCTURED success envelopes (result.ok +
result.error=<code>) rather than JSON-RPC-level errors, so the Ink rpc() promise
always resolves and the TUI branches on the typed billing error code
(insufficient_scope, rate_limited, no_payment_method, …) to render the right
affordance. Money serialized as decimal STRINGS + display strings. charge mints
+ echoes an idempotency_key for retry reuse. 16 unit tests.
* feat(billing): /billing CLI handler + command registry (phase 2b)
- CommandDef("billing", subcommands=buy|auto-reload|limit), added to
_SLACK_VIA_HERMES_ONLY so it routes via /hermes on Slack (keeps the 50-cap
parity test green, same as /credits).
- cli.py::_show_billing + screen helpers: all 5 screens (overview, buy→confirm→
poll, auto-reload, monthly-limit read-only). Reuses _prompt_text_input_modal /
_prompt_text_input (D-C). Non-interactive (_app is None) renders text + portal
deep-link, never prompts (R7). Decimal money end-to-end. 2s/5-min cancellable
poll loop; 429/503 = retry not failure; settled = ledger truth. Lazy step-up on
403 insufficient_scope. no_payment_method treated as mainline funnel-to-portal.
- 6 CLI tests; 156 command tests (incl. Slack/Telegram parity) green.
* feat(billing): /billing Ink TUI screens + tests (phase 2b)
- ui-tui/src/app/slash/commands/billing.ts: /billing TUI command covering all 5
screens — overview (text), buy <amt> → ConfirmReq → charge → non-blocking 2s/
5-min poll loop → settled/failed/timeout branches, auto-reload <below> <to> →
ConfirmReq → PATCH, limit (read-only). Reuses the existing ConfirmReq overlay
(D-C) — no bespoke component. Typed-error envelope branching: insufficient_scope
arms the lazy step-up confirm; no_payment_method/rate_limited/cap funnel to
portal. Client-side amount validation mirrors the server (bounds + 2dp).
- gatewayTypes.ts: Billing* response interfaces.
- registry.ts: register billingCommands.
- billingCommand.test.ts: 12 vitest cases (overview/gating/buy-confirm-poll-
settled/no_payment_method/step-up/limit/auto-reload/validation).
TUI build green; 12/12 vitest pass; slash tests pass once @hermes/ink is built.
* docs(billing): scrub private cross-repo references
NAS is a private repo — remove all references to it from the public PR:
- drop the cross-repo planning doc (planning scaffolding, not a deliverable;
the PR description documents the design)
- replace 'NAS' / 'PR #412 preview' mentions in code + test comments with
generic 'the server' / 'a preview deployment'
* docs(billing): scrub final NAS reference in step-up docstring
* docs(billing): drop dangling plan-doc refs
The phase-2b plan doc was removed in the cross-repo scrub (300afcc0b)
but two module docstrings still pointed at it. Drop the dead refs.
* feat(billing): interactive /billing overlay + step-up UX, portal-URL & token fixes
Adds the interactive /billing TUI overlay and hardens the terminal-billing
client across CLI and TUI.
- TUI: full /billing overlay state machine (overview to buy to confirm,
auto-reload, read-only monthly limit) reusing the existing confirm overlay.
- Step-up: surface the verification link in-transcript and open the browser
via the TUI's own opener (the device flow runs in the headless gateway, so a
printed URL was being dropped); run the step-up handler off the main loop and
emit the link as an out-of-band event so the gateway stays responsive.
- Step-up copy is scope-accurate ("Billing permission granted") and re-checks
/state so it never claims "enabled" when the org kill-switch is still off.
- Portal deep-links resolve to absolute URLs against the active portal base
(the server emits them relative) - fixes a bare "/billing?topup=open" link.
- Billing calls refresh an expired access token via the stored refresh token
instead of reporting a false "not logged in".
- Optimistic funnel: advise "set up a saved card on the portal" up front when
no card is on file (advisory, not a hard gate).
- Token resolution is cached briefly so the 2s charge poll loop stops
re-locking + re-reading the auth store on every tick; 401 re-resolves fresh.
- Remove the temporary demo-mode shims.
Validation: 87 Python billing tests, 88 TS tests (billing command + gateway
event handler), tsc clean, ink + ui-tui builds green.
* docs(billing): add /billing TUI screenshots for PR
* fix(cli): guard _last_invalidate on bare instances; update stale prompt-fallback test
The UI-invalidate throttle read self._last_invalidate unconditionally, which
raised AttributeError on HermesCLI instances built without __init__ (the
thread-safety test's object.__new__ shell). Guard the read with getattr.
The off-main-thread branch of _prompt_text_input was changed (#23185) to cancel
cleanly to None instead of falling back to a bare input() that would hang on the
slash-worker thread; the test still asserted the old direct-input fallback.
Update it to assert the current intended behavior: returns None, calls neither
run_in_terminal nor input(), and does not hang.
406 lines
15 KiB
Python
406 lines
15 KiB
Python
"""Nous Portal terminal-billing HTTP client (Phase 2b).
|
|
|
|
Thin, fail-loud client for the four ``/api/billing/*`` endpoints the terminal
|
|
billing screens drive. Companion to ``hermes_cli/nous_account.py`` (which owns
|
|
read-only entitlement/balance) — this module owns the *write* side: buy credits,
|
|
poll a charge, configure auto-reload.
|
|
|
|
Design rules:
|
|
|
|
- **Money is decimal, never float.** The server emits decimal STRINGS
|
|
(``"142.5"`` — not fixed 2dp). We parse with :class:`decimal.Decimal` and never
|
|
round-trip through float.
|
|
- **This client raises typed exceptions; it does NOT fail open.** Fail-open is the
|
|
*caller's* job (the ``agent/billing_view.py`` builders) so each surface can
|
|
decide how to degrade. A raw network/HTTP error here surfaces as
|
|
:class:`BillingError` (or a subclass) carrying the parsed server ``error`` code,
|
|
HTTP status, ``portalUrl`` deep-link, and ``retry_after``.
|
|
- **Auth** = the OAuth bearer JWT Hermes already holds for inference
|
|
(``get_provider_auth_state("nous")["access_token"]``). No API-key auth on these.
|
|
- **Portal base URL** resolves with the same precedence as the device-flow login
|
|
(``auth.py``): ``HERMES_PORTAL_BASE_URL`` → ``NOUS_PORTAL_BASE_URL`` → the
|
|
stored auth-state ``portal_base_url`` → the registry default. This is how the
|
|
E2E run points the client at a preview deployment with zero code change.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import json
|
|
import os
|
|
import urllib.error
|
|
import urllib.parse
|
|
import urllib.request
|
|
from typing import Any, Optional
|
|
|
|
DEFAULT_PORTAL_BASE_URL = "https://portal.nousresearch.com"
|
|
|
|
# Default HTTP timeout (seconds). Charge/poll calls are quick; keep this tight so
|
|
# a hung portal doesn't freeze the TUI.
|
|
DEFAULT_TIMEOUT = 15.0
|
|
|
|
# Scope the privileged billing endpoints require. Mirrored from
|
|
# hermes_cli.auth.NOUS_BILLING_MANAGE_SCOPE (kept here too so this module has no
|
|
# import-time dependency on the much heavier auth module).
|
|
BILLING_MANAGE_SCOPE = "billing:manage"
|
|
|
|
|
|
# =============================================================================
|
|
# Typed errors
|
|
# =============================================================================
|
|
|
|
|
|
class BillingError(Exception):
|
|
"""A billing HTTP call failed.
|
|
|
|
Carries everything a surface needs to render the right message + affordance:
|
|
the server ``error`` code, HTTP ``status``, an optional human ``message``, the
|
|
``portalUrl`` deep-link (present on every gate denial), and ``retry_after``
|
|
seconds (429/503). ``payload`` is the full parsed JSON body when available.
|
|
"""
|
|
|
|
def __init__(
|
|
self,
|
|
message: str,
|
|
*,
|
|
status: Optional[int] = None,
|
|
error: Optional[str] = None,
|
|
portal_url: Optional[str] = None,
|
|
retry_after: Optional[int] = None,
|
|
payload: Optional[dict[str, Any]] = None,
|
|
) -> None:
|
|
super().__init__(message)
|
|
self.status = status
|
|
self.error = error
|
|
self.portal_url = portal_url
|
|
self.retry_after = retry_after
|
|
self.payload = payload or {}
|
|
|
|
|
|
class BillingScopeRequired(BillingError):
|
|
"""``403 insufficient_scope`` — the held token lacks ``billing:manage``.
|
|
|
|
The lazy step-up trigger: catching this kicks off a fresh device-connect that
|
|
requests ``billing:manage`` (and tells the user an ADMIN must tick "Allow
|
|
terminal billing"). Also fires mid-session if the scope is stripped on refresh
|
|
after the user loses ADMIN.
|
|
"""
|
|
|
|
|
|
class BillingRateLimited(BillingError):
|
|
"""``429 rate_limited`` or ``503 temporarily_unavailable``.
|
|
|
|
NOT a payment failure. Carries ``retry_after`` (seconds) — back off and tell
|
|
the user "try again in N min"; never auto-retry-spam (the limiter is
|
|
5/org/hr + 5/token/hr and easy to dig deeper into).
|
|
"""
|
|
|
|
|
|
class BillingAuthError(BillingError):
|
|
"""``401`` — missing/invalid bearer token (not logged in / expired)."""
|
|
|
|
|
|
# =============================================================================
|
|
# Base-URL + auth resolution
|
|
# =============================================================================
|
|
|
|
|
|
def resolve_portal_base_url(state: Optional[dict[str, Any]] = None) -> str:
|
|
"""Resolve the portal base URL with login-time precedence.
|
|
|
|
``HERMES_PORTAL_BASE_URL`` → ``NOUS_PORTAL_BASE_URL`` → stored auth-state
|
|
``portal_base_url`` → registry default. Trailing slash stripped.
|
|
"""
|
|
env = os.getenv("HERMES_PORTAL_BASE_URL") or os.getenv("NOUS_PORTAL_BASE_URL")
|
|
if env and env.strip():
|
|
return env.strip().rstrip("/")
|
|
if state:
|
|
stored = state.get("portal_base_url")
|
|
if isinstance(stored, str) and stored.strip():
|
|
return stored.strip().rstrip("/")
|
|
return DEFAULT_PORTAL_BASE_URL
|
|
|
|
|
|
def _absolutize_portal_url(portal_url: Optional[str]) -> Optional[str]:
|
|
"""Resolve a (possibly relative) server portalUrl to an absolute URL.
|
|
|
|
The server emits ``portalUrl`` relative by design (e.g. ``/billing?topup=open``)
|
|
— it doesn't know which deployment the client points at. Resolve it against the
|
|
client's portal base (preview / staging / prod) so deep-links are clickable.
|
|
Idempotent: an already-absolute URL is returned unchanged (urljoin keeps it).
|
|
"""
|
|
if not (isinstance(portal_url, str) and portal_url.strip()):
|
|
return portal_url
|
|
base = resolve_portal_base_url()
|
|
# urljoin needs a trailing slash on the base to treat it as a directory and
|
|
# join an absolute path like "/billing?..." against the host. An already-
|
|
# absolute portal_url (with its own scheme/host) is returned as-is.
|
|
return urllib.parse.urljoin(base.rstrip("/") + "/", portal_url)
|
|
|
|
|
|
# Short-lived cache for the resolved (token, base). `resolve_nous_access_token`
|
|
# acquires two cross-process file locks + reads two files on every call (even on
|
|
# its fast path), which is wasteful when the 2s/5-min charge poll loop calls a
|
|
# billing endpoint ~150x per purchase. Cache the result briefly: the resolver
|
|
# only ever returns a token with >=120s of life (its refresh skew), so a 30s
|
|
# cache can never hand back an about-to-expire token. A 401 still surfaces
|
|
# normally (the cache holds a valid token, not the HTTP outcome).
|
|
_TOKEN_CACHE_TTL_SECONDS = 30.0
|
|
_token_cache: tuple[float, str, str] | None = None # (cached_at, token, base)
|
|
|
|
|
|
def _billing_not_logged_in(exc: Optional[BaseException] = None) -> "BillingAuthError":
|
|
"""Build the canonical 'not logged in' BillingAuthError (single source)."""
|
|
err = BillingAuthError(
|
|
"Not logged into Nous Portal — run `hermes portal` to log in.",
|
|
status=401,
|
|
error="invalid_token",
|
|
)
|
|
if exc is not None:
|
|
err.__cause__ = exc
|
|
return err
|
|
|
|
|
|
def _resolve_token_and_base(*, use_cache: bool = True) -> tuple[str, str]:
|
|
"""Return ``(access_token, portal_base_url)`` for billing calls.
|
|
|
|
Uses the same refresh-aware resolver the inference path uses
|
|
(``resolve_nous_access_token``), so a short-lived (~15 min) access token that
|
|
has expired is transparently refreshed via the stored ``refresh_token``
|
|
instead of failing as "not logged in". Raises :class:`BillingAuthError` only
|
|
when there is no usable Nous session at all.
|
|
|
|
The result is cached for ``_TOKEN_CACHE_TTL_SECONDS`` to keep the charge poll
|
|
loop from re-locking + re-reading the auth store on every 2s tick. Pass
|
|
``use_cache=False`` to force a fresh resolution (e.g. after a 401).
|
|
"""
|
|
global _token_cache
|
|
import time as _time
|
|
|
|
if use_cache and _token_cache is not None:
|
|
cached_at, token, base = _token_cache
|
|
if (_time.time() - cached_at) < _TOKEN_CACHE_TTL_SECONDS:
|
|
return token, base
|
|
|
|
try:
|
|
from hermes_cli.auth import get_provider_auth_state
|
|
|
|
state = get_provider_auth_state("nous") or {}
|
|
except Exception:
|
|
state = {}
|
|
|
|
base = resolve_portal_base_url(state)
|
|
|
|
try:
|
|
from hermes_cli.auth import AuthError, resolve_nous_access_token
|
|
except ImportError:
|
|
# auth module unavailable — fall back to the raw stored token.
|
|
token = state.get("access_token")
|
|
if isinstance(token, str) and token.strip():
|
|
resolved = (token.strip(), base)
|
|
_token_cache = (_time.time(), *resolved)
|
|
return resolved
|
|
raise _billing_not_logged_in()
|
|
|
|
try:
|
|
token = resolve_nous_access_token()
|
|
except AuthError as exc:
|
|
raise _billing_not_logged_in(exc) from exc
|
|
resolved = (token.strip(), base)
|
|
_token_cache = (_time.time(), *resolved)
|
|
return resolved
|
|
|
|
|
|
# =============================================================================
|
|
# HTTP plumbing
|
|
# =============================================================================
|
|
|
|
|
|
def _retry_after_seconds(headers: Any) -> Optional[int]:
|
|
"""Parse a ``Retry-After`` header (integer seconds) — None if absent/bad."""
|
|
if headers is None:
|
|
return None
|
|
try:
|
|
raw = headers.get("Retry-After")
|
|
except Exception:
|
|
raw = None
|
|
if raw is None:
|
|
return None
|
|
try:
|
|
return int(str(raw).strip())
|
|
except (TypeError, ValueError):
|
|
return None
|
|
|
|
|
|
def _raise_for_error(
|
|
status: int, payload: dict[str, Any], headers: Any = None
|
|
) -> None:
|
|
"""Map an HTTP error response to the right typed :class:`BillingError`."""
|
|
error = payload.get("error") if isinstance(payload, dict) else None
|
|
message = payload.get("message") if isinstance(payload, dict) else None
|
|
portal_url = _absolutize_portal_url(
|
|
payload.get("portalUrl") if isinstance(payload, dict) else None
|
|
)
|
|
retry_after = _retry_after_seconds(headers)
|
|
|
|
common = {
|
|
"status": status,
|
|
"error": error,
|
|
"portal_url": portal_url,
|
|
"retry_after": retry_after,
|
|
"payload": payload if isinstance(payload, dict) else None,
|
|
}
|
|
|
|
if status == 401:
|
|
raise BillingAuthError(message or "Authentication required.", **common)
|
|
if status == 403 and error == "insufficient_scope":
|
|
raise BillingScopeRequired(
|
|
message or "This action needs the billing:manage scope.", **common
|
|
)
|
|
if status in (429, 503):
|
|
raise BillingRateLimited(
|
|
message or "Rate limited — try again shortly.", **common
|
|
)
|
|
raise BillingError(message or error or f"Billing request failed ({status}).", **common)
|
|
|
|
|
|
def _request(
|
|
method: str,
|
|
path: str,
|
|
*,
|
|
body: Optional[dict[str, Any]] = None,
|
|
extra_headers: Optional[dict[str, str]] = None,
|
|
timeout: float = DEFAULT_TIMEOUT,
|
|
_retried_auth: bool = False,
|
|
) -> dict[str, Any]:
|
|
"""Make an authenticated billing request; return the parsed JSON dict.
|
|
|
|
Raises a typed :class:`BillingError` on any non-2xx response (or transport
|
|
failure). 2xx with an empty body returns ``{}``. A 401 triggers exactly one
|
|
retry with a freshly-resolved token (bypassing the short token cache) so a
|
|
cached-but-just-expired token self-heals instead of failing the call.
|
|
"""
|
|
token, base = _resolve_token_and_base(use_cache=not _retried_auth)
|
|
url = f"{base}{path}"
|
|
headers = {
|
|
"Authorization": f"Bearer {token}",
|
|
"Accept": "application/json",
|
|
}
|
|
if body is not None:
|
|
headers["Content-Type"] = "application/json"
|
|
if extra_headers:
|
|
headers.update(extra_headers)
|
|
|
|
data = json.dumps(body).encode("utf-8") if body is not None else None
|
|
req = urllib.request.Request(url, data=data, headers=headers, method=method)
|
|
|
|
try:
|
|
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
|
raw = resp.read().decode("utf-8")
|
|
return json.loads(raw) if raw.strip() else {}
|
|
except urllib.error.HTTPError as exc:
|
|
# A 401 on a cached token → drop the cache and retry once with a fresh
|
|
# (refresh-aware) resolve before surfacing the auth error.
|
|
if exc.code == 401 and not _retried_auth:
|
|
global _token_cache
|
|
_token_cache = None
|
|
return _request(
|
|
method,
|
|
path,
|
|
body=body,
|
|
extra_headers=extra_headers,
|
|
timeout=timeout,
|
|
_retried_auth=True,
|
|
)
|
|
raw = ""
|
|
try:
|
|
raw = exc.read().decode("utf-8")
|
|
except Exception:
|
|
raw = ""
|
|
try:
|
|
payload = json.loads(raw) if raw.strip() else {}
|
|
except json.JSONDecodeError:
|
|
payload = {}
|
|
_raise_for_error(exc.code, payload, getattr(exc, "headers", None))
|
|
raise # unreachable; _raise_for_error always raises
|
|
except urllib.error.URLError as exc:
|
|
raise BillingError(
|
|
f"Could not reach Nous Portal: {exc.reason}", error="network_error"
|
|
) from exc
|
|
|
|
|
|
# =============================================================================
|
|
# The four endpoints
|
|
# =============================================================================
|
|
|
|
|
|
def get_billing_state(*, timeout: float = DEFAULT_TIMEOUT) -> dict[str, Any]:
|
|
"""``GET /api/billing/state`` — role-tiered overview (no scope required)."""
|
|
return _request("GET", "/api/billing/state", timeout=timeout)
|
|
|
|
|
|
def patch_auto_top_up(
|
|
*,
|
|
enabled: bool,
|
|
threshold: float | str,
|
|
top_up_amount: float | str,
|
|
timeout: float = DEFAULT_TIMEOUT,
|
|
) -> dict[str, Any]:
|
|
"""``PATCH /api/billing/auto-top-up`` — configure auto-reload (scope required).
|
|
|
|
Body is strict server-side: extra keys (``maxMonthlySpend``, a payment method)
|
|
are rejected with 400. Numbers are sent as JSON numbers per the contract.
|
|
"""
|
|
return _request(
|
|
"PATCH",
|
|
"/api/billing/auto-top-up",
|
|
body={
|
|
"enabled": bool(enabled),
|
|
"threshold": float(threshold),
|
|
"topUpAmount": float(top_up_amount),
|
|
},
|
|
timeout=timeout,
|
|
)
|
|
|
|
|
|
def post_charge(
|
|
*,
|
|
amount_usd: float | str,
|
|
idempotency_key: str,
|
|
timeout: float = DEFAULT_TIMEOUT,
|
|
) -> dict[str, Any]:
|
|
"""``POST /api/billing/charge`` — buy credits (scope required).
|
|
|
|
``Idempotency-Key`` header is MANDATORY (a missing header is a server 400, not
|
|
a default): generate a UUID per user-confirmed purchase and reuse it on retry.
|
|
Returns ``202 {chargeId}`` — money is NOT confirmed yet; poll with
|
|
:func:`get_charge_status`.
|
|
"""
|
|
if not (isinstance(idempotency_key, str) and idempotency_key.strip()):
|
|
raise BillingError(
|
|
"Idempotency-Key is required for a charge.",
|
|
error="idempotency_key_required",
|
|
)
|
|
return _request(
|
|
"POST",
|
|
"/api/billing/charge",
|
|
body={"amountUsd": float(amount_usd)},
|
|
extra_headers={"Idempotency-Key": idempotency_key.strip()},
|
|
timeout=timeout,
|
|
)
|
|
|
|
|
|
def get_charge_status(
|
|
charge_id: str, *, timeout: float = DEFAULT_TIMEOUT
|
|
) -> dict[str, Any]:
|
|
"""``GET /api/billing/charge/{id}`` — poll a charge (scope required).
|
|
|
|
Returns ``{status: "pending"|"settled"|"failed", ...}``. An unknown or foreign
|
|
id returns ``{status:"pending"}`` (never 404, never another org's data) — so a
|
|
``pending`` that never resolves past the 5-min cap is a *timeout*, not an error.
|
|
"""
|
|
if not (isinstance(charge_id, str) and charge_id.strip()):
|
|
raise BillingError("A charge id is required.", error="invalid_charge_id")
|
|
# urllib does not need manual quoting for the opaque ids the server mints, but
|
|
# guard against a stray slash that would change the path shape.
|
|
safe_id = urllib.parse.quote(charge_id.strip(), safe="")
|
|
return _request("GET", f"/api/billing/charge/{safe_id}", timeout=timeout)
|