hermes-agent/plugins/memory/honcho/oauth.py
Eri Barrett ba9e3a491b
feat(memory): Honcho OAuth connect — desktop and CLI flows + token refresh (#44335)
* feat(memory): OAuth token storage and refresh for the Honcho provider

* feat(memory): refresh the Honcho OAuth token in the client and session

* feat(memory): zero-CLI loopback OAuth authorization flow

* feat(memory): generic memory-provider OAuth connect endpoints

* feat(desktop): memory-provider OAuth connect link

* feat(memory): CLI OAuth sign-in with source-tagged authorize links

* fix(memory): IP-literal loopback redirect and consent config_path on the authorize link

* fix(memory): profile-scope the memory-provider OAuth endpoints

* refactor(desktop): generic memory-provider OAuth client functions

* docs(memory): trim OAuth module docstrings to the invariants

* docs(memory): document OAuth connect as an optional auth method

* fix(memory): send home-relative display path to consent, not the absolute path

* perf(memory): cache OAuth token expiry in memory to skip the hot-path disk read

* fix(memory): log OAuth refresh failures at warning, not debug

* feat(memory): fall back to an OS-assigned loopback port when 8765 is taken

* test(memory): cover the desktop Connect launcher, status, and provider dispatch

* fix(desktop): keep the memory-provider dropdown one size regardless of connect state

* fix(desktop): move the memory connect link to the description line, leaving the dropdown untouched

* refactor(memory): move OAuth connect routes out of web_server into a memory-layer router

* refactor(desktop): import MemoryConnect directly, drop the single-export barrel

* fix(memory): launch CLI OAuth sign-in right after the auth choice, not after the wizard

* fix(desktop): auto-clear the OAuth error state instead of leaving it sticky

* test(honcho): isolate auth-method prompt from deployment-shape wizard tests

main's wizard suite scripts the cloud prompts without the OAuth auth-method step; auto-answer it in the shared helper so the answer lists stay shape-only.

* docs(honcho): document query-adaptive reasoning level (reasoningHeuristic)

README never mentioned reasoningHeuristic and listed reasoningLevelCap as an orphaned cap with the wrong default (— vs "high"). Add the query-adaptive scaling note + the reasoningHeuristic/reasoningLevelCap rows (grouped under Dialectic & Reasoning), matching the wording already on the hosted honcho.md page, and add a pointer from the memory-providers overview.

* fix(honcho): default the CLI peer prompt to the OAuth consent name

The CLI runs the grant with apply_config=False, so the peerName the user just entered at consent was dropped and the wizard's 'Your name' prompt fell back to $USER. Surface it as a transient OAuthCredential.consent_peer_name (set even when config isn't merged) and seed the prompt default from it.

* feat(honcho): split OAuth client_id by surface (cli=hermes-agent, desktop=hermes-desktop)

resolve_endpoints now picks the client_id from the initiating surface and
threads it through authorize -> token exchange -> persisted grant -> refresh,
so the CLI and desktop register as distinct OAuth clients. Surface-specific
env overrides (HONCHO_OAUTH_CLIENT_ID_CLI/_DESKTOP) win over the generic
HONCHO_OAUTH_CLIENT_ID, which still overrides every surface.

* feat(honcho): show OAuth vs API key in status; detect existing OAuth in setup

status now prints 'Auth: OAuth (clientId, token valid Xm/expired)' instead of
masking the OAuth access token as a generic API key; setup notes an existing
OAuth grant when re-run.

* docs(honcho): drop 'shared pool' wording from unified observation mode help

* fix(honcho): cross-process lock around OAuth refresh to prevent grant revocation

The in-process threading lock can't stop a sibling process (another profile or
the desktop app sharing honcho.json) from replaying the single-use refresh
token and tripping reuse-detection, which revokes the whole grant. Guard the
read-refresh-persist section with an OS file lock on <config>.lock so only one
process rotates at a time; the others re-read the freshly-persisted token.
Best-effort: platforms without flock degrade to in-process serialization.

* refactor(honcho): one OAuth client (hermes-agent) for all surfaces

Collapse the per-surface client_id split. CLI and desktop now use a single
client_id (hermes-agent); consent branding/UI still adapt via the source query
param. One grant identity means no clientId-vs-refresh-token desync that could
get the grant revoked. HONCHO_OAUTH_CLIENT_ID still overrides for self-hosting.

* fix(honcho): per-session resolves to session_id, never remapped by title

Reorder resolve_session_name so stable identifiers win over labels: gateway
per-chat key first, then the per-session session_id, then the cwd map / title.
A (possibly auto-generated) title can no longer remap a live per-session
conversation onto a second Honcho session mid-stream — fixes the desktop, which
is per-conversation via session_id. Consequence: a gateway's per-chat key now
also wins over a title (titles never remap a stable id).
2026-06-22 19:16:47 -05:00

371 lines
14 KiB
Python

"""OAuth credential storage and refresh for the Honcho memory provider.
An access token authenticates exactly like a scoped API key, so it is stored
as the host's ``apiKey``; this module exchanges the refresh token before
expiry to keep it live.
Refresh tokens rotate with single-use reuse detection: a replayed stale token
revokes the whole grant. So every refresh must persist the rotated token
atomically and be serialized — and a failed refresh never raises into the
agent (stale token stays; the fail-open path absorbs the eventual 401).
"""
from __future__ import annotations
import json
import logging
import os
import threading
import time
from contextlib import contextmanager
from dataclasses import dataclass
from pathlib import Path
from typing import Any, Callable
logger = logging.getLogger(__name__)
ACCESS_TOKEN_PREFIX = "hch-at-"
REFRESH_TOKEN_PREFIX = "hch-rt-"
# Refresh this many seconds before the access token actually expires, so an
# in-flight request never races the expiry boundary.
_REFRESH_SKEW_SECONDS = 120
# Default HTTP timeout for the token exchange. Kept short — the refresh happens
# on the path to a memory call, and a stalled auth server must not hang it.
_REFRESH_TIMEOUT_SECONDS = 15.0
# Serializes refresh across threads sharing one process's config. Re-checked
# under the lock (double-checked) so racing callers don't replay a rotated
# refresh token and trip reuse detection.
_refresh_lock = threading.Lock()
@contextmanager
def _config_refresh_lock(path: Path):
"""Machine-wide advisory lock around read-refresh-persist.
The in-process ``_refresh_lock`` can't stop a second process (a sibling
Hermes profile or the desktop app sharing this honcho.json) from replaying
the single-use refresh token and tripping reuse-detection — which revokes
the whole grant. An OS file lock on ``<config>.lock`` serializes rotation
across processes; best-effort, so a platform without flock degrades to
in-process serialization only.
"""
lock_path = Path(f"{path}.lock")
fh = None
try:
lock_path.parent.mkdir(parents=True, exist_ok=True)
fh = open(lock_path, "a+b")
if os.name == "nt":
import msvcrt
fh.seek(0)
msvcrt.locking(fh.fileno(), msvcrt.LK_LOCK, 1)
else:
import fcntl
fcntl.flock(fh.fileno(), fcntl.LOCK_EX)
except Exception:
logger.debug("Honcho OAuth cross-process lock unavailable; in-process only", exc_info=True)
if fh is not None:
fh.close()
fh = None
try:
yield
finally:
if fh is not None:
try:
if os.name == "nt":
import msvcrt
fh.seek(0)
msvcrt.locking(fh.fileno(), msvcrt.LK_UNLCK, 1)
else:
import fcntl
fcntl.flock(fh.fileno(), fcntl.LOCK_UN)
except Exception:
pass
fh.close()
# In-memory expiry cache keyed by (config path, host) → (expires_at, access).
# Lets the hot path (every memory access calls this) skip the honcho.json read
# while the token is comfortably live; disk is only touched near expiry, on a
# cache miss, or when an explicit ``raw`` is supplied. Single-key dict ops are
# atomic under the GIL, so no separate lock is needed. An access token stays
# valid until its own expiry regardless of out-of-band rotation, so a stale
# cache entry can't break auth — it just defers picking up external changes
# until the token nears expiry and disk is read again.
_expiry_cache: dict[tuple[str, str], tuple[float, str]] = {}
def is_oauth_access_token(value: str | None) -> bool:
"""True when ``value`` is an OAuth access token (vs a static API key)."""
return bool(value) and value.startswith(ACCESS_TOKEN_PREFIX)
@dataclass
class OAuthCredential:
"""An OAuth grant as stored in a honcho.json host block.
``access_token`` mirrors the host's ``apiKey``; the remaining fields live in
the host's ``oauth`` sub-block. ``expires_at`` is absolute epoch seconds.
"""
access_token: str
refresh_token: str
expires_at: float
client_id: str
token_endpoint: str
scope: str = "write"
token_type: str = "Bearer"
# Transient consent peer name — set only on a fresh grant, never persisted.
consent_peer_name: str | None = None
@classmethod
def from_host_block(cls, block: dict[str, Any]) -> "OAuthCredential | None":
"""Build a credential from a honcho.json host block, or None if incomplete."""
oauth = block.get("oauth")
access = block.get("apiKey")
if not isinstance(oauth, dict) or not is_oauth_access_token(access):
return None
refresh = oauth.get("refreshToken")
endpoint = oauth.get("tokenEndpoint")
client_id = oauth.get("clientId")
if not (refresh and endpoint and client_id):
return None
try:
expires_at = float(oauth.get("expiresAt", 0))
except (TypeError, ValueError):
expires_at = 0.0
return cls(
access_token=access,
refresh_token=str(refresh),
expires_at=expires_at,
client_id=str(client_id),
token_endpoint=str(endpoint),
scope=str(oauth.get("scope", "write")),
token_type=str(oauth.get("tokenType", "Bearer")),
)
def oauth_block(self) -> dict[str, Any]:
"""The ``oauth`` sub-block to persist (the access token lives in apiKey)."""
return {
"refreshToken": self.refresh_token,
"expiresAt": int(self.expires_at),
"clientId": self.client_id,
"tokenEndpoint": self.token_endpoint,
"scope": self.scope,
"tokenType": self.token_type,
}
def is_expired(self, *, now: float, skew: float = _REFRESH_SKEW_SECONDS) -> bool:
"""True when the access token is within ``skew`` seconds of expiry."""
return now >= (self.expires_at - skew)
# Indirection so tests can drive the exchange without a live server.
def _http_post_form(url: str, data: dict[str, str], timeout: float) -> dict[str, Any]:
"""POST form-encoded ``data`` to ``url`` and return the parsed JSON body."""
import httpx
resp = httpx.post(url, data=data, timeout=timeout)
resp.raise_for_status()
return resp.json()
def _exchange_refresh_token(cred: OAuthCredential, *, now: float) -> OAuthCredential:
"""Run the refresh_token grant and return the rotated credential.
Raises on any transport/protocol failure; callers fail open.
"""
body = _http_post_form(
cred.token_endpoint,
{
"grant_type": "refresh_token",
"client_id": cred.client_id,
"refresh_token": cred.refresh_token,
},
_REFRESH_TIMEOUT_SECONDS,
)
access = body.get("access_token")
refresh = body.get("refresh_token")
if not is_oauth_access_token(access) or not refresh:
raise ValueError("refresh response missing access_token/refresh_token")
try:
expires_in = int(body.get("expires_in", 0))
except (TypeError, ValueError):
expires_in = 0
return OAuthCredential(
access_token=access,
refresh_token=str(refresh),
expires_at=now + expires_in,
client_id=cred.client_id,
token_endpoint=cred.token_endpoint,
scope=str(body.get("scope", cred.scope)),
token_type=str(body.get("token_type", cred.token_type)),
)
def _read_config(path: Path) -> dict[str, Any]:
try:
return json.loads(path.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError):
return {}
def _atomic_write_config(path: Path, raw: dict[str, Any]) -> None:
"""Write ``raw`` to ``path`` atomically, preserving 0600 on the new file."""
path.parent.mkdir(parents=True, exist_ok=True)
tmp = path.with_name(f".{path.name}.tmp")
text = json.dumps(raw, indent=2) + "\n"
fd = os.open(tmp, os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600)
try:
with os.fdopen(fd, "w", encoding="utf-8") as fh:
fh.write(text)
except Exception:
tmp.unlink(missing_ok=True)
raise
os.replace(tmp, path)
def _deep_merge(base: dict[str, Any], overlay: dict[str, Any]) -> dict[str, Any]:
"""Recursively merge ``overlay`` into ``base`` (overlay wins on scalars/lists)."""
for key, value in overlay.items():
if isinstance(value, dict) and isinstance(base.get(key), dict):
_deep_merge(base[key], value)
else:
base[key] = value
return base
def _persist_credential(path: Path, host: str, cred: OAuthCredential) -> None:
"""Persist ``cred`` into ``host``'s block (apiKey + oauth), leaving all else intact."""
raw = _read_config(path)
hosts = raw.setdefault("hosts", {})
block = hosts.setdefault(host, {})
block["apiKey"] = cred.access_token
block["oauth"] = cred.oauth_block()
_atomic_write_config(path, raw)
_expiry_cache[(str(path), host)] = (cred.expires_at, cred.access_token)
def ensure_fresh_token(
path: Path,
host: str,
raw: dict[str, Any] | None = None,
*,
now: float | None = None,
) -> tuple[str | None, bool]:
"""Return ``(access_token, refreshed)`` for ``host``, refreshing if near expiry.
Returns ``(None, False)`` when the host has no OAuth credential (e.g. a plain
API key) so callers leave the existing token untouched. Refresh failures are
swallowed: the current (possibly stale) token is returned with
``refreshed=False`` and the fail-open path handles any resulting 401.
"""
now = time.time() if now is None else now
key = (str(path), host)
# Hot path: trust the cached expiry while the token is well clear of the
# skew window — no disk read. Bypassed when an explicit ``raw`` is supplied.
if raw is None:
cached = _expiry_cache.get(key)
if cached is not None and now < cached[0] - _REFRESH_SKEW_SECONDS:
return cached[1], False
source = raw if raw is not None else _read_config(path)
block = (source.get("hosts") or {}).get(host) or {}
cred = OAuthCredential.from_host_block(block)
if cred is None:
_expiry_cache.pop(key, None)
return None, False
_expiry_cache[key] = (cred.expires_at, cred.access_token)
if not cred.is_expired(now=now):
return cred.access_token, False
with _refresh_lock, _config_refresh_lock(path):
# Re-read under both locks: another thread or process may have just
# rotated the token — adopt theirs instead of replaying the old one.
fresh_block = (_read_config(path).get("hosts") or {}).get(host) or {}
current = OAuthCredential.from_host_block(fresh_block) or cred
if not current.is_expired(now=now):
return current.access_token, current.access_token != cred.access_token
try:
rotated = _exchange_refresh_token(current, now=now)
except Exception as exc:
logger.warning("Honcho OAuth refresh failed for host %s: %s", host, exc)
return current.access_token, False
_persist_credential(path, host, rotated)
logger.info("Honcho OAuth token refreshed for host %s", host)
return rotated.access_token, True
def install_grant(
path: Path,
host: str,
grant: dict[str, Any],
*,
client_id: str,
token_endpoint: str,
apply_config: bool = True,
now: float | None = None,
) -> OAuthCredential:
"""Apply a fresh OAuth grant to ``path`` for ``host``.
Deep-merges the grant's ``config`` (the manifest default_config) into the
file root — preserving other hosts and root keys — then writes the host's
``apiKey`` and ``oauth`` block. ``grant`` is an OAuthTokenResponse dict
(access_token, refresh_token, expires_in, scope, config).
``apply_config=False`` skips the config merge and stores tokens only.
"""
now = time.time() if now is None else now
access = grant.get("access_token")
refresh = grant.get("refresh_token")
if not is_oauth_access_token(access) or not refresh:
raise ValueError("grant missing access_token/refresh_token")
try:
expires_in = int(grant.get("expires_in", 0))
except (TypeError, ValueError):
expires_in = 0
cred = OAuthCredential(
access_token=access,
refresh_token=str(refresh),
expires_at=now + expires_in,
client_id=client_id,
token_endpoint=token_endpoint,
scope=str(grant.get("scope", "write")),
token_type=str(grant.get("token_type", "Bearer")),
)
raw = _read_config(path)
granted_config = grant.get("config")
if isinstance(granted_config, dict):
cred.consent_peer_name = granted_config.get("peerName")
if apply_config:
_deep_merge(raw, granted_config)
_expiry_cache[(str(path), host)] = (cred.expires_at, cred.access_token)
hosts = raw.setdefault("hosts", {})
block = hosts.setdefault(host, {})
block["apiKey"] = cred.access_token
block["oauth"] = cred.oauth_block()
_atomic_write_config(path, raw)
return cred
def apply_token_to_client(client: Any, token: str) -> bool:
"""Rotate the live Honcho client's Bearer in place. Returns success.
The SDK builds its auth header per request from the HTTP client's
``api_key``, so mutating it rotates every holder of the singleton without a
rebuild. Guarded: an SDK shape change degrades to False and the caller can
fall back to resetting the client.
"""
http = getattr(client, "_http", None)
if http is None or not hasattr(http, "api_key"):
return False
http.api_key = token
return True