* feat(memory): OAuth token storage and refresh for the Honcho provider
* feat(memory): refresh the Honcho OAuth token in the client and session
* feat(memory): zero-CLI loopback OAuth authorization flow
* feat(memory): generic memory-provider OAuth connect endpoints
* feat(desktop): memory-provider OAuth connect link
* feat(memory): CLI OAuth sign-in with source-tagged authorize links
* fix(memory): IP-literal loopback redirect and consent config_path on the authorize link
* fix(memory): profile-scope the memory-provider OAuth endpoints
* refactor(desktop): generic memory-provider OAuth client functions
* docs(memory): trim OAuth module docstrings to the invariants
* docs(memory): document OAuth connect as an optional auth method
* fix(memory): send home-relative display path to consent, not the absolute path
* perf(memory): cache OAuth token expiry in memory to skip the hot-path disk read
* fix(memory): log OAuth refresh failures at warning, not debug
* feat(memory): fall back to an OS-assigned loopback port when 8765 is taken
* test(memory): cover the desktop Connect launcher, status, and provider dispatch
* fix(desktop): keep the memory-provider dropdown one size regardless of connect state
* fix(desktop): move the memory connect link to the description line, leaving the dropdown untouched
* refactor(memory): move OAuth connect routes out of web_server into a memory-layer router
* refactor(desktop): import MemoryConnect directly, drop the single-export barrel
* fix(memory): launch CLI OAuth sign-in right after the auth choice, not after the wizard
* fix(desktop): auto-clear the OAuth error state instead of leaving it sticky
* test(honcho): isolate auth-method prompt from deployment-shape wizard tests
main's wizard suite scripts the cloud prompts without the OAuth auth-method step; auto-answer it in the shared helper so the answer lists stay shape-only.
* docs(honcho): document query-adaptive reasoning level (reasoningHeuristic)
README never mentioned reasoningHeuristic and listed reasoningLevelCap as an orphaned cap with the wrong default (— vs "high"). Add the query-adaptive scaling note + the reasoningHeuristic/reasoningLevelCap rows (grouped under Dialectic & Reasoning), matching the wording already on the hosted honcho.md page, and add a pointer from the memory-providers overview.
* fix(honcho): default the CLI peer prompt to the OAuth consent name
The CLI runs the grant with apply_config=False, so the peerName the user just entered at consent was dropped and the wizard's 'Your name' prompt fell back to $USER. Surface it as a transient OAuthCredential.consent_peer_name (set even when config isn't merged) and seed the prompt default from it.
* feat(honcho): split OAuth client_id by surface (cli=hermes-agent, desktop=hermes-desktop)
resolve_endpoints now picks the client_id from the initiating surface and
threads it through authorize -> token exchange -> persisted grant -> refresh,
so the CLI and desktop register as distinct OAuth clients. Surface-specific
env overrides (HONCHO_OAUTH_CLIENT_ID_CLI/_DESKTOP) win over the generic
HONCHO_OAUTH_CLIENT_ID, which still overrides every surface.
* feat(honcho): show OAuth vs API key in status; detect existing OAuth in setup
status now prints 'Auth: OAuth (clientId, token valid Xm/expired)' instead of
masking the OAuth access token as a generic API key; setup notes an existing
OAuth grant when re-run.
* docs(honcho): drop 'shared pool' wording from unified observation mode help
* fix(honcho): cross-process lock around OAuth refresh to prevent grant revocation
The in-process threading lock can't stop a sibling process (another profile or
the desktop app sharing honcho.json) from replaying the single-use refresh
token and tripping reuse-detection, which revokes the whole grant. Guard the
read-refresh-persist section with an OS file lock on <config>.lock so only one
process rotates at a time; the others re-read the freshly-persisted token.
Best-effort: platforms without flock degrade to in-process serialization.
* refactor(honcho): one OAuth client (hermes-agent) for all surfaces
Collapse the per-surface client_id split. CLI and desktop now use a single
client_id (hermes-agent); consent branding/UI still adapt via the source query
param. One grant identity means no clientId-vs-refresh-token desync that could
get the grant revoked. HONCHO_OAUTH_CLIENT_ID still overrides for self-hosting.
* fix(honcho): per-session resolves to session_id, never remapped by title
Reorder resolve_session_name so stable identifiers win over labels: gateway
per-chat key first, then the per-session session_id, then the cwd map / title.
A (possibly auto-generated) title can no longer remap a live per-session
conversation onto a second Honcho session mid-stream — fixes the desktop, which
is per-conversation via session_id. Consequence: a gateway's per-chat key now
also wins over a title (titles never remap a stable id).
The single/multi/hybrid 'deployment shape' was a misnomer: these keys only
affect the gateway (the one entrypoint supplying a runtime user ID), and the
three preset names stamped a lossy taxonomy onto three orthogonal knobs while
hiding which keys got written.
Replace it with an intent-led tree gated on gateway detection:
- _gateway_platforms() lazily inspects the gateway config (best-effort, no
hard dependency); the step auto-skips when no platform is connected.
- 'who talks to this?' → just me / me+others (pooled?) / only others, deriving
pinUserPeer + userPeerAliases + runtimePeerPrefix and echoing the result.
- [e] drops to a raw-knob editor for power users.
- The single→multi orphan guard survives as a pooling steer.
The setup wizard wrote the legacy pinPeerName even though pinUserPeer is
the canonical key that outranks it in the resolver — so it had to scrub
the canonical key afterward to stop it winning. Write pinUserPeer directly
and migrate any legacy pinPeerName onto it on touch (setup load + clone),
which removes the precedence-fighting entirely.
Resolver still reads pinPeerName as a back-compat alias; that's deferred.
Self-hosted Honcho setup had four sharp edges:
- local/cloud URLs ending in /vN double-prefixed by the SDK (/v3/v3/... 404)
- authenticated local servers had no setup prompt for a JWT/bearer token
- profile-derived host keys could be dot-containing workspace IDs Honcho rejects
- memory-provider config files with API keys written world-readable per umask
This keeps existing behavior but makes those paths safer:
- strip a trailing /vN version segment from any configured baseUrl before SDK
init (the SDK's route builders always prepend their own version prefix);
auth-skipping stays loopback-only
- add an optional local JWT/bearer prompt in honcho setup, stored under
hosts.<host>.apiKey
- derive new profile host keys with underscores, still reading legacy
hermes.<profile> blocks
- write memory-provider config files atomically with 0600 via a shared
utils.atomic_json_write(mode=) arg (honcho/hindsight/mem0/supermemory)
- skip honcho.json parsing in gateway cache-busting unless Honcho is the active
memory provider; memoize by honcho.json mtime when active
- bust the gateway agent cache on memory.provider change
- add a hermes memory setup <provider> one-liner so fresh installs can configure
a named provider without the picker (the per-provider hermes <provider>
subcommand only registers once that provider is active)
Closes#20688, #29885, #26459, #30246, #33382, #32244.
Co-authored-by: BROCCOLO1D
Three related regressions stemming from the pinUserPeer alias landing:
- Setup wizard read host-only fields when detecting current shape but the
parser supports root-level config and gives host pinUserPeer higher
precedence than pinPeerName. Re-running setup could mis-detect shape
and silently flip routing. Detection now uses the same resolver order
as HonchoClientConfig, and each shape branch scrubs every peer-mapping
key before writing so a stale pinUserPeer=false can't outrank a freshly
written pinPeerName=true. Multi no longer auto-writes
userPeerAliases={} (was silently masking root-level baselines).
- clone_honcho_for_profile inherited pinPeerName but not pinUserPeer, so
a default profile configured with the newer key produced cloned
profiles without the pin.
- Gateway cache-busting signature fingerprinted Honcho user-peer fields
but not ai_peer. Since HonchoSessionManager freezes cfg.ai_peer at
init, mid-flight aiPeer edits kept assistant writes on the old peer
until an unrelated cache eviction. ai_peer is now part of the
signature.
Remove "PR #14984 / #27371 / #1969" references and "the original key /
legacy / backwards-compatible / Port #N" narration from the honcho
plugin README, tests, and one stale code comment. These artefacts age
poorly: they describe how a change happened rather than what the code
does today, and they tax readers who weren't around for the original
work.
Also drop a dangling reference to scratch/memory-plugin-ux-specs.md in
__init__.py — the file isn't in the repo or git history.
No behaviour change.
Three correctness gaps when honcho.json's identity-mapping config changes
mid-flight:
1. The gateway's agent cache signature ignored honcho identity keys, so
editing peerName / pinPeerName / userPeerAliases / runtimePeerPrefix
was silently dropped until an unrelated cache eviction. Extend
_extract_cache_busting_config to fingerprint the resolved honcho
config so the AIAgent rebuilds on the next message.
2. cmd_setup let single → multi flips orphan the pinned-pool history
under peerName without warning. Detect the transition, warn that
runtime users will resolve to fresh empty peers, and auto-steer to
hybrid (alias the operator's runtime IDs back to peerName) so the
operator's own continuity survives. yes / no overrides available.
3. README didn't document the orphaning behaviour. Add a "Migrating
single → multi" callout under Deployment shapes.
Tests:
- TestPinTransition (test_pin_peer_name.py): fresh-manager flip resolves
to runtime, in-process flip is gated by the per-key session cache
(documents the gateway-cache-must-bust contract), 3 cache-bust
signature tests for pin / aliases / prefix.
- TestProfilePeerUniqueness: two profiles pinned to distinct peerNames
resolve to distinct peers; host-level peerName overrides root when
pinned.
- test_single_to_multi_steers_to_hybrid_by_default and
test_single_to_multi_yes_override_keeps_multi (test_cli.py): wizard
guard end-to-end coverage.
The PR #27371 resolver introduced three identity-mapping config keys
(pinPeerName, userPeerAliases, runtimePeerPrefix), but operators had
no guided way to set them — they had to read the README, understand
the resolver ladder, and hand-edit honcho.json. This commit adds an
interactive step to 'hermes honcho setup' that asks one question
('what's your deployment shape?') and writes the right combination
of keys.
Three shapes cover the realistic deployments:
* single -- pinPeerName=true. All gateway users collapse to your
peerName. Recommended for personal/single-operator use.
* multi -- pinPeerName=false, no aliases. Each runtime user gets
their own peer. Optional runtimePeerPrefix for cross-
platform namespace isolation.
* hybrid -- pinPeerName=false, with userPeerAliases mapping YOUR
runtime IDs (Telegram UID, Discord snowflake, Slack
user, Matrix MXID) to peerName. Multi-user gateway
where you are a privileged operator.
A 'skip' option leaves existing identity-mapping config untouched —
critical because re-running setup must not silently wipe operator-
curated aliases.
The wizard detects the current shape from existing config so the
prompt's default matches what the operator already has.
PR #27371 added host-scoped userPeerAliases, runtimePeerPrefix, and
pinPeerName, but the cloned-profile allowlist in
plugins/memory/honcho/cli.py::clone_honcho_for_profile() omitted them.
A new profile created via 'hermes honcho setup' or similar would
silently drop the operator's identity-mapping config, causing gateway
users to resolve to raw runtime IDs and fragmenting Honcho memory
across an unintended set of peers.
Add the three keys to the allowlist and a regression test class
covering all three plus the unset case.
The scheme-validation commit (e77a3f2c) was too strict: a user with
legacy ''baseUrl: localhost:8000'' (no ''http://'' prefix) in their
''~/.honcho/config.json'' would get ''No API key configured'' from the
CLI after that change, even though their setup worked before.
urlparse on a schemeless host:port treats the host segment as the
scheme and leaves netloc empty, so the http/https check rejected it.
Falls back to a lenient check for schemeless strings that look like
hosts: contain '.' or ':', aren't a boolean/null literal, aren't pure
digits. The SDK still rejects truly malformed URLs at connect time
with a clearer error than ours.
Three new tests: legacy schemeless hosts accepted; obvious garbage
literals (''true'', ''null'', ''12345'') still rejected. Reviewer
noted concern #1: schemeless regression for self-hosters with old
configs.
Two small follow-ups to the PR review:
- Hoist hashlib import from _enforce_session_id_limit() to module top.
stdlib imports are free after first cache, but keeping all imports at
module top matches the rest of the codebase.
- _resolve_api_key now URL-parses baseUrl and requires http/https +
non-empty netloc before returning the 'local' sentinel. A typo like
baseUrl: 'true' (or bare 'localhost') no longer silently passes the
credential guard; the CLI correctly reports 'not configured'.
Three new tests cover the new validation (garbage strings, non-http
schemes, valid https).
_resolve_api_key() only checks for apiKey / HONCHO_API_KEY, so all
CLI subcommands (identity --show, status, migrate, etc.) bail with
"No API key configured" on self-hosted instances that use baseUrl
without an API key.
Return "local" when baseUrl or HONCHO_BASE_URL is set, matching the
client.py behavior that already handles this case for the SDK.
Tested on: macOS, self-hosted Honcho (Docker, localhost:8000).
Hardens the dialectic lifecycle against three failure modes that could
leave the prefetch pipeline stuck or injecting stale content:
- Stale-thread watchdog: _thread_is_live() treats any prefetch thread
older than timeout × 2.0 as dead. A hung Honcho call can no longer
block subsequent fires indefinitely.
- Stale-result discard: pending _prefetch_result is tagged with its
fire turn. prefetch() discards the result if more than cadence × 2
turns passed before a consumer read it (e.g. a run of trivial-prompt
turns between fire and read).
- Empty-streak backoff: consecutive empty dialectic returns widen the
effective cadence (dialectic_cadence + streak, capped at cadence × 8).
A healthy fire resets the streak. Prevents the plugin from hammering
the backend every turn when the peer graph is cold.
- liveness_snapshot() on the provider exposes current turn, last fire,
pending fire-at, empty streak, effective cadence, and thread status
for in-process diagnostics.
- system_prompt_block: nudge the model that honcho_reasoning accepts
reasoning_level minimal/low/medium/high/max per call.
- hermes honcho status: surface base reasoning level, cap, and heuristic
toggle so config drift is visible at a glance.
Tests: 550 passed.
- TestDialecticLiveness (8 tests): stale-thread recovery, stale-result
discard, fresh-result retention, backoff widening, backoff ceiling,
streak reset on success, streak increment on empty, snapshot shape.
- Existing TestDialecticCadenceAdvancesOnSuccess::test_in_flight_thread_is_not_stacked
updated to set _prefetch_thread_started_at so it tests the
fresh-thread-blocks branch (stale path covered separately).
- test_cli TestCmdStatus fake updated with the new config attrs surfaced
in the status block.