Make verification closure the default coding behavior after landed file edits while keeping bounded retries and config/env switches for users who need to disable it.
Continuable cron jobs (attach_to_session / cron.mirror_delivery, default
OFF) now prefer a dedicated thread on thread-capable platforms, falling
back to origin-DM mirroring where threads don't exist.
- Thread-capable (Telegram topics, Discord/Slack threads): open a fresh
thread for the job via the shipped adapter.create_handoff_thread,
route the brief into it, and seed the thread-keyed session so the
user's in-thread reply continues with full context. This is the
'continuable cron opens its own thread' interface.
- DM-only (WhatsApp/Signal/SMS): create_handoff_thread returns None ->
fall back to mirroring into the origin DM session (existing behaviour).
Reuses existing infrastructure end-to-end — no new adapter surface, no
provider-chain signature change:
- adapter.create_handoff_thread (already implemented per-platform,
returns None on unsupported platforms = the fallback signal)
- the live SessionStore via adapter._session_store (already set on every
adapter), reached without threading a new param through the frozen
CronScheduler.start() contract
- gateway.mirror.mirror_to_session for the seed/append
- existing per-target delivery routing carries the new thread_id for free
Mirrors GatewayRunner._process_handoff's open-thread-or-fallback +
seed pattern, standalone for the cron delivery path. thread_seeded
guards against a double-mirror after seeding. Scoped to the origin
target only; fan-out/broadcast targets are never threaded or mirrored.
Config docs updated (cron.mirror_delivery) + cronjob tool
attach_to_session description reframed around continuable/thread-preferred.
Tests: +5 (thread id returned on thread platform; None on DM platform;
None without capability/loop; seed creates thread session + mirrors;
seed no-op on empty). 22/22 in TestCronDeliveryMirror; 532 cron tests
pass (4 failures pre-existing: croniter-not-installed + TZ).
Adds an opt-in path so a cron job's delivered output is also appended to
the TARGET chat's gateway session transcript (as an assistant turn), so a
user reply to a recurring delivery (daily brief, reminder) is answered with
the delivery in context instead of 'what is that?' amnesia.
- Reuses the shipped gateway.mirror.mirror_to_session — the same primitive
interactive send_message mirroring already uses. No messaging-toolset
change (cron still can't call send_message; this rides delivery).
- Gated: per-job attach_to_session overrides global cron.mirror_delivery
(config.yaml). Default OFF — historical isolation preserved byte-for-byte.
- Mirrors the CLEAN agent output, not the cron header/footer wrapper.
- Alternation/cache-safe: append lands at a turn boundary, never mid-loop,
never mutates the cached system prompt. Cold-start (no target session)
is a silent no-op; mirror errors never fail a successful delivery.
- Surfaced on the cronjob tool (attach_to_session) + config schema.
Driven by enterprise cron-as-control-plane use case. 10 new tests; full
cron + cronjob-tool suites pass (600).
The desktop bootstrap (and curl/PowerShell/docker installs) seeded
~/.hermes/SOUL.md with a comment-only scaffold that contained no persona
text. That shadowed the runtime default (_ensure_default_soul_md ->
DEFAULT_SOUL_MD), since seeding is guarded by 'if SOUL.md doesn't exist'.
Result: every fresh installer install got the empty template instead of
the documented Hermes persona; desktop just made it visible in onboarding.
- install.sh / install.ps1 / docker/SOUL.md now write DEFAULT_SOUL_MD.
- _ensure_default_soul_md() upgrades a SOUL.md still matching the known
legacy scaffold in place; customized files (any deviation, incl. a
persona appended below the comment) are never touched.
- Detection normalizes CRLF/BOM so Windows-installer drift still matches.
The gateway-side BEHAVIOUR layer that consumes the relay scale-to-zero
primitives (gateway-gateway Phase 5): the gateway decides it is idle and
drives the relay transport dormant so the platform (Fly autostop:"suspend")
can suspend the now-traffic-idle machine, which wakes on the connector's
wakeUrl poke (decisions.md Q3=C', D1-D13).
- gateway/scale_to_zero.py: pure helpers — scale_to_zero_enabled (the NAS
Labs HERMES_SCALE_TO_ZERO stamp, D11/Q8=A), parse_idle_timeout_seconds
(config.yaml gateway.scale_to_zero.idle_timeout_minutes, D2),
messaging_is_relay_only_or_absent (F6/D1), should_arm (D1/D11/§3.4(1)),
is_idle (D2/D3/F7).
- gateway/run.py: _last_inbound_at clock stamped on user inbound in
_handle_message (F13); the arm-gate + idle predicate + the
_scale_to_zero_watcher dormant sequence (mark draining -> adapter
go_dormant() -> cooldown), started only when armed. Deliberately NOT the
stop path and NOT mark_resume_pending (F12/D13).
- tools/process_registry.py: has_any_active() for the bg-work guard (D3/F7).
- hermes_cli/config.py: gateway.scale_to_zero.idle_timeout_minutes default 5.
Tests: 38 pure-logic + 6 watcher (incl. bg-work regression guard proven RED).
Full relay + scale-to-zero suites: 184 passed. The 20 unrelated failures in
the broader run are PRE-EXISTING on origin/main (custom-provider/tools tests),
confirmed via a pristine baseline worktree.
PR #52151 hardened the runtime-status liveness check to trust a readable
live process command line over stale gateway_state.json argv, so a recycled
PID now owned by an s6 supervisor no longer counts as a running gateway.
That fix is correct but incomplete for the reported symptom: the web
dashboard showed a named profile's gateway green while
`hermes -p <name> gateway status` showed it stopped. Two further issues:
1. Cross-profile PID reuse. In per-profile Docker supervision, one profile's
stale `gateway_state.json` can record a PID the OS later recycled onto a
DIFFERENT profile's live gateway. That PID's command line still
`looks_like_gateway`, so the dead profile was reported running. The
recorded argv has its `-p <name>` selector stripped in-process by
`_apply_profile_override`, so it cannot disambiguate; the live `/proc`
cmdline still carries it. `get_runtime_status_running_pid` now accepts an
`expected_home` and validates the live command line belongs to THAT
profile (mirroring `hermes_cli.gateway._matches_current_profile`, the
logic the CLI scan path already uses — which is why the CLI was correct).
`_check_gateway_running` passes the enumerated profile dir.
2. The existing regression test `test_gateway_running_check_falls_back_to_
runtime_state` used the live pytest PID with a gateway-shaped record; once
the live cmdline became authoritative it no longer looked like a gateway.
Updated to mock the live cmdline to the real separate-process scenario it
describes.
The active-profile path (`get_running_pid`) is intentionally left unscoped:
it is lock-verified and any live gateway cmdline is acceptable there. Multiplex
mode is unaffected — `running` state is only ever written to a gateway's own
home, never a secondary served profile's.
Adds coverage for: cross-profile PID reuse (named + default), matching
profile cmdline (`-p`, `--profile`, explicit HERMES_HOME=), the bare default
gateway, and the unreadable-cmdline cross-platform fallback. Each new
cross-profile assertion fails without the profile scope and passes with it.
Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com>
On macOS, the desktop updater's stage 1 (hermes update --gateway) ends by
restarting running gateways. launchd_restart() SIGTERMs the gateway and
silently waits up to agent.restart_drain_timeout (default 180s) for the
drain; the manual profile-gateway loop waits its drain budget per gateway
the same way. Neither path prints anything before the wait, so the desktop
updater's live output goes dead for minutes right after '✓ Update
complete!' — users read it as a hung update and force-kill their gateway
processes to make it move (#44515). The systemd branch already announces
its drain ('draining (up to Ns)...'); launchd and the manual loop did not.
Print the stop/drain (with PID and budget) before the wait in both paths,
mirroring the systemd branch, and assert the message in the existing
launchd drain test.
Fixes#44515
Ship the final pet-generation UX polish (provider picker behavior, step-2 cancel flow, banner integration, and visual consistency) and make saturated-chroma background removal C-op driven so hatch processing no longer hammers the machine during long runs.
- pet.generate / pet.hatch (parallel rows, off the reader thread) +
cooperative pet.cancel; pet.export / pet.rename.
- pet.gallery localOnly fast path + background manifest prefetch so the
picker never blocks on petdex; rename follows the active-pet config.
- gateway request gains optional timeout + AbortSignal for real Stop.
When the current provider is a custom endpoint (custom or custom:*), the model
switch pipeline must NOT auto-switch to a native provider/OpenRouter based on a
static-catalog match. The user explicitly configured their own endpoint and the
same model name may be served there; silently rewriting model.provider destroys
their config.
- detect_static_provider_for_model(): skip the static-catalog scan when the
current provider is custom/custom:*
- switch_model() Step e: extend is_custom to cover custom:* so the
detect_provider_for_model() last-resort fallback cannot fire
Salvaged from #48351 by Elshayib (authorship preserved).
Fixes#48305
Completes the #45006 fix. PR-base commit (configured-provider routing) handles
the case where a typed model IS declared in user/custom provider config. This
commit closes the other root: when a typed model is NOT in any config and the
current provider is a soft-accepting one (openai-codex / xai-oauth), the
hidden-model soft-accept (#16172 / #19729) would accept ANY unknown name as a
hidden model — so `qwen3.5-4b` typed on a Codex-default session "succeeded" and
mislabeled the provider as "OpenAI Codex" (the exact reported symptom), then
400'd on the next turn.
Gate the soft-accept to slugs that plausibly belong to the provider's family
(openai-codex -> gpt-/codex-/o1/o3/o4; xai-oauth -> grok-). Family-shaped
unknown slugs are still soft-accepted (preserving the #16172 entitlement-gated
hidden-model intent); unrelated names are rejected with actionable guidance to
pin the right provider via `--provider <slug>` or the picker.
Adds TestCodexSoftAcceptPlausibilityGate (5 tests): unrelated names rejected on
codex/xai, family-shaped hidden slugs still accepted, real catalog models
unaffected. Verified load-bearing.
The dashboard/desktop spawn gateway actions with stdin=DEVNULL and
HERMES_NONINTERACTIVE=1 (hermes_cli/web_server.py), but prompt_yes_no
ignored that contract and called sys.exit(1) on the resulting EOFError.
On Windows, `gateway start` asks "Install it now so the gateway starts on
login? [Y/n]" when the scheduled task / startup entry is not yet
installed. Spawned from the desktop app there is no stdin to answer it, so
every desktop-triggered gateway restart aborted at that prompt and the
gateway never started ("Gateway service is not installed").
Fall back to the prompt's default when HERMES_NONINTERACTIVE is set, and
treat a bare EOFError as "accept default" rather than exiting. This lets
the Windows start path proceed unattended (Startup-folder fallback + direct
spawn) while interactive TTY usage is unchanged. Ctrl+C still exits.
After `hermes update`, a globally-installed agent-browser's npm postinstall
(fixUnixSymlink) re-points the global symlink (e.g. /opt/homebrew/bin/agent-browser)
at our local node_modules binary. The next update wipes node_modules, leaving a
dangling symlink that `which` still reports but exec fails on with exit 127 —
silently breaking every browser tool (#48521).
Root cause is trust-on-presence: shutil.which/Path.exists accept a name that
resolves but won't run. Add hermes_constants.agent_browser_runnable() (resolves
the path + runs --version) and gate all four resolution sites on it:
_find_agent_browser now skips a dead candidate and falls through to the next
working one (extended PATH -> local .bin -> npx), self-healing the dangling link.
dep_ensure/doctor/nous_subscription validate too; doctor warns on a broken link.
Closes#48521.
Anthropic migrated the OAuth token endpoint from
console.anthropic.com/v1/oauth/token (now returns HTTP 404) to
platform.claude.com/v1/oauth/token. The token *refresh* path already
iterated both hosts, but the two initial code-exchange call sites were
hardcoded to the dead console host, so every new Claude OAuth login
failed with 'Token exchange failed: HTTP Error 404: Not Found' and saved
no credentials.
Fix the whole bug class:
- Add _OAUTH_TOKEN_URLS [platform.claude.com, console.anthropic.com] in
agent/anthropic_adapter.py; _OAUTH_TOKEN_URL now points at the live
host for backward-compat with existing imports.
- run_hermes_oauth_login_pure() (CLI flow) iterates the list, first
success wins, mirroring the refresh path.
- hermes_cli/web_server.py (desktop dashboard flow) imports the list and
iterates it too, so the GUI login path is fixed identically.
Probe: console.anthropic.com/v1/oauth/token -> HTTP 404 (gone),
platform.claude.com/v1/oauth/token -> HTTP 400 (alive). Verified a real
Claude MAX OAuth login now succeeds end-to-end.
The 30-slot default could not fit Hermes's ~50 built-in commands, so
every skill command (and 20 built-ins) were silently dropped from the
Telegram \`/\` menu by default — they only worked when typed manually.
Raising the default to 60 keeps all built-ins plus common skill commands
visible out of the box while staying under Telegram's ~4KB payload limit.
Users can still tune it via platforms.telegram.extra.command_menu.
Adds a configurable Telegram BotCommand menu cap and priority list via
platforms.telegram.extra.command_menu (max_commands clamped 1..100;
priority_mode prepend|append|replace). Default cap stays 30; hidden
commands remain invokable when typed and /commands lists the full set.
Salvaged from PR #42021. Cherry-picked onto current main; the original
edited gateway/platforms/telegram.py, now relocated to
plugins/platforms/telegram/adapter.py.
Selective --clone / --clone-from / --clone-config copied .env but not
auth.json, silently dropping the credential pool — including OAuth tokens
(Anthropic `claude /login`, Codex, xAI) that never land in .env. A profile
cloned from an OAuth-authenticated default therefore resolved a different
provider (or none) than the source under provider: auto. --clone-all already
carried auth.json via the full copytree; only the selective path missed it.
Add auth.json to _CLONE_CONFIG_FILES and tighten it to 0o600 after copy,
matching .env semantics.
On Windows, start_server() served uvicorn via a bare asyncio.run(_serve()),
which uses the default ProactorEventLoop. uvicorn's socket-serving stack
assumes a SelectorEventLoop on win32 (uvicorn/loops/asyncio.py forces it, and
uvicorn.Server.run threads config.get_loop_factory() into its runner for
exactly this reason). Driving uvicorn on the proactor loop makes
server.startup() bind a socket that never accepts: the dashboard and desktop
backend print "Skipping web UI build" then hang forever with the port
LISTENING but no TCP handshake completing.
Fix is win32-scoped to keep the blast radius minimal: POSIX keeps the exact
asyncio.run(_serve()) it had (its default loop is already a SelectorEventLoop /
uvloop, which is what uvicorn serves on). Only on Windows do we mirror
uvicorn.Server.run and run on the loop factory uvicorn picks, with a fallback
to WindowsSelectorEventLoopPolicy for uvicorn < 0.36.
Fixes hermes dashboard and hermes desktop (the Electron app spawns a
hermes dashboard backend). The gateway symptom in the report has a separate
root cause (no uvicorn) and is not addressed here.
The dashboard Profiles view showed "Gateway stopped" for a gateway that
is in fact running — while the sidebar status strip and `hermes gateway
status` (CLI) both correctly showed it running. Reported on v0.17.0
running the gateway + dashboard in one Docker container.
Root cause: three liveness surfaces with three detection strengths, all
reading the same `gateway.pid`:
- `hermes gateway status` -> find_gateway_pids() (process-table scan)
- sidebar /api/status -> get_running_pid() + gateway_state.json PID
fallback + health-URL probe
- Profiles view -> _check_gateway_running() = get_running_pid()
ONLY, no fallback
`get_running_pid()` short-circuits to None the moment the runtime lock
(`gateway.lock`) doesn't register as held by the *calling* process —
which is always true when the reader is a separate process from the
gateway (the dashboard is its own s6 service in the container), and also
for any launch-service-managed gateway that left a fresh
`gateway_state.json` but no live PID file. So the Profiles view alone
reported the live gateway as stopped.
Fix: give _check_gateway_running the same fallback the sidebar already
has — after the pid-file/lock check misses, validate the PID recorded in
that profile's gateway_state.json against the live process table via the
existing get_runtime_status_running_pid(). read_runtime_status() gains an
optional path arg so a profile's state file can be read without mutating
the process-global HERMES_HOME (preserving the contextvar-based profile
isolation the dashboard relies on). Backward compatible: every existing
caller passes no argument.
Tests: a regression test that fails pre-fix (live gateway, lock check
returns None -> must still report running) and a guard test that a
'stopped' state file is never reported running even with a live PID.
Profiles without their own messaging token inherit the default
profile's token via os.getenv, hit a token collision, and exit with
startup_failed. s6 restarts them immediately, creating ~30MB tirith
sandbox dirs in /tmp each cycle — filling the disk in hours (#51228).
Changes:
- gateway/restart.py: add GATEWAY_FATAL_CONFIG_EXIT_CODE = 78
- gateway/run.py: set exit_code=78 on non-retryable startup errors
(token collision, no platforms)
- hermes_cli/service_manager.py: add _render_finish_script() that
translates exit 78 → exit 125 (s6 permanent failure)
- hermes_cli/container_boot.py: write finish script alongside run
script during profile registration
The s6 finish script pattern follows docker/s6-rc.d/dashboard/finish.
Closes#51228
The cron ticker only runs inside the gateway (_start_cron_ticker); there
is no standalone cron daemon. When the gateway isn't running, next_run_at
passes but jobs never fire and last_run_at stays null — and manual
'hermes cron run' (which bypasses the ticker) appears to work, masking
the real cause. This is the most common cron support report (#51038).
cron list already warned; extend the same warning to cron create (the
moment the user is most likely to hit this) via a shared helper, and add
a pointer to 'hermes cron status'. Silent when a gateway is running, so
the gateway /cron path is unaffected.
Builds on @wgu9's runtime-tracking fix: now that find_gateway_pids() can
see a no-supervisor `gateway restart` runtime, have stop_profile_gateway()
fall back to an orphan-aware, profile-scoped reap (SIGTERM then SIGKILL)
when the pidfile/runtime record is missing or stale. Closes the duplicate-
accumulation path in #51325 — a follow-up restart now kills the prior
orphan instead of stacking another listener on :8644. Gated on
not supports_systemd_services() so a transient `gateway restart` argv on
supervised hosts is never killed.
Also adds the AUTHOR_MAP entry for the salvaged contributor.
_ensure_uv_for_termux only checked resolve_uv() (the managed
$HERMES_HOME/bin/uv) before falling back to pip, so a uv installed via
`pkg install uv` lives on PATH but is invisible to the helper. Combined
with the cherry-picked wheel-only fallback, a Termux user with no managed
uv still hit `pip install uv`, which has no Android wheel and tried to
source-build the Rust crate, OOM-killing low-memory devices.
Probe shutil.which("uv") right after the Termux guard and reuse it before
pip. Add a regression test that keeps resolve_uv() returning None while a
uv exists on PATH and asserts pip is never invoked.
Register a per-instance wakeUrl and forward it to the connector at
self-provision so a suspended gateway can be poked awake when buffered
work arrives (pairs with the connector-side WakePoker).
- relay_wake_url() resolver (env GATEWAY_RELAY_WAKE_URL, then
gateway.relay_wake_url in config.yaml), mirroring relay_instance_id()
- thread wake_url through _post_provision (adds wakeUrl to the body only
when set) + self_provision_relay (resolve, forward, log)
- hermes gateway enroll --wake-url <url> persists GATEWAY_RELAY_WAKE_URL
- document the §5.2 wake poke in relay-connector-contract.md §3.3
- tests: relay_wake_url resolution (env/config/absent), provision
forwarding, body-only-when-set (6 new; 130 relay tests pass)
The actual reconnect+drain on wake is Unit B's loop; this unit only
wires the wake SIGNAL. Opt-in: absent wakeUrl => connector never pokes.
* fix(windows): harden gateway scheduled task
* fix(windows): launch gateway scheduled task via console-less wscript
The Scheduled Task ran the gateway through cmd.exe, which allocates a
console. During logon Windows broadcasts CTRL_CLOSE_EVENT to console
process groups, reaping cmd.exe and the half-initialized gateway with
STATUS_CONTROL_C_EXIT (0xC000013A) - which Task Scheduler treats as a
user cancel, so RestartOnFailure never fires and the gateway vanishes on
every reboot (issue #45599 root cause #1).
Add a console-less .vbs launcher (wscript.exe -> pythonw.exe, both
GUI-subsystem) mirroring the gateway.cmd env + argv, and point the task
action at it. The .cmd stays for the Startup-folder fallback and /Run.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Jeff <jeffrobodie@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Open-ended skill learning across every surface. /learn <free text> takes a
description of any source — a directory, a URL, the workflow you just walked
the agent through, or pasted notes — and the live agent gathers it with the
tools it already has (read_file/search_files, web_extract, the conversation,
the pasted text), then authors a SKILL.md via skill_manage following the
house authoring standards (<=60-char description, the standard section order,
Hermes-tool framing, no invented commands).
No engine, no model-tool footprint, works on any terminal backend (local,
Docker, remote): /learn builds a standards-guided prompt and hands it to the
agent as a normal turn.
- agent/learn_prompt.py: shared standards-guided prompt builder
- /learn registry entry (both surfaces) + CLI handler (inject onto input
queue) + gateway handler (rewrite turn, fall through, /blueprint pattern)
- tui_gateway command.dispatch returns a send directive -> TUI + dashboard chat
- dashboard Skills page 'Learn a skill' panel (dir + URL + open-ended text)
composes a /learn request and runs it in chat
- docs (slash-commands ref + skills feature page), 11 targeted tests
Inspired by OpenAI Codex's Record & Replay and the /learn concept from #47234
(dir-distillation engine); reworked to be open-ended and engine-free per
review.
When a session rotates id on compression, _sync_session_key_after_compress()
re-anchored the session_key, approval-notify routing, yolo state, and slash
worker — but never moved the active-session lease, which stayed keyed to the
pre-compression id. And _find_live_session_by_key() matched live sessions on
the stale session_key, not the live agent's current agent.session_id. After
compression a resume/create path failed to recognize the existing live agent
and could build a SECOND live agent against the same DB continuation -> forked
lineage / cross-session message mixing.
- active_sessions.transfer_active_session(): move a lease in place to the new
id under the exclusive file lock (no slot drop).
- gateway _transfer_active_session_slot(): call it inside
_sync_session_key_after_compress(); on the rare fallback (entry pruned)
RESERVE the new slot before releasing the old lease (reserve-before-release),
so a concurrent gateway at the session cap cannot grab the freed slot in a
release-then-reacquire window and leave this session with no lease; if the
reserve fails, keep the existing lease (review fix).
- _session_lookup_key(): make live-session lookup authoritative on
agent.session_id, wired into all stale-session_key consumers
(_find_live_session_by_key, _session_live_item, _live_session_payload) —
fixes the whole lookup class.
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
By default `hermes slack manifest` opts the app into Slack's AI Assistant
container (assistant_view feature + assistant:write scope +
assistant_thread_* events). Slack then renders DMs as the right-hand
Assistant split-pane, where every exchange is a thread and bare slash
commands (/help, /new, ...) are not delivered as normal command events —
they only work when the bot is @mentioned. There was no way to opt out
short of hand-editing the generated JSON.
Add --no-assistant to emit a flat-DM manifest that omits those three
pieces, so DMs render as a normal chat and slash commands dispatch
inline. The regular messaging surface (Messages tab, slash commands,
Socket Mode, channel + DM scopes/events) is preserved in both modes.
Default behaviour is unchanged (assistant mode still on).
Tests: cover both manifest modes and the argparse wiring.
Adds a per-platform display.reasoning_style setting (code | blockquote |
subtext) controlling how the show_reasoning summary renders on the gateway.
Discord defaults to "subtext" (-# small grey metadata text); every other
platform keeps the fenced code block. Resolves through the existing
display.platforms.<platform>.reasoning_style override chain.
* Revert "fix(cron): scope job execution to its owning profile (#32091 follow-up) (#50993)"
This reverts commit 660e36f097.
* Revert "fix(cron): anchor cron storage at the default root home (not the active profile)"
This reverts commit a5c09fd176.
* feat(memory): OAuth token storage and refresh for the Honcho provider
* feat(memory): refresh the Honcho OAuth token in the client and session
* feat(memory): zero-CLI loopback OAuth authorization flow
* feat(memory): generic memory-provider OAuth connect endpoints
* feat(desktop): memory-provider OAuth connect link
* feat(memory): CLI OAuth sign-in with source-tagged authorize links
* fix(memory): IP-literal loopback redirect and consent config_path on the authorize link
* fix(memory): profile-scope the memory-provider OAuth endpoints
* refactor(desktop): generic memory-provider OAuth client functions
* docs(memory): trim OAuth module docstrings to the invariants
* docs(memory): document OAuth connect as an optional auth method
* fix(memory): send home-relative display path to consent, not the absolute path
* perf(memory): cache OAuth token expiry in memory to skip the hot-path disk read
* fix(memory): log OAuth refresh failures at warning, not debug
* feat(memory): fall back to an OS-assigned loopback port when 8765 is taken
* test(memory): cover the desktop Connect launcher, status, and provider dispatch
* fix(desktop): keep the memory-provider dropdown one size regardless of connect state
* fix(desktop): move the memory connect link to the description line, leaving the dropdown untouched
* refactor(memory): move OAuth connect routes out of web_server into a memory-layer router
* refactor(desktop): import MemoryConnect directly, drop the single-export barrel
* fix(memory): launch CLI OAuth sign-in right after the auth choice, not after the wizard
* fix(desktop): auto-clear the OAuth error state instead of leaving it sticky
* test(honcho): isolate auth-method prompt from deployment-shape wizard tests
main's wizard suite scripts the cloud prompts without the OAuth auth-method step; auto-answer it in the shared helper so the answer lists stay shape-only.
* docs(honcho): document query-adaptive reasoning level (reasoningHeuristic)
README never mentioned reasoningHeuristic and listed reasoningLevelCap as an orphaned cap with the wrong default (— vs "high"). Add the query-adaptive scaling note + the reasoningHeuristic/reasoningLevelCap rows (grouped under Dialectic & Reasoning), matching the wording already on the hosted honcho.md page, and add a pointer from the memory-providers overview.
* fix(honcho): default the CLI peer prompt to the OAuth consent name
The CLI runs the grant with apply_config=False, so the peerName the user just entered at consent was dropped and the wizard's 'Your name' prompt fell back to $USER. Surface it as a transient OAuthCredential.consent_peer_name (set even when config isn't merged) and seed the prompt default from it.
* feat(honcho): split OAuth client_id by surface (cli=hermes-agent, desktop=hermes-desktop)
resolve_endpoints now picks the client_id from the initiating surface and
threads it through authorize -> token exchange -> persisted grant -> refresh,
so the CLI and desktop register as distinct OAuth clients. Surface-specific
env overrides (HONCHO_OAUTH_CLIENT_ID_CLI/_DESKTOP) win over the generic
HONCHO_OAUTH_CLIENT_ID, which still overrides every surface.
* feat(honcho): show OAuth vs API key in status; detect existing OAuth in setup
status now prints 'Auth: OAuth (clientId, token valid Xm/expired)' instead of
masking the OAuth access token as a generic API key; setup notes an existing
OAuth grant when re-run.
* docs(honcho): drop 'shared pool' wording from unified observation mode help
* fix(honcho): cross-process lock around OAuth refresh to prevent grant revocation
The in-process threading lock can't stop a sibling process (another profile or
the desktop app sharing honcho.json) from replaying the single-use refresh
token and tripping reuse-detection, which revokes the whole grant. Guard the
read-refresh-persist section with an OS file lock on <config>.lock so only one
process rotates at a time; the others re-read the freshly-persisted token.
Best-effort: platforms without flock degrade to in-process serialization.
* refactor(honcho): one OAuth client (hermes-agent) for all surfaces
Collapse the per-surface client_id split. CLI and desktop now use a single
client_id (hermes-agent); consent branding/UI still adapt via the source query
param. One grant identity means no clientId-vs-refresh-token desync that could
get the grant revoked. HONCHO_OAUTH_CLIENT_ID still overrides for self-hosting.
* fix(honcho): per-session resolves to session_id, never remapped by title
Reorder resolve_session_name so stable identifiers win over labels: gateway
per-chat key first, then the per-session session_id, then the cwd map / title.
A (possibly auto-generated) title can no longer remap a live per-session
conversation onto a second Honcho session mid-stream — fixes the desktop, which
is per-conversation via session_id. Consequence: a gateway's per-chat key now
also wins over a title (titles never remap a stable id).
The card was macOS-only. cua-driver also runs on Windows and Linux, so
fold `cua-driver doctor` (cross-platform binary/health probes) into a
single OS-aware `ready` signal:
- macOS: ready == both TCC grants; keeps the permission rows + grant flow.
- Windows/Linux: no TCC toggles, so ready == driver health, with a
per-OS note (SmartScreen/UIAccess on Windows; X11/XWayland on Linux).
`computer_use_status()` replaces the macOS-only `permissions_status()` and
surfaces `platform`, `ready`, `can_grant`, and the doctor `checks` (non-ok
ones render as warnings). CLI `permissions status`, the REST endpoint, and
the desktop card all key off the one payload. Grant stays macOS-only (400
elsewhere — nothing to grant).
Computer Use already worked through the desktop backend (the cua-driver
toolset enables + installs via Settings -> Skills & Tools), but there was
no in-app way to see or grant the two macOS permissions it needs, so "give
a model my Mac" was tribal knowledge.
The grants attach to cua-driver's OWN TCC identity (com.trycua.driver /
the installed CuaDriver.app), not Hermes -- so no app entitlement is
involved. cua-driver 0.5+ exposes `permissions status/grant`, which we wrap:
- tools/computer_use/permissions.py: thin client over the two subcommands
- hermes computer-use permissions {status,grant}: CLI parity
- GET /api/tools/computer-use/status, POST .../permissions/grant: desktop REST
- ComputerUsePanel: live Accessibility + Screen Recording state with a
Grant button (dialog attributed to CuaDriver), shown in the expanded
Computer Use toolset row. Binary install stays in the existing provider
post-setup runner.
Follow-ups: i18n the card copy; a "Stop driver" control (cua-driver stop)
for the runaway-`serve` case.
Adds auxiliary.background_review.{provider,model} (default auto = main chat
model — unchanged). Set it to a different, cheaper model and the post-turn
self-improvement review runs there for ~3-5x lower cost.
Cache-aware by design: the main chat is warm in the prompt cache, so the
default full-history replay on the main model is cheap cache reads — left
exactly as-is. A different model can't reuse that cache (different key), so
when (and only when) routed to a different model the fork replays a compact
digest instead of the full transcript, minimising what it cold-writes on the
aux model. Same model -> full replay; different model -> digest.
Quality holds in benchmarks: memory capture identical, skill near-identical.
Nothing changes unless you opt in by naming a different model.
Co-authored-by: Hermes Agent <noreply@nousresearch.com>
The #32091 fix moved every profile's cron jobs into one shared root store,
but never wired the execution-scoping half it recommended: a job still ran
under whichever profile's ticker picked it up, not its owning profile. So a
job created under `hermes -p donna` could execute with the root profile's
.env / config.yaml / credentials.
- jobs.py: create_job auto-captures the active profile (explicit profile=
override available) and stores it on the job; resolve_profile_home() maps a
profile name to its HERMES_HOME; legacy jobs backfill to 'default'.
- scheduler.py: run_job applies the job's profile via a scoped HERMES_HOME
override (env var + in-process ContextVar) before any .env/config/script
load, restored in finally. tick() routes profile-mismatched jobs to the
single-worker sequential pool so the env mutation can't race.
- cronjob tool threads profile through (NOT exposed in the model schema, to
avoid cross-profile privilege escalation); hermes cron add gains --profile.
E2E verified against a temp HERMES_HOME with a real profile dir: a root-profile
ticker runs a profile='donna' job with HERMES_HOME=donna during execution and
restores the ticker env afterward.
Follow-up to the /memory approve fresh-store fix. Both the CLI fallback and
the messaging-gateway handler built a bare MemoryStore() with the hardcoded
default char limits (2200/1375), ignoring the user's configured
memory.memory_char_limit / user_char_limit. A live agent honors those
overrides (agent/agent_init.py), so an approval applied without a live agent
could accept a write the user's lower cap would reject, or vice versa.
Extract a shared tools.memory_tool.load_on_disk_store() factory that reads
the configured limits (falling back to defaults if config can't load) and
wire both the CLI and gateway handlers to it, closing the gap on both
surfaces and de-duplicating the construction block.
The CLI /memory slash handler (cli_commands_mixin._handle_memory_command)
passed self.agent._memory_store straight through, which is None when the
command runs without a live agent — e.g. /memory approve from the Desktop
GUI. The shared write-approval handler then returns "memory store
unavailable" and applies nothing, even with built-in memory enabled and
pending writes present.
Fall back to a freshly loaded on-disk MemoryStore when no live store is
available, mirroring the gateway path (gateway/slash_commands.py). It
persists to the same MEMORY/USER.md and creates MEMORY.md on the first
approved write.
Fixes#46783
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>