Builds on @wgu9's runtime-tracking fix: now that find_gateway_pids() can
see a no-supervisor `gateway restart` runtime, have stop_profile_gateway()
fall back to an orphan-aware, profile-scoped reap (SIGTERM then SIGKILL)
when the pidfile/runtime record is missing or stale. Closes the duplicate-
accumulation path in #51325 — a follow-up restart now kills the prior
orphan instead of stacking another listener on :8644. Gated on
not supports_systemd_services() so a transient `gateway restart` argv on
supervised hosts is never killed.
Also adds the AUTHOR_MAP entry for the salvaged contributor.
atomic_yaml_write (and two sibling config writers) called yaml.dump
without allow_unicode=True. The default personalities shipped in cli.py
contain emoji/kaomoji, so PyYAML escaped astral-plane chars as 8-digit
\\UXXXXXXXX sequences inside multi-line double-quoted strings wrapped
with \\ line-continuations. Stricter/non-PyYAML parsers, editors, and
hand-edits break that structure into unclosed quotes, failing the whole
config parse -> silent fallback to defaults -> custom_providers lost.
Add allow_unicode=True to the canonical writer plus tui_gateway/server.py
and the telegram adapter's atomic config write so config is written as
readable UTF-8 with no escape/fold artifacts.
Fixes#51356
deliver=origin (or omitted) from a TUI or classic-CLI session produces a
job with origin=null, because those sessions never populate the
HERMES_SESSION_PLATFORM/CHAT_ID context vars that _origin_from_env reads.
The scheduler then resolves no delivery target and skips delivery — the
job runs and saves output to last_output, but nothing reaches the user
and they only find out by polling cronjob(action='list') (#51568).
This is by design (local sessions have no live-delivery channel), so the
fix surfaces it instead of silently dropping the intent:
- cronjob create now appends an informational notice to its result when
a created job resolves to zero delivery targets and the user did not
explicitly ask for deliver='local'. The check uses the scheduler's own
_resolve_delivery_targets so it accounts for origin, home channels,
'all', and explicit platform targets — no false positives.
- PLATFORM_HINTS gains a 'tui' entry (the TUI had none) and the 'cli'
hint now states that cron jobs from these sessions are local-only and
that deliver must target a gateway-connected platform to notify the
user. This stops the agent promising a delivery that never happens.
No scheduler/delivery behavior change; no new env var; cron isolation
invariant untouched.
Adds the #51579 regression test the issue asked for: run the real
docker_config_migrate.py boot path twice (host-reboot scenario under
--restart unless-stopped) and assert $HERMES_HOME/.env survives
byte-for-byte and the second boot is a no-op (no re-migration, no new
backup). Exercises real migrate_config + real file I/O via subprocess.
spectrum-ts routes stream telemetry through @photon-ai/otel's createLogger,
which sends severity>=ERROR to console.error and WARN/INFO to console.log.
The two lines the health monitor keys off land on different channels:
log.error("stream persistently failing") -> console.error (caught), but
log.warn("stream interrupted; reconnecting") -> console.log (was missed).
The original interception patched console.error only, so the recovering->
degraded escalation counter never saw the interrupt bursts that are the
primary silent-inbound symptom. Verified live against spectrum-ts 3.1.0 +
@photon-ai/otel: 3 real log.warn('stream interrupted') calls now escalate
to degraded -> process.exit(75) -> adapter reconnect.
Adds a shared classifyStreamLog() fed by both console.error and console.log,
plus a regression test asserting both channels are intercepted.
Source-level regression guard (the script only runs on Windows, so there's no
runner on Linux CI). Asserts Resolve-AvailablePythonVersion exists, that
Install-Venv re-resolves the interpreter before the venv-creation line, and
that Test-Python and the resolver share the single $PythonFallbackVersions
constant so detection and venv creation can't drift apart again.
_ensure_uv_for_termux only checked resolve_uv() (the managed
$HERMES_HOME/bin/uv) before falling back to pip, so a uv installed via
`pkg install uv` lives on PATH but is invisible to the helper. Combined
with the cherry-picked wheel-only fallback, a Termux user with no managed
uv still hit `pip install uv`, which has no Android wheel and tried to
source-build the Rust crate, OOM-killing low-memory devices.
Probe shutil.which("uv") right after the Termux guard and reuse it before
pip. Add a regression test that keeps resolve_uv() returning None while a
uv exists on PATH and asserts pip is never invoked.
Register a per-instance wakeUrl and forward it to the connector at
self-provision so a suspended gateway can be poked awake when buffered
work arrives (pairs with the connector-side WakePoker).
- relay_wake_url() resolver (env GATEWAY_RELAY_WAKE_URL, then
gateway.relay_wake_url in config.yaml), mirroring relay_instance_id()
- thread wake_url through _post_provision (adds wakeUrl to the body only
when set) + self_provision_relay (resolve, forward, log)
- hermes gateway enroll --wake-url <url> persists GATEWAY_RELAY_WAKE_URL
- document the §5.2 wake poke in relay-connector-contract.md §3.3
- tests: relay_wake_url resolution (env/config/absent), provision
forwarding, body-only-when-set (6 new; 130 relay tests pass)
The actual reconnect+drain on wake is Unit B's loop; this unit only
wires the wake SIGNAL. Opt-in: absent wakeUrl => connector never pokes.
The gateway half of the going-idle/buffered-flip primitive (scale-to-zero
PRIMITIVE, not the behaviour). Integrates with the EXISTING drain transition:
- ws_transport: `go_idle()` sends `going_idle` + awaits the connector's
`going_idle_ack` (connector-authoritative flip-then-ack, Q-5.3c — stays
serving until the ack so nothing is lost in the flip window); acks a buffered
inbound (bufferId present) via `inbound_ack` after the handler runs
(drain-without-dup on the delivery leg); NET-NEW reconnect loop re-dials +
re-handshakes after an unexpected close (off by default, on in production).
- adapter: emits `going_idle` from its existing `disconnect()` drain seam before
tearing down the socket; best-effort + guarded (never blocks shutdown).
- transport Protocol + contract doc §3.2 document the 3 new frames.
+6 relay tests (124 pass). NOT in scope: the autonomous idle timer / machine
suspend / NAS health model (deferred behaviour). Ben's relay-adapter solo lane.
* fix(windows): harden gateway scheduled task
* fix(windows): launch gateway scheduled task via console-less wscript
The Scheduled Task ran the gateway through cmd.exe, which allocates a
console. During logon Windows broadcasts CTRL_CLOSE_EVENT to console
process groups, reaping cmd.exe and the half-initialized gateway with
STATUS_CONTROL_C_EXIT (0xC000013A) - which Task Scheduler treats as a
user cancel, so RestartOnFailure never fires and the gateway vanishes on
every reboot (issue #45599 root cause #1).
Add a console-less .vbs launcher (wscript.exe -> pythonw.exe, both
GUI-subsystem) mirroring the gateway.cmd env + argv, and point the task
action at it. The .cmd stays for the Startup-folder fallback and /Run.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Jeff <jeffrobodie@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When _auto_create_thread() creates a thread from a user message via
message.create_thread(), Discord fires a second MESSAGE_CREATE event
for the 'thread starter message'. That starter message carries
message.id == thread.id and may arrive with type=default instead of
type=21 (thread_starter_message), so the existing type filter in
on_message does not catch it — triggering a second call into
_handle_message and thus a second agent run and response.
Fix: after _auto_create_thread succeeds and returns a thread, pre-seed
the dedup cache with str(thread.id) via self._dedup.is_duplicate().
The dedup cache is the same TTL-based MessageDeduplicator that already
guards against Discord RESUME event replays. Calling is_duplicate()
marks the ID as seen; when the duplicate thread-starter MESSAGE_CREATE
arrives, on_message's guard returns True and the event is dropped.
This is a minimal, targeted fix:
- No new state: reuses the existing _dedup instance
- No timing/race: the pre-seed happens synchronously inside the async
_handle_message, before the thread-starter event can be dispatched
- Scoped: only fires when auto-threading is enabled AND thread creation
succeeds (thread object is not None)
Also adds tests in tests/gateway/test_discord_double_dispatch.py
covering the pre-seed behaviour, failure modes (thread creation fails,
auto-thread disabled), and dedup cache integrity.
Closes#51057
_session_task_is_stale() failed to detect a stale session lock when the owner
task completed and cleaned _session_tasks (del in _process_message_background's
finally) but _active_sessions was NOT released because _release_session_guard
skipped on a guard mismatch (a concurrent reset/new command or drain handoff
swapped _active_sessions[key] to a different guard). With no owner task left to
inspect, _session_task_is_stale reported 'not stale', the orphaned guard was
never healed, and the session deadlocked permanently — later messages received
but never dispatched.
Reorder the finally cleanup to release-then-conditional-delete: release the
guard first, then drop the _session_tasks entry ONLY if the guard was actually
released (session_key no longer in _active_sessions). On a guard mismatch the
done-task entry survives, so the on-entry self-heal (_session_task_is_stale ->
_heal_stale_session_lock) detects the stale lock and clears it on the next
inbound message.
Extracted the cleanup into a callable _cleanup_finished_session_task() helper so
the regression test drives the REAL production code path rather than a copy of
its logic (the original test inlined the fixed logic and passed regardless of
the production order — mutation-verified the rewritten tests now fail on the
buggy del-first order). Added a positive-path test (guard matches -> release +
delete) so both branches are pinned.
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Open-ended skill learning across every surface. /learn <free text> takes a
description of any source — a directory, a URL, the workflow you just walked
the agent through, or pasted notes — and the live agent gathers it with the
tools it already has (read_file/search_files, web_extract, the conversation,
the pasted text), then authors a SKILL.md via skill_manage following the
house authoring standards (<=60-char description, the standard section order,
Hermes-tool framing, no invented commands).
No engine, no model-tool footprint, works on any terminal backend (local,
Docker, remote): /learn builds a standards-guided prompt and hands it to the
agent as a normal turn.
- agent/learn_prompt.py: shared standards-guided prompt builder
- /learn registry entry (both surfaces) + CLI handler (inject onto input
queue) + gateway handler (rewrite turn, fall through, /blueprint pattern)
- tui_gateway command.dispatch returns a send directive -> TUI + dashboard chat
- dashboard Skills page 'Learn a skill' panel (dir + URL + open-ended text)
composes a /learn request and runs it in chat
- docs (slash-commands ref + skills feature page), 11 targeted tests
Inspired by OpenAI Codex's Record & Replay and the /learn concept from #47234
(dir-distillation engine); reworked to be open-ended and engine-free per
review.
PTB's HTTPXRequest builds its httpx.AsyncClient with
`limits = httpx.Limits(max_connections=connection_pool_size)` and no
keepalive tuning, so httpx's default keepalive_expiry=5.0 applies. Behind
an HTTP proxy (Cloudflare Warp etc.) a peer-initiated FIN can sit in
CLOSE_WAIT longer than that, leaking fds in the general request pool
(_request[1], which routes bot.send_message/set_my_commands) — the pool
_drain_polling_connections never resets. Telegram was the lone holdout
adapter not using the shared #18451 CLOSE_WAIT helper.
Wire gateway.platforms._http_client_limits.platform_httpx_limits() into
the httpx client across ALL THREE request-construction branches —
fallback-transport, proxy, and plain — via httpx_kwargs["limits"], which
PTB spreads last into its client kwargs so our tuned limits win. PTB's
connection_pool_size (max_connections) is preserved; only keepalive
behaviour is tightened (max_keepalive_connections + keepalive_expiry<5.0).
The fix is macOS-import-safe: no Linux-only socket TCP_KEEPIDLE/INTVL/CNT
constants at module scope (unlike the broken candidate which crashed on
import on the reporter's OS), and it patches the actual proxy path the
repro hits rather than TelegramFallbackTransport, which the proxy repro
never instantiates.
Adds a mutation-survivable behavior-contract test asserting every
HTTPXRequest built by connect() receives httpx_kwargs["limits"] with
keepalive_expiry < httpx's 5.0 default, across both the proxy and plain
branches. Reverting the limits wiring fails the test.
Co-authored-by: indigokarasu <mx.indigo.karasu@gmail.com>
When a session rotates id on compression, _sync_session_key_after_compress()
re-anchored the session_key, approval-notify routing, yolo state, and slash
worker — but never moved the active-session lease, which stayed keyed to the
pre-compression id. And _find_live_session_by_key() matched live sessions on
the stale session_key, not the live agent's current agent.session_id. After
compression a resume/create path failed to recognize the existing live agent
and could build a SECOND live agent against the same DB continuation -> forked
lineage / cross-session message mixing.
- active_sessions.transfer_active_session(): move a lease in place to the new
id under the exclusive file lock (no slot drop).
- gateway _transfer_active_session_slot(): call it inside
_sync_session_key_after_compress(); on the rare fallback (entry pruned)
RESERVE the new slot before releasing the old lease (reserve-before-release),
so a concurrent gateway at the session cap cannot grab the freed slot in a
release-then-reacquire window and leave this session with no lease; if the
reserve fails, keep the existing lease (review fix).
- _session_lookup_key(): make live-session lookup authoritative on
agent.session_id, wired into all stale-session_key consumers
(_find_live_session_by_key, _session_live_item, _live_session_payload) —
fixes the whole lookup class.
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
/reload-mcp -> shutdown_mcp_servers -> _kill_orphaned_mcp_children(include_active=True)
-> _send_signal -> killpg(pgid, SIGTERM). When a tracked MCP stdio child shares
the gateway's OWN process group, killpg delivers SIGTERM to the gateway itself,
firing its SIGTERM handler -> os._exit(0): /reload-mcp crashes the gateway.
Pre-compute the gateway's own pgid (os.getpgrp(), None on Windows/restricted)
and, in _send_signal, skip killpg when pgid == own pgid, falling through to the
per-pid os.kill path so the child is still reaped without self-signaling.
Adds a regression test (folded in) that pins the guard: with a tracked pgid
equal to the gateway's own pgid, killpg is never called for that pgid and the
per-pid kill fallback is used. Mutation-checked.
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Map Hermes xhigh→max to unlock DeepSeek V4's 'Max thinking' tier
through Ollama Cloud's OpenAI-compatible /v1/chat/completions endpoint.
low/medium/high pass through unchanged; disabled/none suppress
reasoning entirely.
Empirically confirmed: reasoning_effort:max produces ~2.5× more
thinking tokens than high on deepseek-v4-pro:cloud (1576 vs 642).
Parity with the classic CLI status bar's ⛓ indicator (PR #51441). The
Ink TUI status bar now shows ⛓ N for live background/async subagents
(delegate_task batches + background single delegations).
- tui_gateway/server.py: _get_usage() embeds active_subagents from
tools.async_delegation.active_count() — the same registry the CLI
reads — onto the existing per-update usage payload, guarded so a
raising active_count() leaves the field off without breaking usage.
- ui-tui appChrome: new 'subagents' status segment (breakpoint w>=92,
slots between bg and cost in the shed-order), renders ⛓ N from
usage.active_subagents.
- Usage / SessionUsageResponse types gain active_subagents?.
Distinct from the turn-scoped SpawnHud / /agents overlay, which mirror
live in-turn subagent.* events; this is the persistent registry count.
By default `hermes slack manifest` opts the app into Slack's AI Assistant
container (assistant_view feature + assistant:write scope +
assistant_thread_* events). Slack then renders DMs as the right-hand
Assistant split-pane, where every exchange is a thread and bare slash
commands (/help, /new, ...) are not delivered as normal command events —
they only work when the bot is @mentioned. There was no way to opt out
short of hand-editing the generated JSON.
Add --no-assistant to emit a flat-DM manifest that omits those three
pieces, so DMs render as a normal chat and slash commands dispatch
inline. The regular messaging surface (Messages tab, slash commands,
Socket Mode, channel + DM scopes/events) is preserved in both modes.
Default behaviour is unchanged (assistant mode still on).
Tests: cover both manifest modes and the argparse wiring.
The classic prompt_toolkit status bar already shows two background
indicators: ▶ N (/background agent threads) and ⚙ N (shell processes
spawned by terminal(background=true)). Background/async subagents
(delegate_task batches and background single delegations) had no
indicator despite being long-running work the user should be able to
see at a glance.
Add a third indicator ⛓ N sourced from
tools.async_delegation.active_count() — the count of delegations still
in the 'running' state. Renders in the plain-text builder and the
styled-fragment builder across the same width tiers as the other two
(omitted on the narrow <52 tier), guarded so a raising active_count()
leaves the snapshot at 0.
Adds a per-platform display.reasoning_style setting (code | blockquote |
subtext) controlling how the show_reasoning summary renders on the gateway.
Discord defaults to "subtext" (-# small grey metadata text); every other
platform keeps the fenced code block. Resolves through the existing
display.platforms.<platform>.reasoning_style override chain.
Replace the old "skips download when a system browser exists" assertions with
tests for the new behavior:
- no PATH scan for browser command names, and the "use the system browser" path
is gone;
- find_system_browser consults only an explicit AGENT_BROWSER_EXECUTABLE_PATH
override (which still skips the bundled download);
- strip_snap_browser_override runs on both install paths and a /snap/* path is
rejected, so already-affected installs auto-recover on update.
ci: centralize path-gating behind single orchestrator + all-checks-pass
gate
Replace the scattered per-workflow detect-changes pattern with a single
ci.yml orchestrator that runs the classifier once, then conditionally
calls sub-workflows via workflow_call based on lane outputs. A final
all-checks-pass job (if: always()) aggregates all results so branch
protection only needs to require one check.
Changes:
- New .github/workflows/ci.yml orchestrator (detect + conditional calls
+ all-checks-pass gate)
- Extend classify_changes.py with scan/deps/mcp_catalog lanes, absorbing
supply-chain-audit's internal changes job
- Update detect-changes/action.yml to expose the new lane outputs
- Convert all 10 PR-gated sub-workflows to workflow_call-only triggers,
removing their push/pull_request triggers and per-step detect-changes
guards (gating now happens at the orchestrator level)
- lint.yml + supply-chain-audit.yml receive event_name as a
workflow_call
input to replace github.event_name (which is "workflow_call" inside
called workflows)
- supply-chain-audit.yml: remove internal changes job + *-gate jobs
(orchestrator handles gating, booleans arrive as inputs)
- contributor-check.yml: remove internal filter step
- Update test_classify_changes.py for 6-lane output + new supply-chain
test cases
Heavy PR checks run on every PR because the workflows deliberately avoid
`on.paths` filters — a path-gated workflow leaves its required check pending
forever when no matching file changes, blocking merge. So a docs-only PR
still spins up the TypeScript matrix, the full Python suite, and ruff/ty.
Keep every workflow triggering on every PR (checks always report) but gate
the expensive *steps* on what the PR touches. Skipping a step (not the job)
leaves the job green, so required checks never hang — the same idiom already
proven in contributor-check.yml.
A classifier (scripts/ci/classify_changes.py) maps the PR diff to three
lanes — python, frontend, site — surfaced as step outputs by a composite
action (.github/actions/detect-changes). Fail-open: an empty diff or any
.github/ change runs everything; python is a denylist (skipped only when
every file is provably prose or a frontend-only package); skills/**/SKILL.md
counts as python-relevant since the skill-doc tests read that tree. Non-PR
events always run the full pipeline.
Slack in-app voice clips ("record a clip") arrive as MP4/AAC containers
(mimetype audio/mp4, filename audio_message*.mp4), and Slack sometimes
labels them video/mp4. The inbound audio handler derived the cache
extension from the mimetype and fell back to ".ogg" for anything not in
{.ogg,.mp3,.wav,.webm,.m4a} — so audio/mp4 voice messages were cached as
.ogg. OpenAI STT (whisper-1, gpt-4o-transcribe) sniffs the container from
the FILENAME extension, so it received MP4 bytes named .ogg and rejected
them. WhatsApp .ogg and uploaded .m4a worked only because their extension
happened to match the bytes.
Fix:
- _resolve_slack_audio_ext(): pick the cache extension from the real
filename first, then a mimetype map (audio/mp4 -> .m4a), defaulting to
.m4a — never the bogus .ogg fallback. Mirrors the video branch and the
audio map already in gateway/platforms/bluebubbles.py.
- _is_slack_voice_clip(): detect audio-only clips mislabeled video/mp4
via the slack_audio subtype / audio_message* filename, and route them
through the audio path (cached as audio, reported as audio/*) so they
reach STT instead of video understanding. Genuine videos (and
slack_video screen recordings) are left on the video path.
Verified end-to-end against a real audio-only MP4: old path cached it as
.ogg (ffprobe shows MP4 bytes -> container mismatch -> OpenAI rejects);
new path caches it as .mp4 (extension matches bytes -> accepted).
Adds inbound-audio tests (previously none): helper unit tests plus
_handle_slack_message E2E coverage for audio/mp4, video/mp4-mislabeled
voice clips, and a real video staying on the video path. Confirmed the
two voice-message tests fail without the fix (mutation check).
The gateway half of Phase 6 Unit ζ: project the agent's existing relevance
knobs into the connector's platform-agnostic vocabulary and declare them at boot
over the /relay/policy route, so the SAME mention-gating / free-response /
allow-bots behavior the agent applies directly also governs relay delivery (and
excluded chatter never wakes a scaled-to-zero agent).
- gateway/relay/__init__.py:
- relay_relevance_policy(): project require_mention -> requireAddress,
free_response_channels -> freeResponseScopes, {PLATFORM}_ALLOW_BOTS in
{mentions,all} -> allowOtherBots. Reads the fronted platform's config block
+ bridged top-level keys. Returns None when all-default (the connector's
quiet default already matches) or no concrete platform is fronted.
- send_relay_policy(): POST /relay/policy authenticated with the gateway's own
per-gateway upgrade token (make_upgrade_token — same bearer as the WS
upgrade), so the connector attaches it to the authenticated instance, never
a body-asserted id. Re-declares every boot (self-healing, full replace).
NEVER raises, NEVER blocks boot — relevance is an optimization layered on
the δ/ε authorization gate. Reuses the per-gateway secret + the
/relay/provision host; no new inbound surface, no new credential.
- _policy_url(): ws(s)://…/relay -> http(s)://…/relay/policy.
- gateway/run.py: call send_relay_policy() after register_relay_adapter()
succeeds (the secret is resolved by then).
- docs/relay-connector-contract.md: new §7 documenting per-instance delivery +
the management plane (/manage/* + /relay/policy) + the relevance-declaration
contract; versioning renumbered to §8. Contract conformance test stays green
(§2/§3 tables untouched).
Tests: +12 (projection mapping incl. comma-string + top-level fallback; send
auth/skip/fail-soft/non-200). Full relay suite 118 pass. The connector route is
already E2E-proven (connector repo gateway_policy_driver.py); this adds the real
gateway send-path it pairs with.
This completes Phase 6 (Team Gateway per-user isolation) end to end.
A "one-shot" is a single stateless model call that runs OUTSIDE any conversation:
it never touches session history, never breaks prompt caching, and returns plain
text. UI surfaces need this for small generative chores — a commit message from a
diff, a rename suggestion, a summary — where an agent turn would pollute the
thread and hand-rolling an LLM call at every call site would be worse.
- `agent/oneshot.py`: `run_oneshot(...)` over the existing auxiliary-client
plumbing (same path as title generation). Two call shapes: explicit
instructions/input, or a registered `template` + `variables` (templates own the
prompt engineering so it stays consistent across CLI/TUI/desktop). Ships a
`commit_message` template. Model selection inherits the live session via
`main_runtime`, else the configured aux `task` backend.
- `tui_gateway/server.py`: `llm.oneshot` RPC (long-handler) inheriting the
session's model when `session_id` resolves.
Stateless by construction — no session mutation, cache untouched.
Follow-up to the coding-context posture (#43316): that PR detects each repo's
verify loop (manifests, package manager, exact test/lint/build commands, context
files) and bakes it into the system-prompt snapshot — but only as a string, for
the model. Non-prompt consumers (the desktop verify UI) had no way to read it
without re-sniffing and drifting from the prompt.
Split detection from rendering, keeping one source of truth:
- `detect_project_facts(root) -> ProjectFacts` (frozen) holds the structured
facts; `_project_facts()` now renders it into the same snapshot lines, so the
prompt block stays byte-identical (cache-safe).
- `project_facts_for(cwd)` resolves the workspace root (git, else marker) and
returns the structured facts, or None outside a workspace.
- `project.facts` gateway RPC surfaces it to any client (desktop/TUI/ACP).
Tests assert the structured output and that the UI-facing commands never drift
from what the prompt block renders (one detector feeds both).
* Revert "fix(cron): scope job execution to its owning profile (#32091 follow-up) (#50993)"
This reverts commit 660e36f097.
* Revert "fix(cron): anchor cron storage at the default root home (not the active profile)"
This reverts commit a5c09fd176.
* feat(memory): OAuth token storage and refresh for the Honcho provider
* feat(memory): refresh the Honcho OAuth token in the client and session
* feat(memory): zero-CLI loopback OAuth authorization flow
* feat(memory): generic memory-provider OAuth connect endpoints
* feat(desktop): memory-provider OAuth connect link
* feat(memory): CLI OAuth sign-in with source-tagged authorize links
* fix(memory): IP-literal loopback redirect and consent config_path on the authorize link
* fix(memory): profile-scope the memory-provider OAuth endpoints
* refactor(desktop): generic memory-provider OAuth client functions
* docs(memory): trim OAuth module docstrings to the invariants
* docs(memory): document OAuth connect as an optional auth method
* fix(memory): send home-relative display path to consent, not the absolute path
* perf(memory): cache OAuth token expiry in memory to skip the hot-path disk read
* fix(memory): log OAuth refresh failures at warning, not debug
* feat(memory): fall back to an OS-assigned loopback port when 8765 is taken
* test(memory): cover the desktop Connect launcher, status, and provider dispatch
* fix(desktop): keep the memory-provider dropdown one size regardless of connect state
* fix(desktop): move the memory connect link to the description line, leaving the dropdown untouched
* refactor(memory): move OAuth connect routes out of web_server into a memory-layer router
* refactor(desktop): import MemoryConnect directly, drop the single-export barrel
* fix(memory): launch CLI OAuth sign-in right after the auth choice, not after the wizard
* fix(desktop): auto-clear the OAuth error state instead of leaving it sticky
* test(honcho): isolate auth-method prompt from deployment-shape wizard tests
main's wizard suite scripts the cloud prompts without the OAuth auth-method step; auto-answer it in the shared helper so the answer lists stay shape-only.
* docs(honcho): document query-adaptive reasoning level (reasoningHeuristic)
README never mentioned reasoningHeuristic and listed reasoningLevelCap as an orphaned cap with the wrong default (— vs "high"). Add the query-adaptive scaling note + the reasoningHeuristic/reasoningLevelCap rows (grouped under Dialectic & Reasoning), matching the wording already on the hosted honcho.md page, and add a pointer from the memory-providers overview.
* fix(honcho): default the CLI peer prompt to the OAuth consent name
The CLI runs the grant with apply_config=False, so the peerName the user just entered at consent was dropped and the wizard's 'Your name' prompt fell back to $USER. Surface it as a transient OAuthCredential.consent_peer_name (set even when config isn't merged) and seed the prompt default from it.
* feat(honcho): split OAuth client_id by surface (cli=hermes-agent, desktop=hermes-desktop)
resolve_endpoints now picks the client_id from the initiating surface and
threads it through authorize -> token exchange -> persisted grant -> refresh,
so the CLI and desktop register as distinct OAuth clients. Surface-specific
env overrides (HONCHO_OAUTH_CLIENT_ID_CLI/_DESKTOP) win over the generic
HONCHO_OAUTH_CLIENT_ID, which still overrides every surface.
* feat(honcho): show OAuth vs API key in status; detect existing OAuth in setup
status now prints 'Auth: OAuth (clientId, token valid Xm/expired)' instead of
masking the OAuth access token as a generic API key; setup notes an existing
OAuth grant when re-run.
* docs(honcho): drop 'shared pool' wording from unified observation mode help
* fix(honcho): cross-process lock around OAuth refresh to prevent grant revocation
The in-process threading lock can't stop a sibling process (another profile or
the desktop app sharing honcho.json) from replaying the single-use refresh
token and tripping reuse-detection, which revokes the whole grant. Guard the
read-refresh-persist section with an OS file lock on <config>.lock so only one
process rotates at a time; the others re-read the freshly-persisted token.
Best-effort: platforms without flock degrade to in-process serialization.
* refactor(honcho): one OAuth client (hermes-agent) for all surfaces
Collapse the per-surface client_id split. CLI and desktop now use a single
client_id (hermes-agent); consent branding/UI still adapt via the source query
param. One grant identity means no clientId-vs-refresh-token desync that could
get the grant revoked. HONCHO_OAUTH_CLIENT_ID still overrides for self-hosting.
* fix(honcho): per-session resolves to session_id, never remapped by title
Reorder resolve_session_name so stable identifiers win over labels: gateway
per-chat key first, then the per-session session_id, then the cwd map / title.
A (possibly auto-generated) title can no longer remap a live per-session
conversation onto a second Honcho session mid-stream — fixes the desktop, which
is per-conversation via session_id. Consequence: a gateway's per-chat key now
also wins over a title (titles never remap a stable id).
Adds auxiliary.background_review.{provider,model} (default auto = main chat
model — unchanged). Set it to a different, cheaper model and the post-turn
self-improvement review runs there for ~3-5x lower cost.
Cache-aware by design: the main chat is warm in the prompt cache, so the
default full-history replay on the main model is cheap cache reads — left
exactly as-is. A different model can't reuse that cache (different key), so
when (and only when) routed to a different model the fork replays a compact
digest instead of the full transcript, minimising what it cold-writes on the
aux model. Same model -> full replay; different model -> digest.
Quality holds in benchmarks: memory capture identical, skill near-identical.
Nothing changes unless you opt in by naming a different model.
Co-authored-by: Hermes Agent <noreply@nousresearch.com>
The #32091 fix moved every profile's cron jobs into one shared root store,
but never wired the execution-scoping half it recommended: a job still ran
under whichever profile's ticker picked it up, not its owning profile. So a
job created under `hermes -p donna` could execute with the root profile's
.env / config.yaml / credentials.
- jobs.py: create_job auto-captures the active profile (explicit profile=
override available) and stores it on the job; resolve_profile_home() maps a
profile name to its HERMES_HOME; legacy jobs backfill to 'default'.
- scheduler.py: run_job applies the job's profile via a scoped HERMES_HOME
override (env var + in-process ContextVar) before any .env/config/script
load, restored in finally. tick() routes profile-mismatched jobs to the
single-worker sequential pool so the env mutation can't race.
- cronjob tool threads profile through (NOT exposed in the model schema, to
avoid cross-profile privilege escalation); hermes cron add gains --profile.
E2E verified against a temp HERMES_HOME with a real profile dir: a root-profile
ticker runs a profile='donna' job with HERMES_HOME=donna during execution and
restores the ticker env afterward.
File tools (read_file, write_file, patch, list_directory, etc.) used
os.path.expanduser() which reads the gateway process HOME env var.
In Docker/systemd/s6 deployments where the gateway HOME differs from
interactive sessions, tilde expanded to the wrong directory.
Add _expand_tilde() helper that delegates to get_subprocess_home() when
available, falling back to os.path.expanduser(). Replace all 9
expanduser() call sites in file_tools.py with _expand_tilde().
Follow-up to #50767, which redacted the chat-platform (_approval_notify_sync)
and SSE/API (_approval_notify) approval transports. The TUI JSON-RPC transport
is the third egress and was missed: three register_gateway_notify callbacks in
tui_gateway/server.py emitted the raw approval_data — including the unredacted
command Tirith flagged — straight to the TUI client via _emit.
Route all three registrations through a new module-level _emit_approval_request()
helper that redacts payload['command'] via the shared
gateway.run._redact_approval_command seam before emitting, matching the pattern
used for the other two transports. Completes the whole-bug-class fix for #48456.
Tests: assert the helper emits a redacted command (real credential pattern),
handles missing/None command, and a wiring guard that no registration emits the
raw payload directly (only the helper may). Both mutation-checked.
The #48456 fix series originated from @liuhao1024's #48462 — credit to them for
the original report and chat-platform fix; this completes the remaining transport.
Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
Follow-up to the /memory approve fresh-store fix. Both the CLI fallback and
the messaging-gateway handler built a bare MemoryStore() with the hardcoded
default char limits (2200/1375), ignoring the user's configured
memory.memory_char_limit / user_char_limit. A live agent honors those
overrides (agent/agent_init.py), so an approval applied without a live agent
could accept a write the user's lower cap would reject, or vice versa.
Extract a shared tools.memory_tool.load_on_disk_store() factory that reads
the configured limits (falling back to defaults if config can't load) and
wire both the CLI and gateway handlers to it, closing the gap on both
surfaces and de-duplicating the construction block.