Lift the attachment drop engine (dragActive + the 7 drag/drop handlers + the
in-app-ref vs OS-upload split) out of ChatBar into
composer/hooks/use-composer-drop.ts. Self-contained, off the keystroke path —
consumes insertInlineRefs + onAttachDroppedItems + requestMainFocus. Verbatim
move, behaviour-preserving.
Lift the dictation + voice-conversation + auto-speak subsystem out of ChatBar
into composer/hooks/use-composer-voice.ts. It owns voiceConversationActive,
lastSpokenIdRef, the pending-reply readers, submitVoiceTurn, the voice
hooks (recorder/conversation/auto-speak), the Ctrl+B toggle event, and
handleToggleAutoSpeak; it exposes dictate/voiceStatus/voiceActivityState/
conversation/start+endConversation/handleToggleAutoSpeak for the controls.
Self-contained: consumes the draft/submit primitives (insertText, clearDraft,
focusInput, onSubmit) passed in, nothing depends back on it — so unlike the
queue subsystem (which is circularly coupled to the draft helpers) it lifts
cleanly. Behaviour-preserving; verbatim move.
The real composer state-engine fix. ChatBar subscribed to the full draft string
(`useAuiState(s => s.composer.text)`), so every keystroke re-rendered the whole
~2k-line component even though the contentEditable DOM already owns the text.
Replace that with:
- an imperative composer-runtime subscription (useComposerRuntime().subscribe)
that mirrors text into draftRef, repaints the editor ONLY on external changes
(clear/restore/insert; the focused editor is the source otherwise), and drives
the debounced per-session stash — all without a React render. This folds the
old `[draft]` sync effect and the `[draft]` debounced-stash effect into one
place keyed off the runtime, surviving core rebinds via the effect dep.
- coarse edge selectors (hasText / isHelpHint / isSteerableText, plus
isEmpty / hasHardNewline in useComposerMetrics) for the chrome, which only
re-render when an edge actually flips.
Net: typing within a line does zero ChatBar re-renders / style invalidations;
work happens only on real edges. Behaviour-preserving — draftRef + editor are
already kept current by every mutation path; verified by the composer DOM repro
tests (enter-submit, IME composition, slash-nav) + text-guard.
First step of decomposing the ChatBar god component (composer/index.tsx). Pull
the self-contained *sizing* engine — stacked/inline layout + the measured-height
CSS vars the thread reads for clearance — into composer/hooks/use-composer-metrics.ts.
The hook owns: the media-query `narrow`, `expanded`/`tight`, the 8px height
bucketing (so per-keystroke growth never invalidates the tree's computed style),
the ResizeObserver, the popout re-sync, and the CSS-var cleanup. ChatBar now
just calls `useComposerMetrics(...)` and consumes `stacked`.
Behaviour-preserving (no keystroke/IME/contentEditable path touched): code moved
verbatim. Deliberately a low-risk first slice on the app's most fragile file;
the draft/state-engine spine is the next, dogfood-heavy step
(see desktop-composer-plan.md).
Two status-stack icon nits:
- Subagents used `hubot`; switch to the dedicated `agent` codicon.
- The queue section had no icon while every other group (todos, subagents,
background) has one. Give it `layers` (a stack of pending turns), matched to
the group-icon styling so all four sections read consistently.
MCP tools connected and enabled but never surfaced into the agent's
session toolset on the desktop app + dashboard WebUI (#51587).
There are two independent background MCP discovery thread owners by
surface: tui_gateway.entry (stdio 'hermes --tui') and hermes_cli.mcp_startup
(desktop app + dashboard WS sidecar via tui_gateway/ws.py, and 'hermes
dashboard'). The late-refresh scheduler gates on
tui_gateway.entry.mcp_discovery_in_flight(), which read ONLY the entry
thread global. On the desktop/dashboard surfaces that global is None, so a
server slower than the bounded build-time wait never triggered a late
refresh and its tools stayed invisible for the whole session.
Make mcp_discovery_in_flight() / join_mcp_discovery() consult BOTH thread
owners. Adds the matching in-flight/join helpers to hermes_cli.mcp_startup
and has tui_gateway.entry delegate to them as a second owner.
Preview links (detected HTML files / localhost dev URLs) were rendered as
CHILDREN of the background StatusSection, which is collapsed by default — so the
moment a background task appeared, the previews got swallowed into the collapsed
"N Background" expandable and vanished until you manually expanded it. With no
background group they rendered as a standalone always-visible block, so the bug
only showed once a bg task was running.
Render the preview links as their own always-visible block right after the
background section instead of as collapsible children. They stay visually
associated with the background group (a localhost dev server and its preview are
the same thing) but are no longer hidden by its collapse — a one-tap open is the
whole point.
_sanitize_api_messages() compared raw tool_call_id strings without
stripping whitespace. When assistant-side IDs and tool-result IDs
diverged due to surrounding whitespace, valid tool results were treated
as orphaned and replaced with [Result unavailable] stub placeholders.
Strip whitespace in _get_tool_call_id_static() (both call_id/id paths,
dict and object) and at the two result_call_id comparison sites in
sanitize_api_messages(). Adds regression tests for preserved-whitespace
results and orphaned-whitespace removal.
Closes#9999
After the slash dispatcher, the next-largest body unit was submitPromptText —
a ~280-line submit pipeline. Lift it into a colocated useSubmitPrompt sub-hook
(use-prompt-actions/submit.ts) with a typed SubmitPromptDeps object; body moves
verbatim. SubmitTextOptions moves to utils.ts (shared by submit + submitText).
Pure restructuring, no behaviour change (full use-prompt-actions suite green).
index.ts: 1,212 -> 937.
The remaining bulk of useMessageStream was handleGatewayEvent — a ~550-line
event-type dispatcher. Lift it into a colocated useGatewayEventHandler sub-hook
(use-message-stream/gateway-event.ts): the values it closed over (sibling
streaming callbacks + the 3 stable refs the deps array omitted + options)
become a typed GatewayEventDeps object; the dispatcher body moves verbatim.
Pure restructuring, no behaviour change (utils tests still green). index.ts:
1,120 -> 540.
Follow-up hardening on the salvaged #54465 backoff persistence work.
The lease refresher's loop treated ANY falsy refresh as a permanent stop
(`if not refreshed: break`), conflating two distinct cases:
- genuine lost-ownership (rowcount 0) — correct to stop, and
- a one-off transient DB error (write contention that escapes
_execute_write's retry budget) — which returned False identically.
A single transient blip therefore killed the lease for the rest of a
multi-minute compression call, silently reintroducing the exact 300s-TTL <
~361s-call expiry wedge the PR set out to fix.
Changes:
- _CompressionLockLeaseRefresher._run now tolerates a bounded run of
consecutive failures (_MAX_CONSECUTIVE_REFRESH_FAILURES = 3) before giving
up the lease; a recovered tick resets the counter. Worst-case extra hold is
cap * refresh_interval, still bounded by the acquirer's TTL.
- Replace the two remaining silent `except Exception: pass` arms in the
compression-failure-cooldown persist/clear helpers with debug logging, for
parity with their sqlite3.Error sibling arms (a non-sqlite bug was invisible).
- Document the join(timeout=1.0) quiesce bound in stop().
- Add 3 regression tests: single-blip tolerance, persistent-failure stop at the
cap, and refresh-raising tolerance.
The usePromptActions body's largest unit was executeSlashCommand — a ~530-line
`/command` dispatcher. Lift it into a colocated useSlashCommand sub-hook
(use-prompt-actions/slash.ts): the ~13 values it closed over become a typed
SlashCommandDeps object the parent passes in; the dispatcher body (and its inner
runSlash recursion) moves verbatim. SlashActionCtx (slash-only) moves with it.
Pure restructuring, no behaviour change (verified: full use-prompt-actions test
suite still green). index.ts: 1,772 -> ~1,250.
Cron pre-run scripts were capped at 120s by default, which surprised
users running long data-collection scripts on crons (the whole point of
crons being to offload long work). Raise _DEFAULT_SCRIPT_TIMEOUT to 3600s
(1 hour).
This bounds the script only — skill/agent jobs already run on a separate
inactivity budget (HERMES_CRON_TIMEOUT, default 600s idle, 0=unlimited),
not a wall-clock cap. Scripts dispatch to a persistent thread pool and do
not hold the tick lock, so a long script doesn't starve other due jobs.
Docs clarified to make the script-vs-agent timeout distinction explicit.
env/config overrides (HERMES_CRON_SCRIPT_TIMEOUT,
cron.script_timeout_seconds) unchanged and still take precedence.
Extract the standalone gateway-event helpers (session-info patch derivation,
completion-error detection, todo-payload routing, delegate_task -> subagent
spec mapping, + the stream-flush/subagent-event constants) out of the
1,285-line hook into a colocated, tested use-message-stream/utils.ts. index.ts
keeps the stateful streaming hook and consumes the helpers.
Pure restructuring, no behaviour change; folder index keeps the import path
intact. index.ts: 1,285 -> ~1,120. Adds unit tests for the pure helpers.
Extract the ~16 standalone helpers (message reconciliation, optimistic/resolved
session upserts, stored-session resolution, runtime-info application, error
classification) out of the 1,254-line god hook into a colocated, tested
use-session-actions/utils.ts. index.ts keeps the hook orchestrator (the
stateful action callbacks) and consumes the helpers.
Pure restructuring, no behaviour change; folder index keeps the import path
(`@/app/session/hooks/use-session-actions`) intact. index.ts: 1,254 -> ~950.
Adds unit tests for the pure helpers.
Detect a routing key whose session is already ended in state.db
(end_reason set) inside get_or_create_session and drop the stale entry
instead of silently routing the message into a closed session.
Previously the only runtime cleanup of sessions.json was the startup
_prune_stale_sessions_locked (#52808/#54138), which requires a restart.
A session ended while the gateway stays alive — any path that finalizes
the DB row without clearing sessions.json — left a live routing key
pointing at a closed session. get_or_create_session never consulted
end_reason, so it returned that stale entry and every subsequent message
was silently dropped (no log, no error, no response) until the next
restart. This is the live-gateway variant of #52804/FM9, which needed an
actual gateway crash.
The guard drops the stale entry and falls through to
_recover_session_from_db, which reopens agent_close-ended rows and
resumes the SAME session_id (transcript preserved); if the row ended for
a non-recoverable reason (e.g. /new) it correctly starts a fresh
session. A warning is logged so the event is visible (the field
incident reported zero log output).
Adds tests/gateway/test_session_store_runtime_stale_guard.py covering
the _is_session_ended_in_db helper and the end-to-end routing self-heal
(recover-vs-fresh, live-entry untouched, stale-wins-over-suspended,
force_new short-circuit).
Closes#54878.
Co-authored-by: David Gutowsky <david.gutowsky@gmail.com>
Two robustness gaps from the #54843 truncate-store path:
- _store_full_text wrote the full clean page to cache/web with no upper
bound (path.write_text(content)); a multi-MB page → unbounded per-extract
disk write. Cap at MAX_STORED_TEXT_CHARS (2MB, the pre-truncate-store
refusal ceiling) with a marker when capped.
- The truncation footer told the model 'read_file ... offset=<line>' — a
literal placeholder it had to guess. Compute the real starting line of the
omitted middle (head line count + 1) so the first read_file lands in the gap.
Multiple @-references in one message (esp. @url: refs, each a full
web_extract round-trip) were expanded in a serial `for ref in refs: await`
loop. Switch to asyncio.gather over the independent _expand_reference calls,
reassembling warnings/blocks in original positional order so output is
byte-identical to the serial path; the token-budget check is unchanged.
Generic + provider-agnostic: helps every web backend equally (exa/tavily/
firecrawl/parallel) since it's above the provider layer. RED/GREEN test:
3 url refs @ 0.2s each = 0.60s serial -> ~0.20s concurrent.
The top-center floating HUDs (command palette + session switcher) pin at
top-3, overlapping the titlebar's `[-webkit-app-region:drag]` bands. Drag
regions win hit-testing over the DOM regardless of z-index, so the top of
each surface — the search input — swallowed clicks, leaving only a ~2px
strip focusable. Add `[-webkit-app-region:no-drag]` to the shared
HUD_SURFACE so the whole surface is interactive.
Finding 2 of the desktop UI-consistency pass. Several surfaces intentionally
make an entire row/cell the click target while hosting nested layout inside a
raw <button> (each re-justifying the pattern in a local comment). Introduce a
zero-style RowButton primitive (components/ui/row-button.tsx) that bakes in the
shared semantics — type="button" + a stable data-slot — without imposing any
styling, then migrate every genuine row-button onto it:
- app/overlays/panel.tsx
- app/artifacts/index.tsx
- app/chat/sidebar/chrome.tsx (SidebarRowBody, SidebarRowLink)
- app/settings/providers-settings.tsx
- components/desktop-onboarding-overlay.tsx (PROVIDER_ROW_CLASS rows)
Fully behavior-preserving: RowButton adds no classes, so each row keeps its
exact layout/look (verified by a unit test asserting className passthrough).
Left as-is (not row-buttons; converting would risk visual regressions): the
compact bespoke buttons in shell/statusbar-controls.tsx (STATUSBAR_ACTION_CLASS,
also a nested DropdownMenuTrigger asChild) and pet-generate/reference-chip.tsx.
Finding 1 of the desktop UI-consistency pass: SVG icon sizing had four
competing conventions with no source of truth. Introduce a named icon-size
scale (iconSize.xs/sm/md/lg/xl -> size-3/3.5/4/5/6) in lib/icons.ts and migrate
the genuine icon deviants onto it:
- desktop-install-overlay.tsx: Loader2/Check/AlertTriangle/Chevron* (h-4 w-4,
h-3.5 w-3.5 -> iconSize.md/sm)
- composer/controls.tsx, voice-activity.tsx, queue-panel.tsx: numeric size={N}
on Tabler icons -> iconSize classes
Sizes snap to the nearest scale step; the only rendered deltas are size={11}
-> 12px (queue/stop glyphs, +1px) and AudioLines size={15} -> 14px (-1px, now
matches its sibling toolbar icons). All other migrations are exact (12/14/16px).
Out of scope (different sizing mechanisms, left untouched): non-icon h-N w-N
layout (sliders, skeletons, swatches), sprite size props (PixelEggSprite), and
Codicon font-icon sizing. Broader size-N -> token adoption is follow-up.
#53552 flipped verify_on_stop to default OFF because the guard fired on
doc/markdown/skill edits and felt like noise. That doc/markdown/skill
suppression already shipped in the same change (_filter_verifiable_paths in
agent/verification_stop.py), so the original noise rationale no longer holds:
the guard already skips prose-only turns.
Restore the surface-aware "auto" default — ON for interactive coding surfaces
(CLI, TUI, desktop) and programmatic callers, OFF for conversational messaging
surfaces (Telegram, Discord, etc.) where the verification narrative would reach
a human as chat noise. The missing/unrecognized fallback in
verify_on_stop_enabled now resolves to the same surface-aware default instead of
hard OFF, so both the DEFAULT_CONFIG value and the resolver agree.
Scope: this changes the shipped default for fresh installs and configs without
an explicit verify_on_stop key. Existing configs that #53552/#54740 migrated to
an explicit `false` are respected and unchanged — this PR does not add a
force-migration of those values back to auto.
The usePromptActions hook is the textbook "god hook" AGENTS.md warns against.
As a first, safe slice, pull its module-level standalone helpers (no closure
over hook state) into a focused, testable use-prompt-actions-utils.ts sibling:
- error classifiers: isSessionNotFoundError, isSessionBusyError,
isProviderSetupError, inlineErrorMessage
- session-busy retry: withSessionBusyRetry (+ its constants)
- attachment IO: base64FromDataUrl, imageFilenameFromPath,
readImageForRemoteAttach, readFileDataUrlForAttach, friendlyRemoteAttachError
- misc: delay, isSessionIdCandidate, blobToDataUrl, renderCommandsCatalog,
slashStatusText, appendText, visibleUserOrdinal, visibleUserIndexAtOrdinal,
the _submitInFlight guard set, and the GatewayRequest type
Pure restructuring, no behavior change; the usePromptActions and
uploadComposerAttachment exports (and their import paths) are unchanged. Adds
unit tests for the pure helpers. use-prompt-actions.ts: 1,956 -> 1,772.
These tests patch `<module>.subprocess.run`, which is the shared `subprocess`
module singleton, so the patch is process-wide. Importing `tui_gateway.server`
runs `prefetch_update_check()` at import time, spawning an unnamed daemon thread
(`Thread-N (_run)`) that shells out to `git ... origin` (`text=True, timeout=5`).
That call races the test and lands in the captured list, intermittently failing
`test_tui_gateway_fuzzy_file_listing_hides_git_windows` with either
`KeyError: 'creationflags'` (the daemon's git call has no creationflags) or a
call-count mismatch (3 git calls captured, not 2). It only reproduced under the
parallel test harness because of the extra concurrency/timing.
Filter captured calls to the distinctive argv tokens of the call under test
(`--show-toplevel`, `ls-files`, `branch --show-current`, `diff`, `rg`,
`taskkill`) and read `creationflags` via `.get`, mirroring the existing
hardening on `test_gateway_pid_scan_hides_wmic_and_powershell_windows`. The
production code is unchanged; this is a test-isolation fix.
DesktopController is a route root that had grown a controller's worth of
session-list plumbing inline. Extract the cohesive fetch/paging cluster into
a focused hook and a tested pure helper, per AGENTS.md's "keep route roots
thin" guidance:
- use-session-list-actions.ts: refreshSessions / loadMoreSessions /
loadMoreSessionsForProfile / loadMoreMessagingForPlatform / refreshCronJobs
(plus the private cron/messaging refreshers, sessionsToKeep, and the
excluded-source constants)
- desktop-controller-utils.ts: pure sameCronSignature helper (+ unit tests)
Pure restructuring, no behavior change. desktop-controller.tsx: 1,441 -> 1,233.