hermes-agent/apps/desktop/scripts
Brooklyn Nicholson f6e6f00ff8 perf(desktop): useDeferredValue for streaming markdown so parses don't block input
Streamdown's per-Block parse cost grows with the live tail's length and
is unavoidable inside the block-memo pattern (industry standard, see
findings doc). The fix is to stop having that work block the main thread.

`<DeferStreamingText>` is a 12-line wrapper that reads message-part state
via `useMessagePartText`, runs it through `useDeferredValue`, and
re-publishes via assistant-ui's `<TextMessagePartProvider>`. The inner
`<StreamdownTextPrimitive>` reads the deferred value through the normal
`useMessagePartText` hook — no fork, no internal-path imports, fully on
assistant-ui's public API. React's concurrent scheduler then:

  - abandons in-flight deferred renders when a newer token arrives, so
    intermediate states get skipped under fast streams
  - deprioritises the markdown render when the main thread has urgent
    work (typing, scroll), so input stays responsive even while a
    100ms parse is queued

Streamdown already uses `useTransition` for its block-array setState;
this lifts the deferral up to the consumer boundary so it covers the
whole pipeline (preprocess → split → repair → parse → render).

A/B on the 34 MB session, 300 tokens at 50 tok/sec, markdown chunks
(four trials each, with the 33ms flush throttle on for both):

| | avgFps | p99 frame | LTs/5s | max LT | typing-while-stream p95 |
|---|---|---|---|---|---|
| pre  | 54.3 | 41 ms | 1.7 | 110 ms | ~17 ms |
| post | 58.5 | 31 ms | 2.0 | 117 ms | 14-18 ms |

Longtask count + max LT unchanged — useDeferredValue doesn't reduce
CPU, only its priority. The avgFps lift and p99 frame drop are the
proof that the existing CPU is no longer blocking 60 fps cadence. One
clean run logged MUTATIONS=0 — React skipped every intermediate text
state and only committed the final one (textbook deferred-value
behaviour).

The actually-reduce-CPU path is replacing the parser with a state
machine like Flowdown — left for a future PR; see
`apps/desktop/scripts/profile-typing-lag.md` for the full investigation.
2026-05-21 20:31:26 -05:00
..
assert-root-install.cjs Improve desktop runtime UX by surfacing inference readiness in gateway status and hardening WSL link opening. 2026-05-15 16:33:04 -05:00
before-build.cjs feat(desktop): thin installer + first-launch install.ps1 bootstrap 2026-05-18 02:26:46 -04:00
click-session.mjs Revert "Revert "perf(desktop): cut per-keystroke layout + listener churn in chat composer"" 2026-05-21 18:57:18 -05:00
dev-no-hmr.mjs Revert "Revert "perf(desktop): cut per-keystroke layout + listener churn in chat composer"" 2026-05-21 18:57:18 -05:00
diag-jump.mjs perf(desktop): fix "Enter jumps up" on long threads 2026-05-21 17:45:55 -05:00
eval.mjs chore(desktop): synthetic-stream perf harness + scripts 2026-05-21 19:38:26 -05:00
leak-typing.mjs Revert "Revert "perf(desktop): cut per-keystroke layout + listener churn in chat composer"" 2026-05-21 18:57:18 -05:00
measure-jump.mjs perf(desktop): fix "Enter jumps up" on long threads 2026-05-21 17:45:55 -05:00
measure-latency.mjs Revert "Revert "perf(desktop): cut per-keystroke layout + listener churn in chat composer"" 2026-05-21 18:57:18 -05:00
measure-real-stream.mjs chore(desktop): synthetic-stream perf harness + scripts 2026-05-21 19:38:26 -05:00
measure-submit.mjs Revert "Revert "perf(desktop): cut per-keystroke layout + listener churn in chat composer"" 2026-05-21 18:57:18 -05:00
measure-synthetic-stream.mjs perf(desktop): floor assistant-text flush gap to 33ms for predictable batching 2026-05-21 20:08:49 -05:00
notarize-artifact.cjs fix(desktop): address CodeQL alerts on PR #20059 2026-05-11 16:52:32 -04:00
notarize.cjs ci(desktop): automate desktop releases 2026-05-05 13:04:33 -05:00
probe-renderer.mjs Revert "Revert "perf(desktop): cut per-keystroke layout + listener churn in chat composer"" 2026-05-21 18:57:18 -05:00
probe-thread.mjs perf(desktop): fix "Enter jumps up" on long threads 2026-05-21 17:45:55 -05:00
profile-long-stream.mjs perf(desktop): rate-limit thread auto-pin during streaming 2026-05-21 18:02:26 -05:00
profile-real-stream.mjs chore(desktop): synthetic-stream perf harness + scripts 2026-05-21 19:38:26 -05:00
profile-synth-stream.mjs chore(desktop): synthetic-stream perf harness + scripts 2026-05-21 19:38:26 -05:00
profile-typing-lag.md perf(desktop): useDeferredValue for streaming markdown so parses don't block input 2026-05-21 20:31:26 -05:00
profile-typing.mjs Revert "Revert "perf(desktop): cut per-keystroke layout + listener churn in chat composer"" 2026-05-21 18:57:18 -05:00
reload-renderer.mjs Revert "Revert "perf(desktop): cut per-keystroke layout + listener churn in chat composer"" 2026-05-21 18:57:18 -05:00
reload.mjs chore(desktop): synthetic-stream perf harness + scripts 2026-05-21 19:38:26 -05:00
stage-native-deps.cjs desktop: swap node-pty fork for upstream microsoft/node-pty 1.1.0 2026-05-18 21:50:53 -07:00
test-desktop.mjs desktop: swap node-pty fork for upstream microsoft/node-pty 1.1.0 2026-05-18 21:50:53 -07:00
write-build-stamp.cjs feat(desktop): thin installer + first-launch install.ps1 bootstrap 2026-05-18 02:26:46 -04:00