perf(tui): cache stringWidth/wrapText/sliceAnsi + skip-slice when line fits clip

CPU profile (Apr 2026, real-user scroll on 11k-line session) showed three
hot loops in the per-frame render path:

  Output.get() per-frame walk:                 24% total
  └─ sliceAnsi(line, from, to) per write:     18% total
  stringWidth(line) chain (cached + JS):      14% total

All three were re-doing identical work every frame: same string → same
clipped slice → same width.

Fixes:

1. Memoize stringWidth (8k-entry LRU) for non-ASCII strings; ASCII fast-path
   skips the cache (inline scan beats Map.get for short ASCII, the >90%
   case). String.charCodeAt scan up to 64 chars is cheaper than the regex
   fallback.

2. Memoize wrapText (4k-entry LRU keyed by maxWidth|wrapType|text) — wrapAnsi
   is pure and the same content reflows identically every frame.

3. Memoize sliceAnsi (4k-entry LRU keyed by start|end|str) for the
   end-defined hot path used by Output.get().

4. Skip the slice entirely in Output.get() when the line already fits the
   clip box (startsBefore=false && endsAfter=false). Most transcript lines
   never exceed their container width, and tokenizing them just to slice
   (line, 0, width) was pure overhead. This single fast-path drops
   sliceAnsi from 18% → ~0% in the profile.

Also tighten virtualization constants (MAX_MOUNTED 260→120, OVERSCAN 40→20,
SLIDE_STEP 25→12) and cap historical-message render at 800 chars / 16
lines via HISTORY_RENDER_MAX_*; messages inside the FULL_RENDER_TAIL_ITEMS
window still render in full so reading-zone behavior is unchanged.

Validation, real-user CPU profile, page-up scroll on 11k-line session:

  Output.get() self-time:     24%   →   0.3%
  sliceAnsi total:            18%   →   not in top 25
  stringWidth family:         14%   →   ~3%
  idle:                     60.7%   →  77.3%

Frame timings (synthetic page-up profile harness):
  dur p95:   ~10ms   →  4.87ms
  dur p99:   25ms+   → 12.80ms
  yoga p99:  ~20ms   →  1.87ms

The remaining CPU in the profile is Yoga layoutNode + React commit,
which is the irreducible work for this UI tree size.
This commit is contained in:
Brooklyn Nicholson 2026-04-26 19:28:09 -05:00
parent 85e9a23efb
commit c370e2e1e5
14 changed files with 450 additions and 42 deletions

View file

@ -1,5 +1,5 @@
import { Box, Link, Text } from '@hermes/ink'
import { memo, type ReactNode, useMemo } from 'react'
import { memo, useMemo, type ReactNode } from 'react'
import { ensureEmojiPresentation } from '../lib/emoji.js'
import { highlightLine, isHighlightable } from '../lib/syntax.js'
@ -213,8 +213,57 @@ function MdInline({ t, text }: { t: Theme; text: string }) {
return <Text>{parts.length ? parts : <Text>{text}</Text>}</Text>
}
// Cross-instance parsed-children cache. `Md` is mounted fresh whenever a
// virtualized row enters the mount window — useMemo's per-instance cache
// doesn't survive remounts, so PageUp into cold/resumed history reparses
// every row (markdown scan + per-line syntax highlight).
//
// Outer WeakMap keyed by theme so palette swaps drop stale baked-in colors
// without code intervention. Inner Map is LRU-bounded; key folds `compact`
// in so the two layout modes don't poison each other.
const MD_CACHE_LIMIT = 512
const mdCache = new WeakMap<Theme, Map<string, ReactNode[]>>()
const cacheBucket = (t: Theme) => {
let b = mdCache.get(t)
if (!b) {
b = new Map()
mdCache.set(t, b)
}
return b
}
const cacheGet = (b: Map<string, ReactNode[]>, key: string) => {
const v = b.get(key)
if (v) {
b.delete(key)
b.set(key, v)
}
return v
}
const cacheSet = (b: Map<string, ReactNode[]>, key: string, v: ReactNode[]) => {
b.set(key, v)
if (b.size > MD_CACHE_LIMIT) {
b.delete(b.keys().next().value!)
}
}
function MdImpl({ compact, t, text }: MdProps) {
const nodes = useMemo(() => {
const bucket = cacheBucket(t)
const cacheKey = `${compact ? '1' : '0'}|${text}`
const cached = cacheGet(bucket, cacheKey)
if (cached) {
return cached
}
const lines = ensureEmojiPresentation(text).split('\n')
const nodes: ReactNode[] = []
@ -615,6 +664,8 @@ function MdImpl({ compact, t, text }: MdProps) {
i++
}
cacheSet(bucket, cacheKey, nodes)
return nodes
}, [compact, t, text])