fix(tui): restrict fast-echo bypass to ASCII so Vietnamese/CJK/IME input renders correctly (#26011)

* fix(tui): restrict fast-echo bypass to ASCII so Vietnamese/CJK/IME input renders correctly

The composer's fast-echo path (canFastAppend / canFastBackspace) writes
characters straight to stdout to skip an Ink re-render on the hot
typing path. The previous guard only checked
'stringWidth(text) === text.length', which lets a lot of non-ASCII
through:

  - Vietnamese precomposed letters (ề, ắ, ờ, ự, ...) report width 1 and
    length 1, but a Vietnamese Telex / IME stack produces them across
    multiple keystrokes; the intermediate composition state must be
    drawn by Ink so the rendered cell, the stored value, and the
    cursor column stay in lockstep when the final commit replaces the
    preview.
  - NFD combining marks (U+0300..U+036F) are zero-width but length 1,
    so even a passing equality lets them slip and silently desync the
    cell column.
  - CJK/East-Asian wide and emoji rejected only because their length
    differs, but the boundary was shape-shaped, not intent-shaped.

User-visible bug from the original report:
  Example: eê noiói nge neène
  -> the bypass committed the IME preview char before the diacritic
     replaced it, leaving doubled letters on screen.

Fix: gate fast-echo on pure printable ASCII (0x20-0x7e). The
performance-critical English typing path is unchanged; everything else
goes through the normal Ink render path so layout stays accurate.

Also extracts the shape preconditions as pure exported helpers
(canFastAppendShape / canFastBackspaceShape) so the regression matrix
is testable without spinning up a TextInput.

Tests: ui-tui/src/__tests__/textInputFastEcho.test.ts adds 20 cases
covering ASCII still works, Vietnamese precomposed + NFD, CJK, emoji,
NBSP / Latin-1, ANSI / control bytes, multi-line, and end-of-line
preconditions. Verified RED on the previous guard (11 of 20 fail) and
GREEN on the new guard.

Refs: #5221, #7443, #17602, #17603 (similar wide-char rendering bugs).

* docs(tui): clarify Vietnamese char terminology in regression comment

Address Copilot review: 'single byte width' implied UTF-8 byte semantics,
but the relevant property is JS code units (`text.length === 1`) and
display width (`stringWidth === 1`). Reworded to match.
This commit is contained in:
brooklyn! 2026-05-15 07:41:50 -07:00 committed by GitHub
parent d5416284f1
commit 9fb40e6a3d
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 218 additions and 19 deletions

View file

@ -179,6 +179,84 @@ export function lineNav(s: string, p: number, dir: -1 | 1): null | number {
export { offsetFromPosition }
const ASCII_PRINTABLE_RE = /^[\x20-\x7e]+$/
/**
* Pure shape-only precondition for the fast-echo append path.
*
* The fast-echo path bypasses Ink's renderer and writes text directly to
* stdout, so the stored value, the rendered terminal cells, and the cursor
* column must all stay in sync without any layout work. We only allow it
* when the inserted text is pure printable ASCII so that:
*
* - `text.length` matches the number of grapheme clusters (no combining
* marks, no surrogate pairs, no precomposed CJK / Latin-Extended
* letters that an IME might still be holding open as a composition),
* - terminal width is exactly 1 cell per character (no East-Asian wide,
* no zero-width, no ambiguous-width fonts),
* - input methods (Vietnamese Telex, IME, dead-keys) cannot leak
* intermediate composition bytes through the bypass before the final
* commit arrives those always go through the normal Ink render path
* and stay layout-accurate (closes #5221, #7443, #17602/#17603).
*
* We deliberately do NOT just check `stringWidth(text) === text.length`:
* Vietnamese precomposed letters like "ề" (U+1EC1) report width 1 and
* length 1 but are still produced by IME compositions and must not be
* fast-echoed.
*/
export function canFastAppendShape(
current: string,
cursor: number,
text: string,
columns: number,
currentLineWidth: number
): boolean {
if (cursor !== current.length) {
return false
}
if (current.length === 0) {
return false
}
if (current.includes('\n')) {
return false
}
if (!ASCII_PRINTABLE_RE.test(text)) {
return false
}
return currentLineWidth + text.length < Math.max(1, columns)
}
/**
* Pure shape-only precondition for the fast-echo backspace path.
*
* Same reasoning as canFastAppendShape only allow the direct
* "\b \b" stdout shortcut when the deleted grapheme is pure printable
* ASCII. Anything else (combining marks, IME compositions, wide chars,
* tabs, ANSI fragments) goes through the normal render path so Ink can
* recompute cell widths.
*/
export function canFastBackspaceShape(current: string, cursor: number): boolean {
if (cursor !== current.length) {
return false
}
if (cursor <= 0) {
return false
}
if (current.includes('\n')) {
return false
}
const removed = current.slice(prevPos(current, cursor), cursor)
return ASCII_PRINTABLE_RE.test(removed)
}
function renderWithCursor(value: string, cursor: number) {
const pos = Math.max(0, Math.min(cursor, value.length))
@ -444,26 +522,11 @@ export function TextInput({
const canFastEchoBase = () => focus && termFocus && !selected && !mask && !!stdout?.isTTY
const canFastAppend = (current: string, cursor: number, text: string) => {
const sw = stringWidth(text)
const canFastAppend = (current: string, cursor: number, text: string) =>
canFastEchoBase() && canFastAppendShape(current, cursor, text, columns, lineWidthRef.current)
return (
canFastEchoBase() &&
cursor === current.length &&
current.length > 0 &&
!current.includes('\n') &&
sw === text.length &&
lineWidthRef.current + sw < Math.max(1, columns)
)
}
const canFastBackspace = (current: string, cursor: number) => {
if (!canFastEchoBase() || cursor !== current.length || cursor <= 0 || current.includes('\n')) {
return false
}
return stringWidth(current.slice(prevPos(current, cursor), cursor)) === 1
}
const canFastBackspace = (current: string, cursor: number) =>
canFastEchoBase() && canFastBackspaceShape(current, cursor)
const commit = (
next: string,