fix(tui): anchor splitReasoning unclosed-tag regex to start of input (#29426)

`splitReasoning()` strips paired `<think>…</think>` blocks first, then runs an unclosed-trailing regex to catch reasoning that hasn't yet streamed its closer. That second regex was unanchored and greedy: new RegExp(`<${tag}>([\\s\\S]*)$`, 'i') So any literal `<think>` somewhere in prose — a model quoting the tag, a code example, or a stream-mid-tag before the closer arrives — consumed every paragraph after it to EOF. User-visible symptom: "TUI eats last paragraph of output," both during streaming and on settled turns. Real reasoning streams always lead the message (that's the only place an unclosed opener can legitimately appear during streaming). Anchor the regex to `^\s*` so mid-prose mentions of the tag are preserved. Empirical repro before the fix: splitReasoning('final answer paragraph one.\n\n<think>internal note\n\nfinal answer paragraph two.') → text: 'final answer paragraph one.' ← paragraph two GONE After: → text: 'final answer paragraph one.\n\n<think>internal note\n\nfinal answer paragraph two.' Updated the existing trailing-unclosed test to lead with `<think>` (the real-world shape) and added a regression test pinning the mid-text case. ui-tui type-check clean, 808/808 vitest pass.
2026-06-05 07:41:39 +00:00 · 2026-05-20 14:09:38 -05:00 · 2026-05-20 14:09:38 -05:00 · 88f5186d35
commit 88f5186d35
parent eeb747de25
2 changed files with 24 additions and 4 deletions
--- a/ui-tui/src/tests/reasoning.test.ts
+++ b/ui-tui/src/tests/reasoning.test.ts
@ -21,11 +21,26 @@ describe('splitReasoning', () => {
    expect(text).toBe('body')
  })

-  it('treats unclosed trailing <think>… as reasoning', () => {
-    const { reasoning, text } = splitReasoning('answer start <think>still deciding')
+  it('treats unclosed leading <think>… as reasoning (real reasoning-model stream)', () => {
+    const { reasoning, text } = splitReasoning('<think>still deciding')

    expect(reasoning).toBe('still deciding')
-    expect(text).toBe('answer start')
+    expect(text).toBe('')
+  })
+
+  it('does not strip trailing prose after a stray mid-text <think> mention', () => {
+    // Regression for "TUI eats last paragraph of output": when the model
+    // emits a literal `<think>` somewhere in prose (quoted explanation, code
+    // example, partial stream-mid-tag), the trailing greedy unclosed-tag
+    // regex used to consume every paragraph after it. Real unclosed
+    // reasoning blocks always lead the message — anchor to ^ so prose
+    // mentions are preserved.
+    const { reasoning, text } = splitReasoning(
+      'final answer paragraph one.\n\n<think>internal note never closed\n\nfinal answer paragraph two.'
+    )
+
+    expect(reasoning).toBe('')
+    expect(text).toBe('final answer paragraph one.\n\n<think>internal note never closed\n\nfinal answer paragraph two.')
  })

  it('returns empty reasoning and untouched text when no tags present', () => {
--- a/ui-tui/src/lib/reasoning.ts
+++ b/ui-tui/src/lib/reasoning.ts
@ -21,7 +21,12 @@ export function splitReasoning(input: string): SplitReasoning {
      return ''
    })

-    const unclosed = new RegExp(`<${tag}>([\\s\\S]*)$`, 'i')
+    // Anchor to start-of-input so a literal `<think>` mid-prose (model quoting
+    // the word, code blocks containing the tag, etc.) doesn't eat every
+    // paragraph after it. Real unclosed reasoning blocks always lead the
+    // message — that's how reasoning models stream. See test
+    // "does not strip trailing prose after a stray mid-text <think> mention".
+    const unclosed = new RegExp(`^\\s*<${tag}>([\\s\\S]*)$`, 'i')
    text = text.replace(unclosed, (_m, inner: string) => {
      const trimmed = inner.trim()