fix(tui): voice TTS speak-back + transcript-key bug + auto-submit

Three issues surfaced during end-to-end testing of the CLI-parity voice loop and are fixed together because they all blocked "speak → agent responds → TTS reads it back" from working at all: 1. Wrong result key (hermes_cli/voice.py) transcribe_recording() returns {"success": bool, "transcript": str}, matching cli.py:_voice_stop_and_transcribe. The wrapper was reading result.get("text"), which is None, so every successful Groq / local STT response was thrown away and the 3-strikes halt fired after three silent-looking cycles. Fixed by reading "transcript" and also honouring "success" like the CLI does. Updated the loop simulation tests to return the correct shape. 2. TTS speak-back was missing (tui_gateway/server.py + hermes_cli/voice.py) The TUI had a voice.toggle "tts" subcommand but nothing downstream actually read the flag — agent replies never spoke. Mirrored cli.py:8747-8754's dispatch: on message.complete with status == "complete", if _voice_tts_enabled() is true, spawn a daemon thread running speak_text(response). Rewrote speak_text as a full port of cli.py:_voice_speak_response — same markdown-strip regex pipeline (code blocks, links, bold/italic, inline code, headers, list bullets, horizontal rules, excessive newlines), same 4000-char cap, same explicit mp3 output path, same MP3-over-OGG playback choice (afplay misbehaves on OGG), same cleanup of both extensions. Keeps TUI TTS audible output byte-for-byte identical to the classic CLI. 3. Auto-submit swallowed on non-empty composer (createGatewayEventHandler.ts) The voice.transcript handler branched on prev input via a setInput updater and fired submitRef.current inside the updater when prev was empty. React strict mode double-invokes state updaters, which would queue the submit twice; and when the composer had any content the transcript was merely appended — the agent never saw it. CLI _pending_input.put(transcript) unconditionally feeds the transcript as the next turn, so match that: always clear the composer and setTimeout(() => submitRef.current(text), 0) outside any updater. Side effect can't run twice this way, and a half-typed draft on the rare occasion is a fair trade vs. silently dropping the turn. Also added peak_rms to the rec.stop debug line so "recording too quiet" is diagnosable at a glance when HERMES_VOICE_DEBUG=1.
2026-04-25 00:51:20 +00:00 · 2026-04-24 01:27:19 +03:00 · 2026-04-24 01:27:19 +03:00 · 42ff785771
commit 42ff785771
parent 04c489b587
3 changed files with 84 additions and 38 deletions
--- a/ui-tui/src/app/createGatewayEventHandler.ts
+++ b/ui-tui/src/app/createGatewayEventHandler.ts
@ -301,19 +301,16 @@ export function createGatewayEventHandler(ctx: GatewayEventHandlerContext): (ev:
          return
        }

-        // Match CLI's _pending_input.put(transcript): auto-submit when the
-        // composer is empty, otherwise append so the user can keep editing
-        // a partial draft they were working on.
-        setInput(prev => {
-          if (!prev) {
-            // defer submit so React commits the state change first
-            setTimeout(() => submitRef.current(text), 0)
-
-            return ''
-          }
-
-          return `${prev}${/\s$/.test(prev) ? '' : ' '}${text}`
-        })
+        // CLI parity: _pending_input.put(transcript) unconditionally feeds
+        // the transcript to the agent as its next turn — draft handling
+        // doesn't apply because voice-mode users are speaking, not typing.
+        //
+        // We can't branch on composer input from inside a setInput updater
+        // (React strict mode double-invokes it, duplicating the submit).
+        // Just clear + defer submit so the cleared input is committed before
+        // submit reads it.
+        setInput('')
+        setTimeout(() => submitRef.current(text), 0)

        return
      }