From 20428f5e600cddb414ab1ec6152aabb0497d14e1 Mon Sep 17 00:00:00 2001 From: brooklyn! Date: Mon, 4 May 2026 15:49:28 -0700 Subject: [PATCH] fix(tui): respect voice.record_key config (supersedes #19028, #19339) (#19835) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix(tui): respect voice.record_key config instead of hardcoded Ctrl+B Classic CLI loaded ``voice.record_key`` from config.yaml and bound the prompt-toolkit handler dynamically (``cli.py`` paths). The new TUI hard- coded ``Ctrl+B`` everywhere — ``isVoiceToggleKey`` (input handler), ``/voice status`` ("Record key: Ctrl+B"), and ``/voice on`` ("Ctrl+B to start/stop recording"). A user who set ``voice.record_key: ctrl+o`` (or any other key) saw the documented config silently ignored — only Ctrl+B worked, the displayed shortcut lied about it. Wire the configured key end to end through the existing channels: * **Backend** (``tui_gateway/server.py``): ``voice.toggle`` action=status AND action=on/off responses now include ``record_key``, sourced from ``config.get('voice', {}).get('record_key', 'ctrl+b')``. * **Backend types** (``ui-tui/src/gatewayTypes.ts``): ``ConfigFullResponse`` now exposes ``config.voice.record_key`` and ``VoiceToggleResponse`` carries ``record_key`` so the TUI can both bind and display it. * **Frontend parser/formatter** (``ui-tui/src/lib/platform.ts``): ``parseVoiceRecordKey()`` accepts ``ctrl+b`` / ``alt+r`` / ``cmd+space`` and the common aliases (``option``, ``cmd``, ``win``, …); falls back to the documented Ctrl+B for empty / multi-character / malformed input so a typo never silently disables the shortcut. ``formatVoiceRecordKey()`` renders for status text. ``isVoiceToggleKey`` now takes a parsed ``ParsedVoiceRecordKey`` argument; the hardcoded ``ch === 'b'`` is gone. Default arg keeps existing call sites back-compat. * **Hydration** (``ui-tui/src/app/useConfigSync.ts``, ``useMainApp.ts``): startup ``config.get full`` already runs; extract ``cfg.voice.record_key`` from it, parse, push into a new ``voiceRecordKey`` state, and forward to the input handler ctx (``InputHandlerContext.voice.recordKey``). Mtime-poll path also re-applies the parsed key so a hand-edit of config.yaml takes effect the next tick — matches existing behaviour for display options. * **Input handler** (``ui-tui/src/app/useInputHandlers.ts``): ``isVoiceToggleKey(key, ch, voice.recordKey)`` so the configured binding fires. * **Slash command** (``ui-tui/src/app/slash/commands/session.ts``): ``/voice status`` and ``/voice on`` use ``formatVoiceRecordKey`` on the response's ``record_key`` instead of the hardcoded label. Tests: * ``parseVoiceRecordKey`` covers ctrl/alt/cmd/super aliases, multi-char rejection, and empty fallback. * ``formatVoiceRecordKey`` covers the doc examples (``Ctrl+B``, ``Ctrl+O``, ``Alt+R``, ``Cmd+B``). * ``isVoiceToggleKey`` regression: ``ctrl+o`` configured → only ``o`` matches, not ``b``; ``alt+r`` matches both alt-bit and meta-bit encodings (terminal protocol parity); omitted-arg call still binds Ctrl+B for back-compat. Full TUI suite (555 tests) passes; ``tsc --noEmit`` clean. Fixes #18994 Co-authored-by: asheriif * fix(tui): support named-key tokens in voice.record_key (space, enter, …) Reviewer caught that the round-1 parser in #18994 rejected every multi-character token, so a config value like ``ctrl+space`` (which the CLI happily binds via prompt_toolkit's ``c-space`` rewrite in ``cli.py``) silently fell back to the documented Ctrl+B default — re-introducing the same false-shortcut bug the PR was meant to fix, just at a different surface. Add explicit named-key support that mirrors what the CLI accepts: * ``space`` (alias: ``spc``) → matches ``ch === ' '`` * ``enter`` (alias: ``return``, ``ret``) → matches ``key.return`` * ``tab`` → matches ``key.tab`` * ``escape`` (alias: ``esc``) → matches ``key.escape`` * ``backspace`` (alias: ``bs``) → matches ``key.backspace`` * ``delete`` (alias: ``del``) → matches ``key.delete`` ``ParsedVoiceRecordKey`` gains an optional ``named`` field; ``ch`` holds either a single char (back-compat) or the canonical named token, and the runtime matcher dispatches on ``named`` before checking the modifier shape. Aliases collapse to one canonical name so ``ctrl+esc`` and ``ctrl+escape`` behave identically. Unrecognised multi-character tokens (e.g. ``ctrl+spcae`` typo, or unsupported keys like ``ctrl+f5``) still fall back to the Ctrl+B default rather than silently disabling the binding — keeps the "typo never silently kills the shortcut" guarantee. Tests: * ``parseVoiceRecordKey`` parametrised over every named token + each alias variant. * New ``isVoiceToggleKey`` cases for space (ch-based match), enter (``key.return``), tab, escape, backspace, delete, including modifier-mismatch negatives. * ``formatVoiceRecordKey`` renders named keys in title case (``Ctrl+Space``, ``Ctrl+Enter``). * Existing fall-back-to-Ctrl+B contract preserved for empty input AND unrecognised multi-char tokens. Full TUI suite: 559/559 pass; ``tsc --noEmit`` clean. Refs #18994 (round-1 review feedback) Co-authored-by: asheriif * test(tui): assert voice.toggle returns configured record_key Salvage the backend regression from #19339 — asserts ``voice.toggle`` action=on AND action=status responses carry the configured ``voice.record_key`` end-to-end through ``_load_cfg()``. Keeps the CLI→TUI parity contract visible in the Python test suite alongside the existing frontend parser/matcher/formatter coverage from #19028. * fix(tui): address Copilot review on #19835 voice.record_key wiring Five tightenings on the parser + matcher + hydration surface, all caught by the Copilot review on the PR — each one turns a silent false-fire or display/binding skew into a deterministic behaviour. * **isVoiceToggleKey ctrl branch was too permissive for named keys.** The doc-default macOS Cmd+B muscle-memory fallback (``isActionMod(key)`` on top of ``key.ctrl``) fired for every configured key, so bare Esc — which hermes-ink reports with ``key.meta`` on some macOS terminals — triggered ``ctrl+escape``, and Alt+Space / Alt+Tab triggered ``ctrl+space`` / ``ctrl+tab``. Gate the fallback to the literal ``ctrl+b`` binding so any custom chord requires the real Ctrl bit. * **Alt branch guarded against Ctrl/Cmd co-press.** Without this, Ctrl+Alt+ and Cmd+Alt+ also fired ``alt+``. * **Dropped the ``meta`` modifier variant and its alias.** In hermes-ink ``key.meta`` is Alt on xterm-style terminals and Cmd on legacy macOS ones, so a literal ``meta+b`` config displayed as ``Cmd+B`` while matching Alt+B — exactly the kind of false shortcut the PR was meant to remove. ``cmd`` / ``command`` now collapse onto ``super`` (kitty-style ``key.super``, with a macOS ``key.meta`` fallback) and render as ``Cmd+B``. Unknown modifier tokens fall back to the documented Ctrl+B default rather than silently coercing to Ctrl. * **Slash-command display/binding skew.** ``/voice status`` and ``/voice on`` rendered from the fresh gateway ``record_key`` response, but ``useInputHandlers()`` still bound the old key until the next 5s mtime poll. Thread ``setVoiceRecordKey`` through ``SlashHandlerContext.voice`` and push the parsed spec into frontend state on every response so text and binding stay consistent. * **Test coverage for the two paths Copilot flagged.** Added vitest coverage for (a) the three-case ``/voice`` slash output in ``createSlashHandler.test.ts`` and (b) the ``applyDisplay → voice.record_key`` hydration + omit-setter back-compat paths in ``useConfigSync.test.ts``. Plus regression cases for every false-fire scenario above. Suite: 575/575 green, tsc --noEmit clean. * fix(tui): address Copilot round-2 review on #19835 Three tightenings on the surface introduced in the round-1 fix: * **``/voice tts`` reset custom bindings to Ctrl+B.** The ``tts`` branch of ``voice.toggle`` omitted ``record_key`` from its response, so the frontend's ``r.record_key ?? 'ctrl+b'`` coerced a user's custom binding back to the default on every TTS toggle. Two-sided fix: the backend now includes ``record_key`` on the ``tts`` branch (parity with ``status``/``on``/``off``), and the slash handler only pushes frontend state when the response actually carries ``record_key`` — belt-and-suspenders against any future branch forgetting to include it. * **``super+b`` / ``win+b`` / ``cmd+b`` displayed "Cmd+B" on Linux and Windows.** ``formatVoiceRecordKey`` rendered ``mod === 'super'`` as ``Cmd`` universally, which told non-mac users the wrong modifier to press even though ``isVoiceToggleKey`` matched the right event bits. Gate the label to ``isMac`` so non-mac renders ``Super+B``. * **``control+b`` / ``ctrl + b`` lost the macOS Cmd+B fallback.** ``_isDefaultVoiceKey`` keyed off ``parsed.raw`` — so semantically-equal aliases of the documented default dropped into the strict branch even though they bind Ctrl+B. Compare on the parsed spec (mod + ch + named) instead. Coverage added: Linux ``Super+B`` rendering (and macOS ``Cmd+B``), ``control+b`` / ``ctrl + b`` accepting the Cmd+B fallback on darwin, ``/voice tts`` without ``record_key`` not clobbering cached binding, and a backend regression asserting every ``voice.toggle`` branch carries the configured key. Suite: 579/579 TUI vitest green, 2/2 backend voice tests green, tsc --noEmit clean. * fix(tui): address Copilot round-3 review on #19835 Three classes of robustness issue caught on the second pass — all revolve around malformed YAML tipping ``parseVoiceRecordKey`` or ``_voice_record_key`` into a crash instead of the documented fallback. * **Parser crashed on non-string YAML scalars.** ``config.get full`` returns raw ``yaml.safe_load`` output, so ``voice.record_key: 1`` or ``voice.record_key: true`` in a hand-edited config would hit ``.trim()`` on a number/bool and throw, breaking startup and every mtime re-apply. Accept ``unknown`` at the signature, guard with ``typeof raw !== 'string'``, and fall back to the default. * **Backend blew up on non-dict ``voice:``.** Same YAML hazard on the gateway side: ``voice: true`` / ``voice: cmd+b`` left ``_load_cfg().get("voice")`` as a bool/str, so ``.get("record_key")`` raised AttributeError and took every ``voice.toggle`` branch down with it. Centralised the lookup in a single ``_voice_record_key()`` helper that ``isinstance``-guards both ``voice`` and ``record_key`` and falls back to ``ctrl+b``. * **Multi-modifier chords silently dropped extras.** The previous validator only checked the first modifier token, so ``ctrl+alt+r`` silently parsed as ``ctrl+r`` and ``cmd+ctrl+b`` as ``super+b`` — a typo bound a different shortcut than the user configured. Reject multi-modifier spellings outright; the classic CLI only supports single-modifier bindings via prompt_toolkit's ``c-x`` / ``a-x`` rewrite, so this matches CLI parity. Coverage added: * ``parseVoiceRecordKey`` fallback on ``1`` / ``true`` / ``null`` / ``undefined`` / ``{}``. * ``parseVoiceRecordKey`` fallback on ``ctrl+alt+r`` / ``cmd+ctrl+b`` / ``alt+ctrl+space``. * ``test_voice_toggle_handles_non_dict_voice_cfg`` exercises every non-dict ``voice:`` shape (bool, str, None, int, list) and asserts each falls back to ``record_key: 'ctrl+b'``. Suite: 581/581 TUI vitest green, 3/3 backend voice tests green, tsc --noEmit clean. * fix(tui): address Copilot round-4 review on #19835 Four final corners of the voice.record_key surface: * **Bare-char configs silently coerced to ``ctrl+``.** A config like ``voice.record_key: o`` / ``space`` / ``escape`` fell through to the default ``mod = 'ctrl'`` and silently bound Ctrl+O, while the classic CLI's prompt_toolkit would bind the raw key (no rewrite) — so the two runtimes silently disagreed on what "o" means. Require an explicit modifier; bare-char configs fall back to the documented Ctrl+B default. * **Reserved ctrl+ bindings would never fire.** ``useInputHandlers()`` intercepts ``ctrl+c`` (interrupt), ``ctrl+d`` (quit), and ``ctrl+l`` (clear screen) before the voice check runs, so those configs would be advertised in /voice status but the advertised shortcut never actually triggers push-to-talk. Added ``_RESERVED_CTRL_CHARS`` at parse time so the user gets the documented default instead of a dead shortcut. (``alt+c``, ``cmd+l``, etc. are not intercepted and stay usable.) * **``_load_cfg()`` root itself may be a non-dict.** ``_voice_record_key()`` isinstance-guarded the ``voice`` subkey but not the root — a malformed config.yaml that collapsed to a scalar/list at the top level (``config.yaml: true`` or ``[]``) would still raise on ``.get("voice")``. Added the top-level guard too so every malformed shape falls back to ``ctrl+b``. * **Stale header comment on ``isVoiceToggleKey``.** The doc-comment still claimed "On macOS we additionally accept the platform action modifier (Cmd) for the configured letter" even though the implementation gates the Cmd fallback to the documented default only. Rewrote to match. Coverage added: * ``parseVoiceRecordKey`` fallback on bare chars (``o``, ``b``, ``space``, ``escape``). * ``parseVoiceRecordKey`` fallback on ``ctrl+c`` / ``ctrl+d`` / ``ctrl+l``; positive case for ``alt+c`` / ``cmd+l`` still usable. * Backend ``test_voice_toggle_handles_non_dict_voice_cfg`` now exercises 5 non-dict shapes at the YAML root too. Suite: 583/583 TUI vitest green, 3/3 backend voice tests green, tsc --noEmit clean. * fix(tui): address Copilot round-5 review on #19835 Three follow-ups on the voice matcher's modifier + shift discipline: * **``super`` branch falsely fired on Alt+ / bare Esc on macOS.** ``isVoiceToggleKey`` accepted ``isMac && key.meta`` as a Cmd fallback for the ``super`` modifier — but hermes-ink sets ``key.meta`` for plain Alt/Option AND for bare Escape on some macOS terminals. A ``cmd+b`` config silently fired on Alt+B; ``cmd+space`` on Alt+Space; ``cmd+escape`` on bare Esc. Drop the fallback and require the literal ``key.super`` bit. Legacy- terminal users who need Cmd should upgrade to a kitty-protocol terminal or bind ``alt+X`` explicitly. * **Shift bit was never checked.** The parser rejects multi- modifier configs like ``ctrl+shift+tab``, but the runtime matcher didn't check ``key.shift`` — so ``ctrl+tab`` also fired on Ctrl+Shift+Tab and ``alt+enter`` on Alt+Shift+Enter. Early-return on ``key.shift === true`` so the runtime only fires the exact chord the user configured. * **Test leaked ``HERMES_VOICE=1`` into later tests.** ``voice.toggle`` action=on writes to ``os.environ`` directly (CLI parity, runtime-only flag); ``test_voice_toggle_returns_ configured_record_key`` dispatched action=on without letting monkeypatch take ownership of the var first. Any later test that read voice mode in the same Python process could inherit a stale enabled state. Added ``monkeypatch.setenv("HERMES_VOICE", "0")`` up front so monkeypatch restores the original value at teardown. Coverage added: * ``cmd+b`` / ``cmd+space`` / ``cmd+escape`` do NOT fire on ``key.meta``-only events on darwin. * ``ctrl+tab`` / ``alt+enter`` / ``ctrl+o`` reject matches when ``key.shift`` is held; sanity cases without Shift still fire. Suite: 585/585 TUI vitest green, 3/3 backend voice tests green, tsc --noEmit clean. * fix(tui): address Copilot round-6 review on #19835 Three classes of modifier-discipline tightening + one config-surface honesty fix: * **Default ``ctrl+b`` Cmd fallback leaked Alt+B.** The default's macOS Cmd+B muscle-memory path used ``isActionMod(key)``, which returns ``key.meta || key.super`` on darwin. hermes-ink also reports plain Alt as ``key.meta``, so Alt+B silently fired the default binding. Replaced with strict ``isMac && key.super === true`` — kitty-style Cmd+B still works, Alt+B correctly rejected. Legacy-terminal mac users (Terminal.app without CSI-u) now get raw Ctrl+B only; the documented default still works everywhere. * **ctrl / super branches accepted extra modifier bits.** The parser rejects multi-modifier configs like ``ctrl+alt+o``, but the runtime matcher was permissive — ``ctrl+o`` fired on Ctrl+Alt+O / Ctrl+Cmd+O, and ``super+b`` fired on Cmd+Alt+B / Ctrl+Cmd+B. Added strict ``!key.alt && !key.meta && key.super !== true`` on ctrl, and ``!key.ctrl && !key.alt && !key.meta`` on super, so the runtime only fires the exact chord the parser would let you configure. * **Dropped ``cmd`` / ``command`` aliases.** They parsed to ``super`` and rendered as ``Cmd+X``, but legacy macOS terminals report Cmd as ``key.meta`` (same signal as Alt), so a ``cmd+o`` config was advertised as working but never actually fired on Terminal.app-without-CSI-u. That recreated the "displayed shortcut does not work" problem this PR was meant to remove. Users who want the platform action modifier spell it ``super`` / ``win`` — that matches the unambiguous ``key.super`` bit, and kitty-style macOS terminals render it as ``Cmd+X`` via platform-aware formatter. Coverage updated: * Default ctrl+b no longer fires on Alt+B via ``key.meta`` leak; raw Ctrl+B and kitty-style Cmd+B still fire. * ``ctrl+o`` rejects Ctrl+Alt+O / Ctrl+Cmd+O / Ctrl+Meta+O chords. * ``super+b`` rejects Cmd+Alt+B / Cmd+Meta+B / Ctrl+Cmd+B chords. * ``cmd+b`` / ``command+b`` / ``meta+b`` all fall back to the documented default at parse time (joined the ambiguous-mac-mod rejection class). * Round-2 expectations that asserted ``cmd+b`` parsed as super and accepted ``key.meta`` on darwin updated to reflect the new stricter contract. Suite: 588/588 TUI vitest green, 3/3 backend voice tests green, tsc --noEmit clean. * fix(tui): address Copilot follow-up on wire typing + escape precedence Two follow-ups from the latest Copilot pass: * **Config wire typing honesty (`gatewayTypes.ts`)** `config.get full` forwards raw `yaml.safe_load()` output, so `voice.record_key` can be any scalar/container when hand-edited. Typing it as `string` suggests a normalized contract that the backend does not guarantee and makes unsafe callers more likely. Change `ConfigVoiceConfig.record_key` to `unknown` with an explicit comment that callers must normalize at runtime. * **Escape-based voice bindings were swallowed before voice check** `useInputHandlers()` handled `key.escape` for queue-edit cancel and selection clear before `isVoiceToggleKey(...)`, so configured `ctrl+escape` / `alt+escape` / `super+escape` chords were advertised but never toggled recording in those UI states. Add an early escape+voice check before generic Esc handlers so escape-based voice bindings win when configured, while plain Esc behavior remains unchanged. Also updated PR #19835 description text to remove stale cmd/command alias claims and match the current parser contract. * fix(tui): pass configured voice shortcut through TextInput layer Thread the live parsed voiceRecordKey into TextInput so configured voice.record_key chords bubble to useInputHandlers instead of being consumed as editor input. This removes the last hardcoded Ctrl+B pass-through in the composer path while preserving existing global control chord behavior. * fix(tui): require explicit alt bit for escape-based alt chords Hermes-ink reports bare Escape as meta=true+escape=true on some terminals, so a configured alt+escape binding was firing on bare Esc. Require an explicit key.alt bit when the configured named key is escape so plain Esc stays plain Esc; kitty-style alt+escape still fires. * fix(tui): harden voice.record + TextInput paste + super-mod reserved list Three round-7 Copilot follow-ups on #19835: - voice.record start handler used _load_cfg().get('voice', {}).get(...) without shape checks, so malformed YAML (bool/scalar/list) returned 5025 instead of using VAD defaults. Centralized _voice_cfg_dict() helper and type-guarded silence_threshold/silence_duration with numeric fallbacks. - TextInput pass-through check moved above paste/copy handling so configured voice chords (ctrl+v / alt+v / cmd+v) beat the composer's paste/copy defaults. - parser now also rejects super+{c,d,l,v} — on macOS those are copy/exit/clear/paste and would be advertised in /voice status but never actually toggle recording. * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix(tui): round-8 Copilot review — allow ctrl+x, gate super reservations to macOS, preserve voice key on transient RPC failure Three round-8 Copilot follow-ups on #19835: - Revert ctrl+x addition to _RESERVED_CTRL_CHARS (landed via Copilot Autofix commit 731ec86): ctrl+x is only claimed during queue-edit (queueEditIdx !== null), so voice works the rest of the session and matches CLI ctrl+ parity. - Gate super+{c,d,l,v} reservation to isMac. Linux/Windows TUI globals key off Ctrl, so kitty/CSI-u super+ configs don't collide on non-mac and should stay usable. - applyDisplay() now skips setVoiceRecordKey when cfg is null so one transient quietRpc() failure after a config edit doesn't clobber the cached binding back to Ctrl+B until the next successful poll. New coverage: - parseVoiceRecordKey preserves ctrl+x on linux - super+{c,d,l,v} rejected on darwin, allowed on linux - applyDisplay(null, ...) leaves voiceRecordKey untouched * fix(cli,tui): normalize voice.record_key aliases across CLI + TUI for parity Round-9 Copilot review on #19835: TUI accepted control+/option+/opt+/super+/win+ aliases but the classic CLI only rewrote literal ctrl+/alt+ before handing to prompt_toolkit, so a TUI-valid config silently bound a different (or no) shortcut in the CLI. - Added normalize_voice_record_key_for_prompt_toolkit() in hermes_cli/voice.py with a single alias table (ctrl/control/alt/option/opt → c-/a-). - Wired it into all three cli.py sites (_enable_voice_mode hint, _show_voice_status display, and the prompt_toolkit binding in _register_voice_handler). - /voice status display now renders control+x as Ctrl+X and option+x as Alt+X (canonical casing) to match TUI formatVoiceRecordKey. - super/win/windows are intentionally left unchanged: prompt_toolkit has no super modifier, so the CLI will reject them loudly at startup rather than silently binding Ctrl+B. Documented this split at both the TUI _MOD_ALIASES comment and the CLI normalizer docstring. - Added tests covering ctrl/control/alt/option/opt mapping, case-insensitivity, non-string fallback, empty-string fallback, and super/win pass-through. * fix(cli): port TUI parser contract into CLI voice.record_key normalizer Round-10 Copilot review on #19835. hermes_cli/voice.py's normalize_voice_record_key_for_prompt_toolkit() previously did blind substring replacement with no trim/validate step, so the CLI diverged from the TUI parser on: - whitespace ('ctrl + b' -> 'c- b' instead of 'c-b') - typoed named keys ('ctrl+spcae' passed through as 'c-spcae' and prompt_toolkit would reject at startup) - bare-char configs ('o' should fall back, not pass through as 'o') - multi-modifier chords ('ctrl+alt+r') - reserved ctrl chars ('ctrl+c/d/l') - unknown modifiers ('meta+b' / 'shift+b') - named-key aliases ('return'/'esc'/'bs'/'del' not collapsed to prompt_toolkit canonicals) Port the TUI parser contract into Python (_VOICE_MOD_ALIASES, _VOICE_NAMED_KEYS, _VOICE_RESERVED_CTRL_CHARS) so one config value binds the same shortcut in both runtimes. Also added format_voice_record_key_for_status() shared between the PTT hint and /voice status display. Non-string scalars (voice.record_key: true / 1) now surface as 'Ctrl+B' instead of the raw scalar — /voice status no longer advertises a shortcut that can never bind. Tests: 29/29 in test_voice_wrapper.py, including 11 new regressions covering whitespace, named-key aliases, typos, bare-char, multi-modifier, reserved ctrl, unknown mods, non-string fallback, and formatter contract. * fix(cli): shape-safe voice config read + graceful super/win fallback Round-11 Copilot review on #19835. Two remaining cross-runtime gaps: 1. load_config().get('voice', {}) still assumed voice was a dict, so a hand-edited voice: true / voice: cmd+b at the top level raised AttributeError before the voice UI could start. Added voice_record_key_from_config(cfg) to hermes_cli/voice.py that isinstance-guards both the root and the voice subkey. All three cli.py read sites (_enable_voice_mode hint, _show_voice_status, PTT binding) now use it. 2. The CLI normalizer previously passed super+/win+/windows+ through unrewritten so prompt_toolkit would reject them loudly at startup — but that crash was a worse UX than a silent fallback. Normalizer now returns c-b for those spellings, and the PTT binding site logs a warning so users see why their TUI-only shortcut isn't binding in the CLI. Coverage: 34/34 in tests/hermes_cli/test_voice_wrapper.py (5 new cases for voice_record_key_from_config + malformed-root + malformed-voice + extractor/normalizer composition). * fix(cli): self-audit cleanup — remaining voice-config shape safety + doc drift Self-review of the voice.record_key change set turned up four remaining items Copilot would very likely flag next round: 1. cli.py _voice_start_continuous still read load_config().get('voice', {}).get('silence_threshold') without an isinstance guard, so a hand-edited voice: true / voice: cmd+b (non-dict) raised AttributeError on VAD recording start. Shape-safe coerce the voice dict and numeric-guard silence_threshold/silence_duration. 2. cli.py _enable_voice_mode's auto_tts check had the same bug — fixed with the same isinstance guard. 3. hermes_cli/voice.py module comment on _VOICE_MOD_ALIASES still said super/win/windows 'pass through unchanged and prompt_toolkit's add() call loudly rejects them at startup'. Round 11 changed the normalizer to silently fall back to c-b with a warning at the binding site; updated the comment to match. 4. ui-tui/src/lib/platform.ts header comment had the same stale 'CLI will loudly reject them at startup' claim; updated to 'falls back to the documented default and logs a warning'. No behavior change on the code paths already covered by test_voice_wrapper.py; the two cli.py fixes are defensive against malformed YAML that previous rounds already hardened in tui_gateway/server.py but missed in the classic CLI. * fix(cli,tui): round-12 Copilot review — alt-collide on mac, bool-in-int guards, voice UI hardcodes, mtime-reload test Five round-12 Copilot review items on #19835: 1. platform.ts: hermes-ink reports Alt as key.meta on many terminals; isActionMod on darwin accepts key.meta as the action modifier. So alt+c/d/l get claimed by isCopyShortcut / isAction('d')/'l') before the voice check. Reject those configs at parse time on macOS only (non-mac keeps them usable). 2. cli.py: four remaining hardcoded 'Ctrl+B' sites in voice-facing UI (_get_voice_status_fragments status bar, _voice_start_recording hints, _get_placeholder composer text) were still lying about non-default configs. Added self._voice_record_key_label() shared helper and wired it into all three sites. 3. server.py + cli.py: bool is a subclass of int, so isinstance(silence_threshold, (int, float)) accepted True/False from malformed YAML and forwarded 1/0 to the VAD engine. Exclude bool explicitly so boolean typos fall back to the documented 200 / 3.0 defaults. 4. useConfigSync.ts: extracted the config.get-full fetch+apply body into a shared hydrateFullConfig() helper. Both the initial hydration and mtime-reload paths now use it, so the polling/RPC wiring is exercised by direct unit tests (4 new cases: fresh apply, reapply on new value, transient RPC failure preserves cache, back-compat without voice setter). 5. Added alt+{c,d,l} rejection regressions on darwin + allow on linux, and bool-leak regressions for both silence_threshold and silence_duration in tests/test_tui_gateway_server.py. Suite: 602/602 TUI vitest, 38/38 backend voice tests, typecheck + lints clean. * fix(cli): cache voice record-key label at binding time + status-bar coverage Round-13 Copilot review on #19835. _voice_record_key_label() was reading live config on every render, which caused two problems: 1. prompt_toolkit registers the push-to-talk binding once at session start (@kb.add(_voice_key)); the binding does NOT re-read config. Editing voice.record_key mid-session would switch the status-bar / placeholder / recording-hint label to the new shortcut while the actual keybinding stayed on the startup chord — reintroducing the display/binding drift this whole PR is fighting. 2. Hot render path: during recording the UI is invalidated every 150ms, so re-loading + deep-merging config on every call added avoidable UI overhead. Fix: cache the label at the same site that registers the prompt_toolkit binding via new set_voice_record_key_cache(raw_key). _voice_record_key_label() now just returns the cached value (falls back to 'Ctrl+B' before startup). Status/placeholder/hint are always in sync with the live binding; no config reload per render. Also added 4 regression cases to tests/cli/test_cli_status_bar.py: configured ctrl+ renders in both wide and compact status bars, configured named key (ctrl+space) renders in the recording hint, pre-startup absent cache falls back to Ctrl+B, and malformed configs (bool True) fall through the formatter to Ctrl+B. Suite: 60/60 test_cli_status_bar + test_voice_wrapper, typecheck + lints clean. * fix(cli): route /voice on + /voice status through startup-pinned label; mac alt+cdl parity Round-14 Copilot review on #19835. All three comments legit: 1. _enable_voice_mode still formatted label from live load_config() — mid-session config edit would make /voice on announce the new shortcut while the prompt_toolkit binding stayed the startup chord. Use self._voice_record_key_label() (cached at binding time, round-13) so /voice on cannot drift from the live binding. 2. _show_voice_status had the same bug — /voice status reported live config instead of the pinned startup binding. Fixed the same way. 3. CLI normalizer accepted alt+c/alt+d/alt+l even though the TUI parser rejects them on macOS (Copilot round-12 — hermes-ink reports Alt as key.meta, isActionMod on darwin accepts it, collides with isCopyShortcut / isAction). Added _VOICE_RESERVED_ALT_CHARS_MAC = {c,d,l} gated to sys.platform == 'darwin' so a shared config like option+c falls back to c-b on both runtimes on macOS; non-mac still binds a-c. Coverage: 4 new tests in test_voice_wrapper.py covering mac alt+cdl rejection, linux alt+cdl allowed, option/opt alias forms, and mac-specific exclusions for other alt letters. 62/62 in voice wrapper + status bar suites. --------- Co-authored-by: Tranquil-Flow Co-authored-by: asheriif Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --- cli.py | 149 ++++-- hermes_cli/voice.py | 186 ++++++++ tests/cli/test_cli_status_bar.py | 62 +++ tests/hermes_cli/test_voice_wrapper.py | 237 +++++++++ tests/test_tui_gateway_server.py | 167 +++++++ tui_gateway/server.py | 68 ++- .../src/__tests__/createSlashHandler.test.ts | 61 ++- ui-tui/src/__tests__/platform.test.ts | 451 +++++++++++++++++- .../__tests__/textInputPassThrough.test.ts | 43 ++ ui-tui/src/__tests__/useConfigSync.test.ts | 140 +++++- ui-tui/src/app/interfaces.ts | 4 + ui-tui/src/app/slash/commands/session.ts | 31 +- ui-tui/src/app/useConfigSync.ts | 61 ++- ui-tui/src/app/useInputHandlers.ts | 16 +- ui-tui/src/app/useMainApp.ts | 13 +- ui-tui/src/components/appLayout.tsx | 1 + ui-tui/src/components/textInput.tsx | 50 +- ui-tui/src/gatewayTypes.ts | 9 +- ui-tui/src/lib/platform.ts | 362 +++++++++++++- 19 files changed, 2027 insertions(+), 84 deletions(-) create mode 100644 ui-tui/src/__tests__/textInputPassThrough.test.ts diff --git a/cli.py b/cli.py index eefd14fe47..e5af3c88f2 100644 --- a/cli.py +++ b/cli.py @@ -2579,23 +2579,59 @@ class HermesCLI: return f" {txt} ({elapsed_str})" return f" {txt}" + def _voice_record_key_label(self) -> str: + """Return the configured voice push-to-talk key formatted for UI. + + Shared helper so every voice-facing status line / placeholder / + recording hint advertises the SAME label as the registered + prompt_toolkit binding. + + Cached at startup (see ``set_voice_record_key_cache``) rather + than re-read per render. Two reasons (Copilot round-13 on + #19835): + + * The prompt_toolkit binding is registered once at session + start via ``@kb.add(_voice_key)``; re-reading config per + render meant the status bar could advertise a new shortcut + after a config edit while the actual binding was still the + startup chord — exactly the display/binding drift this PR + is trying to eliminate. + * The label is on the hot render path (status bar + composer + placeholder invalidated every 150ms during recording), so + reading config on every call added avoidable UI overhead. + """ + return getattr(self, "_voice_record_key_display_cache", None) or "Ctrl+B" + + def set_voice_record_key_cache(self, raw_key: object) -> None: + """Populate the voice label cache from a raw ``voice.record_key``. + + Called at CLI startup after the prompt_toolkit binding is + registered so the cached label always matches the live binding. + """ + try: + from hermes_cli.voice import format_voice_record_key_for_status + self._voice_record_key_display_cache = format_voice_record_key_for_status(raw_key) + except Exception: + self._voice_record_key_display_cache = "Ctrl+B" + def _get_voice_status_fragments(self, width: Optional[int] = None): """Return the voice status bar fragments for the interactive TUI.""" width = width or self._get_tui_terminal_width() compact = self._use_minimal_tui_chrome(width=width) + label = self._voice_record_key_label() if self._voice_recording: if compact: return [("class:voice-status-recording", " ● REC ")] - return [("class:voice-status-recording", " ● REC Ctrl+B to stop ")] + return [("class:voice-status-recording", f" ● REC {label} to stop ")] if self._voice_processing: if compact: return [("class:voice-status", " ◉ STT ")] return [("class:voice-status", " ◉ Transcribing... ")] if compact: - return [("class:voice-status", " 🎤 Ctrl+B ")] + return [("class:voice-status", f" 🎤 {label} ")] tts = " | TTS on" if self._voice_tts else "" cont = " | Continuous" if self._voice_continuous else "" - return [("class:voice-status", f" 🎤 Voice mode{tts}{cont} — Ctrl+B to record ")] + return [("class:voice-status", f" 🎤 Voice mode{tts}{cont} — {label} to record ")] def _build_status_bar_text(self, width: Optional[int] = None) -> str: """Return a compact one-line session status string for the TUI footer.""" @@ -8270,20 +8306,38 @@ class HermesCLI: return self._voice_recording = True - # Load silence detection params from config - voice_cfg = {} + # Load silence detection params from config. Shape-safe: a + # hand-edited ``voice: true`` / ``voice: cmd+b`` leaves + # ``load_config()['voice']`` as a non-dict; coerce to {} so + # continuous recording falls back to the documented defaults + # instead of crashing on ``.get()``. + voice_cfg: dict = {} try: from hermes_cli.config import load_config - voice_cfg = load_config().get("voice", {}) + _cfg = load_config().get("voice") + voice_cfg = _cfg if isinstance(_cfg, dict) else {} except Exception: pass if self._voice_recorder is None: self._voice_recorder = create_audio_recorder() - # Apply config-driven silence params - self._voice_recorder._silence_threshold = voice_cfg.get("silence_threshold", 200) - self._voice_recorder._silence_duration = voice_cfg.get("silence_duration", 3.0) + # Apply config-driven silence params (numeric-guarded so YAML + # scalar corruption doesn't break recording start-up). + # + # ``bool`` is explicitly excluded from the numeric check — in + # Python bool is a subclass of int, so a hand-edited + # ``silence_threshold: true`` would otherwise be forwarded as + # ``1`` instead of falling back to the 200 default (Copilot + # round-12 on #19835). + _threshold = voice_cfg.get("silence_threshold") + _duration = voice_cfg.get("silence_duration") + self._voice_recorder._silence_threshold = ( + _threshold if isinstance(_threshold, (int, float)) and not isinstance(_threshold, bool) else 200 + ) + self._voice_recorder._silence_duration = ( + _duration if isinstance(_duration, (int, float)) and not isinstance(_duration, bool) else 3.0 + ) def _on_silence(): """Called by AudioRecorder when silence is detected after speech.""" @@ -8309,12 +8363,13 @@ class HermesCLI: with self._voice_lock: self._voice_recording = False raise + _label = self._voice_record_key_label() if getattr(self._voice_recorder, "supports_silence_autostop", True): - _recording_hint = "auto-stops on silence | Ctrl+B to stop & exit continuous" + _recording_hint = f"auto-stops on silence | {_label} to stop & exit continuous" elif _is_termux_environment(): - _recording_hint = "Termux:API capture | Ctrl+B to stop" + _recording_hint = f"Termux:API capture | {_label} to stop" else: - _recording_hint = "Ctrl+B to stop" + _recording_hint = f"{_label} to stop" _cprint(f"\n{_ACCENT}● Recording...{_RST} {_DIM}({_recording_hint}){_RST}") # Periodically refresh prompt to update audio level indicator @@ -8559,10 +8614,12 @@ class HermesCLI: with self._voice_lock: self._voice_mode = True - # Check config for auto_tts + # Check config for auto_tts (shape-safe — malformed ``voice:`` YAML + # leaves ``voice_config`` as a non-dict, so guard before .get()). try: from hermes_cli.config import load_config - voice_config = load_config().get("voice", {}) + _raw_voice = load_config().get("voice") + voice_config = _raw_voice if isinstance(_raw_voice, dict) else {} if voice_config.get("auto_tts", False): with self._voice_lock: self._voice_tts = True @@ -8574,13 +8631,11 @@ class HermesCLI: # _voice_message_prefix property and its usage in _process_message(). tts_status = " (TTS enabled)" if self._voice_tts else "" - try: - from hermes_cli.config import load_config - _raw_ptt = load_config().get("voice", {}).get("record_key", "ctrl+b") - _ptt_key = _raw_ptt.lower().replace("ctrl+", "c-").replace("alt+", "a-") - except Exception: - _ptt_key = "c-b" - _ptt_display = _ptt_key.replace("c-", "Ctrl+").upper() + # Use the startup-pinned cache so the advertised shortcut always + # matches the live prompt_toolkit binding — reading live config + # here would drift after a mid-session config edit (Copilot + # round-14 on #19835, same class as round-13). + _ptt_display = self._voice_record_key_label() _cprint(f"\n{_ACCENT}Voice mode enabled{tts_status}{_RST}") _cprint(f" {_DIM}{_ptt_display} to start/stop recording{_RST}") _cprint(f" {_DIM}/voice tts to toggle speech output{_RST}") @@ -8637,7 +8692,6 @@ class HermesCLI: def _show_voice_status(self): """Show current voice mode status.""" - from hermes_cli.config import load_config from tools.voice_mode import check_voice_requirements reqs = check_voice_requirements() @@ -8646,9 +8700,11 @@ class HermesCLI: _cprint(f" Mode: {'ON' if self._voice_mode else 'OFF'}") _cprint(f" TTS: {'ON' if self._voice_tts else 'OFF'}") _cprint(f" Recording: {'YES' if self._voice_recording else 'no'}") - _raw_key = load_config().get("voice", {}).get("record_key", "ctrl+b") - _display_key = _raw_key.replace("ctrl+", "Ctrl+").upper() if "ctrl+" in _raw_key.lower() else _raw_key - _cprint(f" Record key: {_display_key}") + # Display the startup-pinned label so /voice status always + # matches the live prompt_toolkit binding (Copilot round-14 on + # #19835, same class as round-13). Reading live config here + # would drift after a mid-session config edit. + _cprint(f" Record key: {self._voice_record_key_label()}") _cprint(f"\n {_BOLD}Requirements:{_RST}") for line in reqs["details"].split("\n"): _cprint(f" {line}") @@ -10622,15 +10678,44 @@ class HermesCLI: run_in_terminal(_suspend) # Voice push-to-talk key: configurable via config.yaml (voice.record_key) - # Default: Ctrl+B (avoids conflict with Ctrl+R readline reverse-search) - # Config uses "ctrl+b" format; prompt_toolkit expects "c-b" format. + # Default: Ctrl+B (avoids conflict with Ctrl+R readline reverse-search). + # Config spellings (ctrl/control/alt/option/opt) are normalized to + # prompt_toolkit's c-x / a-x format via ``normalize_voice_record_key_for_prompt_toolkit`` + # so the same config value binds identically in the TUI and CLI + # (Copilot round-9 review on #19835). ``super``/``win``/``windows`` + # configs silently fall back to the default here since prompt_toolkit + # has no super modifier — log a warning so users notice the + # TUI/CLI split instead of a silent mismatch (round-11). + _raw_key: object = "ctrl+b" try: from hermes_cli.config import load_config - _raw_key = load_config().get("voice", {}).get("record_key", "ctrl+b") - _voice_key = _raw_key.lower().replace("ctrl+", "c-").replace("alt+", "a-") + from hermes_cli.voice import ( + normalize_voice_record_key_for_prompt_toolkit, + voice_record_key_from_config, + ) + _raw_key = voice_record_key_from_config(load_config()) + _voice_key = normalize_voice_record_key_for_prompt_toolkit(_raw_key) + if ( + isinstance(_raw_key, str) + and _raw_key.strip().lower().split("+", 1)[0].strip() in {"super", "win", "windows"} + and _voice_key == "c-b" + ): + logger.warning( + "voice.record_key %r uses a TUI-only modifier (super/win); " + "CLI fell back to Ctrl+B. Use ctrl+ or alt+ for " + "cross-runtime parity.", + _raw_key, + ) except Exception: _voice_key = "c-b" + # Cache the UI label here — same ``_raw_key`` that drives the + # prompt_toolkit binding below. Every status / placeholder / + # recording-hint render reads this cached value so display can + # never drift from the live keybinding even if the user edits + # voice.record_key mid-session (Copilot round-13 on #19835). + self.set_voice_record_key_cache(_raw_key) + @kb.add(_voice_key) def handle_voice_record(event): """Toggle voice recording when voice mode is active. @@ -10933,7 +11018,8 @@ class HermesCLI: def _get_placeholder(): if cli_ref._voice_recording: - return "recording... Ctrl+B to stop, Ctrl+C to cancel" + _label = cli_ref._voice_record_key_label() + return f"recording... {_label} to stop, Ctrl+C to cancel" if cli_ref._voice_processing: return "transcribing..." if cli_ref._sudo_state: @@ -10953,7 +11039,8 @@ class HermesCLI: if cli_ref._agent_running: return "msg=interrupt · /queue · /bg · /steer · Ctrl+C cancel" if cli_ref._voice_mode: - return "type or Ctrl+B to record" + _label = cli_ref._voice_record_key_label() + return f"type or {_label} to record" return "" input_area.control.input_processors.append(_PlaceholderProcessor(_get_placeholder)) diff --git a/hermes_cli/voice.py b/hermes_cli/voice.py index 0a355ce4fa..f85f30c7bf 100644 --- a/hermes_cli/voice.py +++ b/hermes_cli/voice.py @@ -27,6 +27,192 @@ import sys import threading from typing import Any, Callable, Optional +# Modifier aliases mirrored from the TUI parser (``ui-tui/src/lib/platform.ts``) +# ``_MOD_ALIASES`` table — the contract that removes the cross-runtime +# mismatch Copilot flagged in round-9 on #19835. +# +# ``super``/``win``/``windows`` are intentionally absent: prompt_toolkit +# has no super/meta modifier for the Cmd key, so those spellings are +# TUI-only. The normalizer below returns the documented default +# (``c-b``) for them — a silent fallback was preferred to a hard +# startup crash (Copilot round-11). The CLI binding site +# (``_register_voice_handler`` in cli.py) logs a warning when that +# fallback fires so users see why their TUI-only shortcut isn't +# bound in the classic CLI. +_VOICE_MOD_ALIASES = { + "ctrl": "c-", + "control": "c-", + "alt": "a-", + "option": "a-", + "opt": "a-", +} + +# Named keys prompt_toolkit accepts in ``c-`` / ``a-`` form. +# Aliases collapse to prompt_toolkit's canonical spelling so the same +# config value binds identically in both runtimes (Copilot round-10 on +# #19835). +_VOICE_NAMED_KEYS = { + "space": "space", + "spc": "space", + "enter": "enter", + "return": "enter", + "ret": "enter", + "tab": "tab", + "escape": "escape", + "esc": "escape", + "backspace": "backspace", + "bs": "backspace", + "delete": "delete", + "del": "delete", +} + +# ``useInputHandlers()`` intercepts these before the voice check runs, +# so a binding like ``ctrl+c`` (interrupt), ``ctrl+d`` (quit), or +# ``ctrl+l`` (clear screen) would be advertised in /voice status but +# never fire push-to-talk — the same blocklist the TUI parser uses. +_VOICE_RESERVED_CTRL_CHARS = frozenset({"c", "d", "l"}) + +# On macOS the classic CLI's prompt_toolkit bindings for copy / exit / +# clear also claim ``a-c`` / ``a-d`` / ``a-l`` via the action-modifier +# lookup, and hermes-ink reports Alt as ``key.meta`` on many terminals. +# Mirror the TUI parser's darwin-only reservation so ``option+c`` etc. +# don't bind Alt+C in the CLI while the TUI silently falls back to +# Ctrl+B (Copilot round-14 on #19835). +_VOICE_RESERVED_ALT_CHARS_MAC = frozenset({"c", "d", "l"}) + +_DEFAULT_PT_KEY = "c-b" + + +def voice_record_key_from_config(cfg: Any) -> Any: + """Shape-safe ``cfg.voice.record_key`` lookup. + + ``load_config()`` deep-merges raw YAML and preserves scalar + overrides, so a hand-edited ``voice: true`` / ``voice: cmd+b`` + leaves ``cfg["voice"]`` as a bool/str instead of a dict, and the + naive ``.get("voice", {}).get("record_key")`` chain raises + AttributeError before voice can even start (Copilot round-11 on + #19835). Return ``None`` for malformed shapes so call sites can + feed the result straight into the normalizer/formatter and get + the documented default. + """ + if not isinstance(cfg, dict): + return None + + voice = cfg.get("voice") + if not isinstance(voice, dict): + return None + + return voice.get("record_key") + + +def normalize_voice_record_key_for_prompt_toolkit(raw: Any) -> str: + """Coerce ``voice.record_key`` into prompt_toolkit's ``c-x`` / ``a-x`` format. + + Mirrors the TUI parser contract (``ui-tui/src/lib/platform.ts``) + so one config value binds the same shortcut in both runtimes: + + * non-string / empty / typo'd / bare-char / multi-modifier / reserved + ``ctrl+c|d|l`` → documented default ``c-b`` + * single-char keys: ``ctrl+o`` → ``c-o`` + * named keys: ``ctrl+space`` → ``c-space`` (aliases collapse: + ``ctrl+return`` → ``c-enter``) + * ``super`` / ``win`` / ``windows`` → ``c-b`` (TUI-only modifiers — + prompt_toolkit has no super mod; the CLI binding site is + expected to warn when this fallback fires so users see the + cross-runtime split, Copilot round-11 on #19835) + """ + if not isinstance(raw, str): + return _DEFAULT_PT_KEY + + lowered = raw.strip().lower() + if not lowered: + return _DEFAULT_PT_KEY + + parts = [p.strip() for p in lowered.split("+") if p.strip()] + if not parts: + return _DEFAULT_PT_KEY + + # Multi-modifier chords like ``ctrl+alt+r`` bind different shortcuts + # in prompt_toolkit (a-c-r form) and hermes-ink rejects them; collapse + # to the documented default instead of silently diverging. + if len(parts) > 2: + return _DEFAULT_PT_KEY + + # Bare char / bare named key (no explicit modifier) — the CLI's + # prompt_toolkit binds the raw key without a modifier, which the TUI + # parser refuses; reject here too so both runtimes agree. + if len(parts) == 1: + return _DEFAULT_PT_KEY + + modifier_token, key_token = parts + + # ``super`` / ``win`` / ``windows`` are TUI-only (prompt_toolkit has + # no super modifier, so ``@kb.add(super+b)`` crashes the CLI at + # startup). Fall back to the documented default here; the CLI + # binding site is expected to log a warning when the configured + # value is one of these spellings so users know the TUI+CLI + # runtimes diverge on that shortcut (Copilot round-11 on #19835). + if modifier_token in {"super", "win", "windows"}: + return _DEFAULT_PT_KEY + + normalized_mod = _VOICE_MOD_ALIASES.get(modifier_token) + if not normalized_mod: + return _DEFAULT_PT_KEY + + # Single-char key: reject reserved-ctrl chords that the TUI would + # also block at parse time, plus the mac-only alt reservation. + if len(key_token) == 1: + if normalized_mod == "c-" and key_token in _VOICE_RESERVED_CTRL_CHARS: + return _DEFAULT_PT_KEY + if ( + normalized_mod == "a-" + and sys.platform == "darwin" + and key_token in _VOICE_RESERVED_ALT_CHARS_MAC + ): + return _DEFAULT_PT_KEY + return f"{normalized_mod}{key_token}" + + # Multi-char key token must be a known named key; typos like + # ``ctrl+spcae`` fall back to the default rather than being passed + # through as ``c-spcae`` (which prompt_toolkit would reject). + named = _VOICE_NAMED_KEYS.get(key_token) + if not named: + return _DEFAULT_PT_KEY + + return f"{normalized_mod}{named}" + + +def format_voice_record_key_for_status(raw: Any) -> str: + """Render ``voice.record_key`` for ``/voice status`` in CLI-friendly form. + + Mirrors the TUI's ``formatVoiceRecordKey``: returns ``Ctrl+B`` / + ``Alt+Space`` / ``Ctrl+Enter``. Malformed configs surface as the + documented default so status never advertises a shortcut that + won't bind (Copilot round-10 on #19835). + """ + normalized = normalize_voice_record_key_for_prompt_toolkit(raw) + + if normalized.startswith("c-"): + prefix, key = "Ctrl+", normalized[2:] + elif normalized.startswith("a-"): + prefix, key = "Alt+", normalized[2:] + elif "+" in normalized: + # ``super+`` / ``win+`` — CLI won't bind them, but + # render in title case so status output is still readable. + mod, key = normalized.split("+", 1) + prefix = mod[0].upper() + mod[1:] + "+" + else: + return "Ctrl+B" + + if not key: + return prefix.rstrip("+") + + if len(key) == 1: + return prefix + key.upper() + + return prefix + key[0].upper() + key[1:] + + from tools.voice_mode import ( create_audio_recorder, is_whisper_hallucination, diff --git a/tests/cli/test_cli_status_bar.py b/tests/cli/test_cli_status_bar.py index 4a65c6e467..f5c18bfc4d 100644 --- a/tests/cli/test_cli_status_bar.py +++ b/tests/cli/test_cli_status_bar.py @@ -266,6 +266,68 @@ class TestCLIStatusBar: assert fragments == [("class:voice-status-recording", " ● REC ")] + # Round-13 Copilot review regressions on #19835. The label in voice + # status bar / recording hint / placeholder must render the + # configured ``voice.record_key`` — not hardcoded Ctrl+B. Pinning + # the cache (``set_voice_record_key_cache``) keeps display in sync + # with the prompt_toolkit binding without re-reading config on + # every render. + def test_voice_status_bar_renders_configured_ctrl_letter(self): + cli_obj = _make_cli() + cli_obj._voice_mode = True + cli_obj._voice_recording = False + cli_obj._voice_processing = False + cli_obj._voice_tts = False + cli_obj._voice_continuous = False + cli_obj.set_voice_record_key_cache("ctrl+o") + + wide = cli_obj._get_voice_status_fragments(width=120) + assert any("Ctrl+O to record" in text for _cls, text in wide) + + compact = cli_obj._get_voice_status_fragments(width=50) + assert compact == [("class:voice-status", " 🎤 Ctrl+O ")] + + def test_voice_recording_status_bar_renders_configured_named_key(self): + cli_obj = _make_cli() + cli_obj._voice_mode = True + cli_obj._voice_recording = True + cli_obj._voice_processing = False + cli_obj.set_voice_record_key_cache("ctrl+space") + + fragments = cli_obj._get_voice_status_fragments(width=120) + + assert fragments == [("class:voice-status-recording", " ● REC Ctrl+Space to stop ")] + + def test_voice_status_bar_falls_back_to_ctrl_b_without_cache(self): + cli_obj = _make_cli() + cli_obj._voice_mode = True + cli_obj._voice_recording = False + cli_obj._voice_processing = False + cli_obj._voice_tts = False + cli_obj._voice_continuous = False + # No cache set — mirrors pre-startup state; fall back to + # documented Ctrl+B default (Copilot round-13 review). + + compact = cli_obj._get_voice_status_fragments(width=50) + + assert compact == [("class:voice-status", " 🎤 Ctrl+B ")] + + def test_voice_status_bar_renders_malformed_config_as_default(self): + cli_obj = _make_cli() + cli_obj._voice_mode = True + cli_obj._voice_recording = False + cli_obj._voice_processing = False + cli_obj._voice_tts = False + cli_obj._voice_continuous = False + # Non-string / typoed configs fall through the formatter to the + # documented default so the status bar never advertises an + # invalid shortcut. + cli_obj.set_voice_record_key_cache(True) + + compact = cli_obj._get_voice_status_fragments(width=50) + + assert compact == [("class:voice-status", " 🎤 Ctrl+B ")] + class TestCLIUsageReport: def test_show_usage_includes_estimated_cost(self, capsys): diff --git a/tests/hermes_cli/test_voice_wrapper.py b/tests/hermes_cli/test_voice_wrapper.py index a372c1194f..3caacf4313 100644 --- a/tests/hermes_cli/test_voice_wrapper.py +++ b/tests/hermes_cli/test_voice_wrapper.py @@ -31,6 +31,243 @@ class TestPublicAPI: assert callable(speak_text) +class TestNormalizeVoiceRecordKeyForPromptToolkit: + """Round-9 Copilot review regression on #19835. + + Classic CLI only normalized ``ctrl+`` / ``alt+``, so TUI-valid + aliases like ``control+``, ``option+``, ``opt+`` silently bound a + different (or no) shortcut in the CLI. Normalizer now maps the + same set of aliases the TUI parser accepts, so one config value + binds identically in both runtimes. + """ + + def test_ctrl_and_alt_map_to_prompt_toolkit_form(self): + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("ctrl+b") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("alt+r") == "a-r" + + def test_control_option_opt_aliases_match_tui_parser(self): + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("control+o") == "c-o" + assert normalize_voice_record_key_for_prompt_toolkit("option+space") == "a-space" + assert normalize_voice_record_key_for_prompt_toolkit("opt+enter") == "a-enter" + + def test_case_insensitive(self): + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("Ctrl+B") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("CONTROL+O") == "c-o" + + def test_non_string_falls_back_to_default(self): + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit(None) == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit(1) == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit(True) == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit({}) == "c-b" + + def test_empty_string_falls_back(self): + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("") == "c-b" + + def test_super_win_fall_back_to_default_in_cli(self): + """prompt_toolkit has no super modifier, so ``super+b`` / ``win+o`` + would crash the classic CLI at startup if passed through. Fall + back to the documented default; the CLI binding site is + expected to warn so users know the shortcut is TUI-only + (Copilot round-11 on #19835).""" + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("super+b") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("win+o") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("windows+o") == "c-b" + + # Round-10 Copilot review regressions on #19835. + def test_strips_whitespace_within_and_around(self): + """``ctrl + b`` / `` option + space `` are accepted by the TUI + parser; the CLI normalizer must mirror that or the same config + binds different shortcuts across runtimes.""" + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("ctrl + b") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit(" option + space ") == "a-space" + + def test_named_key_aliases_collapse_to_prompt_toolkit_canonical(self): + """TUI accepts ``return`` / ``esc`` / ``bs`` / ``del`` etc.; + CLI must collapse to prompt_toolkit's canonical spelling + (``enter`` / ``escape`` / ``backspace`` / ``delete``).""" + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("ctrl+return") == "c-enter" + assert normalize_voice_record_key_for_prompt_toolkit("ctrl+esc") == "c-escape" + assert normalize_voice_record_key_for_prompt_toolkit("ctrl+bs") == "c-backspace" + assert normalize_voice_record_key_for_prompt_toolkit("alt+del") == "a-delete" + + def test_typoed_named_keys_fall_back_to_default(self): + """``ctrl+spcae`` would otherwise pass through as ``c-spcae`` and + prompt_toolkit would reject it at startup — fall back instead.""" + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("ctrl+spcae") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("ctrl+f5") == "c-b" + + def test_bare_char_and_multi_modifier_fall_back(self): + """TUI parser rejects bare-char (``o``) and multi-modifier + (``ctrl+alt+r``) configs; the CLI normalizer must match.""" + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("o") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("b") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("ctrl+alt+r") == "c-b" + + def test_reserved_ctrl_chars_fall_back(self): + """``ctrl+c`` / ``ctrl+d`` / ``ctrl+l`` are always claimed by + the CLI's prompt_toolkit input layer or terminal driver; match + the TUI parser's rejection to keep /voice status honest.""" + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("ctrl+c") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("ctrl+d") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("ctrl+l") == "c-b" + + def test_unknown_modifier_falls_back(self): + """``meta+b`` is ambiguous on the wire (Alt on xterm, Cmd on + legacy macOS), same class as the TUI parser's rejection.""" + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("meta+b") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("shift+b") == "c-b" + + # Round-14 Copilot review regression on #19835. On macOS the TUI + # parser rejects alt+c/d/l because hermes-ink reports Alt as + # ``key.meta`` and isActionMod(darwin) accepts it. The CLI + # normalizer must mirror that platform-gated rejection so shared + # configs like ``option+c`` don't bind Alt+C in the CLI while the + # TUI falls back to Ctrl+B. + def test_alt_cdl_rejected_on_macos(self, monkeypatch): + monkeypatch.setattr("sys.platform", "darwin") + + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("alt+c") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("alt+d") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("alt+l") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("option+c") == "c-b" + assert normalize_voice_record_key_for_prompt_toolkit("opt+d") == "c-b" + # Other alt letters still bind on darwin. + assert normalize_voice_record_key_for_prompt_toolkit("alt+r") == "a-r" + assert normalize_voice_record_key_for_prompt_toolkit("alt+space") == "a-space" + + def test_alt_cdl_allowed_on_non_macos(self, monkeypatch): + monkeypatch.setattr("sys.platform", "linux") + + from hermes_cli.voice import normalize_voice_record_key_for_prompt_toolkit + + assert normalize_voice_record_key_for_prompt_toolkit("alt+c") == "a-c" + assert normalize_voice_record_key_for_prompt_toolkit("alt+d") == "a-d" + assert normalize_voice_record_key_for_prompt_toolkit("alt+l") == "a-l" + + +class TestVoiceRecordKeyFromConfig: + """Round-11 Copilot review regression on #19835. + + ``load_config()`` preserves YAML scalar overrides, so a hand-edited + ``voice: true`` or ``voice: cmd+b`` made the naive + ``cfg.get('voice', {}).get('record_key')`` chain raise + AttributeError before voice could run. The shape-safe extractor + returns None for every malformed shape so the call-site fallback + (``normalize_…`` / ``format_…``) surfaces the documented default. + """ + + def test_dict_voice_with_string_record_key(self): + from hermes_cli.voice import voice_record_key_from_config + + assert voice_record_key_from_config({"voice": {"record_key": "ctrl+o"}}) == "ctrl+o" + + def test_non_dict_config_root(self): + from hermes_cli.voice import voice_record_key_from_config + + for bad_root in (None, True, 1, "ctrl+b", [], ["ctrl+b"]): + assert voice_record_key_from_config(bad_root) is None, bad_root + + def test_non_dict_voice_entry(self): + from hermes_cli.voice import voice_record_key_from_config + + for bad_voice in (None, True, "cmd+b", 42, ["ctrl+b"]): + assert voice_record_key_from_config({"voice": bad_voice}) is None, bad_voice + + def test_missing_record_key_returns_none(self): + from hermes_cli.voice import voice_record_key_from_config + + assert voice_record_key_from_config({"voice": {"beep_enabled": True}}) is None + assert voice_record_key_from_config({}) is None + + def test_normalizer_accepts_extractor_output_directly(self): + """voice_record_key_from_config + normalize_… must compose — + None / non-string scalars all fall back to c-b.""" + from hermes_cli.voice import ( + normalize_voice_record_key_for_prompt_toolkit, + voice_record_key_from_config, + ) + + for raw in (None, True, 1, "cmd+b", ["ctrl+b"]): + extracted = voice_record_key_from_config({"voice": raw}) + assert normalize_voice_record_key_for_prompt_toolkit(extracted) == "c-b" + + +class TestFormatVoiceRecordKeyForStatus: + """Round-10 Copilot review regression on #19835. + + ``/voice status`` used to print the raw scalar (``True`` / ``1``) + for non-string configs even though the actual binding falls back + to Ctrl+B. The formatter routes through the same normalizer so + status always matches what the CLI actually binds. + """ + + def test_ctrl_and_alt_letter_keys_render_canonically(self): + from hermes_cli.voice import format_voice_record_key_for_status + + assert format_voice_record_key_for_status("ctrl+b") == "Ctrl+B" + assert format_voice_record_key_for_status("ctrl+o") == "Ctrl+O" + assert format_voice_record_key_for_status("alt+r") == "Alt+R" + + def test_named_keys_render_in_title_case(self): + from hermes_cli.voice import format_voice_record_key_for_status + + assert format_voice_record_key_for_status("ctrl+space") == "Ctrl+Space" + assert format_voice_record_key_for_status("alt+enter") == "Alt+Enter" + assert format_voice_record_key_for_status("ctrl+esc") == "Ctrl+Escape" + + def test_aliases_render_via_normalized_form(self): + from hermes_cli.voice import format_voice_record_key_for_status + + assert format_voice_record_key_for_status("control+o") == "Ctrl+O" + assert format_voice_record_key_for_status("option+space") == "Alt+Space" + assert format_voice_record_key_for_status("opt+enter") == "Alt+Enter" + + def test_non_string_scalar_falls_back_to_ctrl_b_label(self): + from hermes_cli.voice import format_voice_record_key_for_status + + # Copilot round-10 regression: previously /voice status printed + # the raw scalar ("True" / "1") even though the actual binding + # fell back to Ctrl+B. + assert format_voice_record_key_for_status(True) == "Ctrl+B" + assert format_voice_record_key_for_status(1) == "Ctrl+B" + assert format_voice_record_key_for_status(None) == "Ctrl+B" + assert format_voice_record_key_for_status({}) == "Ctrl+B" + + def test_malformed_configs_fall_back_to_ctrl_b(self): + from hermes_cli.voice import format_voice_record_key_for_status + + assert format_voice_record_key_for_status("ctrl+spcae") == "Ctrl+B" + assert format_voice_record_key_for_status("ctrl+alt+r") == "Ctrl+B" + assert format_voice_record_key_for_status("") == "Ctrl+B" + assert format_voice_record_key_for_status(" ") == "Ctrl+B" + + class TestStopWithoutStart: def test_returns_none_when_no_recording_active(self, monkeypatch): """Idempotent no-op: stop before start must not raise or touch state.""" diff --git a/tests/test_tui_gateway_server.py b/tests/test_tui_gateway_server.py index 41b5194da6..469b8895ea 100644 --- a/tests/test_tui_gateway_server.py +++ b/tests/test_tui_gateway_server.py @@ -81,6 +81,173 @@ def test_dispatch_rejects_non_object_params(): } +def test_voice_toggle_returns_configured_record_key(monkeypatch): + monkeypatch.setattr( + server, + "_load_cfg", + lambda: {"voice": {"record_key": "ctrl+o"}}, + ) + monkeypatch.setitem( + sys.modules, + "tools.voice_mode", + types.SimpleNamespace( + check_voice_requirements=lambda: {"available": True, "details": ""} + ), + ) + # ``voice.toggle`` action=on mutates ``os.environ["HERMES_VOICE"]`` + # directly (CLI parity, runtime-only flag). Take monkeypatch + # ownership of the var so the change is reverted at teardown and + # later tests don't inherit a stale ON state (Copilot round-5 + # review on #19835). + monkeypatch.setenv("HERMES_VOICE", "0") + + on_resp = server.dispatch( + {"id": "voice-on", "method": "voice.toggle", "params": {"action": "on"}} + ) + status_resp = server.dispatch( + {"id": "voice-status", "method": "voice.toggle", "params": {"action": "status"}} + ) + + assert on_resp["result"]["record_key"] == "ctrl+o" + assert status_resp["result"]["record_key"] == "ctrl+o" + + +def test_voice_toggle_handles_non_dict_voice_cfg(monkeypatch): + """Round-3 Copilot review regression on #19835. + + ``_load_cfg()`` is raw ``yaml.safe_load()`` output — a hand-edited + ``voice: true`` / ``voice: cmd+b`` / ``voice: null`` leaves ``voice`` + as a bool/str/None, not a dict. Previously ``.get("record_key")`` + on a non-dict broke every ``voice.toggle`` branch. Now it falls + back to the documented default. + """ + monkeypatch.setitem( + sys.modules, + "tools.voice_mode", + types.SimpleNamespace( + check_voice_requirements=lambda: {"available": True, "details": ""} + ), + ) + + for bad in (True, "cmd+b", None, 42, ["ctrl+b"]): + monkeypatch.setattr(server, "_load_cfg", lambda b=bad: {"voice": b}) + + status_resp = server.dispatch( + {"id": "voice-status", "method": "voice.toggle", "params": {"action": "status"}} + ) + + assert status_resp["result"]["record_key"] == "ctrl+b", ( + f"voice.record_key fell back to default for voice={bad!r}" + ) + + # Round-4 follow-up: the YAML root itself may be a non-dict. A + # hand-edit that collapses config.yaml to a scalar / list would + # otherwise crash ``.get("voice")`` before the inner isinstance + # guard gets a chance to run. + for bad_root in (True, None, [], "ctrl+b", 42): + monkeypatch.setattr(server, "_load_cfg", lambda r=bad_root: r) + + status_resp = server.dispatch( + {"id": "voice-status-root", "method": "voice.toggle", "params": {"action": "status"}} + ) + + assert status_resp["result"]["record_key"] == "ctrl+b", ( + f"voice.record_key fell back to default for root={bad_root!r}" + ) + + +def test_voice_record_start_handles_non_dict_voice_cfg(monkeypatch): + """Round-7 Copilot review regression on #19835. + + The ``voice.record`` start path previously read + ``_load_cfg().get("voice", {}).get(...)`` without any shape checks. + When ``voice`` is a non-dict (bool/scalar/list) ``get`` raises + AttributeError and the handler returns 5025 instead of falling + back to the VAD defaults. Now it uses ``_voice_cfg_dict()`` and + non-numeric silence values are coerced to the documented defaults. + """ + captured: dict = {} + + def fake_start_continuous(**kwargs): + captured.update(kwargs) + + monkeypatch.setitem( + sys.modules, + "hermes_cli.voice", + types.SimpleNamespace(start_continuous=fake_start_continuous, stop_continuous=lambda: None), + ) + monkeypatch.setenv("HERMES_VOICE", "1") + + for bad in (True, "cmd+b", None, 42, ["ctrl+b"], {"silence_threshold": "loud"}): + captured.clear() + monkeypatch.setattr(server, "_load_cfg", lambda b=bad: {"voice": b}) + + resp = server.dispatch( + {"id": "voice-record", "method": "voice.record", "params": {"action": "start"}} + ) + + assert "result" in resp, f"voice.record raised for voice={bad!r}: {resp.get('error')}" + assert resp["result"]["status"] == "recording" + assert captured["silence_threshold"] == 200 + assert captured["silence_duration"] == 3.0 + + # Round-12 Copilot review regression on #19835: ``bool`` is a subclass + # of ``int``, so the naive ``isinstance(threshold, (int, float))`` + # guard would forward ``silence_threshold: true`` as ``1`` instead + # of falling back to the documented 200 default. + for bad_bool_cfg in ( + {"silence_threshold": True, "silence_duration": False}, + {"silence_threshold": False}, + {"silence_duration": True}, + ): + captured.clear() + monkeypatch.setattr(server, "_load_cfg", lambda c=bad_bool_cfg: {"voice": c}) + + resp = server.dispatch( + {"id": "voice-record-bool", "method": "voice.record", "params": {"action": "start"}} + ) + + assert "result" in resp, f"voice.record raised for bool cfg={bad_bool_cfg!r}" + assert captured["silence_threshold"] == 200, ( + f"bool silence_threshold leaked through for {bad_bool_cfg!r}" + ) + assert captured["silence_duration"] == 3.0, ( + f"bool silence_duration leaked through for {bad_bool_cfg!r}" + ) + + +def test_voice_toggle_tts_branch_also_carries_record_key(monkeypatch): + """Round-2 Copilot review regression on #19835. + + The ``tts`` branch used to omit ``record_key`` from its response, so a + TUI client would parse ``r.record_key ?? 'ctrl+b'`` and reset a + custom binding to the default on every TTS toggle. Every branch of + ``voice.toggle`` now carries the configured key so frontend state + stays authoritative. + """ + monkeypatch.setattr( + server, + "_load_cfg", + lambda: {"voice": {"record_key": "ctrl+space"}}, + ) + monkeypatch.setitem( + sys.modules, + "tools.voice_mode", + types.SimpleNamespace( + check_voice_requirements=lambda: {"available": True, "details": ""} + ), + ) + monkeypatch.setenv("HERMES_VOICE", "1") + monkeypatch.delenv("HERMES_VOICE_TTS", raising=False) + + tts_resp = server.dispatch( + {"id": "voice-tts", "method": "voice.toggle", "params": {"action": "tts"}} + ) + + assert tts_resp["result"]["record_key"] == "ctrl+space" + assert tts_resp["result"]["tts"] is True + + def test_load_enabled_toolsets_prefers_tui_env(monkeypatch): monkeypatch.setenv("HERMES_TUI_TOOLSETS", "web, terminal, ,memory") diff --git a/tui_gateway/server.py b/tui_gateway/server.py index 01407308da..2a4377c3f1 100644 --- a/tui_gateway/server.py +++ b/tui_gateway/server.py @@ -5278,6 +5278,30 @@ def _voice_tts_enabled() -> bool: return os.environ.get("HERMES_VOICE_TTS", "").strip() == "1" +def _voice_cfg_dict() -> dict: + """Shape-safe accessor for the ``voice:`` block in config.yaml. + + ``_load_cfg()`` returns raw ``yaml.safe_load()`` output, so both the + root AND ``voice`` may be any YAML scalar / list / None. A hand-edit + like ``voice: true`` or a malformed top-level config that parses to + a scalar would otherwise break ``.get("…")`` and take every + ``voice.*`` branch down with it (Copilot round-3..7 review on + #19835). Coerce through ``isinstance`` at every level so malformed + config falls back to an empty dict instead of crashing /voice. + """ + cfg = _load_cfg() + voice_cfg = cfg.get("voice") if isinstance(cfg, dict) else None + + return voice_cfg if isinstance(voice_cfg, dict) else {} + + +def _voice_record_key() -> str: + """Current ``voice.record_key`` value, documented default on error.""" + record_key = _voice_cfg_dict().get("record_key") + + return str(record_key) if isinstance(record_key, str) and record_key else "ctrl+b" + + @method("voice.toggle") def _(rid, params: dict) -> dict: """CLI parity for the ``/voice`` slash command. @@ -5298,8 +5322,13 @@ def _(rid, params: dict) -> dict: # Mirror CLI's _show_voice_status: include STT/TTS provider # availability so the user can tell at a glance *why* voice mode # isn't working ("STT provider: MISSING ..." is the common case). + # ``record_key`` mirrors the configured ``voice.record_key`` so the + # TUI can both bind it (frontend ``isVoiceToggleKey``) and display + # it in /voice status — previously the TUI hardcoded Ctrl+B and + # ignored the config (#18994). payload: dict = { "enabled": _voice_mode_enabled(), + "record_key": _voice_record_key(), "tts": _voice_tts_enabled(), } try: @@ -5336,7 +5365,14 @@ def _(rid, params: dict) -> dict: except Exception as e: logger.warning("voice: stop_continuous failed during toggle off: %s", e) - return _ok(rid, {"enabled": enabled, "tts": _voice_tts_enabled()}) + return _ok( + rid, + { + "enabled": enabled, + "record_key": _voice_record_key(), + "tts": _voice_tts_enabled(), + }, + ) if action == "tts": if not _voice_mode_enabled(): @@ -5344,7 +5380,18 @@ def _(rid, params: dict) -> dict: new_value = not _voice_tts_enabled() # Runtime-only flag (CLI parity) — see voice.toggle on/off above. os.environ["HERMES_VOICE_TTS"] = "1" if new_value else "0" - return _ok(rid, {"enabled": True, "tts": new_value}) + # Include ``record_key`` on every branch so a /voice tts toggle + # doesn't reset the TUI's cached shortcut to the default when a + # user has a custom binding configured (Copilot review, round 2 + # on #19835). Keeps parity with the status/on/off branches above. + return _ok( + rid, + { + "enabled": True, + "record_key": _voice_record_key(), + "tts": new_value, + }, + ) return _err(rid, 4013, f"unknown voice action: {action}") @@ -5376,15 +5423,26 @@ def _(rid, params: dict) -> dict: from hermes_cli.voice import start_continuous - voice_cfg = _load_cfg().get("voice", {}) + # Shape-safe lookups: malformed ``voice:`` YAML (bool/scalar/list) + # must not crash /voice with a 5025 — fall back to VAD defaults. + # + # Exclude ``bool`` from the numeric check since Python's bool is + # a subclass of int — a hand-edit like ``silence_threshold: true`` + # would otherwise forward as ``1`` instead of falling back to + # the documented 200 / 3.0 defaults (Copilot round-12 on #19835). + voice_cfg = _voice_cfg_dict() + threshold = voice_cfg.get("silence_threshold") + duration = voice_cfg.get("silence_duration") + safe_threshold = threshold if isinstance(threshold, (int, float)) and not isinstance(threshold, bool) else 200 + safe_duration = duration if isinstance(duration, (int, float)) and not isinstance(duration, bool) else 3.0 start_continuous( on_transcript=lambda t: _voice_emit("voice.transcript", {"text": t}), on_status=lambda s: _voice_emit("voice.status", {"state": s}), on_silent_limit=lambda: _voice_emit( "voice.transcript", {"no_speech_limit": True} ), - silence_threshold=voice_cfg.get("silence_threshold", 200), - silence_duration=voice_cfg.get("silence_duration", 3.0), + silence_threshold=safe_threshold, + silence_duration=safe_duration, ) return _ok(rid, {"status": "recording"}) diff --git a/ui-tui/src/__tests__/createSlashHandler.test.ts b/ui-tui/src/__tests__/createSlashHandler.test.ts index e8c50c05d2..c9447f16d8 100644 --- a/ui-tui/src/__tests__/createSlashHandler.test.ts +++ b/ui-tui/src/__tests__/createSlashHandler.test.ts @@ -173,6 +173,64 @@ describe('createSlashHandler', () => { expect(ctx.transcript.sys).toHaveBeenCalledWith(expect.stringContaining('usage: /skills')) }) + // Regressions from Copilot review on #19835: /voice output + frontend + // binding state must both track the gateway's fresh ``record_key`` on + // every response, or a config edit shows the new shortcut in text + // while push-to-talk still fires the old one until the next mtime + // poll (~5s). + it('/voice status renders the gateway record_key and pushes it into frontend state', async () => { + const rpc = vi.fn(() => Promise.resolve({ enabled: true, record_key: 'ctrl+space', tts: false })) + const ctx = buildCtx({ gateway: { ...buildGateway(), rpc } }) + + expect(createSlashHandler(ctx)('/voice status')).toBe(true) + await vi.waitFor(() => { + expect(ctx.transcript.sys).toHaveBeenCalledWith(' Record key: Ctrl+Space') + }) + expect(ctx.voice.setVoiceRecordKey).toHaveBeenCalledWith( + expect.objectContaining({ ch: 'space', mod: 'ctrl', named: 'space' }) + ) + }) + + it('/voice on renders the configured binding for the start/stop hint', async () => { + const rpc = vi.fn(() => Promise.resolve({ enabled: true, record_key: 'alt+r', tts: false })) + const ctx = buildCtx({ gateway: { ...buildGateway(), rpc } }) + + expect(createSlashHandler(ctx)('/voice on')).toBe(true) + await vi.waitFor(() => { + expect(ctx.transcript.sys).toHaveBeenCalledWith('Voice mode enabled') + expect(ctx.transcript.sys).toHaveBeenCalledWith(' Alt+R to start/stop recording') + }) + expect(ctx.voice.setVoiceRecordKey).toHaveBeenCalledWith( + expect.objectContaining({ ch: 'r', mod: 'alt' }) + ) + }) + + it('/voice falls back to Ctrl+B when the gateway response omits record_key', async () => { + const rpc = vi.fn(() => Promise.resolve({ enabled: false, tts: false })) + const ctx = buildCtx({ gateway: { ...buildGateway(), rpc } }) + + expect(createSlashHandler(ctx)('/voice status')).toBe(true) + await vi.waitFor(() => { + expect(ctx.transcript.sys).toHaveBeenCalledWith(' Record key: Ctrl+B') + }) + }) + + // Round-2 Copilot review on #19835: a response missing ``record_key`` + // (e.g. the old tts branch, or any future branch that forgets to + // include it) MUST NOT clobber the user's cached binding back to + // Ctrl+B. The label still renders the default for display; the + // frontend state keeps whatever was last authoritatively set. + it('/voice tts without record_key does not clobber cached frontend binding', async () => { + const rpc = vi.fn(() => Promise.resolve({ enabled: true, tts: true })) + const ctx = buildCtx({ gateway: { ...buildGateway(), rpc } }) + + expect(createSlashHandler(ctx)('/voice tts')).toBe(true) + await vi.waitFor(() => { + expect(ctx.transcript.sys).toHaveBeenCalledWith('Voice TTS enabled.') + }) + expect(ctx.voice.setVoiceRecordKey).not.toHaveBeenCalled() + }) + it('cycles details mode and persists it', async () => { const ctx = buildCtx() @@ -648,7 +706,8 @@ const buildTranscript = () => ({ }) const buildVoice = () => ({ - setVoiceEnabled: vi.fn() + setVoiceEnabled: vi.fn(), + setVoiceRecordKey: vi.fn() }) interface Ctx { diff --git a/ui-tui/src/__tests__/platform.test.ts b/ui-tui/src/__tests__/platform.test.ts index 4166f0b71f..77f1347a3a 100644 --- a/ui-tui/src/__tests__/platform.test.ts +++ b/ui-tui/src/__tests__/platform.test.ts @@ -67,11 +67,15 @@ describe('isVoiceToggleKey', () => { expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, 'B')).toBe(true) }) - it('matches Cmd+B on macOS (preserve platform muscle memory)', async () => { + it('matches kitty-style Cmd+B on macOS via key.super', async () => { const { isVoiceToggleKey } = await importPlatform('darwin') - expect(isVoiceToggleKey({ ctrl: false, meta: true, super: false }, 'b')).toBe(true) expect(isVoiceToggleKey({ ctrl: false, meta: false, super: true }, 'b')).toBe(true) + // ``key.meta`` is NOT accepted as Cmd — hermes-ink uses meta for + // Alt too, so accepting it leaked Alt+B into the default binding + // (Copilot round-6 review on #19835). Legacy-terminal mac users + // get strict Ctrl+B. + expect(isVoiceToggleKey({ ctrl: false, meta: true, super: false }, 'b')).toBe(false) }) it('matches Ctrl+B on non-macOS platforms', async () => { @@ -89,6 +93,449 @@ describe('isVoiceToggleKey', () => { }) }) +describe('parseVoiceRecordKey (#18994)', () => { + it('falls back to Ctrl+B for empty input', async () => { + const { DEFAULT_VOICE_RECORD_KEY, parseVoiceRecordKey } = await importPlatform('linux') + + expect(parseVoiceRecordKey('')).toEqual(DEFAULT_VOICE_RECORD_KEY) + }) + + it('parses ctrl+ bindings', async () => { + const { parseVoiceRecordKey } = await importPlatform('linux') + + expect(parseVoiceRecordKey('ctrl+o')).toEqual({ ch: 'o', mod: 'ctrl', raw: 'ctrl+o' }) + expect(parseVoiceRecordKey('Ctrl+R')).toEqual({ ch: 'r', mod: 'ctrl', raw: 'ctrl+r' }) + }) + + it('parses alt/super aliases', async () => { + const { parseVoiceRecordKey } = await importPlatform('linux') + + expect(parseVoiceRecordKey('alt+b').mod).toBe('alt') + expect(parseVoiceRecordKey('option+b').mod).toBe('alt') + expect(parseVoiceRecordKey('super+b').mod).toBe('super') + expect(parseVoiceRecordKey('win+b').mod).toBe('super') + }) + + it('treats ambiguous mac modifiers (meta / cmd / command) as unrecognised', async () => { + const { DEFAULT_VOICE_RECORD_KEY, parseVoiceRecordKey } = await importPlatform('linux') + + // ``meta`` / ``cmd`` / ``command`` are ambiguous on the wire: + // hermes-ink sets ``key.meta`` for plain Alt on every platform AND + // for Cmd on legacy macOS terminals. Accepting any of them would + // produce a display/binding mismatch (Copilot round-6 review on + // #19835). Users on modern kitty-style terminals spell the + // platform action modifier ``super`` / ``win``. + expect(parseVoiceRecordKey('meta+b')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('cmd+b')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('command+b')).toEqual(DEFAULT_VOICE_RECORD_KEY) + }) + + it('parses named keys (space, enter, tab, escape, backspace, delete)', async () => { + const { parseVoiceRecordKey } = await importPlatform('linux') + + // Every named token from the CLI's prompt_toolkit ``c-`` set is + // accepted with both the canonical name and its common alias. + expect(parseVoiceRecordKey('ctrl+space')).toEqual({ + ch: 'space', + mod: 'ctrl', + named: 'space', + raw: 'ctrl+space' + }) + expect(parseVoiceRecordKey('alt+enter').named).toBe('enter') + expect(parseVoiceRecordKey('alt+return').named).toBe('enter') // ``return`` ↔ ``enter`` + expect(parseVoiceRecordKey('ctrl+tab').named).toBe('tab') + expect(parseVoiceRecordKey('ctrl+escape').named).toBe('escape') + expect(parseVoiceRecordKey('ctrl+esc').named).toBe('escape') // ``esc`` alias + expect(parseVoiceRecordKey('ctrl+backspace').named).toBe('backspace') + expect(parseVoiceRecordKey('ctrl+delete').named).toBe('delete') + expect(parseVoiceRecordKey('ctrl+del').named).toBe('delete') // ``del`` alias + }) + + it('falls back to Ctrl+B for unrecognised multi-character tokens', async () => { + const { DEFAULT_VOICE_RECORD_KEY, parseVoiceRecordKey } = await importPlatform('linux') + + // Typos / unsupported names (``ctrl+spcae``, ``ctrl+f5``, …) fall back + // to the documented Ctrl+B default rather than silently disabling the + // binding. + expect(parseVoiceRecordKey('ctrl+spcae')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('ctrl+f5')).toEqual(DEFAULT_VOICE_RECORD_KEY) + }) + + // Round-3 Copilot review regressions on #19835. + it('does not throw on non-string YAML scalars — falls back instead', async () => { + const { DEFAULT_VOICE_RECORD_KEY, parseVoiceRecordKey } = await importPlatform('linux') + + // ``config.get full`` surfaces raw YAML values; ``voice.record_key: 1`` + // or ``voice.record_key: true`` would otherwise crash ``.trim()``. + expect(parseVoiceRecordKey(1 as unknown as string)).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey(true as unknown as string)).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey(null as unknown as string)).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey(undefined as unknown as string)).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey({} as unknown as string)).toEqual(DEFAULT_VOICE_RECORD_KEY) + }) + + it('rejects multi-modifier chords rather than silently dropping extras', async () => { + const { DEFAULT_VOICE_RECORD_KEY, parseVoiceRecordKey } = await importPlatform('linux') + + // Previously ``ctrl+alt+r`` parsed as ``ctrl+r`` and ``cmd+ctrl+b`` as + // ``super+b`` — a typo silently bound a different shortcut. Now a + // multi-modifier spelling falls back to the documented default. + expect(parseVoiceRecordKey('ctrl+alt+r')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('cmd+ctrl+b')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('alt+ctrl+space')).toEqual(DEFAULT_VOICE_RECORD_KEY) + }) + + // Round-4 Copilot review regressions on #19835. + it('rejects bare-char configs without an explicit modifier', async () => { + const { DEFAULT_VOICE_RECORD_KEY, parseVoiceRecordKey } = await importPlatform('linux') + + // The classic CLI's prompt_toolkit binds raw-char configs to the key + // itself (``c-o`` requires an explicit modifier); rewriting ``o`` + // → ``ctrl+o`` would silently diverge the two runtimes. Refuse. + expect(parseVoiceRecordKey('o')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('b')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('space')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('escape')).toEqual(DEFAULT_VOICE_RECORD_KEY) + }) + + it('rejects ctrl+c / ctrl+d / ctrl+l — reserved by the TUI input handler', async () => { + const { DEFAULT_VOICE_RECORD_KEY, parseVoiceRecordKey } = await importPlatform('linux') + + // ``useInputHandlers()`` intercepts these before the voice check, + // so a binding like ``ctrl+c`` would be advertised but never fire. + // Fall back to the documented default instead of lying to the user. + expect(parseVoiceRecordKey('ctrl+c')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('ctrl+d')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('ctrl+l')).toEqual(DEFAULT_VOICE_RECORD_KEY) + // Alt-modifier versions of those letters are NOT intercepted, so + // they remain usable. + expect(parseVoiceRecordKey('alt+c').mod).toBe('alt') + // ``ctrl+x`` is intentionally allowed — only intercepted during + // queue-edit (``queueEditIdx !== null``), so the voice binding + // works for most of the session (Copilot round-8 review). + expect(parseVoiceRecordKey('ctrl+x').mod).toBe('ctrl') + expect(parseVoiceRecordKey('ctrl+x').ch).toBe('x') + }) + + it('rejects super+{c,d,l,v} on macOS — action-mod chords are claimed before voice', async () => { + const { DEFAULT_VOICE_RECORD_KEY, parseVoiceRecordKey } = await importPlatform('darwin') + + // On macOS super+c/d/l/v are copy / exit / clear / paste. Reject at + // parse time so /voice status doesn't advertise dead bindings. + expect(parseVoiceRecordKey('super+c')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('super+d')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('super+l')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('super+v')).toEqual(DEFAULT_VOICE_RECORD_KEY) + // Other super letters still work (no global chord claims them). + expect(parseVoiceRecordKey('super+b').mod).toBe('super') + expect(parseVoiceRecordKey('super+o').mod).toBe('super') + }) + + it('allows super+{c,d,l,v} on Linux/Windows — those globals key off Ctrl, not Super', async () => { + const { parseVoiceRecordKey } = await importPlatform('linux') + + // Kitty/CSI-u users on non-mac report Cmd/Super as ``key.super``, + // but the TUI's global shortcuts (copy/exit/clear/paste) key off + // Ctrl there, so ``super+`` doesn't collide. Reject would + // silently coerce valid configs to Ctrl+B (Copilot round-8 review). + expect(parseVoiceRecordKey('super+c').mod).toBe('super') + expect(parseVoiceRecordKey('super+d').mod).toBe('super') + expect(parseVoiceRecordKey('super+l').mod).toBe('super') + expect(parseVoiceRecordKey('super+v').mod).toBe('super') + }) + + it('rejects alt+{c,d,l} on macOS — meta-as-alt collides with isAction', async () => { + const { DEFAULT_VOICE_RECORD_KEY, parseVoiceRecordKey } = await importPlatform('darwin') + + // hermes-ink reports Alt as ``key.meta`` on many terminals, and + // ``isActionMod`` on darwin accepts ``key.meta`` as the action + // modifier. So ``alt+c`` / ``alt+d`` / ``alt+l`` get claimed by + // isCopyShortcut / isAction('d') / isAction('l') before voice + // runs (Copilot round-12 on #19835). + expect(parseVoiceRecordKey('alt+c')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('alt+d')).toEqual(DEFAULT_VOICE_RECORD_KEY) + expect(parseVoiceRecordKey('alt+l')).toEqual(DEFAULT_VOICE_RECORD_KEY) + // Other alt letters stay usable on darwin. + expect(parseVoiceRecordKey('alt+r').mod).toBe('alt') + expect(parseVoiceRecordKey('alt+space').mod).toBe('alt') + }) + + it('allows alt+{c,d,l} on Linux/Windows — non-mac isAction keys off Ctrl', async () => { + const { parseVoiceRecordKey } = await importPlatform('linux') + + // On Linux/Windows ``isActionMod`` ignores key.meta, so alt+ + // doesn't collide with copy/exit/clear. Those configs stay usable. + expect(parseVoiceRecordKey('alt+c').mod).toBe('alt') + expect(parseVoiceRecordKey('alt+d').mod).toBe('alt') + expect(parseVoiceRecordKey('alt+l').mod).toBe('alt') + }) + + // Round-5 Copilot review regressions on #19835. + it('super+ does NOT fire on key.meta-only events (Alt+X false-fire guard)', async () => { + const { isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('darwin') + + // hermes-ink sets ``key.meta`` for Alt/Option AND for bare Esc on + // some macOS terminals. The super branch used to accept + // ``isMac && key.meta`` as a Cmd fallback, which made super+ + // bindings silently fire on Alt+ / bare Esc. + const superB = parseVoiceRecordKey('super+b') + const superSpace = parseVoiceRecordKey('super+space') + const superEscape = parseVoiceRecordKey('super+escape') + + expect(isVoiceToggleKey({ ctrl: false, meta: true, super: false }, 'b', superB)).toBe(false) + expect(isVoiceToggleKey({ ctrl: false, meta: true, super: false }, ' ', superSpace)).toBe(false) + expect(isVoiceToggleKey({ ctrl: false, escape: true, meta: true, super: false }, '', superEscape)).toBe(false) + }) + + // Round-6 Copilot review regressions on #19835. + it('default ctrl+b does NOT fire on Alt+B via isActionMod meta leak', async () => { + const { DEFAULT_VOICE_RECORD_KEY, isVoiceToggleKey } = await importPlatform('darwin') + + // ``isActionMod(key)`` on darwin was accepting ``key.meta`` as the + // action modifier, so Alt+B (key.meta=true) fired the default + // ctrl+b binding. Now the Cmd-fallback path requires literal + // ``key.super`` on macOS and rejects ``key.meta``. + expect(isVoiceToggleKey({ ctrl: false, meta: true, super: false }, 'b', DEFAULT_VOICE_RECORD_KEY)).toBe(false) + // Literal Ctrl+B and Cmd+B (kitty-style) still work on darwin. + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, 'b', DEFAULT_VOICE_RECORD_KEY)).toBe(true) + expect(isVoiceToggleKey({ ctrl: false, meta: false, super: true }, 'b', DEFAULT_VOICE_RECORD_KEY)).toBe(true) + }) + + it('ctrl+ rejects chords with extra alt / meta / super bits', async () => { + const { isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('linux') + const ctrlO = parseVoiceRecordKey('ctrl+o') + + // ``ctrl+o`` must fire ONLY on literal Ctrl+O, not on + // Ctrl+Alt+O / Ctrl+Cmd+O / Ctrl+Meta+O — otherwise the runtime + // matches a different chord than the parser would let you + // configure. + expect(isVoiceToggleKey({ alt: true, ctrl: true, meta: false, super: false }, 'o', ctrlO)).toBe(false) + expect(isVoiceToggleKey({ ctrl: true, meta: true, super: false }, 'o', ctrlO)).toBe(false) + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: true }, 'o', ctrlO)).toBe(false) + // Sanity: plain Ctrl+O still fires. + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, 'o', ctrlO)).toBe(true) + }) + + it('super+ rejects chords with extra ctrl / alt / meta bits', async () => { + const { isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('linux') + const superB = parseVoiceRecordKey('super+b') + + expect(isVoiceToggleKey({ alt: true, ctrl: false, meta: false, super: true }, 'b', superB)).toBe(false) + expect(isVoiceToggleKey({ ctrl: false, meta: true, super: true }, 'b', superB)).toBe(false) + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: true }, 'b', superB)).toBe(false) + // Sanity: plain Super+B still fires. + expect(isVoiceToggleKey({ ctrl: false, meta: false, super: true }, 'b', superB)).toBe(true) + }) + + it('alt+escape does not fire on bare Esc meta-shape', async () => { + const { isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('darwin') + const altEscape = parseVoiceRecordKey('alt+escape') + + // Some terminals surface bare Esc as meta=true + escape=true. + expect(isVoiceToggleKey({ ctrl: false, escape: true, meta: true, super: false }, '', altEscape)).toBe(false) + // Explicit alt bit (kitty-style) still fires the configured chord. + expect(isVoiceToggleKey({ alt: true, ctrl: false, escape: true, meta: false, super: false }, '', altEscape)).toBe(true) + }) + + it('rejects matches when Shift is held (different chord than configured)', async () => { + const { isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('linux') + + // Parser rejects multi-modifier configs like ``ctrl+shift+tab``, + // so the runtime matcher must also reject Shift-held events — + // otherwise ``ctrl+tab`` would fire on Ctrl+Shift+Tab. + const ctrlTab = parseVoiceRecordKey('ctrl+tab') + const altEnter = parseVoiceRecordKey('alt+enter') + const ctrlO = parseVoiceRecordKey('ctrl+o') + + expect(isVoiceToggleKey({ ctrl: true, meta: false, shift: true, super: false, tab: true }, '', ctrlTab)).toBe(false) + expect(isVoiceToggleKey({ alt: true, ctrl: false, meta: false, return: true, shift: true, super: false }, '', altEnter)).toBe(false) + expect(isVoiceToggleKey({ ctrl: true, meta: false, shift: true, super: false }, 'o', ctrlO)).toBe(false) + + // Sanity: same events without Shift still fire. + expect(isVoiceToggleKey({ ctrl: true, meta: false, shift: false, super: false, tab: true }, '', ctrlTab)).toBe(true) + expect(isVoiceToggleKey({ ctrl: true, meta: false, shift: false, super: false }, 'o', ctrlO)).toBe(true) + }) +}) + +describe('formatVoiceRecordKey (#18994)', () => { + it('renders as the user expects in /voice status', async () => { + const { formatVoiceRecordKey, parseVoiceRecordKey } = await importPlatform('linux') + + expect(formatVoiceRecordKey(parseVoiceRecordKey('ctrl+b'))).toBe('Ctrl+B') + expect(formatVoiceRecordKey(parseVoiceRecordKey('ctrl+o'))).toBe('Ctrl+O') + expect(formatVoiceRecordKey(parseVoiceRecordKey('alt+r'))).toBe('Alt+R') + // ``super``/``win`` render as ``Super`` on non-mac so the hint + // doesn't tell Linux/Windows users to press a Cmd key they don't + // have. + expect(formatVoiceRecordKey(parseVoiceRecordKey('super+b'))).toBe('Super+B') + }) + + it('renders named keys in title case (Ctrl+Space, Ctrl+Enter)', async () => { + const { formatVoiceRecordKey, parseVoiceRecordKey } = await importPlatform('linux') + + expect(formatVoiceRecordKey(parseVoiceRecordKey('ctrl+space'))).toBe('Ctrl+Space') + expect(formatVoiceRecordKey(parseVoiceRecordKey('alt+enter'))).toBe('Alt+Enter') + expect(formatVoiceRecordKey(parseVoiceRecordKey('ctrl+esc'))).toBe('Ctrl+Escape') + expect(formatVoiceRecordKey(parseVoiceRecordKey('super+space'))).toBe('Super+Space') + }) +}) + +describe('isVoiceToggleKey honours configured record key (#18994)', () => { + it('binds the configured letter, not hardcoded b', async () => { + const { isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('linux') + const ctrlO = parseVoiceRecordKey('ctrl+o') + + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, 'o', ctrlO)).toBe(true) + // The old hardcoded 'b' must NOT match when the user configured 'o'. + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, 'b', ctrlO)).toBe(false) + }) + + it('alt+ binding matches alt OR meta (terminal-protocol parity)', async () => { + const { isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('linux') + const altR = parseVoiceRecordKey('alt+r') + + expect(isVoiceToggleKey({ alt: true, ctrl: false, meta: false, super: false }, 'r', altR)).toBe(true) + expect(isVoiceToggleKey({ ctrl: false, meta: true, super: false }, 'r', altR)).toBe(true) + expect(isVoiceToggleKey({ ctrl: false, meta: false, super: false }, 'r', altR)).toBe(false) + }) + + it('binds named keys via ink event flags (space → ch === " ", enter → key.return, …)', async () => { + const { isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('linux') + + const ctrlSpace = parseVoiceRecordKey('ctrl+space') + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, ' ', ctrlSpace)).toBe(true) + // Single-char ``b`` must NOT match a ``space``-configured binding. + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, 'b', ctrlSpace)).toBe(false) + // Space without the configured modifier must not fire either. + expect(isVoiceToggleKey({ ctrl: false, meta: false, super: false }, ' ', ctrlSpace)).toBe(false) + + const ctrlEnter = parseVoiceRecordKey('ctrl+enter') + expect(isVoiceToggleKey({ ctrl: true, meta: false, return: true, super: false }, '', ctrlEnter)).toBe(true) + expect(isVoiceToggleKey({ ctrl: true, meta: false, return: false, super: false }, '', ctrlEnter)).toBe(false) + + const altTab = parseVoiceRecordKey('alt+tab') + expect(isVoiceToggleKey({ alt: true, ctrl: false, meta: false, super: false, tab: true }, '', altTab)).toBe(true) + expect(isVoiceToggleKey({ alt: false, ctrl: false, meta: false, super: false, tab: true }, '', altTab)).toBe(false) + + const ctrlEscape = parseVoiceRecordKey('ctrl+escape') + expect(isVoiceToggleKey({ ctrl: true, escape: true, meta: false, super: false }, '', ctrlEscape)).toBe(true) + expect(isVoiceToggleKey({ ctrl: true, escape: false, meta: false, super: false }, '', ctrlEscape)).toBe(false) + + const ctrlBackspace = parseVoiceRecordKey('ctrl+backspace') + expect(isVoiceToggleKey({ backspace: true, ctrl: true, meta: false, super: false }, '', ctrlBackspace)).toBe(true) + + const ctrlDelete = parseVoiceRecordKey('ctrl+delete') + expect(isVoiceToggleKey({ ctrl: true, delete: true, meta: false, super: false }, '', ctrlDelete)).toBe(true) + }) + + it('omitted configured key falls back to ctrl+b (back-compat)', async () => { + const { isVoiceToggleKey } = await importPlatform('linux') + + // No third arg → DEFAULT_VOICE_RECORD_KEY → Ctrl+B behaviour. + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, 'b')).toBe(true) + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, 'o')).toBe(false) + }) + + // Regressions from Copilot review on #19835: the previous implementation + // accepted ``isActionMod(key)`` in the ``ctrl`` branch for every + // configured key, so bare Esc (which hermes-ink reports with + // ``key.meta`` on some macOS terminals) fired ``ctrl+escape``, and + // Alt+Space / Alt+Tab fired ``ctrl+space`` / ``ctrl+tab``. The fallback + // is now gated to the documented default (``ctrl+b``) only. + it('ctrl+escape does NOT fire on bare Esc via key.meta on macOS', async () => { + const { isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('darwin') + const ctrlEscape = parseVoiceRecordKey('ctrl+escape') + + // Bare Esc on a legacy macOS terminal: ``key.meta: true``, ``key.escape: true``, no ctrl. + expect(isVoiceToggleKey({ ctrl: false, escape: true, meta: true, super: false }, '', ctrlEscape)).toBe(false) + // Real Ctrl+Esc still fires. + expect(isVoiceToggleKey({ ctrl: true, escape: true, meta: false, super: false }, '', ctrlEscape)).toBe(true) + }) + + it('ctrl+space does NOT fire on Alt+Space on macOS', async () => { + const { isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('darwin') + const ctrlSpace = parseVoiceRecordKey('ctrl+space') + + // Alt+Space surfaces as ``key.meta: true`` with space char. + expect(isVoiceToggleKey({ ctrl: false, meta: true, super: false }, ' ', ctrlSpace)).toBe(false) + // Real Ctrl+Space still fires. + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, ' ', ctrlSpace)).toBe(true) + }) + + it('default ctrl+b accepts raw Ctrl+B and kitty-style Cmd+B on macOS', async () => { + const { DEFAULT_VOICE_RECORD_KEY, isVoiceToggleKey } = await importPlatform('darwin') + + // Raw Ctrl+B: always works. + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, 'b', DEFAULT_VOICE_RECORD_KEY)).toBe(true) + // Cmd+B via kitty-style ``key.super``: still works. + expect(isVoiceToggleKey({ ctrl: false, meta: false, super: true }, 'b', DEFAULT_VOICE_RECORD_KEY)).toBe(true) + // Cmd+B via legacy ``key.meta`` NO LONGER works — ``key.meta`` is + // hermes-ink's Alt signal, so accepting it leaked Alt+B into the + // default binding (Copilot round-6 review on #19835). + expect(isVoiceToggleKey({ ctrl: false, meta: true, super: false }, 'b', DEFAULT_VOICE_RECORD_KEY)).toBe(false) + }) + + it('custom ctrl+ does NOT accept Cmd fallback on macOS', async () => { + const { isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('darwin') + const ctrlO = parseVoiceRecordKey('ctrl+o') + + // Only ``ctrl+b`` gets the action-modifier fallback; ``ctrl+o`` must + // be a literal Ctrl bit — otherwise Cmd+O would steal the shortcut. + expect(isVoiceToggleKey({ ctrl: false, meta: true, super: false }, 'o', ctrlO)).toBe(false) + expect(isVoiceToggleKey({ ctrl: false, meta: false, super: true }, 'o', ctrlO)).toBe(false) + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, 'o', ctrlO)).toBe(true) + }) + + it('super+b renders "Cmd+B" on darwin and requires the literal key.super bit', async () => { + const { formatVoiceRecordKey, isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('darwin') + const superB = parseVoiceRecordKey('super+b') + + expect(formatVoiceRecordKey(superB)).toBe('Cmd+B') + // Kitty-style: key.super fires the binding. + expect(isVoiceToggleKey({ ctrl: false, meta: false, super: true }, 'b', superB)).toBe(true) + // ``key.meta`` is NOT accepted — hermes-ink uses meta for Alt too, + // so accepting it here would make super+b silently fire on Alt+B + // (Copilot round-5 review on #19835). + expect(isVoiceToggleKey({ ctrl: false, meta: true, super: false }, 'b', superB)).toBe(false) + // Ctrl held at the same time → reject (different chord). + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: true }, 'b', superB)).toBe(false) + }) + + // Round-2 Copilot review regressions on #19835. + it('super+b renders "Super+B" on Linux (not "Cmd+B")', async () => { + const { formatVoiceRecordKey, parseVoiceRecordKey } = await importPlatform('linux') + + expect(formatVoiceRecordKey(parseVoiceRecordKey('super+b'))).toBe('Super+B') + expect(formatVoiceRecordKey(parseVoiceRecordKey('win+b'))).toBe('Super+B') + }) + + it('super+b still renders "Cmd+B" on macOS', async () => { + const { formatVoiceRecordKey, parseVoiceRecordKey } = await importPlatform('darwin') + + expect(formatVoiceRecordKey(parseVoiceRecordKey('super+b'))).toBe('Cmd+B') + expect(formatVoiceRecordKey(parseVoiceRecordKey('win+b'))).toBe('Cmd+B') + }) + + it('ctrl+b aliases (control+b, "ctrl + b") still accept Cmd+B fallback on macOS', async () => { + const { isVoiceToggleKey, parseVoiceRecordKey } = await importPlatform('darwin') + const controlB = parseVoiceRecordKey('control+b') + const spacedB = parseVoiceRecordKey('ctrl + b') + + // Both parse to the documented default semantically; both must keep + // the macOS Cmd+B muscle-memory fallback via kitty-style key.super. + // ``key.meta`` is NOT accepted — that's hermes-ink's Alt signal + // (round-6 review), so legacy-terminal users get strict Ctrl+B. + expect(isVoiceToggleKey({ ctrl: false, meta: true, super: false }, 'b', controlB)).toBe(false) + expect(isVoiceToggleKey({ ctrl: false, meta: true, super: false }, 'b', spacedB)).toBe(false) + expect(isVoiceToggleKey({ ctrl: false, meta: false, super: true }, 'b', controlB)).toBe(true) + expect(isVoiceToggleKey({ ctrl: false, meta: false, super: true }, 'b', spacedB)).toBe(true) + // Literal Ctrl+B still fires. + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, 'b', controlB)).toBe(true) + // And still reject a ctrl bit on a different letter. + expect(isVoiceToggleKey({ ctrl: true, meta: false, super: false }, 'o', controlB)).toBe(false) + }) +}) + describe('isMacActionFallback', () => { it('routes raw Ctrl+K and Ctrl+W to readline kill-to-end / delete-word on macOS', async () => { const { isMacActionFallback } = await importPlatform('darwin') diff --git a/ui-tui/src/__tests__/textInputPassThrough.test.ts b/ui-tui/src/__tests__/textInputPassThrough.test.ts new file mode 100644 index 0000000000..5988580f9b --- /dev/null +++ b/ui-tui/src/__tests__/textInputPassThrough.test.ts @@ -0,0 +1,43 @@ +import { describe, expect, it } from 'vitest' + +import { shouldPassThroughToGlobalHandler } from '../components/textInput.js' +import { DEFAULT_VOICE_RECORD_KEY, parseVoiceRecordKey } from '../lib/platform.js' + +const key = (overrides: Record = {}) => + ({ ctrl: false, meta: false, ...overrides }) as any + +describe('shouldPassThroughToGlobalHandler', () => { + it('passes through the configured voice shortcut while composer is focused', () => { + expect( + shouldPassThroughToGlobalHandler('o', key({ ctrl: true }), parseVoiceRecordKey('ctrl+o')) + ).toBe(true) + expect( + shouldPassThroughToGlobalHandler('r', key({ meta: true }), parseVoiceRecordKey('alt+r')) + ).toBe(true) + expect( + shouldPassThroughToGlobalHandler(' ', key({ ctrl: true }), parseVoiceRecordKey('ctrl+space')) + ).toBe(true) + expect( + shouldPassThroughToGlobalHandler('', key({ ctrl: true, return: true }), parseVoiceRecordKey('ctrl+enter')) + ).toBe(true) + }) + + it('keeps the legacy default pass-through when no custom key is provided', () => { + expect(shouldPassThroughToGlobalHandler('b', key({ ctrl: true }), DEFAULT_VOICE_RECORD_KEY)).toBe(true) + expect(shouldPassThroughToGlobalHandler('b', key({ ctrl: true }))).toBe(true) + }) + + it('does not swallow ordinary typing keys', () => { + expect(shouldPassThroughToGlobalHandler('h', key(), parseVoiceRecordKey('ctrl+o'))).toBe(false) + expect(shouldPassThroughToGlobalHandler('o', key(), parseVoiceRecordKey('ctrl+o'))).toBe(false) + }) + + it('always passes through non-voice global control keys', () => { + expect(shouldPassThroughToGlobalHandler('c', key({ ctrl: true }))).toBe(true) + expect(shouldPassThroughToGlobalHandler('x', key({ ctrl: true }))).toBe(true) + expect(shouldPassThroughToGlobalHandler('', key({ escape: true }))).toBe(true) + expect(shouldPassThroughToGlobalHandler('', key({ tab: true }))).toBe(true) + expect(shouldPassThroughToGlobalHandler('', key({ pageUp: true }))).toBe(true) + expect(shouldPassThroughToGlobalHandler('', key({ pageDown: true }))).toBe(true) + }) +}) diff --git a/ui-tui/src/__tests__/useConfigSync.test.ts b/ui-tui/src/__tests__/useConfigSync.test.ts index fc2dad19f1..39020d2763 100644 --- a/ui-tui/src/__tests__/useConfigSync.test.ts +++ b/ui-tui/src/__tests__/useConfigSync.test.ts @@ -1,13 +1,15 @@ -import { beforeEach, describe, expect, it, vi } from 'vitest' +import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest' import { $uiState, resetUiState } from '../app/uiStore.js' import { applyDisplay, + hydrateFullConfig, normalizeBusyInputMode, normalizeIndicatorStyle, normalizeMouseTracking, normalizeStatusBar } from '../app/useConfigSync.js' +import type { ParsedVoiceRecordKey } from '../lib/platform.js' describe('applyDisplay', () => { beforeEach(() => { @@ -292,3 +294,139 @@ describe('applyDisplay → tui_status_indicator', () => { expect($uiState.get().indicatorStyle).toBe('kaomoji') }) }) + +// Regressions from Copilot review on #19835: the config-hydration path +// for voice.record_key was untested, so a future regression in the +// hydration or mtime-reapply wiring would slip past the suite. +describe('applyDisplay → voice.record_key (#18994)', () => { + beforeEach(() => { + resetUiState() + }) + + it('parses voice.record_key and pushes it through the setter', () => { + const setBell = vi.fn() + const setVoiceRecordKey = vi.fn() + + applyDisplay( + { config: { display: {}, voice: { record_key: 'ctrl+space' } } }, + setBell, + setVoiceRecordKey + ) + + expect(setVoiceRecordKey).toHaveBeenCalledWith( + expect.objectContaining({ ch: 'space', mod: 'ctrl', named: 'space', raw: 'ctrl+space' }) + ) + }) + + it('falls back to the documented default when voice.record_key is missing', () => { + const setBell = vi.fn() + const setVoiceRecordKey = vi.fn() + + applyDisplay({ config: { display: {} } }, setBell, setVoiceRecordKey) + + expect(setVoiceRecordKey).toHaveBeenCalledWith( + expect.objectContaining({ ch: 'b', mod: 'ctrl', raw: 'ctrl+b' }) + ) + }) + + it('is a no-op when the voice setter is not passed (back-compat)', () => { + const setBell = vi.fn() + + // applyDisplay is used in the setVoiceEnabled-less init path too; + // omitting the third arg must not throw. + expect(() => + applyDisplay({ config: { display: {}, voice: { record_key: 'alt+r' } } }, setBell) + ).not.toThrow() + }) + + it('does not reset voiceRecordKey when cfg is null (transient RPC failure)', () => { + const setBell = vi.fn() + const setVoiceRecordKey = vi.fn() + + // quietRpc() collapses request failures to null. Resetting the + // cached shortcut on every null would clobber a custom binding + // after one transient error until the next successful poll + // (Copilot round-8 review on #19835). + applyDisplay(null, setBell, setVoiceRecordKey) + + expect(setVoiceRecordKey).not.toHaveBeenCalled() + // bell is still applied (defaults to false on null), so the setter + // runs — we specifically only skip voiceRecordKey. + expect(setBell).toHaveBeenCalledWith(false) + }) +}) + +// Round-12 Copilot review regression on #19835: the live mtime-reload +// path was previously untested, so a regression in the polling/RPC +// wiring to applyDisplay would only be visible at runtime. The fetch +// + apply body is now shared as ``hydrateFullConfig()``, exercised +// directly from both the initial hydration and the poll-tick body. +describe('hydrateFullConfig', () => { + beforeEach(() => { + resetUiState() + }) + + const makeFakeGw = (payload: unknown) => + ({ + request: vi.fn(() => Promise.resolve(payload)), + on: vi.fn(), + off: vi.fn() + }) as any + + it('re-applies voice.record_key from a fresh config.get full response', async () => { + const gw = makeFakeGw({ config: { display: {}, voice: { record_key: 'ctrl+o' } } }) + const setBell = vi.fn() + const setVoiceRecordKey = vi.fn() + + await hydrateFullConfig(gw, setBell, setVoiceRecordKey) + + expect(gw.request).toHaveBeenCalledWith('config.get', { key: 'full' }) + expect(setVoiceRecordKey).toHaveBeenCalledWith( + expect.objectContaining({ ch: 'o', mod: 'ctrl', raw: 'ctrl+o' }) + ) + expect(setBell).toHaveBeenCalledWith(false) + }) + + it('reapplies the latest value on each invocation (mtime-reload semantics)', async () => { + const gw = makeFakeGw({ config: { display: {}, voice: { record_key: 'ctrl+b' } } }) + const setBell = vi.fn() + const setVoiceRecordKey = vi.fn() + + await hydrateFullConfig(gw, setBell, setVoiceRecordKey) + expect(setVoiceRecordKey).toHaveBeenLastCalledWith(expect.objectContaining({ ch: 'b' })) + + // Simulate a config edit: gw now returns a new shortcut. + gw.request = vi.fn(() => Promise.resolve({ config: { display: {}, voice: { record_key: 'alt+space' } } })) + + await hydrateFullConfig(gw, setBell, setVoiceRecordKey) + expect(setVoiceRecordKey).toHaveBeenLastCalledWith( + expect.objectContaining({ ch: 'space', mod: 'alt', named: 'space' }) + ) + }) + + it('leaves cached voiceRecordKey untouched when the RPC fails', async () => { + const gw = { request: vi.fn(() => Promise.reject(new Error('boom'))), on: vi.fn(), off: vi.fn() } as any + const setBell = vi.fn() + const setVoiceRecordKey = vi.fn() + + const result = await hydrateFullConfig(gw, setBell, setVoiceRecordKey) + + // quietRpc() swallows the error and returns null; applyDisplay + // sees cfg=null and skips the voice setter (Copilot round-8). + expect(result).toBeNull() + expect(setVoiceRecordKey).not.toHaveBeenCalled() + // bell setter still fires — applyDisplay's null-cfg path applies + // the documented bell default (false). + expect(setBell).toHaveBeenCalledWith(false) + }) + + it('threads through without a voice setter (back-compat call sites)', async () => { + const gw = makeFakeGw({ config: { display: { bell_on_complete: true } } }) + const setBell = vi.fn() + + // No third arg — applyDisplay must not throw and must still apply + // display flags (round-2 / round-8 invariant). + await expect(hydrateFullConfig(gw, setBell)).resolves.toBeTruthy() + expect(setBell).toHaveBeenCalledWith(true) + }) +}) diff --git a/ui-tui/src/app/interfaces.ts b/ui-tui/src/app/interfaces.ts index baf637aa25..dfe88fc040 100644 --- a/ui-tui/src/app/interfaces.ts +++ b/ui-tui/src/app/interfaces.ts @@ -4,6 +4,7 @@ import type { MutableRefObject, ReactNode, RefObject, SetStateAction } from 'rea import type { PasteEvent } from '../components/textInput.js' import type { GatewayClient } from '../gatewayClient.js' import type { ImageAttachResponse } from '../gatewayTypes.js' +import type { ParsedVoiceRecordKey } from '../lib/platform.js' import type { RpcResult } from '../lib/rpc.js' import type { Theme } from '../theme.js' import type { @@ -210,6 +211,7 @@ export interface InputHandlerContext { } voice: { enabled: boolean + recordKey: ParsedVoiceRecordKey recording: boolean setProcessing: StateSetter setRecording: StateSetter @@ -291,6 +293,7 @@ export interface SlashHandlerContext { } voice: { setVoiceEnabled: StateSetter + setVoiceRecordKey: (v: ParsedVoiceRecordKey) => void } } @@ -318,6 +321,7 @@ export interface AppLayoutComposerProps { queuedDisplay: string[] submit: (value: string) => void updateInput: StateSetter + voiceRecordKey: ParsedVoiceRecordKey } export interface AppLayoutProgressProps { diff --git a/ui-tui/src/app/slash/commands/session.ts b/ui-tui/src/app/slash/commands/session.ts index 0a5324ef55..a75419c3b0 100644 --- a/ui-tui/src/app/slash/commands/session.ts +++ b/ui-tui/src/app/slash/commands/session.ts @@ -10,6 +10,7 @@ import type { SessionUsageResponse, VoiceToggleResponse } from '../../../gatewayTypes.js' +import { formatVoiceRecordKey, parseVoiceRecordKey } from '../../../lib/platform.js' import { fmtK } from '../../../lib/text.js' import type { PanelSection } from '../../../types.js' import { DEFAULT_INDICATOR_STYLE, INDICATOR_STYLES, type IndicatorStyle } from '../../interfaces.js' @@ -221,6 +222,30 @@ export const sessionCommands: SlashCommand[] = [ ctx.guarded(r => { ctx.voice.setVoiceEnabled(!!r.enabled) + // Render the configured record key (config.yaml ``voice.record_key``) + // instead of hardcoded "Ctrl+B" — the gateway response carries the + // current value so /voice status and /voice on stay in sync with + // both the CLI and the TUI's actual binding (#18994). + // + // Copilot review on #19835 caught that rendering from the fresh + // backend response WITHOUT updating the frontend ``voice.recordKey`` + // state would skew display and binding between config-edit and + // the next ``mtime`` poll (~5s). Parse once, push into state so + // ``useInputHandlers()`` picks up the new binding immediately. + // + // Round-2 follow-up: only push state when the response actually + // carries ``record_key`` — otherwise an older gateway (or a future + // branch that forgets to include it) would clobber a custom user + // binding back to the default on every /voice invocation. The + // label still falls back to the documented default for display. + const parsed = r.record_key ? parseVoiceRecordKey(r.record_key) : undefined + + if (parsed) { + ctx.voice.setVoiceRecordKey(parsed) + } + + const recordKeyLabel = formatVoiceRecordKey(parsed ?? parseVoiceRecordKey('ctrl+b')) + // Match CLI's _show_voice_status / _enable_voice_mode / // _toggle_voice_tts output shape so users don't have to learn // two vocabularies. @@ -230,11 +255,11 @@ export const sessionCommands: SlashCommand[] = [ ctx.transcript.sys('Voice Mode Status') ctx.transcript.sys(` Mode: ${mode}`) ctx.transcript.sys(` TTS: ${tts}`) - ctx.transcript.sys(' Record key: Ctrl+B') + ctx.transcript.sys(` Record key: ${recordKeyLabel}`) // CLI's "Requirements:" block — surfaces STT/audio setup issues // so the user sees "STT provider: MISSING ..." instead of - // silently failing on every Ctrl+B press. + // silently failing on every record-key press. if (r.details) { ctx.transcript.sys('') ctx.transcript.sys(' Requirements:') @@ -259,7 +284,7 @@ export const sessionCommands: SlashCommand[] = [ if (r.enabled) { const tts = r.tts ? ' (TTS enabled)' : '' ctx.transcript.sys(`Voice mode enabled${tts}`) - ctx.transcript.sys(' Ctrl+B to start/stop recording') + ctx.transcript.sys(` ${recordKeyLabel} to start/stop recording`) ctx.transcript.sys(' /voice tts to toggle speech output') ctx.transcript.sys(' /voice off to disable voice mode') } else { diff --git a/ui-tui/src/app/useConfigSync.ts b/ui-tui/src/app/useConfigSync.ts index ad8f52f148..b0e590ee2c 100644 --- a/ui-tui/src/app/useConfigSync.ts +++ b/ui-tui/src/app/useConfigSync.ts @@ -7,6 +7,11 @@ import type { ConfigMtimeResponse, ReloadMcpResponse } from '../gatewayTypes.js' +import { + DEFAULT_VOICE_RECORD_KEY, + parseVoiceRecordKey, + type ParsedVoiceRecordKey +} from '../lib/platform.js' import { asRpcResult } from '../lib/rpc.js' import { @@ -89,10 +94,47 @@ const quietRpc = async = Record>( } } -export const applyDisplay = (cfg: ConfigFullResponse | null, setBell: (v: boolean) => void) => { +const _voiceRecordKeyFromConfig = (cfg: ConfigFullResponse | null): ParsedVoiceRecordKey => { + const raw = cfg?.config?.voice?.record_key + + return raw ? parseVoiceRecordKey(raw) : DEFAULT_VOICE_RECORD_KEY +} + +/** Fetch ``config.get full`` and fan the result through ``applyDisplay``. + * + * Extracted so the mtime-reload path can be exercised by the test + * suite without a React runtime (Copilot round-12 review on #19835). + * Both the initial hydration and the mtime poller use this shared + * helper, so a regression in the fetch/apply plumbing now fails the + * useConfigSync tests instead of only being visible at runtime. */ +export async function hydrateFullConfig( + gw: GatewayClient, + setBell: (v: boolean) => void, + setVoiceRecordKey?: (v: ParsedVoiceRecordKey) => void +): Promise { + const cfg = await quietRpc(gw, 'config.get', { key: 'full' }) + applyDisplay(cfg, setBell, setVoiceRecordKey) + return cfg +} + +export const applyDisplay = ( + cfg: ConfigFullResponse | null, + setBell: (v: boolean) => void, + setVoiceRecordKey?: (v: ParsedVoiceRecordKey) => void +) => { const d = cfg?.config?.display ?? {} setBell(!!d.bell_on_complete) + // Only push the voice record key when the RPC actually returned a + // config payload. ``quietRpc()`` collapses failures to ``null``; if we + // reset the cached shortcut on every null we would clobber a custom + // binding after one transient RPC error until the next config edit + // (Copilot round-8 review on #19835). The mtime-poll loop advances + // ``mtimeRef`` before this call, so staying silent on null preserves + // the last-good state and lets the next successful poll refresh it. + if (setVoiceRecordKey && cfg) { + setVoiceRecordKey(_voiceRecordKeyFromConfig(cfg)) + } patchUiState({ busyInputMode: normalizeBusyInputMode(d.busy_input_mode), compact: !!d.tui_compact, @@ -109,7 +151,13 @@ export const applyDisplay = (cfg: ConfigFullResponse | null, setBell: (v: boolea }) } -export function useConfigSync({ gw, setBellOnComplete, setVoiceEnabled, sid }: UseConfigSyncOptions) { +export function useConfigSync({ + gw, + setBellOnComplete, + setVoiceEnabled, + setVoiceRecordKey, + sid +}: UseConfigSyncOptions) { const mtimeRef = useRef(0) useEffect(() => { @@ -125,8 +173,8 @@ export function useConfigSync({ gw, setBellOnComplete, setVoiceEnabled, sid }: U quietRpc(gw, 'config.get', { key: 'mtime' }).then(r => { mtimeRef.current = Number(r?.mtime ?? 0) }) - quietRpc(gw, 'config.get', { key: 'full' }).then(r => applyDisplay(r, setBellOnComplete)) - }, [gw, setBellOnComplete, setVoiceEnabled, sid]) + void hydrateFullConfig(gw, setBellOnComplete, setVoiceRecordKey) + }, [gw, setBellOnComplete, setVoiceEnabled, setVoiceRecordKey, sid]) useEffect(() => { if (!sid) { @@ -154,17 +202,18 @@ export function useConfigSync({ gw, setBellOnComplete, setVoiceEnabled, sid }: U quietRpc(gw, 'reload.mcp', { session_id: sid, confirm: true }).then( r => r && turnController.pushActivity('MCP reloaded after config change') ) - quietRpc(gw, 'config.get', { key: 'full' }).then(r => applyDisplay(r, setBellOnComplete)) + void hydrateFullConfig(gw, setBellOnComplete, setVoiceRecordKey) }) }, MTIME_POLL_MS) return () => clearInterval(id) - }, [gw, setBellOnComplete, sid]) + }, [gw, setBellOnComplete, setVoiceRecordKey, sid]) } export interface UseConfigSyncOptions { gw: GatewayClient setBellOnComplete: (v: boolean) => void setVoiceEnabled: (v: boolean) => void + setVoiceRecordKey?: (v: ParsedVoiceRecordKey) => void sid: null | string } diff --git a/ui-tui/src/app/useInputHandlers.ts b/ui-tui/src/app/useInputHandlers.ts index a74c9e8431..20e9b087a4 100644 --- a/ui-tui/src/app/useInputHandlers.ts +++ b/ui-tui/src/app/useInputHandlers.ts @@ -348,9 +348,17 @@ export function useInputHandlers(ctx: InputHandlerContext): InputHandlerResult { return scrollTranscript(key.pageUp ? -step : step) } - // Queue-edit cancel beats selection-clear: the queue header explicitly - // promises "Esc cancel", so honoring it takes priority over the implicit - // selection-dismissal convention. Without an active edit, fall through. + // Escape-based voice bindings (ctrl/alt/super+escape) must win before the + // generic Esc handlers below; otherwise queue-edit cancel / selection-clear + // would swallow the chord and /voice would advertise a shortcut that never + // actually toggles recording in those UI states. + if (key.escape && isVoiceToggleKey(key, ch, voice.recordKey)) { + return voiceRecordToggle() + } + + // Queue-edit cancel beats selection-clear for plain Esc: the queue header + // explicitly promises "Esc cancel", so honoring it takes priority over the + // implicit selection-dismissal convention. Without an active edit, fall through. if (key.escape && cState.queueEditIdx !== null) { return cActions.clearIn() } @@ -439,7 +447,7 @@ export function useInputHandlers(ctx: InputHandlerContext): InputHandlerResult { return } - if (isVoiceToggleKey(key, ch)) { + if (isVoiceToggleKey(key, ch, voice.recordKey)) { return voiceRecordToggle() } diff --git a/ui-tui/src/app/useMainApp.ts b/ui-tui/src/app/useMainApp.ts index 218654f531..b39cc29a32 100644 --- a/ui-tui/src/app/useMainApp.ts +++ b/ui-tui/src/app/useMainApp.ts @@ -18,7 +18,7 @@ import { useGitBranch } from '../hooks/useGitBranch.js' import { useVirtualHistory } from '../hooks/useVirtualHistory.js' import { appendTranscriptMessage } from '../lib/messages.js' import { composerPromptWidth } from '../lib/inputMetrics.js' -import { isMac } from '../lib/platform.js' +import { DEFAULT_VOICE_RECORD_KEY, isMac, type ParsedVoiceRecordKey } from '../lib/platform.js' import { asRpcResult, rpcErrorMessage } from '../lib/rpc.js' import { terminalParityHints } from '../lib/terminalParity.js' import { buildToolTrailLine, sameToolTrailGroup, toolTrailLabel } from '../lib/text.js' @@ -104,6 +104,7 @@ export function useMainApp(gw: GatewayClient) { const [voiceEnabled, setVoiceEnabled] = useState(false) const [voiceRecording, setVoiceRecording] = useState(false) const [voiceProcessing, setVoiceProcessing] = useState(false) + const [voiceRecordKey, setVoiceRecordKey] = useState(DEFAULT_VOICE_RECORD_KEY) const [sessionStartedAt, setSessionStartedAt] = useState(() => Date.now()) const [turnStartedAt, setTurnStartedAt] = useState(null) const [goodVibesTick, setGoodVibesTick] = useState(0) @@ -394,7 +395,7 @@ export function useMainApp(gw: GatewayClient) { } }, [ui.busy]) - useConfigSync({ gw, setBellOnComplete, setVoiceEnabled, sid: ui.sid }) + useConfigSync({ gw, setBellOnComplete, setVoiceEnabled, setVoiceRecordKey, sid: ui.sid }) // Tab title: `⚠` waiting on approval/sudo/secret/clarify, `⏳` busy, `✓` idle. const model = ui.info?.model?.replace(/^.*\//, '') ?? '' @@ -539,6 +540,7 @@ export function useMainApp(gw: GatewayClient) { terminal: { hasSelection, scrollRef, scrollWithSelection, selection, stdout }, voice: { enabled: voiceEnabled, + recordKey: voiceRecordKey, recording: voiceRecording, setProcessing: setVoiceProcessing, setRecording: setVoiceRecording, @@ -642,7 +644,7 @@ export function useMainApp(gw: GatewayClient) { }, slashFlightRef, transcript: { page, panel, send, setHistoryItems, sys, trimLastExchange: session.trimLastExchange }, - voice: { setVoiceEnabled } + voice: { setVoiceEnabled, setVoiceRecordKey } }), [ catalog, @@ -782,9 +784,10 @@ export function useMainApp(gw: GatewayClient) { queueEditIdx: composerState.queueEditIdx, queuedDisplay: composerState.queuedDisplay, submit, - updateInput: composerActions.setInput + updateInput: composerActions.setInput, + voiceRecordKey }), - [cols, composerActions, composerState, empty, pagerPageSize, submit] + [cols, composerActions, composerState, empty, pagerPageSize, submit, voiceRecordKey] ) // Pass current progress through unfrozen — streaming update throttling diff --git a/ui-tui/src/components/appLayout.tsx b/ui-tui/src/components/appLayout.tsx index 8c2d210ca1..ec60726ed3 100644 --- a/ui-tui/src/components/appLayout.tsx +++ b/ui-tui/src/components/appLayout.tsx @@ -288,6 +288,7 @@ const ComposerPane = memo(function ComposerPane({ onSubmit={composer.submit} placeholder={composer.empty ? PLACEHOLDER : ui.busy ? 'Ctrl+C to interrupt…' : ''} value={composer.input} + voiceRecordKey={composer.voiceRecordKey} /> diff --git a/ui-tui/src/components/textInput.tsx b/ui-tui/src/components/textInput.tsx index 3008f0baf4..d8151e72b7 100644 --- a/ui-tui/src/components/textInput.tsx +++ b/ui-tui/src/components/textInput.tsx @@ -5,7 +5,14 @@ import { type MutableRefObject, useEffect, useMemo, useRef, useState } from 'rea import { setInputSelection } from '../app/inputSelectionStore.js' import { readClipboardText, writeClipboardText } from '../lib/clipboard.js' import { cursorLayout, offsetFromPosition } from '../lib/inputMetrics.js' -import { isActionMod, isMac, isMacActionFallback } from '../lib/platform.js' +import { + DEFAULT_VOICE_RECORD_KEY, + isActionMod, + isMac, + isMacActionFallback, + isVoiceToggleKey, + type ParsedVoiceRecordKey +} from '../lib/platform.js' type InkExt = typeof Ink & { stringWidth: (s: string) => number @@ -239,6 +246,7 @@ export function TextInput({ onSubmit, mask, mouseApiRef, + voiceRecordKey = DEFAULT_VOICE_RECORD_KEY, placeholder = '', focus = true }: TextInputProps) { @@ -699,6 +707,15 @@ export function TextInput({ (inp: string, k: Key, event: InputEvent) => { const eventRaw = event.keypress.raw + // Configured voice shortcut wins over composer-level defaults like + // paste/copy so users who bind voice to ctrl+v / alt+v / cmd+v + // actually get voice toggled instead of a paste (Copilot round-7 + // follow-up on #19835). The pass-through predicate is a no-op for + // ordinary typing and plain paste when voice is unbound to 'v'. + if (shouldPassThroughToGlobalHandler(inp, k, voiceRecordKey)) { + return + } + if ( eventRaw === '\x1bv' || eventRaw === '\x1bV' || @@ -744,22 +761,6 @@ export function TextInput({ return } - // Ctrl chords claimed by useInputHandlers — pass through instead of - // letting them fall into readline-style nav or a literal char insert. - // Ctrl+B = voice toggle, Ctrl+X = delete queued message while editing. - if ( - (k.ctrl && inp === 'c') || - (k.ctrl && inp === 'b') || - (k.ctrl && inp === 'x') || - k.tab || - (k.shift && k.tab) || - k.pageUp || - k.pageDown || - k.escape - ) { - return - } - if (k.return) { if (k.shift || k.ctrl || (isMac ? isActionMod(k) : k.meta)) { flushParentChange() @@ -1041,8 +1042,23 @@ interface TextInputProps { onSubmit?: (v: string) => void placeholder?: string value: string + voiceRecordKey?: ParsedVoiceRecordKey } +export const shouldPassThroughToGlobalHandler = ( + input: string, + key: Key, + voiceRecordKey: ParsedVoiceRecordKey = DEFAULT_VOICE_RECORD_KEY +): boolean => + (key.ctrl && input === 'c') || + (key.ctrl && input === 'x') || + key.tab || + (key.shift && key.tab) || + key.pageUp || + key.pageDown || + key.escape || + isVoiceToggleKey(key, input, voiceRecordKey) + export interface TextInputMouseApi { dragAt: (row: number, col: number) => void end: () => void diff --git a/ui-tui/src/gatewayTypes.ts b/ui-tui/src/gatewayTypes.ts index a1513d2a6e..7fca2837fa 100644 --- a/ui-tui/src/gatewayTypes.ts +++ b/ui-tui/src/gatewayTypes.ts @@ -75,8 +75,14 @@ export interface ConfigDisplayConfig { tui_statusbar?: 'bottom' | 'off' | 'on' | 'top' | boolean } +export interface ConfigVoiceConfig { + // Raw `yaml.safe_load()` value from config; may be non-string if hand-edited. + // Callers must normalize/validate at runtime (parseVoiceRecordKey()). + record_key?: unknown +} + export interface ConfigFullResponse { - config?: { display?: ConfigDisplayConfig } + config?: { display?: ConfigDisplayConfig; voice?: ConfigVoiceConfig } } export interface ConfigMtimeResponse { @@ -279,6 +285,7 @@ export interface VoiceToggleResponse { available?: boolean details?: string enabled?: boolean + record_key?: string stt_available?: boolean tts?: boolean } diff --git a/ui-tui/src/lib/platform.ts b/ui-tui/src/lib/platform.ts index 343d8f8683..d7d2cc1ff0 100644 --- a/ui-tui/src/lib/platform.ts +++ b/ui-tui/src/lib/platform.ts @@ -51,13 +51,359 @@ export const isCopyShortcut = ( (isMac && key.ctrl && (key.meta || key.super === true))) /** - * Voice recording toggle key (Ctrl+B). + * Voice recording toggle key — configurable via ``voice.record_key`` in + * ``config.yaml`` (default ``ctrl+b``). * - * Documented as "Ctrl+B" everywhere: tips.py, config.yaml's voice.record_key - * default, and the Python CLI prompt_toolkit handler. We accept raw Ctrl+B on - * every platform so the TUI matches those docs. On macOS we additionally - * accept Cmd+B (the platform action modifier) so existing macOS muscle memory - * keeps working. + * Documented in tips.py, the Python CLI prompt_toolkit handler, and the + * config.yaml default. The TUI honours the same config knob (#18994); + * when ``voice.record_key`` is e.g. ``ctrl+o`` the TUI binds Ctrl+O. + * + * Only the documented default (``ctrl+b``) additionally accepts the + * macOS action modifier (Cmd+B) — custom bindings like ``ctrl+o`` + * require the literal Ctrl bit so Cmd+O can't steal the shortcut. */ -export const isVoiceToggleKey = (key: { ctrl: boolean; meta: boolean; super?: boolean }, ch: string): boolean => - (key.ctrl || isActionMod(key)) && ch.toLowerCase() === 'b' +export type VoiceRecordKeyMod = 'alt' | 'ctrl' | 'super' + +/** Named (multi-character) keys we support, matching the CLI's + * prompt_toolkit binding shape (``c-space``, ``c-enter``, etc.) so a + * config value like ``ctrl+space`` binds in both runtimes. */ +export type VoiceRecordKeyNamed = 'backspace' | 'delete' | 'enter' | 'escape' | 'space' | 'tab' + +export interface ParsedVoiceRecordKey { + /** Single character (``'b'``, ``'o'``) when ``named`` is undefined, + * otherwise the named-key token (``'space'``, ``'enter'``…). Kept as + * one field for back-compat with the v1 ``{ ch, mod, raw }`` shape. */ + ch: string + mod: VoiceRecordKeyMod + named?: VoiceRecordKeyNamed + raw: string +} + +export const DEFAULT_VOICE_RECORD_KEY: ParsedVoiceRecordKey = { + ch: 'b', + mod: 'ctrl', + raw: 'ctrl+b' +} + +/** Modifier aliases. + * + * ``meta`` / ``cmd`` / ``command`` are intentionally absent. + * hermes-ink sets ``key.meta`` for plain Alt/Option on every platform + * AND for Cmd on some legacy macOS terminals (Terminal.app without + * kitty-protocol passthrough). Accepting any of those as a literal + * modifier would produce a display/binding mismatch — a config like + * ``cmd+b`` would render as ``Cmd+B`` but silently fire on Alt+B, or + * never fire at all on legacy terminals even though the UI advertises + * it (Copilot round-6 review on #19835). Users on modern kitty-style + * terminals (iTerm2 CSI-u, Ghostty, Kitty, WezTerm, Alacritty) spell + * the platform action modifier ``super`` / ``win``, which match the + * unambiguous ``key.super`` bit. macOS users on Terminal.app stick + * with the documented ``ctrl+b``. + * + * Cross-runtime parity: the ``ctrl`` / ``control`` / ``alt`` / ``option`` / + * ``opt`` spellings are normalized identically in the classic CLI + * (``hermes_cli/voice.py::normalize_voice_record_key_for_prompt_toolkit``) + * so one ``voice.record_key`` value binds the same shortcut in both + * runtimes (Copilot round-9 review on #19835). The ``super`` / + * ``win`` / ``windows`` spellings are TUI-only — prompt_toolkit has no + * super modifier, so the CLI falls back to the documented default and + * logs a warning at startup (Copilot round-11 review on #19835). */ +const _MOD_ALIASES: Record = { + alt: 'alt', + control: 'ctrl', + ctrl: 'ctrl', + option: 'alt', + opt: 'alt', + super: 'super', + win: 'super', + windows: 'super' +} + +/** Map config-string named tokens to the canonical name used at match time. + * + * Aliases mirror what prompt_toolkit accepts (``return`` ↔ ``enter``, + * ``esc`` ↔ ``escape``) so a config that round-trips through the CLI also + * binds in the TUI. */ +const _NAMED_KEY_ALIASES: Record = { + backspace: 'backspace', + bs: 'backspace', + del: 'delete', + delete: 'delete', + enter: 'enter', + esc: 'escape', + escape: 'escape', + ret: 'enter', + return: 'enter', + space: 'space', + spc: 'space', + tab: 'tab' +} + +/** ``useInputHandlers()`` intercepts these unconditionally before the + * voice check runs, so a binding like ``ctrl+c`` (interrupt), + * ``ctrl+d`` (quit), or ``ctrl+l`` (clear screen) would be advertised + * in /voice status but never fire push-to-talk. Reject at parse time + * so the user gets the documented Ctrl+B instead of a dead shortcut + * (Copilot round-4 review on #19835). + * + * ``ctrl+x`` is intentionally NOT here — it's only claimed during + * queue-edit (``queueEditIdx !== null``), so the voice binding works + * for most of the session and matches CLI parity for ``ctrl+`` + * bindings (Copilot round-8 review on #19835). */ +const _RESERVED_CTRL_CHARS = new Set(['c', 'd', 'l']) + +/** On macOS the action-modifier intercepts these editor chords via + * ``isCopyShortcut`` / ``isAction`` in ``useInputHandlers()``: + * - super+c → copy + * - super+d → exit + * - super+l → clear screen + * - super+v → paste (also claimed at the TextInput layer) + * On Linux/Windows those globals key off Ctrl instead of Super, so + * super+ bindings don't collide. Gate the rejection to darwin + * at parse time so kitty/CSI-u ``super+`` configs still work for + * non-mac users (Copilot round-8 review on #19835). */ +const _RESERVED_SUPER_CHARS = new Set(['c', 'd', 'l', 'v']) + +/** On macOS ``isActionMod`` accepts ``key.meta`` as the action + * modifier — but hermes-ink reports Alt as ``key.meta`` on many + * terminals. So on darwin a configured ``alt+c`` / ``alt+d`` / ``alt+l`` + * gets swallowed by ``isCopyShortcut`` / ``isAction`` before the voice + * check runs. Block at parse time so /voice status doesn't advertise + * a shortcut that actually copies / quits / clears (Copilot round-12 + * review on #19835). */ +const _RESERVED_ALT_CHARS_MAC = new Set(['c', 'd', 'l']) + +interface RuntimeKeyEvent { + alt?: boolean + backspace?: boolean + ctrl: boolean + delete?: boolean + escape?: boolean + meta: boolean + return?: boolean + shift?: boolean + super?: boolean + tab?: boolean +} + +/** Match an ink ``key`` event against a parsed named key. The ink runtime + * sets one boolean per named key; ``space`` is a printable char so it + * arrives as ``ch === ' '`` rather than a dedicated ``key.space`` flag. */ +const _matchesNamedKey = ( + named: VoiceRecordKeyNamed, + key: RuntimeKeyEvent, + ch: string +): boolean => { + switch (named) { + case 'backspace': + return key.backspace === true + case 'delete': + return key.delete === true + case 'enter': + return key.return === true + case 'escape': + return key.escape === true + case 'space': + return ch === ' ' + case 'tab': + return key.tab === true + } +} + +/** + * Parse a config-string voice record key like ``ctrl+b`` / ``alt+r`` / + * ``ctrl+space`` into ``{mod, ch, named?}``. Accepts single characters + * AND the named tokens declared in ``_NAMED_KEY_ALIASES`` (``space``, + * ``enter``/``return``, ``tab``, ``escape``/``esc``, ``backspace``, + * ``delete``) — matching the keys prompt_toolkit accepts on the CLI + * side via the ``c-`` rewrite in ``cli.py``. + * + * Accepts ``unknown`` because the source is raw YAML via + * ``config.get full`` — a hand-edited ``voice.record_key: 1`` or + * ``voice.record_key: true`` would otherwise crash ``.trim()`` on a + * non-string scalar (Copilot round-3 review on #19835). Non-string / + * empty / unrecognised values fall back to the documented Ctrl+B + * default so a typo never silently disables the shortcut. + */ +export const parseVoiceRecordKey = (raw: unknown): ParsedVoiceRecordKey => { + if (typeof raw !== 'string') { + return DEFAULT_VOICE_RECORD_KEY + } + + const lower = raw.trim().toLowerCase() + + if (!lower) { + return DEFAULT_VOICE_RECORD_KEY + } + + const parts = lower.split('+').map(p => p.trim()).filter(Boolean) + + if (!parts.length) { + return DEFAULT_VOICE_RECORD_KEY + } + + const last = parts[parts.length - 1] + const modCandidates = parts.slice(0, -1) + + // Reject multi-modifier chords (``ctrl+alt+r``, ``cmd+ctrl+b``) rather + // than silently dropping the extra modifier — the previous + // single-token validator made a typo bind a different shortcut than + // the user configured (Copilot round-3 review on #19835). The classic + // CLI only supports single-modifier bindings via prompt_toolkit's + // ``c-x`` / ``a-x`` rewrite in ``cli.py``, so this matches CLI parity. + if (modCandidates.length > 1) { + return DEFAULT_VOICE_RECORD_KEY + } + + // Require an explicit modifier. A bare ``o`` / ``space`` / ``escape`` + // has no sensible mapping: the CLI's prompt_toolkit binds the raw + // key (no rewrite) so bare-char configs would silently diverge + // between the two runtimes (Copilot round-4 review on #19835). + // Fall back to the documented default. + if (modCandidates.length === 0) { + return DEFAULT_VOICE_RECORD_KEY + } + + const norm = _MOD_ALIASES[modCandidates[0]] + + // Unknown modifier token (e.g. bare ``meta+b`` which is ambiguous on + // the wire) falls back to the documented default rather than + // silently coercing to Ctrl and producing a misleading bind. + if (!norm) { + return DEFAULT_VOICE_RECORD_KEY + } + + const mod = norm + + // Block bindings the TUI input handler intercepts before the voice + // check — ``ctrl+c`` / ``ctrl+d`` / ``ctrl+l`` would never actually + // fire push-to-talk, so advertising them in /voice status is a lie. + if (mod === 'ctrl' && last.length === 1 && _RESERVED_CTRL_CHARS.has(last)) { + return DEFAULT_VOICE_RECORD_KEY + } + + // Same for ``super+c`` / ``super+d`` / ``super+l`` / ``super+v`` on + // macOS only — those are copy / exit / clear / paste and get claimed + // by ``isCopyShortcut`` / ``isAction`` / the TextInput paste layer + // before voice has a chance to toggle. On Linux/Windows the TUI + // globals key off Ctrl (not Super), so kitty/CSI-u ``super+`` + // bindings stay usable for non-mac users. + if (isMac && mod === 'super' && last.length === 1 && _RESERVED_SUPER_CHARS.has(last)) { + return DEFAULT_VOICE_RECORD_KEY + } + + // On macOS hermes-ink reports Alt as ``key.meta``, which ``isActionMod`` + // accepts as the mac action modifier. So ``alt+c`` / ``alt+d`` / ``alt+l`` + // collide with copy / exit / clear in ``useInputHandlers()`` before the + // voice check. Reject at parse time on darwin only — non-mac ``alt+`` + // bindings are still usable (Copilot round-12 review on #19835). + if (isMac && mod === 'alt' && last.length === 1 && _RESERVED_ALT_CHARS_MAC.has(last)) { + return DEFAULT_VOICE_RECORD_KEY + } + + if (last.length === 1) { + return { ch: last, mod, raw: lower } + } + + const named = _NAMED_KEY_ALIASES[last] + + if (named) { + return { ch: named, mod, named, raw: lower } + } + + // Unknown multi-character token (e.g. typo'd ``ctrl+spcae``) — fall back + // to the doc default rather than silently disabling the binding. + return DEFAULT_VOICE_RECORD_KEY +} + +/** Render a parsed key back as ``Ctrl+B`` / ``Ctrl+Space`` for status text. + * + * Platform-aware for the ``super`` modifier: renders ``Cmd`` on macOS and + * ``Super`` elsewhere. Previously rendered ``Cmd`` universally, which told + * Linux/Windows users the wrong modifier to press (Copilot review, round + * 2 on #19835). */ +export const formatVoiceRecordKey = (parsed: ParsedVoiceRecordKey): string => { + const modLabel = + parsed.mod === 'super' ? (isMac ? 'Cmd' : 'Super') : parsed.mod[0].toUpperCase() + parsed.mod.slice(1) + // Named tokens render in title case (Ctrl+Space, Ctrl+Enter); single + // chars render upper-case to match the existing Ctrl+B convention. + const keyLabel = parsed.named + ? parsed.named[0].toUpperCase() + parsed.named.slice(1) + : parsed.ch.toUpperCase() + + return `${modLabel}+${keyLabel}` +} + +/** Whether the parsed binding is the documented default (ctrl+b). + * + * Compare on the parsed spec rather than ``raw`` so semantically-equal + * aliases (``control+b``, ``ctrl + b``) still get the macOS Cmd+B + * muscle-memory fallback (Copilot review, round 2 on #19835). */ +const _isDefaultVoiceKey = (parsed: ParsedVoiceRecordKey): boolean => + parsed.mod === DEFAULT_VOICE_RECORD_KEY.mod && + parsed.ch === DEFAULT_VOICE_RECORD_KEY.ch && + parsed.named === DEFAULT_VOICE_RECORD_KEY.named + +export const isVoiceToggleKey = ( + key: RuntimeKeyEvent, + ch: string, + configured: ParsedVoiceRecordKey = DEFAULT_VOICE_RECORD_KEY +): boolean => { + // Match the configured key first (single-char compare or named-key + // event-property check). Bail out before evaluating modifier shape + // so the wrong key never reaches the modifier guard. + if (configured.named) { + if (!_matchesNamedKey(configured.named, key, ch)) { + return false + } + } else if (ch.toLowerCase() !== configured.ch) { + return false + } + + // The parser rejects multi-modifier configs (``ctrl+shift+b`` etc.), + // so at match time Shift must always be clear — otherwise + // ``ctrl+tab`` would also fire on Ctrl+Shift+Tab and ``alt+enter`` + // on Alt+Shift+Enter, triggering a different chord than configured + // (Copilot round-5 review on #19835). + if (key.shift === true) { + return false + } + + switch (configured.mod) { + case 'alt': + // Most terminals surface Alt as either ``alt`` or ``meta``; accept + // both so the binding works across xterm-style and kitty-style + // protocols. Guard against ctrl/super bits so a chord like + // Ctrl+Alt+ or Cmd+Alt+ doesn't spuriously fire the + // alt binding. + // + // Bare Escape on hermes-ink can arrive as ``key.meta=true`` on some + // terminals, so a configured ``alt+escape`` must not match that shape; + // require an explicit alt bit for escape chords (Copilot round-7 + // follow-up on #19835). + return (key.alt === true || (key.meta && key.escape !== true)) && !key.ctrl && key.super !== true + case 'ctrl': + // Require the Ctrl bit AND a clear Alt/Super so a chord like + // Ctrl+Alt+ / Ctrl+Cmd+ doesn't spuriously match + // ``ctrl+`` (Copilot round-6 review on #19835). + // + // The documented default (``ctrl+b``) additionally accepts the + // explicit ``key.super`` bit on macOS for Cmd+B muscle memory — + // but ONLY ``key.super`` (kitty-style), never ``key.meta``, since + // ``key.meta`` is hermes-ink's Alt signal and accepting it would + // fire the binding on Alt+B. + if (key.ctrl) { + return !key.alt && !key.meta && key.super !== true + } + + return _isDefaultVoiceKey(configured) && isMac && key.super === true && !key.alt && !key.meta + case 'super': + // Require the explicit ``key.super`` bit (kitty-style protocol) + // AND clear Ctrl/Alt/Meta so Ctrl+Cmd+X or Alt+Cmd+X don't + // spuriously fire the super binding (Copilot round-6 review on + // #19835). Legacy-terminal users whose Cmd arrives as + // ``key.meta`` need a kitty-protocol terminal — see the + // _MOD_ALIASES doc-comment for the rationale. + return key.super === true && !key.ctrl && !key.alt && !key.meta + } +}