The LLM review prompt mentioned bespoke `archive_skill` and `pin_skill`
tools that are not registered as model tools. Swap the prompt to rely
on the real surface:
- skill_manage action=patch — for patching and consolidation
- terminal — to `mv` skill dirs into .archive/
Also drop `pin` from the model's decision list — pinning is a user
opt-out for `hermes curator pin <skill>`, not something the model
should do autonomously.
Decision list is now: keep / patch / consolidate / archive.
Tests updated: prompt-invariant test now asserts the existing tools
are referenced and that bespoke tool names do NOT appear. New test
prevents `pin` from being re-added as a model decision.
Adds the Curator — an auxiliary-model background task that periodically
reviews AGENT-CREATED skills and keeps the collection tidy: tracks usage,
transitions unused skills through active → stale → archived, and spawns
a forked AIAgent to consolidate overlaps and patch drift.
Default: enabled, inactivity-triggered (no cron daemon). Runs on CLI
startup and gateway boot when the last run is older than interval_hours
(default 24) AND the agent has been idle for min_idle_hours (default 2).
Invariants (all load-bearing):
- Never touches bundled or hub-installed skills (.bundled_manifest +
.hub/lock.json double-filter)
- Never auto-deletes — archive only. Archives are recoverable
via `hermes curator restore <skill>`
- Pinned skills bypass all auto-transitions
- Uses the aux client; never touches the main session's prompt cache
New files:
- tools/skill_usage.py — sidecar .usage.json telemetry, atomic writes,
provenance filter
- agent/curator.py — orchestrator: config, idle gating, state-machine
transitions (pure, no LLM), forked-agent review prompt
- hermes_cli/curator.py — `hermes curator {status,run,pause,resume,
pin,unpin,restore}` subcommand
- tests/tools/test_skill_usage.py — 29 tests
- tests/agent/test_curator.py — 25 tests
Modified files (surgical patches):
- tools/skills_tool.py — bump view_count on successful skill_view
- tools/skill_manager_tool.py — bump patch_count on skill_manage
patch/edit/write_file/remove_file; forget record on delete
- hermes_cli/config.py — add curator: section to DEFAULT_CONFIG
- hermes_cli/commands.py — add /curator CommandDef with subcommands
- hermes_cli/main.py — register `hermes curator` subparser via
register_cli() from hermes_cli.curator
- cli.py — /curator slash-command dispatch + startup hook
- gateway/run.py — gateway-boot hook (mirrors CLI)
Validation:
- 54 new tests across skill_usage + curator, all passing in 3s
- 346 tests across all touched files' neighbors green
- 2783 tests across hermes_cli/ + gateway/test_run_progress_topics.py green
- CLI smoke: `hermes curator status/pause/resume` work end-to-end
Companion to PR #16026 (class-first skill review prompt) — together
they form a loop: the review prompt stops near-duplicate skill creation
at the source, and the curator prunes/consolidates what still accumulates.
Refs #7816.
Relative entries in skills.external_dirs were resolved against the
process cwd via Path.resolve(), making them silently fail when Hermes
was launched from a different directory.
Resolve relative paths against get_hermes_home() for consistent
behavior across CLI, gateway, and cron contexts. Absolute paths
and env-var/tilde expansion are unchanged.
Commit 3c42064e made config.yaml the single source of truth for
TERMINAL_CWD, but the config bridge passes cwd values verbatim to
os.environ. When a user sets terminal.cwd: ~/ in config.yaml, the
literal string '~/'' reaches subprocess.Popen, which the kernel
rejects because it does not expand shell tilde syntax.
This patch adds three defensive layers:
1. gateway/run.py — expanduser at config bridge time so TERMINAL_CWD
is always an absolute path.
2. tools/terminal_tool.py — expanduser when reading TERMINAL_CWD in
_get_env_config(), guarding against stale or manually-set env vars.
3. tools/environments/local.py — expanduser in LocalEnvironment before
passing cwd to subprocess.Popen, the final safety net.
Includes regression tests in test_config_cwd_bridge.py for nested
terminal.cwd, top-level cwd alias, and precedence ordering.
Refs: 3c42064e
Finish the Copilot review cleanup for lazy prompt submission:
- prompt.submit now claims session.running before returning success, preserving
the existing RPC-level session busy error so the frontend can queue.
- agent-init timeout/failure now emits a normal error event instead of writing a
second JSON-RPC response for an already-settled request id.
Tests:
- python -m py_compile tui_gateway/server.py tui_gateway/entry.py
- cd ui-tui && npm run type-check && npm run build
- scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py
- cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts
The known-key splitter in `_sanitize_env_lines` used substring matching
to find concatenated KEY=VALUE pairs. When a registered key was a suffix
of another (LM_API_KEY is a suffix of GLM_API_KEY), the shorter key's
needle would match inside the longer one, causing the sanitizer to
rewrite `GLM_API_KEY=...` as `G\nLM_API_KEY=...` and silently break
Z.AI/GLM auth (and similarly `GLM_BASE_URL` -> `G\nLM_BASE_URL`).
Drop matches whose needle range is fully contained within a longer
overlapping match. Two regression tests cover the suffix-collision case
and confirm a real concatenation that happens to start with the longer
key still splits where it should.
Fixes#17138
Respond to Copilot's lazy-start review: session metadata/history/usage do not
need a constructed AIAgent, so keep them on the no-wait session path. This
preserves the deferred startup model and avoids blocking simple session RPCs on
agent initialization.
Tests:
- python -m py_compile tui_gateway/server.py tui_gateway/entry.py
- cd ui-tui && npm run type-check && npm run build
- scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py
- cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts
Classic CLI exposes ``/reload`` (re-reads ~/.hermes/.env into
``os.environ`` via ``hermes_cli.config.reload_env``) so newly added API
keys take effect without restarting the session. The TUI was missing
the parity command, so users had to Ctrl+C out and ``hermes --tui``
again whenever they added or rotated a credential.
Three small wires:
* New ``reload.env`` JSON-RPC method in ``tui_gateway/server.py`` that
delegates to ``hermes_cli.config.reload_env`` and returns the count
of vars updated.
* New ``/reload`` slash command in ``ui-tui/src/app/slash/commands/ops.ts``
matching the existing ``/reload-mcp`` pattern (native RPC, no slash
worker).
* Drop ``cli_only=True`` from the ``reload`` ``CommandDef`` in
``hermes_cli/commands.py`` so help/menus surface it in the TUI too.
``reload_env`` itself is environment-agnostic.
Same caveat as classic CLI: the *currently constructed* agent's
credential pool / provider routing does not auto-rebuild. Users who
want a brand-new credential resolution should follow with ``/new``.
Tests:
* New ``test_reload_env_rpc_calls_hermes_cli_reload_env`` confirms
RPC delegates and reports the count.
* New ``test_reload_env_rpc_surfaces_errors`` confirms exceptions are
rendered as JSON-RPC errors.
* ``createSlashHandler.test.ts`` slash-parity matrix extended with
``['/reload', 'reload.env', {}]`` so we can't regress the routing.
Validation:
scripts/run_tests.sh tests/test_tui_gateway_server.py — 92/92.
scripts/run_tests.sh tests/hermes_cli/test_commands.py — 128/128.
cd ui-tui && npm run type-check — clean; npm test --run — 390/390.
After PR #7885 (97b0cd51e) added content-side segment breaks for
natural mid-turn assistant messages, the tool-progress task in
gateway/run.py was not updated to match. progress_msg_id and
progress_lines persisted for the whole run, so after a tool batch
produced bubble B1 followed by content bubble C1, the next tool.started
kept editing the OLD bubble B1 above C1 — making the chat appear out
of order on Telegram, Discord, and Slack.
Add on_new_message callback to GatewayStreamConsumer, fired at the
four sites where a fresh content bubble lands on the platform:
- _send_or_edit first-send branch (NOT edits)
- _send_commentary
- _send_new_chunk (overflow split)
- each successful chunk of _send_fallback_final
Gateway supplies a lambda that enqueues ('__reset__',) into the
progress_queue. send_progress_messages() handles the marker in both
the main loop and the CancelledError drain path, clearing
progress_msg_id, progress_lines, and the dedup state so the next
tool.started opens a fresh bubble below the new content.
Result: each tool batch appears in chronological order below the
preceding content. When no content appears between tool batches,
tools still group in one bubble (CLI-style compactness).
Co-authored-by: teknium1 <teknium@users.noreply.github.com>
init_session() runs a login shell bootstrap that sources profile scripts
(.bashrc, .bash_profile, etc.) before capturing pwd. If any profile
script changes the working directory, the captured cwd overwrites the
configured terminal.cwd value — so terminal commands run in the wrong
directory despite the TUI banner showing the configured path.
Add an explicit 'builtin cd' to the configured cwd in the bootstrap
script, after profile sourcing but before pwd capture, ensuring the
configured terminal.cwd is always what gets recorded.
Fixes#14044
* Reject unsupported schemes (anything outside http/https/ws/wss) in
cli.py /browser connect before probing or persisting, matching the
gateway's existing 4015 path.
* Defend gateway browser.manage against `{"url": null}` and
non-string urls: empty/null falls back to DEFAULT_BROWSER_CDP_URL,
non-string returns a 4015 instead of slipping into the generic
5031 catch via TypeError on `"://" in url`.
* Add regression tests for both null-url fallback and non-string
rejection.
* Gate `browser.progress` emit on truthy `session_id`. The TUI
prints `messages` from the response when there's no session, so
emitting events too would double-render. Now: with a session →
events stream live; without one → bundled messages only.
* Resolve `system = platform.system()` once in `_browser_connect`
and thread it through `try_launch_chrome_debug` and
`_failure_messages` → `manual_chrome_debug_command`, so the
generated hint is consistent (and tests are deterministic) on
any host.
* Add `test_browser_manage_connect_no_session_skips_progress_events`
to lock in the gating behavior.
Fixes from Copilot's two passes on PR #17238:
* Validate parsed URL once: reject missing host, invalid port, and
unsupported scheme up front so malformed inputs (e.g. http://:9222
or http://localhost:abc) don't fall through to a generic 5031.
* Tighten _is_default_local_cdp to require a discovery-style path so
ws://127.0.0.1:9222/devtools/browser/<id> is not collapsed to bare
http://127.0.0.1:9222 (which would lose the path and break the
connect).
* Move browser.manage into _LONG_HANDLERS so the up-to-10s
launch-and-retry loop runs on the RPC pool instead of blocking the
main dispatcher.
* try_launch_chrome_debug uses Windows-appropriate detach kwargs
(creationflags=DETACHED_PROCESS|CREATE_NEW_PROCESS_GROUP) instead
of POSIX-only start_new_session=True.
* manual_chrome_debug_command uses subprocess.list2cmdline on
Windows so the printed instruction is cmd.exe-compatible.
* Mirror host/port validation in cli.py /browser connect so the
classic CLI never persists an invalid BROWSER_CDP_URL.
Split browser.manage into a small dispatcher with named connect/disconnect
helpers, fold _http_ok / _probe_urls / _normalize_cdp_url out of the nested
probe loop, collapse the failure-message scaffolding, and DRY the chrome
candidate path tables. Behaviour and event shape unchanged.
Emit browser.progress JSON-RPC notifications during the connect work and render them in the TUI as system transcript lines, so users see the same step-by-step status the base CLI prints instead of nothing for ~1m followed by a final result.
Return CLI-style browser connect status messages from the gateway and render them in the TUI so local Chrome launch attempts are visible instead of ending in a silent delayed failure.
Detect an actual Chrome/Chromium executable before printing a manual CDP launch command, including common WSL-mounted Windows browser paths, so /browser connect does not suggest google-chrome when it is unavailable.
Share Chrome CDP launch helpers between the classic CLI and TUI so default /browser connect uses loopback consistently, retries local Chrome launch, and reports a copyable manual-start command instead of claiming a dead connection.
Clean up the remaining review nits:
- let the deferred @hermes/ink import retry after a transient failure instead
of memoizing a rejected promise forever
- keep memory-monitor in-flight state inside a finally so future exceptions
cannot suppress that memory level indefinitely
- use read_raw_config for the TUI MCP cold-start probe instead of full
load_config()
- keep input.detect_drop for explicit relative path prefixes (./ and ../)
while preserving the no-RPC fast path for ordinary plain prompts
Tests:
- python -m py_compile tui_gateway/server.py tui_gateway/entry.py
- cd ui-tui && npm run type-check && npm run build
- scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py
- cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts
A cleanup review found that adding prompt.submit to _LONG_HANDLERS made the RPC
pool own the full first-turn wait even though the handler itself already spawns
a turn thread. Keep prompt.submit inline and make it return immediately:
- look up the session without waiting
- kick the lazy agent build
- spawn a short waiter thread that blocks on agent_ready, then starts the
existing turn dispatcher
This keeps stdin dispatch responsive, avoids occupying a bounded pool worker for
a normal chat turn, and preserves the lazy-start hydration behavior.
Tests:
- python -m py_compile tui_gateway/server.py
- cd ui-tui && npm run type-check && npm run build
- scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py
- cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts
Copilot correctly flagged two concurrency windows:
- memoryMonitor could re-enter while awaiting the lazy @hermes/ink import or
heap dump, producing duplicate imports/dumps under sustained pressure.
- _start_agent_build used a check-then-set guard without synchronization, so
concurrent agent-backed RPCs could start duplicate agent builders.
Fix both with single-flight guards: cache the dynamic import promise and track
per-level dump in-flight state in memoryMonitor, and protect the TUI agent build
flag with a per-session lock.
Tests:
- python -m py_compile tui_gateway/server.py
- cd ui-tui && npm run type-check && npm run build
- cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts
- scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py
The lazy startup panel could remain stuck on the placeholder when no first
prompt was submitted because agent construction only started from _sess(). Keep
session.create cheap, but schedule _start_agent_build shortly after returning
the placeholder so tools/skills hydrate automatically.
Also replace the ugly placeholder bar rows with compact unicode-animations
braille loaders for the tools and skills sections.
Tests:
- python -m py_compile tui_gateway/server.py
- cd ui-tui && npm run type-check && npm run build
- cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts
- scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py
Match classic CLI perceived startup behavior: show the TUI shell and composer
before constructing the full AIAgent. session.create now returns a lightweight
placeholder session with lazy=true and no longer starts _make_agent eagerly.
The first method that needs the agent triggers _start_agent_build() via _sess();
prompt.submit is routed through the RPC worker pool so that the initial wait for
agent construction does not block the stdio dispatcher.
The intro panel renders skeleton rows for tools/skills while the real
session.info payload is absent, then hydrates to the real tools/skills panel once
AIAgent initialization completes. Also skip the startup /voice status probe and
avoid the input.detect_drop RPC for ordinary plain-text prompts to keep early
startup/first-submit paths cheap.
Measurements on macOS Terminal.app:
- Previous full ready p50 after earlier PR commits: ~1537ms
- Lazy skeleton panel p50: ~794ms
- Original baseline full ready p50: ~1843ms
So the visible startup surface is now ~743ms faster than the prior PR state and
~1.05s faster than the original baseline. First prompt still pays the same agent
construction cost if it races the background/skeleton state, matching classic
CLI's deferred behavior.
Tests:
- python -m py_compile tui_gateway/server.py
- cd ui-tui && npm run type-check && npm run build
- scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py
- cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts
The background skill-review prompts (_SKILL_REVIEW_PROMPT and the **Skills**
half of _COMBINED_REVIEW_PROMPT) steered the reviewer toward passive
behavior — most passes concluded 'Nothing to save.' even when the session
produced real lessons. User-preference corrections (style, format,
legibility, verbosity) were especially lost: they were read as memory
signals only, so skills never carried the fix.
This rewrite changes the stance:
- **Active-update bias.** The reviewer now treats inaction as a missed
learning opportunity. 'Nothing to save.' remains an explicit escape
but is no longer framed as the most-common outcome.
- **User-preference corrections are first-class skill signals.** Style,
tone, format, legibility, verbosity complaints — and the actual
phrasings users use ('stop doing X', 'this is too verbose', 'I hate
when you Y', 'remember this') — now warrant patching the skill that
governs the task, not just writing to memory.
- **Loaded-skill-first preference order.** When a skill was loaded via
/skill-name or skill_view during the session, the reviewer patches
THAT one first. It was in play; it's the right place.
- **Four-step ladder: patch-loaded → patch-umbrella → support-file →
create.** Support files are explicitly enumerated as three kinds:
* references/<topic>.md — session-specific detail OR condensed
knowledge banks (quoted research, API docs excerpts, domain notes)
* templates/<name>.<ext> — starter files to copy and modify
* scripts/<name>.<ext> — statically re-runnable actions
- **Name-veto for CREATE.** New skill names MUST be class-level — no PR
numbers, error strings, codenames, library-alone names, or session
artifacts ('fix-X / debug-Y / audit-Z-today'). If the proposed name
only fits today's task, fall back to one of the patch/support-file
options.
- **Memory scope clarified.** 'who the user is and what the current
situation and state of your operations are' — MEMORY.md is
situational/state, USER.md is identity/preferences.
- **Curator handoff.** Reviewer flags overlap; the background curator
handles consolidation at scale. Single-session reviewer doesn't
attempt umbrella-rebalancing.
Tests: tests/run_agent/test_review_prompt_class_first.py upgraded to
assert the new behavioral contracts (active bias, user-correction
signals, loaded-skill-first, support-file kinds, name-veto, memory
framing, curator handoff). 17 tests, all pass.
Co-authored-by: teknium1 <teknium@users.noreply.github.com>
Three modules independently implemented the same "preserve head+tail of
a secret, mask the middle" logic with slightly different behaviors that
had started to drift:
hermes_cli/config.py redact_key — 12-char floor, 4+4, DIM '(not set)'
hermes_cli/status.py redact_key — 12-char floor, 4+4, plain '(not set)' ← drift
hermes_cli/dump.py _redact — 12-char floor, 4+4, empty string
The visible bug: 'hermes status' displayed the '(not set)' placeholder
in plain text while 'hermes config' showed it in dim text. Same concept,
inconsistent UI.
Introduces mask_secret() in agent/redact.py as the canonical helper,
with head/tail/floor/placeholder/empty kwargs. The three call sites
become one-line wrappers that differ only in the 'empty' handling:
config.redact_key → mask_secret(k, empty=color('(not set)', Colors.DIM))
status.redact_key → mask_secret(k, empty=color('(not set)', Colors.DIM))
dump._redact → mask_secret(v) # empty → ''
agent.redact._mask_token (log redactor, different policy: 18-char floor,
6+4 visible, '***' on empty) also ports to mask_secret but retains its
own empty-case handling to preserve the historical '***' return.
Net: the three display-time redactors now agree on formatting, the
canonical helper lives in one place, and future tweaks (e.g. adding
bullet-point masking, changing the head/tail widths) happen once.
Verified:
- 3/3 tests/hermes_cli/test_web_server.py::TestRedactKey pass
- 89/89 agent/tests/test_redact.py + tests/tools/test_browser_secret_exfil.py
+ tests/hermes_cli/test_redact_config_bridge.py pass
- Live 'hermes status', 'hermes config', 'hermes dump' all render the
same way they did before (verified against actual env with real
keys: OpenRouter, Firecrawl, Browserbase, FAL, Tinker all show
'prefix...suffix'; Kimi shows '***' at <12 chars; unset shows
'(not set)' uniformly).
Co-authored-by: teknium1 <teknium@users.noreply.github.com>
TUI session readiness was still laggy after the gateway-ready fixes. Profiling
session.create -> session.info showed the slow phase is background AIAgent
construction (~1.1s). A cProfile run of tui_gateway.server::_make_agent showed
model_tools/tool discovery importing tools.code_execution_tool, whose
module-level EXECUTE_CODE_SCHEMA calls _get_execution_mode(), which imported
cli.CLI_CONFIG.
That pulled the classic interactive CLI stack (prompt_toolkit/Rich and REPL
setup) into every agent startup path, including hermes --tui where it is not
used. Replace that with hermes_cli.config.read_raw_config(), which is cached and
reads only the raw code_execution section. Existing defaults still apply when
the key is absent.
Measurements on macOS Terminal.app:
- import run_agent: ~466ms -> ~347ms
- model_tools import: ~418ms -> ~272ms
- _make_agent: ~1452ms -> ~1239ms
- session.create -> session.info: ~1167ms -> ~999ms
- full hermes --tui ready p50: ~1655ms -> ~1537ms
Tests:
- scripts/run_tests.sh tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py
Match the buffered-stdin rearm cadence to IN_PASTE state so large pastes do not spin the normal escape timeout while waiting for readable data to drain.
Keep the latest prompt sticky while the viewport is in live assistant output beyond history, and clear stale sticky state at the real bottom using fresh scroll height.
detect_dangerous_command() and detect_hardline_command() were calling
re.search(pattern, text, re.IGNORECASE | re.DOTALL) inline — Python's
re._cache (512 patterns) amortizes compile cost on the warm path, but:
1. The first terminal() call per process pays the full compile fan-out
for all 59 patterns (12 HARDLINE + 47 DANGEROUS). Measured at
~2.6 ms per detect_dangerous_command() call after re.purge().
2. The re._cache is LRU — unrelated regex work elsewhere in the agent
(response parsing, text normalization, etc.) can evict our patterns
and silently re-compile them on the next terminal() call.
Precompiling at module load eliminates both costs:
detect_dangerous_command:
cold 2.613 ms → 0.298 ms (-88%)
warm 0.042 ms → 0.004 ms (-90%)
detect_hardline_command:
cold ~0.6 ms → 0.006 ms
warm 0.011 ms → 0.002 ms
Savings are per terminal() call. Agents with heavy terminal use see
compound savings; the bigger value is the stability guarantee (no
re._cache eviction can silently re-introduce the 2.6 ms cold cost
mid-session).
Implementation:
- HARDLINE_PATTERNS_COMPILED and DANGEROUS_PATTERNS_COMPILED built at
module load from the existing (pattern, description) tuples, using
shared _RE_FLAGS = re.IGNORECASE | re.DOTALL.
- detect_* functions now iterate the compiled list and call pattern_re.search(text).
- Original HARDLINE_PATTERNS and DANGEROUS_PATTERNS lists kept as-is
(other code in the file uses them for key derivation /
_PATTERN_KEY_ALIASES).
Verified:
- 160/161 tests/tools/test_approval*.py pass (1 pre-existing heartbeat
test flake on main).
- 349/349 tests/tools/ 'approval or terminal or dangerous' pass.
- Live hermes chat smoke: 3 benign terminal commands + 1 rm -rf /tmp/
(clarify prompt fired — approval path still works) + 1 sudo (sudo
password prompt fired — DANGEROUS pattern match still works). 23
log lines in the smoke window, zero errors.
Co-authored-by: teknium1 <teknium@users.noreply.github.com>
Replace the removed built-in boot-md hook (#17093) with a how-to that
shows users how to wire up the same behavior themselves via the hooks
system. Uses _resolve_gateway_model() + _resolve_runtime_agent_kwargs()
so the example works against custom endpoints and OAuth providers,
not just the aggregator defaults that the old built-in silently assumed.
Co-authored-by: teknium1 <teknium@users.noreply.github.com>
Validate configured providers against both Hermes runtime provider ids and
catalog-normalized provider ids. This keeps providers like ai-gateway from
being rejected after catalog resolution maps them to models.dev ids.
Keep credential checks and vendor-slug warnings anchored to the runtime id
so doctor reports actionable provider names in follow-up diagnostics.
Two amplifying optimizations to per-turn overhead in the gateway:
1. get_tool_definitions() memoization (model_tools.py)
Keyed on (frozenset(enabled), frozenset(disabled),
registry._generation, config.yaml mtime+size). Only active when
quiet_mode=True (which is every hot-path caller — gateway,
AIAgent.__init__); quiet_mode=False keeps the existing print side
effects. Cached path returns a shallow-copy list sharing read-only
schema dicts.
Measured: 7.5 ms → 0.01 ms per call (~750× speedup). Gateway
constructs fresh AIAgent per message, so this saves ~7 ms/turn before
any LLM work.
2. check_fn() TTL cache (tools/registry.py)
check_fn callables like check_terminal_requirements probe external
state (Docker daemon, Modal SDK, playwright binary). For a long-lived
process, hitting them on every get_definitions() pass was pure waste
— external state changes on human timescales. 30 s TTL so env-var
flips (hermes tools enable X) propagate within a turn or two without
explicit invalidation.
Measured: first call 7.5ms → 1.6ms (check_fn probes now dominate);
subsequent calls ~0.01ms via the upstream memoization.
Invalidation surface:
- registry._generation bumps on register/deregister/register_toolset_alias,
invalidating the memoized definitions automatically.
- config.yaml mtime in the cache key captures user-visible config edits
affecting dynamic schemas (execute_code mode, discord allowlist).
- invalidate_check_fn_cache() exposed for explicit flushes (e.g. after
hermes tools enable/disable).
- tests/conftest.py autouse fixture clears both caches before every test
so env-var monkeypatches don't see stale results.
Also fixes a regression from PR #17046 that I missed:
- tools/web_tools.py — Firecrawl was removed from module scope by the
lazy import, breaking 8 tests that patch 'tools.web_tools.Firecrawl'.
Applied the same _FirecrawlProxy pattern used in auxiliary_client/
run_agent for OpenAI (module-level proxy that looks like the class
but imports the SDK on first call/isinstance; patch() replaces the
attribute as usual).
Verified:
- 49/49 tests/tools/test_web_tools_config.py pass (was 8 failing on main)
- 68/68 tests/tools/test_homeassistant_tool.py pass (was 1 failing in
the full suite due to check_fn TTL cross-test pollution; fixed by
the autouse fixture)
- 3887/3895 tests/tools/ (8 pre-existing fails: 2 delegate, 1 mcp
dynamic discovery, 5 mcp structured content — all confirmed on main)
- 2973/2976 tests/agent/ + tests/run_agent/ (3 pre-existing fails)
- 868/868 tests/run_agent/ (excluding test_run_agent.py which has
pre-existing suite-level issues)
- Live smoke: 2 turns + /model switch + tool calls, zero errors in
agent.log session window.
Co-authored-by: teknium1 <teknium@users.noreply.github.com>
Two targeted fixes on the critical path from `hermes --tui` launch to
`gateway.ready`:
1. **Defer `@hermes/ink` import in memoryMonitor.ts.** The static top-level
import dragged the full ~414KB Ink bundle (React + renderer + all
components/hooks) onto the critical path *before* `gw.start()` could
spawn the Python gateway — serialising ~155ms of Node work in front of
it on every launch. `evictInkCaches` only runs inside the 10-second
tick under heap pressure, so it moves to a lazy dynamic import. First
tick hits the ESM cache because the app entry has long since imported
`@hermes/ink`.
2. **Gate `tools.mcp_tool` import on config in tui_gateway/entry.py.**
Importing the module transitively pulls the MCP SDK + pydantic + httpx
+ jsonschema + starlette formparsers (~200ms). The overwhelming
majority of users have no `mcp_servers` configured, so this runs for
nothing. A cheap `load_config()` check (~25ms) skips the 200ms import
when no servers are declared, with a conservative fallback to the old
behaviour if the config probe itself fails.
## Measurements (macOS Terminal.app, Apple Silicon, n=12)
| Metric | Before (p50) | After (p50) | Δ |
|----------------------------|--------------|-------------|----------|
| Python gateway boot alone | 252–365ms | 105–151ms | −180ms |
| `hermes --tui` banner paint | 686ms | 665ms | −21ms |
| `hermes --tui` → ready | **1843ms** | **1655ms** | **−188ms (−10.2%)** |
| `hermes --tui` → ready p90 | 1932ms | 1778ms | −154ms |
| stdev (ready) | 126ms | 83ms | also more consistent |
## Tests
- `scripts/run_tests.sh tests/tui_gateway/ tests/tools/test_mcp_tool.py`:
195 passed. (The one pre-existing failure in
`test_session_resume_returns_hydrated_messages` reproduces on main —
unrelated, it's a mock-DB kwarg mismatch.)
- `ui-tui` vitest: 430 tests, all pass.
- `npm run type-check` in ui-tui: clean.
## Notes
- Node-side first paint ("banner") didn't move meaningfully because that
latency is dominated by Ink's render pipeline + React mount, not by
which imports load first.
- The win shows up entirely in the time from banner to `gateway.ready`
— exactly where we expected it, since both fixes shorten the Python
gateway's boot path or let it overlap more with Node startup.
- No user-visible behaviour change. Memory monitoring still fires every
10s; MCP still works when `mcp_servers` is configured.
* fix(tui): honor documented mouse_tracking config key
The TUI runtime was reading display.tui_mouse while docs and user-facing
examples pointed users at display.mouse_tracking. That made persistent
mouse-disable config look like a no-op for users trying to restore native
terminal selection/copy behavior on Linux/SSH/tmux terminals.
Use display.mouse_tracking as the canonical key, keep display.tui_mouse as
a legacy fallback, and have /mouse write the documented key. Both gateway
config.get and client-side config sync now share the same precedence: the
canonical key wins, then the legacy key, then default on.
* review(copilot): align mouse tracking config coercion
- Load gateway config once before deriving display.mouse_tracking state.
- Use key-presence precedence on the TUI client too, so canonical
mouse_tracking wins over legacy tui_mouse even when the value is null.
- Treat numeric 0 as disabled on both gateway and client, matching the
existing string "0" handling.
- Widen ConfigDisplayConfig mouse fields because config.get full returns raw
YAML, not normalized booleans.
This PR groups the TUI fixes that restore macOS Terminal usability and clean up the theme/composer regressions:
- copy transcript selections on macOS drag-release so Terminal.app users can copy while mouse tracking is enabled
- copy composer selections on macOS drag-release; composer selection is internal to TextInput and does not use the global Ink selection bus
- keep IDE Cmd+C forwarding setup macOS-only, and make keybinding conflict checks respect simple when-clause overlap/negation
- force truecolor before chalk initializes (unless NO_COLOR / FORCE_COLOR / HERMES_TUI_TRUECOLOR opt-outs apply) so the default banner keeps its gold/amber/bronze gradient in Terminal.app
- move TUI surfaces onto semantic theme tokens and preserve skin prompt symbols as bare tokens with renderer-owned spacing
- render focused placeholders as dim hint text in TTY mode instead of inverse/selected-looking synthetic cursor text
Round 1 of #17174 hit `nix-lockfile-check` failure. Root cause was
NOT a stale hash — the primary `nix (ubuntu-latest)` and
`nix (macos-latest)` builds passed. GitHub's Magic Nix Cache returned
HTTP 418 (rate-limited / throttled) mid-run, so the rebuild bailed
with `some outputs of '/nix/store/...-npm-deps.drv' are not valid,
so checking is not possible` — no `got:` line for the script to
extract.
The script then incorrectly treated this as 'build failed with no
hash mismatch' and exited 1, breaking the lint on every PR whenever
the cache is throttled.
Now we recognize the throttling/cache-disabled signature and skip
that entry with a warning. A real stale hash still surfaces in the
primary `.#$ATTR` build (separate CI job), so we don't lose
coverage.
`web/package-lock.json` was updated by the design-system refactor
(merged via #17007 + follow-ups: spinner / select / badges / buttons)
without bumping `nix/web.nix::npmDeps.hash`, breaking nix builds on
every PR + main since 2026-04-28T18:46.
Hash sourced from the actual `Check flake` failure output:
specified: sha256-AahWmJ9gDQ9pMPa1FYwUjYdO2mOi6JM9Mst27E0vp68=
got: sha256-+B2+Fe4djPzHHcUXRx+m0cuyaopAhW0PcHsMgYfV5VE=
Standalone single-file fix so it can land fast and clear nix on
every other open PR.
* feat(tui): pluggable busy-indicator styles (kaomoji/emoji/unicode/ascii)
The status-bar `FaceTicker` rotated through wide-and-variable kaomoji
glyphs (`(。•́︿•̀。)`, `( ͡° ͜ʖ ͡°)`, …) every 2.5s. Real display widths range
from ~5 to ~16 columns, so the rest of the bar (cwd, ctx %, voice,
bg counter) shifted on every cycle. Padding the verb alone (#17116)
helped but didn't address the dominant jitter source — the glyph
itself.
Add four indicator styles, configurable + hot-swappable:
* `kaomoji` (default — preserves the existing vibe; verb is now
pad-stable so the only width churn left is the kaomoji itself).
* `emoji` — single 2-col emoji frame (`⚕ 🌀🤔✨🍵🔮`).
* `unicode` — `unicode-animations` braille spinner (1-col, smooth).
* `ascii` — `| / - \` (1-col, max compat).
Wires:
* `display.tui_status_indicator` in `DEFAULT_CONFIG` (default
`kaomoji`).
* New JSON-RPC `config.set/get indicator` keys, narrow allow-list.
* `applyDisplay` reads the field and patches `UiState.indicatorStyle`,
so the existing `mtime` poll picks up `~/.hermes/config.yaml` edits
within ~5s without a TUI restart.
* `/indicator [style]` slash command (alias `/indicator-style`,
subcommand completion `kaomoji|emoji|unicode|ascii`). Bare form
shows the current style; setter fires `config.set` and
optimistically `patchUiState({ indicatorStyle })` so the live TUI
swaps immediately, matching the `/skin` UX.
* `CommandDef("indicator", ..., subcommands=...)` so classic CLI
autocomplete + TUI `complete.slash` both surface it.
* `FaceTicker` decouples spinner cadence from verb cadence — the
glyph runs at the spinner's authored interval (or `FACE_TICK_MS`
for kaomoji), the verb stays on the original 2.5s cycle, and both
re-arm cleanly when style changes.
Tests:
* `normalizeIndicatorStyle` rejects unknown / non-string input.
* `applyDisplay → tui_status_indicator` covers fan-out + fallback.
* `/indicator <style>` hot-swaps `UiState.indicatorStyle` after a
successful `config.set`.
* `/indicator sparkle` rejects with the usage hint and never hits
the gateway.
* Slash-parity matrix gets `'/indicator'` → `config.get`.
Validation:
cd ui-tui && npm run type-check — clean; npm test --run — 398/398.
scripts/run_tests.sh tests/test_tui_gateway_server.py
tests/hermes_cli/test_commands.py — 220/220.
* chore(tui): drop /indicator-style alias to declutter autocomplete
* fix(tui): drop verb-width pad — /indicator handles glyph jitter directly
* fix(tui): unicode indicator style hides the verb (cleanest option)
* refactor(tui): single source of truth for INDICATOR_STYLES; cleaner error format
Round 1 Copilot review on PR #17150:
- Exported `INDICATOR_STYLES` const tuple from `interfaces.ts`;
`IndicatorStyle` union type is derived from it. `useConfigSync`
builds its validation Set from the tuple, and `session.ts` uses it
for both the usage hint and the runtime allow-list — adding/removing
a style now touches one line.
- Backend `config.set indicator` error message: switched
`sorted(allowed)` list repr to `pick one of ascii|emoji|kaomoji|unicode`
(matches the TUI usage hint), and reports the normalized `raw`
instead of the original `value`. Backend allowed tuple now has a
comment pointing back at `INDICATOR_STYLES` so the two stay aligned.
Note: kept the verb portion unpadded per design intent — fixed-width
padding was the exact UX the `/indicator` command was added to remove.
Stable width comes from the glyph; verbs cycling is part of the kawaii
aesthetic. Reply on the verb thread will explain.
* fix(tui): drop type collapse + gate verb timer + DEFAULT_INDICATOR_STYLE
Round 2 Copilot review on PR #17150:
- `tui_status_indicator?: 'ascii' | ... | string` collapses to `string`
in TS — consumers got no narrowing. Documented as plain `string` with
a comment about runtime validation via `normalizeIndicatorStyle`.
- `FaceTicker` always started a 2.5s verb interval, even for the
`unicode` style which hides the verb entirely. Now gated on
`showVerb` from `renderIndicator` — `unicode` stays calm.
Pre-emptive self-review (avoid round 3):
- Three call sites duplicated the literal `'kaomoji'` default
(uiStore, normalizeIndicatorStyle, slash command). Added
`DEFAULT_INDICATOR_STYLE` to interfaces.ts and threaded it through
so changing the default touches one line.
* fix(tui-gateway): normalize config.get indicator output to match TUI render
Round 4 Copilot review on PR #17150: `config.get` for `indicator`
returned the raw `display.tui_status_indicator` value without
validation, so a hand-edited config.yaml with stray casing or an
unknown style would leave `/indicator` printing one thing while
the TUI rendered the kaomoji default (frontend's
`normalizeIndicatorStyle` does this normalization on receive).
Lifted the allow-list to module scope as `_INDICATOR_STYLES` /
`_INDICATOR_DEFAULT`, reused by both `config.set` and `config.get`.
Comment notes the alignment with `INDICATOR_STYLES` /
`DEFAULT_INDICATOR_STYLE` in interfaces.ts so adding/removing a
style is a one-line change on each end.
Tests cover: known value verbatim, casing/whitespace normalize,
unknown→default, unset→default.
* fix(tui-gateway): preserve falsy-input diagnostics in config.set indicator error
Round 5 Copilot review on PR #17150: `raw = str(value or "").strip().lower()`
collapsed any falsy non-string (`0`, `False`, `[]`) to empty string,
so the error message read `unknown indicator: ` with nothing after —
losing the original input.
Switched to `("" if value is None else str(value)).strip().lower()`
so only `None` (the genuine 'no value' case) becomes blank. Used
`{raw!r}` in the error so the diagnostic is unambiguous (`'0'` vs `0`).
Tests:
- known-value happy path (`'EMOJI'` → `'emoji'`)
- falsy non-string inputs (`0` / `False` / `[]`) surface meaningfully
- `None` keeps the blank-repr error
* feat(tui): expand light-terminal auto-detection (HERMES_TUI_THEME, BG hex)
Modern terminals (Ghostty, Warp, iTerm2) don't set COLORFGBG, so the
auto-light path was effectively COLORFGBG-only and silently broken for
many users. Two pragmatic additions, both opt-in, plus a clearer
priority chain:
1. **`HERMES_TUI_THEME=light|dark`** as a symmetric explicit override.
The existing `HERMES_TUI_LIGHT` is fine but reads as boolean noise;
a named theme env var matches `display.skin` muscle memory.
2. **`HERMES_TUI_BACKGROUND` hex/rgb hint.** Lets advanced users
(or a future OSC11 query helper that caches the answer) state a
ground-truth background colour. Decoded to Rec. 709 luma; ≥ 0.6
counts as light.
Priority order is now fully ordered and explainable:
1. `HERMES_TUI_LIGHT` (1/0/true/false/on/off).
2. `HERMES_TUI_THEME=light|dark`.
3. `HERMES_TUI_BACKGROUND` luminance.
4. `COLORFGBG` last field — light slots 7/15 → light, 0–15 → dark
(authoritative when set, so the new TERM_PROGRAM path can never
stomp on a terminal that already volunteered a dark answer).
5. `TERM_PROGRAM` allow-list — empty by default. The slot is left
in place because folks asked for it but populating it risks
wrongly flipping users on Apple_Terminal / iTerm2 dark profiles
to light. Easy to add per terminal once we have signal.
Tests: 5 new cases in `theme.test.ts` covering theme env, background
hex (3- and 6-char), invalid hex falling through, and COLORFGBG taking
precedence over the future allow-list.
Validation: `npm run type-check` clean, `npm test --run` 392/392.
* review(copilot): tighten theme detection comments + drop unnecessary cast
* review(copilot): strict hex regex so partial garbage doesn't slip into luminance
* test(tui): make TERM_PROGRAM allow-list injectable so precedence is provable
Copilot review on PR #17113: `LIGHT_DEFAULT_TERM_PROGRAMS` is empty
in production, so the prior assertion would have passed even if
`detectLightMode` ignored `COLORFGBG` entirely. That defeats the
test's purpose.
`detectLightMode` now takes the allow-list as an optional second
argument (defaults to the production set). The test injects a set
containing `Apple_Terminal`, asserts the allow-list alone WOULD
return light, then asserts `COLORFGBG: '15;0'` overrides it — the
precedence rule is now exercised, not assumed.
* fix(tui): COLORFGBG empty-trailing-field falls through; isolate DEFAULT_THEME tests
Round 2 Copilot review on PR #17113:
1. `Number(colorfgbg.split(';').at(-1))` returns 0 for an empty trailing
field (e.g. `COLORFGBG='15;'` → bg===0), which would have looked
like an authoritative dark slot and incorrectly blocked the
TERM_PROGRAM allow-list. Added a `/^\d+$/` guard before coercion;
non-numeric trailing fields now fall through.
2. Fixed the misleading '0–6 / 8–15 ranges are dark' comment — the
block returns true for bg===15, so the range is actually 0–6 / 8–14.
3. `DEFAULT_THEME` is computed from `process.env` at module-load.
A developer shell with `HERMES_TUI_THEME=light` (or a bright
`HERMES_TUI_BACKGROUND`) would flip it and break local tests.
The DEFAULT_THEME describe blocks now sterilize the relevant env
vars + dynamically import theme.ts (vi.resetModules pattern from
platform.test.ts). fromSkin tests compare against DARK_THEME
directly to decouple them from ambient env.
* test(tui): isolate ALL env-coupled theme symbols, not just DEFAULT_THEME
Round 3 Copilot review on PR #17113: the static top-level imports of
`fromSkin`, `DARK_THEME`, `LIGHT_THEME` evaluated theme.ts before
`importThemeWithCleanEnv` had a chance to clean the env. Because
`fromSkin` closes over `DEFAULT_THEME`, an ambient `HERMES_TUI_THEME=light`
or bright `HERMES_TUI_BACKGROUND` would still flip the base palette
and cause local-only failures.
Removed the static import entirely. Every test now obtains its theme
symbols via `importThemeWithCleanEnv`, including `detectLightMode`
(for consistency, even though it takes env as a parameter).
`fromSkin` tests assert against the cleaned `DEFAULT_THEME` from the
same dynamic import — preserves the actual contract (skins extend the
ambient base palette) without coupling the test to dev-shell state.
Verified by running with HERMES_TUI_THEME=light + HERMES_TUI_BACKGROUND=#ffffff:
all 20 theme tests still pass.
Self-review (avoid round 4):
- Audited other test files importing DEFAULT_THEME (syntax.test.ts,
streamingMarkdown.test.ts, constants.test.ts) — all just pass it as
a parameter or assert palette property existence (works on both
light + dark), so no env coupling there.
* fix(tui-gateway): harden stdio transport against half-closed pipes + SIGTERM races
`tui_gateway` reports `tui_gateway_crash.log` traces where the main
thread sits in `sys.stdin` while a worker holds `_stdout_lock` mid-
flush, and SIGTERM then calls `sys.exit(0)` while the lock is still
held — the interpreter shutdown stalls behind the wedged write.
Two narrowly scoped hardenings:
**`tui_gateway/transport.py`**
* Move JSON serialisation outside the lock — long messages no longer
block sibling writers while we serialise.
* Treat `BrokenPipeError`, `ValueError` ("I/O on closed file") and
generic `OSError` from both `write` and `flush` as "peer is gone":
return `False` instead of bubbling, matching what `write_json`'s
callers in `entry.py` already expect.
* Split `flush` into its own try block so a stuck flush never strands
a partial write or holds the lock indefinitely on its way out.
* Optional `HERMES_TUI_GATEWAY_NO_FLUSH=1` env knob to skip explicit
`flush()` entirely on environments where a half-closed read pipe
produces an indefinite kernel-level block. Default unchanged.
**`tui_gateway/entry.py`**
* `_log_signal` now spawns a 1-second daemon timer that calls
`os._exit(0)` if the orderly `sys.exit(0)` path is itself stuck
behind a wedged worker. Atexit handlers run inside the grace
window when they can; the timer is the safety net so a deadlocked
flush no longer strands the gateway process.
Tests:
* `test_write_json_closed_stream_returns_false` — ValueError path.
* `test_write_json_oserror_on_flush_returns_false` — OSError on flush
must not strand the lock; the write portion still landed before the
flush failure.
* `test_write_json_no_flush_env_skips_flush` — env knob bypass.
Validation: `scripts/run_tests.sh tests/tui_gateway/test_protocol.py`
(42/42 pass; one pre-existing failure on
`test_session_resume_returns_hydrated_messages` is unrelated to this
change — same `include_ancestors` mock kwarg issue tracked elsewhere).
`scripts/run_tests.sh tests/test_tui_gateway_server.py` 90/90 pass.
* review(copilot): tighten transport hardening comments + test cleanup
* review(copilot): narrow exception capture, configurable grace, simpler no-flush test
* fix(tui-gateway): narrow ValueError to closed-stream; surface UnicodeEncodeError
Copilot review on PR #17118: `UnicodeEncodeError` is a ValueError
subclass, so a non-UTF-8 stdout (mismatched PYTHONIOENCODING / locale)
would have been silently swallowed as 'peer gone' under
`except ValueError`. That hides a real environment bug.
Now:
- UnicodeEncodeError → log with exc_info (warning) and drop the frame
- ValueError where str(e) contains 'closed file' → peer gone, return False
- Any other ValueError → log loudly, drop frame (defensive, but visible)
Same shape applied to flush. Adds two regression tests.
* fix(tui-gateway): reserve write() False for peer-gone; re-raise programming errors
Round 2 Copilot review on PR #17118: `Transport.write()` returning
`False` is documented as 'peer is gone', and `entry.py` reacts by
calling `sys.exit(0)`. But the implementation also returned False
for non-IO conditions (non-JSON-safe payloads, UnicodeEncodeError,
unrelated ValueErrors), so a programming error or local env bug would
present as a clean disconnect — exactly the diagnosis pain we wanted
to eliminate.
Now:
- `json.dumps` failure → re-raises (TypeError/ValueError surfaces in crash log)
- `BrokenPipeError` → False (peer gone)
- `ValueError('...closed file...')` → False (peer gone)
- `UnicodeEncodeError` and any other ValueError → re-raise
- `OSError` → False (existing IO-failure semantics, debug-logged)
Tests updated to assert the re-raise behaviour and added a
non-serializable-payload regression test.
* fix(tui-gateway): narrow OSError to peer-gone errnos; honest test naming
Round 3 Copilot review on PR #17118:
- Docstring claimed False = peer gone, but generic OSError on write/flush
also returned False — meaning ENOSPC/EACCES/EIO would silently exit.
Added `_PEER_GONE_ERRNOS = {EPIPE, ECONNRESET, EBADF, ESHUTDOWN, +WSA}`
and narrowed the OSError handlers; non-peer-gone errnos re-raise.
Docstring now lists OSError as peer-gone branch with the errno set.
- The `_DISABLE_FLUSH` test was named after the env var but actually
patched the module constant. Renamed it to reflect the contract being
tested (skips flush when constant is true) AND added a real
end-to-end test that sets the env var, reloads transport.py, and
asserts the constant flips. Cleanup reload restores defaults so
parallel tests stay isolated.
Self-review (avoid round 4):
- Verified TeeTransport's secondary-swallow stays intentional.
- _log_signal grace path already covered by separate tests.
* fix(tui): honor display.busy_input_mode in TUI v2
The TUI v2 frontend hard-coded `composerActions.enqueue(full)` whenever
`ui.busy` was true. The classic CLI and gateway adapters honor the
`display.busy_input_mode` config key (`interrupt` | `queue` | `steer`),
but Ink ignored it — sending a message during a long-running turn always
landed in the queue regardless of config. The config default is already
`interrupt` (hermes_cli/config.py), so users who explicitly opted into
that experience were silently stuck on the legacy queue path.
This wires the value through the existing config-sync surface:
* `applyDisplay` now reads `display.busy_input_mode`, defaults to
`interrupt` (matching `_load_busy_input_mode` in tui_gateway), and
drops it into a new `UiState.busyInputMode` field.
* `dispatchSubmission` and the queue-edit fall-through call a shared
`handleBusyInput` helper that branches on the mode:
* `queue` — legacy behavior, append to the queue.
* `steer` — call `session.steer`; on rejection, fall back to
queue with a sys note.
* `interrupt` — `turnController.interruptTurn(...)` then `send()`,
so the new prompt actually moves.
* Mtime polling in `useConfigSync` already re-applies `config.full`, so
flipping `display.busy_input_mode` in `~/.hermes/config.yaml` takes
effect on the next 5s tick without restarting the TUI.
Tests:
* `applyDisplay → busy_input_mode` covers normalization + UiState fan-out.
* `normalizeBusyInputMode` mirrors the Python side's allow-list.
Validation:
* `npm run type-check` (in `ui-tui/`) — clean.
* `npm test --run` (in `ui-tui/`) — 394/394.
* review(copilot): narrow busy_input_mode type, preserve queue order on steer fallback
* review(copilot): clarify handleBusyInput comment (option, not return value)
* fix(tui): default busy_input_mode to queue in TUI (CLI keeps interrupt)
In a full-screen TUI users typically author the next prompt while the
agent is still streaming, so an unintended interrupt loses in-flight
typing. TUI fallback now defaults to `queue`; CLI / messaging
adapters keep `interrupt` as the framework default.
Override per-config via `display.busy_input_mode: interrupt` (or
`steer`) — the normalize/wire path is unchanged, only the missing-
value branch differs from the Python default.
uiStore initial value also flipped to `queue` so first-frame render
before `config.full` lands matches the eventual normalized value.