hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-17 09:41:58 +00:00

Author	SHA1	Message	Date
Teknium	f0dc919f92	fix(compression): include system prompt + tool schemas in token estimates (#18265 ) The user-visible /compress banner and the post-compression last_prompt_tokens writeback both counted only the raw message transcript (chars/4). With a 15KB system prompt and 30 tool schemas (~26KB), a 4-message transcript that looks like ~45 tokens to the transcript-only estimator is really ~10.5K tokens of request pressure — a 234x gap. Two user-facing consequences: - Banner shows 'Compressing … (~45 tokens)…' while compression is actually firing on 10K+ tokens of real pressure, confusing users about why compression triggered (reported by @codecovenant on X; #6217). - Post-compression last_prompt_tokens writeback omits tool schemas, so the next should_compress() check compares real usage against a stale underestimate — compression triggers late, potentially past the model's context limit on small-context models (#14695). Swap estimate_messages_tokens_rough() for estimate_request_tokens_rough() at every user-visible banner and at the post-compression writeback. estimate_request_tokens_rough() already existed for exactly this purpose and includes system prompt + tool schemas. Touched call sites: - run_agent.py: post-compression last_prompt_tokens writeback, post-tool call should_compress() fallback when provider usage is missing - cli.py: /compress banner + summary - gateway/run.py: gateway /compress banner + summary - tui_gateway/server.py: TUI /compress status + summary - acp_adapter/server.py: ACP /compact before/after Left intentionally alone: - Session-hygiene fallback and the 'no agent' /status path in gateway/run.py — no agent instance is in scope to query for system prompt/tools, and the existing 30-50% overestimate wobble on hygiene is safety-accepted. - Verbose-mode 'Request size' logging — informational only, already counts system prompt via api_messages[0]. Also relabels the feedback line from 'Rough transcript estimate' to 'Approx request size' so the metric label matches what it actually measures. Credits: diagnoses from @devilardis (#14695) and @Jackten (#6217); user report @codecovenant on X (2026-04-30). Closes #14695 Closes #6217	2026-04-30 23:03:54 -07:00
hharry11	24130b7e53	fix(approval): harden YOLO mode env parsing against quoted-bool strings	2026-04-30 20:37:37 -07:00
sprmn24	5ed27c0f74	fix(tui_gateway): guard env var parsing against invalid values at import _SLASH_WORKER_TIMEOUT_S and _pool used raw float()/int() on env vars at module level. A non-numeric value (e.g. HERMES_TUI_SLASH_TIMEOUT_S=abc) raises ValueError during import, preventing TUI gateway from starting with no useful error message. Wrap both parses in try/except with safe fallbacks: - HERMES_TUI_SLASH_TIMEOUT_S: fallback to 45.0s - HERMES_TUI_RPC_POOL_WORKERS: fallback to 4 workers	2026-04-30 20:26:23 -07:00
hharry11	ca9a61ae38	fix(plugins): await async handlers in CLI and TUI dispatch	2026-04-30 19:56:18 -07:00
Yukipukii1	2110a3a0c4	fix(tui): return JSON-RPC errors for invalid request shapes	2026-04-30 19:47:00 -07:00
Teknium	bbbce92651	feat(tui): render self-improvement review summaries in the transcript The Ink TUI (\`hermes --tui\` + dashboard \`/chat\`) had no wiring for the background self-improvement review. When the review fired and patched a skill or saved a memory entry, the change landed but the user had no visual indication it happened — only the CLI had a print surface for the '💾 Self-improvement review: …' line. Changes: - tui_gateway/server.py: in _init_session, attach agent.background_review_callback to an _emit('review.summary', sid, {text}) closure. Wrapped in try/except so agents with locked attribute slots don't break session startup. - ui-tui/src/app/createGatewayEventHandler.ts: handle 'review.summary' by routing ev.payload.text through sys(…), matching the existing 'background.complete' pattern. Empty / whitespace payloads are ignored so the transcript never gets a blank system line. - ui-tui/src/gatewayTypes.ts: extend the GatewayEvent discriminated union with { type: 'review.summary', payload?: { text?: string } }. Gateway platforms (Telegram, Discord, Slack, …) already route the review summary via background_review_callback → post-delivery queue in gateway/run.py, so they pick up the new 'Self-improvement review:' prefix from the companion run_agent change with no platform edits. Tests: - tests/tui_gateway/test_review_summary_callback.py (Python, 2 tests): _init_session attaches a callback that emits the right event; the callback path survives agents that can't accept the attribute. - ui-tui/src/__tests__/createGatewayEventHandler.test.ts (vitest, 2 new cases): review.summary events feed sys(...) with the full text; empty / missing payloads are no-ops. - TypeScript type-check passes. - tui_gateway suite: 64/64 pass.	2026-04-30 14:07:22 -07:00
Brooklyn Nicholson	b9d9fa7df8	fix(tui): respect max turns config Co-authored-by: YuShu <24110240104@m.fudan.edu.cn>	2026-04-30 12:26:57 -05:00
Rob Moen	0dd373ec43	fix(context): honor model.context_length for Ollama num_ctx and all display paths When a user sets model.context_length in config.yaml, the value was only used for Hermes' internal compression decisions (context_compressor) but NOT for Ollama's num_ctx parameter. Ollama auto-detects context from GGUF metadata (often 256K+) and allocates that much VRAM regardless of the user's config — causing OOM on smaller GPUs like the P100 (16GB). Root cause: two separate context values existed independently: - context_compressor.context_length = config value (e.g. 65536) ✓ - _ollama_num_ctx = GGUF metadata value (e.g. 256000) ✗ ignored config Changes: 1. Cap Ollama num_ctx to config context_length (run_agent.py) When model.context_length is explicitly set and no explicit ollama_num_ctx override exists, cap the auto-detected GGUF value to the user's context_length. This is the core fix — it prevents Ollama from allocating more VRAM than the user budgeted. 2. Pass config_context_length through all secondary call sites Several paths called get_model_context_length() without the config override, falling through to the 256K default fallback: - cli.py: @-reference expansion and /model switch display - gateway/run.py: @-reference expansion and /model switch display - tui_gateway/server.py: @-reference expansion - hermes_cli/model_switch.py: resolve_display_context_length() 3. Normalize root-level context_length in config (hermes_cli/config.py) _normalize_root_model_keys() now migrates root-level context_length into the model section, matching existing behavior for provider and base_url. Users who wrote `context_length: 65536` at the YAML root instead of under `model:` had it silently ignored. 4. Fix misleading comments (agent/model_metadata.py) DEFAULT_FALLBACK_CONTEXT is 256K (CONTEXT_PROBE_TIERS[0]), not 128K as two comments stated. Tests: 3 new tests for root-level context_length normalization. All existing context_length tests pass (96 tests).	2026-04-30 04:31:23 -07:00
Teknium	21e695fcb6	fix: clean up defensive shims and finish CI stabilization from #17660 (#17801 ) PR #17660 landed a sweep of CI fixes but left three loose ends: 1. tests/cli/test_cli_loading_indicator.py::test_reload_mcp_sets_busy_state_ and_prints_status — /reload-mcp gained a prompt-cache-invalidation confirmation (commit `4d7fc0f37`) that was never wired into this test. The test exercises the loading-indicator path, so pre-approve via config and go straight into _reload_mcp(). 2. tools/mcp_tool.py _make_tool_handler — the added getattr(server, '_rpc_lock', None) + 'skip the lock if missing' branch is inconsistent with four sibling call sites that still direct-access server._rpc_lock. The lock is guaranteed by MCPServerTask.__init__; falling through to an unlocked session.call_tool would silently serialize-strip RPCs if the guard ever triggered. Restore direct access. 3. tui_gateway/server.py _messages_as_conversation — the helper existed only to catch 'TypeError: include_ancestors unexpected' from mocked SessionDBs that don't actually exist. The real SessionDB.get_messages_as_conversation has accepted include_ancestors since introduction, and every test FakeDB in the repo already declares the kwarg. Remove the shim, inline the two call sites.	2026-04-29 23:53:17 -07:00
Stephen Schoettler	f73364b1c4	fix(ci): stabilize main test suite regressions (#17660 ) * fix: stabilize main test suite regressions * test(agent): update MiniMax normalization expectation * test: stabilize remaining CI assertions * test: harden config helper monkeypatching * test: harden CI-only assertions * fix(agent): propagate fast streaming interrupts	2026-04-29 23:18:55 -07:00
Teknium	4d7fc0f37c	feat(gateway,cli): confirm /reload-mcp to warn about prompt cache invalidation Reloading MCP servers rebuilds the tool set for the active session, which invalidates the provider prompt cache (tool schemas are baked into the system prompt). The next message re-sends full input tokens — can be expensive on long-context or high-reasoning models. To surface that cost, /reload-mcp now routes through a new slash-confirm primitive with three options: Approve Once / Always Approve / Cancel. 'Always Approve' persists approvals.mcp_reload_confirm: false so future reloads run silently. Coverage: * Classic CLI (cli.py) — interactive numbered prompt. * TUI (tui_gateway + Ink ops.ts) — text warning on first call; `now` / `always` args skip the gate; `always` also persists the opt-out. * Messenger gateway — button UI on Telegram (inline keyboard), Discord (discord.ui.View), Slack (Block Kit actions); text fallback on every other platform via /approve /always /cancel replies intercepted in gateway/run.py _handle_message. * Config key: approvals.mcp_reload_confirm (default true). * Auto-reload paths (CLI file watcher, TUI config-sync mtime poll) pass confirm=true so they do NOT prompt. Implementation: * tools/slash_confirm.py — module-level pending-state store used by all adapters and by the CLI prompt. Thread-safe register/resolve/clear. * gateway/platforms/base.py — send_slash_confirm hook (default 'Not supported' → text fallback). * gateway/run.py — _request_slash_confirm helper + text intercept in _handle_message (yields to in-progress tool-exec approvals so dangerous-command /approve still unblocks the tool thread first). Tests: * tests/tools/test_slash_confirm.py — primitive lifecycle + async resolution + double-click atomicity (16 tests). * tests/hermes_cli/test_mcp_reload_confirm_gate.py — default-config shape + deep-merge preserves user opt-out (5 tests). Targeted runs (hermetic): 89 passed (slash-confirm, config gate, existing agent cache, existing telegram approval buttons).	2026-04-29 21:56:47 -07:00
Brooklyn Nicholson	8dcab19d02	fix(gateway): fail closed when session.delete can't enumerate active sessions If a concurrent RPC mutates _sessions while session.delete is iterating it (e.g. a parallel session.create on the thread pool), the bare except swallowed the RuntimeError and let the delete proceed against a row that may still be live. Snapshot via list(_sessions.values()) and return an error when even that raises, instead of treating "couldn't check" as "no active sessions."	2026-04-29 20:21:16 -07:00
Brooklyn Nicholson	24b5279f43	feat(tui): delete sessions from /resume picker with `d` Pressing `d` on the highlighted row in the resume picker prompts `delete? y/n`; `y` deletes the session (DB row + on-disk transcript files), anything else cancels. The active session is excluded from deletion server-side. Adds a new `session.delete` JSON-RPC handler that wraps `SessionDB.delete_session`, forwarding the per-profile `sessions/` directory so transcripts get cleaned up alongside the row.	2026-04-29 20:21:16 -07:00
brooklyn!	fc7f55f490	fix(tui): responsive /compress with live progress + CLI-parity feedback (#17661 ) * fix(tui): offload manual compaction RPC Route TUI session compression through the existing long-handler pool so slow compaction does not block other gateway RPCs. * fix(tui): show compaction progress immediately Print a local status line before the compress RPC starts so slow manual compaction does not look like a no-op. * feat(tui): rich /compress feedback parity with CLI Show pre-compaction message count and rough token estimate immediately, emit a status update so the bottom bar reflects ongoing compaction, and report a multi-line summary (headline + token delta + optional note) using the shared summarize_manual_compression helper. * fix(tui): show live compaction estimate in transcript Mirror compression progress status into the transcript so users see the backend message count and token estimate while /compress is still running. * fix(tui): single live compaction line with spinner glyph Drop the redundant local "compressing context..." placeholder and prefix the live backend status line with a braille spinner glyph so /compress reads as a single in-progress row. * fix(tui): address review nits on /compress feedback Reuse the precomputed token estimate inside _compress_session_history so the gateway does not redo the O(n) work while holding history_lock, keep the status bar pinned during long manual compactions instead of auto-restoring after 4s, and drop the redundant noop bullet that doubled with the system role glyph. * fix(tui): release history_lock during compaction LLM call Move the snapshot/commit pattern into _compress_session_history so the lock is held only across the in-memory bookkeeping, not during agent._compress_context. Also emit a final neutral status update from session.compress so the pinned compressing indicator clears even on errors. * fix(tui): rebuild prompt cleanly + sync session_key after compress Pass system_message=None so AIAgent._compress_context rebuilds the system prompt without nesting the cached identity block. Reuse the handler's pre-snapshotted history inside _compress_session_history to avoid a second O(n) copy under the lock. After compaction, when AIAgent._compress_context rotates session_id, sync the gateway session_key, migrate approval notify + yolo state, restart the slash worker, and clear the stale pending title. Mirrors HermesCLI._manual_compress. * Avoid /compress lock re-entry in slash side effects. Stop pre-locking history before _compress_session_history in slash command mirroring, keep session-key sync parity with manual compression, and add a regression test that asserts /compress is invoked without holding history_lock.	2026-04-29 18:01:18 -07:00
brooklyn!	5e6e8b6af3	fix(tui): honor launch toolsets (#17623 ) * fix(tui): honor launch toolsets Carry chat --toolsets through the TUI launcher so TUI sessions use the same per-session tool scope as the classic CLI. * fix(tui): parse top-level toolsets flag Allow top-level hermes --tui --toolsets to reach the implicit chat session, matching chat subcommand behavior. * fix(tui): validate launch toolsets Filter invalid HERMES_TUI_TOOLSETS entries and fall back to configured CLI toolsets when the override contains no valid toolsets. * fix(tui): avoid config load for builtin toolsets Honor built-in HERMES_TUI_TOOLSETS values before loading config and treat all/* as the all-toolsets sentinel. * fix(cli): honor toolsets in oneshot mode Forward top-level --toolsets into oneshot agent construction so the flag is not silently ignored outside the TUI path. * fix(cli): validate oneshot toolsets Reject invalid-only oneshot toolset overrides before output redirection and clarify TUI fallback warnings. * fix(cli): preserve all-toolsets sentinel Map explicit all/* oneshot toolset overrides to the all-toolsets sentinel and replace locals() checks in TUI toolset loading. * fix(cli): warn on extra all-toolset entries Warn when all/* toolset overrides include additional ignored entries so typos are still visible. * fix(tui): honor plugin toolset overrides Discover plugin toolsets before rejecting unresolved explicit toolset overrides and read raw config for MCP name validation. * fix(tui): reuse toolset argument normalizer Share top-level TUI toolset argument parsing with the oneshot path to avoid duplicate normalization logic. * fix(cli): reject disabled mcp toolsets Validate explicit toolset overrides against enabled MCP servers only and clarify top-level toolset flag help. * fix(cli): distinguish disabled mcp from unknown toolsets Report disabled MCP servers separately from unknown toolset entries and stub plugin discovery in invalid-name tests for determinism.	2026-04-29 16:55:27 -07:00
brooklyn!	d9bf093728	Merge pull request #17638 from NousResearch/bb/tui-details-persist fix(tui): persist global details mode sections	2026-04-29 15:15:37 -07:00
Brooklyn Nicholson	faa467ccaf	fix(tui): share detail section constants Reuse one gateway detail-section list for global and per-section detail mode config handling.	2026-04-29 17:05:51 -05:00
Brooklyn Nicholson	c2cb6d1071	fix(tui): persist global details mode sections Pin all detail sections when /details sets a global mode so config sync does not restore built-in section defaults.	2026-04-29 16:46:42 -05:00
Brooklyn Nicholson	7d96a5ab6e	fix(tui): refine reasoning visibility updates Save reasoning display changes atomically and keep trail segments visible when Activity can render them.	2026-04-29 16:03:45 -05:00
Brooklyn Nicholson	d8afafd22b	fix(tui): hide reasoning panels immediately Make /reasoning hide update the thinking section visibility so existing and live reasoning blocks disappear without waiting for config sync.	2026-04-29 15:23:14 -05:00
brooklyn!	5e68503d2f	Merge pull request #17190 from NousResearch/bb/tui-cold-start-profiling perf(tui): cut visible cold start ~57% with lazy agent init	2026-04-28 22:45:14 -07:00
Brooklyn Nicholson	d341af22c0	fix(tui): preserve busy and init error signaling Finish the Copilot review cleanup for lazy prompt submission: - prompt.submit now claims session.running before returning success, preserving the existing RPC-level session busy error so the frontend can queue. - agent-init timeout/failure now emits a normal error event instead of writing a second JSON-RPC response for an already-settled request id. Tests: - python -m py_compile tui_gateway/server.py tui_gateway/entry.py - cd ui-tui && npm run type-check && npm run build - scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py - cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts	2026-04-29 00:25:09 -05:00
Brooklyn Nicholson	cc5efb6fc1	fix(tui): keep non-agent session RPCs lazy Respond to Copilot's lazy-start review: session metadata/history/usage do not need a constructed AIAgent, so keep them on the no-wait session path. This preserves the deferred startup model and avoids blocking simple session RPCs on agent initialization. Tests: - python -m py_compile tui_gateway/server.py tui_gateway/entry.py - cd ui-tui && npm run type-check && npm run build - scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py - cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts	2026-04-29 00:22:38 -05:00
Brooklyn Nicholson	97a2474b39	review(copilot): point reload.env docstring at hermes_cli.config.reload_env	2026-04-28 22:22:30 -07:00
Brooklyn Nicholson	4858e26eaa	feat(tui): port classic CLI /reload (.env hot-reload) to TUI Classic CLI exposes ``/reload`` (re-reads ~/.hermes/.env into ``os.environ`` via ``hermes_cli.config.reload_env``) so newly added API keys take effect without restarting the session. The TUI was missing the parity command, so users had to Ctrl+C out and ``hermes --tui`` again whenever they added or rotated a credential. Three small wires: * New ``reload.env`` JSON-RPC method in ``tui_gateway/server.py`` that delegates to ``hermes_cli.config.reload_env`` and returns the count of vars updated. * New ``/reload`` slash command in ``ui-tui/src/app/slash/commands/ops.ts`` matching the existing ``/reload-mcp`` pattern (native RPC, no slash worker). * Drop ``cli_only=True`` from the ``reload`` ``CommandDef`` in ``hermes_cli/commands.py`` so help/menus surface it in the TUI too. ``reload_env`` itself is environment-agnostic. Same caveat as classic CLI: the currently constructed agent's credential pool / provider routing does not auto-rebuild. Users who want a brand-new credential resolution should follow with ``/new``. Tests: * New ``test_reload_env_rpc_calls_hermes_cli_reload_env`` confirms RPC delegates and reports the count. * New ``test_reload_env_rpc_surfaces_errors`` confirms exceptions are rendered as JSON-RPC errors. * ``createSlashHandler.test.ts`` slash-parity matrix extended with ``['/reload', 'reload.env', {}]`` so we can't regress the routing. Validation: scripts/run_tests.sh tests/test_tui_gateway_server.py — 92/92. scripts/run_tests.sh tests/hermes_cli/test_commands.py — 128/128. cd ui-tui && npm run type-check — clean; npm test --run — 390/390.	2026-04-28 22:22:30 -07:00
Brooklyn Nicholson	f95c34f415	fix(browser): address Copilot round-4 on /browser connect * Reject unsupported schemes (anything outside http/https/ws/wss) in cli.py /browser connect before probing or persisting, matching the gateway's existing 4015 path. * Defend gateway browser.manage against `{"url": null}` and non-string urls: empty/null falls back to DEFAULT_BROWSER_CDP_URL, non-string returns a 4015 instead of slipping into the generic 5031 catch via TypeError on `"://" in url`. * Add regression tests for both null-url fallback and non-string rejection.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	679a27498d	fix(browser): address Copilot round-3 on /browser connect * Gate `browser.progress` emit on truthy `session_id`. The TUI prints `messages` from the response when there's no session, so emitting events too would double-render. Now: with a session → events stream live; without one → bundled messages only. * Resolve `system = platform.system()` once in `_browser_connect` and thread it through `try_launch_chrome_debug` and `_failure_messages` → `manual_chrome_debug_command`, so the generated hint is consistent (and tests are deterministic) on any host. * Add `test_browser_manage_connect_no_session_skips_progress_events` to lock in the gating behavior.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	d1ee4915f3	fix(browser): address Copilot review on /browser connect Fixes from Copilot's two passes on PR #17238: * Validate parsed URL once: reject missing host, invalid port, and unsupported scheme up front so malformed inputs (e.g. http://:9222 or http://localhost:abc) don't fall through to a generic 5031. * Tighten _is_default_local_cdp to require a discovery-style path so ws://127.0.0.1:9222/devtools/browser/<id> is not collapsed to bare http://127.0.0.1:9222 (which would lose the path and break the connect). * Move browser.manage into _LONG_HANDLERS so the up-to-10s launch-and-retry loop runs on the RPC pool instead of blocking the main dispatcher. * try_launch_chrome_debug uses Windows-appropriate detach kwargs (creationflags=DETACHED_PROCESS\|CREATE_NEW_PROCESS_GROUP) instead of POSIX-only start_new_session=True. * manual_chrome_debug_command uses subprocess.list2cmdline on Windows so the printed instruction is cmd.exe-compatible. * Mirror host/port validation in cli.py /browser connect so the classic CLI never persists an invalid BROWSER_CDP_URL.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	26816d1f77	refactor(tui): tighten /browser connect plumbing Split browser.manage into a small dispatcher with named connect/disconnect helpers, fold _http_ok / _probe_urls / _normalize_cdp_url out of the nested probe loop, collapse the failure-message scaffolding, and DRY the chrome candidate path tables. Behaviour and event shape unchanged.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	e750829015	fix(tui): stream /browser connect progress as gateway events Emit browser.progress JSON-RPC notifications during the connect work and render them in the TUI as system transcript lines, so users see the same step-by-step status the base CLI prints instead of nothing for ~1m followed by a final result.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	7d39a45749	fix(tui): show /browser connect progress like CLI Return CLI-style browser connect status messages from the gateway and render them in the TUI so local Chrome launch attempts are visible instead of ending in a silent delayed failure.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	69ff114ee2	fix(browser): avoid bogus Chrome launch fallback Detect an actual Chrome/Chromium executable before printing a manual CDP launch command, including common WSL-mounted Windows browser paths, so /browser connect does not suggest google-chrome when it is unavailable.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	f10a3df632	fix(tui): align /browser connect local CDP handling Share Chrome CDP launch helpers between the classic CLI and TUI so default /browser connect uses loopback consistently, retries local Chrome launch, and reports a copyable manual-start command instead of claiming a dead connection.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	88a9efdb1a	fix(tui): tighten cold-start edge cases after review Clean up the remaining review nits: - let the deferred @hermes/ink import retry after a transient failure instead of memoizing a rejected promise forever - keep memory-monitor in-flight state inside a finally so future exceptions cannot suppress that memory level indefinitely - use read_raw_config for the TUI MCP cold-start probe instead of full load_config() - keep input.detect_drop for explicit relative path prefixes (./ and ../) while preserving the no-RPC fast path for ordinary plain prompts Tests: - python -m py_compile tui_gateway/server.py tui_gateway/entry.py - cd ui-tui && npm run type-check && npm run build - scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py - cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts	2026-04-29 00:08:34 -05:00
Brooklyn Nicholson	72a3af63d4	fix(tui): keep prompt submit off the RPC pool A cleanup review found that adding prompt.submit to _LONG_HANDLERS made the RPC pool own the full first-turn wait even though the handler itself already spawns a turn thread. Keep prompt.submit inline and make it return immediately: - look up the session without waiting - kick the lazy agent build - spawn a short waiter thread that blocks on agent_ready, then starts the existing turn dispatcher This keeps stdin dispatch responsive, avoids occupying a bounded pool worker for a normal chat turn, and preserves the lazy-start hydration behavior. Tests: - python -m py_compile tui_gateway/server.py - cd ui-tui && npm run type-check && npm run build - scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py - cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts	2026-04-29 00:04:12 -05:00
Brooklyn Nicholson	a2819e1820	fix(tui): address lazy startup review races Copilot correctly flagged two concurrency windows: - memoryMonitor could re-enter while awaiting the lazy @hermes/ink import or heap dump, producing duplicate imports/dumps under sustained pressure. - _start_agent_build used a check-then-set guard without synchronization, so concurrent agent-backed RPCs could start duplicate agent builders. Fix both with single-flight guards: cache the dynamic import promise and track per-level dump in-flight state in memoryMonitor, and protect the TUI agent build flag with a per-session lock. Tests: - python -m py_compile tui_gateway/server.py - cd ui-tui && npm run type-check && npm run build - cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts - scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py	2026-04-28 23:54:33 -05:00
Brooklyn Nicholson	0a6ecea676	fix(tui): hydrate lazy startup panel and use animated loaders The lazy startup panel could remain stuck on the placeholder when no first prompt was submitted because agent construction only started from _sess(). Keep session.create cheap, but schedule _start_agent_build shortly after returning the placeholder so tools/skills hydrate automatically. Also replace the ugly placeholder bar rows with compact unicode-animations braille loaders for the tools and skills sections. Tests: - python -m py_compile tui_gateway/server.py - cd ui-tui && npm run type-check && npm run build - cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts - scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py	2026-04-28 23:48:07 -05:00
Brooklyn Nicholson	b66cbb7b4c	perf(tui): defer agent construction until first prompt Match classic CLI perceived startup behavior: show the TUI shell and composer before constructing the full AIAgent. session.create now returns a lightweight placeholder session with lazy=true and no longer starts _make_agent eagerly. The first method that needs the agent triggers _start_agent_build() via _sess(); prompt.submit is routed through the RPC worker pool so that the initial wait for agent construction does not block the stdio dispatcher. The intro panel renders skeleton rows for tools/skills while the real session.info payload is absent, then hydrates to the real tools/skills panel once AIAgent initialization completes. Also skip the startup /voice status probe and avoid the input.detect_drop RPC for ordinary plain-text prompts to keep early startup/first-submit paths cheap. Measurements on macOS Terminal.app: - Previous full ready p50 after earlier PR commits: ~1537ms - Lazy skeleton panel p50: ~794ms - Original baseline full ready p50: ~1843ms So the visible startup surface is now ~743ms faster than the prior PR state and ~1.05s faster than the original baseline. First prompt still pays the same agent construction cost if it races the background/skeleton state, matching classic CLI's deferred behavior. Tests: - python -m py_compile tui_gateway/server.py - cd ui-tui && npm run type-check && npm run build - scripts/run_tests.sh tests/tui_gateway/test_protocol.py::test_sess_found tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py - cd ui-tui && npm test -- --run src/__tests__/useSessionLifecycle.test.ts src/__tests__/useConfigSync.test.ts	2026-04-28 23:32:02 -05:00
Brooklyn Nicholson	0399d4b976	perf(tui): shave ~190ms off `hermes --tui` cold start Two targeted fixes on the critical path from `hermes --tui` launch to `gateway.ready`: 1. Defer `@hermes/ink` import in memoryMonitor.ts. The static top-level import dragged the full ~414KB Ink bundle (React + renderer + all components/hooks) onto the critical path before `gw.start()` could spawn the Python gateway — serialising ~155ms of Node work in front of it on every launch. `evictInkCaches` only runs inside the 10-second tick under heap pressure, so it moves to a lazy dynamic import. First tick hits the ESM cache because the app entry has long since imported `@hermes/ink`. 2. Gate `tools.mcp_tool` import on config in tui_gateway/entry.py. Importing the module transitively pulls the MCP SDK + pydantic + httpx + jsonschema + starlette formparsers (~200ms). The overwhelming majority of users have no `mcp_servers` configured, so this runs for nothing. A cheap `load_config()` check (~25ms) skips the 200ms import when no servers are declared, with a conservative fallback to the old behaviour if the config probe itself fails. ## Measurements (macOS Terminal.app, Apple Silicon, n=12) \| Metric \| Before (p50) \| After (p50) \| Δ \| \|----------------------------\|--------------\|-------------\|----------\| \| Python gateway boot alone \| 252–365ms \| 105–151ms \| −180ms \| \| `hermes --tui` banner paint \| 686ms \| 665ms \| −21ms \| \| `hermes --tui` → ready \| 1843ms \| 1655ms \| −188ms (−10.2%) \| \| `hermes --tui` → ready p90 \| 1932ms \| 1778ms \| −154ms \| \| stdev (ready) \| 126ms \| 83ms \| also more consistent \| ## Tests - `scripts/run_tests.sh tests/tui_gateway/ tests/tools/test_mcp_tool.py`: 195 passed. (The one pre-existing failure in `test_session_resume_returns_hydrated_messages` reproduces on main — unrelated, it's a mock-DB kwarg mismatch.) - `ui-tui` vitest: 430 tests, all pass. - `npm run type-check` in ui-tui: clean. ## Notes - Node-side first paint ("banner") didn't move meaningfully because that latency is dominated by Ink's render pipeline + React mount, not by which imports load first. - The win shows up entirely in the time from banner to `gateway.ready` — exactly where we expected it, since both fixes shorten the Python gateway's boot path or let it overlap more with Node startup. - No user-visible behaviour change. Memory monitoring still fires every 10s; MCP still works when `mcp_servers` is configured.	2026-04-28 19:42:31 -05:00
brooklyn!	188eaa57c4	fix(tui): honor documented mouse_tracking config key (#17188 ) * fix(tui): honor documented mouse_tracking config key The TUI runtime was reading display.tui_mouse while docs and user-facing examples pointed users at display.mouse_tracking. That made persistent mouse-disable config look like a no-op for users trying to restore native terminal selection/copy behavior on Linux/SSH/tmux terminals. Use display.mouse_tracking as the canonical key, keep display.tui_mouse as a legacy fallback, and have /mouse write the documented key. Both gateway config.get and client-side config sync now share the same precedence: the canonical key wins, then the legacy key, then default on. * review(copilot): align mouse tracking config coercion - Load gateway config once before deriving display.mouse_tracking state. - Use key-presence precedence on the TUI client too, so canonical mouse_tracking wins over legacy tui_mouse even when the value is null. - Treat numeric 0 as disabled on both gateway and client, matching the existing string "0" handling. - Widen ConfigDisplayConfig mouse fields because config.get full returns raw YAML, not normalized booleans.	2026-04-28 17:39:07 -07:00
brooklyn!	7d81d76366	feat(tui): pluggable busy-indicator styles (#13610 ) (#17150 ) * feat(tui): pluggable busy-indicator styles (kaomoji/emoji/unicode/ascii) The status-bar `FaceTicker` rotated through wide-and-variable kaomoji glyphs (`(｡•́︿•̀｡)`, `( ͡° ͜ʖ ͡°)`, …) every 2.5s. Real display widths range from ~5 to ~16 columns, so the rest of the bar (cwd, ctx %, voice, bg counter) shifted on every cycle. Padding the verb alone (#17116) helped but didn't address the dominant jitter source — the glyph itself. Add four indicator styles, configurable + hot-swappable: * `kaomoji` (default — preserves the existing vibe; verb is now pad-stable so the only width churn left is the kaomoji itself). * `emoji` — single 2-col emoji frame (`⚕ 🌀 🤔 ✨ 🍵 🔮`). * `unicode` — `unicode-animations` braille spinner (1-col, smooth). * `ascii` — `\| / - \` (1-col, max compat). Wires: * `display.tui_status_indicator` in `DEFAULT_CONFIG` (default `kaomoji`). * New JSON-RPC `config.set/get indicator` keys, narrow allow-list. * `applyDisplay` reads the field and patches `UiState.indicatorStyle`, so the existing `mtime` poll picks up `~/.hermes/config.yaml` edits within ~5s without a TUI restart. * `/indicator [style]` slash command (alias `/indicator-style`, subcommand completion `kaomoji\|emoji\|unicode\|ascii`). Bare form shows the current style; setter fires `config.set` and optimistically `patchUiState({ indicatorStyle })` so the live TUI swaps immediately, matching the `/skin` UX. * `CommandDef("indicator", ..., subcommands=...)` so classic CLI autocomplete + TUI `complete.slash` both surface it. * `FaceTicker` decouples spinner cadence from verb cadence — the glyph runs at the spinner's authored interval (or `FACE_TICK_MS` for kaomoji), the verb stays on the original 2.5s cycle, and both re-arm cleanly when style changes. Tests: * `normalizeIndicatorStyle` rejects unknown / non-string input. * `applyDisplay → tui_status_indicator` covers fan-out + fallback. * `/indicator <style>` hot-swaps `UiState.indicatorStyle` after a successful `config.set`. * `/indicator sparkle` rejects with the usage hint and never hits the gateway. * Slash-parity matrix gets `'/indicator'` → `config.get`. Validation: cd ui-tui && npm run type-check — clean; npm test --run — 398/398. scripts/run_tests.sh tests/test_tui_gateway_server.py tests/hermes_cli/test_commands.py — 220/220. * chore(tui): drop /indicator-style alias to declutter autocomplete * fix(tui): drop verb-width pad — /indicator handles glyph jitter directly * fix(tui): unicode indicator style hides the verb (cleanest option) * refactor(tui): single source of truth for INDICATOR_STYLES; cleaner error format Round 1 Copilot review on PR #17150: - Exported `INDICATOR_STYLES` const tuple from `interfaces.ts`; `IndicatorStyle` union type is derived from it. `useConfigSync` builds its validation Set from the tuple, and `session.ts` uses it for both the usage hint and the runtime allow-list — adding/removing a style now touches one line. - Backend `config.set indicator` error message: switched `sorted(allowed)` list repr to `pick one of ascii\|emoji\|kaomoji\|unicode` (matches the TUI usage hint), and reports the normalized `raw` instead of the original `value`. Backend allowed tuple now has a comment pointing back at `INDICATOR_STYLES` so the two stay aligned. Note: kept the verb portion unpadded per design intent — fixed-width padding was the exact UX the `/indicator` command was added to remove. Stable width comes from the glyph; verbs cycling is part of the kawaii aesthetic. Reply on the verb thread will explain. * fix(tui): drop type collapse + gate verb timer + DEFAULT_INDICATOR_STYLE Round 2 Copilot review on PR #17150: - `tui_status_indicator?: 'ascii' \| ... \| string` collapses to `string` in TS — consumers got no narrowing. Documented as plain `string` with a comment about runtime validation via `normalizeIndicatorStyle`. - `FaceTicker` always started a 2.5s verb interval, even for the `unicode` style which hides the verb entirely. Now gated on `showVerb` from `renderIndicator` — `unicode` stays calm. Pre-emptive self-review (avoid round 3): - Three call sites duplicated the literal `'kaomoji'` default (uiStore, normalizeIndicatorStyle, slash command). Added `DEFAULT_INDICATOR_STYLE` to interfaces.ts and threaded it through so changing the default touches one line. * fix(tui-gateway): normalize config.get indicator output to match TUI render Round 4 Copilot review on PR #17150: `config.get` for `indicator` returned the raw `display.tui_status_indicator` value without validation, so a hand-edited config.yaml with stray casing or an unknown style would leave `/indicator` printing one thing while the TUI rendered the kaomoji default (frontend's `normalizeIndicatorStyle` does this normalization on receive). Lifted the allow-list to module scope as `_INDICATOR_STYLES` / `_INDICATOR_DEFAULT`, reused by both `config.set` and `config.get`. Comment notes the alignment with `INDICATOR_STYLES` / `DEFAULT_INDICATOR_STYLE` in interfaces.ts so adding/removing a style is a one-line change on each end. Tests cover: known value verbatim, casing/whitespace normalize, unknown→default, unset→default. * fix(tui-gateway): preserve falsy-input diagnostics in config.set indicator error Round 5 Copilot review on PR #17150: `raw = str(value or "").strip().lower()` collapsed any falsy non-string (`0`, `False`, `[]`) to empty string, so the error message read `unknown indicator: ` with nothing after — losing the original input. Switched to `("" if value is None else str(value)).strip().lower()` so only `None` (the genuine 'no value' case) becomes blank. Used `{raw!r}` in the error so the diagnostic is unambiguous (`'0'` vs `0`). Tests: - known-value happy path (`'EMOJI'` → `'emoji'`) - falsy non-string inputs (`0` / `False` / `[]`) surface meaningfully - `None` keeps the blank-repr error	2026-04-28 18:19:16 -05:00
brooklyn!	1e326c686d	fix(tui-gateway): harden stdio transport against half-closed pipes + SIGTERM races (#17118 ) * fix(tui-gateway): harden stdio transport against half-closed pipes + SIGTERM races `tui_gateway` reports `tui_gateway_crash.log` traces where the main thread sits in `sys.stdin` while a worker holds `_stdout_lock` mid- flush, and SIGTERM then calls `sys.exit(0)` while the lock is still held — the interpreter shutdown stalls behind the wedged write. Two narrowly scoped hardenings: `tui_gateway/transport.py` * Move JSON serialisation outside the lock — long messages no longer block sibling writers while we serialise. * Treat `BrokenPipeError`, `ValueError` ("I/O on closed file") and generic `OSError` from both `write` and `flush` as "peer is gone": return `False` instead of bubbling, matching what `write_json`'s callers in `entry.py` already expect. * Split `flush` into its own try block so a stuck flush never strands a partial write or holds the lock indefinitely on its way out. * Optional `HERMES_TUI_GATEWAY_NO_FLUSH=1` env knob to skip explicit `flush()` entirely on environments where a half-closed read pipe produces an indefinite kernel-level block. Default unchanged. `tui_gateway/entry.py` * `_log_signal` now spawns a 1-second daemon timer that calls `os._exit(0)` if the orderly `sys.exit(0)` path is itself stuck behind a wedged worker. Atexit handlers run inside the grace window when they can; the timer is the safety net so a deadlocked flush no longer strands the gateway process. Tests: * `test_write_json_closed_stream_returns_false` — ValueError path. * `test_write_json_oserror_on_flush_returns_false` — OSError on flush must not strand the lock; the write portion still landed before the flush failure. * `test_write_json_no_flush_env_skips_flush` — env knob bypass. Validation: `scripts/run_tests.sh tests/tui_gateway/test_protocol.py` (42/42 pass; one pre-existing failure on `test_session_resume_returns_hydrated_messages` is unrelated to this change — same `include_ancestors` mock kwarg issue tracked elsewhere). `scripts/run_tests.sh tests/test_tui_gateway_server.py` 90/90 pass. * review(copilot): tighten transport hardening comments + test cleanup * review(copilot): narrow exception capture, configurable grace, simpler no-flush test * fix(tui-gateway): narrow ValueError to closed-stream; surface UnicodeEncodeError Copilot review on PR #17118: `UnicodeEncodeError` is a ValueError subclass, so a non-UTF-8 stdout (mismatched PYTHONIOENCODING / locale) would have been silently swallowed as 'peer gone' under `except ValueError`. That hides a real environment bug. Now: - UnicodeEncodeError → log with exc_info (warning) and drop the frame - ValueError where str(e) contains 'closed file' → peer gone, return False - Any other ValueError → log loudly, drop frame (defensive, but visible) Same shape applied to flush. Adds two regression tests. * fix(tui-gateway): reserve write() False for peer-gone; re-raise programming errors Round 2 Copilot review on PR #17118: `Transport.write()` returning `False` is documented as 'peer is gone', and `entry.py` reacts by calling `sys.exit(0)`. But the implementation also returned False for non-IO conditions (non-JSON-safe payloads, UnicodeEncodeError, unrelated ValueErrors), so a programming error or local env bug would present as a clean disconnect — exactly the diagnosis pain we wanted to eliminate. Now: - `json.dumps` failure → re-raises (TypeError/ValueError surfaces in crash log) - `BrokenPipeError` → False (peer gone) - `ValueError('...closed file...')` → False (peer gone) - `UnicodeEncodeError` and any other ValueError → re-raise - `OSError` → False (existing IO-failure semantics, debug-logged) Tests updated to assert the re-raise behaviour and added a non-serializable-payload regression test. * fix(tui-gateway): narrow OSError to peer-gone errnos; honest test naming Round 3 Copilot review on PR #17118: - Docstring claimed False = peer gone, but generic OSError on write/flush also returned False — meaning ENOSPC/EACCES/EIO would silently exit. Added `_PEER_GONE_ERRNOS = {EPIPE, ECONNRESET, EBADF, ESHUTDOWN, +WSA}` and narrowed the OSError handlers; non-peer-gone errnos re-raise. Docstring now lists OSError as peer-gone branch with the errno set. - The `_DISABLE_FLUSH` test was named after the env var but actually patched the module constant. Renamed it to reflect the contract being tested (skips flush when constant is true) AND added a real end-to-end test that sets the env var, reloads transport.py, and asserts the constant flips. Cleanup reload restores defaults so parallel tests stay isolated. Self-review (avoid round 4): - Verified TeeTransport's secondary-swallow stays intentional. - _log_signal grace path already covered by separate tests.	2026-04-28 17:54:06 -05:00
brooklyn!	15ef11a8b8	fix(tui): make /browser connect actually take effect on the live agent (#17120 ) * fix(tui): make /browser connect actually take effect on the live agent Reports were that `/browser connect <url>` (and "changes to CDP url don't get picked up") didn't propagate to the live agent in `--tui`, forcing users to fall back to setting `browser.cdp_url` in `config.yaml` and restarting. Tracing the path on current main shows the protocol wiring is already correct — `/browser` is registered in `ui-tui/src/app/slash/commands/ops.ts` and dispatches `browser.manage` through the gateway RPC, NOT the slash worker (covered by the `browser.manage` row in `slashParity.test.ts`). But three real gaps left the experience flaky: 1. `cleanup_all_browsers()` ran AFTER `os.environ["BROWSER_CDP_URL"]` was rewritten. `_ensure_cdp_supervisor(...)` reads the env to resolve its target URL, so a tool call landing in that brief window could re-attach the supervisor to the OLD CDP endpoint just before we reaped sessions, leaving the agent talking to a dead URL. Reorder to clean first, swap env, clean again so the supervisor for the default task is definitively closed. 2. `browser.manage status` reported only the env var, ignoring `browser.cdp_url` from config.yaml. `_get_cdp_override()` (the resolver the agent itself uses) consults both — match it so `/browser status` answers the same question the next `browser_navigate` will see. Closes a stealth bug where users saw "browser not connected" while their CDP URL was perfectly set in config.yaml. 3. `/browser disconnect` only cleared `BROWSER_CDP_URL` and reaped once, leaving the same swap window as connect. Symmetrical double-cleanup here too. Frontend (`ops.ts`): * Echo "next browser tool call will use this CDP endpoint" on success so users see immediate confirmation that the gateway accepted the swap, even before any tool runs. * Mention `browser.cdp_url` in `config.yaml` in the usage hint and the not-connected status line. Persistent config is the correct fix for some terminal-multiplexer / sub-agent flows where env inheritance is unreliable; surfacing it makes that workaround discoverable. Tests (4 new, all hermetic): * `status` returns the resolved URL when only `browser.cdp_url` is set in config.yaml. * `connect` writes env AND cleans before/after, in that order. * `connect` against an unreachable endpoint does NOT mutate env or reap. * `disconnect` removes env and cleans twice. Validation: scripts/run_tests.sh tests/test_tui_gateway_server.py — 94/94 pass. cd ui-tui && npm run type-check — clean; npm test --run — 389/389. * review(copilot): always defer to _get_cdp_override; normalize bare host:port * review(copilot): collapse discovery-style CDP paths so /json/version isn't duplicated * fix(tui): /browser status must not perform CDP discovery I/O Copilot review on PR #17120: previous version routed through `tools.browser_tool._get_cdp_override`, which calls `_resolve_cdp_override` and performs an HTTP probe to /json/version with a multi-second timeout for discovery-style URLs. That blocks the TUI on `/browser status` whenever the configured host is slow or unreachable. Status now reads env-then-config directly with no network I/O. The WS normalization still happens in `browser_navigate` for actual tool calls, so behaviour-on-call is unchanged. * fix(tui): skip /json/version probe for concrete ws://devtools/browser endpoints Round 2 Copilot review on PR #17120: hosted CDP providers (Browserbase, browserless, etc.) return concrete `ws[s]://.../devtools/browser/<id>` URLs which are already directly connectable but don't serve the HTTP discovery path. The previous `/json/version` probe rejected these valid endpoints with 'could not reach browser CDP'. For `ws[s]://...` URLs whose path starts with `/devtools/browser/` we now do a TCP-level reachability check (`socket.create_connection`) instead of the HTTP probe. The actual CDP handshake happens on the next `browser_navigate` call, so we still surface unreachable hosts as 5031 errors — just without the false negatives. Discovery-style URLs (`http://host:port[/json[/version]]`) keep the HTTP probe path unchanged. Updated existing test + added two new ones (TCP-only success, TCP unreachable → 5031).	2026-04-28 17:46:57 -05:00
brooklyn!	87d3fa6f1c	feat(tui): opt-in auto-resume of the most recent session (#17130 ) * feat(tui): opt-in auto-resume of the most recent session `hermes --tui` always forges a fresh session at startup unless the user sets `HERMES_TUI_RESUME=<id>`. Disconnects, terminal-window crashes, and accidental Ctrl+D therefore lose every piece of in-flight context even though `state.db` still has the full history a `/resume` away. Add an opt-in path that mirrors classic CLI's `hermes -c` muscle memory: when `display.tui_auto_resume_recent: true` is set in `~/.hermes/config.yaml`, the TUI looks up the most recent human-facing session and resumes it instead of starting fresh. Default off so existing users aren't surprised; explicit `HERMES_TUI_RESUME` always wins. Wires: * New `session.most_recent` JSON-RPC in `tui_gateway/server.py` that returns the first non-`tool` row from `list_sessions_rich`, or `{"session_id": null}` when none. Uses the same deny-list as `session.list` so sub-agent rows can't sneak in. * `createGatewayEventHandler.handleReady` re-ordered: explicit `STARTUP_RESUME_ID` first (unchanged), then conditional auto-resume via `config.get full → display.tui_auto_resume_recent`, then the legacy `newSession()` fallback. Failures of either RPC fall back to `newSession()` so the path is always finite. * Default `display.tui_auto_resume_recent: False` added to `DEFAULT_CONFIG` in `hermes_cli/config.py` (no `_config_version` bump per AGENTS.md — deep-merge handles the additive key). Tests: * 4 new vitest cases in `createGatewayEventHandler.test.ts` cover every gate-and-fallback combination (env wins, config off, config on with hit, config on with miss). * 3 new pytest cases for `session.most_recent` (denied row skip, tool-only → null, db-unavailable → null). Validation: scripts/run_tests.sh tests/test_tui_gateway_server.py — 93/93. cd ui-tui && npm run type-check — clean; npm test --run — 393/393. * review(copilot): fold session.most_recent errors into null + extend ConfigDisplayConfig * review(copilot): cover RPC-rejection fallbacks in auto-resume tests	2026-04-28 16:53:38 -05:00
Gille	0d957a8d48	fix(tui): surface mouse slash command (#17126 )	2026-04-28 13:27:43 -07:00
Rugved Somwanshi	214ca943ac	feat(agent): add lmstudio integration	2026-04-28 12:27:36 -07:00
Ruda Porto Filgueiras	a23f18cc3e	fix(bedrock): add live model discovery and region resolution for non-US regions provider_model_ids("bedrock") fell through to a static _PROVIDER_MODELS table containing only hardcoded us.* model IDs. Users configured for non-US AWS regions (eu-central-1, ap-northeast-1, etc.) saw wrong or no models in /model and autocomplete. Root causes fixed: 1. models.py: provider_model_ids() now calls discover_bedrock_models() keyed by the resolved region before falling back to the static table. A new bedrock_model_ids_or_none() helper in bedrock_adapter.py consolidates the discover -> extract IDs -> fallback pattern used by all three call sites. 2. providers.py: registers bedrock in HERMES_OVERLAYS with transport=bedrock_converse and auth_type=aws_sdk so get_provider("bedrock") and resolve_provider_full("bedrock") work. 3. model_switch.py: list_authenticated_providers() sections 2 and 3 detect AWS credentials via has_aws_credentials() for aws_sdk overlays and use live discovery for the model list. 4. bedrock_adapter.py: resolve_bedrock_region() reads the configured region from botocore.session before falling back to us-east-1, covering users who set their region in ~/.aws/config via a named profile rather than env vars. 5. tui_gateway/server.py: passes provider= to get_model_context_length() so context window lookups work correctly for the Bedrock provider.	2026-04-28 03:53:11 -07:00
Teknium	dd789a4fdf	fix(mcp): move discovery out of model_tools import side effect (#16856 ) (#16899 ) model_tools.py ran discover_mcp_tools() as a module-level side effect. discover_mcp_tools() uses a blocking 120s wait internally (via _run_on_mcp_loop -> future.result(timeout=120)). The gateway lazy-imports run_agent -> model_tools on the first user message, which happens inside the asyncio event loop thread. A slow or unreachable MCP server therefore froze Discord shard heartbeats and Telegram polling for up to 120s on the first message after gateway start. Fix: remove the module-level call. Every entry point now runs discovery explicitly at its own startup, using the context-appropriate blocking/non-blocking pattern: - gateway/run.py: loop.run_in_executor(None, discover_mcp_tools) before platforms start accepting traffic - hermes_cli/main.py: inline (no event loop at CLI startup) - tui_gateway/entry.py: inline (sync stdin loop, no event loop) - acp_adapter/entry.py: inline before asyncio.run() Closes #16856.	2026-04-28 01:17:58 -07:00
Teknium	9e4d79b17f	fix(tui): `/model` writes HERMES_TUI_PROVIDER unconditionally (#16857 ) (#16897 ) `/new` after `/model <custom-provider>:<model>` silently reverted to a native provider whose static catalog happened to contain the same model name (e.g. `deepseek-v4-pro` → native `deepseek` → 401). Root cause at the `/model` writeback site: `HERMES_INFERENCE_PROVIDER` was set unconditionally but `HERMES_TUI_PROVIDER` was only mirrored when it was already set. On sessions launched without `--provider`, `HERMES_TUI_PROVIDER` stayed unset, so `_resolve_startup_runtime()` on `/new` skipped the explicit-provider early return and fell through to `detect_static_provider_for_model()`. Fix: set `HERMES_TUI_PROVIDER` unconditionally alongside `HERMES_INFERENCE_PROVIDER` when `/model` lands. Keeps #15755's invariant intact — `HERMES_TUI_PROVIDER` remains the canonical "explicit this process" carrier, `HERMES_INFERENCE_PROVIDER` remains ambient and does not short-circuit startup resolution. Bug report and diagnosis: @Bartok9 in #16857 / #16873. Fixes #16857	2026-04-28 01:17:04 -07:00
Brooklyn Nicholson	4f59510dd4	fix(tui): tighten fast-mode support validation Distinguish missing model from unsupported model before enabling fast mode and cover both cases so config and live agent state remain untouched on invalid fast toggles.	2026-04-27 13:00:11 -05:00

1 2 3 4

183 commits