mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-10 08:32:09 +00:00
239 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
e7a7872a87 |
fix(tui_gateway): dedup re-queued process notifications flooding TUI
_ notification_poller_loop_ re-emits status.update every cycle when a background process completes while the session is busy. The same completion event gets re-queued and re-emitted to the TUI every few ms, flooding the transcript with duplicate lines. Add _notification_event_dedup_key(evt) that returns a tuple identity for each notification event. Only emit status.update on first sight per identity: - completions: (sid, type) — one-shot per process session - watch_match: (sid, type, command, pattern, output, ...) - watch_overflow/disabled: (sid, type, command, message, ...) The dedup key design was refined from an initial sid:type approach after @lordbuffcloud identified that distinct watch_match events (READY vs DONE) for the same process would be incorrectly collapsed. Tests from @tymrtn cover distinct watch matches, exact replay dedup, and completion one-shot behavior. Co-authored-by: tymrtn <ty@tmrtn.com> |
||
|
|
a3fb48b2ce |
fix(state): keep /branch sessions visible after parent reopen
/branch (aka /fork) sessions vanished from /resume and /sessions. Both surfaces funnel through list_sessions_rich(include_children=False), which hid any session with a parent_session_id unless identified as a branch via a heuristic — parent.end_reason == 'branched' AND child.started_at >= parent.ended_at. Two ways that heuristic failed: 1. CLI/gateway branches: once the parent was reopened (e.g. resumed) and re-ended with a different end_reason (tui_shutdown overwriting 'branched'), the heuristic stopped matching and the branch was hidden permanently. 2. TUI branches (tui_gateway session.branch): the TUI never ends the parent as 'branched' — it creates the child while the parent is still live — so the heuristic NEVER matched and TUI branches were hidden from the moment they were created (this is the macOS desktop app's primary symptom). Fix: persist a stable '_branched_from' marker in the branch session's model_config at creation time across ALL THREE branch paths (CLI cli.py, gateway gateway/run.py, and TUI tui_gateway/server.py), and OR a json_extract(model_config, '$._branched_from') IS NOT NULL check into the list_sessions_rich filter. The marker is immutable across the parent's lifecycle, so the branch stays visible regardless of how/whether the parent is ended. The legacy end_reason heuristic is kept (OR'd) so pre-existing branches remain visible. Subagent/compression children (no marker, parent not 'branched') stay correctly hidden. Fixes #20856. Approach by liuhao1024 (PR #20864); reimplemented on current main, extended to the TUI branch path (which the original missed), with regression tests for the reopen+re-end scenario and the TUI marker persistence. |
||
|
|
8077e7d2fb |
fix(tui): narrow resume lock to avoid blocking session.close
The salvaged fix held _session_resume_lock across _make_agent (MCP discovery + AIAgent construction, seconds), serializing it against session.close. Since session.close runs on the main RPC dispatch thread (not a _LONG_HANDLER), a close racing a mid-build resume would stall all fast-path RPCs (approval.respond, session.interrupt). Restructure to double-checked locking: build the agent outside the lock, then re-check _find_live_session_by_key under the lock before _init_session. A losing concurrent resume discards its just-built agent (no worker/poller wired yet) and reuses the winner. Updated the concurrent-resume regression test to assert the real invariant (one surviving live session + loser agent closed) rather than the implementation detail of a single _make_agent call. |
||
|
|
bd6d098762 | fix(tui): keep resumed live history current | ||
|
|
98903d0313 | fix(tui): reuse live session on resume | ||
|
|
f66a929a6b
|
fix(desktop): render approval/sudo/secret prompts so tools stop silently timing out (#38578)
* fix(desktop): render approval/sudo/secret prompts so tools stop silently timing out The desktop app's gateway event handler (use-message-stream.ts) handled clarify.request but had no case for approval.request, sudo.request, or secret.request. When a tool needed approval, the gateway emitted approval.request and blocked the agent thread in _await_gateway_decision() for up to 5 min (approvals.gateway_timeout); the desktop dropped the unknown event, never showed a dialog, then the agent returned BLOCKED. No prompt, just a stall then a block. The Ink TUI already handles all three (createGatewayEventHandler.ts); this brings the Electron app to parity. - store/prompts.ts: approval/sudo/secret atoms (+ request-id-guarded clears) - components/prompt-overlays.tsx: Radix dialogs; close/Esc maps to refusal so silence is never mistaken for consent (parity with TUI Esc->deny) - use-message-stream.ts: wire the three *.request cases; clearAllPrompts on message.complete so an overlay can't outlive its turn - chat-messages.ts: GatewayEventPayload gains command/description/env_var/prompt - mount PromptOverlays in the chat shell * feat(desktop): inline tool-call approval bar (Cursor-style "Run") Render dangerous-command / execute_code approval inline on the pending tool row instead of as a modal. Binding is positional: the desktop tool.start payload carries no structured args, but approval.request only fires from the terminal/execute_code guards and the agent blocks on one approval at a time, so the single pending row of those tools is the one that raised it. Command/description text comes from $approvalRequest. Drops ApprovalDialog from PromptOverlays (sudo/secret stay modal). * style(desktop): make inline approval bar match Cursor's command card Drop the amber alert styling for a neutral elevated card: command on a terminal-prefixed row up top, a divided footer with the muted description on the left and right-aligned controls — a ghost "Reject" (Esc) plus a split primary "Run" (⌘⏎) whose chevron opens "Allow this session" / "Always allow" / "Reject". Wire ⌘/Ctrl+Enter → Run and Esc → Reject to match Cursor's accept/skip bindings, guarded against double-send via the $approvalRequest atom. * style(desktop): shrink inline approval to a tiny Cursor-style button strip The running tool row already shows the command, so drop the whole card + command echo + description band. What's left is a compact strip under the row: a small split "Run ⌘⏎" button (chevron → Allow this session / Always allow / Reject) and a ghost "Reject Esc", indented to sit under the row's title text. * style(desktop): drop the loud blue Run button for a quiet outlined control Swap the primary (blue) Run for a subtle outlined split control — neutral border, transparent fill, hover-accent — so the approval strip reads as quiet inline affordance rather than a big CTA. Reject stays ghost. * style(desktop): make Run a soft primary badge Tint the Run split control with the primary color as a badge (bg-primary/10, primary text, primary/25 border, rounded-md, hover primary/15) instead of a solid CTA or a neutral outline. * style(desktop): slim the approval chevron and space out Reject The chevron button had ballooned because dropping the size prop fell back to the big default size (h-9 + has-svg px-3). Pin size=xs everywhere and give the chevron a tight w-5/px-0. Bump the gap between the Run badge and Reject (gap-2.5) and loosen Reject's internal spacing. * feat(desktop): confirm before "Always allow" persists an approval "Always allow" writes the matched pattern to ~/.hermes/config.yaml and suppresses the prompt in every future session — too consequential to fire straight from a menu click. Route it through a confirm dialog that names the pattern + command and the file it touches. The dialog owns the keyboard while open so Esc closes it instead of denying the approval. * fix(gateway): make sudo + secret prompts actually fire in the desktop Tek's PR added the sudo/secret overlays and callback wiring, but neither reached the live path: - Sudo: the sudo password callback is thread-local (terminal_tool _callback_tls), and _wire_callbacks runs on the agent-build thread, not the turn thread that executes tools. At command time the callback was missing, so terminal sudo fell through to /dev/tty and hung the headless gateway. Re-wire callbacks at the top of the prompt-submit turn thread. - Secret: skills_tool short-circuited to the "secret entry unsupported" hint for any gateway surface, before invoking the callback. Interactive surfaces (desktop/TUI) register a secret-capture callback that routes to the secret.request overlay; only short-circuit when no callback exists, so messaging still gets the hint but the desktop prompts. * docs(desktop): drop Cursor references from approval comments * docs(desktop): drop Cursor reference from prompt-overlays comment * fix(skills): gate in-band secret capture on HERMES_INTERACTIVE, not callback presence The desktop/sudo PR switched the gateway secret-capture short-circuit from "any gateway surface" to "gateway surface with no callback registered". That made a messaging gateway (telegram/discord/...) attempt interactive in-band secret capture whenever any callback happened to be registered, instead of returning the safe "setup unsupported" hint — and broke test_gateway_still_loads_skill_but_returns_setup_guidance. Discriminate on HERMES_INTERACTIVE instead: the desktop app / TUI set it in _enable_gateway_prompts (alongside registering the secret.request callback), while messaging platforms never do. This is the same flag tools/approval.py uses to tell an interactive surface from a messaging one, so messaging keeps the hint and desktop/TUI still prompt. --------- Co-authored-by: Brooklyn Nicholson <brooklyn.bb.nicholson@gmail.com> |
||
|
|
1927ff217e
|
Merge pull request #38517 from NousResearch/bb/desktop-yolo-statusbar-toggle
feat(desktop): YOLO toggle in the status bar (per-session, TUI parity) |
||
|
|
e02a6038a4
|
fix(tui): save TUI /save snapshots under Hermes home with system prompt (#38251)
* fix(tui): save TUI /save snapshots under Hermes home with system prompt The TUI gateway's session.save RPC wrote hermes_conversation_<ts>.json to the workspace/project CWD via os.path.abspath(...) and only exported model and messages. This diverged from the classic CLI /save (which writes under the Hermes profile home) and from the dashboard save (which includes the system prompt). Write the snapshot under get_hermes_home()/sessions/saved/ and include system_prompt, session_id, and session_start so the TUI export matches the CLI and dashboard behavior. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(tui): prefer agent.session_start for /save export; assert it in test Address review feedback: derive session_start from the agent's session_start datetime (matching the classic CLI export) and fall back to the gateway session's created_at only when unavailable. Assert session_start in the regression test. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
e76d8bf5aa
|
fix(tui): stop persisting full tool output in trail lines (silent OOM death)
A heavy --tui session (browser snapshots, large tool outputs) silently OOM-killed the Node parent within minutes — closing the gateway child's stdin, which the user saw only as a bare "gateway exited" / stdin EOF. CLI was immune. Root cause: each completed tool's verbose trail line embedded up to 16KB of result_text, persisted in transcript Msg.tools[] for the whole session and rendered EXPANDED by default, so an Ink render-node tree was built for every one of up to 800 messages at once. That tree blew past Node's heap at a few hundred MB — far below the 2.5GB memory-monitor exit threshold, so the death was never even attributed. - text.ts: persisted verbose tool-trail blocks now cap to a small preview (VERBOSE_TRAIL_MAX_CHARS=800/12 lines), not the 16KB live-render budget. Retained trail strings drop ~17x (12.2MB -> 0.7MB at 800 msgs); the live streaming tail still uses the larger LIVE_RENDER budget. - tui_gateway/server.py: lower the gateway-side verbose text cap to match (1KB/16 lines) so we stop shipping output the TUI no longer renders. - memoryMonitor.ts: derive critical/high thresholds from the real V8 heap ceiling (~88%/70%) instead of the hardcoded 2.5GB that killed the process at 31% of an 8GB ceiling; add a one-shot onWarn early-warning on fast sub-threshold heap growth so the next such death is diagnosable, not silent. - entry.tsx: wire onWarn to a crash-log breadcrumb + stderr line. Full tool output is unchanged in the agent context and SQLite session — this is display/transport only, no behavior or context change. Fixes #34095. Related #27282. Tests: ui-tui text + new memoryMonitor suites (33 pass), python verbose-cap guard (5 pass); full ui-tui suite shows no new failures vs pristine main. E2E repro confirms the retention drop. |
||
|
|
ea4fe15631 |
feat(desktop): inline model picker in the status bar
Replace the status-bar model chip's modal with a Cursor-style dropdown: - providers grouped by name in a stable order (no recency reshuffle on select) - per-model hover-Edit submenu for reasoning effort + fast, gated by per-model capabilities now surfaced in the model.options payload - unified Fast toggle: flips the speed=fast param where supported, else swaps to the model's `-fast` variant (base and variant collapse into one row) - localStorage-backed "Edit Models" dialog to choose which models appear Adds reusable dropdown primitives (DropdownMenuSearch, shared row/label tokens, portaled + collision-aware submenus) and reads session state from nanostores rather than prop-drilling, so editing options doesn't rebuild and close the menu. |
||
|
|
31c40c72c0
|
fix(desktop): stabilize project folder sessions (#37586)
* fix(desktop): stabilize project folder sessions Keep desktop folder selection aligned with new sessions and scope TUI gateway cwd through session context so prompts and tools resolve against the selected workspace. * fix(desktop): address review feedback on folder sessions Snapshot sessions before iterating to avoid concurrent-mutation crashes, optional-chain the revealLogs catch, and read console-message args from the correct Electron event/messageDetails positions. * fix(desktop): address second review pass on folder sessions Sync the remembered workspace key with the cwd atom (clear on empty), only load tree children for real directory nodes, and throttle renderer auto-reloads so a deterministic startup crash can't loop forever. * fix(desktop): inherit parent workspace for ephemeral agent tasks Background and preview tasks use ephemeral ids absent from the session map, so pass the parent session cwd into the session context explicitly instead of clearing it back to the gateway launch dir. Also correct the set_session_vars docstring about clear_session_vars semantics. * fix(desktop): validate preview cwd before pinning session context A non-empty but non-existent client cwd would pin an unusable override and silently fall back to the launch dir. Validate once, reuse for both the session context and the terminal override, and fall back to the parent session workspace when invalid. * fix(desktop): harden preview cwd normalization and adopt normalized cwd Guard preview cwd normalization against malformed client paths so a bad input can't fail the whole restart, and adopt the backend's normalized config.get cwd in the no-active-session path so the persisted workspace stays consistent with what the agent uses. |
||
|
|
8977bf282e
|
Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> |
||
|
|
135c65093a |
feat(desktop): stable in-workspace ordering + No-workspace default
- Sidebar: rows within a workspace group now sort by creation time instead of last activity, so they stop reshuffling every time a message lands (muscle memory). Groups still float up by recency. - Sessions only persist a workspace cwd when one was explicitly chosen; an auto-detected launch directory is no longer stamped on the row, so untargeted sessions group under "No workspace" instead of "desktop". The agent still runs in the detected directory. |
||
|
|
85b65e29f0
|
feat(desktop): session hygiene, archive, media streaming + connecting overlay (#37099)
* feat(desktop): session hygiene, archive, media streaming + connecting overlay
Address a batch of desktop feedback:
- Stop leaking empty "Untitled" sessions: the TUI gateway pre-created a DB
row on every session.create (i.e. every launch/draft). Persist the row
lazily on first prompt instead, and hide message-less rows in the sidebar.
- Archive/hide sessions: new `archived` column + set_session_archived, web
API (`?archived=` + PATCH archived), Ctrl/⌘-click and a context-menu item
in the sidebar, and an "Archived Chats" settings panel to restore/delete.
- Videos load via a streaming `hermes-media://` protocol instead of capped,
in-memory data URLs (16 MB limit) — bypasses the cap and supports seeking.
- Background-process completions route to the session that launched them:
the completion event now carries session_key and each poller only consumes
its own.
- Sidebar: "Group by workspace" toggle is always visible; each workspace
group gets a "+" to start a session in that directory; "New agent"/"Agents"
relabeled to "New session"/"Sessions".
- New gateway connecting overlay (ascii decode → fade out) replacing the bare
skeleton/"starting gateway" state.
* fix(desktop): bail connecting overlay on boot error
The shownRef latch kept the connecting overlay mounted behind
BootFailureOverlay after a hard boot failure. Return null on boot.error
so the failure recovery surface fully owns the screen.
* fix(desktop): address Copilot review
- /api/sessions: validate `archived` (400 on unknown) and return `archived`
as a JSON boolean instead of SQLite's 0/1.
- PATCH /api/sessions/{id}: 400 (not a misleading 404) when the body has no
updatable fields; stop conflating a no-op with "not found".
- hermes-media protocol: drop `bypassCSP` — streaming only needs
secure/standard/stream/supportFetchAPI.
- Sidebar workspace header: split the toggle and the "+" into sibling buttons
so we no longer nest interactive elements inside a <button>.
* fix(desktop): address Copilot re-review
- hermes-media protocol: restrict streaming to an audio/video extension
allowlist (415 otherwise) so it can't be used to read arbitrary local files.
- Connecting overlay: use z-[1200] instead of the non-standard z-1200 utility.
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
||
|
|
3f7d1c801d |
feat(undo): /undo [N] backs up N user turns with prefill + soft-delete
Extends the existing /undo command from a single in-memory exchange removal into a full rewind: back up N user turns (default 1), soft-delete the truncated rows in SessionDB (active=0, kept for audit, hidden from re-prompts and search), notify memory providers, and prefill the composer with the backed-up message text for editing — CLI and TUI. Reuses the SessionDB rewind primitives, the on_session_switch(rewound=True) memory hook, and the TUI command.dispatch prefill payload from SaguaroDev's #21910 work, wired to /undo [N] instead of a separate /rewind picker. - cli.py: undo_last(n, prefill) — in-memory truncate + SQLite soft-delete + agent surgery (system-prompt invalidate, flush-index reset) + memory notify + editable buffer prefill; /undo dispatch parses optional count; checkpoint-rollback caller passes prefill=False - tui_gateway/server.py: command.dispatch undo branch (was rewind) parses count, picks Nth-from-last user turn, clamps to oldest - commands.py: /undo gains [N] args_hint - tests: rename + expand TUI suite (multi-turn, clamp, invalid-count) - release.py: AUTHOR_MAP entry for SaguaroDev Co-authored-by: SaguaroDev <74339271+SaguaroDev@users.noreply.github.com> |
||
|
|
243e836dce |
feat(tui): wire /rewind through command.dispatch + prefill payload (#21910)
Adds the TUI half of the /rewind feature so the Ink terminal UI gets the same affordance as the prompt_toolkit CLI. Python side (tui_gateway/server.py): - /rewind added to _PENDING_INPUT_COMMANDS so slash.exec rejects it and the TUI falls through to command.dispatch (the only path with access to live session state + memory hooks). - New command.dispatch branch for name == "rewind": v1 auto-picks the most recent user turn (Claude-Code-style single- step undo), calls SessionDB.rewind_to_message, refreshes the in-memory history, fires _memory_manager.on_session_switch with rewound=True, and returns the new "prefill" payload. - A dedicated picker overlay (multi-step rewind) is tracked as a follow-up to #21910. TS side (ui-tui/src/): - New "prefill" variant on CommandDispatchResponse + asCommandDispatch validator. Mirrors "send" but does NOT auto-submit; the client drops the message into the composer for editing. - createSlashHandler renders the optional notice via sys() and calls ctx.composer.setInput(d.message), letting the user edit-and-resubmit the rewound turn — the core UX promised by the issue. Tests: - 7 new tui_gateway tests covering prefill payload shape, in-memory history truncation, DB soft-delete, memory-provider notification (rewound=True), busy-session refusal, missing-session error, and registry placement in _PENDING_INPUT_COMMANDS. - Extended asCommandDispatch vitest covering the new prefill variant (with + without notice, and rejection of malformed payloads). Out of scope for v1 (tracked as #21910 follow-up): - Dedicated picker overlay in Ink (the multi-step rewind UI). v1 auto- picks the most recent user turn, matching the most common case. - Gateway platforms (Telegram, Discord, etc.) — issue scopes v1 to CLI + TUI only. |
||
|
|
77bb64813c
|
fix(desktop): report desktop_contract in lazy session.create info (#36112)
The lazy session.create path hand-builds a partial info dict that omitted desktop_contract. The desktop GUI reads a missing contract as undefined and treats it as an out-of-date backend, so it surfaced a "Backend out of date" toast on every launch even against a current backend. Carry the contract in the lazy payload like _session_info already does for resume/branch. |
||
|
|
51c68d4ab1
|
Add Hermes desktop app (#20059)
* feat: better composer etc * docs: add desktop and dashboard run instructions * fix(desktop): address security scan findings * fix(dashboard): resolve @nous-research/ui path under npm workspaces The sync-assets prebuild step shelled out to 'cp -r node_modules/@nous-research/ui/dist/fonts ...' with a path relative to apps/dashboard/. That works only when the dep is installed locally in the dashboard workspace, but 'npm install' at the repo root (the documented setup — see apps/desktop/README.md) hoists shared deps to the root node_modules under npm workspaces. The relative cp then fails with 'No such file or directory', sync-assets exits 1, the Vite build aborts, and 'hermes dashboard' surfaces a generic 'Web UI build failed' message. Replace the shell one-liner with scripts/sync-assets.cjs, which walks up from the dashboard directory looking for node_modules/ @nous-research/ui — working in both the hoisted (workspaces) and co-located (standalone) layouts. Also guards against a missing dist/fonts or dist/assets with a clearer error pointing at a rebuild of the UI package rather than silently copying nothing. * feat(desktop): support connecting to a remote Hermes backend Add HERMES_DESKTOP_REMOTE_URL and HERMES_DESKTOP_REMOTE_TOKEN env vars that, when set, short-circuit the local-child spawn in startHermes() and connect the Electron renderer to an already- running 'hermes dashboard' server reachable over the network. Motivating use case: WSL2 users who want to run the Hermes core (agent loop, tools, filesystem access) inside their WSL distribution while rendering the Electron GUI on native Windows. Before this change, the desktop app always spawned a local Python child on the same host as the renderer, which doesn't cross the WSL/Windows boundary. The remote path reuses waitForHermes() as a liveness probe (/api/status is in the backend's public endpoint allowlist), so the connection is only returned once the backend is actually ready. WebSocket URL derivation picks ws:// or wss:// based on the input scheme. URL validation rejects non-http(s) schemes and requires both env vars together to avoid a half-configured connection that would silently fall through to the spawn path. No behaviour change when the env vars are unset — the default local-spawn flow is untouched. Typical usage: # in WSL2 hermes dashboard --tui --no-open --host 0.0.0.0 --port 9119 --insecure # on Windows set HERMES_DESKTOP_REMOTE_URL=http://localhost:9119 set HERMES_DESKTOP_REMOTE_TOKEN=<session token> set HERMES_DESKTOP_IGNORE_EXISTING=1 (launch Hermes desktop) * ci(desktop): automate desktop releases Add GitHub Actions release channels for signed desktop installers and document the stable/nightly download paths. * feat: file tabs * refactor(desktop): tighten right-rail tab close API Promote closeRightRailTab/closeActiveRightRailTab as the single public entry point. Drops the activeTabRef + handleCloseDocument indirection in ChatPreviewRail, the unused $rightRailHasContent atom, and the legacy dismissFilePreviewTarget alias. -70 LOC. * feat(desktop): polish composer pill toward reference look Solid foreground-on-background send/voice-conversation circle (black-on-white in light, white-on-black in dark) anchors the right edge as the primary CTA instead of the orange theme primary. Bumps the primary control to 2.125rem so it visually outranks the ghost mic/plus controls. Opens up the surface padding (0.625rem x / 0.5rem y) so the input row breathes around its controls, and nudges the corner radius from 20 to 24px for a slightly pill-ier silhouette. LiquidGlass distortion is preserved. * feat(desktop): add startup and onboarding flow Add phase-based desktop boot progress, fresh-install sandbox testing, and first-run provider credential onboarding so packaged installs can start cleanly without manual settings detours. * fix(desktop): gate prompts on provider setup Show the desktop provider onboarding flow before prompt submission when no inference provider is configured, preventing fresh installs from falling through to backend credential errors. * fix(desktop): surface provider onboarding from session warnings Propagate credential warnings through session runtime info and open desktop onboarding whenever a session reports no usable provider, so unconfigured installs cannot fall through to prompt errors. * fix(desktop): route gateway provider errors to onboarding The "No inference provider configured" auth error reaches the renderer through gateway error events, not the prompt.submit promise; the previous patch only caught the latter, so the error toast still surfaced and onboarding never opened. Also strip credential-shaped env vars from the test:desktop:fresh sandbox so the packaged backend can't see provider keys leaking from the launching shell. * fix(desktop): use strict runtime check to drive onboarding setup.status returned True whenever any provider auth state was discoverable, including indirect fallbacks like a gh-CLI Copilot token. That made desktop think the user was set up while the agent's actual resolve_runtime_provider call still raised AuthError, leaving the user with a useless toast and no onboarding. Add a setup.runtime_check gateway method that runs the same resolver the agent uses on session creation, and switch the desktop onboarding overlay and prompt precheck to use it. * feat(desktop): OAuth-first onboarding using existing dashboard provider API Replace the engineer-flavored API key form with a Sign-in-first onboarding overlay that uses the dashboard's existing /api/providers/oauth catalog and PKCE/device-code endpoints (Anthropic, Nous, OpenAI Codex, etc.). API key entry is now a fallback tab with friendly provider names instead of env var prefixes, and the loud raw resolver error is gone in favor of a one-line welcome message. * fix(desktop): polish onboarding provider list Reorder OAuth providers so Nous Portal is first, give the segmented Sign in / API key control equal column widths, and replace the engineer-flavored backend names like "Anthropic (Claude API)" / "MiniMax (OAuth)" with friendlier in-app titles. External-CLI providers now show a softer subtitle and an external-link icon instead of a chevron. * refactor(desktop): split onboarding overlay into store + view Move the OAuth state machine, runtime check, copy-to-clipboard, and api-key save into store/onboarding.ts (matching the boot.ts pattern), leaving the overlay as a presentation layer that subscribes via useStore. Tabs are now table-driven, child panels read flow from the store instead of prop-drilling, and the polling/PKCE/error/success branches share a small Status atom. * fix(desktop): external CLI providers + center mode tabs External-CLI providers (Claude Code, Qwen Code) now open an in-overlay panel with the CLI command, copy button, and an "I've signed in" recheck instead of firing an invisible toast. Center the Sign in / API key tab control so it sits under the heading instead of hugging the left edge. * fix(desktop): drop onboarding tabs for an inline link, group device-code waiting state Replace the Sign in / API key tab pair with an "I have an API key" footer link under the OAuth provider list, with a "Back to sign in" affordance inside the API key form. Group the device-code "Waiting for you to authorize..." status next to the Cancel button so the alignment matches the action. * refactor(desktop): tighten onboarding store + overlay Drop the dead isOnboardingBusy/BUSY set, factor the catch-fallback dance into safeReq, and share a single reloadAndConnect helper between PKCE submit, device-code success, external recheck, and api-key save. In the overlay, extract Step / CodeBlock / FlowFooter / CancelBtn / DocsLink atoms so the four sign-in panels share the same chrome instead of repeating it inline. Net effect: fewer literal divs, one place to touch the spacing, and the code-block + footer rows are reusable across future flows. * fix(desktop): mount onboarding from frame 1 to kill the FOUT Default onboarding.configured to null (unknown until the runtime check resolves) and have the onboarding overlay render whenever it's not yet confirmed true. The boot overlay now yields to it, so the very first paint is the Welcome card with a "While we get you set up..." progress strip instead of a flash of the chat shell between boot dismiss and onboarding mount. The picker swaps in cleanly once the gateway opens and the runtime check confirms the user is not configured. Already-configured users see the same prep card briefly while their existing runtime warms up, then the overlay dismisses without touching the chat shell. * fix(desktop): top-align empty sessions placeholder The "Start a chat to build your history." empty state used a min-h-35 grid place-items-center container, which floated the text in a tall dead zone. Render it as a flat paragraph that sits right under the section header like the empty pinned state does. * refactor(desktop): drop dead boot overlay Onboarding overlay subsumes the boot card now that it mounts from frame 1 and renders boot progress inline. The standalone DesktopBootOverlay is unreachable in every flow (yields whenever onboarding has not confirmed configured, dismisses once it has). * fix(desktop): hide pinned/recents sections until first session A fresh sidebar showed the Pinned and Recent chats headers with floating empty-state copy underneath. Drop both sections (and the now-orphan SidebarEmptySessionState) when there are no sessions yet — they reappear after the first chat. Skeletons during initial load are unchanged. * feat(gui): route embedded TUI through dashboard gateway (#21979) Inject HERMES_TUI_GATEWAY_URL into dashboard PTY sessions so embedded ui-tui instances attach to the in-process websocket gateway, with coverage for the new env wiring. * Add desktop remote gateway settings Make the desktop gateway connection configurable from settings so local remains the default while remote backends can be saved, tested, and applied without environment variables. * feat(gui): first-class Messaging page + gateway menu redesign - Add Messaging page to the desktop app with per-platform setup, status, and inline guidance. Catalog derives from gateway.config Platform enum + plugin registry, so every messaging adapter the CLI supports (Telegram, Discord, Slack, Mattermost, Matrix, WhatsApp, Signal, BlueBubbles, Home Assistant, Email, SMS, DingTalk, Feishu, WeCom, Weixin, QQ, Yuanbao, API server, Webhooks, plugins) shows up without per-platform code. - New REST endpoints: GET /api/messaging/platforms, PUT and POST /test on the same path. Secrets go through the existing .env pipeline; enable/disable writes config.yaml. - Replace gateway statusbar dropdown with a richer panel: status row, icon-only restart + system-panel actions, recent activity (with timestamps trimmed in display, full text on hover), platform list. - Auto-poll the messaging page every 6s (paused when hidden) so status updates without a manual check. - Drop Settings / Command Center from the sidebar nav (still reachable via shortcuts and the titlebar cog). - Flatten top corners on Messaging/Skills/Artifacts/Chat panes. - Share new StatusDot component across messaging + gateway menu. - Fix gateway/config.py so an explicit platforms.<name>.enabled=false in config.yaml is honored when env tokens are present. - pb-9 on the chat content area for breathing room above the composer. * Potential fix for pull request finding 'CodeQL / Clear-text logging of sensitive information' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * pin electron version * hide application menu on non-mac systems * interpret compactPreview for non-string vlaues as JSON or an empty string * fix(desktop): keep composer contenteditable mounted across stacked toggle The composer rendered {input} inside two different parent fragments depending on `stacked`. When auto-expand flipped `stacked` (e.g. the moment typed text wrapped past two lines), React reconciled the two branches as different positions and unmounted/remounted the contenteditable. The fresh mount started empty, so any in-flight characters — most reliably reproduced by holding a key — were lost. Replace the conditional with a single CSS Grid whose template-areas swap on `stacked`. The three children (menu, input, controls) keep stable identities across the toggle; only their grid placement changes, which the browser handles without React tearing down the editor. * refactor(desktop): align install layout with install.ps1 / install.sh Make the desktop app's runtime layout match what scripts/install.ps1 and scripts/install.sh produce, so a desktop-only user and a CLI-only user end up with the same files in the same places and can share one install. Layout - ACTIVE_HERMES_ROOT = HERMES_HOME/hermes-agent (was: process.resourcesPath/hermes-agent, read-only) - VENV_ROOT = HERMES_HOME/hermes-agent/venv (was: userData/hermes-runtime) - desktop.log = HERMES_HOME/logs/desktop.log (was: userData/desktop.log) - HERMES_HOME default: %LOCALAPPDATA%\hermes on Windows, ~/.hermes elsewhere The packaged .app/.exe still ships a read-only payload at process.resourcesPath/hermes-agent (FACTORY_HERMES_ROOT). On first launch or after an installer-driven upgrade we sync factory -> active, then provision the venv and run pip install -e . against the active root. Key behaviors - Pin HERMES_HOME in the spawned Python's env so get_hermes_home() resolves to the same path resolveHermesHome() picked. Without this, Python falls back to ~/.hermes on every platform - fine on mac/linux, a split-state bug on Windows where our default is %LOCALAPPDATA%\hermes. - Detect developer installs by .git presence at ACTIVE; never overwrite a user's checkout via factory sync. - Marker at ACTIVE/.hermes-desktop-runtime.json (schema v4) tracks pyproject hash + factory version + runtime schema version. depsFresh fast-paths when nothing changed. - Dev (npm run dev) prefers SOURCE_REPO_ROOT over ACTIVE so devs run their local edits, not whatever's under HERMES_HOME. - Better error messages distinguish "no payload" from "no Python". - Preserve a legacy ~/.hermes on Windows when no %LOCALAPPDATA%\hermes exists, so users with prior pip/manual installs aren't orphaned. pyproject.toml - Promote fastapi, uvicorn[standard], ptyprocess (non-Windows), and pywinpty (Windows) to main dependencies. The dashboard backend (hermes dashboard) needs them at runtime; the previous lazy-import fallback was a footgun for fresh installs. - Empty the [pty] optional-extra; kept as a no-op back-compat alias for any existing pip install hermes-agent[pty] invocations. Drops the hardcoded BUNDLED_RUNTIME_REQUIREMENTS list in main.cjs - the desktop now installs whatever pyproject.toml says, single source of truth. Files - apps/desktop/electron/main.cjs: runtime layout, HERMES_HOME pin, factory->active sync, marker v4 - apps/desktop/scripts/test-desktop.mjs: track new venv location - apps/desktop/README.md: new Setup, Runtime Bootstrap, and Debugging sections - pyproject.toml: fastapi/uvicorn/pty backends in main dependencies; [pty] extra emptied Tested locally on Windows: npm run dev boots cleanly, sessions land at the new location, type-check + lint + test:desktop:platforms all pass. Verified end-to-end on a fresh Win11 VM via dist:win installer. Known gaps (filed as follow-ups, not in this PR): - Skills not seeded on packaged installs (sync_skills only runs in cmd_chat, not cmd_dashboard). Need to move to shared pre-dispatch. - Git Bash not bundled or detected; agent's terminal tool errors out with a useful message but desktop bootstrapper should pre-flight it. - install.ps1 / install.sh should be decomposed into composable phase libraries so the desktop bootstrapper can reuse them as a single source of truth across all install surfaces. * feat(desktop): theme polish, prose chat typography, composer chrome - DS tokens/midground, Backdrop, scoped scrollbars, typography plugin + prose - Composer liquid/radius utilities, thread font parity, tool/thinking cues - File tree label scale, preview flex, thread retry loading + streaming tests * feat(desktop): NSIS prereq detection page + auto-install via winget The packaged Windows installer now detects Python 3.11+ and Git for Windows at install time and offers to install missing prereqs via winget. Mirrors the prereq logic scripts/install.ps1 already runs for CLI installs, so desktop installer users get the same out-of-the-box experience as install.ps1 users. Why - Hermes' terminal tool calls bash.exe directly (tools/environments/ local.py); on Windows that's Git Bash from Git for Windows. Without it, the agent fails on the first terminal() call. - Hermes' Python runtime needs 3.11+. Without it, the desktop bootstrapper errors out at venv creation. - Both gaps surfaced on a fresh Windows 11 VM smoke test: VM had Python pre-installed but no Git, so the agent's first terminal call failed with "Git Bash isn't installed." - install.ps1 has had Install-Git + Install-Uv functions for ages. The desktop installer was the asymmetric outlier. How — NSIS prereq page - New file: apps/desktop/installer/prereq-check.nsh (plugged into electron-builder via build.nsis.include) - Real Wizard page using nsDialogs, inserted via customPageAfterChangeDir hook (between the Directory page and InstFiles). - Group boxes for Python and Git, each showing detection status. - Pre-checked install checkboxes when winget is available. - Auto-skips silently if both prereqs are already installed. - Falls back to manual download URLs when winget itself is missing. - Detection: - Python: probes `py -3.11`/`-3.12`/`-3.13`/`-3.14` via the Python launcher. Microsoft Store "Python stub" (no py.exe) is correctly classified as not-installed. - Git: `where git`. - winget: `where winget` (Win10 1809+ / Win11 with App Installer). - Install execution (in customInstall macro): - Python: nsExec::ExecToLog with `--scope user --silent`. Per-user install, no UAC prompt, output streams to install log. - Git: ExecShellWait via Windows ShellExecute. Critical because Git always installs per-machine and triggers UAC; ShellExecute preserves the foreground focus chain across non-elevated → elevated process spawns, so UAC actually comes to the foreground. nsExec::ExecToLog breaks the chain because winget runs hidden. - Both pass `--disable-interactivity --accept-package-agreements --accept-source-agreements` to suppress winget's own dialogs. - Verification: probes Git's standard install locations via FileExists rather than `where git`. NSIS's process inherits PATH at startup, so a freshly-installed Git won't be visible to `where` until restart. - Silent installs (/S) skip the prompts; managed deploys handle prereqs out-of-band via Group Policy / Intune. How — Electron-side safety net - New findGitBash() in main.cjs, parallel to findSystemPython(). Probes the same locations as tools/environments/local.py:_find_bash() so a positive result here means the agent's terminal tool will work. - ensureRuntime now throws a clear, actionable error on Windows when Git Bash isn't found, matching the existing "Python 3.11+ is required" error path. - Catches users the NSIS page doesn't: .msi installer users (NSIS prereq page doesn't run for MSI), `npm run dev` users, manual installers, anyone who unchecked the install boxes on the NSIS prereq page. - All gated on `IS_WINDOWS`; macOS / Linux unaffected. NSIS build issue (resolved) - electron-builder defaults to `-WX` (warnings as errors). NSIS optimizer emits "warning 6010: function not referenced" for our page functions because Page custom directives don't count as references in its static-analysis pass. The functions ARE called at runtime when NSIS invokes the page; the optimizer just can't see it statically. - Set `build.nsis.warningsAsErrors=false` in package.json so this spurious warning doesn't fail the build. (Documented option from electron-builder's nsisOptions.) Out of scope (filed for future work) - MSI prereq detection: Windows Installer custom actions are a different mechanism. Enterprise deploys typically handle prereqs via GP/Intune. - Bundle PortableGit + python-build-standalone in extraResources for zero-network installs. ~80MB increase. - Mac / Linux GUI prereq flows (different installer formats; Xcode CLT covers most macOS prereqs already; Linux is per-distro hard). Files - apps/desktop/installer/prereq-check.nsh (new, ~290 lines NSIS) - apps/desktop/package.json (build.nsis.include + warningsAsErrors) - apps/desktop/electron/main.cjs (findGitBash + preflight) - apps/desktop/README.md (Runtime prerequisites section) Cross-platform impact - macOS / Linux builds (dist:mac, dist:mac:dmg, dist:mac:zip): nsis config is ignored entirely; .nsh is dormant. - npm run dev: .nsh dormant; main.cjs preflight gated on IS_WINDOWS. - scripts/install.ps1, scripts/install.sh: no reference to any new files; CLI install paths untouched. - Hermes CLI / dashboard / gateway: no reference; runtime untouched. - All checks: node --check on main.cjs and test-desktop.mjs pass; npm run test:desktop:platforms 4/4 passing; node --test green. Tested - npm run dist:win produces signed .exe and .msi without errors. - Fresh Win11 VM (Python pre-installed, no Git): prereq page renders, Python check shows detected, Git checkbox pre-checked. Click Next → Git installs via winget with UAC prompt in foreground. - After install completes, Hermes launches and the agent's terminal tool can run bash commands. Verified Git Bash is detected at `C:\Program Files\Git\bin\bash.exe` by ensureRuntime's preflight. * feat: theme changes, composer tweaks, in app update ux, finesse * fix(cli): seed bundled skills on dashboard + gateway entrypoints `sync_skills(quiet=True)` was only being called from inside `cmd_chat`, which meant `hermes dashboard` (the desktop GUI's backend) and `hermes gateway` (Telegram/Discord/Slack/etc daemons) never seeded the bundled skill library into ~/.hermes/skills/. This surfaced as "No skills found" in the desktop GUI's skills panel on fresh installs, despite the agent having access to the full bundled library when invoked via `hermes chat`. scripts/install.ps1 worked around it by running skills_sync.py as part of Copy-ConfigTemplates, but that's not part of the desktop installer's bootstrap chain. Fix - Extract the skills-sync block from cmd_chat into a module-level `_sync_bundled_skills_quietly()` helper. - Call the helper from cmd_chat (preserving existing behavior), cmd_dashboard (after the --status/--stop early-return paths and fastapi import check, so we don't run skills_sync on management commands or when deps aren't installed), and cmd_gateway. Why these three entrypoints - cmd_chat: the user's primary CLI entrypoint - cmd_dashboard: the desktop GUI's backend; this is what `hermes dashboard --tui` invokes when the desktop bootstrapper spawns Hermes - cmd_gateway: long-running daemons where the user expects the agent to have full skill access Other entrypoints (cmd_config, cmd_doctor, cmd_login, cmd_status, etc.) are management commands that don't need skill discovery and were never running skills_sync in the first place — leaving them alone. Idempotence - tools/skills_sync.py is manifest-based: skipped skills cost milliseconds. Calling it from multiple entrypoints adds no real cost, and users running `hermes chat` then `hermes dashboard` get two fast no-ops on the second call. Failure handling - Helper wraps skills_sync in try/except. Skills are an enhancement, not a hard dependency — Hermes runs fine with an empty skills/ dir. Files - hermes_cli/main.py: + new helper `_sync_bundled_skills_quietly()` at module level + cmd_chat: replace inline block with helper call + cmd_dashboard: add helper call after fastapi import succeeds + cmd_gateway: add helper call before delegating to gateway_command * feat(desktop): hoisted todo widget, JSON tool summaries, history grouping & timer fixes - Hoist todo to first-class widget (shadcn checkboxes, brand colors, no tool-accordion). Header derives label from active task; non-active rows fade. - Replace raw JSON dumps with structured key/value summaries via formatToolResultSummary; nested error extraction for clearer failures. - Fix loaded-session grouping: stitch interleaved assistant/tool iterations into one bubble instead of orphaned synthetic messages. - Stable tool/thinking timers via keyed registry so unmount/scroll doesn't reset elapsed counts; gate "running" on real live thread state. - Reorganize chat-only assistant-ui components under components/chat/. * fix(desktop): address CodeQL alerts on PR #20059 - settings/helpers.ts: harden setNested against prototype pollution. POLLUTING_PATH_PARTS check is now applied at every assignment site (loop + leaf) and uses Object.defineProperty so CodeQL can see the guard inline rather than via a helper function call. - lib/markdown-preprocess.ts: rebuild the dangling-fence close regex from a fence-char + length instead of marker.replace(...). The marker is captured by `(`{3,}|~{3,})` so it can only be backticks or tildes, but CodeQL was tracing tainted input text into the RegExp source and flagging hostname dots from input as part of the pattern (false positive js/incomplete-hostname-regexp on the test fixture URLs). Reconstructing from a literal char breaks the dataflow. - scripts/notarize-artifact.cjs: drop args from the run() rejection message. Args carry --key-id / --issuer / key file path; the existing outer catch already squashes errors to a generic line, but CodeQL was flagging the args.join(' ') as clear-text logging of APPLE_API_KEY_ID. Composer DOM-text-as-HTML alerts (composer/index.tsx:379, :547) are already addressed in |
||
|
|
cbf851ae1d |
perf(tui): stop slow/dead MCP servers from freezing TUI startup
The 'summoning hermes…' phase blocked on gateway.ready, which ran MCP tool discovery inline. Any configured-but-unreachable MCP server burned its full connect-retry backoff (1+2+4s ≈ 7s) before the composer appeared — startup went from instant to ~7.5s of dead air for anyone with a down stdio/http server in mcp_servers. Move discovery into a background daemon thread so gateway.ready fires immediately; tools register into the shared registry as servers connect, and the agent isn't built until the first prompt. Measured spawn→ready: ~7500ms → ~115ms (dead twozero_td server in config). Also drop rich.console + prompt_toolkit off banner.py's import path (lazy-imported inside cprint/build_welcome_banner). tui_gateway.server imports banner only to reach the lightweight prefetch_update_check helper; the eager rich/pt imports added ~45ms before gateway.ready for no benefit. tui_gateway.server import: ~115ms → ~69ms. |
||
|
|
66827f8947 |
chore: prune unused imports and duplicate import redefinitions
Remove unused imports (F401) and duplicate/shadowed import redefinitions (F811) across the codebase using ruff's safe autofixes. No behavioral changes -- imports only. - ~1400 safe autofixes applied across 644 files (net -1072 lines) - __init__.py re-exports preserved (excluded from F401 removal so public re-export surfaces stay intact) - Re-exports that are imported or monkeypatched by tests but look unused in their defining module are kept with explicit # noqa: F401 (gateway/run.py load_dotenv; run_agent re-exports from agent.message_sanitization, agent.context_compressor, agent.retry_utils, agent.prompt_builder, agent.process_bootstrap, agent.codex_responses_adapter) - Unsafe F841 (unused-variable) fixes deliberately skipped -- those can change behavior when the RHS has side effects - ruff lints remain disabled in pyproject.toml (only PLW1514 is selected); this is a one-time cleanup, not a config change Verification: - python -m compileall: clean - pytest --collect-only: all 27161 tests collect (zero import errors) - core entry points import clean (run_agent, model_tools, cli, toolsets, hermes_state, batch_runner, gateway) - static scan: every name any test imports directly from an edited module still resolves |
||
|
|
3a9bc9d88a
|
fix(model picker): unify /model and hermes model lists, add disk cache (#33867)
* fix(model picker): unify /model and `hermes model` model lists, add disk cache
The /model slash picker and `hermes model` were drifting apart. /model
read the raw static `OPENROUTER_MODELS` list (31 entries, including 5
that fail at runtime — no tool-call support or absent from live catalog),
while `hermes model` ran the same list through the live OpenRouter
/v1/models tool-support filter and showed 26 valid entries. Same problem
existed for every other authed provider: /model used curated static
lists, `hermes model` used live /v1/models.
Unifies both surfaces on `provider_model_ids()` and adds a generic
disk-cached wrapper so the picker stays snappy.
Changes
- hermes_cli/models.py: new `cached_provider_model_ids()` —
~/.hermes/provider_models_cache.json, 1h TTL, per-provider entries
keyed by credential fingerprint (env vars + OAuth file mtimes).
Stale-data-beats-no-data on transient failures. Pair with
`clear_provider_models_cache(provider=None)`.
- hermes_cli/models.py: `provider_model_ids("nous")` now falls back
to the docs-hosted manifest (not the in-repo snapshot) when the live
Portal /models call fails — preserves the model_catalog regression
guarantee while still going through the unified pathway.
- hermes_cli/model_switch.py: `list_authenticated_providers` routes
sections 1, 2, and 2b through `cached_provider_model_ids(slug)` with
curated fallback when the live fetcher comes up empty.
- hermes_cli/model_switch.py: `parse_model_flags` extended to a
4-tuple, parses `--refresh`.
- cli.py / gateway/run.py / tui_gateway/server.py: updated unpacking;
CLI + gateway wire `--refresh` to `clear_provider_models_cache()`.
- hermes_cli/main.py: `hermes model --refresh` argparse flag.
- hermes_cli/commands.py: `/model` args_hint advertises `--refresh`.
- tests/hermes_cli/test_inventory.py: refresh stale comment.
Live PTY parity verification
- /model → OpenRouter row: `(26 models)` (was 31, with broken entries)
- `hermes model` → OpenRouter: 26 models (unchanged)
- The 5 dropped entries: `pareto-code` (no tool-call support),
`gemini-3-pro-image-preview` (no tool-call support),
`elephant-alpha`, `hy3-preview:free`, `ring-2.6-1t:free` (gone
from OpenRouter's live catalog).
Live PTY timing
- First /model open, empty cache: 4624 ms (full network round trip
across every authed provider)
- Second /model open, warm cache: 51 ms (90× faster)
- `/model --refresh` clears the disk cache and re-fetches.
Cache schema (~/.hermes/provider_models_cache.json, ~3 KB):
{ "anthropic": {"fp": "<sha256:16>", "at": 1748..., "models": [...]},
... }
Targeted tests: tests/hermes_cli/ + gateway model tests + tui_gateway —
5855/5855 pass.
* fix(model picker): use blake2b for cache fingerprint to silence CodeQL
py/weak-sensitive-data-hashing flagged the sha256 call in
_credential_fingerprint() as a high-severity alert because the input
includes env var values whose names contain *_API_KEY / *_TOKEN.
The hash is used solely as a cache-bust identity — never reversed, never
stored, collisions are harmless (worst case: cache miss → live re-fetch).
blake2b serves the same purpose and isn't flagged by this rule.
Functional behavior identical: 16-hex-char digest, cache hit/miss logic
unchanged. Live re-verified — 26 OpenRouter models, warm-cache 78ms.
|
||
|
|
0a83247e9f |
feat: add TUI session orchestrator
Add a first-class active-session orchestrator for the Ink TUI: - list, activate, close, and launch live process-local TUI sessions - hydrate committed and in-flight output when switching sessions - dispatch a new prompt session from the +new row with session-scoped model picks - expose a clickable live-session count in the status chrome - preserve stable row order while initially focusing the current session - support mouse hit-testing for floating orchestrator overlays - add backend and frontend regression coverage for the lifecycle and UI helpers |
||
|
|
c9b3eeabdc
|
fix(cli): decouple tool_progress=verbose from global DEBUG logging (#31379)
PR #
|
||
|
|
a627981a65
|
fix(tui): stop slash dropdown from chopping last char of /goal (#31311)
Some checks are pending
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Docker Build and Publish / move-latest (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
Two independent bugs caused the slash-command autocomplete to render
`/goal` as `/goa` (and `/gquota` as `/gquot` for that matter) in the TUI:
1. `tui_gateway/server.py` was forwarding `c.display` from
prompt_toolkit's `Completion` straight into the JSON-RPC payload.
prompt_toolkit normalizes `display=` into `FormattedText` (a `list`
subclass), so the wire format became `[["", "/goal"]]` instead of
the `string` that `CompletionItem.display` in the TUI declares.
`meta` already went through `to_plain_text` — `display` did not.
2. The dropdown row in `appOverlays.tsx` used `flexDirection="row"`
with the display `<Text>` and the (very long) meta `<Text>` as
siblings. When the meta overflows the row width, Ink/Yoga shrinks
the *first* column by one cell, lopping the trailing character off
the command name. `/goal` triggers it reliably because its meta
string is the longest of any built-in command (description +
embedded `[text | pause | resume | clear | status]` usage hint).
Wrapping the display column in `<Box flexShrink={0}>` keeps it at
its natural width and lets the meta wrap or truncate instead.
|
||
|
|
83f6a83b24 | fix(tui): handle images with codex app-server | ||
|
|
7de8cd4c5f |
fix(tui): clear TTS env var on voice off, and add TTS indicator to status bar
Bug 1: /voice off in TUI mode did not clear HERMES_VOICE_TTS, leaving TTS stuck ON with no way to disable it (the voice.toggle tts handler requires voice mode to be ON). Bug 2: TUI status bar only showed 'voice on/off' without any indication of whether TTS speech output is active, because the frontend never tracked voiceTts state. - tui_gateway/server.py: clear HERMES_VOICE_TTS when voice is turned off - ui-tui/src/app/useMainApp.ts: add voiceTts state, thread setVoiceTts through voice contexts, display [tts] in status bar - ui-tui/src/app/slash/commands/session.ts: sync tts from voice.toggle response - ui-tui/src/app/interfaces.ts: add setVoiceTts to all voice context interfaces |
||
|
|
1264fab156
|
fix(tui): surface verbose tool details (#30225)
Some checks failed
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Docker Build and Publish / move-latest (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix Lockfile Fix / auto-fix-main (push) Waiting to run
Nix Lockfile Fix / fix (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
Tests / test (push) Waiting to run
Tests / e2e (push) Waiting to run
Deploy Site / deploy-vercel (push) Has been cancelled
Deploy Site / deploy-docs (push) Has been cancelled
OSV-Scanner / Scan lockfiles (push) Has been cancelled
uv.lock check / uv lock --check (push) Has been cancelled
* fix(tui): surface verbose tool details Emit redacted structured verbose args/results to the TUI so /verbose verbose can show full tool detail without reopening stdout, and fail closed if redaction is unavailable. Salvages #29011. Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com> * fix(tui): address verbose detail review Label verbose tool failures as errors, cover forced verbose reasoning, and avoid new diff type warnings from the redaction regression tests. * fix(tui): bound verbose tool payloads Cap verbose tool detail text before emitting JSON-RPC events and preserve verbose results on inline diff completions. * fix(tui): align termux argv test with gc flag Update the stale TUI launch expectation so the Termux freshness path matches the current direct Node argv. --------- Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com> |
||
|
|
a7cd254c29
|
feat(tui): mouse_tracking DEC mode presets (salvage of #26681) (#30084)
* feat(tui): make display.mouse_tracking pick which DEC modes to enable
Previously the boolean flag was all-or-nothing across modes 1000+1002+1003+1006.
Inside tmux, mode 1003 (any-motion) makes every mouse cross of the prompt row
fire a clipboard probe that surfaces as "No image in clipboard" — sometimes
dozens in a row. Disabling tracking entirely killed scroll-wheel scrolling too,
since tmux's own scrollback is preempted by the alt-screen TUI.
`display.mouse_tracking` (and `/mouse <preset>`) now accepts `off | wheel |
buttons | all` in addition to the legacy booleans. `wheel` is 1000+1006:
scroll wheel + click only, no drag, no hover — the tmux-friendly subset.
`buttons` adds 1002 for drag-to-select. `all` (= legacy `true`) keeps the
hover-driven UI (scrollbar paginate-on-hover, link mouseenter, etc.).
* fix(tui): repaint + sync mouse mode when display.mouse_tracking changes
Two interacting bugs left the TUI blank when `display.mouse_tracking`
switched at runtime (config edit, /mouse <preset>):
1. AlternateScreen's effect re-runs on every `mouseTracking` change,
tearing down and re-entering the alt screen. After re-entry, ink's
frame buffers are reset by `resetFramesForAltScreen()` but nothing
schedules the follow-up render — the alt screen sits blank until
some other state change happens to trigger one. Add a
`scheduleRender()` in `setAltScreenActive`'s active=true branch so
the freshly-entered alt screen gets a full repaint immediately.
2. `setAltScreenActive` early-returns when `active` hasn't changed,
which silently drops a `mouseTracking` change if the cleanup→setup
pair somehow leaves `altScreenActive` already true. Call
`setAltScreenMouseTracking` explicitly from the AlternateScreen
effect so the in-memory mode and terminal DECSET sequence stay in
sync regardless of how `setAltScreenActive` resolved (the call is a
no-op when the mode is unchanged).
* fix(tui): address copilot review #4341269705
- tui_gateway/server.py: drop the never-referenced _MOUSE_TRACKING_MODES
frozenset (comment #3284802434). _MOUSE_TRACKING_ALIASES already
centralizes the canonical preset set via its values; the separate
constant added no behavior.
- tests/test_tui_gateway_server.py: update the existing
test_config_mouse_uses_documented_key_with_legacy_fallback to assert
the new preset strings ('all'/'off' instead of 'on'/'off',
display.mouse_tracking persisted as 'all' instead of True) and add
test_config_mouse_accepts_preset_strings_and_aliases covering /mouse
set with wheel/click/unknown (comment #3284802453). The on/off legacy
config.set return shape was an implementation detail of the boolean
flag, not a stable API — the slash command, gateway help text, and
docs all advertise the preset values now.
- ui-tui/packages/hermes-ink/src/ink/ink.tsx: schedule a render at the
end of reenterAltScreen() (comment #3284802461). Mirrors the same fix
in setAltScreenActive() from
|
||
|
|
697d38a3f4 |
feat: auto-launch Chromium-family browser for CDP
Add browser CDP launch candidates for Chrome, Chromium, Brave, and Edge while preserving Chrome-first selection. Retry candidate launch failures instead of giving up after the first executable. Update /browser CLI and TUI messaging, docs, and tool descriptions from Chrome-only wording to Chromium-family browser support. Add regression coverage for Brave/Edge paths, Chrome-first precedence, fallback launches, and CDP endpoint probing. |
||
|
|
8c3b065124 | fix(cli): show active profile in TUI prompt | ||
|
|
b5c1fe78aa
|
feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373)
Skill bundles are tiny YAML files in ~/.hermes/skill-bundles/ that
group several skills under one slash command. Invoking /<bundle-name>
from any surface (CLI, TUI, dashboard, any gateway platform) loads
every referenced skill into a single combined user message.
Use cases:
- /backend-dev → loads github-code-review + test-driven-development
+ github-pr-workflow as one bundle.
- /research → loads several research skills together.
- Team task profiles shared via dotfiles.
Behavior:
- Bundles take precedence over individual skills when slugs collide.
- Missing skills are skipped with a note, not fatal.
- No system-prompt mutation — bundles generate a fresh user message
at invocation time, the same way /<skill> does. Prompt cache stays
intact.
- Works in CLI dispatch, gateway dispatch, autocomplete (CLI + TUI),
/help display.
Schema (~/.hermes/skill-bundles/<slug>.yaml):
name: backend-dev
description: Backend feature work.
skills:
- github-code-review
- test-driven-development
instruction: |
Optional extra guidance prepended to the loaded skills.
New module: agent/skill_bundles.py — load, scan, resolve, build
invocation message, save, delete. yaml.safe_load only; broken
bundles log a warning and are skipped, never raise.
New CLI subcommand: hermes bundles {list,show,create,delete,reload}.
Implementation in hermes_cli/bundles.py; wired in hermes_cli/main.py.
'bundles' added to _BUILTIN_SUBCOMMANDS so plugin discovery skips it.
New in-session slash command: /bundles lists installed bundles in
both CLI and gateway. /<bundle-name> dispatch added to CLI (cli.py)
and gateway (gateway/run.py) before the existing /<skill-name> path.
Autocomplete: SlashCommandCompleter gained an optional
skill_bundles_provider parameter that defaults to None — the prompt
shows '▣ <description> (N skills)' for bundles vs '⚡' for skills.
Tests:
- tests/agent/test_skill_bundles.py — 33 tests covering slugify,
scan/cache freshness, resolve (including underscore→hyphen
Telegram alias), build_bundle_invocation_message (loading, missing
skills, user/bundle instruction injection, dedup), save/delete,
reload diff, list sort.
- tests/hermes_cli/test_bundles.py — 8 tests for the CLI
subcommand (create/list/show/delete/reload, --force, missing
bundle errors).
- tests/gateway/test_bundles_command.py — 4 tests for the gateway
handler and bundle resolution priority.
Live E2E: verified subprocess invocations of hermes bundles
{list,create,show,reload,delete} round-trip correctly against an
isolated HERMES_HOME.
Docs:
- website/docs/user-guide/features/skills.md — new 'Skill Bundles'
section with quick example, YAML schema, management commands,
behavior notes.
- website/docs/reference/cli-commands.md — 'hermes bundles' added to
the top-level command table and given its own subcommand section.
|
||
|
|
2ef501e1f5
|
feat(cli): add /update slash command to CLI and TUI (#23854)
* feat: add /update slash command to CLI and TUI * test(cli): add Python tests for /update slash command Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cli): address Copilot review for /update slash command Route classic CLI /update through prompt_toolkit modal confirmation and defer relaunch to the main-thread cleanup path after app.exit(). Tighten Y/n semantics, add Python wrapper and catalog coverage tests, and assert /update stays visible in the TUI command catalog. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cli): address review feedback on /update command - Replace raw input() with _prompt_text_input_modal in _handle_update_command to avoid EOF/hang/keystroke-leak races with prompt_toolkit's stdin ownership - Fix confirmation logic: only proceed on recognized affirmative aliases (y/yes/1/ok); cancel on everything else including empty string, typos, and unrecognized input — matches all other [Y/n] prompts in the codebase - Route relaunch through main-thread shutdown path: set _pending_relaunch and return False from process_command so process_loop triggers app.exit(); run() then calls relaunch() after prompt_toolkit has restored terminal modes and after cleanup — safe on both POSIX (execvp) and Windows (subprocess+exit) - Fix misleading docstring in test_update_command.py: the Vitest only covers the TypeScript slash handler that emits code 42, not the Python wrapper branch that acts on it - Rewrite tests to use SimpleNamespace pattern (like test_destructive_slash_confirm) so _prompt_text_input_modal can be stubbed directly - Add Python test for _launch_tui exit-code-42 → relaunch branch in main.py Agent-Logs-Url: https://github.com/NousResearch/hermes-agent/sessions/f6da68cf-e7b1-4b7a-aed6-3d4b0f523bdb Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com> * fix(cli): polish test fixtures for /update command - Remove unused _prompt_text_input from SimpleNamespace stub - Use pytest.fail sentinel in managed-install guard test to catch unexpected modal invocations Agent-Logs-Url: https://github.com/NousResearch/hermes-agent/sessions/f6da68cf-e7b1-4b7a-aed6-3d4b0f523bdb Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com> * chore: re-trigger CI after Copilot review fixes Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com> |
||
|
|
9df9816dab |
feat(azure-foundry): add Microsoft Entra ID auth
Use azure-identity DefaultAzureCredential for keyless Foundry auth. Preserve refreshable callable credentials through OpenAI and Anthropic client paths. Add setup, doctor, auth status, docs, and tests for Entra auth. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> |
||
|
|
d5416284f1
|
fix(tui): autonomous background process completion notifications (#26071) (#26327)
* feat(process-registry): add format_process_notification shared helper * feat(process-registry): add drain_notifications method * refactor(cli): use shared drain_notifications and format_process_notification * feat(tui): add background notification poller for completion_queue * feat(tui): wire notification poller into session init/finalize * refactor(tui): add post-turn drain using shared helper as safety net |
||
|
|
efc32ab639 |
refactor(inventory): extract shared ConfigContext + build_models_payload
Three call-sites in the codebase each duplicated the same config-slice
+ list_authenticated_providers + post-processing pattern:
- hermes_cli/web_server.py /api/model/options
- tui_gateway/server.py model.options JSON-RPC
- tui_gateway/server.py model.save_key JSON-RPC
This consolidates them onto hermes_cli/inventory.py:
load_picker_context() -> ConfigContext
Replaces the 17-LOC config-slice (model.{default,name,provider,
base_url}, providers:, custom_providers:) every consumer did
inline.
ConfigContext.with_overrides(*, current_provider=, current_model=,
current_base_url=) -> ConfigContext
Truthy-only overlay for TUI agent-session state on top of disk
config. Empty getattr(agent, ...) attrs MUST NOT clobber disk.
build_models_payload(ctx, *, include_unconfigured, picker_hints,
canonical_order, max_models) -> dict
Single payload builder. Delegates curation to
list_authenticated_providers (does not call provider_model_ids
per row \u2014 that pulls non-agentic models). picker_hints +
canonical_order produce the TUI ModelPickerDialog shape;
defaults match the dashboard's existing /api/model/options
contract.
Two latent bugs fixed by consolidation:
1. The dashboard read cfg.get('custom_providers') directly, missing
the v12+ keyed providers: form. Now both surfaces go through
get_compatible_custom_providers().
2. The TUI's canonical-merge keyed on is_user_defined to decide order.
Section 3 of list_authenticated_providers sets is_user_defined=True
on rows from the providers: config dict even when the slug is
canonical \u2014 that silently demoted them to the picker tail.
_reorder_canonical now keys on slug membership instead.
Stats: +666 / -145 (net +521). Module 240 LOC; 18 behavior tests.
This PR replaces the rejected #23369 (which bundled the consolidation
with new scriptable CLI surfaces \u2014 hermes models list/status, hermes
providers list \u2014 and a JSON contract that have no external user
demand). Just the refactor; the CLI surface is deferred to a separate
PR gated on actual demand.
Refs #23359.
|
||
|
|
557deece6f |
fix(tui): use TERMINAL_CWD in _session_info for accurate status line path
_session_info() used os.getcwd() which reflects the gateway process working directory, not the user's actual working directory. This caused the TUI status line to display incorrect paths (e.g. D:\HermesWork instead of D:\Hermes\HermesWork) after agent turns that changed the process cwd. Align with session.create which already correctly reads TERMINAL_CWD env var set by the CLI launcher. |
||
|
|
ce0f529cde
|
chore: ruff auto-fix C401, C416, C408, PLR1722 (#23940)
C401: set(x for x in y) -> {x for x in y} (set comprehension)
C416: [(k,v) for k,v in d] -> list(d.items()) (unnecessary listcomp)
C408: tuple()/dict() -> ()/{} (unnecessary collection call)
PLR1722: exit() -> sys.exit() (adds import sys where needed)
21 instances fixed, 0 remaining. 19 files, +40/-36.
|
||
|
|
2ec8d2b42f
|
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple membership tests. Set lookup is O(1) vs O(n) for tuple — consistent micro-optimization across the codebase. 608 instances fixed via `ruff --fix --unsafe-fixes`, 0 remaining. 133 files, +626/-626 (net zero). |
||
|
|
3e7145e0bb
|
revert: roll back /goal checklist + /subgoal feature stack (#23813)
* Revert "fix(goals): force judge to use tool calls instead of JSON-text replies (#23547)" This reverts commit |
||
|
|
404640a2b7
|
feat(goals): /goal checklist + /subgoal user controls (#23456)
* feat(goals): /goal checklist + /subgoal user controls
Two-phase judge for /goal — Phase A decomposes the goal into a detailed
checklist on first turn; Phase B evaluates each pending item harshly
against the agent's most recent response. The goal completes only when
every item is in a terminal status (completed or impossible). Adds
/subgoal so the user can append, complete, mark impossible, undo,
remove, or clear items the judge missed or got wrong.
Mechanics:
- GoalState gains `checklist` and `decomposed` fields, both backwards
compatible (old state_meta rows load unchanged).
- Phase A: aux call writes a harsh, exhaustive checklist; biased toward
more items not fewer. Falls through to legacy freeform judge when
decompose fails.
- Phase B: judge gets the checklist + last-response snippet + path to
a per-session conversation dump at <HERMES_HOME>/goals/<sid>.json.
A bounded read_file tool (max 5 calls per turn, restricted to that
one file) lets the judge inspect history when the snippet is
ambiguous. Stickiness in code: terminal items are frozen, only the
user can revert via /subgoal undo.
- Continuation prompt shows checklist progress when non-empty;
reverts to old prompt when empty.
- Status line shows M/N done counts.
CLI + gateway + TUI gateway all pass the agent reference into
evaluate_after_turn so the dump can be written. Gateway-side
/subgoal is allowed mid-run since it only modifies the checklist
the judge consults at turn boundaries.
Tests: 24 new cases — backcompat round-trip, Phase A decompose,
Phase B updates + new_items + stickiness, user override flows,
conversation dump (incl. unsafe-sid sanitization), judge read_file
restriction. Existing freeform-mode tests updated to patch the
renamed `judge_goal_freeform` and skip Phase A explicitly.
* fix(goals): off-by-one in judge index, message-list plumbing, prompt tuning
Three live-test findings from running /goal end-to-end against
gemini-3-flash-preview as the judge:
1. Off-by-one bug — the judge sees the checklist rendered with 1-based
indices ('1. [ ] foo, 2. [ ] bar') but the apply layer indexed
state.checklist as 0-based. Result: every judge update landed on
the wrong item, evidence got attached to neighbouring rows, and
the genuine 'first pending' item (usually #1) never got marked.
Fix: convert 1 → 0 in _parse_evaluate_response. Also tightened the
user prompt to call out the 1-based scheme explicitly. New tests
cover the parser conversion + an end-to-end fake-judge round-trip.
2. Conversation dump never happened — _extract_agent_messages tried
common AIAgent attribute names (.messages, .conversation_history,
etc.) but AIAgent doesn't expose the message list as an instance
attribute; it lives inside run_conversation()'s scope. Result: the
judge's read_file tool always saw history_path=unavailable. Fix:
added an explicit messages= kwarg to evaluate_after_turn that all
three call sites (CLI, gateway, TUI gateway) now pass directly.
Agent-attribute extraction kept as back-compat fallback.
3. Prompt was too harsh on simple goals. The original 'be HARSH,
default to leaving items pending' wording made the judge refuse
to mark 'file exists' completed even after the agent ran ls,
test -f, os.path.isfile, and find — burning the entire 8-turn
budget on a fizzbuzz task. Softened to 'strict but not absurd'
with explicit guidance on what counts as evidence and a directive
not to require re-proving items already established earlier.
Re-tested live with the same fizzbuzz goal: now terminates in 2
turns with all 8 checklist items correctly attributed to their
own evidence. /subgoal user-action flow (add / complete / undo /
impossible) verified live as well.
|
||
|
|
c7f0aab949
|
feat(openrouter): wire Pareto Code router with min_coding_score knob (#22838)
Pick openrouter/pareto-code as your model and OpenRouter auto-routes each
request to the cheapest model meeting your coding-quality bar (ranked by
Artificial Analysis). The new openrouter.min_coding_score config key (0.0-1.0,
default 0.65) tunes the floor.
- hermes_cli/models.py: add openrouter/pareto-code to OPENROUTER_MODELS so
it shows up in the picker with a description
- hermes_cli/config.py: add openrouter.min_coding_score (default 0.65 — lands
on a mid-tier coder on the current Pareto frontier)
- plugins/model-providers/openrouter: emit extra_body.plugins =
[{id: pareto-router, min_coding_score: X}] when model is openrouter/pareto-code
AND the score is a valid float in [0.0, 1.0]
- agent/transports/chat_completions.py: same emission on the legacy flag
path (when no provider profile is loaded)
- run_agent.py: openrouter_min_coding_score kwarg + storage; plumbed into
both build_kwargs() invocations and the context-summary extra_body path
- cli.py: read openrouter.min_coding_score once at init, validate float in
[0,1], pass to AIAgent constructions (CLI + background-task paths)
- cron/scheduler.py, batch_runner.py, tools/delegate_tool.py,
tui_gateway/server.py: propagate the kwarg (mirrors providers_order
plumbing — subagents inherit, cron/batch read from config)
- tests: profile-level + transport-level coverage of the model gating,
unset/empty/out-of-range handling, and the legacy flag path
- docs: new 'OpenRouter Pareto Code Router' section in providers.md
Verified end-to-end against api.openrouter.ai: at score=0.65 we land on a
mid-tier coder, at omission we get the strongest. Score is silently dropped
on any model other than openrouter/pareto-code, so it's safe to leave set.
|
||
|
|
cbce5e93fc |
codebase: add encoding='utf-8' to all bare open() calls (PLW1514)
Closes the last Python-on-Windows UTF-8 exposure by making every
text-mode open() call explicit about its encoding.
Before: on Windows, bare open(path, 'r') defaults to the system
locale encoding (cp1252 on US-locale installs). That means reading
any config/yaml/markdown/json file with non-ASCII content either
crashes with UnicodeDecodeError or silently mis-decodes bytes.
After: all 89 affected call sites in production code now pass
encoding='utf-8' explicitly. Works identically on every platform
and every locale, no surprise behavior.
Mechanical sweep via:
ruff check --preview --extend-select PLW1514 --unsafe-fixes --fix --exclude 'tests,venv,.venv,node_modules,website,optional-skills, skills,tinker-atropos,plugins' .
All 89 fixes have the same shape: open(x) or open(x, mode) became
open(x, encoding='utf-8') or open(x, mode, encoding='utf-8'). Nothing
else changed. Every modified file still parses and the Windows/sandbox
test suite is still green (85 passed, 14 skipped, 0 failed across
tests/tools/test_code_execution_windows_env.py +
tests/tools/test_code_execution_modes.py + tests/tools/test_env_passthrough.py +
tests/test_hermes_bootstrap.py).
Scope notes:
- tests/ excluded: test fixtures can use locale encoding intentionally
(exercising edge cases). If we want to tighten tests later that's
a separate PR.
- plugins/ excluded: plugin-specific conventions may differ; plugin
authors own their code.
- optional-skills/ and skills/ excluded: skill scripts are user-authored
and we don't want to mass-edit them.
- website/ and tinker-atropos/ excluded: vendored / generated content.
46 files touched, 89 +/- lines (symmetric replacement). No behavior
change on POSIX or on Windows when the file is ASCII; bug fix on
Windows when the file contains non-ASCII.
|
||
|
|
7f92e5506e
|
Merge pull request #20942 from NousResearch/austin/fix/personality
fix(tui): preserve session when switching personality |
||
|
|
d87c7b99e2
|
fix(analytics): prevent silent token loss and add Claude 4.5–4.7 pricing (#21455)
- Add pricing entries for Claude Opus 4.5/4.6/4.7, Sonnet 4.5/4.6, and Haiku 4.5 with updated source URLs (platform.claude.com) - Add _normalize_anthropic_model_name() to handle dot-notation variants (e.g. claude-opus-4.7 → claude-opus-4-7) for pricing lookups - Fix silent token loss: ensure session row exists before UPDATE in both run_agent.py and hermes_state.py (INSERT OR IGNORE is idempotent) - Log token persistence failures at DEBUG level instead of swallowing them silently — makes undercounted analytics diagnosable - Surface reasoning tokens in CLI /usage and TUI usage panel - Add 'reasoning' and 'cost_status' fields to TUI Usage type |
||
|
|
ec9d0e26d4 | fix(tui): render structured content on resume | ||
|
|
6e46f99e7e
|
fix(tui): surface backend error as visible text when final_response is empty (#21245)
When the provider rejects a request (e.g. invalid model slug like
'--provider nous --model kimi-k2.6' where the valid slug is
'moonshotai/kimi-k2.6'), run_conversation() returns
{failed: True, error: <detail>, final_response: None}. The TUI gateway
and one-shot CLI mode both dropped the error on the floor and emitted
an empty turn, so the user saw a blank response with no indication
that anything went wrong.
Mirror the interactive CLI's existing pattern (cli.py:9832): when
final_response is empty AND (failed|partial) is set AND error is
populated, surface 'Error: <detail>' as the visible text. Leaves
the None-with-no-error path and the '(empty)' sentinel path
untouched — an empty successful turn still renders empty, and
existing sentinel handlers keep owning their lane.
Reported by @counterposition in PR #20873; taking a minimal fix
rather than the broader structured-failure refactor proposed there.
|
||
|
|
65c762b2e8 |
fix(tui): preserve session when switching personality
Previously, /personality in the TUI called _reset_session_agent() which destroyed the agent, cleared conversation history, and effectively started a new session. This made personality switching disruptive — users lost their entire conversation context. Now /personality updates the agent's ephemeral_system_prompt in-place and injects a pivot marker into the conversation history. The marker tells the model to adopt the new persona from that point forward, which is necessary because LLMs tend to pattern-match their prior responses and continue the established tone without an explicit signal. Changes: - tui_gateway/server.py: Rewrite _apply_personality_to_session to update the agent in-place instead of resetting. Inject a user-role pivot marker so the model actually switches style mid-conversation. - ui-tui/src/app/slash/commands/session.ts: Update help text (no longer mentions history reset). - tests/test_tui_gateway_server.py: Update test to verify history is preserved, pivot marker is injected, and ephemeral prompt is set. |
||
|
|
04cf4788cc
|
fix(tui): restore voice push-to-talk parity (#20897)
* fix(tui): restore classic CLI voice push-to-talk parity
(cherry picked from commit
|
||
|
|
d78c34928f |
feat(tui): collapsible sections in startup banner (skills, system prompt, MCP)
The TUI SessionPanel banner now uses collapsible \u25b8/\u25be toggle sections matching the existing Chevron convention used for runtime agent details. Skills, system prompt, and MCP server lists are collapsed by default; tools remain expanded as the most actionable info. - tui_gateway/server.py: _session_info() now passes agent._cached_system_prompt through to the TUI frontend - ui-tui/src/types.ts: added system_prompt?: string to SessionInfo - ui-tui/src/components/branding.tsx: rewrote SessionPanel with CollapseToggle helper + per-section useState toggles Default states: tools=open, skills=collapsed, system=collapsed, mcp=collapsed. Clicking any \u25b8/\u25be header toggles that section. |
||
|
|
3ebdd26449 | fix(browser): surface Lightpanda Chrome fallback warnings |